gov.llnl.ontology.mapreduce.stats
Class WordsiMR

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by gov.llnl.ontology.mapreduce.CorpusTableMR
          extended by gov.llnl.ontology.mapreduce.stats.WordsiMR
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool

public class WordsiMR
extends CorpusTableMR

Author:
Keith Stevens

Nested Class Summary
static class WordsiMR.WordsiDependencyMapper
          The TableMapper responsible for the real work.
static class WordsiMR.WordsiOccurrenceMapper
          The TableMapper responsible for the real work.
 
Nested classes/interfaces inherited from class gov.llnl.ontology.mapreduce.CorpusTableMR
CorpusTableMR.CorpusTableMapper<K,V>
 
Field Summary
static String ABOUT
           
static String CONF_PREFIX
          A prefix for any Configuration setting.
static String DEFAULT_ACCEPTOR
          The classname of the default DependencyPathAcceptor.
static String DEFAULT_WEIGHT
          The classname of the default DependencyPathWeight.
static String DEPENDENCY_BASIS
          The configuration for setting the DependencyPathBasisMapping.
static String PATH_ACCEPTOR
          The configuration for setting the DependencyPathAcceptor.
static String PATH_LENGTH
          The configuration for setting the path length or word co-occurrence window.
static String PATH_WEIGHT
          The configuration for setting the DependencyPathWeight.
static String USE_ORDERING
          The configuration set when word ordering features should be used.
static String USE_POS
          The configuration set when part of speech features should be used.
 
Fields inherited from class gov.llnl.ontology.mapreduce.CorpusTableMR
TABLE
 
Constructor Summary
WordsiMR()
           
 
Method Summary
protected  void addOptions(MRArgOptions options)
          Add more command line arguments.
static
<T> void
emitContext(String focus, edu.ucla.sspace.basis.BasisMapping<T,String> basis, edu.ucla.sspace.vector.SparseDoubleVector vector, org.apache.hadoop.mapreduce.Mapper.Context context)
           
 String jobName()
          Returns a descriptive job name for this map reduce task.
static void main(String[] args)
          Runs the TokenCountMR.
protected  Class mapperClass()
          Returns the Class object for the Mapper task.
protected  Class mapperKeyClass()
          Returns the Class object for the Mapper Value of this task.
protected  Class mapperValueClass()
          Returns the Class object for the Mapper Value of this task.
protected  void setupConfiguration(MRArgOptions options, org.apache.hadoop.conf.Configuration conf)
          Copies command line arguments to a Configuration so that Map/Reduce jobs can utilize the values set.
protected  void setupReducer(String tableName, org.apache.hadoop.mapreduce.Job job, MRArgOptions options)
          Sets up the Reducer for this job.
protected  void validateOptions(MRArgOptions options)
          Returns true if the MRArgOptions contains a valid value for each requried option.
 
Methods inherited from class gov.llnl.ontology.mapreduce.CorpusTableMR
run
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Field Detail

ABOUT

public static final String ABOUT
See Also:
Constant Field Values

CONF_PREFIX

public static final String CONF_PREFIX
A prefix for any Configuration setting.

See Also:
Constant Field Values

PATH_ACCEPTOR

public static final String PATH_ACCEPTOR
The configuration for setting the DependencyPathAcceptor.

See Also:
Constant Field Values

PATH_WEIGHT

public static final String PATH_WEIGHT
The configuration for setting the DependencyPathWeight.

See Also:
Constant Field Values

PATH_LENGTH

public static final String PATH_LENGTH
The configuration for setting the path length or word co-occurrence window.

See Also:
Constant Field Values

DEPENDENCY_BASIS

public static final String DEPENDENCY_BASIS
The configuration for setting the DependencyPathBasisMapping.

See Also:
Constant Field Values

USE_POS

public static final String USE_POS
The configuration set when part of speech features should be used.

See Also:
Constant Field Values

USE_ORDERING

public static final String USE_ORDERING
The configuration set when word ordering features should be used.

See Also:
Constant Field Values

DEFAULT_ACCEPTOR

public static final String DEFAULT_ACCEPTOR
The classname of the default DependencyPathAcceptor.

See Also:
Constant Field Values

DEFAULT_WEIGHT

public static final String DEFAULT_WEIGHT
The classname of the default DependencyPathWeight.

See Also:
Constant Field Values
Constructor Detail

WordsiMR

public WordsiMR()
Method Detail

main

public static void main(String[] args)
                 throws Exception
Runs the TokenCountMR.

Throws:
Exception

validateOptions

protected void validateOptions(MRArgOptions options)
Returns true if the MRArgOptions contains a valid value for each requried option. By default, this does no validation.

Overrides:
validateOptions in class CorpusTableMR

jobName

public String jobName()
Returns a descriptive job name for this map reduce task.

Overrides:
jobName in class CorpusTableMR

addOptions

protected void addOptions(MRArgOptions options)
Add more command line arguments. By default, this adds no options.

Overrides:
addOptions in class CorpusTableMR

setupConfiguration

protected void setupConfiguration(MRArgOptions options,
                                  org.apache.hadoop.conf.Configuration conf)
Copies command line arguments to a Configuration so that Map/Reduce jobs can utilize the values set. By default, this does no configuration.

Overrides:
setupConfiguration in class CorpusTableMR

mapperClass

protected Class mapperClass()
Returns the Class object for the Mapper task.

Specified by:
mapperClass in class CorpusTableMR

mapperKeyClass

protected Class mapperKeyClass()
Returns the Class object for the Mapper Value of this task.

Overrides:
mapperKeyClass in class CorpusTableMR

mapperValueClass

protected Class mapperValueClass()
Returns the Class object for the Mapper Value of this task.

Overrides:
mapperValueClass in class CorpusTableMR

setupReducer

protected void setupReducer(String tableName,
                            org.apache.hadoop.mapreduce.Job job,
                            MRArgOptions options)
Sets up the Reducer for this job.

Overrides:
setupReducer in class CorpusTableMR

emitContext

public static <T> void emitContext(String focus,
                                   edu.ucla.sspace.basis.BasisMapping<T,String> basis,
                                   edu.ucla.sspace.vector.SparseDoubleVector vector,
                                   org.apache.hadoop.mapreduce.Mapper.Context context)
                        throws IOException,
                               InterruptedException
Throws:
IOException
InterruptedException


Copyright © 2010-2011. All Rights Reserved.