gov.llnl.ontology.mapreduce.stats
Class TokenCountMR

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by gov.llnl.ontology.mapreduce.CorpusTableMR
          extended by gov.llnl.ontology.mapreduce.stats.TokenCountMR
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool

public class TokenCountMR
extends CorpusTableMR

A Map/Reduce job that counts the number of occurrences for each token in a corpus.

Author:
Keith Stevens

Nested Class Summary
static class TokenCountMR.TokenCountMapper
          The TableMapper responsible for most of the work.
 
Nested classes/interfaces inherited from class gov.llnl.ontology.mapreduce.CorpusTableMR
CorpusTableMR.CorpusTableMapper<K,V>
 
Field Summary
static String ABOUT
          The job description used in help text.
 
Fields inherited from class gov.llnl.ontology.mapreduce.CorpusTableMR
CONF_PREFIX, TABLE
 
Constructor Summary
TokenCountMR()
           
 
Method Summary
protected  String jobName()
          Returns a descriptive job name for this map reduce task.
static void main(String[] args)
          Runs the TokenCountMR.
protected  Class mapperClass()
          Returns the Class object for the Mapper task.
protected  Class mapperKeyClass()
          Returns the Class object for the Mapper Value of this task.
protected  Class mapperValueClass()
          Returns the Class object for the Mapper Value of this task.
protected  void setupReducer(String tableName, org.apache.hadoop.mapreduce.Job job, MRArgOptions options)
          Sets up the Reducer for this job.
protected  void validateOptions(MRArgOptions options)
          Returns true if the MRArgOptions contains a valid value for each requried option.
 
Methods inherited from class gov.llnl.ontology.mapreduce.CorpusTableMR
addOptions, run, setupConfiguration
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Field Detail

ABOUT

public static final String ABOUT
The job description used in help text.

See Also:
Constant Field Values
Constructor Detail

TokenCountMR

public TokenCountMR()
Method Detail

main

public static void main(String[] args)
                 throws Exception
Runs the TokenCountMR.

Throws:
Exception

validateOptions

protected void validateOptions(MRArgOptions options)
Returns true if the MRArgOptions contains a valid value for each requried option. By default, this does no validation.

Overrides:
validateOptions in class CorpusTableMR

jobName

protected String jobName()
Returns a descriptive job name for this map reduce task.

Overrides:
jobName in class CorpusTableMR

mapperClass

protected Class mapperClass()
Returns the Class object for the Mapper task.

Specified by:
mapperClass in class CorpusTableMR

mapperKeyClass

protected Class mapperKeyClass()
Returns the Class object for the Mapper Value of this task.

Overrides:
mapperKeyClass in class CorpusTableMR

mapperValueClass

protected Class mapperValueClass()
Returns the Class object for the Mapper Value of this task.

Overrides:
mapperValueClass in class CorpusTableMR

setupReducer

protected void setupReducer(String tableName,
                            org.apache.hadoop.mapreduce.Job job,
                            MRArgOptions options)
Sets up the Reducer for this job.

Overrides:
setupReducer in class CorpusTableMR


Copyright © 2010-2011. All Rights Reserved.