public class LSAMain extends GenericMain
LatentSemanticAnalysis (LSA) from the
 command line.  This class takes in several command line arguments.
 -d, --docFile=FILE[,FILE...] a file where each line is
        a document.  This is the preferred input format for large corpora
   -f, --fileList=FILE[,FILE...] a list of document files
        where each file is specified on its own line.
   --dimensions=<int> how many dimensions to use for the LSA
        vectors.  See LatentSemanticAnalysis for default value
   --preprocess=<class name> specifies an instance of edu.ucla.sspace.lsa.MatrixTransformer to use in preprocessing the
        word-document matrix compiled by LSA prior to computing the SVD.  See
        LatentSemanticAnalysis for default value
   -F, --tokenFilter=FILE[include|exclude][,FILE...]
        specifies a list of one or more files to use for filtering the documents.  An option
        flag may be added to each file to specify how the words in the filter
        filter should be used: include if only the words in the filter
        file should be retained in the document; exclude if only the
        words not in the filter file should be retained in the
        document.
   -S, --svdAlgorithm=SVD.Algorithm species a specific SVD.Algorithm method to use when reducing the dimensionality in LSA.
        In general, users should not need to specify this option, as the
        default setting will choose the fastest algorithm available on the
        system.  This is only provided as an advanced option for users who
        want to compare the algorithms' performance or any variations between
        the SVD results.
   -o, --outputFormat=text|binary} Specifies the
        output formatting to use when generating the semantic space (.sspace) file.  See SemanticSpaceUtils for format details.
   -t, --threads=INT how many threads to use when
        processing the documents.  The default is one per core.
 
   -w, --overwrite=BOOL specifies whether to overwrite
        the existing output files.  The default is true.  If set to
        false, a unique integer is inserted into the file name.
   -v, --verbose  specifies whether to print runtime
        information to standard out
   
 An invocation will produce one file as output lsa-semantic-space.sspace.  If overwrite was set to true,
 this file will be replaced for each new semantic space.  Otherwise, a new
 output file of the format lsa-semantic-space<number>.sspace will be
 created, where <number> is a unique identifier for that program's
 invocation.  The output file will be placed in the directory specified on the
 command line.
 
This class is desgined to run multi-threaded and performs well with one thread per core, which is the default setting.
LatentSemanticAnalysis, 
TransformargOptions, EXT, isMultiThreaded, verbose| Modifier and Type | Method and Description | 
|---|---|
| protected void | addExtraOptions(ArgOptions options)Adds all of the options to the  ArgOptions. | 
| protected String | getAlgorithmSpecifics()Returns a string describing algorithm-specific options and behaviods. | 
| protected SemanticSpace | getSpace()Returns the  SemanticSpacethat will be used for processing. | 
| protected SemanticSpaceIO.SSpaceFormat | getSpaceFormat()Returns the  format as the default
 format of a  LatentSemanticAnalysisspace. | 
| static void | main(String[] args) | 
| protected void | postProcessing()Allows subclasses to interact with the  SemanticSpaceafter the
 space has finished processing all of the text. | 
addCorpusReaderIterators, addDocIterators, addFileIterators, getDocumentIterator, handleExtraOptions, loadValidTermSet, parseDocumentsMultiThreaded, parseDocumentsSingleThreaded, processDocumentsAndSpace, run, saveSSpace, setupOptions, setupProperties, usage, verbose, verboseprotected void addExtraOptions(ArgOptions options)
ArgOptions.addExtraOptions in class GenericMainoptions - the ArgOptions object which more main specific options can
        be added to.GenericMain.handleExtraOptions()protected SemanticSpace getSpace()
GenericMainSemanticSpace that will be used for processing.  This
 method is guaranteed to be called after the command line arguments have
 been parsed, so the contents of GenericMain.argOptions are valid.getSpace in class GenericMainprotected SemanticSpaceIO.SSpaceFormat getSpaceFormat()
LatentSemanticAnalysis space.getSpaceFormat in class GenericMainprotected void postProcessing()
GenericMainSemanticSpace after the
 space has finished processing all of the text.postProcessing in class GenericMainprotected String getAlgorithmSpecifics()
getAlgorithmSpecifics in class GenericMainCopyright © 2012. All Rights Reserved.