public class PseudoWordDependencyContextExtractor extends DependencyContextExtractor
DependencyContextExtractor. Given a mapping from
raw tokens to pseudo words, this extractor will automatically change the text
for any dependency node that has a valid pseudo word mapping. The pseudo
word will serve as the primary key for assignments and the original token
will serve as the secondary key.extractor, generator, readHeader| Constructor and Description |
|---|
PseudoWordDependencyContextExtractor(DependencyExtractor extractor,
DependencyContextGenerator generator,
Map<String,String> pseudoWordMap)
Creates a new
PseudoWordDependencyContextExtractor. |
| Modifier and Type | Method and Description |
|---|---|
protected boolean |
acceptWord(DependencyTreeNode focusNode,
String contextHeader,
Wordsi wordsi)
Returns true if
focusWord is a known pseudo word. |
void |
processDocument(BufferedReader document,
Wordsi wordsi)
Processes the content of
document and calls Wordsi.handleContextVector(java.lang.String, java.lang.String, edu.ucla.sspace.vector.SparseDoubleVector) for each context vector that can be extracted
from document. |
getPrimaryKey, getSecondaryKey, getVectorLength, handleContextHeaderpublic PseudoWordDependencyContextExtractor(DependencyExtractor extractor, DependencyContextGenerator generator, Map<String,String> pseudoWordMap)
PseudoWordDependencyContextExtractor.extractor - The DependencyExtractor that parses the document
and returns a valid dependency treebasisMapping - A mapping from dependency paths to feature indicesweighter - A weighting function for dependency pathsacceptor - An accepting function that validates dependency paths
which may serve as featurespseudoWordMap - A mapping from raw tokens to pseudo wordspublic void processDocument(BufferedReader document, Wordsi wordsi)
document and calls Wordsi.handleContextVector(java.lang.String, java.lang.String, edu.ucla.sspace.vector.SparseDoubleVector) for each context vector that can be extracted
from document.processDocument in interface ContextExtractorprocessDocument in class DependencyContextExtractorprotected boolean acceptWord(DependencyTreeNode focusNode, String contextHeader, Wordsi wordsi)
focusWord is a known pseudo word.acceptWord in class DependencyContextExtractorCopyright © 2012. All Rights Reserved.