public class PreComputedContextExtractor extends Object implements ContextExtractor
ContextExtractor that assumes that the corpus has already been
pre-processed and each document is a single line with the following format:
header score,feature(|score, feature)*
where header is some id that identifies the word being represented by this
context, score is a double, and feature is some string. With this style of
corpus, PreComputedContextExtractor will obtain a dimension for each
feature and transform the line into a SparseDoubleVector, passing it
to Wordsi for futher processing.| Constructor and Description |
|---|
PreComputedContextExtractor()
Constructs a new
PreComputedContextExtractor using a StringBasisMapping. |
PreComputedContextExtractor(BasisMapping<String,String> basis)
Constructs a new
PreComputedContextExtractor using the given
BasisMapping. |
| Modifier and Type | Method and Description |
|---|---|
int |
getVectorLength()
Returns the maximum number of dimensions used to represent any given
context.
|
void |
processDocument(BufferedReader document,
Wordsi wordsi)
Processes the content of
document and calls Wordsi.handleContextVector(java.lang.String, java.lang.String, edu.ucla.sspace.vector.SparseDoubleVector) for each context vector that can be extracted
from document. |
public PreComputedContextExtractor()
PreComputedContextExtractor using a StringBasisMapping.public PreComputedContextExtractor(BasisMapping<String,String> basis)
PreComputedContextExtractor using the given
BasisMapping.public void processDocument(BufferedReader document, Wordsi wordsi)
document and calls Wordsi.handleContextVector(java.lang.String, java.lang.String, edu.ucla.sspace.vector.SparseDoubleVector) for each context vector that can be extracted
from document.processDocument in interface ContextExtractorpublic int getVectorLength()
getVectorLength in interface ContextExtractorCopyright © 2012. All Rights Reserved.