public class PreComputedContextExtractor extends Object implements ContextExtractor
ContextExtractor
that assumes that the corpus has already been
pre-processed and each document is a single line with the following format:
header score,feature(|score, feature)*
where header is some id that identifies the word being represented by this
context, score is a double, and feature is some string. With this style of
corpus, PreComputedContextExtractor
will obtain a dimension for each
feature and transform the line into a SparseDoubleVector
, passing it
to Wordsi
for futher processing.Constructor and Description |
---|
PreComputedContextExtractor()
Constructs a new
PreComputedContextExtractor using a StringBasisMapping . |
PreComputedContextExtractor(BasisMapping<String,String> basis)
Constructs a new
PreComputedContextExtractor using the given
BasisMapping . |
Modifier and Type | Method and Description |
---|---|
int |
getVectorLength()
Returns the maximum number of dimensions used to represent any given
context.
|
void |
processDocument(BufferedReader document,
Wordsi wordsi)
Processes the content of
document and calls Wordsi.handleContextVector(java.lang.String, java.lang.String, edu.ucla.sspace.vector.SparseDoubleVector) for each context vector that can be extracted
from document . |
public PreComputedContextExtractor()
PreComputedContextExtractor
using a StringBasisMapping
.public PreComputedContextExtractor(BasisMapping<String,String> basis)
PreComputedContextExtractor
using the given
BasisMapping
.public void processDocument(BufferedReader document, Wordsi wordsi)
document
and calls Wordsi.handleContextVector(java.lang.String, java.lang.String, edu.ucla.sspace.vector.SparseDoubleVector)
for each context vector that can be extracted
from document
.processDocument
in interface ContextExtractor
public int getVectorLength()
getVectorLength
in interface ContextExtractor
Copyright © 2012. All Rights Reserved.