public interface ContextGenerator
ContextGenerator
s are a subcomponent of ContextExtractor
s. As a
ContextExtractor
examines text and finds a word which is being
represented in the Wordsi
space, it will call the ContextGenerator
and pass the created context vector to a Wordsi
implementation.
ContextGenerator
s are recomended to be made serializable. They will
serve as the core representation method of Wordsi
implementations and
can thus be re-used in multiple evaluations. For example, after training a
Wordsi
model, it may need to be evaluated in a psuedo-word
disambiguation task or a SemEval task. In both cases, the feature space must
remain the same turing training and evaluation.
For evaluation purposes, an added option is available: a read only mode.
When in read only mode, ContextGenerator
s should not create any new
features. If some co-occurring term does not exist in the feature space, it
should be left out of the context vector, only feature which already exist in
the space should contribute to the context vector. In standard mode, the
generator is permitted to decided which words should serve as features using
any method.ContextGenerator
Modifier and Type | Method and Description |
---|---|
SparseDoubleVector |
generateContext(Queue<String> prevWords,
Queue<String> nextWords)
Returns a
SparseDoubleVector that represents the context composed
of the set of prevWords before the focus word and the set of
nextWords after the focus word. |
int |
getVectorLength()
Returns the maximum number of dimensions used to represent any given
context.
|
void |
setReadOnly(boolean readOnly)
Sets the read only mode of the
ContextGenerator . |
SparseDoubleVector generateContext(Queue<String> prevWords, Queue<String> nextWords)
SparseDoubleVector
that represents the context composed
of the set of prevWords
before the focus word and the set of
nextWords
after the focus word. Since sparse vectors are
returned, if a second order vector is generated, it is recommended that
the vector also be sparsed or have very few dimensions.int getVectorLength()
void setReadOnly(boolean readOnly)
ContextGenerator
. While in read
only mode, only features that previously existed will contribute to
context vectors.Copyright © 2012. All Rights Reserved.