public class DependencyVectorSpace extends Object implements DimensionallyInterpretableSemanticSpace<String>
BasisFunction
that maps the the co-occurrence of a word at
the end of a path to a specific dimension. For example, a basis
function may map each occurrence of a word to a single dimension, or
the function might map each occurrence to a different dimension
specific to how the work was related.
PathWeight
for specifying how to value the co-occurrence.
For example, each occurrence may have the same value, or the weight
could be based on how long is the path that connects them.
DependencyPathAcceptor
that determines which paths are to be
used in counting co-occurrences. Padó and Lapata provide three
templates to match against: MinimumTemplateAcceptor
, MediumTemplateAcceptor
, and MaximumTemplateAcceptor
. Each
acceptor matches the next smaller's set of paths and additional paths.
See Padó and Lapata (2007) for details.
"edu.ucla.sspace.dri.DependencyVectorSpace.pathAcceptor"
MinimalTemplateAcceptor
DependencyPathAcceptor
to use for validating dependency paths. If a
path is rejected it will not count towards co-occurrences.
FlatPathWeight
"edu.ucla.sspace.dri.DependencyVectorSpace.basisMapping"
WordBasedBasisMapping
processDocument
. At any given point in
processing, the getVector
method may be used
to access the current semantics of a word. This allows callers to track
incremental changes to the semantics as the corpus is processed.
The processSpace
method for this class does
nothing.StructuredVectorSpace
,
BasisFunction
,
PathWeight
,
DependencyPathAcceptor
Modifier and Type | Field and Description |
---|---|
static String |
BASIS_MAPPING_PROPERTY
The property for setting the maximal length of any
DependencyPath . |
static String |
PATH_ACCEPTOR_PROPERTY
The property for setting the
DependencyPathAcceptor . |
static String |
PATH_WEIGHTING_PROPERTY
The property for setting the
DependencyPathWeight . |
static String |
PROPERTY_PREFIX
The base prefix for all
DependencyVectorSpace properties. |
Constructor and Description |
---|
DependencyVectorSpace()
Creates and configures this
DependencyVectorSpace with the
default set of parameters. |
DependencyVectorSpace(Properties properties)
Creates and configures this
DependencyVectorSpace with the
default set of parameters. |
DependencyVectorSpace(Properties properties,
int pathLength)
/**
Creates and configures this
DependencyVectorSpace with the
default set of parameters. |
Modifier and Type | Method and Description |
---|---|
String |
getDimensionDescription(int dimension)
Returns a description of the dependency path feature to which the
provided dimension is mapped.
|
String |
getSpaceName()
Returns "
DependencyVectorSpace " plus this instance's
configuration of a basis mapping, path weighting and path acceptor. |
Vector |
getVector(String term)
Returns the semantic vector for the provided word.
|
int |
getVectorLength()
Returns the length of vectors in this semantic space.
|
Set<String> |
getWords()
Returns the set of words that are represented in this semantic space.
|
void |
processDocument(BufferedReader document)
Extracts all the parsed sentences in the document and then updates the
co-occurrence values for those paths matching the loaded set of
templates, according to this instance's
BasisFunction . |
void |
processSpace(Properties properties)
Does nothing.
|
public static final String PROPERTY_PREFIX
DependencyVectorSpace
properties.public static final String PATH_ACCEPTOR_PROPERTY
DependencyPathAcceptor
.public static final String PATH_WEIGHTING_PROPERTY
DependencyPathWeight
.public static final String BASIS_MAPPING_PROPERTY
DependencyPath
.public DependencyVectorSpace()
DependencyVectorSpace
with the
default set of parameters. The default values are:WordBasedBasisMapping
is used for dimensions;
FlatPathWeight
is used to weight accepted paths;
MinimumTemplateAcceptor
is used to filter the paths
in a sentence.
public DependencyVectorSpace(Properties properties)
DependencyVectorSpace
with the
default set of parameters. The default values are:WordBasedBasisMapping
is used for dimensions;
FlatPathWeight
is used to weight accepted paths;
MinimumTemplateAcceptor
is used to filter the paths
in a sentence.
public DependencyVectorSpace(Properties properties, int pathLength)
DependencyVectorSpace
with the
default set of parameters. The default values are:WordBasedBasisMapping
is used for dimensions;
FlatPathWeight
is used to weight accepted paths;
MinimumTemplateAcceptor
is used to filter the paths
in a sentence.
properties
- The Properties
setting the above optionspathLength
- The maximum valid path length. Must be non-negative.
If zero, an the maximum path length used by the DependencyPathAcceptor
will be used.public String getDimensionDescription(int dimension)
getDimensionDescription
in interface DimensionallyInterpretableSemanticSpace<String>
dimension
- a dimension numberpublic Set<String> getWords()
getWords
in interface SemanticSpace
public Vector getVector(String term)
getVector
in interface SemanticSpace
term
- a word that may be in the semantic spaceVector
for the provided word or null
if the
word was not in the space.public String getSpaceName()
DependencyVectorSpace
" plus this instance's
configuration of a basis mapping, path weighting and path acceptor.getSpaceName
in interface SemanticSpace
public int getVectorLength()
processSpace
is called.getVectorLength
in interface SemanticSpace
public void processDocument(BufferedReader document) throws IOException
BasisFunction
. Path
occurrences are weighted using this instance's PathWeight
.processDocument
in interface SemanticSpace
document
- a reader that allows access to the text of the documentIOException
- if any error occurs while reading the documentpublic void processSpace(Properties properties)
processSpace
in interface SemanticSpace
properties
- a set of properties and values that may be used to
configure any exposed parameters of the algorithm.Copyright © 2012. All Rights Reserved.