public class DependencyVectorSpace extends Object implements DimensionallyInterpretableSemanticSpace<String>
BasisFunction that maps the the co-occurrence of a word at
the end of a path to a specific dimension. For example, a basis
function may map each occurrence of a word to a single dimension, or
the function might map each occurrence to a different dimension
specific to how the work was related.
PathWeight for specifying how to value the co-occurrence.
For example, each occurrence may have the same value, or the weight
could be based on how long is the path that connects them.
DependencyPathAcceptor that determines which paths are to be
used in counting co-occurrences. Padó and Lapata provide three
templates to match against: MinimumTemplateAcceptor, MediumTemplateAcceptor, and MaximumTemplateAcceptor. Each
acceptor matches the next smaller's set of paths and additional paths.
See Padó and Lapata (2007) for details.
"edu.ucla.sspace.dri.DependencyVectorSpace.pathAcceptor"
MinimalTemplateAcceptor
DependencyPathAcceptor to use for validating dependency paths. If a
path is rejected it will not count towards co-occurrences.
FlatPathWeight
"edu.ucla.sspace.dri.DependencyVectorSpace.basisMapping"
WordBasedBasisMapping
processDocument. At any given point in
processing, the getVector method may be used
to access the current semantics of a word. This allows callers to track
incremental changes to the semantics as the corpus is processed.
The processSpace method for this class does
nothing.StructuredVectorSpace,
BasisFunction,
PathWeight,
DependencyPathAcceptor| Modifier and Type | Field and Description |
|---|---|
static String |
BASIS_MAPPING_PROPERTY
The property for setting the maximal length of any
DependencyPath. |
static String |
PATH_ACCEPTOR_PROPERTY
The property for setting the
DependencyPathAcceptor. |
static String |
PATH_WEIGHTING_PROPERTY
The property for setting the
DependencyPathWeight. |
static String |
PROPERTY_PREFIX
The base prefix for all
DependencyVectorSpace properties. |
| Constructor and Description |
|---|
DependencyVectorSpace()
Creates and configures this
DependencyVectorSpace with the
default set of parameters. |
DependencyVectorSpace(Properties properties)
Creates and configures this
DependencyVectorSpace with the
default set of parameters. |
DependencyVectorSpace(Properties properties,
int pathLength)
/**
Creates and configures this
DependencyVectorSpace with the
default set of parameters. |
| Modifier and Type | Method and Description |
|---|---|
String |
getDimensionDescription(int dimension)
Returns a description of the dependency path feature to which the
provided dimension is mapped.
|
String |
getSpaceName()
Returns "
DependencyVectorSpace" plus this instance's
configuration of a basis mapping, path weighting and path acceptor. |
Vector |
getVector(String term)
Returns the semantic vector for the provided word.
|
int |
getVectorLength()
Returns the length of vectors in this semantic space.
|
Set<String> |
getWords()
Returns the set of words that are represented in this semantic space.
|
void |
processDocument(BufferedReader document)
Extracts all the parsed sentences in the document and then updates the
co-occurrence values for those paths matching the loaded set of
templates, according to this instance's
BasisFunction. |
void |
processSpace(Properties properties)
Does nothing.
|
public static final String PROPERTY_PREFIX
DependencyVectorSpace properties.public static final String PATH_ACCEPTOR_PROPERTY
DependencyPathAcceptor.public static final String PATH_WEIGHTING_PROPERTY
DependencyPathWeight.public static final String BASIS_MAPPING_PROPERTY
DependencyPath.public DependencyVectorSpace()
DependencyVectorSpace with the
default set of parameters. The default values are:WordBasedBasisMapping is used for dimensions;
FlatPathWeight is used to weight accepted paths;
MinimumTemplateAcceptor is used to filter the paths
in a sentence.
public DependencyVectorSpace(Properties properties)
DependencyVectorSpace with the
default set of parameters. The default values are:WordBasedBasisMapping is used for dimensions;
FlatPathWeight is used to weight accepted paths;
MinimumTemplateAcceptor is used to filter the paths
in a sentence.
public DependencyVectorSpace(Properties properties, int pathLength)
DependencyVectorSpace with the
default set of parameters. The default values are:WordBasedBasisMapping is used for dimensions;
FlatPathWeight is used to weight accepted paths;
MinimumTemplateAcceptor is used to filter the paths
in a sentence.
properties - The Properties setting the above optionspathLength - The maximum valid path length. Must be non-negative.
If zero, an the maximum path length used by the DependencyPathAcceptor will be used.public String getDimensionDescription(int dimension)
getDimensionDescription in interface DimensionallyInterpretableSemanticSpace<String>dimension - a dimension numberpublic Set<String> getWords()
getWords in interface SemanticSpacepublic Vector getVector(String term)
getVector in interface SemanticSpaceterm - a word that may be in the semantic spaceVector for the provided word or null if the
word was not in the space.public String getSpaceName()
DependencyVectorSpace" plus this instance's
configuration of a basis mapping, path weighting and path acceptor.getSpaceName in interface SemanticSpacepublic int getVectorLength()
processSpace is called.getVectorLength in interface SemanticSpacepublic void processDocument(BufferedReader document) throws IOException
BasisFunction. Path
occurrences are weighted using this instance's PathWeight.processDocument in interface SemanticSpacedocument - a reader that allows access to the text of the documentIOException - if any error occurs while reading the documentpublic void processSpace(Properties properties)
processSpace in interface SemanticSpaceproperties - a set of properties and values that may be used to
configure any exposed parameters of the algorithm.Copyright © 2012. All Rights Reserved.