| Package | Description |
|---|---|
| edu.ucla.sspace.dependency | |
| edu.ucla.sspace.mains | |
| edu.ucla.sspace.text | |
| edu.ucla.sspace.text.corpora |
| Class and Description |
|---|
| Stemmer
An interface for classes that stem tokens.
|
| TokenFilter
A utility for asserting what tokens are valid and invalid within a stream of
tokens.
|
| Class and Description |
|---|
| Document
An abstraction for a document that allows document processors to access text
in a uniform manner.
|
| TemporalDocument
An abstraction for a document that allows document processors to access
time-annotated text in a uniform manner.
|
| Class and Description |
|---|
| BloglinesCorpusReader
A
DirectoryCorpusReader for the bloglines corpus. |
| CorpusReader
A basic interface for setting up a
CorpusReader, which reads un
cleaned text from corpus files and transforms them into an appropriately
cleaned Document instance. |
| DirectoryCorpusReader
An abstract base class for corpus reading iterators that need to traverse
through a large nested directory structure to find files containing text.
|
| DirectoryCorpusReader.BaseFileIterator |
| Document
An abstraction for a document that allows document processors to access text
in a uniform manner.
|
| DocumentPreprocessor
A class for preprocessing all types of documents.
|
| LabeledDocument
An abstraction for a document that has an accompanying label or name.
|
| LabeledParsedDocument
A union interface for a document that has been (or will be) dependency parsed
to generate an accompanying parse tree of its contents and that has an
accompanying label about its source or contents.
|
| LabeledStringDocument
A
LabeledDocument implementation backed by a String whose
contents are used for the document text. |
| ParsedDocument
An abstraction for a document that has been (or will be) dependency parsed to
generate an accompanying parse tree of its contents.
|
| Stemmer
An interface for classes that stem tokens.
|
| StringDocument
A
Document implementation backed by a String whose contents
are used for the document text. |
| TemporalDocument
An abstraction for a document that allows document processors to access
time-annotated text in a uniform manner.
|
| TokenFilter
A utility for asserting what tokens are valid and invalid within a stream of
tokens.
|
| UsenetCorpusReader |
| Class and Description |
|---|
| CorpusReader
A basic interface for setting up a
CorpusReader, which reads un
cleaned text from corpus files and transforms them into an appropriately
cleaned Document instance. |
| DirectoryCorpusReader
An abstract base class for corpus reading iterators that need to traverse
through a large nested directory structure to find files containing text.
|
| DirectoryCorpusReader.BaseFileIterator |
| Document
An abstraction for a document that allows document processors to access text
in a uniform manner.
|
| DocumentPreprocessor
A class for preprocessing all types of documents.
|
| TemporalDocument
An abstraction for a document that allows document processors to access
time-annotated text in a uniform manner.
|
Copyright © 2012. All Rights Reserved.