Package | Description |
---|---|
edu.ucla.sspace.mains | |
edu.ucla.sspace.text | |
edu.ucla.sspace.text.corpora |
Modifier and Type | Method and Description |
---|---|
protected Iterator<Document> |
GenericWordsiMain.getDocumentIterator() |
protected Iterator<Document> |
GenericMain.getDocumentIterator()
Returns the iterator for all of the documents specified on the command
line or throws an
Error if no documents are specified. |
protected Iterator<Document> |
GrefenstetteMain.getDocumentIterator() |
Modifier and Type | Method and Description |
---|---|
protected void |
GenericMain.addCorpusReaderIterators(Collection<Iterator<Document>> docIters,
String[] fileNames)
Adds a corpus reader for each file listed.
|
protected void |
DependencyGenericMain.addDocIterators(Collection<Iterator<Document>> docIters,
String[] fileNames)
Adds a
DependencyFileDocumentIterator to docIters for
each file name provided. |
protected void |
DVWordsiMain.addDocIterators(Collection<Iterator<Document>> docIters,
String[] fileNames)
Adds
DependencyFileDocumentIterator s for each file name provided. |
protected void |
GenericMain.addDocIterators(Collection<Iterator<Document>> docIters,
String[] fileNames)
Adds a
OneLinePerDocumentIterator to docIters for each
file name provided. |
protected void |
DependencyGenericMain.addFileIterators(Collection<Iterator<Document>> docIters,
String[] fileNames)
Throws
UnsupportedOperationException . |
protected void |
DVWordsiMain.addFileIterators(Collection<Iterator<Document>> docIters,
String[] fileNames)
Throws
UnsupportedOperationException . |
protected void |
GenericMain.addFileIterators(Collection<Iterator<Document>> docIters,
String[] fileNames)
Adds a
FileListDocumentIterator to docIters for each file
name provided. |
protected void |
GenericMain.parseDocumentsMultiThreaded(SemanticSpace sspace,
Iterator<Document> docIter,
int numThreads)
Calls
processDocument once for every document in docIter using a the
specified number thread to call processSpace on the SemanticSpace instance. |
protected void |
GenericMain.parseDocumentsSingleThreaded(SemanticSpace sspace,
Iterator<Document> docIter)
Calls
processDocument once for every document in docIter using a
single thread to interact with the SemanticSpace instance. |
protected void |
GenericMain.processDocumentsAndSpace(SemanticSpace space,
Iterator<Document> docIter,
int numThreads,
Properties props)
Processes all the documents held by the iterator and process the space.
|
Modifier and Type | Interface and Description |
---|---|
interface |
CorpusReader<D extends Document>
A basic interface for setting up a
CorpusReader , which reads un
cleaned text from corpus files and transforms them into an appropriately
cleaned Document instance. |
class |
DirectoryCorpusReader<D extends Document>
An abstract base class for corpus reading iterators that need to traverse
through a large nested directory structure to find files containing text.
|
Modifier and Type | Interface and Description |
---|---|
interface |
AnnotatedDocument
An abstraction for a document that allows document processors to access text
in a uniform manner.
|
interface |
LabeledDocument
An abstraction for a document that has an accompanying label or name.
|
interface |
LabeledParsedDocument
A union interface for a document that has been (or will be) dependency parsed
to generate an accompanying parse tree of its contents and that has an
accompanying label about its source or contents.
|
interface |
ParsedDocument
An abstraction for a document that has been (or will be) dependency parsed to
generate an accompanying parse tree of its contents.
|
interface |
TemporalDocument
An abstraction for a document that allows document processors to access
time-annotated text in a uniform manner.
|
Modifier and Type | Class and Description |
---|---|
class |
FileDocument
A
Document implementation backed by a File whose contents are
used for the document text. |
class |
LabeledParsedStringDocument
An abstraction for a document that has been (or will be) dependency parsed to
generate an accompanying parse tree of its contents.
|
class |
LabeledStringDocument
A
LabeledDocument implementation backed by a String whose
contents are used for the document text. |
class |
StringDocument
A
Document implementation backed by a String whose contents
are used for the document text. |
class |
TemporalFileDocument
A
TemporalDocument implementation backed by a File whose
contents are used for the document text. |
class |
TemporalStringDocument
A
TemporalDocument implementation backed by a String whose
contents are used for the document text. |
Modifier and Type | Method and Description |
---|---|
protected Document |
UsenetCorpusReader.InnerIterator.advanceInDoc()
Iterates over the utterances in a file and appends the words to create a
new document.
|
protected Document |
BloglinesCorpusReader.InnerIterator.advanceInDoc()
Iterates over the utterances in a file and appends the words to create a
new document.
|
protected Document |
ChildesCorpusReader.InnerIterator.advanceInDoc()
Iterates over the utterances in a file and appends the words to create a
new document.
|
protected Document |
SenseEvalDependencyCorpusReader.InnerIterator.advanceInDoc()
Iterates over the instances in a file and appends the words to create a
new document.
|
Document |
DependencyFileDocumentIterator.next()
Returns the next document from the file.
|
Document |
WaCkypediaDocumentIterator.next()
Returns the next document from the file.
|
Document |
UkWacDependencyFileIterator.next()
Returns the next document from the file.
|
Document |
FileListDocumentIterator.next()
Returns the next document from the list.
|
Document |
OneLinePerDocumentIterator.next()
Returns the next document from the file.
|
Document |
LimitedOneLinePerDocumentIterator.next()
Returns the next document from the file.
|
Document |
BufferedFileListDocumentIterator.next()
Returns the next document from the list.
|
Modifier and Type | Method and Description |
---|---|
protected Iterator<Document> |
UsenetCorpusReader.corpusIterator(Iterator<File> fileIter) |
protected Iterator<Document> |
BloglinesCorpusReader.corpusIterator(Iterator<File> fileIter) |
protected Iterator<Document> |
ChildesCorpusReader.corpusIterator(Iterator<File> fileIter) |
protected Iterator<Document> |
SenseEvalDependencyCorpusReader.corpusIterator(Iterator<File> fileIter) |
Constructor and Description |
---|
LimitedOneLinePerDocumentIterator(Iterator<Document> iter,
int docLimit,
boolean useMultipleResets)
Constructs an
Iterator for the documents contained in the
provided file. |
Modifier and Type | Method and Description |
---|---|
protected Document |
UsenetCorpusReader.UseNetIterator.advanceInDoc()
Iterates over the utterances in a file and appends the words to
create a new document.
|
protected Document |
BloglinesCorpusReader.BloglinesIterator.advanceInDoc()
Iterates over the utterances in a file and appends the words to
create a new document.
|
protected Document |
ChildesCorpusReader.ChildesFileIterator.advanceInDoc()
Iterates over the utterances in a file and appends the words to
create a new document.
|
Document |
PukWacCorpusReader.UkWacIterator.next() |
Document |
PukWacDependencyCorpusReader.UkWacIterator.next() |
Document |
SenseEvalDependencyCorpusReader.SenseEvalIterator.next() |
Modifier and Type | Method and Description |
---|---|
protected Iterator<Document> |
UsenetCorpusReader.corpusIterator(Iterator<File> files) |
protected Iterator<Document> |
BloglinesCorpusReader.corpusIterator(Iterator<File> files)
|
protected Iterator<Document> |
ChildesCorpusReader.corpusIterator(Iterator<File> files)
|
Iterator<Document> |
PukWacCorpusReader.read(File file)
Returns a
Iterator that traverses the documents containted in
the given file . |
Iterator<Document> |
SemEvalCorpusReader.read(File file)
Returns a
Iterator that traverses the documents containted in
the given file . |
Iterator<Document> |
PukWacDependencyCorpusReader.read(File file)
Returns a
Iterator that traverses the documents containted in
the given file . |
Iterator<Document> |
SemEvalLexSubReader.read(File file)
Returns a
Iterator that traverses the documents containted in
the given file . |
Iterator<Document> |
SenseEvalDependencyCorpusReader.read(File file)
Returns a
Iterator that traverses the documents containted in
the given file . |
Iterator<Document> |
PukWacCorpusReader.read(Reader baseReader)
Retrusn a
Iterator that traverses the documents contained in
baseReader . |
Iterator<Document> |
SemEvalCorpusReader.read(Reader reader)
Retrusn a
Iterator that traverses the documents contained in
baseReader . |
Iterator<Document> |
PukWacDependencyCorpusReader.read(Reader baseReader)
Retrusn a
Iterator that traverses the documents contained in
baseReader . |
Iterator<Document> |
SemEvalLexSubReader.read(Reader reader)
Retrusn a
Iterator that traverses the documents contained in
baseReader . |
Iterator<Document> |
SenseEvalDependencyCorpusReader.read(Reader docReader)
Retrusn a
Iterator that traverses the documents contained in
baseReader . |
Copyright © 2012. All Rights Reserved.