public class BloglinesCorpusReader extends DirectoryCorpusReader<Document>
DirectoryCorpusReader
for the bloglines corpus.Modifier and Type | Class and Description |
---|---|
class |
BloglinesCorpusReader.BloglinesIterator |
DirectoryCorpusReader.BaseFileIterator
Constructor and Description |
---|
BloglinesCorpusReader()
Constructs a new
BloglinesCorpusReader that uses no preprocessing
before documents are returned. |
BloglinesCorpusReader(DocumentPreprocessor preprocessor)
Constructs a new
BloglinesCorpusReader that uses preprocessor to clean documents before they are returned. |
Modifier and Type | Method and Description |
---|---|
protected Iterator<Document> |
corpusIterator(Iterator<File> files)
|
initialize, read, read
public BloglinesCorpusReader()
BloglinesCorpusReader
that uses no preprocessing
before documents are returned.public BloglinesCorpusReader(DocumentPreprocessor preprocessor)
BloglinesCorpusReader
that uses preprocessor
to clean documents before they are returned.protected Iterator<Document> corpusIterator(Iterator<File> files)
Iterator
over documents contained in the File
s
traversed by fileIter
. Sub-classes are encouraged to sub-class
BaseFileIterator for the return value of this method.corpusIterator
in class DirectoryCorpusReader<Document>
Copyright © 2012. All Rights Reserved.