public abstract class DirectoryCorpusReader.BaseFileIterator extends Object implements Iterator<D>
| Constructor and Description |
|---|
DirectoryCorpusReader.BaseFileIterator(Iterator<File> filesToExplore)
Creates a new
DirectoryCorpusReader.BaseFileIterator. |
| Modifier and Type | Method and Description |
|---|---|
protected D |
advance()
Returns the next String representing a complete document that is
accessible by this
DirectoryCorpusReader. |
protected abstract D |
advanceInDoc()
Returns a new String representing a complete document extracted from
currentDoc. |
protected String |
cleanDoc(String document)
Returns a cleaned version of the document if document processing is
enabled, otherwise the document text is returned unmodified.
|
boolean |
hasNext() |
D |
next() |
void |
remove()
Throws
UnsupportedOperationException if called. |
protected abstract void |
setupCurrentDoc(File currentDoc)
Sets up any data members needed to process the current file being
processed.
|
public DirectoryCorpusReader.BaseFileIterator(Iterator<File> filesToExplore)
DirectoryCorpusReader.BaseFileIterator.public void remove()
UnsupportedOperationException if called.protected abstract D advanceInDoc()
currentDoc.protected abstract void setupCurrentDoc(File currentDoc)
protected String cleanDoc(String document)
DocumentPreprocessor.process(String)protected D advance()
DirectoryCorpusReader. If all files have
been traversed then this will return null.Copyright © 2012. All Rights Reserved.