public abstract class DirectoryCorpusReader.BaseFileIterator extends Object implements Iterator<D>
Constructor and Description |
---|
DirectoryCorpusReader.BaseFileIterator(Iterator<File> filesToExplore)
Creates a new
DirectoryCorpusReader.BaseFileIterator . |
Modifier and Type | Method and Description |
---|---|
protected D |
advance()
Returns the next String representing a complete document that is
accessible by this
DirectoryCorpusReader . |
protected abstract D |
advanceInDoc()
Returns a new String representing a complete document extracted from
currentDoc . |
protected String |
cleanDoc(String document)
Returns a cleaned version of the document if document processing is
enabled, otherwise the document text is returned unmodified.
|
boolean |
hasNext() |
D |
next() |
void |
remove()
Throws
UnsupportedOperationException if called. |
protected abstract void |
setupCurrentDoc(File currentDoc)
Sets up any data members needed to process the current file being
processed.
|
public DirectoryCorpusReader.BaseFileIterator(Iterator<File> filesToExplore)
DirectoryCorpusReader.BaseFileIterator
.public void remove()
UnsupportedOperationException
if called.protected abstract D advanceInDoc()
currentDoc
.protected abstract void setupCurrentDoc(File currentDoc)
protected String cleanDoc(String document)
DocumentPreprocessor.process(String)
protected D advance()
DirectoryCorpusReader
. If all files have
been traversed then this will return null.Copyright © 2012. All Rights Reserved.