public class PukWaCDocumentIterator extends Object implements Iterator<LabeledParsedDocument>
Document containg a single
dependency parsed sentence given a file in the CoNLL Format which
is contained in the XML format provided in the WaCkypedia corpus. See the WaCky group's
website for more information on the PukWaC.| Constructor and Description |
|---|
PukWaCDocumentIterator(String documentsFile)
Creates an
Iterator over the file where each document returned
contains the sequence of dependency parsed words composing a sentence.. |
| Modifier and Type | Method and Description |
|---|---|
boolean |
hasNext()
Returns
true if there are more documents to return. |
LabeledParsedDocument |
next()
Returns the next document from the file.
|
void |
remove()
Throws an
UnsupportedOperationException if called. |
public PukWaCDocumentIterator(String documentsFile)
Iterator over the file where each document returned
contains the sequence of dependency parsed words composing a sentence..documentsFile - the name of the PukWaC file containing dependency
parsed sentences in the CoNLL
Format separated by XML tags for the sentences and articles
from which they cameIOError - if any error occurs when reading documentsFilepublic boolean hasNext()
true if there are more documents to return.hasNext in interface Iterator<LabeledParsedDocument>public LabeledParsedDocument next()
next in interface Iterator<LabeledParsedDocument>public void remove()
UnsupportedOperationException if called.remove in interface Iterator<LabeledParsedDocument>Copyright © 2012. All Rights Reserved.