public class SemEvalLexSubReader extends org.xml.sax.helpers.DefaultHandler implements CorpusReader<Document>
CorpusReader
returns documents in the following format:
word_instance_id text ... ||| *focus_word* text ...
Note that this is implemented as a DefaultHandler
for a SAXParser
.Modifier and Type | Class and Description |
---|---|
class |
SemEvalLexSubReader.SemEvalHandler |
Constructor and Description |
---|
SemEvalLexSubReader() |
Modifier and Type | Method and Description |
---|---|
Iterator<Document> |
read(File file)
Returns a
Iterator that traverses the documents containted in
the given file . |
Iterator<Document> |
read(Reader reader)
Retrusn a
Iterator that traverses the documents contained in
baseReader . |
characters, endDocument, endElement, endPrefixMapping, error, fatalError, ignorableWhitespace, notationDecl, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, startDocument, startElement, startPrefixMapping, unparsedEntityDecl, warning
public Iterator<Document> read(Reader reader)
Iterator
that traverses the documents contained in
baseReader
.read
in interface CorpusReader<Document>
reader
- A Reader
that will extract text from a data
source, such as a URL, a File, a data stream, or any other source
accesible via the Reader
interface. Each CorpusReader
should specify the expected text format, be it an
XML schema or some other unique format.public Iterator<Document> read(File file)
Iterator
that traverses the documents containted in
the given file
.read
in interface CorpusReader<Document>
file
- A text file holding documents in a format
that is readable by a particular CorpusReader
. This text
file may have it's own unique text structure or an xml format.
Each CorpusReader
should specify the expected text format.Copyright © 2012. All Rights Reserved.