public class SemEvalLexSubReader extends org.xml.sax.helpers.DefaultHandler implements CorpusReader<Document>
CorpusReader returns documents in the following format:
word_instance_id text ... ||| *focus_word* text ...
Note that this is implemented as a DefaultHandler for a SAXParser.| Modifier and Type | Class and Description |
|---|---|
class |
SemEvalLexSubReader.SemEvalHandler |
| Constructor and Description |
|---|
SemEvalLexSubReader() |
| Modifier and Type | Method and Description |
|---|---|
Iterator<Document> |
read(File file)
Returns a
Iterator that traverses the documents containted in
the given file. |
Iterator<Document> |
read(Reader reader)
Retrusn a
Iterator that traverses the documents contained in
baseReader. |
characters, endDocument, endElement, endPrefixMapping, error, fatalError, ignorableWhitespace, notationDecl, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, startDocument, startElement, startPrefixMapping, unparsedEntityDecl, warningpublic Iterator<Document> read(Reader reader)
Iterator that traverses the documents contained in
baseReader.read in interface CorpusReader<Document>reader - A Reader that will extract text from a data
source, such as a URL, a File, a data stream, or any other source
accesible via the Reader interface. Each CorpusReader should specify the expected text format, be it an
XML schema or some other unique format.public Iterator<Document> read(File file)
Iterator that traverses the documents containted in
the given file.read in interface CorpusReader<Document>file - A text file holding documents in a format
that is readable by a particular CorpusReader. This text
file may have it's own unique text structure or an xml format.
Each CorpusReader should specify the expected text format.Copyright © 2012. All Rights Reserved.