gov.llnl.ontology.text.corpora
Class SemEval2010TrainDocumentReader

java.lang.Object
  extended by gov.llnl.ontology.text.corpora.SemEval2010TrainDocumentReader
All Implemented Interfaces:
DocumentReader
Direct Known Subclasses:
SemEval2010TestDocumentReader

public class SemEval2010TrainDocumentReader
extends Object
implements DocumentReader

A DocumentReader for the SemEval2010 test corpus. It uses the instance name as the key, the title is be just the keyterm. The id is the token index of the word that matches the title when both are stemmed. It does not generate any labels for a document.

This is not thread safe.

Author:
Keith Stevens

Field Summary
static String CORPUS_NAME
           
 
Constructor Summary
SemEval2010TrainDocumentReader()
          Constructs a new SemEval2010TrainDocumentReader.
 
Method Summary
 String corpusName()
          Returns CORPUS_NAME
 Document readDocument(String doc)
          Returns a Document represented by the given string.
 Document readDocument(String doc, String corpusName)
          Returns a Document represented by the given string and uses corpusName as the corpus name for the returned Document.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CORPUS_NAME

public static final String CORPUS_NAME
See Also:
Constant Field Values
Constructor Detail

SemEval2010TrainDocumentReader

public SemEval2010TrainDocumentReader()
Constructs a new SemEval2010TrainDocumentReader.

Method Detail

corpusName

public String corpusName()
Returns CORPUS_NAME


readDocument

public Document readDocument(String doc)
Returns a Document represented by the given string.

Specified by:
readDocument in interface DocumentReader

readDocument

public Document readDocument(String doc,
                             String corpusName)
Returns a Document represented by the given string and uses corpusName as the corpus name for the returned Document.

Specified by:
readDocument in interface DocumentReader


Copyright © 2010-2011. All Rights Reserved.