public class DocumentPreprocessor extends Object
Constructor and Description |
---|
DocumentPreprocessor()
Constructs a
DocumentPreprocessor with an empty word list |
DocumentPreprocessor(File wordList)
Constructs a
DocumentPreprocessor where the provided file
contains the list all valid words for the output documents. |
DocumentPreprocessor(String[] wordList)
A Constructor purely for test purposes.
|
Modifier and Type | Method and Description |
---|---|
String |
process(String document)
Processes the provided document and returns the cleaned version of the
document.
|
String |
process(String document,
boolean removeWords)
Processes the provided document and returns the cleaned version of the
document.
|
public DocumentPreprocessor()
DocumentPreprocessor
with an empty word listpublic DocumentPreprocessor(File wordList) throws IOException
DocumentPreprocessor
where the provided file
contains the list all valid words for the output documents.wordList
- a file containing a list of all valid words for
outputtingIOException
public DocumentPreprocessor(String[] wordList)
public String process(String document)
document
- a document to processpublic String process(String document, boolean removeWords)
document
- a document to processremoveWords
- If true, any word which is not found in the provided
word list is removed from the cleaned document.Copyright © 2012. All Rights Reserved.