public class CompoundWordIterator extends Object implements Iterator<String>
Note that unlike other iterators, the next
method is O(n)
in complexity where n
is the number of unique words starting
the set of compound tokens recognized by this iterator. This may result in a
noticeable performance penalty when a large set of compound words is used.
This class also provides a reset
method to
allow resetting the token stream backing this iterator. If the recognized
set of compound words does not change, then this method is prefered over
creating a new CompoundWordIterator
as it avoids the initializationg
overhead of building the underlying compound recognizer.
Constructor and Description |
---|
CompoundWordIterator(BufferedReader br,
Set<String> compoundWords) |
CompoundWordIterator(Iterator<String> tokens,
Set<String> compoundWords) |
CompoundWordIterator(String str,
Set<String> compoundWords) |
Modifier and Type | Method and Description |
---|---|
boolean |
hasNext()
Returns
true if there is another token to return. |
String |
next()
Returns the next token in the stream.
|
void |
remove()
Throws an
UnsupportedOperationException if called. |
void |
reset(BufferedReader br)
Resets the underlying token stream to point to the contents of the
provided
BufferedReader . |
void |
reset(Iterator<String> tokens)
Resets the underlying token stream to point to the tokens in the provided
iterator.
|
public CompoundWordIterator(BufferedReader br, Set<String> compoundWords)
public boolean hasNext()
true
if there is another token to return.public String next()
O(n)
in complexity where n
is the number
of unique words starting the set of compound tokens recognized by this
iterator.public void remove()
UnsupportedOperationException
if called.public void reset(BufferedReader br)
BufferedReader
. This does not change the set of
accepted compound words.Copyright © 2012. All Rights Reserved.