|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.io.Text>
gov.llnl.ontology.text.hbase.XMLRecordReader
public class XMLRecordReader
A RecordReader
for processing gzipped tarballs of document files.
It is assumed that each tarballed file is a single document, or will be
processed further by other stages.
Field Summary | |
---|---|
static String |
CONF_PREFIX
|
static String |
DELIMITER_TAG
|
Constructor Summary | |
---|---|
XMLRecordReader()
Creates a new XMLRecordReader without gzipped files. |
|
XMLRecordReader(boolean useGzip)
Creates a new XMLRecordReader with useGzip set to true if
the files are in a gzip format. |
Method Summary | |
---|---|
void |
close()
|
org.apache.hadoop.hbase.io.ImmutableBytesWritable |
getCurrentKey()
|
org.apache.hadoop.io.Text |
getCurrentValue()
|
float |
getProgress()
|
void |
initialize(org.apache.hadoop.mapreduce.InputSplit isplit,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
Extract the Path for the file to be processed by this XMLRecordReader . |
boolean |
nextKeyValue()
Advances the reader one step to point to the next tarball file. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final String CONF_PREFIX
public static final String DELIMITER_TAG
Constructor Detail |
---|
public XMLRecordReader()
XMLRecordReader
without gzipped files.
public XMLRecordReader(boolean useGzip)
XMLRecordReader
with useGzip
set to true if
the files are in a gzip format.
Method Detail |
---|
public void initialize(org.apache.hadoop.mapreduce.InputSplit isplit, org.apache.hadoop.mapreduce.TaskAttemptContext context) throws IOException, InterruptedException
Path
for the file to be processed by this XMLRecordReader
.
initialize
in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.io.Text>
IOException
InterruptedException
public boolean nextKeyValue() throws IOException
null
when there are no more files in the tarball.
nextKeyValue
in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.io.Text>
IOException
public org.apache.hadoop.hbase.io.ImmutableBytesWritable getCurrentKey()
getCurrentKey
in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.io.Text>
public org.apache.hadoop.io.Text getCurrentValue()
getCurrentValue
in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.io.Text>
public float getProgress() throws IOException, InterruptedException
getProgress
in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.io.Text>
IOException
InterruptedException
public void close() throws IOException
close
in interface Closeable
close
in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.io.Text>
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |