GzipTarInputFormat.GzipTarRecordReader (C-Cat 1.0 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

gov.llnl.ontology.text.hbase
Class GzipTarInputFormat.GzipTarRecordReader

java.lang.Object
  org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.io.Text>
      gov.llnl.ontology.text.hbase.GzipTarInputFormat.GzipTarRecordReader

All Implemented Interfaces:: Closeable

Enclosing class:: GzipTarInputFormat

public class GzipTarInputFormat.GzipTarRecordReader
extends org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.io.Text>
extends org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.io.Text>

A RecordReader for processing gzipped tarballs of document files. It is assumed that each tarballed file is a single document, or will be processed further by other stages.

Constructor Summary
`GzipTarInputFormat.GzipTarRecordReader()`

Method Summary
`void`	`close()`
`org.apache.hadoop.hbase.io.ImmutableBytesWritable`	`getCurrentKey()`
`org.apache.hadoop.io.Text`	`getCurrentValue()`
`float`	`getProgress()`
`void`	`initialize(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)` Extract the `Path` for the file to be processed by this `GzipTarInputFormat.GzipTarRecordReader`.
`boolean`	`nextKeyValue()` Advances the reader one step to point to the next tarball file.

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

GzipTarInputFormat.GzipTarRecordReader

public GzipTarInputFormat.GzipTarRecordReader()

Method Detail

initialize

public void initialize(org.apache.hadoop.mapreduce.InputSplit split,
                       org.apache.hadoop.mapreduce.TaskAttemptContext context)
                throws IOException,
                       InterruptedException

Extract the Path for the file to be processed by this GzipTarInputFormat.GzipTarRecordReader.

Specified by:: initialize in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.io.Text>

Throws:: IOException; InterruptedException

nextKeyValue

public boolean nextKeyValue()
                     throws IOException

Advances the reader one step to point to the next tarball file. It returns null when there are no more files in the tarball.

Specified by:: nextKeyValue in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.io.Text>

Throws:: IOException

getCurrentKey

public org.apache.hadoop.hbase.io.ImmutableBytesWritable getCurrentKey()

Specified by:: getCurrentKey in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.io.Text>

getCurrentValue

public org.apache.hadoop.io.Text getCurrentValue()

Specified by:: getCurrentValue in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.io.Text>

getProgress

public float getProgress()
                  throws IOException,
                         InterruptedException

Specified by:: getProgress in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.io.Text>

Throws:: IOException; InterruptedException

close

public void close()
           throws IOException

Specified by:: close in interface Closeable
Specified by:: close in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.io.Text>

Throws:: IOException