You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Something Something <ma...@gmail.com> on 2009/12/21 18:24:48 UTC

InputFormat related question...

In my application I have a file in this format:

The first line of the file contains the data to be processed, and *each* of
the remaining lines contain parameters that will be used to slice & dice the
data in various ways.  In other words, each mapper needs two lines - the 1st
line from this file that contains data and another line that contains
parameters.

I looked at NLineInputFormat which can be used for "parameter sweeps", but
it's not quite what I want.  I believe this format returns N no. of
consecutive lines to the mapper, correct?

What's the best way to handle this case?  Do I have to write a special
InputFormat class?  Please help.  Thanks.