You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Kunal Gupta <ku...@techlead-india.com> on 2009/12/02 14:45:34 UTC

issue using LineRecordReader's nextKeyValue() method

I am writing my custom InputFormat to read N number of lines per map task.

For this I have extended the FileInputFormat and RecordReader classes.

In my RecordReader I am using LineRecordReader object to get key value
pairs.

If I am calling LineRecordReaderObj.nextKeyValue() method multiple times
while 

handling the same file SPLIT, it goes to the next keyValue but the following
functions 

return same keyValue every time:

getCurrentKey() / getCurrentValue()

What may be the possible reason for this behavior?

 


Re: issue using LineRecordReader's nextKeyValue() method

Posted by Kunal Gupta <ku...@techlead-india.com>.
Thanks Aaron,
        
        Your last point was absolutely correct and i was facing problem
        because
        of my lack of understanding of this point. I had later figured
        out the
        issue and allocated new memory for each key and value and got
        the issue
        resolved. Thanks for the response..
        
        Aaron i was wondering whether i shall post such 'thanks'
        messages on the
        mapreduce-user mailing list or to the individual person? what do
        you
        suggest?
        
        -Kunal
        

On Mon, 2009-12-07 at 15:51 -0800, Aaron Kimball wrote:
> getCurrentKey() and getCurrentValue() do just that: they return the
> current (k, v) pair i.e. the current line. The notion of the current
> line is not changed by calling these methods.
> 
> nextKeyValue()'s contract is to advance the iterator and return true
> if more data is now available.
> 
> Note that even if the data held in the current key and value objects
> is updated by nextKeyValue(), the LineRR will not create new objects
> to hold those key and value pairs -- it recycles objects for
> efficiency.
> 
> - Aaron
> 
> 
> On Wed, Dec 2, 2009 at 5:45 AM, Kunal Gupta <ku...@techlead-india.com>
> wrote:
>         I am writing my custom InputFormat to read N number of lines
>         per map task.
>         
>         For this I have extended the FileInputFormat and RecordReader
>         classes.
>         
>         In my RecordReader I am using LineRecordReader object to get
>         key value pairs.
>         
>         If I am calling LineRecordReaderObj.nextKeyValue() method
>         multiple times while 
>         
>         handling the same file SPLIT, it goes to the next keyValue but
>         the following functions 
>         
>         return same keyValue every time:
>         
>         getCurrentKey() / getCurrentValue()
>         
>         What may be the possible reason for this behavior?
>         
>          
>         
>         
> 


Re: issue using LineRecordReader's nextKeyValue() method

Posted by Aaron Kimball <aa...@cloudera.com>.
getCurrentKey() and getCurrentValue() do just that: they return the current
(k, v) pair i.e. the current line. The notion of the current line is not
changed by calling these methods.

nextKeyValue()'s contract is to advance the iterator and return true if more
data is now available.

Note that even if the data held in the current key and value objects is
updated by nextKeyValue(), the LineRR will not create new objects to hold
those key and value pairs -- it recycles objects for efficiency.

- Aaron


On Wed, Dec 2, 2009 at 5:45 AM, Kunal Gupta <ku...@techlead-india.com>wrote:

>  I am writing my custom InputFormat to read N number of lines per map
> task.
>
> For this I have extended the FileInputFormat and RecordReader classes.
>
> In my RecordReader I am using LineRecordReader object to get key value
> pairs.
>
> If I am calling LineRecordReaderObj.nextKeyValue() method multiple times
> while
>
> handling the same file SPLIT, it goes to the next keyValue but the
> following functions
>
> return same keyValue every time:
>
> getCurrentKey() / getCurrentValue()
>
> What may be the possible reason for this behavior?
>
>
>