You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by exception <ex...@taomee.com> on 2010/11/19 08:11:46 UTC

potential bug in InputSampler, hadoop 0.21.0

Hi all,

I probably find a bug in InputSamper, under hadoop 0.21.0.
In the file InputSampler.java under package org.apache.hadoop.mapreduce.lib.partition, inside function getSample, a record reader is created but not initialized. So when trying to use the record reader, an exception will be thrown. Because some of the objects referenced by the record reader haven't been initialized properly.

For example, near line 217:
......
RecordReader<K,V> reader = inf.createRecordReader(splits.get(i),
          new TaskAttemptContextImpl(job.getConfiguration(),
                                     new TaskAttemptID()));
        while (reader.nextKeyValue()) {
......
}

The reader should be initialized before calling "reader.nextKeyValue()".







Cheers

Re: potential bug in InputSampler, hadoop 0.21.0

Posted by Eli Collins <el...@cloudera.com>.

File a jira?

On Thursday, November 18, 2010, exception <ex...@taomee.com> wrote:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Hi all,
>
>
>
> I probably find a bug in InputSamper, under hadoop
>  0.21.0.
>
> In the file InputSampler.java under package org.apache.hadoop.mapreduce.lib.partition,
> inside function getSample, a record reader is created but not initialized. So when
> trying to use the record reader, an exception will be thrown. Because some of
> the objects referenced by the record reader haven’t been initialized
> properly.
>
>
>
> For example,
> near line 217:
>
> ……
>
> RecordReader<K,V> reader =
> inf.createRecordReader(splits.get(i),
>
>
> new TaskAttemptContextImpl(job.getConfiguration(),
>
>
> new TaskAttemptID()));
>
>
> while (reader.nextKeyValue()) {
>
> ……
>
> }
>
>
>
> The reader should be initialized before calling “reader.nextKeyValue()”.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Cheers
>
>
>
>
>
>
>
>
>