You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by Harsh J <ha...@cloudera.com> on 2011/05/05 05:40:50 UTC

Re: Bug of InputSampler

Jeff,

Is this similar to what
https://issues.apache.org/jira/browse/MAPREDUCE-1987 had pointed out?

On Fri, Apr 8, 2011 at 2:03 PM, Jeff Zhang <zj...@gmail.com> wrote:
> Hi all,
>
> I found a probable bug of InputSampler in method writePartitionFile
>
>    for(int i = 1; i < numPartitions; ++i) {
>      int k = Math.round(stepSize * i);
>      while (last >= k && comparator.compare(samples[last], samples[k]) ==
> 0) {
>        ++k;
>      }
>      writer.append(samples[k], nullValue);
>      last = k;
>    }
>
>
> In the line of writer.append(samples[k], nullValue), the k may be already
> out of array index. maybe we should add one line to test whether it has been
> out of array index as following
>  for(int i = 1; i < numPartitions; ++i) {
>      int k = Math.round(stepSize * i);
>      while (last >= k && comparator.compare(samples[last], samples[k]) ==
> 0) {
>        ++k;
>      }
>      if (k>=samples.length)
>         break;
>
>      writer.append(samples[k], nullValue);
>      last = k;
>    }
>
>
> --
> Best Regards
>
> Jeff Zhang
>



-- 
Harsh J