You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Stephen Durfey <sj...@gmail.com> on 2019/09/04 21:06:16 UTC

NegativeArraySizeException during map segment merge

I ran into an issue and am struggling to find a way around it. I have a job
failing with the following output (version 2.7.0 of hadoop):

2019-09-04 13:20:30,026 DEBUG [main]
org.apache.hadoop.mapred.MapRFsOutputBuffer:
MapId=attempt_1567541971569_2612_m_003447_0 Reducer=133Spill
=0(110526690,1099443164, 96132208)
2019-09-04 13:20:30,026 DEBUG [main]
org.apache.hadoop.mapred.MapRFsOutputBuffer:
MapId=attempt_1567541971569_2612_m_003447_0 Reducer=133Spill =1(4123,2, 31)
2019-09-04 13:20:30,026 INFO [main] org.apache.hadoop.mapred.Merger:
Merging 2 sorted segments
2019-09-04 13:20:30,026 DEBUG [main] com.mapr.fs.jni.MapRClient: Open: path
=
/var/.../mapred/nodeManager/spill/job_1567541971569_2612/attempt_1567541971569_2612_m_003447_0/spill0.out
2019-09-04 13:20:30,039 DEBUG [main] com.mapr.fs.Inode: &gt;Inode Open
file:
/var/.../mapred/nodeManager/spill/job_1567541971569_2612/attempt_1567541971569_2612_m_003447_0/spill0.out,
size: 243373048, chunkSize: 268435456, fid: 315986.88557.30377956
2019-09-04 13:20:30,057 DEBUG [main]
org.apache.hadoop.io.compress.CodecPool: Got recycled decompressor
2019-09-04 13:20:30,058 DEBUG [main] com.mapr.fs.jni.MapRClient: Open: path
=
/var/.../mapred/nodeManager/spill/job_1567541971569_2612/attempt_1567541971569_2612_m_003447_0/spill1.out
2019-09-04 13:20:30,058 DEBUG [main] com.mapr.fs.Inode: &gt;Inode Open
file:
/var/.../mapred/nodeManager/spill/job_1567541971569_2612/attempt_1567541971569_2612_m_003447_0/spill1.out,
size: 69143436, chunkSize: 268435456, fid: 315986.88362.30378046
2019-09-04 13:20:30,064 DEBUG [main]
org.apache.hadoop.io.compress.CodecPool: Got recycled decompressor
2019-09-04 13:20:30,064 INFO [main] org.apache.hadoop.mapred.Merger: Down
to the last merge-pass, with 1 segments left of total size: 96132217 bytes
2019-09-04 13:20:30,065 WARN [main] org.apache.hadoop.mapred.YarnChild:
Exception running child : java.lang.NegativeArraySizeException
at org.apache.hadoop.mapred.IFile$Reader.nextRawValue(IFile.java:488)
at org.apache.hadoop.mapred.Merger$Segment.nextRawValue(Merger.java:341)
at org.apache.hadoop.mapred.Merger$Segment.getValue(Merger.java:323)
at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:567)
at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:209)
at
org.apache.hadoop.mapred.MapRFsOutputBuffer.mergeParts(MapRFsOutputBuffer.java:1403)
at
org.apache.hadoop.mapred.MapRFsOutputBuffer.flush(MapRFsOutputBuffer.java:1609)
at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:732)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:802)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:346)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Looking at "MapRFsOutputBuffer: MapId=attempt_1567541971569_2612_m_003447_0
Reducer=133Spill =0(110526690,1099443164, 96132208)", the value
'1,099,443,164' is the raw length of the segment, during buffer allocation
in IFile$Reader#nextRawValue, when that is bit shifted, it causes the
integer overflow, and the exception I am seeing. At least, that is what it
looks like. I'm not sure what tuning options are at my disposal to try to
fix this issue, if any. I tried changing mapreduce.task.io.sort.mb to a
small number (240mb), but that still resulted the same issue. Any
help/suggestions would be appreciated :)

- Stephen

Re: NegativeArraySizeException during map segment merge

Posted by Prabhu Josephraj <pj...@cloudera.com.INVALID>.
1. Looking at IFile$Reader#nextRawValue, not sure why we create valBytes
array of size 2 * currentValueLength even though it tries to read data of
currentValueLength size.
If there is no reason, this can be fixed which will fix the problem.

public void nextRawValue(DataInputBuffer value) throws IOException {
  final byte[] valBytes = (value.getData().length < currentValueLength)
    ? new byte[currentValueLength << 1]
    : value.getData();
  int i = readData(valBytes, 0, currentValueLength);
  if (i != currentValueLength) {
    throw new IOException ("Asked for " + currentValueLength + " Got: " + i);

  }

2. Also the stack trace shows mapreduce.job.map.output.collector.class is
set to MapRFsOutputBuffer which is used on top of MapRFS. Can you test the
same
job on Hdfs with MapOutputBuffer to isolate the issue.



On Thu, Sep 5, 2019 at 2:36 AM Stephen Durfey <sj...@gmail.com> wrote:

> I ran into an issue and am struggling to find a way around it. I have a
> job failing with the following output (version 2.7.0 of hadoop):
>
> 2019-09-04 13:20:30,026 DEBUG [main]
> org.apache.hadoop.mapred.MapRFsOutputBuffer:
> MapId=attempt_1567541971569_2612_m_003447_0 Reducer=133Spill
> =0(110526690,1099443164, 96132208)
> 2019-09-04 13:20:30,026 DEBUG [main]
> org.apache.hadoop.mapred.MapRFsOutputBuffer:
> MapId=attempt_1567541971569_2612_m_003447_0 Reducer=133Spill =1(4123,2, 31)
> 2019-09-04 13:20:30,026 INFO [main] org.apache.hadoop.mapred.Merger:
> Merging 2 sorted segments
> 2019-09-04 13:20:30,026 DEBUG [main] com.mapr.fs.jni.MapRClient: Open:
> path =
> /var/.../mapred/nodeManager/spill/job_1567541971569_2612/attempt_1567541971569_2612_m_003447_0/spill0.out
> 2019-09-04 13:20:30,039 DEBUG [main] com.mapr.fs.Inode: &gt;Inode Open
> file:
> /var/.../mapred/nodeManager/spill/job_1567541971569_2612/attempt_1567541971569_2612_m_003447_0/spill0.out,
> size: 243373048, chunkSize: 268435456, fid: 315986.88557.30377956
> 2019-09-04 13:20:30,057 DEBUG [main]
> org.apache.hadoop.io.compress.CodecPool: Got recycled decompressor
> 2019-09-04 13:20:30,058 DEBUG [main] com.mapr.fs.jni.MapRClient: Open:
> path =
> /var/.../mapred/nodeManager/spill/job_1567541971569_2612/attempt_1567541971569_2612_m_003447_0/spill1.out
> 2019-09-04 13:20:30,058 DEBUG [main] com.mapr.fs.Inode: &gt;Inode Open
> file:
> /var/.../mapred/nodeManager/spill/job_1567541971569_2612/attempt_1567541971569_2612_m_003447_0/spill1.out,
> size: 69143436, chunkSize: 268435456, fid: 315986.88362.30378046
> 2019-09-04 13:20:30,064 DEBUG [main]
> org.apache.hadoop.io.compress.CodecPool: Got recycled decompressor
> 2019-09-04 13:20:30,064 INFO [main] org.apache.hadoop.mapred.Merger: Down
> to the last merge-pass, with 1 segments left of total size: 96132217 bytes
> 2019-09-04 13:20:30,065 WARN [main] org.apache.hadoop.mapred.YarnChild:
> Exception running child : java.lang.NegativeArraySizeException
> at org.apache.hadoop.mapred.IFile$Reader.nextRawValue(IFile.java:488)
> at org.apache.hadoop.mapred.Merger$Segment.nextRawValue(Merger.java:341)
> at org.apache.hadoop.mapred.Merger$Segment.getValue(Merger.java:323)
> at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:567)
> at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:209)
> at
> org.apache.hadoop.mapred.MapRFsOutputBuffer.mergeParts(MapRFsOutputBuffer.java:1403)
> at
> org.apache.hadoop.mapred.MapRFsOutputBuffer.flush(MapRFsOutputBuffer.java:1609)
> at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:732)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:802)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:346)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
>
> Looking at "MapRFsOutputBuffer:
> MapId=attempt_1567541971569_2612_m_003447_0 Reducer=133Spill
> =0(110526690,1099443164, 96132208)", the value '1,099,443,164' is the raw
> length of the segment, during buffer allocation in
> IFile$Reader#nextRawValue, when that is bit shifted, it causes the integer
> overflow, and the exception I am seeing. At least, that is what it looks
> like. I'm not sure what tuning options are at my disposal to try to fix
> this issue, if any. I tried changing mapreduce.task.io.sort.mb to a small
> number (240mb), but that still resulted the same issue. Any
> help/suggestions would be appreciated :)
>
> - Stephen
>