You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Billie Rinaldi (Jira)" <ji...@apache.org> on 2020/03/09 20:14:00 UTC
[jira] [Commented] (MAPREDUCE-7265) Buffer corruption with spill percent 1.0

    [ https://issues.apache.org/jira/browse/MAPREDUCE-7265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055342#comment-17055342 ] 

Billie Rinaldi commented on MAPREDUCE-7265:
-------------------------------------------

Patch 01 contains one way of addressing this issue. It effectively makes the check for whether a spill is needed happen one record earlier. That is, if the metadata for k/v pair _N_ will not fit in the buffer, it initiates the spill when k/v pair _N-1_ data is written. Another option might be to disallow spill percent 1.0 (although I am concerned about the spill initiation code in the collect method and am thinking that code path should be avoided).

The patch has some tests that exercise different aspects of the issue, though I have not been able to reproduce everything I have seen with custom key/value types. In the attached patch, if the change to MapTask is removed, testTwoSpillsBytesWritable will crash the test run and you won't be able to see the results of the other tests. If that test is also commented out, the other tests will produce some ArrayIndexOutOfBoundsExceptions and some failures to verify expected map output file contents.

> Buffer corruption with spill percent 1.0
> ----------------------------------------
>
>                 Key: MAPREDUCE-7265
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7265
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Billie Rinaldi
>            Assignee: Billie Rinaldi
>            Priority: Minor
>         Attachments: MAPREDUCE-7265.01.patch
>
>
> I encountered a variety of issues on a cluster where the spill percent was set to 1.0. Under some conditions, MapTask will not detect that its in memory spill buffer is already full and will keep collecting k/v pairs, causing corruption of the buffer.
> I have been able to track at least some of the problems to a condition where adding a key/value pair to the buffer fills the buffer with fewer than 16 bytes remaining (the kv metadata size). When this happens, the next metadata index (kvindex) passes over the data index (bufindex), which causes some of the index and length calculations to be incorrect in the collect and write methods. It can allow data to keep being written to the buffer even though it is already full, with data overwriting metadata in the buffer and vice versa. I have seen this manifest as the NegativeArraySizeException seen in MAPREDUCE-6907 as well as in ArrayIndexOutOfBoundsException and EOFException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org