You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Kumar Ravi (Commented) (JIRA)" <ji...@apache.org> on 2012/03/23 20:51:28 UTC
[jira] [Commented] (HADOOP-8192) Fix unit test failures with IBM's JDK

    [ https://issues.apache.org/jira/browse/HADOOP-8192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236990#comment-13236990 ] 

Kumar Ravi commented on HADOOP-8192:
------------------------------------

While debugging this issue it was observed that the order by which the racksToBlocks HashMap gets populated seems to matter. As per Robert Evans and Devaraj Das, it appears by design that the order should not matter. 

 The reason order happens to play a role here is that as soon as all the blocks are accounted for, getMoreSplits() stops iterating through the racks, and depending upon which rack(s) each block is replicated on, and depending upon when each rack is processed in the loop within getMoreSplits(), one can end up with different split counts, and as a result fail the testcase in some situations.

Specifically for this testcase, there are 3 racks that are simulated where each of these 3 racks have a datanode each. Datanode 1 has replicas of all the blocks of all the 3 files (file1, file2, and file3) while Datanode 2 has all the blocks of files file2 and file 3 and Datanode 3 has all the blocks of only file3. As soon as Rack 1 is processed, getMoreSplits() exits with a split count of the number of times it stays in this loop. So in this scenario, if Rack1 gets processed last, one will end up with a split count of 3. If Rack1 gets processed in the beginning, split count will be 1. The testcase is expecting a return value of 3 which is the value returned if running on Sun JVM, but a value of 1 or 2 may be returned depending on when rack1 gets processed.

                
> Fix unit test failures with IBM's JDK
> -------------------------------------
>
>                 Key: HADOOP-8192
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8192
>             Project: Hadoop Common
>          Issue Type: Bug
>         Environment: java version "1.6.0"
> Java(TM) SE Runtime Environment (build pxi3260sr10-20111208_01(SR10))
> IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux x86-32 jvmxi3260sr10-20111207_96808 (JIT enabled, AOT enabled)
> J9VM - 20111207_096808
> JIT  - r9_20111107_21307ifx1
> GC   - 20110519_AA)
> JCL  - 20111104_02
>            Reporter: Devaraj Das
>
> Some tests fail with IBM's JDK. They are org.apache.hadoop.mapred.lib.TestCombineFileInputFormat, org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat, org.apache.hadoop.streaming.TestStreamingBadRecords, org.apache.hadoop.mapred.TestCapacityScheduler. This jira is to track fixing these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira