You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amar Kamat (JIRA)" <ji...@apache.org> on 2009/06/16 10:38:07 UTC

[jira] Created: (HADOOP-6052) KeyFieldBasedPartitioner would lost data if specifed field not exist

KeyFieldBasedPartitioner would lost data if specifed field not exist
--------------------------------------------------------------------

                 Key: HADOOP-6052
                 URL: https://issues.apache.org/jira/browse/HADOOP-6052
             Project: Hadoop Core
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.20.0
            Reporter: Amar Kamat
            Assignee: Amar Kamat
             Fix For: 0.21.0


When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6052) KeyFieldBasedPartitioner would lost data if specifed field not exist

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-6052:
-------------------------------

    Attachment: HADOOP-6052-v1.1.patch

Attaching a new patch incorporating Devaraj's comments. Running test-patch. Waiting for HADOOP-6076.

> KeyFieldBasedPartitioner would lost data if specifed field not exist
> --------------------------------------------------------------------
>
>                 Key: HADOOP-6052
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6052
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-6052-v1.0-branch0.20.patch, HADOOP-6052-v1.0.patch, HADOOP-6052-v1.1.patch
>
>
> When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6052) KeyFieldBasedPartitioner would lost data if specifed field not exist

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721055#action_12721055 ] 

Amar Kamat commented on HADOOP-6052:
------------------------------------

Opened HADOOP-6075 to address TaskTaskTrackerMemoryManager failure.

> KeyFieldBasedPartitioner would lost data if specifed field not exist
> --------------------------------------------------------------------
>
>                 Key: HADOOP-6052
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6052
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-6052-v1.0-branch0.20.patch, HADOOP-6052-v1.0.patch
>
>
> When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6052) KeyFieldBasedPartitioner would lost data if specifed field not exist

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720573#action_12720573 ] 

Amar Kamat commented on HADOOP-6052:
------------------------------------

Opened HADOOP-6065 to address the failure of TestRunningTaskLimit.

> KeyFieldBasedPartitioner would lost data if specifed field not exist
> --------------------------------------------------------------------
>
>                 Key: HADOOP-6052
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6052
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-6052-v1.0.patch
>
>
> When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6052) KeyFieldBasedPartitioner would lost data if specifed field not exist

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720668#action_12720668 ] 

Devaraj Das commented on HADOOP-6052:
-------------------------------------

Sorry for commenting so late on this one - the check for (startChar < 0), should happen before endChar is evaluated, no? If startChar < 0, the endChar evaluation is redundant..

> KeyFieldBasedPartitioner would lost data if specifed field not exist
> --------------------------------------------------------------------
>
>                 Key: HADOOP-6052
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6052
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-6052-v1.0-branch0.20.patch, HADOOP-6052-v1.0.patch
>
>
> When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6052) KeyFieldBasedPartitioner would lost data if specifed field not exist

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720541#action_12720541 ] 

Amar Kamat commented on HADOOP-6052:
------------------------------------

Following tests failed.
||Name||Type||Result||Resolution||
|org.apache.hadoop.mapred.TestReduceFetch|FAILED|Rerun also failed|HADOOP-6029|
|org.apache.hadoop.mapred.TestRunningTaskLimits|FAILED| Rerun passed|?|
|org.apache.hadoop.mapred.TestTaskLimits FAILED|(timeout)|Rerun also failed|HADOOP-5993/HADOOP-6061|


Looking at TestRunningTaskLimits, I see the following code
{code}

    JobConf jobConf = createWaitJobConf(mr, "job1", 20, 20);
    jobConf.setRunningMapLimit(5);
    jobConf.setRunningReduceLimit(3);
    
    // Submit the job
    RunningJob rJob = (new JobClient(jobConf)).submitJob(jobConf);
    
    // Wait 20 seconds for it to start up
    UtilsForTests.waitFor(20000);
    
    // Check the number of running tasks
    JobTracker jobTracker = mr.getJobTrackerRunner().getJobTracker();
    JobInProgress jip = jobTracker.getJob(rJob.getID());
    assertEquals(5, jip.runningMaps());
    assertEquals(3, jip.runningReduces());
{code}
I dont think waiting for 20 secs is a good thing to do. When I see the logs only one reducer was scheduled.

Contrib tests passed except 
||Name||Type||Result||Resolution||
|org.apache.hadoop.streaming.TestStreamingExitStatus|FAILED|Known issue|HADOOP-5906|
|org.apache.hadoop.streaming.TestStreamingStderr|FAILED (timeout)|Known issue|HADOOP-6062|
|org.apache.hadoop.mapred.TestCapacitySchedulerConf|FAILED|Second run passed after deleting capacity-scheduler.xml  from conf|?|



> KeyFieldBasedPartitioner would lost data if specifed field not exist
> --------------------------------------------------------------------
>
>                 Key: HADOOP-6052
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6052
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-6052-v1.0.patch
>
>
> When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6052) KeyFieldBasedPartitioner would lost data if specifed field not exist

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721167#action_12721167 ] 

Amar Kamat commented on HADOOP-6052:
------------------------------------

Result of test-patch
[exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.


Running ant test now.

> KeyFieldBasedPartitioner would lost data if specifed field not exist
> --------------------------------------------------------------------
>
>                 Key: HADOOP-6052
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6052
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-6052-v1.0-branch0.20.patch, HADOOP-6052-v1.0.patch, HADOOP-6052-v1.1.patch
>
>
> When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6052) KeyFieldBasedPartitioner would lost data if specifed field not exist

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-6052:
-------------------------------

    Attachment: HADOOP-6052-v1.0.patch

Attaching a fix. Incorporated Jothi's comments from HADOOP-5779. Result of test-patch
[exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.

Running ant test now.

> KeyFieldBasedPartitioner would lost data if specifed field not exist
> --------------------------------------------------------------------
>
>                 Key: HADOOP-6052
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6052
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-6052-v1.0.patch
>
>
> When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6052) KeyFieldBasedPartitioner would lost data if specifed field not exist

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-6052:
-------------------------------

    Attachment: HADOOP-6052-v1.0-branch0.20.patch

Attaching a patch for branch 0.20

> KeyFieldBasedPartitioner would lost data if specifed field not exist
> --------------------------------------------------------------------
>
>                 Key: HADOOP-6052
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6052
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-6052-v1.0-branch0.20.patch, HADOOP-6052-v1.0.patch
>
>
> When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.