You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Denny Ye (JIRA)" <ji...@apache.org> on 2011/03/15 05:42:29 UTC

[jira] Created: (MAPREDUCE-2384) Can MR make error response Immediately?

Can MR make error response Immediately?
---------------------------------------

                 Key: MAPREDUCE-2384
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: job submission
    Affects Versions: 0.21.0
            Reporter: Denny Ye


When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
        1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
        2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
 
        In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
        It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated MAPREDUCE-2384:
-------------------------------

    Status: Patch Available  (was: Open)
    
> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 0.24.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037456#comment-13037456 ] 

Hadoop QA commented on MAPREDUCE-2384:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12480002/MAPREDUCE-2384.r1.diff
  against trunk revision 1125599.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

    +1 system test framework.  The patch passed system test framework compile.

Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/289//testReport/
Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/289//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/289//console

This message is automatically generated.

> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J Chouraria
>             Fix For: 0.23.0
>
>         Attachments: MAPREDUCE-2384.r1.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13098776#comment-13098776 ] 

Harsh J commented on MAPREDUCE-2384:
------------------------------------

Hm, where does MR2's job submitter lie, in that case?

I'm yet to come to speed completely on MR2, but what I thought was that the code under {{hadoop-mapreduce-client/hadoop-mapreduce-client-core/}} is something that's still maintained/used and that was the reason for my last update here.

We can close this as not-a-problem if the default hadoop job submitter doesn't require, or does already do such checks. I think this one was tied to how the API works, so am curious to know where the changes lie :)

> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 0.23.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated MAPREDUCE-2384:
-------------------------------

    Attachment: MAPREDUCE-2384.r2.diff

I figured I could definitely write a test for this as well. Looking through rest of client side submit process, am quite positive this will not affect things, and is just an accidentally misplaced method call.

{code}
    [junit] Running org.apache.hadoop.mapreduce.TestMRJobClient
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 66.872 sec
{code}

Changes:
- Same code patch with a test case method added to {{TestMRJobClient}}, which passes locally.

> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 0.23.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2384) The job submitter should make sure to validate jobs before creation of necessary files

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated MAPREDUCE-2384:
-------------------------------

    Release Note:   (was: Submitter should fail on errors early, before transferring files.)
    
> The job submitter should make sure to validate jobs before creation of necessary files
> --------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff, MAPREDUCE-2384.r4.diff
>
>
> In 0.20.x/1.x, 0.21, 0.22 the MapReduce job submitter writes some job-necessary files to the JT FS before checking for output specs or other job validation items. This appears unnecessary to do.
> This has since been silently fixed in the rewrite of the MRApp (called MRv2) in the MAPREDUCE-279 dump thats now replaced the older MR (or, MRv1 now). However, we can still do with a test case to prevent regressing again.
> Original description below:
> {quote}
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2384) The job submitter should make sure to validate jobs before creation of necessary files

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284825#comment-13284825 ] 

Hudson commented on MAPREDUCE-2384:
-----------------------------------

Integrated in Hadoop-Mapreduce-trunk #1094 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1094/])
    MAPREDUCE-2384. The job submitter should make sure to validate jobs before creation of necessary files. (harsh) (Revision 1343240)

     Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1343240
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestMRJobClient.java

                
> The job submitter should make sure to validate jobs before creation of necessary files
> --------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission, test
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 3.0.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff, MAPREDUCE-2384.r4.diff
>
>
> In 0.20.x/1.x, 0.21, 0.22 the MapReduce job submitter writes some job-necessary files to the JT FS before checking for output specs or other job validation items. This appears unnecessary to do.
> This has since been silently fixed in the rewrite of the MRApp (called MRv2) in the MAPREDUCE-279 dump thats now replaced the older MR (or, MRv1 now). However, we can still do with a test case to prevent regressing again.
> Original description below:
> {quote}
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J Chouraria (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J Chouraria updated MAPREDUCE-2384:
-----------------------------------------

    Fix Version/s: 0.23.0
     Release Note: Submitter should fail on errors early, before transferring files.
           Status: Patch Available  (was: Open)

As before, I do not think refactoring (2) is a good idea maintenance-wise. Here's a patch for just the reordering of (1). Some simple jobsubs pass with the change -- I believe existing test cases cover this change already; but let me know if not.

> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J Chouraria
>             Fix For: 0.23.0
>
>         Attachments: MAPREDUCE-2384.r1.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283176#comment-13283176 ] 

Harsh J commented on MAPREDUCE-2384:
------------------------------------

Since this is trivial: Just adds in a test that compiles fine, passes for now and can help prevent regressions of this nature in future refactoring of the submit path (validated by running it against 1.x, where it is indeed broken, but is not critical in any way), I'm going ahead with the commit by Monday unless someone has raised an objection.

Thanks,
Harsh
                
> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff, MAPREDUCE-2384.r4.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J Chouraria (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J Chouraria reassigned MAPREDUCE-2384:
--------------------------------------------

    Assignee: Harsh J Chouraria

> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J Chouraria
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated MAPREDUCE-2384:
-------------------------------

    Attachment: MAPREDUCE-2384.r4.diff

New patch that just adds in a test to prevent regressions.

Test passes on the default trunk JobSubmitter logic.
                
> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff, MAPREDUCE-2384.r4.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J Chouraria (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039150#comment-13039150 ] 

Harsh J Chouraria commented on MAPREDUCE-2384:
----------------------------------------------

Justification: Existing test cases already cover submissions. The change does not require a new one, IMO.

> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J Chouraria
>             Fix For: 0.23.0
>
>         Attachments: MAPREDUCE-2384.r1.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096623#comment-13096623 ] 

Hadoop QA commented on MAPREDUCE-2384:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12492888/MAPREDUCE-2384.r3.diff
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 8 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in hadoop-mapreduce-project.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/597//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/597//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-hs.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/597//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-shuffle.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/597//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-common.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/597//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-app.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/597//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-jobclient.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/597//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/597//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-common.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/597//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/597//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/597//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/597//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-api.html
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/597//console

This message is automatically generated.

> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 0.23.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated MAPREDUCE-2384:
-------------------------------

    Attachment: MAPREDUCE-2384.r2.diff

> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 0.23.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J Chouraria (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J Chouraria updated MAPREDUCE-2384:
-----------------------------------------

    Attachment: MAPREDUCE-2384.r1.diff

> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J Chouraria
>         Attachments: MAPREDUCE-2384.r1.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188055#comment-13188055 ] 

Harsh J commented on MAPREDUCE-2384:
------------------------------------

Any update on this? Its harmless and saves a job ID.

Arun, if you are confident this is not required on MR2 at all, please mark it as an appropriate resolved state.
                
> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 0.24.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated MAPREDUCE-2384:
-------------------------------------

    Status: Open  (was: Patch Available)

Sorry, to come in late.

The patch has gone stale.

Given this is not an issue with MRv2 should we still commit this? I'm happy to, but not sure it's useful. Thanks.

> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 0.23.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated MAPREDUCE-2384:
-------------------------------

       Fix Version/s:     (was: 0.23.0)
    Target Version/s: 3.0.0
              Status: Patch Available  (was: Open)
    
> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff, MAPREDUCE-2384.r4.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2384) The job submitter should make sure to validate jobs before creation of necessary files

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284747#comment-13284747 ] 

Hudson commented on MAPREDUCE-2384:
-----------------------------------

Integrated in Hadoop-Hdfs-trunk #1060 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1060/])
    MAPREDUCE-2384. The job submitter should make sure to validate jobs before creation of necessary files. (harsh) (Revision 1343240)

     Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1343240
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestMRJobClient.java

                
> The job submitter should make sure to validate jobs before creation of necessary files
> --------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission, test
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 3.0.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff, MAPREDUCE-2384.r4.diff
>
>
> In 0.20.x/1.x, 0.21, 0.22 the MapReduce job submitter writes some job-necessary files to the JT FS before checking for output specs or other job validation items. This appears unnecessary to do.
> This has since been silently fixed in the rewrite of the MRApp (called MRv2) in the MAPREDUCE-279 dump thats now replaced the older MR (or, MRv1 now). However, we can still do with a test case to prevent regressing again.
> Original description below:
> {quote}
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13274000#comment-13274000 ] 

Hadoop QA commented on MAPREDUCE-2384:
--------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526630/MAPREDUCE-2384.r4.diff
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 1 new or modified test files.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2383//testReport/
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2383//console

This message is automatically generated.
                
> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff, MAPREDUCE-2384.r4.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2384) The job submitter should make sure to validate jobs before creation of necessary files

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284408#comment-13284408 ] 

Hudson commented on MAPREDUCE-2384:
-----------------------------------

Integrated in Hadoop-Mapreduce-trunk-Commit #2309 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2309/])
    MAPREDUCE-2384. The job submitter should make sure to validate jobs before creation of necessary files. (harsh) (Revision 1343240)

     Result = FAILURE
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1343240
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestMRJobClient.java

                
> The job submitter should make sure to validate jobs before creation of necessary files
> --------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission, test
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 3.0.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff, MAPREDUCE-2384.r4.diff
>
>
> In 0.20.x/1.x, 0.21, 0.22 the MapReduce job submitter writes some job-necessary files to the JT FS before checking for output specs or other job validation items. This appears unnecessary to do.
> This has since been silently fixed in the rewrite of the MRApp (called MRv2) in the MAPREDUCE-279 dump thats now replaced the older MR (or, MRv1 now). However, we can still do with a test case to prevent regressing again.
> Original description below:
> {quote}
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated MAPREDUCE-2384:
-------------------------------

    Fix Version/s:     (was: 0.24.0)
                   0.23.0
           Status: Open  (was: Patch Available)

Checked trunk sources and this is fixed via MAPREDUCE-279 on trunk.

Is the test case still required?
                
> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 0.23.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated MAPREDUCE-2384:
-------------------------------

    Attachment:     (was: MAPREDUCE-2384.r2.diff)

> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 0.23.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2384) The job submitter should make sure to validate jobs before creation of necessary files

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284398#comment-13284398 ] 

Hudson commented on MAPREDUCE-2384:
-----------------------------------

Integrated in Hadoop-Common-trunk-Commit #2290 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2290/])
    MAPREDUCE-2384. The job submitter should make sure to validate jobs before creation of necessary files. (harsh) (Revision 1343240)

     Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1343240
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestMRJobClient.java

                
> The job submitter should make sure to validate jobs before creation of necessary files
> --------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission, test
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 3.0.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff, MAPREDUCE-2384.r4.diff
>
>
> In 0.20.x/1.x, 0.21, 0.22 the MapReduce job submitter writes some job-necessary files to the JT FS before checking for output specs or other job validation items. This appears unnecessary to do.
> This has since been silently fixed in the rewrite of the MRApp (called MRv2) in the MAPREDUCE-279 dump thats now replaced the older MR (or, MRv1 now). However, we can still do with a test case to prevent regressing again.
> Original description below:
> {quote}
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039197#comment-13039197 ] 

Todd Lipcon commented on MAPREDUCE-2384:
----------------------------------------

This seems like a reasonable change to me, but I'd like someone more familiar with the job submission code path to review it before commit, just in case I'm missing a subtlety. Anyone out there?

> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J Chouraria
>             Fix For: 0.23.0
>
>         Attachments: MAPREDUCE-2384.r1.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2384) The job submitter should make sure to validate jobs before creation of necessary files

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated MAPREDUCE-2384:
-------------------------------

    Component/s: test
    
> The job submitter should make sure to validate jobs before creation of necessary files
> --------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission, test
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 3.0.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff, MAPREDUCE-2384.r4.diff
>
>
> In 0.20.x/1.x, 0.21, 0.22 the MapReduce job submitter writes some job-necessary files to the JT FS before checking for output specs or other job validation items. This appears unnecessary to do.
> This has since been silently fixed in the rewrite of the MRApp (called MRv2) in the MAPREDUCE-279 dump thats now replaced the older MR (or, MRv1 now). However, we can still do with a test case to prevent regressing again.
> Original description below:
> {quote}
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2384) The job submitter should make sure to validate jobs before creation of necessary files

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated MAPREDUCE-2384:
-------------------------------

                Tags: test
          Resolution: Fixed
       Fix Version/s: 3.0.0
    Target Version/s:   (was: 3.0.0)
              Status: Resolved  (was: Patch Available)

Committed the test addition as revision 1343240 to trunk after re-validating the patch via a {{mvn clean install -Dtest=TestMRJobClient}} and ensuring that passed + build completed properly.

Patch's QA result is in comments above if someone needs it.
                
> The job submitter should make sure to validate jobs before creation of necessary files
> --------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 3.0.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff, MAPREDUCE-2384.r4.diff
>
>
> In 0.20.x/1.x, 0.21, 0.22 the MapReduce job submitter writes some job-necessary files to the JT FS before checking for output specs or other job validation items. This appears unnecessary to do.
> This has since been silently fixed in the rewrite of the MRApp (called MRv2) in the MAPREDUCE-279 dump thats now replaced the older MR (or, MRv1 now). However, we can still do with a test case to prevent regressing again.
> Original description below:
> {quote}
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2384) The job submitter should make sure to validate jobs before creation of necessary files

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated MAPREDUCE-2384:
-------------------------------

    Description: 
In 0.20.x/1.x, 0.21, 0.22 the MapReduce job submitter writes some job-necessary files to the JT FS before checking for output specs or other job validation items. This appears unnecessary to do.

This has since been silently fixed in the rewrite of the MRApp (called MRv2) in the MAPREDUCE-279 dump thats now replaced the older MR (or, MRv1 now). However, we can still do with a test case to prevent regressing again.

Original description below:

{quote}
When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
        1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
        2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
 
        In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
        It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!
{quote}

  was:
When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
        1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
        2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
 
        In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
        It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

        Summary: The job submitter should make sure to validate jobs before creation of necessary files  (was: Can MR make error response Immediately?)
    
> The job submitter should make sure to validate jobs before creation of necessary files
> --------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff, MAPREDUCE-2384.r4.diff
>
>
> In 0.20.x/1.x, 0.21, 0.22 the MapReduce job submitter writes some job-necessary files to the JT FS before checking for output specs or other job validation items. This appears unnecessary to do.
> This has since been silently fixed in the rewrite of the MRApp (called MRv2) in the MAPREDUCE-279 dump thats now replaced the older MR (or, MRv1 now). However, we can still do with a test case to prevent regressing again.
> Original description below:
> {quote}
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J Chouraria (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030471#comment-13030471 ] 

Harsh J Chouraria commented on MAPREDUCE-2384:
----------------------------------------------

1. is an easy one to fix (basically to move job spec check a step up). I have a patch for this in pipeline.

2. as per OP, is to not build JIP until after the config checks. I think it is alright to have it the way it is now, since to check pre-JIP construct would still require one extra lookup to occur (the config props that are to be checked, are also used elsewhere later). Besides, its easier to read/maintain with the JIP methods and I don't think the construction time (a few property loads, some array decls) would take much time.

What are your thoughts on 2.; would we benefit enough to refactor the parts to not use JIP (and construct it only after validity is verified)?

> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J Chouraria
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073358#comment-13073358 ] 

Harsh J commented on MAPREDUCE-2384:
------------------------------------

FWIW, the only place this change could interfere with is that OutputFormat#checkOutputSpecs() can't no longer find distributed cache files on the JT FS. Don't think that really matters since you can directly access stuff on the HDFS/LocalFS to workaround (as is the case with how the DC is loaded).

> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 0.23.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2384) The job submitter should make sure to validate jobs before creation of necessary files

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284397#comment-13284397 ] 

Hudson commented on MAPREDUCE-2384:
-----------------------------------

Integrated in Hadoop-Hdfs-trunk-Commit #2363 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2363/])
    MAPREDUCE-2384. The job submitter should make sure to validate jobs before creation of necessary files. (harsh) (Revision 1343240)

     Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1343240
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestMRJobClient.java

                
> The job submitter should make sure to validate jobs before creation of necessary files
> --------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission, test
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 3.0.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff, MAPREDUCE-2384.r4.diff
>
>
> In 0.20.x/1.x, 0.21, 0.22 the MapReduce job submitter writes some job-necessary files to the JT FS before checking for output specs or other job validation items. This appears unnecessary to do.
> This has since been silently fixed in the rewrite of the MRApp (called MRv2) in the MAPREDUCE-279 dump thats now replaced the older MR (or, MRv1 now). However, we can still do with a test case to prevent regressing again.
> Original description below:
> {quote}
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated MAPREDUCE-2384:
-------------------------------

    Attachment: MAPREDUCE-2384.r3.diff

Previous patch dupe, updated for 0.23/trunk.

> Can MR make error response Immediately?
> ---------------------------------------
>
>                 Key: MAPREDUCE-2384
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 0.21.0
>            Reporter: Denny Ye
>            Assignee: Harsh J
>             Fix For: 0.23.0
>
>         Attachments: MAPREDUCE-2384.r1.diff, MAPREDUCE-2384.r2.diff, MAPREDUCE-2384.r3.diff
>
>
> When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
>         1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 
>         2. JobTracker.   Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.
>  
>         In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.  
>         It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira