You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2009/05/15 18:59:45 UTC

[jira] Created: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

map/reduce doesn't run jobs with 0 maps
---------------------------------------

                 Key: HADOOP-5850
                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
             Project: Hadoop Core
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.20.0
            Reporter: Owen O'Malley
            Priority: Critical
             Fix For: 0.20.1


Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Ramya R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711564#action_12711564 ] 

Ramya R commented on HADOOP-5850:
---------------------------------

The patch seems to introduce a new error. The taskdetails.jsp page of job setup and cleanup tasks throws a NullPointerException. This is the case with all the jobs' setup and cleanup tasks. Below is the Exception seen on the UI:
java.lang.NullPointerException
at org.apache.hadoop.mapred.TaskInProgress.getSplitNodes(TaskInProgress.java:1034)
at org.apache.hadoop.mapred.taskdetails_jsp._jspService(taskdetails_jsp.java:288)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:4

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn-branch-20.v2.txt, HADOOP-5850-20090520-svn.1.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-5850:
------------------------------

    Attachment: HADOOP-5850-20090522-branch-20-final.txt

Patch for branch-20.

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn-branch-20.v2.txt, HADOOP-5850-20090520-svn.1.txt, HADOOP-5850-20090522-branch-20-final.txt, HADOOP-5850-20090522.txt, screenshot-1.jpg
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711898#action_12711898 ] 

Hadoop QA commented on HADOOP-5850:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12408651/HADOOP-5850-20090520-svn-branch-20.v2.txt
  against trunk revision 777330.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 11 new or modified tests.

    -1 patch.  The patch command could not apply the patch.

Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/374/console

This message is automatically generated.

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn-branch-20.v2.txt, HADOOP-5850-20090520-svn.1.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-5850:
------------------------------

    Status: Open  (was: Patch Available)

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn.1.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-5850:
------------------------------

    Attachment: HADOOP-5850-20090519.1.txt

Attaching new patch with the suggested changes. The test now fails without the core changes and succeeds with.

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-5850:
------------------------------

    Attachment: HADOOP-5850-20090520-svn.1.txt

Attaching final patch. The previous patch had some problems because of which TestSetupAndCleanupFailure failed.

This patch fixes those problems. It passed ant test-patch, core and contrib tests except the following which failed/timeout even without this patch and are unrelated to the changes in this patch.

Failed:
 - org.apache.hadoop.streaming.TestMultipleCachefiles
 - org.apache.hadoop.streaming.TestStreamingBadRecords
 - org.apache.hadoop.streaming.TestSymLink

Timedout:
 - org.apache.hadoop.mapred.TestJobInProgressListener FAILED (timeout)
 - org.apache.hadoop.mapred.TestQueueCapacities FAILED (timeout)

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn.1.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712691#action_12712691 ] 

Vinod K V commented on HADOOP-5850:
-----------------------------------

This situation occurs if scheduler is invoked and it calls job.obtainNewMapTask() while the job-clean-up task of this job is still running. Discussed with this Devaraj who concurs that obtainNewMapTask()/obtainNewReduceTask() should return immediately, doing nothing, when job-cleanup is running. Will address this in a new JIRA.

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn-branch-20.v2.txt, HADOOP-5850-20090520-svn.1.txt, HADOOP-5850-20090522-branch-20-final.txt, HADOOP-5850-20090522.txt, screenshot-1.jpg
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-5850:
------------------------------

    Attachment: HADOOP-5850-20090522.txt

Attaching patch incorporating the review comments and fixing the issues.

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn-branch-20.v2.txt, HADOOP-5850-20090520-svn.1.txt, HADOOP-5850-20090522.txt, screenshot-1.jpg
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Milind Bhandarkar (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710930#action_12710930 ] 

Milind Bhandarkar commented on HADOOP-5850:
-------------------------------------------

Should the output directories be empty or should they produce zero-length part files ?

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-5850:
------------------------------

    Status: Open  (was: Patch Available)

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Tom White (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711069#action_12711069 ] 

Tom White commented on HADOOP-5850:
-----------------------------------

You should be able to use LazyOutputFormat if you wanted no part files to be produced.

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Ramya R (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ramya R updated HADOOP-5850:
----------------------------

    Attachment: screenshot-1.jpg

Had offline discussion with Nigel. Attaching a screenshot for the Exception thrown in taskdetails.jsp page.

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn-branch-20.v2.txt, HADOOP-5850-20090520-svn.1.txt, screenshot-1.jpg
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712703#action_12712703 ] 

Vinod K V commented on HADOOP-5850:
-----------------------------------

bq. [...] Will address this in a new JIRA.
HADOOP-5908


> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn-branch-20.v2.txt, HADOOP-5850-20090520-svn.1.txt, HADOOP-5850-20090522-branch-20-final.txt, HADOOP-5850-20090522.txt, screenshot-1.jpg
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-5850:
------------------------------

    Status: Patch Available  (was: Open)

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das resolved HADOOP-5850.
---------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]

I just committed this. Thanks, Vinod!

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn-branch-20.v2.txt, HADOOP-5850-20090520-svn.1.txt, HADOOP-5850-20090522-branch-20-final.txt, HADOOP-5850-20090522.txt, screenshot-1.jpg
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711314#action_12711314 ] 

Vinod K V commented on HADOOP-5850:
-----------------------------------

Forgot to summarize. With this patch,
 - Jobs with zero maps will still run job setup and cleanup tasks.
 - If number of reduces is non-zero, reduces are run leaving behind the corresponding number of empty part-files in the output directory.
 - If the number of reduces is also zero, an empty output directory is left behind.
 - The map progress(and reduce progress if number of reduces is zero) is set to 1.0 once the job cleanup task finishes.

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn.1.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-5850:
------------------------------

    Attachment: HADOOP-5850-20090519.txt

Attaching patch to fix this issue.
 - With this patch, jobs will 0 maps or no input still run JobSetUp, any number of reduces( which do nothing), and the JobCleanUp task.
 - Removed the 0-splits check in JobInProgress.initTasks() and added checks so that cleanup task doesn't launch before setup tasks when number of splits is zero.
 - Renamed TestEmptyJobWithDFS to TestEmptyJob, removed HDFS dependence to quicken the test, added checks for verifying the number of map and reduce tasks run for an empty-job.

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-5850:
------------------------------

    Status: Patch Available  (was: Open)

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Ramya R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712686#action_12712686 ] 

Ramya R commented on HADOOP-5850:
---------------------------------

With the above fix, when a job (writing to DFS) with 0 maps and >0 reduces is submitted, the cluster hangs completely.
The TT logs show "INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to <jt>' with reponseId 'ID" infinitely and the JT throws java.io.IOException: java.lang.ArithmeticException forever. 
Below is the stacktrace:
{noformat} 
2009-05-25 08:13:00,124 INFO org.apache.hadoop.ipc.Server: IPC Server handler 37 on <portno>, call heartbeat(org.apache.hadoop.mapred.TaskTrackerStatus@14d128c, false, false, true, 3231) from <ip>:<port> error: java.io.IOException:
 java.lang.ArithmeticException: / by zero
java.io.IOException: java.lang.ArithmeticException: / by zero
        at org.apache.hadoop.mapred.ResourceEstimator.getEstimatedMapOutputSize(ResourceEstimator.java:85)
        at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:1729)
        at org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:978)
        at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:572)
        at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:418)
        at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:498)
        at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:277)
        at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:977)
        at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2605)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
{noformat} 
In such a case, all the jobs hang infinitely without progressing and the cluster is completely down.
This problem is solved only when the no-map job is killed. Once the job is killed the cluster is back running and the other jobs proceed smoothly.


> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn-branch-20.v2.txt, HADOOP-5850-20090520-svn.1.txt, HADOOP-5850-20090522-branch-20-final.txt, HADOOP-5850-20090522.txt, screenshot-1.jpg
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712040#action_12712040 ] 

Amareshwari Sriramadasu commented on HADOOP-5850:
-------------------------------------------------

changes look fine to me

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn-branch-20.v2.txt, HADOOP-5850-20090520-svn.1.txt, HADOOP-5850-20090522.txt, screenshot-1.jpg
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711940#action_12711940 ] 

Vinod K V commented on HADOOP-5850:
-----------------------------------

Thank you Ramya for finding out a problem with the patch! I am working on fixing this and uploading a new patch.

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn-branch-20.v2.txt, HADOOP-5850-20090520-svn.1.txt, screenshot-1.jpg
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710701#action_12710701 ] 

Vinod K V commented on HADOOP-5850:
-----------------------------------

ant test-patch results:
{code}
     [exec] +1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     +1 tests included.  The patch appears to include 6 new or modified tests.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec]
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
     [exec]
     [exec]
     [exec]
     [exec]
     [exec] ======================================================================
     [exec] ======================================================================
     [exec]     Finished build.
     [exec] ======================================================================
     [exec] ======================================================================
{code}

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-5850:
------------------------------

    Attachment: HADOOP-5850-20090520-svn-branch-20.v2.txt

Patch for branch 0.20.

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn-branch-20.v2.txt, HADOOP-5850-20090520-svn.1.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-5850:
------------------------------

    Status: Open  (was: Patch Available)

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn-branch-20.v2.txt, HADOOP-5850-20090520-svn.1.txt, screenshot-1.jpg
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711522#action_12711522 ] 

Amareshwari Sriramadasu commented on HADOOP-5850:
-------------------------------------------------

Setting Map Progress and reduce progress to 1.0f should not be done in updateTaskStatus() method, it can be done when setup completes in completedTask().
Catching NullPointerException in testcase doesn't seem correct, just throw it out if there is any.


> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn-branch-20.v2.txt, HADOOP-5850-20090520-svn.1.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V reassigned HADOOP-5850:
---------------------------------

    Assignee: Vinod K V

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-5850:
------------------------------

    Status: Patch Available  (was: Open)

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn.1.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Jothi Padmanabhan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710678#action_12710678 ] 

Jothi Padmanabhan commented on HADOOP-5850:
-------------------------------------------

Changes to JobInProgress looks fine.
Could you add an assertion to the test case to verify presence of empty output directories as well? That should fail without the patch and should pass with the patch.

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711046#action_12711046 ] 

Vinod K V commented on HADOOP-5850:
-----------------------------------

bq. Should the output directories be empty or should they produce zero-length part files ? 
The job will run N number of reduces that the user intends and they should produce zero-length part files. The patch does the same.

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5850) map/reduce doesn't run jobs with 0 maps

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712122#action_12712122 ] 

Vinod K V commented on HADOOP-5850:
-----------------------------------

ant test-patch and run-test-mapred targets passed with the patch.

> map/reduce doesn't run jobs with 0 maps
> ---------------------------------------
>
>                 Key: HADOOP-5850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5850
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Vinod K V
>            Priority: Critical
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-5850-20090519.1.txt, HADOOP-5850-20090519.txt, HADOOP-5850-20090520-svn-branch-20.v2.txt, HADOOP-5850-20090520-svn.1.txt, HADOOP-5850-20090522.txt, screenshot-1.jpg
>
>
> Currently, the framework ignores jobs that have 0 maps. This is incorrect. Many pipelines need the job to run (if nothing else, to create the output directory!) so that subsequent jobs don't fail. Effectively, there will be no map tasks and the reduce tasks should immediately set up the Reducer and RecordWriter and then call close on both since there are no inputs to the reduce. I believe it should just work if we remove the check...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.