You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Richard Ding (JIRA)" <ji...@apache.org> on 2011/02/18 01:41:12 UTC

[jira] Updated: (PIG-1829) "0" value seen in PigStat's map/reduce runtime, even when the job is successful

     [ https://issues.apache.org/jira/browse/PIG-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Ding updated PIG-1829:
------------------------------

    Attachment: PIG-1829.patch

Implemented the polling of completed jobs before the batch execution completes. 

The output of test-patch:

{code}
     [exec] -1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
     [exec]                         Please justify why no tests are needed for this patch.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
{code}

It's hard to test this with miniCluster. 

> "0" value seen in PigStat's map/reduce runtime, even when the job is successful
> -------------------------------------------------------------------------------
>
>                 Key: PIG-1829
>                 URL: https://issues.apache.org/jira/browse/PIG-1829
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Thejas M Nair
>            Assignee: Richard Ding
>             Fix For: 0.9.0
>
>         Attachments: PIG-1829.patch
>
>
> Pig runtime calls JobClient.getMapTaskReports(jobId) and JobClient.getReduceTaskReports(jobId) to get statistics about numbers of maps/reducers, as well as max/min/avg time of these tasks. But from time to time, these calls return empty lists. When that happens pig is reports 0 values for the stats. 
> The jobtracker keeps the stats information only for a limited duration based on the configuration parameters  mapred.jobtracker.completeuserjobs.maximum and mapred.job.tracker.retiredjobs.cache.size. Since pig collects the stats after jobs have finished running, it is possible that the stats for the initial jobs are no longer available. To have better chances of getting the stats, it should be collected as soon as the job is over. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira