You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Runping Qi (JIRA)" <ji...@apache.org> on 2008/03/09 00:47:46 UTC

[jira] Updated: (HADOOP-2978) JobHistory log format for COUNTER is ambigurous

     [ https://issues.apache.org/jira/browse/HADOOP-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Runping Qi updated HADOOP-2978:
-------------------------------

    Description: 

For the lines like: 

Job JOBID="job_200803072233_0001" FINISH_TIME="1204929332820" JOB_STATUS="SUCCESS" FINISHED_MAPS="24" FINISHED_REDUCES="15" FAILED_MAPS="0" FAILED_REDUCES="0" COUNTERS="Job Counters .Launched map tasks=24,Job Counters .Launched reduce tasks=15Map-Reduce Framework.Map input records=2894276,Map-Reduce Framework.Map output records=2894276,Map-Reduce Framework.Map input bytes=3227015845,Map-Reduce Framework.Map output bytes=3232268034,Map-Reduce Framework.Combine input records=0,Map-Reduce Framework.Combine output records=0,Map-Reduce Framework.Reduce input groups=2526981,Map-Reduce Framework.Reduce input records=2894276,Map-Reduce Framework.Reduce output records=2894276"

The extracted value for COUNTERS is 

Job Counters .Launched map tasks


which is clearly wrong.



  was:


For the lines like: 

Job JOBID="job_200803072233_0001" FINISH_TIME="1204929332820" JOB_STATUS="SUCCESS" FINISHED_MAPS="24" FINISHED_REDUCES="15" FAILED_MAPS="0" FAILED_REDUCES="0" COUNTERS="Job Counters .Launched map tasks=24,Job Counters .Launched reduce tasks=15Map-Reduce Framework.Map input records=2894276,Map-Reduce Framework.Map output records=2894276,Map-Reduce Framework.Map input bytes=3227015845,Map-Reduce Framework.Map output bytes=3232268034,Map-Reduce Framework.Combine input records=0,Map-Reduce Framework.Combine output records=0,Map-Reduce Framework.Reduce input groups=2526981,Map-Reduce Framework.Reduce input records=2894276,Map-Reduce Framework.Reduce output records=2894276"

The extracted value for COUNTERS is 

Job Counters .Launched map tasks


which is clearly wrong.



        Summary: JobHistory log format for COUNTER is ambigurous   (was: JobHistory parser cannot extract  the value for  COUNTERS )


An item in a job history log line is separated by "=".
However, the value for the item "COUNTERS" contains "=", which cause the parser misbehaves.

For the lines like: 

Job JOBID="job_200803072233_0001" FINISH_TIME="1204929332820" JOB_STATUS="SUCCESS" FINISHED_MAPS="24" FINISHED_REDUCES="15" FAILED_MAPS="0" FAILED_REDUCES="0" COUNTERS="Job Counters .Launched map tasks=24,Job Counters .Launched reduce tasks=15Map-Reduce Framework.Map input records=2894276,Map-Reduce Framework.Map output records=2894276,Map-Reduce Framework.Map input bytes=3227015845,Map-Reduce Framework.Map output bytes=3232268034,Map-Reduce Framework.Combine input records=0,Map-Reduce Framework.Combine output records=0,Map-Reduce Framework.Reduce input groups=2526981,Map-Reduce Framework.Reduce input records=2894276,Map-Reduce Framework.Reduce output records=2894276"

The extracted value for COUNTERS is 

Job Counters .Launched map tasks


which is clearly wrong.

The expected value is:

Job Counters .Launched map tasks=24,Job Counters .Launched reduce tasks=15Map-Reduce Framework.Map input records=2894276,Map-Reduce Framework.Map output records=2894276,Map-Reduce Framework.Map input bytes=3227015845,Map-Reduce Framework.Map output bytes=3232268034,Map-Reduce Framework.Combine input records=0,Map-Reduce Framework.Combine output records=0,Map-Reduce Framework.Reduce input groups=2526981,Map-Reduce Framework.Reduce input records=2894276,Map-Reduce Framework.Reduce output records=2894276"

Clearly, the "=" chars in the value cuased the confusion.

The chars "=" in the value is added by the makeCompactString method of Counters class
as separators between a counter name and its value.

I suggest we use colon char (":") instead as the separator.
I'll attach a patch sortly.
 

> JobHistory log format for COUNTER is ambigurous 
> ------------------------------------------------
>
>                 Key: HADOOP-2978
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2978
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Runping Qi
>
> For the lines like: 
> Job JOBID="job_200803072233_0001" FINISH_TIME="1204929332820" JOB_STATUS="SUCCESS" FINISHED_MAPS="24" FINISHED_REDUCES="15" FAILED_MAPS="0" FAILED_REDUCES="0" COUNTERS="Job Counters .Launched map tasks=24,Job Counters .Launched reduce tasks=15Map-Reduce Framework.Map input records=2894276,Map-Reduce Framework.Map output records=2894276,Map-Reduce Framework.Map input bytes=3227015845,Map-Reduce Framework.Map output bytes=3232268034,Map-Reduce Framework.Combine input records=0,Map-Reduce Framework.Combine output records=0,Map-Reduce Framework.Reduce input groups=2526981,Map-Reduce Framework.Reduce input records=2894276,Map-Reduce Framework.Reduce output records=2894276"
> The extracted value for COUNTERS is 
> Job Counters .Launched map tasks
> which is clearly wrong.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.