You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2008/09/10 13:38:45 UTC

[jira] Updated: (HADOOP-2403) JobHistory log files contain data that cannot be parsed by org.apache.hadoop.mapred.JobHistory

     [ https://issues.apache.org/jira/browse/HADOOP-2403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-2403:
--------------------------------------------

    Attachment: patch-2403.txt

Existing history parsing code actually incorporates new lines in the values. The parsing problem occurs when the character *"* is followed by *\n*, because the value doesnt allow *"* inside.  Since the JobHistory looks for *KEY="VALUE"* Pattern for parsing keys and values, parsing fails if value has *"* and *=* in it.

The attached patch escapes *"* and *=* in the value and logs it. Regular expression for VALUE is modified to allow any character otherthan quote, but escaped quotes will be allowed. After parsing the value, both *"* and *=* are unescaped and returned.  

> JobHistory log files contain data that cannot be parsed by org.apache.hadoop.mapred.JobHistory
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2403
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2403
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Critical
>             Fix For: 0.19.0
>
>         Attachments: EncodeDecode.java, patch-2403.txt, patch-2403.txt
>
>
> When some tasks failed, the job tracker writes an line to the history file with error message.
> However, the error message may mess up with the history file format, choking the history parser. Here is an example:
> MapAttempt TASK_TYPE="MAP" TASKID="tip_200712102254_0001_m_000090" TASK_ATTEMPT_ID="task_200712102254_0001_m_000090_0" TASK_STATUS="FAILED" FINISH_TIME="1197327293253" HOSTNAME="XXXX:50050" ERROR="java.lang.IllegalArgumentException: Trouble to get key or value (<,> substituted by null 
> . Key XML-Ori:
>         <Root>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.