You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Arun C Murthy (JIRA)" <ji...@apache.org> on 2008/04/15 03:01:06 UTC

[jira] Created: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

JobHistory file on HDFS should not use the 'job name'
-----------------------------------------------------

                 Key: HADOOP-3256
                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
             Project: Hadoop Core
          Issue Type: Bug
    Affects Versions: 0.17.0
            Reporter: Arun C Murthy
            Assignee: Arun C Murthy
            Priority: Blocker
             Fix For: 0.17.0


HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.

Unfortunately the following code:
{noformat}
        // setup the history log file for this job
        String logFileName = jobUniqueString +  
                             "_" + user+ "_" + jobName;
        if (logFileName.length() > MAX_FILENAME_SIZE) {
          logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
        }
{noformat}
is vulnerable to user-provided job names. 

Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".

The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.

The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3256:
----------------------------------

    Attachment: HADOOP-3256_1_20080415.patch

Updated patch to use UR{En|De}code and to keep the old scheme of <jobtrackerHostname, jobId, username, jobName> for the log-file name...

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3256:
----------------------------------

    Attachment: HADOOP-3256_2_20080415.patch

Fixed HistoryViewer too...

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589029#action_12589029 ] 

Hadoop QA commented on HADOOP-3256:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12380155/HADOOP-3256_1_20080415.patch
against trunk revision 645773.

    @author +1.  The patch does not contain any @author tags.

    tests included -1.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2237/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2237/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2237/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2237/console

This message is automatically generated.

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3256:
----------------------------------

    Attachment: HADOOP-3256_2_20080415.patch

Updated patch incorporating review feedback, also fixed _all_ jsps to encode urls again so that the decoding done by the browser doesn't lead to wrong logFile parameter being passed around.

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589403#action_12589403 ] 

Arun C Murthy commented on HADOOP-3256:
---------------------------------------

Amareshwari, as that comment says we need to re-encode the filename since the browser will un-encode it automatically. So that is necessary. Similarly I had to do it in a lot of places in other jsps too.

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3256:
----------------------------------

    Status: Open  (was: Patch Available)

Cancelling patch to incorporate Amareshwari's comments ...

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3256:
----------------------------------

    Status: Open  (was: Patch Available)

Fixing findbugs warnings...

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589571#action_12589571 ] 

Devaraj Das commented on HADOOP-3256:
-------------------------------------

Arun, I didn't understand the change you made in CopyFiles.java. Apart from that it looks good.

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588993#action_12588993 ] 

Amareshwari Sriramadasu commented on HADOOP-3256:
-------------------------------------------------

some comments
1. The code in jobhistory.jsp should call jobFile.getName.toString() to get the file name, instead of jobFile.toString()
{code}
+        decodedJobFileName = URLDecoder.decode(jobFile.toString(), "UTF-8");
{code} should be 
{code}
+        decodedJobFileName = URLDecoder.decode(jobFile.getName().toString(), "UTF-8");
{code}

2.  printJob should be passed jobFile, since jobFile is the complete path of the log file and is used to load the job details in the following jsps. and should not be decoded.
So, the code change 
{code}
       printJob(trackerHostName, trackerStartTime, jobId,
-               jobName, user, jobFile.toString(), out) ; 
+               jobName, user, decodedJobFileName, out) ; 
{code} is not necessary.

3. Otherthan above two comments, indentation issues to do with two spces vs four spaces.


> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3256:
----------------------------------

    Attachment: HADOOP-3256_2_20080415.patch

Removed the debugging code in CopyFiles.java ... my bad.

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3256:
----------------------------------

    Status: Patch Available  (was: Open)

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589476#action_12589476 ] 

Hadoop QA commented on HADOOP-3256:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12380247/HADOOP-3256_2_20080415.patch
against trunk revision 645773.

    @author +1.  The patch does not contain any @author tags.

    tests included -1.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2250/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2250/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2250/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2250/console

This message is automatically generated.

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589411#action_12589411 ] 

Amareshwari Sriramadasu commented on HADOOP-3256:
-------------------------------------------------

Sorry..  one more comment.  Decoding should be done in HistoryViewer.java also, at the place where filename split is done to extract jobId.

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3256:
----------------------------------

    Component/s: mapred

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588917#action_12588917 ] 

Hadoop QA commented on HADOOP-3256:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12380131/HADOOP-3256_0_20080414.patch
against trunk revision 645773.

    @author +1.  The patch does not contain any @author tags.

    tests included -1.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests -1.  The patch failed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2233/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2233/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2233/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2233/console

This message is automatically generated.

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588965#action_12588965 ] 

Arun C Murthy commented on HADOOP-3256:
---------------------------------------

Alternate solution: I guess we can use URL{En|De}code the <jobtrackerHostname, jobId, username, jobName> quartet to ensure they are legal URIs:
http://java.sun.com/j2se/1.5.0/docs/api/java/net/URLEncoder.html
http://java.sun.com/j2se/1.5.0/docs/api/java/net/URLDecoder.html

Thoughts?

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588861#action_12588861 ] 

szetszwo edited comment on HADOOP-3256 at 4/14/08 6:08 PM:
-------------------------------------------------------------------------

Another easy fix is using jobid + hash(other stuffs).

      was (Author: szetszwo):
    Another easy is to use jobid + hash(other stuffs).
  
> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589395#action_12589395 ] 

Hadoop QA commented on HADOOP-3256:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12380227/HADOOP-3256_2_20080415.patch
against trunk revision 645773.

    @author +1.  The patch does not contain any @author tags.

    tests included -1.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs -1.  The patch appears to introduce 1 new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2245/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2245/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2245/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2245/console

This message is automatically generated.

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589430#action_12589430 ] 

Amareshwari Sriramadasu commented on HADOOP-3256:
-------------------------------------------------

+1 for the patch

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3256:
----------------------------------

    Status: Patch Available  (was: Open)

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588910#action_12588910 ] 

Runping Qi commented on HADOOP-3256:
------------------------------------


I thinkk option 2 is preferred.

You may introduce encode/decode functions to escape the job name part.



> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3256:
----------------------------------

    Status: Patch Available  (was: Open)

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3256:
----------------------------------

    Status: Open  (was: Patch Available)

I need to fix HistoryViewer too...

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589986#action_12589986 ] 

Hudson commented on HADOOP-3256:
--------------------------------

Integrated in Hadoop-trunk #463 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/463/])

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589401#action_12589401 ] 

Amareshwari Sriramadasu commented on HADOOP-3256:
-------------------------------------------------

The encoding you are doing in jobhistory is not needed, because you are encoding the logFile in jobHistory.java before creating.
And listpaths will give encoded name. jobFile.toString() will give correct path to be passed for jsps, right?
{code}
+      // Encode the logfile name again to cancel the decoding done by the browser
+      String encodedJobFileName = 
+          JobHistory.JobInfo.encodeJobHistoryFileName(jobFile.getName().toString());
{code}

the above encoding and passing it for jsps is not correct, since jobFile.getName() is already encoded name. Thus patch needs only decoding of jobFile.getName() for extracting values of jobtracker hostname, starttime, jobid, username and jobname.

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3256:
----------------------------------

    Attachment: HADOOP-3256_0_20080414.patch

Fixed to use jobId rather than jobName.

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3256:
----------------------------------

    Status: Patch Available  (was: Open)

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3256:
----------------------------------

    Status: Patch Available  (was: Open)

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588897#action_12588897 ] 

Amareshwari Sriramadasu commented on HADOOP-3256:
-------------------------------------------------

Changing the log file name is going to use only jobId breaks history. 
History jsps assume that they have the path name as jobtrackerHostname_jobId_username_jobName.  Since there is no master index now, the history viewing jsps do a list paths in history directory and  parse the path names to give more information for the user about the log file.
(For more information see HADOOP-2178 )
I could see the following options:
1. One solution could be to write this information as the first line of history log, but the jsps have to read first line of all history files to print the first page.
This is going to be lot of reads. 
2. Another solution could be to construct a accepted path with the jobtrackerHostname, jobId, username and jobName.

Thoughts?

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-3256:
--------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Arun!

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588861#action_12588861 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-3256:
------------------------------------------------

Another easy is to use jobid + hash(other stuffs).

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589408#action_12589408 ] 

Amareshwari Sriramadasu commented on HADOOP-3256:
-------------------------------------------------

Then it makes sense.

+1 for the patch. I validated the patch with both jobname having ':' in it and without it.

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3256:
----------------------------------

    Attachment: HADOOP-3256_2_20080415.patch

Updated patch to fix the findbugs warnings...

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch, HADOOP-3256_2_20080415.patch, HADOOP-3256_2_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3256:
----------------------------------

    Status: Open  (was: Patch Available)

> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3256_0_20080414.patch, HADOOP-3256_1_20080415.patch
>
>
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the jobid.
> The long term fix is to ensure that Path handles filenames with _any_ characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.