You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Alejandro Abdelnur (JIRA)" <ji...@apache.org> on 2008/03/18 11:57:28 UTC

[jira] Created: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Within a task, the value ofJobConf.getOutputPath() method is modified
---------------------------------------------------------------------

                 Key: HADOOP-3041
                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
             Project: Hadoop Core
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.16.1
         Environment: all
            Reporter: Alejandro Abdelnur
            Priority: Blocker
             Fix For: 0.16.2


Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 

For example: /user/foo/myoutput/part_00000

In 0.16.1, now it returns an internal hadoop for the task output temporary location.

For the above example: /user/foo/myoutput/_temporary/part_00000

This change breaks applications that use the getOutputPath() to compute other directories.

IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580447#action_12580447 ] 

Devaraj Das commented on HADOOP-3041:
-------------------------------------

bq. Indeed. Any applications that implementtheir own output format class depend on the current semantics of getOutputPath.

I guess the solution we are driving towards is that we will have an API called JobConf.getFinalOutputPath() and define a private job config variable that will store the dir what the user originally sets during job submission. This config variable is never updated except during job submission.

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.16.2
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580462#action_12580462 ] 

Devaraj Das commented on HADOOP-3041:
-------------------------------------

bq. Deprecate getOutputPath() and replace it with getCurrentOutputPath() and getFinalOutputPath() 
+1

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.16.2
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579899#action_12579899 ] 

Alejandro Abdelnur commented on HADOOP-3041:
--------------------------------------------

If there is a method returning the original path is OK. 

But, using the  rule of least surprise, wouldn't make more sense to have a getTaskOutputPath() that returns the path to the part file for the current task and leave the getOutputPath() with the user entered value?

Also the javadoc should not say 'Get the Path to the output directory for the map-reduce job' in its one line description then.




> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Priority: Blocker
>             Fix For: 0.16.2
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581863#action_12581863 ] 

Alejandro Abdelnur commented on HADOOP-3041:
--------------------------------------------

IMO Owen's proposal is the correct one. 

I'm fine with reverting behavior to the 0.16.0 if it gets addressed in 0.17 as we can workaround it.

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.16.2
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582892#action_12582892 ] 

Amareshwari Sriramadasu commented on HADOOP-3041:
-------------------------------------------------

Java doc warning is due to src/java/org/apache/hadoop/net/SocketInputStream.java, not related to this patch.

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585647#action_12585647 ] 

Hadoop QA commented on HADOOP-3041:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12379370/patch-3041.txt
against trunk revision 643282.

    @author +1.  The patch does not contain any @author tags.

    tests included +1.  The patch appears to include 114 new or modified tests.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2163/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2163/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2163/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2163/console

This message is automatically generated.

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580452#action_12580452 ] 

Sameer Paranjpye commented on HADOOP-3041:
------------------------------------------

Silently changing public API semantics is bad, looks like we've done that here. How about we:
- Deprecate getOutputPath() and replace it with getCurrentOutputPath() and getFinalOutputPath()
- Have getOutputPath() return the same thing as getCurrentOutputPath() because that breaks the least amount of code


> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.16.2
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Status: Open  (was: Patch Available)

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579846#action_12579846 ] 

Devaraj Das commented on HADOOP-3041:
-------------------------------------

Alejandro, the reason for modifying the job's output dir is to let user apps transparently deal with things like creation of side files in the task's output directory, and, speculative tasks creating the same output files. Another reason is that the getOutputPath can be used (and is usually used) in the OutputFormat implementation. All user code could use getOutputPath and create task specific stuff there and the framework automatically promotes/discards these files upon successful/failed task completion. Look at the JavaDoc in JobConf.getOutputPath() to get a clear explanation of what i am trying to say (by the way this doc needs to be fixed to include _temporary).
You are facing the problem since you create a directory in the _same level_ as the _actual_ output directory of the job. One way to address your problem is to provide an additional API like JobConf.getConfiguredOutputPath that would internally do things like getOutputPath.getParent(), etc. and return you the actual configured directory. This will ensure that your apps don't break when the framework changes the directory structure of the output path, etc. Not the best solution but we have to arrive at a compromise between your requirement and what we already document and provide. Thoughts?

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Priority: Blocker
>             Fix For: 0.16.2
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585963#action_12585963 ] 

Hudson commented on HADOOP-3041:
--------------------------------

Integrated in Hadoop-trunk #451 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/451/])

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-3041:
--------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this.  Thanks Amareshwari!  Can we add a release note for this?

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583999#action_12583999 ] 

Owen O'Malley commented on HADOOP-3041:
---------------------------------------

*Sigh*

One option is to move both get and set output path to FileInputFormat, which is where they actually belong. I only hate to do it because moving JobConf.setOutputPath will break 99% of map/reduce applications. In that case, it would look like:

{code}
FileInputFormat:
  getOutputPath
  setOutputPath
  getTaskOutputPath

JobConf:
  deprecate getOutputPath
  deprecate setOutputPath
{code}

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-3041:
--------------------------------

    Release Note: 
1. Deprecates JobConf.setOutputPath and JobConf.getOutputPath
JobConf.getOutputPath() still returns the same value that it used to return. 
2. Deprecates OutputFormatBase. Adds FileOutputFormat. Existing output formats extending OutputFormatBase, now extend FileOutputFormat.
3. Adds the following APIs in FileOutputFormat :
public static void setOutputPath(JobConf conf, Path outputDir); // sets mapred.output.dir
public static Path getOutputPath(JobConf conf) ; // gets mapred.output.dir
public static Path getWorkOutputPath(JobConf conf); // gets mapred.work.output.dir
4. static void setWorkOutputPath(JobConf conf, Path outputDir) is also added to FileOutputFormat. This is used by the framework to set mapred.work.output.dir as task's temporary output dir .

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Attachment: patch-3041.txt

After the discussions, attaching a patch that does the following:

1. Deprecates JobConf.setOutputPath and JobConf.getOutputPath
JobConf.getOutputPath() still returns the same value that it used to return. 
2. Deprecates OutputFormatBase.  Adds FileOutputFormat. Existing  output formats extending OutputFormatBase,  are now extending FileOutputFormat.
3. Adds the following APIs in FileOutputFormat :
public static void setOutputPath(JobConf conf, Path outputDir); // sets mapred.output.dir
public static Path getOutputPath(JobConf conf) ; // gets mapred.output.dir
public static Path getWorkOutputPath(JobConf conf); // gets mapred.work.output.dir
4.  static void setWorkOutputPath(JobConf conf, Path outputDir)  is also added to FileOutputFormat. This is used by the framework to set mapred.work.output.dir as task's temporary output dir .


> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.16.2
>
>         Attachments: patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-3041:
--------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Amareshwari!

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.16.2
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581094#action_12581094 ] 

Devaraj Das commented on HADOOP-3041:
-------------------------------------

Could you please submit a patch for the 0.16 branch. This one doesn't apply cleanly. Thanks!

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.16.2
>
>         Attachments: patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu reassigned HADOOP-3041:
-----------------------------------------------

    Assignee: Amareshwari Sriramadasu

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.16.2
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580342#action_12580342 ] 

Runping Qi commented on HADOOP-3041:
------------------------------------


bq: Possibly. One thing that is of concern here is that apps potentially have been written using the getOutputPath API (that creates side files within it).. 

Indeed. Any applications that implementtheir own output format class depend on the current semantics of getOutputPath.
I have many of such applications.



> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.16.2
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580991#action_12580991 ] 

Hadoop QA commented on HADOOP-3041:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12378306/patch-3041.txt
against trunk revision 619744.

    @author +1.  The patch does not contain any @author tags.

    tests included +1.  The patch appears to include 15 new or modified tests.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2016/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2016/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2016/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2016/console

This message is automatically generated.

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.16.2
>
>         Attachments: patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581094#action_12581094 ] 

devaraj edited comment on HADOOP-3041 at 3/21/08 8:33 AM:
--------------------------------------------------------------

I committed the patch to trunk. Could you please submit a patch for the 0.16 branch. This one doesn't apply cleanly. Thanks!

      was (Author: devaraj):
    Could you please submit a patch for the 0.16 branch. This one doesn't apply cleanly. Thanks!
  
> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.16.2
>
>         Attachments: patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12584003#action_12584003 ] 

Alejandro Abdelnur commented on HADOOP-3041:
--------------------------------------------

Our applications, which which are running on a previous hadoop version, when migrating to 0.16.0+ are failing because we assumed the returned value was the path of output part without any temporary stuff in it. So we are broken as well.

As the Hadoop API gets refined changes like this will break things, for example FileSystem listPaths() now returns NULL instead an empty array when the dir does not exist. 

It is kind of painful but I would not deprecating methods with the right name because they were returning incorrect data.

IMO the right thing to do is to have the getOutputPath() with the configured value and getWorkingOutputPath() with the temporary dir.


> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Attachment: patch-3041.txt

Here is a patch that deprecates getOutputPath() and adds new apis getCurrentOutputPath() and getFinalOutputPath() (all in JobConf)

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.16.2
>
>         Attachments: patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585317#action_12585317 ] 

Owen O'Malley commented on HADOOP-3041:
---------------------------------------

This looks like the right direction.

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Attachment: patch-3041-0.16.2.txt

Patch for 0.16 branch. It passed all the ant tests successfully on my machine on branch 0.16.

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.16.2
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583620#action_12583620 ] 

Hadoop QA commented on HADOOP-3041:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12378939/patch-3041.txt
against trunk revision 619744.

    @author +1.  The patch does not contain any @author tags.

    tests included +1.  The patch appears to include 6 new or modified tests.

    patch -1.  The patch command could not apply the patch.

Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2101/console

This message is automatically generated.

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583682#action_12583682 ] 

Runping Qi commented on HADOOP-3041:
------------------------------------


My applications assume that getOutputPath() returns the temporary working dir.
Will my apps be broken by this patch?
I think many people are in  similar situations.


> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-3041:
--------------------------------

    Hadoop Flags: [Reviewed]

+1

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Attachment: patch-3041.txt

Patch has setWorkOutputDirectory code similar to setOutputPath.

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das reopened HADOOP-3041:
---------------------------------


I reverted the patch

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.16.2
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582290#action_12582290 ] 

Hudson commented on HADOOP-3041:
--------------------------------

Integrated in Hadoop-trunk #442 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/442/])

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582708#action_12582708 ] 

Hadoop QA commented on HADOOP-3041:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12378715/patch-3041.txt
against trunk revision 619744.

    @author +1.  The patch does not contain any @author tags.

    tests included +1.  The patch appears to include 6 new or modified tests.

    javadoc -1.  The javadoc tool appears to have generated 1 warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2077/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2077/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2077/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2077/console

This message is automatically generated.

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Attachment: patch-3041.txt

Patch in sync with the trunk

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Status: Patch Available  (was: Reopened)

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Status: Open  (was: Patch Available)

Canceling patch to make setOutputPath and setWorkOutputDirectory have same code semantics.

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581570#action_12581570 ] 

Owen O'Malley commented on HADOOP-3041:
---------------------------------------

I think we need to revert this in 16.2. Breaking compatibility from 0.16.1 is worse that living with a recognized change from 0.16.0. Changing from getOutputPath makes the API inconsistent with setOutputPath, so for 0.17, I propose:

 * moving getOutputPath to be the final one
 * rename getCurrentOutputPath to getWorkOutputDirectory

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.16.2
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582643#action_12582643 ] 

Hadoop QA commented on HADOOP-3041:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12378708/patch-3041.txt
against trunk revision 619744.

    @author +1.  The patch does not contain any @author tags.

    tests included +1.  The patch appears to include 6 new or modified tests.

    javadoc -1.  The javadoc tool appears to have generated 1 warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2075/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2075/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2075/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2075/console

This message is automatically generated.

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583798#action_12583798 ] 

Runping Qi commented on HADOOP-3041:
------------------------------------


I think it is better to removing the getOutputPath()  from the api and replace it with something else.
That way, the application can detect the problem at compile time.
Otherwise, the apps will misbehave without any warnings.



> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581342#action_12581342 ] 

Hudson commented on HADOOP-3041:
--------------------------------

Integrated in Hadoop-trunk #436 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/436/])

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.16.2
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585335#action_12585335 ] 

Hadoop QA commented on HADOOP-3041:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12379258/patch-3041.txt
against trunk revision 643282.

    @author +1.  The patch does not contain any @author tags.

    tests included +1.  The patch appears to include 114 new or modified tests.

    patch -1.  The patch command could not apply the patch.

Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2146/console

This message is automatically generated.

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579938#action_12579938 ] 

Devaraj Das commented on HADOOP-3041:
-------------------------------------

bq. But, using the rule of least surprise, wouldn't make more sense to have a getTaskOutputPath() that returns the path to the part file for the current task and leave the getOutputPath() with the user entered value?

Possibly. One thing that is of concern here is that apps potentially have been written using the getOutputPath API (that creates side files within it).. Also, if the user really intends to create a side file in the output directory of the job, it is slightly unintuitive IMO to have the user invoke getTaskOutputPath. But yes I agree that getOutputPath returning the task's output path is unintuitive as well. I wish this was clearer. I am unhappy about it too..

bq. Also the javadoc should not say 'Get the Path to the output directory for the map-reduce job' in its one line description then.
Hmm.. 



> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Priority: Blocker
>             Fix For: 0.16.2
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Status: Open  (was: Patch Available)

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Attachment: patch-3041.txt

Patch has a documentation change from the earlier.

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-3041:
--------------------------------

    Fix Version/s:     (was: 0.16.2)
                   0.17.0

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Status: Open  (was: Patch Available)

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582120#action_12582120 ] 

Amareshwari Sriramadasu commented on HADOOP-3041:
-------------------------------------------------

bq.  so for 0.17, I propose:
    * moving getOutputPath to be the final one
    * rename getCurrentOutputPath to getWorkOutputDirectory 


Moving the getOutputPath to be the final one makes existing applications incompatible. ex. All the OutputFormat written by users has to use getWorkOutputDirectory instead of getOutputPath.

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3041) Within a task, the value ofJobConf.getOutputPath() method is modified

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------

    Attachment: patch-3041.txt

Here is a patch adding 
1. getWorkOutputDirectory() returning task's temporary output directory which will be set by the framework.
2. changing getOutputPath to hold the value specified by setOutputPath always.

> Within a task, the value ofJobConf.getOutputPath() method is modified
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-3041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3041
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.1
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt
>
>
> Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task. 
> For example: /user/foo/myoutput/part_00000
> In 0.16.1, now it returns an internal hadoop for the task output temporary location.
> For the above example: /user/foo/myoutput/_temporary/part_00000
> This change breaks applications that use the getOutputPath() to compute other directories.
> IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.