You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Cheolsoo Park (JIRA)" <ji...@apache.org> on 2015/02/04 18:48:35 UTC

[jira] [Updated] (PIG-4409) fs.defaultFS is overwritten in JobConf by replicated join at runtime

     [ https://issues.apache.org/jira/browse/PIG-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cheolsoo Park updated PIG-4409:
-------------------------------
    Attachment: PIG-4409-1.patch

Uploading a patch that fixes the issue.

> fs.defaultFS is overwritten in JobConf by replicated join at runtime
> --------------------------------------------------------------------
>
>                 Key: PIG-4409
>                 URL: https://issues.apache.org/jira/browse/PIG-4409
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.14.0
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>            Priority: Critical
>             Fix For: 0.15.0
>
>         Attachments: PIG-4409-1.patch
>
>
> This is a regression of PIG-4257.
> Pig accidentally overwrites {{fs.defaultFS}} in JobConf during the replicated join at runtime. This can cause various side effects because udfs and store/load funcs might depend on the value of {{fs.defaultFS}} at runtime.
> Here is an example. I have a store func that does 2-phase commit to S3. Each reducer writes output to local disk first and copies them to the final destination on S3 during the task commit phase. Once it's done copying, reducer writes a commit log to a hdfs location. During the job commit phase, AM reads all the commit logs and update Hive metastore accordingly.
> This store func stopped working in 0.14 when there is a replicate join in the reduce phase. It is because {{fs.defaultFS}} is overwritten to local FS from HDFS by replicated join at runtime.
> The root cause is that PIG-4257 changed {{ConfigurationUtil.getLocalFSProperties()}} to return a reference to JobConf instead of a copy object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)