You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Evert Lammerts (JIRA)" <ji...@apache.org> on 2012/08/13 12:10:37 UTC

[jira] [Created] (PIG-2872) StoreFuncInterface.setStoreLocation get's a copy of a Configuration object

Evert Lammerts created PIG-2872:
-----------------------------------

             Summary: StoreFuncInterface.setStoreLocation get's a copy of a Configuration object
                 Key: PIG-2872
                 URL: https://issues.apache.org/jira/browse/PIG-2872
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.11
         Environment: Pig trunk, Hadoop 0.20.205 with Kerberos, ElasticSearch trunk, Wonderdog trunk
            Reporter: Evert Lammerts


When an implementation of StoreFuncInterface.setStoreLocation is called from JobControlCompiler.getJob, it is passed a copy of the Configuration that will be used for the Job that will be submitted:

{code:title=JobControlCompiler.java}
sFunc.setStoreLocation(st.getSFile().getFileName(), new org.apache.hadoop.mapreduce.Job(nwJob.getConfiguration()));
{code}

When a new org.apache.hadoop.mapreduce.Job is created it creates a copy of the Configuration object, as far as I know. Thus anything added to the Configuration object in the implementation of setStoreLocation will not be included in the Configuration of nwJob in JobControlCompiler.getJob.

I notice this goes wrong in Wonderdog, which needs to include the Elasticsearch configuration file in the DistributedCache. It is added to mapred.cache.files through setStoreLocation, but this setting doesn't make it back into the Job returned by JobControlCompiler.getJob, and is therefore never localized.

This might be intentional semantics within Pig, but I'm not familiar enough with StoreFuncs to know whether it is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2872) StoreFuncInterface.setStoreLocation get's a copy of a Configuration object

Posted by "Bill Graham (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433199#comment-13433199 ] 

Bill Graham commented on PIG-2872:
----------------------------------

I ran into issues with this as well. See PIG-2870 and the attached patch. Does Wonderdog work if you apply this patch?
                
> StoreFuncInterface.setStoreLocation get's a copy of a Configuration object
> --------------------------------------------------------------------------
>
>                 Key: PIG-2872
>                 URL: https://issues.apache.org/jira/browse/PIG-2872
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.11
>         Environment: Pig trunk, Hadoop 0.20.205 with Kerberos, ElasticSearch trunk, Wonderdog trunk
>            Reporter: Evert Lammerts
>
> When an implementation of StoreFuncInterface.setStoreLocation is called from JobControlCompiler.getJob, it is passed a copy of the Configuration that will be used for the Job that will be submitted:
> {code:title=JobControlCompiler.java}
> sFunc.setStoreLocation(st.getSFile().getFileName(), new org.apache.hadoop.mapreduce.Job(nwJob.getConfiguration()));
> {code}
> When a new org.apache.hadoop.mapreduce.Job is created it creates a copy of the Configuration object, as far as I know. Thus anything added to the Configuration object in the implementation of setStoreLocation will not be included in the Configuration of nwJob in JobControlCompiler.getJob.
> I notice this goes wrong in Wonderdog, which needs to include the Elasticsearch configuration file in the DistributedCache. It is added to mapred.cache.files through setStoreLocation, but this setting doesn't make it back into the Job returned by JobControlCompiler.getJob, and is therefore never localized.
> This might be intentional semantics within Pig, but I'm not familiar enough with StoreFuncs to know whether it is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2872) StoreFuncInterface.setStoreLocation get's a copy of a Configuration object

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440910#comment-13440910 ] 

Dmitriy V. Ryaboy commented on PIG-2872:
----------------------------------------

Reverted PIG-2578. Is still a problem, or does reverting that patch fix this issue?
                
> StoreFuncInterface.setStoreLocation get's a copy of a Configuration object
> --------------------------------------------------------------------------
>
>                 Key: PIG-2872
>                 URL: https://issues.apache.org/jira/browse/PIG-2872
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.11
>         Environment: Pig trunk, Hadoop 0.20.205 with Kerberos, ElasticSearch trunk, Wonderdog trunk
>            Reporter: Evert Lammerts
>
> When an implementation of StoreFuncInterface.setStoreLocation is called from JobControlCompiler.getJob, it is passed a copy of the Configuration that will be used for the Job that will be submitted:
> {code:title=JobControlCompiler.java}
> sFunc.setStoreLocation(st.getSFile().getFileName(), new org.apache.hadoop.mapreduce.Job(nwJob.getConfiguration()));
> {code}
> When a new org.apache.hadoop.mapreduce.Job is created it creates a copy of the Configuration object, as far as I know. Thus anything added to the Configuration object in the implementation of setStoreLocation will not be included in the Configuration of nwJob in JobControlCompiler.getJob.
> I notice this goes wrong in Wonderdog, which needs to include the Elasticsearch configuration file in the DistributedCache. It is added to mapred.cache.files through setStoreLocation, but this setting doesn't make it back into the Job returned by JobControlCompiler.getJob, and is therefore never localized.
> This might be intentional semantics within Pig, but I'm not familiar enough with StoreFuncs to know whether it is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2872) StoreFuncInterface.setStoreLocation get's a copy of a Configuration object

Posted by "Evert Lammerts (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433441#comment-13433441 ] 

Evert Lammerts commented on PIG-2872:
-------------------------------------

The patch gets rid of this exception, thanks! Will you ship it into 0.11?
                
> StoreFuncInterface.setStoreLocation get's a copy of a Configuration object
> --------------------------------------------------------------------------
>
>                 Key: PIG-2872
>                 URL: https://issues.apache.org/jira/browse/PIG-2872
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.11
>         Environment: Pig trunk, Hadoop 0.20.205 with Kerberos, ElasticSearch trunk, Wonderdog trunk
>            Reporter: Evert Lammerts
>
> When an implementation of StoreFuncInterface.setStoreLocation is called from JobControlCompiler.getJob, it is passed a copy of the Configuration that will be used for the Job that will be submitted:
> {code:title=JobControlCompiler.java}
> sFunc.setStoreLocation(st.getSFile().getFileName(), new org.apache.hadoop.mapreduce.Job(nwJob.getConfiguration()));
> {code}
> When a new org.apache.hadoop.mapreduce.Job is created it creates a copy of the Configuration object, as far as I know. Thus anything added to the Configuration object in the implementation of setStoreLocation will not be included in the Configuration of nwJob in JobControlCompiler.getJob.
> I notice this goes wrong in Wonderdog, which needs to include the Elasticsearch configuration file in the DistributedCache. It is added to mapred.cache.files through setStoreLocation, but this setting doesn't make it back into the Job returned by JobControlCompiler.getJob, and is therefore never localized.
> This might be intentional semantics within Pig, but I'm not familiar enough with StoreFuncs to know whether it is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2872) StoreFuncInterface.setStoreLocation get's a copy of a Configuration object

Posted by "Cheolsoo Park (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433459#comment-13433459 ] 

Cheolsoo Park commented on PIG-2872:
------------------------------------

In PIG-2821, it looks like Rohini is proposing to revert PIG-2578 that introduced this issue in the first place. Don't we need a coordinated fix for the regression from PIG-2578?

Thanks!
                
> StoreFuncInterface.setStoreLocation get's a copy of a Configuration object
> --------------------------------------------------------------------------
>
>                 Key: PIG-2872
>                 URL: https://issues.apache.org/jira/browse/PIG-2872
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.11
>         Environment: Pig trunk, Hadoop 0.20.205 with Kerberos, ElasticSearch trunk, Wonderdog trunk
>            Reporter: Evert Lammerts
>
> When an implementation of StoreFuncInterface.setStoreLocation is called from JobControlCompiler.getJob, it is passed a copy of the Configuration that will be used for the Job that will be submitted:
> {code:title=JobControlCompiler.java}
> sFunc.setStoreLocation(st.getSFile().getFileName(), new org.apache.hadoop.mapreduce.Job(nwJob.getConfiguration()));
> {code}
> When a new org.apache.hadoop.mapreduce.Job is created it creates a copy of the Configuration object, as far as I know. Thus anything added to the Configuration object in the implementation of setStoreLocation will not be included in the Configuration of nwJob in JobControlCompiler.getJob.
> I notice this goes wrong in Wonderdog, which needs to include the Elasticsearch configuration file in the DistributedCache. It is added to mapred.cache.files through setStoreLocation, but this setting doesn't make it back into the Job returned by JobControlCompiler.getJob, and is therefore never localized.
> This might be intentional semantics within Pig, but I'm not familiar enough with StoreFuncs to know whether it is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2872) StoreFuncInterface.setStoreLocation get's a copy of a Configuration object

Posted by "Bill Graham (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433713#comment-13433713 ] 

Bill Graham commented on PIG-2872:
----------------------------------

The patch in PIG-2870 works with PIG-2578, but I we still need to think it through some more since the patch is basically forking logic depending on whether it's a single or multi-store job. Let's continue the discusion at PIG-2870.

And yes, we will ship a fix to this problem one way or the other in Pig 0.11.
                
> StoreFuncInterface.setStoreLocation get's a copy of a Configuration object
> --------------------------------------------------------------------------
>
>                 Key: PIG-2872
>                 URL: https://issues.apache.org/jira/browse/PIG-2872
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.11
>         Environment: Pig trunk, Hadoop 0.20.205 with Kerberos, ElasticSearch trunk, Wonderdog trunk
>            Reporter: Evert Lammerts
>
> When an implementation of StoreFuncInterface.setStoreLocation is called from JobControlCompiler.getJob, it is passed a copy of the Configuration that will be used for the Job that will be submitted:
> {code:title=JobControlCompiler.java}
> sFunc.setStoreLocation(st.getSFile().getFileName(), new org.apache.hadoop.mapreduce.Job(nwJob.getConfiguration()));
> {code}
> When a new org.apache.hadoop.mapreduce.Job is created it creates a copy of the Configuration object, as far as I know. Thus anything added to the Configuration object in the implementation of setStoreLocation will not be included in the Configuration of nwJob in JobControlCompiler.getJob.
> I notice this goes wrong in Wonderdog, which needs to include the Elasticsearch configuration file in the DistributedCache. It is added to mapred.cache.files through setStoreLocation, but this setting doesn't make it back into the Job returned by JobControlCompiler.getJob, and is therefore never localized.
> This might be intentional semantics within Pig, but I'm not familiar enough with StoreFuncs to know whether it is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira