You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Bhooshan Mogal <bh...@gmail.com> on 2013/03/29 02:10:22 UTC

Why does Pig not use default resources from the Configuration object?

Hi Folks,

I had implemented the Hadoop FileSystem abstract class for a storage system
at work. This implementation uses some config files that are similar in
structure to hadoop config files. They have a *-default.xml and a
*-site.xml for users to override default properties. In the class that
implemented the Hadoop FileSystem, I had added these configuration files as
default resources in a static block using
Configuration.addDefaultResource("my-default.xml") and
Configuration.addDefaultResource("my-site.xml". This was working fine and
we were able to run the Hadoop Filesystem CLI and map-reduce jobs just fine
for our storage system. However, when we tried using this storage system in
pig scripts, we saw errors indicating that our configuration parameters
were not available. Upon further debugging, we saw that the config files
were added to the Configuration object as resources, but were part of
defaultResources. However, in Main.java in the pig source, we saw that the
Configuration object was created as Configuration conf = new
Configuration(false);, thereby setting loadDefaults to false in the conf
object. As a result, properties from the default resources (including my
config files) were not loaded and hence, unavailable.

We solved the problem by using Configuration.addResource instead of
Configuration.addDefaultResource, but still could not figure out why Pig
does not use default resources?

Could someone on the list explain why this is the case?

Thanks,
-- 
Bhooshan

Re: Why does Pig not use default resources from the Configuration object?

Posted by Prashant Kommireddi <pr...@gmail.com>.
Hi Bhooshan,

There is a patch that addresses what you need, and is part of 0.12
(unreleased). Take a look and see if you can apply the patch to the version
you are using.
https://issues.apache.org/jira/browse/PIG-3135.

With this patch, the following property will allow you to override the
default and pass in your own configuration.
pig.use.overriden.hadoop.configs=true


On Thu, Mar 28, 2013 at 6:10 PM, Bhooshan Mogal <bh...@gmail.com>wrote:

> Hi Folks,
>
> I had implemented the Hadoop FileSystem abstract class for a storage system
> at work. This implementation uses some config files that are similar in
> structure to hadoop config files. They have a *-default.xml and a
> *-site.xml for users to override default properties. In the class that
> implemented the Hadoop FileSystem, I had added these configuration files as
> default resources in a static block using
> Configuration.addDefaultResource("my-default.xml") and
> Configuration.addDefaultResource("my-site.xml". This was working fine and
> we were able to run the Hadoop Filesystem CLI and map-reduce jobs just fine
> for our storage system. However, when we tried using this storage system in
> pig scripts, we saw errors indicating that our configuration parameters
> were not available. Upon further debugging, we saw that the config files
> were added to the Configuration object as resources, but were part of
> defaultResources. However, in Main.java in the pig source, we saw that the
> Configuration object was created as Configuration conf = new
> Configuration(false);, thereby setting loadDefaults to false in the conf
> object. As a result, properties from the default resources (including my
> config files) were not loaded and hence, unavailable.
>
> We solved the problem by using Configuration.addResource instead of
> Configuration.addDefaultResource, but still could not figure out why Pig
> does not use default resources?
>
> Could someone on the list explain why this is the case?
>
> Thanks,
> --
> Bhooshan
>