You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Dean Chen <de...@ocirs.com> on 2015/04/13 07:41:41 UTC

How is hive-site.xml loaded?

The docs state that:
Configuration of Hive is done by placing your `hive-site.xml` file in
`conf/`.

I've searched the codebase for hive-site.xml and didn't find code that
specifically loaded it anywhere so it looks like there is some magic to
autoload *.xml files in /conf? I've skimmed through HiveContext
<https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala>
and didn't see anything obvious in there.

The reason I'm asking is that I am working on a feature that needs config
in hbase-site.xml to be available in the spark context and would prefer to
follow the convention set by hive-site.xml.

--
Dean Chen

Re: How is hive-site.xml loaded?

Posted by Steve Loughran <st...@hortonworks.com>.
There's some magic in the process that is worth knowing/being cautious of

Those special HDFSConfiguration, YarnConfiguration, HiveConf objects are all doing work in their class initializer to call Configuration.addDefaultResource

this puts their -default and -site XML files onto the list of default configuration. Hadoop then runs through the list of configuration instances it is tracking in a WeakHashmap, and, if created with the useDefaults=true option in their constructor, tells them to reload all their "default" config props (preserving anything set explicitly).

This means you can use/abuse this feature to force in properties onto all Hadoop Configuration instances that asked for the default values -though this doesn't guarantee the changes will be picked up.

It's generally considered best practice for apps to create an instance of the configuration classes whose defaults & site they want picked up as soon as they can. Even if you discard the instance itself. Your goal is to get those settings in, so that the defaults don't get picked up elsewhere.
-steve

> On 13 Apr 2015, at 07:10, Raunak Jhawar <ra...@gmail.com> wrote:
> 
> The most obvious path being /etc/hive/conf, but this can be changed to
> lookup for any other path.
> 
> --
> Thanks,
> Raunak Jhawar
> 
> 
> 
> 
> 
> 
> On Mon, Apr 13, 2015 at 11:22 AM, Dean Chen <de...@ocirs.com> wrote:
> 
>> Ah ok, thanks!
>> 
>> --
>> Dean Chen
>> 
>> On Apr 12, 2015, at 10:45 PM, Reynold Xin <rx...@databricks.com> wrote:
>> 
>> It is loaded by Hive's HiveConf, which simply searches for hive-site.xml on
>> the classpath.
>> 
>> 
>> On Sun, Apr 12, 2015 at 10:41 PM, Dean Chen <de...@ocirs.com> wrote:
>> 
>>> The docs state that:
>>> Configuration of Hive is done by placing your `hive-site.xml` file in
>>> `conf/`.
>>> 
>>> I've searched the codebase for hive-site.xml and didn't find code that
>>> specifically loaded it anywhere so it looks like there is some magic to
>>> autoload *.xml files in /conf? I've skimmed through HiveContext
>>> <
>>> 
>> https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala
>>>> 
>>> and didn't see anything obvious in there.
>>> 
>>> The reason I'm asking is that I am working on a feature that needs config
>>> in hbase-site.xml to be available in the spark context and would prefer
>> to
>>> follow the convention set by hive-site.xml.
>>> 
>>> --
>>> Dean Chen
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: How is hive-site.xml loaded?

Posted by Raunak Jhawar <ra...@gmail.com>.
The most obvious path being /etc/hive/conf, but this can be changed to
lookup for any other path.

--
Thanks,
Raunak Jhawar






On Mon, Apr 13, 2015 at 11:22 AM, Dean Chen <de...@ocirs.com> wrote:

> Ah ok, thanks!
>
> --
> Dean Chen
>
> On Apr 12, 2015, at 10:45 PM, Reynold Xin <rx...@databricks.com> wrote:
>
> It is loaded by Hive's HiveConf, which simply searches for hive-site.xml on
> the classpath.
>
>
> On Sun, Apr 12, 2015 at 10:41 PM, Dean Chen <de...@ocirs.com> wrote:
>
> > The docs state that:
> > Configuration of Hive is done by placing your `hive-site.xml` file in
> > `conf/`.
> >
> > I've searched the codebase for hive-site.xml and didn't find code that
> > specifically loaded it anywhere so it looks like there is some magic to
> > autoload *.xml files in /conf? I've skimmed through HiveContext
> > <
> >
> https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala
> > >
> > and didn't see anything obvious in there.
> >
> > The reason I'm asking is that I am working on a feature that needs config
> > in hbase-site.xml to be available in the spark context and would prefer
> to
> > follow the convention set by hive-site.xml.
> >
> > --
> > Dean Chen
> >
>

Re: How is hive-site.xml loaded?

Posted by Dean Chen <de...@ocirs.com>.
Ah ok, thanks!

--
Dean Chen

On Apr 12, 2015, at 10:45 PM, Reynold Xin <rx...@databricks.com> wrote:

It is loaded by Hive's HiveConf, which simply searches for hive-site.xml on
the classpath.


On Sun, Apr 12, 2015 at 10:41 PM, Dean Chen <de...@ocirs.com> wrote:

> The docs state that:
> Configuration of Hive is done by placing your `hive-site.xml` file in
> `conf/`.
>
> I've searched the codebase for hive-site.xml and didn't find code that
> specifically loaded it anywhere so it looks like there is some magic to
> autoload *.xml files in /conf? I've skimmed through HiveContext
> <
> https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala
> >
> and didn't see anything obvious in there.
>
> The reason I'm asking is that I am working on a feature that needs config
> in hbase-site.xml to be available in the spark context and would prefer to
> follow the convention set by hive-site.xml.
>
> --
> Dean Chen
>

Re: How is hive-site.xml loaded?

Posted by Reynold Xin <rx...@databricks.com>.
It is loaded by Hive's HiveConf, which simply searches for hive-site.xml on
the classpath.


On Sun, Apr 12, 2015 at 10:41 PM, Dean Chen <de...@ocirs.com> wrote:

> The docs state that:
> Configuration of Hive is done by placing your `hive-site.xml` file in
> `conf/`.
>
> I've searched the codebase for hive-site.xml and didn't find code that
> specifically loaded it anywhere so it looks like there is some magic to
> autoload *.xml files in /conf? I've skimmed through HiveContext
> <
> https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala
> >
> and didn't see anything obvious in there.
>
> The reason I'm asking is that I am working on a feature that needs config
> in hbase-site.xml to be available in the spark context and would prefer to
> follow the convention set by hive-site.xml.
>
> --
> Dean Chen
>