You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Henning Blohm <he...@zfabrik.de> on 2016/04/18 11:57:48 UTC

Best way to pass configuration properties to MRv2 jobs

Hi,

in our Hadoop 2.6.0 cluster, we need to pass some properties to all 
Hadoop processes so they can be referenced using ${...} syntax in 
configuration files. This works reasonably well using 
HADOOP_NAMENODE_OPTS and the like.

For Map/Reduce jobs however, we need to speficy not only

mapred.child.java.opts

to pass system properties, in addition we need to set

yarn.app.mapreduce.am.command-opts

for anything that is referenced in Hadoop configuration files.

In the end however almost all the properties passed are available as 
environment variables as well.

Hence my question:

* Is it possible to use reference environment variables in configuration 
files directly?
* Does anybody know of a simpler way to make sure some system properties 
are _always_ set for all Yarn processes?

Thanks,
Henning

Re: Best way to pass configuration properties to MRv2 jobs

Posted by Henning Blohm <he...@zfabrik.de>.

How true!! ;-)

Thanks,
Henning

On 18.04.2016 19:53, Dima Spivak wrote:
> Probably better off asking on the Hadoop user mailing list (
> user@hadoop.apache.org) than the HBase one… :)
>
> -Dima
>
> On Mon, Apr 18, 2016 at 2:57 AM, Henning Blohm <he...@zfabrik.de>
> wrote:
>
>> Hi,
>>
>> in our Hadoop 2.6.0 cluster, we need to pass some properties to all Hadoop
>> processes so they can be referenced using ${...} syntax in configuration
>> files. This works reasonably well using HADOOP_NAMENODE_OPTS and the like.
>>
>> For Map/Reduce jobs however, we need to speficy not only
>>
>> mapred.child.java.opts
>>
>> to pass system properties, in addition we need to set
>>
>> yarn.app.mapreduce.am.command-opts
>>
>> for anything that is referenced in Hadoop configuration files.
>>
>> In the end however almost all the properties passed are available as
>> environment variables as well.
>>
>> Hence my question:
>>
>> * Is it possible to use reference environment variables in configuration
>> files directly?
>> * Does anybody know of a simpler way to make sure some system properties
>> are _always_ set for all Yarn processes?
>>
>> Thanks,
>> Henning
>>

Re: Best way to pass configuration properties to MRv2 jobs

Posted by Dima Spivak <ds...@cloudera.com>.

Probably better off asking on the Hadoop user mailing list (
user@hadoop.apache.org) than the HBase one… :)

-Dima

On Mon, Apr 18, 2016 at 2:57 AM, Henning Blohm <he...@zfabrik.de>
wrote:

> Hi,
>
> in our Hadoop 2.6.0 cluster, we need to pass some properties to all Hadoop
> processes so they can be referenced using ${...} syntax in configuration
> files. This works reasonably well using HADOOP_NAMENODE_OPTS and the like.
>
> For Map/Reduce jobs however, we need to speficy not only
>
> mapred.child.java.opts
>
> to pass system properties, in addition we need to set
>
> yarn.app.mapreduce.am.command-opts
>
> for anything that is referenced in Hadoop configuration files.
>
> In the end however almost all the properties passed are available as
> environment variables as well.
>
> Hence my question:
>
> * Is it possible to use reference environment variables in configuration
> files directly?
> * Does anybody know of a simpler way to make sure some system properties
> are _always_ set for all Yarn processes?
>
> Thanks,
> Henning
>

Restore cluster from disk snapshot

Posted by Henning Blohm <he...@zfabrik.de>.

On a 10 nodes cluster running Hadoop 2.6 and HBase 1.0 we would like to 
be able to restore the whole cluster from a disk snapshot backups.

Given that snapshots for data nodes and name node may not be taken at 
the 100% exact same point in time, is there a safe variant that at least 
means the cluster is still consistent (except for some late edits maybe)?

Thanks,
Henning