You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by Pushparaj Motamari <pu...@gmail.com> on 2016/10/11 18:20:53 UTC

Connecting Hadoop HA cluster via java client

Hi,

I have two questions pertaining to accessing the hadoop ha cluster from
java client.

1. Is  it necessary to supply

conf.set("dfs.ha.automatic-failover.enabled",true);

and

conf.set("ha.zookeeper.quorum","zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181");

in addition to the other properties set in the code below?

private Configuration initHAConf(URI journalURI, Configuration conf) {
  conf.set(DFSConfigKeys.DFS_NAMENODE_SHARED_EDITS_DIR_KEY,
      journalURI.toString());

  String address1 = "127.0.0.1:" + NN1_IPC_PORT;
  String address2 = "127.0.0.1:" + NN2_IPC_PORT;
  conf.set(DFSUtil.addKeySuffixes(DFS_NAMENODE_RPC_ADDRESS_KEY,
      NAMESERVICE, NN1), address1);
  conf.set(DFSUtil.addKeySuffixes(DFS_NAMENODE_RPC_ADDRESS_KEY,
      NAMESERVICE, NN2), address2);
  conf.set(DFSConfigKeys.DFS_NAMESERVICES, NAMESERVICE);
  conf.set(DFSUtil.addKeySuffixes(DFS_HA_NAMENODES_KEY_PREFIX, NAMESERVICE),
      NN1 + "," + NN2);
  conf.set(DFS_CLIENT_FAILOVER_PROXY_PROVIDER_KEY_PREFIX + "." + NAMESERVICE,
      ConfiguredFailoverProxyProvider.class.getName());
  conf.set("fs.defaultFS", "hdfs://" + NAMESERVICE);

  return conf;}

2. If we supply zookeeper configuration details as mentioned in the
question 1 is it necessary to set the primary and secondary namenode
addresses as mentioned in the code above? Since we have
given zookeeper connection details the client should be able to figure
out the active namenode connection details.


Regards

Pushparaj

Re: Connecting Hadoop HA cluster via java client

Posted by 권병창 <ma...@navercorp.com>.

that configure need to using webhdfs://${nameservice}
try to "hdfs dfs -ls webhdfs://${nameservice}/some/files"

-----Original Message-----
From: "Pushparaj Motamari"&lt;pushparajxa@gmail.com&gt; 
To: "권병창"&lt;magnum.c@navercorp.com&gt;; 
Cc: &lt;user@hadoop.apache.org&gt;; 
Sent: 2016-10-18 (화) 23:02:14
Subject: Re: Connecting Hadoop HA cluster via java client

Hi,
 Following are not required I guess. I am able to connect to cluster without these. Is there any reason to include them?
dfs.namenode.http-address.${dfs.nameservices}.nn1 
dfs.namenode.http-address.${dfs.nameservices}.nn2 
Regards
Pushparaj 

On Wed, Oct 12, 2016 at 6:39 AM, 권병창 &lt;magnum.c@navercorp.com&gt; wrote:
Hi.

1. minimal configuration to connect HA namenode is below properties.
zookeeper information does not necessary.

dfs.nameservices
dfs.ha.namenodes.${dfs.nameservices}
dfs.namenode.rpc-address.${dfs.nameservices}.nn1 
dfs.namenode.rpc-address.${dfs.nameservices}.nn2
dfs.namenode.http-address.${dfs.nameservices}.nn1 
dfs.namenode.http-address.${dfs.nameservices}.nn2
dfs.client.failover.proxy.provider.c3=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider  

2. client use round robin manner for selecting active namenode.

-----Original Message-----
From: "Pushparaj Motamari"&lt;pushparajxa@gmail.com&gt; 
To: &lt;user@hadoop.apache.org&gt;; 
Cc: 
Sent: 2016-10-12 (수) 03:20:53
Subject: Connecting Hadoop HA cluster via java client

Hi,
 I have two questions pertaining to accessing the hadoop ha cluster from java client.  1. Is  it necessary to supply 
conf.set("dfs.ha.automatic-failover.enabled",true);
and
conf.set("ha.zookeeper.quorum","zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181");

in addition to the other properties set in the code below?
private Configuration initHAConf(URI journalURI, Configuration conf) {
  conf.set(DFSConfigKeys.DFS_NAMENODE_SHARED_EDITS_DIR_KEY,
      journalURI.toString());

  String address1 = "127.0.0.1:" + NN1_IPC_PORT;
  String address2 = "127.0.0.1:" + NN2_IPC_PORT;
  conf.set(DFSUtil.addKeySuffixes(DFS_NAMENODE_RPC_ADDRESS_KEY,
      NAMESERVICE, NN1), address1);
  conf.set(DFSUtil.addKeySuffixes(DFS_NAMENODE_RPC_ADDRESS_KEY,
      NAMESERVICE, NN2), address2);
  conf.set(DFSConfigKeys.DFS_NAMESERVICES, NAMESERVICE);
  conf.set(DFSUtil.addKeySuffixes(DFS_HA_NAMENODES_KEY_PREFIX, NAMESERVICE),
      NN1 + "," + NN2);
  conf.set(DFS_CLIENT_FAILOVER_PROXY_PROVIDER_KEY_PREFIX + "." + NAMESERVICE,
      ConfiguredFailoverProxyProvider.class.getName());
  conf.set("fs.defaultFS", "hdfs://" + NAMESERVICE);

  return conf;
}

2. If we supply zookeeper configuration details as mentioned in the question 1 is it necessary to set the primary and secondary namenode addresses as mentioned in the code above? Since we have 
given zookeeper connection details the client should be able to figure out the active namenode connection details.

Regards

Pushparaj

Re: Connecting Hadoop HA cluster via java client

Posted by Rakesh Radhakrishnan <ra...@apache.org>.

Hi,

dfs.namenode.http-address, this is the fully-qualified HTTP address for
each NameNode to listen on. Similarly to rpc-address configuration, set the
addresses for both NameNodes HTTP servers(Web UI) to listen on and can
browse the status of Active/Standby NN in Web browser. Also, hdfs supports
secure http server address and port, can use "dfs.namenode.https-address"
for this.

For example:-
I assume dfs.nameservices(the logical name for your nameservice) config
item is configured as "mycluster"

<property>
  <name>dfs.namenode.http-address.mycluster.nn1</name>
  <value>machine1.example.com:50070</value>
</property>
<property>
  <name>dfs.namenode.http-address.mycluster.nn2</name>
  <value>machine2.example.com:50070</value>
</property>

Regards,
Rakesh

On Tue, Oct 18, 2016 at 7:32 PM, Pushparaj Motamari <pu...@gmail.com>
wrote:

> Hi,
>
> Following are not required I guess. I am able to connect to cluster
> without these. Is there any reason to include them?
>
> dfs.namenode.http-address.${dfs.nameservices}.nn1
>
> dfs.namenode.http-address.${dfs.nameservices}.nn2
>
> Regards
>
> Pushparaj
>
>
>
> On Wed, Oct 12, 2016 at 6:39 AM, 권병창 <ma...@navercorp.com> wrote:
>
>> Hi.
>>
>>
>>
>> 1. minimal configuration to connect HA namenode is below properties.
>>
>> zookeeper information does not necessary.
>>
>>
>>
>> dfs.nameservices
>>
>> dfs.ha.namenodes.${dfs.nameservices}
>>
>> dfs.namenode.rpc-address.${dfs.nameservices}.nn1
>>
>> dfs.namenode.rpc-address.${dfs.nameservices}.nn2
>>
>> dfs.namenode.http-address.${dfs.nameservices}.nn1
>>
>> dfs.namenode.http-address.${dfs.nameservices}.nn2
>> dfs.client.failover.proxy.provider.c3=org.apache.hadoop.hdfs
>> .server.namenode.ha.ConfiguredFailoverProxyProvider
>>
>>
>>
>>
>>
>> 2. client use round robin manner for selecting active namenode.
>>
>>
>>
>>
>>
>> -----Original Message-----
>> *From:* "Pushparaj Motamari"<pu...@gmail.com>
>> *To:* <us...@hadoop.apache.org>;
>> *Cc:*
>> *Sent:* 2016-10-12 (수) 03:20:53
>> *Subject:* Connecting Hadoop HA cluster via java client
>>
>> Hi,
>>
>> I have two questions pertaining to accessing the hadoop ha cluster from
>> java client.
>>
>> 1. Is  it necessary to supply
>>
>> conf.set("dfs.ha.automatic-failover.enabled",true);
>>
>> and
>>
>> conf.set("ha.zookeeper.quorum","zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181");
>>
>> in addition to the other properties set in the code below?
>>
>> private Configuration initHAConf(URI journalURI, Configuration conf) {
>>   conf.set(DFSConfigKeys.DFS_NAMENODE_SHARED_EDITS_DIR_KEY,
>>       journalURI.toString());
>>
>>   String address1 = "127.0.0.1:" + NN1_IPC_PORT;
>>   String address2 = "127.0.0.1:" + NN2_IPC_PORT;
>>   conf.set(DFSUtil.addKeySuffixes(DFS_NAMENODE_RPC_ADDRESS_KEY,
>>       NAMESERVICE, NN1), address1);
>>   conf.set(DFSUtil.addKeySuffixes(DFS_NAMENODE_RPC_ADDRESS_KEY,
>>       NAMESERVICE, NN2), address2);
>>   conf.set(DFSConfigKeys.DFS_NAMESERVICES, NAMESERVICE);
>>   conf.set(DFSUtil.addKeySuffixes(DFS_HA_NAMENODES_KEY_PREFIX, NAMESERVICE),
>>       NN1 + "," + NN2);
>>   conf.set(DFS_CLIENT_FAILOVER_PROXY_PROVIDER_KEY_PREFIX + "." + NAMESERVICE,
>>       ConfiguredFailoverProxyProvider.class.getName());
>>   conf.set("fs.defaultFS", "hdfs://" + NAMESERVICE);
>>
>>   return conf;}
>>
>> 2. If we supply zookeeper configuration details as mentioned in the question 1 is it necessary to set the primary and secondary namenode addresses as mentioned in the code above? Since we have
>> given zookeeper connection details the client should be able to figure out the active namenode connection details.
>>
>>
>> Regards
>>
>> Pushparaj
>>
>>
>

Re: Connecting Hadoop HA cluster via java client

Posted by Pushparaj Motamari <pu...@gmail.com>.

Hi,

Following are not required I guess. I am able to connect to cluster without
these. Is there any reason to include them?

dfs.namenode.http-address.${dfs.nameservices}.nn1

dfs.namenode.http-address.${dfs.nameservices}.nn2

Regards

Pushparaj



On Wed, Oct 12, 2016 at 6:39 AM, 권병창 <ma...@navercorp.com> wrote:

> Hi.
>
>
>
> 1. minimal configuration to connect HA namenode is below properties.
>
> zookeeper information does not necessary.
>
>
>
> dfs.nameservices
>
> dfs.ha.namenodes.${dfs.nameservices}
>
> dfs.namenode.rpc-address.${dfs.nameservices}.nn1
>
> dfs.namenode.rpc-address.${dfs.nameservices}.nn2
>
> dfs.namenode.http-address.${dfs.nameservices}.nn1
>
> dfs.namenode.http-address.${dfs.nameservices}.nn2
> dfs.client.failover.proxy.provider.c3=org.apache.hadoop.
> hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
>
>
>
>
>
> 2. client use round robin manner for selecting active namenode.
>
>
>
>
>
> -----Original Message-----
> *From:* "Pushparaj Motamari"<pu...@gmail.com>
> *To:* <us...@hadoop.apache.org>;
> *Cc:*
> *Sent:* 2016-10-12 (수) 03:20:53
> *Subject:* Connecting Hadoop HA cluster via java client
>
> Hi,
>
> I have two questions pertaining to accessing the hadoop ha cluster from
> java client.
>
> 1. Is  it necessary to supply
>
> conf.set("dfs.ha.automatic-failover.enabled",true);
>
> and
>
> conf.set("ha.zookeeper.quorum","zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181");
>
> in addition to the other properties set in the code below?
>
> private Configuration initHAConf(URI journalURI, Configuration conf) {
>   conf.set(DFSConfigKeys.DFS_NAMENODE_SHARED_EDITS_DIR_KEY,
>       journalURI.toString());
>
>   String address1 = "127.0.0.1:" + NN1_IPC_PORT;
>   String address2 = "127.0.0.1:" + NN2_IPC_PORT;
>   conf.set(DFSUtil.addKeySuffixes(DFS_NAMENODE_RPC_ADDRESS_KEY,
>       NAMESERVICE, NN1), address1);
>   conf.set(DFSUtil.addKeySuffixes(DFS_NAMENODE_RPC_ADDRESS_KEY,
>       NAMESERVICE, NN2), address2);
>   conf.set(DFSConfigKeys.DFS_NAMESERVICES, NAMESERVICE);
>   conf.set(DFSUtil.addKeySuffixes(DFS_HA_NAMENODES_KEY_PREFIX, NAMESERVICE),
>       NN1 + "," + NN2);
>   conf.set(DFS_CLIENT_FAILOVER_PROXY_PROVIDER_KEY_PREFIX + "." + NAMESERVICE,
>       ConfiguredFailoverProxyProvider.class.getName());
>   conf.set("fs.defaultFS", "hdfs://" + NAMESERVICE);
>
>   return conf;}
>
> 2. If we supply zookeeper configuration details as mentioned in the question 1 is it necessary to set the primary and secondary namenode addresses as mentioned in the code above? Since we have
> given zookeeper connection details the client should be able to figure out the active namenode connection details.
>
>
> Regards
>
> Pushparaj
>
>

RE: Connecting Hadoop HA cluster via java client

Posted by 권병창 <ma...@navercorp.com>.

Hi.
 
1. minimal configuration to connect HA namenode is below properties.
zookeeper information does not necessary.
 
dfs.nameservices
dfs.ha.namenodes.${dfs.nameservices}
dfs.namenode.rpc-address.${dfs.nameservices}.nn1 
dfs.namenode.rpc-address.${dfs.nameservices}.nn2
dfs.namenode.http-address.${dfs.nameservices}.nn1 
dfs.namenode.http-address.${dfs.nameservices}.nn2
dfs.client.failover.proxy.provider.c3=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
 
 
2. client use round robin manner for selecting active namenode.
 
 
-----Original Message-----
From: "Pushparaj Motamari"&lt;pushparajxa@gmail.com&gt; 
To: &lt;user@hadoop.apache.org&gt;; 
Cc: 
Sent: 2016-10-12 (수) 03:20:53
Subject: Connecting Hadoop HA cluster via java client
 
Hi,
 I have two questions pertaining to accessing the hadoop ha cluster from java client.  1. Is  it necessary to supply 
conf.set("dfs.ha.automatic-failover.enabled",true);
and
conf.set("ha.zookeeper.quorum","zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181");

in addition to the other properties set in the code below?
private Configuration initHAConf(URI journalURI, Configuration conf) {
  conf.set(DFSConfigKeys.DFS_NAMENODE_SHARED_EDITS_DIR_KEY,
      journalURI.toString());
  
  String address1 = "127.0.0.1:" + NN1_IPC_PORT;
  String address2 = "127.0.0.1:" + NN2_IPC_PORT;
  conf.set(DFSUtil.addKeySuffixes(DFS_NAMENODE_RPC_ADDRESS_KEY,
      NAMESERVICE, NN1), address1);
  conf.set(DFSUtil.addKeySuffixes(DFS_NAMENODE_RPC_ADDRESS_KEY,
      NAMESERVICE, NN2), address2);
  conf.set(DFSConfigKeys.DFS_NAMESERVICES, NAMESERVICE);
  conf.set(DFSUtil.addKeySuffixes(DFS_HA_NAMENODES_KEY_PREFIX, NAMESERVICE),
      NN1 + "," + NN2);
  conf.set(DFS_CLIENT_FAILOVER_PROXY_PROVIDER_KEY_PREFIX + "." + NAMESERVICE,
      ConfiguredFailoverProxyProvider.class.getName());
  conf.set("fs.defaultFS", "hdfs://" + NAMESERVICE);
  
  return conf;
}

2. If we supply zookeeper configuration details as mentioned in the question 1 is it necessary to set the primary and secondary namenode addresses as mentioned in the code above? Since we have 
given zookeeper connection details the client should be able to figure out the active namenode connection details.


Regards

Pushparaj