You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Chen Xi (JIRA)" <ji...@apache.org> on 2018/11/01 08:06:00 UTC
[jira] [Commented] (SPARK-20608) Standby namenodes should be allowed to included in yarn.spark.access.namenodes to support HDFS HA

    [ https://issues.apache.org/jira/browse/SPARK-20608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671238#comment-16671238 ] 

Chen Xi commented on SPARK-20608:
---------------------------------

Hi, everyone!

I read through your discussion, and tried to use HDFS HA namespace in the configuration spark.yarn.access.namenodes to submit a spark example. I ran into this exception when it is submitting application to YARN:

 
{code:java}
2018-11-01 15:46:35 INFO Client:54 - Submitting application application_1534931995442_1739691 to ResourceManager
2018-11-01 15:46:36 INFO YarnClientImpl:251 - Submitted application application_1534931995442_1739691
2018-11-01 15:46:38 INFO Client:54 - Application report for application_1534931995442_1739691 (state: FAILED)
2018-11-01 15:46:38 INFO Client:54 - 
client token: N/A
diagnostics: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hadoop-foo-kylin, Ident: (token for bob_dp: HDFS_DELEGATION_TOKEN owner=bob_dp@HADOOP.COM, renewer=yarn, realUser=, issueDate=1541058391247, maxDate=1541663191247, sequenceNumber=6335, masterKeyId=68)
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.bob
start time: 1541058395976
final status: FAILED
tracking URL: http://xxxxxx:8808/proxy/application_1534931995442_1739691/
user: bob_dp
{code}
Here is my submit script, I tried both spark-2.1.1 and spark 2.3.2:

 

 
{code:java}
export HADOOP_CONF_DIR=/home/bob_dp/hadoop_conf && ./bin/spark-submit --class org.apache.spark.examples.JavaWordCount --conf spark.master=yarn --conf spark.submit.deployMode=cluster  --conf spark.shuffle.service.enabled=false --conf spark.yarn.access.namenodes=hdfs://hadoop-foo-kylin  examples/jars/spark-examples_2.11-2.3.2.jar hdfs://hadoop-foo-kylin/tmp/logs/README.md
{code}
My hdfs-site.xml in hadoop_conf contains the information of the remote ha namenode.

However, when I set spark.yarn.access.namenodes to use both active & standby namenode, the job ran successfully in spark-2.1.1 but failed in spark-2.3.2 due to StandbyException.

In a word, when spark.yarn.access.namenodes points to HA namespace, the job fails. Is this a bug in spark? [~vanzin] [~charliechen] [~stevel@apache.org] [~liuml07]

 

> Standby namenodes should be allowed to included in yarn.spark.access.namenodes to support HDFS HA
> -------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-20608
>                 URL: https://issues.apache.org/jira/browse/SPARK-20608
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Submit, YARN
>    Affects Versions: 2.0.1, 2.1.0
>            Reporter: Yuechen Chen
>            Priority: Minor
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> If one Spark Application need to access remote namenodes, yarn.spark.access.namenodes should be only be configged in spark-submit scripts, and Spark Client(On Yarn) would fetch HDFS credential periodically.
> If one hadoop cluster is configured by HA, there would be one active namenode and at least one standby namenode. 
> However, if yarn.spark.access.namenodes includes both active and standby namenodes, Spark Application will be failed for the reason that the standby namenode would not access by Spark for org.apache.hadoop.ipc.StandbyException.
> I think it won't cause any bad effect to config standby namenodes in yarn.spark.access.namenodes, and my Spark Application can be able to sustain the failover of Hadoop namenode.
> HA Examples:
> Spark-submit script: yarn.spark.access.namenodes=hdfs://namenode01,hdfs://namenode02
> Spark Application Codes:
> dataframe.write.parquet(getActiveNameNode(...) + hdfsPath)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org