You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "feiwang (JIRA)" <ji...@apache.org> on 2019/04/19 02:30:00 UTC

[jira] [Updated] (SPARK-27515) [Deploy] When application master retry after a long time running, the hdfs delegation token may be expired

     [ https://issues.apache.org/jira/browse/SPARK-27515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

feiwang updated SPARK-27515:
----------------------------
    Description: 
When submit a spark yarn application, we first create a container launch context and store the relative tokens.
And for each attempt of applicationMaster, it would transfer origin tokens.
However, it also transfer origin hdfs delegation tokens.
For a spark streaming application, if its applicationMaster failed when it has run for a long duration.
The hdfs token stored in container launch context may be expired.
When the new attempt applicationMaster prepareLocalResources, it would access the hdfs and failed for token expired.
This error occured when we rolling upgrading our cluster.

  was:
When submit a spark yarn application, we first create a container launch context and store the relative tokens.
And for each attempt of applicationMaster, it would transfer origin tokens to connect yarn.
However, it also transfer origin hdfs delegation tokens.
For a spark streaming application, if its applicationMaster failed when it has run for a long duration.
The hdfs token stored in container launch context may be expired.
When the new attempt applicationMaster prepareLocalResources, it would access the hdfs and failed for token expired.
This error occured when we rolling upgrading our cluster.


> [Deploy] When application master retry after a long time running, the hdfs delegation token may be expired
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-27515
>                 URL: https://issues.apache.org/jira/browse/SPARK-27515
>             Project: Spark
>          Issue Type: Bug
>          Components: Deploy
>    Affects Versions: 2.3.2
>            Reporter: feiwang
>            Priority: Major
>
> When submit a spark yarn application, we first create a container launch context and store the relative tokens.
> And for each attempt of applicationMaster, it would transfer origin tokens.
> However, it also transfer origin hdfs delegation tokens.
> For a spark streaming application, if its applicationMaster failed when it has run for a long duration.
> The hdfs token stored in container launch context may be expired.
> When the new attempt applicationMaster prepareLocalResources, it would access the hdfs and failed for token expired.
> This error occured when we rolling upgrading our cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org