You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Marcelo Vanzin (JIRA)" <ji...@apache.org> on 2015/08/05 19:40:04 UTC

[jira] [Updated] (SPARK-9645) External shuffle service does not work with kerberos on

     [ https://issues.apache.org/jira/browse/SPARK-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcelo Vanzin updated SPARK-9645:
----------------------------------
    Description: 
Lots of errors like this when running apps with the external shuffle service and kerberos enabled:

{noformat}
15/08/05 06:26:18 WARN TaskSetManager: Lost task 2.0 in stage 2.0 (TID 12, spark-nightly-2.vpc.cloudera.com): FetchFailed(BlockManagerId(2, spark-nightly-2.vpc.cloudera.com, 7337), shuffleId=0, mapId=0, reduceId=2, message=
org.apache.spark.shuffle.FetchFailedException: java.lang.RuntimeException: Failed to open file: /yarn/nm/usercache/systest/appcache/application_1438780049118_0008/blockmgr-7178b106-6902-4082-8792-1c3e34b80d15/38/shuffle_0_0_0.index
	at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.getSortBasedShuffleBlockData(ExternalShuffleBlockResolver.java:203)
	at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.getBlockData(ExternalShuffleBlockResolver.java:113)
	at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.handleMessage(ExternalShuffleBlockHandler.java:80)
	at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.receive(ExternalShuffleBlockHandler.java:68)
	at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:114)
{noformat}

This is caused by commit c4830598 (SPARK-6287), which modified the permissions of the directory storing the shuffle files.

  was:
Lots of errors like this when running apps with the external shuffle service and kerberos enabled:

{noformat}
15/08/05 06:26:18 WARN TaskSetManager: Lost task 2.0 in stage 2.0 (TID 12, spark-nightly-2.vpc.cloudera.com): FetchFailed(BlockManagerId(2, spark-nightly-2.vpc.cloudera.com, 7337), shuffleId=0, mapId=0, reduceId=2, message=
org.apache.spark.shuffle.FetchFailedException: java.lang.RuntimeException: Failed to open file: /yarn/nm/usercache/systest/appcache/application_1438780049118_0008/blockmgr-7178b106-6902-4082-8792-1c3e34b80d15/38/shuffle_0_0_0.index
	at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.getSortBasedShuffleBlockData(ExternalShuffleBlockResolver.java:203)
	at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.getBlockData(ExternalShuffleBlockResolver.java:113)
	at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.handleMessage(ExternalShuffleBlockHandler.java:80)
	at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.receive(ExternalShuffleBlockHandler.java:68)
	at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:114)
{noformat}

This is cause by commit c4830598 (SPARK-6287), which modified the permissions of the directory storing the shuffle files.


> External shuffle service does not work with kerberos on
> -------------------------------------------------------
>
>                 Key: SPARK-9645
>                 URL: https://issues.apache.org/jira/browse/SPARK-9645
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, YARN
>    Affects Versions: 1.5.0
>            Reporter: Marcelo Vanzin
>            Priority: Blocker
>
> Lots of errors like this when running apps with the external shuffle service and kerberos enabled:
> {noformat}
> 15/08/05 06:26:18 WARN TaskSetManager: Lost task 2.0 in stage 2.0 (TID 12, spark-nightly-2.vpc.cloudera.com): FetchFailed(BlockManagerId(2, spark-nightly-2.vpc.cloudera.com, 7337), shuffleId=0, mapId=0, reduceId=2, message=
> org.apache.spark.shuffle.FetchFailedException: java.lang.RuntimeException: Failed to open file: /yarn/nm/usercache/systest/appcache/application_1438780049118_0008/blockmgr-7178b106-6902-4082-8792-1c3e34b80d15/38/shuffle_0_0_0.index
> 	at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.getSortBasedShuffleBlockData(ExternalShuffleBlockResolver.java:203)
> 	at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.getBlockData(ExternalShuffleBlockResolver.java:113)
> 	at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.handleMessage(ExternalShuffleBlockHandler.java:80)
> 	at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.receive(ExternalShuffleBlockHandler.java:68)
> 	at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:114)
> {noformat}
> This is caused by commit c4830598 (SPARK-6287), which modified the permissions of the directory storing the shuffle files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org