You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Lily (Jira)" <ji...@apache.org> on 2022/04/18 09:52:00 UTC

[jira] [Created] (SPARK-38934) Provider TemporaryAWSCredentialsProvider has no credentials

Lily created SPARK-38934:
----------------------------

             Summary: Provider TemporaryAWSCredentialsProvider has no credentials
                 Key: SPARK-38934
                 URL: https://issues.apache.org/jira/browse/SPARK-38934
             Project: Spark
          Issue Type: Bug
          Components: Kubernetes, Spark Core
    Affects Versions: 3.2.1
            Reporter: Lily


 

We are using Jupyter Hub on K8s as a notebook based development environment and Spark on K8s as a backend cluster of Jupyter Hub on K8s with Spark 3.2.1 and Hadoop 3.3.1.

When we run a code like the one below in the Jupyter Hub on K8s,

 
{code:java}
val perm = ... // get AWS temporary credential by AWS STS from AWS assumed role

// set AWS temporary credential
spark.sparkContext.hadoopConfiguration.set("fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider")
spark.sparkContext.hadoopConfiguration.set("fs.s3a.access.key", perm.credential.accessKeyID)
spark.sparkContext.hadoopConfiguration.set("fs.s3a.secret.key", perm.credential.secretAccessKey)
spark.sparkContext.hadoopConfiguration.set("fs.s3a.session.token", perm.credential.sessionToken)

// execute simple Spark action
spark.read.format("parquet").load("s3a://<path>/*").show(1) {code}
 

 

we got warning like the one below in the first code execution, but we were able to get the proper result thanks to Spark task retry function. 
{code:java}
22/04/18 09:13:50 WARN TaskSetManager: Lost task 2.0 in stage 0.0 (TID 2) (10.197.5.15 executor 1): java.nio.file.AccessDeniedException: s3a://<path>/<file>.parquet: org.apache.hadoop.fs.s3a.CredentialInitializationException: Provider TemporaryAWSCredentialsProvider has no credentials
	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:206)
	at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:117)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:2810)
	at org.apache.spark.util.HadoopFSUtils$.listLeafFiles(HadoopFSUtils.scala:225)
	at org.apache.spark.util.HadoopFSUtils$.$anonfun$parallelListLeafFilesInternal$6(HadoopFSUtils.scala:136)
	at scala.collection.immutable.Stream.map(Stream.scala:418)
	at org.apache.spark.util.HadoopFSUtils$.$anonfun$parallelListLeafFilesInternal$4(HadoopFSUtils.scala:126)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:863)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:863)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:131)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.hadoop.fs.s3a.CredentialInitializationException: Provider TemporaryAWSCredentialsProvider has no credentials
	at org.apache.hadoop.fs.s3a.auth.AbstractSessionCredentialsProvider.getCredentials(AbstractSessionCredentialsProvider.java:130)
	at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:177)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1266)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:842)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:792)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5445)
	at com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:6420)
	at com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:6393)
	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5430)
	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5392)
	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5386)
	at com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:971)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$7(S3AFileSystem.java:2116)
	at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:489)
	at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:412)
	at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:375)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:2107)
	at org.apache.hadoop.fs.s3a.S3AFileSystem$ListingOperationCallbacksImpl.lambda$listObjectsAsync$0(S3AFileSystem.java:1750)
	at org.apache.hadoop.fs.s3a.impl.CallableSupplier.get(CallableSupplier.java:62)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
	... 3 more {code}
Would you explain why we are having this warning and tell us how we can prevent  experiencing this issue again?

Thank you in advance.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org