You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Lily (Jira)" <ji...@apache.org> on 2022/04/18 09:52:00 UTC
[jira] [Created] (SPARK-38934) Provider TemporaryAWSCredentialsProvider has no credentials
Lily created SPARK-38934:
----------------------------
Summary: Provider TemporaryAWSCredentialsProvider has no credentials
Key: SPARK-38934
URL: https://issues.apache.org/jira/browse/SPARK-38934
Project: Spark
Issue Type: Bug
Components: Kubernetes, Spark Core
Affects Versions: 3.2.1
Reporter: Lily
We are using Jupyter Hub on K8s as a notebook based development environment and Spark on K8s as a backend cluster of Jupyter Hub on K8s with Spark 3.2.1 and Hadoop 3.3.1.
When we run a code like the one below in the Jupyter Hub on K8s,
{code:java}
val perm = ... // get AWS temporary credential by AWS STS from AWS assumed role
// set AWS temporary credential
spark.sparkContext.hadoopConfiguration.set("fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider")
spark.sparkContext.hadoopConfiguration.set("fs.s3a.access.key", perm.credential.accessKeyID)
spark.sparkContext.hadoopConfiguration.set("fs.s3a.secret.key", perm.credential.secretAccessKey)
spark.sparkContext.hadoopConfiguration.set("fs.s3a.session.token", perm.credential.sessionToken)
// execute simple Spark action
spark.read.format("parquet").load("s3a://<path>/*").show(1) {code}
we got warning like the one below in the first code execution, but we were able to get the proper result thanks to Spark task retry function.
{code:java}
22/04/18 09:13:50 WARN TaskSetManager: Lost task 2.0 in stage 0.0 (TID 2) (10.197.5.15 executor 1): java.nio.file.AccessDeniedException: s3a://<path>/<file>.parquet: org.apache.hadoop.fs.s3a.CredentialInitializationException: Provider TemporaryAWSCredentialsProvider has no credentials
at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:206)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:117)
at org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:2810)
at org.apache.spark.util.HadoopFSUtils$.listLeafFiles(HadoopFSUtils.scala:225)
at org.apache.spark.util.HadoopFSUtils$.$anonfun$parallelListLeafFilesInternal$6(HadoopFSUtils.scala:136)
at scala.collection.immutable.Stream.map(Stream.scala:418)
at org.apache.spark.util.HadoopFSUtils$.$anonfun$parallelListLeafFilesInternal$4(HadoopFSUtils.scala:126)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:863)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:863)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.hadoop.fs.s3a.CredentialInitializationException: Provider TemporaryAWSCredentialsProvider has no credentials
at org.apache.hadoop.fs.s3a.auth.AbstractSessionCredentialsProvider.getCredentials(AbstractSessionCredentialsProvider.java:130)
at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:177)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1266)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:842)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:792)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5445)
at com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:6420)
at com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:6393)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5430)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5392)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5386)
at com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:971)
at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$7(S3AFileSystem.java:2116)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:489)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:412)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:375)
at org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:2107)
at org.apache.hadoop.fs.s3a.S3AFileSystem$ListingOperationCallbacksImpl.lambda$listObjectsAsync$0(S3AFileSystem.java:1750)
at org.apache.hadoop.fs.s3a.impl.CallableSupplier.get(CallableSupplier.java:62)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
... 3 more {code}
Would you explain why we are having this warning and tell us how we can prevent experiencing this issue again?
Thank you in advance.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org