You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by mykidong <my...@gmail.com> on 2020/09/20 05:18:02 UTC

UnknownHostException is thrown when spark job whose jar files will be uploaded to s3 object storage via https is submitted to kubernetes


Hi,

I have already succeeded submitting spark job to kubernetes with accessing
s3 object storage in the below env.
At that time, I was in the following env.:
Spark: 3.0.0.
S3 Object Storage: Hadoop Ozone S3 Object Storage accessed by HTTP Endpoint
containging IP Address, Not Host Name.
Resource Management: Kubernetes.
My spark job worked OK.


Now, I want to replace unsecured HTTP with HTTPS to access s3 object
storage:
Spark: 3.0.0.
S3 Object Storage: MinIO S3 Object Storage accessed by HTTPS Endpoint
containging Host Name.
Resource Management: Kubernetes.

I have already installed cert-manager, ingress controller on k8s, and added
my S3 Endpoint Host Name to public DNS server.
I have also tested MinIO S3 Object Storage using AWS CLI via HTTPS, it works
fine what I expected.

But, the problem is when I submit my spark job to kubernetes and the deps
jar files will be uploaded to my MinIO S3 Object Storage, spark submit
cannot find my S3 endpoint with the WRONG HOST NAME.

Let's see my spark job submit:

export MASTER=k8s://https://10.0.4.5:6443;
export NAMESPACE=ai-developer;
export ENDPOINT=https://mykidong-tenant.minio.cloudchef-labs.com;

spark-submit \
--master $MASTER \
--deploy-mode cluster \
--name spark-thrift-server \
--class io.spongebob.hive.SparkThriftServerRunner \
--packages
com.amazonaws:aws-java-sdk-s3:1.11.375,org.apache.hadoop:hadoop-aws:3.2.0 \
--conf "spark.executor.extraJavaOptions=-Dnetworkaddress.cache.ttl=60" \
--conf "spark.driver.extraJavaOptions=-Dnetworkaddress.cache.ttl=60" \
--conf spark.kubernetes.file.upload.path=s3a://mykidong/spark-thrift-server
\
--conf spark.kubernetes.container.image.pullPolicy=Always \
--conf spark.kubernetes.namespace=$NAMESPACE \
--conf spark.kubernetes.container.image=mykidong/spark:v3.0.0 \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.hadoop.hive.metastore.client.connect.retry.delay=5 \
--conf spark.hadoop.hive.metastore.client.socket.timeout=1800 \
--conf
spark.hadoop.hive.metastore.uris=thrift://metastore.$NAMESPACE.svc.cluster.local:9083
\
--conf spark.hadoop.hive.server2.enable.doAs=false \
--conf spark.hadoop.hive.server2.thrift.http.port=10002 \
--conf spark.hadoop.hive.server2.thrift.port=10016 \
--conf spark.hadoop.hive.server2.transport.mode=binary \
--conf spark.hadoop.metastore.catalog.default=spark \
--conf spark.hadoop.hive.execution.engine=spark \
--conf spark.hadoop.hive.input.format=io.delta.hive.HiveInputFormat \
--conf spark.hadoop.hive.tez.input.format=io.delta.hive.HiveInputFormat \
--conf spark.sql.warehouse.dir=s3a://mykidong/apps/spark/warehouse \
--conf spark.hadoop.fs.defaultFS=s3a://mykidong \
--conf spark.hadoop.fs.s3a.access.key=bWluaW8= \
--conf spark.hadoop.fs.s3a.secret.key=bWluaW8xMjM= \
--conf spark.hadoop.fs.s3a.connection.ssl.enabled=true \
--conf spark.hadoop.fs.s3a.endpoint=$ENDPOINT \
--conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \
--conf spark.hadoop.fs.s3a.fast.upload=true \
--conf spark.driver.extraJavaOptions="-Divy.cache.dir=/tmp -Divy.home=/tmp"
\
--conf spark.executor.instances=4 \
--conf spark.executor.memory=2G \
--conf spark.executor.cores=2 \
--conf spark.driver.memory=1G \
--conf
spark.jars=/home/pcp/delta-lake/connectors/dist/delta-core-shaded-assembly_2.12-0.1.0.jar,/home/pcp/delta-lake/connectors/dist/hive-delta_2.12-0.1.0.jar
\
file:///home/pcp/spongebob/examples/spark-thrift-server/target/spark-thrift-server-1.0.0-SNAPSHOT-spark-job.jar;



After a little time, I got the following UnknownHost Exception:

20/09/20 03:29:23 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties
20/09/20 03:29:24 INFO SparkKubernetesClientFactory: Auto-configuring K8S
client using current context from users K8S config file
20/09/20 03:29:25 INFO KerberosConfDriverFeatureStep: You have not specified
a krb5.conf file locally or via a ConfigMap. Make sure that you have the
krb5.conf locally on the driver image.
20/09/20 03:29:26 WARN MetricsConfig: Cannot locate configuration: tried
hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
20/09/20 03:29:26 INFO MetricsSystemImpl: Scheduled Metric snapshot period
at 10 second(s).
20/09/20 03:29:26 INFO MetricsSystemImpl: s3a-file-system metrics system
started



Exception in thread "main" org.apache.spark.SparkException: Uploading file
/home/pcp/delta-lake/connectors/dist/delta-core-shaded-assembly_2.12-0.1.0.jar
failed...
        at
org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:289)
        at
org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:248)
        at
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
        at
scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at
scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at
scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at scala.collection.TraversableLike.map(TraversableLike.scala:238)
        at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
        at scala.collection.AbstractTraversable.map(Traversable.scala:108)
        at
org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:247)
        at
org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:162)
        at scala.collection.immutable.List.foreach(List.scala:392)
        at
org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:160)
        at
org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$3(KubernetesDriverBuilder.scala:60)
        at
scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
        at
scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
        at scala.collection.immutable.List.foldLeft(List.scala:89)
        at
org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:58)
        at
org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:98)
        at
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4(KubernetesClientApplication.scala:221)
        at
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4$adapted(KubernetesClientApplication.scala:215)
        at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2539)
        at
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:215)
        at
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:188)
        at
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
        at
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at
org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.hadoop.fs.s3a.AWSClientIOException: doesBucketExist on
mykidong: com.amazonaws.SdkClientException: Unable to execute HTTP request:
mykidong.mykidong-tenant.minio.cloudchef-labs.com: Unable to execute HTTP
request: mykidong.mykidong-tenant.minio.cloudchef-labs.com
        at
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:189)
        at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:111)
        at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:265)
        at
org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322)
        at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:261)
        at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:236)
        at
org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:375)
        at
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:311)
        at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3303)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
        at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479)
        at
org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1853)
        at
org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:280)
        ... 30 more
Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request:
mykidong.mykidong-tenant.minio.cloudchef-labs.com
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1116)
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1066)
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
        at
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
        at
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4368)
        at
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4315)
        at
com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1344)
        at
com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1284)
        at
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$1(S3AFileSystem.java:376)
        at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
        ... 43 more
Caused by: java.net.UnknownHostException:
mykidong.mykidong-tenant.minio.cloudchef-labs.com
        at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
        at java.net.InetAddress.getAllByName(InetAddress.java:1193)
        at java.net.InetAddress.getAllByName(InetAddress.java:1127)
        at
com.amazonaws.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:27)
        at
com.amazonaws.http.DelegatingDnsResolver.resolve(DelegatingDnsResolver.java:38)
        at
org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:112)
        at
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:373)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at
com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
        at com.amazonaws.http.conn.$Proxy18.connect(Unknown Source)
        at
org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:394)
        at
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237)
        at
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
        at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
        at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
        at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
        at
com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1238)
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1058)
        ... 55 more
20/09/20 03:41:25 INFO ShutdownHookManager: Shutdown hook called
20/09/20 03:41:25 INFO ShutdownHookManager: Deleting directory
/tmp/spark-35671300-f6a6-45c1-8f90-ca001d76eec6


Take a look at the exception message.
Even if my s3 endpoint is https://mykidong-tenant.minio.cloudchef-labs.com, 
the Unknown exception message has
mykidong.mykidong-tenant.minio.cloudchef-labs.com
That is, 'mykidong' prefix has been added to the original endpoint.

Let's summarize it:
- With HTTP Endpoint using IP Address, spark job works fine on Kubernetes.
- But, with HTTPS Endpoint using Host Name(of course), spark submit cannot
find S3 Endpoint.


Any ideas?

Cheers,

- Kidong.




--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: UnknownHostException is thrown when spark job whose jar files will be uploaded to s3 object storage via https is submitted to kubernetes

Posted by mykidong <my...@gmail.com>.
Sorry, I have missed setting path style access of s3 property to submit.
If I have added --conf spark.hadoop.fs.s3a.path.style.access=true in the
spark submit, it works fine!

- Kidong.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: UnknownHostException is thrown when spark job whose jar files will be uploaded to s3 object storage via https is submitted to kubernetes

Posted by Hitesh Tiwari <hi...@gmail.com>.
Hi,

Not sure if this would be useful but you can take a look.

https://www.google.com/url?sa=t&source=web&rct=j&url=https://github.com/fabric8io/kubernetes-client/issues/2168&ved=2ahUKEwiole7bvvjrAhXWFXcKHSzvBp8QFjAAegQIBxAB&usg=AOvVaw2-mLNJpzDlCCME-eHAXe1u

Thanks & Regards,
Hitesh Tiwari

On Sun, 20 Sep 2020, 10:48 mykidong, <my...@gmail.com> wrote:

>
>
> Hi,
>
> I have already succeeded submitting spark job to kubernetes with accessing
> s3 object storage in the below env.
> At that time, I was in the following env.:
> Spark: 3.0.0.
> S3 Object Storage: Hadoop Ozone S3 Object Storage accessed by HTTP Endpoint
> containging IP Address, Not Host Name.
> Resource Management: Kubernetes.
> My spark job worked OK.
>
>
> Now, I want to replace unsecured HTTP with HTTPS to access s3 object
> storage:
> Spark: 3.0.0.
> S3 Object Storage: MinIO S3 Object Storage accessed by HTTPS Endpoint
> containging Host Name.
> Resource Management: Kubernetes.
>
> I have already installed cert-manager, ingress controller on k8s, and added
> my S3 Endpoint Host Name to public DNS server.
> I have also tested MinIO S3 Object Storage using AWS CLI via HTTPS, it
> works
> fine what I expected.
>
> But, the problem is when I submit my spark job to kubernetes and the deps
> jar files will be uploaded to my MinIO S3 Object Storage, spark submit
> cannot find my S3 endpoint with the WRONG HOST NAME.
>
> Let's see my spark job submit:
>
> export MASTER=k8s://https://10.0.4.5:6443;
> export NAMESPACE=ai-developer;
> export ENDPOINT=https://mykidong-tenant.minio.cloudchef-labs.com;
>
> spark-submit \
> --master $MASTER \
> --deploy-mode cluster \
> --name spark-thrift-server \
> --class io.spongebob.hive.SparkThriftServerRunner \
> --packages
> com.amazonaws:aws-java-sdk-s3:1.11.375,org.apache.hadoop:hadoop-aws:3.2.0 \
> --conf "spark.executor.extraJavaOptions=-Dnetworkaddress.cache.ttl=60" \
> --conf "spark.driver.extraJavaOptions=-Dnetworkaddress.cache.ttl=60" \
> --conf spark.kubernetes.file.upload.path=s3a://mykidong/spark-thrift-server
> \
> --conf spark.kubernetes.container.image.pullPolicy=Always \
> --conf spark.kubernetes.namespace=$NAMESPACE \
> --conf spark.kubernetes.container.image=mykidong/spark:v3.0.0 \
> --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
> --conf spark.hadoop.hive.metastore.client.connect.retry.delay=5 \
> --conf spark.hadoop.hive.metastore.client.socket.timeout=1800 \
> --conf
>
> spark.hadoop.hive.metastore.uris=thrift://metastore.$NAMESPACE.svc.cluster.local:9083
> \
> --conf spark.hadoop.hive.server2.enable.doAs=false \
> --conf spark.hadoop.hive.server2.thrift.http.port=10002 \
> --conf spark.hadoop.hive.server2.thrift.port=10016 \
> --conf spark.hadoop.hive.server2.transport.mode=binary \
> --conf spark.hadoop.metastore.catalog.default=spark \
> --conf spark.hadoop.hive.execution.engine=spark \
> --conf spark.hadoop.hive.input.format=io.delta.hive.HiveInputFormat \
> --conf spark.hadoop.hive.tez.input.format=io.delta.hive.HiveInputFormat \
> --conf spark.sql.warehouse.dir=s3a://mykidong/apps/spark/warehouse \
> --conf spark.hadoop.fs.defaultFS=s3a://mykidong \
> --conf spark.hadoop.fs.s3a.access.key=bWluaW8= \
> --conf spark.hadoop.fs.s3a.secret.key=bWluaW8xMjM= \
> --conf spark.hadoop.fs.s3a.connection.ssl.enabled=true \
> --conf spark.hadoop.fs.s3a.endpoint=$ENDPOINT \
> --conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \
> --conf spark.hadoop.fs.s3a.fast.upload=true \
> --conf spark.driver.extraJavaOptions="-Divy.cache.dir=/tmp -Divy.home=/tmp"
> \
> --conf spark.executor.instances=4 \
> --conf spark.executor.memory=2G \
> --conf spark.executor.cores=2 \
> --conf spark.driver.memory=1G \
> --conf
>
> spark.jars=/home/pcp/delta-lake/connectors/dist/delta-core-shaded-assembly_2.12-0.1.0.jar,/home/pcp/delta-lake/connectors/dist/hive-delta_2.12-0.1.0.jar
> \
>
> file:///home/pcp/spongebob/examples/spark-thrift-server/target/spark-thrift-server-1.0.0-SNAPSHOT-spark-job.jar;
>
>
>
> After a little time, I got the following UnknownHost Exception:
>
> 20/09/20 03:29:23 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Using Spark's default log4j profile:
> org/apache/spark/log4j-defaults.properties
> 20/09/20 03:29:24 INFO SparkKubernetesClientFactory: Auto-configuring K8S
> client using current context from users K8S config file
> 20/09/20 03:29:25 INFO KerberosConfDriverFeatureStep: You have not
> specified
> a krb5.conf file locally or via a ConfigMap. Make sure that you have the
> krb5.conf locally on the driver image.
> 20/09/20 03:29:26 WARN MetricsConfig: Cannot locate configuration: tried
> hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
> 20/09/20 03:29:26 INFO MetricsSystemImpl: Scheduled Metric snapshot period
> at 10 second(s).
> 20/09/20 03:29:26 INFO MetricsSystemImpl: s3a-file-system metrics system
> started
>
>
>
> Exception in thread "main" org.apache.spark.SparkException: Uploading file
>
> /home/pcp/delta-lake/connectors/dist/delta-core-shaded-assembly_2.12-0.1.0.jar
> failed...
>         at
>
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:289)
>         at
>
> org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:248)
>         at
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
>         at
> scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>         at
> scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>         at
> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>         at scala.collection.TraversableLike.map(TraversableLike.scala:238)
>         at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
>         at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>         at
>
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:247)
>         at
>
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:162)
>         at scala.collection.immutable.List.foreach(List.scala:392)
>         at
>
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:160)
>         at
>
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$3(KubernetesDriverBuilder.scala:60)
>         at
> scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
>         at
> scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
>         at scala.collection.immutable.List.foldLeft(List.scala:89)
>         at
>
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:58)
>         at
>
> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:98)
>         at
>
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4(KubernetesClientApplication.scala:221)
>         at
>
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4$adapted(KubernetesClientApplication.scala:215)
>         at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2539)
>         at
>
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:215)
>         at
>
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:188)
>         at
> org.apache.spark.deploy.SparkSubmit.org
> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
>         at
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>         at
> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>         at
> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>         at
>
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
>         at
> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: org.apache.hadoop.fs.s3a.AWSClientIOException: doesBucketExist
> on
> mykidong: com.amazonaws.SdkClientException: Unable to execute HTTP request:
> mykidong.mykidong-tenant.minio.cloudchef-labs.com: Unable to execute HTTP
> request: mykidong.mykidong-tenant.minio.cloudchef-labs.com
>         at
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:189)
>         at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:111)
>         at
> org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:265)
>         at
> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322)
>         at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:261)
>         at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:236)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:375)
>         at
> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:311)
>         at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3303)
>         at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
>         at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352)
>         at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320)
>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479)
>         at
> org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1853)
>         at
>
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:280)
>         ... 30 more
> Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP
> request:
> mykidong.mykidong-tenant.minio.cloudchef-labs.com
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1116)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1066)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
>         at
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
>         at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4368)
>         at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4315)
>         at
>
> com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1344)
>         at
>
> com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1284)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$1(S3AFileSystem.java:376)
>         at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
>         ... 43 more
> Caused by: java.net.UnknownHostException:
> mykidong.mykidong-tenant.minio.cloudchef-labs.com
>         at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
>         at java.net.InetAddress.getAllByName(InetAddress.java:1193)
>         at java.net.InetAddress.getAllByName(InetAddress.java:1127)
>         at
>
> com.amazonaws.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:27)
>         at
>
> com.amazonaws.http.DelegatingDnsResolver.resolve(DelegatingDnsResolver.java:38)
>         at
>
> org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:112)
>         at
>
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:373)
>         at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>         at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at
>
> com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
>         at com.amazonaws.http.conn.$Proxy18.connect(Unknown Source)
>         at
>
> org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:394)
>         at
>
> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237)
>         at
> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
>         at
>
> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
>         at
>
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
>         at
>
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
>         at
>
> com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1238)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1058)
>         ... 55 more
> 20/09/20 03:41:25 INFO ShutdownHookManager: Shutdown hook called
> 20/09/20 03:41:25 INFO ShutdownHookManager: Deleting directory
> /tmp/spark-35671300-f6a6-45c1-8f90-ca001d76eec6
>
>
> Take a look at the exception message.
> Even if my s3 endpoint is https://mykidong-tenant.minio.cloudchef-labs.com,
>
> the Unknown exception message has
> mykidong.mykidong-tenant.minio.cloudchef-labs.com
> That is, 'mykidong' prefix has been added to the original endpoint.
>
> Let's summarize it:
> - With HTTP Endpoint using IP Address, spark job works fine on Kubernetes.
> - But, with HTTPS Endpoint using Host Name(of course), spark submit cannot
> find S3 Endpoint.
>
>
> Any ideas?
>
> Cheers,
>
> - Kidong.
>
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>