You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xuzhou Yin (Jira)" <ji...@apache.org> on 2020/09/02 00:17:00 UTC

[jira] [Created] (SPARK-32775) [k8s] Spark client dependency support ignores non-local paths

Xuzhou Yin created SPARK-32775:
----------------------------------

             Summary: [k8s] Spark client dependency support ignores non-local paths
                 Key: SPARK-32775
                 URL: https://issues.apache.org/jira/browse/SPARK-32775
             Project: Spark
          Issue Type: Bug
          Components: Kubernetes
    Affects Versions: 3.0.0
            Reporter: Xuzhou Yin


According to the logic of this line: [https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L161,] Spark filters out all paths which are not local (ie. no scheme or [file://|file:///] scheme). It may cause non-local dependencies not loaded by Driver.

For example, when starting a Spark job with spark.jars=local:///local/path/1.jar,s3://s3/path/2.jar,[file:///local/path/3.jar], it seems like this logic will upload [file:///local/path/3.jar] to s3, and reset spark.jars to only s3://upload/path/3.jar, while completely ignoring local:///local/path/1.jar and s3:///s3/path/2.jar.

We need to fix this logic such that Spark upload local files to S3, and transform the paths while keeping all other paths as they are.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org