You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Manos Tsagkias (Jira)" <ji...@apache.org> on 2020/06/29 23:14:00 UTC

[jira] [Created] (SPARK-32134) YARN: archives rename with # doesn't work for https

Manos Tsagkias created SPARK-32134:
--------------------------------------

             Summary: YARN: archives rename with # doesn't work for https
                 Key: SPARK-32134
                 URL: https://issues.apache.org/jira/browse/SPARK-32134
             Project: Spark
          Issue Type: Bug
          Components: YARN
    Affects Versions: 2.3.0
            Reporter: Manos Tsagkias


This is related to SPARK-10858

The YARN distributed cache feature with --archives where you can rename the archive using a # symbol does not work with the http(s) scheme:


{{--archives http://mirror.sfo12.us.leaseweb.net/centos/6.10/isos/i386/sha1sum.txt#sha1sum}}

This is because URLs can have fragments and therefore the # is interpreted as part of the fragment. We could use a similar trick as we do for the other two schemes file:// and hdfs:// in which first we remove the last fragment, parse the URL, and then reattach the fragment. The [code exists|https://github.com/apache/spark/pull/9035/files] but it is not applied to URLs.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org