You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Manos Tsagkias (Jira)" <ji...@apache.org> on 2020/06/29 23:14:00 UTC
[jira] [Created] (SPARK-32134) YARN: archives rename with # doesn't
work for https
Manos Tsagkias created SPARK-32134:
--------------------------------------
Summary: YARN: archives rename with # doesn't work for https
Key: SPARK-32134
URL: https://issues.apache.org/jira/browse/SPARK-32134
Project: Spark
Issue Type: Bug
Components: YARN
Affects Versions: 2.3.0
Reporter: Manos Tsagkias
This is related to SPARK-10858
The YARN distributed cache feature with --archives where you can rename the archive using a # symbol does not work with the http(s) scheme:
{{--archives http://mirror.sfo12.us.leaseweb.net/centos/6.10/isos/i386/sha1sum.txt#sha1sum}}
This is because URLs can have fragments and therefore the # is interpreted as part of the fragment. We could use a similar trick as we do for the other two schemes file:// and hdfs:// in which first we remove the last fragment, parse the URL, and then reattach the fragment. The [code exists|https://github.com/apache/spark/pull/9035/files] but it is not applied to URLs.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org