You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Zikun (Jira)" <ji...@apache.org> on 2021/08/17 12:42:00 UTC
[jira] [Updated] (SPARK-36493) Skip Retrieving keytab with
SparkFiles.get if keytab found in the CWD of Yarn Container
[ https://issues.apache.org/jira/browse/SPARK-36493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zikun updated SPARK-36493:
--------------------------
Summary: Skip Retrieving keytab with SparkFiles.get if keytab found in the CWD of Yarn Container (was: SparkFiles.get is not needed for the JDBC keytab provided by the "--files" option)
> Skip Retrieving keytab with SparkFiles.get if keytab found in the CWD of Yarn Container
> ---------------------------------------------------------------------------------------
>
> Key: SPARK-36493
> URL: https://issues.apache.org/jira/browse/SPARK-36493
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 3.1.0, 3.1.2
> Reporter: Zikun
> Priority: Major
> Fix For: 3.1.3
>
>
> Currently we have the logic to deal with the JDBC keytab provided by the "--files" option
> {{if (keytabParam != null && FilenameUtils.getPath(keytabParam).isEmpty)}}
> \{{{}}
> {{}}{{val result = SparkFiles.get(keytabParam)}}
> {{}}{{logDebug(s"Keytab path not found, assuming --files, file name used on executor: $result")}}
> {{}}{{result}}
> {{}}} {{else {}}
> {{}}{{logDebug("Keytab path found, assuming manual upload")}}
> {{}}{{keytabParam}}
> {{}}}
> Spark has already created the soft link for any file submitted by the "--files" option. Here is an example.
> testusera1.keytab -> /var/opt/hadoop/temp/nm-local-dir/usercache/testusera1/appcache/application_1628584679772_0003/filecache/12/testusera1.keytab
>
> So there is no need to call the SparkFiles.get to absolute path of the keytab file. We can directly use the variable `keytabParam` as the keytab file path.
>
> Moreover, SparkFiles.get will get a wrong path of keytab for the driver in cluster mode. In cluster mode, the keytab is distributed to the following location for both the driver and executors
> {{/var/opt/hadoop/temp/nm-local-dir/usercache/testusera1/appcache/application_1628584679772_0003/container_1628584679772_0030_01_000001/testusera1.keytab}}
> but SparkFiles.get brings the following wrong location for the driver
> /var/opt/hadoop/temp/nm-local-dir/usercache/testusera1/appcache/application_1628584679772_0003/spark-8fb0f437-c842-4a9f-9612-39de40082e40/userFiles-5075388b-0928-4bc3-a498-7f6c84b27808/testusera1.keytab
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org