You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by parente <gi...@git.apache.org> on 2017/01/06 05:27:51 UTC

[GitHub] spark pull request #16482: [SPARK-19038][YARN] Fix for misnamed keytab in ap...

GitHub user parente opened a pull request:

    https://github.com/apache/spark/pull/16482

    [SPARK-19038][YARN] Fix for misnamed keytab in app staging dir

    ## What changes were proposed in this pull request?
    
    Bug fix to respect the generate AM keytab name when copying the local keytab file to the app staging dir.
    
    Ref: https://issues.apache.org/jira/browse/SPARK-19038
    
    ## How was this patch tested?
    
    existing tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/parente/spark fix-keytab-remote-copy

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/16482.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16482
    
----
commit e2c5672ea3017f1682f5f9cead339e9874d82b7f
Author: Peter Parente <pa...@cs.unc.edu>
Date:   2017-01-06T05:20:55Z

    Fix keytab misnamed in app staging dir

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16482: [SPARK-19105][YARN] Fix for misnamed keytab in app stagi...

Posted by Tagar <gi...@git.apache.org>.
Github user Tagar commented on the issue:

    https://github.com/apache/spark/pull/16482
  
    I see this problem in my setup too. The same code was working fine in Spark 1.5 and Spark 1.6,
    and breaks with the same symptoms as in [SPARK-19038](https://issues.apache.org/jira/browse/SPARK-19038) from Spark 2.0 onwards.
    
    And this problem is not with ticket refresh, as it happens when you would try to run a first query
    in a fresh (and just created ) Spark context. Ticket refresh code works after some time when ticket
    is about to expire, not a few seconds after Spark Context has started, I believe - maybe there is 
    a change in Spark 2.
    
    > py4j.protocol.Py4JJavaError: An error occurred while calling o61.sql.
    > : org.apache.spark.SparkException: Keytab file: svc_odiprd.keytab-a1b98b7c-79fa-45b0-a80d-11953879a810 specified in spark.yarn.keytab does not exist
    >         at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:113)
    >         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    >         at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    >         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    >         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    >         at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
    >         at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:354)
    >         at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:258)
    >         at org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
    >         at org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
    >         at org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:46)
    >         at org.apache.spark.sql.hive.HiveSharedState.externalCatalog(HiveSharedState.scala:45)
    >         at org.apache.spark.sql.hive.HiveSessionState.catalog$lzycompute(HiveSessionState.scala:50)
    >         at org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:48)
    >         at org.apache.spark.sql.hive.HiveSessionState$$anon$1.<init>(HiveSessionState.scala:63)
    >         at org.apache.spark.sql.hive.HiveSessionState.analyzer$lzycompute(HiveSessionState.scala:63)
    >         at org.apache.spark.sql.hive.HiveSessionState.analyzer(HiveSessionState.scala:62)
    >         at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
    >         at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
    >         at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
    >   . . . 
    
    We also set keytab and principal in Python code, that uses yarn-client:
    
    ```
        conf = (SparkConf()
                 .setMaster('yarn-client')
                 .set("spark.yarn.keytab", kt_location)  .set("spark.yarn.principal", kt_principal)
                )
        sc = SparkContext(conf=conf)
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16482: [SPARK-19105][YARN] Fix for misnamed keytab in ap...

Posted by parente <gi...@git.apache.org>.
Github user parente closed the pull request at:

    https://github.com/apache/spark/pull/16482


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16482: [SPARK-19038][YARN] Fix for misnamed keytab in app stagi...

Posted by parente <gi...@git.apache.org>.
Github user parente commented on the issue:

    https://github.com/apache/spark/pull/16482
  
    >  please check the yarn container's local cache:
    
    Without this fix, when I specify `--principal user@REALM` and `--keytab /some/path/user.keytab`, I see the following in my app staging directory on HDFS:
    
    ```
    Found 7 items
    -rw-r--r--   3 user supergroup         68 2017-01-06 03:59 user.keytab
    -rw-r--r--   3 user supergroup      73502 2017-01-06 03:59 __spark_conf__.zip
    -rw-r--r--   3 user supergroup  189767340 2017-01-06 03:59 __spark_libs__4440821503780683972.zip
    -rw-r--r--   3 user supergroup      91275 2017-01-06 03:59 py4j-0.10.3-src.zip
    -rw-r--r--   3 user supergroup     440385 2017-01-06 03:59 pyspark.zip
    ```
    
    Notice that the keytab has not been properly suffixed during the remote copy. It's not clear to me at all how the file in your example receives the suffix when the call to `copyFileToRemote` in `Client.distribute` does not pass the destination name at all. The other calls to `copyFileToRemote` to copy the spark conf and libs do indeed pass `destName` to rename to the underscored versions we see above.
    
    > Also please see the comment in HiveClientImple, it've already mentioned the problem you met.
    
    Thanks for the pointer about this aspect. The linked JIRA issue describes the HiveClientImpl problem, but also notes that Kerberos ticket renewal is not occurring properly either in the first comment. I can open a separate JIRA issue explicitly about the keytab naming problem and lack of re-ticketing and separate out the two issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16482: [SPARK-19105][YARN] Fix for misnamed keytab in app stagi...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/16482
  
    >The other calls to copyFileToRemote to copy the spark conf and libs do indeed pass destName to rename to the underscored versions we see above.
    
    The local name of Spark conf and libs are also started with underscore.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16482: [SPARK-19038][YARN] Fix for misnamed keytab in app stagi...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/16482
  
    I think there's no issue in yarn side, please check the yarn container's local cache:
    
    ```
    -rw------- 1 spark hadoop  200 Jan  6 05:59 container_tokens
    -rwx------ 1 spark hadoop 4773 Jan  6 05:59 launch_container.sh
    lrwxrwxrwx 1 spark hadoop   67 Jan  6 05:59 __spark_conf__ -> /hadoop/yarn/local/usercache/spark/filecache/118/__spark_conf__.zip
    lrwxrwxrwx 1 spark hadoop   61 Jan  6 05:59 spark.keytab-79c95ff9-254f-4a75-9ecb-4c4afae1d1b5 -> /hadoop/yarn/local/usercache/spark/filecache/119/spark.keytab
    lrwxrwxrwx 1 spark hadoop   86 Jan  6 05:59 __spark_libs__ -> /hadoop/yarn/local/usercache/spark/filecache/117/__spark_libs__1388803833471984400.zip
    drwxr-s--- 2 spark hadoop 4096 Jan  6 05:59 tmp
    ```
    
    and Spark Conf:
    
    ```
    spark.yarn.keytab=spark.keytab-79c95ff9-254f-4a75-9ecb-4c4afae1d1b5
    ```
    
    Though keytab is still using the local name (without UUID suffix), yarn distributed cache correctly point the keytab "spark.keytab-79c95ff9-254f-4a75-9ecb-4c4afae1d1b5" to the right one.
    
    So `AMDelegationTokenRenewer` still could correctly get the keytab here.
    
    I guess the problem here is that you're running in yarn client mode. And `HiveClientImpl` is running in the driver side, this driver is running out of container, so obviously the keytab "spark.keytab-79c95ff9-254f-4a75-9ecb-4c4afae1d1b5" will be failed to find.
    
    Also please see the comment in `HiveClientImple`, it've already mentioned the problem you met. 
    
    ```
        // Instead of using the spark conf of the current spark context, a new
        // instance of SparkConf is needed for the original value of spark.yarn.keytab
        // and spark.yarn.principal set in SparkSubmit, as yarn.Client resets the
        // keytab configuration for the link name in distributed cache
    ``` 
    
    So IMHO the fix here is not solid.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16482: [SPARK-19105][YARN] Fix for misnamed keytab in app stagi...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/16482
  
    >Notice that the keytab has not been properly suffixed during the remote copy. It's not clear to me at all how the file in your example receives the suffix when the call to copyFileToRemote in Client.distribute does not pass the destination name at all. 
    
    Please check AM container's local cache, rather than HDFS staging files.
    
    I'm sure with current mechanism, tokens can be re-issued with `AMCredentialRenewer` in the yarn side. The only possible issue is here in `HiveClientImpl`, since it is running in yarn-client mode and driver is running out-of-box, so potentially there's an issue here.
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16482: [SPARK-19038][YARN] Fix for misnamed keytab in app stagi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16482
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16482: [SPARK-19105][YARN] Fix for misnamed keytab in app stagi...

Posted by parente <gi...@git.apache.org>.
Github user parente commented on the issue:

    https://github.com/apache/spark/pull/16482
  
    You are correct that on the app master the symlink exists with the proper suffix
    
    ```
    -rw------- 1 user yarn  202 Jan  6 07:11 container_tokens
    lrwxrwxrwx 1 user   63 Jan  6 07:11 user.keytab-da450fc5-d80d-4b39-927f-1068b1873614 -> /mnt/disk4/yarn/local/usercache/user/filecache/34/user.keytab
    ```
    
    Thanks for the clarification and sorry for the bother. Something else must be going wrong in my setup such that tickets are not being renewed using the keytab.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16482: [SPARK-19105][YARN] Fix for misnamed keytab in app stagi...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/16482
  
    Also with your fix, I don't think local keytab file name is changed to "keytab-UUID", it is still "keytab", so in `HiveClientImpl`, it tries to get local keytab with "keytab-UUID", can it really find the keytab file?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16482: [SPARK-19105][YARN] Fix for misnamed keytab in app stagi...

Posted by parente <gi...@git.apache.org>.
Github user parente commented on the issue:

    https://github.com/apache/spark/pull/16482
  
    > Please check AM container's local cache, rather than HDFS staging files.
    
    Ahhhh. I see the symlinks now in your example and understand that you listed the AM contents. I will check that.
    
    > I don't think local keytab file name is changed to "keytab-UUID", it is still "keytab", so in HiveClientImpl, it tries to get local keytab with "keytab-UUID", can it really find the keytab file?
    
    Correct. I opened a separate issue about the failure to renew tickets I am seeing and linked this PR to that. The fact that HiveClientImpl is now reading the updated config value to get the keytab name as it appears on the AM instead of locally is the original, but separate issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org