You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@toree.apache.org by "Luciano Resende (Jira)" <ji...@apache.org> on 2020/03/30 21:58:00 UTC
[jira] [Commented] (TOREE-516) Kerberos error while working with Toree Kernel

    [ https://issues.apache.org/jira/browse/TOREE-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17071306#comment-17071306 ] 

Luciano Resende commented on TOREE-516:
---------------------------------------

While Toree kernel can be submitted in yarn cluster mode, vanilla Jupyter Notebooks expect kernels as local processes and cannot discover where in the YARN cluster the actual kernel launched. The proper way to handle this is to have Jupyter Notebook -> Jupyter Enterprise Gateway -> Toree Kernel, and this is described in https://jupyter-enterprise-gateway.readthedocs.io/en/latest/kernel-yarn-cluster-mode.html


> Kerberos error while working with Toree Kernel
> ----------------------------------------------
>
>                 Key: TOREE-516
>                 URL: https://issues.apache.org/jira/browse/TOREE-516
>             Project: TOREE
>          Issue Type: Bug
>          Components: Kernel
>    Affects Versions: 0.3.0
>         Environment: RHEL 7.7, HDP 3 , Spark 2.3.2
>            Reporter: Bharath
>            Priority: Critical
>              Labels: security
>
> We have a Kerberied HDP 3.1.0 cluster, configured Jupyter/Toree - It throws Keberos error, we have a valid kerberos ticket in place. Here is the kernel.json
> {
> "argv": [
> "/opt/jupyterhub/share/jupyter/kernels/apache_toree_scala/bin/run.sh",
> "--profile",
> "\{connection_file}"
> ],
> "env": {
> "DEFAULT_INTERPRETER": "Scala",
> "TOREE_SPARK_OPTS": "--master yarn --deploy-mode cluster",
> "TOREE_OPTS": "",
> "HADOOP_CONF_DIR": "/etc/hadoop/conf",
> "SPARK_HOME": "/usr/hdp/3.1.0.0-78/spark2",
> "PYTHONPATH": "/opt/jupyterhub/bin/python:/opt/anaconda3/bin/python:/usr/hdp/3.1.0.0-78/spark2/python/lib/py4j-0.10.7-src.zip:/usr/hdp/3.1.0.0-78/spark2/python:/usr/hdp/3.1.0.0-78/spark2/python/lib",
> "PYTHON_EXEC": "python",
> "HBASE_CONF_DIR": "/etc/hbase/conf",
> "HBASE_HOME": "/usr/hdp/current/hbase-client",
> "JAVA_HOME": "/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.232.b09-0.el7_7.x86_64"
> },
> "display_name": "Toree - Scala-kernel",
> "language": "scala",
> "interrupt_mode": "signal",
> "metadata": {}
> }
> Error Message:
> 20/03/26 18:32:28 INFO RpcRetryingCallerImpl: Call exception, tries=8, retries=36, started=18874 ms ago, cancelled=false, msg=Call to compute-4.datalake.ntt/xxx.xxx.xx.xxx:16020 failed on local exception: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)], details=row 'name_data:exchg_yearly,,99999999999999' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=compute-4.datalake.ntt
> With above configuration, kernel dies sometimes.
> Please suggest if the way we are passing configs in Kernel.Json is the right way ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)