You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@toree.apache.org by "Ribamar Santarosa (JIRA)" <ji...@apache.org> on 2017/09/12 13:12:00 UTC

[jira] [Updated] (TOREE-438) CLONE - How to support Spark on Yarn model?

     [ https://issues.apache.org/jira/browse/TOREE-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ribamar Santarosa updated TOREE-438:
------------------------------------
    Description: 
It looks like the TOREE-97 issue -- support for Spark Yarn was closed without definitive solution (or something went wrong on the way). Toree does support it, but it won't work if a user don't add manually in their kernel.json definition, the env vars for `HADOOP_CONF_DIR`. Without that env var, Spark doesn't know what to do with the option `--master=yarn` (set in `__TOREE_SPARK_OPTS__`). It would be desirable to have it by default, and this patch provides this functionality. 

Probably this is not the nicest way to solve the problem, because it  just hard codes more vars into the JSON file -- ideally it would be nice to have an interface to add or remove env vars from those files, however, `HADOOP_CONF_DIR`  and `SPARK_CONF_DIR`  look basic to be exported. Even for an Spark Standalone deployment, `HADOOP_CONF_DIR` won't hurt.  So, here it goes our 2 cents to improve a bit the situation.

I cloned the TOREE-97 into TOREE-438 to sign this issue. 

  was:
Hi, All 
      Now I test spark-kernel in IPython3.0 released and Spark On Yarn model.  kernel.json like as below
{code}
{
    "display_name": "SparkOnYarn",
    "language": "scala",
    "argv": [
      "/root/local/bin/sparkkernel",
       "--master",
       "yarn-client",
        "--profile",
        "{connection_file}"
     ],
     "codemirror_mode": "scala"
}
{code}
while kernel can not be started.


> CLONE - How to support Spark on Yarn model?
> -------------------------------------------
>
>                 Key: TOREE-438
>                 URL: https://issues.apache.org/jira/browse/TOREE-438
>             Project: TOREE
>          Issue Type: Bug
>            Reporter: Ribamar Santarosa
>
> It looks like the TOREE-97 issue -- support for Spark Yarn was closed without definitive solution (or something went wrong on the way). Toree does support it, but it won't work if a user don't add manually in their kernel.json definition, the env vars for `HADOOP_CONF_DIR`. Without that env var, Spark doesn't know what to do with the option `--master=yarn` (set in `__TOREE_SPARK_OPTS__`). It would be desirable to have it by default, and this patch provides this functionality. 
> Probably this is not the nicest way to solve the problem, because it  just hard codes more vars into the JSON file -- ideally it would be nice to have an interface to add or remove env vars from those files, however, `HADOOP_CONF_DIR`  and `SPARK_CONF_DIR`  look basic to be exported. Even for an Spark Standalone deployment, `HADOOP_CONF_DIR` won't hurt.  So, here it goes our 2 cents to improve a bit the situation.
> I cloned the TOREE-97 into TOREE-438 to sign this issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)