You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by liancheng <gi...@git.apache.org> on 2014/08/09 03:13:29 UTC

[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

GitHub user liancheng opened a pull request:

    https://github.com/apache/spark/pull/1864

    [SPARK-2894][Core] Fixes spark-shell and pyspark CLI options

    This PR tries to fix SPARK-2894 in a way different from #1825, namely filtering out all `spark-submit` options from user application options, since we have control over option list of `spark-submit`. By extracting the filtering into a utility shell function, fixing other shell scripts can be much easier, and adding new `spark-submit` options wouldn't need too much duplicated work. In the mid term, we still need a cleaner solution such as what has been discussed in #1715.
    
    All mode `pyspark` supported (Python, IPython and deprecated `pyspark app.py` style) should work. @JoshRosen @davies It would be great if you can help review PySpark related code, thanks!
    
    An open issue is Windows scripts need also to be updated.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/liancheng/spark spark-2894

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1864.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1864
    
----
commit e630d19c01f35a9d48decc6c60b01b9d99c311c0
Author: Cheng Lian <li...@gmail.com>
Date:   2014-08-08T17:15:45Z

    Fixing pyspark and spark-shell CLI options

commit 5afc584c9338bd2b5d7dff827a04ecdd07d6ac36
Author: Cheng Lian <li...@gmail.com>
Date:   2014-08-09T00:50:41Z

    Filter out spark-submit options when starting Python gateway

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/1864#issuecomment-51688048
  
    @andrewor14 Sorry, I was busy attending the 1st Spark Meetup in Beijing today, and thanks to @sarutak, his most recently updated PR (#1825) fixed all the issues you've pointed out here. I'll test it locally to make sure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16024844
  
    --- Diff: bin/utils.sh ---
    @@ -0,0 +1,56 @@
    +#!/usr/bin/env bash
    +
    +#
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    +# contributor license agreements.  See the NOTICE file distributed with
    +# this work for additional information regarding copyright ownership.
    +# The ASF licenses this file to You under the Apache License, Version 2.0
    +# (the "License"); you may not use this file except in compliance with
    +# the License.  You may obtain a copy of the License at
    +#
    +#    http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +#
    +
    +# Gather all all spark-submit options into SUBMISSION_OPTS
    +function gatherSparkSubmitOpts() {
    +  SUBMISSION_OPTS=()
    +  APPLICATION_OPTS=()
    --- End diff --
    
    `SUBMISSION_OPTS` sounds a little strange to me. At the same time I realize we already have a `SPARK_SUBMIT_OPTS` elsewhere. How about calling these two `SPARK_SUBMIT_ARGS` and `SPARK_APPLICATION_ARGS` instead?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16026058
  
    --- Diff: bin/utils.sh ---
    @@ -0,0 +1,56 @@
    +#!/usr/bin/env bash
    +
    +#
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    +# contributor license agreements.  See the NOTICE file distributed with
    +# this work for additional information regarding copyright ownership.
    +# The ASF licenses this file to You under the Apache License, Version 2.0
    +# (the "License"); you may not use this file except in compliance with
    +# the License.  You may obtain a copy of the License at
    +#
    +#    http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +#
    +
    +# Gather all all spark-submit options into SUBMISSION_OPTS
    +function gatherSparkSubmitOpts() {
    +  SUBMISSION_OPTS=()
    +  APPLICATION_OPTS=()
    +  while (($#)); do
    +    case $1 in
    +      --master | --deploy-mode | --class | --name | --jars | --py-files | --files)
    +        ;&
    +
    +      --conf | --properties-file | --driver-memory | --driver-java-options)
    +        ;&
    +
    +      --driver-library-path | --driver-class-path | --executor-memory | --driver-cores)
    +        ;&
    +
    +      --total-executor-cores | --executor-cores | --queue | --num-executors | --archives)
    --- End diff --
    
    I'm using Maverics with an brew installed bash 4.3.8. And yes #1825 works, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16024822
  
    --- Diff: bin/spark-shell ---
    @@ -46,11 +49,11 @@ function main(){
             # (see https://github.com/sbt/sbt/issues/562).
             stty -icanon min 1 -echo > /dev/null 2>&1
             export SPARK_SUBMIT_OPTS="$SPARK_SUBMIT_OPTS -Djline.terminal=unix"
    -        $FWDIR/bin/spark-submit --class org.apache.spark.repl.Main spark-shell "$@"
    +        $FWDIR/bin/spark-submit --class org.apache.spark.repl.Main ${SUBMISSION_OPTS[@]} spark-shell ${APPLICATION_OPTS[@]}
    --- End diff --
    
    Does this handle quoted strings, e.g. `--name "awesome app"`? You may need to put double quotes around the argument lists.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16024830
  
    --- Diff: bin/utils.sh ---
    @@ -0,0 +1,56 @@
    +#!/usr/bin/env bash
    +
    +#
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    +# contributor license agreements.  See the NOTICE file distributed with
    +# this work for additional information regarding copyright ownership.
    +# The ASF licenses this file to You under the Apache License, Version 2.0
    +# (the "License"); you may not use this file except in compliance with
    +# the License.  You may obtain a copy of the License at
    +#
    +#    http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +#
    +
    +# Gather all all spark-submit options into SUBMISSION_OPTS
    +function gatherSparkSubmitOpts() {
    +  SUBMISSION_OPTS=()
    +  APPLICATION_OPTS=()
    +  while (($#)); do
    +    case $1 in
    +      --master | --deploy-mode | --class | --name | --jars | --py-files | --files)
    +        ;&
    +
    +      --conf | --properties-file | --driver-memory | --driver-java-options)
    +        ;&
    +
    +      --driver-library-path | --driver-class-path | --executor-memory | --driver-cores)
    +        ;&
    +
    +      --total-executor-cores | --executor-cores | --queue | --num-executors | --archives)
    --- End diff --
    
    Any reason to spread this out over 4 cases? Why not just group them in 1? (You could use backslash to escape new line)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16024855
  
    --- Diff: python/pyspark/java_gateway.py ---
    @@ -39,7 +39,7 @@ def launch_gateway():
             submit_args = os.environ.get("PYSPARK_SUBMIT_ARGS")
             submit_args = submit_args if submit_args is not None else ""
             submit_args = shlex.split(submit_args)
    -        command = [os.path.join(SPARK_HOME, script), "pyspark-shell"] + submit_args
    +        command = [os.path.join(SPARK_HOME, script)] + submit_args + ["pyspark-shell"]
    --- End diff --
    
    You could probably get them through `os.environ.get("SPARK_APPLICATION_ARGS")` here (or whatever you decide to call the environment variable)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16024852
  
    --- Diff: python/pyspark/java_gateway.py ---
    @@ -39,7 +39,7 @@ def launch_gateway():
             submit_args = os.environ.get("PYSPARK_SUBMIT_ARGS")
             submit_args = submit_args if submit_args is not None else ""
             submit_args = shlex.split(submit_args)
    -        command = [os.path.join(SPARK_HOME, script), "pyspark-shell"] + submit_args
    +        command = [os.path.join(SPARK_HOME, script)] + submit_args + ["pyspark-shell"]
    --- End diff --
    
    I don't see how application args for the pyspark shell are handled here. Is this still WIP?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/1864#issuecomment-51679865
  
    Hi @liancheng, this looks pretty good. We should merge this quickly because Spark shell for master is currently broken. However, I don't see a code path for handling application args for pyspark shell. Do you intend to add that in this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16024885
  
    --- Diff: bin/utils.sh ---
    @@ -0,0 +1,56 @@
    +#!/usr/bin/env bash
    +
    +#
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    +# contributor license agreements.  See the NOTICE file distributed with
    +# this work for additional information regarding copyright ownership.
    +# The ASF licenses this file to You under the Apache License, Version 2.0
    +# (the "License"); you may not use this file except in compliance with
    +# the License.  You may obtain a copy of the License at
    +#
    +#    http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +#
    +
    +# Gather all all spark-submit options into SUBMISSION_OPTS
    +function gatherSparkSubmitOpts() {
    +  SUBMISSION_OPTS=()
    +  APPLICATION_OPTS=()
    --- End diff --
    
    I also used `SPARK_SUBMIT_ARGS` at first, but there is already a `SPARK_SUBMIT_OPTS` env var used in PySpark. These two can be potentially confusing, that's why I resorted to `SUBMISSION_OPTS` at last.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16026126
  
    --- Diff: python/pyspark/java_gateway.py ---
    @@ -39,7 +39,7 @@ def launch_gateway():
             submit_args = os.environ.get("PYSPARK_SUBMIT_ARGS")
             submit_args = submit_args if submit_args is not None else ""
             submit_args = shlex.split(submit_args)
    -        command = [os.path.join(SPARK_HOME, script), "pyspark-shell"] + submit_args
    +        command = [os.path.join(SPARK_HOME, script)] + submit_args + ["pyspark-shell"]
    --- End diff --
    
    Currently if we call something like `bin/pyspark app.py --name "awesome app"`, `bin/pyspark` directly delegates to `spark-submit` and doesn't go down here. This implies that, `submit_args` can only contain `spark-submit` options.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1864#issuecomment-51674246
  
    QA results for PR 1864:<br>- This patch PASSES unit tests.<br>- This patch merges cleanly<br>- This patch adds the following public classes (experimental):<br>$FWDIR/bin/spark-submit --class org.apache.spark.repl.Main ${SUBMISSION_OPTS[@]} spark-shell ${APPLICATION_OPTS[@]}<br>$FWDIR/bin/spark-submit --class org.apache.spark.repl.Main ${SUBMISSION_OPTS[@]} spark-shell ${APPLICATION_OPTS[@]}<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18237/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/1864


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by sarutak <gi...@git.apache.org>.
Github user sarutak commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16025891
  
    --- Diff: bin/utils.sh ---
    @@ -0,0 +1,56 @@
    +#!/usr/bin/env bash
    +
    +#
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    +# contributor license agreements.  See the NOTICE file distributed with
    +# this work for additional information regarding copyright ownership.
    +# The ASF licenses this file to You under the Apache License, Version 2.0
    +# (the "License"); you may not use this file except in compliance with
    +# the License.  You may obtain a copy of the License at
    +#
    +#    http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +#
    +
    +# Gather all all spark-submit options into SUBMISSION_OPTS
    +function gatherSparkSubmitOpts() {
    +  SUBMISSION_OPTS=()
    +  APPLICATION_OPTS=()
    +  while (($#)); do
    +    case $1 in
    +      --master | --deploy-mode | --class | --name | --jars | --py-files | --files)
    +        ;&
    +
    +      --conf | --properties-file | --driver-memory | --driver-java-options)
    +        ;&
    +
    +      --driver-library-path | --driver-class-path | --executor-memory | --driver-cores)
    +        ;&
    +
    +      --total-executor-cores | --executor-cores | --queue | --num-executors | --archives)
    --- End diff --
    
    Let me make a little bit corrections.
    This code works on bash 4.3.0 and 4.1.2 in CentOS6 but doesn't work on 3.2.51 in Mac OS X Maverics.
    
    #1825 still works on 4.3.0 , 4.1.2 and 3.2.51.
    
    If #1825 doesn't work 4.3.8, some work arounds may be needed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by sarutak <gi...@git.apache.org>.
Github user sarutak commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16025795
  
    --- Diff: bin/utils.sh ---
    @@ -0,0 +1,56 @@
    +#!/usr/bin/env bash
    +
    +#
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    +# contributor license agreements.  See the NOTICE file distributed with
    +# this work for additional information regarding copyright ownership.
    +# The ASF licenses this file to You under the Apache License, Version 2.0
    +# (the "License"); you may not use this file except in compliance with
    +# the License.  You may obtain a copy of the License at
    +#
    +#    http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +#
    +
    +# Gather all all spark-submit options into SUBMISSION_OPTS
    +function gatherSparkSubmitOpts() {
    +  SUBMISSION_OPTS=()
    +  APPLICATION_OPTS=()
    +  while (($#)); do
    +    case $1 in
    +      --master | --deploy-mode | --class | --name | --jars | --py-files | --files)
    +        ;&
    +
    +      --conf | --properties-file | --driver-memory | --driver-java-options)
    +        ;&
    +
    +      --driver-library-path | --driver-class-path | --executor-memory | --driver-cores)
    +        ;&
    +
    +      --total-executor-cores | --executor-cores | --queue | --num-executors | --archives)
    --- End diff --
    
    This code didn't work in GNU Bash 4.3.0 and 4.1.2. I guess you use BSD right? I think, this is BSD specific issue.
    I re-PRed #1825 and modified to use bask slash based multiline and that worked in GNU Bash 4.3.0 and 4.1.2.
    
    @liancheng can you check whether that works or not in 4.3.8?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16026217
  
    --- Diff: bin/spark-shell ---
    @@ -46,11 +49,11 @@ function main(){
             # (see https://github.com/sbt/sbt/issues/562).
             stty -icanon min 1 -echo > /dev/null 2>&1
             export SPARK_SUBMIT_OPTS="$SPARK_SUBMIT_OPTS -Djline.terminal=unix"
    -        $FWDIR/bin/spark-submit --class org.apache.spark.repl.Main spark-shell "$@"
    +        $FWDIR/bin/spark-submit --class org.apache.spark.repl.Main ${SUBMISSION_OPTS[@]} spark-shell ${APPLICATION_OPTS[@]}
    --- End diff --
    
    Confirmed it doesn't, need to add similar logic as the `for` loop in `pyspark` to handle quoted arguments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by sarutak <gi...@git.apache.org>.
Github user sarutak commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16025805
  
    --- Diff: bin/utils.sh ---
    @@ -0,0 +1,56 @@
    +#!/usr/bin/env bash
    +
    +#
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    +# contributor license agreements.  See the NOTICE file distributed with
    +# this work for additional information regarding copyright ownership.
    +# The ASF licenses this file to You under the Apache License, Version 2.0
    +# (the "License"); you may not use this file except in compliance with
    +# the License.  You may obtain a copy of the License at
    +#
    +#    http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +#
    +
    +# Gather all all spark-submit options into SUBMISSION_OPTS
    +function gatherSparkSubmitOpts() {
    +  SUBMISSION_OPTS=()
    +  APPLICATION_OPTS=()
    +  while (($#)); do
    +    case $1 in
    +      --master | --deploy-mode | --class | --name | --jars | --py-files | --files)
    +        ;&
    +
    +      --conf | --properties-file | --driver-memory | --driver-java-options)
    +        ;&
    +
    +      --driver-library-path | --driver-class-path | --executor-memory | --driver-cores)
    +        ;&
    +
    +      --total-executor-cores | --executor-cores | --queue | --num-executors | --archives)
    +        if [[ $# -lt 2 ]]; then
    +          usage
    --- End diff --
    
    utils.sh expects scripts which use utils.sh implements usage() but it's implicit.
    In #1825, the new commit modifies this issue.
    Newer utils.sh forces scripts to implement usage function.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16024934
  
    --- Diff: bin/utils.sh ---
    @@ -0,0 +1,56 @@
    +#!/usr/bin/env bash
    +
    +#
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    +# contributor license agreements.  See the NOTICE file distributed with
    +# this work for additional information regarding copyright ownership.
    +# The ASF licenses this file to You under the Apache License, Version 2.0
    +# (the "License"); you may not use this file except in compliance with
    +# the License.  You may obtain a copy of the License at
    +#
    +#    http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +#
    +
    +# Gather all all spark-submit options into SUBMISSION_OPTS
    +function gatherSparkSubmitOpts() {
    +  SUBMISSION_OPTS=()
    +  APPLICATION_OPTS=()
    +  while (($#)); do
    +    case $1 in
    +      --master | --deploy-mode | --class | --name | --jars | --py-files | --files)
    +        ;&
    +
    +      --conf | --properties-file | --driver-memory | --driver-java-options)
    +        ;&
    +
    +      --driver-library-path | --driver-class-path | --executor-memory | --driver-cores)
    +        ;&
    +
    +      --total-executor-cores | --executor-cores | --queue | --num-executors | --archives)
    --- End diff --
    
    Just to prevent too long a line. I tried escaping newline with backslash, at least it doesn't work in Bash 4.3.8 :(


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16026061
  
    --- Diff: bin/utils.sh ---
    @@ -0,0 +1,56 @@
    +#!/usr/bin/env bash
    +
    +#
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    +# contributor license agreements.  See the NOTICE file distributed with
    +# this work for additional information regarding copyright ownership.
    +# The ASF licenses this file to You under the Apache License, Version 2.0
    +# (the "License"); you may not use this file except in compliance with
    +# the License.  You may obtain a copy of the License at
    +#
    +#    http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +#
    +
    +# Gather all all spark-submit options into SUBMISSION_OPTS
    +function gatherSparkSubmitOpts() {
    +  SUBMISSION_OPTS=()
    +  APPLICATION_OPTS=()
    +  while (($#)); do
    +    case $1 in
    +      --master | --deploy-mode | --class | --name | --jars | --py-files | --files)
    +        ;&
    +
    +      --conf | --properties-file | --driver-memory | --driver-java-options)
    +        ;&
    +
    +      --driver-library-path | --driver-class-path | --executor-memory | --driver-cores)
    +        ;&
    +
    +      --total-executor-cores | --executor-cores | --queue | --num-executors | --archives)
    +        if [[ $# -lt 2 ]]; then
    +          usage
    --- End diff --
    
    My fault, forgot to the usage function at first...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by sarutak <gi...@git.apache.org>.
Github user sarutak commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16025808
  
    --- Diff: python/pyspark/java_gateway.py ---
    @@ -39,7 +39,7 @@ def launch_gateway():
             submit_args = os.environ.get("PYSPARK_SUBMIT_ARGS")
             submit_args = submit_args if submit_args is not None else ""
             submit_args = shlex.split(submit_args)
    -        command = [os.path.join(SPARK_HOME, script), "pyspark-shell"] + submit_args
    +        command = [os.path.join(SPARK_HOME, script)] + submit_args + ["pyspark-shell"]
    --- End diff --
    
    I also modified this in #1825 .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1864#issuecomment-51672988
  
    QA tests have started for PR 1864. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18237/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16024832
  
    --- Diff: bin/utils.sh ---
    @@ -0,0 +1,56 @@
    +#!/usr/bin/env bash
    +
    +#
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    +# contributor license agreements.  See the NOTICE file distributed with
    +# this work for additional information regarding copyright ownership.
    +# The ASF licenses this file to You under the Apache License, Version 2.0
    +# (the "License"); you may not use this file except in compliance with
    +# the License.  You may obtain a copy of the License at
    +#
    +#    http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +#
    +
    +# Gather all all spark-submit options into SUBMISSION_OPTS
    +function gatherSparkSubmitOpts() {
    +  SUBMISSION_OPTS=()
    +  APPLICATION_OPTS=()
    +  while (($#)); do
    +    case $1 in
    +      --master | --deploy-mode | --class | --name | --jars | --py-files | --files)
    +        ;&
    +
    +      --conf | --properties-file | --driver-memory | --driver-java-options)
    +        ;&
    +
    +      --driver-library-path | --driver-class-path | --executor-memory | --driver-cores)
    +        ;&
    +
    +      --total-executor-cores | --executor-cores | --queue | --num-executors | --archives)
    +        if [[ $# -lt 2 ]]; then
    +          usage
    --- End diff --
    
    Maybe I'm missing something, but where does `usage` come from?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16027252
  
    --- Diff: python/pyspark/java_gateway.py ---
    @@ -39,7 +39,7 @@ def launch_gateway():
             submit_args = os.environ.get("PYSPARK_SUBMIT_ARGS")
             submit_args = submit_args if submit_args is not None else ""
             submit_args = shlex.split(submit_args)
    -        command = [os.path.join(SPARK_HOME, script), "pyspark-shell"] + submit_args
    +        command = [os.path.join(SPARK_HOME, script)] + submit_args + ["pyspark-shell"]
    --- End diff --
    
    @liancheng Ah yes, you're right, this doesn't actually do anything because the main class for the pyspark-shell is the `py4j.JavaGateway`, which is not interested in IPYTHON arguments like `notebook`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2894][Core] Fixes spark-shell and pyspa...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1864#discussion_r16027127
  
    --- Diff: python/pyspark/java_gateway.py ---
    @@ -39,7 +39,7 @@ def launch_gateway():
             submit_args = os.environ.get("PYSPARK_SUBMIT_ARGS")
             submit_args = submit_args if submit_args is not None else ""
             submit_args = shlex.split(submit_args)
    -        command = [os.path.join(SPARK_HOME, script), "pyspark-shell"] + submit_args
    +        command = [os.path.join(SPARK_HOME, script)] + submit_args + ["pyspark-shell"]
    --- End diff --
    
    @liancheng Yes, `bin/pyspark` with a python file doesn't go down here, but this code path is mainly for the pyspark shell, where the user my specify ipython with arguments, e.g. `notebook` or `--py-lab`. In this case we need to put these application arguments after `pyspark-shell`, otherwise these arguments are never handled.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org