You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by liancheng <gi...@git.apache.org> on 2014/08/06 06:24:06 UTC

[GitHub] spark pull request: [SPARK-2874][SQL] Fixed usage messages of all ...

GitHub user liancheng opened a pull request:

    https://github.com/apache/spark/pull/1801

    [SPARK-2874][SQL] Fixed usage messages of all Spark SQL related scripts

    JIRA issue: [SPARK-2874](https://issues.apache.org/jira/browse/SPARK-2874)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/liancheng/spark spark-2874

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1801.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1801
    
----
commit daee105e56cde9969e162576aa6d18a446c00c1a
Author: Cheng Lian <li...@gmail.com>
Date:   2014-08-06T04:15:43Z

    Fixed usage messages of all Spark SQL related scripts

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2678][Core][SQL] A workaround for SPARK...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1801#issuecomment-51303836
  
    QA results for PR 1801:<br>- This patch PASSES unit tests.<br>- This patch merges cleanly<br>- This patch adds no public classes<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18003/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2678][Core][SQL] A workaround for SPARK...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/1801#issuecomment-51558315
  
    Yeah, I agree that this break in backward compatibility is pretty bad. At the same time we don't want to introduce some new config (e.g. the proposals in #1715) until we settle on a reasonably intuitive one. Otherwise we may have to support a clumsy way of passing arguments forever. Fortunately the documented way is not affected. It just so happens that our existing implementation is much more flexible than we intended it to be.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2874][SQL] Fixed usage messages of all ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1801#issuecomment-51298082
  
    QA tests have started for PR 1801. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18003/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2678][Core][SQL] A workaround for SPARK...

Posted by concretevitamin <gi...@git.apache.org>.
Github user concretevitamin commented on the pull request:

    https://github.com/apache/spark/pull/1801#issuecomment-51533306
  
    Oh, it's been reported by #1825.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2874][SQL] Fixed usage messages of all ...

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on the pull request:

    https://github.com/apache/spark/pull/1801#issuecomment-51294164
  
    @liancheng thanks for doing this. I don't have too much comments about it. But I have some other questions from the user perspective:
    1) Shall we set the default log level as `WARN` instead of `INFO` while running the `./bin/spark-sql`, the output seems too verbose for normal user currently.
    2) Shall we provide the default `hive-site.xml.template` and `hive-log4j.properties.template` under the `conf/` too? 
    3) I also noticed the package made by `make-distribution.sh`, only contains the assembly jars under the folder `lib`, actually in my cluster, I'd would like to use the SparkSQL with Hive metastore database installed on MySQL, hence I need to put additional jars (mysql driver and datanucleaus) into the folder `lib` manually, but currently, seems I have to change the scripts of `bin/compute-classpath.sh` as well to make those new jars effective in classpath. Not sure if there any better idea for the configuration.
    
    Sorry, those questions are not related to this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2678][Core][SQL] A workaround for SPARK...

Posted by concretevitamin <gi...@git.apache.org>.
Github user concretevitamin commented on the pull request:

    https://github.com/apache/spark/pull/1801#issuecomment-51533012
  
    @liancheng @andrewor14 @pwendell  With this patch things like `./bin/spark-shell --master local[2]` errors out ("bad options: --master").  I had to workaround this with an extra "--" before the flag. Is this intended?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2874][SQL] Fixed usage messages of all ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1801#issuecomment-51298396
  
    QA tests have started for PR 1801. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18006/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2678][Core][SQL] A workaround for SPARK...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/1801#issuecomment-51299909
  
    Hey @chenghao-intel, answers for your questions:
    
    1. Actually `bin/spark-sql` doesn't output lots of logs by default. Did you set something like `hive.root.looger=INFO,console` in your `hive-site.xml` or elsewhere (similar to what `shark-withinfo` does in Shark)?
    1. Hmm... I'm not very sure about this. Usually people just copy their `hive-site.xml` and `hive-log4j.properties` from their existing Hive installation. Maybe we can do it in another PR. One thing to note, [`hive-default.xml.template` in Hive 0.12 isn't a valid XML](https://github.com/apache/hive/blob/release-0.12.0/conf/hive-default.xml.template#L2000), needs some minor tweak before being added.
    1. Need some more time to investigate this one. We can discuss this offline.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2678][Core][SQL] A workaround for SPARK...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/1801#issuecomment-51545117
  
    This change broke backwards compatibility with existing spark-submit commands. e.g., this used to work fine and doesn't anymore:
    
        spark-submit myapp.jar --master yarn-client --class my.App
    
    You have to change it to:
    
        spark-submit --master yarn-client myapp.jar 
    
    I don't think that's acceptable. Lot's of users will be tripped by that, especially since the error message is less than helpful.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2678][Core][SQL] A workaround for SPARK...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/1801#issuecomment-51557390
  
    @vanzin I understand what you're worrying about :) Actually in #1715 we've roughly reached a cleaner solution (see the commends) than this one, but some details haven't been quite nailed down yet, and it's already close to 1.1 release. So, this PR is just a not so elegant fix of SPARK-2678 that aims to be compatible with the *spec* and sacrificing some clarity in bash scripts. We'll do it again in the right way as what Matei, Patrick and I discussed in #1715 once details are nailed down.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2678][Core][SQL] A workaround for SPARK...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/1801


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2678][Core][SQL] A workaround for SPARK...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1801#issuecomment-51362332
  
    QA tests have started for PR 1801. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18028/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2678][Core][SQL] A workaround for SPARK...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/1801#issuecomment-51546605
  
    I see what you're saying, but what the docs say and what the code does are different things... well, let's see how many people are tripped by this once this change goes out.
    
    A better error message, at least, would go a long way. Currently it says "Cannot load main class" and tells you to add "--verbose". But if you add "--verbose" at the end, you get the same error message... so nothing is clarified.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2874][SQL] Fixed usage messages of all ...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/1801#issuecomment-51299486
  
    Thanks cheng. This LGTM pending tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2678][Core][SQL] A workaround for SPARK...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/1801#issuecomment-51384413
  
    Thanks Cheng! I'm gonna merge this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2678][Core][SQL] A workaround for SPARK...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/1801#issuecomment-51546254
  
    @vanzin The point here is that, the only valid way to call `spark-submit` according both both our documentation and the usage message `spark-submit` prints is
    
    ```
    spark-submit [options] <app jar | python file> [app options]
    ```
    
    And we decided to be only compatible with this spec we have at hand. Existing scripts that rely on this bug (i.e., putting `--master` after primary resource) should be fixed, and we're not going to be compatible with them. And we do plan to provide a cleaner `spark-submit` interface in future releases.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2678][Core][SQL] A workaround for SPARK...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1801#issuecomment-51303827
  
    QA results for PR 1801:<br>- This patch PASSES unit tests.<br>- This patch merges cleanly<br>- This patch adds no public classes<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18006/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org