You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by srowen <gi...@git.apache.org> on 2014/08/18 15:02:17 UTC

[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

GitHub user srowen opened a pull request:

    https://github.com/apache/spark/pull/2014

    SPARK-3069 [DOCS] Build instructions in README are outdated

    Here's my crack at Bertrand's suggestion. The Github `README.md` contains build info that's outdated. It should just point to the current online docs, and reflect that Maven is the primary build now.
    
    (Incidentally, the stanza at the end about contributions of original work should go in https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark too. It won't hurt to be crystal clear about the agreement to license, given that ICLAs are not required of anyone here.)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/srowen/spark SPARK-3069

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2014.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2014
    
----
commit 5c02e40097e9c93e86a0665c02f269a0e5106de8
Author: Sean Owen <sr...@gmail.com>
Date:   2014-08-18T12:57:53Z

    Refer to current online documentation for building, and remove slightly outdated copy in README.md

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-52911412
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19051/consoleFull) for   PR 2014 at commit [`3a2bcad`](https://github.com/apache/spark/commit/3a2bcadffd59d2575695a6f62bbab49e4ef46273).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `In multiclass classification, all `$2^`
      * `public final class JavaDecisionTree `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by Sean Owen <so...@cloudera.com>.
I imagine the new site hasn't been pushed. Yeah, the README.md has the
new links immediately though. It's a minor and temporary, since I
believe the site was going to be updated to fix that 1.1.0-SNAPSHOT
ref anyway.

On Tue, Sep 16, 2014 at 5:24 PM, nchammas <gi...@git.apache.org> wrote:
> Github user nchammas commented on the pull request:
>
>     https://github.com/apache/spark/pull/2014#issuecomment-55770066
>
>     FYI: This page is 404-ing: http://spark.apache.org/docs/latest/building-spark.html
>
>     Is that temporary?
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastructure@apache.org or file a JIRA ticket
> with INFRA.
> ---
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
> For additional commands, e-mail: reviews-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by nchammas <gi...@git.apache.org>.
Github user nchammas commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55770066
  
    FYI: This page is 404-ing: http://spark.apache.org/docs/latest/building-spark.html
    
    Is that temporary?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53148101
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19103/consoleFull) for   PR 2014 at commit [`5c6b814`](https://github.com/apache/spark/commit/5c6b8144765e4810b5d8995e06498c14ceba844d).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by nchammas <gi...@git.apache.org>.
Github user nchammas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r16796890
  
    --- Diff: CONTRIBUTING.md ---
    @@ -0,0 +1,12 @@
    +## Contributing to Spark
    --- End diff --
    
    @srowen I think it's a good idea to have a CONTRIBUTING file here, for the reasons you explained elsewhere, like [this one](https://github.com/blog/1184-contributing-guidelines). I'd actually favor moving the [Contributing to Spark](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark) page entirely out of the wiki and into here. 
    
    I believe Spark accepts contributions entirely through GitHub, so it makes sense to have the contributing instructions live here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55386674
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20226/consoleFull) for   PR 2014 at commit [`be82027`](https://github.com/apache/spark/commit/be8202712cca2429be3cf3fd2cfa27cf2a684b18).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r17583504
  
    --- Diff: docs/_config.yml ---
    @@ -1,5 +1,7 @@
    -pygments: true
    +highlighter: pygments
     markdown: kramdown
    +gems:
    +  - jekyll-redirect-from
    --- End diff --
    
    Does this config mean that users don't need to install them gem manually?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/2014


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by nchammas <gi...@git.apache.org>.
Github user nchammas commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53775856
  
    > Is the net conclusion that README.md should use Maven if anything?
    
    Not sure. It sounds like Maven is indeed in the official standard for building Spark, but we do want to document the `sbt` instructions somewhere. Dunno if the README is that place.
    
    > but then I can't remove the wiki page and it ends up duplicated again
    
    I think if Patrick / Reynold / Spark committers agree that the CONTRIBUTING file would be better placed on GitHub, they can have the wiki page removed in short order.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53149241
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19103/consoleFull) for   PR 2014 at commit [`5c6b814`](https://github.com/apache/spark/commit/5c6b8144765e4810b5d8995e06498c14ceba844d).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `In multiclass classification, all `$2^`
      * `public final class JavaDecisionTree `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by nchammas <gi...@git.apache.org>.
Github user nchammas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r17668529
  
    --- Diff: CONTRIBUTING.md ---
    @@ -0,0 +1,12 @@
    +## Contributing to Spark
    --- End diff --
    
    Having this is pretty nice! I like the banner you get when opening a new PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53717258
  
    @nchammas @pwendell Is the net conclusion that `README.md` should use Maven if anything?
    I'd be happy to move the wiki into `CONTRIBUTING.md` but then I can't remove the wiki page and it ends up duplicated again. Maybe it's fine as is and the important change is getting the file in place to trigger the prompt on the PR screen. If so then I think this is still ready for review/merge as you all see fit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-54103445
  
    @srowen what about adding a redirect? As a stupid-simple approach, we could just have the old page there and do a javascript based redirect by setting `window.location`. This is an important enough page it could be worth some manual redirection for a few versions of Spark. There might also be jekyll ways of doing redirection if you are feeling adventurous.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53147889
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19102/consoleFull) for   PR 2014 at commit [`7aa045e`](https://github.com/apache/spark/commit/7aa045e881cda1d51c90040e5917839aee085b05).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55695975
  
    LGTM pending one minor comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by nchammas <gi...@git.apache.org>.
Github user nchammas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r17458717
  
    --- Diff: README.md ---
    @@ -13,16 +13,19 @@ and Spark Streaming.
     ## Online Documentation
     
     You can find the latest Spark documentation, including a programming
    -guide, on the project webpage at <http://spark.apache.org/documentation.html>.
    +guide, on the [project web page](http://spark.apache.org/documentation.html).
     This README file only contains basic setup instructions.
     
     ## Building Spark
     
    -Spark is built on Scala 2.10. To build Spark and its example programs, run:
    +Spark is built using [Apache Maven](http://maven.apache.org/).
    +To build Spark and its example programs, run:
     
    -    ./sbt/sbt assembly
    +    mvn -DskipTests clean package
    --- End diff --
    
    Yeah, we cleared this up in the [main discussion thread](https://github.com/apache/spark/pull/2014#issuecomment-53646263). I stand corrected.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r17546414
  
    --- Diff: docs/building-spark.md ---
    @@ -159,4 +160,13 @@ then ship it over to the cluster. We are investigating the exact cause for this.
     
     The assembly jar produced by `mvn package` will, by default, include all of Spark's dependencies, including Hadoop and some of its ecosystem projects. On YARN deployments, this causes multiple versions of these to appear on executor classpaths: the version packaged in the Spark assembly and the version on each node, included with yarn.application.classpath.  The `hadoop-provided` profile builds the assembly without including Hadoop-ecosystem projects, like ZooKeeper and Hadoop itself. 
     
    +# Building with SBT
     
    +Maven is the official recommendation for packaging Spark, and is the "build of reference".
    +But SBT is supported for day-to-day development since it can provide much faster iterative
    +compilation. More advanced developers may wish to use SBT.
    +
    +The SBT build is derived from the Maven POM files, and so the same Maven profiles and variables
    +can be set to control the SBT build. For example:
    +
    +    sbt -Pyarn -Phadoop-2.3 compile
    --- End diff --
    
    I think the goal here is just a taste, assuming the advanced developer will understand and figure out the rest if needed. Happy to make further edits though, like, should we still suggest `./sbt/sbt` instead of a local `sbt`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-52912040
  
    Ha, false positive. It picked up a line of text in `README.md` that contains a class declaration. Maybe the checker can avoid known non-source extensions. The unit test failure can't be related as this is just a doc change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by nchammas <gi...@git.apache.org>.
Github user nchammas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r16936253
  
    --- Diff: CONTRIBUTING.md ---
    @@ -0,0 +1,12 @@
    +## Contributing to Spark
    --- End diff --
    
    @pwendell Perhaps for a future PR: What do you think about removing the contributing guide from the wiki and having it live exclusively on GitHub? Seems like a better home for it since GitHub is the only way we accept contributions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53977940
  
    Made a few comments inline. On building docs, my favorite idea is just to have the README link to the upstream docs, and then change the upstream docs to be called "Building Spark" and have it mostly focus on Maven but have a note on SBT at the end as a developer option. That would avoid duplication.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r16933031
  
    --- Diff: CONTRIBUTING.md ---
    @@ -0,0 +1,12 @@
    +## Contributing to Spark
    --- End diff --
    
    Yeah, seems fine to have this here. It might make it easier for people to find the contributing wiki page if they start on github.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55664791
  
    @markhamstra Nice one, change coming up...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r17472638
  
    --- Diff: README.md ---
    @@ -66,78 +69,24 @@ Many of the example programs print usage help if no params are given.
     
     ## Running Tests
     
    -Testing first requires [building Spark](#building-spark). Once Spark is built, tests
    -can be run using:
    -
    -    ./dev/run-tests
    +Please see the guidance on how to 
    +[run all automated tests](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-AutomatedTesting)
     
     ## A Note About Hadoop Versions
     
     Spark uses the Hadoop core library to talk to HDFS and other Hadoop-supported
     storage systems. Because the protocols have changed in different versions of
     Hadoop, you must build Spark against the same version that your cluster runs.
    -You can change the version by setting `-Dhadoop.version` when building Spark.
    -
    -For Apache Hadoop versions 1.x, Cloudera CDH MRv1, and other Hadoop
    -versions without YARN, use:
    -
    -    # Apache Hadoop 1.2.1
    -    $ sbt/sbt -Dhadoop.version=1.2.1 assembly
    -
    -    # Cloudera CDH 4.2.0 with MapReduce v1
    -    $ sbt/sbt -Dhadoop.version=2.0.0-mr1-cdh4.2.0 assembly
    -
    -For Apache Hadoop 2.2.X, 2.1.X, 2.0.X, 0.23.x, Cloudera CDH MRv2, and other Hadoop versions
    -with YARN, also set `-Pyarn`:
    -
    -    # Apache Hadoop 2.0.5-alpha
    -    $ sbt/sbt -Dhadoop.version=2.0.5-alpha -Pyarn assembly
    -
    -    # Cloudera CDH 4.2.0 with MapReduce v2
    -    $ sbt/sbt -Dhadoop.version=2.0.0-cdh4.2.0 -Pyarn assembly
    -
    -    # Apache Hadoop 2.2.X and newer
    -    $ sbt/sbt -Dhadoop.version=2.2.0 -Pyarn assembly
    -
    -When developing a Spark application, specify the Hadoop version by adding the
    -"hadoop-client" artifact to your project's dependencies. For example, if you're
    -using Hadoop 1.2.1 and build your application using SBT, add this entry to
    -`libraryDependencies`:
    -
    -    "org.apache.hadoop" % "hadoop-client" % "1.2.1"
     
    -If your project is built with Maven, add this to your POM file's `<dependencies>` section:
    -
    -    <dependency>
    -      <groupId>org.apache.hadoop</groupId>
    -      <artifactId>hadoop-client</artifactId>
    -      <version>1.2.1</version>
    -    </dependency>
    -
    -
    -## A Note About Thrift JDBC server and CLI for Spark SQL
    -
    -Spark SQL supports Thrift JDBC server and CLI.
    -See sql-programming-guide.md for more information about using the JDBC server and CLI.
    -You can use those features by setting `-Phive` when building Spark as follows.
    -
    -    $ sbt/sbt -Phive  assembly
    +Please refer to the build documentation at
    +["Specifying the Hadoop Version"](http://spark.apache.org/docs/latest/building-with-maven.html#specifying-the-hadoop-version)
    --- End diff --
    
    Oops, thanks. I have fixed a few straggling occurrences. Yes all links should be to building-spark now, even though building-with-maven is retained as a redirect to the former.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by nchammas <gi...@git.apache.org>.
Github user nchammas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r16796412
  
    --- Diff: README.md ---
    @@ -66,78 +69,24 @@ Many of the example programs print usage help if no params are given.
     
     ## Running Tests
     
    -Testing first requires [building Spark](#building-spark). Once Spark is built, tests
    -can be run using:
    -
    -    ./dev/run-tests
    +Please see the guidance on how to 
    +[run all automated tests](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-AutomatedTesting)
     
     ## A Note About Hadoop Versions
     
     Spark uses the Hadoop core library to talk to HDFS and other Hadoop-supported
     storage systems. Because the protocols have changed in different versions of
     Hadoop, you must build Spark against the same version that your cluster runs.
    -You can change the version by setting `-Dhadoop.version` when building Spark.
    -
    -For Apache Hadoop versions 1.x, Cloudera CDH MRv1, and other Hadoop
    -versions without YARN, use:
    -
    -    # Apache Hadoop 1.2.1
    -    $ sbt/sbt -Dhadoop.version=1.2.1 assembly
    -
    -    # Cloudera CDH 4.2.0 with MapReduce v1
    -    $ sbt/sbt -Dhadoop.version=2.0.0-mr1-cdh4.2.0 assembly
    -
    -For Apache Hadoop 2.2.X, 2.1.X, 2.0.X, 0.23.x, Cloudera CDH MRv2, and other Hadoop versions
    -with YARN, also set `-Pyarn`:
    -
    -    # Apache Hadoop 2.0.5-alpha
    -    $ sbt/sbt -Dhadoop.version=2.0.5-alpha -Pyarn assembly
    -
    -    # Cloudera CDH 4.2.0 with MapReduce v2
    -    $ sbt/sbt -Dhadoop.version=2.0.0-cdh4.2.0 -Pyarn assembly
    -
    -    # Apache Hadoop 2.2.X and newer
    -    $ sbt/sbt -Dhadoop.version=2.2.0 -Pyarn assembly
    -
    -When developing a Spark application, specify the Hadoop version by adding the
    -"hadoop-client" artifact to your project's dependencies. For example, if you're
    -using Hadoop 1.2.1 and build your application using SBT, add this entry to
    -`libraryDependencies`:
    -
    -    "org.apache.hadoop" % "hadoop-client" % "1.2.1"
     
    -If your project is built with Maven, add this to your POM file's `<dependencies>` section:
    -
    -    <dependency>
    -      <groupId>org.apache.hadoop</groupId>
    -      <artifactId>hadoop-client</artifactId>
    -      <version>1.2.1</version>
    -    </dependency>
    -
    -
    -## A Note About Thrift JDBC server and CLI for Spark SQL
    -
    -Spark SQL supports Thrift JDBC server and CLI.
    -See sql-programming-guide.md for more information about using the JDBC server and CLI.
    -You can use those features by setting `-Phive` when building Spark as follows.
    -
    -    $ sbt/sbt -Phive  assembly
    +Please refer to the build documentation at
    +["Specifying the Hadoop Version"](http://spark.apache.org/docs/latest/building-with-maven.html#specifying-the-hadoop-version)
    --- End diff --
    
    Similarly here, I think `sbt` is the default way to do these things, and Maven is a supported alternative. 
    
    It's confusing that the [main site](http://spark.apache.org/docs/latest/building-with-maven.html) only lists Maven for building Spark, but since it's listed as "Building Spark with Maven" as opposed to just "Building Spark" I read that as an alternative way of building that is supported, rather than the main way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r16933038
  
    --- Diff: README.md ---
    @@ -66,78 +69,24 @@ Many of the example programs print usage help if no params are given.
     
     ## Running Tests
     
    -Testing first requires [building Spark](#building-spark). Once Spark is built, tests
    -can be run using:
    -
    -    ./dev/run-tests
    +Please see the guidance on how to 
    +[run all automated tests](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-AutomatedTesting)
     
     ## A Note About Hadoop Versions
     
     Spark uses the Hadoop core library to talk to HDFS and other Hadoop-supported
     storage systems. Because the protocols have changed in different versions of
     Hadoop, you must build Spark against the same version that your cluster runs.
    -You can change the version by setting `-Dhadoop.version` when building Spark.
    -
    -For Apache Hadoop versions 1.x, Cloudera CDH MRv1, and other Hadoop
    -versions without YARN, use:
    -
    -    # Apache Hadoop 1.2.1
    -    $ sbt/sbt -Dhadoop.version=1.2.1 assembly
    -
    -    # Cloudera CDH 4.2.0 with MapReduce v1
    -    $ sbt/sbt -Dhadoop.version=2.0.0-mr1-cdh4.2.0 assembly
    -
    -For Apache Hadoop 2.2.X, 2.1.X, 2.0.X, 0.23.x, Cloudera CDH MRv2, and other Hadoop versions
    -with YARN, also set `-Pyarn`:
    -
    -    # Apache Hadoop 2.0.5-alpha
    -    $ sbt/sbt -Dhadoop.version=2.0.5-alpha -Pyarn assembly
    -
    -    # Cloudera CDH 4.2.0 with MapReduce v2
    -    $ sbt/sbt -Dhadoop.version=2.0.0-cdh4.2.0 -Pyarn assembly
    -
    -    # Apache Hadoop 2.2.X and newer
    -    $ sbt/sbt -Dhadoop.version=2.2.0 -Pyarn assembly
    -
    -When developing a Spark application, specify the Hadoop version by adding the
    -"hadoop-client" artifact to your project's dependencies. For example, if you're
    -using Hadoop 1.2.1 and build your application using SBT, add this entry to
    -`libraryDependencies`:
    -
    -    "org.apache.hadoop" % "hadoop-client" % "1.2.1"
     
    -If your project is built with Maven, add this to your POM file's `<dependencies>` section:
    -
    -    <dependency>
    -      <groupId>org.apache.hadoop</groupId>
    -      <artifactId>hadoop-client</artifactId>
    -      <version>1.2.1</version>
    -    </dependency>
    -
    -
    -## A Note About Thrift JDBC server and CLI for Spark SQL
    -
    -Spark SQL supports Thrift JDBC server and CLI.
    -See sql-programming-guide.md for more information about using the JDBC server and CLI.
    -You can use those features by setting `-Phive` when building Spark as follows.
    -
    -    $ sbt/sbt -Phive  assembly
    +Please refer to the build documentation at
    +["Specifying the Hadoop Version"](http://spark.apache.org/docs/latest/building-with-maven.html#specifying-the-hadoop-version)
    --- End diff --
    
    Yeah maybe we should change the title of the doc to be called "Building Spark" and also it might be nice at the bottom to have a quick note that SBT is supported for developer builds but Maven is the reference build for all packaging. That might be the best way to do this overall.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53565824
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19309/consoleFull) for   PR 2014 at commit [`3e4a303`](https://github.com/apache/spark/commit/3e4a303392e35706029b122fbfcb65d139edd8d0).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-54147529
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19589/consoleFull) for   PR 2014 at commit [`7fb3674`](https://github.com/apache/spark/commit/7fb3674f327c6b3883ecdadb0754f5126089d942).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55662640
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20353/consoleFull) for   PR 2014 at commit [`db2bd97`](https://github.com/apache/spark/commit/db2bd97bb03a243a39320c472a85834bfbcb099c).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by bbossy <gi...@git.apache.org>.
Github user bbossy commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-52769500
  
    Yes, LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53148996
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19102/consoleFull) for   PR 2014 at commit [`7aa045e`](https://github.com/apache/spark/commit/7aa045e881cda1d51c90040e5917839aee085b05).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53992523
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19538/consoleFull) for   PR 2014 at commit [`13492d8`](https://github.com/apache/spark/commit/13492d833e9e7c08da28a083caa14ef2fad44eaa).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-57974280
  
    @nchammas that page won't appear until we actually update the live docs (something that happens for each release rather than when a push a PR)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53645224
  
    @nchammas there is a bit more color in this thread:
    http://apache-spark-developers-list.1001551.n3.nabble.com/Assorted-project-updates-tests-build-etc-td7063.html


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r16968179
  
    --- Diff: docs/building-with-maven.md ---
    @@ -159,4 +159,8 @@ then ship it over to the cluster. We are investigating the exact cause for this.
     
     The assembly jar produced by `mvn package` will, by default, include all of Spark's dependencies, including Hadoop and some of its ecosystem projects. On YARN deployments, this causes multiple versions of these to appear on executor classpaths: the version packaged in the Spark assembly and the version on each node, included with yarn.application.classpath.  The `hadoop-provided` profile builds the assembly without including Hadoop-ecosystem projects, like ZooKeeper and Hadoop itself. 
     
    +# Building with SBT
     
    +Maven is the official recommendation for packaging Spark, and is the "build of reference".
    +But SBT is supported for day-to-day development since it can provide much faster iterative
    +compilation. More advanced developers may wish to use SBT.
    --- End diff --
    
    I'd describe that the sbt build is configured to use the existing maven profiles and given an example:
    
    ```
    sbt -Pyarn -Phadoop-2.3 compile
    ```
    
    ... or something


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-52907543
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19051/consoleFull) for   PR 2014 at commit [`3a2bcad`](https://github.com/apache/spark/commit/3a2bcadffd59d2575695a6f62bbab49e4ef46273).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by nchammas <gi...@git.apache.org>.
Github user nchammas commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53638059
  
    > `make-distribution.sh` uses Maven.
    
    Oh, good point. Erm... I guess we need one of the project maintainers to step in then and clarify the place of `sbt` relative to Maven.
    
    > according to people, the sbt builds are not meant for external consumption.
    
    Makes more sense now that I see distributions are created using Maven.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55391607
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20226/consoleFull) for   PR 2014 at commit [`be82027`](https://github.com/apache/spark/commit/be8202712cca2429be3cf3fd2cfa27cf2a684b18).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r17458522
  
    --- Diff: README.md ---
    @@ -66,78 +69,24 @@ Many of the example programs print usage help if no params are given.
     
     ## Running Tests
     
    -Testing first requires [building Spark](#building-spark). Once Spark is built, tests
    -can be run using:
    -
    -    ./dev/run-tests
    +Please see the guidance on how to 
    +[run all automated tests](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-AutomatedTesting)
     
     ## A Note About Hadoop Versions
     
     Spark uses the Hadoop core library to talk to HDFS and other Hadoop-supported
     storage systems. Because the protocols have changed in different versions of
     Hadoop, you must build Spark against the same version that your cluster runs.
    -You can change the version by setting `-Dhadoop.version` when building Spark.
    -
    -For Apache Hadoop versions 1.x, Cloudera CDH MRv1, and other Hadoop
    -versions without YARN, use:
    -
    -    # Apache Hadoop 1.2.1
    -    $ sbt/sbt -Dhadoop.version=1.2.1 assembly
    -
    -    # Cloudera CDH 4.2.0 with MapReduce v1
    -    $ sbt/sbt -Dhadoop.version=2.0.0-mr1-cdh4.2.0 assembly
    -
    -For Apache Hadoop 2.2.X, 2.1.X, 2.0.X, 0.23.x, Cloudera CDH MRv2, and other Hadoop versions
    -with YARN, also set `-Pyarn`:
    -
    -    # Apache Hadoop 2.0.5-alpha
    -    $ sbt/sbt -Dhadoop.version=2.0.5-alpha -Pyarn assembly
    -
    -    # Cloudera CDH 4.2.0 with MapReduce v2
    -    $ sbt/sbt -Dhadoop.version=2.0.0-cdh4.2.0 -Pyarn assembly
    -
    -    # Apache Hadoop 2.2.X and newer
    -    $ sbt/sbt -Dhadoop.version=2.2.0 -Pyarn assembly
    -
    -When developing a Spark application, specify the Hadoop version by adding the
    -"hadoop-client" artifact to your project's dependencies. For example, if you're
    -using Hadoop 1.2.1 and build your application using SBT, add this entry to
    -`libraryDependencies`:
    -
    -    "org.apache.hadoop" % "hadoop-client" % "1.2.1"
     
    -If your project is built with Maven, add this to your POM file's `<dependencies>` section:
    -
    -    <dependency>
    -      <groupId>org.apache.hadoop</groupId>
    -      <artifactId>hadoop-client</artifactId>
    -      <version>1.2.1</version>
    -    </dependency>
    -
    -
    -## A Note About Thrift JDBC server and CLI for Spark SQL
    -
    -Spark SQL supports Thrift JDBC server and CLI.
    -See sql-programming-guide.md for more information about using the JDBC server and CLI.
    -You can use those features by setting `-Phive` when building Spark as follows.
    -
    -    $ sbt/sbt -Phive  assembly
    +Please refer to the build documentation at
    +["Specifying the Hadoop Version"](http://spark.apache.org/docs/latest/building-with-maven.html#specifying-the-hadoop-version)
    --- End diff --
    
    Isn't this link outdated as of the rest of the changes in this PR? It should be `building-spark.html`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55665680
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20354/consoleFull) for   PR 2014 at commit [`501507e`](https://github.com/apache/spark/commit/501507e0782b65f7bfa2aa057b83aa2870817e45).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55672401
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20354/consoleFull) for   PR 2014 at commit [`501507e`](https://github.com/apache/spark/commit/501507e0782b65f7bfa2aa057b83aa2870817e45).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53158753
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19108/consoleFull) for   PR 2014 at commit [`9b56494`](https://github.com/apache/spark/commit/9b564944fef59afb61f8dd4af2aaf5771bcd46e8).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by markhamstra <gi...@git.apache.org>.
Github user markhamstra commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r17570971
  
    --- Diff: docs/building-spark.md ---
    @@ -159,4 +160,21 @@ then ship it over to the cluster. We are investigating the exact cause for this.
     
     The assembly jar produced by `mvn package` will, by default, include all of Spark's dependencies, including Hadoop and some of its ecosystem projects. On YARN deployments, this causes multiple versions of these to appear on executor classpaths: the version packaged in the Spark assembly and the version on each node, included with yarn.application.classpath.  The `hadoop-provided` profile builds the assembly without including Hadoop-ecosystem projects, like ZooKeeper and Hadoop itself. 
     
    +# Building with SBT
     
    +Maven is the official recommendation for packaging Spark, and is the "build of reference".
    +But SBT is supported for day-to-day development since it can provide much faster iterative
    +compilation. More advanced developers may wish to use SBT.
    +
    +The SBT build is derived from the Maven POM files, and so the same Maven profiles and variables
    +can be set to control the SBT build. For example:
    +
    +    sbt/sbt -Pyarn -Phadoop-2.3 compile
    +
    +# Speeding up Compilation with Zinc
    +
    +[Zinc](https://github.com/typesafehub/zinc) is a long-running server version of SBT's incremental
    +compiler. When run locally as a background process, it speeds up builds of Scala-based projects
    +like Spark. Developers who regularly recompile Spark will be most interested in Zinc. The project
    --- End diff --
    
    Should probably be something like "...who regularly recompile Spark with Maven will be..."; else some may think that Zinc is something to be used with SBT -- especially given that this comment is placed immediately after "Building with SBT".


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r17458463
  
    --- Diff: README.md ---
    @@ -13,16 +13,19 @@ and Spark Streaming.
     ## Online Documentation
     
     You can find the latest Spark documentation, including a programming
    -guide, on the project webpage at <http://spark.apache.org/documentation.html>.
    +guide, on the [project web page](http://spark.apache.org/documentation.html).
     This README file only contains basic setup instructions.
     
     ## Building Spark
     
    -Spark is built on Scala 2.10. To build Spark and its example programs, run:
    +Spark is built using [Apache Maven](http://maven.apache.org/).
    +To build Spark and its example programs, run:
     
    -    ./sbt/sbt assembly
    +    mvn -DskipTests clean package
    --- End diff --
    
    Actually the officially documented way of building Spark is through maven: http://spark.apache.org/docs/latest/building-with-maven.html. We should keep this consistent with the docs. (The discussion you linked to refers to tests).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55695994
  
    @srowen are you planning to add more to this or is it GTG from your perspective?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r16933034
  
    --- Diff: README.md ---
    @@ -66,78 +69,24 @@ Many of the example programs print usage help if no params are given.
     
     ## Running Tests
     
    -Testing first requires [building Spark](#building-spark). Once Spark is built, tests
    -can be run using:
    -
    -    ./dev/run-tests
    +Please see the guidance on how to 
    --- End diff --
    
    I think it's nice that we have the `tl;dr` here on how to run tests. The wiki link might not be stable (we are linking to an anchor tab) and the content there is only one or two sentences... not sure it adds much


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53981717
  
    OK will add back the "TL;DR" build instructions, and keep the wiki link for completeness.
    
    @pwendell I'll update the "Building with Maven" doc too, but do you want me to change the file name? it would break external hyperlinks I suppose, although otherwise it still reads "building-with-maven.html"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53572164
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19309/consoleFull) for   PR 2014 at commit [`3e4a303`](https://github.com/apache/spark/commit/3e4a303392e35706029b122fbfcb65d139edd8d0).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55598514
  
    @andrewor14 @nchammas @pwendell Humble ping on this one, I think it's good to go, and probably helps head off some build questions going forward.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55609958
  
    Yes, the build already warns if zinc is not being used.
    To keep this scoped, I suggest that could be handled separately if more docs were desired about zinc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55661991
  
    @pwendell I changed to `sbt/sbt`, and @markhamstra I took the liberty of adding a note on `zinc` while we're at it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-54797341
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19968/consoleFull) for   PR 2014 at commit [`91c921f`](https://github.com/apache/spark/commit/91c921f41f123df6818567f37cb33c6486895efa).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-54790978
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19968/consoleFull) for   PR 2014 at commit [`91c921f`](https://github.com/apache/spark/commit/91c921f41f123df6818567f37cb33c6486895efa).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55768975
  
    Okay I can merge this. One thing though, we've typically had less-than-smooth experiences with jekyll and its dependencies. So if this feature causes issues for users I'd propose we just maintain a tombstone and write one line of javascript to redirect. Google, Yahoo, Bing, etc can correctly index this behavior (I've done this a few times in the past to migrate from one site to another) and move to the new page.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by nchammas <gi...@git.apache.org>.
Github user nchammas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r16796104
  
    --- Diff: README.md ---
    @@ -13,16 +13,19 @@ and Spark Streaming.
     ## Online Documentation
     
     You can find the latest Spark documentation, including a programming
    -guide, on the project webpage at <http://spark.apache.org/documentation.html>.
    +guide, on the [project web page](http://spark.apache.org/documentation.html).
     This README file only contains basic setup instructions.
     
     ## Building Spark
     
    -Spark is built on Scala 2.10. To build Spark and its example programs, run:
    +Spark is built using [Apache Maven](http://maven.apache.org/).
    +To build Spark and its example programs, run:
     
    -    ./sbt/sbt assembly
    +    mvn -DskipTests clean package
    --- End diff --
    
    Per [the related discussion here](https://github.com/apache/spark/commit/73b3089b8d2901dab11bb1ef6f46c29625b677fe#commitcomment-7552507), I don't think we want to change this. There are definitely two ways on offer to build Spark, but I think `sbt` is the default way to go and Maven is supported alongside it.
    
    Perhaps we should just clarify this here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53157117
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19108/consoleFull) for   PR 2014 at commit [`9b56494`](https://github.com/apache/spark/commit/9b564944fef59afb61f8dd4af2aaf5771bcd46e8).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by markhamstra <gi...@git.apache.org>.
Github user markhamstra commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55611558
  
    Yes, I know that the scala-maven-plugin will throw warnings if zinc isn't being used.  I also know that many users are either confused by those warnings or ignore them completely, in large measure because our build docs never mention zinc even though zinc usage should be commonplace and arguably in the default "how to build" instruction.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53644873
  
    Yeah so our position on the builds is that we officially recommend Maven for packaging spark but we support sbt for day-to-day development since it can provide much faster iteration. So I would be inclined to point towards Maven in the docs - and allow more advanced developers to use SBT. The reason why it's hard to "officially" support both is that they use different policies for dependency resolution - this is just a fundamental difference in the way that Maven and sbt/ivy work - this isn't typically an issue when doing normal development, but in deployments it can cause a huge dependency headache. And supporting both in this regard would be extremely difficult.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r17564257
  
    --- Diff: docs/building-spark.md ---
    @@ -159,4 +160,13 @@ then ship it over to the cluster. We are investigating the exact cause for this.
     
     The assembly jar produced by `mvn package` will, by default, include all of Spark's dependencies, including Hadoop and some of its ecosystem projects. On YARN deployments, this causes multiple versions of these to appear on executor classpaths: the version packaged in the Spark assembly and the version on each node, included with yarn.application.classpath.  The `hadoop-provided` profile builds the assembly without including Hadoop-ecosystem projects, like ZooKeeper and Hadoop itself. 
     
    +# Building with SBT
     
    +Maven is the official recommendation for packaging Spark, and is the "build of reference".
    +But SBT is supported for day-to-day development since it can provide much faster iterative
    +compilation. More advanced developers may wish to use SBT.
    +
    +The SBT build is derived from the Maven POM files, and so the same Maven profiles and variables
    +can be set to control the SBT build. For example:
    +
    +    sbt -Pyarn -Phadoop-2.3 compile
    --- End diff --
    
    Yeah I'd give `sbt/sbt` but other than that, seems fine to have just a small example.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by nchammas <gi...@git.apache.org>.
Github user nchammas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r17546110
  
    --- Diff: docs/building-spark.md ---
    @@ -159,4 +160,13 @@ then ship it over to the cluster. We are investigating the exact cause for this.
     
     The assembly jar produced by `mvn package` will, by default, include all of Spark's dependencies, including Hadoop and some of its ecosystem projects. On YARN deployments, this causes multiple versions of these to appear on executor classpaths: the version packaged in the Spark assembly and the version on each node, included with yarn.application.classpath.  The `hadoop-provided` profile builds the assembly without including Hadoop-ecosystem projects, like ZooKeeper and Hadoop itself. 
     
    +# Building with SBT
     
    +Maven is the official recommendation for packaging Spark, and is the "build of reference".
    +But SBT is supported for day-to-day development since it can provide much faster iterative
    +compilation. More advanced developers may wish to use SBT.
    +
    +The SBT build is derived from the Maven POM files, and so the same Maven profiles and variables
    +can be set to control the SBT build. For example:
    +
    +    sbt -Pyarn -Phadoop-2.3 compile
    --- End diff --
    
    Do we need to add a bit more color here about how to use `sbt`, to match what used to be in the GitHub README? Or is this sufficient?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53626165
  
    Just wanted to point out that `make-distribution.sh` uses Maven.
    
    Also, we have features available in the Maven build (e.g. Guava shading) that were intentionally left out of the sbt build, since, according to people, the sbt builds are not meant for external consumption.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by markhamstra <gi...@git.apache.org>.
Github user markhamstra commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55609759
  
    There really should be at least some mention of zinc (https://github.com/typesafehub/zinc) in our maven build instructions, since using zinc greatly improves the maven + scala experience.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-54147642
  
    (Test failure looks spurious -- it's in the Python code and no code was touched in this PR.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53994367
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19538/consoleFull) for   PR 2014 at commit [`13492d8`](https://github.com/apache/spark/commit/13492d833e9e7c08da28a083caa14ef2fad44eaa).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by nchammas <gi...@git.apache.org>.
Github user nchammas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r17547220
  
    --- Diff: docs/building-spark.md ---
    @@ -159,4 +160,13 @@ then ship it over to the cluster. We are investigating the exact cause for this.
     
     The assembly jar produced by `mvn package` will, by default, include all of Spark's dependencies, including Hadoop and some of its ecosystem projects. On YARN deployments, this causes multiple versions of these to appear on executor classpaths: the version packaged in the Spark assembly and the version on each node, included with yarn.application.classpath.  The `hadoop-provided` profile builds the assembly without including Hadoop-ecosystem projects, like ZooKeeper and Hadoop itself. 
     
    +# Building with SBT
     
    +Maven is the official recommendation for packaging Spark, and is the "build of reference".
    +But SBT is supported for day-to-day development since it can provide much faster iterative
    +compilation. More advanced developers may wish to use SBT.
    +
    +The SBT build is derived from the Maven POM files, and so the same Maven profiles and variables
    +can be set to control the SBT build. For example:
    +
    +    sbt -Pyarn -Phadoop-2.3 compile
    --- End diff --
    
    Hmm, I don't know enough to make a recommendation; I'll leave that to others. Just wanted to call out the fact that we'd have less info on using `sbt` than before. Maybe that's a good thing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55670069
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20353/consoleFull) for   PR 2014 at commit [`db2bd97`](https://github.com/apache/spark/commit/db2bd97bb03a243a39320c472a85834bfbcb099c).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-54141725
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19589/consoleFull) for   PR 2014 at commit [`7fb3674`](https://github.com/apache/spark/commit/7fb3674f327c6b3883ecdadb0754f5126089d942).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-54141581
  
    @pwendell I added the SBT example to `README.md`. I also found there's a nice standard way to handle redirects: https://help.github.com/articles/redirects-on-github-pages  That's done, which entailed a new `jekyll` plugin and one config tweak, so I updated the site build docs too. The redirect works nicely for me locally with `jekyll serve`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by nchammas <gi...@git.apache.org>.
Github user nchammas commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-53646263
  
    Ah, thanks for that. Clears things up for me about `sbt` vs. Maven.
    
    So if we want to stress Maven as the standard, where would we move the `sbt`-related documentation to? I imagine we want to keep that around as an "advanced" feature. Or is that pretty much why it's here on GitHub?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2014#issuecomment-55717325
  
    @pwendell no I believe that the user still has to install the gem. I did at least. Yes this is GTG from my end. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org