You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mateiz <gi...@git.apache.org> on 2014/05/28 01:20:50 UTC

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

GitHub user mateiz opened a pull request:

    https://github.com/apache/spark/pull/896

    [SPARK-1566] consolidate programming guide, and general doc updates

    This is a fairly large PR to clean up and update the docs for 1.0. The major changes are:
    
    * A unified programming guide for all languages replaces language-specific ones and shows language-specific info in tabs
    * New programming guide sections on key-value pairs, unit testing, input formats beyond text, migrating from 0.9, and passing functions to Spark
    * Spark-submit guide moved to a separate page and expanded slightly
    * Various cleanups of the menu system, security docs, and others
    * Updated look of title bar to differentiate the docs from previous Spark versions
    
    You can find the updated docs at http://people.apache.org/~matei/1.0-docs/_site/ and in particular http://people.apache.org/~matei/1.0-docs/_site/programming-guide.html.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mateiz/spark 1.0-docs

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/896.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #896
    
----
commit 038d8feb700758f48d4c777a47a9b36c637a5372
Author: Matei Zaharia <ma...@databricks.com>
Date:   2014-05-26T00:30:51Z

    Change color of doc title bar to differentiate from 0.9.0

commit 4298ce9145585b0ad02809202b4bc22df79063a8
Author: Matei Zaharia <ma...@databricks.com>
Date:   2014-05-26T00:50:47Z

    More CSS tweaks

commit dec99f104c484d8ce5cffd8e1ed9db530e59374a
Author: Matei Zaharia <ma...@databricks.com>
Date:   2014-05-26T01:11:25Z

    More CSS tweaks

commit 6b618f6af881f7cdbf29b7cbdf0abe35c8d4e555
Author: Matei Zaharia <ma...@databricks.com>
Date:   2014-05-26T20:52:59Z

    First pass at updating programming guide to support all languages, plus
    other tweaks throughout

commit 08f0c861a881ebc3c5cd83619ce885431c478f0e
Author: Matei Zaharia <ma...@databricks.com>
Date:   2014-05-26T21:56:53Z

    Actually added programming guide to Git

commit 5d29d82b6c53fa0f0f14e60939b1369c4b3888b3
Author: Matei Zaharia <ma...@databricks.com>
Date:   2014-05-27T07:57:25Z

    New section on basics and function syntax

commit af747d8d78ee30ee8b1067ab44d1cab560be5d34
Author: Matei Zaharia <ma...@databricks.com>
Date:   2014-05-27T08:04:10Z

    tweaks

commit 12aa10c5485ea3855b6a1d59be5cc38c77a23f21
Author: Matei Zaharia <ma...@databricks.com>
Date:   2014-05-27T08:31:26Z

    Added key-value pairs section

commit bdf22a28a17955bbbe733b68ad1340eefa0cdcd5
Author: Matei Zaharia <ma...@databricks.com>
Date:   2014-05-27T08:37:41Z

    tweaks

commit fce1c797e73d1704e91807a23c7cc5b088edcb7a
Author: Matei Zaharia <ma...@databricks.com>
Date:   2014-05-27T18:29:27Z

    Add more API functions

commit cf22e27e186ec9c3981488ee2cd7c58b8b1b8599
Author: Matei Zaharia <ma...@databricks.com>
Date:   2014-05-27T18:49:59Z

    migration guide, remove old language guides

commit 64cb7c2ceb82483088156928bfac2fe53772c6be
Author: Matei Zaharia <ma...@databricks.com>
Date:   2014-05-27T21:11:58Z

    stuff

commit 59ef0289aa148e0b5b3ac198c09cc28fb5713fc4
Author: Matei Zaharia <ma...@databricks.com>
Date:   2014-05-27T21:12:28Z

    Moved submitting apps to separate doc

commit 09c57b252b62209fd92d35a4443ae65e637fc2fb
Author: Matei Zaharia <ma...@databricks.com>
Date:   2014-05-27T23:07:34Z

    miscellaneous changes

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44371355
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15245/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44348380
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44445558
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44351157
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44483569
  
    @pwendell is it okay to merge this as is or do you want to look at it more?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44451149
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44348783
  
    Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44445572
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/896#discussion_r13116427
  
    --- Diff: docs/submitting-applications.md ---
    @@ -0,0 +1,153 @@
    +---
    +layout: global
    +title: Submitting Applications
    +---
    +
    +The `spark-submit` script in Spark's `bin` directory is used to launch applications on a cluster.
    +It can use all of Spark's supported [cluster managers](cluster-overview.html#cluster-manager-types)
    +through a uniform interface so you don't have to configure your application specially for each one.
    +
    +# Bundling Your Application's Dependencies
    +If your code depends on other projects, you will need to package them alongside
    +your application in order to distribute the code to a Spark cluster. To do this,
    +to create an assembly jar (or "uber" jar) containing your code and its dependencies. Both
    +[sbt](https://github.com/sbt/sbt-assembly) and
    +[Maven](http://maven.apache.org/plugins/maven-shade-plugin/)
    +have assembly plugins. When creating assembly jars, list Spark and Hadoop
    +as `provided` dependencies; these need not be bundled since they are provided by
    +the cluster manager at runtime. Once you have an assembled jar you can call the `bin/spark-submit`
    +script as shown here while passing your jar.
    +
    +For Python, you can use the `--py-files` argument of `spark-submit` to add `.py`, `.zip` or `.egg`
    +files to be distributed with your application. If you depend on multiple Python files we recommend
    +packaging them into a `.zip` or `.egg`.
    +
    +# Launching Applications with spark-submit
    +
    +Once a user application is bundled, it can be launched using the `bin/spark-submit` script.
    +This script takes care of setting up the classpath with Spark and its
    +dependencies, and can support different cluster managers and deploy modes that Spark supports:
    +
    +{% highlight bash %}
    +./bin/spark-submit \
    +  --class <main-class>
    +  --master <master-url> \
    +  --deploy-mode <deploy-mode> \
    +  ... # other options
    +  <application-jar> \
    +  [application-arguments]
    +{% endhighlight %}
    +
    +Some of the commonly used options are:
    +
    +* `--class`: The entry point for your application (e.g. `org.apache.spark.examples.SparkPi`)
    +* `--master`: The [master URL](#master-urls) for the cluster (e.g. `spark://23.195.26.187:7077`)
    +* `--deploy-mode`: Whether to deploy your driver program within the cluster or run it locally as an external client (either `cluster` or `client`)
    +* `application-jar`: Path to a bundled jar including your application and all dependencies. The URL must be globally visible inside of your cluster, for instance, an `hdfs://` path or a `file://` path that is present on all nodes.
    +* `application-arguments`: Arguments passed to the main method of your main class, if any
    +
    +For Python applications, simply pass a `.py` file in the place of `<application-jar>` instead of a JAR,
    +and add Python `.zip`, `.egg` or `.py` files to the search path with `--py-files`.
    +
    +To enumerate all options available to `spark-submit` run it with `--help`. Here are a few
    +examples of common options:
    +
    +{% highlight bash %}
    +# Run application locally on 8 cores
    +./bin/spark-submit \
    +  --class org.apache.spark.examples.SparkPi \
    +  --master local[8] \
    +  /path/to/examples.jar \
    +  100
    +
    +# Run on a Spark standalone cluster
    +./bin/spark-submit \
    +  --class org.apache.spark.examples.SparkPi \
    +  --master spark://207.184.161.138:7077 \
    +  --executor-memory 20G \
    +  --total-executor-cores 100 \
    +  /path/to/examples.jar \
    +  1000
    +
    +# Run on a YARN cluster
    +export HADOOP_CONF_DIR=XXX
    +./bin/spark-submit \
    +  --class org.apache.spark.examples.SparkPi \
    +  --master yarn-cluster \  # can also be `yarn-client` for client mode
    +  --executor-memory 20G \
    +  --num-executors 50 \
    +  /path/to/examples.jar \
    +  1000
    +
    +# Run a Python application on a cluster
    +./bin/spark-submit \
    +  --master spark://207.184.161.138:7077 \
    +  examples/src/main/python/pi.py \
    +  1000
    +{% endhighlight %}
    +
    +# Master URLs
    +
    +The master URL passed to Spark can be in one of the following formats:
    +
    +<table class="table">
    +<tr><th>Master URL</th><th>Meaning</th></tr>
    +<tr><td> local </td><td> Run Spark locally with one worker thread (i.e. no parallelism at all). </td></tr>
    +<tr><td> local[K] </td><td> Run Spark locally with K worker threads (ideally, set this to the number of cores on your machine). </td></tr>
    +<tr><td> local[*] </td><td> Run Spark locally with as many worker threads as logical cores on your machine.</td></tr>
    +<tr><td> spark://HOST:PORT </td><td> Connect to the given <a href="spark-standalone.html">Spark standalone
    +        cluster</a> master. The port must be whichever one your master is configured to use, which is 7077 by default.
    +</td></tr>
    +<tr><td> mesos://HOST:PORT </td><td> Connect to the given <a href="running-on-mesos.html">Mesos</a> cluster.
    +        The port must be whichever one your is configured to use, which is 5050 by default.
    +        Or, for a Mesos cluster using ZooKeeper, use <code>mesos://zk://...</code>.
    +</td></tr>
    +<tr><td> yarn-client </td><td> Connect to a <a href="running-on-yarn.html"> YARN </a> cluster in
    +client mode. The cluster location will be found based on the HADOOP_CONF_DIR variable.
    +</td></tr>
    +<tr><td> yarn-cluster </td><td> Connect to a <a href="running-on-yarn.html"> YARN </a> cluster in
    +cluster mode. The cluster location will be found based on HADOOP_CONF_DIR.
    +</td></tr>
    +</table>
    +
    +
    +# Loading Configuration from a File
    +
    +The `spark-submit` script can load default [Spark configuration values](configuration.html) from a
    +properties file and pass them on to your application. By default it will read options
    +from `conf/spark-defaults.conf` in the Spark directory. For more detail, see the section on
    +[loading default configurations](configuration.html#loading-default-configurations).
    +
    +Loading default Spark configurations this way can obviate the need for certain flags to
    +`spark-submit`. For instance, if the `spark.master` property is set, you can safely omit the
    +`--master` flag from `spark-submit`. In general, configuration values explicitly set on a
    +`SparkConf` take the highest precedence, then flags passed to `spark-submit`, then values in the
    +defaults file.
    +
    +If you are ever unclear where configuration options are coming from, you can print out fine-grained
    +debugging information by running `spark-submit` with the `--verbose` option.
    --- End diff --
    
    Yes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44349520
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44351158
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15237/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44352724
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15239/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44481139
  
    Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44350541
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/896#discussion_r13109195
  
    --- Diff: docs/submitting-applications.md ---
    @@ -0,0 +1,153 @@
    +---
    +layout: global
    +title: Submitting Applications
    +---
    +
    +The `spark-submit` script in Spark's `bin` directory is used to launch applications on a cluster.
    +It can use all of Spark's supported [cluster managers](cluster-overview.html#cluster-manager-types)
    +through a uniform interface so you don't have to configure your application specially for each one.
    +
    +# Bundling Your Application's Dependencies
    +If your code depends on other projects, you will need to package them alongside
    +your application in order to distribute the code to a Spark cluster. To do this,
    +to create an assembly jar (or "uber" jar) containing your code and its dependencies. Both
    +[sbt](https://github.com/sbt/sbt-assembly) and
    +[Maven](http://maven.apache.org/plugins/maven-shade-plugin/)
    +have assembly plugins. When creating assembly jars, list Spark and Hadoop
    +as `provided` dependencies; these need not be bundled since they are provided by
    +the cluster manager at runtime. Once you have an assembled jar you can call the `bin/spark-submit`
    +script as shown here while passing your jar.
    +
    +For Python, you can use the `--py-files` argument of `spark-submit` to add `.py`, `.zip` or `.egg`
    +files to be distributed with your application. If you depend on multiple Python files we recommend
    +packaging them into a `.zip` or `.egg`.
    +
    +# Launching Applications with spark-submit
    +
    +Once a user application is bundled, it can be launched using the `bin/spark-submit` script
    +This script takes care of setting up the classpath with Spark and its
    +dependencies, and can support different cluster managers and deploy modes that Spark supports:
    +
    +{% highlight bash %}
    +./bin/spark-submit \
    +  --class <main-class>
    +  --master <master-url> \
    +  --deploy-mode <deploy-mode> \
    +  ... # other options
    +  <application-jar> \
    +  [application-arguments]
    +{% endhighlight %}
    +
    +Some of the commonly used options are:
    +
    +* `--class`: The entry point for your application (e.g. `org.apache.spark.examples.SparkPi`)
    +* `--master`: The [master URL](#master-urls) for the cluster (e.g. `spark://23.195.26.187:7077`)
    +* `--deploy-mode`: Whether to deploy your driver program within the cluster or run it locally as an external client (either `cluster` or `client`)
    +* `application-jar`: Path to a bundled jar including your application and all dependencies. The URL must be globally visible inside of your cluster, for instance, an `hdfs://` path or a `file://` path that is present on all nodes.
    +* `application-arguments`: Arguments passed to the main method of your main class, if any
    +
    +For Python applications, simply pass a `.py` file in the place of `<application-jar>` instead of a JAR,
    +and add Python `.zip`, `.egg` or `.py` files to the search path with `--py-files`.
    +
    +To enumerate all options available to `spark-submit` run it with `--help`. Here are a few
    +examples of common options:
    +
    +{% highlight bash %}
    +# Run application locally on 8 cores
    +./bin/spark-submit \
    +  --class org.apache.spark.examples.SparkPi
    --- End diff --
    
    This needs to have a backlash after it (doesn't paste correclty into a shell) and same in the cases below


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44348013
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44622890
  
    Thanks Matei, I've merged this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44352098
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15238/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44478696
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44481140
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15269/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by ash211 <gi...@git.apache.org>.
Github user ash211 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/896#discussion_r13116340
  
    --- Diff: docs/submitting-applications.md ---
    @@ -0,0 +1,153 @@
    +---
    +layout: global
    +title: Submitting Applications
    +---
    +
    +The `spark-submit` script in Spark's `bin` directory is used to launch applications on a cluster.
    +It can use all of Spark's supported [cluster managers](cluster-overview.html#cluster-manager-types)
    +through a uniform interface so you don't have to configure your application specially for each one.
    +
    +# Bundling Your Application's Dependencies
    +If your code depends on other projects, you will need to package them alongside
    +your application in order to distribute the code to a Spark cluster. To do this,
    +to create an assembly jar (or "uber" jar) containing your code and its dependencies. Both
    +[sbt](https://github.com/sbt/sbt-assembly) and
    +[Maven](http://maven.apache.org/plugins/maven-shade-plugin/)
    +have assembly plugins. When creating assembly jars, list Spark and Hadoop
    +as `provided` dependencies; these need not be bundled since they are provided by
    +the cluster manager at runtime. Once you have an assembled jar you can call the `bin/spark-submit`
    +script as shown here while passing your jar.
    +
    +For Python, you can use the `--py-files` argument of `spark-submit` to add `.py`, `.zip` or `.egg`
    +files to be distributed with your application. If you depend on multiple Python files we recommend
    +packaging them into a `.zip` or `.egg`.
    +
    +# Launching Applications with spark-submit
    +
    +Once a user application is bundled, it can be launched using the `bin/spark-submit` script.
    +This script takes care of setting up the classpath with Spark and its
    +dependencies, and can support different cluster managers and deploy modes that Spark supports:
    +
    +{% highlight bash %}
    +./bin/spark-submit \
    +  --class <main-class>
    +  --master <master-url> \
    +  --deploy-mode <deploy-mode> \
    +  ... # other options
    +  <application-jar> \
    +  [application-arguments]
    +{% endhighlight %}
    +
    +Some of the commonly used options are:
    +
    +* `--class`: The entry point for your application (e.g. `org.apache.spark.examples.SparkPi`)
    +* `--master`: The [master URL](#master-urls) for the cluster (e.g. `spark://23.195.26.187:7077`)
    +* `--deploy-mode`: Whether to deploy your driver program within the cluster or run it locally as an external client (either `cluster` or `client`)
    +* `application-jar`: Path to a bundled jar including your application and all dependencies. The URL must be globally visible inside of your cluster, for instance, an `hdfs://` path or a `file://` path that is present on all nodes.
    +* `application-arguments`: Arguments passed to the main method of your main class, if any
    +
    +For Python applications, simply pass a `.py` file in the place of `<application-jar>` instead of a JAR,
    +and add Python `.zip`, `.egg` or `.py` files to the search path with `--py-files`.
    +
    +To enumerate all options available to `spark-submit` run it with `--help`. Here are a few
    +examples of common options:
    +
    +{% highlight bash %}
    +# Run application locally on 8 cores
    +./bin/spark-submit \
    +  --class org.apache.spark.examples.SparkPi \
    +  --master local[8] \
    +  /path/to/examples.jar \
    +  100
    +
    +# Run on a Spark standalone cluster
    +./bin/spark-submit \
    +  --class org.apache.spark.examples.SparkPi \
    +  --master spark://207.184.161.138:7077 \
    +  --executor-memory 20G \
    +  --total-executor-cores 100 \
    +  /path/to/examples.jar \
    +  1000
    +
    +# Run on a YARN cluster
    +export HADOOP_CONF_DIR=XXX
    +./bin/spark-submit \
    +  --class org.apache.spark.examples.SparkPi \
    +  --master yarn-cluster \  # can also be `yarn-client` for client mode
    +  --executor-memory 20G \
    +  --num-executors 50 \
    +  /path/to/examples.jar \
    +  1000
    +
    +# Run a Python application on a cluster
    +./bin/spark-submit \
    +  --master spark://207.184.161.138:7077 \
    +  examples/src/main/python/pi.py \
    +  1000
    +{% endhighlight %}
    +
    +# Master URLs
    +
    +The master URL passed to Spark can be in one of the following formats:
    +
    +<table class="table">
    +<tr><th>Master URL</th><th>Meaning</th></tr>
    +<tr><td> local </td><td> Run Spark locally with one worker thread (i.e. no parallelism at all). </td></tr>
    +<tr><td> local[K] </td><td> Run Spark locally with K worker threads (ideally, set this to the number of cores on your machine). </td></tr>
    +<tr><td> local[*] </td><td> Run Spark locally with as many worker threads as logical cores on your machine.</td></tr>
    +<tr><td> spark://HOST:PORT </td><td> Connect to the given <a href="spark-standalone.html">Spark standalone
    +        cluster</a> master. The port must be whichever one your master is configured to use, which is 7077 by default.
    +</td></tr>
    +<tr><td> mesos://HOST:PORT </td><td> Connect to the given <a href="running-on-mesos.html">Mesos</a> cluster.
    +        The port must be whichever one your is configured to use, which is 5050 by default.
    +        Or, for a Mesos cluster using ZooKeeper, use <code>mesos://zk://...</code>.
    +</td></tr>
    +<tr><td> yarn-client </td><td> Connect to a <a href="running-on-yarn.html"> YARN </a> cluster in
    +client mode. The cluster location will be found based on the HADOOP_CONF_DIR variable.
    +</td></tr>
    +<tr><td> yarn-cluster </td><td> Connect to a <a href="running-on-yarn.html"> YARN </a> cluster in
    +cluster mode. The cluster location will be found based on HADOOP_CONF_DIR.
    +</td></tr>
    +</table>
    +
    +
    +# Loading Configuration from a File
    +
    +The `spark-submit` script can load default [Spark configuration values](configuration.html) from a
    +properties file and pass them on to your application. By default it will read options
    +from `conf/spark-defaults.conf` in the Spark directory. For more detail, see the section on
    +[loading default configurations](configuration.html#loading-default-configurations).
    +
    +Loading default Spark configurations this way can obviate the need for certain flags to
    +`spark-submit`. For instance, if the `spark.master` property is set, you can safely omit the
    +`--master` flag from `spark-submit`. In general, configuration values explicitly set on a
    +`SparkConf` take the highest precedence, then flags passed to `spark-submit`, then values in the
    +defaults file.
    +
    +If you are ever unclear where configuration options are coming from, you can print out fine-grained
    +debugging information by running `spark-submit` with the `--verbose` option.
    --- End diff --
    
    `spark-submit -h` doesn't show `--verbose` as an option, but `--verbose` does appear when running with invalid options.  Should I file a bug to include that in spark-submit's normal help output?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/896


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44348007
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44350542
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by ash211 <gi...@git.apache.org>.
Github user ash211 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/896#discussion_r13116488
  
    --- Diff: docs/submitting-applications.md ---
    @@ -0,0 +1,153 @@
    +---
    +layout: global
    +title: Submitting Applications
    +---
    +
    +The `spark-submit` script in Spark's `bin` directory is used to launch applications on a cluster.
    +It can use all of Spark's supported [cluster managers](cluster-overview.html#cluster-manager-types)
    +through a uniform interface so you don't have to configure your application specially for each one.
    +
    +# Bundling Your Application's Dependencies
    +If your code depends on other projects, you will need to package them alongside
    +your application in order to distribute the code to a Spark cluster. To do this,
    +to create an assembly jar (or "uber" jar) containing your code and its dependencies. Both
    +[sbt](https://github.com/sbt/sbt-assembly) and
    +[Maven](http://maven.apache.org/plugins/maven-shade-plugin/)
    +have assembly plugins. When creating assembly jars, list Spark and Hadoop
    +as `provided` dependencies; these need not be bundled since they are provided by
    +the cluster manager at runtime. Once you have an assembled jar you can call the `bin/spark-submit`
    +script as shown here while passing your jar.
    +
    +For Python, you can use the `--py-files` argument of `spark-submit` to add `.py`, `.zip` or `.egg`
    +files to be distributed with your application. If you depend on multiple Python files we recommend
    +packaging them into a `.zip` or `.egg`.
    +
    +# Launching Applications with spark-submit
    +
    +Once a user application is bundled, it can be launched using the `bin/spark-submit` script.
    +This script takes care of setting up the classpath with Spark and its
    +dependencies, and can support different cluster managers and deploy modes that Spark supports:
    +
    +{% highlight bash %}
    +./bin/spark-submit \
    +  --class <main-class>
    +  --master <master-url> \
    +  --deploy-mode <deploy-mode> \
    +  ... # other options
    +  <application-jar> \
    +  [application-arguments]
    +{% endhighlight %}
    +
    +Some of the commonly used options are:
    +
    +* `--class`: The entry point for your application (e.g. `org.apache.spark.examples.SparkPi`)
    +* `--master`: The [master URL](#master-urls) for the cluster (e.g. `spark://23.195.26.187:7077`)
    +* `--deploy-mode`: Whether to deploy your driver program within the cluster or run it locally as an external client (either `cluster` or `client`)
    +* `application-jar`: Path to a bundled jar including your application and all dependencies. The URL must be globally visible inside of your cluster, for instance, an `hdfs://` path or a `file://` path that is present on all nodes.
    +* `application-arguments`: Arguments passed to the main method of your main class, if any
    +
    +For Python applications, simply pass a `.py` file in the place of `<application-jar>` instead of a JAR,
    +and add Python `.zip`, `.egg` or `.py` files to the search path with `--py-files`.
    +
    +To enumerate all options available to `spark-submit` run it with `--help`. Here are a few
    +examples of common options:
    +
    +{% highlight bash %}
    +# Run application locally on 8 cores
    +./bin/spark-submit \
    +  --class org.apache.spark.examples.SparkPi \
    +  --master local[8] \
    +  /path/to/examples.jar \
    +  100
    +
    +# Run on a Spark standalone cluster
    +./bin/spark-submit \
    +  --class org.apache.spark.examples.SparkPi \
    +  --master spark://207.184.161.138:7077 \
    +  --executor-memory 20G \
    +  --total-executor-cores 100 \
    +  /path/to/examples.jar \
    +  1000
    +
    +# Run on a YARN cluster
    +export HADOOP_CONF_DIR=XXX
    +./bin/spark-submit \
    +  --class org.apache.spark.examples.SparkPi \
    +  --master yarn-cluster \  # can also be `yarn-client` for client mode
    +  --executor-memory 20G \
    +  --num-executors 50 \
    +  /path/to/examples.jar \
    +  1000
    +
    +# Run a Python application on a cluster
    +./bin/spark-submit \
    +  --master spark://207.184.161.138:7077 \
    +  examples/src/main/python/pi.py \
    +  1000
    +{% endhighlight %}
    +
    +# Master URLs
    +
    +The master URL passed to Spark can be in one of the following formats:
    +
    +<table class="table">
    +<tr><th>Master URL</th><th>Meaning</th></tr>
    +<tr><td> local </td><td> Run Spark locally with one worker thread (i.e. no parallelism at all). </td></tr>
    +<tr><td> local[K] </td><td> Run Spark locally with K worker threads (ideally, set this to the number of cores on your machine). </td></tr>
    +<tr><td> local[*] </td><td> Run Spark locally with as many worker threads as logical cores on your machine.</td></tr>
    +<tr><td> spark://HOST:PORT </td><td> Connect to the given <a href="spark-standalone.html">Spark standalone
    +        cluster</a> master. The port must be whichever one your master is configured to use, which is 7077 by default.
    +</td></tr>
    +<tr><td> mesos://HOST:PORT </td><td> Connect to the given <a href="running-on-mesos.html">Mesos</a> cluster.
    +        The port must be whichever one your is configured to use, which is 5050 by default.
    +        Or, for a Mesos cluster using ZooKeeper, use <code>mesos://zk://...</code>.
    +</td></tr>
    +<tr><td> yarn-client </td><td> Connect to a <a href="running-on-yarn.html"> YARN </a> cluster in
    +client mode. The cluster location will be found based on the HADOOP_CONF_DIR variable.
    +</td></tr>
    +<tr><td> yarn-cluster </td><td> Connect to a <a href="running-on-yarn.html"> YARN </a> cluster in
    +cluster mode. The cluster location will be found based on HADOOP_CONF_DIR.
    +</td></tr>
    +</table>
    +
    +
    +# Loading Configuration from a File
    +
    +The `spark-submit` script can load default [Spark configuration values](configuration.html) from a
    +properties file and pass them on to your application. By default it will read options
    +from `conf/spark-defaults.conf` in the Spark directory. For more detail, see the section on
    +[loading default configurations](configuration.html#loading-default-configurations).
    +
    +Loading default Spark configurations this way can obviate the need for certain flags to
    +`spark-submit`. For instance, if the `spark.master` property is set, you can safely omit the
    +`--master` flag from `spark-submit`. In general, configuration values explicitly set on a
    +`SparkConf` take the highest precedence, then flags passed to `spark-submit`, then values in the
    +defaults file.
    +
    +If you are ever unclear where configuration options are coming from, you can print out fine-grained
    +debugging information by running `spark-submit` with the `--verbose` option.
    --- End diff --
    
    https://issues.apache.org/jira/browse/SPARK-1944


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44369237
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on a diff in the pull request:

    https://github.com/apache/spark/pull/896#discussion_r13116882
  
    --- Diff: docs/running-on-mesos.md ---
    @@ -103,7 +103,7 @@ the `make-distribution.sh` script included in a Spark source tarball/checkout.
     ## Using a Mesos Master URL
     
     The Master URLs for Mesos are in the form `mesos://host:5050` for a single-master Mesos
    -cluster, or `zk://host:2181` for a multi-master Mesos cluster using ZooKeeper.
    +cluster, or `mesos://zk://host:2181` for a multi-master Mesos cluster using ZooKeeper.
    --- End diff --
    
    Yeah I'd leave this for another time. In general none of these are URLs anyway, perhaps they could be called URIs, but I don't even want to get into the difference ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44371354
  
    Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44478690
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44451153
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15259/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44349512
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by ash211 <gi...@git.apache.org>.
Github user ash211 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/896#discussion_r13116025
  
    --- Diff: docs/running-on-mesos.md ---
    @@ -103,7 +103,7 @@ the `make-distribution.sh` script included in a Spark source tarball/checkout.
     ## Using a Mesos Master URL
     
     The Master URLs for Mesos are in the form `mesos://host:5050` for a single-master Mesos
    -cluster, or `zk://host:2181` for a multi-master Mesos cluster using ZooKeeper.
    +cluster, or `mesos://zk://host:2181` for a multi-master Mesos cluster using ZooKeeper.
    --- End diff --
    
    I thought this change was wrong because the `MESOS_REGEX` at https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L1437 is `"""(mesos|zk)://.*""".r` but after reading through the code this change is actually correct.
    
    I'm not sure that `mesos://zk://host:2181` is a [valid URL](https://en.wikipedia.org/wiki/Uniform_resource_locator#Syntax) though.  JDBC for example uses `jdbc:postgresql://` like in `jdbc:postgresql://localhost/test` rather than `jdbc://postgresql://...`  Issue for a later time?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44369235
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44348370
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/896#discussion_r13108963
  
    --- Diff: docs/submitting-applications.md ---
    @@ -0,0 +1,153 @@
    +---
    +layout: global
    +title: Submitting Applications
    +---
    +
    +The `spark-submit` script in Spark's `bin` directory is used to launch applications on a cluster.
    +It can use all of Spark's supported [cluster managers](cluster-overview.html#cluster-manager-types)
    +through a uniform interface so you don't have to configure your application specially for each one.
    +
    +# Bundling Your Application's Dependencies
    +If your code depends on other projects, you will need to package them alongside
    +your application in order to distribute the code to a Spark cluster. To do this,
    +to create an assembly jar (or "uber" jar) containing your code and its dependencies. Both
    +[sbt](https://github.com/sbt/sbt-assembly) and
    +[Maven](http://maven.apache.org/plugins/maven-shade-plugin/)
    +have assembly plugins. When creating assembly jars, list Spark and Hadoop
    +as `provided` dependencies; these need not be bundled since they are provided by
    +the cluster manager at runtime. Once you have an assembled jar you can call the `bin/spark-submit`
    +script as shown here while passing your jar.
    +
    +For Python, you can use the `--py-files` argument of `spark-submit` to add `.py`, `.zip` or `.egg`
    +files to be distributed with your application. If you depend on multiple Python files we recommend
    +packaging them into a `.zip` or `.egg`.
    +
    +# Launching Applications with spark-submit
    +
    +Once a user application is bundled, it can be launched using the `bin/spark-submit` script
    --- End diff --
    
    This sentence needs a period. Wasn't part of your change but I just noticed it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44352723
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44348784
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15236/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44349180
  
    Hey @mateiz looks great. I added two small comments which I think were both typo's unrelated to your patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1566] consolidate programming guide, an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/896#issuecomment-44352097
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---