You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@spark.apache.org by pw...@apache.org on 2013/09/12 04:31:01 UTC

[1/4] git commit: Updated Spark on Mesos documentation.

Updated Branches:
  refs/heads/branch-0.8 ef19bc6cb -> 2f5898eaf


Updated Spark on Mesos documentation.


Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/a0f0c1be
Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/a0f0c1be
Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/a0f0c1be

Branch: refs/heads/branch-0.8
Commit: a0f0c1bed23d800c56e0b1637ef267ef94eb6103
Parents: 91a59e6
Author: Benjamin Hindman <be...@gmail.com>
Authored: Wed Sep 11 16:05:25 2013 -0700
Committer: Benjamin Hindman <be...@gmail.com>
Committed: Wed Sep 11 16:05:25 2013 -0700

----------------------------------------------------------------------
 docs/running-on-mesos.md | 33 ++++++++++++++++-----------------
 1 file changed, 16 insertions(+), 17 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/a0f0c1be/docs/running-on-mesos.md
----------------------------------------------------------------------
diff --git a/docs/running-on-mesos.md b/docs/running-on-mesos.md
index eee7a45..443350c 100644
--- a/docs/running-on-mesos.md
+++ b/docs/running-on-mesos.md
@@ -3,24 +3,23 @@ layout: global
 title: Running Spark on Mesos
 ---
 
-Spark can run on private clusters managed by the [Apache Mesos](http://incubator.apache.org/mesos/) resource manager. Follow the steps below to install Mesos and Spark:
-
-1. Download and build Spark using the instructions [here](index.html).
-2. Download Mesos {{site.MESOS_VERSION}} from a [mirror](http://www.apache.org/dyn/closer.cgi/incubator/mesos/mesos-{{site.MESOS_VERSION}}/).
-3. Configure Mesos using the `configure` script, passing the location of your `JAVA_HOME` using `--with-java-home`. Mesos comes with "template" configure scripts for different platforms, such as `configure.macosx`, that you can run. See the README file in Mesos for other options. **Note:** If you want to run Mesos without installing it into the default paths on your system (e.g. if you don't have administrative privileges to install it), you should also pass the `--prefix` option to `configure` to tell it where to install. For example, pass `--prefix=/home/user/mesos`. By default the prefix is `/usr/local`.
-4. Build Mesos using `make`, and then install it using `make install`.
-5. Create a file called `spark-env.sh` in Spark's `conf` directory, by copying `conf/spark-env.sh.template`, and add the following lines it:
-   * `export MESOS_NATIVE_LIBRARY=<path to libmesos.so>`. This path is usually `<prefix>/lib/libmesos.so` (where the prefix is `/usr/local` by default). Also, on Mac OS X, the library is called `libmesos.dylib` instead of `.so`.
-6. Copy Spark and Mesos to the _same_ paths on all the nodes in the cluster (or, for Mesos, `make install` on every node).
-7. Configure Mesos for deployment:
-   * On your master node, edit `<prefix>/var/mesos/deploy/masters` to list your master and `<prefix>/var/mesos/deploy/slaves` to list the slaves, where `<prefix>` is the prefix where you installed Mesos (`/usr/local` by default).
-   * On all nodes, edit `<prefix>/var/mesos/conf/mesos.conf` and add the line `master=HOST:5050`, where HOST is your master node.
-   * Run `<prefix>/sbin/mesos-start-cluster.sh` on your master to start Mesos. If all goes well, you should see Mesos's web UI on port 8080 of the master machine.
-   * See Mesos's README file for more information on deploying it.
-8. To run a Spark application against the cluster, when you create your `SparkContext`, pass the string `mesos://HOST:5050` as the first parameter, where `HOST` is the machine running your Mesos master. In addition, pass the location of Spark on your nodes as the third parameter, and a list of JAR files containing your JAR's code as the fourth (these will automatically get copied to the workers). For example:
+Spark can run on clusters managed by [Apache Mesos](http://mesos.apache.org/). Follow the steps below to install Mesos and Spark:
+
+1. Download and build Spark using the instructions [here](index.html). **Note:** Don't forget to consider what version of HDFS you might want to use!
+2. Download, build, install, and start Mesos {{site.MESOS_VERSION}} on your cluster. You can download the Mesos distribution from a [mirror](http://www.apache.org/dyn/closer.cgi/mesos/{{site.MESOS_VERSION}}/). See the Mesos [Getting Started](http://mesos.apache.org/gettingstarted) page for more information. **Note:** If you want to run Mesos without installing it into the default paths on your system (e.g., if you don't have administrative privileges to install it), you should also pass the `--prefix` option to `configure` to tell it where to install. For example, pass `--prefix=/home/user/mesos`. By default the prefix is `/usr/local`.
+3. Create a Spark "distribution" using `make-distribution.sh`.
+4. Rename the `dist` directory created from `make-distribution.sh` to `spark-{{site.SPARK_VERSION}}`.
+5. Create a `tar` archive: `tar czf spark-{{site.SPARK_VERSION}}.tar.gz spark-{{site.SPARK_VERSION}}`
+6. Upload this archive to your HDFS or another place accessible from Mesos via `http://`, e.g., [Amazon Simple Storage Service](http://aws.amazon.com/s3): `hadoop fs -put spark-{{site.SPARK_VERSION}}.tar.gz /path/to/spark-{{site.SPARK_VERSION}}.tar.gz`
+7. Create a file called `spark-env.sh` in Spark's `conf` directory, by copying `conf/spark-env.sh.template`, and add the following lines to it:
+   * `export MESOS_NATIVE_LIBRARY=<path to libmesos.so>`. This path is usually `<prefix>/lib/libmesos.so` (where the prefix is `/usr/local` by default, see above). Also, on Mac OS X, the library is called `libmesos.dylib` instead of `libmesos.so`.
+   * `export SPARK_EXECUTOR_URI=<path to spark-{{site.SPARK_VERSION}}.tar.gz uploaded above>`.
+   * `export MASTER=mesos://HOST:PORT` where HOST:PORT is the host and port (default: 5050) of your Mesos master (or `zk://...` if using Mesos with ZooKeeper).
+8. To run a Spark application against the cluster, when you create your `SparkContext`, pass the string `mesos://HOST:PORT` as the first parameter. In addition, you'll need to set the `spark.executor.uri` property. For example
 
 {% highlight scala %}
-new SparkContext("mesos://HOST:5050", "My App Name", "/home/user/spark", List("my-app.jar"))
+System.setProperty("spark.executor.uri", "<path to spark-{{site.SPARK_VERSION}}.tar.gz uploaded above>")
+val sc = new SparkContext("mesos://HOST:5050", "App Name", ...)
 {% endhighlight %}
 
 If you want to run Spark on Amazon EC2, you can use the Spark [EC2 launch scripts](ec2-scripts.html), which provide an easy way to launch a cluster with Mesos, Spark, and HDFS pre-configured. This will get you a cluster in about five minutes without any configuration on your part.
@@ -52,6 +51,6 @@ Again, this must be done *before* initializing a SparkContext.
 
 You can run Spark and Mesos alongside your existing Hadoop cluster by just launching them as a separate service on the machines. To access Hadoop data from Spark, just use a hdfs:// URL (typically `hdfs://<namenode>:9000/path`, but you can find the right URL on your Hadoop Namenode's web UI).
 
-In addition, it is possible to also run Hadoop MapReduce on Mesos, to get better resource isolation and sharing between the two. In this case, Mesos will act as a unified scheduler that assigns cores to either Hadoop or Spark, as opposed to having them share resources via the Linux scheduler on each node. Please refer to the Mesos wiki page on [Running Hadoop on Mesos](https://github.com/mesos/mesos/wiki/Running-Hadoop-on-Mesos).
+In addition, it is possible to also run Hadoop MapReduce on Mesos, to get better resource isolation and sharing between the two. In this case, Mesos will act as a unified scheduler that assigns cores to either Hadoop or Spark, as opposed to having them share resources via the Linux scheduler on each node. Please refer to [Hadoop on Mesos](https://github.com/mesos/hadoop).
 
 In either case, HDFS runs separately from Hadoop MapReduce, without going through Mesos.

[3/4] git commit: Merge pull request #927 from benh/mesos-docs

Posted by pw...@apache.org.

Merge pull request #927 from benh/mesos-docs

Updated Spark on Mesos documentation.

Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/58c7d8b1
Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/58c7d8b1
Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/58c7d8b1

Branch: refs/heads/branch-0.8
Commit: 58c7d8b13875536f1b091a8fdc462ab52d075e36
Parents: 91a59e6 8e2602d
Author: Matei Zaharia <ma...@gmail.com>
Authored: Wed Sep 11 17:33:42 2013 -0700
Committer: Matei Zaharia <ma...@gmail.com>
Committed: Wed Sep 11 17:33:42 2013 -0700

----------------------------------------------------------------------
 docs/running-on-mesos.md | 33 ++++++++++++++++-----------------
 1 file changed, 16 insertions(+), 17 deletions(-)
----------------------------------------------------------------------

[2/4] git commit: More updates to Spark on Mesos documentation.

Posted by pw...@apache.org.

More updates to Spark on Mesos documentation.


Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/8e2602dd
Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/8e2602dd
Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/8e2602dd

Branch: refs/heads/branch-0.8
Commit: 8e2602dd7033deded36d225250f30d980bfa6ecd
Parents: a0f0c1b
Author: Benjamin Hindman <be...@gmail.com>
Authored: Wed Sep 11 16:08:54 2013 -0700
Committer: Benjamin Hindman <be...@gmail.com>
Committed: Wed Sep 11 16:08:54 2013 -0700

----------------------------------------------------------------------
 docs/running-on-mesos.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/8e2602dd/docs/running-on-mesos.md
----------------------------------------------------------------------
diff --git a/docs/running-on-mesos.md b/docs/running-on-mesos.md
index 443350c..322ff58 100644
--- a/docs/running-on-mesos.md
+++ b/docs/running-on-mesos.md
@@ -10,12 +10,12 @@ Spark can run on clusters managed by [Apache Mesos](http://mesos.apache.org/). F
 3. Create a Spark "distribution" using `make-distribution.sh`.
 4. Rename the `dist` directory created from `make-distribution.sh` to `spark-{{site.SPARK_VERSION}}`.
 5. Create a `tar` archive: `tar czf spark-{{site.SPARK_VERSION}}.tar.gz spark-{{site.SPARK_VERSION}}`
-6. Upload this archive to your HDFS or another place accessible from Mesos via `http://`, e.g., [Amazon Simple Storage Service](http://aws.amazon.com/s3): `hadoop fs -put spark-{{site.SPARK_VERSION}}.tar.gz /path/to/spark-{{site.SPARK_VERSION}}.tar.gz`
+6. Upload this archive to HDFS or another place accessible from Mesos via `http://`, e.g., [Amazon Simple Storage Service](http://aws.amazon.com/s3): `hadoop fs -put spark-{{site.SPARK_VERSION}}.tar.gz /path/to/spark-{{site.SPARK_VERSION}}.tar.gz`
 7. Create a file called `spark-env.sh` in Spark's `conf` directory, by copying `conf/spark-env.sh.template`, and add the following lines to it:
    * `export MESOS_NATIVE_LIBRARY=<path to libmesos.so>`. This path is usually `<prefix>/lib/libmesos.so` (where the prefix is `/usr/local` by default, see above). Also, on Mac OS X, the library is called `libmesos.dylib` instead of `libmesos.so`.
    * `export SPARK_EXECUTOR_URI=<path to spark-{{site.SPARK_VERSION}}.tar.gz uploaded above>`.
    * `export MASTER=mesos://HOST:PORT` where HOST:PORT is the host and port (default: 5050) of your Mesos master (or `zk://...` if using Mesos with ZooKeeper).
-8. To run a Spark application against the cluster, when you create your `SparkContext`, pass the string `mesos://HOST:PORT` as the first parameter. In addition, you'll need to set the `spark.executor.uri` property. For example
+8. To run a Spark application against the cluster, when you create your `SparkContext`, pass the string `mesos://HOST:PORT` as the first parameter. In addition, you'll need to set the `spark.executor.uri` property. For example:
 
 {% highlight scala %}
 System.setProperty("spark.executor.uri", "<path to spark-{{site.SPARK_VERSION}}.tar.gz uploaded above>")

[4/4] git commit: Merge remote-tracking branch 'origin/master' into branch-0.8

Posted by pw...@apache.org.

Merge remote-tracking branch 'origin/master' into branch-0.8


Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/2f5898ea
Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/2f5898ea
Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/2f5898ea

Branch: refs/heads/branch-0.8
Commit: 2f5898eaf0853811849430cbc95cf1fc91e2708c
Parents: ef19bc6 58c7d8b
Author: Patrick Wendell <pw...@gmail.com>
Authored: Wed Sep 11 19:30:31 2013 -0700
Committer: Patrick Wendell <pw...@gmail.com>
Committed: Wed Sep 11 19:30:31 2013 -0700

----------------------------------------------------------------------
 docs/running-on-mesos.md | 33 ++++++++++++++++-----------------
 1 file changed, 16 insertions(+), 17 deletions(-)
----------------------------------------------------------------------