You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Yunjie Ji <jy...@163.com> on 2017/02/28 01:18:33 UTC
Run spark machine learning example on Yarn failed
After start the dfs, yarn and spark, I run these code under the root
directory of spark on my master host:
`MASTER=yarn ./bin/run-example ml.LogisticRegressionExample
data/mllib/sample_libsvm_data.txt`
Actually I get these code from spark's README. And here is the source code
about LogisticRegressionExample on GitHub:
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/LogisticRegressionExample.scala
<https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/LogisticRegressionExample.scala>
Then, error occurs:
`Exception in thread "main" org.apache.spark.sql.AnalysisException: Path
does notexist:
hdfs://master:9000/user/root/data/mllib/sample_libsvm_data.txt;`
Firstly, I don't know why it's `hdfs://master:9000/user/root`, I do set
namenode's IP address to `hdfs://master:9000`, but why spark chose the
directory `/user/root`?
Then, I make a directory `/user/root/data/mllib/sample_libsvm_data.txt` on
every host of the cluster, so I hope spark can find this file. But the same
error occurs again.
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Run-spark-machine-learning-example-on-Yarn-failed-tp28435.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Run spark machine learning example on Yarn failed
Posted by Femi Anthony <fe...@gmail.com>.
Have you tried specifying an absolute instead of a relative path ?
Femi
> On Feb 27, 2017, at 8:18 PM, Yunjie Ji <jy...@163.com> wrote:
>
> After start the dfs, yarn and spark, I run these code under the root
> directory of spark on my master host:
> `MASTER=yarn ./bin/run-example ml.LogisticRegressionExample
> data/mllib/sample_libsvm_data.txt`
>
> Actually I get these code from spark's README. And here is the source code
> about LogisticRegressionExample on GitHub:
> https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/LogisticRegressionExample.scala
> <https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/LogisticRegressionExample.scala>
>
> Then, error occurs:
> `Exception in thread "main" org.apache.spark.sql.AnalysisException: Path
> does notexist:
> hdfs://master:9000/user/root/data/mllib/sample_libsvm_data.txt;`
>
> Firstly, I don't know why it's `hdfs://master:9000/user/root`, I do set
> namenode's IP address to `hdfs://master:9000`, but why spark chose the
> directory `/user/root`?
>
> Then, I make a directory `/user/root/data/mllib/sample_libsvm_data.txt` on
> every host of the cluster, so I hope spark can find this file. But the same
> error occurs again.
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Run-spark-machine-learning-example-on-Yarn-failed-tp28435.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Run spark machine learning example on Yarn failed
Posted by Marco Mistroni <mm...@gmail.com>.
Or place the file in s3 and provide the s3 path
Kr
On 28 Feb 2017 1:18 am, "Yunjie Ji" <jy...@163.com> wrote:
> After start the dfs, yarn and spark, I run these code under the root
> directory of spark on my master host:
> `MASTER=yarn ./bin/run-example ml.LogisticRegressionExample
> data/mllib/sample_libsvm_data.txt`
>
> Actually I get these code from spark's README. And here is the source code
> about LogisticRegressionExample on GitHub:
> https://github.com/apache/spark/blob/master/examples/
> src/main/scala/org/apache/spark/examples/ml/LogisticRegressionExample.
> scala
> <https://github.com/apache/spark/blob/master/examples/
> src/main/scala/org/apache/spark/examples/ml/LogisticRegressionExample.
> scala>
>
> Then, error occurs:
> `Exception in thread "main" org.apache.spark.sql.AnalysisException: Path
> does notexist:
> hdfs://master:9000/user/root/data/mllib/sample_libsvm_data.txt;`
>
> Firstly, I don't know why it's `hdfs://master:9000/user/root`, I do set
> namenode's IP address to `hdfs://master:9000`, but why spark chose the
> directory `/user/root`?
>
> Then, I make a directory `/user/root/data/mllib/sample_libsvm_data.txt` on
> every host of the cluster, so I hope spark can find this file. But the same
> error occurs again.
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Run-spark-machine-learning-example-on-Yarn-failed-
> tp28435.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>
Re: Run spark machine learning example on Yarn failed
Posted by Jörn Franke <jo...@gmail.com>.
You do not need to place it in every local directory of every node. Just use hadoop fs -put to put it on HDFS. Alternatively as others suggested use s3
> On 28 Feb 2017, at 02:18, Yunjie Ji <jy...@163.com> wrote:
>
> After start the dfs, yarn and spark, I run these code under the root
> directory of spark on my master host:
> `MASTER=yarn ./bin/run-example ml.LogisticRegressionExample
> data/mllib/sample_libsvm_data.txt`
>
> Actually I get these code from spark's README. And here is the source code
> about LogisticRegressionExample on GitHub:
> https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/LogisticRegressionExample.scala
> <https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/LogisticRegressionExample.scala>
>
> Then, error occurs:
> `Exception in thread "main" org.apache.spark.sql.AnalysisException: Path
> does notexist:
> hdfs://master:9000/user/root/data/mllib/sample_libsvm_data.txt;`
>
> Firstly, I don't know why it's `hdfs://master:9000/user/root`, I do set
> namenode's IP address to `hdfs://master:9000`, but why spark chose the
> directory `/user/root`?
>
> Then, I make a directory `/user/root/data/mllib/sample_libsvm_data.txt` on
> every host of the cluster, so I hope spark can find this file. But the same
> error occurs again.
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Run-spark-machine-learning-example-on-Yarn-failed-tp28435.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org