You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by "Nick R. Katsipoulakis" <ka...@cs.pitt.edu> on 2014/07/09 18:27:57 UTC

Apache Spark, Hadoop 2.2.0 without Yarn Integration

Hello,

I am currently learning Apache Spark and I want to see how it integrates
with an existing Hadoop Cluster.

My current Hadoop configuration is version 2.2.0 without Yarn. I have build
Apache Spark (v1.0.0) following the instructions in the README file. Only
setting the SPARK_HADOOP_VERSION=1.2.1. Also, I export the HADOOP_CONF_DIR
to point to the configuration directory of Hadoop configuration.

My use-case is the Linear Least Regression MLlib example of Apache Spark
(link:
http://spark.apache.org/docs/latest/mllib-linear-methods.html#linear-least-squares-lasso-and-ridge-regression).
The only difference in the code is that I give the text file to be an HDFS
file.

However, I get a "Runtime Exception: Error in configuring object."

So my question is the following:

Does Spark work with a Hadoop distribution without Yarn?
If yes, am I doing it right? If no, can I build Spark with
SPARK_HADOOP_VERSION=2.2.0 and with SPARK_YARN=false?

Thank you,
Nick

Re: Apache Spark, Hadoop 2.2.0 without Yarn Integration

Posted by Sandy Ryza <sa...@cloudera.com>.

Compiling with YARN set to true is not required for Spark working with
Hadoop 2.2.0 in standalone mode.

-Sandy

On Fri, Jan 2, 2015 at 12:06 PM, Moep <ni...@googlemail.com> wrote:

> Well that's confusing. I have the same issue. So you're saying I have to
> compile Spark with Yarn set to true to make it work with Hadoop 2.2.0 in
> Standalone mode?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-Hadoop-2-2-0-without-Yarn-Integration-tp9202p20947.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Re: Apache Spark, Hadoop 2.2.0 without Yarn Integration

Posted by Moep <ni...@googlemail.com>.

Well that's confusing. I have the same issue. So you're saying I have to
compile Spark with Yarn set to true to make it work with Hadoop 2.2.0 in
Standalone mode?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-Hadoop-2-2-0-without-Yarn-Integration-tp9202p20947.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: Apache Spark, Hadoop 2.2.0 without Yarn Integration

Posted by "Nick R. Katsipoulakis" <ka...@cs.pitt.edu>.

Krishna,

Ok, thank you. I just wanted to make sure that this can be done.

Cheers,
Nick


On Wed, Jul 9, 2014 at 3:30 PM, Krishna Sankar <ks...@gmail.com> wrote:

> Nick,
>    AFAIK, you can compile with yarn=true and still run spark in stand
> alone cluster mode.
> Cheers
> <k/>
>
>
> On Wed, Jul 9, 2014 at 9:27 AM, Nick R. Katsipoulakis <ka...@cs.pitt.edu>
> wrote:
>
>> Hello,
>>
>> I am currently learning Apache Spark and I want to see how it integrates
>> with an existing Hadoop Cluster.
>>
>> My current Hadoop configuration is version 2.2.0 without Yarn. I have
>> build Apache Spark (v1.0.0) following the instructions in the README file.
>> Only setting the SPARK_HADOOP_VERSION=1.2.1. Also, I export the
>> HADOOP_CONF_DIR to point to the configuration directory of Hadoop
>> configuration.
>>
>> My use-case is the Linear Least Regression MLlib example of Apache Spark
>> (link:
>> http://spark.apache.org/docs/latest/mllib-linear-methods.html#linear-least-squares-lasso-and-ridge-regression).
>> The only difference in the code is that I give the text file to be an HDFS
>> file.
>>
>> However, I get a "Runtime Exception: Error in configuring object."
>>
>> So my question is the following:
>>
>> Does Spark work with a Hadoop distribution without Yarn?
>> If yes, am I doing it right? If no, can I build Spark with
>> SPARK_HADOOP_VERSION=2.2.0 and with SPARK_YARN=false?
>>
>> Thank you,
>> Nick
>>
>
>

Re: Apache Spark, Hadoop 2.2.0 without Yarn Integration

Posted by Krishna Sankar <ks...@gmail.com>.

Nick,
   AFAIK, you can compile with yarn=true and still run spark in stand alone
cluster mode.
Cheers
<k/>


On Wed, Jul 9, 2014 at 9:27 AM, Nick R. Katsipoulakis <ka...@cs.pitt.edu>
wrote:

> Hello,
>
> I am currently learning Apache Spark and I want to see how it integrates
> with an existing Hadoop Cluster.
>
> My current Hadoop configuration is version 2.2.0 without Yarn. I have
> build Apache Spark (v1.0.0) following the instructions in the README file.
> Only setting the SPARK_HADOOP_VERSION=1.2.1. Also, I export the
> HADOOP_CONF_DIR to point to the configuration directory of Hadoop
> configuration.
>
> My use-case is the Linear Least Regression MLlib example of Apache Spark
> (link:
> http://spark.apache.org/docs/latest/mllib-linear-methods.html#linear-least-squares-lasso-and-ridge-regression).
> The only difference in the code is that I give the text file to be an HDFS
> file.
>
> However, I get a "Runtime Exception: Error in configuring object."
>
> So my question is the following:
>
> Does Spark work with a Hadoop distribution without Yarn?
> If yes, am I doing it right? If no, can I build Spark with
> SPARK_HADOOP_VERSION=2.2.0 and with SPARK_YARN=false?
>
> Thank you,
> Nick
>