You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Konstantin Kudryavtsev <ku...@gmail.com> on 2014/07/04 14:16:44 UTC

Unable to run Spark 1.0 SparkPi on HDP 2.0

Hi all,

I stuck in issue with runing spark PI example on HDP 2.0

I downloaded spark 1.0 pre-build from http://spark.apache.org/downloads.html
(for HDP2)
The run example from spark web-site:
./bin/spark-submit --class org.apache.spark.examples.SparkPi     --master
yarn-cluster --num-executors 3 --driver-memory 2g --executor-memory 2g
--executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2

I got error:
Application application_1404470405736_0044 failed 3 times due to AM
Container for appattempt_1404470405736_0044_000003 exited with exitCode: 1
due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
.Failing this attempt.. Failing the application.

Unknown/unsupported param List(--executor-memory, 2048,
--executor-cores, 1, --num-executors, 3)
Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
Options:
  --jar JAR_PATH       Path to your application's JAR file (required)
  --class CLASS_NAME   Name of your application's main class (required)

...bla-bla-bla


any ideas? how can I make it works?

Thank you,
Konstantin Kudryavtsev

Re: Unable to run Spark 1.0 SparkPi on HDP 2.0

Posted by vs <vi...@gmail.com>.

The Hortonworks Tech Preview of Spark is for Spark on YARN. It does not
require Spark to be installed on all nodes manually. When you submit the
Spark assembly jar it will have all its dependencies. YARN will instantiate
Spark App Master & Containers based on this jar.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p9246.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Unable to run Spark 1.0 SparkPi on HDP 2.0

Posted by Sean Owen <so...@cloudera.com>.

On Tue, Jul 8, 2014 at 2:01 AM, DB Tsai <db...@dbtsai.com> wrote:

> Actually, the one needed to install the jar to each individual node is
> standalone mode which works for both MR1 and MR2. Cloudera and
> Hortonworks currently support spark in this way as far as I know.
>

(CDH5 uses Spark on YARN.)

Re: Unable to run Spark 1.0 SparkPi on HDP 2.0

Posted by DB Tsai <db...@dbtsai.com>.

Actually, the one needed to install the jar to each individual node is
standalone mode which works for both MR1 and MR2. Cloudera and
Hortonworks currently support spark in this way as far as I know.

For both yarn-cluster or yarn-client, Spark will distribute the jars
through distributed cache and each executor can find the jars there.

On Jul 7, 2014 6:23 AM, "Chester @work" <ch...@alpinenow.com> wrote:
>
> In Yarn cluster mode, you can either have spark on all the cluster nodes or supply the spark jar yourself. In the 2nd case, you don't need install spark on cluster at all. As you supply the spark assembly as we as your app jar together.
>
> I hope this make it clear
>
> Chester
>
> Sent from my iPhone
>
> On Jul 7, 2014, at 5:05 AM, Konstantin Kudryavtsev <ku...@gmail.com> wrote:
>
> thank you Krishna!
>
> Could you please explain why do I need install spark on each node if Spark official site said: If you have a Hadoop 2 cluster, you can run Spark without any installation needed
>
> I have HDP 2 (YARN) and that's why I hope I don't need to install spark on each node
>
> Thank you,
> Konstantin Kudryavtsev
>
>
> On Mon, Jul 7, 2014 at 1:57 PM, Krishna Sankar <ks...@gmail.com> wrote:
>>
>> Konstantin,
>>
>> You need to install the hadoop rpms on all nodes. If it is Hadoop 2, the nodes would have hdfs & YARN.
>> Then you need to install Spark on all nodes. I haven't had experience with HDP, but the tech preview might have installed Spark as well.
>> In the end, one should have hdfs,yarn & spark installed on all the nodes.
>> After installations, check the web console to make sure hdfs, yarn & spark are running.
>> Then you are ready to start experimenting/developing spark applications.
>>
>> HTH.
>> Cheers
>> <k/>
>>
>>
>> On Mon, Jul 7, 2014 at 2:34 AM, Konstantin Kudryavtsev <ku...@gmail.com> wrote:
>>>
>>> guys, I'm not talking about running spark on VM, I don have problem with it.
>>>
>>> I confused in the next:
>>> 1) Hortonworks describe installation process as RPMs on each node
>>> 2) spark home page said that everything I need is YARN
>>>
>>> And I'm in stucj with understanding what I need to do to run spark on yarn (do I need RPMs installations or only build spark on edge node?)
>>>
>>>
>>> Thank you,
>>> Konstantin Kudryavtsev
>>>
>>>
>>> On Mon, Jul 7, 2014 at 4:34 AM, Robert James <sr...@gmail.com> wrote:
>>>>
>>>> I can say from my experience that getting Spark to work with Hadoop 2
>>>> is not for the beginner; after solving one problem after another
>>>> (dependencies, scripts, etc.), I went back to Hadoop 1.
>>>>
>>>> Spark's Maven, ec2 scripts, and others all use Hadoop 1 - not sure
>>>> why, but, given so, Hadoop 2 has too many bumps
>>>>
>>>> On 7/6/14, Marco Shaw <ma...@gmail.com> wrote:
>>>> > That is confusing based on the context you provided.
>>>> >
>>>> > This might take more time than I can spare to try to understand.
>>>> >
>>>> > For sure, you need to add Spark to run it in/on the HDP 2.1 express VM.
>>>> >
>>>> > Cloudera's CDH 5 express VM includes Spark, but the service isn't running by
>>>> > default.
>>>> >
>>>> > I can't remember for MapR...
>>>> >
>>>> > Marco
>>>> >
>>>> >> On Jul 6, 2014, at 6:33 PM, Konstantin Kudryavtsev
>>>> >> <ku...@gmail.com> wrote:
>>>> >>
>>>> >> Marco,
>>>> >>
>>>> >> Hortonworks provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you
>>>> >> can try
>>>> >> from
>>>> >> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>>> >>  HDP 2.1 means YARN, at the same time they propose ti install rpm
>>>> >>
>>>> >> On other hand, http://spark.apache.org/ said "
>>>> >> Integrated with Hadoop
>>>> >> Spark can run on Hadoop 2's YARN cluster manager, and can read any
>>>> >> existing Hadoop data.
>>>> >>
>>>> >> If you have a Hadoop 2 cluster, you can run Spark without any installation
>>>> >> needed. "
>>>> >>
>>>> >> And this is confusing for me... do I need rpm installation on not?...
>>>> >>
>>>> >>
>>>> >> Thank you,
>>>> >> Konstantin Kudryavtsev
>>>> >>
>>>> >>
>>>> >>> On Sun, Jul 6, 2014 at 10:56 PM, Marco Shaw <ma...@gmail.com>
>>>> >>> wrote:
>>>> >>> Can you provide links to the sections that are confusing?
>>>> >>>
>>>> >>> My understanding, the HDP1 binaries do not need YARN, while the HDP2
>>>> >>> binaries do.
>>>> >>>
>>>> >>> Now, you can also install Hortonworks Spark RPM...
>>>> >>>
>>>> >>> For production, in my opinion, RPMs are better for manageability.
>>>> >>>
>>>> >>>> On Jul 6, 2014, at 5:39 PM, Konstantin Kudryavtsev
>>>> >>>> <ku...@gmail.com> wrote:
>>>> >>>>
>>>> >>>> Hello, thanks for your message... I'm confused, Hortonworhs suggest
>>>> >>>> install spark rpm on each node, but on Spark main page said that yarn
>>>> >>>> enough and I don't need to install it... What the difference?
>>>> >>>>
>>>> >>>> sent from my HTC
>>>> >>>>
>>>> >>>>> On Jul 6, 2014 8:34 PM, "vs" <vi...@gmail.com> wrote:
>>>> >>>>> Konstantin,
>>>> >>>>>
>>>> >>>>> HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you can
>>>> >>>>> try
>>>> >>>>> from
>>>> >>>>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>>> >>>>>
>>>> >>>>> Let me know if you see issues with the tech preview.
>>>> >>>>>
>>>> >>>>> "spark PI example on HDP 2.0
>>>> >>>>>
>>>> >>>>> I downloaded spark 1.0 pre-build from
>>>> >>>>> http://spark.apache.org/downloads.html
>>>> >>>>> (for HDP2)
>>>> >>>>> The run example from spark web-site:
>>>> >>>>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi
>>>> >>>>> --master
>>>> >>>>> yarn-cluster --num-executors 3 --driver-memory 2g --executor-memory 2g
>>>> >>>>> --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2
>>>> >>>>>
>>>> >>>>> I got error:
>>>> >>>>> Application application_1404470405736_0044 failed 3 times due to AM
>>>> >>>>> Container for appattempt_1404470405736_0044_000003 exited with
>>>> >>>>> exitCode: 1
>>>> >>>>> due to: Exception from container-launch:
>>>> >>>>> org.apache.hadoop.util.Shell$ExitCodeException:
>>>> >>>>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>>>> >>>>> at org.apache.hadoop.util.Shell.run(Shell.java:379)
>>>> >>>>> at
>>>> >>>>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>>>> >>>>> at
>>>> >>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>>> >>>>> at
>>>> >>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>>>> >>>>> at
>>>> >>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>>>> >>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>> >>>>> at
>>>> >>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>> >>>>> at
>>>> >>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>> >>>>> at java.lang.Thread.run(Thread.java:744)
>>>> >>>>> .Failing this attempt.. Failing the application.
>>>> >>>>>
>>>> >>>>> Unknown/unsupported param List(--executor-memory, 2048,
>>>> >>>>> --executor-cores, 1,
>>>> >>>>> --num-executors, 3)
>>>> >>>>> Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
>>>> >>>>> Options:
>>>> >>>>>   --jar JAR_PATH       Path to your application's JAR file (required)
>>>> >>>>>   --class CLASS_NAME   Name of your application's main class
>>>> >>>>> (required)
>>>> >>>>> ...bla-bla-bla
>>>> >>>>> "
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> --
>>>> >>>>> View this message in context:
>>>> >>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
>>>> >>>>> Sent from the Apache Spark User List mailing list archive at
>>>> >>>>> Nabble.com.
>>>> >>
>>>> >
>>>
>>>
>>
>

Re: Unable to run Spark 1.0 SparkPi on HDP 2.0

Posted by Konstantin Kudryavtsev <ku...@gmail.com>.

Hi Chester,

Thank you very much, it is clear now - just two different way to support
spark on acluster

Thank you,
Konstantin Kudryavtsev


On Mon, Jul 7, 2014 at 3:22 PM, Chester @work <ch...@alpinenow.com> wrote:

> In Yarn cluster mode, you can either have spark on all the cluster nodes
> or supply the spark jar yourself. In the 2nd case, you don't need install
> spark on cluster at all. As you supply the spark assembly as we as your app
> jar together.
>
> I hope this make it clear
>
> Chester
>
> Sent from my iPhone
>
> On Jul 7, 2014, at 5:05 AM, Konstantin Kudryavtsev <
> kudryavtsev.konstantin@gmail.com> wrote:
>
> thank you Krishna!
>
>  Could you please explain why do I need install spark on each node if
> Spark official site said: If you have a Hadoop 2 cluster, you can run
> Spark without any installation needed
>
> I have HDP 2 (YARN) and that's why I hope I don't need to install spark on
> each node
>
> Thank you,
> Konstantin Kudryavtsev
>
>
> On Mon, Jul 7, 2014 at 1:57 PM, Krishna Sankar <ks...@gmail.com>
> wrote:
>
>> Konstantin,
>>
>>    1. You need to install the hadoop rpms on all nodes. If it is Hadoop
>>    2, the nodes would have hdfs & YARN.
>>    2. Then you need to install Spark on all nodes. I haven't had
>>    experience with HDP, but the tech preview might have installed Spark as
>>    well.
>>    3. In the end, one should have hdfs,yarn & spark installed on all the
>>    nodes.
>>    4. After installations, check the web console to make sure hdfs, yarn
>>    & spark are running.
>>    5. Then you are ready to start experimenting/developing spark
>>    applications.
>>
>> HTH.
>> Cheers
>> <k/>
>>
>>
>> On Mon, Jul 7, 2014 at 2:34 AM, Konstantin Kudryavtsev <
>> kudryavtsev.konstantin@gmail.com> wrote:
>>
>>> guys, I'm not talking about running spark on VM, I don have problem with
>>> it.
>>>
>>> I confused in the next:
>>> 1) Hortonworks describe installation process as RPMs on each node
>>> 2) spark home page said that everything I need is YARN
>>>
>>> And I'm in stucj with understanding what I need to do to run spark on
>>> yarn (do I need RPMs installations or only build spark on edge node?)
>>>
>>>
>>> Thank you,
>>> Konstantin Kudryavtsev
>>>
>>>
>>> On Mon, Jul 7, 2014 at 4:34 AM, Robert James <sr...@gmail.com>
>>> wrote:
>>>
>>>> I can say from my experience that getting Spark to work with Hadoop 2
>>>> is not for the beginner; after solving one problem after another
>>>> (dependencies, scripts, etc.), I went back to Hadoop 1.
>>>>
>>>> Spark's Maven, ec2 scripts, and others all use Hadoop 1 - not sure
>>>> why, but, given so, Hadoop 2 has too many bumps
>>>>
>>>> On 7/6/14, Marco Shaw <ma...@gmail.com> wrote:
>>>> > That is confusing based on the context you provided.
>>>> >
>>>> > This might take more time than I can spare to try to understand.
>>>> >
>>>> > For sure, you need to add Spark to run it in/on the HDP 2.1 express
>>>> VM.
>>>> >
>>>> > Cloudera's CDH 5 express VM includes Spark, but the service isn't
>>>> running by
>>>> > default.
>>>> >
>>>> > I can't remember for MapR...
>>>> >
>>>> > Marco
>>>> >
>>>> >> On Jul 6, 2014, at 6:33 PM, Konstantin Kudryavtsev
>>>> >> <ku...@gmail.com> wrote:
>>>> >>
>>>> >> Marco,
>>>> >>
>>>> >> Hortonworks provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that
>>>> you
>>>> >> can try
>>>> >> from
>>>> >>
>>>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>>> >>  HDP 2.1 means YARN, at the same time they propose ti install rpm
>>>> >>
>>>> >> On other hand, http://spark.apache.org/ said "
>>>> >> Integrated with Hadoop
>>>> >> Spark can run on Hadoop 2's YARN cluster manager, and can read any
>>>> >> existing Hadoop data.
>>>> >>
>>>> >> If you have a Hadoop 2 cluster, you can run Spark without any
>>>> installation
>>>> >> needed. "
>>>> >>
>>>> >> And this is confusing for me... do I need rpm installation on not?...
>>>> >>
>>>> >>
>>>> >> Thank you,
>>>> >> Konstantin Kudryavtsev
>>>> >>
>>>> >>
>>>> >>> On Sun, Jul 6, 2014 at 10:56 PM, Marco Shaw <ma...@gmail.com>
>>>> >>> wrote:
>>>> >>> Can you provide links to the sections that are confusing?
>>>> >>>
>>>> >>> My understanding, the HDP1 binaries do not need YARN, while the HDP2
>>>> >>> binaries do.
>>>> >>>
>>>> >>> Now, you can also install Hortonworks Spark RPM...
>>>> >>>
>>>> >>> For production, in my opinion, RPMs are better for manageability.
>>>> >>>
>>>> >>>> On Jul 6, 2014, at 5:39 PM, Konstantin Kudryavtsev
>>>> >>>> <ku...@gmail.com> wrote:
>>>> >>>>
>>>> >>>> Hello, thanks for your message... I'm confused, Hortonworhs suggest
>>>> >>>> install spark rpm on each node, but on Spark main page said that
>>>> yarn
>>>> >>>> enough and I don't need to install it... What the difference?
>>>> >>>>
>>>> >>>> sent from my HTC
>>>> >>>>
>>>> >>>>> On Jul 6, 2014 8:34 PM, "vs" <vi...@gmail.com> wrote:
>>>> >>>>> Konstantin,
>>>> >>>>>
>>>> >>>>> HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you
>>>> can
>>>> >>>>> try
>>>> >>>>> from
>>>> >>>>>
>>>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>>> >>>>>
>>>> >>>>> Let me know if you see issues with the tech preview.
>>>> >>>>>
>>>> >>>>> "spark PI example on HDP 2.0
>>>> >>>>>
>>>> >>>>> I downloaded spark 1.0 pre-build from
>>>> >>>>> http://spark.apache.org/downloads.html
>>>> >>>>> (for HDP2)
>>>> >>>>> The run example from spark web-site:
>>>> >>>>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi
>>>> >>>>> --master
>>>> >>>>> yarn-cluster --num-executors 3 --driver-memory 2g
>>>> --executor-memory 2g
>>>> >>>>> --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2
>>>> >>>>>
>>>> >>>>> I got error:
>>>> >>>>> Application application_1404470405736_0044 failed 3 times due to
>>>> AM
>>>> >>>>> Container for appattempt_1404470405736_0044_000003 exited with
>>>> >>>>> exitCode: 1
>>>> >>>>> due to: Exception from container-launch:
>>>> >>>>> org.apache.hadoop.util.Shell$ExitCodeException:
>>>> >>>>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>>>> >>>>> at org.apache.hadoop.util.Shell.run(Shell.java:379)
>>>> >>>>> at
>>>> >>>>>
>>>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>>>> >>>>> at
>>>> >>>>>
>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>>> >>>>> at
>>>> >>>>>
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>>>> >>>>> at
>>>> >>>>>
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>>>> >>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>> >>>>> at
>>>> >>>>>
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>> >>>>> at
>>>> >>>>>
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>> >>>>> at java.lang.Thread.run(Thread.java:744)
>>>> >>>>> .Failing this attempt.. Failing the application.
>>>> >>>>>
>>>> >>>>> Unknown/unsupported param List(--executor-memory, 2048,
>>>> >>>>> --executor-cores, 1,
>>>> >>>>> --num-executors, 3)
>>>> >>>>> Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
>>>> >>>>> Options:
>>>> >>>>>   --jar JAR_PATH       Path to your application's JAR file
>>>> (required)
>>>> >>>>>   --class CLASS_NAME   Name of your application's main class
>>>> >>>>> (required)
>>>> >>>>> ...bla-bla-bla
>>>> >>>>> "
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> --
>>>> >>>>> View this message in context:
>>>> >>>>>
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
>>>> >>>>> Sent from the Apache Spark User List mailing list archive at
>>>> >>>>> Nabble.com.
>>>> >>
>>>> >
>>>>
>>>
>>>
>>
>

Re: Unable to run Spark 1.0 SparkPi on HDP 2.0

Posted by "Chester @work" <ch...@alpinenow.com>.

In Yarn cluster mode, you can either have spark on all the cluster nodes or supply the spark jar yourself. In the 2nd case, you don't need install spark on cluster at all. As you supply the spark assembly as we as your app jar together. 

I hope this make it clear

Chester

Sent from my iPhone

> On Jul 7, 2014, at 5:05 AM, Konstantin Kudryavtsev <ku...@gmail.com> wrote:
> 
> thank you Krishna!
> 
> Could you please explain why do I need install spark on each node if Spark official site said: If you have a Hadoop 2 cluster, you can run Spark without any installation needed
> 
> I have HDP 2 (YARN) and that's why I hope I don't need to install spark on each node 
> 
> Thank you,
> Konstantin Kudryavtsev
> 
> 
>> On Mon, Jul 7, 2014 at 1:57 PM, Krishna Sankar <ks...@gmail.com> wrote:
>> Konstantin,
>> You need to install the hadoop rpms on all nodes. If it is Hadoop 2, the nodes would have hdfs & YARN.
>> Then you need to install Spark on all nodes. I haven't had experience with HDP, but the tech preview might have installed Spark as well.
>> In the end, one should have hdfs,yarn & spark installed on all the nodes.
>> After installations, check the web console to make sure hdfs, yarn & spark are running.
>> Then you are ready to start experimenting/developing spark applications.
>> HTH.
>> Cheers
>> <k/>
>> 
>> 
>>> On Mon, Jul 7, 2014 at 2:34 AM, Konstantin Kudryavtsev <ku...@gmail.com> wrote:
>>> guys, I'm not talking about running spark on VM, I don have problem with it.
>>> 
>>> I confused in the next:
>>> 1) Hortonworks describe installation process as RPMs on each node
>>> 2) spark home page said that everything I need is YARN
>>> 
>>> And I'm in stucj with understanding what I need to do to run spark on yarn (do I need RPMs installations or only build spark on edge node?)
>>> 
>>> 
>>> Thank you,
>>> Konstantin Kudryavtsev
>>> 
>>> 
>>>> On Mon, Jul 7, 2014 at 4:34 AM, Robert James <sr...@gmail.com> wrote:
>>>> I can say from my experience that getting Spark to work with Hadoop 2
>>>> is not for the beginner; after solving one problem after another
>>>> (dependencies, scripts, etc.), I went back to Hadoop 1.
>>>> 
>>>> Spark's Maven, ec2 scripts, and others all use Hadoop 1 - not sure
>>>> why, but, given so, Hadoop 2 has too many bumps
>>>> 
>>>> On 7/6/14, Marco Shaw <ma...@gmail.com> wrote:
>>>> > That is confusing based on the context you provided.
>>>> >
>>>> > This might take more time than I can spare to try to understand.
>>>> >
>>>> > For sure, you need to add Spark to run it in/on the HDP 2.1 express VM.
>>>> >
>>>> > Cloudera's CDH 5 express VM includes Spark, but the service isn't running by
>>>> > default.
>>>> >
>>>> > I can't remember for MapR...
>>>> >
>>>> > Marco
>>>> >
>>>> >> On Jul 6, 2014, at 6:33 PM, Konstantin Kudryavtsev
>>>> >> <ku...@gmail.com> wrote:
>>>> >>
>>>> >> Marco,
>>>> >>
>>>> >> Hortonworks provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you
>>>> >> can try
>>>> >> from
>>>> >> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>>> >>  HDP 2.1 means YARN, at the same time they propose ti install rpm
>>>> >>
>>>> >> On other hand, http://spark.apache.org/ said "
>>>> >> Integrated with Hadoop
>>>> >> Spark can run on Hadoop 2's YARN cluster manager, and can read any
>>>> >> existing Hadoop data.
>>>> >>
>>>> >> If you have a Hadoop 2 cluster, you can run Spark without any installation
>>>> >> needed. "
>>>> >>
>>>> >> And this is confusing for me... do I need rpm installation on not?...
>>>> >>
>>>> >>
>>>> >> Thank you,
>>>> >> Konstantin Kudryavtsev
>>>> >>
>>>> >>
>>>> >>> On Sun, Jul 6, 2014 at 10:56 PM, Marco Shaw <ma...@gmail.com>
>>>> >>> wrote:
>>>> >>> Can you provide links to the sections that are confusing?
>>>> >>>
>>>> >>> My understanding, the HDP1 binaries do not need YARN, while the HDP2
>>>> >>> binaries do.
>>>> >>>
>>>> >>> Now, you can also install Hortonworks Spark RPM...
>>>> >>>
>>>> >>> For production, in my opinion, RPMs are better for manageability.
>>>> >>>
>>>> >>>> On Jul 6, 2014, at 5:39 PM, Konstantin Kudryavtsev
>>>> >>>> <ku...@gmail.com> wrote:
>>>> >>>>
>>>> >>>> Hello, thanks for your message... I'm confused, Hortonworhs suggest
>>>> >>>> install spark rpm on each node, but on Spark main page said that yarn
>>>> >>>> enough and I don't need to install it... What the difference?
>>>> >>>>
>>>> >>>> sent from my HTC
>>>> >>>>
>>>> >>>>> On Jul 6, 2014 8:34 PM, "vs" <vi...@gmail.com> wrote:
>>>> >>>>> Konstantin,
>>>> >>>>>
>>>> >>>>> HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you can
>>>> >>>>> try
>>>> >>>>> from
>>>> >>>>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>>> >>>>>
>>>> >>>>> Let me know if you see issues with the tech preview.
>>>> >>>>>
>>>> >>>>> "spark PI example on HDP 2.0
>>>> >>>>>
>>>> >>>>> I downloaded spark 1.0 pre-build from
>>>> >>>>> http://spark.apache.org/downloads.html
>>>> >>>>> (for HDP2)
>>>> >>>>> The run example from spark web-site:
>>>> >>>>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi
>>>> >>>>> --master
>>>> >>>>> yarn-cluster --num-executors 3 --driver-memory 2g --executor-memory 2g
>>>> >>>>> --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2
>>>> >>>>>
>>>> >>>>> I got error:
>>>> >>>>> Application application_1404470405736_0044 failed 3 times due to AM
>>>> >>>>> Container for appattempt_1404470405736_0044_000003 exited with
>>>> >>>>> exitCode: 1
>>>> >>>>> due to: Exception from container-launch:
>>>> >>>>> org.apache.hadoop.util.Shell$ExitCodeException:
>>>> >>>>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>>>> >>>>> at org.apache.hadoop.util.Shell.run(Shell.java:379)
>>>> >>>>> at
>>>> >>>>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>>>> >>>>> at
>>>> >>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>>> >>>>> at
>>>> >>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>>>> >>>>> at
>>>> >>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>>>> >>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>> >>>>> at
>>>> >>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>> >>>>> at
>>>> >>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>> >>>>> at java.lang.Thread.run(Thread.java:744)
>>>> >>>>> .Failing this attempt.. Failing the application.
>>>> >>>>>
>>>> >>>>> Unknown/unsupported param List(--executor-memory, 2048,
>>>> >>>>> --executor-cores, 1,
>>>> >>>>> --num-executors, 3)
>>>> >>>>> Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
>>>> >>>>> Options:
>>>> >>>>>   --jar JAR_PATH       Path to your application's JAR file (required)
>>>> >>>>>   --class CLASS_NAME   Name of your application's main class
>>>> >>>>> (required)
>>>> >>>>> ...bla-bla-bla
>>>> >>>>> "
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> --
>>>> >>>>> View this message in context:
>>>> >>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
>>>> >>>>> Sent from the Apache Spark User List mailing list archive at
>>>> >>>>> Nabble.com.
>>>> >>
>>>> >
>

Re: Unable to run Spark 1.0 SparkPi on HDP 2.0

Posted by Konstantin Kudryavtsev <ku...@gmail.com>.

thank you Krishna!

 Could you please explain why do I need install spark on each node if Spark
official site said: If you have a Hadoop 2 cluster, you can run Spark
without any installation needed

I have HDP 2 (YARN) and that's why I hope I don't need to install spark on
each node

Thank you,
Konstantin Kudryavtsev


On Mon, Jul 7, 2014 at 1:57 PM, Krishna Sankar <ks...@gmail.com> wrote:

> Konstantin,
>
>    1. You need to install the hadoop rpms on all nodes. If it is Hadoop
>    2, the nodes would have hdfs & YARN.
>    2. Then you need to install Spark on all nodes. I haven't had
>    experience with HDP, but the tech preview might have installed Spark as
>    well.
>    3. In the end, one should have hdfs,yarn & spark installed on all the
>    nodes.
>    4. After installations, check the web console to make sure hdfs, yarn
>    & spark are running.
>    5. Then you are ready to start experimenting/developing spark
>    applications.
>
> HTH.
> Cheers
> <k/>
>
>
> On Mon, Jul 7, 2014 at 2:34 AM, Konstantin Kudryavtsev <
> kudryavtsev.konstantin@gmail.com> wrote:
>
>> guys, I'm not talking about running spark on VM, I don have problem with
>> it.
>>
>> I confused in the next:
>> 1) Hortonworks describe installation process as RPMs on each node
>> 2) spark home page said that everything I need is YARN
>>
>> And I'm in stucj with understanding what I need to do to run spark on
>> yarn (do I need RPMs installations or only build spark on edge node?)
>>
>>
>> Thank you,
>> Konstantin Kudryavtsev
>>
>>
>> On Mon, Jul 7, 2014 at 4:34 AM, Robert James <sr...@gmail.com>
>> wrote:
>>
>>> I can say from my experience that getting Spark to work with Hadoop 2
>>> is not for the beginner; after solving one problem after another
>>> (dependencies, scripts, etc.), I went back to Hadoop 1.
>>>
>>> Spark's Maven, ec2 scripts, and others all use Hadoop 1 - not sure
>>> why, but, given so, Hadoop 2 has too many bumps
>>>
>>> On 7/6/14, Marco Shaw <ma...@gmail.com> wrote:
>>> > That is confusing based on the context you provided.
>>> >
>>> > This might take more time than I can spare to try to understand.
>>> >
>>> > For sure, you need to add Spark to run it in/on the HDP 2.1 express VM.
>>> >
>>> > Cloudera's CDH 5 express VM includes Spark, but the service isn't
>>> running by
>>> > default.
>>> >
>>> > I can't remember for MapR...
>>> >
>>> > Marco
>>> >
>>> >> On Jul 6, 2014, at 6:33 PM, Konstantin Kudryavtsev
>>> >> <ku...@gmail.com> wrote:
>>> >>
>>> >> Marco,
>>> >>
>>> >> Hortonworks provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that
>>> you
>>> >> can try
>>> >> from
>>> >>
>>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>> >>  HDP 2.1 means YARN, at the same time they propose ti install rpm
>>> >>
>>> >> On other hand, http://spark.apache.org/ said "
>>> >> Integrated with Hadoop
>>> >> Spark can run on Hadoop 2's YARN cluster manager, and can read any
>>> >> existing Hadoop data.
>>> >>
>>> >> If you have a Hadoop 2 cluster, you can run Spark without any
>>> installation
>>> >> needed. "
>>> >>
>>> >> And this is confusing for me... do I need rpm installation on not?...
>>> >>
>>> >>
>>> >> Thank you,
>>> >> Konstantin Kudryavtsev
>>> >>
>>> >>
>>> >>> On Sun, Jul 6, 2014 at 10:56 PM, Marco Shaw <ma...@gmail.com>
>>> >>> wrote:
>>> >>> Can you provide links to the sections that are confusing?
>>> >>>
>>> >>> My understanding, the HDP1 binaries do not need YARN, while the HDP2
>>> >>> binaries do.
>>> >>>
>>> >>> Now, you can also install Hortonworks Spark RPM...
>>> >>>
>>> >>> For production, in my opinion, RPMs are better for manageability.
>>> >>>
>>> >>>> On Jul 6, 2014, at 5:39 PM, Konstantin Kudryavtsev
>>> >>>> <ku...@gmail.com> wrote:
>>> >>>>
>>> >>>> Hello, thanks for your message... I'm confused, Hortonworhs suggest
>>> >>>> install spark rpm on each node, but on Spark main page said that
>>> yarn
>>> >>>> enough and I don't need to install it... What the difference?
>>> >>>>
>>> >>>> sent from my HTC
>>> >>>>
>>> >>>>> On Jul 6, 2014 8:34 PM, "vs" <vi...@gmail.com> wrote:
>>> >>>>> Konstantin,
>>> >>>>>
>>> >>>>> HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you
>>> can
>>> >>>>> try
>>> >>>>> from
>>> >>>>>
>>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>> >>>>>
>>> >>>>> Let me know if you see issues with the tech preview.
>>> >>>>>
>>> >>>>> "spark PI example on HDP 2.0
>>> >>>>>
>>> >>>>> I downloaded spark 1.0 pre-build from
>>> >>>>> http://spark.apache.org/downloads.html
>>> >>>>> (for HDP2)
>>> >>>>> The run example from spark web-site:
>>> >>>>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi
>>> >>>>> --master
>>> >>>>> yarn-cluster --num-executors 3 --driver-memory 2g
>>> --executor-memory 2g
>>> >>>>> --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2
>>> >>>>>
>>> >>>>> I got error:
>>> >>>>> Application application_1404470405736_0044 failed 3 times due to AM
>>> >>>>> Container for appattempt_1404470405736_0044_000003 exited with
>>> >>>>> exitCode: 1
>>> >>>>> due to: Exception from container-launch:
>>> >>>>> org.apache.hadoop.util.Shell$ExitCodeException:
>>> >>>>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>>> >>>>> at org.apache.hadoop.util.Shell.run(Shell.java:379)
>>> >>>>> at
>>> >>>>>
>>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>>> >>>>> at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>> >>>>> at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>>> >>>>> at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>>> >>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>> >>>>> at
>>> >>>>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> >>>>> at
>>> >>>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> >>>>> at java.lang.Thread.run(Thread.java:744)
>>> >>>>> .Failing this attempt.. Failing the application.
>>> >>>>>
>>> >>>>> Unknown/unsupported param List(--executor-memory, 2048,
>>> >>>>> --executor-cores, 1,
>>> >>>>> --num-executors, 3)
>>> >>>>> Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
>>> >>>>> Options:
>>> >>>>>   --jar JAR_PATH       Path to your application's JAR file
>>> (required)
>>> >>>>>   --class CLASS_NAME   Name of your application's main class
>>> >>>>> (required)
>>> >>>>> ...bla-bla-bla
>>> >>>>> "
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> --
>>> >>>>> View this message in context:
>>> >>>>>
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
>>> >>>>> Sent from the Apache Spark User List mailing list archive at
>>> >>>>> Nabble.com.
>>> >>
>>> >
>>>
>>
>>
>

Re: Unable to run Spark 1.0 SparkPi on HDP 2.0

Posted by Krishna Sankar <ks...@gmail.com>.

Konstantin,

   1. You need to install the hadoop rpms on all nodes. If it is Hadoop 2,
   the nodes would have hdfs & YARN.
   2. Then you need to install Spark on all nodes. I haven't had experience
   with HDP, but the tech preview might have installed Spark as well.
   3. In the end, one should have hdfs,yarn & spark installed on all the
   nodes.
   4. After installations, check the web console to make sure hdfs, yarn &
   spark are running.
   5. Then you are ready to start experimenting/developing spark
   applications.

HTH.
Cheers
<k/>


On Mon, Jul 7, 2014 at 2:34 AM, Konstantin Kudryavtsev <
kudryavtsev.konstantin@gmail.com> wrote:

> guys, I'm not talking about running spark on VM, I don have problem with
> it.
>
> I confused in the next:
> 1) Hortonworks describe installation process as RPMs on each node
> 2) spark home page said that everything I need is YARN
>
> And I'm in stucj with understanding what I need to do to run spark on yarn
> (do I need RPMs installations or only build spark on edge node?)
>
>
> Thank you,
> Konstantin Kudryavtsev
>
>
> On Mon, Jul 7, 2014 at 4:34 AM, Robert James <sr...@gmail.com>
> wrote:
>
>> I can say from my experience that getting Spark to work with Hadoop 2
>> is not for the beginner; after solving one problem after another
>> (dependencies, scripts, etc.), I went back to Hadoop 1.
>>
>> Spark's Maven, ec2 scripts, and others all use Hadoop 1 - not sure
>> why, but, given so, Hadoop 2 has too many bumps
>>
>> On 7/6/14, Marco Shaw <ma...@gmail.com> wrote:
>> > That is confusing based on the context you provided.
>> >
>> > This might take more time than I can spare to try to understand.
>> >
>> > For sure, you need to add Spark to run it in/on the HDP 2.1 express VM.
>> >
>> > Cloudera's CDH 5 express VM includes Spark, but the service isn't
>> running by
>> > default.
>> >
>> > I can't remember for MapR...
>> >
>> > Marco
>> >
>> >> On Jul 6, 2014, at 6:33 PM, Konstantin Kudryavtsev
>> >> <ku...@gmail.com> wrote:
>> >>
>> >> Marco,
>> >>
>> >> Hortonworks provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that
>> you
>> >> can try
>> >> from
>> >>
>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>> >>  HDP 2.1 means YARN, at the same time they propose ti install rpm
>> >>
>> >> On other hand, http://spark.apache.org/ said "
>> >> Integrated with Hadoop
>> >> Spark can run on Hadoop 2's YARN cluster manager, and can read any
>> >> existing Hadoop data.
>> >>
>> >> If you have a Hadoop 2 cluster, you can run Spark without any
>> installation
>> >> needed. "
>> >>
>> >> And this is confusing for me... do I need rpm installation on not?...
>> >>
>> >>
>> >> Thank you,
>> >> Konstantin Kudryavtsev
>> >>
>> >>
>> >>> On Sun, Jul 6, 2014 at 10:56 PM, Marco Shaw <ma...@gmail.com>
>> >>> wrote:
>> >>> Can you provide links to the sections that are confusing?
>> >>>
>> >>> My understanding, the HDP1 binaries do not need YARN, while the HDP2
>> >>> binaries do.
>> >>>
>> >>> Now, you can also install Hortonworks Spark RPM...
>> >>>
>> >>> For production, in my opinion, RPMs are better for manageability.
>> >>>
>> >>>> On Jul 6, 2014, at 5:39 PM, Konstantin Kudryavtsev
>> >>>> <ku...@gmail.com> wrote:
>> >>>>
>> >>>> Hello, thanks for your message... I'm confused, Hortonworhs suggest
>> >>>> install spark rpm on each node, but on Spark main page said that yarn
>> >>>> enough and I don't need to install it... What the difference?
>> >>>>
>> >>>> sent from my HTC
>> >>>>
>> >>>>> On Jul 6, 2014 8:34 PM, "vs" <vi...@gmail.com> wrote:
>> >>>>> Konstantin,
>> >>>>>
>> >>>>> HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you
>> can
>> >>>>> try
>> >>>>> from
>> >>>>>
>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>> >>>>>
>> >>>>> Let me know if you see issues with the tech preview.
>> >>>>>
>> >>>>> "spark PI example on HDP 2.0
>> >>>>>
>> >>>>> I downloaded spark 1.0 pre-build from
>> >>>>> http://spark.apache.org/downloads.html
>> >>>>> (for HDP2)
>> >>>>> The run example from spark web-site:
>> >>>>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi
>> >>>>> --master
>> >>>>> yarn-cluster --num-executors 3 --driver-memory 2g --executor-memory
>> 2g
>> >>>>> --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2
>> >>>>>
>> >>>>> I got error:
>> >>>>> Application application_1404470405736_0044 failed 3 times due to AM
>> >>>>> Container for appattempt_1404470405736_0044_000003 exited with
>> >>>>> exitCode: 1
>> >>>>> due to: Exception from container-launch:
>> >>>>> org.apache.hadoop.util.Shell$ExitCodeException:
>> >>>>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>> >>>>> at org.apache.hadoop.util.Shell.run(Shell.java:379)
>> >>>>> at
>> >>>>>
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>> >>>>> at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>> >>>>> at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>> >>>>> at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>> >>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>> >>>>> at
>> >>>>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> >>>>> at
>> >>>>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> >>>>> at java.lang.Thread.run(Thread.java:744)
>> >>>>> .Failing this attempt.. Failing the application.
>> >>>>>
>> >>>>> Unknown/unsupported param List(--executor-memory, 2048,
>> >>>>> --executor-cores, 1,
>> >>>>> --num-executors, 3)
>> >>>>> Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
>> >>>>> Options:
>> >>>>>   --jar JAR_PATH       Path to your application's JAR file
>> (required)
>> >>>>>   --class CLASS_NAME   Name of your application's main class
>> >>>>> (required)
>> >>>>> ...bla-bla-bla
>> >>>>> "
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>> View this message in context:
>> >>>>>
>> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
>> >>>>> Sent from the Apache Spark User List mailing list archive at
>> >>>>> Nabble.com.
>> >>
>> >
>>
>
>

Re: Unable to run Spark 1.0 SparkPi on HDP 2.0

Posted by Konstantin Kudryavtsev <ku...@gmail.com>.

guys, I'm not talking about running spark on VM, I don have problem with it.

I confused in the next:
1) Hortonworks describe installation process as RPMs on each node
2) spark home page said that everything I need is YARN

And I'm in stucj with understanding what I need to do to run spark on yarn
(do I need RPMs installations or only build spark on edge node?)


Thank you,
Konstantin Kudryavtsev


On Mon, Jul 7, 2014 at 4:34 AM, Robert James <sr...@gmail.com> wrote:

> I can say from my experience that getting Spark to work with Hadoop 2
> is not for the beginner; after solving one problem after another
> (dependencies, scripts, etc.), I went back to Hadoop 1.
>
> Spark's Maven, ec2 scripts, and others all use Hadoop 1 - not sure
> why, but, given so, Hadoop 2 has too many bumps
>
> On 7/6/14, Marco Shaw <ma...@gmail.com> wrote:
> > That is confusing based on the context you provided.
> >
> > This might take more time than I can spare to try to understand.
> >
> > For sure, you need to add Spark to run it in/on the HDP 2.1 express VM.
> >
> > Cloudera's CDH 5 express VM includes Spark, but the service isn't
> running by
> > default.
> >
> > I can't remember for MapR...
> >
> > Marco
> >
> >> On Jul 6, 2014, at 6:33 PM, Konstantin Kudryavtsev
> >> <ku...@gmail.com> wrote:
> >>
> >> Marco,
> >>
> >> Hortonworks provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you
> >> can try
> >> from
> >>
> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
> >>  HDP 2.1 means YARN, at the same time they propose ti install rpm
> >>
> >> On other hand, http://spark.apache.org/ said "
> >> Integrated with Hadoop
> >> Spark can run on Hadoop 2's YARN cluster manager, and can read any
> >> existing Hadoop data.
> >>
> >> If you have a Hadoop 2 cluster, you can run Spark without any
> installation
> >> needed. "
> >>
> >> And this is confusing for me... do I need rpm installation on not?...
> >>
> >>
> >> Thank you,
> >> Konstantin Kudryavtsev
> >>
> >>
> >>> On Sun, Jul 6, 2014 at 10:56 PM, Marco Shaw <ma...@gmail.com>
> >>> wrote:
> >>> Can you provide links to the sections that are confusing?
> >>>
> >>> My understanding, the HDP1 binaries do not need YARN, while the HDP2
> >>> binaries do.
> >>>
> >>> Now, you can also install Hortonworks Spark RPM...
> >>>
> >>> For production, in my opinion, RPMs are better for manageability.
> >>>
> >>>> On Jul 6, 2014, at 5:39 PM, Konstantin Kudryavtsev
> >>>> <ku...@gmail.com> wrote:
> >>>>
> >>>> Hello, thanks for your message... I'm confused, Hortonworhs suggest
> >>>> install spark rpm on each node, but on Spark main page said that yarn
> >>>> enough and I don't need to install it... What the difference?
> >>>>
> >>>> sent from my HTC
> >>>>
> >>>>> On Jul 6, 2014 8:34 PM, "vs" <vi...@gmail.com> wrote:
> >>>>> Konstantin,
> >>>>>
> >>>>> HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you can
> >>>>> try
> >>>>> from
> >>>>>
> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
> >>>>>
> >>>>> Let me know if you see issues with the tech preview.
> >>>>>
> >>>>> "spark PI example on HDP 2.0
> >>>>>
> >>>>> I downloaded spark 1.0 pre-build from
> >>>>> http://spark.apache.org/downloads.html
> >>>>> (for HDP2)
> >>>>> The run example from spark web-site:
> >>>>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi
> >>>>> --master
> >>>>> yarn-cluster --num-executors 3 --driver-memory 2g --executor-memory
> 2g
> >>>>> --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2
> >>>>>
> >>>>> I got error:
> >>>>> Application application_1404470405736_0044 failed 3 times due to AM
> >>>>> Container for appattempt_1404470405736_0044_000003 exited with
> >>>>> exitCode: 1
> >>>>> due to: Exception from container-launch:
> >>>>> org.apache.hadoop.util.Shell$ExitCodeException:
> >>>>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
> >>>>> at org.apache.hadoop.util.Shell.run(Shell.java:379)
> >>>>> at
> >>>>>
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
> >>>>> at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> >>>>> at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
> >>>>> at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
> >>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> >>>>> at
> >>>>>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >>>>> at
> >>>>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >>>>> at java.lang.Thread.run(Thread.java:744)
> >>>>> .Failing this attempt.. Failing the application.
> >>>>>
> >>>>> Unknown/unsupported param List(--executor-memory, 2048,
> >>>>> --executor-cores, 1,
> >>>>> --num-executors, 3)
> >>>>> Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
> >>>>> Options:
> >>>>>   --jar JAR_PATH       Path to your application's JAR file (required)
> >>>>>   --class CLASS_NAME   Name of your application's main class
> >>>>> (required)
> >>>>> ...bla-bla-bla
> >>>>> "
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> View this message in context:
> >>>>>
> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
> >>>>> Sent from the Apache Spark User List mailing list archive at
> >>>>> Nabble.com.
> >>
> >
>

Re: Unable to run Spark 1.0 SparkPi on HDP 2.0

Posted by Robert James <sr...@gmail.com>.

I can say from my experience that getting Spark to work with Hadoop 2
is not for the beginner; after solving one problem after another
(dependencies, scripts, etc.), I went back to Hadoop 1.

Spark's Maven, ec2 scripts, and others all use Hadoop 1 - not sure
why, but, given so, Hadoop 2 has too many bumps

On 7/6/14, Marco Shaw <ma...@gmail.com> wrote:
> That is confusing based on the context you provided.
>
> This might take more time than I can spare to try to understand.
>
> For sure, you need to add Spark to run it in/on the HDP 2.1 express VM.
>
> Cloudera's CDH 5 express VM includes Spark, but the service isn't running by
> default.
>
> I can't remember for MapR...
>
> Marco
>
>> On Jul 6, 2014, at 6:33 PM, Konstantin Kudryavtsev
>> <ku...@gmail.com> wrote:
>>
>> Marco,
>>
>> Hortonworks provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you
>> can try
>> from
>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>  HDP 2.1 means YARN, at the same time they propose ti install rpm
>>
>> On other hand, http://spark.apache.org/ said "
>> Integrated with Hadoop
>> Spark can run on Hadoop 2's YARN cluster manager, and can read any
>> existing Hadoop data.
>>
>> If you have a Hadoop 2 cluster, you can run Spark without any installation
>> needed. "
>>
>> And this is confusing for me... do I need rpm installation on not?...
>>
>>
>> Thank you,
>> Konstantin Kudryavtsev
>>
>>
>>> On Sun, Jul 6, 2014 at 10:56 PM, Marco Shaw <ma...@gmail.com>
>>> wrote:
>>> Can you provide links to the sections that are confusing?
>>>
>>> My understanding, the HDP1 binaries do not need YARN, while the HDP2
>>> binaries do.
>>>
>>> Now, you can also install Hortonworks Spark RPM...
>>>
>>> For production, in my opinion, RPMs are better for manageability.
>>>
>>>> On Jul 6, 2014, at 5:39 PM, Konstantin Kudryavtsev
>>>> <ku...@gmail.com> wrote:
>>>>
>>>> Hello, thanks for your message... I'm confused, Hortonworhs suggest
>>>> install spark rpm on each node, but on Spark main page said that yarn
>>>> enough and I don't need to install it... What the difference?
>>>>
>>>> sent from my HTC
>>>>
>>>>> On Jul 6, 2014 8:34 PM, "vs" <vi...@gmail.com> wrote:
>>>>> Konstantin,
>>>>>
>>>>> HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you can
>>>>> try
>>>>> from
>>>>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>>>>
>>>>> Let me know if you see issues with the tech preview.
>>>>>
>>>>> "spark PI example on HDP 2.0
>>>>>
>>>>> I downloaded spark 1.0 pre-build from
>>>>> http://spark.apache.org/downloads.html
>>>>> (for HDP2)
>>>>> The run example from spark web-site:
>>>>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi
>>>>> --master
>>>>> yarn-cluster --num-executors 3 --driver-memory 2g --executor-memory 2g
>>>>> --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2
>>>>>
>>>>> I got error:
>>>>> Application application_1404470405736_0044 failed 3 times due to AM
>>>>> Container for appattempt_1404470405736_0044_000003 exited with
>>>>> exitCode: 1
>>>>> due to: Exception from container-launch:
>>>>> org.apache.hadoop.util.Shell$ExitCodeException:
>>>>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>>>>> at org.apache.hadoop.util.Shell.run(Shell.java:379)
>>>>> at
>>>>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>>>>> at
>>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>>>> at
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>>>>> at
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>> at java.lang.Thread.run(Thread.java:744)
>>>>> .Failing this attempt.. Failing the application.
>>>>>
>>>>> Unknown/unsupported param List(--executor-memory, 2048,
>>>>> --executor-cores, 1,
>>>>> --num-executors, 3)
>>>>> Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
>>>>> Options:
>>>>>   --jar JAR_PATH       Path to your application's JAR file (required)
>>>>>   --class CLASS_NAME   Name of your application's main class
>>>>> (required)
>>>>> ...bla-bla-bla
>>>>> "
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>> Nabble.com.
>>
>

Re: Unable to run Spark 1.0 SparkPi on HDP 2.0

Posted by Marco Shaw <ma...@gmail.com>.

That is confusing based on the context you provided. 

This might take more time than I can spare to try to understand. 

For sure, you need to add Spark to run it in/on the HDP 2.1 express VM. 

Cloudera's CDH 5 express VM includes Spark, but the service isn't running by default. 

I can't remember for MapR...

Marco

> On Jul 6, 2014, at 6:33 PM, Konstantin Kudryavtsev <ku...@gmail.com> wrote:
> 
> Marco,
> 
> Hortonworks provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you can try
> from
> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf  HDP 2.1 means YARN, at the same time they propose ti install rpm
> 
> On other hand, http://spark.apache.org/ said "
> Integrated with Hadoop
> Spark can run on Hadoop 2's YARN cluster manager, and can read any existing Hadoop data.
> 
> If you have a Hadoop 2 cluster, you can run Spark without any installation needed. "
> 
> And this is confusing for me... do I need rpm installation on not?...
> 
> 
> Thank you,
> Konstantin Kudryavtsev
> 
> 
>> On Sun, Jul 6, 2014 at 10:56 PM, Marco Shaw <ma...@gmail.com> wrote:
>> Can you provide links to the sections that are confusing?
>> 
>> My understanding, the HDP1 binaries do not need YARN, while the HDP2 binaries do. 
>> 
>> Now, you can also install Hortonworks Spark RPM...
>> 
>> For production, in my opinion, RPMs are better for manageability. 
>> 
>>> On Jul 6, 2014, at 5:39 PM, Konstantin Kudryavtsev <ku...@gmail.com> wrote:
>>> 
>>> Hello, thanks for your message... I'm confused, Hortonworhs suggest install spark rpm on each node, but on Spark main page said that yarn enough and I don't need to install it... What the difference?
>>> 
>>> sent from my HTC
>>> 
>>>> On Jul 6, 2014 8:34 PM, "vs" <vi...@gmail.com> wrote:
>>>> Konstantin,
>>>> 
>>>> HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you can try
>>>> from
>>>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>>> 
>>>> Let me know if you see issues with the tech preview.
>>>> 
>>>> "spark PI example on HDP 2.0
>>>> 
>>>> I downloaded spark 1.0 pre-build from http://spark.apache.org/downloads.html
>>>> (for HDP2)
>>>> The run example from spark web-site:
>>>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi     --master
>>>> yarn-cluster --num-executors 3 --driver-memory 2g --executor-memory 2g
>>>> --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2
>>>> 
>>>> I got error:
>>>> Application application_1404470405736_0044 failed 3 times due to AM
>>>> Container for appattempt_1404470405736_0044_000003 exited with exitCode: 1
>>>> due to: Exception from container-launch:
>>>> org.apache.hadoop.util.Shell$ExitCodeException:
>>>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>>>> at org.apache.hadoop.util.Shell.run(Shell.java:379)
>>>> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>>>> at
>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>>> at
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>>>> at
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>> at java.lang.Thread.run(Thread.java:744)
>>>> .Failing this attempt.. Failing the application.
>>>> 
>>>> Unknown/unsupported param List(--executor-memory, 2048, --executor-cores, 1,
>>>> --num-executors, 3)
>>>> Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
>>>> Options:
>>>>   --jar JAR_PATH       Path to your application's JAR file (required)
>>>>   --class CLASS_NAME   Name of your application's main class (required)
>>>> ...bla-bla-bla
>>>> "
>>>> 
>>>> 
>>>> 
>>>> --
>>>> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Re: Unable to run Spark 1.0 SparkPi on HDP 2.0

Posted by Konstantin Kudryavtsev <ku...@gmail.com>.

Marco,

Hortonworks provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you
can try
from
http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
HDP 2.1 means YARN, at the same time they propose ti install rpm

On other hand, http://spark.apache.org/ said "
Integrated with Hadoop

Spark can run on Hadoop 2's YARN cluster manager, and can read any existing
Hadoop data.
If you have a Hadoop 2 cluster, you can run Spark without any installation
needed. "

And this is confusing for me... do I need rpm installation on not?...


Thank you,
Konstantin Kudryavtsev


On Sun, Jul 6, 2014 at 10:56 PM, Marco Shaw <ma...@gmail.com> wrote:

> Can you provide links to the sections that are confusing?
>
> My understanding, the HDP1 binaries do not need YARN, while the HDP2
> binaries do.
>
> Now, you can also install Hortonworks Spark RPM...
>
> For production, in my opinion, RPMs are better for manageability.
>
> On Jul 6, 2014, at 5:39 PM, Konstantin Kudryavtsev <
> kudryavtsev.konstantin@gmail.com> wrote:
>
> Hello, thanks for your message... I'm confused, Hortonworhs suggest
> install spark rpm on each node, but on Spark main page said that yarn
> enough and I don't need to install it... What the difference?
>
> sent from my HTC
> On Jul 6, 2014 8:34 PM, "vs" <vi...@gmail.com> wrote:
>
>> Konstantin,
>>
>> HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you can try
>> from
>>
>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>
>> Let me know if you see issues with the tech preview.
>>
>> "spark PI example on HDP 2.0
>>
>> I downloaded spark 1.0 pre-build from
>> http://spark.apache.org/downloads.html
>> (for HDP2)
>> The run example from spark web-site:
>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi     --master
>> yarn-cluster --num-executors 3 --driver-memory 2g --executor-memory 2g
>> --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2
>>
>> I got error:
>> Application application_1404470405736_0044 failed 3 times due to AM
>> Container for appattempt_1404470405736_0044_000003 exited with exitCode: 1
>> due to: Exception from container-launch:
>> org.apache.hadoop.util.Shell$ExitCodeException:
>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>> at org.apache.hadoop.util.Shell.run(Shell.java:379)
>> at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>> at
>>
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>> at
>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>> at
>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>> at
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:744)
>> .Failing this attempt.. Failing the application.
>>
>> Unknown/unsupported param List(--executor-memory, 2048, --executor-cores,
>> 1,
>> --num-executors, 3)
>> Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
>> Options:
>>   --jar JAR_PATH       Path to your application's JAR file (required)
>>   --class CLASS_NAME   Name of your application's main class (required)
>> ...bla-bla-bla
>> "
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>

Re: Unable to run Spark 1.0 SparkPi on HDP 2.0

Posted by Marco Shaw <ma...@gmail.com>.

Can you provide links to the sections that are confusing?

My understanding, the HDP1 binaries do not need YARN, while the HDP2 binaries do. 

Now, you can also install Hortonworks Spark RPM...

For production, in my opinion, RPMs are better for manageability. 

> On Jul 6, 2014, at 5:39 PM, Konstantin Kudryavtsev <ku...@gmail.com> wrote:
> 
> Hello, thanks for your message... I'm confused, Hortonworhs suggest install spark rpm on each node, but on Spark main page said that yarn enough and I don't need to install it... What the difference?
> 
> sent from my HTC
> 
>> On Jul 6, 2014 8:34 PM, "vs" <vi...@gmail.com> wrote:
>> Konstantin,
>> 
>> HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you can try
>> from
>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>> 
>> Let me know if you see issues with the tech preview.
>> 
>> "spark PI example on HDP 2.0
>> 
>> I downloaded spark 1.0 pre-build from http://spark.apache.org/downloads.html
>> (for HDP2)
>> The run example from spark web-site:
>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi     --master
>> yarn-cluster --num-executors 3 --driver-memory 2g --executor-memory 2g
>> --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2
>> 
>> I got error:
>> Application application_1404470405736_0044 failed 3 times due to AM
>> Container for appattempt_1404470405736_0044_000003 exited with exitCode: 1
>> due to: Exception from container-launch:
>> org.apache.hadoop.util.Shell$ExitCodeException:
>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>> at org.apache.hadoop.util.Shell.run(Shell.java:379)
>> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>> at
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:744)
>> .Failing this attempt.. Failing the application.
>> 
>> Unknown/unsupported param List(--executor-memory, 2048, --executor-cores, 1,
>> --num-executors, 3)
>> Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
>> Options:
>>   --jar JAR_PATH       Path to your application's JAR file (required)
>>   --class CLASS_NAME   Name of your application's main class (required)
>> ...bla-bla-bla
>> "
>> 
>> 
>> 
>> --
>> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Unable to run Spark 1.0 SparkPi on HDP 2.0

Posted by Konstantin Kudryavtsev <ku...@gmail.com>.

Hello, thanks for your message... I'm confused, Hortonworhs suggest install
spark rpm on each node, but on Spark main page said that yarn enough and I
don't need to install it... What the difference?

sent from my HTC
On Jul 6, 2014 8:34 PM, "vs" <vi...@gmail.com> wrote:

> Konstantin,
>
> HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you can try
> from
> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>
> Let me know if you see issues with the tech preview.
>
> "spark PI example on HDP 2.0
>
> I downloaded spark 1.0 pre-build from
> http://spark.apache.org/downloads.html
> (for HDP2)
> The run example from spark web-site:
> ./bin/spark-submit --class org.apache.spark.examples.SparkPi     --master
> yarn-cluster --num-executors 3 --driver-memory 2g --executor-memory 2g
> --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2
>
> I got error:
> Application application_1404470405736_0044 failed 3 times due to AM
> Container for appattempt_1404470405736_0044_000003 exited with exitCode: 1
> due to: Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
> at org.apache.hadoop.util.Shell.run(Shell.java:379)
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
> at
>
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at
>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
> at
>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> .Failing this attempt.. Failing the application.
>
> Unknown/unsupported param List(--executor-memory, 2048, --executor-cores,
> 1,
> --num-executors, 3)
> Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
> Options:
>   --jar JAR_PATH       Path to your application's JAR file (required)
>   --class CLASS_NAME   Name of your application's main class (required)
> ...bla-bla-bla
> "
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Re: Unable to run Spark 1.0 SparkPi on HDP 2.0

Posted by vs <vi...@gmail.com>.

Konstantin,

HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you can try
from
http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf

Let me know if you see issues with the tech preview.

"spark PI example on HDP 2.0

I downloaded spark 1.0 pre-build from http://spark.apache.org/downloads.html
(for HDP2)
The run example from spark web-site:
./bin/spark-submit --class org.apache.spark.examples.SparkPi     --master
yarn-cluster --num-executors 3 --driver-memory 2g --executor-memory 2g
--executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2

I got error:
Application application_1404470405736_0044 failed 3 times due to AM
Container for appattempt_1404470405736_0044_000003 exited with exitCode: 1
due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
.Failing this attempt.. Failing the application.

Unknown/unsupported param List(--executor-memory, 2048, --executor-cores, 1,
--num-executors, 3)
Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options] 
Options:
  --jar JAR_PATH       Path to your application's JAR file (required)
  --class CLASS_NAME   Name of your application's main class (required)
...bla-bla-bla
"



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.