You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Stefan Beskow <St...@sas.com> on 2014/02/19 22:12:56 UTC

Problem running SimpleShortestPathsComputation on CDH5 with Hadoop 2.2.0

Hi.

I checked out and built Giraph for Cloudera CDH5 with Hadoop 2.2.0 using the following:
git clone git://git.apache.org/giraph.git snapshot_from_git
cd snapshot_from_git
mvn -Phadoop_yarn -Dhadoop.version=2.2.0 clean package -DskipTests

When I run the sample application org.apache.giraph.examples.SimpleShortestPathsComputation I get the following exception:

LogType: gam-stdout.log
Log Contents:
2014-02-19 15:00:48,751 INFO  [main] yarn.GiraphApplicationMaster (GiraphApplicationMaster.java:main(421)) - Starting GitaphAM
2014-02-19 15:00:49,746 INFO  [main] yarn.GiraphApplicationMaster (GiraphApplicationMaster.java:<init>(168)) - GiraphAM  for ContainerId container_1392713839733_0013_02_000001 ApplicationAttemptId appattempt_1392713839733_0013_000002
2014-02-19 15:00:49,782 INFO  [main] yarn.GiraphApplicationMaster (GiraphApplicationMaster.java:run(198)) - Forcefully terminating executors with done =:false
2014-02-19 15:00:49,783 INFO  [main] yarn.GiraphApplicationMaster (GiraphApplicationMaster.java:finish(212)) - Application completed. Stopping running containers
2014-02-19 15:00:49,783 ERROR [main] yarn.GiraphApplicationMaster (GiraphApplicationMaster.java:main(437)) - GiraphApplicationMaster caught a top-level exception in main.
java.lang.NullPointerException
        at org.apache.giraph.yarn.GiraphApplicationMaster.finish(GiraphApplicationMaster.java:213)
        at org.apache.giraph.yarn.GiraphApplicationMaster.run(GiraphApplicationMaster.java:201)
        at org.apache.giraph.yarn.GiraphApplicationMaster.main(GiraphApplicationMaster.java:433)

Here is the command I used to run the application:
hadoop jar giraph-1.1.0-SNAPSHOT-for-hadoop-2.2.0-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation
-vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
-vip /user/stbesk/input/tiny-graph.txt
-vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat
-op /user/stbesk/output/shortestpaths
-w 2
-ca giraph.SplitMasterWorker=true
-ca giraph.zkList=localhost:2181
-yj giraph-1.1.0-SNAPSHOT-for-hadoop-2.2.0-jar-with-dependencies.jar,giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.2.0-jar-with-dependencies.jar

What am I doing wrong? Appreciate any suggestions for what to try next.

Thanks.
Stefan


RE: Problem running SimpleShortestPathsComputation on CDH5 with Hadoop 2.2.0

Posted by Stefan Beskow <St...@sas.com>.
Hi Kristen.

That sounds like a very good idea.  But our cluster administrator is not keen on adding third party jar files to the hadoop/lib folder due to the parcel structure in CDH5 so he asked me to explore all other options first. Alexandre Fonseca is looking into this problem and indicated that he might have a patch for it later today.

Thanks.
Stefan

From: Kristen Hardwick [mailto:khardwick@spryinc.com]
Sent: Thursday, February 20, 2014 3:53 PM
To: user@giraph.apache.org
Subject: Re: Problem running SimpleShortestPathsComputation on CDH5 with Hadoop 2.2.0

Hi Stefan,

I'm not sure if it is correct, but I will share what I did to get past the classpath issue in my environment.

I tried the -yj parameter that has been suggested here, but that didn't work for me either. To fix it, I had to copy the jar files from giraph-core/target and giraph-examples/target into the Hadoop lib folder with the following commands:

cp giraph-core/target/*.jar /usr/lib/hadoop/lib
cp giraph-examples/target/*.jar /usr/lib/hadoop/lib

This put the required classes on the classpath and eliminated my ClassNotFound exceptions. Hopefully that will work for you too. Good luck!

Kristen Hardwick

On Thu, Feb 20, 2014 at 2:20 PM, Stefan Beskow <St...@sas.com>> wrote:
Hi Alex.

I tried that, but unfortunately in my case it didn't work so I'm still getting this exception:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/giraph/yarn/GiraphYarnTask
Caused by: java.lang.ClassNotFoundException: org.apache.giraph.yarn.GiraphYarnTask
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.apache.giraph.yarn.GiraphYarnTask.  Program will exit.
In case I missed something here is what I did:

1. Copied giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.2.0-cdh5.0.0-beta-2-jar-with-dependencies.jar to a directory called /users/stbesk/snapshot_from_git/jars
2. Updated hadoop classpath: export HADOOP_CLASSPATH=/users/stbesk/snapshot_from_git/jars
3. Updated command to only use giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.2.0-cdh5.0.0-beta-2-jar-with-dependencies.jar since it contains all dependencies:

hadoop jar /users/stbesk/snapshot_from_git/jars/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.2.0-cdh5.0.0-beta-2-jar-with-dependencies.jar
org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation
-vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
-vip /user/stbesk/input/tiny-graph.txt
-vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat
-op /user/stbesk/output/shortestpathsC2
-w 2
-ca giraph.SplitMasterWorker=true
-ca giraph.zkList=localhost:2181
-yj giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.2.0-cdh5.0.0-beta-2-jar-with-dependencies.jar

Did I miss something?

Thanks.
Stefan


-----Original Message-----
From: Alexandre Fonseca [mailto:alexandrejorgefonseca@gmail.com<ma...@gmail.com>]
Sent: Thursday, February 20, 2014 1:11 PM
To: user@giraph.apache.org<ma...@giraph.apache.org>
Subject: Re: Problem running SimpleShortestPathsComputation on CDH5 with Hadoop 2.2.0

The yarn version of Giraph is quite finicky when it comes to detecting the giraph jar. When it can't find the jar on the new containers you get that exception.

After much experimentation I've had great success with the following (substitute paths and filenames where needed):

HADOOP_CLASSPATH=/data/b.ajf/giraph hadoop jar /data/b.ajf/giraph/giraph.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation -vip /user/b.ajf/ssp_input -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -op /user/b.ajf/ssp_output -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -yj giraph.jar

The argument supplied to -yj must be only the name of the jar (no directories). The jar must be directly accessible from the current classpath or subdirectories (but it doesn't work if you give the full path of the jar, e.g, HADOOP_CLASSPATH=/data/b.ajf/giraph/giraph.jar).

Also, if you use giraph-examples-with-dependencies you don't need the other one as the examples-with-dependencies already contain the core.

Cheers,
Alex

On Thursday, February 20, 2014 05:42:45 PM Stefan Beskow wrote:
> Thanks Roman for getting back to me so quickly. I checked with
> Cloudera and they have a maven repository for CDH5 (version
> 2.2.0-cdh5.0.0-beta-2) as shown below:
>
> <repository>
>       <id>cloudera</id>
>
> <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
> </repository>
>
> I added that to the pom.xml file and ran a new build using the
> following
> command:
>
> mvn -Phadoop_yarn -Dhadoop.version=2.2.0-cdh5.0.0-beta-2 clean package
> -DskipTests
>
> The program now gets further, but throws the following exception:
>
> Container: container_1392713839733_0017_01_000002 on
> el01cn04.unx.sas.com_8041
>
===========================================================================
> ===== LogType: task-2-stderr.log
> LogLength: 637
> Log Contents:
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/giraph/yarn/GiraphYarnTask Caused by:
> java.lang.ClassNotFoundException:
> org.apache.giraph.yarn.GiraphYarnTask at
> java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> Could not find the main class: org.apache.giraph.yarn.GiraphYarnTask.
> Program will exit.
>
> Do you have any additional suggestions for what I should try?
>
> Thanks for your help.
> Stefan
>
> -----Original Message-----
> From: shaposhnik@gmail.com<ma...@gmail.com> [mailto:shaposhnik@gmail.com<ma...@gmail.com>] On Behalf Of
> Roman Shaposhnik Sent: Wednesday, February 19, 2014 4:43 PM
> To: user@giraph.apache.org<ma...@giraph.apache.org>
> Subject: Re: Problem running SimpleShortestPathsComputation on CDH5
> with Hadoop 2.2.0 On Wed, Feb 19, 2014 at 1:12 PM, Stefan Beskow
> <St...@sas.com>>
wrote:
> > Hi.
> >
> > I checked out and built Giraph for Cloudera CDH5 with Hadoop 2.2.0
> > using the
> > following:
> >
> > git clone git://git.apache.org/giraph.git<http://git.apache.org/giraph.git> snapshot_from_git cd
> > snapshot_from_git mvn -Phadoop_yarn -Dhadoop.version=2.2.0 clean
> > package -DskipTests
>
> It may help to build against CDH5 directly by:
>    * manually adding repository.cloudera.com<http://repository.cloudera.com> to the set of repos
>    * specifying -Dhadoop.version=2.2.0-cdh5.0.0-beta-2
>
> > When I run the sample application
> > org.apache.giraph.examples.SimpleShortestPathsComputation I get the
>
> > following exception:
> You need to provide way more logs from the YARN side for us to make
> sense of it.
>
> Thanks,
> Roman.



Re: Problem running SimpleShortestPathsComputation on CDH5 with Hadoop 2.2.0

Posted by Kristen Hardwick <kh...@spryinc.com>.
Hi Stefan,

I'm not sure if it is correct, but I will share what I did to get past the
classpath issue in my environment.

I tried the -yj parameter that has been suggested here, but that didn't
work for me either. To fix it, I had to copy the jar files from
giraph-core/target
and giraph-examples/target into the Hadoop lib folder with the following
commands:

cp giraph-core/target/*.jar /usr/lib/hadoop/lib
cp giraph-examples/target/*.jar /usr/lib/hadoop/lib

This put the required classes on the classpath and eliminated my
ClassNotFound exceptions. Hopefully that will work for you too. Good luck!

Kristen Hardwick


On Thu, Feb 20, 2014 at 2:20 PM, Stefan Beskow <St...@sas.com>wrote:

> Hi Alex.
>
> I tried that, but unfortunately in my case it didn't work so I'm still
> getting this exception:
>
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/giraph/yarn/GiraphYarnTask
> Caused by: java.lang.ClassNotFoundException:
> org.apache.giraph.yarn.GiraphYarnTask
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> Could not find the main class: org.apache.giraph.yarn.GiraphYarnTask.
>  Program will exit.
>
> In case I missed something here is what I did:
>
> 1. Copied
> giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.2.0-cdh5.0.0-beta-2-jar-with-dependencies.jar
> to a directory called /users/stbesk/snapshot_from_git/jars
> 2. Updated hadoop classpath: export
> HADOOP_CLASSPATH=/users/stbesk/snapshot_from_git/jars
> 3. Updated command to only use
> giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.2.0-cdh5.0.0-beta-2-jar-with-dependencies.jar
> since it contains all dependencies:
>
> hadoop jar
> /users/stbesk/snapshot_from_git/jars/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.2.0-cdh5.0.0-beta-2-jar-with-dependencies.jar
> org.apache.giraph.GiraphRunner
> org.apache.giraph.examples.SimpleShortestPathsComputation
> -vif
> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
> -vip /user/stbesk/input/tiny-graph.txt
> -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat
> -op /user/stbesk/output/shortestpathsC2
> -w 2
> -ca giraph.SplitMasterWorker=true
> -ca giraph.zkList=localhost:2181
> -yj
> giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.2.0-cdh5.0.0-beta-2-jar-with-dependencies.jar
>
> Did I miss something?
>
> Thanks.
> Stefan
>
>
> -----Original Message-----
> From: Alexandre Fonseca [mailto:alexandrejorgefonseca@gmail.com]
> Sent: Thursday, February 20, 2014 1:11 PM
> To: user@giraph.apache.org
> Subject: Re: Problem running SimpleShortestPathsComputation on CDH5 with
> Hadoop 2.2.0
>
> The yarn version of Giraph is quite finicky when it comes to detecting the
> giraph jar. When it can't find the jar on the new containers you get that
> exception.
>
> After much experimentation I've had great success with the following
> (substitute paths and filenames where needed):
>
> HADOOP_CLASSPATH=/data/b.ajf/giraph hadoop jar
> /data/b.ajf/giraph/giraph.jar org.apache.giraph.GiraphRunner
> org.apache.giraph.examples.SimpleShortestPathsComputation -vip
> /user/b.ajf/ssp_input -vif
> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -op
> /user/b.ajf/ssp_output -vof
> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -yj giraph.jar
>
> The argument supplied to -yj must be only the name of the jar (no
> directories). The jar must be directly accessible from the current
> classpath or subdirectories (but it doesn't work if you give the full path
> of the jar, e.g, HADOOP_CLASSPATH=/data/b.ajf/giraph/giraph.jar).
>
> Also, if you use giraph-examples-with-dependencies you don't need the
> other one as the examples-with-dependencies already contain the core.
>
> Cheers,
> Alex
>
> On Thursday, February 20, 2014 05:42:45 PM Stefan Beskow wrote:
> > Thanks Roman for getting back to me so quickly. I checked with
> > Cloudera and they have a maven repository for CDH5 (version
> > 2.2.0-cdh5.0.0-beta-2) as shown below:
> >
> > <repository>
> >       <id>cloudera</id>
> >
> > <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
> > </repository>
> >
> > I added that to the pom.xml file and ran a new build using the
> > following
> > command:
> >
> > mvn -Phadoop_yarn -Dhadoop.version=2.2.0-cdh5.0.0-beta-2 clean package
> > -DskipTests
> >
> > The program now gets further, but throws the following exception:
> >
> > Container: container_1392713839733_0017_01_000002 on
> > el01cn04.unx.sas.com_8041
> >
> ===========================================================================
> > ===== LogType: task-2-stderr.log
> > LogLength: 637
> > Log Contents:
> > Exception in thread "main" java.lang.NoClassDefFoundError:
> > org/apache/giraph/yarn/GiraphYarnTask Caused by:
> > java.lang.ClassNotFoundException:
> > org.apache.giraph.yarn.GiraphYarnTask at
> > java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> >         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> > Could not find the main class: org.apache.giraph.yarn.GiraphYarnTask.
> > Program will exit.
> >
> > Do you have any additional suggestions for what I should try?
> >
> > Thanks for your help.
> > Stefan
> >
> > -----Original Message-----
> > From: shaposhnik@gmail.com [mailto:shaposhnik@gmail.com] On Behalf Of
> > Roman Shaposhnik Sent: Wednesday, February 19, 2014 4:43 PM
> > To: user@giraph.apache.org
> > Subject: Re: Problem running SimpleShortestPathsComputation on CDH5
> > with Hadoop 2.2.0 On Wed, Feb 19, 2014 at 1:12 PM, Stefan Beskow
> > <St...@sas.com>
> wrote:
> > > Hi.
> > >
> > > I checked out and built Giraph for Cloudera CDH5 with Hadoop 2.2.0
> > > using the
> > > following:
> > >
> > > git clone git://git.apache.org/giraph.git snapshot_from_git cd
> > > snapshot_from_git mvn -Phadoop_yarn -Dhadoop.version=2.2.0 clean
> > > package -DskipTests
> >
> > It may help to build against CDH5 directly by:
> >    * manually adding repository.cloudera.com to the set of repos
> >    * specifying -Dhadoop.version=2.2.0-cdh5.0.0-beta-2
> >
> > > When I run the sample application
> > > org.apache.giraph.examples.SimpleShortestPathsComputation I get the
> >
> > > following exception:
> > You need to provide way more logs from the YARN side for us to make
> > sense of it.
> >
> > Thanks,
> > Roman.
>
>
>

RE: Problem running SimpleShortestPathsComputation on CDH5 with Hadoop 2.2.0

Posted by Stefan Beskow <St...@sas.com>.
Hi Alex.

I tried that, but unfortunately in my case it didn't work so I'm still getting this exception:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/giraph/yarn/GiraphYarnTask
Caused by: java.lang.ClassNotFoundException: org.apache.giraph.yarn.GiraphYarnTask
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.apache.giraph.yarn.GiraphYarnTask.  Program will exit.

In case I missed something here is what I did:

1. Copied giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.2.0-cdh5.0.0-beta-2-jar-with-dependencies.jar to a directory called /users/stbesk/snapshot_from_git/jars
2. Updated hadoop classpath: export HADOOP_CLASSPATH=/users/stbesk/snapshot_from_git/jars
3. Updated command to only use giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.2.0-cdh5.0.0-beta-2-jar-with-dependencies.jar since it contains all dependencies:

hadoop jar /users/stbesk/snapshot_from_git/jars/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.2.0-cdh5.0.0-beta-2-jar-with-dependencies.jar 
org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation 
-vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat 
-vip /user/stbesk/input/tiny-graph.txt 
-vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat 
-op /user/stbesk/output/shortestpathsC2 
-w 2 
-ca giraph.SplitMasterWorker=true 
-ca giraph.zkList=localhost:2181 
-yj giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.2.0-cdh5.0.0-beta-2-jar-with-dependencies.jar

Did I miss something?

Thanks.
Stefan


-----Original Message-----
From: Alexandre Fonseca [mailto:alexandrejorgefonseca@gmail.com] 
Sent: Thursday, February 20, 2014 1:11 PM
To: user@giraph.apache.org
Subject: Re: Problem running SimpleShortestPathsComputation on CDH5 with Hadoop 2.2.0

The yarn version of Giraph is quite finicky when it comes to detecting the giraph jar. When it can't find the jar on the new containers you get that exception.

After much experimentation I've had great success with the following (substitute paths and filenames where needed):

HADOOP_CLASSPATH=/data/b.ajf/giraph hadoop jar /data/b.ajf/giraph/giraph.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation -vip /user/b.ajf/ssp_input -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -op /user/b.ajf/ssp_output -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -yj giraph.jar

The argument supplied to -yj must be only the name of the jar (no directories). The jar must be directly accessible from the current classpath or subdirectories (but it doesn't work if you give the full path of the jar, e.g, HADOOP_CLASSPATH=/data/b.ajf/giraph/giraph.jar).

Also, if you use giraph-examples-with-dependencies you don't need the other one as the examples-with-dependencies already contain the core.

Cheers,
Alex

On Thursday, February 20, 2014 05:42:45 PM Stefan Beskow wrote:
> Thanks Roman for getting back to me so quickly. I checked with 
> Cloudera and they have a maven repository for CDH5 (version 
> 2.2.0-cdh5.0.0-beta-2) as shown below:
> 
> <repository>
>       <id>cloudera</id>
>       
> <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
> </repository>
> 
> I added that to the pom.xml file and ran a new build using the 
> following
> command:
> 
> mvn -Phadoop_yarn -Dhadoop.version=2.2.0-cdh5.0.0-beta-2 clean package 
> -DskipTests
> 
> The program now gets further, but throws the following exception:
> 
> Container: container_1392713839733_0017_01_000002 on
> el01cn04.unx.sas.com_8041
> 
===========================================================================
> ===== LogType: task-2-stderr.log
> LogLength: 637
> Log Contents:
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/giraph/yarn/GiraphYarnTask Caused by:
> java.lang.ClassNotFoundException: 
> org.apache.giraph.yarn.GiraphYarnTask at
> java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> Could not find the main class: org.apache.giraph.yarn.GiraphYarnTask. 
> Program will exit.
> 
> Do you have any additional suggestions for what I should try?
> 
> Thanks for your help.
> Stefan
> 
> -----Original Message-----
> From: shaposhnik@gmail.com [mailto:shaposhnik@gmail.com] On Behalf Of 
> Roman Shaposhnik Sent: Wednesday, February 19, 2014 4:43 PM
> To: user@giraph.apache.org
> Subject: Re: Problem running SimpleShortestPathsComputation on CDH5 
> with Hadoop 2.2.0 On Wed, Feb 19, 2014 at 1:12 PM, Stefan Beskow 
> <St...@sas.com>
wrote:
> > Hi.
> > 
> > I checked out and built Giraph for Cloudera CDH5 with Hadoop 2.2.0 
> > using the
> > following:
> > 
> > git clone git://git.apache.org/giraph.git snapshot_from_git cd 
> > snapshot_from_git mvn -Phadoop_yarn -Dhadoop.version=2.2.0 clean 
> > package -DskipTests
> 
> It may help to build against CDH5 directly by:
>    * manually adding repository.cloudera.com to the set of repos
>    * specifying -Dhadoop.version=2.2.0-cdh5.0.0-beta-2
> 
> > When I run the sample application
> > org.apache.giraph.examples.SimpleShortestPathsComputation I get the
> 
> > following exception:
> You need to provide way more logs from the YARN side for us to make 
> sense of it.
> 
> Thanks,
> Roman.



Re: Problem running SimpleShortestPathsComputation on CDH5 with Hadoop 2.2.0

Posted by Alexandre Fonseca <al...@gmail.com>.
The yarn version of Giraph is quite finicky when it comes to detecting the 
giraph jar. When it can't find the jar on the new containers you get that 
exception.

After much experimentation I've had great success with the following 
(substitute paths and filenames where needed):

HADOOP_CLASSPATH=/data/b.ajf/giraph hadoop jar /data/b.ajf/giraph/giraph.jar 
org.apache.giraph.GiraphRunner 
org.apache.giraph.examples.SimpleShortestPathsComputation -vip 
/user/b.ajf/ssp_input -vif 
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -op 
/user/b.ajf/ssp_output -vof 
org.apache.giraph.io.formats.IdWithValueTextOutputFormat -yj giraph.jar

The argument supplied to -yj must be only the name of the jar (no 
directories). The jar must be directly accessible from the current classpath 
or subdirectories (but it doesn't work if you give the full path of the jar, 
e.g, HADOOP_CLASSPATH=/data/b.ajf/giraph/giraph.jar).

Also, if you use giraph-examples-with-dependencies you don't need the other 
one as the examples-with-dependencies already contain the core.

Cheers,
Alex

On Thursday, February 20, 2014 05:42:45 PM Stefan Beskow wrote:
> Thanks Roman for getting back to me so quickly. I checked with Cloudera and
> they have a maven repository for CDH5 (version 2.2.0-cdh5.0.0-beta-2) as
> shown below:
> 
> <repository>
>       <id>cloudera</id>
>       <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
> </repository>
> 
> I added that to the pom.xml file and ran a new build using the following
> command:
> 
> mvn -Phadoop_yarn -Dhadoop.version=2.2.0-cdh5.0.0-beta-2 clean package
> -DskipTests
> 
> The program now gets further, but throws the following exception:
> 
> Container: container_1392713839733_0017_01_000002 on
> el01cn04.unx.sas.com_8041
> 
===========================================================================
> ===== LogType: task-2-stderr.log
> LogLength: 637
> Log Contents:
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/giraph/yarn/GiraphYarnTask Caused by:
> java.lang.ClassNotFoundException: org.apache.giraph.yarn.GiraphYarnTask at
> java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> Could not find the main class: org.apache.giraph.yarn.GiraphYarnTask. 
> Program will exit.
> 
> Do you have any additional suggestions for what I should try?
> 
> Thanks for your help.
> Stefan
> 
> -----Original Message-----
> From: shaposhnik@gmail.com [mailto:shaposhnik@gmail.com] On Behalf Of Roman
> Shaposhnik Sent: Wednesday, February 19, 2014 4:43 PM
> To: user@giraph.apache.org
> Subject: Re: Problem running SimpleShortestPathsComputation on CDH5 with
> Hadoop 2.2.0
> On Wed, Feb 19, 2014 at 1:12 PM, Stefan Beskow <St...@sas.com> 
wrote:
> > Hi.
> > 
> > I checked out and built Giraph for Cloudera CDH5 with Hadoop 2.2.0
> > using the
> > following:
> > 
> > git clone git://git.apache.org/giraph.git snapshot_from_git cd
> > snapshot_from_git mvn -Phadoop_yarn -Dhadoop.version=2.2.0 clean
> > package -DskipTests
> 
> It may help to build against CDH5 directly by:
>    * manually adding repository.cloudera.com to the set of repos
>    * specifying -Dhadoop.version=2.2.0-cdh5.0.0-beta-2
> 
> > When I run the sample application
> > org.apache.giraph.examples.SimpleShortestPathsComputation I get the
> 
> > following exception:
> You need to provide way more logs from the YARN side for us to make sense of
> it.
> 
> Thanks,
> Roman.

RE: Problem running SimpleShortestPathsComputation on CDH5 with Hadoop 2.2.0

Posted by Stefan Beskow <St...@sas.com>.
Thanks Roman for getting back to me so quickly. I checked with Cloudera and they have a maven repository for CDH5 (version 2.2.0-cdh5.0.0-beta-2) as shown below:

<repository>
      <id>cloudera</id>
      <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>

I added that to the pom.xml file and ran a new build using the following command:

mvn -Phadoop_yarn -Dhadoop.version=2.2.0-cdh5.0.0-beta-2 clean package -DskipTests

The program now gets further, but throws the following exception:

Container: container_1392713839733_0017_01_000002 on el01cn04.unx.sas.com_8041
================================================================================
LogType: task-2-stderr.log
LogLength: 637
Log Contents:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/giraph/yarn/GiraphYarnTask
Caused by: java.lang.ClassNotFoundException: org.apache.giraph.yarn.GiraphYarnTask
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.apache.giraph.yarn.GiraphYarnTask.  Program will exit.

Do you have any additional suggestions for what I should try?

Thanks for your help.
Stefan

-----Original Message-----
From: shaposhnik@gmail.com [mailto:shaposhnik@gmail.com] On Behalf Of Roman Shaposhnik
Sent: Wednesday, February 19, 2014 4:43 PM
To: user@giraph.apache.org
Subject: Re: Problem running SimpleShortestPathsComputation on CDH5 with Hadoop 2.2.0

On Wed, Feb 19, 2014 at 1:12 PM, Stefan Beskow <St...@sas.com> wrote:
> Hi.
>
> I checked out and built Giraph for Cloudera CDH5 with Hadoop 2.2.0 
> using the
> following:
>
> git clone git://git.apache.org/giraph.git snapshot_from_git cd 
> snapshot_from_git mvn -Phadoop_yarn -Dhadoop.version=2.2.0 clean 
> package -DskipTests

It may help to build against CDH5 directly by:
   * manually adding repository.cloudera.com to the set of repos
   * specifying -Dhadoop.version=2.2.0-cdh5.0.0-beta-2

> When I run the sample application
> org.apache.giraph.examples.SimpleShortestPathsComputation I get the 
> following exception:

You need to provide way more logs from the YARN side for us to make sense of it.

Thanks,
Roman.



Re: Problem running SimpleShortestPathsComputation on CDH5 with Hadoop 2.2.0

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
On Wed, Feb 19, 2014 at 1:12 PM, Stefan Beskow <St...@sas.com> wrote:
> Hi.
>
> I checked out and built Giraph for Cloudera CDH5 with Hadoop 2.2.0 using the
> following:
>
> git clone git://git.apache.org/giraph.git snapshot_from_git
> cd snapshot_from_git
> mvn -Phadoop_yarn -Dhadoop.version=2.2.0 clean package -DskipTests

It may help to build against CDH5 directly by:
   * manually adding repository.cloudera.com to the set of repos
   * specifying -Dhadoop.version=2.2.0-cdh5.0.0-beta-2

> When I run the sample application
> org.apache.giraph.examples.SimpleShortestPathsComputation I get the
> following exception:

You need to provide way more logs from the YARN side for us to
make sense of it.

Thanks,
Roman.