You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Nicholas Karkoulias <ni...@gmail.com> on 2013/07/07 16:29:13 UTC

GiraphApplicationMaster not found (and other newbie questions)

Hello everyone.

This is my first message to an Apache mailing list, so please excuse 
(and correct) possibly incorrect usage or netiquette issues :)

I am a newbie to both Hadoop and Giraph (but not to Linux). After 
overcoming _several_ configuration-related hurdles, I have successfully 
built both (versions Hadoop 2.0.5-alpha and Giraph 1.1.0-SNAPSHOT, using 
the patch for issue #688), and seem to have a properly working 
HDFS/Hadoop installation in pseudo-distributed mode. I can run Hadoop 
code just fine.

However, Giraph fails during the computation (seemingly just before 
returning or writing the result – the time elapsed before the crash 
differs, depending on which example I try to run). See below for the 
error. I don't know whether it is a bug, or me doing something wrong.

I'm using the YARN-enabled version of Giraph (and, thus, an external 
ZooKeeper service), assuming that Giraph will completely move to YARN 
_eventually_. (Is that correct?)

Also, two other (somewhat unrelated) questions:
(1) When is the file conf/giraph-site.xml actually parsed/used? Is it 
read by Hadoop? Should it be copied somewhere? I tried setting the 
ZooKeeper host:port in that file (giraph.zkList), but it was ignored and 
I finally had to add the property in the command line shown below... Any 
relevant documentation? (In general, documentation for certain Hadoop 
features, such as the configuration files, seems to be lacking...)
(2) What's the correct process to submit a patch with a really simple 
typo correction in a string? (perhaps I should just contact the file 
author – it's nothing important)

I'll append the shell commands I used (after $) and the output at the 
end of this message.

Thank you in advance,
Nicholas


$ function giraphrunner(){ hadoop jar 
/tmp/software/giraph/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.0.5-alpha-jar-with-dependencies.jar 
org.apache.giraph.GiraphRunner -Dgiraph.zkList=localhost:2181 "$@"; }

$ time giraphrunner org.apache.giraph.examples.SimplePageRankComputation 
-vif 
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat 
-vip /dir/tiny_graph.txt -of 
org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op 
/dir/simplepagerank -w 1
13/07/07 15:06:15 INFO utils.ConfigurationUtils: No edge input format 
specified. Ensure your InputFormat does not require one.
13/07/07 15:06:15 INFO yarn.GiraphYarnClient: Final output path is: 
hdfs://localhost:9000/dir/simplepagerank
13/07/07 15:06:15 INFO service.AbstractService: 
Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
13/07/07 15:06:15 INFO service.AbstractService: 
Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
13/07/07 15:06:15 INFO yarn.GiraphYarnClient: Defaulting per-task heap 
size to 1024MB.
13/07/07 15:06:15 INFO yarn.GiraphYarnClient: Obtained new Application 
ID: application_1372875746593_0018
13/07/07 15:06:15 WARN conf.Configuration: mapred.job.id is deprecated. 
Instead, use mapreduce.job.id
13/07/07 15:06:15 WARN conf.Configuration: mapred.output.dir is 
deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
13/07/07 15:06:15 INFO yarn.YarnUtils: Registered file in 
LocalResources: giraph-conf.xml
13/07/07 15:06:16 INFO yarn.GiraphYarnClient: 
ApplicationSumbissionContext for GiraphApplicationMaster launch 
container is populated.
13/07/07 15:06:16 INFO client.YarnClientImpl: Submitted application 
application_1372875746593_0018 to ResourceManager at 
localhost/127.0.0.1:8032
13/07/07 15:06:16 INFO yarn.GiraphYarnClient: GiraphApplicationMaster 
container request was submitted to ResourceManager for job: Giraph: 
org.apache.giraph.examples.SimplePageRankComputation
13/07/07 15:06:17 INFO yarn.GiraphYarnClient: Giraph: 
org.apache.giraph.examples.SimplePageRankComputation, Elapsed: 0.85 secs
13/07/07 15:06:17 INFO yarn.GiraphYarnClient: 
appattempt_1372875746593_0018_000001, State: ACCEPTED, Containers used: 1
13/07/07 15:06:18 ERROR yarn.GiraphYarnClient: Giraph: 
org.apache.giraph.examples.SimplePageRankComputation reports FAILED 
state, diagnostics show: Application application_1372875746593_0018 
failed 1 times due to AM Container for 
appattempt_1372875746593_0018_000001 exited with  exitCode: 1 due to:
.Failing this attempt.. Failing the application.
13/07/07 15:06:18 INFO yarn.GiraphYarnClient: Cleaning up HDFS 
distributed cache directory for Giraph job.
13/07/07 15:06:18 INFO yarn.GiraphYarnClient: Completed Giraph: 
org.apache.giraph.examples.SimplePageRankComputation: FAILED, total 
running time: 0 minutes, 1 seconds.

real    0m8.392s
user    0m8.825s
sys     0m1.492s

$ cat 
software/hadoop-2.0.5-alpha/logs/userlogs/application_1372875746593_0018/container_1372875746593_0018_01_000001/gam-stderr.log
Exception in thread "main" java.lang.NoClassDefFoundError: 
org/apache/giraph/yarn/GiraphApplicationMaster
Caused by: java.lang.ClassNotFoundException: 
org.apache.giraph.yarn.GiraphApplicationMaster
         at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
         at java.security.AccessController.doPrivileged(Native Method)
         at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
         at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
         at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
Could not find the main class: 
org.apache.giraph.yarn.GiraphApplicationMaster. Program will exit.


Re: GiraphApplicationMaster not found (and other newbie questions)

Posted by Nicholas Karkoulias <ni...@gmail.com>.
Thank you very much for your response.

I had built Giraph with the following command:

mvn -e -Phadoop_yarn -Dhadoop.version=2.0.5-alpha -DskipTests clean install

The patch for issue 688 had been already applied manually, since it 
wasn't part of the Git repository yet. A few days later I git-fetch'ed 
the latest trunk (which now includes the commit with the patch) and 
repeated the process, with the same result (I even deleted the contents 
of M2_REPO in-between).

The order of the -P and -D flags shouldn't matter, right? (I remember 
reading https://issues.apache.org/jira/browse/GIRAPH-629, which places 
the -P flag first.)

Perhaps a problem with the build profiles, omitting certain classes? 
Unfortunately, I know nothing about Maven to examine this issue myself. 
It could still be a user error. Should I submit a bug report?

Finally, advice regarding my previous question [see (1)] about 
conf/giraph-site.xml would be greatly appreciated.

Regards,
Nicholas

PS. For unrelated reasons, I'll have to re-install the whole software 
stack, so I can't try again anything (or submit a report) right away.


On 07/15/2013 09:02 PM, Eli Reisman wrote:
> hi nicholas,
>
> Looks like your build was not pulling in the Giraph yarn/ package 
> classes, probably you build using something like:
>
> mvn -Phadoop_2.0.5 clean install
>
> try instead:
>
> mvn -Dhadoop.version=2.0.5-alpha -Phadoop_yarn clean install
>
> (assuming you have applied the patch for 2.0.5-alpha builds on Giraph 
> I htink its GIRAPH-688, otherwise, the hadoop_yarn profile only builds 
> against 2.0.3-alpha hadoop)
>
>
>
>
> On Sun, Jul 7, 2013 at 7:29 AM, Nicholas Karkoulias 
> <nicholaskar@gmail.com <ma...@gmail.com>> wrote:
>
>     Hello everyone.
>
>     This is my first message to an Apache mailing list, so please
>     excuse (and correct) possibly incorrect usage or netiquette issues :)
>
>     I am a newbie to both Hadoop and Giraph (but not to Linux). After
>     overcoming _several_ configuration-related hurdles, I have
>     successfully built both (versions Hadoop 2.0.5-alpha and Giraph
>     1.1.0-SNAPSHOT, using the patch for issue #688), and seem to have
>     a properly working HDFS/Hadoop installation in pseudo-distributed
>     mode. I can run Hadoop code just fine.
>
>     However, Giraph fails during the computation (seemingly just
>     before returning or writing the result – the time elapsed before
>     the crash differs, depending on which example I try to run). See
>     below for the error. I don't know whether it is a bug, or me doing
>     something wrong.
>
>     I'm using the YARN-enabled version of Giraph (and, thus, an
>     external ZooKeeper service), assuming that Giraph will completely
>     move to YARN _eventually_. (Is that correct?)
>
>     Also, two other (somewhat unrelated) questions:
>     (1) When is the file conf/giraph-site.xml actually parsed/used? Is
>     it read by Hadoop? Should it be copied somewhere? I tried setting
>     the ZooKeeper host:port in that file (giraph.zkList), but it was
>     ignored and I finally had to add the property in the command line
>     shown below... Any relevant documentation? (In general,
>     documentation for certain Hadoop features, such as the
>     configuration files, seems to be lacking...)
>     (2) What's the correct process to submit a patch with a really
>     simple typo correction in a string? (perhaps I should just contact
>     the file author – it's nothing important)
>
>     I'll append the shell commands I used (after $) and the output at
>     the end of this message.
>
>     Thank you in advance,
>     Nicholas
>
>
>     $ function giraphrunner(){ hadoop jar
>     /tmp/software/giraph/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.0.5-alpha-jar-with-dependencies.jar
>     org.apache.giraph.GiraphRunner -Dgiraph.zkList=localhost:2181 "$@"; }
>
>     $ time giraphrunner
>     org.apache.giraph.examples.SimplePageRankComputation -vif
>     org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
>     -vip /dir/tiny_graph.txt -of
>     org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
>     /dir/simplepagerank -w 1
>     13/07/07 15:06:15 INFO utils.ConfigurationUtils: No edge input
>     format specified. Ensure your InputFormat does not require one.
>     13/07/07 15:06:15 INFO yarn.GiraphYarnClient: Final output path
>     is: hdfs://localhost:9000/dir/simplepagerank
>     13/07/07 15:06:15 INFO service.AbstractService:
>     Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
>     13/07/07 15:06:15 INFO service.AbstractService:
>     Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
>     13/07/07 15:06:15 INFO yarn.GiraphYarnClient: Defaulting per-task
>     heap size to 1024MB.
>     13/07/07 15:06:15 INFO yarn.GiraphYarnClient: Obtained new
>     Application ID: application_1372875746593_0018
>     13/07/07 15:06:15 WARN conf.Configuration: mapred.job.id is
>     deprecated. Instead, use mapreduce.job.id
>     13/07/07 15:06:15 WARN conf.Configuration: mapred.output.dir is
>     deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
>     13/07/07 15:06:15 INFO yarn.YarnUtils: Registered file in
>     LocalResources: giraph-conf.xml
>     13/07/07 15:06:16 INFO yarn.GiraphYarnClient:
>     ApplicationSumbissionContext for GiraphApplicationMaster launch
>     container is populated.
>     13/07/07 15:06:16 INFO client.YarnClientImpl: Submitted
>     application application_1372875746593_0018 to ResourceManager at
>     localhost/127.0.0.1:8032
>     13/07/07 15:06:16 INFO yarn.GiraphYarnClient:
>     GiraphApplicationMaster container request was submitted to
>     ResourceManager for job: Giraph:
>     org.apache.giraph.examples.SimplePageRankComputation
>     13/07/07 15:06:17 INFO yarn.GiraphYarnClient: Giraph:
>     org.apache.giraph.examples.SimplePageRankComputation, Elapsed:
>     0.85 secs
>     13/07/07 15:06:17 INFO yarn.GiraphYarnClient:
>     appattempt_1372875746593_0018_000001, State: ACCEPTED, Containers
>     used: 1
>     13/07/07 15:06:18 ERROR yarn.GiraphYarnClient: Giraph:
>     org.apache.giraph.examples.SimplePageRankComputation reports
>     FAILED state, diagnostics show: Application
>     application_1372875746593_0018 failed 1 times due to AM Container
>     for appattempt_1372875746593_0018_000001 exited with  exitCode: 1
>     due to:
>     .Failing this attempt.. Failing the application.
>     13/07/07 15:06:18 INFO yarn.GiraphYarnClient: Cleaning up HDFS
>     distributed cache directory for Giraph job.
>     13/07/07 15:06:18 INFO yarn.GiraphYarnClient: Completed Giraph:
>     org.apache.giraph.examples.SimplePageRankComputation: FAILED,
>     total running time: 0 minutes, 1 seconds.
>
>     real    0m8.392s
>     user    0m8.825s
>     sys     0m1.492s
>
>     $ cat
>     software/hadoop-2.0.5-alpha/logs/userlogs/application_1372875746593_0018/container_1372875746593_0018_01_000001/gam-stderr.log
>     Exception in thread "main" java.lang.NoClassDefFoundError:
>     org/apache/giraph/yarn/GiraphApplicationMaster
>     Caused by: java.lang.ClassNotFoundException:
>     org.apache.giraph.yarn.GiraphApplicationMaster
>             at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>             at java.security.AccessController.doPrivileged(Native Method)
>             at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>             at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
>             at
>     sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>             at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
>     Could not find the main class:
>     org.apache.giraph.yarn.GiraphApplicationMaster. Program will exit.
>
>


Re: GiraphApplicationMaster not found (and other newbie questions)

Posted by Eli Reisman <ap...@gmail.com>.
hi nicholas,

Looks like your build was not pulling in the Giraph yarn/ package classes,
probably you build using something like:

mvn -Phadoop_2.0.5 clean install

try instead:

mvn -Dhadoop.version=2.0.5-alpha -Phadoop_yarn clean install

(assuming you have applied the patch for 2.0.5-alpha builds on Giraph I
htink its GIRAPH-688, otherwise, the hadoop_yarn profile only builds
against 2.0.3-alpha hadoop)




On Sun, Jul 7, 2013 at 7:29 AM, Nicholas Karkoulias
<ni...@gmail.com>wrote:

> Hello everyone.
>
> This is my first message to an Apache mailing list, so please excuse (and
> correct) possibly incorrect usage or netiquette issues :)
>
> I am a newbie to both Hadoop and Giraph (but not to Linux). After
> overcoming _several_ configuration-related hurdles, I have successfully
> built both (versions Hadoop 2.0.5-alpha and Giraph 1.1.0-SNAPSHOT, using
> the patch for issue #688), and seem to have a properly working HDFS/Hadoop
> installation in pseudo-distributed mode. I can run Hadoop code just fine.
>
> However, Giraph fails during the computation (seemingly just before
> returning or writing the result – the time elapsed before the crash
> differs, depending on which example I try to run). See below for the error.
> I don't know whether it is a bug, or me doing something wrong.
>
> I'm using the YARN-enabled version of Giraph (and, thus, an external
> ZooKeeper service), assuming that Giraph will completely move to YARN
> _eventually_. (Is that correct?)
>
> Also, two other (somewhat unrelated) questions:
> (1) When is the file conf/giraph-site.xml actually parsed/used? Is it read
> by Hadoop? Should it be copied somewhere? I tried setting the ZooKeeper
> host:port in that file (giraph.zkList), but it was ignored and I finally
> had to add the property in the command line shown below... Any relevant
> documentation? (In general, documentation for certain Hadoop features, such
> as the configuration files, seems to be lacking...)
> (2) What's the correct process to submit a patch with a really simple typo
> correction in a string? (perhaps I should just contact the file author –
> it's nothing important)
>
> I'll append the shell commands I used (after $) and the output at the end
> of this message.
>
> Thank you in advance,
> Nicholas
>
>
> $ function giraphrunner(){ hadoop jar /tmp/software/giraph/giraph-**
> examples/target/giraph-**examples-1.1.0-SNAPSHOT-for-**
> hadoop-2.0.5-alpha-jar-with-**dependencies.jar
> org.apache.giraph.GiraphRunner -Dgiraph.zkList=localhost:2181 "$@"; }
>
> $ time giraphrunner org.apache.giraph.examples.**SimplePageRankComputation
> -vif org.apache.giraph.io.formats.**JsonLongDoubleFloatDoubleVerte**xInputFormat
> -vip /dir/tiny_graph.txt -of org.apache.giraph.io.formats.**IdWithValueTextOutputFormat
> -op /dir/simplepagerank -w 1
> 13/07/07 15:06:15 INFO utils.ConfigurationUtils: No edge input format
> specified. Ensure your InputFormat does not require one.
> 13/07/07 15:06:15 INFO yarn.GiraphYarnClient: Final output path is:
> hdfs://localhost:9000/dir/**simplepagerank
> 13/07/07 15:06:15 INFO service.AbstractService: Service:org.apache.hadoop.
> **yarn.client.YarnClientImpl is inited.
> 13/07/07 15:06:15 INFO service.AbstractService: Service:org.apache.hadoop.
> **yarn.client.YarnClientImpl is started.
> 13/07/07 15:06:15 INFO yarn.GiraphYarnClient: Defaulting per-task heap
> size to 1024MB.
> 13/07/07 15:06:15 INFO yarn.GiraphYarnClient: Obtained new Application ID:
> application_1372875746593_0018
> 13/07/07 15:06:15 WARN conf.Configuration: mapred.job.id is deprecated.
> Instead, use mapreduce.job.id
> 13/07/07 15:06:15 WARN conf.Configuration: mapred.output.dir is
> deprecated. Instead, use mapreduce.output.**fileoutputformat.outputdir
> 13/07/07 15:06:15 INFO yarn.YarnUtils: Registered file in LocalResources:
> giraph-conf.xml
> 13/07/07 15:06:16 INFO yarn.GiraphYarnClient: ApplicationSumbissionContext
> for GiraphApplicationMaster launch container is populated.
> 13/07/07 15:06:16 INFO client.YarnClientImpl: Submitted application
> application_1372875746593_0018 to ResourceManager at localhost/
> 127.0.0.1:8032
> 13/07/07 15:06:16 INFO yarn.GiraphYarnClient: GiraphApplicationMaster
> container request was submitted to ResourceManager for job: Giraph:
> org.apache.giraph.examples.**SimplePageRankComputation
> 13/07/07 15:06:17 INFO yarn.GiraphYarnClient: Giraph:
> org.apache.giraph.examples.**SimplePageRankComputation, Elapsed: 0.85 secs
> 13/07/07 15:06:17 INFO yarn.GiraphYarnClient:
> appattempt_1372875746593_0018_**000001, State: ACCEPTED, Containers used:
> 1
> 13/07/07 15:06:18 ERROR yarn.GiraphYarnClient: Giraph:
> org.apache.giraph.examples.**SimplePageRankComputation reports FAILED
> state, diagnostics show: Application application_1372875746593_0018 failed
> 1 times due to AM Container for appattempt_1372875746593_0018_**000001
> exited with  exitCode: 1 due to:
> .Failing this attempt.. Failing the application.
> 13/07/07 15:06:18 INFO yarn.GiraphYarnClient: Cleaning up HDFS distributed
> cache directory for Giraph job.
> 13/07/07 15:06:18 INFO yarn.GiraphYarnClient: Completed Giraph:
> org.apache.giraph.examples.**SimplePageRankComputation: FAILED, total
> running time: 0 minutes, 1 seconds.
>
> real    0m8.392s
> user    0m8.825s
> sys     0m1.492s
>
> $ cat software/hadoop-2.0.5-alpha/**logs/userlogs/application_**
> 1372875746593_0018/container_**1372875746593_0018_01_000001/**
> gam-stderr.log
> Exception in thread "main" java.lang.**NoClassDefFoundError:
> org/apache/giraph/yarn/**GiraphApplicationMaster
> Caused by: java.lang.**ClassNotFoundException: org.apache.giraph.yarn.**
> GiraphApplicationMaster
>         at java.net.URLClassLoader$1.run(**URLClassLoader.java:217)
>         at java.security.**AccessController.doPrivileged(**Native Method)
>         at java.net.URLClassLoader.**findClass(URLClassLoader.java:**205)
>         at java.lang.ClassLoader.**loadClass(ClassLoader.java:**321)
>         at sun.misc.Launcher$**AppClassLoader.loadClass(**
> Launcher.java:294)
>         at java.lang.ClassLoader.**loadClass(ClassLoader.java:**266)
> Could not find the main class: org.apache.giraph.yarn.**GiraphApplicationMaster.
> Program will exit.
>
>