You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Jonathan Ellithorpe <jd...@gmail.com> on 2013/04/04 18:23:05 UTC

Running SimpleShortestPathsVertex with ToolRunner

Hi,

I am using version 0.2 of giraph and version 0.20.203 of hadoop. I would
like to run the SimpleShortestPathsVertex example, and so I tried the
command in the documentation (
https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Example):

hadoop jar
~/giraph/giraph-examples/target/giraph-examples-0.2-SNAPSHOT-for-hadoop-0.20.203.0-jar-with-dependencies.jar
org.apache.giraph.examples.SimpleShortestPathsVertex graph output 0 3

but I get:

Exception in thread "main" java.lang.NoSuchMethodException:
org.apache.giraph.examples.SimpleShortestPathsVertex.main([Ljava.lang.String;)

Indicating that the class has no main method. The documentation (
https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Example)
in fact states that this is the expected behavior in giraph 0.2 because it
must be run with ToolRunner, but I don't see any instructions for executing
SimpleShortestPathsVertex using ToolRunner.

This is the first time that I am running hadoop and giraph, so I am
probably missing something here, and any help would be greatly appreciated.

Best,
Jonathan

Re: Running SimpleShortestPathsVertex with ToolRunner

Posted by Eli Reisman <ap...@gmail.com>.
Also if you write your own Giraph apps, you can package them in the
examples repo for an easy build, or build your own jar and simply include
it in the class path and refer to it where we called
o.a.g.SimpleShortestPathsVertex above.



On Fri, Apr 5, 2013 at 1:37 AM, Eli Reisman <ap...@gmail.com>wrote:

> Hey guys sorry for the confusion, we're busting our humps to get a release
> done and the docs on the wiki are next up after that (and a book in the
> works as well)
>
> So...here's the problem. Benchmarks are old code, and work great for us
> but run using the API you described.
>
> EXAMPLES however, use a newer execution command. You want to run
> org.apache.giraph.GiraphRunner with the hadoop jar command. the first
> argument to this will be the name of the example class you wan tot run
> (SimpleShortestPathsVertex in your case)
>
> try something like (I'm simplifying here):
>
> hadoop jar --config etc/hadoop path/to/giraph/examples.jar
> org.apache.giraph.GiraphRunner
> org.apache.giraph.examples.SimpleShortestPathsVertex ...
>
> where "..." is the in/out dirs on HDFS, other command line switches like
> -w and even optional hadoop/giraph config params with -ca
> key1=val1,key2=val2,... and others -- use -h option with GiraphRunner (I
> think) to get a dump of the command line opts that are not specific to just
> one example, you will have to refer to the source code to see what custom
> options an application might have.
>
> Hope this helps, sorry for the frustration. Spread the word, docs are
> coming!
>
>
>
>
> On Thu, Apr 4, 2013 at 2:44 PM, David Boyd <db...@data-tactics-corp.com>wrote:
>
>>  Jonathan:
>>      The ShortestPathBenchmark code you ran should be a good starting
>> point for your
>> task.  You will certainly have to write some code depending on the format
>> of your input
>> and any specific weights you wanted to apply to edges.   If you can put
>> your data into
>> the same format as that benchmark/example then you should be able to
>> start from
>> there.
>>
>>
>> On 4/4/2013 1:03 PM, Jonathan Ellithorpe wrote:
>>
>> David,
>>
>>  Thank you very much for your prompt response, and also the tip about
>> the number of workers being no more than one less than the available map
>> slots.
>>
>>  I tried the ShortestPathsBenchmark and that does indeed seem to work,
>> so that's good.
>>
>>  My higher level goal is to use giraph to compute the shortest paths in
>> an input graph. Is giraph 0.2 capable of doing this out of the box, or do I
>> need to write my own code? Would switching to giraph 0.1 give me this
>> ability?
>>
>>  Best,
>> Jonathan
>>
>>
>> On Thu, Apr 4, 2013 at 9:35 AM, David Boyd <db...@data-tactics-corp.com>wrote:
>>
>>>  All of the documentation for Giraph on both the Githiub wiki and the
>>> the Apache site is out of date as it related to the 0.2 version.
>>> Currently I have tested the following benchmark examples from the core
>>> library:
>>>
>>> PageRankBenchmark
>>> ShortestPathsBenchmark
>>> AggregatorsBenchmark
>>> RandomMessageBenchmark
>>> Help for these examples is provided via the -h option as shown below:
>>>
>>> hadoop jar giraph-jar-with-dependencies.jar
>>> org.apache.giraph.benchmark.PageRankBenchmark -h
>>> usage: org.apache.giraph.benchmark.PageRankBenchmark [-e <arg>] [-h] [-s
>>> <arg>] [-v] [-V <arg>] [-w <arg>]
>>>  -e,--edgesPerVertex <arg>      Edges per vertex
>>>  -h,--help                      Help
>>>  -s,--supersteps <arg>          Supersteps to execute before finishing
>>>  -v,--verbose                   Verbose
>>>  -V,--aggregateVertices <arg>   Aggregate vertices
>>>  -w,--workers <arg>             Number of workers
>>> Just substitute the class name of the example you want help for in the
>>> command above.
>>>
>>> It is critical that the number of workers be NO MORE than one less than
>>> the available number of map slots on your cluster.
>>>
>>> The following was used to execute the PageRankBenchmark example on my
>>> cluster:
>>>
>>>  hadoop jar giraph-jar-with-dependencies.jar
>>> org.apache.giraph.benchmark.PageRankBenchmark
>>> -Dmapred.child.java-opts="-Xmx64g -Xms64g XX:+UseConcMarkSweepGC
>>> -XX:-UseGCOverheadLimit" -Dgiraph.zkList=10.1.94.104:2181 -e 1 -s 3 -v
>>> -V 50000 -w 83
>>>
>>>
>>>
>>>
>>>  On 4/4/2013 12:23 PM, Jonathan Ellithorpe wrote:
>>>
>>> Hi,
>>>
>>>  I am using version 0.2 of giraph and version 0.20.203 of hadoop. I
>>> would like to run the SimpleShortestPathsVertex example, and so I tried the
>>> command in the documentation (
>>> https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Example
>>> ):
>>>
>>>  hadoop jar
>>> ~/giraph/giraph-examples/target/giraph-examples-0.2-SNAPSHOT-for-hadoop-0.20.203.0-jar-with-dependencies.jar
>>> org.apache.giraph.examples.SimpleShortestPathsVertex graph output 0 3
>>>
>>>  but I get:
>>>
>>>  Exception in thread "main" java.lang.NoSuchMethodException:
>>> org.apache.giraph.examples.SimpleShortestPathsVertex.main([Ljava.lang.String;)
>>>
>>>  Indicating that the class has no main method. The documentation (
>>> https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Example)
>>> in fact states that this is the expected behavior in giraph 0.2 because it
>>> must be run with ToolRunner, but I don't see any instructions for executing
>>> SimpleShortestPathsVertex using ToolRunner.
>>>
>>>  This is the first time that I am running hadoop and giraph, so I am
>>> probably missing something here, and any help would be greatly appreciated.
>>>
>>>  Best,
>>> Jonathan
>>>
>>>
>>>
>>
>>
>

Re: Running SimpleShortestPathsVertex with ToolRunner

Posted by Eli Reisman <ap...@gmail.com>.
Hey guys sorry for the confusion, we're busting our humps to get a release
done and the docs on the wiki are next up after that (and a book in the
works as well)

So...here's the problem. Benchmarks are old code, and work great for us but
run using the API you described.

EXAMPLES however, use a newer execution command. You want to run
org.apache.giraph.GiraphRunner with the hadoop jar command. the first
argument to this will be the name of the example class you wan tot run
(SimpleShortestPathsVertex in your case)

try something like (I'm simplifying here):

hadoop jar --config etc/hadoop path/to/giraph/examples.jar
org.apache.giraph.GiraphRunner
org.apache.giraph.examples.SimpleShortestPathsVertex ...

where "..." is the in/out dirs on HDFS, other command line switches like -w
and even optional hadoop/giraph config params with -ca
key1=val1,key2=val2,... and others -- use -h option with GiraphRunner (I
think) to get a dump of the command line opts that are not specific to just
one example, you will have to refer to the source code to see what custom
options an application might have.

Hope this helps, sorry for the frustration. Spread the word, docs are
coming!




On Thu, Apr 4, 2013 at 2:44 PM, David Boyd <db...@data-tactics-corp.com>wrote:

>  Jonathan:
>      The ShortestPathBenchmark code you ran should be a good starting
> point for your
> task.  You will certainly have to write some code depending on the format
> of your input
> and any specific weights you wanted to apply to edges.   If you can put
> your data into
> the same format as that benchmark/example then you should be able to start
> from
> there.
>
>
> On 4/4/2013 1:03 PM, Jonathan Ellithorpe wrote:
>
> David,
>
>  Thank you very much for your prompt response, and also the tip about the
> number of workers being no more than one less than the available map slots.
>
>  I tried the ShortestPathsBenchmark and that does indeed seem to work, so
> that's good.
>
>  My higher level goal is to use giraph to compute the shortest paths in
> an input graph. Is giraph 0.2 capable of doing this out of the box, or do I
> need to write my own code? Would switching to giraph 0.1 give me this
> ability?
>
>  Best,
> Jonathan
>
>
> On Thu, Apr 4, 2013 at 9:35 AM, David Boyd <db...@data-tactics-corp.com>wrote:
>
>>  All of the documentation for Giraph on both the Githiub wiki and the
>> the Apache site is out of date as it related to the 0.2 version.
>> Currently I have tested the following benchmark examples from the core
>> library:
>>
>> PageRankBenchmark
>> ShortestPathsBenchmark
>> AggregatorsBenchmark
>> RandomMessageBenchmark
>> Help for these examples is provided via the -h option as shown below:
>>
>> hadoop jar giraph-jar-with-dependencies.jar
>> org.apache.giraph.benchmark.PageRankBenchmark -h
>> usage: org.apache.giraph.benchmark.PageRankBenchmark [-e <arg>] [-h] [-s
>> <arg>] [-v] [-V <arg>] [-w <arg>]
>>  -e,--edgesPerVertex <arg>      Edges per vertex
>>  -h,--help                      Help
>>  -s,--supersteps <arg>          Supersteps to execute before finishing
>>  -v,--verbose                   Verbose
>>  -V,--aggregateVertices <arg>   Aggregate vertices
>>  -w,--workers <arg>             Number of workers
>> Just substitute the class name of the example you want help for in the
>> command above.
>>
>> It is critical that the number of workers be NO MORE than one less than
>> the available number of map slots on your cluster.
>>
>> The following was used to execute the PageRankBenchmark example on my
>> cluster:
>>
>>  hadoop jar giraph-jar-with-dependencies.jar
>> org.apache.giraph.benchmark.PageRankBenchmark
>> -Dmapred.child.java-opts="-Xmx64g -Xms64g XX:+UseConcMarkSweepGC
>> -XX:-UseGCOverheadLimit" -Dgiraph.zkList=10.1.94.104:2181 -e 1 -s 3 -v
>> -V 50000 -w 83
>>
>>
>>
>>
>>  On 4/4/2013 12:23 PM, Jonathan Ellithorpe wrote:
>>
>> Hi,
>>
>>  I am using version 0.2 of giraph and version 0.20.203 of hadoop. I
>> would like to run the SimpleShortestPathsVertex example, and so I tried the
>> command in the documentation (
>> https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Example
>> ):
>>
>>  hadoop jar
>> ~/giraph/giraph-examples/target/giraph-examples-0.2-SNAPSHOT-for-hadoop-0.20.203.0-jar-with-dependencies.jar
>> org.apache.giraph.examples.SimpleShortestPathsVertex graph output 0 3
>>
>>  but I get:
>>
>>  Exception in thread "main" java.lang.NoSuchMethodException:
>> org.apache.giraph.examples.SimpleShortestPathsVertex.main([Ljava.lang.String;)
>>
>>  Indicating that the class has no main method. The documentation (
>> https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Example)
>> in fact states that this is the expected behavior in giraph 0.2 because it
>> must be run with ToolRunner, but I don't see any instructions for executing
>> SimpleShortestPathsVertex using ToolRunner.
>>
>>  This is the first time that I am running hadoop and giraph, so I am
>> probably missing something here, and any help would be greatly appreciated.
>>
>>  Best,
>> Jonathan
>>
>>
>>
>
>

Re: Running SimpleShortestPathsVertex with ToolRunner

Posted by David Boyd <db...@data-tactics-corp.com>.
Jonathan:
      The ShortestPathBenchmark code you ran should be a good starting 
point for your
task.  You will certainly have to write some code depending on the 
format of your input
and any specific weights you wanted to apply to edges.   If you can put 
your data into
the same format as that benchmark/example then you should be able to 
start from
there.

On 4/4/2013 1:03 PM, Jonathan Ellithorpe wrote:
> David,
>
> Thank you very much for your prompt response, and also the tip about 
> the number of workers being no more than one less than the available 
> map slots.
>
> I tried the ShortestPathsBenchmark and that does indeed seem to work, 
> so that's good.
>
> My higher level goal is to use giraph to compute the shortest paths in 
> an input graph. Is giraph 0.2 capable of doing this out of the box, or 
> do I need to write my own code? Would switching to giraph 0.1 give me 
> this ability?
>
> Best,
> Jonathan
>
>
> On Thu, Apr 4, 2013 at 9:35 AM, David Boyd 
> <dboyd@data-tactics-corp.com <ma...@data-tactics-corp.com>> wrote:
>
>     All of the documentation for Giraph on both the Githiub wiki and
>     the the Apache site is out of date as it related to the 0.2
>     version.   Currently I have tested the following benchmark
>     examples from the core library:
>
>     PageRankBenchmark
>     ShortestPathsBenchmark
>     AggregatorsBenchmark
>     RandomMessageBenchmark
>     Help for these examples is provided via the -h option as shown below:
>
>     hadoop jar giraph-jar-with-dependencies.jar
>     org.apache.giraph.benchmark.PageRankBenchmark -h
>     usage: org.apache.giraph.benchmark.PageRankBenchmark [-e <arg>]
>     [-h] [-s <arg>] [-v] [-V <arg>] [-w <arg>]
>      -e,--edgesPerVertex <arg>      Edges per vertex
>      -h,--help                      Help
>      -s,--supersteps <arg>          Supersteps to execute before finishing
>      -v,--verbose                   Verbose
>      -V,--aggregateVertices <arg>   Aggregate vertices
>      -w,--workers <arg>             Number of workers
>     Just substitute the class name of the example you want help for in
>     the command above.
>
>     It is critical that the number of workers be NO MORE than one less
>     than the available number of map slots on your cluster.
>
>     The following was used to execute the PageRankBenchmark example on
>     my cluster:
>
>      hadoop jar giraph-jar-with-dependencies.jar
>     org.apache.giraph.benchmark.PageRankBenchmark
>     -Dmapred.child.java-opts="-Xmx64g -Xms64g XX:+UseConcMarkSweepGC
>     -XX:-UseGCOverheadLimit" -Dgiraph.zkList=10.1.94.104:2181
>     <http://10.1.94.104:2181> -e 1 -s 3 -v -V 50000 -w 83
>
>
>
>
>      On 4/4/2013 12:23 PM, Jonathan Ellithorpe wrote:
>>     Hi,
>>
>>     I am using version 0.2 of giraph and version 0.20.203 of hadoop.
>>     I would like to run the SimpleShortestPathsVertex example, and so
>>     I tried the command in the documentation
>>     (https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Example):
>>
>>     hadoop jar
>>     ~/giraph/giraph-examples/target/giraph-examples-0.2-SNAPSHOT-for-hadoop-0.20.203.0-jar-with-dependencies.jar
>>     org.apache.giraph.examples.SimpleShortestPathsVertex graph output 0 3
>>
>>     but I get:
>>
>>     Exception in thread "main" java.lang.NoSuchMethodException:
>>     org.apache.giraph.examples.SimpleShortestPathsVertex.main([Ljava.lang.String;)
>>
>>     Indicating that the class has no main method. The documentation
>>     (https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Example)
>>     in fact states that this is the expected behavior in giraph 0.2
>>     because it must be run with ToolRunner, but I don't see any
>>     instructions for executing SimpleShortestPathsVertex using
>>     ToolRunner.
>>
>>     This is the first time that I am running hadoop and giraph, so I
>>     am probably missing something here, and any help would be greatly
>>     appreciated.
>>
>>     Best,
>>     Jonathan
>
>


Re: Running SimpleShortestPathsVertex with ToolRunner

Posted by Jonathan Ellithorpe <jd...@gmail.com>.
David,

Thank you very much for your prompt response, and also the tip about the
number of workers being no more than one less than the available map slots.

I tried the ShortestPathsBenchmark and that does indeed seem to work, so
that's good.

My higher level goal is to use giraph to compute the shortest paths in an
input graph. Is giraph 0.2 capable of doing this out of the box, or do I
need to write my own code? Would switching to giraph 0.1 give me this
ability?

Best,
Jonathan


On Thu, Apr 4, 2013 at 9:35 AM, David Boyd <db...@data-tactics-corp.com>wrote:

>  All of the documentation for Giraph on both the Githiub wiki and the the
> Apache site is out of date as it related to the 0.2 version.   Currently I
> have tested the following benchmark examples from the core library:
>
> PageRankBenchmark
> ShortestPathsBenchmark
> AggregatorsBenchmark
> RandomMessageBenchmark
> Help for these examples is provided via the -h option as shown below:
>
> hadoop jar giraph-jar-with-dependencies.jar
> org.apache.giraph.benchmark.PageRankBenchmark -h
> usage: org.apache.giraph.benchmark.PageRankBenchmark [-e <arg>] [-h] [-s
> <arg>] [-v] [-V <arg>] [-w <arg>]
>  -e,--edgesPerVertex <arg>      Edges per vertex
>  -h,--help                      Help
>  -s,--supersteps <arg>          Supersteps to execute before finishing
>  -v,--verbose                   Verbose
>  -V,--aggregateVertices <arg>   Aggregate vertices
>  -w,--workers <arg>             Number of workers
> Just substitute the class name of the example you want help for in the
> command above.
>
> It is critical that the number of workers be NO MORE than one less than
> the available number of map slots on your cluster.
>
> The following was used to execute the PageRankBenchmark example on my
> cluster:
>
>  hadoop jar giraph-jar-with-dependencies.jar
> org.apache.giraph.benchmark.PageRankBenchmark
> -Dmapred.child.java-opts="-Xmx64g -Xms64g XX:+UseConcMarkSweepGC
> -XX:-UseGCOverheadLimit" -Dgiraph.zkList=10.1.94.104:2181 -e 1 -s 3 -v -V
> 50000 -w 83
>
>
>
>
>  On 4/4/2013 12:23 PM, Jonathan Ellithorpe wrote:
>
> Hi,
>
>  I am using version 0.2 of giraph and version 0.20.203 of hadoop. I would
> like to run the SimpleShortestPathsVertex example, and so I tried the
> command in the documentation (
> https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Example
> ):
>
>  hadoop jar
> ~/giraph/giraph-examples/target/giraph-examples-0.2-SNAPSHOT-for-hadoop-0.20.203.0-jar-with-dependencies.jar
> org.apache.giraph.examples.SimpleShortestPathsVertex graph output 0 3
>
>  but I get:
>
>  Exception in thread "main" java.lang.NoSuchMethodException:
> org.apache.giraph.examples.SimpleShortestPathsVertex.main([Ljava.lang.String;)
>
>  Indicating that the class has no main method. The documentation (
> https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Example)
> in fact states that this is the expected behavior in giraph 0.2 because it
> must be run with ToolRunner, but I don't see any instructions for executing
> SimpleShortestPathsVertex using ToolRunner.
>
>  This is the first time that I am running hadoop and giraph, so I am
> probably missing something here, and any help would be greatly appreciated.
>
>  Best,
> Jonathan
>
>
>

Re: Running SimpleShortestPathsVertex with ToolRunner

Posted by David Boyd <db...@data-tactics-corp.com>.
All of the documentation for Giraph on both the Githiub wiki and the the 
Apache site is out of date as it related to the 0.2 version.   Currently 
I have tested the following benchmark examples from the core library:

PageRankBenchmark
ShortestPathsBenchmark
AggregatorsBenchmark
RandomMessageBenchmark
Help for these examples is provided via the -h option as shown below:

hadoop jar giraph-jar-with-dependencies.jar 
org.apache.giraph.benchmark.PageRankBenchmark -h
usage: org.apache.giraph.benchmark.PageRankBenchmark [-e <arg>] [-h] [-s 
<arg>] [-v] [-V <arg>] [-w <arg>]
  -e,--edgesPerVertex <arg>      Edges per vertex
  -h,--help                      Help
  -s,--supersteps <arg>          Supersteps to execute before finishing
  -v,--verbose                   Verbose
  -V,--aggregateVertices <arg>   Aggregate vertices
  -w,--workers <arg>             Number of workers
Just substitute the class name of the example you want help for in the 
command above.

It is critical that the number of workers be NO MORE than one less than 
the available number of map slots on your cluster.

The following was used to execute the PageRankBenchmark example on my 
cluster:

  hadoop jar giraph-jar-with-dependencies.jar 
org.apache.giraph.benchmark.PageRankBenchmark 
-Dmapred.child.java-opts="-Xmx64g -Xms64g XX:+UseConcMarkSweepGC 
-XX:-UseGCOverheadLimit" -Dgiraph.zkList=10.1.94.104:2181 -e 1 -s 3 -v 
-V 50000 -w 83



  On 4/4/2013 12:23 PM, Jonathan Ellithorpe wrote:
> Hi,
>
> I am using version 0.2 of giraph and version 0.20.203 of hadoop. I 
> would like to run the SimpleShortestPathsVertex example, and so I 
> tried the command in the documentation 
> (https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Example):
>
> hadoop jar 
> ~/giraph/giraph-examples/target/giraph-examples-0.2-SNAPSHOT-for-hadoop-0.20.203.0-jar-with-dependencies.jar 
> org.apache.giraph.examples.SimpleShortestPathsVertex graph output 0 3
>
> but I get:
>
> Exception in thread "main" java.lang.NoSuchMethodException: 
> org.apache.giraph.examples.SimpleShortestPathsVertex.main([Ljava.lang.String;)
>
> Indicating that the class has no main method. The documentation 
> (https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Example) 
> in fact states that this is the expected behavior in giraph 0.2 
> because it must be run with ToolRunner, but I don't see any 
> instructions for executing SimpleShortestPathsVertex using ToolRunner.
>
> This is the first time that I am running hadoop and giraph, so I am 
> probably missing something here, and any help would be greatly 
> appreciated.
>
> Best,
> Jonathan