You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Phillip Rhodes <mo...@gmail.com> on 2015/03/12 04:54:10 UTC

[SOLVED] Re: Giraph job never ends

OK, this was easy enough to fix, once I understood what
was actually happening.  Since I'm running on EC2 nodes on
AWS, it is not the case that any give node can talk to any other
node on any port (at least not by default).  I had tried to
cherry-pick which ports to whitelist in the security group,
but I missed one or more that YARN needed for internal
communication.   I discovered this when examining the
resourcemanager logs.


For now, instead of trying to enumerate exactly which ports
to allow, I added a rule to allow "all traffic" for address 10.0.0.0/24
and that solved this.


Cheers,


Phil


On Wed, Mar 11, 2015 at 1:39 PM, Phillip Rhodes
<mo...@gmail.com> wrote:
> Interesting... It totally did not work for me when built using the
> hadoop_2 profile, but with the hadoop_yarn profile everything at least
> starts up.  I'm pretty baffled right now... my cluster is essentially
> working, and I can run, for example, the WordCount example just fine.
> And the Giraph job starts and shows no apparent errors, but I get no
> output and it seems to run forever.
>
> It's probably some really small detail of my Hadoop configuration, or
> some environmental issue.  The problem is, I don't even know where to
> start looking right now.  :-(
>
>
> Phil
> This message optimized for indexing by NSA PRISM
>
>
> On Wed, Mar 11, 2015 at 3:16 AM, Martin Junghanns
> <ma...@gmx.net> wrote:
>> Hi Phillip,
>>
>> I am using Hadoop 2.5.2 with Giraph 1.1.0 and it runs fine with
>> -Phadoop2 (from scratch) and -Phadoop_yarn (after removing
>> STATIC_SASL_SYMBOL from munge.symbols in pom.xml).
>>
>> Maybe you can also try the stable Giraph
>> version and report your problem as an issue?
>>
>> Cheers,
>> Martin
>>
>> On 11.03.2015 04:03, Phillip Rhodes wrote:
>>> Giraph crew:
>>>
>>> I'm trying to run the SimpleShortestPathsComputation example using
>>> the latest Giraph code and Hadoop 2.5.2.  My command line looks
>>> like this:
>>>
>>> hadoop jar
>>> /home/prhodes/giraph/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-2.5.2-jar-with-dependencies.jar
>>>
>>>
>> org.apache.giraph.GiraphRunner
>>> org.apache.giraph.examples.SimpleShortestPathsComputation -vif
>>> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
>>>
>>>
>> -vip /user/prhodes/input/tiny_graph.txt -vof
>>> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
>>> /user/prhodes/giraph_output/shortestpaths -w 4
>>>
>>>
>>> and the job appears to start OK.  But then it starts outputing
>>> these kinds of messages, and this just continues (seemingly)
>>> forever until you ctrl+c it.
>>>
>>> 15/03/11 02:54:31 INFO yarn.GiraphYarnClient: Giraph:
>>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>>> Elapsed: 305.43 secs 15/03/11 02:54:31 INFO yarn.GiraphYarnClient:
>>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>>> used: 1 15/03/11 02:54:35 INFO yarn.GiraphYarnClient: Giraph:
>>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>>> Elapsed: 309.44 secs 15/03/11 02:54:35 INFO yarn.GiraphYarnClient:
>>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>>> used: 1 15/03/11 02:54:39 INFO yarn.GiraphYarnClient: Giraph:
>>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>>> Elapsed: 313.45 secs 15/03/11 02:54:39 INFO yarn.GiraphYarnClient:
>>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>>> used: 1 15/03/11 02:54:43 INFO yarn.GiraphYarnClient: Giraph:
>>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>>> Elapsed: 317.45 secs 15/03/11 02:54:43 INFO yarn.GiraphYarnClient:
>>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>>> used: 1 ^C15/03/11 02:54:47 INFO yarn.GiraphYarnClient: Giraph:
>>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>>> Elapsed: 321.46 secs 15/03/11 02:54:47 INFO yarn.GiraphYarnClient:
>>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>>> used: 1
>>>
>>> Any idea what is going on here?
>>>
>>>
>>> Thanks,
>>>
>>>
>>> Phil ---
>>>
>>>
>>> This message optimized for indexing by NSA PRISM
>>>

Re: [SOLVED] Re: Giraph job never ends

Posted by Steven Harenberg <sd...@ncsu.edu>.
Figured out the issue via the container log file:
container_1426433168188_0001_01_000001/gam-stdout.log. Too much virtual
memory was trying to be used (I am using a micro instance on EC2 so there
is not much to work with) causing an "exitCode: 143". Apparently, there is
a limit on the virtual memory based on the physical memory, but you can
ignore this limit by adding the following to yarn-site.xml:

<property>
  <name>yarn.nodemanager.vmem-check-enabled</name>
  <value>false</value>
  <description>Whether virtual memory limits will be enforced for
containers.</description>
</property>

source:
http://stackoverflow.com/questions/14110428/am-container-is-running-beyond-virtual-memory-limits

Everything seems to be working for me now.

On Fri, Mar 13, 2015 at 10:24 PM, Steven Harenberg <sd...@ncsu.edu>
wrote:

> Thanks Phil, I appreciate the help. Your posts over the past couple days
> have already been quite helpful.
>
> There were a few things I was going to play with as well, perhaps it is
> some configuration issue as you mentioned earlier. I had some issues with
> EC2 today and I will look at it again tomorrow.
>
> Thanks for letting me know about your talk, it sounds interesting. I will
> try and go as long as I can get there in time.
>
> --Steve
>
> On Fri, Mar 13, 2015 at 3:37 PM, Phillip Rhodes <motley.crue.fan@gmail.com
> > wrote:
>
>> Steve:
>>
>> I'm not 100% sure what to tell you, and I don't have access to my
>> cluster right this minute.  But later this evening I can log in and
>> see if I can find anything that might be
>> useful to you.
>>
>> Also, as an FYI, I'll be doing a presentation on Giraph at the
>> Triangle Java User's Group meeting this coming Monday... if you're in
>> the area (I see you have an @ncsu.edu address), and you can come by, I
>> might be able to help you then.   Part of my presentation will be
>> walking through how to setup a Giraph / YARN cluster, based on my
>> experiences over the past few days...
>>
>>
>> Phil
>>
>> This message optimized for indexing by NSA PRISM
>>
>>
>> On Fri, Mar 13, 2015 at 3:30 PM, Steven Harenberg <sd...@ncsu.edu>
>> wrote:
>> > Hey Phil,
>> >
>> > I have been having the exact same problems as you (I am also setting up
>> > Giraph on EC2), but this solution did not work for me.
>> >
>> > Do you recall what error you saw in resourcemanager logs? I am also
>> looking
>> > at these logs, but nothing is standing out to me. In fact, it almost
>> seems
>> > like the application should have successfully finished. The log stops
>> > updating and I see a lot of "COMPLETED", "RESULT=SUCCESS", "FINISHED"
>> at the
>> > end of the log. Though, it does look like one of the containers is not
>> > transitioning to these states.
>> >
>> > Thanks,
>> > Steve
>> >
>> >
>> > On Wed, Mar 11, 2015 at 11:54 PM, Phillip Rhodes <
>> motley.crue.fan@gmail.com>
>> > wrote:
>> >>
>> >> OK, this was easy enough to fix, once I understood what
>> >> was actually happening.  Since I'm running on EC2 nodes on
>> >> AWS, it is not the case that any give node can talk to any other
>> >> node on any port (at least not by default).  I had tried to
>> >> cherry-pick which ports to whitelist in the security group,
>> >> but I missed one or more that YARN needed for internal
>> >> communication.   I discovered this when examining the
>> >> resourcemanager logs.
>> >>
>> >>
>> >> For now, instead of trying to enumerate exactly which ports
>> >> to allow, I added a rule to allow "all traffic" for address
>> 10.0.0.0/24
>> >> and that solved this.
>> >>
>> >>
>> >> Cheers,
>> >>
>> >>
>> >> Phil
>> >>
>> >>
>> >> On Wed, Mar 11, 2015 at 1:39 PM, Phillip Rhodes
>> >> <mo...@gmail.com> wrote:
>> >> > Interesting... It totally did not work for me when built using the
>> >> > hadoop_2 profile, but with the hadoop_yarn profile everything at
>> least
>> >> > starts up.  I'm pretty baffled right now... my cluster is essentially
>> >> > working, and I can run, for example, the WordCount example just fine.
>> >> > And the Giraph job starts and shows no apparent errors, but I get no
>> >> > output and it seems to run forever.
>> >> >
>> >> > It's probably some really small detail of my Hadoop configuration, or
>> >> > some environmental issue.  The problem is, I don't even know where to
>> >> > start looking right now.  :-(
>> >> >
>> >> >
>> >> > Phil
>> >> > This message optimized for indexing by NSA PRISM
>> >> >
>> >> >
>> >> > On Wed, Mar 11, 2015 at 3:16 AM, Martin Junghanns
>> >> > <ma...@gmx.net> wrote:
>> >> >> Hi Phillip,
>> >> >>
>> >> >> I am using Hadoop 2.5.2 with Giraph 1.1.0 and it runs fine with
>> >> >> -Phadoop2 (from scratch) and -Phadoop_yarn (after removing
>> >> >> STATIC_SASL_SYMBOL from munge.symbols in pom.xml).
>> >> >>
>> >> >> Maybe you can also try the stable Giraph
>> >> >> version and report your problem as an issue?
>> >> >>
>> >> >> Cheers,
>> >> >> Martin
>> >> >>
>> >> >> On 11.03.2015 04:03, Phillip Rhodes wrote:
>> >> >>> Giraph crew:
>> >> >>>
>> >> >>> I'm trying to run the SimpleShortestPathsComputation example using
>> >> >>> the latest Giraph code and Hadoop 2.5.2.  My command line looks
>> >> >>> like this:
>> >> >>>
>> >> >>> hadoop jar
>> >> >>>
>> >> >>>
>> /home/prhodes/giraph/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-2.5.2-jar-with-dependencies.jar
>> >> >>>
>> >> >>>
>> >> >> org.apache.giraph.GiraphRunner
>> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation -vif
>> >> >>>
>> >> >>>
>> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
>> >> >>>
>> >> >>>
>> >> >> -vip /user/prhodes/input/tiny_graph.txt -vof
>> >> >>> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
>> >> >>> /user/prhodes/giraph_output/shortestpaths -w 4
>> >> >>>
>> >> >>>
>> >> >>> and the job appears to start OK.  But then it starts outputing
>> >> >>> these kinds of messages, and this just continues (seemingly)
>> >> >>> forever until you ctrl+c it.
>> >> >>>
>> >> >>> 15/03/11 02:54:31 INFO yarn.GiraphYarnClient: Giraph:
>> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>> >> >>> Elapsed: 305.43 secs 15/03/11 02:54:31 INFO yarn.GiraphYarnClient:
>> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>> >> >>> used: 1 15/03/11 02:54:35 INFO yarn.GiraphYarnClient: Giraph:
>> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>> >> >>> Elapsed: 309.44 secs 15/03/11 02:54:35 INFO yarn.GiraphYarnClient:
>> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>> >> >>> used: 1 15/03/11 02:54:39 INFO yarn.GiraphYarnClient: Giraph:
>> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>> >> >>> Elapsed: 313.45 secs 15/03/11 02:54:39 INFO yarn.GiraphYarnClient:
>> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>> >> >>> used: 1 15/03/11 02:54:43 INFO yarn.GiraphYarnClient: Giraph:
>> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>> >> >>> Elapsed: 317.45 secs 15/03/11 02:54:43 INFO yarn.GiraphYarnClient:
>> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>> >> >>> used: 1 ^C15/03/11 02:54:47 INFO yarn.GiraphYarnClient: Giraph:
>> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>> >> >>> Elapsed: 321.46 secs 15/03/11 02:54:47 INFO yarn.GiraphYarnClient:
>> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>> >> >>> used: 1
>> >> >>>
>> >> >>> Any idea what is going on here?
>> >> >>>
>> >> >>>
>> >> >>> Thanks,
>> >> >>>
>> >> >>>
>> >> >>> Phil ---
>> >> >>>
>> >> >>>
>> >> >>> This message optimized for indexing by NSA PRISM
>> >> >>>
>> >
>> >
>>
>
>

Re: [SOLVED] Re: Giraph job never ends

Posted by Steven Harenberg <sd...@ncsu.edu>.
Thanks Phil, I appreciate the help. Your posts over the past couple days
have already been quite helpful.

There were a few things I was going to play with as well, perhaps it is
some configuration issue as you mentioned earlier. I had some issues with
EC2 today and I will look at it again tomorrow.

Thanks for letting me know about your talk, it sounds interesting. I will
try and go as long as I can get there in time.

--Steve

On Fri, Mar 13, 2015 at 3:37 PM, Phillip Rhodes <mo...@gmail.com>
wrote:

> Steve:
>
> I'm not 100% sure what to tell you, and I don't have access to my
> cluster right this minute.  But later this evening I can log in and
> see if I can find anything that might be
> useful to you.
>
> Also, as an FYI, I'll be doing a presentation on Giraph at the
> Triangle Java User's Group meeting this coming Monday... if you're in
> the area (I see you have an @ncsu.edu address), and you can come by, I
> might be able to help you then.   Part of my presentation will be
> walking through how to setup a Giraph / YARN cluster, based on my
> experiences over the past few days...
>
>
> Phil
>
> This message optimized for indexing by NSA PRISM
>
>
> On Fri, Mar 13, 2015 at 3:30 PM, Steven Harenberg <sd...@ncsu.edu>
> wrote:
> > Hey Phil,
> >
> > I have been having the exact same problems as you (I am also setting up
> > Giraph on EC2), but this solution did not work for me.
> >
> > Do you recall what error you saw in resourcemanager logs? I am also
> looking
> > at these logs, but nothing is standing out to me. In fact, it almost
> seems
> > like the application should have successfully finished. The log stops
> > updating and I see a lot of "COMPLETED", "RESULT=SUCCESS", "FINISHED" at
> the
> > end of the log. Though, it does look like one of the containers is not
> > transitioning to these states.
> >
> > Thanks,
> > Steve
> >
> >
> > On Wed, Mar 11, 2015 at 11:54 PM, Phillip Rhodes <
> motley.crue.fan@gmail.com>
> > wrote:
> >>
> >> OK, this was easy enough to fix, once I understood what
> >> was actually happening.  Since I'm running on EC2 nodes on
> >> AWS, it is not the case that any give node can talk to any other
> >> node on any port (at least not by default).  I had tried to
> >> cherry-pick which ports to whitelist in the security group,
> >> but I missed one or more that YARN needed for internal
> >> communication.   I discovered this when examining the
> >> resourcemanager logs.
> >>
> >>
> >> For now, instead of trying to enumerate exactly which ports
> >> to allow, I added a rule to allow "all traffic" for address 10.0.0.0/24
> >> and that solved this.
> >>
> >>
> >> Cheers,
> >>
> >>
> >> Phil
> >>
> >>
> >> On Wed, Mar 11, 2015 at 1:39 PM, Phillip Rhodes
> >> <mo...@gmail.com> wrote:
> >> > Interesting... It totally did not work for me when built using the
> >> > hadoop_2 profile, but with the hadoop_yarn profile everything at least
> >> > starts up.  I'm pretty baffled right now... my cluster is essentially
> >> > working, and I can run, for example, the WordCount example just fine.
> >> > And the Giraph job starts and shows no apparent errors, but I get no
> >> > output and it seems to run forever.
> >> >
> >> > It's probably some really small detail of my Hadoop configuration, or
> >> > some environmental issue.  The problem is, I don't even know where to
> >> > start looking right now.  :-(
> >> >
> >> >
> >> > Phil
> >> > This message optimized for indexing by NSA PRISM
> >> >
> >> >
> >> > On Wed, Mar 11, 2015 at 3:16 AM, Martin Junghanns
> >> > <ma...@gmx.net> wrote:
> >> >> Hi Phillip,
> >> >>
> >> >> I am using Hadoop 2.5.2 with Giraph 1.1.0 and it runs fine with
> >> >> -Phadoop2 (from scratch) and -Phadoop_yarn (after removing
> >> >> STATIC_SASL_SYMBOL from munge.symbols in pom.xml).
> >> >>
> >> >> Maybe you can also try the stable Giraph
> >> >> version and report your problem as an issue?
> >> >>
> >> >> Cheers,
> >> >> Martin
> >> >>
> >> >> On 11.03.2015 04:03, Phillip Rhodes wrote:
> >> >>> Giraph crew:
> >> >>>
> >> >>> I'm trying to run the SimpleShortestPathsComputation example using
> >> >>> the latest Giraph code and Hadoop 2.5.2.  My command line looks
> >> >>> like this:
> >> >>>
> >> >>> hadoop jar
> >> >>>
> >> >>>
> /home/prhodes/giraph/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-2.5.2-jar-with-dependencies.jar
> >> >>>
> >> >>>
> >> >> org.apache.giraph.GiraphRunner
> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation -vif
> >> >>>
> >> >>>
> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
> >> >>>
> >> >>>
> >> >> -vip /user/prhodes/input/tiny_graph.txt -vof
> >> >>> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
> >> >>> /user/prhodes/giraph_output/shortestpaths -w 4
> >> >>>
> >> >>>
> >> >>> and the job appears to start OK.  But then it starts outputing
> >> >>> these kinds of messages, and this just continues (seemingly)
> >> >>> forever until you ctrl+c it.
> >> >>>
> >> >>> 15/03/11 02:54:31 INFO yarn.GiraphYarnClient: Giraph:
> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
> >> >>> Elapsed: 305.43 secs 15/03/11 02:54:31 INFO yarn.GiraphYarnClient:
> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
> >> >>> used: 1 15/03/11 02:54:35 INFO yarn.GiraphYarnClient: Giraph:
> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
> >> >>> Elapsed: 309.44 secs 15/03/11 02:54:35 INFO yarn.GiraphYarnClient:
> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
> >> >>> used: 1 15/03/11 02:54:39 INFO yarn.GiraphYarnClient: Giraph:
> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
> >> >>> Elapsed: 313.45 secs 15/03/11 02:54:39 INFO yarn.GiraphYarnClient:
> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
> >> >>> used: 1 15/03/11 02:54:43 INFO yarn.GiraphYarnClient: Giraph:
> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
> >> >>> Elapsed: 317.45 secs 15/03/11 02:54:43 INFO yarn.GiraphYarnClient:
> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
> >> >>> used: 1 ^C15/03/11 02:54:47 INFO yarn.GiraphYarnClient: Giraph:
> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
> >> >>> Elapsed: 321.46 secs 15/03/11 02:54:47 INFO yarn.GiraphYarnClient:
> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
> >> >>> used: 1
> >> >>>
> >> >>> Any idea what is going on here?
> >> >>>
> >> >>>
> >> >>> Thanks,
> >> >>>
> >> >>>
> >> >>> Phil ---
> >> >>>
> >> >>>
> >> >>> This message optimized for indexing by NSA PRISM
> >> >>>
> >
> >
>

Re: [SOLVED] Re: Giraph job never ends

Posted by Phillip Rhodes <mo...@gmail.com>.
Steve:

I'm not 100% sure what to tell you, and I don't have access to my
cluster right this minute.  But later this evening I can log in and
see if I can find anything that might be
useful to you.

Also, as an FYI, I'll be doing a presentation on Giraph at the
Triangle Java User's Group meeting this coming Monday... if you're in
the area (I see you have an @ncsu.edu address), and you can come by, I
might be able to help you then.   Part of my presentation will be
walking through how to setup a Giraph / YARN cluster, based on my
experiences over the past few days...


Phil

This message optimized for indexing by NSA PRISM


On Fri, Mar 13, 2015 at 3:30 PM, Steven Harenberg <sd...@ncsu.edu> wrote:
> Hey Phil,
>
> I have been having the exact same problems as you (I am also setting up
> Giraph on EC2), but this solution did not work for me.
>
> Do you recall what error you saw in resourcemanager logs? I am also looking
> at these logs, but nothing is standing out to me. In fact, it almost seems
> like the application should have successfully finished. The log stops
> updating and I see a lot of "COMPLETED", "RESULT=SUCCESS", "FINISHED" at the
> end of the log. Though, it does look like one of the containers is not
> transitioning to these states.
>
> Thanks,
> Steve
>
>
> On Wed, Mar 11, 2015 at 11:54 PM, Phillip Rhodes <mo...@gmail.com>
> wrote:
>>
>> OK, this was easy enough to fix, once I understood what
>> was actually happening.  Since I'm running on EC2 nodes on
>> AWS, it is not the case that any give node can talk to any other
>> node on any port (at least not by default).  I had tried to
>> cherry-pick which ports to whitelist in the security group,
>> but I missed one or more that YARN needed for internal
>> communication.   I discovered this when examining the
>> resourcemanager logs.
>>
>>
>> For now, instead of trying to enumerate exactly which ports
>> to allow, I added a rule to allow "all traffic" for address 10.0.0.0/24
>> and that solved this.
>>
>>
>> Cheers,
>>
>>
>> Phil
>>
>>
>> On Wed, Mar 11, 2015 at 1:39 PM, Phillip Rhodes
>> <mo...@gmail.com> wrote:
>> > Interesting... It totally did not work for me when built using the
>> > hadoop_2 profile, but with the hadoop_yarn profile everything at least
>> > starts up.  I'm pretty baffled right now... my cluster is essentially
>> > working, and I can run, for example, the WordCount example just fine.
>> > And the Giraph job starts and shows no apparent errors, but I get no
>> > output and it seems to run forever.
>> >
>> > It's probably some really small detail of my Hadoop configuration, or
>> > some environmental issue.  The problem is, I don't even know where to
>> > start looking right now.  :-(
>> >
>> >
>> > Phil
>> > This message optimized for indexing by NSA PRISM
>> >
>> >
>> > On Wed, Mar 11, 2015 at 3:16 AM, Martin Junghanns
>> > <ma...@gmx.net> wrote:
>> >> Hi Phillip,
>> >>
>> >> I am using Hadoop 2.5.2 with Giraph 1.1.0 and it runs fine with
>> >> -Phadoop2 (from scratch) and -Phadoop_yarn (after removing
>> >> STATIC_SASL_SYMBOL from munge.symbols in pom.xml).
>> >>
>> >> Maybe you can also try the stable Giraph
>> >> version and report your problem as an issue?
>> >>
>> >> Cheers,
>> >> Martin
>> >>
>> >> On 11.03.2015 04:03, Phillip Rhodes wrote:
>> >>> Giraph crew:
>> >>>
>> >>> I'm trying to run the SimpleShortestPathsComputation example using
>> >>> the latest Giraph code and Hadoop 2.5.2.  My command line looks
>> >>> like this:
>> >>>
>> >>> hadoop jar
>> >>>
>> >>> /home/prhodes/giraph/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-2.5.2-jar-with-dependencies.jar
>> >>>
>> >>>
>> >> org.apache.giraph.GiraphRunner
>> >>> org.apache.giraph.examples.SimpleShortestPathsComputation -vif
>> >>>
>> >>> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
>> >>>
>> >>>
>> >> -vip /user/prhodes/input/tiny_graph.txt -vof
>> >>> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
>> >>> /user/prhodes/giraph_output/shortestpaths -w 4
>> >>>
>> >>>
>> >>> and the job appears to start OK.  But then it starts outputing
>> >>> these kinds of messages, and this just continues (seemingly)
>> >>> forever until you ctrl+c it.
>> >>>
>> >>> 15/03/11 02:54:31 INFO yarn.GiraphYarnClient: Giraph:
>> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>> >>> Elapsed: 305.43 secs 15/03/11 02:54:31 INFO yarn.GiraphYarnClient:
>> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>> >>> used: 1 15/03/11 02:54:35 INFO yarn.GiraphYarnClient: Giraph:
>> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>> >>> Elapsed: 309.44 secs 15/03/11 02:54:35 INFO yarn.GiraphYarnClient:
>> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>> >>> used: 1 15/03/11 02:54:39 INFO yarn.GiraphYarnClient: Giraph:
>> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>> >>> Elapsed: 313.45 secs 15/03/11 02:54:39 INFO yarn.GiraphYarnClient:
>> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>> >>> used: 1 15/03/11 02:54:43 INFO yarn.GiraphYarnClient: Giraph:
>> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>> >>> Elapsed: 317.45 secs 15/03/11 02:54:43 INFO yarn.GiraphYarnClient:
>> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>> >>> used: 1 ^C15/03/11 02:54:47 INFO yarn.GiraphYarnClient: Giraph:
>> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>> >>> Elapsed: 321.46 secs 15/03/11 02:54:47 INFO yarn.GiraphYarnClient:
>> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>> >>> used: 1
>> >>>
>> >>> Any idea what is going on here?
>> >>>
>> >>>
>> >>> Thanks,
>> >>>
>> >>>
>> >>> Phil ---
>> >>>
>> >>>
>> >>> This message optimized for indexing by NSA PRISM
>> >>>
>
>

Re: [SOLVED] Re: Giraph job never ends

Posted by Steven Harenberg <sd...@ncsu.edu>.
Hey Phil,

I have been having the exact same problems as you (I am also setting up
Giraph on EC2), but this solution did not work for me.

Do you recall what error you saw in resourcemanager logs? I am also looking
at these logs, but nothing is standing out to me. In fact, it almost seems
like the application should have successfully finished. The log stops
updating and I see a lot of "COMPLETED", "RESULT=SUCCESS", "FINISHED" at
the end of the log. Though, it does look like one of the containers is not
transitioning to these states.

Thanks,
Steve

On Wed, Mar 11, 2015 at 11:54 PM, Phillip Rhodes <mo...@gmail.com>
wrote:

> OK, this was easy enough to fix, once I understood what
> was actually happening.  Since I'm running on EC2 nodes on
> AWS, it is not the case that any give node can talk to any other
> node on any port (at least not by default).  I had tried to
> cherry-pick which ports to whitelist in the security group,
> but I missed one or more that YARN needed for internal
> communication.   I discovered this when examining the
> resourcemanager logs.
>
>
> For now, instead of trying to enumerate exactly which ports
> to allow, I added a rule to allow "all traffic" for address 10.0.0.0/24
> and that solved this.
>
>
> Cheers,
>
>
> Phil
>
>
> On Wed, Mar 11, 2015 at 1:39 PM, Phillip Rhodes
> <mo...@gmail.com> wrote:
> > Interesting... It totally did not work for me when built using the
> > hadoop_2 profile, but with the hadoop_yarn profile everything at least
> > starts up.  I'm pretty baffled right now... my cluster is essentially
> > working, and I can run, for example, the WordCount example just fine.
> > And the Giraph job starts and shows no apparent errors, but I get no
> > output and it seems to run forever.
> >
> > It's probably some really small detail of my Hadoop configuration, or
> > some environmental issue.  The problem is, I don't even know where to
> > start looking right now.  :-(
> >
> >
> > Phil
> > This message optimized for indexing by NSA PRISM
> >
> >
> > On Wed, Mar 11, 2015 at 3:16 AM, Martin Junghanns
> > <ma...@gmx.net> wrote:
> >> Hi Phillip,
> >>
> >> I am using Hadoop 2.5.2 with Giraph 1.1.0 and it runs fine with
> >> -Phadoop2 (from scratch) and -Phadoop_yarn (after removing
> >> STATIC_SASL_SYMBOL from munge.symbols in pom.xml).
> >>
> >> Maybe you can also try the stable Giraph
> >> version and report your problem as an issue?
> >>
> >> Cheers,
> >> Martin
> >>
> >> On 11.03.2015 04:03, Phillip Rhodes wrote:
> >>> Giraph crew:
> >>>
> >>> I'm trying to run the SimpleShortestPathsComputation example using
> >>> the latest Giraph code and Hadoop 2.5.2.  My command line looks
> >>> like this:
> >>>
> >>> hadoop jar
> >>>
> /home/prhodes/giraph/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-2.5.2-jar-with-dependencies.jar
> >>>
> >>>
> >> org.apache.giraph.GiraphRunner
> >>> org.apache.giraph.examples.SimpleShortestPathsComputation -vif
> >>> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
> >>>
> >>>
> >> -vip /user/prhodes/input/tiny_graph.txt -vof
> >>> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
> >>> /user/prhodes/giraph_output/shortestpaths -w 4
> >>>
> >>>
> >>> and the job appears to start OK.  But then it starts outputing
> >>> these kinds of messages, and this just continues (seemingly)
> >>> forever until you ctrl+c it.
> >>>
> >>> 15/03/11 02:54:31 INFO yarn.GiraphYarnClient: Giraph:
> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
> >>> Elapsed: 305.43 secs 15/03/11 02:54:31 INFO yarn.GiraphYarnClient:
> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
> >>> used: 1 15/03/11 02:54:35 INFO yarn.GiraphYarnClient: Giraph:
> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
> >>> Elapsed: 309.44 secs 15/03/11 02:54:35 INFO yarn.GiraphYarnClient:
> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
> >>> used: 1 15/03/11 02:54:39 INFO yarn.GiraphYarnClient: Giraph:
> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
> >>> Elapsed: 313.45 secs 15/03/11 02:54:39 INFO yarn.GiraphYarnClient:
> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
> >>> used: 1 15/03/11 02:54:43 INFO yarn.GiraphYarnClient: Giraph:
> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
> >>> Elapsed: 317.45 secs 15/03/11 02:54:43 INFO yarn.GiraphYarnClient:
> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
> >>> used: 1 ^C15/03/11 02:54:47 INFO yarn.GiraphYarnClient: Giraph:
> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
> >>> Elapsed: 321.46 secs 15/03/11 02:54:47 INFO yarn.GiraphYarnClient:
> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
> >>> used: 1
> >>>
> >>> Any idea what is going on here?
> >>>
> >>>
> >>> Thanks,
> >>>
> >>>
> >>> Phil ---
> >>>
> >>>
> >>> This message optimized for indexing by NSA PRISM
> >>>
>