You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Jyotirmoy Sundi <su...@gmail.com> on 2014/03/02 06:18:12 UTC

error while running giraph of YARN (was working in MR1)

Hi Folks,

    The job was working properly in MR1 without any issue. I am trying
to run a simple CC sample Giraph job on YARN. . I have attached the
stacktrace and a few errors. Any pointers will be really helpful for
the below errors.

*1. BspServiceMaster (YARN profile) is FAILING this task, throwing
exception to end job run.*

*2. java.lang.IllegalStateException: Not enough healthy workers to
create input splits*




*StackTrace:*

2014-03-02 04:53:24,646 INFO
org.apache.giraph.master.BspServiceMaster:
logMissingWorkersOnSuperstep: No response from partition 2 (could be
master)
2014-03-02 04:53:24,646 ERROR
org.apache.giraph.master.BspServiceMaster: checkWorkers: Did not
receive enough processes in time (only 1 of 2 required) after waiting
600000msecs).  This occurs if you do not have enough map tasks
available simultaneously on your Hadoop instance to fulfill the number
of requested workers.
2014-03-02 04:53:24,649 INFO
org.apache.giraph.master.BspServiceMaster: setJobState:
{"_stateKey":"FAILED","_applicationAttemptKey":-1,"_superstepKey":-1}
on superstep -1
2014-03-02 04:53:24,653 FATAL
org.apache.giraph.master.BspServiceMaster: failJob: Killing job
job_201402281650_0019
2014-03-02 04:53:24,654 FATAL
org.apache.giraph.master.BspServiceMaster: failJob: exception
java.lang.IllegalStateException: Not enough healthy workers to create
input splits
2014-03-02 04:53:24,654 ERROR org.apache.giraph.master.MasterThread:
masterThread: Master algorithm failed with RuntimeException
java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING
this task, throwing exception to end job run.
	at org.apache.giraph.master.BspServiceMaster.failJob(BspServiceMaster.java:349)
	at org.apache.giraph.master.BspServiceMaster.setJobStateFailed(BspServiceMaster.java:297)
	at org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:616)
	at org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:692)
	at org.apache.giraph.master.MasterThread.run(MasterThread.java:100)
Caused by: java.lang.IllegalStateException: Not enough healthy workers
to create input splits
	... 4 more
2014-03-02 04:53:24,656 FATAL org.apache.giraph.graph.GraphMapper:
uncaughtException: OverrideExceptionHandler on thread
org.apache.giraph.master.MasterThread, msg =
java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING
this task, throwing exception to end job run., exiting...
java.lang.IllegalStateException: java.lang.RuntimeException:
BspServiceMaster (YARN profile) is FAILING this task, throwing
exception to end job run.
	at org.apache.giraph.master.MasterThread.run(MasterThread.java:181)
Caused by: java.lang.RuntimeException: BspServiceMaster (YARN profile)
is FAILING this task, throwing exception to end job run.
	at org.apache.giraph.master.BspServiceMaster.failJob(BspServiceMaster.java:349)
	at org.apache.giraph.master.BspServiceMaster.setJobStateFailed(BspServiceMaster.java:297)
	at org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:616)
	at org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:692)
	at org.apache.giraph.master.MasterThread.run(MasterThread.java:100)
Caused by: java.lang.IllegalStateException: Not enough healthy workers
to create input splits
	... 4 more

------------------------------

Re: error while running giraph of YARN (was working in MR1)

Posted by Eli Reisman <ap...@gmail.com>.
Correct, although some things changed I have not studied up on since Giraph
on YARN moved to Hadoop 2.2 compatibility. Eventuallly the smart play will
be to move the master to the "application master" process but to maintain
easy compatibility with MRv1 Giraph I didn't implement it this way when I
put the original path together.


Good luck, hope that helps,

Eli


On Sun, Mar 2, 2014 at 11:02 PM, Jyotirmoy Sundi <su...@gmail.com> wrote:

> Hi Eli,
>        Please ignore my previous response.
> So for Giraph on YARN, are you saying for a large graph when running on
> MR1 (runs with one master + multiple worker) but for yarn it would be one
> Application Master, one Master, and multiple Workers ?
>
> Thanks
> Sundi
>
>
>
>
> On Sun, Mar 2, 2014 at 10:14 PM, Jyotirmoy Sundi <su...@gmail.com>wrote:
>
>> Hmmm,
>> I am running in cloudera manager. The number of application master,master
>> and worker seems as per the config stats.
>>
>> I get the following response in master mapper while running on a small
>> graph
>> MASTER_ONLY checkWorkers: Only found 2 responses of 3 needed to start
>> superstep -1
>>
>> When I go to the mapper running master, I get the following log:
>>
>> INFO org.apache.giraph.master.BspServiceMaster: logMissingWorkersOnSuperstep: No response from partition 3 (could be master)
>>
>> Any idea what configuration issue it might be ?
>>
>>
>> Thanks
>>
>> Sundi
>>
>>
>>
>>
>> On Sun, Mar 2, 2014 at 4:56 PM, Eli Reisman <ap...@gmail.com>wrote:
>>
>>> This looks like YARN cluster is misconfigured. Alternately, you need to
>>> configure it to allow a few more worker tasks. Giraph on YARN at minimum
>>> needs one Application Master, one Master, and one Worker (so 3 YARN
>>> containers) I have a feeling this could be the issue.
>>>
>>>
>>> On Sat, Mar 1, 2014 at 9:18 PM, Jyotirmoy Sundi <su...@gmail.com>wrote:
>>>
>>>> Hi Folks,
>>>>
>>>>     The job was working properly in MR1 without any issue. I am trying to run a simple CC sample Giraph job on YARN. . I have attached the stacktrace and a few errors. Any pointers will be really helpful for the below errors.
>>>>
>>>> *1. BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run.*
>>>>
>>>> *2. java.lang.IllegalStateException: Not enough healthy workers to create input splits*
>>>>
>>>>
>>>>
>>>>
>>>> *StackTrace:*
>>>>
>>>> 2014-03-02 04:53:24,646 INFO org.apache.giraph.master.BspServiceMaster: logMissingWorkersOnSuperstep: No response from partition 2 (could be master)
>>>> 2014-03-02 04:53:24,646 ERROR org.apache.giraph.master.BspServiceMaster: checkWorkers: Did not receive enough processes in time (only 1 of 2 required) after waiting 600000msecs).  This occurs if you do not have enough map tasks available simultaneously on your Hadoop instance to fulfill the number of requested workers.
>>>> 2014-03-02 04:53:24,649 INFO org.apache.giraph.master.BspServiceMaster: setJobState: {"_stateKey":"FAILED","_applicationAttemptKey":-1,"_superstepKey":-1} on superstep -1
>>>> 2014-03-02 04:53:24,653 FATAL org.apache.giraph.master.BspServiceMaster: failJob: Killing job job_201402281650_0019
>>>> 2014-03-02 04:53:24,654 FATAL org.apache.giraph.master.BspServiceMaster: failJob: exception java.lang.IllegalStateException: Not enough healthy workers to create input splits
>>>> 2014-03-02 04:53:24,654 ERROR org.apache.giraph.master.MasterThread: masterThread: Master algorithm failed with RuntimeException
>>>> java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run.
>>>> 	at org.apache.giraph.master.BspServiceMaster.failJob(BspServiceMaster.java:349)
>>>> 	at org.apache.giraph.master.BspServiceMaster.setJobStateFailed(BspServiceMaster.java:297)
>>>> 	at org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:616)
>>>> 	at org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:692)
>>>> 	at org.apache.giraph.master.MasterThread.run(MasterThread.java:100)
>>>> Caused by: java.lang.IllegalStateException: Not enough healthy workers to create input splits
>>>> 	... 4 more
>>>> 2014-03-02 04:53:24,656 FATAL org.apache.giraph.graph.GraphMapper: uncaughtException: OverrideExceptionHandler on thread org.apache.giraph.master.MasterThread, msg = java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run., exiting...
>>>> java.lang.IllegalStateException: java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run.
>>>> 	at org.apache.giraph.master.MasterThread.run(MasterThread.java:181)
>>>> Caused by: java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run.
>>>> 	at org.apache.giraph.master.BspServiceMaster.failJob(BspServiceMaster.java:349)
>>>> 	at org.apache.giraph.master.BspServiceMaster.setJobStateFailed(BspServiceMaster.java:297)
>>>> 	at org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:616)
>>>> 	at org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:692)
>>>> 	at org.apache.giraph.master.MasterThread.run(MasterThread.java:100)
>>>> Caused by: java.lang.IllegalStateException: Not enough healthy workers to create input splits
>>>> 	... 4 more
>>>>
>>>> ------------------------------
>>>>
>>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Jyotirmoy Sundi
>>
>>
>
>
> --
> Best Regards,
> Jyotirmoy Sundi
>
>

Re: error while running giraph of YARN (was working in MR1)

Posted by Jyotirmoy Sundi <su...@gmail.com>.
Hi Eli,
       Please ignore my previous response.
So for Giraph on YARN, are you saying for a large graph when running on MR1
(runs with one master + multiple worker) but for yarn it would be one
Application Master, one Master, and multiple Workers ?

Thanks
Sundi




On Sun, Mar 2, 2014 at 10:14 PM, Jyotirmoy Sundi <su...@gmail.com> wrote:

> Hmmm,
> I am running in cloudera manager. The number of application master,master
> and worker seems as per the config stats.
>
> I get the following response in master mapper while running on a small
> graph
> MASTER_ONLY checkWorkers: Only found 2 responses of 3 needed to start
> superstep -1
>
> When I go to the mapper running master, I get the following log:
>
> INFO org.apache.giraph.master.BspServiceMaster: logMissingWorkersOnSuperstep: No response from partition 3 (could be master)
>
> Any idea what configuration issue it might be ?
>
>
> Thanks
>
> Sundi
>
>
>
>
> On Sun, Mar 2, 2014 at 4:56 PM, Eli Reisman <ap...@gmail.com>wrote:
>
>> This looks like YARN cluster is misconfigured. Alternately, you need to
>> configure it to allow a few more worker tasks. Giraph on YARN at minimum
>> needs one Application Master, one Master, and one Worker (so 3 YARN
>> containers) I have a feeling this could be the issue.
>>
>>
>> On Sat, Mar 1, 2014 at 9:18 PM, Jyotirmoy Sundi <su...@gmail.com>wrote:
>>
>>> Hi Folks,
>>>
>>>     The job was working properly in MR1 without any issue. I am trying to run a simple CC sample Giraph job on YARN. . I have attached the stacktrace and a few errors. Any pointers will be really helpful for the below errors.
>>>
>>> *1. BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run.*
>>>
>>> *2. java.lang.IllegalStateException: Not enough healthy workers to create input splits*
>>>
>>>
>>>
>>>
>>> *StackTrace:*
>>>
>>> 2014-03-02 04:53:24,646 INFO org.apache.giraph.master.BspServiceMaster: logMissingWorkersOnSuperstep: No response from partition 2 (could be master)
>>> 2014-03-02 04:53:24,646 ERROR org.apache.giraph.master.BspServiceMaster: checkWorkers: Did not receive enough processes in time (only 1 of 2 required) after waiting 600000msecs).  This occurs if you do not have enough map tasks available simultaneously on your Hadoop instance to fulfill the number of requested workers.
>>> 2014-03-02 04:53:24,649 INFO org.apache.giraph.master.BspServiceMaster: setJobState: {"_stateKey":"FAILED","_applicationAttemptKey":-1,"_superstepKey":-1} on superstep -1
>>> 2014-03-02 04:53:24,653 FATAL org.apache.giraph.master.BspServiceMaster: failJob: Killing job job_201402281650_0019
>>> 2014-03-02 04:53:24,654 FATAL org.apache.giraph.master.BspServiceMaster: failJob: exception java.lang.IllegalStateException: Not enough healthy workers to create input splits
>>> 2014-03-02 04:53:24,654 ERROR org.apache.giraph.master.MasterThread: masterThread: Master algorithm failed with RuntimeException
>>> java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run.
>>> 	at org.apache.giraph.master.BspServiceMaster.failJob(BspServiceMaster.java:349)
>>> 	at org.apache.giraph.master.BspServiceMaster.setJobStateFailed(BspServiceMaster.java:297)
>>> 	at org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:616)
>>> 	at org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:692)
>>> 	at org.apache.giraph.master.MasterThread.run(MasterThread.java:100)
>>> Caused by: java.lang.IllegalStateException: Not enough healthy workers to create input splits
>>> 	... 4 more
>>> 2014-03-02 04:53:24,656 FATAL org.apache.giraph.graph.GraphMapper: uncaughtException: OverrideExceptionHandler on thread org.apache.giraph.master.MasterThread, msg = java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run., exiting...
>>> java.lang.IllegalStateException: java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run.
>>> 	at org.apache.giraph.master.MasterThread.run(MasterThread.java:181)
>>> Caused by: java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run.
>>> 	at org.apache.giraph.master.BspServiceMaster.failJob(BspServiceMaster.java:349)
>>> 	at org.apache.giraph.master.BspServiceMaster.setJobStateFailed(BspServiceMaster.java:297)
>>> 	at org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:616)
>>> 	at org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:692)
>>> 	at org.apache.giraph.master.MasterThread.run(MasterThread.java:100)
>>> Caused by: java.lang.IllegalStateException: Not enough healthy workers to create input splits
>>> 	... 4 more
>>>
>>> ------------------------------
>>>
>>>
>>
>
>
> --
> Best Regards,
> Jyotirmoy Sundi
>
>


-- 
Best Regards,
Jyotirmoy Sundi

Re: error while running giraph of YARN (was working in MR1)

Posted by Jyotirmoy Sundi <su...@gmail.com>.
Hmmm,
I am running in cloudera manager. The number of application master,master
and worker seems as per the config stats.

I get the following response in master mapper while running on a small graph
MASTER_ONLY checkWorkers: Only found 2 responses of 3 needed to start
superstep -1

When I go to the mapper running master, I get the following log:

INFO org.apache.giraph.master.BspServiceMaster:
logMissingWorkersOnSuperstep: No response from partition 3 (could be
master)

Any idea what configuration issue it might be ?


Thanks

Sundi




On Sun, Mar 2, 2014 at 4:56 PM, Eli Reisman <ap...@gmail.com>wrote:

> This looks like YARN cluster is misconfigured. Alternately, you need to
> configure it to allow a few more worker tasks. Giraph on YARN at minimum
> needs one Application Master, one Master, and one Worker (so 3 YARN
> containers) I have a feeling this could be the issue.
>
>
> On Sat, Mar 1, 2014 at 9:18 PM, Jyotirmoy Sundi <su...@gmail.com>wrote:
>
>> Hi Folks,
>>
>>     The job was working properly in MR1 without any issue. I am trying to run a simple CC sample Giraph job on YARN. . I have attached the stacktrace and a few errors. Any pointers will be really helpful for the below errors.
>>
>> *1. BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run.*
>>
>> *2. java.lang.IllegalStateException: Not enough healthy workers to create input splits*
>>
>>
>>
>>
>> *StackTrace:*
>>
>> 2014-03-02 04:53:24,646 INFO org.apache.giraph.master.BspServiceMaster: logMissingWorkersOnSuperstep: No response from partition 2 (could be master)
>> 2014-03-02 04:53:24,646 ERROR org.apache.giraph.master.BspServiceMaster: checkWorkers: Did not receive enough processes in time (only 1 of 2 required) after waiting 600000msecs).  This occurs if you do not have enough map tasks available simultaneously on your Hadoop instance to fulfill the number of requested workers.
>> 2014-03-02 04:53:24,649 INFO org.apache.giraph.master.BspServiceMaster: setJobState: {"_stateKey":"FAILED","_applicationAttemptKey":-1,"_superstepKey":-1} on superstep -1
>> 2014-03-02 04:53:24,653 FATAL org.apache.giraph.master.BspServiceMaster: failJob: Killing job job_201402281650_0019
>> 2014-03-02 04:53:24,654 FATAL org.apache.giraph.master.BspServiceMaster: failJob: exception java.lang.IllegalStateException: Not enough healthy workers to create input splits
>> 2014-03-02 04:53:24,654 ERROR org.apache.giraph.master.MasterThread: masterThread: Master algorithm failed with RuntimeException
>> java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run.
>> 	at org.apache.giraph.master.BspServiceMaster.failJob(BspServiceMaster.java:349)
>> 	at org.apache.giraph.master.BspServiceMaster.setJobStateFailed(BspServiceMaster.java:297)
>> 	at org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:616)
>> 	at org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:692)
>> 	at org.apache.giraph.master.MasterThread.run(MasterThread.java:100)
>> Caused by: java.lang.IllegalStateException: Not enough healthy workers to create input splits
>> 	... 4 more
>> 2014-03-02 04:53:24,656 FATAL org.apache.giraph.graph.GraphMapper: uncaughtException: OverrideExceptionHandler on thread org.apache.giraph.master.MasterThread, msg = java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run., exiting...
>> java.lang.IllegalStateException: java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run.
>> 	at org.apache.giraph.master.MasterThread.run(MasterThread.java:181)
>> Caused by: java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run.
>> 	at org.apache.giraph.master.BspServiceMaster.failJob(BspServiceMaster.java:349)
>> 	at org.apache.giraph.master.BspServiceMaster.setJobStateFailed(BspServiceMaster.java:297)
>> 	at org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:616)
>> 	at org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:692)
>> 	at org.apache.giraph.master.MasterThread.run(MasterThread.java:100)
>> Caused by: java.lang.IllegalStateException: Not enough healthy workers to create input splits
>> 	... 4 more
>>
>> ------------------------------
>>
>>
>


-- 
Best Regards,
Jyotirmoy Sundi

Re: error while running giraph of YARN (was working in MR1)

Posted by Eli Reisman <ap...@gmail.com>.
This looks like YARN cluster is misconfigured. Alternately, you need to
configure it to allow a few more worker tasks. Giraph on YARN at minimum
needs one Application Master, one Master, and one Worker (so 3 YARN
containers) I have a feeling this could be the issue.


On Sat, Mar 1, 2014 at 9:18 PM, Jyotirmoy Sundi <su...@gmail.com> wrote:

> Hi Folks,
>
>     The job was working properly in MR1 without any issue. I am trying to run a simple CC sample Giraph job on YARN. . I have attached the stacktrace and a few errors. Any pointers will be really helpful for the below errors.
>
> *1. BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run.*
>
> *2. java.lang.IllegalStateException: Not enough healthy workers to create input splits*
>
>
>
>
> *StackTrace:*
>
> 2014-03-02 04:53:24,646 INFO org.apache.giraph.master.BspServiceMaster: logMissingWorkersOnSuperstep: No response from partition 2 (could be master)
> 2014-03-02 04:53:24,646 ERROR org.apache.giraph.master.BspServiceMaster: checkWorkers: Did not receive enough processes in time (only 1 of 2 required) after waiting 600000msecs).  This occurs if you do not have enough map tasks available simultaneously on your Hadoop instance to fulfill the number of requested workers.
> 2014-03-02 04:53:24,649 INFO org.apache.giraph.master.BspServiceMaster: setJobState: {"_stateKey":"FAILED","_applicationAttemptKey":-1,"_superstepKey":-1} on superstep -1
> 2014-03-02 04:53:24,653 FATAL org.apache.giraph.master.BspServiceMaster: failJob: Killing job job_201402281650_0019
> 2014-03-02 04:53:24,654 FATAL org.apache.giraph.master.BspServiceMaster: failJob: exception java.lang.IllegalStateException: Not enough healthy workers to create input splits
> 2014-03-02 04:53:24,654 ERROR org.apache.giraph.master.MasterThread: masterThread: Master algorithm failed with RuntimeException
> java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run.
> 	at org.apache.giraph.master.BspServiceMaster.failJob(BspServiceMaster.java:349)
> 	at org.apache.giraph.master.BspServiceMaster.setJobStateFailed(BspServiceMaster.java:297)
> 	at org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:616)
> 	at org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:692)
> 	at org.apache.giraph.master.MasterThread.run(MasterThread.java:100)
> Caused by: java.lang.IllegalStateException: Not enough healthy workers to create input splits
> 	... 4 more
> 2014-03-02 04:53:24,656 FATAL org.apache.giraph.graph.GraphMapper: uncaughtException: OverrideExceptionHandler on thread org.apache.giraph.master.MasterThread, msg = java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run., exiting...
> java.lang.IllegalStateException: java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run.
> 	at org.apache.giraph.master.MasterThread.run(MasterThread.java:181)
> Caused by: java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this task, throwing exception to end job run.
> 	at org.apache.giraph.master.BspServiceMaster.failJob(BspServiceMaster.java:349)
> 	at org.apache.giraph.master.BspServiceMaster.setJobStateFailed(BspServiceMaster.java:297)
> 	at org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:616)
> 	at org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:692)
> 	at org.apache.giraph.master.MasterThread.run(MasterThread.java:100)
> Caused by: java.lang.IllegalStateException: Not enough healthy workers to create input splits
> 	... 4 more
>
> ------------------------------
>
>