You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@myriad.apache.org by "Sarjeet Singh (JIRA)" <ji...@apache.org> on 2015/09/11 23:55:46 UTC

[jira] [Created] (MYRIAD-135) NullPointerException in ResourceOffersEventHandler from the offer received from Mesos.

Sarjeet Singh created MYRIAD-135:
------------------------------------

             Summary: NullPointerException in ResourceOffersEventHandler from the offer received from Mesos.
                 Key: MYRIAD-135
                 URL: https://issues.apache.org/jira/browse/MYRIAD-135
             Project: Myriad
          Issue Type: Bug
          Components: Scheduler
    Affects Versions: Myriad 0.1.0
            Reporter: Sarjeet Singh


I hit a NullPointerException when myriad-scheduler was receiving offers from mesos & offer was missing some resource entity info e.g. (cpu/memory/ports).

The exception is caused from the following code:

https://github.com/mesos/myriad/blob/phase1/myriad-scheduler/src/main/java/com/ebay/myriad/scheduler/event/handlers/ResourceOffersEventHandler.java#L150-L156

Observed the issue when submit a yarn job and job was ran on CGS NMs, not FGS NMs. On further debugging the issue, found the following exception from RM log:

15/09/11 13:14:22 WARN handlers.StatusUpdateEventHandler: Task: value:
"yarn_container_e09_1442001795955_0002_01_000001"
 not found, status: TASK_FINISHED
15/09/11 13:14:23 INFO handlers.ResourceOffersEventHandler: Received offers 1
Sep 11, 2015 1:14:23 PM com.lmax.disruptor.FatalExceptionHandler
handleEventException
SEVERE: Exception processing: 16
com.ebay.myriad.scheduler.event.ResourceOffersEvent@1256f6b6
java.lang.NullPointerException
        at
com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.matches(ResourceOffersEventHandler.java:154)
        at
com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:92)
        at
com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:55)
        at
com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

15/09/11 13:14:23 ERROR yarn.YarnUncaughtExceptionHandler: Thread
Thread[pool-2-thread-3,5,main] threw an Exception.
java.lang.RuntimeException: java.lang.NullPointerException
        at
com.lmax.disruptor.FatalExceptionHandler.handleEventException(FatalExceptionHandler.java:45)
        at
com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:147)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
        at
com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.matches(ResourceOffersEventHandler.java:154)
        at
com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:92)
        at
com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:55)
        at
com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
        ... 3 more

Also, Observed from RM logs that after the above exception, no more offer logs in RM as thread receiving offers is existed upon exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [jira] [Created] (MYRIAD-135) NullPointerException in ResourceOffersEventHandler from the offer received from Mesos.

Posted by Sarjeet Singh <sa...@maprtech.com>.
Darin,

Right. The offer had no 'cpu' constraint. Although, I tried many times to
reproduce the same issue, but this was just rare to see. Any idea on, how
to re-trigger this?

Also, What & Which occasions this can happen when offer is received with
resource constraints missing?

-Sarjeet

On Fri, Sep 11, 2015 at 4:10 PM, Darin Johnson <db...@gmail.com>
wrote:

> Looks like you hit a case where there was a offer with no cpu. Checking the
> cpu was historic, as cpu was set to -1 and then we added to it.  It would
> make more sense now to have `checkResource(cpu==null, "cpu")`.  Same for
> mem and ports.  I'm in the process of testing some other stuff now so can
> check and report back.
>
> On Fri, Sep 11, 2015 at 5:55 PM, Sarjeet Singh (JIRA) <ji...@apache.org>
> wrote:
>
> > Sarjeet Singh created MYRIAD-135:
> > ------------------------------------
> >
> >              Summary: NullPointerException in ResourceOffersEventHandler
> > from the offer received from Mesos.
> >                  Key: MYRIAD-135
> >                  URL: https://issues.apache.org/jira/browse/MYRIAD-135
> >              Project: Myriad
> >           Issue Type: Bug
> >           Components: Scheduler
> >     Affects Versions: Myriad 0.1.0
> >             Reporter: Sarjeet Singh
> >
> >
> > I hit a NullPointerException when myriad-scheduler was receiving offers
> > from mesos & offer was missing some resource entity info e.g.
> > (cpu/memory/ports).
> >
> > The exception is caused from the following code:
> >
> >
> >
> https://github.com/mesos/myriad/blob/phase1/myriad-scheduler/src/main/java/com/ebay/myriad/scheduler/event/handlers/ResourceOffersEventHandler.java#L150-L156
> >
> > Observed the issue when submit a yarn job and job was ran on CGS NMs, not
> > FGS NMs. On further debugging the issue, found the following exception
> from
> > RM log:
> >
> > 15/09/11 13:14:22 WARN handlers.StatusUpdateEventHandler: Task: value:
> > "yarn_container_e09_1442001795955_0002_01_000001"
> >  not found, status: TASK_FINISHED
> > 15/09/11 13:14:23 INFO handlers.ResourceOffersEventHandler: Received
> > offers 1
> > Sep 11, 2015 1:14:23 PM com.lmax.disruptor.FatalExceptionHandler
> > handleEventException
> > SEVERE: Exception processing: 16
> > com.ebay.myriad.scheduler.event.ResourceOffersEvent@1256f6b6
> > java.lang.NullPointerException
> >         at
> >
> >
> com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.matches(ResourceOffersEventHandler.java:154)
> >         at
> >
> >
> com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:92)
> >         at
> >
> >
> com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:55)
> >         at
> > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
> >         at
> >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >         at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >         at java.lang.Thread.run(Thread.java:745)
> >
> > 15/09/11 13:14:23 ERROR yarn.YarnUncaughtExceptionHandler: Thread
> > Thread[pool-2-thread-3,5,main] threw an Exception.
> > java.lang.RuntimeException: java.lang.NullPointerException
> >         at
> >
> >
> com.lmax.disruptor.FatalExceptionHandler.handleEventException(FatalExceptionHandler.java:45)
> >         at
> > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:147)
> >         at
> >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >         at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >         at java.lang.Thread.run(Thread.java:745)
> > Caused by: java.lang.NullPointerException
> >         at
> >
> >
> com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.matches(ResourceOffersEventHandler.java:154)
> >         at
> >
> >
> com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:92)
> >         at
> >
> >
> com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:55)
> >         at
> > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
> >         ... 3 more
> >
> > Also, Observed from RM logs that after the above exception, no more offer
> > logs in RM as thread receiving offers is existed upon exception.
> >
> >
> >
> > --
> > This message was sent by Atlassian JIRA
> > (v6.3.4#6332)
> >
>

Re: [jira] [Created] (MYRIAD-135) NullPointerException in ResourceOffersEventHandler from the offer received from Mesos.

Posted by Darin Johnson <db...@gmail.com>.
Looks like you hit a case where there was a offer with no cpu. Checking the
cpu was historic, as cpu was set to -1 and then we added to it.  It would
make more sense now to have `checkResource(cpu==null, "cpu")`.  Same for
mem and ports.  I'm in the process of testing some other stuff now so can
check and report back.

On Fri, Sep 11, 2015 at 5:55 PM, Sarjeet Singh (JIRA) <ji...@apache.org>
wrote:

> Sarjeet Singh created MYRIAD-135:
> ------------------------------------
>
>              Summary: NullPointerException in ResourceOffersEventHandler
> from the offer received from Mesos.
>                  Key: MYRIAD-135
>                  URL: https://issues.apache.org/jira/browse/MYRIAD-135
>              Project: Myriad
>           Issue Type: Bug
>           Components: Scheduler
>     Affects Versions: Myriad 0.1.0
>             Reporter: Sarjeet Singh
>
>
> I hit a NullPointerException when myriad-scheduler was receiving offers
> from mesos & offer was missing some resource entity info e.g.
> (cpu/memory/ports).
>
> The exception is caused from the following code:
>
>
> https://github.com/mesos/myriad/blob/phase1/myriad-scheduler/src/main/java/com/ebay/myriad/scheduler/event/handlers/ResourceOffersEventHandler.java#L150-L156
>
> Observed the issue when submit a yarn job and job was ran on CGS NMs, not
> FGS NMs. On further debugging the issue, found the following exception from
> RM log:
>
> 15/09/11 13:14:22 WARN handlers.StatusUpdateEventHandler: Task: value:
> "yarn_container_e09_1442001795955_0002_01_000001"
>  not found, status: TASK_FINISHED
> 15/09/11 13:14:23 INFO handlers.ResourceOffersEventHandler: Received
> offers 1
> Sep 11, 2015 1:14:23 PM com.lmax.disruptor.FatalExceptionHandler
> handleEventException
> SEVERE: Exception processing: 16
> com.ebay.myriad.scheduler.event.ResourceOffersEvent@1256f6b6
> java.lang.NullPointerException
>         at
>
> com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.matches(ResourceOffersEventHandler.java:154)
>         at
>
> com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:92)
>         at
>
> com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:55)
>         at
> com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
>
> 15/09/11 13:14:23 ERROR yarn.YarnUncaughtExceptionHandler: Thread
> Thread[pool-2-thread-3,5,main] threw an Exception.
> java.lang.RuntimeException: java.lang.NullPointerException
>         at
>
> com.lmax.disruptor.FatalExceptionHandler.handleEventException(FatalExceptionHandler.java:45)
>         at
> com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:147)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>         at
>
> com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.matches(ResourceOffersEventHandler.java:154)
>         at
>
> com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:92)
>         at
>
> com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:55)
>         at
> com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
>         ... 3 more
>
> Also, Observed from RM logs that after the above exception, no more offer
> logs in RM as thread receiving offers is existed upon exception.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>