You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by dstieglitz <ds...@stainlesscode.com> on 2016/04/06 02:10:52 UTC

Re: Behavior of init() for clustered singleton

Following up on this...

Sorry for the vague description of the problem, but we are experiencing
objects "going null" (as if they were garbage collected?) in our clustered
singleton.

We have an instance variable of an object that is initialized in the service
init() method. We have confirmed that on topology change, the object is
properly re-initialized. However, after some period of time, for example,
overnight, the object "goes null."

Are we doing this correctly? Should we store the object in the cluster?

The schedule class is here:
https://github.com/dstieglitz/grails-ignite/blob/v0.4.x/src/java/org/grails/ignite/DistributedSchedulerServiceImpl.java

The object in question is the "DistributedScheduledThreadPoolExecutor"



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Behavior-of-init-for-clustered-singleton-tp3819p3944.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Behavior of init() for clustered singleton

Posted by Yakov Zhdanov <yz...@apache.org>.
Please provide app logs after the issue gets reproduced.

--Yakov

2016-04-08 19:20 GMT+03:00 dstieglitz <ds...@stainlesscode.com>:

> Ok I added the debug statements:
>
>
> https://github.com/dstieglitz/grails-ignite/blob/v0.4.x/src/java/org/grails/ignite/DistributedSchedulerServiceImpl.java
>
> Let me know if you want me to report anything from our application.
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/Behavior-of-init-for-clustered-singleton-tp3819p4025.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Behavior of init() for clustered singleton

Posted by dstieglitz <ds...@stainlesscode.com>.
Ok I added the debug statements:

https://github.com/dstieglitz/grails-ignite/blob/v0.4.x/src/java/org/grails/ignite/DistributedSchedulerServiceImpl.java

Let me know if you want me to report anything from our application.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Behavior-of-init-for-clustered-singleton-tp3819p4025.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Behavior of init() for clustered singleton

Posted by Yakov Zhdanov <yz...@apache.org>.
Guys,

It seems there can be a race condition between service methods call and
initialization
- org/apache/ignite/internal/processors/service/GridServiceProcessor.java:921

Alex G, Val, can you please check if service may be called prior to its
initialization?

Dan, can you please add service instance identity hash code to output in
init() and other service methods. Smth like - System.out.println("Inside
service XXX method [thread=" + Thread.currentThread().getName() + ", hash=" +
System.identityHashCode(this) + ']');

--Yakov

2016-04-08 13:01 GMT+03:00 dstieglitz <ds...@stainlesscode.com>:

> If you look at the line below:
>
>
> https://github.com/dstieglitz/grails-ignite/blob/v0.4.x/src/java/org/grails/ignite/IgniteCronDistributedRunnableScheduledFuture.java#L79
>
> We're seeing the string "NULL EXECUTOR" in our status. But based on the way
> the classes are initialized I don't think it's possible for that reference
> to be null. Also we've observed the scheduler working, so at this point I
> think our main issue was confusion caused by this seemingly null reference.
>
> I'm not sure exactly what is not serialized, all we see is this null
> evaluation return true.
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/Behavior-of-init-for-clustered-singleton-tp3819p4021.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Behavior of init() for clustered singleton

Posted by dstieglitz <ds...@stainlesscode.com>.
If you look at the line below:

https://github.com/dstieglitz/grails-ignite/blob/v0.4.x/src/java/org/grails/ignite/IgniteCronDistributedRunnableScheduledFuture.java#L79

We're seeing the string "NULL EXECUTOR" in our status. But based on the way
the classes are initialized I don't think it's possible for that reference
to be null. Also we've observed the scheduler working, so at this point I
think our main issue was confusion caused by this seemingly null reference.

I'm not sure exactly what is not serialized, all we see is this null
evaluation return true.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Behavior-of-init-for-clustered-singleton-tp3819p4021.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Behavior of init() for clustered singleton

Posted by vkulichenko <va...@gmail.com>.
Dan,

What exactly is not serialized? As Dmitry pointed out earlier, the service
state is not preserved when it's redeployed, so you should reinitialize it
in init() method. If you still need to share the state, you can use the
cache.

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Behavior-of-init-for-clustered-singleton-tp3819p4016.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Behavior of init() for clustered singleton

Posted by dstieglitz <ds...@stainlesscode.com>.
Hi guys:

So, we've investigated this a bit further and we think the service is
actually working, but the issue is that our debug display is showing null
for some objects. We think this is because the service and those objects
live on another node, and we're seeing null because they are not serializing
across the grid. 

Is that possible? If there are some objects in the service that don't
serialize and you try to access them from a different node would they just
print out as null?

Dan



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Behavior-of-init-for-clustered-singleton-tp3819p4013.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Behavior of init() for clustered singleton

Posted by vkulichenko <va...@gmail.com>.
Hi,

The try-catch block in scheduleWithCron method just wraps the original
exception into DistributedRunnableException and rethrows it. It is then
propagated to the node that invoked the service proxy.

Do you expect different behavior?

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Behavior-of-init-for-clustered-singleton-tp3819p4009.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Behavior of init() for clustered singleton

Posted by dstieglitz <ds...@stainlesscode.com>.
We did find an exception in the remote job which seems to be related (stack
trace below).

There are try {} catch blocks to try and catch this exception but it always
appears in the logs. Is it possible to catch this?

I also noticed it is translated to a "null" IgniteException somewhere, I'm
not sure if this is the correct behavior.

016-04-07 17:51:38,505 [sys-#30%hapnin-grid%] ERROR task.GridTaskWorker  -
Failed to obtain remote job result policy for result from
ComputeTask.result(..) method (will fail the whole task): GridJobResultImpl
[job=C2 [], sib=GridJobSiblingImpl
[sesId=45f99d1f351-193a6965-27a9-43f0-bb84-af7d67c5e55b,
jobId=55f99d1f351-4d30f145-aaa5-42b6-9194-27b6d20ac4bf,
nodeId=4d30f145-aaa5-42b6-9194-27b6d20ac4bf, isJobDone=false],
jobCtx=GridJobContextImpl
[jobId=55f99d1f351-4d30f145-aaa5-42b6-9194-27b6d20ac4bf, timeoutObj=null,
attrs={}], node=TcpDiscoveryNode [id=4d30f145-aaa5-42b6-9194-27b6d20ac4bf,
addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.30.1.34],
sockAddrs=[dev2.localdomain/172.30.1.34:47500, /0:0:0:0:0:0:0:1%lo:47500,
/127.0.0.1:47500, /172.30.1.34:47500], discPort=47500, order=1, intOrder=1,
lastExchangeTime=1460051494963, loc=false, ver=1.5.0#20151229-sha1:f1f8cda2,
isClient=false], ex=class o.a.i.IgniteException: null, hasRes=true,
isCancelled=false, isOccupied=true]
class org.apache.ignite.IgniteException: Remote job threw user exception
(override or implement ComputeTask.result(..) method if you would like to
have automatic failover for this exception).
	at
org.apache.ignite.compute.ComputeTaskAdapter.result(ComputeTaskAdapter.java:101)
	at
org.apache.ignite.internal.processors.task.GridTaskWorker$3.apply(GridTaskWorker.java:909)
	at
org.apache.ignite.internal.processors.task.GridTaskWorker$3.apply(GridTaskWorker.java:902)
	at
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6429)
	at
org.apache.ignite.internal.processors.task.GridTaskWorker.result(GridTaskWorker.java:902)
	at
org.apache.ignite.internal.processors.task.GridTaskWorker.onResponse(GridTaskWorker.java:798)
	at
org.apache.ignite.internal.processors.task.GridTaskProcessor.processJobExecuteResponse(GridTaskProcessor.java:995)
	at
org.apache.ignite.internal.processors.task.GridTaskProcessor$JobMessageListener.onMessage(GridTaskProcessor.java:1219)
	at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:821)
	at
org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:103)
	at
org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:784)
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: class org.apache.ignite.IgniteException: null
	at
org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2.execute(GridClosureProcessor.java:1792)
	at
org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:509)
	at
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6397)
	at
org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:503)
	at
org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:456)
	at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
	at
org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1166)
	at
org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1770)
	... 6 more
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at
org.apache.ignite.internal.processors.service.GridServiceProxy$ServiceProxyCallable.call(GridServiceProxy.java:382)
	at
org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2.execute(GridClosureProcessor.java:1789)
	... 13 more
Caused by: org.grails.ignite.DistributedRunnableException: invalid pattern:
"0 0 2 * * ?"
	at
org.grails.ignite.DistributedSchedulerServiceImpl.scheduleWithCron(DistributedSchedulerServiceImpl.java:162)
	... 19 more
Caused by: org.grails.ignite.DistributedRunnableException: invalid pattern:
"0 0 2 * * ?"
	at
org.grails.ignite.DistributedScheduledThreadPoolExecutor.scheduleWithCron(DistributedScheduledThreadPoolExecutor.java:67)
	at
org.grails.ignite.DistributedSchedulerServiceImpl.scheduleWithCron(DistributedSchedulerServiceImpl.java:149)
	... 19 more
Caused by: it.sauronsoftware.cron4j.InvalidPatternException: invalid
pattern: "0 0 2 * * ?"
	at it.sauronsoftware.cron4j.SchedulingPattern.<init>(Unknown Source)
	at it.sauronsoftware.cron4j.Scheduler.schedule(Unknown Source)
	at it.sauronsoftware.cron4j.Scheduler.schedule(Unknown Source)
	at
org.grails.ignite.DistributedScheduledThreadPoolExecutor.scheduleWithCron(DistributedScheduledThreadPoolExecutor.java:61)
	... 20 more



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Behavior-of-init-for-clustered-singleton-tp3819p4001.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Behavior of init() for clustered singleton

Posted by Yakov Zhdanov <yz...@apache.org>.
Your examples seems correct to me.
1. What does it mean by "goes null"?
2. I do not see any assignments other than instantiation in init() method.
3. You confirm that service worked OK on some node but after some time with
no topology changes it starts to throw NPE, correct? Can you please share
the stack trace? Maybe it can reveal some details we missing now.

--Yakov

2016-04-06 3:10 GMT+03:00 dstieglitz <ds...@stainlesscode.com>:

> Following up on this...
>
> Sorry for the vague description of the problem, but we are experiencing
> objects "going null" (as if they were garbage collected?) in our clustered
> singleton.
>
> We have an instance variable of an object that is initialized in the
> service
> init() method. We have confirmed that on topology change, the object is
> properly re-initialized. However, after some period of time, for example,
> overnight, the object "goes null."
>
> Are we doing this correctly? Should we store the object in the cluster?
>
> The schedule class is here:
>
> https://github.com/dstieglitz/grails-ignite/blob/v0.4.x/src/java/org/grails/ignite/DistributedSchedulerServiceImpl.java
>
> The object in question is the "DistributedScheduledThreadPoolExecutor"
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/Behavior-of-init-for-clustered-singleton-tp3819p3944.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>