You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ode.apache.org by Chris Taylor <sa...@yahoo.com> on 2008/11/24 16:14:32 UTC

Re: Client calling retired process?

Some more information regarding this error:
 
we are still seeing this even with the ODE Trunk 1.2.1 deployment. It occurs quite rarely, but it seems the catalyst is an OutOfMemoryError raised by ODE when a new request comes in:
java.lang.OutOfMemoryError
at org.apache.ode.bpel.engine.MyRoleMessageExchangeImpl$ResponseFuture.get(MyRoleMessageExchangeImpl.java:201)
at org.apache.ode.axis2.ODEService.onAxisMessageExchange(ODEService.java:149)
at org.apache.ode.axis2.hooks.ODEMessageReceiver.invokeBusinessLogic(ODEMessageReceiver.java:67)
at org.apache.ode.axis2.hooks.ODEMessageReceiver.invokeBusinessLogic(ODEMessageReceiver.java:50)
at org.apache.axis2.receivers.AbstractMessageReceiver.receive(AbstractMessageReceiver.java:96)
at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:145)
at org.apache.axis2.transport.http.HTTPTransportUtils.processHTTPPostRequest(HTTPTransportUtils.java:275)
at org.apache.axis2.transport.http.AxisServlet.doPost(AxisServlet.java:120)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:763)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:856)
at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1075)
at com.ibm.ws.webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:550)
at com.ibm.ws.wswebcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:478)
at com.ibm.ws.webcontainer.servlet.CacheServletWrapper.handleRequest(CacheServletWrapper.java:90)
at com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:744)
at com.ibm.ws.wswebcontainer.WebContainer.handleRequest(WebContainer.java:1455)
at com.ibm.ws.webcontainer.channel.WCChannelLink.ready(WCChannelLink.java:115)
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:458)
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewInformation(HttpInboundLink.java:387)
at com.ibm.ws.http.channel.inbound.impl.HttpICLReadCallback.complete(HttpICLReadCallback.java:102)
at com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165)
at com.ibm.io.async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217)
at com.ibm.io.async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161)
at com.ibm.io.async.AsyncFuture.completed(AsyncFuture.java:136)
at com.ibm.io.async.ResultHandler.complete(ResultHandler.java:195)
at com.ibm.io.async.ResultHandler.runEventProcessingLoop(ResultHandler.java:743)
at com.ibm.io.async.ResultHandler$2.run(ResultHandler.java:873)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1473)
After Websphere recovers, from this point on until we redeploy the process in question to a new version, ODE attempts to route subsequent requests to a retired version.
 [11/20/08 14:29:26:968 CST] 00000046 SystemOut O 14:29:26,967 ERROR [BpelEngineImpl] Scheduled job failed; jobDetail={type=INVOKE_INTERNAL, pid={http://eclipse.org/bpel/sample}AdminYNProcess-195, inmem=true, mexid=4611686018427387977}
org.apache.ode.bpel.runtime.InvalidProcessException: Process is retired.
at org.apache.ode.bpel.engine.PartnerLinkMyRoleImpl.invokeNewInstance(PartnerLinkMyRoleImpl.java:173)
at org.apache.ode.bpel.engine.BpelProcess.invokeProcess(BpelProcess.java:204)
at org.apache.ode.bpel.engine.BpelProcess.handleWorkEvent(BpelProcess.java:372)
at org.apache.ode.bpel.engine.BpelEngineImpl.onScheduledJob(BpelEngineImpl.java:326)
at org.apache.ode.bpel.engine.BpelServerImpl.onScheduledJob(BpelServerImpl.java:373)
at org.apache.ode.scheduler.simple.SimpleScheduler$4$1.call(SimpleScheduler.java:337)
at org.apache.ode.scheduler.simple.SimpleScheduler$4$1.call(SimpleScheduler.java:336)
at org.apache.ode.scheduler.simple.SimpleScheduler.execTransaction(SimpleScheduler.java:174)
at org.apache.ode.scheduler.simple.SimpleScheduler$4.call(SimpleScheduler.java:335)
at org.apache.ode.scheduler.simple.SimpleScheduler$4.call(SimpleScheduler.java:332)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:284)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:665)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:690)
at java.lang.Thread.run(Thread.java:810)

Attached is the Java core dump file from the time of the original OutOfMemoryError, showing that it was caused by excessive garbage collection.  the VM this runs under allocates 1 Gig of memory on the heap.
 
- Chris Taylor 



________________________________
From: Matthieu Riou <ma...@offthelip.org>
To: user@ode.apache.org
Cc: Dave Cecchi <da...@perficient.com>
Sent: Thursday, October 16, 2008 10:40:57 AM
Subject: Re: Client calling retired process?

On Wed, Oct 15, 2008 at 9:27 AM, Chris Taylor <sa...@yahoo.com> wrote:

> Matthieu, Yes would appreciate if you could put that latest built war
> somewhere.  We have attempted to build with buildr without success.
>

Here it is:

http://people.apache.org/~mriou/ode-axis2-war-1.2.1-SNAPSHOT.war

Let me know how it goes.

Cheers,
Matthieu


>
>
>
> ----- Original Message ----
> From: Matthieu Riou <ma...@offthelip.org>
> To: user@ode.apache.org
> Sent: Monday, October 13, 2008 1:30:56 PM
> Subject: Re: Client calling retired process?
>
> On Mon, Oct 13, 2008 at 10:55 AM, Chris Taylor <sa...@yahoo.com> wrote:
>
> > Thanks, Matthieu.  Some background:
> >
> > we're running ODE 1.2 on Websphere 6.1, with Oracle 10g as the process
> > store.
> >
> > This scenario consistently fails in the manner I described, but it seems
> > only for certain processes.
> >
> > So, for example, if i have the following:
> >
> > ProcessA-20
> > ProcessB-21
> > ProcessC-22
> >
> > deployed in my environment, the scenario would be that something causes
> > ProcessA-20 to hang - at which point it goes into recovery mode and
> spawns
> > an ode job to retry.  From this point on, new requests to (not just)
> > ProcessA get routed to the now-retired ProcessA-19, but also new requests
> to
> > ProcessB get routed to (now-retired) ProcessB-20!  The weird thing is,
> > ProcessC-22 is apparently unaffected.  It still gets calls legitimately
> > routed to its latest versioned deployment, ProcessC-22.
> >
> > I do not know if this happens under other scenarios unrelated to
> recovery.
> > I think I just do not have enough data points yet to say.
> >
> >
>
> If you have a reproducible test scenario, it would be great if you could
> try
> it with the current stable branch. I've fixed something related to what
> you're describing a couple of months ago. If doing a build is an issue for
> you, I can upload the WAR to a public place.
>
> Thanks,
> Matthieu
>
>
> >
> >
> >
> > ----- Original Message ----
> > From: Matthieu Riou <ma...@offthelip.org>
> > To: user@ode.apache.org
> > Sent: Monday, October 13, 2008 12:33:18 PM
> > Subject: Re: Client calling retired process?
> >
> > On Mon, Oct 13, 2008 at 8:17 AM, Chris Taylor <sa...@yahoo.com>
> wrote:
> >
> > > Thanks, Alexis, but i'm no closer to fully understanding why this
> occurs.
> > > It happens periodically now almost everyday with different deployed
> > > processes.  Although I don't understand it, I have done some research
> > into
> > > the behaviour.  Here's a scenario:
> > >
> > > we'll deploy ProcessA-19, then retire it with ProcessA-20 deployment.
> At
> > > some point it, or another, process will fail and attempt to go into
> > recovery
> > > mode (excuse me if I state this incorrectly),  at this point ODE will
> > create
> > > a scheduled job in an attempt to retry the service later.
> > >
> > > Here's where it gets screwy.  From then on, all new calls to ProcessA
> > will
> > > not route to ProcessA-20, but ode will attempt to route them to
> > ProcessA-19,
> > > which is of course retired. Ode does not recover from this.  It seems
> the
> > > only way to compensate is to redeploy ProcessA as ProcessA-21.  New
> > requests
> > > will then route correctly.
> > >
> > > Any idea here?
> > >
> >
> > I'll have to ask a few more questions to narrow it down and make sure I
> > understand correctly:
> >
> >  * Does the exact same scenario sometimes works and sometimes doesn't?
> >  * Is it always happening in relation with recovery and retry or did you
> > see it happen in other situations as well?
> >  * Which version of ODE are you using? Have you tried with a recent 1.X
> > branch?
> >
> > Thanks,
> > Matthieu
> >
> >
> > >
> > >
> > >
> > > ----- Original Message ----
> > > From: Alexis Midon <mi...@intalio.com>
> > > To: user@ode.apache.org
> > > Sent: Wednesday, October 8, 2008 7:26:54 PM
> > > Subject: Re: Client calling retired process?
> > >
> > > Hi Chris,
> > >
> > > No new executions can be started on a retired process, but running
> > > instances
> > > can still finish their job. [1]
> > >
> > > I'm not really familiar with this part of the code, but after looking
> at
> > > it,
> > > it seems to me that the deployment of a new version is not atomic.
> > Meaning
> > > that a process could be flagged as retired while the creation of a new
> > > instance is in progress, hence you're exception.
> > >
> > > does it make sense regarding your scenario? is it possible that the
> > process
> > > gets retired while messages are coming in?
> > >
> > > [1] further details here:
> > > http://ode.apache.org/user-guide.html#UserGuide-Versioning
> > >
> > >
> > >
> > > On Wed, Oct 8, 2008 at 11:37 AM, Chris Taylor <sa...@yahoo.com>
> > wrote:
> > >
> > > > Okay, I've a deployment (called GetCodes) bundle that includes 5
> > > > processes.  4 of the processes make calls to the fifth (it's an
> > > abstraction
> > > > layer of process business logic).  When I deploy this "GetCodes"
> bundle
> > > > using the DeploymentService utility, I can see an incremented
> > deployment
> > > > (say, GetCodes-40) alongside previous iterations.
> > > >
> > > > Occasionally, I'll have a client making soap calls to one of the
> > > processes
> > > > under this logical bundle that will fail with the following error:
> > > >
> > > > InvalidProcessException: Process is retired.
> > > >
> > > > In the logs, it's clear that ODE is directing this client call to
> > > > GetCodes-39 - though the client isn't explicitly attempting to call a
> > > > specific version (is that even possible?).  Any clue why some clients
> > > > periodically - erroneously - are directed by ODE to a retired process
> > > > version?
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
> >
>
>
>
>
>


      

Re: Client calling retired process?

Posted by Alex Boisvert <bo...@intalio.com>.
On Tue, Nov 25, 2008 at 2:01 PM, Chris Taylor <sa...@yahoo.com> wrote:

> Incidentally, in analyzing this issue it would seem that the process store
> was holding multiple "Active" versions of the same process (so, multiple
> PIDs of the same Type in the STORE_PROCESS were Active at the same time).
>
> I wonder if this development environment, wherein we are using the
> DeploymentService to deploy Deployment Units (and thus versioned deployment
> directories), combined with a couple different deployments of the ODE
> runtime engine was causing these "logically" retired processes to hang
> around as Active.
>
> Going out on a limb here, but does it make sense during times of recovery
> that the engine might get confused about which process Type to route to if
> there were multiple Active Process IDs of it at the same time?


I think that's entirely possible.   Matthieu could confirm since he's more
familiar with that part of the code.

alex

Re: Client calling retired process?

Posted by Matthieu Riou <ma...@offthelip.org>.
On Tue, Nov 25, 2008 at 2:01 PM, Chris Taylor <sa...@yahoo.com> wrote:

> Brilliant!  Thanks, Alex.
>
> Incidentally, in analyzing this issue it would seem that the process store
> was holding multiple "Active" versions of the same process (so, multiple
> PIDs of the same Type in the STORE_PROCESS were Active at the same time).
>
> I wonder if this development environment, wherein we are using the
> DeploymentService to deploy Deployment Units (and thus versioned deployment
> directories), combined with a couple different deployments of the ODE
> runtime engine was causing these "logically" retired processes to hang
> around as Active.
>
> Going out on a limb here, but does it make sense during times of recovery
> that the engine might get confused about which process Type to route to if
> there were multiple Active Process IDs of it at the same time?
>

Yeah, in that case the result is undefined. There's basically no way for the
engine to guess which one is the real destination.

Matthieu


>
>
>
>
>
> ________________________________
> From: Alex Boisvert <bo...@intalio.com>
> To: user@ode.apache.org
> Sent: Tuesday, November 25, 2008 2:50:47 PM
> Subject: Re: Client calling retired process?
>
> On Tue, Nov 25, 2008 at 12:46 PM, Chris Taylor <sa...@yahoo.com> wrote:
>
> > Oh, jeez.  Forget that last one, I obviously wasn't thinking about it
> > enough before I posted it.  We have deployed a couple of different
> versions
> > of the ODE runtime engine, and Websphere stomps on the previous each time
> we
> > do, so we lose the actual /processes/deployment folder for previous,
> retired
> > process versions in that scenario.
> >
> > Still, perhaps a good argument for an external, configurable processes
> > folder?
>
>
> I believe Ode uses the "ode-axis2.working.dir" property in
> ode-axis2.properties:
>
> e.g.
> ode-axis2.working.dir=${org.apache.ode.configDir}/../
>
> alex
>
>
>
>
>

Re: Client calling retired process?

Posted by Chris Taylor <sa...@yahoo.com>.
Brilliant!  Thanks, Alex.

Incidentally, in analyzing this issue it would seem that the process store was holding multiple "Active" versions of the same process (so, multiple PIDs of the same Type in the STORE_PROCESS were Active at the same time).

I wonder if this development environment, wherein we are using the DeploymentService to deploy Deployment Units (and thus versioned deployment directories), combined with a couple different deployments of the ODE runtime engine was causing these "logically" retired processes to hang around as Active.

Going out on a limb here, but does it make sense during times of recovery that the engine might get confused about which process Type to route to if there were multiple Active Process IDs of it at the same time?

 



________________________________
From: Alex Boisvert <bo...@intalio.com>
To: user@ode.apache.org
Sent: Tuesday, November 25, 2008 2:50:47 PM
Subject: Re: Client calling retired process?

On Tue, Nov 25, 2008 at 12:46 PM, Chris Taylor <sa...@yahoo.com> wrote:

> Oh, jeez.  Forget that last one, I obviously wasn't thinking about it
> enough before I posted it.  We have deployed a couple of different versions
> of the ODE runtime engine, and Websphere stomps on the previous each time we
> do, so we lose the actual /processes/deployment folder for previous, retired
> process versions in that scenario.
>
> Still, perhaps a good argument for an external, configurable processes
> folder?


I believe Ode uses the "ode-axis2.working.dir" property in
ode-axis2.properties:

e.g.
ode-axis2.working.dir=${org.apache.ode.configDir}/../

alex



      

Re: Client calling retired process?

Posted by Alex Boisvert <bo...@intalio.com>.
On Tue, Nov 25, 2008 at 12:46 PM, Chris Taylor <sa...@yahoo.com> wrote:

> Oh, jeez.  Forget that last one, I obviously wasn't thinking about it
> enough before I posted it.  We have deployed a couple of different versions
> of the ODE runtime engine, and Websphere stomps on the previous each time we
> do, so we lose the actual /processes/deployment folder for previous, retired
> process versions in that scenario.
>
> Still, perhaps a good argument for an external, configurable processes
> folder?


I believe Ode uses the "ode-axis2.working.dir" property in
ode-axis2.properties:

e.g.
ode-axis2.working.dir=${org.apache.ode.configDir}/../

alex

Re: Client calling retired process?

Posted by Chris Taylor <sa...@yahoo.com>.
Oh, jeez.  Forget that last one, I obviously wasn't thinking about it enough before I posted it.  We have deployed a couple of different versions of the ODE runtime engine, and Websphere stomps on the previous each time we do, so we lose the actual /processes/deployment folder for previous, retired process versions in that scenario.

Still, perhaps a good argument for an external, configurable processes folder?




________________________________
From: Chris Taylor <sa...@yahoo.com>
To: user@ode.apache.org
Sent: Tuesday, November 25, 2008 2:35:20 PM
Subject: Re: Client calling retired process?

One other thing - I have no idea if this is related or not:  We also use the DeploymentService for bpel deployments into this test environment (and all of our environments, actually).  Every time we start up now, we see errors like the following:
[11/25/08 14:12:04:463 CST] 00000060 SystemOut O 14:12:04,462 ERROR [ProcessStoreImpl] Error loading DU from store: GetProviderDetails-107
org.apache.ode.bpel.iapi.ContextException: Deployed directory null no longer there!
at org.apache.ode.store.ProcessStoreImpl.load(ProcessStoreImpl.java:606)
at org.apache.ode.store.ProcessStoreImpl$6.call(ProcessStoreImpl.java:461)
at org.apache.ode.store.ProcessStoreImpl$Callable.call(ProcessStoreImpl.java:701)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:284)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:665)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:690)
at java.lang.Thread.run(Thread.java:810)
which makes perfect sense, of course, since we are no longer on version GetProviderDetails-107. but now on, let's say, GetProviderDetails-200.  But why would ODE continue to look for retired versions?  And now, given that we are many versions deep on many of these bpels, these errors take several pages upon startup.




________________________________
From: Matthieu Riou <ma...@offthelip.org>
To: user@ode.apache.org
Sent: Tuesday, November 25, 2008 10:20:07 AM
Subject: Re: Client calling retired process?

On Mon, Nov 24, 2008 at 7:14 AM, Chris Taylor <sa...@yahoo.com> wrote:

> Some more information regarding this error:
>
> we are still seeing this even with the ODE Trunk 1.2.1 deployment. It
> occurs quite rarely, but it seems the catalyst is an OutOfMemoryError raised
> by ODE when a new request comes in:
>

Reviewing the code again I couldn't spot anything that would produce this
behavior. The process or the process data aren't stored in structures that
would be sensitive to OOM. One thing that could help would be a debug log of
BpelEngineImpl when the problem occurs as routing to a given process from
the message happens in BpelEngineImpl.route(). So you could just set that
logger to debug and see the next time it happens.

Thanks,
Matthieu


>
>
> java.lang.OutOfMemoryError
>
> at
> org.apache.ode.bpel.engine.MyRoleMessageExchangeImpl$ResponseFuture.get(MyRoleMessageExchangeImpl.java:201)
>
> at
> org.apache.ode.axis2.ODEService.onAxisMessageExchange(ODEService.java:149)
>
> at
> org.apache.ode.axis2.hooks.ODEMessageReceiver.invokeBusinessLogic(ODEMessageReceiver.java:67)
>
> at
> org.apache.ode.axis2.hooks.ODEMessageReceiver.invokeBusinessLogic(ODEMessageReceiver.java:50)
>
> at
> org.apache.axis2.receivers.AbstractMessageReceiver.receive(AbstractMessageReceiver.java:96)
>
> at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:145)
>
> at
> org.apache.axis2.transport.http.HTTPTransportUtils.processHTTPPostRequest(HTTPTransportUtils.java:275)
>
> at org.apache.axis2.transport.http.AxisServlet.doPost(AxisServlet.java:120)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:763)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:856)
>
> at com.ibm.ws <http://com.ibm.ws.webcontainer.servlet.servletwrapper.se/>
> .webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1075)
>
> at com.ibm.ws
> .webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:550)
>
> at
> com.ibm.ws.wswebcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:478)
>
> at
> com.ibm.ws.webcontainer.servlet.CacheServletWrapper.handleRequest(CacheServletWrapper.java:90)
>
> at
> com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:744)
>
> at
> com.ibm.ws.wswebcontainer.WebContainer.handleRequest(WebContainer.java:1455)
>
> at com.ibm.ws <http://com.ibm.ws.webcontainer.channel.wcchannellink.re/>
> .webcontainer.channel.WCChannelLink.ready(WCChannelLink.java:115)
>
> at com.ibm.ws <http://com.ibm.ws.http.channel.inbound.impl.ht/>
> .http.channel.inbound.impl.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:458)
>
> at
> com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewInformation(HttpInboundLink.java:387)
>
> at com.ibm.ws<http://com.ibm.ws.http.channel.inbound.impl.httpiclreadcallback.com/>
> .http.channel.inbound.impl.HttpICLReadCallback.complete(HttpICLReadCallback.java:102)
>
> at
> com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165)
>
> at com.ibm.io <http://com.ibm.io.async.abstractasyncfuture.in/>
> .async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217)
>
> at com.ibm.io <http://com.ibm.io.async.asyncchannelfuture.fi/>
> .async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161)
>
> at com.ibm.io <http://com.ibm.io.async.asyncfuture.com/>
> .async.AsyncFuture.completed(AsyncFuture.java:136)
>
> at com.ibm.io <http://com.ibm.io.async.resulthandler.com/>
> .async.ResultHandler.complete(ResultHandler.java:195)
>
> at com.ibm.io <http://com.ibm.io.async.resulthandler.ru/>
> .async.ResultHandler.runEventProcessingLoop(ResultHandler.java:743)
>
> at com.ibm.io <http://com.ibm.io.async.re/>
> .async.ResultHandler$2.run(ResultHandler.java:873)
>
> at com.ibm.ws <http://com.ibm.ws.util.th/>
> .util.ThreadPool$Worker.run(ThreadPool.java:1473)
>
>
>
> After Websphere recovers, from this point on until we redeploy the process
> in question to a new version, ODE attempts to route subsequent requests to a
> retired version.
>
>
>
> [11/20/08 14:29:26:968 CST] 00000046 SystemOut O 14:29:26,967 ERROR
> [BpelEngineImpl] Scheduled job failed; jobDetail={type=INVOKE_INTERNAL,
> pid={http://eclipse.org/bpel/sample}AdminYNProcess-195,<http://eclipse.org/bpel/sample%7DAdminYNProcess-195,>inmem=true, mexid=4611686018427387977}
>
> org.apache.ode.bpel.runtime.InvalidProcessException: Process is retired.
>
> at
> org.apache.ode.bpel.engine.PartnerLinkMyRoleImpl.invokeNewInstance(PartnerLinkMyRoleImpl.java:173)
>
> at
> org.apache.ode.bpel.engine.BpelProcess.invokeProcess(BpelProcess.java:204)
>
> at
> org.apache.ode.bpel.engine.BpelProcess.handleWorkEvent(BpelProcess.java:372)
>
> at
> org.apache.ode.bpel.engine.BpelEngineImpl.onScheduledJob(BpelEngineImpl.java:326)
>
> at
> org.apache.ode.bpel.engine.BpelServerImpl.onScheduledJob(BpelServerImpl.java:373)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4$1.call(SimpleScheduler.java:337)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4$1.call(SimpleScheduler.java:336)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler.execTransaction(SimpleScheduler.java:174)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4.call(SimpleScheduler.java:335)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4.call(SimpleScheduler.java:332)
>
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:284)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:665)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:690)
>
> at java.lang.Thread.run(Thread.java:810)
>
> Attached is the Java core dump file from the time of the original
> OutOfMemoryError, showing that it was caused by excessive garbage
> collection.  the VM this runs under allocates 1 Gig of memory on the heap.
>
> - Chris Taylor
>
>  ------------------------------
> *From:* Matthieu Riou <ma...@offthelip.org>
> *To:* user@ode.apache.org
> *Cc:* Dave Cecchi <da...@perficient.com>
> *Sent:* Thursday, October 16, 2008 10:40:57 AM
> *Subject:* Re: Client calling retired process?
>
> On Wed, Oct 15, 2008 at 9:27 AM, Chris Taylor <sa...@yahoo.com> wrote:
>
> > Matthieu, Yes would appreciate if you could put that latest built war
> > somewhere.  We have attempted to build with buildr without success.
> >
>
> Here it is:
>
> http://people.apache.org/~mriou/ode-axis2-war-1.2.1-SNAPSHOT.war<http://people.apache.org/%7Emriou/ode-axis2-war-1.2.1-SNAPSHOT.war>
>
> Let me know how it goes.
>
> Cheers,
> Matthieu
>
>
> >
> >
> >
> > ----- Original Message ----
> > From: Matthieu Riou <ma...@offthelip.org>
> > To: user@ode.apache.org
> > Sent: Monday, October 13, 2008 1:30:56 PM
> > Subject: Re: Client calling retired process?
> >
> > On Mon, Oct 13, 2008 at 10:55 AM, Chris Taylor <sa...@yahoo.com>
> wrote:
> >
> > > Thanks, Matthieu.  Some background:
> > >
> > > we're running ODE 1.2 on Websphere 6.1, with Oracle 10g as the process
> > > store.
> > >
> > > This scenario consistently fails in the manner I described, but it
> seems
> > > only for certain processes.
> > >
> > > So, for example, if i have the following:
> > >
> > > ProcessA-20
> > > ProcessB-21
> > > ProcessC-22
> > >
> > > deployed in my environment, the scenario would be that something causes
> > > ProcessA-20 to hang - at which point it goes into recovery mode and
> > spawns
> > > an ode job to retry.  From this point on, new requests to (not just)
> > > ProcessA get routed to the now-retired ProcessA-19, but also new
> requests
> > to
> > > ProcessB get routed to (now-retired) ProcessB-20!  The weird thing is,
> > > ProcessC-22 is apparently unaffected.  It still gets calls legitimately
> > > routed to its latest versioned deployment, ProcessC-22.
> > >
> > > I do not know if this happens under other scenarios unrelated to
> > recovery.
> > > I think I just do not have enough data points yet to say.
> > >
> > >
> >
> > If you have a reproducible test scenario, it would be great if you could
> > try
> > it with the current stable branch. I've fixed something related to what
> > you're describing a couple of months ago. If doing a build is an issue
> for
> > you, I can upload the WAR to a public place.
> >
> > Thanks,
> > Matthieu
> >
> >
> > >
> > >
> > >
> > > ----- Original Message ----
> > > From: Matthieu Riou <ma...@offthelip.org>
> > > To: user@ode.apache.org
> > > Sent: Monday, October 13, 2008 12:33:18 PM
> > > Subject: Re: Client calling retired process?
> > >
> > > On Mon, Oct 13, 2008 at 8:17 AM, Chris Taylor <sa...@yahoo.com>
> > wrote:
> > >
> > > > Thanks, Alexis, but i'm no closer to fully understanding why this
> > occurs.
> > > > It happens periodically now almost everyday with different deployed
> > > > processes.  Although I don't understand it, I have done some research
> > > into
> > > > the behaviour.  Here's a scenario:
> > > >
> > > > we'll deploy ProcessA-19, then retire it with ProcessA-20 deployment.
> > At
> > > > some point it, or another, process will fail and attempt to go into
> > > recovery
> > > > mode (excuse me if I state this incorrectly),  at this point ODE will
> > > create
> > > > a scheduled job in an attempt to retry the service later.
> > > >
> > > > Here's where it gets screwy.  From then on, all new calls to ProcessA
> > > will
> > > > not route to ProcessA-20, but ode will attempt to route them to
> > > ProcessA-19,
> > > > which is of course retired. Ode does not recover from this.  It seems
> > the
> > > > only way to compensate is to redeploy ProcessA as ProcessA-21.  New
> > > requests
> > > > will then route correctly.
> > > >
> > > > Any idea here?
> > > >
> > >
> > > I'll have to ask a few more questions to narrow it down and make sure I
> > > understand correctly:
> > >
> > >  * Does the exact same scenario sometimes works and sometimes doesn't?
> > >  * Is it always happening in relation with recovery and retry or did
> you
> > > see it happen in other situations as well?
> > >  * Which version of ODE are you using? Have you tried with a recent 1.X
> > > branch?
> > >
> > > Thanks,
> > > Matthieu
> > >
> > >
> > > >
> > > >
> > > >
> > > > ----- Original Message ----
> > > > From: Alexis Midon <mi...@intalio.com>
> > > > To: user@ode.apache.org
> > > > Sent: Wednesday, October 8, 2008 7:26:54 PM
> > > > Subject: Re: Client calling retired process?
> > > >
> > > > Hi Chris,
> > > >
> > > > No new executions can be started on a retired process, but running
> > > > instances
> > > > can still finish their job. [1]
> > > >
> > > > I'm not really familiar with this part of the code, but after looking
> > at
> > > > it,
> > > > it seems to me that the deployment of a new version is not atomic.
> > > Meaning
> > > > that a process could be flagged as retired while the creation of a
> new
> > > > instance is in progress, hence you're exception.
> > > >
> > > > does it make sense regarding your scenario? is it possible that the
> > > process
> > > > gets retired while messages are coming in?
> > > >
> > > > [1] further details here:
> > > > http://ode.apache.org/user-guide.html#UserGuide-Versioning
> > > >
> > > >
> > > >
> > > > On Wed, Oct 8, 2008 at 11:37 AM, Chris Taylor <sa...@yahoo.com>
> > > wrote:
> > > >
> > > > > Okay, I've a deployment (called GetCodes) bundle that includes 5
> > > > > processes.  4 of the processes make calls to the fifth (it's an
> > > > abstraction
> > > > > layer of process business logic).  When I deploy this "GetCodes"
> > bundle
> > > > > using the DeploymentService utility, I can see an incremented
> > > deployment
> > > > > (say, GetCodes-40) alongside previous iterations.
> > > > >
> > > > > Occasionally, I'll have a client making soap calls to one of the
> > > > processes
> > > > > under this logical bundle that will fail with the following error:
> > > > >
> > > > > InvalidProcessException: Process is retired.
> > > > >
> > > > > In the logs, it's clear that ODE is directing this client call to
> > > > > GetCodes-39 - though the client isn't explicitly attempting to call
> a
> > > > > specific version (is that even possible?).  Any clue why some
> clients
> > > > > periodically - erroneously - are directed by ODE to a retired
> process
> > > > > version?
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
> >
>
>
>


      

Re: Client calling retired process?

Posted by Chris Taylor <sa...@yahoo.com>.
One other thing - I have no idea if this is related or not:  We also use the DeploymentService for bpel deployments into this test environment (and all of our environments, actually).  Every time we start up now, we see errors like the following:
[11/25/08 14:12:04:463 CST] 00000060 SystemOut O 14:12:04,462 ERROR [ProcessStoreImpl] Error loading DU from store: GetProviderDetails-107
org.apache.ode.bpel.iapi.ContextException: Deployed directory null no longer there!
at org.apache.ode.store.ProcessStoreImpl.load(ProcessStoreImpl.java:606)
at org.apache.ode.store.ProcessStoreImpl$6.call(ProcessStoreImpl.java:461)
at org.apache.ode.store.ProcessStoreImpl$Callable.call(ProcessStoreImpl.java:701)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:284)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:665)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:690)
at java.lang.Thread.run(Thread.java:810)
which makes perfect sense, of course, since we are no longer on version GetProviderDetails-107. but now on, let's say, GetProviderDetails-200.  But why would ODE continue to look for retired versions?  And now, given that we are many versions deep on many of these bpels, these errors take several pages upon startup.




________________________________
From: Matthieu Riou <ma...@offthelip.org>
To: user@ode.apache.org
Sent: Tuesday, November 25, 2008 10:20:07 AM
Subject: Re: Client calling retired process?

On Mon, Nov 24, 2008 at 7:14 AM, Chris Taylor <sa...@yahoo.com> wrote:

> Some more information regarding this error:
>
> we are still seeing this even with the ODE Trunk 1.2.1 deployment. It
> occurs quite rarely, but it seems the catalyst is an OutOfMemoryError raised
> by ODE when a new request comes in:
>

Reviewing the code again I couldn't spot anything that would produce this
behavior. The process or the process data aren't stored in structures that
would be sensitive to OOM. One thing that could help would be a debug log of
BpelEngineImpl when the problem occurs as routing to a given process from
the message happens in BpelEngineImpl.route(). So you could just set that
logger to debug and see the next time it happens.

Thanks,
Matthieu


>
>
> java.lang.OutOfMemoryError
>
> at
> org.apache.ode.bpel.engine.MyRoleMessageExchangeImpl$ResponseFuture.get(MyRoleMessageExchangeImpl.java:201)
>
> at
> org.apache.ode.axis2.ODEService.onAxisMessageExchange(ODEService.java:149)
>
> at
> org.apache.ode.axis2.hooks.ODEMessageReceiver.invokeBusinessLogic(ODEMessageReceiver.java:67)
>
> at
> org.apache.ode.axis2.hooks.ODEMessageReceiver.invokeBusinessLogic(ODEMessageReceiver.java:50)
>
> at
> org.apache.axis2.receivers.AbstractMessageReceiver.receive(AbstractMessageReceiver.java:96)
>
> at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:145)
>
> at
> org.apache.axis2.transport.http.HTTPTransportUtils.processHTTPPostRequest(HTTPTransportUtils.java:275)
>
> at org.apache.axis2.transport.http.AxisServlet.doPost(AxisServlet.java:120)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:763)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:856)
>
> at com.ibm.ws <http://com.ibm.ws.webcontainer.servlet.servletwrapper.se/>
> .webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1075)
>
> at com.ibm.ws
> .webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:550)
>
> at
> com.ibm.ws.wswebcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:478)
>
> at
> com.ibm.ws.webcontainer.servlet.CacheServletWrapper.handleRequest(CacheServletWrapper.java:90)
>
> at
> com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:744)
>
> at
> com.ibm.ws.wswebcontainer.WebContainer.handleRequest(WebContainer.java:1455)
>
> at com.ibm.ws <http://com.ibm.ws.webcontainer.channel.wcchannellink.re/>
> .webcontainer.channel.WCChannelLink.ready(WCChannelLink.java:115)
>
> at com.ibm.ws <http://com.ibm.ws.http.channel.inbound.impl.ht/>
> .http.channel.inbound.impl.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:458)
>
> at
> com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewInformation(HttpInboundLink.java:387)
>
> at com.ibm.ws<http://com.ibm.ws.http.channel.inbound.impl.httpiclreadcallback.com/>
> .http.channel.inbound.impl.HttpICLReadCallback.complete(HttpICLReadCallback.java:102)
>
> at
> com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165)
>
> at com.ibm.io <http://com.ibm.io.async.abstractasyncfuture.in/>
> .async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217)
>
> at com.ibm.io <http://com.ibm.io.async.asyncchannelfuture.fi/>
> .async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161)
>
> at com.ibm.io <http://com.ibm.io.async.asyncfuture.com/>
> .async.AsyncFuture.completed(AsyncFuture.java:136)
>
> at com.ibm.io <http://com.ibm.io.async.resulthandler.com/>
> .async.ResultHandler.complete(ResultHandler.java:195)
>
> at com.ibm.io <http://com.ibm.io.async.resulthandler.ru/>
> .async.ResultHandler.runEventProcessingLoop(ResultHandler.java:743)
>
> at com.ibm.io <http://com.ibm.io.async.re/>
> .async.ResultHandler$2.run(ResultHandler.java:873)
>
> at com.ibm.ws <http://com.ibm.ws.util.th/>
> .util.ThreadPool$Worker.run(ThreadPool.java:1473)
>
>
>
> After Websphere recovers, from this point on until we redeploy the process
> in question to a new version, ODE attempts to route subsequent requests to a
> retired version.
>
>
>
> [11/20/08 14:29:26:968 CST] 00000046 SystemOut O 14:29:26,967 ERROR
> [BpelEngineImpl] Scheduled job failed; jobDetail={type=INVOKE_INTERNAL,
> pid={http://eclipse.org/bpel/sample}AdminYNProcess-195,<http://eclipse.org/bpel/sample%7DAdminYNProcess-195,>inmem=true, mexid=4611686018427387977}
>
> org.apache.ode.bpel.runtime.InvalidProcessException: Process is retired.
>
> at
> org.apache.ode.bpel.engine.PartnerLinkMyRoleImpl.invokeNewInstance(PartnerLinkMyRoleImpl.java:173)
>
> at
> org.apache.ode.bpel.engine.BpelProcess.invokeProcess(BpelProcess.java:204)
>
> at
> org.apache.ode.bpel.engine.BpelProcess.handleWorkEvent(BpelProcess.java:372)
>
> at
> org.apache.ode.bpel.engine.BpelEngineImpl.onScheduledJob(BpelEngineImpl.java:326)
>
> at
> org.apache.ode.bpel.engine.BpelServerImpl.onScheduledJob(BpelServerImpl.java:373)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4$1.call(SimpleScheduler.java:337)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4$1.call(SimpleScheduler.java:336)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler.execTransaction(SimpleScheduler.java:174)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4.call(SimpleScheduler.java:335)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4.call(SimpleScheduler.java:332)
>
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:284)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:665)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:690)
>
> at java.lang.Thread.run(Thread.java:810)
>
> Attached is the Java core dump file from the time of the original
> OutOfMemoryError, showing that it was caused by excessive garbage
> collection.  the VM this runs under allocates 1 Gig of memory on the heap.
>
> - Chris Taylor
>
>  ------------------------------
> *From:* Matthieu Riou <ma...@offthelip.org>
> *To:* user@ode.apache.org
> *Cc:* Dave Cecchi <da...@perficient.com>
> *Sent:* Thursday, October 16, 2008 10:40:57 AM
> *Subject:* Re: Client calling retired process?
>
> On Wed, Oct 15, 2008 at 9:27 AM, Chris Taylor <sa...@yahoo.com> wrote:
>
> > Matthieu, Yes would appreciate if you could put that latest built war
> > somewhere.  We have attempted to build with buildr without success.
> >
>
> Here it is:
>
> http://people.apache.org/~mriou/ode-axis2-war-1.2.1-SNAPSHOT.war<http://people.apache.org/%7Emriou/ode-axis2-war-1.2.1-SNAPSHOT.war>
>
> Let me know how it goes.
>
> Cheers,
> Matthieu
>
>
> >
> >
> >
> > ----- Original Message ----
> > From: Matthieu Riou <ma...@offthelip.org>
> > To: user@ode.apache.org
> > Sent: Monday, October 13, 2008 1:30:56 PM
> > Subject: Re: Client calling retired process?
> >
> > On Mon, Oct 13, 2008 at 10:55 AM, Chris Taylor <sa...@yahoo.com>
> wrote:
> >
> > > Thanks, Matthieu.  Some background:
> > >
> > > we're running ODE 1.2 on Websphere 6.1, with Oracle 10g as the process
> > > store.
> > >
> > > This scenario consistently fails in the manner I described, but it
> seems
> > > only for certain processes.
> > >
> > > So, for example, if i have the following:
> > >
> > > ProcessA-20
> > > ProcessB-21
> > > ProcessC-22
> > >
> > > deployed in my environment, the scenario would be that something causes
> > > ProcessA-20 to hang - at which point it goes into recovery mode and
> > spawns
> > > an ode job to retry.  >From this point on, new requests to (not just)
> > > ProcessA get routed to the now-retired ProcessA-19, but also new
> requests
> > to
> > > ProcessB get routed to (now-retired) ProcessB-20!  The weird thing is,
> > > ProcessC-22 is apparently unaffected.  It still gets calls legitimately
> > > routed to its latest versioned deployment, ProcessC-22.
> > >
> > > I do not know if this happens under other scenarios unrelated to
> > recovery.
> > > I think I just do not have enough data points yet to say.
> > >
> > >
> >
> > If you have a reproducible test scenario, it would be great if you could
> > try
> > it with the current stable branch. I've fixed something related to what
> > you're describing a couple of months ago. If doing a build is an issue
> for
> > you, I can upload the WAR to a public place.
> >
> > Thanks,
> > Matthieu
> >
> >
> > >
> > >
> > >
> > > ----- Original Message ----
> > > From: Matthieu Riou <ma...@offthelip.org>
> > > To: user@ode.apache.org
> > > Sent: Monday, October 13, 2008 12:33:18 PM
> > > Subject: Re: Client calling retired process?
> > >
> > > On Mon, Oct 13, 2008 at 8:17 AM, Chris Taylor <sa...@yahoo.com>
> > wrote:
> > >
> > > > Thanks, Alexis, but i'm no closer to fully understanding why this
> > occurs.
> > > > It happens periodically now almost everyday with different deployed
> > > > processes.  Although I don't understand it, I have done some research
> > > into
> > > > the behaviour.  Here's a scenario:
> > > >
> > > > we'll deploy ProcessA-19, then retire it with ProcessA-20 deployment.
> > At
> > > > some point it, or another, process will fail and attempt to go into
> > > recovery
> > > > mode (excuse me if I state this incorrectly),  at this point ODE will
> > > create
> > > > a scheduled job in an attempt to retry the service later.
> > > >
> > > > Here's where it gets screwy.  From then on, all new calls to ProcessA
> > > will
> > > > not route to ProcessA-20, but ode will attempt to route them to
> > > ProcessA-19,
> > > > which is of course retired. Ode does not recover from this.  It seems
> > the
> > > > only way to compensate is to redeploy ProcessA as ProcessA-21.  New
> > > requests
> > > > will then route correctly.
> > > >
> > > > Any idea here?
> > > >
> > >
> > > I'll have to ask a few more questions to narrow it down and make sure I
> > > understand correctly:
> > >
> > >  * Does the exact same scenario sometimes works and sometimes doesn't?
> > >  * Is it always happening in relation with recovery and retry or did
> you
> > > see it happen in other situations as well?
> > >  * Which version of ODE are you using? Have you tried with a recent 1.X
> > > branch?
> > >
> > > Thanks,
> > > Matthieu
> > >
> > >
> > > >
> > > >
> > > >
> > > > ----- Original Message ----
> > > > From: Alexis Midon <mi...@intalio.com>
> > > > To: user@ode.apache.org
> > > > Sent: Wednesday, October 8, 2008 7:26:54 PM
> > > > Subject: Re: Client calling retired process?
> > > >
> > > > Hi Chris,
> > > >
> > > > No new executions can be started on a retired process, but running
> > > > instances
> > > > can still finish their job. [1]
> > > >
> > > > I'm not really familiar with this part of the code, but after looking
> > at
> > > > it,
> > > > it seems to me that the deployment of a new version is not atomic.
> > > Meaning
> > > > that a process could be flagged as retired while the creation of a
> new
> > > > instance is in progress, hence you're exception.
> > > >
> > > > does it make sense regarding your scenario? is it possible that the
> > > process
> > > > gets retired while messages are coming in?
> > > >
> > > > [1] further details here:
> > > > http://ode.apache.org/user-guide.html#UserGuide-Versioning
> > > >
> > > >
> > > >
> > > > On Wed, Oct 8, 2008 at 11:37 AM, Chris Taylor <sa...@yahoo.com>
> > > wrote:
> > > >
> > > > > Okay, I've a deployment (called GetCodes) bundle that includes 5
> > > > > processes.  4 of the processes make calls to the fifth (it's an
> > > > abstraction
> > > > > layer of process business logic).  When I deploy this "GetCodes"
> > bundle
> > > > > using the DeploymentService utility, I can see an incremented
> > > deployment
> > > > > (say, GetCodes-40) alongside previous iterations.
> > > > >
> > > > > Occasionally, I'll have a client making soap calls to one of the
> > > > processes
> > > > > under this logical bundle that will fail with the following error:
> > > > >
> > > > > InvalidProcessException: Process is retired.
> > > > >
> > > > > In the logs, it's clear that ODE is directing this client call to
> > > > > GetCodes-39 - though the client isn't explicitly attempting to call
> a
> > > > > specific version (is that even possible?).  Any clue why some
> clients
> > > > > periodically - erroneously - are directed by ODE to a retired
> process
> > > > > version?
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
> >
>
>
>


      

Re: Client calling retired process?

Posted by Alex Boisvert <bo...@intalio.com>.
There were actually 2 issues.

First, the recovery mechanism was a relatively slow batch process and made
recovery longer than necessary.   We are fixing this by making it
incremental instead of batch (think traditional file systems with fsck
versus log-oriented file systems with concurrent repair).

Second, there was an issue that created duplicate
jobs<http://issues.apache.org/jira/browse/ODE-424>when a job would
fail.  On some systems were failure are frequent and
repetitive, this could lead to a significant number of outstanding jobs and
therefore exacerbate the first problem.

alex


On Tue, Nov 25, 2008 at 9:47 AM, Chris Taylor <sa...@yahoo.com> wrote:

> Thanks, Alex.  The problem description is a little confusing in this Jira,
> though.  What is it that happens, exactly?
>
>
>
>
> ________________________________
> From: Alex Boisvert <bo...@intalio.com>
> To: user@ode.apache.org
> Sent: Tuesday, November 25, 2008 11:22:49 AM
> Subject: Re: Client calling retired process?
>
> On Tue, Nov 25, 2008 at 9:06 AM, Chris Taylor <sa...@yahoo.com> wrote:
>
> > In the meantime, this is causing a secondary issue in that when we hit
> the
> > original OOM, we build up a lot of rescheduled jobs (sometimes well over
> a
> > hundred) apparently for requests that cannot be satisfied.  When the
> server
> > starts up again, it immediately pegs at full capacity trying to satisfy
> > these.  Other than deleting the rescheduled jobs from ODE_JOB, is there
> some
> > way to change the configuration of ODE to limit how many of these it
> > reschedules so as not to back it up?
>
>
> This was recently fixed on the Ode 1.x branch
> http://issues.apache.org/jira/browse/ODE-425
>
> but still needs to be ported to the trunk
> http://issues.apache.org/jira/browse/ODE-430
>
> I'm hoping it will happen in the next week or so.
>
> alex
>
>
>
>
>

Re: Client calling retired process?

Posted by Chris Taylor <sa...@yahoo.com>.
Thanks, Alex.  The problem description is a little confusing in this Jira, though.  What is it that happens, exactly?  




________________________________
From: Alex Boisvert <bo...@intalio.com>
To: user@ode.apache.org
Sent: Tuesday, November 25, 2008 11:22:49 AM
Subject: Re: Client calling retired process?

On Tue, Nov 25, 2008 at 9:06 AM, Chris Taylor <sa...@yahoo.com> wrote:

> In the meantime, this is causing a secondary issue in that when we hit the
> original OOM, we build up a lot of rescheduled jobs (sometimes well over a
> hundred) apparently for requests that cannot be satisfied.  When the server
> starts up again, it immediately pegs at full capacity trying to satisfy
> these.  Other than deleting the rescheduled jobs from ODE_JOB, is there some
> way to change the configuration of ODE to limit how many of these it
> reschedules so as not to back it up?


This was recently fixed on the Ode 1.x branch
http://issues.apache.org/jira/browse/ODE-425

but still needs to be ported to the trunk
http://issues.apache.org/jira/browse/ODE-430

I'm hoping it will happen in the next week or so.

alex



      

Re: Client calling retired process?

Posted by Alex Boisvert <bo...@intalio.com>.
On Tue, Nov 25, 2008 at 9:06 AM, Chris Taylor <sa...@yahoo.com> wrote:

> In the meantime, this is causing a secondary issue in that when we hit the
> original OOM, we build up a lot of rescheduled jobs (sometimes well over a
> hundred) apparently for requests that cannot be satisfied.  When the server
> starts up again, it immediately pegs at full capacity trying to satisfy
> these.  Other than deleting the rescheduled jobs from ODE_JOB, is there some
> way to change the configuration of ODE to limit how many of these it
> reschedules so as not to back it up?


This was recently fixed on the Ode 1.x branch
http://issues.apache.org/jira/browse/ODE-425

but still needs to be ported to the trunk
http://issues.apache.org/jira/browse/ODE-430

I'm hoping it will happen in the next week or so.

alex

Re: Client calling retired process?

Posted by Chris Taylor <sa...@yahoo.com>.
We are planning to change our ODE deployment so that it is on a separate Node from other application instances. When we do this, i'll change the logging configuration as you mentioned and capture what happens.

In the meantime, this is causing a secondary issue in that when we hit the original OOM, we build up a lot of rescheduled jobs (sometimes well over a hundred) apparently for requests that cannot be satisfied.  When the server starts up again, it immediately pegs at full capacity trying to satisfy these.  Other than deleting the rescheduled jobs from ODE_JOB, is there some way to change the configuration of ODE to limit how many of these it reschedules so as not to back it up?




________________________________
From: Matthieu Riou <ma...@offthelip.org>
To: user@ode.apache.org
Sent: Tuesday, November 25, 2008 10:20:07 AM
Subject: Re: Client calling retired process?

On Mon, Nov 24, 2008 at 7:14 AM, Chris Taylor <sa...@yahoo.com> wrote:

> Some more information regarding this error:
>
> we are still seeing this even with the ODE Trunk 1.2.1 deployment. It
> occurs quite rarely, but it seems the catalyst is an OutOfMemoryError raised
> by ODE when a new request comes in:
>

Reviewing the code again I couldn't spot anything that would produce this
behavior. The process or the process data aren't stored in structures that
would be sensitive to OOM. One thing that could help would be a debug log of
BpelEngineImpl when the problem occurs as routing to a given process from
the message happens in BpelEngineImpl.route(). So you could just set that
logger to debug and see the next time it happens.

Thanks,
Matthieu


>
>
> java.lang.OutOfMemoryError
>
> at
> org.apache.ode.bpel.engine.MyRoleMessageExchangeImpl$ResponseFuture.get(MyRoleMessageExchangeImpl.java:201)
>
> at
> org.apache.ode.axis2.ODEService.onAxisMessageExchange(ODEService.java:149)
>
> at
> org.apache.ode.axis2.hooks.ODEMessageReceiver.invokeBusinessLogic(ODEMessageReceiver.java:67)
>
> at
> org.apache.ode.axis2.hooks.ODEMessageReceiver.invokeBusinessLogic(ODEMessageReceiver.java:50)
>
> at
> org.apache.axis2.receivers.AbstractMessageReceiver.receive(AbstractMessageReceiver.java:96)
>
> at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:145)
>
> at
> org.apache.axis2.transport.http.HTTPTransportUtils.processHTTPPostRequest(HTTPTransportUtils.java:275)
>
> at org.apache.axis2.transport.http.AxisServlet.doPost(AxisServlet.java:120)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:763)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:856)
>
> at com.ibm.ws <http://com.ibm.ws.webcontainer.servlet.servletwrapper.se/>
> .webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1075)
>
> at com.ibm.ws
> .webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:550)
>
> at
> com.ibm.ws.wswebcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:478)
>
> at
> com.ibm.ws.webcontainer.servlet.CacheServletWrapper.handleRequest(CacheServletWrapper.java:90)
>
> at
> com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:744)
>
> at
> com.ibm.ws.wswebcontainer.WebContainer.handleRequest(WebContainer.java:1455)
>
> at com.ibm.ws <http://com.ibm.ws.webcontainer.channel.wcchannellink.re/>
> .webcontainer.channel.WCChannelLink.ready(WCChannelLink.java:115)
>
> at com.ibm.ws <http://com.ibm.ws.http.channel.inbound.impl.ht/>
> .http.channel.inbound.impl.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:458)
>
> at
> com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewInformation(HttpInboundLink.java:387)
>
> at com.ibm.ws<http://com.ibm.ws.http.channel.inbound.impl.httpiclreadcallback.com/>
> .http.channel.inbound.impl.HttpICLReadCallback.complete(HttpICLReadCallback.java:102)
>
> at
> com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165)
>
> at com.ibm.io <http://com.ibm.io.async.abstractasyncfuture.in/>
> .async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217)
>
> at com.ibm.io <http://com.ibm.io.async.asyncchannelfuture.fi/>
> .async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161)
>
> at com.ibm.io <http://com.ibm.io.async.asyncfuture.com/>
> .async.AsyncFuture.completed(AsyncFuture.java:136)
>
> at com.ibm.io <http://com.ibm.io.async.resulthandler.com/>
> .async.ResultHandler.complete(ResultHandler.java:195)
>
> at com.ibm.io <http://com.ibm.io.async.resulthandler.ru/>
> .async.ResultHandler.runEventProcessingLoop(ResultHandler.java:743)
>
> at com.ibm.io <http://com.ibm.io.async.re/>
> .async.ResultHandler$2.run(ResultHandler.java:873)
>
> at com.ibm.ws <http://com.ibm.ws.util.th/>
> .util.ThreadPool$Worker.run(ThreadPool.java:1473)
>
>
>
> After Websphere recovers, from this point on until we redeploy the process
> in question to a new version, ODE attempts to route subsequent requests to a
> retired version.
>
>
>
> [11/20/08 14:29:26:968 CST] 00000046 SystemOut O 14:29:26,967 ERROR
> [BpelEngineImpl] Scheduled job failed; jobDetail={type=INVOKE_INTERNAL,
> pid={http://eclipse.org/bpel/sample}AdminYNProcess-195,<http://eclipse.org/bpel/sample%7DAdminYNProcess-195,>inmem=true, mexid=4611686018427387977}
>
> org.apache.ode.bpel.runtime.InvalidProcessException: Process is retired.
>
> at
> org.apache.ode.bpel.engine.PartnerLinkMyRoleImpl.invokeNewInstance(PartnerLinkMyRoleImpl.java:173)
>
> at
> org.apache.ode.bpel.engine.BpelProcess.invokeProcess(BpelProcess.java:204)
>
> at
> org.apache.ode.bpel.engine.BpelProcess.handleWorkEvent(BpelProcess.java:372)
>
> at
> org.apache.ode.bpel.engine.BpelEngineImpl.onScheduledJob(BpelEngineImpl.java:326)
>
> at
> org.apache.ode.bpel.engine.BpelServerImpl.onScheduledJob(BpelServerImpl.java:373)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4$1.call(SimpleScheduler.java:337)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4$1.call(SimpleScheduler.java:336)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler.execTransaction(SimpleScheduler.java:174)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4.call(SimpleScheduler.java:335)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4.call(SimpleScheduler.java:332)
>
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:284)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:665)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:690)
>
> at java.lang.Thread.run(Thread.java:810)
>
> Attached is the Java core dump file from the time of the original
> OutOfMemoryError, showing that it was caused by excessive garbage
> collection.  the VM this runs under allocates 1 Gig of memory on the heap.
>
> - Chris Taylor
>
>  ------------------------------
> *From:* Matthieu Riou <ma...@offthelip.org>
> *To:* user@ode.apache.org
> *Cc:* Dave Cecchi <da...@perficient.com>
> *Sent:* Thursday, October 16, 2008 10:40:57 AM
> *Subject:* Re: Client calling retired process?
>
> On Wed, Oct 15, 2008 at 9:27 AM, Chris Taylor <sa...@yahoo.com> wrote:
>
> > Matthieu, Yes would appreciate if you could put that latest built war
> > somewhere.  We have attempted to build with buildr without success.
> >
>
> Here it is:
>
> http://people.apache.org/~mriou/ode-axis2-war-1.2.1-SNAPSHOT.war<http://people.apache.org/%7Emriou/ode-axis2-war-1.2.1-SNAPSHOT.war>
>
> Let me know how it goes.
>
> Cheers,
> Matthieu
>
>
> >
> >
> >
> > ----- Original Message ----
> > From: Matthieu Riou <ma...@offthelip.org>
> > To: user@ode.apache.org
> > Sent: Monday, October 13, 2008 1:30:56 PM
> > Subject: Re: Client calling retired process?
> >
> > On Mon, Oct 13, 2008 at 10:55 AM, Chris Taylor <sa...@yahoo.com>
> wrote:
> >
> > > Thanks, Matthieu.  Some background:
> > >
> > > we're running ODE 1.2 on Websphere 6.1, with Oracle 10g as the process
> > > store.
> > >
> > > This scenario consistently fails in the manner I described, but it
> seems
> > > only for certain processes.
> > >
> > > So, for example, if i have the following:
> > >
> > > ProcessA-20
> > > ProcessB-21
> > > ProcessC-22
> > >
> > > deployed in my environment, the scenario would be that something causes
> > > ProcessA-20 to hang - at which point it goes into recovery mode and
> > spawns
> > > an ode job to retry.  From this point on, new requests to (not just)
> > > ProcessA get routed to the now-retired ProcessA-19, but also new
> requests
> > to
> > > ProcessB get routed to (now-retired) ProcessB-20!  The weird thing is,
> > > ProcessC-22 is apparently unaffected.  It still gets calls legitimately
> > > routed to its latest versioned deployment, ProcessC-22.
> > >
> > > I do not know if this happens under other scenarios unrelated to
> > recovery.
> > > I think I just do not have enough data points yet to say.
> > >
> > >
> >
> > If you have a reproducible test scenario, it would be great if you could
> > try
> > it with the current stable branch. I've fixed something related to what
> > you're describing a couple of months ago. If doing a build is an issue
> for
> > you, I can upload the WAR to a public place.
> >
> > Thanks,
> > Matthieu
> >
> >
> > >
> > >
> > >
> > > ----- Original Message ----
> > > From: Matthieu Riou <ma...@offthelip.org>
> > > To: user@ode.apache.org
> > > Sent: Monday, October 13, 2008 12:33:18 PM
> > > Subject: Re: Client calling retired process?
> > >
> > > On Mon, Oct 13, 2008 at 8:17 AM, Chris Taylor <sa...@yahoo.com>
> > wrote:
> > >
> > > > Thanks, Alexis, but i'm no closer to fully understanding why this
> > occurs.
> > > > It happens periodically now almost everyday with different deployed
> > > > processes.  Although I don't understand it, I have done some research
> > > into
> > > > the behaviour.  Here's a scenario:
> > > >
> > > > we'll deploy ProcessA-19, then retire it with ProcessA-20 deployment.
> > At
> > > > some point it, or another, process will fail and attempt to go into
> > > recovery
> > > > mode (excuse me if I state this incorrectly),  at this point ODE will
> > > create
> > > > a scheduled job in an attempt to retry the service later.
> > > >
> > > > Here's where it gets screwy.  From then on, all new calls to ProcessA
> > > will
> > > > not route to ProcessA-20, but ode will attempt to route them to
> > > ProcessA-19,
> > > > which is of course retired. Ode does not recover from this.  It seems
> > the
> > > > only way to compensate is to redeploy ProcessA as ProcessA-21.  New
> > > requests
> > > > will then route correctly.
> > > >
> > > > Any idea here?
> > > >
> > >
> > > I'll have to ask a few more questions to narrow it down and make sure I
> > > understand correctly:
> > >
> > >  * Does the exact same scenario sometimes works and sometimes doesn't?
> > >  * Is it always happening in relation with recovery and retry or did
> you
> > > see it happen in other situations as well?
> > >  * Which version of ODE are you using? Have you tried with a recent 1.X
> > > branch?
> > >
> > > Thanks,
> > > Matthieu
> > >
> > >
> > > >
> > > >
> > > >
> > > > ----- Original Message ----
> > > > From: Alexis Midon <mi...@intalio.com>
> > > > To: user@ode.apache.org
> > > > Sent: Wednesday, October 8, 2008 7:26:54 PM
> > > > Subject: Re: Client calling retired process?
> > > >
> > > > Hi Chris,
> > > >
> > > > No new executions can be started on a retired process, but running
> > > > instances
> > > > can still finish their job. [1]
> > > >
> > > > I'm not really familiar with this part of the code, but after looking
> > at
> > > > it,
> > > > it seems to me that the deployment of a new version is not atomic.
> > > Meaning
> > > > that a process could be flagged as retired while the creation of a
> new
> > > > instance is in progress, hence you're exception.
> > > >
> > > > does it make sense regarding your scenario? is it possible that the
> > > process
> > > > gets retired while messages are coming in?
> > > >
> > > > [1] further details here:
> > > > http://ode.apache.org/user-guide.html#UserGuide-Versioning
> > > >
> > > >
> > > >
> > > > On Wed, Oct 8, 2008 at 11:37 AM, Chris Taylor <sa...@yahoo.com>
> > > wrote:
> > > >
> > > > > Okay, I've a deployment (called GetCodes) bundle that includes 5
> > > > > processes.  4 of the processes make calls to the fifth (it's an
> > > > abstraction
> > > > > layer of process business logic).  When I deploy this "GetCodes"
> > bundle
> > > > > using the DeploymentService utility, I can see an incremented
> > > deployment
> > > > > (say, GetCodes-40) alongside previous iterations.
> > > > >
> > > > > Occasionally, I'll have a client making soap calls to one of the
> > > > processes
> > > > > under this logical bundle that will fail with the following error:
> > > > >
> > > > > InvalidProcessException: Process is retired.
> > > > >
> > > > > In the logs, it's clear that ODE is directing this client call to
> > > > > GetCodes-39 - though the client isn't explicitly attempting to call
> a
> > > > > specific version (is that even possible?).  Any clue why some
> clients
> > > > > periodically - erroneously - are directed by ODE to a retired
> process
> > > > > version?
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
> >
>
>
>



      

Re: Client calling retired process?

Posted by Matthieu Riou <ma...@offthelip.org>.
On Mon, Nov 24, 2008 at 7:14 AM, Chris Taylor <sa...@yahoo.com> wrote:

> Some more information regarding this error:
>
> we are still seeing this even with the ODE Trunk 1.2.1 deployment. It
> occurs quite rarely, but it seems the catalyst is an OutOfMemoryError raised
> by ODE when a new request comes in:
>

Reviewing the code again I couldn't spot anything that would produce this
behavior. The process or the process data aren't stored in structures that
would be sensitive to OOM. One thing that could help would be a debug log of
BpelEngineImpl when the problem occurs as routing to a given process from
the message happens in BpelEngineImpl.route(). So you could just set that
logger to debug and see the next time it happens.

Thanks,
Matthieu


>
>
> java.lang.OutOfMemoryError
>
> at
> org.apache.ode.bpel.engine.MyRoleMessageExchangeImpl$ResponseFuture.get(MyRoleMessageExchangeImpl.java:201)
>
> at
> org.apache.ode.axis2.ODEService.onAxisMessageExchange(ODEService.java:149)
>
> at
> org.apache.ode.axis2.hooks.ODEMessageReceiver.invokeBusinessLogic(ODEMessageReceiver.java:67)
>
> at
> org.apache.ode.axis2.hooks.ODEMessageReceiver.invokeBusinessLogic(ODEMessageReceiver.java:50)
>
> at
> org.apache.axis2.receivers.AbstractMessageReceiver.receive(AbstractMessageReceiver.java:96)
>
> at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:145)
>
> at
> org.apache.axis2.transport.http.HTTPTransportUtils.processHTTPPostRequest(HTTPTransportUtils.java:275)
>
> at org.apache.axis2.transport.http.AxisServlet.doPost(AxisServlet.java:120)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:763)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:856)
>
> at com.ibm.ws <http://com.ibm.ws.webcontainer.servlet.servletwrapper.se/>
> .webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1075)
>
> at com.ibm.ws
> .webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:550)
>
> at
> com.ibm.ws.wswebcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:478)
>
> at
> com.ibm.ws.webcontainer.servlet.CacheServletWrapper.handleRequest(CacheServletWrapper.java:90)
>
> at
> com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:744)
>
> at
> com.ibm.ws.wswebcontainer.WebContainer.handleRequest(WebContainer.java:1455)
>
> at com.ibm.ws <http://com.ibm.ws.webcontainer.channel.wcchannellink.re/>
> .webcontainer.channel.WCChannelLink.ready(WCChannelLink.java:115)
>
> at com.ibm.ws <http://com.ibm.ws.http.channel.inbound.impl.ht/>
> .http.channel.inbound.impl.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:458)
>
> at
> com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewInformation(HttpInboundLink.java:387)
>
> at com.ibm.ws<http://com.ibm.ws.http.channel.inbound.impl.httpiclreadcallback.com/>
> .http.channel.inbound.impl.HttpICLReadCallback.complete(HttpICLReadCallback.java:102)
>
> at
> com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165)
>
> at com.ibm.io <http://com.ibm.io.async.abstractasyncfuture.in/>
> .async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217)
>
> at com.ibm.io <http://com.ibm.io.async.asyncchannelfuture.fi/>
> .async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161)
>
> at com.ibm.io <http://com.ibm.io.async.asyncfuture.com/>
> .async.AsyncFuture.completed(AsyncFuture.java:136)
>
> at com.ibm.io <http://com.ibm.io.async.resulthandler.com/>
> .async.ResultHandler.complete(ResultHandler.java:195)
>
> at com.ibm.io <http://com.ibm.io.async.resulthandler.ru/>
> .async.ResultHandler.runEventProcessingLoop(ResultHandler.java:743)
>
> at com.ibm.io <http://com.ibm.io.async.re/>
> .async.ResultHandler$2.run(ResultHandler.java:873)
>
> at com.ibm.ws <http://com.ibm.ws.util.th/>
> .util.ThreadPool$Worker.run(ThreadPool.java:1473)
>
>
>
> After Websphere recovers, from this point on until we redeploy the process
> in question to a new version, ODE attempts to route subsequent requests to a
> retired version.
>
>
>
> [11/20/08 14:29:26:968 CST] 00000046 SystemOut O 14:29:26,967 ERROR
> [BpelEngineImpl] Scheduled job failed; jobDetail={type=INVOKE_INTERNAL,
> pid={http://eclipse.org/bpel/sample}AdminYNProcess-195,<http://eclipse.org/bpel/sample%7DAdminYNProcess-195,>inmem=true, mexid=4611686018427387977}
>
> org.apache.ode.bpel.runtime.InvalidProcessException: Process is retired.
>
> at
> org.apache.ode.bpel.engine.PartnerLinkMyRoleImpl.invokeNewInstance(PartnerLinkMyRoleImpl.java:173)
>
> at
> org.apache.ode.bpel.engine.BpelProcess.invokeProcess(BpelProcess.java:204)
>
> at
> org.apache.ode.bpel.engine.BpelProcess.handleWorkEvent(BpelProcess.java:372)
>
> at
> org.apache.ode.bpel.engine.BpelEngineImpl.onScheduledJob(BpelEngineImpl.java:326)
>
> at
> org.apache.ode.bpel.engine.BpelServerImpl.onScheduledJob(BpelServerImpl.java:373)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4$1.call(SimpleScheduler.java:337)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4$1.call(SimpleScheduler.java:336)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler.execTransaction(SimpleScheduler.java:174)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4.call(SimpleScheduler.java:335)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4.call(SimpleScheduler.java:332)
>
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:284)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:665)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:690)
>
> at java.lang.Thread.run(Thread.java:810)
>
> Attached is the Java core dump file from the time of the original
> OutOfMemoryError, showing that it was caused by excessive garbage
> collection.  the VM this runs under allocates 1 Gig of memory on the heap.
>
> - Chris Taylor
>
>  ------------------------------
> *From:* Matthieu Riou <ma...@offthelip.org>
> *To:* user@ode.apache.org
> *Cc:* Dave Cecchi <da...@perficient.com>
> *Sent:* Thursday, October 16, 2008 10:40:57 AM
> *Subject:* Re: Client calling retired process?
>
> On Wed, Oct 15, 2008 at 9:27 AM, Chris Taylor <sa...@yahoo.com> wrote:
>
> > Matthieu, Yes would appreciate if you could put that latest built war
> > somewhere.  We have attempted to build with buildr without success.
> >
>
> Here it is:
>
> http://people.apache.org/~mriou/ode-axis2-war-1.2.1-SNAPSHOT.war<http://people.apache.org/%7Emriou/ode-axis2-war-1.2.1-SNAPSHOT.war>
>
> Let me know how it goes.
>
> Cheers,
> Matthieu
>
>
> >
> >
> >
> > ----- Original Message ----
> > From: Matthieu Riou <ma...@offthelip.org>
> > To: user@ode.apache.org
> > Sent: Monday, October 13, 2008 1:30:56 PM
> > Subject: Re: Client calling retired process?
> >
> > On Mon, Oct 13, 2008 at 10:55 AM, Chris Taylor <sa...@yahoo.com>
> wrote:
> >
> > > Thanks, Matthieu.  Some background:
> > >
> > > we're running ODE 1.2 on Websphere 6.1, with Oracle 10g as the process
> > > store.
> > >
> > > This scenario consistently fails in the manner I described, but it
> seems
> > > only for certain processes.
> > >
> > > So, for example, if i have the following:
> > >
> > > ProcessA-20
> > > ProcessB-21
> > > ProcessC-22
> > >
> > > deployed in my environment, the scenario would be that something causes
> > > ProcessA-20 to hang - at which point it goes into recovery mode and
> > spawns
> > > an ode job to retry.  From this point on, new requests to (not just)
> > > ProcessA get routed to the now-retired ProcessA-19, but also new
> requests
> > to
> > > ProcessB get routed to (now-retired) ProcessB-20!  The weird thing is,
> > > ProcessC-22 is apparently unaffected.  It still gets calls legitimately
> > > routed to its latest versioned deployment, ProcessC-22.
> > >
> > > I do not know if this happens under other scenarios unrelated to
> > recovery.
> > > I think I just do not have enough data points yet to say.
> > >
> > >
> >
> > If you have a reproducible test scenario, it would be great if you could
> > try
> > it with the current stable branch. I've fixed something related to what
> > you're describing a couple of months ago. If doing a build is an issue
> for
> > you, I can upload the WAR to a public place.
> >
> > Thanks,
> > Matthieu
> >
> >
> > >
> > >
> > >
> > > ----- Original Message ----
> > > From: Matthieu Riou <ma...@offthelip.org>
> > > To: user@ode.apache.org
> > > Sent: Monday, October 13, 2008 12:33:18 PM
> > > Subject: Re: Client calling retired process?
> > >
> > > On Mon, Oct 13, 2008 at 8:17 AM, Chris Taylor <sa...@yahoo.com>
> > wrote:
> > >
> > > > Thanks, Alexis, but i'm no closer to fully understanding why this
> > occurs.
> > > > It happens periodically now almost everyday with different deployed
> > > > processes.  Although I don't understand it, I have done some research
> > > into
> > > > the behaviour.  Here's a scenario:
> > > >
> > > > we'll deploy ProcessA-19, then retire it with ProcessA-20 deployment.
> > At
> > > > some point it, or another, process will fail and attempt to go into
> > > recovery
> > > > mode (excuse me if I state this incorrectly),  at this point ODE will
> > > create
> > > > a scheduled job in an attempt to retry the service later.
> > > >
> > > > Here's where it gets screwy.  From then on, all new calls to ProcessA
> > > will
> > > > not route to ProcessA-20, but ode will attempt to route them to
> > > ProcessA-19,
> > > > which is of course retired. Ode does not recover from this.  It seems
> > the
> > > > only way to compensate is to redeploy ProcessA as ProcessA-21.  New
> > > requests
> > > > will then route correctly.
> > > >
> > > > Any idea here?
> > > >
> > >
> > > I'll have to ask a few more questions to narrow it down and make sure I
> > > understand correctly:
> > >
> > >  * Does the exact same scenario sometimes works and sometimes doesn't?
> > >  * Is it always happening in relation with recovery and retry or did
> you
> > > see it happen in other situations as well?
> > >  * Which version of ODE are you using? Have you tried with a recent 1.X
> > > branch?
> > >
> > > Thanks,
> > > Matthieu
> > >
> > >
> > > >
> > > >
> > > >
> > > > ----- Original Message ----
> > > > From: Alexis Midon <mi...@intalio.com>
> > > > To: user@ode.apache.org
> > > > Sent: Wednesday, October 8, 2008 7:26:54 PM
> > > > Subject: Re: Client calling retired process?
> > > >
> > > > Hi Chris,
> > > >
> > > > No new executions can be started on a retired process, but running
> > > > instances
> > > > can still finish their job. [1]
> > > >
> > > > I'm not really familiar with this part of the code, but after looking
> > at
> > > > it,
> > > > it seems to me that the deployment of a new version is not atomic.
> > > Meaning
> > > > that a process could be flagged as retired while the creation of a
> new
> > > > instance is in progress, hence you're exception.
> > > >
> > > > does it make sense regarding your scenario? is it possible that the
> > > process
> > > > gets retired while messages are coming in?
> > > >
> > > > [1] further details here:
> > > > http://ode.apache.org/user-guide.html#UserGuide-Versioning
> > > >
> > > >
> > > >
> > > > On Wed, Oct 8, 2008 at 11:37 AM, Chris Taylor <sa...@yahoo.com>
> > > wrote:
> > > >
> > > > > Okay, I've a deployment (called GetCodes) bundle that includes 5
> > > > > processes.  4 of the processes make calls to the fifth (it's an
> > > > abstraction
> > > > > layer of process business logic).  When I deploy this "GetCodes"
> > bundle
> > > > > using the DeploymentService utility, I can see an incremented
> > > deployment
> > > > > (say, GetCodes-40) alongside previous iterations.
> > > > >
> > > > > Occasionally, I'll have a client making soap calls to one of the
> > > > processes
> > > > > under this logical bundle that will fail with the following error:
> > > > >
> > > > > InvalidProcessException: Process is retired.
> > > > >
> > > > > In the logs, it's clear that ODE is directing this client call to
> > > > > GetCodes-39 - though the client isn't explicitly attempting to call
> a
> > > > > specific version (is that even possible?).  Any clue why some
> clients
> > > > > periodically - erroneously - are directed by ODE to a retired
> process
> > > > > version?
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
> >
>
>
>