You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ode.apache.org by Chris Taylor <sa...@yahoo.com> on 2008/11/24 16:14:32 UTC

Re: Client calling retired process?

Some more information regarding this error:

we are still seeing this even with the ODE Trunk 1.2.1 deployment. It occurs quite rarely, but it seems the catalyst is an OutOfMemoryError raised by ODE when a new request comes in:
java.lang.OutOfMemoryError
at org.apache.ode.bpel.engine.MyRoleMessageExchangeImpl$ResponseFuture.get(MyRoleMessageExchangeImpl.java:201)
at org.apache.ode.axis2.ODEService.onAxisMessageExchange(ODEService.java:149)
at org.apache.ode.axis2.hooks.ODEMessageReceiver.invokeBusinessLogic(ODEMessageReceiver.java:67)
at org.apache.ode.axis2.hooks.ODEMessageReceiver.invokeBusinessLogic(ODEMessageReceiver.java:50)
at org.apache.axis2.receivers.AbstractMessageReceiver.receive(AbstractMessageReceiver.java:96)
at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:145)
at org.apache.axis2.transport.http.HTTPTransportUtils.processHTTPPostRequest(HTTPTransportUtils.java:275)
at org.apache.axis2.transport.http.AxisServlet.doPost(AxisServlet.java:120)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:763)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:856)
at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1075)
at com.ibm.ws.webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:550)
at com.ibm.ws.wswebcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:478)
at com.ibm.ws.webcontainer.servlet.CacheServletWrapper.handleRequest(CacheServletWrapper.java:90)
at com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:744)
at com.ibm.ws.wswebcontainer.WebContainer.handleRequest(WebContainer.java:1455)
at com.ibm.ws.webcontainer.channel.WCChannelLink.ready(WCChannelLink.java:115)
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:458)
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewInformation(HttpInboundLink.java:387)
at com.ibm.ws.http.channel.inbound.impl.HttpICLReadCallback.complete(HttpICLReadCallback.java:102)
at com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165)
at com.ibm.io.async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217)
at com.ibm.io.async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161)
at com.ibm.io.async.AsyncFuture.completed(AsyncFuture.java:136)
at com.ibm.io.async.ResultHandler.complete(ResultHandler.java:195)
at com.ibm.io.async.ResultHandler.runEventProcessingLoop(ResultHandler.java:743)
at com.ibm.io.async.ResultHandler$2.run(ResultHandler.java:873)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1473)
After Websphere recovers, from this point on until we redeploy the process in question to a new version, ODE attempts to route subsequent requests to a retired version.
 [11/20/08 14:29:26:968 CST] 00000046 SystemOut O 14:29:26,967 ERROR [BpelEngineImpl] Scheduled job failed; jobDetail={type=INVOKE_INTERNAL, pid={http://eclipse.org/bpel/sample}AdminYNProcess-195, inmem=true, mexid=4611686018427387977}
org.apache.ode.bpel.runtime.InvalidProcessException: Process is retired.
at org.apache.ode.bpel.engine.PartnerLinkMyRoleImpl.invokeNewInstance(PartnerLinkMyRoleImpl.java:173)
at org.apache.ode.bpel.engine.BpelProcess.invokeProcess(BpelProcess.java:204)
at org.apache.ode.bpel.engine.BpelProcess.handleWorkEvent(BpelProcess.java:372)
at org.apache.ode.bpel.engine.BpelEngineImpl.onScheduledJob(BpelEngineImpl.java:326)
at org.apache.ode.bpel.engine.BpelServerImpl.onScheduledJob(BpelServerImpl.java:373)
at org.apache.ode.scheduler.simple.SimpleScheduler$4$1.call(SimpleScheduler.java:337)
at org.apache.ode.scheduler.simple.SimpleScheduler$4$1.call(SimpleScheduler.java:336)
at org.apache.ode.scheduler.simple.SimpleScheduler.execTransaction(SimpleScheduler.java:174)
at org.apache.ode.scheduler.simple.SimpleScheduler$4.call(SimpleScheduler.java:335)
at org.apache.ode.scheduler.simple.SimpleScheduler$4.call(SimpleScheduler.java:332)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:284)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:665)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:690)
at java.lang.Thread.run(Thread.java:810)

Attached is the Java core dump file from the time of the original OutOfMemoryError, showing that it was caused by excessive garbage collection.  the VM this runs under allocates 1 Gig of memory on the heap.

- Chris Taylor 

________________________________
From: Matthieu Riou <ma...@offthelip.org>
To: user@ode.apache.org
Cc: Dave Cecchi <da...@perficient.com>
Sent: Thursday, October 16, 2008 10:40:57 AM
Subject: Re: Client calling retired process?

On Wed, Oct 15, 2008 at 9:27 AM, Chris Taylor <sa...@yahoo.com> wrote:

> Matthieu, Yes would appreciate if you could put that latest built war
> somewhere.  We have attempted to build with buildr without success.
>

Here it is:

http://people.apache.org/~mriou/ode-axis2-war-1.2.1-SNAPSHOT.war

Let me know how it goes.

Cheers,
Matthieu

>
>
>
> ----- Original Message ----
> From: Matthieu Riou <ma...@offthelip.org>
> To: user@ode.apache.org
> Sent: Monday, October 13, 2008 1:30:56 PM
> Subject: Re: Client calling retired process?
>
> On Mon, Oct 13, 2008 at 10:55 AM, Chris Taylor <sa...@yahoo.com> wrote:
>
> > Thanks, Matthieu.  Some background:
> >
> > we're running ODE 1.2 on Websphere 6.1, with Oracle 10g as the process
> > store.
> >
> > This scenario consistently fails in the manner I described, but it seems
> > only for certain processes.
> >
> > So, for example, if i have the following:
> >
> > ProcessA-20
> > ProcessB-21
> > ProcessC-22
> >
> > deployed in my environment, the scenario would be that something causes
> > ProcessA-20 to hang - at which point it goes into recovery mode and
> spawns
> > an ode job to retry.  From this point on, new requests to (not just)
> > ProcessA get routed to the now-retired ProcessA-19, but also new requests
> to
> > ProcessB get routed to (now-retired) ProcessB-20!  The weird thing is,
> > ProcessC-22 is apparently unaffected.  It still gets calls legitimately
> > routed to its latest versioned deployment, ProcessC-22.
> >
> > I do not know if this happens under other scenarios unrelated to
> recovery.
> > I think I just do not have enough data points yet to say.
> >
> >
>
> If you have a reproducible test scenario, it would be great if you could
> try
> it with the current stable branch. I've fixed something related to what
> you're describing a couple of months ago. If doing a build is an issue for
> you, I can upload the WAR to a public place.
>
> Thanks,
> Matthieu
>
>
> >
> >
> >
> > ----- Original Message ----
> > From: Matthieu Riou <ma...@offthelip.org>
> > To: user@ode.apache.org
> > Sent: Monday, October 13, 2008 12:33:18 PM
> > Subject: Re: Client calling retired process?
> >
> > On Mon, Oct 13, 2008 at 8:17 AM, Chris Taylor <sa...@yahoo.com>
> wrote:
> >
> > > Thanks, Alexis, but i'm no closer to fully understanding why this
> occurs.
> > > It happens periodically now almost everyday with different deployed
> > > processes.  Although I don't understand it, I have done some research
> > into
> > > the behaviour.  Here's a scenario:
> > >
> > > we'll deploy ProcessA-19, then retire it with ProcessA-20 deployment.
> At
> > > some point it, or another, process will fail and attempt to go into
> > recovery
> > > mode (excuse me if I state this incorrectly),  at this point ODE will
> > create
> > > a scheduled job in an attempt to retry the service later.
> > >
> > > Here's where it gets screwy.  From then on, all new calls to ProcessA
> > will
> > > not route to ProcessA-20, but ode will attempt to route them to
> > ProcessA-19,
> > > which is of course retired. Ode does not recover from this.  It seems
> the
> > > only way to compensate is to redeploy ProcessA as ProcessA-21.  New
> > requests
> > > will then route correctly.
> > >
> > > Any idea here?
> > >
> >
> > I'll have to ask a few more questions to narrow it down and make sure I
> > understand correctly:
> >
> >  * Does the exact same scenario sometimes works and sometimes doesn't?
> >  * Is it always happening in relation with recovery and retry or did you
> > see it happen in other situations as well?
> >  * Which version of ODE are you using? Have you tried with a recent 1.X
> > branch?
> >
> > Thanks,
> > Matthieu
> >
> >
> > >
> > >
> > >
> > > ----- Original Message ----
> > > From: Alexis Midon <mi...@intalio.com>
> > > To: user@ode.apache.org
> > > Sent: Wednesday, October 8, 2008 7:26:54 PM
> > > Subject: Re: Client calling retired process?
> > >
> > > Hi Chris,
> > >
> > > No new executions can be started on a retired process, but running
> > > instances
> > > can still finish their job. [1]
> > >
> > > I'm not really familiar with this part of the code, but after looking
> at
> > > it,
> > > it seems to me that the deployment of a new version is not atomic.
> > Meaning
> > > that a process could be flagged as retired while the creation of a new
> > > instance is in progress, hence you're exception.
> > >
> > > does it make sense regarding your scenario? is it possible that the
> > process
> > > gets retired while messages are coming in?
> > >
> > > [1] further details here:
> > > http://ode.apache.org/user-guide.html#UserGuide-Versioning
> > >
> > >
> > >
> > > On Wed, Oct 8, 2008 at 11:37 AM, Chris Taylor <sa...@yahoo.com>
> > wrote:
> > >
> > > > Okay, I've a deployment (called GetCodes) bundle that includes 5
> > > > processes.  4 of the processes make calls to the fifth (it's an
> > > abstraction
> > > > layer of process business logic).  When I deploy this "GetCodes"
> bundle
> > > > using the DeploymentService utility, I can see an incremented
> > deployment
> > > > (say, GetCodes-40) alongside previous iterations.
> > > >
> > > > Occasionally, I'll have a client making soap calls to one of the
> > > processes
> > > > under this logical bundle that will fail with the following error:
> > > >
> > > > InvalidProcessException: Process is retired.
> > > >
> > > > In the logs, it's clear that ODE is directing this client call to
> > > > GetCodes-39 - though the client isn't explicitly attempting to call a
> > > > specific version (is that even possible?).  Any clue why some clients
> > > > periodically - erroneously - are directed by ODE to a retired process
> > > > version?
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
> >
>
>
>
>
>

Re: Client calling retired process?

Posted by Alex Boisvert <bo...@intalio.com>.

On Tue, Nov 25, 2008 at 2:01 PM, Chris Taylor <sa...@yahoo.com> wrote:

> Incidentally, in analyzing this issue it would seem that the process store
> was holding multiple "Active" versions of the same process (so, multiple
> PIDs of the same Type in the STORE_PROCESS were Active at the same time).
>
> I wonder if this development environment, wherein we are using the
> DeploymentService to deploy Deployment Units (and thus versioned deployment
> directories), combined with a couple different deployments of the ODE
> runtime engine was causing these "logically" retired processes to hang
> around as Active.
>
> Going out on a limb here, but does it make sense during times of recovery
> that the engine might get confused about which process Type to route to if
> there were multiple Active Process IDs of it at the same time?


I think that's entirely possible.   Matthieu could confirm since he's more
familiar with that part of the code.

alex