You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ode.apache.org by "Kurt Westerfeld (JIRA)" <ji...@apache.org> on 2010/10/29 17:32:22 UTC

[jira] Reopened: (ODE-894) BPEL with followed by fails with NPE when using JPA persistence

     [ https://issues.apache.org/jira/browse/ODE-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kurt Westerfeld reopened ODE-894:
---------------------------------


Hi, we found how to reproduce this problem.

We stumbled on a workaround for our use-case with ODE.  Basically, our needs initially were to only have one BPEL process which used correlation and pick constructs, and that was marked as <in-memory>false</in-memory> with all others being "true".  Our needs changed, and we now need more than one such process, and when two BPEL processes are involved with one-another using a pick/correlation, this persistence bug crops up.

So, we really need to get this fixed for our use case.  The change posted is the way to go--we tested it and it works great.

Do I need to file a testcase for this?  In all likelihood, the "dynamic partner" testcase from the ODE site would reproduce it, as it seems to use correlation keys.

My theory on this issue is that after doing a <invoke> with correlation, the recipient replies before the persisted message makes its way to the database through multi-thread race condition.  I'm not certain, however.  It doesn't make sense that the caller is influenced by the receiver's setting of the in-memory value--it would be much easier to understand if the receiver had this issue.

We tried also to set the <invoke> deployment process setting usePeer2Peer with no luck.  

> BPEL with <pick> followed by <invoke> fails with NPE when using JPA persistence
> -------------------------------------------------------------------------------
>
>                 Key: ODE-894
>                 URL: https://issues.apache.org/jira/browse/ODE-894
>             Project: ODE
>          Issue Type: Bug
>          Components: BPEL Runtime
>    Affects Versions: 1.3.4
>         Environment: JBI distribution on servicemix 3.3.2
>            Reporter: Kurt Westerfeld
>
> We have a bpel process which contains a <pick> followed by a few <assign> and an <invoke> operation.  When running this process, we can resume the <pick> but soon afterwards an NPE occurs as in the following stack trace:
> 13:04:04,013 | ERROR | pool-5-thread-1 | SimpleScheduler          | .simple.SimpleScheduler$RunJob  545 | Error while processing a persisted job: Job hqejbhcnphr5nd9dfp0pnt time: 2010-10-05 13:04:00 EDT transacted: true persisted: true details: JobDetails( instanceId: null mexId: hqejbhcnphr5nd9dfp0pns processId: {(endpoint-name-removed)-0 type: INVOKE_INTERNAL channel: null correlatorId: null correlationKeySet: null retryCount: null inMem: false detailsExt: {})
> org.apache.ode.bpel.iapi.Scheduler$JobProcessorException: java.lang.NullPointerException
> 	at org.apache.ode.bpel.engine.BpelEngineImpl.onScheduledJob(BpelEngineImpl.java:478)
> 	at org.apache.ode.bpel.engine.BpelServerImpl.onScheduledJob(BpelServerImpl.java:450)
> 	at org.apache.ode.scheduler.simple.SimpleScheduler$RunJob$1.call(SimpleScheduler.java:518)
> 	at org.apache.ode.scheduler.simple.SimpleScheduler$RunJob$1.call(SimpleScheduler.java:513)
> 	at org.apache.ode.scheduler.simple.SimpleScheduler.execTransaction(SimpleScheduler.java:284)
> 	at org.apache.ode.scheduler.simple.SimpleScheduler.execTransaction(SimpleScheduler.java:239)
> 	at org.apache.ode.scheduler.simple.SimpleScheduler$RunJob.call(SimpleScheduler.java:512)
> 	at org.apache.ode.scheduler.simple.SimpleScheduler$RunJob.call(SimpleScheduler.java:496)
> 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
> 	at org.apache.ode.bpel.engine.MessageImpl.getMessage(MessageImpl.java:104)
> 	at org.apache.ode.bpel.engine.PartnerLinkMyRoleImpl.computeCorrelationKeys(PartnerLinkMyRoleImpl.java:294)
> 	at org.apache.ode.bpel.engine.PartnerLinkMyRoleImpl.findRoute(PartnerLinkMyRoleImpl.java:122)
> 	at org.apache.ode.bpel.engine.BpelProcess.invokeProcess(BpelProcess.java:233)
> 	at org.apache.ode.bpel.engine.BpelProcess.invokeProcess(BpelProcess.java:279)
> 	at org.apache.ode.bpel.engine.BpelProcess.handleJobDetails(BpelProcess.java:426)
> 	at org.apache.ode.bpel.engine.BpelEngineImpl.onScheduledJob(BpelEngineImpl.java:460)
> 	... 12 more
> This problem then continues for a while, while the scheduling and refiring of this job continues.
> In diagnosing the problem, it was seen that the dao-jpa persistence has an issue in that the MessageDAOImpl class is never persisted with the MEX that created it.  The NPE occurs when the internal <pick> resumes on a message exchange that is later scheduled, hydrated from persistence but the MEX does not contain the "receive" message.
> Specifically, the class org.apache.ode.dao.jpa.MessageExchangeDAOImpl is not doing is not doing a getEM().persist() on the MessageDAOImpl, and there is no cascading one-to-many setup to cause the MessageDAOImpl to be persisted.  The hibernate dao does persist the message in this case, and testing with the hibernate back-end caused the issue to go away.  However, we are experiencing other "primary constraint" violations with the hibernate back-end, so we want to use the JPA back-end.
> Here is a minimal patch which we tested which fixes the issue:
>   Index: dao-jpa/src/main/java/org/apache/ode/dao/jpa/MessageExchangeDAOImpl.java
>   ===================================================================
>   --- dao-jpa/src/main/java/org/apache/ode/dao/jpa/MessageExchangeDAOImpl.java    (revision 997965)
>   +++ dao-jpa/src/main/java/org/apache/ode/dao/jpa/MessageExchangeDAOImpl.java    (working copy)
>   @@ -128,6 +128,7 @@
>        public MessageDAO createMessage(QName type) {
>            MessageDAOImpl ret = new MessageDAOImpl(type,this);
>   +        getEM().persist(ret);
>            return ret ;
>        }
> We do not know whether this patch is the right way to fix this, but it does seem to be similar to the hibernate back-end. One issue with this minimal change is potentially not having the MessageDAOImpl cleanup in place--not sure how best to approach that.
> Another approach considered could be to harden the PartnerLinkMyRoleImpl.computeCorrelationKeys method to ensure it doesn't get an NPE when this message component is missing.  Not sure which way is best.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.