You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geronimo.apache.org by Florent Guillaume <fg...@nuxeo.com> on 2011/03/11 16:17:29 UTC

txmanager: shouldn't connection be removed from pool if it fails to enlist?

Hi David, all,

I have the following situation using txmanager (2.1.3) as a standalone
component in my application.

ConnectionFactory.getConnection
-> GenericConnectionManager.allocateConnection
-> TransactionEnlistingInterceptor.getConnection
-> TransactionImpl.enlistResource
-> xaRes.start
In my XAResource for internal reasons there's a failure to do the
start (no network resources available), and it throws XAException
XAER_RMERR.
So enlistResource catches this and returns false.
But the caller, TransactionEnlistingInterceptor.getConnection, does
nothing with the return code and assumes all went well. So the
corrupted XAResource stays in the pool and is still corrupted on the
next try.

In my opinion it should return the connection to the pool with a DESTROY action.

There's a code path catching SystemException where it does it, but
this exception is never raised here. I see two possible fixes:
1. make TransactionImpl.enlistResource throw SystemException at least
when getting XAER_RMERR,
2. make TransactionEnlistingInterceptor.getConnection look for a false
return value when calling enlistResource and in this case doing a
DESTROY as well.

What do you think?
I can provide a JIRA ticket and a patch if needed.

Florent

-- 
Florent Guillaume, Director of R&D, Nuxeo
Open Source, Java EE based, Enterprise Content Management (ECM)
http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87

Re: txmanager: shouldn't connection be removed from pool if it fails to enlist?

Posted by Florent Guillaume <fg...@nuxeo.com>.
https://issues.apache.org/jira/browse/GERONIMO-5870

On Wed, Mar 16, 2011 at 2:10 PM, Florent Guillaume <fg...@nuxeo.com> wrote:
> I'll try to write a patch when I find the time. In the meantime I've
> made my XAResource more robust to know how to reset its state when
> there's a problem.
>
> I've tried upgrading to 3.1 but there are API changes that I can't
> understand immediately how to work around:
> Previously, when instantiating a GenericConnectionManager, I didn't
> have to pass my ManagedConnectionFactory to it. So I could bind in
> JNDI my generic ConnectionManager, later when needing a pooled
> connection factory in user setup code look it up in JBDI, then do
> mcf.createConnectionFactory(cm).
> Now it seems I have to bind in JNDI a ConnectionManager that's tied to
> a given ManagedConnectionFactory. So I have to instantiate the
> ManagedConnectionFactory much earlier and in a different part of the
> setup. Is that the way to go? Not sure if I'm clear :)
>
> Florent
>
>
>
> On Fri, Mar 11, 2011 at 9:48 PM, David Jencks <da...@yahoo.com> wrote:
>> Hi Florent,
>>
>> Thanks for finding this and figuring out what is going on!  A Jira would be great and a patch even better!
>>
>> BTW you might want to switch to a newer tm as 2.2.x and 3.x have much better error handling and recovery when a resource is not available on startup or disappears midway through a commit.  (The changes aren't going into 2.1.x since they involve an incompatible api change).
>>
>> thanks
>> david jencks
>>
>> On Mar 11, 2011, at 7:17 AM, Florent Guillaume wrote:
>>
>>> Hi David, all,
>>>
>>> I have the following situation using txmanager (2.1.3) as a standalone
>>> component in my application.
>>>
>>> ConnectionFactory.getConnection
>>> -> GenericConnectionManager.allocateConnection
>>> -> TransactionEnlistingInterceptor.getConnection
>>> -> TransactionImpl.enlistResource
>>> -> xaRes.start
>>> In my XAResource for internal reasons there's a failure to do the
>>> start (no network resources available), and it throws XAException
>>> XAER_RMERR.
>>> So enlistResource catches this and returns false.
>>> But the caller, TransactionEnlistingInterceptor.getConnection, does
>>> nothing with the return code and assumes all went well. So the
>>> corrupted XAResource stays in the pool and is still corrupted on the
>>> next try.
>>>
>>> In my opinion it should return the connection to the pool with a DESTROY action.
>>>
>>> There's a code path catching SystemException where it does it, but
>>> this exception is never raised here. I see two possible fixes:
>>> 1. make TransactionImpl.enlistResource throw SystemException at least
>>> when getting XAER_RMERR,
>>> 2. make TransactionEnlistingInterceptor.getConnection look for a false
>>> return value when calling enlistResource and in this case doing a
>>> DESTROY as well.
>>>
>>> What do you think?
>>> I can provide a JIRA ticket and a patch if needed.
>>>
>>> Florent
>>>
>>> --
>>> Florent Guillaume, Director of R&D, Nuxeo
>>> Open Source, Java EE based, Enterprise Content Management (ECM)
>>> http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87
>>
>>
>
>
>
> --
> Florent Guillaume, Director of R&D, Nuxeo
> Open Source, Java EE based, Enterprise Content Management (ECM)
> http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87
>



-- 
Florent Guillaume, Director of R&D, Nuxeo
Open Source, Java EE based, Enterprise Content Management (ECM)
http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87

Re: txmanager: shouldn't connection be removed from pool if it fails to enlist?

Posted by Florent Guillaume <fg...@nuxeo.com>.
I'll try to write a patch when I find the time. In the meantime I've
made my XAResource more robust to know how to reset its state when
there's a problem.

I've tried upgrading to 3.1 but there are API changes that I can't
understand immediately how to work around:
Previously, when instantiating a GenericConnectionManager, I didn't
have to pass my ManagedConnectionFactory to it. So I could bind in
JNDI my generic ConnectionManager, later when needing a pooled
connection factory in user setup code look it up in JBDI, then do
mcf.createConnectionFactory(cm).
Now it seems I have to bind in JNDI a ConnectionManager that's tied to
a given ManagedConnectionFactory. So I have to instantiate the
ManagedConnectionFactory much earlier and in a different part of the
setup. Is that the way to go? Not sure if I'm clear :)

Florent



On Fri, Mar 11, 2011 at 9:48 PM, David Jencks <da...@yahoo.com> wrote:
> Hi Florent,
>
> Thanks for finding this and figuring out what is going on!  A Jira would be great and a patch even better!
>
> BTW you might want to switch to a newer tm as 2.2.x and 3.x have much better error handling and recovery when a resource is not available on startup or disappears midway through a commit.  (The changes aren't going into 2.1.x since they involve an incompatible api change).
>
> thanks
> david jencks
>
> On Mar 11, 2011, at 7:17 AM, Florent Guillaume wrote:
>
>> Hi David, all,
>>
>> I have the following situation using txmanager (2.1.3) as a standalone
>> component in my application.
>>
>> ConnectionFactory.getConnection
>> -> GenericConnectionManager.allocateConnection
>> -> TransactionEnlistingInterceptor.getConnection
>> -> TransactionImpl.enlistResource
>> -> xaRes.start
>> In my XAResource for internal reasons there's a failure to do the
>> start (no network resources available), and it throws XAException
>> XAER_RMERR.
>> So enlistResource catches this and returns false.
>> But the caller, TransactionEnlistingInterceptor.getConnection, does
>> nothing with the return code and assumes all went well. So the
>> corrupted XAResource stays in the pool and is still corrupted on the
>> next try.
>>
>> In my opinion it should return the connection to the pool with a DESTROY action.
>>
>> There's a code path catching SystemException where it does it, but
>> this exception is never raised here. I see two possible fixes:
>> 1. make TransactionImpl.enlistResource throw SystemException at least
>> when getting XAER_RMERR,
>> 2. make TransactionEnlistingInterceptor.getConnection look for a false
>> return value when calling enlistResource and in this case doing a
>> DESTROY as well.
>>
>> What do you think?
>> I can provide a JIRA ticket and a patch if needed.
>>
>> Florent
>>
>> --
>> Florent Guillaume, Director of R&D, Nuxeo
>> Open Source, Java EE based, Enterprise Content Management (ECM)
>> http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87
>
>



-- 
Florent Guillaume, Director of R&D, Nuxeo
Open Source, Java EE based, Enterprise Content Management (ECM)
http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87

Re: txmanager: shouldn't connection be removed from pool if it fails to enlist?

Posted by David Jencks <da...@yahoo.com>.
Hi Florent,

Thanks for finding this and figuring out what is going on!  A Jira would be great and a patch even better!

BTW you might want to switch to a newer tm as 2.2.x and 3.x have much better error handling and recovery when a resource is not available on startup or disappears midway through a commit.  (The changes aren't going into 2.1.x since they involve an incompatible api change).

thanks
david jencks

On Mar 11, 2011, at 7:17 AM, Florent Guillaume wrote:

> Hi David, all,
> 
> I have the following situation using txmanager (2.1.3) as a standalone
> component in my application.
> 
> ConnectionFactory.getConnection
> -> GenericConnectionManager.allocateConnection
> -> TransactionEnlistingInterceptor.getConnection
> -> TransactionImpl.enlistResource
> -> xaRes.start
> In my XAResource for internal reasons there's a failure to do the
> start (no network resources available), and it throws XAException
> XAER_RMERR.
> So enlistResource catches this and returns false.
> But the caller, TransactionEnlistingInterceptor.getConnection, does
> nothing with the return code and assumes all went well. So the
> corrupted XAResource stays in the pool and is still corrupted on the
> next try.
> 
> In my opinion it should return the connection to the pool with a DESTROY action.
> 
> There's a code path catching SystemException where it does it, but
> this exception is never raised here. I see two possible fixes:
> 1. make TransactionImpl.enlistResource throw SystemException at least
> when getting XAER_RMERR,
> 2. make TransactionEnlistingInterceptor.getConnection look for a false
> return value when calling enlistResource and in this case doing a
> DESTROY as well.
> 
> What do you think?
> I can provide a JIRA ticket and a patch if needed.
> 
> Florent
> 
> -- 
> Florent Guillaume, Director of R&D, Nuxeo
> Open Source, Java EE based, Enterprise Content Management (ECM)
> http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87