You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by Olivier Dony <ol...@denali.be> on 2007/01/17 17:29:46 UTC

(JCR-645) introduces NPE -- Jackrabbit Release plan for 1.2.

Hi,

Apache's Jira seems to be down currently, but I wanted to provide  
some feedback about issue JCR-645 (DatabasePersistenceManager &  
DatabaseFileSystem: try to gracefully recover from connection loss).
We tested this fix inside 1.2RC2 with a MySQL database server.
The reconnection/retry mechanism in DatabasePersistenceManager seems  
to behave fine when the connection times out or is killed for some  
reason, but the DB server is in fact still running.

However there is a problem if the connection cannot be re-established  
directly, for example if a transient network outage lasts longer than  
the few reconnection attempts.
Inside DatabasePersistenceManager.reestablishConnection(),  
initConnection() will fail, and the preparedStatements map will stay  
empty.

This in turn will trigger a nasty NullPointerException (never caught)  
next time executeStmt() is called, because the map is still empty,  
and there is no check for that.
I guess a dumb fix would be to maintain some state in initConnection 
(), check it in executeStmt(), and maybe attempt to  
reestablishConnection() if it's down, just in case the database  
server has become available again. But I've haven't tested this yet,  
and may be missing something.

Would it be possible to still include a fix for this in 1.2?

Thanks a lot!


Olivier Dony


PS: I suppose I can provide a patch for this if it helps, but Stefan  
has a better view of this stuff.

Re: (JCR-645) introduces NPE -- Jackrabbit Release plan for 1.2.

Posted by Stefan Guggisberg <st...@gmail.com>.
On 1/19/07, David Johnson <db...@gmail.com> wrote:
> Did this make it into the 1.2.1 release?  Or since it was committed to the
> trunk will it have to wait for the 1.3 version?

no and no.

as jukka said it'll be part of a next patch release which will follow the first
(i.e. main) release 1.2.1 probably soon after.

cheers
stefan

>
> -Dave
>
> On 1/18/07, Stefan Guggisberg <st...@gmail.com> wrote:
> >
> > On 1/18/07, Olivier Dony <ol...@denali.be> wrote:
> > > On Jan 17, 2007, at 6:19 PM, Stefan Guggisberg wrote:
> > > > doh! probably the simplest fix would be to remove line 783 in
> > > > DatabasePersistenceManager.java and line 1010 in
> > > > DatabaseFileSystem.java,
> > > > i.e. the following stmt:
> > > >
> > > >        preparedStatements.clear();
> > > >
> > > > if you have a chance to successfully test this i'll commit the
> > > > change asap.
> > >
> > > Ah you're right, there's no real need to clear the map anyway, as the
> > > statements will all get replaced in initPreparedStatements() after a
> > > successful reconnection.
> > >
> > > Alright, so we've freshly rebuilt tag 1.2-rc2 with the lines you
> > > mention commented out.
> > > After testing a little bit, this fix seems to work fine for us, and
> > > it can now reconnect correctly even if the database server is down
> > > for a while!
> >
> > great, i committed the fix in trunk (r497392).
> >
> > cheers
> > stefan
> >
> > > I say ship it! ;-)
> > >
> > >
> > > >> Would it be possible to still include a fix for this in 1.2?
> > > >
> > > > i guess jukka will have to decide this.
> > >
> > > I do hope it gets included, as the change is really minor and makes
> > > the solution a lot more robust.
> > > (Apparently my boss would sleep better knowing we're using an
> > > official jackrabbit release instead of a modified version... go
> > > figure...)
> > >
> > > Thank you!
> > >
> > >
> > > Olivier
> > >
> >
>
>

Re: (JCR-645) introduces NPE -- Jackrabbit Release plan for 1.2.

Posted by David Johnson <db...@gmail.com>.
Thanks,

Looks good.

-Dave

On 1/19/07, Jukka Zitting <ju...@gmail.com> wrote:
>
> Hi,
>
> On 1/19/07, Jukka Zitting <ju...@gmail.com> wrote:
> > I'm planning to make a 1.2.2 patch release already in a few weeks with
> > this and a couple other lower priority fixes. See
> >
> https://issues.apache.org/jira/secure/IssueNavigator.jspa?pid=10591&fixfor=12312228
> > for the planned contents of the patch release.
>
> Hmm, the link apparently requires some session state to work. You can
> find the list of issues targeted for 1.2.2 by selecting the "Road Map"
> tab on https://issues.apache.org/jira/browse/JCR. It contains the
> lists of scheduled issues for the next planned releases.
>
> BR,
>
> Jukka Zitting
>

Re: (JCR-645) introduces NPE -- Jackrabbit Release plan for 1.2.

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On 1/19/07, Jukka Zitting <ju...@gmail.com> wrote:
> I'm planning to make a 1.2.2 patch release already in a few weeks with
> this and a couple other lower priority fixes. See
> https://issues.apache.org/jira/secure/IssueNavigator.jspa?pid=10591&fixfor=12312228
> for the planned contents of the patch release.

Hmm, the link apparently requires some session state to work. You can
find the list of issues targeted for 1.2.2 by selecting the "Road Map"
tab on https://issues.apache.org/jira/browse/JCR. It contains the
lists of scheduled issues for the next planned releases.

BR,

Jukka Zitting

Re: (JCR-645) introduces NPE -- Jackrabbit Release plan for 1.2.

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On 1/19/07, David Johnson <db...@gmail.com> wrote:
> Did this make it into the 1.2.1 release?  Or since it was committed to the
> trunk will it have to wait for the 1.3 version?

No. We had to skip the 1.2 release and relabel it as 1.2.1 as there
was a last minute blocker issue (JCR-707) that prevented the
originally planned release. I'm sorry for the confusion with the
version numbering this might cause.

I'm planning to make a 1.2.2 patch release already in a few weeks with
this and a couple other lower priority fixes. See
https://issues.apache.org/jira/secure/IssueNavigator.jspa?pid=10591&fixfor=12312228
for the planned contents of the patch release.

BR,

Jukka Zitting

Re: (JCR-645) introduces NPE -- Jackrabbit Release plan for 1.2.

Posted by David Johnson <db...@gmail.com>.
Did this make it into the 1.2.1 release?  Or since it was committed to the
trunk will it have to wait for the 1.3 version?

-Dave

On 1/18/07, Stefan Guggisberg <st...@gmail.com> wrote:
>
> On 1/18/07, Olivier Dony <ol...@denali.be> wrote:
> > On Jan 17, 2007, at 6:19 PM, Stefan Guggisberg wrote:
> > > doh! probably the simplest fix would be to remove line 783 in
> > > DatabasePersistenceManager.java and line 1010 in
> > > DatabaseFileSystem.java,
> > > i.e. the following stmt:
> > >
> > >        preparedStatements.clear();
> > >
> > > if you have a chance to successfully test this i'll commit the
> > > change asap.
> >
> > Ah you're right, there's no real need to clear the map anyway, as the
> > statements will all get replaced in initPreparedStatements() after a
> > successful reconnection.
> >
> > Alright, so we've freshly rebuilt tag 1.2-rc2 with the lines you
> > mention commented out.
> > After testing a little bit, this fix seems to work fine for us, and
> > it can now reconnect correctly even if the database server is down
> > for a while!
>
> great, i committed the fix in trunk (r497392).
>
> cheers
> stefan
>
> > I say ship it! ;-)
> >
> >
> > >> Would it be possible to still include a fix for this in 1.2?
> > >
> > > i guess jukka will have to decide this.
> >
> > I do hope it gets included, as the change is really minor and makes
> > the solution a lot more robust.
> > (Apparently my boss would sleep better knowing we're using an
> > official jackrabbit release instead of a modified version... go
> > figure...)
> >
> > Thank you!
> >
> >
> > Olivier
> >
>

Re: (JCR-645) introduces NPE -- Jackrabbit Release plan for 1.2.

Posted by Stefan Guggisberg <st...@gmail.com>.
On 1/18/07, Olivier Dony <ol...@denali.be> wrote:
> On Jan 17, 2007, at 6:19 PM, Stefan Guggisberg wrote:
> > doh! probably the simplest fix would be to remove line 783 in
> > DatabasePersistenceManager.java and line 1010 in
> > DatabaseFileSystem.java,
> > i.e. the following stmt:
> >
> >        preparedStatements.clear();
> >
> > if you have a chance to successfully test this i'll commit the
> > change asap.
>
> Ah you're right, there's no real need to clear the map anyway, as the
> statements will all get replaced in initPreparedStatements() after a
> successful reconnection.
>
> Alright, so we've freshly rebuilt tag 1.2-rc2 with the lines you
> mention commented out.
> After testing a little bit, this fix seems to work fine for us, and
> it can now reconnect correctly even if the database server is down
> for a while!

great, i committed the fix in trunk (r497392).

cheers
stefan

> I say ship it! ;-)
>
>
> >> Would it be possible to still include a fix for this in 1.2?
> >
> > i guess jukka will have to decide this.
>
> I do hope it gets included, as the change is really minor and makes
> the solution a lot more robust.
> (Apparently my boss would sleep better knowing we're using an
> official jackrabbit release instead of a modified version... go
> figure...)
>
> Thank you!
>
>
> Olivier
>

Re: (JCR-645) introduces NPE -- Jackrabbit Release plan for 1.2.

Posted by Olivier Dony <ol...@denali.be>.
On Jan 18, 2007, at 10:49 AM, Jukka Zitting wrote:
> Hi,
>
> On 1/18/07, Olivier Dony <ol...@denali.be> wrote:
>> I do hope it gets included, as the change is really minor and makes
>> the solution a lot more robust.
>
> I'll unfortunately not include this in the first 1.2 release. The
> release process has been complex enough without any more last minute
> changes, and this problem seems to affect only a specific deployment
> scenario.

Hmm, it's true that such database outages are not very common indeed,  
they just happen to be part of our test suite.


> Can you please still file a separate bug report about this so I can
> better track this for a patch release that I hope to make soon after
> the main 1.2 release is out.

Great! I've created JCR-710.

Thanks,


Olivier



Re: (JCR-645) introduces NPE -- Jackrabbit Release plan for 1.2.

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On 1/18/07, Olivier Dony <ol...@denali.be> wrote:
> On Jan 17, 2007, at 6:19 PM, Stefan Guggisberg wrote:
> >> Would it be possible to still include a fix for this in 1.2?
> >
> > i guess jukka will have to decide this.
>
> I do hope it gets included, as the change is really minor and makes
> the solution a lot more robust.

I'll unfortunately not include this in the first 1.2 release. The
release process has been complex enough without any more last minute
changes, and this problem seems to affect only a specific deployment
scenario.

Can you please still file a separate bug report about this so I can
better track this for a patch release that I hope to make soon after
the main 1.2 release is out.

BR,

Jukka Zitting

Re: (JCR-645) introduces NPE -- Jackrabbit Release plan for 1.2.

Posted by Olivier Dony <ol...@denali.be>.
On Jan 17, 2007, at 6:19 PM, Stefan Guggisberg wrote:
> doh! probably the simplest fix would be to remove line 783 in
> DatabasePersistenceManager.java and line 1010 in
> DatabaseFileSystem.java,
> i.e. the following stmt:
>
>        preparedStatements.clear();
>
> if you have a chance to successfully test this i'll commit the  
> change asap.

Ah you're right, there's no real need to clear the map anyway, as the  
statements will all get replaced in initPreparedStatements() after a  
successful reconnection.

Alright, so we've freshly rebuilt tag 1.2-rc2 with the lines you  
mention commented out.
After testing a little bit, this fix seems to work fine for us, and  
it can now reconnect correctly even if the database server is down  
for a while!
I say ship it! ;-)


>> Would it be possible to still include a fix for this in 1.2?
>
> i guess jukka will have to decide this.

I do hope it gets included, as the change is really minor and makes  
the solution a lot more robust.
(Apparently my boss would sleep better knowing we're using an  
official jackrabbit release instead of a modified version... go  
figure...)

Thank you!


Olivier

Re: (JCR-645) introduces NPE -- Jackrabbit Release plan for 1.2.

Posted by Stefan Guggisberg <st...@gmail.com>.
hi olivier,

On 1/17/07, Olivier Dony <ol...@denali.be> wrote:
> Hi,
>
> Apache's Jira seems to be down currently, but I wanted to provide
> some feedback about issue JCR-645 (DatabasePersistenceManager &
> DatabaseFileSystem: try to gracefully recover from connection loss).
> We tested this fix inside 1.2RC2 with a MySQL database server.
> The reconnection/retry mechanism in DatabasePersistenceManager seems
> to behave fine when the connection times out or is killed for some
> reason, but the DB server is in fact still running.
>
> However there is a problem if the connection cannot be re-established
> directly, for example if a transient network outage lasts longer than
> the few reconnection attempts.
> Inside DatabasePersistenceManager.reestablishConnection(),
> initConnection() will fail, and the preparedStatements map will stay
> empty.
>
> This in turn will trigger a nasty NullPointerException (never caught)
> next time executeStmt() is called, because the map is still empty,
> and there is no check for that.

doh! probably the simplest fix would be to remove line 783 in
DatabasePersistenceManager.java and line 1010 in
DatabaseFileSystem.java,
i.e. the following stmt:

        preparedStatements.clear();

if you have a chance to successfully test this i'll commit the change asap.

> I guess a dumb fix would be to maintain some state in initConnection
> (), check it in executeStmt(), and maybe attempt to
> reestablishConnection() if it's down, just in case the database
> server has become available again. But I've haven't tested this yet,
> and may be missing something.
>
> Would it be possible to still include a fix for this in 1.2?

i guess jukka will have to decide this.

thanks for reporting this issue and for the excellent problem analysis.

cheers
stefan

>
> Thanks a lot!
>
>
> Olivier Dony
>
>
> PS: I suppose I can provide a patch for this if it helps, but Stefan
> has a better view of this stuff.
>