You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jena.apache.org by Simon Helsen <sh...@ca.ibm.com> on 2013/01/28 16:59:13 UTC

records not strictly increasing

Hi guys,

in one of our regular test runs, a multi-threaded test barfed once (but 
then not again and we have never seen it before even though we run these 
tests regularly). I am not sure if we accidentally bumped into a true tdb 
bug or whether we are doing something unsafe on our side. The exception 
occurred inside a read transaction while iterating over a ResultSet. You 
can assume that othe threads were also in read and/or write transactions.

I have no idea how to produce a test case to replicate this, so my 
starting question will be if anyone can give a broad explanation of the 
meaning of the exception. Our tests were running Jena 2.7.4.  I may open a 
Jira issue as well.

thanks

Simon

Caused by: com.hp.hpl.jena.tdb.base.StorageException: RecordRangeIterator: 
records not strictly increasing: 
00000000000000af000000000003e9f90000000000250aaa000000000024ea3d // 
00000000000000af000000000003e93700000000002792790000000000277240
        at 
com.hp.hpl.jena.tdb.base.recordbuffer.RecordRangeIterator.hasNext(RecordRangeIterator.java:124)
        at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
        at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.hasNext(DatasetControlMRSW.java:119)
        at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:181)
        at org.openjena.atlas.iterator.Iter$6.hasNext(Iter.java:386)
        at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:181)
        at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
        at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:181)
        at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
        at 
org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:58)
        at 
org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:46)
        at 
org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:46)
        at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIterConvert.hasNextBinding(QueryIterConvert.java:59)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
        at 
com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
        at 
com.ibm.team.jfs.rdf.internal.jena.InternalResultSet.retrieveAllBindings(InternalResultSet.java:47)
        at 
com.ibm.team.jfs.rdf.internal.jena.InternalResultSet.retrieveFirstPage(InternalResultSet.java:54)
        at 
com.ibm.team.jfs.rdf.internal.jena.tdb.JenaTxTdbProvider.renderSelect(JenaTxTdbProvider.java:1887)
        at 
com.ibm.team.jfs.rdf.internal.jena.tdb.JenaTxTdbProvider.performSelect(JenaTxTdbProvider.java:1611)
        at 
com.ibm.team.jfs.rdf.internal.jena.tdb.JenaTxTdbProvider$21.run(JenaTxTdbProvider.java:1757)
        at 
com.ibm.team.jfs.rdf.internal.jena.tdb.JenaTxTdbProvider$21.run(JenaTxTdbProvider.java:1)
        at 
com.ibm.team.jfs.rdf.internal.jena.tdb.JenaTxTdbProvider.storeOperation(JenaTxTdbProvider.java:2208)
        ... 55 more

Re: records not strictly increasing

Posted by Simon Helsen <sh...@ca.ibm.com>.

Andy,

I didn't open an issue yet, because I don't know how to reproduce the 
problem. Unfortunately, it seems there is no way of knowing when the 
corruption is introduced in this case. There was only one JVM running, so, 
let's talk about 

"a previous crash (with no journal restore later - new file type in 
0.9.X)"

I did not run the test myself, but it is possible that the JVM process was 
killed at some point. If so, I know that countless queries did run ok 
before the exception, but corruptions can be subtle of course. Under what 
circumstances would a journal restore fail and if so, would that not be 
logged or even stop TDB from starting? Would it be possible to have some 
sort of safety mechanism to indicate whether a journal restore was 
unsuccessful? 

Simon

From:
Andy Seaborne <an...@apache.org>
To:
dev@jena.apache.org, 
Date:
01/28/2013 11:44 AM
Subject:
Re: records not strictly increasing

On 28/01/13 15:59, Simon Helsen wrote:
> Hi guys,
>
> in one of our regular test runs, a multi-threaded test barfed once (but
> then not again and we have never seen it before even though we run these
> tests regularly). I am not sure if we accidentally bumped into a true 
tdb
> bug or whether we are doing something unsafe on our side. The exception
> occurred inside a read transaction while iterating over a ResultSet. You
> can assume that othe threads were also in read and/or write 
transactions.
>
> I have no idea how to produce a test case to replicate this, so my
> starting question will be if anyone can give a broad explanation of the
> meaning of the exception. Our tests were running Jena 2.7.4.  I may open 
a
> Jira issue as well.
>
> thanks
>
> Simon
>
> Caused by: com.hp.hpl.jena.tdb.base.StorageException: 
RecordRangeIterator:
> records not strictly increasing:
> 00000000000000af000000000003e9f90000000000250aaa000000000024ea3d //
> 00000000000000af000000000003e93700000000002792790000000000277240
>          at
> 
com.hp.hpl.jena.tdb.base.recordbuffer.RecordRangeIterator.hasNext(RecordRangeIterator.java:124)
>          at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)

Simon,

If it is not reproducible then there isn't anything that can be done.

The exception is detecting a bad database, not at the point in time when 
the corruption happens.  I suggest that when it happens you preserve the 
bad database and see what else might be broken in it.

It is unlikely to be due to concurrency in the same JVM - that could not 
cause this is 0.8.X either - and leads to different errors.

Either a previous crash (with no journal restore later - new file type 
in 0.9.X), or access from two JVMs are the only two possibilities that 
occur to me

                 Andy

Re: records not strictly increasing

Posted by Andy Seaborne <an...@apache.org>.

On 28/01/13 15:59, Simon Helsen wrote:
> Hi guys,
>
> in one of our regular test runs, a multi-threaded test barfed once (but
> then not again and we have never seen it before even though we run these
> tests regularly). I am not sure if we accidentally bumped into a true tdb
> bug or whether we are doing something unsafe on our side. The exception
> occurred inside a read transaction while iterating over a ResultSet. You
> can assume that othe threads were also in read and/or write transactions.
>
> I have no idea how to produce a test case to replicate this, so my
> starting question will be if anyone can give a broad explanation of the
> meaning of the exception. Our tests were running Jena 2.7.4.  I may open a
> Jira issue as well.
>
> thanks
>
> Simon
>
> Caused by: com.hp.hpl.jena.tdb.base.StorageException: RecordRangeIterator:
> records not strictly increasing:
> 00000000000000af000000000003e9f90000000000250aaa000000000024ea3d //
> 00000000000000af000000000003e93700000000002792790000000000277240
>          at
> com.hp.hpl.jena.tdb.base.recordbuffer.RecordRangeIterator.hasNext(RecordRangeIterator.java:124)
>          at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)

Simon,

If it is not reproducible then there isn't anything that can be done.

The exception is detecting a bad database, not at the point in time when 
the corruption happens.  I suggest that when it happens you preserve the 
bad database and see what else might be broken in it.

It is unlikely to be due to concurrency in the same JVM - that could not 
cause this is 0.8.X either - and leads to different errors.

Either a previous crash (with no journal restore later - new file type 
in 0.9.X), or access from two JVMs are the only two possibilities that 
occur to me

	Andy