You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Simon Helsen <sh...@ca.ibm.com> on 2013/01/28 16:59:13 UTC
records not strictly increasing
Hi guys,
in one of our regular test runs, a multi-threaded test barfed once (but
then not again and we have never seen it before even though we run these
tests regularly). I am not sure if we accidentally bumped into a true tdb
bug or whether we are doing something unsafe on our side. The exception
occurred inside a read transaction while iterating over a ResultSet. You
can assume that othe threads were also in read and/or write transactions.
I have no idea how to produce a test case to replicate this, so my
starting question will be if anyone can give a broad explanation of the
meaning of the exception. Our tests were running Jena 2.7.4. I may open a
Jira issue as well.
thanks
Simon
Caused by: com.hp.hpl.jena.tdb.base.StorageException: RecordRangeIterator:
records not strictly increasing:
00000000000000af000000000003e9f90000000000250aaa000000000024ea3d //
00000000000000af000000000003e93700000000002792790000000000277240
at
com.hp.hpl.jena.tdb.base.recordbuffer.RecordRangeIterator.hasNext(RecordRangeIterator.java:124)
at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
at
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.hasNext(DatasetControlMRSW.java:119)
at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:181)
at org.openjena.atlas.iterator.Iter$6.hasNext(Iter.java:386)
at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:181)
at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:181)
at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
at
org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:58)
at
org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:46)
at
org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:46)
at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
at
com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
at
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
at
com.hp.hpl.jena.sparql.engine.iterator.QueryIterConvert.hasNextBinding(QueryIterConvert.java:59)
at
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
at
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
at
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
at
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
at
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
at
com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
at
com.ibm.team.jfs.rdf.internal.jena.InternalResultSet.retrieveAllBindings(InternalResultSet.java:47)
at
com.ibm.team.jfs.rdf.internal.jena.InternalResultSet.retrieveFirstPage(InternalResultSet.java:54)
at
com.ibm.team.jfs.rdf.internal.jena.tdb.JenaTxTdbProvider.renderSelect(JenaTxTdbProvider.java:1887)
at
com.ibm.team.jfs.rdf.internal.jena.tdb.JenaTxTdbProvider.performSelect(JenaTxTdbProvider.java:1611)
at
com.ibm.team.jfs.rdf.internal.jena.tdb.JenaTxTdbProvider$21.run(JenaTxTdbProvider.java:1757)
at
com.ibm.team.jfs.rdf.internal.jena.tdb.JenaTxTdbProvider$21.run(JenaTxTdbProvider.java:1)
at
com.ibm.team.jfs.rdf.internal.jena.tdb.JenaTxTdbProvider.storeOperation(JenaTxTdbProvider.java:2208)
... 55 more
Re: records not strictly increasing
Posted by Simon Helsen <sh...@ca.ibm.com>.
Andy,
I didn't open an issue yet, because I don't know how to reproduce the
problem. Unfortunately, it seems there is no way of knowing when the
corruption is introduced in this case. There was only one JVM running, so,
let's talk about
"a previous crash (with no journal restore later - new file type in
0.9.X)"
I did not run the test myself, but it is possible that the JVM process was
killed at some point. If so, I know that countless queries did run ok
before the exception, but corruptions can be subtle of course. Under what
circumstances would a journal restore fail and if so, would that not be
logged or even stop TDB from starting? Would it be possible to have some
sort of safety mechanism to indicate whether a journal restore was
unsuccessful?
Simon
From:
Andy Seaborne <an...@apache.org>
To:
dev@jena.apache.org,
Date:
01/28/2013 11:44 AM
Subject:
Re: records not strictly increasing
On 28/01/13 15:59, Simon Helsen wrote:
> Hi guys,
>
> in one of our regular test runs, a multi-threaded test barfed once (but
> then not again and we have never seen it before even though we run these
> tests regularly). I am not sure if we accidentally bumped into a true
tdb
> bug or whether we are doing something unsafe on our side. The exception
> occurred inside a read transaction while iterating over a ResultSet. You
> can assume that othe threads were also in read and/or write
transactions.
>
> I have no idea how to produce a test case to replicate this, so my
> starting question will be if anyone can give a broad explanation of the
> meaning of the exception. Our tests were running Jena 2.7.4. I may open
a
> Jira issue as well.
>
> thanks
>
> Simon
>
> Caused by: com.hp.hpl.jena.tdb.base.StorageException:
RecordRangeIterator:
> records not strictly increasing:
> 00000000000000af000000000003e9f90000000000250aaa000000000024ea3d //
> 00000000000000af000000000003e93700000000002792790000000000277240
> at
>
com.hp.hpl.jena.tdb.base.recordbuffer.RecordRangeIterator.hasNext(RecordRangeIterator.java:124)
> at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
Simon,
If it is not reproducible then there isn't anything that can be done.
The exception is detecting a bad database, not at the point in time when
the corruption happens. I suggest that when it happens you preserve the
bad database and see what else might be broken in it.
It is unlikely to be due to concurrency in the same JVM - that could not
cause this is 0.8.X either - and leads to different errors.
Either a previous crash (with no journal restore later - new file type
in 0.9.X), or access from two JVMs are the only two possibilities that
occur to me
Andy
Re: records not strictly increasing
Posted by Andy Seaborne <an...@apache.org>.
On 28/01/13 15:59, Simon Helsen wrote:
> Hi guys,
>
> in one of our regular test runs, a multi-threaded test barfed once (but
> then not again and we have never seen it before even though we run these
> tests regularly). I am not sure if we accidentally bumped into a true tdb
> bug or whether we are doing something unsafe on our side. The exception
> occurred inside a read transaction while iterating over a ResultSet. You
> can assume that othe threads were also in read and/or write transactions.
>
> I have no idea how to produce a test case to replicate this, so my
> starting question will be if anyone can give a broad explanation of the
> meaning of the exception. Our tests were running Jena 2.7.4. I may open a
> Jira issue as well.
>
> thanks
>
> Simon
>
> Caused by: com.hp.hpl.jena.tdb.base.StorageException: RecordRangeIterator:
> records not strictly increasing:
> 00000000000000af000000000003e9f90000000000250aaa000000000024ea3d //
> 00000000000000af000000000003e93700000000002792790000000000277240
> at
> com.hp.hpl.jena.tdb.base.recordbuffer.RecordRangeIterator.hasNext(RecordRangeIterator.java:124)
> at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
Simon,
If it is not reproducible then there isn't anything that can be done.
The exception is detecting a bad database, not at the point in time when
the corruption happens. I suggest that when it happens you preserve the
bad database and see what else might be broken in it.
It is unlikely to be due to concurrency in the same JVM - that could not
cause this is 0.8.X either - and leads to different errors.
Either a previous crash (with no journal restore later - new file type
in 0.9.X), or access from two JVMs are the only two possibilities that
occur to me
Andy