You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by Rick Hillegas <Ri...@Sun.COM> on 2005/11/10 01:58:21 UTC
RowLocation lifetime
Hello Store experts,
How long is a RowLocation in a Heap good for? Provided that the row is
not deleted, can you count on its RowLocation reliably identifying the
row for the duration of a Statement, a Transaction, a Connection? Forever?
Thanks,
-Rick
Re: RowLocation lifetime
Posted by Andreas Korneliussen <An...@Sun.COM>.
Mike Matrigali wrote:
> null pointer is a bug, please report as a separate JIRA,not sure
> what is going on. Note that while it is convenient for testing
> purposes to use the SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE
> to cause the purging, purging of rows can happen at any time
> after a delete has been committed. The timing depends on a
> number of factors and the timing may change in the future - so
> one should assume that as soon as a row is deleted and committed
> it may be purged.
>
> Andreas Korneliussen wrote:
>
>> Mike Matrigali wrote:
>>
>>> From the store point of view, 3 things can happen to
>>> RowLocations in heap tables:
>>> 1) It can be deleted (requires at least TABLE IX, ROW X locking)
>>> o internal to store it is just marked deleted
>>> o external to store requests for any operation on
>>> this row will fail. Note that your cursor can
>>> experience this no matter what locking you do, as
>>> it is always possible for another statement in your
>>> transaction to do the delete.
>>> 2) It can be purged (requires at least TABLE IX, ROW X locking)
>>> o all physical evidence of the row is removed from table,
>>> both internal and external operations on this row will
>>> fail. Only committed deleted rows are purged.
>>> Note this will never happen if you have some
>>> sort of lock on the row as the requested X lock is
>>> always requested in a system only transaction.
>>> o the actual RowLocation will not be reused while
>>> at least some sort of table level intent lock is held.
>>> 3) It can be reused (requires a table level X lock)
>>> o basically as part of a compress all rows can be shuffled
>>> in any way. A former RowLocation can now point to
>>> a completely different row.
>>>
>>> So as you point out, your implementation can have serious problems
>>> with cursors held over commit. This is why in current usage of
>>> cursors over commit the only safe thing to do is to ask for the
>>> next row location and use it.
>>>
>>> Please make sure to consider the delete/purge cases also. One case
>>> that often causes problems is a transaction deleting a row that is
>>> locked by it's own cursor from another statement in the same connection.
>>>
>> Yes, we need to consider those cases.
>>
>> It seems that the store is capable of graciously handle that the row
>> get deleted (i.e by its own transaction). If the transaction later
>> tries to update the deleted row using the resultset, the store call
>> will return false indicating that the row was not updated. The deleted
>> row will not be purged as long as the transaction is open.
>>
>> However in read-committed/read-uncommitted mode, a row read by the
>> cursor, can be deleted by another transaction, and then purged.
>> It seems that the store does not handle an update of a deleted+purged
>> record.
>>
>> On our prototype impl., I get a get a NullPointerException from the
>> store in this case. It comes in
>> GenericConglomerateController.replace(..)).
>>
>> I would think there are multiple ways of adressing this issue:
>>
>> 1 We could make the store graciously handle the situation if the
>> RowLocation points to a deleted+purged row, by returning false if the
>> RowLocation is invalid, (and from the caller we can give an exception)
>
> It seems like the ConglomerateController.replace() function should throw
> an exception (other than null pointer) if it is called with a
> non-existent RowLocation, but I could be convinced returning false
> is ok. The problem I have is that store really has no way to tell the
> difference between a BAD RowLocation input and one which was purged.
>
Yes, maybe the store should throw an exception if the RowLocation is
invalid. Seen from the user perspective, I think the update should fail
the same way whether the row has been deleted+purged or only deleted
Currently the function returns false if the row has been deleted.
I ran a check on all places in the code tree that the
ConglomerateCongtroller.replace(..) method is called. I found that the
return value is silently ignored all places, except for in the store
unit-tests.
This means that an updateRow() on a deleted record, would simply return
silently (using the optimistic concurrency approach). After the
transaction has committed, the row would remain deleted.
I will file a JIRA for the NullPointerException case.
--Andreas
Re: RowLocation lifetime
Posted by Mike Matrigali <mi...@sbcglobal.net>.
null pointer is a bug, please report as a separate JIRA,not sure
what is going on. Note that while it is convenient for testing
purposes to use the SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE
to cause the purging, purging of rows can happen at any time
after a delete has been committed. The timing depends on a
number of factors and the timing may change in the future - so
one should assume that as soon as a row is deleted and committed
it may be purged.
Andreas Korneliussen wrote:
> Mike Matrigali wrote:
>
>> From the store point of view, 3 things can happen to
>> RowLocations in heap tables:
>> 1) It can be deleted (requires at least TABLE IX, ROW X locking)
>> o internal to store it is just marked deleted
>> o external to store requests for any operation on
>> this row will fail. Note that your cursor can
>> experience this no matter what locking you do, as
>> it is always possible for another statement in your
>> transaction to do the delete.
>> 2) It can be purged (requires at least TABLE IX, ROW X locking)
>> o all physical evidence of the row is removed from table,
>> both internal and external operations on this row will
>> fail. Only committed deleted rows are purged.
>> Note this will never happen if you have some
>> sort of lock on the row as the requested X lock is
>> always requested in a system only transaction.
>> o the actual RowLocation will not be reused while
>> at least some sort of table level intent lock is held.
>> 3) It can be reused (requires a table level X lock)
>> o basically as part of a compress all rows can be shuffled
>> in any way. A former RowLocation can now point to
>> a completely different row.
>>
>> So as you point out, your implementation can have serious problems
>> with cursors held over commit. This is why in current usage of
>> cursors over commit the only safe thing to do is to ask for the
>> next row location and use it.
>>
>> Please make sure to consider the delete/purge cases also. One case
>> that often causes problems is a transaction deleting a row that is
>> locked by it's own cursor from another statement in the same connection.
>>
> Yes, we need to consider those cases.
>
> It seems that the store is capable of graciously handle that the row get
> deleted (i.e by its own transaction). If the transaction later tries to
> update the deleted row using the resultset, the store call will return
> false indicating that the row was not updated. The deleted row will not
> be purged as long as the transaction is open.
>
> However in read-committed/read-uncommitted mode, a row read by the
> cursor, can be deleted by another transaction, and then purged.
> It seems that the store does not handle an update of a deleted+purged
> record.
>
> On our prototype impl., I get a get a NullPointerException from the
> store in this case. It comes in
> GenericConglomerateController.replace(..)).
>
> I would think there are multiple ways of adressing this issue:
>
> 1 We could make the store graciously handle the situation if the
> RowLocation points to a deleted+purged row, by returning false if the
> RowLocation is invalid, (and from the caller we can give an exception)
It seems like the ConglomerateController.replace() function should throw
an exception (other than null pointer) if it is called with a
non-existent RowLocation, but I could be convinced returning false
is ok. The problem I have is that store really has no way to tell the
difference between a BAD RowLocation input and one which was purged.
>
> 2 Or we could make all scrollable updatable resultsets set read-locks or
> updatelocks on every row, for all isolation levels (including
> read-uncommitted)
this would work and I guess the choice depends on optimistic vs.
pessimistic concurrency. If you want to guarantee not errors thrown
at update time then you need locks - though lock timeouts and lock
deadlocks are always a possibility. If you are allowed to throw errors
if you find the row does not exist at update time, I am sure the
store interfaces could be used to either return false, return an
exception that can be checked, or allow for an interface to check
and get a lock on an entity about to be updated.
>
> 3 Or we could make purging require a table level X lock, instead of row
> locks
This is just not an option, in active applications no space reclamation
would ever happen. Very early on this was how it worked and it just did
not work well with most applications. Given user input, the current row
level space reclamation was implemented.
>
> Below is output from the test:
>
> T1: Read next Tuple:(0,0,17)
> T1: Read next Tuple:(1,1,19)
> T1: Read next Tuple:(2,2,21)
> T1: Read next Tuple:(3,3,23)
> T1: Read next Tuple:(4,4,25)
> T1: Read next Tuple:(5,5,27)
> T1: Read next Tuple:(6,6,29)
> T1: Read next Tuple:(7,7,31)
> T1: Read next Tuple:(8,8,33)
> T1: Read next Tuple:(9,9,35)
> T2: Deleted Tuple:(0,0,17)
> T2: commit
> T3: call SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE
> T3: purged deleted records
> T3: commit
> T1: Read first Tuple:(0,0,17)
> T1: updateInt(2, 3);
> T1: updateRow()
> java.lang.NullPointerException
> at
> org.apache.derby.impl.store.access.conglomerate.GenericConglomerateController.replace(GenericConglomerateController.java:465)
>
> at
> org.apache.derby.impl.sql.execute.RowChangerImpl.updateRow(RowChangerImpl.java:516)
>
> at
> org.apache.derby.impl.sql.execute.UpdateResultSet.collectAffectedRows(UpdateResultSet.java:577)
>
> at
> org.apache.derby.impl.sql.execute.UpdateResultSet.open(UpdateResultSet.java:276)
>
> at
> org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:368)
>
> at
> org.apache.derby.impl.jdbc.EmbedResultSet.updateRow(EmbedResultSet.java:3256)
>
> at
> resultsettests.ConcurrencyTest.testConcurrency7(ConcurrencyTest.java:306)
>
> -- Andreas
>
>
Re: RowLocation lifetime
Posted by Andreas Korneliussen <An...@Sun.COM>.
Mike Matrigali wrote:
> i get confused when speaking about isolation level, update/read only
> result sets, and underlying sql query of the result set. I don't
> know if one scrollable result sets are dependent on some sort of
> isolation level.
>
> With respect to straight embedded server execution of SQL, it is fine to
> run with
> read-uncommitted level - but any actual update done on a row will get
> an X lock held to end transaction. At least from this level an SQL
> operation is never failed dependent on the isolation level.
>
> I don't remember if U locks are requested in read uncommitted mode,
> but definitely X locks are requested when the actual update is done.
>
> Note that all discussions of locking should specify under which
> isolation level the system is running. I assumed read commited for
> the below discussion as it is the default.
>
The discussion was intended for read-committed and read-uncommitted,
since for other isolation levels, the rows for which we use the
RowLocation would be locked, and cannot be deleted or purged by another
transaction. Also, if we delete the row in our own transaction, it will
not be purged, since it is locked with an exclusive lock.
I think you are allowed to do updates in read-uncommitted, however when
you read, you do not set read-locks, so you can read uncommitted data.
Andreas
> Daniel John Debrunner wrote:
>
>>> Andreas Korneliussen wrote:
>>
>>
>>
>>>> 2 Or we could make all scrollable updatable resultsets set read-locks
>>>> or updatelocks on every row, for all isolation levels (including
>>>> read-uncommitted)
>>
>>
>>
>> I think updates are not allowed in read-uncommitted mode, so we should
>> not be getting locks in read-uncommitted.
>>
>> Dan.
>>
>>
>>
>>
>>
>
Re: RowLocation lifetime
Posted by Mike Matrigali <mi...@sbcglobal.net>.
i get confused when speaking about isolation level, update/read only
result sets, and underlying sql query of the result set. I don't
know if one scrollable result sets are dependent on some sort of
isolation level.
With respect to straight embedded server execution of SQL, it is fine to
run with
read-uncommitted level - but any actual update done on a row will get
an X lock held to end transaction. At least from this level an SQL
operation is never failed dependent on the isolation level.
I don't remember if U locks are requested in read uncommitted mode,
but definitely X locks are requested when the actual update is done.
Note that all discussions of locking should specify under which
isolation level the system is running. I assumed read commited for
the below discussion as it is the default.
Daniel John Debrunner wrote:
>>Andreas Korneliussen wrote:
>
>
>>>2 Or we could make all scrollable updatable resultsets set read-locks
>>>or updatelocks on every row, for all isolation levels (including
>>>read-uncommitted)
>
>
> I think updates are not allowed in read-uncommitted mode, so we should
> not be getting locks in read-uncommitted.
>
> Dan.
>
>
>
>
>
Re: RowLocation lifetime
Posted by Daniel John Debrunner <dj...@debrunners.com>.
> Andreas Korneliussen wrote:
>> 2 Or we could make all scrollable updatable resultsets set read-locks
>> or updatelocks on every row, for all isolation levels (including
>> read-uncommitted)
I think updates are not allowed in read-uncommitted mode, so we should
not be getting locks in read-uncommitted.
Dan.
Re: RowLocation lifetime
Posted by Andreas Korneliussen <An...@Sun.COM>.
Suresh Thalamati wrote:
> Andreas Korneliussen wrote:
>
>> Mike Matrigali wrote:
>>
>
> <snip>
>
>
>> I would think there are multiple ways of adressing this issue:
>>
>> 1 We could make the store graciously handle the situation if the
>> RowLocation points to a deleted+purged row, by returning false if the
>> RowLocation is invalid, (and from the caller we can give an exception)
>
>
>
> This may not be good option, because purged row location can
> potentially be used by an another insert from a different
> transaction.
>
Maybe I misunderstood, however I assumed the RowLocation would not be
reused as long as there is a table intent lock on the table. Therefore
the insert from a different transaction would need to use another
RowLocation.
Mike Matrigali wrote:
> 2) It can be purged (requires at least TABLE IX, ROW X locking)
> o all physical evidence of the row is removed from table,
> both internal and external operations on this row will
> fail. Only committed deleted rows are purged.
> Note this will never happen if you have some
> sort of lock on the row as the requested X lock is
> always requested in a system only transaction.
> o the actual RowLocation will not be reused while
> at least some sort of table level intent lock is held.
--Andreas
Re: RowLocation lifetime
Posted by Suresh Thalamati <su...@gmail.com>.
Andreas Korneliussen wrote:
> Mike Matrigali wrote:
>
<snip>
> I would think there are multiple ways of adressing this issue:
>
> 1 We could make the store graciously handle the situation if the
> RowLocation points to a deleted+purged row, by returning false if the
> RowLocation is invalid, (and from the caller we can give an exception)
This may not be good option, because purged row location can
potentially be used by an another insert from a different
transaction.
>
> 2 Or we could make all scrollable updatable resultsets set read-locks or
> updatelocks on every row, for all isolation levels (including
> read-uncommitted)
Sounds like a good idea.
>
> 3 Or we could make purging require a table level X lock, instead of row
> locks
No, this can lead to real concurrency issues, if application is doing
lot of deletes in parallel to other operations.
Thanks
-suresht
Re: RowLocation lifetime
Posted by Andreas Korneliussen <An...@Sun.COM>.
Mike Matrigali wrote:
> From the store point of view, 3 things can happen to
> RowLocations in heap tables:
> 1) It can be deleted (requires at least TABLE IX, ROW X locking)
> o internal to store it is just marked deleted
> o external to store requests for any operation on
> this row will fail. Note that your cursor can
> experience this no matter what locking you do, as
> it is always possible for another statement in your
> transaction to do the delete.
> 2) It can be purged (requires at least TABLE IX, ROW X locking)
> o all physical evidence of the row is removed from table,
> both internal and external operations on this row will
> fail. Only committed deleted rows are purged.
> Note this will never happen if you have some
> sort of lock on the row as the requested X lock is
> always requested in a system only transaction.
> o the actual RowLocation will not be reused while
> at least some sort of table level intent lock is held.
> 3) It can be reused (requires a table level X lock)
> o basically as part of a compress all rows can be shuffled
> in any way. A former RowLocation can now point to
> a completely different row.
>
> So as you point out, your implementation can have serious problems
> with cursors held over commit. This is why in current usage of
> cursors over commit the only safe thing to do is to ask for the
> next row location and use it.
>
> Please make sure to consider the delete/purge cases also. One case
> that often causes problems is a transaction deleting a row that is
> locked by it's own cursor from another statement in the same connection.
>
Yes, we need to consider those cases.
It seems that the store is capable of graciously handle that the row get
deleted (i.e by its own transaction). If the transaction later tries to
update the deleted row using the resultset, the store call will return
false indicating that the row was not updated. The deleted row will not
be purged as long as the transaction is open.
However in read-committed/read-uncommitted mode, a row read by the
cursor, can be deleted by another transaction, and then purged.
It seems that the store does not handle an update of a deleted+purged
record.
On our prototype impl., I get a get a NullPointerException from the
store in this case. It comes in GenericConglomerateController.replace(..)).
I would think there are multiple ways of adressing this issue:
1 We could make the store graciously handle the situation if the
RowLocation points to a deleted+purged row, by returning false if the
RowLocation is invalid, (and from the caller we can give an exception)
2 Or we could make all scrollable updatable resultsets set read-locks or
updatelocks on every row, for all isolation levels (including
read-uncommitted)
3 Or we could make purging require a table level X lock, instead of row
locks
Below is output from the test:
T1: Read next Tuple:(0,0,17)
T1: Read next Tuple:(1,1,19)
T1: Read next Tuple:(2,2,21)
T1: Read next Tuple:(3,3,23)
T1: Read next Tuple:(4,4,25)
T1: Read next Tuple:(5,5,27)
T1: Read next Tuple:(6,6,29)
T1: Read next Tuple:(7,7,31)
T1: Read next Tuple:(8,8,33)
T1: Read next Tuple:(9,9,35)
T2: Deleted Tuple:(0,0,17)
T2: commit
T3: call SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE
T3: purged deleted records
T3: commit
T1: Read first Tuple:(0,0,17)
T1: updateInt(2, 3);
T1: updateRow()
java.lang.NullPointerException
at
org.apache.derby.impl.store.access.conglomerate.GenericConglomerateController.replace(GenericConglomerateController.java:465)
at
org.apache.derby.impl.sql.execute.RowChangerImpl.updateRow(RowChangerImpl.java:516)
at
org.apache.derby.impl.sql.execute.UpdateResultSet.collectAffectedRows(UpdateResultSet.java:577)
at
org.apache.derby.impl.sql.execute.UpdateResultSet.open(UpdateResultSet.java:276)
at
org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:368)
at
org.apache.derby.impl.jdbc.EmbedResultSet.updateRow(EmbedResultSet.java:3256)
at
resultsettests.ConcurrencyTest.testConcurrency7(ConcurrencyTest.java:306)
-- Andreas
Re: RowLocation lifetime
Posted by Mike Matrigali <mi...@sbcglobal.net>.
>From the store point of view, 3 things can happen to
RowLocations in heap tables:
1) It can be deleted (requires at least TABLE IX, ROW X locking)
o internal to store it is just marked deleted
o external to store requests for any operation on
this row will fail. Note that your cursor can
experience this no matter what locking you do, as
it is always possible for another statement in your
transaction to do the delete.
2) It can be purged (requires at least TABLE IX, ROW X locking)
o all physical evidence of the row is removed from table,
both internal and external operations on this row will
fail. Only committed deleted rows are purged.
Note this will never happen if you have some
sort of lock on the row as the requested X lock is
always requested in a system only transaction.
o the actual RowLocation will not be reused while
at least some sort of table level intent lock is held.
3) It can be reused (requires a table level X lock)
o basically as part of a compress all rows can be shuffled
in any way. A former RowLocation can now point to
a completely different row.
So as you point out, your implementation can have serious problems
with cursors held over commit. This is why in current usage of
cursors over commit the only safe thing to do is to ask for the
next row location and use it.
Please make sure to consider the delete/purge cases also. One case
that often causes problems is a transaction deleting a row that is
locked by it's own cursor from another statement in the same connection.
As you also discovered, even in read uncommitted Derby still gets
table level intent locks. This can cause unexpected behavior for users
not doing row level locking, but it seemed a reasonable way to avoid
a lot of ugly problems that can be caused by concurrent DDL with read
uncommitted operations.
Andreas Korneliussen wrote:
> Hi,
>
> We are planning on using RowLocation to position the cursor when doing
> scrollable updatable cursors (i.e when navigating backwards in the
> resultset) - see
> http://permalink.gmane.org/gmane.comp.apache.db.derby.devel/10028 for
> more details.
>
> The fact that the RowLocation could become invalid, do worry me a bit.
>
> I did a test on a simple select statement, using transaction level
> read-uncommitted.
>
> T1: select * from testtable
>
> Then (before doing commit) I called on another connection:
>
> T2: call SYSCS_UTIL.SYSCS_COMPRESS_TABLE("APP", "TESTTABLE", 0)
> T2: Got exception:The exception 'SQL Exception: A lock could not be
> obtained within the time requested' was thrown while evaluating an
> expression.
>
> So even in read-uncommitted mode, a lock intent level lock on the table
> is set (good), and it seems to be held until I close the resultset or
> commit the transaction T1.
>
> The problem I then see is for cursors that are held over commit
> (ResultSet.HOLD_CURSORS_OVER_COMMIT). Maybe we should not support it for
> scrollable updatable resultsets.
>
> Anyway , we would really appreciate to get some comments on the
> specification Dag sent out two days ago, to ensure that we are on the
> right track.
>
> Thanks
>
> -- Andreas
>
> Mike Matrigali wrote:
>
>> Assuming row is not deleted, the question can only be answered
>> knowing the isolation level. Basically the RowLocation can only
>> be counted on while a lock intent level lock is held on the table.
>> Intent table locks may be released as soon as a statement is
>> completed, or may be held to end of transaction depending on
>> the type of statement and type of isolation level.
>>
>> The thing that may move an existing row in a heap are the compress
>> table system procedures.
>>
>> If a row is deleted then there are other factors.
>>
>> Rick Hillegas wrote:
>>
>>> Hello Store experts,
>>>
>>> How long is a RowLocation in a Heap good for? Provided that the row
>>> is not deleted, can you count on its RowLocation reliably identifying
>>> the row for the duration of a Statement, a Transaction, a Connection?
>>> Forever?
>>>
>>> Thanks,
>>> -Rick
>>>
>>>
>>
>
>
Re: RowLocation lifetime
Posted by Andreas Korneliussen <An...@Sun.COM>.
Hi,
We are planning on using RowLocation to position the cursor when doing
scrollable updatable cursors (i.e when navigating backwards in the
resultset) - see
http://permalink.gmane.org/gmane.comp.apache.db.derby.devel/10028 for
more details.
The fact that the RowLocation could become invalid, do worry me a bit.
I did a test on a simple select statement, using transaction level
read-uncommitted.
T1: select * from testtable
Then (before doing commit) I called on another connection:
T2: call SYSCS_UTIL.SYSCS_COMPRESS_TABLE("APP", "TESTTABLE", 0)
T2: Got exception:The exception 'SQL Exception: A lock could not be
obtained within the time requested' was thrown while evaluating an
expression.
So even in read-uncommitted mode, a lock intent level lock on the table
is set (good), and it seems to be held until I close the resultset or
commit the transaction T1.
The problem I then see is for cursors that are held over commit
(ResultSet.HOLD_CURSORS_OVER_COMMIT). Maybe we should not support it for
scrollable updatable resultsets.
Anyway , we would really appreciate to get some comments on the
specification Dag sent out two days ago, to ensure that we are on the
right track.
Thanks
-- Andreas
Mike Matrigali wrote:
> Assuming row is not deleted, the question can only be answered
> knowing the isolation level. Basically the RowLocation can only
> be counted on while a lock intent level lock is held on the table.
> Intent table locks may be released as soon as a statement is
> completed, or may be held to end of transaction depending on
> the type of statement and type of isolation level.
>
> The thing that may move an existing row in a heap are the compress
> table system procedures.
>
> If a row is deleted then there are other factors.
>
> Rick Hillegas wrote:
>
>> Hello Store experts,
>>
>> How long is a RowLocation in a Heap good for? Provided that the row is
>> not deleted, can you count on its RowLocation reliably identifying the
>> row for the duration of a Statement, a Transaction, a Connection?
>> Forever?
>>
>> Thanks,
>> -Rick
>>
>>
>
Re: RowLocation lifetime
Posted by Mike Matrigali <mi...@sbcglobal.net>.
Øystein Grøvlen wrote:
>>>>>>"MM" == Mike Matrigali <mi...@sbcglobal.net> writes:
>
>
> MM> Øystein Grøvlen wrote:
>
> >>>>>>> "MM" == Mike Matrigali <mi...@sbcglobal.net> writes:
> >>
> >>
> MM> It is only stable while some sort of stable table intent lock is held.
> MM> Rows can move during a compress table operation.
> >>
> >> I understand, when a record is moved to another page, its RecordId
> >> will change. Is this the only case where a RecordId will change? If
> >> yes, I would think one could solve the problem for insensitive result
> >> sets by invalidating open result sets when an underlying table is
> >> compressed.
> >>
> >> Some questions to how compression works:
> >> - Will RecordIds ever be reused or will the sequence number continue to
> >> increase?
> MM> Derby now supports 2 different compression techniques, basically one
> MM> offline and one online.
>
> MM> SYSCS_UTIL.SYSCS_COMPRESS_TABLE() basically copies all rows from one
> MM> table to another, so recordId's may be reused. This requires table
> MM> level locking and so is effectively offline.
>
> Do not the new version of the table get a new container key? If yes,
> I would think that recordId's are not reused since container key is
> part of the record id.
That is an interesting point. I think the correct answer depends on how
one is using the row location. Currently the on disk format of the only
current implementation of a row location is done by HeapRowLocation and
it only stores a page and record number. While logically the record id
includes it's container, the current usage only requires the page and
record id. This was a significant space savings in indexes, but does
restrict all rows in a given index to point to a single container - not
a problem with current implementation. The in-memory versions of
HeapRowLocation do associate each with a single container for locking.
>
>
> MM> SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE() moves records within the
> MM> same table, and will not reuse RecordId's, but a given record can
> MM> definitely change RecordId. This requires row locks
> MM> for purged/moved rows for most of it's work. Giving back space to
> MM> OS requires short time table level lock.
>
> Good, I was not aware that there was an online version of compress.
>
> >> - How do you ensure that the old RecordIds are not encountered
> >> during recovery? Does the compression include a checkpoint?
> MM> Neither really does anything special to stop old recordId's from
> MM> being encountered during recovery. With offline redo recovery of
> MM> old table is basically a drop table so either the table is there
> MM> and we drop it again, or the table is already dropped and we do
> MM> nothing. In online it is the normal case of redo or a record delete
> MM> or a record purge, in either case redo will either see a version of
> MM> the page where it can redo the delete or purge or it will see from
> MM> the version of the page that there is no work to do.
>
Re: RowLocation lifetime
Posted by Øystein Grøvlen <Oy...@Sun.COM>.
>>>>> "MM" == Mike Matrigali <mi...@sbcglobal.net> writes:
MM> Øystein Grøvlen wrote:
>>>>>>> "MM" == Mike Matrigali <mi...@sbcglobal.net> writes:
>>
>>
MM> It is only stable while some sort of stable table intent lock is held.
MM> Rows can move during a compress table operation.
>>
>> I understand, when a record is moved to another page, its RecordId
>> will change. Is this the only case where a RecordId will change? If
>> yes, I would think one could solve the problem for insensitive result
>> sets by invalidating open result sets when an underlying table is
>> compressed.
>>
>> Some questions to how compression works:
>> - Will RecordIds ever be reused or will the sequence number continue to
>> increase?
MM> Derby now supports 2 different compression techniques, basically one
MM> offline and one online.
MM> SYSCS_UTIL.SYSCS_COMPRESS_TABLE() basically copies all rows from one
MM> table to another, so recordId's may be reused. This requires table
MM> level locking and so is effectively offline.
Do not the new version of the table get a new container key? If yes,
I would think that recordId's are not reused since container key is
part of the record id.
MM> SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE() moves records within the
MM> same table, and will not reuse RecordId's, but a given record can
MM> definitely change RecordId. This requires row locks
MM> for purged/moved rows for most of it's work. Giving back space to
MM> OS requires short time table level lock.
Good, I was not aware that there was an online version of compress.
>> - How do you ensure that the old RecordIds are not encountered
>> during recovery? Does the compression include a checkpoint?
MM> Neither really does anything special to stop old recordId's from
MM> being encountered during recovery. With offline redo recovery of
MM> old table is basically a drop table so either the table is there
MM> and we drop it again, or the table is already dropped and we do
MM> nothing. In online it is the normal case of redo or a record delete
MM> or a record purge, in either case redo will either see a version of
MM> the page where it can redo the delete or purge or it will see from
MM> the version of the page that there is no work to do.
--
Øystein
Re: RowLocation lifetime
Posted by Mike Matrigali <mi...@sbcglobal.net>.
Øystein Grøvlen wrote:
>>>>>>"MM" == Mike Matrigali <mi...@sbcglobal.net> writes:
>
>
> MM> It is only stable while some sort of stable table intent lock is held.
> MM> Rows can move during a compress table operation.
>
> I understand, when a record is moved to another page, its RecordId
> will change. Is this the only case where a RecordId will change? If
> yes, I would think one could solve the problem for insensitive result
> sets by invalidating open result sets when an underlying table is
> compressed.
>
> Some questions to how compression works:
> - Will RecordIds ever be reused or will the sequence number continue to
> increase?
Derby now supports 2 different compression techniques, basically one
offline and one online.
SYSCS_UTIL.SYSCS_COMPRESS_TABLE() basically copies all rows from one
table to another, so recordId's may be reused. This requires table
level locking and so is effectively offline.
SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE() moves records within the
same table, and will not reuse RecordId's, but a given record can
definitely change RecordId. This requires row locks
for purged/moved rows for most of it's work. Giving back space to
OS requires short time table level lock.
> - How do you ensure that the old RecordIds are not encountered
> during recovery? Does the compression include a checkpoint?
Neither really does anything special to stop old recordId's from
being encountered during recovery. With offline redo recovery of
old table is basically a drop table so either the table is there
and we drop it again, or the table is already dropped and we do
nothing. In online it is the normal case of redo or a record delete
or a record purge, in either case redo will either see a version of
the page where it can redo the delete or purge or it will see from
the version of the page that there is no work to do.
SYSCS_UTIL.SYSCS_COMPRESS_TABLE():
This activity is basically all done above the store level, store
does not really know what is going on and there is no special
casing of recovery. The path is basically the same recovery path
as the fast create/load of a new table/indexes. As the last step
in the transaction the language code switches the mapping of the
user table to the underlying store conglomerates.
SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE():
For the row movement portion of this, there is no special recovery
code. Every row movement is just a logged delete and insert, with
all the associated index updates in the same transaction. The row
movement portion, and row purging portion of this compress is
"online" in that it only needs short term page latches and short
row level locked transactions. The actual giving space back to the
OS still needs a table level lock, and does require a checkpoint.
> - It seems like the compression heavily depends on that no log
> records are generated during compression. Do you have any ideas
> of how to make compression on-line? I guess you would need a
> mapping between new and old RecordIds (i.e., every move would
> have to be logged.)
There is not requirement of no log records. As you say every move is
logged, which is why in it is recommended that users use offline
compression if that option is available to them as it uses way less
system resources and will finish quicker, but definitely is not as
concurrent.
>
Re: RowLocation lifetime
Posted by Suresh Thalamati <su...@gmail.com>.
Øystein Grøvlen wrote:
>>>>>>"MM" == Mike Matrigali <mi...@sbcglobal.net> writes:
>
>
> MM> It is only stable while some sort of stable table intent lock is held.
> MM> Rows can move during a compress table operation.
>
> I understand, when a record is moved to another page, its RecordId
> will change. Is this the only case where a RecordId will change?
It seems like that for the heap tables. By quick scan at the code I
find only btrees set derby.storage.reusableRecordId=true, and this
property is not supposed to be set by the users.
If a invalid handle is passed to store, in one fecth case I noticed it
throws the exception SQLState.RAWSTORE_RECORD_VANISHED ,
Thanks
-suresht
Re: RowLocation lifetime
Posted by Øystein Grøvlen <Oy...@Sun.COM>.
>>>>> "MM" == Mike Matrigali <mi...@sbcglobal.net> writes:
MM> It is only stable while some sort of stable table intent lock is held.
MM> Rows can move during a compress table operation.
I understand, when a record is moved to another page, its RecordId
will change. Is this the only case where a RecordId will change? If
yes, I would think one could solve the problem for insensitive result
sets by invalidating open result sets when an underlying table is
compressed.
Some questions to how compression works:
- Will RecordIds ever be reused or will the sequence number continue to
increase?
- How do you ensure that the old RecordIds are not encountered
during recovery? Does the compression include a checkpoint?
- It seems like the compression heavily depends on that no log
records are generated during compression. Do you have any ideas
of how to make compression on-line? I guess you would need a
mapping between new and old RecordIds (i.e., every move would
have to be logged.)
--
Øystein
Re: RowLocation lifetime
Posted by Øystein Grøvlen <Oy...@Sun.COM>.
>>>>> "MM" == Mike Matrigali <mi...@sbcglobal.net> writes:
MM> Not really.
MM> I don't think even a unique key constraint meets the requirements,
MM> as the row can be deleted and a different row can be created.
HeapRowLocation contains a RecordHandle which if I have understood
this correctly refers to a RecordId in this case. Looking at the
code, RecordId consists of pageId and a sequence number within each
page. It seems to me that RecordIds should be unique and not reused.
Could not this be used as a stable handle?
MM> Rick Hillegas wrote:
>> Hi Mike,
>> It appears that RowLocations are stable under some combinations of
>> isolation levels and lock granularities but that, in general, a
>> RowLocation isn't guaranteed to be stable even for the duration of a
>> statement. Is there some other, more stable handle on a heap row? By
>> "more stable" I mean something which uniquely identifies the row
>> across statements, transactions, or connections?
>> Thanks,
>> -Rick
>>
--
Øystein
Re: RowLocation lifetime
Posted by Mike Matrigali <mi...@sbcglobal.net>.
Not really.
I don't think even a unique key constraint meets the requirements,
as the row can be deleted and a different row can be created.
Rick Hillegas wrote:
> Hi Mike,
>
> It appears that RowLocations are stable under some combinations of
> isolation levels and lock granularities but that, in general, a
> RowLocation isn't guaranteed to be stable even for the duration of a
> statement. Is there some other, more stable handle on a heap row? By
> "more stable" I mean something which uniquely identifies the row across
> statements, transactions, or connections?
>
> Thanks,
> -Rick
>
>
Re: RowLocation lifetime
Posted by Rick Hillegas <Ri...@Sun.COM>.
Hi Mike,
It appears that RowLocations are stable under some combinations of
isolation levels and lock granularities but that, in general, a
RowLocation isn't guaranteed to be stable even for the duration of a
statement. Is there some other, more stable handle on a heap row? By
"more stable" I mean something which uniquely identifies the row across
statements, transactions, or connections?
Thanks,
-Rick
Re: RowLocation lifetime
Posted by Mike Matrigali <mi...@sbcglobal.net>.
Assuming row is not deleted, the question can only be answered
knowing the isolation level. Basically the RowLocation can only
be counted on while a lock intent level lock is held on the table.
Intent table locks may be released as soon as a statement is
completed, or may be held to end of transaction depending on
the type of statement and type of isolation level.
The thing that may move an existing row in a heap are the compress
table system procedures.
If a row is deleted then there are other factors.
Rick Hillegas wrote:
> Hello Store experts,
>
> How long is a RowLocation in a Heap good for? Provided that the row is
> not deleted, can you count on its RowLocation reliably identifying the
> row for the duration of a Statement, a Transaction, a Connection? Forever?
>
> Thanks,
> -Rick
>
>