You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by Rick Hillegas <Ri...@Sun.COM> on 2005/11/10 01:58:21 UTC

RowLocation lifetime

Hello Store experts,

How long is a RowLocation in a Heap good for? Provided that the row is 
not deleted, can you count on its RowLocation reliably identifying the 
row for the duration of a Statement, a Transaction, a Connection? Forever?

Thanks,
-Rick

Re: RowLocation lifetime

Posted by Andreas Korneliussen <An...@Sun.COM>.
Mike Matrigali wrote:
> null pointer is a bug, please report as a separate JIRA,not sure
> what is going on.  Note that while it is convenient for testing
> purposes to use the SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE
> to cause the purging, purging of rows can happen at any time
> after a delete has been committed.  The timing depends on a
> number of factors and the timing may change in the future - so
> one should assume that as soon as a row is deleted and committed
> it may be purged.
> 
> Andreas Korneliussen wrote:
> 
>> Mike Matrigali wrote:
>>
>>> From the store point of view, 3 things can happen to
>>> RowLocations in heap tables:
>>> 1) It can be deleted (requires at least TABLE IX, ROW X locking)
>>>        o internal to store it is just marked deleted
>>>        o external to store requests for any operation on
>>>          this row will fail.  Note that your cursor can
>>>          experience this no matter what locking you do, as
>>>          it is always possible for another statement in your
>>>          transaction to do the delete.
>>> 2) It can be purged (requires at least TABLE IX, ROW X locking)
>>>        o all physical evidence of the row is removed from table,
>>>          both internal and external operations on this row will
>>>          fail.  Only committed deleted rows are purged.
>>>          Note this will never happen if you have some
>>>          sort of lock on the row as the requested X lock is
>>>          always requested in a system only transaction.
>>>        o the actual RowLocation will not be reused while
>>>          at least some sort of table level intent lock is held.
>>> 3) It can be reused (requires a table level X lock)
>>>        o basically as part of a compress all rows can be shuffled
>>>          in any way.  A former RowLocation can now point to
>>>          a completely different row.
>>>
>>> So as you point out, your implementation can have serious problems
>>> with cursors held over commit.  This is why in current usage of
>>> cursors over commit the only safe thing to do is to ask for the
>>> next row location and use it.
>>>
>>> Please make sure to consider the delete/purge cases also.  One case
>>> that often causes problems is a transaction deleting a row that is
>>> locked by it's own cursor from another statement in the same connection.
>>>
>> Yes, we need to consider those cases.
>>
>> It seems that the store is capable of graciously handle that the row 
>> get deleted (i.e by its own transaction). If the transaction later 
>> tries to update the deleted row using the resultset, the store call 
>> will return false indicating that the row was not updated. The deleted 
>> row will not be purged as long as the transaction is open.
>>
>> However in read-committed/read-uncommitted mode, a row read by the 
>> cursor, can be deleted by another transaction, and then purged.
>> It seems that the store does not handle an update of a deleted+purged 
>> record.
>>
>> On our prototype impl., I get a get a NullPointerException from the 
>> store in this case.  It comes in 
>> GenericConglomerateController.replace(..)).
>>
>> I would think there are multiple ways of adressing this issue:
>>
>> 1 We could  make the store graciously handle the situation if the 
>> RowLocation points to a deleted+purged row, by returning false if the 
>> RowLocation is invalid, (and from the caller we can give an exception)
> 
> It seems like the ConglomerateController.replace() function should throw
> an exception (other than null pointer) if it is called with a 
> non-existent RowLocation, but I could be convinced returning false
> is ok.  The problem I have is that store really has no way to tell the 
> difference between a BAD RowLocation input and one which was purged.
> 

Yes, maybe the store should throw an exception if the RowLocation is 
invalid.  Seen from the user perspective, I think the update should fail 
the same way whether the row has been deleted+purged or only deleted 
Currently the function returns false if the row has been deleted.

I ran a check on all places in the code tree that the 
ConglomerateCongtroller.replace(..) method is called. I found that the 
return value is silently ignored all places, except for in the  store 
unit-tests.

This means that an updateRow() on a deleted record, would simply return 
silently (using the optimistic concurrency approach). After the 
transaction has committed, the row would remain deleted.

I will file a JIRA for the NullPointerException case.

--Andreas





Re: RowLocation lifetime

Posted by Mike Matrigali <mi...@sbcglobal.net>.
null pointer is a bug, please report as a separate JIRA,not sure
what is going on.  Note that while it is convenient for testing
purposes to use the SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE
to cause the purging, purging of rows can happen at any time
after a delete has been committed.  The timing depends on a
number of factors and the timing may change in the future - so
one should assume that as soon as a row is deleted and committed
it may be purged.

Andreas Korneliussen wrote:
> Mike Matrigali wrote:
> 
>> From the store point of view, 3 things can happen to
>> RowLocations in heap tables:
>> 1) It can be deleted (requires at least TABLE IX, ROW X locking)
>>        o internal to store it is just marked deleted
>>        o external to store requests for any operation on
>>          this row will fail.  Note that your cursor can
>>          experience this no matter what locking you do, as
>>          it is always possible for another statement in your
>>          transaction to do the delete.
>> 2) It can be purged (requires at least TABLE IX, ROW X locking)
>>        o all physical evidence of the row is removed from table,
>>          both internal and external operations on this row will
>>          fail.  Only committed deleted rows are purged.
>>          Note this will never happen if you have some
>>          sort of lock on the row as the requested X lock is
>>          always requested in a system only transaction.
>>        o the actual RowLocation will not be reused while
>>          at least some sort of table level intent lock is held.
>> 3) It can be reused (requires a table level X lock)
>>        o basically as part of a compress all rows can be shuffled
>>          in any way.  A former RowLocation can now point to
>>          a completely different row.
>>
>> So as you point out, your implementation can have serious problems
>> with cursors held over commit.  This is why in current usage of
>> cursors over commit the only safe thing to do is to ask for the
>> next row location and use it.
>>
>> Please make sure to consider the delete/purge cases also.  One case
>> that often causes problems is a transaction deleting a row that is
>> locked by it's own cursor from another statement in the same connection.
>>
> Yes, we need to consider those cases.
> 
> It seems that the store is capable of graciously handle that the row get 
> deleted (i.e by its own transaction). If the transaction later tries to 
> update the deleted row using the resultset, the store call will return 
> false indicating that the row was not updated. The deleted row will not 
> be purged as long as the transaction is open.
> 
> However in read-committed/read-uncommitted mode, a row read by the 
> cursor, can be deleted by another transaction, and then purged.
> It seems that the store does not handle an update of a deleted+purged 
> record.
> 
> On our prototype impl., I get a get a NullPointerException from the 
> store in this case.  It comes in 
> GenericConglomerateController.replace(..)).
> 
> I would think there are multiple ways of adressing this issue:
> 
> 1 We could  make the store graciously handle the situation if the 
> RowLocation points to a deleted+purged row, by returning false if the 
> RowLocation is invalid, (and from the caller we can give an exception)
It seems like the ConglomerateController.replace() function should throw
an exception (other than null pointer) if it is called with a 
non-existent RowLocation, but I could be convinced returning false
is ok.  The problem I have is that store really has no way to tell the 
difference between a BAD RowLocation input and one which was purged.

> 
> 2 Or we could make all scrollable updatable resultsets set read-locks or 
>  updatelocks on every row, for all isolation levels (including 
> read-uncommitted)
this would work and I guess the choice depends on optimistic vs. 
pessimistic concurrency.  If you want to guarantee not errors thrown
at update time then you need locks - though lock timeouts and lock
deadlocks are always a possibility.  If you are allowed to throw errors
if you find the row does not exist at update time, I am sure the
store interfaces could be used to either return false, return an
exception that can be checked, or allow for an interface to check
and get a lock on an entity about to be updated.
> 
> 3 Or we could make purging require a table level X lock, instead of row 
> locks
This is just not an option, in active applications no space reclamation
would ever happen.  Very early on this was how it worked and it just did
not work well with most applications.  Given user input, the current row
level space reclamation was implemented.
> 
> Below is output from the test:
> 
> T1: Read next Tuple:(0,0,17)
> T1: Read next Tuple:(1,1,19)
> T1: Read next Tuple:(2,2,21)
> T1: Read next Tuple:(3,3,23)
> T1: Read next Tuple:(4,4,25)
> T1: Read next Tuple:(5,5,27)
> T1: Read next Tuple:(6,6,29)
> T1: Read next Tuple:(7,7,31)
> T1: Read next Tuple:(8,8,33)
> T1: Read next Tuple:(9,9,35)
> T2: Deleted Tuple:(0,0,17)
> T2: commit
> T3: call SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE
> T3: purged deleted records
> T3: commit
> T1: Read first Tuple:(0,0,17)
> T1: updateInt(2, 3);
> T1: updateRow()
> java.lang.NullPointerException
>         at 
> org.apache.derby.impl.store.access.conglomerate.GenericConglomerateController.replace(GenericConglomerateController.java:465) 
> 
>         at 
> org.apache.derby.impl.sql.execute.RowChangerImpl.updateRow(RowChangerImpl.java:516) 
> 
>         at 
> org.apache.derby.impl.sql.execute.UpdateResultSet.collectAffectedRows(UpdateResultSet.java:577) 
> 
>         at 
> org.apache.derby.impl.sql.execute.UpdateResultSet.open(UpdateResultSet.java:276) 
> 
>         at 
> org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:368) 
> 
>         at 
> org.apache.derby.impl.jdbc.EmbedResultSet.updateRow(EmbedResultSet.java:3256) 
> 
>         at 
> resultsettests.ConcurrencyTest.testConcurrency7(ConcurrencyTest.java:306)
> 
> -- Andreas
> 
> 


Re: RowLocation lifetime

Posted by Andreas Korneliussen <An...@Sun.COM>.
Mike Matrigali wrote:
> i get confused when speaking about isolation level, update/read only
> result sets, and underlying sql query of the result set.  I don't
> know if one scrollable result sets are dependent on some sort of
> isolation level.
> 
> With respect to straight embedded server execution of SQL, it is fine to 
> run with
> read-uncommitted level - but any actual update done on a row will get
> an X lock held to end transaction.  At least from this level an SQL
> operation is never failed dependent on the isolation level.
> 
> I don't remember if U locks are requested in read uncommitted mode,
> but definitely X locks are requested when the actual update is done.
> 
> Note that all discussions of locking should specify under which 
> isolation level the system is running.  I assumed read commited for
> the below discussion as it is the default.
> 

The discussion was intended for  read-committed and read-uncommitted, 
since for other isolation levels, the rows for which we use the 
RowLocation would be locked, and cannot be deleted or purged by another 
transaction. Also, if we delete the row in our own transaction, it will 
not be purged, since it is locked with an exclusive lock.

I think you are allowed to do updates in read-uncommitted, however when 
you read, you do not set read-locks, so you can read uncommitted  data.

Andreas


> Daniel John Debrunner wrote:
> 
>>> Andreas Korneliussen wrote:
>>
>>
>>
>>>> 2 Or we could make all scrollable updatable resultsets set read-locks
>>>> or  updatelocks on every row, for all isolation levels (including
>>>> read-uncommitted)
>>
>>
>>
>> I think updates are not allowed in read-uncommitted mode, so we should
>> not be getting locks in read-uncommitted.
>>
>> Dan.
>>
>>
>>
>>
>>
> 


Re: RowLocation lifetime

Posted by Mike Matrigali <mi...@sbcglobal.net>.
i get confused when speaking about isolation level, update/read only
result sets, and underlying sql query of the result set.  I don't
know if one scrollable result sets are dependent on some sort of
isolation level.

With respect to straight embedded server execution of SQL, it is fine to 
run with
read-uncommitted level - but any actual update done on a row will get
an X lock held to end transaction.  At least from this level an SQL
operation is never failed dependent on the isolation level.

I don't remember if U locks are requested in read uncommitted mode,
but definitely X locks are requested when the actual update is done.

Note that all discussions of locking should specify under which 
isolation level the system is running.  I assumed read commited for
the below discussion as it is the default.

Daniel John Debrunner wrote:
>>Andreas Korneliussen wrote:
> 
> 
>>>2 Or we could make all scrollable updatable resultsets set read-locks
>>>or  updatelocks on every row, for all isolation levels (including
>>>read-uncommitted)
> 
> 
> I think updates are not allowed in read-uncommitted mode, so we should
> not be getting locks in read-uncommitted.
> 
> Dan.
> 
> 
> 
> 
> 


Re: RowLocation lifetime

Posted by Daniel John Debrunner <dj...@debrunners.com>.
> Andreas Korneliussen wrote:

>> 2 Or we could make all scrollable updatable resultsets set read-locks
>> or  updatelocks on every row, for all isolation levels (including
>> read-uncommitted)

I think updates are not allowed in read-uncommitted mode, so we should
not be getting locks in read-uncommitted.

Dan.




Re: RowLocation lifetime

Posted by Andreas Korneliussen <An...@Sun.COM>.
Suresh Thalamati wrote:
> Andreas Korneliussen wrote:
> 
>> Mike Matrigali wrote:
>>
> 
> <snip>
> 
> 
>> I would think there are multiple ways of adressing this issue:
>>
>> 1 We could  make the store graciously handle the situation if the 
>> RowLocation points to a deleted+purged row, by returning false if the 
>> RowLocation is invalid, (and from the caller we can give an exception)
> 
> 
> 
> This may not be good option, because purged row location  can 
> potentially be used by an another insert from a different
> transaction.
> 

Maybe I misunderstood, however I assumed the RowLocation would not be 
reused as long as there is a table intent lock on the table. Therefore 
the insert from a different transaction would need to use another 
RowLocation.

Mike Matrigali wrote:
> 2) It can be purged (requires at least TABLE IX, ROW X locking)
>        o all physical evidence of the row is removed from table,
>          both internal and external operations on this row will
>          fail.  Only committed deleted rows are purged.
>          Note this will never happen if you have some
>          sort of lock on the row as the requested X lock is
>          always requested in a system only transaction.
>        o the actual RowLocation will not be reused while
>          at least some sort of table level intent lock is held.

--Andreas

Re: RowLocation lifetime

Posted by Suresh Thalamati <su...@gmail.com>.
Andreas Korneliussen wrote:
> Mike Matrigali wrote:
> 

<snip>


> I would think there are multiple ways of adressing this issue:
> 
> 1 We could  make the store graciously handle the situation if the 
> RowLocation points to a deleted+purged row, by returning false if the 
> RowLocation is invalid, (and from the caller we can give an exception)


This may not be good option, because purged row location  can 
potentially be used by an another insert from a different
transaction.


> 
> 2 Or we could make all scrollable updatable resultsets set read-locks or 
>  updatelocks on every row, for all isolation levels (including 
> read-uncommitted)

Sounds like a good idea.


> 
> 3 Or we could make purging require a table level X lock, instead of row 
> locks

No, this can lead to real concurrency issues, if application is doing 
lot of deletes in parallel to other operations.



Thanks
-suresht

Re: RowLocation lifetime

Posted by Andreas Korneliussen <An...@Sun.COM>.
Mike Matrigali wrote:
> From the store point of view, 3 things can happen to
> RowLocations in heap tables:
> 1) It can be deleted (requires at least TABLE IX, ROW X locking)
>        o internal to store it is just marked deleted
>        o external to store requests for any operation on
>          this row will fail.  Note that your cursor can
>          experience this no matter what locking you do, as
>          it is always possible for another statement in your
>          transaction to do the delete.
> 2) It can be purged (requires at least TABLE IX, ROW X locking)
>        o all physical evidence of the row is removed from table,
>          both internal and external operations on this row will
>          fail.  Only committed deleted rows are purged.
>          Note this will never happen if you have some
>          sort of lock on the row as the requested X lock is
>          always requested in a system only transaction.
>        o the actual RowLocation will not be reused while
>          at least some sort of table level intent lock is held.
> 3) It can be reused (requires a table level X lock)
>        o basically as part of a compress all rows can be shuffled
>          in any way.  A former RowLocation can now point to
>          a completely different row.
> 
> So as you point out, your implementation can have serious problems
> with cursors held over commit.  This is why in current usage of
> cursors over commit the only safe thing to do is to ask for the
> next row location and use it.
> 
> Please make sure to consider the delete/purge cases also.  One case
> that often causes problems is a transaction deleting a row that is
> locked by it's own cursor from another statement in the same connection.
>
Yes, we need to consider those cases.

It seems that the store is capable of graciously handle that the row get 
deleted (i.e by its own transaction). If the transaction later tries to 
update the deleted row using the resultset, the store call will return 
false indicating that the row was not updated. The deleted row will not 
be purged as long as the transaction is open.

However in read-committed/read-uncommitted mode, a row read by the 
cursor, can be deleted by another transaction, and then purged.
It seems that the store does not handle an update of a deleted+purged 
record.

On our prototype impl., I get a get a NullPointerException from the 
store in this case.  It comes in GenericConglomerateController.replace(..)).

I would think there are multiple ways of adressing this issue:

1 We could  make the store graciously handle the situation if the 
RowLocation points to a deleted+purged row, by returning false if the 
RowLocation is invalid, (and from the caller we can give an exception)

2 Or we could make all scrollable updatable resultsets set read-locks or 
  updatelocks on every row, for all isolation levels (including 
read-uncommitted)

3 Or we could make purging require a table level X lock, instead of row 
locks

Below is output from the test:

T1: Read next Tuple:(0,0,17)
T1: Read next Tuple:(1,1,19)
T1: Read next Tuple:(2,2,21)
T1: Read next Tuple:(3,3,23)
T1: Read next Tuple:(4,4,25)
T1: Read next Tuple:(5,5,27)
T1: Read next Tuple:(6,6,29)
T1: Read next Tuple:(7,7,31)
T1: Read next Tuple:(8,8,33)
T1: Read next Tuple:(9,9,35)
T2: Deleted Tuple:(0,0,17)
T2: commit
T3: call SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE
T3: purged deleted records
T3: commit
T1: Read first Tuple:(0,0,17)
T1: updateInt(2, 3);
T1: updateRow()
java.lang.NullPointerException
         at 
org.apache.derby.impl.store.access.conglomerate.GenericConglomerateController.replace(GenericConglomerateController.java:465)
         at 
org.apache.derby.impl.sql.execute.RowChangerImpl.updateRow(RowChangerImpl.java:516)
         at 
org.apache.derby.impl.sql.execute.UpdateResultSet.collectAffectedRows(UpdateResultSet.java:577)
         at 
org.apache.derby.impl.sql.execute.UpdateResultSet.open(UpdateResultSet.java:276)
         at 
org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:368)
         at 
org.apache.derby.impl.jdbc.EmbedResultSet.updateRow(EmbedResultSet.java:3256)
         at 
resultsettests.ConcurrencyTest.testConcurrency7(ConcurrencyTest.java:306)

-- Andreas

Re: RowLocation lifetime

Posted by Mike Matrigali <mi...@sbcglobal.net>.
>From the store point of view, 3 things can happen to
RowLocations in heap tables:
1) It can be deleted (requires at least TABLE IX, ROW X locking)
       o internal to store it is just marked deleted
       o external to store requests for any operation on
         this row will fail.  Note that your cursor can
         experience this no matter what locking you do, as
         it is always possible for another statement in your
         transaction to do the delete.
2) It can be purged (requires at least TABLE IX, ROW X locking)
       o all physical evidence of the row is removed from table,
         both internal and external operations on this row will
         fail.  Only committed deleted rows are purged.
         Note this will never happen if you have some
         sort of lock on the row as the requested X lock is
         always requested in a system only transaction.
       o the actual RowLocation will not be reused while
         at least some sort of table level intent lock is held.
3) It can be reused (requires a table level X lock)
       o basically as part of a compress all rows can be shuffled
         in any way.  A former RowLocation can now point to
         a completely different row.

So as you point out, your implementation can have serious problems
with cursors held over commit.  This is why in current usage of
cursors over commit the only safe thing to do is to ask for the
next row location and use it.

Please make sure to consider the delete/purge cases also.  One case
that often causes problems is a transaction deleting a row that is
locked by it's own cursor from another statement in the same connection.

As you also discovered, even in read uncommitted Derby still gets
table level intent locks.  This can cause unexpected behavior for users
not doing row level locking, but it seemed a reasonable way to avoid
a lot of ugly problems that can be caused by concurrent DDL with read
uncommitted operations.

Andreas Korneliussen wrote:
> Hi,
> 
>  We are planning on using RowLocation to position the cursor when doing
> scrollable updatable cursors (i.e when navigating backwards in the
> resultset) - see
> http://permalink.gmane.org/gmane.comp.apache.db.derby.devel/10028 for
> more details.
> 
> The fact that the RowLocation could become invalid, do worry me a bit.
> 
> I did a test on a simple select statement, using transaction level
> read-uncommitted.
> 
> T1: select * from testtable
> 
> Then (before doing commit) I called on another connection:
> 
> T2: call SYSCS_UTIL.SYSCS_COMPRESS_TABLE("APP", "TESTTABLE", 0)
> T2: Got exception:The exception 'SQL Exception: A lock could not be
> obtained within the time requested' was thrown while evaluating an
> expression.
> 
> So even in read-uncommitted mode, a lock intent level lock on the table
> is set (good), and it seems to be held until I close the resultset or
> commit the transaction T1.
> 
> The problem I then see is for cursors that are held over commit
> (ResultSet.HOLD_CURSORS_OVER_COMMIT). Maybe we should not support it for
> scrollable updatable resultsets.
> 
> Anyway , we would really appreciate to get some comments on the
> specification Dag sent out two days ago, to ensure that we are on the
> right track.
> 
> Thanks
> 
> -- Andreas
> 
> Mike Matrigali wrote:
> 
>> Assuming row is not deleted, the question can only be answered
>> knowing the isolation level.  Basically the RowLocation can only
>> be counted on while a lock intent level lock is held on the table.
>> Intent table locks may be released as soon as a statement is
>> completed, or may be held to end of transaction depending on
>> the type of statement and type of isolation level.
>>
>> The thing that may move an existing row in a heap are the compress
>> table system procedures.
>>
>> If a row is deleted then there are other factors.
>>
>> Rick Hillegas wrote:
>>
>>> Hello Store experts,
>>>
>>> How long is a RowLocation in a Heap good for? Provided that the row
>>> is not deleted, can you count on its RowLocation reliably identifying
>>> the row for the duration of a Statement, a Transaction, a Connection?
>>> Forever?
>>>
>>> Thanks,
>>> -Rick
>>>
>>>
>>
> 
> 

Re: RowLocation lifetime

Posted by Andreas Korneliussen <An...@Sun.COM>.
Hi,

  We are planning on using RowLocation to position the cursor when doing 
scrollable updatable cursors (i.e when navigating backwards in the 
resultset) - see 
http://permalink.gmane.org/gmane.comp.apache.db.derby.devel/10028 for 
more details.

The fact that the RowLocation could become invalid, do worry me a bit.

I did a test on a simple select statement, using transaction level 
read-uncommitted.

T1: select * from testtable

Then (before doing commit) I called on another connection:

T2: call SYSCS_UTIL.SYSCS_COMPRESS_TABLE("APP", "TESTTABLE", 0)
T2: Got exception:The exception 'SQL Exception: A lock could not be 
obtained within the time requested' was thrown while evaluating an 
expression.

So even in read-uncommitted mode, a lock intent level lock on the table 
is set (good), and it seems to be held until I close the resultset or 
commit the transaction T1.

The problem I then see is for cursors that are held over commit 
(ResultSet.HOLD_CURSORS_OVER_COMMIT). Maybe we should not support it for 
scrollable updatable resultsets.

Anyway , we would really appreciate to get some comments on the 
specification Dag sent out two days ago, to ensure that we are on the 
right track.

Thanks

-- Andreas

Mike Matrigali wrote:
> Assuming row is not deleted, the question can only be answered
> knowing the isolation level.  Basically the RowLocation can only
> be counted on while a lock intent level lock is held on the table.
> Intent table locks may be released as soon as a statement is
> completed, or may be held to end of transaction depending on
> the type of statement and type of isolation level.
> 
> The thing that may move an existing row in a heap are the compress
> table system procedures.
> 
> If a row is deleted then there are other factors.
> 
> Rick Hillegas wrote:
> 
>> Hello Store experts,
>>
>> How long is a RowLocation in a Heap good for? Provided that the row is 
>> not deleted, can you count on its RowLocation reliably identifying the 
>> row for the duration of a Statement, a Transaction, a Connection? 
>> Forever?
>>
>> Thanks,
>> -Rick
>>
>>
> 


Re: RowLocation lifetime

Posted by Mike Matrigali <mi...@sbcglobal.net>.

Øystein Grøvlen wrote:
>>>>>>"MM" == Mike Matrigali <mi...@sbcglobal.net> writes:
> 
> 
>     MM> Øystein Grøvlen wrote:
> 
>     >>>>>>> "MM" == Mike Matrigali <mi...@sbcglobal.net> writes:
>     >> 
>     >> 
>     MM> It is only stable while some sort of stable table intent lock is held.
>     MM> Rows can move during a compress table operation.
>     >> 
>     >> I understand, when a record is moved to another page, its RecordId
>     >> will change.  Is this the only case where a RecordId will change?  If
>     >> yes, I would think one could solve the problem for insensitive result
>     >> sets by invalidating open result sets when an underlying table is
>     >> compressed.
>     >> 
>     >> Some questions to how compression works:
>     >> - Will RecordIds ever be reused or will the sequence number continue to
>     >> increase?
>     MM> Derby now supports 2 different compression techniques, basically one
>     MM> offline and one online.
> 
>     MM> SYSCS_UTIL.SYSCS_COMPRESS_TABLE() basically copies all rows from one
>     MM> table to another, so recordId's may be reused.  This requires table
>     MM> level locking and so is effectively offline.
> 
> Do not the new version of the table get a new container key?  If yes,
> I would think that recordId's are not reused since container key is
> part of the record id.

That is an interesting point.  I think the correct answer depends on how 
one is using the row location.  Currently the on disk format of the only 
current implementation of a row location is done by HeapRowLocation and 
it only stores a page and record number.  While logically the record id
includes it's container, the current usage only requires the page and
record id.  This was a significant space savings in indexes, but does
restrict all rows in a given index to point to a single container - not
a problem with current implementation.  The in-memory versions of
HeapRowLocation do associate each with a single container for locking.
> 
> 
>     MM> SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE() moves records within the
>     MM> same table, and will not reuse RecordId's, but a given record can
>     MM> definitely change RecordId.  This requires row locks
>     MM> for purged/moved rows for most of it's work.  Giving back space to
>     MM> OS requires short time table level lock.
> 
> Good, I was not aware that there was an online version of compress.
> 
>     >> - How do you ensure that the old RecordIds are not encountered
>     >> during recovery?  Does the compression include a checkpoint?
>     MM> Neither really does anything special to stop old recordId's from
>     MM> being encountered during recovery.   With offline redo recovery of
>     MM> old table is basically a drop table so either the table is there
>     MM> and we drop it again, or the table is already dropped and we do
>     MM> nothing.  In online it is the normal case of redo or a record delete
>     MM> or a record purge, in either case redo will either see a version of
>     MM> the page where it can redo the delete or purge or it will see from
>     MM> the version of the page that there is no work to do.
> 


Re: RowLocation lifetime

Posted by Øystein Grøvlen <Oy...@Sun.COM>.
>>>>> "MM" == Mike Matrigali <mi...@sbcglobal.net> writes:

    MM> Øystein Grøvlen wrote:

    >>>>>>> "MM" == Mike Matrigali <mi...@sbcglobal.net> writes:
    >> 
    >> 
    MM> It is only stable while some sort of stable table intent lock is held.
    MM> Rows can move during a compress table operation.
    >> 
    >> I understand, when a record is moved to another page, its RecordId
    >> will change.  Is this the only case where a RecordId will change?  If
    >> yes, I would think one could solve the problem for insensitive result
    >> sets by invalidating open result sets when an underlying table is
    >> compressed.
    >> 
    >> Some questions to how compression works:
    >> - Will RecordIds ever be reused or will the sequence number continue to
    >> increase?
    MM> Derby now supports 2 different compression techniques, basically one
    MM> offline and one online.

    MM> SYSCS_UTIL.SYSCS_COMPRESS_TABLE() basically copies all rows from one
    MM> table to another, so recordId's may be reused.  This requires table
    MM> level locking and so is effectively offline.

Do not the new version of the table get a new container key?  If yes,
I would think that recordId's are not reused since container key is
part of the record id.


    MM> SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE() moves records within the
    MM> same table, and will not reuse RecordId's, but a given record can
    MM> definitely change RecordId.  This requires row locks
    MM> for purged/moved rows for most of it's work.  Giving back space to
    MM> OS requires short time table level lock.

Good, I was not aware that there was an online version of compress.

    >> - How do you ensure that the old RecordIds are not encountered
    >> during recovery?  Does the compression include a checkpoint?
    MM> Neither really does anything special to stop old recordId's from
    MM> being encountered during recovery.   With offline redo recovery of
    MM> old table is basically a drop table so either the table is there
    MM> and we drop it again, or the table is already dropped and we do
    MM> nothing.  In online it is the normal case of redo or a record delete
    MM> or a record purge, in either case redo will either see a version of
    MM> the page where it can redo the delete or purge or it will see from
    MM> the version of the page that there is no work to do.

-- 
Øystein


Re: RowLocation lifetime

Posted by Mike Matrigali <mi...@sbcglobal.net>.

Øystein Grøvlen wrote:

>>>>>>"MM" == Mike Matrigali <mi...@sbcglobal.net> writes:
> 
> 
>     MM> It is only stable while some sort of stable table intent lock is held.
>     MM> Rows can move during a compress table operation.
> 
> I understand, when a record is moved to another page, its RecordId
> will change.  Is this the only case where a RecordId will change?  If
> yes, I would think one could solve the problem for insensitive result
> sets by invalidating open result sets when an underlying table is
> compressed.
> 
> Some questions to how compression works:
>    - Will RecordIds ever be reused or will the sequence number continue to
>      increase?
Derby now supports 2 different compression techniques, basically one
offline and one online.

SYSCS_UTIL.SYSCS_COMPRESS_TABLE() basically copies all rows from one
table to another, so recordId's may be reused.  This requires table
level locking and so is effectively offline.
SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE() moves records within the
same table, and will not reuse RecordId's, but a given record can
definitely change RecordId.  This requires row locks
for purged/moved rows for most of it's work.  Giving back space to
OS requires short time table level lock.

>    - How do you ensure that the old RecordIds are not encountered
>      during recovery?  Does the compression include a checkpoint?
Neither really does anything special to stop old recordId's from
being encountered during recovery.   With offline redo recovery of
old table is basically a drop table so either the table is there
and we drop it again, or the table is already dropped and we do
nothing.  In online it is the normal case of redo or a record delete
or a record purge, in either case redo will either see a version of
the page where it can redo the delete or purge or it will see from
the version of the page that there is no work to do.

SYSCS_UTIL.SYSCS_COMPRESS_TABLE():
    This activity is basically all done above the store level, store
    does not really know what is going on and there is no special
    casing of recovery.  The path is basically the same recovery path
    as the fast create/load of a new table/indexes.  As the last step
    in the transaction the language code switches the mapping of the
    user table to the underlying store conglomerates.

SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE():
    For the row movement portion of this, there is no special recovery
    code.  Every row movement is just a logged delete and insert, with
    all the associated index updates in the same transaction. The row
    movement portion, and row purging portion of this compress is
    "online" in that it only needs short term page latches and short
    row level locked transactions.  The actual giving space back to the
    OS still needs a table level lock, and does require a checkpoint.

>    - It seems like the compression heavily depends on that no log
>      records are generated during compression.  Do you have any ideas
>      of how to make compression on-line?  I guess you would need a
>      mapping between new and old RecordIds (i.e., every move would
>      have to be logged.)

There is not requirement of no log records.  As you say every move is
logged, which is why in it is recommended that users use offline
compression if that option is available to them as it uses way less
system resources and will finish quicker, but definitely is not as
concurrent.
> 

Re: RowLocation lifetime

Posted by Suresh Thalamati <su...@gmail.com>.
Øystein Grøvlen wrote:
>>>>>>"MM" == Mike Matrigali <mi...@sbcglobal.net> writes:
> 
> 
>     MM> It is only stable while some sort of stable table intent lock is held.
>     MM> Rows can move during a compress table operation.
> 
> I understand, when a record is moved to another page, its RecordId
> will change.  Is this the only case where a RecordId will change? 

It seems like that for the heap tables. By quick scan at the code I 
find only btrees set derby.storage.reusableRecordId=true, and this 
property is not supposed to be set by the users.

If a invalid handle is passed to store, in one fecth case I noticed it 
throws the exception SQLState.RAWSTORE_RECORD_VANISHED ,


Thanks
-suresht



Re: RowLocation lifetime

Posted by Øystein Grøvlen <Oy...@Sun.COM>.
>>>>> "MM" == Mike Matrigali <mi...@sbcglobal.net> writes:

    MM> It is only stable while some sort of stable table intent lock is held.
    MM> Rows can move during a compress table operation.

I understand, when a record is moved to another page, its RecordId
will change.  Is this the only case where a RecordId will change?  If
yes, I would think one could solve the problem for insensitive result
sets by invalidating open result sets when an underlying table is
compressed.

Some questions to how compression works:
   - Will RecordIds ever be reused or will the sequence number continue to
     increase?
   - How do you ensure that the old RecordIds are not encountered
     during recovery?  Does the compression include a checkpoint?
   - It seems like the compression heavily depends on that no log
     records are generated during compression.  Do you have any ideas
     of how to make compression on-line?  I guess you would need a
     mapping between new and old RecordIds (i.e., every move would
     have to be logged.)

-- 
Øystein


Re: RowLocation lifetime

Posted by Øystein Grøvlen <Oy...@Sun.COM>.
>>>>> "MM" == Mike Matrigali <mi...@sbcglobal.net> writes:

    MM> Not really.
    MM> I don't think even a unique key constraint meets the requirements,
    MM> as the row can be deleted and a different row can be created.

HeapRowLocation contains a RecordHandle which if I have understood
this correctly refers to a RecordId in this case.  Looking at the
code, RecordId consists of pageId and a sequence number within each
page.  It seems to me that RecordIds should be unique and not reused.
Could not this be used as a stable handle? 

    MM> Rick Hillegas wrote:
    >> Hi Mike,
    >> It appears  that RowLocations are stable under  some combinations of
    >> isolation  levels and  lock granularities  but that,  in  general, a
    >> RowLocation isn't guaranteed to be stable even for the duration of a
    >> statement. Is there some other, more stable handle on a heap row? By
    >> "more  stable" I mean  something which  uniquely identifies  the row
    >> across statements, transactions, or connections?

    >> Thanks,

    >> -Rick
    >> 





-- 
Øystein


Re: RowLocation lifetime

Posted by Mike Matrigali <mi...@sbcglobal.net>.
Not really.

I don't think even a unique key constraint meets the requirements,
as the row can be deleted and a different row can be created.


Rick Hillegas wrote:
> Hi Mike,
> 
> It appears that RowLocations are stable under some combinations of 
> isolation levels and lock granularities but that, in general, a 
> RowLocation isn't guaranteed to be stable even for the duration of a 
> statement. Is there some other, more stable handle on a heap row? By 
> "more stable" I mean something which uniquely identifies the row across 
> statements, transactions, or connections?
> 
> Thanks,
> -Rick
> 
> 


Re: RowLocation lifetime

Posted by Rick Hillegas <Ri...@Sun.COM>.
Hi Mike,

It appears that RowLocations are stable under some combinations of 
isolation levels and lock granularities but that, in general, a 
RowLocation isn't guaranteed to be stable even for the duration of a 
statement. Is there some other, more stable handle on a heap row? By 
"more stable" I mean something which uniquely identifies the row across 
statements, transactions, or connections?

Thanks,
-Rick

Re: RowLocation lifetime

Posted by Mike Matrigali <mi...@sbcglobal.net>.
Assuming row is not deleted, the question can only be answered
knowing the isolation level.  Basically the RowLocation can only
be counted on while a lock intent level lock is held on the table.
Intent table locks may be released as soon as a statement is
completed, or may be held to end of transaction depending on
the type of statement and type of isolation level.

The thing that may move an existing row in a heap are the compress
table system procedures.

If a row is deleted then there are other factors.

Rick Hillegas wrote:
> Hello Store experts,
> 
> How long is a RowLocation in a Heap good for? Provided that the row is 
> not deleted, can you count on its RowLocation reliably identifying the 
> row for the duration of a Statement, a Transaction, a Connection? Forever?
> 
> Thanks,
> -Rick
> 
>