You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Mike Matrigali (JIRA)" <de...@db.apache.org> on 2006/03/07 02:13:29 UTC
[jira] Resolved: (DERBY-670) improve space reclamation from deleted blob/clob columns which are bigger than a page

     [ http://issues.apache.org/jira/browse/DERBY-670?page=all ]
     
Mike Matrigali resolved DERBY-670:
----------------------------------

    Resolution: Fixed

this patch just uses the exising row and column header information to schedule deleted rows to be 
post commit reclaimed immediately  if the row is long or has a long column.  The overhead was not
much to just check the in memory part of the row on the current cached page.  

committed :
m2_142:5>svn commit

Sending        java\engine\org\apache\derby\iapi\store\raw\Page.java
Sending        java\engine\org\apache\derby\impl\store\access\conglomerate\Gener
icConglomerateController.java
Sending        java\engine\org\apache\derby\impl\store\raw\data\BasePage.java
Sending        java\engine\org\apache\derby\impl\store\raw\data\StoredRecordHead
er.java
Adding         java\testing\org\apache\derbyTesting\functionTests\master\st_recl
aim_longcol.out
Sending        java\testing\org\apache\derbyTesting\functionTests\suites\storete
sts.runall
Sending        java\testing\org\apache\derbyTesting\functionTests\tests\store\Ba
seTest.java
Sending        java\testing\org\apache\derbyTesting\functionTests\tests\store\On
lineCompressTest.java
Adding         java\testing\org\apache\derbyTesting\functionTests\tests\storetes
ts\st_reclaim_longcol.java
Transmitting file data .........
Committed revision 383663.

> improve space reclamation from deleted blob/clob columns which are bigger than a page
> -------------------------------------------------------------------------------------
>
>          Key: DERBY-670
>          URL: http://issues.apache.org/jira/browse/DERBY-670
>      Project: Derby
>         Type: Improvement
>   Components: Store
>     Versions: 10.1.1.0
>     Reporter: Mike Matrigali
>     Assignee: Mike Matrigali
>     Priority: Minor

>
> Currently Derby space reclamation is initiated after all the rows on a 
> MAIN page are delted.  When blob/clob's larger than a page are involved
> the row on the main page only keeps a pointer to a page chain, so the
> main page rows can be very small and thus may take a lot of rows to be
> deleted before we clean up and reuse space associated with blob/clob.
> So in an extreme case of a table with only a int key and a 1 blob column
> with N bytes , and a 32k 
> page derby probably stores more than 1000 rows.  If the app simply does
> insert/delete of a single row it will grow to 1000 * N bytes
> for an app that to the user should only be on the order of N big.
> It would seem reasonable to queue a post commit for any delete which
> includes a blob/clob that has been chained.  This is in keeping with
> the current policy to queue the work when it is possible we can reclaim
> an entire page.  
> The problem is that there would be an extra cost at delete time to 
> determine if the row being deleted has a blob/clob page chain.  The
> actual information is stored in the field header of that particular
> column so currently the only way to check would be to check every
> field header of every column in the deleted row.  From the store's
> point of view every column can be "long" with a page chain -- currently
> it doesn't know that only blob/clob datatypes can cause this behavior.
> Some options include:
> 1 at table create time ask for input from language to say if one of
>   these is at all possible, so that check is never done if not 
>   necessary.
> 2 Maintain a bit in the container header with some sort of indication if any long
>     row exists, may simply 1/0 or a reference count.   Note information is easily
>      available at insert time.
> 3 maintain a bit in the page indicating if any long rows exist
> 4 maintain a bit in the record header if any long columns exist,  note the existing bit
>     is only if the whole record is overflowed, not if a single column is overflowed.
> options 1-3 would then be used to only perform the slow check at delete time if  necessary.
> I don't really like option 1 unless we change the storage interface to actually check/guarantee the behavior.
> I lean toward option 4, but it is sort of a row format change.  Given that the system has room saved for this
> bit I believe we can use it without any sort of upgrade-time work necessary - though I believe it can only be set on a
> hard upgrade as there may be old code which does not expect it to be set.  Soft upgrades won't get the
> benefit and existing data won't get the benefit.
> Any other ideas out there?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira