You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by Øystein Grøvlen <Oy...@Sun.COM> on 2006/03/31 16:13:22 UTC

Does in-place compress really defragment?

I tried an experiment with on-line compress and it seems like no space
is freed unless I delete records at the end of the heap:

1. Deleted every third record of a table
2. Inline compress with purge&defragment. File size did not change
3. Deleted every second of the remaining records
4. Inline compress with purge&defragment. File size did not change
5. Deleted the last third of the remaining records
6. Inline compress with purge&defragment. File size reduced by 1/3. 
7. Deleted first half of the remaining records
8. Inline compress with purge&defragment. File size did not change

Is this how it is supposed to be?  I would have thought that each
compress would defragment the table and free space, but it seems like
only empty space at the end of a table is freed.  Trace of what I did
below.  (There are 1536 records in t.  The records have primary keys
in range [0,1535] and was inserted in sorted order on primary key.
For all records j==mod(i,3).)

--
Øystein

ij> create table t1 (i integer primary key, j integer, c varchar(300));
0 rows inserted/updated/deleted
ij> insert into t1 select * from t;
1536 rows inserted/updated/deleted
ij> delete from t1 where j=1;
512 rows inserted/updated/deleted
ij> CALL SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'T1', 1, 1, 1);
0 rows inserted/updated/deleted
ij> delete from t1 where j=2;
512 rows inserted/updated/deleted
ij> CALL SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'T1', 1, 1, 1);
0 rows inserted/updated/deleted
ij> delete from t1 where i > 1024;
170 rows inserted/updated/deleted
ij> CALL SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'T1', 1, 1, 1);
0 rows inserted/updated/deleted
ij> delete from t1 where i < 512;
171 rows inserted/updated/deleted
ij> CALL SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'T1', 1, 1, 1);
0 rows inserted/updated/deleted
ij> 


Re: Does in-place compress really defragment?

Posted by Oystein Grovlen - Sun Norway <Oy...@Sun.COM>.
Bryan Pendleton wrote:
> Øystein Grøvlen wrote:
>> I tried an experiment with on-line compress and it seems like no space
>> is freed unless I delete records at the end of the heap:
> 
> It does seem like the documentation allows for this:
> 
>           SYSCS_COMPRESS_TABLE is guaranteed to recover the maximum amount
>           of free space, at the cost of temporarily creating new tables
>           and indexes before the statement in committed. 
> SYSCS_INPLACE_COMPRESS
>           attempts to reclaim space within the same table, but cannot 
> guarantee
>           it will recover all available space.
> 
> Did you try your same experiment with full compress?
> 

No, since that is quite a different mechanism where effectively a new 
table is created, the data is moved, and the old table dropped.  I was 
interested in exploring interactions with in-place compress and holdable 
cursors.

According to the documentation for in-place compress, I would have 
assumed that I should have been able to release space regardless of 
where in the file there are free space when I specify both PURGE_ROWS 
and DEFRAGMENT_ROWS in addition to TRUNCATE_END.  From the documentation:

PURGE_ROWS
     If PURGE_ROWS is set to a non-zero value, then a single pass is 
made through the table which will purge committed deleted rows from the 
table. This space is then available for future inserted rows, but 
remains allocated to the table. As this option scans every page of the 
table, its performance is linearly related to the size of the table.
DEFRAGMENT_ROWS
     If DEFRAGMENT_ROWS is set to a non-zero value, then a single 
defragment pass is made which will move existing rows from the end of 
the table towards the front of the table. The goal of defragmentation is 
to empty a set of pages at the end of the table which can then be 
returned to the operating system by the TRUNCATE_END option. It is 
recommended to only run DEFRAGMENT_ROWS if also specifying the 
TRUNCATE_END option. The DEFRAGMENT_ROWS option scans the whole table 
and needs to update index entries for every base table row move, so the 
execution time is linearly related to the size of the table.
TRUNCATE_END
     If TRUNCATE_END is set to a non-zero value, then all contiguous 
pages at the end of the table will be returned to the operating system. 
Running the PURGE_ROWS and/or DEFRAGMENT_ROWS options may increase the 
number of pages affected. This option by itself performs no scans of the 
table.

My case indicates that the table is not defragmented.

-- 
Øystein

Re: Does in-place compress really defragment?

Posted by Bryan Pendleton <bp...@amberpoint.com>.
Øystein Grøvlen wrote:
> I tried an experiment with on-line compress and it seems like no space
> is freed unless I delete records at the end of the heap:

It does seem like the documentation allows for this:

           SYSCS_COMPRESS_TABLE is guaranteed to recover the maximum amount
           of free space, at the cost of temporarily creating new tables
           and indexes before the statement in committed. SYSCS_INPLACE_COMPRESS
           attempts to reclaim space within the same table, but cannot guarantee
           it will recover all available space.

Did you try your same experiment with full compress?

thanks,

bryan


Re: Does in-place compress really defragment?

Posted by Mike Matrigali <mi...@sbcglobal.net>.

Oystein Grovlen - Sun Norway wrote:

> 
> OK, here is a replay of just step 7 and 8 including queries of the space 
> table vti:
> 
> ij> create table t1 (i integer primary key, j integer, c varchar(300));
> 0 rows inserted/updated/deleted
> ij> insert into t1 select * from t;
> 1536 rows inserted/updated/deleted
> ij> select conglomeratename, isindex, numallocatedpages, numfreepages, 
> pagesize, estimspacesaving from new 
> org.apache.derby.diag.SpaceTable('T1') t order by conglomeratename;
> CONGLOMERATENAME
> |ISIND&|NUMALLOCATEDPAGES   |NUMFREEPAGES        |PAGESIZE 
> |ESTIMSPACESAVING
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
> 
> SQL060404033237400 
>                                                         |1     |10 
>              |0                   |4096       |0
> T1                                                         |0     |103 
>              |0                   |4096       |0
> 
> 2 rows selected
> ij> delete from t1 where i < 512;
> 512 rows inserted/updated/deleted
> ij> select conglomeratename, isindex, numallocatedpages, numfreepages, 
> pagesize, estimspacesaving from new 
> org.apache.derby.diag.SpaceTable('T1') t order by conglomeratename;
> CONGLOMERATENAME
> |ISIND&|NUMALLOCATEDPAGES   |NUMFREEPAGES        |PAGESIZE 
> |ESTIMSPACESAVING
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
> 
> SQL060404033237400 
>                                                         |1     |8 
>              |2                   |4096       |8192
> T1                                                         |0     |70 
>              |33                  |4096       |135168
> 
> 2 rows selected
> ij> CALL SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'T1', 1, 1, 1);
> 0 rows inserted/updated/deleted
> ij> select conglomeratename, isindex, numallocatedpages, numfreepages, 
> pagesize, estimspacesaving from new 
> org.apache.derby.diag.SpaceTable('T1') t order by conglomeratename;
> CONGLOMERATENAME
> |ISIND&|NUMALLOCATEDPAGES   |NUMFREEPAGES        |PAGESIZE 
> |ESTIMSPACESAVING
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
> 
> SQL060404033237400 
>                                                         |1     |8 
>              |2                   |4096       |8192
> T1                                                         |0     |103 
>              |0                   |4096       |0
> 
> 2 rows selected
> 
> 
> 
> Looks like compress reuses the empty pages without freeing other pages.
> 

This looks like a bug, I will report and am actively looking at it.


Re: Does in-place compress really defragment?

Posted by Oystein Grovlen - Sun Norway <Oy...@Sun.COM>.
Mike Matrigali wrote:

> I am most concerned about 7 to 8, that should reduce space assuming
> the 171 rows don't all fit on a page.

Does this mean that it is only space in completely empty pages that will 
be reused when defragmenting?

> 
> one question, what sizes of rows are you using (basically are there
> actually 300 characters in your "c" column), for quick tests I always
> use char columns as they use whatever space is declared no matter
> the data.  Varchar uses variable length storage.  

All rows have 200 character strings for the "c" column.

> It would be 
> interesting to throw in a select from the space table vti after every
> statement to see what is going on - see
> opensource/java/testing/org/apache/derbyTesting/functionTests/tests/store/SpaceTable.sql 
> test for some usage.

OK, here is a replay of just step 7 and 8 including queries of the space 
table vti:

ij> create table t1 (i integer primary key, j integer, c varchar(300));
0 rows inserted/updated/deleted
ij> insert into t1 select * from t;
1536 rows inserted/updated/deleted
ij> select conglomeratename, isindex, numallocatedpages, numfreepages, 
pagesize, estimspacesaving from new 
org.apache.derby.diag.SpaceTable('T1') t order by conglomeratename;
CONGLOMERATENAME 
 
|ISIND&|NUMALLOCATEDPAGES   |NUMFREEPAGES        |PAGESIZE 
|ESTIMSPACESAVING
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL060404033237400 
                                                         |1     |10 
              |0                   |4096       |0
T1 
                                                         |0     |103 
              |0                   |4096       |0

2 rows selected
ij> delete from t1 where i < 512;
512 rows inserted/updated/deleted
ij> select conglomeratename, isindex, numallocatedpages, numfreepages, 
pagesize, estimspacesaving from new 
org.apache.derby.diag.SpaceTable('T1') t order by conglomeratename;
CONGLOMERATENAME 
 
|ISIND&|NUMALLOCATEDPAGES   |NUMFREEPAGES        |PAGESIZE 
|ESTIMSPACESAVING
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL060404033237400 
                                                         |1     |8 
              |2                   |4096       |8192
T1 
                                                         |0     |70 
              |33                  |4096       |135168

2 rows selected
ij> CALL SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'T1', 1, 1, 1);
0 rows inserted/updated/deleted
ij> select conglomeratename, isindex, numallocatedpages, numfreepages, 
pagesize, estimspacesaving from new 
org.apache.derby.diag.SpaceTable('T1') t order by conglomeratename;
CONGLOMERATENAME 
 
|ISIND&|NUMALLOCATEDPAGES   |NUMFREEPAGES        |PAGESIZE 
|ESTIMSPACESAVING
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL060404033237400 
                                                         |1     |8 
              |2                   |4096       |8192
T1 
                                                         |0     |103 
              |0                   |4096       |0

2 rows selected



Looks like compress reuses the empty pages without freeing other pages.

-- 
Øystein

Re: Does in-place compress really defragment?

Posted by Mike Matrigali <mi...@sbcglobal.net>.
there is a lot more work that could be done on the inplace compress,
my guess is that the defragment part is not finding space on the 1/3
empty pages to move the rows to.  The defragment loop basically takes
rows at the "end" of the table and inserts them where the existing
technology tells them there is space (after being reset to look from
beginning rather than end).  It is true that after the defragment
run the only space that can be returned to the OS is the free pages
at the "end", but the defragment run tries to move rows from end to
beginning.

There are a lot of smarter thing that could be done, creating an
in memory data structure with efficient access to number of free
bytes per page could then be used to do a better job of filling
"half" empty pages.

I am most concerned about 7 to 8, that should reduce space assuming
the 171 rows don't all fit on a page.

one question, what sizes of rows are you using (basically are there
actually 300 characters in your "c" column), for quick tests I always
use char columns as they use whatever space is declared no matter
the data.  Varchar uses variable length storage.  It would be 
interesting to throw in a select from the space table vti after every
statement to see what is going on - see
opensource/java/testing/org/apache/derbyTesting/functionTests/tests/store/SpaceTable.sql 

test for some usage.

For major space changes in the table I would recommend the offline
compress, it guarantees to reclaim ALL possible space in tables and
indexes and has the benefit of probably getting better clustering
on disk.

Øystein Grøvlen wrote:
> I tried an experiment with on-line compress and it seems like no space
> is freed unless I delete records at the end of the heap:
> 
> 1. Deleted every third record of a table
> 2. Inline compress with purge&defragment. File size did not change
> 3. Deleted every second of the remaining records
> 4. Inline compress with purge&defragment. File size did not change
> 5. Deleted the last third of the remaining records
> 6. Inline compress with purge&defragment. File size reduced by 1/3. 
> 7. Deleted first half of the remaining records
> 8. Inline compress with purge&defragment. File size did not change
> 
> Is this how it is supposed to be?  I would have thought that each
> compress would defragment the table and free space, but it seems like
> only empty space at the end of a table is freed.  Trace of what I did
> below.  (There are 1536 records in t.  The records have primary keys
> in range [0,1535] and was inserted in sorted order on primary key.
> For all records j==mod(i,3).)
> 
> --
> Øystein
> 
> ij> create table t1 (i integer primary key, j integer, c varchar(300));
> 0 rows inserted/updated/deleted
> ij> insert into t1 select * from t;
> 1536 rows inserted/updated/deleted
> ij> delete from t1 where j=1;
> 512 rows inserted/updated/deleted
> ij> CALL SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'T1', 1, 1, 1);
> 0 rows inserted/updated/deleted
> ij> delete from t1 where j=2;
> 512 rows inserted/updated/deleted
> ij> CALL SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'T1', 1, 1, 1);
> 0 rows inserted/updated/deleted
> ij> delete from t1 where i > 1024;
> 170 rows inserted/updated/deleted
> ij> CALL SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'T1', 1, 1, 1);
> 0 rows inserted/updated/deleted
> ij> delete from t1 where i < 512;
> 171 rows inserted/updated/deleted
> ij> CALL SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'T1', 1, 1, 1);
> 0 rows inserted/updated/deleted
> ij> 
> 
> 
> 


Re: Does in-place compress really defragment?

Posted by Manjula G Kutty <ma...@gmail.com>.
Øystein Grøvlen wrote:

>I tried an experiment with on-line compress and it seems like no space
>is freed unless I delete records at the end of the heap:
>
>1. Deleted every third record of a table
>2. Inline compress with purge&defragment. File size did not change
>3. Deleted every second of the remaining records
>4. Inline compress with purge&defragment. File size did not change
>5. Deleted the last third of the remaining records
>6. Inline compress with purge&defragment. File size reduced by 1/3. 
>7. Deleted first half of the remaining records
>8. Inline compress with purge&defragment. File size did not change
>
>Is this how it is supposed to be?  I would have thought that each
>compress would defragment the table and free space, but it seems like
>only empty space at the end of a table is freed.  Trace of what I did
>below.  (There are 1536 records in t.  The records have primary keys
>in range [0,1535] and was inserted in sorted order on primary key.
>For all records j==mod(i,3).)
>
>--
>Øystein
>
>ij> create table t1 (i integer primary key, j integer, c varchar(300));
>0 rows inserted/updated/deleted
>ij> insert into t1 select * from t;
>1536 rows inserted/updated/deleted
>ij> delete from t1 where j=1;
>512 rows inserted/updated/deleted
>ij> CALL SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'T1', 1, 1, 1);
>0 rows inserted/updated/deleted
>ij> delete from t1 where j=2;
>512 rows inserted/updated/deleted
>ij> CALL SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'T1', 1, 1, 1);
>0 rows inserted/updated/deleted
>ij> delete from t1 where i > 1024;
>170 rows inserted/updated/deleted
>ij> CALL SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'T1', 1, 1, 1);
>0 rows inserted/updated/deleted
>ij> delete from t1 where i < 512;
>171 rows inserted/updated/deleted
>ij> CALL SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'T1', 1, 1, 1);
>0 rows inserted/updated/deleted
>ij> 
>
>
>  
>
Hi,

I have used inplace compression in my test , which is a java program,. I 
called inpleace compression every one hour intervel and I'm deleting 
rows randomly. So I'm sure that they are not getting deleted from the 
bottom of the table. I 'm seeing notable increase in the free space 
after using inplace compression.

Thanks
Manjula