You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by anthonyri <an...@inbox.com> on 2011/10/21 23:29:11 UTC

Database size larger than expected

Hi,

I have a Derby database with two records, the first has 5000 records (of
int, double, double, double, double - so approx 34 bytes per record), the
second has 1000 records (of int, varchar(10), varchar(40) - so approx 102
bytes per record).

5000 * 34 = 166k
1000 * 102 = 100k
Total = 266k (if it was in a text file)

What I don't understand is why the database (seg0 folder) is 1.2MB?

I have compacted the database using;

call SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'mapObject', 1, 1, 1);
call SYSCS_UTIL.SYSCS_COMPRESS_TABLE('APP', 'MAPOBJECT', 1);

Any thoughts please?

Anthony
-- 
View this message in context: http://old.nabble.com/Database-size-larger-than-expected-tp32699233p32699233.html
Sent from the Apache Derby Users mailing list archive at Nabble.com.


Re: Database size larger than expected

Posted by Karl Weber <ka...@googlemail.com>.
Hi Mike,

Am Samstag, 22. Oktober 2011, 00:24:06 schrieb Mike Matrigali:
> There are a number of factors:
> 
[...]
> 
> 3) Each record has overhead and each field has overhead that is variable.

Could you elaborate a litte bit more on that? I already asked about this some 
while ago [1], in particular with reference to the article in [2]. 
Unfortunately I posted my original question to the wrong list, according to 
[2] I should have posted it to derby-dev, not to derby-user.

It would be really great if one could estimate the space requirements on disk 
for a given table or index. 

The other points you mentioned are also interesting. Do you think they could 
be added to [2]? Perhaps a particular example might also help to unterstand 
[2] a little bit better?

[1] http://old.nabble.com/Space-requirements-on-disk-to32618327.html

[2] http://db.apache.org/derby/papers/pageformats.html

Re: Database size larger than expected

Posted by anthonyri <an...@inbox.com>.
Mike,

Thank you for the explanation.

When I scale up to holding a million records, then the database size is more
in line with what I would expect. As per your explanation, small databases
already have an 'empty' size, which appears to be why it was larger than I
expected.

Anthony


Mike Matrigali wrote:
> 
> There are a number of factors:
> 
> 1) an "empty" derby database takes up 770k in 10.9.  This is 71 system 
> catalogs and indexes on those system catalogs.  Even empty tables still
> have inital empty pages allocated and written, in the default they take
> at lease 8k each.  Each file in seg0 represents either a table or an
> index.  I assume you meant you have a Derby database with 2 tables, so
> only 2 of the files in seg0 have data for your tables.  They are likely
> the 2 biggest files in seg0.
> 
> 2) By default the system also reserves empty space in pages for future 
> updates.  I think the default in 20% reserved.  There are properties to
> override this.
> 
> 3) Each record has overhead and each field has overhead that is variable.
> 
> 4) The tables will begin allocating 8 pages at a time when they need
> a new empty page.  So worst case could be 7 empty pages at the end.
> 
> 
> anthonyri wrote:
>> David,
>> 
>> No indexes I am afraid, should have mentioned that, sorry.
>> 
>> Anthony
>> 
>> 
>> David Zanter wrote:
>>> Do you have Indexes on those Tables?
>>>
>>>
>>> On Fri, Oct 21, 2011 at 5:29 PM, anthonyri <an...@inbox.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a Derby database with two records, the first has 5000 records
>>>> (of
>>>> int, double, double, double, double - so approx 34 bytes per record),
>>>> the
>>>> second has 1000 records (of int, varchar(10), varchar(40) - so approx
>>>> 102
>>>> bytes per record).
>>>>
>>>> 5000 * 34 = 166k
>>>> 1000 * 102 = 100k
>>>> Total = 266k (if it was in a text file)
>>>>
>>>> What I don't understand is why the database (seg0 folder) is 1.2MB?
>>>>
>>>> I have compacted the database using;
>>>>
>>>> call SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'mapObject', 1, 1,
>>>> 1);
>>>> call SYSCS_UTIL.SYSCS_COMPRESS_TABLE('APP', 'MAPOBJECT', 1);
>>>>
>>>> Any thoughts please?
>>>>
>>>> Anthony
>>>> --
>>>> View this message in context:
>>>> http://old.nabble.com/Database-size-larger-than-expected-tp32699233p32699233.html
>>>> Sent from the Apache Derby Users mailing list archive at Nabble.com.
>>>>
>>>>
>>>
>> 
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Database-size-larger-than-expected-tp32699233p32700588.html
Sent from the Apache Derby Users mailing list archive at Nabble.com.


Re: Database size larger than expected

Posted by Mike Matrigali <mi...@sbcglobal.net>.
There are a number of factors:

1) an "empty" derby database takes up 770k in 10.9.  This is 71 system 
catalogs and indexes on those system catalogs.  Even empty tables still
have inital empty pages allocated and written, in the default they take
at lease 8k each.  Each file in seg0 represents either a table or an
index.  I assume you meant you have a Derby database with 2 tables, so
only 2 of the files in seg0 have data for your tables.  They are likely
the 2 biggest files in seg0.

2) By default the system also reserves empty space in pages for future 
updates.  I think the default in 20% reserved.  There are properties to
override this.

3) Each record has overhead and each field has overhead that is variable.

4) The tables will begin allocating 8 pages at a time when they need
a new empty page.  So worst case could be 7 empty pages at the end.


anthonyri wrote:
> David,
> 
> No indexes I am afraid, should have mentioned that, sorry.
> 
> Anthony
> 
> 
> David Zanter wrote:
>> Do you have Indexes on those Tables?
>>
>>
>> On Fri, Oct 21, 2011 at 5:29 PM, anthonyri <an...@inbox.com> wrote:
>>
>>> Hi,
>>>
>>> I have a Derby database with two records, the first has 5000 records (of
>>> int, double, double, double, double - so approx 34 bytes per record), the
>>> second has 1000 records (of int, varchar(10), varchar(40) - so approx 102
>>> bytes per record).
>>>
>>> 5000 * 34 = 166k
>>> 1000 * 102 = 100k
>>> Total = 266k (if it was in a text file)
>>>
>>> What I don't understand is why the database (seg0 folder) is 1.2MB?
>>>
>>> I have compacted the database using;
>>>
>>> call SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'mapObject', 1, 1,
>>> 1);
>>> call SYSCS_UTIL.SYSCS_COMPRESS_TABLE('APP', 'MAPOBJECT', 1);
>>>
>>> Any thoughts please?
>>>
>>> Anthony
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Database-size-larger-than-expected-tp32699233p32699233.html
>>> Sent from the Apache Derby Users mailing list archive at Nabble.com.
>>>
>>>
>>
> 


Re: Database size larger than expected

Posted by anthonyri <an...@inbox.com>.
David,

No indexes I am afraid, should have mentioned that, sorry.

Anthony


David Zanter wrote:
> 
> Do you have Indexes on those Tables?
> 
> 
> On Fri, Oct 21, 2011 at 5:29 PM, anthonyri <an...@inbox.com> wrote:
> 
>>
>> Hi,
>>
>> I have a Derby database with two records, the first has 5000 records (of
>> int, double, double, double, double - so approx 34 bytes per record), the
>> second has 1000 records (of int, varchar(10), varchar(40) - so approx 102
>> bytes per record).
>>
>> 5000 * 34 = 166k
>> 1000 * 102 = 100k
>> Total = 266k (if it was in a text file)
>>
>> What I don't understand is why the database (seg0 folder) is 1.2MB?
>>
>> I have compacted the database using;
>>
>> call SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'mapObject', 1, 1,
>> 1);
>> call SYSCS_UTIL.SYSCS_COMPRESS_TABLE('APP', 'MAPOBJECT', 1);
>>
>> Any thoughts please?
>>
>> Anthony
>> --
>> View this message in context:
>> http://old.nabble.com/Database-size-larger-than-expected-tp32699233p32699233.html
>> Sent from the Apache Derby Users mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://old.nabble.com/Database-size-larger-than-expected-tp32699233p32699352.html
Sent from the Apache Derby Users mailing list archive at Nabble.com.


Re: Database size larger than expected

Posted by David Zanter <dz...@gmail.com>.
Do you have Indexes on those Tables?


On Fri, Oct 21, 2011 at 5:29 PM, anthonyri <an...@inbox.com> wrote:

>
> Hi,
>
> I have a Derby database with two records, the first has 5000 records (of
> int, double, double, double, double - so approx 34 bytes per record), the
> second has 1000 records (of int, varchar(10), varchar(40) - so approx 102
> bytes per record).
>
> 5000 * 34 = 166k
> 1000 * 102 = 100k
> Total = 266k (if it was in a text file)
>
> What I don't understand is why the database (seg0 folder) is 1.2MB?
>
> I have compacted the database using;
>
> call SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'mapObject', 1, 1, 1);
> call SYSCS_UTIL.SYSCS_COMPRESS_TABLE('APP', 'MAPOBJECT', 1);
>
> Any thoughts please?
>
> Anthony
> --
> View this message in context:
> http://old.nabble.com/Database-size-larger-than-expected-tp32699233p32699233.html
> Sent from the Apache Derby Users mailing list archive at Nabble.com.
>
>