You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by S <s....@gmail.com> on 2006/11/03 12:01:29 UTC

table size estimation

This is a follow up email to my earlier posting regarding storage. Based on
the query given,

select * from NEW org.apache.derby.diag.SpaceTable('SYS','SYSTABLES') AS x;


using it for my table, the estimated size of a record is coming to be 131
bytes. The table consists of 5 bigints, 3 smallints, 1 timestamp and a
varchar. The avg length of the varchar is 14chars. So, total size of the
record size should have been 5*8+3*2+14+(bytes for timestamp)+any additional
per field/record data all put together shouldn't be more than 70/80 bytes.
So, why am I getting 131 bytes?

Re: table size estimation

Posted by Mike Matrigali <mi...@sbcglobal.net>.
I have no idea if this applies, but one problem with using this method
is that the space table will report space that is allocated to deleted
rows.  reclamation of deleted rows is deferred so if you have deleted
rows, or aborted inserts, or attempted to insert rows but failed due
to duplicate key violation then the table may be using space to store
deleted rows.

Also by default tables reserve empty space in the page to allow for the
rows to expand during future updates:
http://db.apache.org/derby/docs/dev/tuning/rtunproper28026.html

To see the overheads for per row and per field see, note page size 
affects overhead size so you will need to know page size, don't forget
to account for page header and offset table overhead:
http://db.apache.org/derby/papers/pageformats.html#records

Also the last page in the table is not going to be filled so if you 
don't have a lot of rows in the table this may dominate the calculation.
So worst case would be 1 row in a 2k page would give ~2k row size where
only  ~100 bytes actually used.

timestamp is stored a 3 int's - so 12 bytes for the data portion
(date, time, and nanoseconds).

S wrote:
> This is a follow up email to my earlier posting regarding storage. Based 
> on the query given,
> 
> select * from NEW org.apache.derby.diag.SpaceTable('SYS','SYSTABLES') AS x;
> 
> 
> using it for my table, the estimated size of a record is coming to be 
> 131 bytes. The table consists of 5 bigints, 3 smallints, 1 timestamp and 
> a varchar. The avg length of the varchar is 14chars. So, total size of 
> the record size should have been 5*8+3*2+14+(bytes for timestamp)+any 
> additional per field/record data all put together shouldn't be more than 
> 70/80 bytes. So, why am I getting 131 bytes?


Re: table size estimation

Posted by Michael Segel <ms...@mycingular.blackberry.net>.
Just a hunch...
Some of your fields will get rounded up to the nearest word size. 
Assume word size to be 8 bytes.

So you would have 
5*8 for your big ints 
3*8 for your small ints
14 rounded to nearest word size = 24
Plus 16 bytes for the pointer to the rest of the varchar.
And 16 for the time stamp

That would give you 130 bytes.

But hey whatdo I know? I'm just shooting from the hip on my crackberry.

Sent via BlackBerry.

-Mike Segel
Principal
MSCC
312 952 8175


-----Original Message-----
From: S <s....@gmail.com>
Date: Fri, 3 Nov 2006 03:01:29 
To:derby-user@db.apache.org
Subject: table size estimation

This is a follow up email to my earlier posting regarding storage. Based on the query given,

select * from NEW org.apache.derby.diag.SpaceTable('SYS','SYSTABLES') AS x;
using it for my table, the estimated size of a record is coming to be 131 bytes. The table consists of 5 bigints, 3 smallints, 1 timestamp and a varchar. The avg length of the varchar is 14chars. So, total size of the record size should have been 5*8+3*2+14+(bytes for timestamp)+any additional per field/record data all put together shouldn't be more than 70/80 bytes. So, why am I getting 131 bytes?