You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Leslie Software <le...@yahoo.com> on 2007/06/03 14:17:27 UTC

Should I leave nulls in a lookup table?

I have a lookup table that right now stores either 'T' or 'F' in columns that are char(1).  Eventually this table will have over 70 columns and approach 8000 rows.  Right now for my testing I have about 600 rows and four columns.  

Today it occurred to me that I do not need to store any 'F' characters just leaving nulls would be fine and I wondered if that would save me some storage space.  So I changed my table creation and loading and found that my database size actually grew.  I have seen changes in my database that should have shrunk its size (deletion of rows) actually make it bigger so I am unsure if this growth is because storing only 'T' and leaving nulls is actually more expensive or not.

An example of my original table creation statement (for storing both 'T' and 'F'):
create table ability_lookup (
id INT NOT NULL GENERATED ALWAYS AS IDENTITY CONSTRAINT ability_lookup_pk PRIMARY KEY,
card_id INT NOT NULL UNIQUE,
cantrip CHAR (1) DEFAULT 'F' NOT NULL CONSTRAINT cantrip_boolean_ck CHECK (cantrip = 'T' or cantrip = 'F'),
pumpable_offence CHAR (1) DEFAULT 'F' NOT NULL CONSTRAINT pumpable_offence_boolean_ck CHECK (pumpable_offence = 'T' or pumpable_offence = 'F'),
pumpable_defense CHAR (1) DEFAULT 'F' NOT NULL CONSTRAINT pumpable_defense_boolean_ck CHECK (pumpable_defense = 'T' or pumpable_defense = 'F'),
discard CHAR (1) DEFAULT 'F' NOT NULL CONSTRAINT discard_boolean_ck CHECK (discard = 'T' or discard = 'F')
)

An example of my original table creation statement (for storing just 'T'):

create table ability_lookup (
id INT NOT NULL GENERATED ALWAYS AS IDENTITY CONSTRAINT ability_lookup_pk PRIMARY KEY,
card_id INT NOT NULL UNIQUE,
cantrip CHAR (1) DEFAULT NULL CONSTRAINT cantrip_boolean_ck CHECK (cantrip = 'T' or cantrip IS NULL),
pumpable_offence CHAR (1) DEFAULT NULL CONSTRAINT pumpable_offence_boolean_ck CHECK (pumpable_offence = 'T' or pumpable_offence IS NULL),
pumpable_defense CHAR (1) DEFAULT NULL CONSTRAINT pumpable_defense_boolean_ck CHECK (pumpable_defense = 'T' or pumpable_defense IS NULL),
discard CHAR (1) DEFAULT NULL CONSTRAINT discard_boolean_ck CHECK (discard = 'T' or discard IS NULL)
)

The field card_id is for joining with a card table which has about 8000 rows in it right now.

So my question is: is storing only 'T' and leaving nulls any less expensive than storing both 'T' and 'F' in a lookup table?

Ian

Note: After making changes and before creating my read-only copy that
goes with my application I perform SYSCS_UTIL.SYSCS_COMPRESS_TABLE for
each table:

call SYSCS_UTIL.SYSCS_COMPRESS_TABLE('APP', 'ABILITY_LOOKUP', 1);

and ensure that the database has been shut down with the
"shutdown=true" string and restarted to allow any clean up work to be
done.  Performing these actions greatly reduced the size of the final database after I have made changes to its content.


--

Ian Leslie - Shareware Author (mailto:lesliesoftware@yahoo.com)





      Get news delivered with the All new Yahoo! Mail.  Enjoy RSS feeds right on your Mail page. Start today at http://mrd.mail.yahoo.com/try_beta?.intl=ca

Re: Should I leave nulls in a lookup table?

Posted by Bryan Pendleton <bp...@amberpoint.com>.
 > So my question is: is storing only 'T' and leaving nulls any
 > less expensive than storing both 'T' and 'F' in a lookup table?

I believe the CHAR(1) NOT NULL columns will be simpler and slightly
more efficient for the database to handle. NULL processing is
complex and adds both space and time overheads; for a column which
is boolean true/false my preference would be the CHAR(1) NOT NULL
implementation.

thanks,

bryan



Re: Should I leave nulls in a lookup table?

Posted by Craig L Russell <Cr...@Sun.COM>.
On Jun 3, 2007, at 8:14 AM, Peter Ondruska wrote:

> CHAR(1) will be same size no matter what value it holds.

But if you allow nulls, then the database has to store the null/not  
null flag somewhere, thereby increasing the size of the data.

Craig
>
> On 3.6.2007, at 14:17, Leslie Software wrote:
>
>> I have a lookup table that right now stores either 'T' or 'F' in  
>> columns that are char(1).  Eventually this table will have over 70  
>> columns and approach 8000 rows.  Right now for my testing I have  
>> about 600 rows and four columns.
>>
>> Today it occurred to me that I do not need to store any 'F'  
>> characters just leaving nulls would be fine and I wondered if that  
>> would save me some storage space.  So I changed my table creation  
>> and loading and found that my database size actually grew.  I have  
>> seen changes in my database that should have shrunk its size  
>> (deletion of rows) actually make it bigger so I am unsure if this  
>> growth is because storing only 'T' and leaving nulls is actually  
>> more expensive or not.
>>
>> An example of my original table creation statement (for storing  
>> both 'T' and 'F'):
>> create table ability_lookup (
>> id INT NOT NULL GENERATED ALWAYS AS IDENTITY CONSTRAINT  
>> ability_lookup_pk PRIMARY KEY,
>> card_id INT NOT NULL UNIQUE,
>> cantrip CHAR (1) DEFAULT 'F' NOT NULL CONSTRAINT  
>> cantrip_boolean_ck CHECK (cantrip = 'T' or cantrip = 'F'),
>> pumpable_offence CHAR (1) DEFAULT 'F' NOT NULL CONSTRAINT  
>> pumpable_offence_boolean_ck CHECK (pumpable_offence = 'T' or  
>> pumpable_offence = 'F'),
>> pumpable_defense CHAR (1) DEFAULT 'F' NOT NULL CONSTRAINT  
>> pumpable_defense_boolean_ck CHECK (pumpable_defense = 'T' or  
>> pumpable_defense = 'F'),
>> discard CHAR (1) DEFAULT 'F' NOT NULL CONSTRAINT  
>> discard_boolean_ck CHECK (discard = 'T' or discard = 'F')
>> )
>>
>> An example of my original table creation statement (for storing  
>> just 'T'):
>>
>> create table ability_lookup (
>> id INT NOT NULL GENERATED ALWAYS AS IDENTITY CONSTRAINT  
>> ability_lookup_pk PRIMARY KEY,
>> card_id INT NOT NULL UNIQUE,
>> cantrip CHAR (1) DEFAULT NULL CONSTRAINT cantrip_boolean_ck CHECK  
>> (cantrip = 'T' or cantrip IS NULL),
>> pumpable_offence CHAR (1) DEFAULT NULL CONSTRAINT  
>> pumpable_offence_boolean_ck CHECK (pumpable_offence = 'T' or  
>> pumpable_offence IS NULL),
>> pumpable_defense CHAR (1) DEFAULT NULL CONSTRAINT  
>> pumpable_defense_boolean_ck CHECK (pumpable_defense = 'T' or  
>> pumpable_defense IS NULL),
>> discard CHAR (1) DEFAULT NULL CONSTRAINT discard_boolean_ck CHECK  
>> (discard = 'T' or discard IS NULL)
>> )
>>
>> The field card_id is for joining with a card table which has about  
>> 8000 rows in it right now.
>>
>> So my question is: is storing only 'T' and leaving nulls any less  
>> expensive than storing both 'T' and 'F' in a lookup table?
>>
>> Ian
>>
>> Note: After making changes and before creating my read-only copy that
>> goes with my application I perform SYSCS_UTIL.SYSCS_COMPRESS_TABLE  
>> for
>> each table:
>>
>> call SYSCS_UTIL.SYSCS_COMPRESS_TABLE('APP', 'ABILITY_LOOKUP', 1);
>>
>> and ensure that the database has been shut down with the
>> "shutdown=true" string and restarted to allow any clean up work to be
>> done.  Performing these actions greatly reduced the size of the  
>> final database after I have made changes to its content.
>>
>>
>> --
>>
>> Ian Leslie - Shareware Author (mailto:lesliesoftware@yahoo.com)
>>
>>
>>
>>
>>
>>       Get news delivered with the All new Yahoo! Mail.  Enjoy RSS  
>> feeds right on your Mail page. Start today at http:// 
>> mrd.mail.yahoo.com/try_beta?.intl=ca
>

Craig Russell
Architect, Sun Java Enterprise System http://java.sun.com/products/jdo
408 276-5638 mailto:Craig.Russell@sun.com
P.S. A good JDO? O, Gasp!


Re: Should I leave nulls in a lookup table?

Posted by Peter Ondruska <pe...@gmail.com>.
CHAR(1) will be same size no matter what value it holds.

On 3.6.2007, at 14:17, Leslie Software wrote:

> I have a lookup table that right now stores either 'T' or 'F' in  
> columns that are char(1).  Eventually this table will have over 70  
> columns and approach 8000 rows.  Right now for my testing I have  
> about 600 rows and four columns.
>
> Today it occurred to me that I do not need to store any 'F'  
> characters just leaving nulls would be fine and I wondered if that  
> would save me some storage space.  So I changed my table creation  
> and loading and found that my database size actually grew.  I have  
> seen changes in my database that should have shrunk its size  
> (deletion of rows) actually make it bigger so I am unsure if this  
> growth is because storing only 'T' and leaving nulls is actually  
> more expensive or not.
>
> An example of my original table creation statement (for storing  
> both 'T' and 'F'):
> create table ability_lookup (
> id INT NOT NULL GENERATED ALWAYS AS IDENTITY CONSTRAINT  
> ability_lookup_pk PRIMARY KEY,
> card_id INT NOT NULL UNIQUE,
> cantrip CHAR (1) DEFAULT 'F' NOT NULL CONSTRAINT cantrip_boolean_ck  
> CHECK (cantrip = 'T' or cantrip = 'F'),
> pumpable_offence CHAR (1) DEFAULT 'F' NOT NULL CONSTRAINT  
> pumpable_offence_boolean_ck CHECK (pumpable_offence = 'T' or  
> pumpable_offence = 'F'),
> pumpable_defense CHAR (1) DEFAULT 'F' NOT NULL CONSTRAINT  
> pumpable_defense_boolean_ck CHECK (pumpable_defense = 'T' or  
> pumpable_defense = 'F'),
> discard CHAR (1) DEFAULT 'F' NOT NULL CONSTRAINT discard_boolean_ck  
> CHECK (discard = 'T' or discard = 'F')
> )
>
> An example of my original table creation statement (for storing  
> just 'T'):
>
> create table ability_lookup (
> id INT NOT NULL GENERATED ALWAYS AS IDENTITY CONSTRAINT  
> ability_lookup_pk PRIMARY KEY,
> card_id INT NOT NULL UNIQUE,
> cantrip CHAR (1) DEFAULT NULL CONSTRAINT cantrip_boolean_ck CHECK  
> (cantrip = 'T' or cantrip IS NULL),
> pumpable_offence CHAR (1) DEFAULT NULL CONSTRAINT  
> pumpable_offence_boolean_ck CHECK (pumpable_offence = 'T' or  
> pumpable_offence IS NULL),
> pumpable_defense CHAR (1) DEFAULT NULL CONSTRAINT  
> pumpable_defense_boolean_ck CHECK (pumpable_defense = 'T' or  
> pumpable_defense IS NULL),
> discard CHAR (1) DEFAULT NULL CONSTRAINT discard_boolean_ck CHECK  
> (discard = 'T' or discard IS NULL)
> )
>
> The field card_id is for joining with a card table which has about  
> 8000 rows in it right now.
>
> So my question is: is storing only 'T' and leaving nulls any less  
> expensive than storing both 'T' and 'F' in a lookup table?
>
> Ian
>
> Note: After making changes and before creating my read-only copy that
> goes with my application I perform SYSCS_UTIL.SYSCS_COMPRESS_TABLE for
> each table:
>
> call SYSCS_UTIL.SYSCS_COMPRESS_TABLE('APP', 'ABILITY_LOOKUP', 1);
>
> and ensure that the database has been shut down with the
> "shutdown=true" string and restarted to allow any clean up work to be
> done.  Performing these actions greatly reduced the size of the  
> final database after I have made changes to its content.
>
>
> --
>
> Ian Leslie - Shareware Author (mailto:lesliesoftware@yahoo.com)
>
>
>
>
>
>       Get news delivered with the All new Yahoo! Mail.  Enjoy RSS  
> feeds right on your Mail page. Start today at http:// 
> mrd.mail.yahoo.com/try_beta?.intl=ca