You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Jim Newsham <jn...@referentia.com> on 2007/06/01 22:49:04 UTC

tuning/storage questions

 

I'm reading the tuning guide and trying to eke out a little more performance
for our app by tweaking storage parameters.  The properties our database
exhibits which seem potentially of important to storage tuning are:

 

-          tables with small records (typically 3 fields; usually fixed
size, but some have varchar)

-          few, but large (number of records per) tables

-          frequent inserts

-          once inserted, data will likely not change (it's possible for an
update operation, but in practice may never happen)

-          several indexes per table

-          queries usually involve joins and are usually selective

-          some queries return somewhat large result sets

-          queries currently take longer than we would like

 

1. Because updates will be very rare, I think it makes sense to set
pageReservedSpace to 0%.  If an update occurs which expands the size of a
row (unlikely but possible), there will be a performance hit (I assume
during both the update and subsequent reads of this record, because the
documentation implies that part of the row will be written to an overflow
page).  But my pages will hold 25% more data, which will be a big win for
the normal case.  Reasonable?

 

2. Page size is a little less straightforward.  The following imply page
size should be small:

 

-          not storing large objects

-          small columns/records

-          selective queries

 

The following imply page size should be large:

 

-          tables with large numbers of records

-          inserts only (?? The documentation says read-only apps should use
a large page size.  I'm not quite sure of the reasoning for this.  Would
this also apply to an app such as mine which only performs inserts but not
updates?)

 

The documentation says large page size for indexes is a big performance
gain.  I was wondering if it makes sense to have small pages for all tables,
and large pages for all indexes.  Does having heterogeneous page sizes have
any adverse effects on the page cache?  Does having mismatched page sizes
between a table and its indexes have any negative impact?

 

3.  What is the page cache eviction policy?   Is it least-frequently-used?
Do indexes get preferential treatment?  Since some queries pull a large
amount of data, I think this could potentially repopulate the entire cache
with data which may not be used again.  Does this mean it would be wasteful
to increase the page cache size?

 

Thanks,

Jim Newsham


RE: tuning/storage questions

Posted by Jim Newsham <jn...@referentia.com>.

> -----Original Message-----
> From: Mike Matrigali [mailto:mikem_app@sbcglobal.net]
> I'll answer below, but what I usually say in these cases is that you
> are starting too low.  I would not expect more than a 10% change for
> any of these low level stuff.  I would look more at the queries that
> are going to slow and see if the right indexes exist for these queries.
> Often you can get orders of magnitude changes in this area.
> 
> also you don't say if you are using embedded vs. network.  Returning
> large result sets is likely to go much faster in embedded vs. network.

Appreciated.  I got about 15x speedup in the last few days with some recent
changes in our db access API and some of the client code. :)  Some of these
involved ensuring efficient queries (particularly join order) and using
Derby's statement cache, while others involved eliminating inefficient or
repetitive queries or other poor usage patterns.  I estimate about 3x
speedup in the db layer, and 5x speedup in the client layer.

We are using embedded mode, as the database is just a storage mechanism for
our desktop app.

Thank you for your other replies, they were also helpful.  If I can free up
some time I'll perform some tests with different page sizes, and possibly
different page cache sizes.

Jim



Re: tuning/storage questions

Posted by Mike Matrigali <mi...@sbcglobal.net>.

Jim Newsham wrote:
>  
> 
> I’m reading the tuning guide and trying to eke out a little more 
> performance for our app by tweaking storage parameters.  The properties 
> our database exhibits which seem potentially of important to storage 
> tuning are:
> 
I'll answer below, but what I usually say in these cases is that you
are starting too low.  I would not expect more than a 10% change for
any of these low level stuff.  I would look more at the queries that
are going to slow and see if the right indexes exist for these queries.
Often you can get orders of magnitude changes in this area.

also you don't say if you are using embedded vs. network.  Returning 
large result sets is likely to go much faster in embedded vs. network.
>  
> 
> -          tables with small records (typically 3 fields; usually fixed 
> size, but some have varchar)
> 
> -          few, but large (number of records per) tables
> 
> -          frequent inserts
> 
> -          once inserted, data will likely not change (it’s possible for 
> an update operation, but in practice may never happen)
> 
> -          several indexes per table
> 
> -          queries usually involve joins and are usually selective
> 
> -          some queries return somewhat large result sets
> 
> -          queries currently take longer than we would like
> 
>  
> 
> 1. Because updates will be very rare, I think it makes sense to set 
> pageReservedSpace to 0%.  If an update occurs which expands the size of 
> a row (unlikely but possible), there will be a performance hit (I assume 
> during both the update and subsequent reads of this record, because the 
> documentation implies that part of the row will be written to an 
> overflow page).  But my pages will hold 25% more data, which will be a 
> big win for the normal case.  Reasonable?
should be completely reasonable for tables with only char colums.   And 
probably reasonable for varchar columns.  Your interpretation of impact 
is right on.  Derby, as zero-admin should already optimize the space for
all fixed length column tables, but it doesn't.  I just filed  	 
DERBY-2746 for this feature - you may want to vote on it or help do the 
work to implement it if you think it
would be useful.
> 
>  
> 
> 2. Page size is a little less straightforward.  The following imply page 
> size should be small:
> 
>  
> 
> -          not storing large objects
> 
> -          small columns/records
> 
> -          selective queries
> 
>  
> 
> The following imply page size should be large:
> 
>  
> 
> -          tables with large numbers of records
> 
> -          inserts only (?? The documentation says read-only apps should 
> use a large page size.  I’m not quite sure of the reasoning for this.  
> Would this also apply to an app such as mine which only performs inserts 
> but not updates?)
> 
>  
> 
> The documentation says large page size for indexes is a big performance 
> gain.  I was wondering if it makes sense to have small pages for all 
> tables, and large pages for all indexes.  Does having heterogeneous page 
> sizes have any adverse effects on the page cache?  Does having 
> mismatched page sizes between a table and its indexes have any negative 
> impact?
There may be a garbage collection impact on the cache.  The current 
cache handles heterogeneous page sizes fine, but when replacing a page
where the size is the same it reuses a page size array, but if it is 
different it throws away the old array and allocates a new one.
> 
>  
> 
> 3.  What is the page cache eviction policy?   Is it 
> least-frequently-used?  Do indexes get preferential treatment?  Since 
> some queries pull a large amount of data, I think this could potentially 
> repopulate the entire cache with data which may not be used again.  Does 
> this mean it would be wasteful to increase the page cache size?
it is basically lru, so if your queries are using indexes the index 
pages probably stay in cache.  But a scan not using the indexes that is
bigger than cache may throw out everything.
> 
>  
> 
> Thanks,
> 
> Jim Newsham
>