You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by De Meyer Tim <Ti...@cegeka.be> on 2011/08/05 16:35:37 UTC
Derby eating up disk space + how to trigger ReclaimSpace elegantly
Hi,
We're working on a java webapp and using a derby database (v10.5) in embedded mode.
The first version of the application went live about 2 months ago.
The application is used to make invoices and we also store the XML of an invoice document (input data for generating a PDF against a template).
Today, the database is about 1,7GB on disk.
During these 2 months, we've released some minor upgrades, including database migration scripts (like extra tables for new functionality).
Now we've noticed that, when we ran our latest upgrade, the database has suddenly shrunk to a size of 600MB.
This latest upgrade contained a migration script that dropped a no longer needed column on practically every table.
It's after executing this script that the shrink happened.
We did some debugging and hit suspend when the disk size started shrinking, it lead us to a Derby class called "ReclaimSpace".
The shrink was a bliss, because the customer was already complaining about the large size on disk :-)
We're afraid the database will start using up unnecessary space again soon, and of course, we're not going to have a similar migration script in every upgrade.
Is there an elegant way to configure derby to do this cleaning continuously, or to let our webapp instruct derby to do some cleaning?
We're launching the webapp from within a small java webstart app (we launch a Jetty and attach our war file), so it's even OK for us to write some java code to do it programmatically.
We've found this, but it's on a per table basis.
http://db.apache.org/derby/docs/10.1/adminguide/cadminspace21579.html
http://db.apache.org/derby/docs/10.5/ref/rrefaltertablecompress.html
By the way, we're running about 10 of these webapps on different desktop PC's.
Each webapp synchronizes its data to a server running a postgresql 9 database.
This means that this postgres database accumulates all the data of these 10 webapps (some data is shared, so it's not times ten).
The size of this postgres DB is less than 1GB, which we think is surprisingly small compared to the derby DB for one webapp.
Any help on all of this would be more than welcome.
Kind regards,
Tim De Meyer
Re: Derby eating up disk space + how to trigger ReclaimSpace elegantly
Posted by "Dag H. Wanvik" <da...@oracle.com>.
On 05.08.2011 16:35, De Meyer Tim wrote:
>
> Hi,
>
> We're working on a java webapp and using a derby database (v10.5) in
> embedded mode.
> The first version of the application went live about 2 months ago.
> The application is used to make invoices and we also store the XML of
> an invoice document (input data for generating a PDF against a template).
> Today, the database is about 1,7GB on disk.
>
> During these 2 months, we've released some minor upgrades, including
> database migration scripts (like extra tables for new functionality).
> Now we've noticed that, when we ran our latest upgrade, the database
> has suddenly shrunk to a size of 600MB.
> This latest upgrade contained a migration script that dropped a no
> longer needed column on practically every table.
> It's after executing this script that the shrink happened.
> We did some debugging and hit suspend when the disk size started
> shrinking, it lead us to a Derby class called "ReclaimSpace".
>
> The shrink was a bliss, because the customer was already complaining
> about the large size on disk :-)
> We're afraid the database will start using up unnecessary space again
> soon, and of course, we're not going to have a similar migration
> script in every upgrade.
> Is there an elegant way to configure derby to do this cleaning
> continuously, or to let our webapp instruct derby to do some cleaning?
> We're launching the webapp from within a small java webstart app (we
> launch a Jetty and attach our war file), so it's even OK for us to
> write some java code to do it programmatically.
>
> We've found this, but it's on a per table basis.
> http://db.apache.org/derby/docs/10.1/adminguide/cadminspace21579.html
> http://db.apache.org/derby/docs/10.5/ref/rrefaltertablecompress.htm
> <http://db.apache.org/derby/docs/10.5/ref/rrefaltertablecompress.html>
>
The SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE and
SYSCS_UTIL.SYSCS_COMPRESS_TABLE are the normal two way to reclaim space
in Derby. The latter will also return space to the operating system, but
requires a table level lock. The fact that they procedures opererates on
a table at a time should't pose any big problem, you can use metadata to
iterate over the tables of interest.
Some users have apps that can't afford to wait for the cleaning and opt
for partitioning tables, say, one per month of data, and then dropping
the tables that are older than a certain threshold. Views and/or table
functions can be used to allows viewing the data as a whole in queries.
Thanks,
Dag
>
> By the way, we're running about 10 of these webapps on different
> desktop PC's.
> Each webapp synchronizes its data to a server running a postgresql 9
> database.
> This means that this postgres database accumulates all the data of
> these 10 webapps (some data is shared, so it's not times ten).
> The size of this postgres DB is less than 1GB, which we think is
> surprisingly small compared to the derby DB for one webapp.
>
> Any help on all of this would be more than welcome.
>
> Kind regards,
>
> Tim De Meyer
>
Re: Derby eating up disk space + how to trigger ReclaimSpace elegantly
Posted by Phil Bradley <ph...@tower.ie>.
Hi,
For what it's worth, I had an approximately similar experience
with approximately similar volumes of data on a desktop app using
Derby embedded. I introduced a daily compress of all tables using
the following command (for each table):
call SYSCS_UTIL.SYSCS_COMPRESS_TABLE('MYSCHEMA', 'MYTABLE', 0)
The schedule for this was handled at the application level rather
than the derby level. Specifically, we used quartz to do the
scheduling.
Also, as you mention, this is on a per table basis. I don't
believe there is a single command that you can use to compress
the whole database so we just use the database metadata to list
all tables and apply the procedure to each.
Regards,
Phil
On Fri, 05 Aug 2011 16:35 +0200, "De Meyer Tim"
<Ti...@cegeka.be> wrote:
Hi,
We're working on a java webapp and using a derby database
(v10.5) in embedded mode.
The first version of the application went live about 2 months
ago.
The application is used to make invoices and we also store the
XML of an invoice document (input data for generating a PDF
against a template).
Today, the database is about 1,7GB on disk.
During these 2 months, we've released some minor upgrades,
including database migration scripts (like extra tables for
new functionality).
Now we've noticed that, when we ran our latest upgrade, the
database has suddenly shrunk to a size of 600MB.
This latest upgrade contained a migration script that dropped
a no longer needed column on practically every table.
It's after executing this script that the shrink happened.
We did some debugging and hit suspend when the disk size
started shrinking, it lead us to a Derby class called
"ReclaimSpace".
The shrink was a bliss, because the customer was already
complaining about the large size on disk :-)
We're afraid the database will start using up unnecessary
space again soon, and of course, we're not going to have a
similar migration script in every upgrade.
Is there an elegant way to configure derby to do this cleaning
continuously, or to let our webapp instruct derby to do some
cleaning?
We're launching the webapp from within a small java webstart
app (we launch a Jetty and attach our war file), so it's even
OK for us to write some java code to do it programmatically.
We've found this, but it's on a per table basis.
[1]http://db.apache.org/derby/docs/10.1/adminguide/cadminspace
21579.html
[2]http://db.apache.org/derby/docs/10.5/ref/rrefaltertablecomp
ress.html
By the way, we're running about 10 of these webapps on
different desktop PC's.
Each webapp synchronizes its data to a server running a
postgresql 9 database.
This means that this postgres database accumulates all the
data of these 10 webapps (some data is shared, so it's not
times ten).
The size of this postgres DB is less than 1GB, which we think
is surprisingly small compared to the derby DB for one webapp.
Any help on all of this would be more than welcome.
Kind regards,
Tim De Meyer
References
1. http://db.apache.org/derby/docs/10.1/adminguide/cadminspace21579.html
2. http://db.apache.org/derby/docs/10.5/ref/rrefaltertablecompress.html