You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-user@db.apache.org by SOA Work <so...@web.de> on 2006/08/22 14:30:41 UTC

improve performance on large insert numbers

Hi there,

I'm currently using derby to log values applying in my application.
One value consists of a string, a date object and a double value. Not much data. But I want to insert  a huge amount of entries.

At the moment inserting 10 000 entries takes about 2 seconds. Now I would like to here your opinion about how to improve performance (if possible).

Here the ideas I'm currently using (comment if you want)

1.) I use a table without indexes, primary keys and so on. 
"create table log(id varchar(50), value double, timestamp date)"

2.) I use the embedded mode

3.) I use a prepared statement "insert into log values (?, ?, ?)"

4.) I disabled auto commit and use addBatch for all 10000 entries. at the end I call executeBatch() and commit()

I would like to here your suggestions or comments.
Best regards

______________________________________________________________________
XXL-Speicher, PC-Virenschutz, Spartarife & mehr: Nur im WEB.DE Club!		
Jetzt gratis testen! http://freemail.web.de/home/landingpad/?mc=021130

Re: Where is the right place to make Derby feature requests?

Posted by Stanley Bradbury <St...@gmail.com>.

Terry Kilshaw wrote:
> Where is the right place to make Derby feature requests?
>
> I guess its not this discussion group.
>
> Is that done on the developer discussion group or somewhere else?
>
> Thanks,
>
>
>
>
>   
Hi -

File a JIRA entry for the feature.  The process is described here:
          http://db.apache.org/derby/DerbyBugGuidelines.html

HTH

Where is the right place to make Derby feature requests?

Posted by Terry Kilshaw <te...@quantechsoftware.com>.

Where is the right place to make Derby feature requests?

I guess its not this discussion group.

Is that done on the developer discussion group or somewhere else?

Thanks,

RE: improve performance on large insert numbers

Posted by de...@segel.com.

And why do you want to use a database for this task?

I was waiting for someone to ask this, but since no one did I guess I have
to do it... ;-)

What you've said, is that you wanted to load a lot of records in to a table,
really, really fast. 

No indexes. Nor have you identified any type of query or extract
requirements.

So why use a database for this function?

A general purpose RDBMS will not be the fastest method of capturing and
storing this data.

What you may want to consider is capturing the initial data in to a flat
file, then using a separate process, load this data in to an indexed
database table(s) for future use.

Since you really didn't go into detail, let me use a hypothetical example of
a solution that may parallel your problem...

Let me give you an example application.

Suppose your application is set up to log any and all TCP/IP traffic on your
server. Depending on your server, there can be a lot of traffic.
So your application writes to a file or set of files (round robin or
something similar) with the traffic information. You then have a secondary
application that tracks when a file buffer is full and then writes that
buffer to the database. After completion, the second program tells the first
that the file buffer is cleared for deletion and then the app can delete and
reuse that buffer's name.

This way you then have indexes on the tables for fast retrieval and you can
use things like triggers to set up alerts based on the data being inserted.

In essence you're writing to a buffer pool prior to storage in the database.
(Only that in the event of a server crash, the data is safe(r) in the files
of the buffer pool, rather than in memory....)

Informix had an interesting take on this. The Financial Foundation which
included its real time loader. Instead of files, it used memory as its
buffer. Of course those who implemented solutions would tee off the data to
also write to a file too but that's another story.

HTH.

-G

> -----Original Message-----
> From: Mike Matrigali [mailto:mikem_app@sbcglobal.net]
> Sent: Tuesday, August 22, 2006 11:08 AM
> To: Derby Discussion
> Subject: Re: improve performance on large insert numbers
> 
> is your machine cpu bound during the insert?  If not having log
> on separate disk may help.
> 
> Does your app always load a large number of entries into an empty table,
> or does it also load a large number of entries into an existing table
> with entries.  If loading into an empty table then using one of the
> import system routines with a vti would avoid logging.
> 
> SOA Work wrote:
> > Hi there,
> >
> > I'm currently using derby to log values applying in my application.
> > One value consists of a string, a date object and a double value. Not
> much data. But I want to insert  a huge amount of entries.
> >
> > At the moment inserting 10 000 entries takes about 2 seconds. Now I
> would like to here your opinion about how to improve performance (if
> possible).
> >
> > Here the ideas I'm currently using (comment if you want)
> >
> > 1.) I use a table without indexes, primary keys and so on.
> > "create table log(id varchar(50), value double, timestamp date)"
> >
> > 2.) I use the embedded mode
> >
> > 3.) I use a prepared statement "insert into log values (?, ?, ?)"
> >
> > 4.) I disabled auto commit and use addBatch for all 10000 entries. at
> the end I call executeBatch() and commit()
> >
> > I would like to here your suggestions or comments.
> > Best regards
> >
> > ______________________________________________________________________
> > XXL-Speicher, PC-Virenschutz, Spartarife & mehr: Nur im WEB.DE Club!
> 
> > Jetzt gratis testen! http://freemail.web.de/home/landingpad/?mc=021130
> >
> >
> >

Re: improve performance on large insert numbers

Posted by Mike Matrigali <mi...@sbcglobal.net>.

is your machine cpu bound during the insert?  If not having log
on separate disk may help.

Does your app always load a large number of entries into an empty table, 
or does it also load a large number of entries into an existing table
with entries.  If loading into an empty table then using one of the
import system routines with a vti would avoid logging.

SOA Work wrote:
> Hi there,
> 
> I'm currently using derby to log values applying in my application.
> One value consists of a string, a date object and a double value. Not much data. But I want to insert  a huge amount of entries.
> 
> At the moment inserting 10 000 entries takes about 2 seconds. Now I would like to here your opinion about how to improve performance (if possible).
> 
> Here the ideas I'm currently using (comment if you want)
> 
> 1.) I use a table without indexes, primary keys and so on. 
> "create table log(id varchar(50), value double, timestamp date)"
> 
> 2.) I use the embedded mode
> 
> 3.) I use a prepared statement "insert into log values (?, ?, ?)"
> 
> 4.) I disabled auto commit and use addBatch for all 10000 entries. at the end I call executeBatch() and commit()
> 
> I would like to here your suggestions or comments.
> Best regards
> 
> ______________________________________________________________________
> XXL-Speicher, PC-Virenschutz, Spartarife & mehr: Nur im WEB.DE Club!		
> Jetzt gratis testen! http://freemail.web.de/home/landingpad/?mc=021130
> 
> 
>