You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-user@db.apache.org by Tim Dugan <TD...@prospricing.com> on 2008/12/30 03:49:51 UTC

is there a lower-level (non-SQL) API for Derby?

I'm looking to see if Derby can be used similarly to Berkeley DB -- a lower-level API.  Can anyone tell me?



Maybe to the access area of the "Store Layer" which in some Derby documentation is described like this:

"The Store layer is split into two main areas, access and raw. The access layer presents a conglomerate (table or index)/row based interface to the SQL layer. It handles table scans, index scans, index lookups, indexing, sorting, locking policies, transactions, isolation levels."

Now that Derby is included in Java 16--I am having a really hard time finding Java documentation that talks about Derby.

Re: is there a lower-level (non-SQL) API for Derby?

Posted by Rick Hillegas <Ri...@Sun.COM>.

Hi Brian,

The last time the community discussed this, there was some concern that 
this effort would fall outside Derby's charter. One way forward might be 
to create an experimental branch in which to prototype a solution. If 
the prototype attracted enough interest, then we could take it to the 
Apache Incubator and attempt to create a new, related DB subproject.

Here, for instance, is something which we could prototype in a branch: 
We could create a small wrapper API around Derby. The jar file 
containing this small wrapper would be what the related Apache project 
ultimately delivers and you would use it together with the existing 
Derby jars. I'm thinking that for operations like creating/dropping a 
container (i.e., a table) you might not see a performance gain but for 
simple gets and puts you could see the 15-20% improvement. Of the 
benefits listed at the end of this message, this solution would deliver 
benefit (3) and to a smaller extent benefit (2). It would not deliver 
benefit (1): the code footprint of this solution would actually be a 
little larger than native Derby and memory and disk usage would be 
pretty much the same.

Would a solution like this be useful for you?

Thanks,
-Rick

Brian Peterson wrote:
> Hi Rick,
>
> I've tried following up with this because I'd be interested in using this
> lighter version. From what I've been able to find, it looks like you started
> to set up the goals for such an effort. Is this effort still moving forward?
>
> My chief need would be speed -- factoring out the overhead of the JDBC/SQL
> interface. I see someone else noted that this has been measured at 15-20%
> for lookups on simple tables. I would definitely use the subsystem to get a
> 20% improvement when using an embedded database.
>
> Brian 
>
> -----Original Message-----
> From: Richard.Hillegas@Sun.COM [mailto:Richard.Hillegas@Sun.COM] 
> Sent: Monday, January 05, 2009 9:49 AM
> To: Derby Discussion
> Cc: derby-dev@db.apache.org
> Subject: Re: is there a lower-level (non-SQL) API for Derby?
>
> Hi Tim,
>
> This question has come up before. For instance, you may find some 
> interesting discussion on the following email thread: 
> http://www.nabble.com/simpler-api-to-the-Derby-store-td18137499.html#a181374
> 99
>
> The Derby storage layer is supposed to be an independent component. The 
> api is described in the javadoc for the 
> org.apache.derby.iapi.store.access package: 
> http://db.apache.org/derby/javadoc/engine/
>
> What would you say are your chief needs? Are you looking for a version 
> of Derby which is
>
> 1) smaller
> 2) faster
>
>   or
>
> 3) easier-to-use
>
> Hope this helps,
> -Rick
>
> Tim Dugan wrote:
>   
>>  
>> I'm looking to see if Derby can be used similarly to Berkeley DB -- a 
>> lower-level API.  Can anyone tell me?
>>
>>  
>>
>> Maybe to the access area of the "Store Layer" which in some Derby 
>> documentation is described like this:
>>
>>     "The Store layer is split into two main areas, access and raw. The
>>     access layer presents a conglomerate (table or index)/row based
>>     interface to the SQL layer. It handles table scans, index scans,
>>     index lookups, indexing, sorting, locking policies, transactions,
>>     isolation levels."
>>
>> Now that Derby is included in Java 16--I am having a really hard time 
>> finding Java documentation that talks about Derby.
>>  
>>     
>
>
>
>

RE: iterating over millions of rows

Posted by de...@segel.com.

Did you try using an index?

> -----Original Message-----
> From: Brian Peterson [mailto:publicayers@verizon.net]
> Sent: Monday, January 19, 2009 10:20 PM
> To: 'Derby Discussion'
> Subject: iterating over millions of rows
> 
> I have a big table, about 1 million rows, that I'm doing a simple "select
> *"
> over. The table is depressingly simple, basically a big VARCHAR for bit
> data
> that stores some serialized bytes. When I profile using VisualVM it seems
> that it is spending most of its time in
> 
> org.apache.derby.impl.store.raw.data.RAFContainer4.readFully
> 
> If I remember correctly, this gets invoked something like 3000 times. Is
> there anything I can do to speed up iterating over this table? It is
> taking
> about 30s to iterate over the 1 million records, but I could have up to 25
> million.
> 
> It is an embedded db using 10.4, JDK 1.6.0_07, running on a Windows XP SP2
> machine. I have page size set to the max, 32K, and the page case size set
> to
> 6000 pages.
> 
> Is there anything I can do, or have I just run up against how fast Windows
> can read off of the disk?
> 
> Brian
> 
>

iterating over millions of rows

Posted by Brian Peterson <pu...@verizon.net>.

I have a big table, about 1 million rows, that I'm doing a simple "select *"
over. The table is depressingly simple, basically a big VARCHAR for bit data
that stores some serialized bytes. When I profile using VisualVM it seems
that it is spending most of its time in 

org.apache.derby.impl.store.raw.data.RAFContainer4.readFully

If I remember correctly, this gets invoked something like 3000 times. Is
there anything I can do to speed up iterating over this table? It is taking
about 30s to iterate over the 1 million records, but I could have up to 25
million.

It is an embedded db using 10.4, JDK 1.6.0_07, running on a Windows XP SP2
machine. I have page size set to the max, 32K, and the page case size set to
6000 pages. 

Is there anything I can do, or have I just run up against how fast Windows
can read off of the disk?

Brian

RE: is there a lower-level (non-SQL) API for Derby?

Posted by Brian Peterson <pu...@verizon.net>.

Hi Rick,

I've tried following up with this because I'd be interested in using this
lighter version. From what I've been able to find, it looks like you started
to set up the goals for such an effort. Is this effort still moving forward?

My chief need would be speed -- factoring out the overhead of the JDBC/SQL
interface. I see someone else noted that this has been measured at 15-20%
for lookups on simple tables. I would definitely use the subsystem to get a
20% improvement when using an embedded database.

Brian 

-----Original Message-----
From: Richard.Hillegas@Sun.COM [mailto:Richard.Hillegas@Sun.COM] 
Sent: Monday, January 05, 2009 9:49 AM
To: Derby Discussion
Cc: derby-dev@db.apache.org
Subject: Re: is there a lower-level (non-SQL) API for Derby?

Hi Tim,

This question has come up before. For instance, you may find some 
interesting discussion on the following email thread: 
http://www.nabble.com/simpler-api-to-the-Derby-store-td18137499.html#a181374
99

The Derby storage layer is supposed to be an independent component. The 
api is described in the javadoc for the 
org.apache.derby.iapi.store.access package: 
http://db.apache.org/derby/javadoc/engine/

What would you say are your chief needs? Are you looking for a version 
of Derby which is

1) smaller
2) faster

  or

3) easier-to-use

Hope this helps,
-Rick

Tim Dugan wrote:
>  
> I'm looking to see if Derby can be used similarly to Berkeley DB -- a 
> lower-level API.  Can anyone tell me?
>
>  
>
> Maybe to the access area of the "Store Layer" which in some Derby 
> documentation is described like this:
>
>     "The Store layer is split into two main areas, access and raw. The
>     access layer presents a conglomerate (table or index)/row based
>     interface to the SQL layer. It handles table scans, index scans,
>     index lookups, indexing, sorting, locking policies, transactions,
>     isolation levels."
>
> Now that Derby is included in Java 16--I am having a really hard time 
> finding Java documentation that talks about Derby.
>

Re: is there a lower-level (non-SQL) API for Derby?

Posted by Rick Hillegas <Ri...@Sun.COM>.

Hi Tim,

This question has come up before. For instance, you may find some 
interesting discussion on the following email thread: 
http://www.nabble.com/simpler-api-to-the-Derby-store-td18137499.html#a18137499

The Derby storage layer is supposed to be an independent component. The 
api is described in the javadoc for the 
org.apache.derby.iapi.store.access package: 
http://db.apache.org/derby/javadoc/engine/

What would you say are your chief needs? Are you looking for a version 
of Derby which is

1) smaller
2) faster

  or

3) easier-to-use

Hope this helps,
-Rick

Tim Dugan wrote:
>  
> I'm looking to see if Derby can be used similarly to Berkeley DB -- a 
> lower-level API.  Can anyone tell me?
>
>  
>
> Maybe to the access area of the "Store Layer" which in some Derby 
> documentation is described like this:
>
>     "The Store layer is split into two main areas, access and raw. The
>     access layer presents a conglomerate (table or index)/row based
>     interface to the SQL layer. It handles table scans, index scans,
>     index lookups, indexing, sorting, locking policies, transactions,
>     isolation levels."
>
> Now that Derby is included in Java 16--I am having a really hard time 
> finding Java documentation that talks about Derby.
>

Re: is there a lower-level (non-SQL) API for Derby?

Posted by Rick Hillegas <Ri...@Sun.COM>.

Hi Tim,

This question has come up before. For instance, you may find some 
interesting discussion on the following email thread: 
http://www.nabble.com/simpler-api-to-the-Derby-store-td18137499.html#a18137499

The Derby storage layer is supposed to be an independent component. The 
api is described in the javadoc for the 
org.apache.derby.iapi.store.access package: 
http://db.apache.org/derby/javadoc/engine/

What would you say are your chief needs? Are you looking for a version 
of Derby which is

1) smaller
2) faster

  or

3) easier-to-use

Hope this helps,
-Rick

Tim Dugan wrote:
>  
> I'm looking to see if Derby can be used similarly to Berkeley DB -- a 
> lower-level API.  Can anyone tell me?
>
>  
>
> Maybe to the access area of the "Store Layer" which in some Derby 
> documentation is described like this:
>
>     "The Store layer is split into two main areas, access and raw. The
>     access layer presents a conglomerate (table or index)/row based
>     interface to the SQL layer. It handles table scans, index scans,
>     index lookups, indexing, sorting, locking policies, transactions,
>     isolation levels."
>
> Now that Derby is included in Java 16--I am having a really hard time 
> finding Java documentation that talks about Derby.
>

Re: is there a lower-level (non-SQL) API for Derby?

Posted by "Dag H. Wanvik" <Da...@Sun.COM>.

Tim Dugan <TD...@prospricing.com> writes:

> I'm looking to see if Derby can be used similarly to Berkeley DB --
> a lower-level API.  Can anyone tell me?

No, there is no public lower level API.

> Maybe to the access area of the "Store Layer" which in some Derby
> documentation is described like this:
>
> "The Store layer is split into two main areas, access and raw. The
> access layer presents a conglomerate (table or index)/row based
> interface to the SQL layer. It handles table scans, index scans,
> index lookups, indexing, sorting, locking policies, transactions,
> isolation levels."

Right, there are internal layers in the architecture, but they do not
constitute public APIs at the present time. Note that when you use
Derby embedded combined with prepared statements, the SQL overhead is
typically low, we have measured in the order of 15-20% extra for read
operations of small records with a simple primary key (we used the
internal APIs to make a comparison), where data fit in the in-memory
cache. But YMMV, of course.

>
> Now that Derby is included in Java 16--I am having a really hard
> time finding Java documentation that talks about Derby.

The Java DB docs are essentially the Derby docs rebundled. You can
find them here:

http://developers.sun.com/javadb/reference/docs/index.jsp

The Derby version is here:

http://db.apache.org/derby/manuals/index.html#docs_10.4

Thanks,
Dag