You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Kristian Waagan <Kr...@Sun.COM> on 2009/10/05 11:24:14 UTC
Re: Derby in-memory back end - where to go next?

Rick Hillegas wrote:
> Hi Kristian,
>
> Here's another piece of feedback: Last night I gave an overview of 
> Derby to the San Francisco Java User's Group. A developer asked 
> whether the growth of the in-memory database could be bounded. He had 
> a use case which we didn't explore in depth but which involved 
> periodically truncating the database. I asked him to bring his 
> requirements to the Derby user list so that we could feed them into 
> your spec effort. Here are my takeaways:
>
> * It would be great to be able to bound the growth of the in-memory db
>
> * It would be great if the memory occupied by deleted records could be 
> released

Hi Rick,

I'm not quite sure how easy it is to implement the second feature. I see 
a possibility if the in-memory back end recognizes the Derby binary 
format, but removing the deleted records at the storage layer will have 
performance implications. We also have to deal with the checksums 
somehow... Maybe a quick prototype could tell us if this is feasible at all.

In addition to the actual "early deletion" of records, which would 
probably happen after Derby has written out a complete page, a kind of 
sparse byte array implementation would be needed. An easier options is 
using
These two features could be implemented individually (the spare byte 
array implementation first). Is perhaps java.util.zip.Deflater an option?

Regarding the processing, we may be able to utilize the field storing 
the number of deleted records in the page header to decide if we need to 
scan the page for deleted records, and use the slot table to calculate 
the amount of free space on the page. I don't know what is the best 
approach regarding performance; use the existing "meta-structures" to 
decide whether to scan, or just scan the byte array unconditionally.

And finally, is this added complexity worth the gain?
Is there an easier way?


Regards,
-- 
Kristian

>
> Thanks,
> -Rick
>
> Kristian Waagan wrote:
>> Hello,
>>
>> In Derby 10.5 an in-memory back end, or storage engine, was included. 
>> It stores all the data in main memory, with the exception of 
>> derby.log. If this is news to you, and you want a quick intro to it, 
>> see [1] and [2].
>>
>> I'm trying to gather some feedback on whether the current 
>> implementation is found acceptable, or if there are additional 
>> features people would like to see. I expect some wishes to emerge, 
>> and I plan to record these on the wiki page [1]. The page can then be 
>> used to guide further work in this area.
>>
>> To start the discussion, I'll list some potential features and tasks. 
>> Feel free to comment on any one of them either by replying to this 
>> thread, or by adding your comments to [1]. It can be a +1 or -1 on 
>> the feature itself, a suggestion for a new feature, or details on 
>> what a feature should look like.
>>
>>
>> * Documentation
>> Must at least document the JDBC subsubprotocol, and also explain how 
>> to delete in-memory databases.
>> If new features are added, these must be documented as well.
>>
>> * Deletion of in-memory databases
>> Currently the only ways to delete an in-memory database are to 
>> restart the JVM or use a static method that isn't part of Derby's 
>> public API. A proper mechanism for deletion should be added.
>>
>> * Automatic deletion on database shutdown (or when last connection 
>> disconnects)
>>
>> * "Anonymous in-memory databases"
>> A database which only the connection creating it can access, and when 
>> the connection goes away the database goes away.
>>
>> * Automatic persistence
>> The database could be persisted to disk automatically based on 
>> certain criteria. The most obvious ones are perhaps on a fixed 
>> interval and on JVM shutdown.
>>
>> * Monitoring
>> The most basic information is how many in-memory databases exist in 
>> the current JVM, and how big they are. How should this information be 
>> presented? Should it be available to anyone having a connection to 
>> the current JVM?
>>
>> * No derby.log
>> Include a class in Derby that will discard everything written to 
>> derby.log.
>>
>>
>> Thank you for your feedback,
>