You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Johannes Christen <j....@insiders.de> on 2006/07/06 17:06:45 UTC

Berkeley DB JEDirectory Performance

Hi all.
 
I just want to share my experience with the Berkeley DB JEDirectory
implementation from the contrib. area. I spend two days evaluating and
testing it and found out that it does work, but has very bad performance
and very high disk requirements for medium size document volume. 
 
I indexed about 78000 documents (DPA news items) in the FSDirectory and
the JEDirectory, and here are the results:
 
Disk usage (index size):
FSDirectory: 322 MB
JEDirectory: 4650 MB
 
Indexing Performance:
FSDirectory: 84 minutes
JEDirectory: 402 minutes
 
Searching:
Initial opening of the JEDirectory took about 45 minutes.
The searching itself was ok, but still about 1.5 times slower than with
the FSDirectory.
 
Ok. I hope than helped people who consider using the Berkeley DB
directory implementation in their application. It may do a good job if
you want to use transactions in small environments, but if the amount of
documents is getting big I wouldn't recommend the JEDirectory
implementation.
 
Bye for now
 
    Jo Christen
 
 

Re: Berkeley DB JEDirectory Performance

Posted by Erick Erickson <er...@gmail.com>.
Thanks, I always appreciate someone else doing work for me <G>....

Best
Erick

Re: Berkeley DB JEDirectory Performance

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Thanks Jo.  You may want to look for Andi Vajda's email with performance numbers, too.  I think he did send them out when he first contributed DbDirectory, and I don't recall the numbers being this bad.

Otis

----- Original Message ----
From: Johannes Christen <j....@insiders.de>
To: java-user@lucene.apache.org
Sent: Thursday, July 6, 2006 11:06:45 AM
Subject: Berkeley DB JEDirectory Performance

Hi all.
 
I just want to share my experience with the Berkeley DB JEDirectory
implementation from the contrib. area. I spend two days evaluating and
testing it and found out that it does work, but has very bad performance
and very high disk requirements for medium size document volume. 
 
I indexed about 78000 documents (DPA news items) in the FSDirectory and
the JEDirectory, and here are the results:
 
Disk usage (index size):
FSDirectory: 322 MB
JEDirectory: 4650 MB
 
Indexing Performance:
FSDirectory: 84 minutes
JEDirectory: 402 minutes
 
Searching:
Initial opening of the JEDirectory took about 45 minutes.
The searching itself was ok, but still about 1.5 times slower than with
the FSDirectory.
 
Ok. I hope than helped people who consider using the Berkeley DB
directory implementation in their application. It may do a good job if
you want to use transactions in small environments, but if the amount of
documents is getting big I wouldn't recommend the JEDirectory
implementation.
 
Bye for now
 
    Jo Christen
 
 




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org