You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Dmitry Goldenberg <dm...@weblayers.com> on 2006/04/06 20:16:47 UTC

RE: Data structure of a Lucene Index

Ideally, I'd love to see an article explaining both in detail: the index structure as well as the merge algorithm...

________________________________

From: Prasenjit Mukherjee [mailto:prasenjitm@aol.com]
Sent: Tue 3/28/2006 11:57 PM
To: java-user@lucene.apache.org
Subject: Data structure of a Lucene Index



It seems to me that lucene doesn't use B-tree for its indexing storage.
Any paper/article which explains the theory behind data-structure of 
single index(segment).  I am not referring to the merge algorithm, I am
curious to know the storage structure of a single optimized lucene index.

Any pointer is greatly appreciated.
--Prasen

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org





Re: Data structure of a Lucene Index

Posted by Prasenjit Mukherjee <pr...@aol.com>.
I think Doug's paper ( specifically the Seek and Transfer section ) is 
the closest I could get. A little bit detailed explanation can be found 
in Yates' book on Information-Retreival.  I agree with Dimitry, a 
detailed explanation (or even pointers to some existing arcticle would 
be beneficial to all of us).

--prasen

------------------------------------------------------------


I talked about this a bit in a presentation at Haifa last year:

http://www.haifa.ibm.com/Workshops/ir2005/papers/DougCutting-Haifa05.pdf

See the section on "Seek versus Transfer".

Doug


Dmitry Goldenberg wrote:

>Ideally, I'd love to see an article explaining both in detail: the index structure as well as the merge algorithm...
>
>________________________________
>
>From: Prasenjit Mukherjee [mailto:prasenjitm@aol.com]
>Sent: Tue 3/28/2006 11:57 PM
>To: java-user@lucene.apache.org
>Subject: Data structure of a Lucene Index
>
>
>
>It seems to me that lucene doesn't use B-tree for its indexing storage.
>Any paper/article which explains the theory behind data-structure of 
>single index(segment).  I am not referring to the merge algorithm, I am
>curious to know the storage structure of a single optimized lucene index.
>
>Any pointer is greatly appreciated.
>--Prasen
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>  
>
>------------------------------------------------------------------------
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>  
>