You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Jason Polites <ja...@gmail.com> on 2006/08/12 18:12:16 UTC

Re: WIll storing docs affect lucene's search performance ?

IMO you should avoid storing any data in the index that you don't need for
display.  Lucene is an index (and a damn good one), not a database.  If you
find yourself storing large amounts of data in the index, this could be an
indication that you may need to re-think your architecture.

In its simplest case, data storage is for storing data. Lucene is for
indexing the data and searching it.

You will certainly see performance implications with storing data in the
index, particularly if you elect to have the data compressed by lucene.  The
lazy loading in the current trunk will help enormously with this (great work
by the dev team), but I would still encourage you to design a system in
which lucene is not the primary source of data.  That is, if you need to
re-index, get the data from its source location (or some interim location)
rather than relying on storing it in lucene.

I had all sorts of struggles with this very issue, and after several failed
attempts came to the conclusion that whilst Lucene often forms a critical
part of a good solution, it should only be used as an index/search tool..
not a database.


On 8/12/06, Grant Ingersoll <gs...@syr.edu> wrote:
>
>
> Large stored fields can affect performance when you are iterating
> over your hits (assuming you are not interested in the value of the
> stored field at that point in time) for a results display since all
> Fields are loaded when getting the Document.  The SVN trunk has a
> version of lazy loading that allows you to specify which fields are
> loaded and which ones are lazy, so you can avoid loading fields that
> a user will never view.
>
> -Grant
>
> On Aug 11, 2006, at 9:07 AM, Prasenjit Mukherjee wrote:
>
> > I have a requirement ( use highlighter) to  store the doc content
> > somewhere., and I am not allowed to use a RDBMS. I am thinking of
> > using Lucene's Field with (Field.Store.YES and Field.Index.NO) to
> > store the doc content. Will it have any negative affect on my
> > search performance ?
> > I think I have read somewhere that  Lucene shouldn't be used(or
> > misused)  to provide RDBMS like storage.
> >
> > --prasen
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>
> --------------------------
> Grant Ingersoll
> Sr. Software Engineer
> Center for Natural Language Processing
> Syracuse University
> 335 Hinds Hall
> Syracuse, NY 13244
> http://www.cnlp.org
>
> Voice: 315-443-5484
> Skype: grant_ingersoll
> Fax: 315-443-6886
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>