You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Moti Nisenson <mo...@gmail.com> on 2007/04/16 09:59:48 UTC

Merging Indeces

Hi,

I was wondering if the addIndexes() method in IndexWriter can be used for
updating documents.

Specifically, I'd like to leave my primary index alone during the update
process. Instead, I want to use a separate index (on a RAMDirectory), and to
make the updateDocument() calls on it and finally to merge the changes to
the primary index. This leads to several questions:

1) If I make an updateDocument call and the document isn't present in the
separate index, will it fail?
2) After I merge the indeces, will the documents in the primary index be
updated (i.e. deleted and then added if existing, otherwise just added)?
3) If the merge does update the primary index, does it all occur in a single
transaction?

Thanks for the help!

Moti

Re: Merging Indeces

Posted by Lukas Vlcek <lu...@gmail.com>.
Hi,

try to look at Compass (http://www.opensymphony.com/compass/). It is built
on top of Lucene but provides additional concepts (transactions is one of
them). You might find this useful depending on your needs.

Regards,
Lukas

On 4/16/07, Erick Erickson <er...@gmail.com> wrote:
>
> See below.
>
> On 4/16/07, Moti Nisenson <mo...@gmail.com> wrote:
> >
> > Hi,
> >
> > I was wondering if the addIndexes() method in IndexWriter can be used
> for
> > updating documents.
> >
> > Specifically, I'd like to leave my primary index alone during the update
> > process. Instead, I want to use a separate index (on a RAMDirectory),
> and
> > to
> > make the updateDocument() calls on it and finally to merge the changes
> to
> > the primary index. This leads to several questions:
> >
> > 1) If I make an updateDocument call and the document isn't present in
> the
> > separate index, will it fail?
> > 2) After I merge the indeces, will the documents in the primary index be
> > updated (i.e. deleted and then added if existing, otherwise just added)?
>
>
> No, absolutely not. Lucene identifies documents by document ID. All the
> documents in your secondary index will get unique Lucene IDs and
> Lucene will NOT remove any documents from your original index.
>
> Lucene has no concept of "document identity" in that you can index
> the same document 15 times in a row and Lucene will have 15 entries.
>
>
> 3) If the merge does update the primary index, does it all occur in a
> single
> > transaction?
>
>
> "transaction" is really not a Lucene concept. What kind of behavior are
> you really looking for here?
>
>
> Thanks for the help!
> >
> > Moti
> >
>

Re: Merging Indeces

Posted by Erick Erickson <er...@gmail.com>.
See below.

On 4/16/07, Moti Nisenson <mo...@gmail.com> wrote:
>
> Hi,
>
> I was wondering if the addIndexes() method in IndexWriter can be used for
> updating documents.
>
> Specifically, I'd like to leave my primary index alone during the update
> process. Instead, I want to use a separate index (on a RAMDirectory), and
> to
> make the updateDocument() calls on it and finally to merge the changes to
> the primary index. This leads to several questions:
>
> 1) If I make an updateDocument call and the document isn't present in the
> separate index, will it fail?
> 2) After I merge the indeces, will the documents in the primary index be
> updated (i.e. deleted and then added if existing, otherwise just added)?


No, absolutely not. Lucene identifies documents by document ID. All the
documents in your secondary index will get unique Lucene IDs and
Lucene will NOT remove any documents from your original index.

Lucene has no concept of "document identity" in that you can index
the same document 15 times in a row and Lucene will have 15 entries.


3) If the merge does update the primary index, does it all occur in a single
> transaction?


"transaction" is really not a Lucene concept. What kind of behavior are
you really looking for here?


Thanks for the help!
>
> Moti
>