You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Jochen <jo...@gmail.com> on 2011/11/02 11:09:20 UTC

Adding metadata to Lucene indexes?

Hi,

is it possible to add metadata to a Lucene index (not to the indivudual 
Fields or Documents contained in the index). We need to periodically 
update an index by importing an XML document, and are looking for a 
nice cozy place to store an import date and a checksum that tells us if 
our input has changed.

Regards,
Jochen



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Adding metadata to Lucene indexes?

Posted by "Francisco A. Lozano" <fl...@gmail.com>.
Thank you :) this is very useful.

Until today I maintained a "metadata" key=>value text file inside the
lucene directories, but this feature looks better.

Francisco A. Lozano



On Fri, Nov 4, 2011 at 08:39, Uwe Schindler <uw...@thetaphi.de> wrote:
> You can read the Map without opening an IndexReader just by a static method in IndexReader that simply reads the segments file.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Adding metadata to Lucene indexes?

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

Sorry, correction: If you call commit() without map, it should preserve the previous map (according to the docs).

You can read the Map without opening an IndexReader just by a static method in IndexReader that simply reads the segments file.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Friday, November 04, 2011 8:31 AM
> To: java-user@lucene.apache.org
> Subject: RE: Adding metadata to Lucene indexes?
> 
> It must be passed to every commit.
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> > -----Original Message-----
> > From: Francisco A. Lozano [mailto:flozano@gmail.com]
> > Sent: Friday, November 04, 2011 7:19 AM
> > To: java-user@lucene.apache.org
> > Subject: Re: Adding metadata to Lucene indexes?
> >
> > This metadata Map needs to be written on every commit, or if I just
> > use plain
> > commit() without the Map<> it keeps the old values?
> >
> >
> > Francisco A. Lozano
> >
> >
> >
> > On Thu, Nov 3, 2011 at 22:22, Greg Bowyer <gb...@shopzilla.com> wrote:
> > > I would look at the meta data for this, the magic document is
> > > something that I did previously for exactly this problem, and two
> > > weeks later we removed it as so much of the code started having to
> > > check if the document was the magic document.
> > >
> > > The only thing with lucene metadata is that solr, currently, does
> > > not expose this.
> > > On 03/11/11 14:14, Jochen wrote:
> > >>
> > >> Thanks for the help. Following-up on that, how can I create
> > >> document that is not indexed and returned by "normal" searches, and
> > >> retrieve it when I need access to my metadata? There seems to be no
> > >> reliable "document id" that I can use for this.
> > >>
> > >> Regards,
> > >> Jochen
> > >>
> > >> On 2011-11-03 16:51:48 +0000, Uwe Schindler said:
> > >>
> > >>> There is also commit user data (a String-Map). When you commit the
> > >>> index writer you can attach that metadata. It's readable by IndexReader.
> > >>>
> > >>> -----
> > >>> Uwe Schindler
> > >>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > >>> eMail: uwe@thetaphi.de
> > >>>
> > >>>> -----Original Message-----
> > >>>> From: Ian Lea [mailto:ian.lea@gmail.com]
> > >>>> Sent: Thursday, November 03, 2011 4:05 PM
> > >>>> To: java-user@lucene.apache.org
> > >>>> Subject: Re: Adding metadata to Lucene indexes?
> > >>>>
> > >>>> You could add a dedicated document to the index storing whatever
> > >>>> you want.
> > >>>> There is no requirement for lucene docs to bear any relation to
> > >>>> each
> > >>>
> > >>> other.
> > >>>>
> > >>>> --
> > >>>> Ian.
> > >>>>
> > >>>>
> > >>>> On Wed, Nov 2, 2011 at 10:09 AM,
> > Jochen<jo...@gmail.com>  wrote:
> > >>>>>
> > >>>>> Hi,
> > >>>>>
> > >>>>> is it possible to add metadata to a Lucene index (not to the
> > >>>>> indivudual Fields or Documents contained in the index). We need
> > >>>>> to periodically update an index by importing an XML document,
> > >>>>> and are looking for a nice cozy place to store an import date
> > >>>>> and a checksum that tells us if our input has changed.
> > >>>>>
> > >>>>> Regards,
> > >>>>> Jochen
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> ----------------------------------------------------------------
> > >>>>> --
> > >>>>> --- To unsubscribe, e-mail:
> > >>>>> java-user-unsubscribe@lucene.apache.org
> > >>>>> For additional commands, e-mail:
> > >>>>> java-user-help@lucene.apache.org
> > >>>>>
> > >>>>>
> > >>>> -----------------------------------------------------------------
> > >>>> --
> > >>>> -- To unsubscribe, e-mail:
> > >>>> java-user-unsubscribe@lucene.apache.org
> > >>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>
> > >>
> > >>
> > >> -------------------------------------------------------------------
> > >> -- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>
> > >>
> > >
> > >
> > > --------------------------------------------------------------------
> > > - To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Adding metadata to Lucene indexes?

Posted by Uwe Schindler <uw...@thetaphi.de>.
It must be passed to every commit.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Francisco A. Lozano [mailto:flozano@gmail.com]
> Sent: Friday, November 04, 2011 7:19 AM
> To: java-user@lucene.apache.org
> Subject: Re: Adding metadata to Lucene indexes?
> 
> This metadata Map needs to be written on every commit, or if I just use plain
> commit() without the Map<> it keeps the old values?
> 
> 
> Francisco A. Lozano
> 
> 
> 
> On Thu, Nov 3, 2011 at 22:22, Greg Bowyer <gb...@shopzilla.com> wrote:
> > I would look at the meta data for this, the magic document is
> > something that I did previously for exactly this problem, and two
> > weeks later we removed it as so much of the code started having to
> > check if the document was the magic document.
> >
> > The only thing with lucene metadata is that solr, currently, does not
> > expose this.
> > On 03/11/11 14:14, Jochen wrote:
> >>
> >> Thanks for the help. Following-up on that, how can I create document
> >> that is not indexed and returned by "normal" searches, and retrieve
> >> it when I need access to my metadata? There seems to be no reliable
> >> "document id" that I can use for this.
> >>
> >> Regards,
> >> Jochen
> >>
> >> On 2011-11-03 16:51:48 +0000, Uwe Schindler said:
> >>
> >>> There is also commit user data (a String-Map). When you commit the
> >>> index writer you can attach that metadata. It's readable by IndexReader.
> >>>
> >>> -----
> >>> Uwe Schindler
> >>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> >>> eMail: uwe@thetaphi.de
> >>>
> >>>> -----Original Message-----
> >>>> From: Ian Lea [mailto:ian.lea@gmail.com]
> >>>> Sent: Thursday, November 03, 2011 4:05 PM
> >>>> To: java-user@lucene.apache.org
> >>>> Subject: Re: Adding metadata to Lucene indexes?
> >>>>
> >>>> You could add a dedicated document to the index storing whatever
> >>>> you want.
> >>>> There is no requirement for lucene docs to bear any relation to
> >>>> each
> >>>
> >>> other.
> >>>>
> >>>> --
> >>>> Ian.
> >>>>
> >>>>
> >>>> On Wed, Nov 2, 2011 at 10:09 AM,
> Jochen<jo...@gmail.com>  wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> is it possible to add metadata to a Lucene index (not to the
> >>>>> indivudual Fields or Documents contained in the index). We need to
> >>>>> periodically update an index by importing an XML document, and are
> >>>>> looking for a nice cozy place to store an import date and a
> >>>>> checksum that tells us if our input has changed.
> >>>>>
> >>>>> Regards,
> >>>>> Jochen
> >>>>>
> >>>>>
> >>>>>
> >>>>> ------------------------------------------------------------------
> >>>>> --- To unsubscribe, e-mail:
> >>>>> java-user-unsubscribe@lucene.apache.org
> >>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>
> >>>>>
> >>>> -------------------------------------------------------------------
> >>>> -- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Adding metadata to Lucene indexes?

Posted by "Francisco A. Lozano" <fl...@gmail.com>.
This metadata Map needs to be written on every commit, or if I just
use plain commit() without the Map<> it keeps the old values?


Francisco A. Lozano



On Thu, Nov 3, 2011 at 22:22, Greg Bowyer <gb...@shopzilla.com> wrote:
> I would look at the meta data for this, the magic document is something that
> I did previously for exactly this problem, and two weeks later we removed it
> as so much of the code started having to check if the document was the magic
> document.
>
> The only thing with lucene metadata is that solr, currently, does not expose
> this.
> On 03/11/11 14:14, Jochen wrote:
>>
>> Thanks for the help. Following-up on that, how can I create document
>> that is not indexed and returned by "normal" searches, and retrieve it
>> when I need access to my metadata? There seems to be no reliable
>> "document id" that I can use for this.
>>
>> Regards,
>> Jochen
>>
>> On 2011-11-03 16:51:48 +0000, Uwe Schindler said:
>>
>>> There is also commit user data (a String-Map). When you commit the index
>>> writer you can attach that metadata. It's readable by IndexReader.
>>>
>>> -----
>>> Uwe Schindler
>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>> http://www.thetaphi.de
>>> eMail: uwe@thetaphi.de
>>>
>>>> -----Original Message-----
>>>> From: Ian Lea [mailto:ian.lea@gmail.com]
>>>> Sent: Thursday, November 03, 2011 4:05 PM
>>>> To: java-user@lucene.apache.org
>>>> Subject: Re: Adding metadata to Lucene indexes?
>>>>
>>>> You could add a dedicated document to the index storing whatever you
>>>> want.
>>>> There is no requirement for lucene docs to bear any relation to each
>>>
>>> other.
>>>>
>>>> --
>>>> Ian.
>>>>
>>>>
>>>> On Wed, Nov 2, 2011 at 10:09 AM, Jochen<jo...@gmail.com>  wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> is it possible to add metadata to a Lucene index (not to the
>>>>> indivudual Fields or Documents contained in the index). We need to
>>>>> periodically update an index by importing an XML document, and are
>>>>> looking for a nice cozy place to store an import date and a checksum
>>>>> that tells us if our input has changed.
>>>>>
>>>>> Regards,
>>>>> Jochen
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Adding metadata to Lucene indexes?

Posted by Greg Bowyer <gb...@shopzilla.com>.
I would look at the meta data for this, the magic document is something 
that I did previously for exactly this problem, and two weeks later we 
removed it as so much of the code started having to check if the 
document was the magic document.

The only thing with lucene metadata is that solr, currently, does not 
expose this.
On 03/11/11 14:14, Jochen wrote:
> Thanks for the help. Following-up on that, how can I create document
> that is not indexed and returned by "normal" searches, and retrieve it
> when I need access to my metadata? There seems to be no reliable
> "document id" that I can use for this.
>
> Regards,
> Jochen
>
> On 2011-11-03 16:51:48 +0000, Uwe Schindler said:
>
>> There is also commit user data (a String-Map). When you commit the index
>> writer you can attach that metadata. It's readable by IndexReader.
>>
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: uwe@thetaphi.de
>>
>>> -----Original Message-----
>>> From: Ian Lea [mailto:ian.lea@gmail.com]
>>> Sent: Thursday, November 03, 2011 4:05 PM
>>> To: java-user@lucene.apache.org
>>> Subject: Re: Adding metadata to Lucene indexes?
>>>
>>> You could add a dedicated document to the index storing whatever you want.
>>> There is no requirement for lucene docs to bear any relation to each
>> other.
>>>
>>> --
>>> Ian.
>>>
>>>
>>> On Wed, Nov 2, 2011 at 10:09 AM, Jochen<jo...@gmail.com>  wrote:
>>>> Hi,
>>>>
>>>> is it possible to add metadata to a Lucene index (not to the
>>>> indivudual Fields or Documents contained in the index). We need to
>>>> periodically update an index by importing an XML document, and are
>>>> looking for a nice cozy place to store an import date and a checksum
>>>> that tells us if our input has changed.
>>>>
>>>> Regards,
>>>> Jochen
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Adding metadata to Lucene indexes?

Posted by Jochen <jo...@gmail.com>.
Thanks for the help. Following-up on that, how can I create document 
that is not indexed and returned by "normal" searches, and retrieve it 
when I need access to my metadata? There seems to be no reliable 
"document id" that I can use for this.

Regards,
Jochen

On 2011-11-03 16:51:48 +0000, Uwe Schindler said:

> There is also commit user data (a String-Map). When you commit the index
> writer you can attach that metadata. It's readable by IndexReader.
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
>> -----Original Message-----
>> From: Ian Lea [mailto:ian.lea@gmail.com]
>> Sent: Thursday, November 03, 2011 4:05 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: Adding metadata to Lucene indexes?
>> 
>> You could add a dedicated document to the index storing whatever you want.
>> There is no requirement for lucene docs to bear any relation to each
> other.
>> 
>> 
>> --
>> Ian.
>> 
>> 
>> On Wed, Nov 2, 2011 at 10:09 AM, Jochen <jo...@gmail.com> wrote:
>>> Hi,
>>> 
>>> is it possible to add metadata to a Lucene index (not to the
>>> indivudual Fields or Documents contained in the index). We need to
>>> periodically update an index by importing an XML document, and are
>>> looking for a nice cozy place to store an import date and a checksum
>>> that tells us if our input has changed.
>>> 
>>> Regards,
>>> Jochen
>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>> 
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Adding metadata to Lucene indexes?

Posted by Uwe Schindler <uw...@thetaphi.de>.
There is also commit user data (a String-Map). When you commit the index
writer you can attach that metadata. It's readable by IndexReader.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Ian Lea [mailto:ian.lea@gmail.com]
> Sent: Thursday, November 03, 2011 4:05 PM
> To: java-user@lucene.apache.org
> Subject: Re: Adding metadata to Lucene indexes?
> 
> You could add a dedicated document to the index storing whatever you want.
> There is no requirement for lucene docs to bear any relation to each
other.
> 
> 
> --
> Ian.
> 
> 
> On Wed, Nov 2, 2011 at 10:09 AM, Jochen <jo...@gmail.com> wrote:
> > Hi,
> >
> > is it possible to add metadata to a Lucene index (not to the
> > indivudual Fields or Documents contained in the index). We need to
> > periodically update an index by importing an XML document, and are
> > looking for a nice cozy place to store an import date and a checksum
> > that tells us if our input has changed.
> >
> > Regards,
> > Jochen
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Adding metadata to Lucene indexes?

Posted by Ian Lea <ia...@gmail.com>.
You could add a dedicated document to the index storing whatever you
want.  There is no requirement for lucene docs to bear any relation to
each other.


--
Ian.


On Wed, Nov 2, 2011 at 10:09 AM, Jochen <jo...@gmail.com> wrote:
> Hi,
>
> is it possible to add metadata to a Lucene index (not to the indivudual
> Fields or Documents contained in the index). We need to periodically update
> an index by importing an XML document, and are looking for a nice cozy place
> to store an import date and a checksum that tells us if our input has
> changed.
>
> Regards,
> Jochen
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org