You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Brian Whitman <br...@variogr.am> on 2007/07/07 18:07:08 UTC

history

I have been trying to plan out a history function for Solr. When I  
update a document with an existing unique key, I would like the older  
version to stay around and get tagged with the date and some metadata  
to indicate it's not "live." Any normal search would not touch  
history documents.

Is this best accomplished with a custom update request handler?

-Brian



RE: history

Posted by "Norskog, Lance" <la...@divvio.com>.
We have another use case. We would like count the number of times a
document came up in any search, and the total number of times it was
read. If these counters are not indexed, it seems like an update would
be a simple integer poke into the index. 

Also, thanks for the spellcheck info.

-----Original Message-----
From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik
Seeley
Sent: Saturday, July 07, 2007 9:21 AM
To: solr-user@lucene.apache.org
Subject: Re: history

On 7/7/07, Brian Whitman <br...@variogr.am> wrote:
> I have been trying to plan out a history function for Solr. When I 
> update a document with an existing unique key, I would like the older 
> version to stay around and get tagged with the date and some metadata 
> to indicate it's not "live." Any normal search would not touch history

> documents.

Interesting...
One might be able to accomplish this with the update processors that
Ryan & I have been batting around for the last few days, in conjunction
with updateable documents, which is on-deck.

The first idea that comes to mind is that during an update, you could
change the id of the older document to be something like id_<timestamp>,
and reindex it with the addition of a live:false field.

For normal queries, use a filter of -live:false filter.
For all old of a document, use a prefix query id:mydocid_* for all
versions of a document, use query id:mydocid*

So if you can hold off a little bit, you shouldn't need a custom query
handler.  This will be a good use case to ensure that our request
processors and updateable documents are powerful enough.

-Yonik

Re: history

Posted by climbingrose <cl...@gmail.com>.
Accidentally I have a very similar use case. Thanks for advice.

On 7/8/07, Yonik Seeley <yo...@apache.org> wrote:
>
> On 7/7/07, Brian Whitman <br...@variogr.am> wrote:
> > I have been trying to plan out a history function for Solr. When I
> > update a document with an existing unique key, I would like the older
> > version to stay around and get tagged with the date and some metadata
> > to indicate it's not "live." Any normal search would not touch
> > history documents.
>
> Interesting...
> One might be able to accomplish this with the update processors that
> Ryan & I have been batting around for the last few days, in
> conjunction with updateable documents, which is on-deck.
>
> The first idea that comes to mind is that during an update, you could
> change the id of the older document to be something like
> id_<timestamp>, and reindex it with the addition of a live:false
> field.
>
> For normal queries, use a filter of -live:false filter.
> For all old of a document, use a prefix query id:mydocid_*
> for all versions of a document, use query id:mydocid*
>
> So if you can hold off a little bit, you shouldn't need a custom query
> handler.  This will be a good use case to ensure that our request
> processors and updateable documents are powerful enough.
>
> -Yonik
>



-- 
Regards,

Cuong Hoang

Re: history

Posted by Yonik Seeley <yo...@apache.org>.
On 7/7/07, Brian Whitman <br...@variogr.am> wrote:
> I have been trying to plan out a history function for Solr. When I
> update a document with an existing unique key, I would like the older
> version to stay around and get tagged with the date and some metadata
> to indicate it's not "live." Any normal search would not touch
> history documents.

Interesting...
One might be able to accomplish this with the update processors that
Ryan & I have been batting around for the last few days, in
conjunction with updateable documents, which is on-deck.

The first idea that comes to mind is that during an update, you could
change the id of the older document to be something like
id_<timestamp>, and reindex it with the addition of a live:false
field.

For normal queries, use a filter of -live:false filter.
For all old of a document, use a prefix query id:mydocid_*
for all versions of a document, use query id:mydocid*

So if you can hold off a little bit, you shouldn't need a custom query
handler.  This will be a good use case to ensure that our request
processors and updateable documents are powerful enough.

-Yonik