You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Fredrik Rødland <so...@rodland.no> on 2013/05/23 13:53:16 UTC

hook to know when a DOC is committed.

I need to know when a document is committed in SOLR - i.e. is searchable.

Is there anyone who has a solution on how to do this.

I'm aware of three methods to create "hooks" for knowing when a doc is added or a commit is performed, but the doc(id) does not seem to be included for the commit-hooks (naturally I guess):

A. subclass DirectUpdateHandler2 and override commit and/or addDoc
B. subclass UpdateRequestProcessor (and include it in the update-chain) and override processAdd and/or processCommit
C. implement SolrEventListener and implement postCommit and/or postSoftCommit

The use-case is to let other parts of a system know that a document is searchable without having to create a poller which has to have state on when/how it polls.

Any ideas or tricks out there?


Fredrik


--
Fredrik Rødland               Mail:    fredrik.rodland@finn.no
FINN.no                       Cell:    +47 99 21 98 17
                              Twitter: @fredrikr
Oslo, NORWAY                  Web:     http://about.me/fmr


Re: hook to know when a DOC is committed.

Posted by Jack Krupansky <ja...@basetechnology.com>.
Yes, by definition, a poller retries. But by picking a sensible default for 
initial poll and retry (possibly an initial delay tuned to match average 
update/commit time) couple with a traditional exponential backoff, that 
should not be a problem at all. In other words, an average request would not 
require a retry.

Even so, do you feel that there is some sort of problem with retry? If so, 
please state what it is.

Again, if you utilize soft commit, the time to commit will be significantly 
reduced.

Or, just go ahead a force a commit on every commit here the delay of a poll 
request is not acceptable. But I'd recommend the tuned poller.

"would require a whole bunch of logic" - and you think the commit hooks and 
your push model implementation (on both Solr and client side) will be less 
logic?!!

-- Jack Krupansky

-----Original Message----- 
From: Fredrik Rødland
Sent: Thursday, May 23, 2013 8:18 AM
To: solr-user@lucene.apache.org
Subject: Re: hook to know when a DOC is committed.

On 23. mai 2013, at 14:05, Jack Krupansky <ja...@basetechnology.com> wrote:

Hi Jack,

thanks for your answer.

> A poller really is the most sensible, practical, and easiest route to go. 
> If you add the "versions=true" parameter to your update request and have 
> the transaction log enabled the update response will have the version 
> numbers for each document id, then the poller can also tell if an update 
> has been committed as well.

The poller will still have to retry before advertising a doc as searchable - 
won't it?

> Do you have some other, unmentioned requirement that you feel is biasing 
> you against a sensible poller? Clue us in as to the nature of such a 
> requirement.

My plan was to link sold with our already established high-volume 
messaging-system.  So each time a document is searchable a message would be 
broadcasted on a given channel.

Our system consist of approx 10 indexes and 8 replications of each of these, 
so keeping track of all these by pollers would require a whole bunch of 
logic.  Having a pushed-based system would facilitate knowing where & when a 
document is searchable quite a lot.



regards,


Fredrik


--
Fredrik Rødland               Mail:    fredrik.rodland@finn.no
FINN.no                       Cell:    +47 99 21 98 17
                              Twitter: @fredrikr
Oslo, NORWAY                  Web:     http://about.me/fmr 


Re: hook to know when a DOC is committed.

Posted by Fredrik Rødland <so...@rodland.no>.
On 23. mai 2013, at 14:05, Jack Krupansky <ja...@basetechnology.com> wrote:

Hi Jack,

thanks for your answer.

> A poller really is the most sensible, practical, and easiest route to go. If you add the "versions=true" parameter to your update request and have the transaction log enabled the update response will have the version numbers for each document id, then the poller can also tell if an update has been committed as well.

The poller will still have to retry before advertising a doc as searchable - won't it?

> Do you have some other, unmentioned requirement that you feel is biasing you against a sensible poller? Clue us in as to the nature of such a requirement.

My plan was to link sold with our already established high-volume messaging-system.  So each time a document is searchable a message would be broadcasted on a given channel.

Our system consist of approx 10 indexes and 8 replications of each of these, so keeping track of all these by pollers would require a whole bunch of logic.  Having a pushed-based system would facilitate knowing where & when a document is searchable quite a lot.



regards,


Fredrik


--
Fredrik Rødland               Mail:    fredrik.rodland@finn.no
FINN.no                       Cell:    +47 99 21 98 17
                              Twitter: @fredrikr
Oslo, NORWAY                  Web:     http://about.me/fmr


Re: hook to know when a DOC is committed.

Posted by Jack Krupansky <ja...@basetechnology.com>.
A poller really is the most sensible, practical, and easiest route to go. If 
you add the "versions=true" parameter to your update request and have the 
transaction log enabled the update response will have the version numbers 
for each document id, then the poller can also tell if an update has been 
committed as well.

Also, with soft commit, documents should be visible must more rapidly.

Do you have some other, unmentioned requirement that you feel is biasing you 
against a sensible poller? Clue us in as to the nature of such a 
requirement.

-- Jack Krupansky

-----Original Message----- 
From: Fredrik Rødland
Sent: Thursday, May 23, 2013 7:53 AM
To: solr-user@lucene.apache.org
Subject: hook to know when a DOC is committed.

I need to know when a document is committed in SOLR - i.e. is searchable.

Is there anyone who has a solution on how to do this.

I'm aware of three methods to create "hooks" for knowing when a doc is added 
or a commit is performed, but the doc(id) does not seem to be included for 
the commit-hooks (naturally I guess):

A. subclass DirectUpdateHandler2 and override commit and/or addDoc
B. subclass UpdateRequestProcessor (and include it in the update-chain) and 
override processAdd and/or processCommit
C. implement SolrEventListener and implement postCommit and/or 
postSoftCommit

The use-case is to let other parts of a system know that a document is 
searchable without having to create a poller which has to have state on 
when/how it polls.

Any ideas or tricks out there?


Fredrik


--
Fredrik Rødland               Mail:    fredrik.rodland@finn.no
FINN.no                       Cell:    +47 99 21 98 17
                              Twitter: @fredrikr
Oslo, NORWAY                  Web:     http://about.me/fmr