You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Martin Grotzke <ma...@googlemail.com> on 2010/12/13 02:04:37 UTC

Rebuild Spellchecker based on cron expression

Hi,

the spellchecker component already provides a buildOnCommit and
buildOnOptimize option.

Since we have several spellchecker indices building on each commit is
not really what we want to do.
Building on optimize is not possible as index optimization is done on
the master and the slaves don't even run an optimize but only fetch
the optimized index.

Therefore I'm thinking about an extension of the spellchecker that
allows you to rebuild the spellchecker based on a cron-expression
(e.g. rebuild each night at 1 am).

What do you think about this, is there anybody else interested in this?

Regarding the lifecycle, is there already some executor "framework" or
any regularly running process in place, or would I have to pull up my
own thread? If so, how can I stop my thread when solr/tomcat is
shutdown (I couldn't see any shutdown or destroy method in
SearchComponent)?

Thanx for your feedback,
cheers,
Martin

Re: Rebuild Spellchecker based on cron expression

Posted by Martin Grotzke <ma...@googlemail.com>.
On Mon, Dec 13, 2010 at 4:01 AM, Erick Erickson <er...@gmail.com> wrote:
> I'm shooting in the dark here, but according to this:
> http://wiki.apache.org/solr/SolrReplication
> <http://wiki.apache.org/solr/SolrReplication>after the slave pulls the index
> down, it issues a commit. So if your
> slave is configured to generate the dictionary on commit, will it
> "just happen"?

Our slaves spellcheckers are not configured to buildOnCommit,
therefore it shouldn't just happen.

>
> But according to this: https://issues.apache.org/jira/browse/SOLR-866
> <https://issues.apache.org/jira/browse/SOLR-866>this is an open issue....

Thanx for the pointer! SOLR-866 is even better suited for us - after
reading SOLR-433 again I realized that it targets scripts based
replication (what we're going to leave behind us).

Cheers,
Martin


>
> Best
> Erick
>
> On Sun, Dec 12, 2010 at 8:30 PM, Martin Grotzke <
> martin.grotzke@googlemail.com> wrote:
>
>> On Mon, Dec 13, 2010 at 2:12 AM, Markus Jelsma
>> <ma...@openindex.io> wrote:
>> > Maybe you've overlooked the build parameter?
>> > http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.build
>> I'm aware of this, but we don't want to maintain cron-jobs on all
>> slaves for all spellcheckers for all cores.
>> That's why I'm thinking about a more integrated solution. Or did I
>> really overlook s.th.?
>>
>> Cheers,
>> Martin
>>
>>
>> >
>> >> Hi,
>> >>
>> >> the spellchecker component already provides a buildOnCommit and
>> >> buildOnOptimize option.
>> >>
>> >> Since we have several spellchecker indices building on each commit is
>> >> not really what we want to do.
>> >> Building on optimize is not possible as index optimization is done on
>> >> the master and the slaves don't even run an optimize but only fetch
>> >> the optimized index.
>> >>
>> >> Therefore I'm thinking about an extension of the spellchecker that
>> >> allows you to rebuild the spellchecker based on a cron-expression
>> >> (e.g. rebuild each night at 1 am).
>> >>
>> >> What do you think about this, is there anybody else interested in this?
>> >>
>> >> Regarding the lifecycle, is there already some executor "framework" or
>> >> any regularly running process in place, or would I have to pull up my
>> >> own thread? If so, how can I stop my thread when solr/tomcat is
>> >> shutdown (I couldn't see any shutdown or destroy method in
>> >> SearchComponent)?
>> >>
>> >> Thanx for your feedback,
>> >> cheers,
>> >> Martin
>> >
>>
>>
>>
>> --
>> Martin Grotzke
>> http://twitter.com/martin_grotzke
>>
>



-- 
Martin Grotzke
http://www.javakaffee.de/blog/

Re: Rebuild Spellchecker based on cron expression

Posted by Erick Erickson <er...@gmail.com>.
I'm shooting in the dark here, but according to this:
http://wiki.apache.org/solr/SolrReplication
<http://wiki.apache.org/solr/SolrReplication>after the slave pulls the index
down, it issues a commit. So if your
slave is configured to generate the dictionary on commit, will it
"just happen"?

But according to this: https://issues.apache.org/jira/browse/SOLR-866
<https://issues.apache.org/jira/browse/SOLR-866>this is an open issue....

Best
Erick

On Sun, Dec 12, 2010 at 8:30 PM, Martin Grotzke <
martin.grotzke@googlemail.com> wrote:

> On Mon, Dec 13, 2010 at 2:12 AM, Markus Jelsma
> <ma...@openindex.io> wrote:
> > Maybe you've overlooked the build parameter?
> > http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.build
> I'm aware of this, but we don't want to maintain cron-jobs on all
> slaves for all spellcheckers for all cores.
> That's why I'm thinking about a more integrated solution. Or did I
> really overlook s.th.?
>
> Cheers,
> Martin
>
>
> >
> >> Hi,
> >>
> >> the spellchecker component already provides a buildOnCommit and
> >> buildOnOptimize option.
> >>
> >> Since we have several spellchecker indices building on each commit is
> >> not really what we want to do.
> >> Building on optimize is not possible as index optimization is done on
> >> the master and the slaves don't even run an optimize but only fetch
> >> the optimized index.
> >>
> >> Therefore I'm thinking about an extension of the spellchecker that
> >> allows you to rebuild the spellchecker based on a cron-expression
> >> (e.g. rebuild each night at 1 am).
> >>
> >> What do you think about this, is there anybody else interested in this?
> >>
> >> Regarding the lifecycle, is there already some executor "framework" or
> >> any regularly running process in place, or would I have to pull up my
> >> own thread? If so, how can I stop my thread when solr/tomcat is
> >> shutdown (I couldn't see any shutdown or destroy method in
> >> SearchComponent)?
> >>
> >> Thanx for your feedback,
> >> cheers,
> >> Martin
> >
>
>
>
> --
> Martin Grotzke
> http://twitter.com/martin_grotzke
>

Re: Rebuild Spellchecker based on cron expression

Posted by Martin Grotzke <ma...@googlemail.com>.
On Mon, Dec 13, 2010 at 2:12 AM, Markus Jelsma
<ma...@openindex.io> wrote:
> Maybe you've overlooked the build parameter?
> http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.build
I'm aware of this, but we don't want to maintain cron-jobs on all
slaves for all spellcheckers for all cores.
That's why I'm thinking about a more integrated solution. Or did I
really overlook s.th.?

Cheers,
Martin


>
>> Hi,
>>
>> the spellchecker component already provides a buildOnCommit and
>> buildOnOptimize option.
>>
>> Since we have several spellchecker indices building on each commit is
>> not really what we want to do.
>> Building on optimize is not possible as index optimization is done on
>> the master and the slaves don't even run an optimize but only fetch
>> the optimized index.
>>
>> Therefore I'm thinking about an extension of the spellchecker that
>> allows you to rebuild the spellchecker based on a cron-expression
>> (e.g. rebuild each night at 1 am).
>>
>> What do you think about this, is there anybody else interested in this?
>>
>> Regarding the lifecycle, is there already some executor "framework" or
>> any regularly running process in place, or would I have to pull up my
>> own thread? If so, how can I stop my thread when solr/tomcat is
>> shutdown (I couldn't see any shutdown or destroy method in
>> SearchComponent)?
>>
>> Thanx for your feedback,
>> cheers,
>> Martin
>



-- 
Martin Grotzke
http://twitter.com/martin_grotzke

Re: Rebuild Spellchecker based on cron expression

Posted by Markus Jelsma <ma...@openindex.io>.
Maybe you've overlooked the build parameter?
http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.build

> Hi,
> 
> the spellchecker component already provides a buildOnCommit and
> buildOnOptimize option.
> 
> Since we have several spellchecker indices building on each commit is
> not really what we want to do.
> Building on optimize is not possible as index optimization is done on
> the master and the slaves don't even run an optimize but only fetch
> the optimized index.
> 
> Therefore I'm thinking about an extension of the spellchecker that
> allows you to rebuild the spellchecker based on a cron-expression
> (e.g. rebuild each night at 1 am).
> 
> What do you think about this, is there anybody else interested in this?
> 
> Regarding the lifecycle, is there already some executor "framework" or
> any regularly running process in place, or would I have to pull up my
> own thread? If so, how can I stop my thread when solr/tomcat is
> shutdown (I couldn't see any shutdown or destroy method in
> SearchComponent)?
> 
> Thanx for your feedback,
> cheers,
> Martin

Re: Rebuild Spellchecker based on cron expression

Posted by ilanh <il...@gmail.com>.
What command you are using in your cron on the slave to only rebuild the
spellcheck index?
I have only found the option to query the slave for dummy string and attache
it as URL attribute the "&spellcheck.build=true".
E.g.
slave-solr:8983/solr/my-index/spell/?q=helllo&spellcheck.build=true&wt=xml



--
View this message in context: http://lucene.472066.n3.nabble.com/Rebuild-Spellchecker-based-on-cron-expression-tp2076054p3973948.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Rebuild Spellchecker based on cron expression

Posted by Martin Grotzke <ma...@googlemail.com>.
Hi Erick,

thanx for your advice! I'll check the options with our client and see
how we'll proceed. My spare time right now is already full with other
open source stuff, otherwise it'd be fun contributing s.th. to solr!
:-)

Cheers,
Martin


On Mon, Dec 13, 2010 at 2:46 PM, Erick Erickson <er...@gmail.com> wrote:
> ***
> Just wondering what's the reason that this patch receives that little
> interest. Anything wrong with it?
> ***
>
> Nobody got behind it and pushed I suspect. And since it's been a long time
> since it was updated, there's no guarantee that it would apply cleanly any
> more.
> Or that it will perform as intended.
>
> So, if you're really interested, I'd suggest you ping the dev list and ask
> whether this is valuable or if it's been superseded. If the feedback is that
> this
> would be valuable, you can see what you can do to make it happen.
>
> Once it's working to your satisfaction and you've submitted a patch, let
> people
> know it's ready and ask them to commit it or critique it. You might have to
> remind
> the committers after a few days that it's ready and get it applied to trunk
> and/or 3.x.
>
> But I really wouldn't start working with it until I got some feedback from
> the
> people who are actively working on Solr whether it's been superseded by
> other functionality first, sometimes bugs just aren't closed when something
> else makes it obsolete.
>
> Here's a place to start: http://wiki.apache.org/solr/HowToContribute
>
> Best
> Erick
>
> On Mon, Dec 13, 2010 at 2:58 AM, Martin Grotzke <
> martin.grotzke@googlemail.com> wrote:
>
>> Hi,
>>
>> when thinking further about it it's clear that
>>  https://issues.apache.org/jira/browse/SOLR-433
>> would be even better - we could generate the spellechecker indices on
>> commit/optimize on the master and replicate them to all slaves.
>>
>> Just wondering what's the reason that this patch receives that little
>> interest. Anything wrong with it?
>>
>> Cheers,
>> Martin
>>
>>
>> On Mon, Dec 13, 2010 at 2:04 AM, Martin Grotzke
>> <ma...@googlemail.com> wrote:
>> > Hi,
>> >
>> > the spellchecker component already provides a buildOnCommit and
>> > buildOnOptimize option.
>> >
>> > Since we have several spellchecker indices building on each commit is
>> > not really what we want to do.
>> > Building on optimize is not possible as index optimization is done on
>> > the master and the slaves don't even run an optimize but only fetch
>> > the optimized index.
>> >
>> > Therefore I'm thinking about an extension of the spellchecker that
>> > allows you to rebuild the spellchecker based on a cron-expression
>> > (e.g. rebuild each night at 1 am).
>> >
>> > What do you think about this, is there anybody else interested in this?
>> >
>> > Regarding the lifecycle, is there already some executor "framework" or
>> > any regularly running process in place, or would I have to pull up my
>> > own thread? If so, how can I stop my thread when solr/tomcat is
>> > shutdown (I couldn't see any shutdown or destroy method in
>> > SearchComponent)?
>> >
>> > Thanx for your feedback,
>> > cheers,
>> > Martin
>> >
>>
>>
>>
>> --
>> Martin Grotzke
>> http://www.javakaffee.de/blog/
>>
>



-- 
Martin Grotzke
http://www.javakaffee.de/blog/

Re: Rebuild Spellchecker based on cron expression

Posted by Erick Erickson <er...@gmail.com>.
***
Just wondering what's the reason that this patch receives that little
interest. Anything wrong with it?
***

Nobody got behind it and pushed I suspect. And since it's been a long time
since it was updated, there's no guarantee that it would apply cleanly any
more.
Or that it will perform as intended.

So, if you're really interested, I'd suggest you ping the dev list and ask
whether this is valuable or if it's been superseded. If the feedback is that
this
would be valuable, you can see what you can do to make it happen.

Once it's working to your satisfaction and you've submitted a patch, let
people
know it's ready and ask them to commit it or critique it. You might have to
remind
the committers after a few days that it's ready and get it applied to trunk
and/or 3.x.

But I really wouldn't start working with it until I got some feedback from
the
people who are actively working on Solr whether it's been superseded by
other functionality first, sometimes bugs just aren't closed when something
else makes it obsolete.

Here's a place to start: http://wiki.apache.org/solr/HowToContribute

Best
Erick

On Mon, Dec 13, 2010 at 2:58 AM, Martin Grotzke <
martin.grotzke@googlemail.com> wrote:

> Hi,
>
> when thinking further about it it's clear that
>  https://issues.apache.org/jira/browse/SOLR-433
> would be even better - we could generate the spellechecker indices on
> commit/optimize on the master and replicate them to all slaves.
>
> Just wondering what's the reason that this patch receives that little
> interest. Anything wrong with it?
>
> Cheers,
> Martin
>
>
> On Mon, Dec 13, 2010 at 2:04 AM, Martin Grotzke
> <ma...@googlemail.com> wrote:
> > Hi,
> >
> > the spellchecker component already provides a buildOnCommit and
> > buildOnOptimize option.
> >
> > Since we have several spellchecker indices building on each commit is
> > not really what we want to do.
> > Building on optimize is not possible as index optimization is done on
> > the master and the slaves don't even run an optimize but only fetch
> > the optimized index.
> >
> > Therefore I'm thinking about an extension of the spellchecker that
> > allows you to rebuild the spellchecker based on a cron-expression
> > (e.g. rebuild each night at 1 am).
> >
> > What do you think about this, is there anybody else interested in this?
> >
> > Regarding the lifecycle, is there already some executor "framework" or
> > any regularly running process in place, or would I have to pull up my
> > own thread? If so, how can I stop my thread when solr/tomcat is
> > shutdown (I couldn't see any shutdown or destroy method in
> > SearchComponent)?
> >
> > Thanx for your feedback,
> > cheers,
> > Martin
> >
>
>
>
> --
> Martin Grotzke
> http://www.javakaffee.de/blog/
>

Re: Rebuild Spellchecker based on cron expression

Posted by Martin Grotzke <ma...@googlemail.com>.
Hi,

when thinking further about it it's clear that
  https://issues.apache.org/jira/browse/SOLR-433
would be even better - we could generate the spellechecker indices on
commit/optimize on the master and replicate them to all slaves.

Just wondering what's the reason that this patch receives that little
interest. Anything wrong with it?

Cheers,
Martin


On Mon, Dec 13, 2010 at 2:04 AM, Martin Grotzke
<ma...@googlemail.com> wrote:
> Hi,
>
> the spellchecker component already provides a buildOnCommit and
> buildOnOptimize option.
>
> Since we have several spellchecker indices building on each commit is
> not really what we want to do.
> Building on optimize is not possible as index optimization is done on
> the master and the slaves don't even run an optimize but only fetch
> the optimized index.
>
> Therefore I'm thinking about an extension of the spellchecker that
> allows you to rebuild the spellchecker based on a cron-expression
> (e.g. rebuild each night at 1 am).
>
> What do you think about this, is there anybody else interested in this?
>
> Regarding the lifecycle, is there already some executor "framework" or
> any regularly running process in place, or would I have to pull up my
> own thread? If so, how can I stop my thread when solr/tomcat is
> shutdown (I couldn't see any shutdown or destroy method in
> SearchComponent)?
>
> Thanx for your feedback,
> cheers,
> Martin
>



-- 
Martin Grotzke
http://www.javakaffee.de/blog/

Re: Rebuild Spellchecker based on cron expression

Posted by Peter Karich <pe...@yahoo.de>.
> Building on optimize is not possible as index optimization is done on
> the master and the slaves don't even run an optimize but only fetch
> the optimized index.

isn't the spellcheck index replicated to the slaves too?

-- 
http://jetwick.com open twitter search