You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Grant Ingersoll <gs...@apache.org> on 2008/01/17 20:26:48 UTC

Integrated Spellchecking

Is it feasible to submit a query to any of the various handlers and  
have it bring back results and spelling suggestions all in one  
response?  Is this something the query components piece would handle,  
assuming one exists for the spell checker?

Thanks,
Grant

Re: Integrated Spellchecking

Posted by Yonik Seeley <yo...@apache.org>.
On Jan 17, 2008 2:33 PM, Ryan McKinley <ry...@gmail.com> wrote:
> Yes -- this is what search components are for!
>
> Depending on where you put it in the chain, it could only return spell
> checked results if there are too few results (or the top score is below
> some threshold)

Score thresholds are tricky in lucene since scores across different
queries aren't that meaningful.
But a number of results threshold sounds like it might be a good idea....

Perhaps there could even be options to
- test if the suggestion actually matches any documents
- replace the original query with the suggestion before running the query
- add an additional DocList to the response for documents matching the
suggestion


 Thinking a little more on the threshold idea, it seems to have some issues.

One problem:
  In general, you want spell suggestions to be corpus wide... so you
might be under a threshold just because the query is heavily filtered
(restrictive fqs) and the suggestion may not match anything under
those restrictions.  Getting the DocSet of the query only to check the
number of hits adds expense to the request.

But
- if not sorting by score, the cache would re-use the query DocSet
instead of going to the Lucene index
- one could add a call to Solr to retrieve the number of hits in the
base query, before filtering (but that could limit or complicate
future optimizations that move some of the filters into the base
query...)

Another issue is how big the spelling index is.... if it's big enough,
best practice might be to have a separate spelling index that the
front-end client hits concurrently with the main index.  This also
sort of applies to distributed search (one may want a single separate
spelling index that isn't distributed).

-Yonik

Re: Integrated Spellchecking

Posted by Doug Steigerwald <ds...@mcclatchyinteractive.com>.
Allocating some time to this next week.  Need to try and remember what issues I was having when I 
stopped working on it.

doug

Matthew Runo wrote:
> I'd have to agree with this. I'd probably be able to put a bit of work 
> into it as well, as it's something we'd use for sure if it were available.
> 
> Thanks!
> 
> Matthew Runo
> Software Developer
> Zappos.com
> 702.943.7833
> 
> On Feb 18, 2008, at 6:09 AM, Grant Ingersoll wrote:
> 
>> Hey Doug,
>>
>> If you have permission to donate, perhaps you can just post the patch 
>> anyway and state that it isn't quite ready to go.  This is something I 
>> could use too, and so may have some cycles to work on it.  I hate to 
>> replicate the work if you already have something that is more or less 
>> working.  A half baked patch is better than no patch.
>>
>> -Grant
>>
>>
>> On Feb 15, 2008, at 12:45 PM, Doug Steigerwald wrote:
>>
>>> That unfortunately got pushed aside to work on some of our higher 
>>> priority solr work since we already had it working one way.
>>>
>>> Hoping to revisit this after we push to production and start working 
>>> on new features and share what I've done for this and 
>>> multicore/spellcheck replication (which we have working quite well in 
>>> QA right now).
>>>
>>> Doug Steigerwald
>>> Software Developer
>>> McClatchy Interactive
>>> dsteigerwald@mcclatchyinteractive.com
>>> 919.861.1287
>>>
>>>
>>> oleg_gnatovskiy wrote:
>>>> dsteiger wrote:
>>>>> I've got a couple search components for automatic spell correction 
>>>>> that
>>>>> I've been working on.
>>>>>
>>>>> I've converted most of the SpellCheckerRequestHandler to a search
>>>>> component (hopefully will throw a patch out soon for this).  Then 
>>>>> another search component that will do auto
>>>>> correction for a query if the search returns zero results.
>>>>>
>>>>> We're hoping to see some performance improvements out of handling 
>>>>> this in
>>>>> Solr instead of our Rails service.
>>>>>
>>>>> doug
>>>>>
>>>>>
>>>>> Ryan McKinley wrote:
>>>>>> Yes -- this is what search components are for!
>>>>>>
>>>>>> Depending on where you put it in the chain, it could only return 
>>>>>> spell checked results if there are too few results (or the top 
>>>>>> score is below some threshold)
>>>>>>
>>>>>> ryan
>>>>>>
>>>>>>
>>>>>> Grant Ingersoll wrote:
>>>>>>> Is it feasible to submit a query to any of the various handlers 
>>>>>>> and have it bring back results and spelling suggestions all in 
>>>>>>> one response?  Is this something the query components piece would 
>>>>>>> handle, assuming one exists for the spell checker?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Grant
>>>>>>>
>>>>>
>>>> So have you succeeded in implementing this patch? I'd definitely 
>>>> like to use
>>>> this functionality as a search suggestion.
>>
>>

Re: Integrated Spellchecking

Posted by Matthew Runo <mr...@zappos.com>.
I'd have to agree with this. I'd probably be able to put a bit of work  
into it as well, as it's something we'd use for sure if it were  
available.

Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833

On Feb 18, 2008, at 6:09 AM, Grant Ingersoll wrote:

> Hey Doug,
>
> If you have permission to donate, perhaps you can just post the  
> patch anyway and state that it isn't quite ready to go.  This is  
> something I could use too, and so may have some cycles to work on  
> it.  I hate to replicate the work if you already have something that  
> is more or less working.  A half baked patch is better than no patch.
>
> -Grant
>
>
> On Feb 15, 2008, at 12:45 PM, Doug Steigerwald wrote:
>
>> That unfortunately got pushed aside to work on some of our higher  
>> priority solr work since we already had it working one way.
>>
>> Hoping to revisit this after we push to production and start  
>> working on new features and share what I've done for this and  
>> multicore/spellcheck replication (which we have working quite well  
>> in QA right now).
>>
>> Doug Steigerwald
>> Software Developer
>> McClatchy Interactive
>> dsteigerwald@mcclatchyinteractive.com
>> 919.861.1287
>>
>>
>> oleg_gnatovskiy wrote:
>>> dsteiger wrote:
>>>> I've got a couple search components for automatic spell  
>>>> correction that
>>>> I've been working on.
>>>>
>>>> I've converted most of the SpellCheckerRequestHandler to a search
>>>> component (hopefully will throw a patch out soon for this).  Then  
>>>> another search component that will do auto
>>>> correction for a query if the search returns zero results.
>>>>
>>>> We're hoping to see some performance improvements out of handling  
>>>> this in
>>>> Solr instead of our Rails service.
>>>>
>>>> doug
>>>>
>>>>
>>>> Ryan McKinley wrote:
>>>>> Yes -- this is what search components are for!
>>>>>
>>>>> Depending on where you put it in the chain, it could only return  
>>>>> spell checked results if there are too few results (or the top  
>>>>> score is below some threshold)
>>>>>
>>>>> ryan
>>>>>
>>>>>
>>>>> Grant Ingersoll wrote:
>>>>>> Is it feasible to submit a query to any of the various handlers  
>>>>>> and have it bring back results and spelling suggestions all in  
>>>>>> one response?  Is this something the query components piece  
>>>>>> would handle, assuming one exists for the spell checker?
>>>>>>
>>>>>> Thanks,
>>>>>> Grant
>>>>>>
>>>>
>>> So have you succeeded in implementing this patch? I'd definitely  
>>> like to use
>>> this functionality as a search suggestion.
>
>


Re: Integrated Spellchecking

Posted by Grant Ingersoll <gs...@apache.org>.
Hey Doug,

If you have permission to donate, perhaps you can just post the patch  
anyway and state that it isn't quite ready to go.  This is something I  
could use too, and so may have some cycles to work on it.  I hate to  
replicate the work if you already have something that is more or less  
working.  A half baked patch is better than no patch.

-Grant


On Feb 15, 2008, at 12:45 PM, Doug Steigerwald wrote:

> That unfortunately got pushed aside to work on some of our higher  
> priority solr work since we already had it working one way.
>
> Hoping to revisit this after we push to production and start working  
> on new features and share what I've done for this and multicore/ 
> spellcheck replication (which we have working quite well in QA right  
> now).
>
> Doug Steigerwald
> Software Developer
> McClatchy Interactive
> dsteigerwald@mcclatchyinteractive.com
> 919.861.1287
>
>
> oleg_gnatovskiy wrote:
>> dsteiger wrote:
>>> I've got a couple search components for automatic spell correction  
>>> that
>>> I've been working on.
>>>
>>> I've converted most of the SpellCheckerRequestHandler to a search
>>> component (hopefully will throw a patch out soon for this).  Then  
>>> another search component that will do auto
>>> correction for a query if the search returns zero results.
>>>
>>> We're hoping to see some performance improvements out of handling  
>>> this in
>>> Solr instead of our Rails service.
>>>
>>> doug
>>>
>>>
>>> Ryan McKinley wrote:
>>>> Yes -- this is what search components are for!
>>>>
>>>> Depending on where you put it in the chain, it could only return  
>>>> spell checked results if there are too few results (or the top  
>>>> score is below some threshold)
>>>>
>>>> ryan
>>>>
>>>>
>>>> Grant Ingersoll wrote:
>>>>> Is it feasible to submit a query to any of the various handlers  
>>>>> and have it bring back results and spelling suggestions all in  
>>>>> one response?  Is this something the query components piece  
>>>>> would handle, assuming one exists for the spell checker?
>>>>>
>>>>> Thanks,
>>>>> Grant
>>>>>
>>>
>> So have you succeeded in implementing this patch? I'd definitely  
>> like to use
>> this functionality as a search suggestion.



Re: Integrated Spellchecking

Posted by Doug Steigerwald <ds...@mcclatchyinteractive.com>.
That unfortunately got pushed aside to work on some of our higher priority solr work since we 
already had it working one way.

Hoping to revisit this after we push to production and start working on new features and share what 
I've done for this and multicore/spellcheck replication (which we have working quite well in QA 
right now).

Doug Steigerwald
Software Developer
McClatchy Interactive
dsteigerwald@mcclatchyinteractive.com
919.861.1287


oleg_gnatovskiy wrote:
> 
> 
> dsteiger wrote:
>> I've got a couple search components for automatic spell correction that
>> I've been working on.
>>
>> I've converted most of the SpellCheckerRequestHandler to a search
>> component (hopefully will throw a 
>> patch out soon for this).  Then another search component that will do auto
>> correction for a query if 
>> the search returns zero results.
>>
>> We're hoping to see some performance improvements out of handling this in
>> Solr instead of our Rails 
>> service.
>>
>> doug
>>
>>
>> Ryan McKinley wrote:
>>> Yes -- this is what search components are for!
>>>
>>> Depending on where you put it in the chain, it could only return spell 
>>> checked results if there are too few results (or the top score is below 
>>> some threshold)
>>>
>>> ryan
>>>
>>>
>>> Grant Ingersoll wrote:
>>>> Is it feasible to submit a query to any of the various handlers and 
>>>> have it bring back results and spelling suggestions all in one 
>>>> response?  Is this something the query components piece would handle, 
>>>> assuming one exists for the spell checker?
>>>>
>>>> Thanks,
>>>> Grant
>>>>
>>
> 
> 
> So have you succeeded in implementing this patch? I'd definitely like to use
> this functionality as a search suggestion.

Re: Integrated Spellchecking

Posted by oleg_gnatovskiy <ol...@citysearch.com>.


dsteiger wrote:
> 
> I've got a couple search components for automatic spell correction that
> I've been working on.
> 
> I've converted most of the SpellCheckerRequestHandler to a search
> component (hopefully will throw a 
> patch out soon for this).  Then another search component that will do auto
> correction for a query if 
> the search returns zero results.
> 
> We're hoping to see some performance improvements out of handling this in
> Solr instead of our Rails 
> service.
> 
> doug
> 
> 
> Ryan McKinley wrote:
>> Yes -- this is what search components are for!
>> 
>> Depending on where you put it in the chain, it could only return spell 
>> checked results if there are too few results (or the top score is below 
>> some threshold)
>> 
>> ryan
>> 
>> 
>> Grant Ingersoll wrote:
>>> Is it feasible to submit a query to any of the various handlers and 
>>> have it bring back results and spelling suggestions all in one 
>>> response?  Is this something the query components piece would handle, 
>>> assuming one exists for the spell checker?
>>>
>>> Thanks,
>>> Grant
>>>
> 
> 


So have you succeeded in implementing this patch? I'd definitely like to use
this functionality as a search suggestion.
-- 
View this message in context: http://www.nabble.com/Integrated-Spellchecking-tp14930232p15504125.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Integrated Spellchecking

Posted by Grant Ingersoll <gs...@apache.org>.
On Jan 17, 2008, at 3:01 PM, Doug Steigerwald wrote:

> I've got a couple search components for automatic spell correction  
> that I've been working on.
>
> I've converted most of the SpellCheckerRequestHandler to a search  
> component (hopefully will throw a patch out soon for this).  Then  
> another search component that will do auto correction for a query if  
> the search returns zero results.

If you need somebody to test, throw it up on a JIRA, as I would be  
happy to test.

-Grant

Re: Integrated Spellchecking

Posted by Doug Steigerwald <ds...@mcclatchyinteractive.com>.
I've got a couple search components for automatic spell correction that I've been working on.

I've converted most of the SpellCheckerRequestHandler to a search component (hopefully will throw a 
patch out soon for this).  Then another search component that will do auto correction for a query if 
the search returns zero results.

We're hoping to see some performance improvements out of handling this in Solr instead of our Rails 
service.

doug


Ryan McKinley wrote:
> Yes -- this is what search components are for!
> 
> Depending on where you put it in the chain, it could only return spell 
> checked results if there are too few results (or the top score is below 
> some threshold)
> 
> ryan
> 
> 
> Grant Ingersoll wrote:
>> Is it feasible to submit a query to any of the various handlers and 
>> have it bring back results and spelling suggestions all in one 
>> response?  Is this something the query components piece would handle, 
>> assuming one exists for the spell checker?
>>
>> Thanks,
>> Grant
>>

Re: Integrated Spellchecking

Posted by Ryan McKinley <ry...@gmail.com>.
Yes -- this is what search components are for!

Depending on where you put it in the chain, it could only return spell 
checked results if there are too few results (or the top score is below 
some threshold)

ryan


Grant Ingersoll wrote:
> Is it feasible to submit a query to any of the various handlers and have 
> it bring back results and spelling suggestions all in one response?  Is 
> this something the query components piece would handle, assuming one 
> exists for the spell checker?
> 
> Thanks,
> Grant
>