You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by SolrCarinthia <ta...@ymail.com> on 2012/11/08 14:18:10 UTC

Solr SpellCheck on Query Field

Is it possible to run a spellcheck on multiple fields. I am aware of using a
multivalued field for this
(http://lucene.472066.n3.nabble.com/spellcheck-on-multiple-fields-td1587327.html)

However, what I want is to return spellcheck alternatives based on the field
against which the query ran. So if I run a query against a field like
'FirstName', I want to be able to retrieve alternate query terms from the
values indexed in 'FirstName' field only. Similarly a search against a field
'LastName' should return alternatives from the values indexed for this field
only. I dont think a multivalued field approach would work for me, since it
is actually an aggregation of indexed values from multiple fields. When
searching for First Name, I don't want to put forward suggestions that are
actually coming from tokens indexed from Last Name, Address City,etc.

To summarize my problem, I want to be able to chose the field against which
spellcheck alternatives should be provided at query time. Is this possible ?



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-SpellCheck-on-Query-Field-tp4019036.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Solr SpellCheck on Query Field

Posted by "Dyer, James" <Ja...@ingramcontent.com>.
What I'm saying is if you specify "spellcheck.maxCollationTries", it will run the suggested query against the index for you and only return valid re-written queries.  That is, a misspelled firstname will be replaced with a valid firstname; a missspelled lastname will be replaced with a valid lastname, etc.  Also, all the collations returned will be guaranteed to give the user some hits (it even can tell you how many).  

I also recommend using "spellcheck.alternativeTermCount" & "spellcheck.maxResultsForSuggest" in your situation.  Check the wiki for these parameters for more information.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: SolrCarinthia [mailto:tandon.harsh@ymail.com] 
Sent: Friday, November 09, 2012 2:43 AM
To: solr-user@lucene.apache.org
Subject: RE: Solr SpellCheck on Query Field

Correct me if i am wrong but wouldn't collation return alternate terms
against the master dictionary field. 

So if I were to take a collated term and run a query for that term against a
specific field (say First Name) I am not guaranteed to get back results
since that term could actually have been collated because of a token from
another field (say Last Name). To get guaranteed results as collate
promises, I would have to run the suggested query against the master dict
field. That's probably valid since collation returns results in context of
master dict field but this behavior is pretty much what I want to avoid.



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-SpellCheck-on-Query-Field-tp4019036p4019232.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Solr SpellCheck on Query Field

Posted by SolrCarinthia <ta...@ymail.com>.
Correct me if i am wrong but wouldn't collation return alternate terms
against the master dictionary field. 

So if I were to take a collated term and run a query for that term against a
specific field (say First Name) I am not guaranteed to get back results
since that term could actually have been collated because of a token from
another field (say Last Name). To get guaranteed results as collate
promises, I would have to run the suggested query against the master dict
field. That's probably valid since collation returns results in context of
master dict field but this behavior is pretty much what I want to avoid.



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-SpellCheck-on-Query-Field-tp4019036p4019232.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Solr SpellCheck on Query Field

Posted by "Dyer, James" <Ja...@ingramcontent.com>.
This would be an awesome feature to have, wouldn't it?

For now, the best you can do is to create a master dictionary that contains all of the "FirstName"s and "LastName"s and use that as your dictionary's spellcheck field.  This is the <copyField> technique that you refer to in the linked post.  Alone this won't work because it might correct a misspelled "FirstName" with someone's "LastName" or visa-versa, giving you absurd query corrections.

The workaround for this is to use "spellcheck.collate=true" and set "spellcheck.maxCollationTries" to a number greater than zero.  This will cause SpellCheckComponent to verify that the particular suggestions will actually return some hits before sending them back.  So every collation returned will represent a valid set of spelling corrections for the user's terms.

Another drawback to having a master dictionary is that by default, SpellCheckComponent will never suggest for words included in the dictionary.  So if somebody's misspelt FirstName happens to be in the dictionary because it is a valid LastName, SpellCheckComponent's default settings assume that this is indeed correctly-spelled.  The way around this is to specify "spellcheck.alternativeTermCount" to a non-zero value.  This is the number of suggestions to return for terms that are in the dictionary (you can use the same value as for "spellcheck.count", or a lower value if you want to try and tune this behavior).  You should also set "spellcheck.maxResultsForSuggest" to zero. (Use a higher value if you also want "did-you-mean"-style suggestions for low-hitcount queries.)

I think these conbinations will probably give you exactly what you want, at the expense of some overhead and configuration complexity.

For more information, see the wiki section beginning here:  http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.count 

For an example, see the "/spell" request handler in the Solr Example:  http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x/solr/example/solr/collection1/conf/solrconfig.xml

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: SolrCarinthia [mailto:tandon.harsh@ymail.com] 
Sent: Thursday, November 08, 2012 7:18 AM
To: solr-user@lucene.apache.org
Subject: Solr SpellCheck on Query Field

Is it possible to run a spellcheck on multiple fields. I am aware of using a
multivalued field for this
(http://lucene.472066.n3.nabble.com/spellcheck-on-multiple-fields-td1587327.html)

However, what I want is to return spellcheck alternatives based on the field
against which the query ran. So if I run a query against a field like
'FirstName', I want to be able to retrieve alternate query terms from the
values indexed in 'FirstName' field only. Similarly a search against a field
'LastName' should return alternatives from the values indexed for this field
only. I dont think a multivalued field approach would work for me, since it
is actually an aggregation of indexed values from multiple fields. When
searching for First Name, I don't want to put forward suggestions that are
actually coming from tokens indexed from Last Name, Address City,etc.

To summarize my problem, I want to be able to chose the field against which
spellcheck alternatives should be provided at query time. Is this possible ?



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-SpellCheck-on-Query-Field-tp4019036.html
Sent from the Solr - User mailing list archive at Nabble.com.