You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Developer <bb...@gmail.com> on 2014/02/11 22:01:55 UTC

Solr Autosuggest - Strange issue with leading numbers in query

I have a strange issue with Autosuggest.

Whenever I query for a keyword along with numbers (leading) it returns the
suggestion corresponding to the alphabets (ignoring the numbers). I was
under assumption that it will return an empty result back. I am not sure
what I am doing wrong. Can someone help?

*Query:*
/autocomplete?qt=/lucid&req_type=auto_complete&spellcheck.maxCollations=10&q="12342343243242ga"&spellcheck.count=10

*Result:*

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
</lst>
<lst name="spellcheck">
<lst name="suggestions">
<lst name="ga">
<int name="numFound">1</int>
<int name="startOffset">15</int>
<int name="endOffset">17</int>
<arr name="suggestion">
<str>galaxy</str>
</arr>
</lst>
<str name="collation">"12342343243242galaxy"</str>
</lst>
</lst>
</response>


*My field configuration is as below:*
    <fieldType class="solr.TextField" name="textSpell_word"
positionIncrementGap="100">
	<analyzer>
	<tokenizer class="solr.WhitespaceTokenizerFactory"/>
		<filter class="solr.LowerCaseFilterFactory"/>
		<filter class="solr.StopFilterFactory" enablePositionIncrements="true"
ignoreCase="true" words="stopwords_autosuggest.txt"/>
	</analyzer>
    </fieldType>

*SolrConfig.xml*	

	<searchComponent class="solr.SpellCheckComponent" name="autocomplete">
		<lst name="spellchecker">
			<str name="name">autocomplete</str>
			<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
			<str
name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
			<str name="field">autocomplete_word</str>
			<str name="storeDir">autocomplete</str>
			<str name="buildOnCommit">true</str>
			<float name="threshold">.005</float>
			
		</lst>
	</searchComponent>
	<requestHandler class="org.apache.solr.handler.component.SearchHandler"
name="/autocomplete">
		<lst name="defaults">
			<str name="spellcheck">true</str>
			<str name="spellcheck.dictionary">autocomplete</str>
			<str name="spellcheck.collate">true</str>
			<str name="spellcheck.count">10</str>
			<str name="spellcheck.onlyMorePopular">false</str>
		</lst>
		<arr name="components">
			<str>autocomplete</str>
		</arr>
	</requestHandler>



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Autosuggest - Strange issue with leading numbers in query

Posted by bbi123 <bb...@gmail.com>.
I tied almost all possible combination but in vain.

Does anyone know if there is any logic build in to suggester component to
ignore the leading numbers?

autocomplete?qt=/lucid&req_type=auto_complete&spellcheck.collate=false&q=34g

<lst name="spellcheck">
<lst name="suggestions">
<lst name="g">
<int name="numFound">1</int>
<int name="startOffset">2</int>
<int name="endOffset">3</int>
<arr name="suggestion">
<str>galaxy</str>
</arr>
</lst>
</lst>
</lst>
</response>


/autocomplete?qt=/lucid&req_type=auto_complete&spellcheck.collate=false&q=11123423432423243ip

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
<lst name="spellcheck">
<lst name="suggestions">
<lst name="ip">
<int name="numFound">2</int>
*<int name="startOffset">17</int>*
<int name="endOffset">19</int>
<arr name="suggestion">
<str>ipad</str>
<str>iphone</str>
</arr>
</lst>
</lst>
</lst>
</response>



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751p4123702.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Autosuggest - Strange issue with leading numbers in query

Posted by Jason Hellman <jh...@innoventsolutions.com>.
Here’s a rather obvious question:  have you rebuilt your spell index recently?  Is it possible the offending numbers snuck into the spell dictionary?  The terms component will show you what’s in your current, searchable field…but not the dictionary.

If my memory serves correctly, with collate=true this would allow for such behavior to occur, especially with onlyMorePopular set to false (which would ensure the resulting collation has a query count greater than the current query).  Have you flipped onlyMorePopular to true to confirm?




On Feb 18, 2014, at 10:16 AM, bbi123 <bb...@gmail.com> wrote:

> Thanks a lot for your response Erik.
> 
> I was trying to find if I have any suggestion starting with numbers using
> terms component but I couldn't find any.. Its very strange!!!
> 
> Anyways, thanks again for your response.
> 
> 
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751p4118072.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Autosuggest - Strange issue with leading numbers in query

Posted by bbi123 <bb...@gmail.com>.
Thanks a lot for your response Erik.

I was trying to find if I have any suggestion starting with numbers using
terms component but I couldn't find any.. Its very strange!!!

Anyways, thanks again for your response.



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751p4118072.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Autosuggest - Strange issue with leading numbers in query

Posted by Erick Erickson <er...@gmail.com>.
Ah, OK, I though you were indexing things like 123412335ga, but not so.

Afraid I'm fresh out of ideas..... Although I might try using TermsComponent
to examine the index and see if, somehow, there _are_ terms with leading
numbers in the output.

It's also possible that numbers are stripped when building the FST that
is used, but I don't know one way or the other.

Best,
Erick


On Mon, Feb 17, 2014 at 11:30 AM, Developer <bb...@gmail.com> wrote:

> Hi Erik,
>
> Thanks a lot for your reply.
>
> I expect it to return zero suggestions since the suggested keyword doesnt
> actually start with numbers.
>
> Expected results
> Searching for ga -> returns galaxy
> Searching for gal -> returns galaxy
> Searching for 12321312321312ga -> should not return any suggestion since
> there is no keyword (combination) exists in the index.
>
> Thanks
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751p4117846.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Solr Autosuggest - Strange issue with leading numbers in query

Posted by Developer <bb...@gmail.com>.
Hi Erik,

Thanks a lot for your reply.

I expect it to return zero suggestions since the suggested keyword doesnt
actually start with numbers.

Expected results 
Searching for ga -> returns galaxy 
Searching for gal -> returns galaxy
Searching for 12321312321312ga -> should not return any suggestion since
there is no keyword (combination) exists in the index.

Thanks




--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751p4117846.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Autosuggest - Strange issue with leading numbers in query

Posted by Erick Erickson <er...@gmail.com>.
Hmmm, the example you post seems correct to me, the returned
suggestion is really close to the term. What are you expecting here?

The example is inconsistent with
"it returns the suggestion corresponding to the alphabets (ignoring the
numbers)"

It looks like it's considering the numbers just fine, which is what makes
the returned suggestion close to the term I think.

Best,
Erick


On Tue, Feb 11, 2014 at 1:01 PM, Developer <bb...@gmail.com> wrote:

> I have a strange issue with Autosuggest.
>
> Whenever I query for a keyword along with numbers (leading) it returns the
> suggestion corresponding to the alphabets (ignoring the numbers). I was
> under assumption that it will return an empty result back. I am not sure
> what I am doing wrong. Can someone help?
>
> *Query:*
>
> /autocomplete?qt=/lucid&req_type=auto_complete&spellcheck.maxCollations=10&q="12342343243242ga"&spellcheck.count=10
>
> *Result:*
>
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">1</int>
> </lst>
> <lst name="spellcheck">
> <lst name="suggestions">
> <lst name="ga">
> <int name="numFound">1</int>
> <int name="startOffset">15</int>
> <int name="endOffset">17</int>
> <arr name="suggestion">
> <str>galaxy</str>
> </arr>
> </lst>
> <str name="collation">"12342343243242galaxy"</str>
> </lst>
> </lst>
> </response>
>
>
> *My field configuration is as below:*
>     <fieldType class="solr.TextField" name="textSpell_word"
> positionIncrementGap="100">
>         <analyzer>
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>                 <filter class="solr.LowerCaseFilterFactory"/>
>                 <filter class="solr.StopFilterFactory"
> enablePositionIncrements="true"
> ignoreCase="true" words="stopwords_autosuggest.txt"/>
>         </analyzer>
>     </fieldType>
>
> *SolrConfig.xml*
>
>         <searchComponent class="solr.SpellCheckComponent"
> name="autocomplete">
>                 <lst name="spellchecker">
>                         <str name="name">autocomplete</str>
>                         <str
> name="classname">org.apache.solr.spelling.suggest.Suggester</str>
>                         <str
> name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
>                         <str name="field">autocomplete_word</str>
>                         <str name="storeDir">autocomplete</str>
>                         <str name="buildOnCommit">true</str>
>                         <float name="threshold">.005</float>
>
>                 </lst>
>         </searchComponent>
>         <requestHandler
> class="org.apache.solr.handler.component.SearchHandler"
> name="/autocomplete">
>                 <lst name="defaults">
>                         <str name="spellcheck">true</str>
>                         <str
> name="spellcheck.dictionary">autocomplete</str>
>                         <str name="spellcheck.collate">true</str>
>                         <str name="spellcheck.count">10</str>
>                         <str name="spellcheck.onlyMorePopular">false</str>
>                 </lst>
>                 <arr name="components">
>                         <str>autocomplete</str>
>                 </arr>
>         </requestHandler>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>