You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by dabboo <ag...@sapient.com> on 2009/03/31 10:20:21 UTC

RE: Not getting the proper result.

Did you try creating your indexes again after modifying the schema.xml

The way Solr search is, whatever string you have 

Radha C. wrote:
> 
> 
> Thanks Grant,
> 
> I used the analysis page in the example, the StandardTokenizerFactory does
> not split by dots, It is passing the L.I.C 
> as it is to solr.StandardFilterFactory and this Filter class also did not
> split or remove the dots. My question is I read in the wiki page (
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters ) that the
> StandardFilterFactory class removes dots and provides LIC  . But it is not
> working as mentioned in wiki page. Am I missing anything here? Please
> suggest me What went wrong with my below schema.xml.
>  
> 
> -----Original Message-----
> From: Grant Ingersoll [mailto:gsingers@apache.org] 
> Sent: Monday, March 30, 2009 7:46 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Not getting the proper result. 
> 
> The StandardTokenizer splits on punctuation, so L.I.C. is likely becoming
> 'l', 'i', 'c', while LIC -> lic.  One helpful tool is the Analysis page on
> the Solr admin:  http://localhost:8983/solr/admin/analysis.jsp
>   as it can help you figure out what is going on with analysis on both the
> query and indexing side.
> 
> HTH,
> Grant
> 
> On Mar 30, 2009, at 7:50 AM, Radha C. wrote:
> 
>> Hi,
>>
>> I am having following analyzer set up in schema.xml <fieldType 
>> name="text" class="solr.TextField"
>> positionIncrementGap="100">
>> <analyzer>
>>  <tokenizer class="solr.StandardTokenizerFactory"/>
>>  <filter class="solr.StandardFilterFactory"/>
>>  <filter class="solr.LowerCaseFilterFactory"/>
>> </analyzer>
>>  </fieldType>
>>
>> I am indexing a database field which contains L.I.C and I am trying to 
>> search the field as follows but getting zero response.
>> http://localhost:8080/solr/select/?q=LIC
>> <http://localhost:8080/solr/select/?q=LIC&debugQuery=on>
>> &debugQuery=on and
>> http://localhost:8080/solr/select/?q=lic
>> <http://localhost:8080/solr/select/?q=lic&debugQuery=on>
>> &debugQuery=on
>>
>> But it is giving result for  q=L.I.C
>>
>> It is not identifying the L.I.C and lic .  what is the wrong here? can 
>> anyone help me ?
>>
>> Thanks
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Not-getting-the-proper-result.-tp22781710p22800476.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Not getting the proper result.

Posted by Chris Hostetter <ho...@fucit.org>.
StandardTokenizer is .... tricky.

it does a lot of kooky things that probably made sense when it was 
written, you'll not in your output that the "term type" is getting set to 
"HOST" Standard Tokenizer has decided that L.I.C looks like a hostname, so 
it's not splitting on the periods.

: analyser output. Sohould I use WhiteSpaceTokenizerFactory with the
: StandardFilterFactory ? 

your milage may vary ... but based on what little i know about your use 
case WhiteSpaceTokenizerFactory with a WordDelimiterFilterFactory would 
probably work.

: org.apache.solr.analysis.StandardTokenizerFactory {}
: term position 	1
: term text 	L.I.C
: term type 	<HOST>
: source start,end 	0,5
: payload 	



-Hoss


RE: Not getting the proper result.

Posted by "Radha C." <cr...@ceiindia.com>.
Thanks for your reply. Yes, I restarted the servlet container, and run my
java code to reindex again, I tested with analysis page also.
StandardFilterFactory is not removing dots in the L.I.C. Below is the
analyser output. Sohould I use WhiteSpaceTokenizerFactory with the
StandardFilterFactory ? 

The index Analyzer
-------------------------------------------------
org.apache.solr.analysis.StandardTokenizerFactory {}
term position 	1
term text 	L.I.C
term type 	<HOST>
source start,end 	0,5
payload 	
org.apache.solr.analysis.StandardFilterFactory {}
term position 	1
term text 	L.I.C
term type 	<HOST>
source start,end 	0,5
payload 	
org.apache.solr.analysis.LowerCaseFilterFactory {}
term position 	1
term text 	l.i.c
term type 	<HOST>
source start,end 	0,5
payload 	 

-----Original Message-----
From: dabboo [mailto:agarg@sapient.com] 
Sent: Tuesday, March 31, 2009 1:50 PM
To: solr-user@lucene.apache.org
Subject: RE: Not getting the proper result.


Did you try creating your indexes again after modifying the schema.xml

The way Solr search is, whatever string you have 

Radha C. wrote:
> 
> 
> Thanks Grant,
> 
> I used the analysis page in the example, the StandardTokenizerFactory 
> does not split by dots, It is passing the L.I.C as it is to 
> solr.StandardFilterFactory and this Filter class also did not split or 
> remove the dots. My question is I read in the wiki page ( 
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters ) that the 
> StandardFilterFactory class removes dots and provides LIC  . But it is 
> not working as mentioned in wiki page. Am I missing anything here? 
> Please suggest me What went wrong with my below schema.xml.
>  
> 
> -----Original Message-----
> From: Grant Ingersoll [mailto:gsingers@apache.org]
> Sent: Monday, March 30, 2009 7:46 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Not getting the proper result. 
> 
> The StandardTokenizer splits on punctuation, so L.I.C. is likely 
> becoming 'l', 'i', 'c', while LIC -> lic.  One helpful tool is the 
> Analysis page on the Solr admin:
http://localhost:8983/solr/admin/analysis.jsp
>   as it can help you figure out what is going on with analysis on both 
> the query and indexing side.
> 
> HTH,
> Grant
> 
> On Mar 30, 2009, at 7:50 AM, Radha C. wrote:
> 
>> Hi,
>>
>> I am having following analyzer set up in schema.xml <fieldType 
>> name="text" class="solr.TextField"
>> positionIncrementGap="100">
>> <analyzer>
>>  <tokenizer class="solr.StandardTokenizerFactory"/>
>>  <filter class="solr.StandardFilterFactory"/>
>>  <filter class="solr.LowerCaseFilterFactory"/>
>> </analyzer>
>>  </fieldType>
>>
>> I am indexing a database field which contains L.I.C and I am trying 
>> to search the field as follows but getting zero response.
>> http://localhost:8080/solr/select/?q=LIC
>> <http://localhost:8080/solr/select/?q=LIC&debugQuery=on>
>> &debugQuery=on and
>> http://localhost:8080/solr/select/?q=lic
>> <http://localhost:8080/solr/select/?q=lic&debugQuery=on>
>> &debugQuery=on
>>
>> But it is giving result for  q=L.I.C
>>
>> It is not identifying the L.I.C and lic .  what is the wrong here? 
>> can anyone help me ?
>>
>> Thanks
> 
> 
> 

--
View this message in context:
http://www.nabble.com/Not-getting-the-proper-result.-tp22781710p22800476.htm
l
Sent from the Solr - User mailing list archive at Nabble.com.