You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Sophie M." <so...@beezik.com> on 2010/06/23 14:56:39 UTC

Alphabetic range

Hello all,

I try since several day to build up an alphabetical range. I will explain
all steps (i have the Solr1.4 Enterprise  Search Server book written by
Smiley and Pugh).

I want get all artists beginning by the two first letter. If I request "mi",
I want to have as response "michael jackson" and all artists name beginning
by "mi".

I defined a field type similiar to Smiley and Pugh's example p.148

<fieldType name="bucketFirstTwoLetters" class="solr.TextField"
sortMissingLast="true" omitNorms="true">
		<analyser type="index">
			<tokenizer class="solr.PatternTokenizerFactory"
pattern="^([a-zA-Z])([a-zA-Z]).*" group="2"/> <!-- les deux premieres
lettres-->
		</analyser>
		<analyser type="query">
			<tokenizer class="solr.KeywordTokenizerFactory"/>
		</analyser>
	</fieldType>
	
I defined the field ArtistSort like : 

<field name="ArtistSort" type="bucketFirstTwoLetters" stored="true"
multivalued="false"/>
To the request : 

http://localhost:8983/solr/music/select?indent=on&q=yu&qt=standard&wt=standard&facet=on&facet.field=ArtistSort&facetsort=lex&facet.missing=on&facet.method=enum&fl=ArtistSort

I get :

http://lucene.472066.n3.nabble.com/file/n916716/select.xml select.xml 

I don't understand why the pattern doesn't my exacty. For example "An An Yu"
matches but I only want artists whom name begins by "yu". And I know that an
artist named ReYu would match because ReYu would be interpreted as Re Yu (as
two words).

I also tried to make an other type of queries like : 

http://localhost:8983/solr/music/select?indent=on&version=2.2&q=ArtistSort:mi*&fq=&start=0&rows=10&fl=ArtistSort&qt=standard&wt=standard&explainOther=&hl.fl=

I get exacly what I would. I made several tries, I get only artist's names
wich begins by the good first to letters.

But I get very few responses, see there :

result name="response" numFound="6" start="0">

<doc>
<str name="ArtistSort">mike manne and tiger blues</str>
</doc>
−
<doc>
<str name="ArtistSort">mimika</str>
</doc>
−
<doc>
<str name="ArtistSort">miduno</str>
</doc>
−
<doc>
<str name="ArtistSort">milue macïro</str>
</doc>
−
<doc>
<str name="ArtistSort">mister pringle</str>
</doc>
−
<doc>
<str name="ArtistSort">mimmai</str>
</doc>


In my index there is more than 80 000 artists...  I really don't understand
why I can't get more responses. I think about the problem since days and
days and now my brain freezes 

Thank you in advance.

Sophie
-- 
View this message in context: http://lucene.472066.n3.nabble.com/Alphabetic-range-tp916716p916716.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Alphabetic range

Posted by "Sophie M." <so...@beezik.com>.
Hello Otis,

this morning, instead of

http://localhost:8983/solr/music/select?indent=on&version=2.2&q=ArtistSort:mi*&fq=&start=0&rows=10&fl=ArtistSort&qt=standard&wt=standard&explainOther=&hl.fl=

I tried :

http://localhost:8983/solr/music/select?indent=on&version=2.2&q=ArtistSort:Mi*&fq=&start=0&rows=10&fl=ArtistSort&qt=standard&wt=standard&explainOther=&hl.fl=

and I get all artists missing :) So all is well. Thank you for your advice
because I still have problems with accents and Analysis will surely help me.

Sophie
-- 
View this message in context: http://lucene.472066.n3.nabble.com/Alphabetic-range-tp916716p919091.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Alphabetic range

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Sophie,

Go to your Solr Admin page, look for the Analysis page link, go there, enter some artists names, enter the query, check the verbose checkboxes, and submit.  This will tell you what is going on with your analysis at index and at search time.
 Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Sophie M. <so...@beezik.com>
> To: solr-user@lucene.apache.org
> Sent: Wed, June 23, 2010 8:56:39 AM
> Subject: Alphabetic range
> 
> 
Hello all,

I try since several day to build up an alphabetical range. 
> I will explain
all steps (i have the Solr1.4 Enterprise  Search Server 
> book written by
Smiley and Pugh).

I want get all artists beginning by 
> the two first letter. If I request "mi",
I want to have as response "michael 
> jackson" and all artists name beginning
by "mi".

I defined a field 
> type similiar to Smiley and Pugh's example p.148

<fieldType 
> name="bucketFirstTwoLetters" class="solr.TextField"
sortMissingLast="true" 
> omitNorms="true">
        <analyser 
> type="index">
            
> <tokenizer 
> class="solr.PatternTokenizerFactory"
pattern="^([a-zA-Z])([a-zA-Z]).*" 
> group="2"/> <!-- les deux premieres
lettres-->
    
>     </analyser>
        
> <analyser type="query">
        
>     <tokenizer 
> class="solr.KeywordTokenizerFactory"/>
    
>     </analyser>
    
> </fieldType>
    
I defined the field ArtistSort like 
> : 

<field name="ArtistSort" type="bucketFirstTwoLetters" 
> stored="true"
multivalued="false"/>
To the request : 


> href="http://localhost:8983/solr/music/select?indent=on&q=yu&qt=standard&wt=standard&facet=on&facet.field=ArtistSort&facetsort=lex&facet.missing=on&facet.method=enum&fl=ArtistSort" 
> target=_blank 
> >http://localhost:8983/solr/music/select?indent=on&q=yu&qt=standard&wt=standard&facet=on&facet.field=ArtistSort&facetsort=lex&facet.missing=on&facet.method=enum&fl=ArtistSort

I 
> get :


> href="http://lucene.472066.n3.nabble.com/file/n916716/select.xml" target=_blank 
> >http://lucene.472066.n3.nabble.com/file/n916716/select.xml select.xml 
> 

I don't understand why the pattern doesn't my exacty. For example "An An 
> Yu"
matches but I only want artists whom name begins by "yu". And I know that 
> an
artist named ReYu would match because ReYu would be interpreted as Re Yu 
> (as
two words).

I also tried to make an other type of queries like : 
> 


> href="http://localhost:8983/solr/music/select?indent=on&version=2.2&q=ArtistSort:mi*&fq=&start=0&rows=10&fl=ArtistSort&qt=standard&wt=standard&explainOther=&hl.fl=" 
> target=_blank 
> >http://localhost:8983/solr/music/select?indent=on&version=2.2&q=ArtistSort:mi*&fq=&start=0&rows=10&fl=ArtistSort&qt=standard&wt=standard&explainOther=&hl.fl=

I 
> get exacly what I would. I made several tries, I get only artist's names
wich 
> begins by the good first to letters.

But I get very few responses, see 
> there :

result name="response" numFound="6" 
> start="0">

<doc>
<str name="ArtistSort">mike manne and 
> tiger blues</str>
</doc>
−
<doc>
<str 
> name="ArtistSort">mimika</str>
</doc>
−
<doc>
<str 
> name="ArtistSort">miduno</str>
</doc>
−
<doc>
<str 
> name="ArtistSort">milue 
> macïro</str>
</doc>
−
<doc>
<str 
> name="ArtistSort">mister 
> pringle</str>
</doc>
−
<doc>
<str 
> name="ArtistSort">mimmai</str>
</doc>


In my index 
> there is more than 80 000 artists...  I really don't understand
why I 
> can't get more responses. I think about the problem since days and
days and 
> now my brain freezes 

Thank you in advance.

Sophie
-- 
View 
> this message in context: 
> href="http://lucene.472066.n3.nabble.com/Alphabetic-range-tp916716p916716.html" 
> target=_blank 
> >http://lucene.472066.n3.nabble.com/Alphabetic-range-tp916716p916716.html
Sent 
> from the Solr - User mailing list archive at Nabble.com.