You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tom Weber <to...@rtl.lu> on 2006/09/11 10:34:06 UTC

Strange Sorting results on a Text Field

Hello,

   have a strange response in a query with sorting.

   I sort on a field which is :

   <field name="testfield" type="text" indexed="true" stored="true"  
multiValued="true"/>

   in this field mostly 32 byte md5's are saved, mostly only a single  
entry but also up to 5.

   when I do a search like this : "+testfield: 
(fde34c51739462d9486140601dcfb7bf 63af20144c2cbae1ec4dc0bc2e9d2c2f  
3cf8e32bf2b9384447d52318a72fd4b1) ;testfield asc"

   I get the following results:
	<arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
	<arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
	<arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
	<arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
	<arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
	<arr name="testfield"><str>4302516b91b743a8972120f52d309a72</str></arr>
	<arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
	<arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
	<arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
	<arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>

   I have no idea why position 6 is in this search, because the XML  
entries are correct too.

   Any Idea where I may search for the error ?

   Also, does somebody has a link where the benefits of "multiValued"  
are explained ?

   Thanks,

   Tom



Re: Strange Sorting results on a Text Field

Posted by Yonik Seeley <yo...@apache.org>.
On 9/11/06, Tom Weber <to...@rtl.lu> wrote:
>    Thanks also for the "multiValued" explanation, this is useful for
> my current application. But then, if I use this field and I ask for
> sorting, how will the sorting be done, alphanumeric on the first
> entry for this field ? Until now, I entered more than one entry by
> separting them with a space in the same field, like <field
> name="test">text1 text2 text3</field>.

Sorting is currently only supported when there is at most one value
(or token) per document.  This is a lucene restriction.

-Yonik

Re: Strange Sorting results on a Text Field

Posted by Tom Weber <to...@rtl.lu>.
Hello Yonik,

   You are right about the string stuff, I saw while turning on the  
debugging a few minutes ago, that it is splitting the md5 sum up in  
several parts, eacht time we have a number after a letter or the  
other way round.

   Thanks also for the "multiValued" explanation, this is useful for  
my current application. But then, if I use this field and I ask for  
sorting, how will the sorting be done, alphanumeric on the first  
entry for this field ? Until now, I entered more than one entry by  
separting them with a space in the same field, like <field  
name="test">text1 text2 text3</field>.

   Thanks,

   tom


On 11 Sep, 2006, at 15:14 , Yonik Seeley wrote:

> On 9/11/06, Tom Weber <to...@rtl.lu> wrote:
>> Hello,
>>
>>    have a strange response in a query with sorting.
>>
>>    I sort on a field which is :
>>
>>    <field name="testfield" type="text" indexed="true" stored="true"
>> multiValued="true"/>
>
> I think you probably want a type="string" instead.  Text fields have
> text analysis (stemming, lowercasing, word splitting, etc) and aren't
> used for exact matching or sorting.
>
>>    in this field mostly 32 byte md5's are saved, mostly only a single
>> entry but also up to 5.
>>
>>    when I do a search like this : "+testfield:
>> (fde34c51739462d9486140601dcfb7bf 63af20144c2cbae1ec4dc0bc2e9d2c2f
>> 3cf8e32bf2b9384447d52318a72fd4b1) ;testfield asc"
>>
>>    I get the following results:
>>         <arr  
>> name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>>         <arr  
>> name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>>         <arr  
>> name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>>         <arr  
>> name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>>         <arr  
>> name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>>         <arr  
>> name="testfield"><str>4302516b91b743a8972120f52d309a72</str></arr>
>>         <arr  
>> name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>>         <arr  
>> name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>>         <arr  
>> name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>>         <arr  
>> name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>>
>>    I have no idea why position 6 is in this search, because the XML
>> entries are correct too.
>>
>>    Any Idea where I may search for the error ?
>>
>>    Also, does somebody has a link where the benefits of "multiValued"
>> are explained ?
>
> You can have multiple values for the field in a single document if
> it's marked as multiValued:
>
> <add><doc>
>  <field name="f1">first val</field>
>  <field name="f1">second val</field>
> </doc></add>
>
>
> -Yonik


Re: Strange Sorting results on a Text Field

Posted by Yonik Seeley <yo...@apache.org>.
On 9/11/06, Tom Weber <to...@rtl.lu> wrote:
> Hello,
>
>    have a strange response in a query with sorting.
>
>    I sort on a field which is :
>
>    <field name="testfield" type="text" indexed="true" stored="true"
> multiValued="true"/>

I think you probably want a type="string" instead.  Text fields have
text analysis (stemming, lowercasing, word splitting, etc) and aren't
used for exact matching or sorting.

>    in this field mostly 32 byte md5's are saved, mostly only a single
> entry but also up to 5.
>
>    when I do a search like this : "+testfield:
> (fde34c51739462d9486140601dcfb7bf 63af20144c2cbae1ec4dc0bc2e9d2c2f
> 3cf8e32bf2b9384447d52318a72fd4b1) ;testfield asc"
>
>    I get the following results:
>         <arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>         <arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>         <arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>         <arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>         <arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>         <arr name="testfield"><str>4302516b91b743a8972120f52d309a72</str></arr>
>         <arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>         <arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>         <arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>         <arr name="testfield"><str>c10c9bf4ef3f1bc30aedf83b96a9ce16</str></arr>
>
>    I have no idea why position 6 is in this search, because the XML
> entries are correct too.
>
>    Any Idea where I may search for the error ?
>
>    Also, does somebody has a link where the benefits of "multiValued"
> are explained ?

You can have multiple values for the field in a single document if
it's marked as multiValued:

<add><doc>
  <field name="f1">first val</field>
  <field name="f1">second val</field>
</doc></add>


-Yonik