You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Kevin Osborn <os...@yahoo.com> on 2010/01/12 23:32:09 UTC

LongField not stripping leading zeros

This is in Solr 1.3.

I have some text in our database in the form 0088698183939. The leading zeros are useless, but I want to able to search it with no leading zeros or several leading zeros. So, I decided to index this as a long, expecting it to just store it as a number. But, instead, I see this in the index:

<arr name="upcean">
   <long>0088698183939</long>
</arr>

Shouldn't the leading zeros be gone since this is a number. And when I search on it, I only get a hit if I include two leading zeros. I could just clean everything on both index and search time, but could someone explain what is going on here? Thanks.



      

Re: LongField not stripping leading zeros

Posted by Chris Hostetter <ho...@fucit.org>.
: Thanks. Is there any performance penalty vs. LongField? I don't need to 

The other ones do normalization by converting to a Long internally -- i 
have no idea if you would see some micro performance benefit in doing 
the 0 stripping yourself.

Sorting a LongField should take less RAM then a SortableLongField (because 
the Sortable*Field's use String based FieldCaches to support sortMissing*) 
but if you aren't doing any sorting or range queries on the field that 
shouldn't matter.


-Hoss


Re: LongField not stripping leading zeros

Posted by Kevin Osborn <os...@yahoo.com>.
Thanks. Is there any performance penalty vs. LongField? I don't need to do any range queries on these value. I am basically treating them as numerical strings. I thought it would just be a shortcut to strip leading zeros, which I can easily do on my own.




________________________________
From: Chris Hostetter <ho...@fucit.org>
To: Solr <so...@lucene.apache.org>
Sent: Tue, January 12, 2010 3:16:13 PM
Subject: Re: LongField not stripping leading zeros

: I have some text in our database in the form 0088698183939. The leading 
: zeros are useless, but I want to able to search it with no leading zeros 
: or several leading zeros. So, I decided to index this as a long, 
: expecting it to just store it as a number. But, instead, I see this in 
: the index:

Note the comment s about LongField in the example schema...

      Plain numeric field types that store and index the text
      value verbatim (and hence don't support range queries, since the
      lexicographic ordering isn't equal to the numeric ordering)

...LongField, IntField, etc.. all just index/store the exact value you 
put in -- the only distinctions between them and StrField is that they are 
rendered back as a numeric type (by the response writers) and they use the 
numericly typed FieldCache for sorting.

You should be using TrieLongField (or SortableLongField if you need 
sortMissing* type functionality)


-Hoss


      

Re: LongField not stripping leading zeros

Posted by Chris Hostetter <ho...@fucit.org>.
: I have some text in our database in the form 0088698183939. The leading 
: zeros are useless, but I want to able to search it with no leading zeros 
: or several leading zeros. So, I decided to index this as a long, 
: expecting it to just store it as a number. But, instead, I see this in 
: the index:

Note the comment s about LongField in the example schema...

      Plain numeric field types that store and index the text
      value verbatim (and hence don't support range queries, since the
      lexicographic ordering isn't equal to the numeric ordering)

...LongField, IntField, etc.. all just index/store the exact value you 
put in -- the only distinctions between them and StrField is that they are 
rendered back as a numeric type (by the response writers) and they use the 
numericly typed FieldCache for sorting.

You should be using TrieLongField (or SortableLongField if you need 
sortMissing* type functionality)


-Hoss