You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Juan Antonio Farré Basurte <ju...@reviewpro.com> on 2011/05/12 17:15:30 UTC

TrieIntField for "short" values

Hello,
I'm quite a beginner in solr and have many doubts while trying to learn how everything works.
I have only a slight idea on how TrieFields work.
The thing is I have an integer value that will always be in the range 0-1000. A short field would be enough for this, but there is no such TrieShortField (not even a SortableShortField). So, I used a TrieIntField.
My doubt is, in this case, what would be a suitable value for precisionStep. If the field had only 1000 distinct values, but they were more or less uniformly distributed in the 32-bit int range, probably a big precisionStep would be suitable. But as my values are in the range 0 to 1000, I think (without much knowledge) that a low precisionStep should be more adequate. For example, 2.
Can anybody, please, help me finding a good configuration for this type? And, if possible, can anybody explain in a brief and intuitive way what are the differences and tradeoffs of choosing smaller or bigger precisionSteps?
Thanks a lot,

Juan

Re: TrieIntField for "short" values

Posted by Erick Erickson <er...@gmail.com>.
Nope, I'm afraid I can't. Because I don't really understand it in
detail, the wizard
from Germany (Uwe) put it in place....

But here's a great place to start if you want to dive deep:
https://hudson.apache.org/hudson/job/Lucene-trunk/javadoc/all/org/apache/lucene/search/NumericRangeQuery.html?is-external=true

But with only 1,000 distinct values, I don't think you really need to
pursue this very far
except for curiosity's sake. Just go with 4 as the page above suggests....

Best
Erick

On Sun, May 15, 2011 at 11:19 AM, Juan Antonio Farré Basurte
<ju...@reviewpro.com> wrote:
> Hi,
>
> Thanks for your answer.
>
> I am doing range queries on this field, yes, that's why I cared about
> how all this trie thing works :)
>
> If I use precisionStep=0 would it be equivalent to use, say, a
> SortableIntField?
>
> Would it be possible that you explained, for example, the difference in
> how it would work using a precisionStep=0 or using a
> precisionStep=Integer.MAX_VALUE?
>
> May be this way I could get an idea on how it works. I've read as much
> information as I've been able to find, but I didn't get a clear idea.
>
> Thanks a lot,
>
> Juan
>
> El dom, 15-05-2011 a las 11:01 -0400, Erick Erickson escribió:
>> Are you doing range queries on this field? Range queries are where
>> Trie shines, so worrying about
>> precision step if you're NOT intending to do range queries is a waste,
>> just use precisionstep=0.
>>
>> In fact, with only 1,000 values, I'd just go with PrecisionStep=0
>> (which is the int field)
>>
>> Best
>> Erick
>>
>> On Thu, May 12, 2011 at 11:15 AM, Juan Antonio Farré Basurte
>> <ju...@reviewpro.com> wrote:
>> > Hello,
>> > I'm quite a beginner in solr and have many doubts while trying to learn how everything works.
>> > I have only a slight idea on how TrieFields work.
>> > The thing is I have an integer value that will always be in the range 0-1000. A short field would be enough for this, but there is no such TrieShortField (not even a SortableShortField). So, I used a TrieIntField.
>> > My doubt is, in this case, what would be a suitable value for precisionStep. If the field had only 1000 distinct values, but they were more or less uniformly distributed in the 32-bit int range, probably a big precisionStep would be suitable. But as my values are in the range 0 to 1000, I think (without much knowledge) that a low precisionStep should be more adequate. For example, 2.
>> > Can anybody, please, help me finding a good configuration for this type? And, if possible, can anybody explain in a brief and intuitive way what are the differences and tradeoffs of choosing smaller or bigger precisionSteps?
>> > Thanks a lot,
>> >
>> > Juan
>
>
>

Re: TrieIntField for "short" values

Posted by Juan Antonio Farré Basurte <ju...@reviewpro.com>.
Hi,

Thanks for your answer.

I am doing range queries on this field, yes, that's why I cared about
how all this trie thing works :)

If I use precisionStep=0 would it be equivalent to use, say, a
SortableIntField?

Would it be possible that you explained, for example, the difference in
how it would work using a precisionStep=0 or using a
precisionStep=Integer.MAX_VALUE?

May be this way I could get an idea on how it works. I've read as much
information as I've been able to find, but I didn't get a clear idea.

Thanks a lot,

Juan

El dom, 15-05-2011 a las 11:01 -0400, Erick Erickson escribió:
> Are you doing range queries on this field? Range queries are where
> Trie shines, so worrying about
> precision step if you're NOT intending to do range queries is a waste,
> just use precisionstep=0.
> 
> In fact, with only 1,000 values, I'd just go with PrecisionStep=0
> (which is the int field)
> 
> Best
> Erick
> 
> On Thu, May 12, 2011 at 11:15 AM, Juan Antonio Farré Basurte
> <ju...@reviewpro.com> wrote:
> > Hello,
> > I'm quite a beginner in solr and have many doubts while trying to learn how everything works.
> > I have only a slight idea on how TrieFields work.
> > The thing is I have an integer value that will always be in the range 0-1000. A short field would be enough for this, but there is no such TrieShortField (not even a SortableShortField). So, I used a TrieIntField.
> > My doubt is, in this case, what would be a suitable value for precisionStep. If the field had only 1000 distinct values, but they were more or less uniformly distributed in the 32-bit int range, probably a big precisionStep would be suitable. But as my values are in the range 0 to 1000, I think (without much knowledge) that a low precisionStep should be more adequate. For example, 2.
> > Can anybody, please, help me finding a good configuration for this type? And, if possible, can anybody explain in a brief and intuitive way what are the differences and tradeoffs of choosing smaller or bigger precisionSteps?
> > Thanks a lot,
> >
> > Juan



Re: TrieIntField for "short" values

Posted by Erick Erickson <er...@gmail.com>.
Are you doing range queries on this field? Range queries are where
Trie shines, so worrying about
precision step if you're NOT intending to do range queries is a waste,
just use precisionstep=0.

In fact, with only 1,000 values, I'd just go with PrecisionStep=0
(which is the int field)

Best
Erick

On Thu, May 12, 2011 at 11:15 AM, Juan Antonio Farré Basurte
<ju...@reviewpro.com> wrote:
> Hello,
> I'm quite a beginner in solr and have many doubts while trying to learn how everything works.
> I have only a slight idea on how TrieFields work.
> The thing is I have an integer value that will always be in the range 0-1000. A short field would be enough for this, but there is no such TrieShortField (not even a SortableShortField). So, I used a TrieIntField.
> My doubt is, in this case, what would be a suitable value for precisionStep. If the field had only 1000 distinct values, but they were more or less uniformly distributed in the 32-bit int range, probably a big precisionStep would be suitable. But as my values are in the range 0 to 1000, I think (without much knowledge) that a low precisionStep should be more adequate. For example, 2.
> Can anybody, please, help me finding a good configuration for this type? And, if possible, can anybody explain in a brief and intuitive way what are the differences and tradeoffs of choosing smaller or bigger precisionSteps?
> Thanks a lot,
>
> Juan