You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by cyang2010 <ys...@hotmail.com> on 2011/03/01 21:48:43 UTC

numberic or string type for non-sortable field?

I wonder if i shall use solr int or string for such field with following
requirement

multi-value
facet needed
sort not needed


The field value is a an id.  Therefore, i can store as either numeric field
or just a string.   Shall i choose string for efficiency?

Thanks.

-- 
View this message in context: http://lucene.472066.n3.nabble.com/numberic-or-string-type-for-non-sortable-field-tp2606353p2606353.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: numberic or string type for non-sortable field?

Posted by Chris Hostetter <ho...@fucit.org>.
: Can I know why?  I thought solr is tuned for string if no sorting of facet by
: range query is needed.

"tuned for string" doesn't really mean anything to me, i'm not sure what 
that's in refrence to.  nothing thta i know of is particularly optimized 
for strings.  Almost anything can be indexed/stored/represented as a 
string (in some form ot another) and that tends to work fine in solr, but 
some things are optimized for other more specialized datatypes.

the reason i suggested that using ints might (marginally) be better is 
because of the FieldCache and the fieldValueCache -- the int 
representation uses less memory then if it was holding strings 
representing hte same ints.

worrying about that is really a premature optimization though -- model 
your data in the way that makes the most sense -- if your ids are 
inherently ints, model them as ints until you come up with a reason to 
model them otherwise and move on to the next problem.


-Hoss

Re: numberic or string type for non-sortable field?

Posted by cyang2010 <ys...@hotmail.com>.
Can I know why?  I thought solr is tuned for string if no sorting of facet by
range query is needed.

-- 
View this message in context: http://lucene.472066.n3.nabble.com/numberic-or-string-type-for-non-sortable-field-tp2606353p2607932.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: numberic or string type for non-sortable field?

Posted by Ahmet Arslan <io...@yahoo.com>.
> I will only facet based on field value, not ranged
> query  (it is just some
> ids for a  multi-value field).   And i
> won't do sort on the field either.
> 
> In that case, is string more efficient for the
> requirement?

Hoss was saying to use, <fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/> 


      

Re: numberic or string type for non-sortable field?

Posted by cyang2010 <ys...@hotmail.com>.
Sorry i didn't make my question clear.

I will only facet based on field value, not ranged query  (it is just some
ids for a  multi-value field).   And i won't do sort on the field either.

In that case, is string more efficient for the requirement?

-- 
View this message in context: http://lucene.472066.n3.nabble.com/numberic-or-string-type-for-non-sortable-field-tp2606353p2606762.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: numberic or string type for non-sortable field?

Posted by Chris Hostetter <ho...@fucit.org>.
: > The field value is a an id.  Therefore, i can store as
: > either numeric field
: > or just a string.   Shall i choose string
: > for efficiency?
: 
: Trie based integer (tint) is preferred for faster faceting.

range faceting/filtering yes -- not for "field" faceting which is what i 
think he's asking about.

in that case int would still proably be more efficient, but you don't want 
precision steps (that will introduce added terms)

-Hoss

Re: numberic or string type for non-sortable field?

Posted by Ahmet Arslan <io...@yahoo.com>.
> I wonder if i shall use solr int or
> string for such field with following
> requirement
> 
> multi-value
> facet needed
> sort not needed
> 
> 
> The field value is a an id.  Therefore, i can store as
> either numeric field
> or just a string.   Shall i choose string
> for efficiency?

Trie based integer (tint) is preferred for faster faceting.