You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Will Milspec <wi...@gmail.com> on 2011/08/18 23:22:35 UTC
overhead of empty, unused fields
hi all,
What are the cost of unused field types?
Our application supports multiple languages. We envision separate
Lucene/Solr fields (and field types) per language (conten_en, content_fr,
content_zh_CN,etc).
We thought of a few optons:
a) auto-generating the 'multilingual' portion of the schema based on the
application's languages,
b) include fields-and-types for all languagues
In A, if an implemenation only used French and Chinese, the schema would
only have content_en and conten_zh_CN fields-and-types.
In B, the implementation would have all field types, but a give document
would only have two fields
A seems "more efficiient", but less work. The downside: if a user wants to
add a language, they would need to regenerate the schema (i.e. add
fields-and-types for "ja")
How much do empty field types and fields? Do a dozen-or-so unused field
types hurt scalability of indexing or search?
thanks,
will
Re: overhead of empty, unused fields
Posted by Markus Jelsma <ma...@openindex.io>.
No problem. A document without a value for some field simply doesn't have an
entry in the inverted index.
> hi all,
>
> What are the cost of unused field types?
>
> Our application supports multiple languages. We envision separate
> Lucene/Solr fields (and field types) per language (conten_en, content_fr,
> content_zh_CN,etc).
>
> We thought of a few optons:
> a) auto-generating the 'multilingual' portion of the schema based on the
> application's languages,
> b) include fields-and-types for all languagues
>
>
> In A, if an implemenation only used French and Chinese, the schema would
> only have content_en and conten_zh_CN fields-and-types.
>
> In B, the implementation would have all field types, but a give document
> would only have two fields
>
> A seems "more efficiient", but less work. The downside: if a user wants to
> add a language, they would need to regenerate the schema (i.e. add
> fields-and-types for "ja")
>
>
> How much do empty field types and fields? Do a dozen-or-so unused field
> types hurt scalability of indexing or search?
>
> thanks,
>
> will