You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by santamaria2 <ar...@contify.com> on 2012/08/08 10:16:33 UTC

Designing an index with multiple entity types, sharing field names across entity-types.

My question stems from a vague memory of reading somewhere that Solr's search
performance depends on how the total number of 'terms' there are in all in a
field that is searched upon.

I'm setting up an index core for some autocomplete boxes on my site. There
is a search box for each facet group in my results page (suggestions for a
single entity-type), and a 'generic' search box on my header that will
display suggestions for multiple entity-types.

The entity types are: Books, Authors, Categories, Publishers.

Books, Authors --> over 100,000 of each type right now. Will grow larger.
Categories, Publishers --> around 500 of each type. Will grow slowly.

Books & Categories have 'descriptions' which I also want searchable -- with
lower boosts.

In my per-entity search boxes, for autocomplete suggestions for user input
"man", I'd do:
q=(name:man* OR description:man*^0.5)&fq=type:<my-entity>


For my generic search box on top of my page, I would not have fq, but
instead I'd use &group=true&group.field=type.
(type --> {'book', 'author', 'category', 'publisher'})

This seems okay, but I'm just wondering about what I said in my first
paragraph. The number of total terms of a field.

For a laaaarge index, would it be better to more specific fields?
eg. Instead of a common field 'name', what if I do 'author_name',
'book_name', 'publisher_name', 'category_name', 'book_description',
'category_description'?

Would this be 'faster' to search on?
For my per-entity search boxes, the query changes in an obvious manner. But
this would complicate stuff for my generic-search-box query... for which I
haven't decided on how I'd go about designing a query, yet.

What say thee?



--
View this message in context: http://lucene.472066.n3.nabble.com/Designing-an-index-with-multiple-entity-types-sharing-field-names-across-entity-types-tp3999727.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Designing an index with multiple entity types, sharing field names across entity-types.

Posted by Erick Erickson <er...@gmail.com>.
I'd recommend you ignore search speed for the time being.
First, your index isn't that large from what you've described.
I see clients witn 40-50M documents on a single machine
(admittedly with some iron under the hood)...

Instead, I'd concentrate on designing the best user
experience I could, and worry about search speed only
when your testing shows you need to. The real question
isn't "will X or Y be faster" IMO, it's "Will what I want
be fast enough".

Using distinct fields and edismax is where I'd start.

Best
Erick

On Thu, Aug 9, 2012 at 2:04 AM, santamaria2 <ar...@contify.com> wrote:
> *civilized bump*
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Designing-an-index-with-multiple-entity-types-sharing-field-names-across-entity-types-tp3999727p4000051.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Designing an index with multiple entity types, sharing field names across entity-types.

Posted by santamaria2 <ar...@contify.com>.
*civilized bump*



--
View this message in context: http://lucene.472066.n3.nabble.com/Designing-an-index-with-multiple-entity-types-sharing-field-names-across-entity-types-tp3999727p4000051.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Designing an index with multiple entity types, sharing field names across entity-types.

Posted by santamaria2 <ar...@contify.com>.
To clarify a wee bit more. I'm wondering the performance impact on
single-entity queries if I use common field names.
eg. 'name' field for all entity types. 'Author' & 'Book' together make up
for 200,000+ 'name' values. Will this affect anything if I search over
'Category'? Will using fq=type:category save me?



--
View this message in context: http://lucene.472066.n3.nabble.com/Designing-an-index-with-multiple-entity-types-sharing-field-names-across-entity-types-tp3999727p3999728.html
Sent from the Solr - User mailing list archive at Nabble.com.