You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jimmy Lin <ji...@gmail.com> on 2014/04/26 16:24:01 UTC

Optimal setup for multiple tools

Hello,

My team has been working with SOLR for the last 2 years.  We have two main
indices:

1. documents
        -index and store main text
        -one record for each document
2. places (all of the geospatial places found in the documents above)
        -index but don't store main text
        -one record for each place.  could have thousands in a single
document but the ratio has seemed to come out to 6:1 places to documents

We have several tools that query the above indices.  One is just a standard
search tool that returns documents filtered on keyword, temporal, and
geospatial filters.  Another is a geospatial tool that queries the places
collection.  We now have a requirement to provide document highlighting
when querying in the geospatial tool.

Does anyone have any suggestions/prior experience on how they would set up
two collections that are essentially different "views" of the data?  Also
any tips on how to ensure that these two collections are "in sync" (meaning
any documents indexed into the documents collection are also properly
indexed in places)?

Thanks alot,

Jimmy Lin

Re: Optimal setup for multiple tools

Posted by Erick Erickson <er...@gmail.com>.
Have you considered putting them in the _same_ index? There's not much
penalty at all with having sparsely populated fields in a document, so
the fact that the two parts of your index had orthogonal fields
wouldn't cost you much and would solve the synchronization problem.

You can include a type field to distinguish between the and just
include a filter query to keep them separate. Since that'll be cached,
your search performance should be fine.

Otherwise you should include the fields you need to sort on in the
index you need to sort. Denormalizes the data, but...

About keeping the two in synch, that's really outside Solr, your
indexing process has to manage that I'd guess.

Best,
Erick

On Sat, Apr 26, 2014 at 7:24 AM, Jimmy Lin <ji...@gmail.com> wrote:
> Hello,
>
> My team has been working with SOLR for the last 2 years.  We have two main
> indices:
>
> 1. documents
>         -index and store main text
>         -one record for each document
> 2. places (all of the geospatial places found in the documents above)
>         -index but don't store main text
>         -one record for each place.  could have thousands in a single
> document but the ratio has seemed to come out to 6:1 places to documents
>
> We have several tools that query the above indices.  One is just a standard
> search tool that returns documents filtered on keyword, temporal, and
> geospatial filters.  Another is a geospatial tool that queries the places
> collection.  We now have a requirement to provide document highlighting
> when querying in the geospatial tool.
>
> Does anyone have any suggestions/prior experience on how they would set up
> two collections that are essentially different "views" of the data?  Also
> any tips on how to ensure that these two collections are "in sync" (meaning
> any documents indexed into the documents collection are also properly
> indexed in places)?
>
> Thanks alot,
>
> Jimmy Lin