You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Garafola Timothy <ti...@gmail.com> on 2009/06/12 18:58:34 UTC

Efficient Sharding with date sorted queries

I have a solr index which is going to grow 3x in the near future.  I'm
considering using distributed search and was contemplating what would
be the best approach to splitting the index.  Since most of the
searches performed on the index are sorted by date descending, I'm
considering splitting the index based on the created date of the
documents.

>From Yonik Seeley's blog post,
http://yonik.wordpress.com/2008/02/27/distributed-search-for-solr/,
I've read that there are two phases to sharding.  The first phase
collects matching ids and documents across the shards.  Then the
second phase collects the stored fields for the documents.  I'm
assuming that this second phase's execution is limited by the number
of rows requested and the number of results.

So let's say I have 2 shards.  The first shard has docs with creation
dates of this year.  The Second shard contains documents from the
previous year.  I run a solr query requesting 10 rows sorted by date
and get 11 from the first shard and 3 from the second.  Will the
initial query only execute the first phase on the second shard?  If
so, that should result in more optimum performance, right?


Thanks,
-Tim

Re: Efficient Sharding with date sorted queries

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
On Fri, Jun 12, 2009 at 10:28 PM, Garafola Timothy <ti...@gmail.com>wrote:

>
> So let's say I have 2 shards.  The first shard has docs with creation
> dates of this year.  The Second shard contains documents from the
> previous year.  I run a solr query requesting 10 rows sorted by date
> and get 11 from the first shard and 3 from the second.


No, you cannot request specific number of results from a shard. That is
something that Solr will manage itself. It requests start+rows number of
documents from each shard to find the rows number of documents to be
returned. If you really want to get a specific number of results from a
shard, make a query to that shard alone.

-- 
Regards,
Shalin Shekhar Mangar.