You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Li Li <fa...@gmail.com> on 2010/07/08 13:44:14 UTC
Distributed Indexing
Is there any tools for "Distributed Indexing"? It refers to
KattaIntegration and ZooKeeperIntegration in
http://wiki.apache.org/solr/DistributedSearch.
But it seems that they concern more on error processing and
replication. I need a dispatcher that dispatch different docs by
uniqueKey(such as url) to different machines. And when a doc is
updated, the doc is sent to the machine that contains the url. Also I
need the docs are randomly sent to all the machines so that when I do
a distributed search the idfs of different machines are similar
because the current distributed search's idf are local.
RE: Distributed Indexing
Posted by Yuval Feinstein <yu...@answers.com>.
Li,
as far as I know, you still have to do this part yourself.
A possible way to shard is to number the shards from 0 to numShards-1,
calculate hash(uniqueKey)%numShards per each document,
and send the document to the resulting shard number.
This number is consistent and sends documents uniformly to different shards.
-- Yuval
-----Original Message-----
From: Li Li [mailto:fancyerii@gmail.com]
Sent: Thursday, July 08, 2010 2:44 PM
To: solr-user@lucene.apache.org
Subject: Distributed Indexing
Is there any tools for "Distributed Indexing"? It refers to
KattaIntegration and ZooKeeperIntegration in
http://wiki.apache.org/solr/DistributedSearch.
But it seems that they concern more on error processing and
replication. I need a dispatcher that dispatch different docs by
uniqueKey(such as url) to different machines. And when a doc is
updated, the doc is sent to the machine that contains the url. Also I
need the docs are randomly sent to all the machines so that when I do
a distributed search the idfs of different machines are similar
because the current distributed search's idf are local.