You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sascha Szott <sz...@zib.de> on 2010/06/17 19:43:28 UTC

federated / meta search

Hi folks,

if I'm seeing it right Solr currently does not provide any support for 
federated / meta searching. Therefore, I'd like to know if anyone has 
already put efforts into this direction? Moreover, is federated / meta 
search considered a scenario Solr should be able to deal with at all or 
is it (far) beyond the scope of Solr?

To be more precise, I'll give you a short explanation of my 
requirements. Assume, there are a couple of Solr instances running at 
different places. The documents stored within those instances are all 
from the same domain (bibliographic records), but it can not be ensured 
that the schema definitions conform to 100%. But lets say, there are at 
least some index fields that are present in all instances (fields with 
the same name and type definition). Now, I'd like to perform a search on 
all instances at the same time (with the restriction that the query 
contains only those fields that overlap among the different schemas) and 
combine the results in a reasonable way by utilizing the score 
information associated with each hit. Please note, that due to legal 
issues it is not feasible to build a single index that integrates the 
documents of all Solr instances under consideration.

Thanks in advance,
Sascha


Re: federated / meta search

Posted by Lance Norskog <go...@gmail.com>.
Yes, you can do this. You need to have a common system for creating
unique ids for the documents.

Also, there's an odd problem around relevance. Relevance scoring is
based on all of the terms in a field in the whole index, and there is
a "statistical fingerprint" of this for an index. With two indexes
from two sources, the terms in the documents will not have the same
"fingerprint". Relevance scores from one shard will not match the
meaning of a document's score in the other shard.

There is a project to make this work in Solr, but it is not nearly finished.

Lance Norskog

On Fri, Jun 18, 2010 at 4:28 AM, Sascha Szott <sz...@zib.de> wrote:
> Hi Joe & Markus,
>
> sounds good! Maybe I should better add a note on the Wiki page on federated
> search [1].
>
> Thanks,
> Sascha
>
> [1] http://wiki.apache.org/solr/FederatedSearch
>
> Joe Calderon wrote:
>>
>> yes, you can use distributed search across shards with different
>> schemas as long as the query only references overlapping fields, i
>> usually test adding new fields or tokenizers on one shard and deploy
>> only after i verified its working properly
>>
>> On Thu, Jun 17, 2010 at 1:10 PM, Markus Jelsma<ma...@buyways.nl>
>>  wrote:
>>>
>>> Hi,
>>>
>>>
>>>
>>> Check out Solr sharding [1] capabilities. I never tested it with
>>> different schema's but if each node is queried with fields that it supports,
>>> it should return useful results.
>>>
>>>
>>>
>>> [1]: http://wiki.apache.org/solr/DistributedSearch
>>>
>>>
>>>
>>> Cheers.
>>>
>>> -----Original message-----
>>> From: Sascha Szott<sz...@zib.de>
>>> Sent: Thu 17-06-2010 19:44
>>> To: solr-user@lucene.apache.org;
>>> Subject: federated / meta search
>>>
>>> Hi folks,
>>>
>>> if I'm seeing it right Solr currently does not provide any support for
>>> federated / meta searching. Therefore, I'd like to know if anyone has
>>> already put efforts into this direction? Moreover, is federated / meta
>>> search considered a scenario Solr should be able to deal with at all or
>>> is it (far) beyond the scope of Solr?
>>>
>>> To be more precise, I'll give you a short explanation of my
>>> requirements. Assume, there are a couple of Solr instances running at
>>> different places. The documents stored within those instances are all
>>> from the same domain (bibliographic records), but it can not be ensured
>>> that the schema definitions conform to 100%. But lets say, there are at
>>> least some index fields that are present in all instances (fields with
>>> the same name and type definition). Now, I'd like to perform a search on
>>> all instances at the same time (with the restriction that the query
>>> contains only those fields that overlap among the different schemas) and
>>> combine the results in a reasonable way by utilizing the score
>>> information associated with each hit. Please note, that due to legal
>>> issues it is not feasible to build a single index that integrates the
>>> documents of all Solr instances under consideration.
>>>
>>> Thanks in advance,
>>> Sascha
>>>
>>>
>
>



-- 
Lance Norskog
goksron@gmail.com

Re: federated / meta search

Posted by Sascha Szott <sz...@zib.de>.
Hi Joe & Markus,

sounds good! Maybe I should better add a note on the Wiki page on 
federated search [1].

Thanks,
Sascha

[1] http://wiki.apache.org/solr/FederatedSearch

Joe Calderon wrote:
> yes, you can use distributed search across shards with different
> schemas as long as the query only references overlapping fields, i
> usually test adding new fields or tokenizers on one shard and deploy
> only after i verified its working properly
>
> On Thu, Jun 17, 2010 at 1:10 PM, Markus Jelsma<ma...@buyways.nl>  wrote:
>> Hi,
>>
>>
>>
>> Check out Solr sharding [1] capabilities. I never tested it with different schema's but if each node is queried with fields that it supports, it should return useful results.
>>
>>
>>
>> [1]: http://wiki.apache.org/solr/DistributedSearch
>>
>>
>>
>> Cheers.
>>
>> -----Original message-----
>> From: Sascha Szott<sz...@zib.de>
>> Sent: Thu 17-06-2010 19:44
>> To: solr-user@lucene.apache.org;
>> Subject: federated / meta search
>>
>> Hi folks,
>>
>> if I'm seeing it right Solr currently does not provide any support for
>> federated / meta searching. Therefore, I'd like to know if anyone has
>> already put efforts into this direction? Moreover, is federated / meta
>> search considered a scenario Solr should be able to deal with at all or
>> is it (far) beyond the scope of Solr?
>>
>> To be more precise, I'll give you a short explanation of my
>> requirements. Assume, there are a couple of Solr instances running at
>> different places. The documents stored within those instances are all
>> from the same domain (bibliographic records), but it can not be ensured
>> that the schema definitions conform to 100%. But lets say, there are at
>> least some index fields that are present in all instances (fields with
>> the same name and type definition). Now, I'd like to perform a search on
>> all instances at the same time (with the restriction that the query
>> contains only those fields that overlap among the different schemas) and
>> combine the results in a reasonable way by utilizing the score
>> information associated with each hit. Please note, that due to legal
>> issues it is not feasible to build a single index that integrates the
>> documents of all Solr instances under consideration.
>>
>> Thanks in advance,
>> Sascha
>>
>>


Re: federated / meta search

Posted by Joe Calderon <ca...@gmail.com>.
yes, you can use distributed search across shards with different
schemas as long as the query only references overlapping fields, i
usually test adding new fields or tokenizers on one shard and deploy
only after i verified its working properly

On Thu, Jun 17, 2010 at 1:10 PM, Markus Jelsma <ma...@buyways.nl> wrote:
> Hi,
>
>
>
> Check out Solr sharding [1] capabilities. I never tested it with different schema's but if each node is queried with fields that it supports, it should return useful results.
>
>
>
> [1]: http://wiki.apache.org/solr/DistributedSearch
>
>
>
> Cheers.
>
> -----Original message-----
> From: Sascha Szott <sz...@zib.de>
> Sent: Thu 17-06-2010 19:44
> To: solr-user@lucene.apache.org;
> Subject: federated / meta search
>
> Hi folks,
>
> if I'm seeing it right Solr currently does not provide any support for
> federated / meta searching. Therefore, I'd like to know if anyone has
> already put efforts into this direction? Moreover, is federated / meta
> search considered a scenario Solr should be able to deal with at all or
> is it (far) beyond the scope of Solr?
>
> To be more precise, I'll give you a short explanation of my
> requirements. Assume, there are a couple of Solr instances running at
> different places. The documents stored within those instances are all
> from the same domain (bibliographic records), but it can not be ensured
> that the schema definitions conform to 100%. But lets say, there are at
> least some index fields that are present in all instances (fields with
> the same name and type definition). Now, I'd like to perform a search on
> all instances at the same time (with the restriction that the query
> contains only those fields that overlap among the different schemas) and
> combine the results in a reasonable way by utilizing the score
> information associated with each hit. Please note, that due to legal
> issues it is not feasible to build a single index that integrates the
> documents of all Solr instances under consideration.
>
> Thanks in advance,
> Sascha
>
>

RE: federated / meta search

Posted by Markus Jelsma <ma...@buyways.nl>.
Hi,

 

Check out Solr sharding [1] capabilities. I never tested it with different schema's but if each node is queried with fields that it supports, it should return useful results.

 

[1]: http://wiki.apache.org/solr/DistributedSearch

 

Cheers.
 
-----Original message-----
From: Sascha Szott <sz...@zib.de>
Sent: Thu 17-06-2010 19:44
To: solr-user@lucene.apache.org; 
Subject: federated / meta search

Hi folks,

if I'm seeing it right Solr currently does not provide any support for 
federated / meta searching. Therefore, I'd like to know if anyone has 
already put efforts into this direction? Moreover, is federated / meta 
search considered a scenario Solr should be able to deal with at all or 
is it (far) beyond the scope of Solr?

To be more precise, I'll give you a short explanation of my 
requirements. Assume, there are a couple of Solr instances running at 
different places. The documents stored within those instances are all 
from the same domain (bibliographic records), but it can not be ensured 
that the schema definitions conform to 100%. But lets say, there are at 
least some index fields that are present in all instances (fields with 
the same name and type definition). Now, I'd like to perform a search on 
all instances at the same time (with the restriction that the query 
contains only those fields that overlap among the different schemas) and 
combine the results in a reasonable way by utilizing the score 
information associated with each hit. Please note, that due to legal 
issues it is not feasible to build a single index that integrates the 
documents of all Solr instances under consideration.

Thanks in advance,
Sascha