You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Doug Steigerwald <ds...@mcclatchyinteractive.com> on 2009/03/04 15:20:40 UTC

MoreLikeThis filtering

Is it possible to filter similarities found by the MLT component/ 
handler?  Something like mlt.fq=site_id:86?

We have 32 cores in our Solr install, and some of those cores have up  
to 8 sites indexed in them.  Typically those cores will have one very  
large site with a few hundred thousand indexed documents, and lots of  
small sites with significantly less documents indexed.

We're looking to implement a MLT component for our sites but want the  
similar stories to be only for a specific site (not all sites in the  
core).

Is there a way to do something like this, or will we have to make mods  
(I'm not seeing anything jump out at me in the Solr 1.3.0 or Lucene  
2.4.0 code)?

/solr/dsteiger/mlt?q=story_id:188665+AND+site_id:86&mlt.fq=site_id:86

(We have all all of our other defaults set up in the handler config.)

Thanks.
---
Doug Steigerwald
Software Developer
McClatchy Interactive
dsteigerwald@mcclatchyinteractive.com


Re: MoreLikeThis filtering

Posted by Doug Steigerwald <ds...@mcclatchyinteractive.com>.
'fq' seems to only work with finding the documents with your original  
query, not for filtering the similar documents.

Doug

On Mar 4, 2009, at 9:28 AM, Otis Gospodnetic wrote:

>
> Doug,
>
> does the good old 'fq' not work with MLT?  It should...
>
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
>> From: Doug Steigerwald <ds...@mcclatchyinteractive.com>
>> To: solr-user@lucene.apache.org
>> Sent: Wednesday, March 4, 2009 9:20:40 AM
>> Subject: MoreLikeThis filtering
>>
>> Is it possible to filter similarities found by the MLT component/ 
>> handler?
>> Something like mlt.fq=site_id:86?
>>
>> We have 32 cores in our Solr install, and some of those cores have  
>> up to 8 sites
>> indexed in them.  Typically those cores will have one very large  
>> site with a few
>> hundred thousand indexed documents, and lots of small sites with  
>> significantly
>> less documents indexed.
>>
>> We're looking to implement a MLT component for our sites but want  
>> the similar
>> stories to be only for a specific site (not all sites in the core).
>>
>> Is there a way to do something like this, or will we have to make  
>> mods (I'm not
>> seeing anything jump out at me in the Solr 1.3.0 or Lucene 2.4.0  
>> code)?
>>
>> /solr/dsteiger/mlt?q=story_id:188665+AND+site_id:86&mlt.fq=site_id:86
>>
>> (We have all all of our other defaults set up in the handler config.)
>>
>> Thanks.
>> ---
>> Doug Steigerwald
>> Software Developer
>> McClatchy Interactive
>> dsteigerwald@mcclatchyinteractive.com


Re: MoreLikeThis filtering

Posted by Doug Steigerwald <ds...@mcclatchyinteractive.com>.
Hah.  Sorry, I'm really out of it today.

The MoreLikeThisComponent doesn't seem to work for filtering using fq,  
but the MoreLikeThisHandler does.

Problem solved, we'll just use the handler instead of a component.

Doug

On Mar 4, 2009, at 11:02 AM, Doug Steigerwald wrote:

> Sorry.  The examples on the wiki aren't working with the 'fq' to  
> filter the similarities.  It just filters the actual queries.
>
> http://localhost:8983/solr/mlt?q=id:SP2514N&mlt.fl=manu,cat&mlt.mindf=1&mlt.mintf=1&fq=popularity:6&mlt.displayTerms=details&mlt=true
>
> The popularity of the doc found is 6, and trying to use  
> 'fq=popularity:6' brings back similarities with a popularity other  
> than 6.
>
> Doug
>
> On Mar 4, 2009, at 10:39 AM, Doug Steigerwald wrote:
>
>> Hm.  I checked out a clean Solr 1.3.0 and indexed the example docs  
>> and set up a simple MLT handler the example queries on the Wiki  
>> work fine (fq can filter out docs).  Our build has a slight change  
>> to QueryComponent so another query isn't done when we use localsolr 
>> +field collapsing, but that change doesn't look like it would make  
>> a difference.  It just conditionally sets rb.setNeedDocSet() to  
>> true or false.
>>
>> Will run some tests on a clean fresh build of Solr to see if it's  
>> our build.
>>
>> Doug
>>
>> On Mar 4, 2009, at 9:28 AM, Otis Gospodnetic wrote:
>>
>>>
>>> Doug,
>>>
>>> does the good old 'fq' not work with MLT?  It should...
>>>
>>>
>>> Otis
>>> --
>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>>
>>>
>>>
>>> ----- Original Message ----
>>>> From: Doug Steigerwald <ds...@mcclatchyinteractive.com>
>>>> To: solr-user@lucene.apache.org
>>>> Sent: Wednesday, March 4, 2009 9:20:40 AM
>>>> Subject: MoreLikeThis filtering
>>>>
>>>> Is it possible to filter similarities found by the MLT component/ 
>>>> handler?
>>>> Something like mlt.fq=site_id:86?
>>>>
>>>> We have 32 cores in our Solr install, and some of those cores  
>>>> have up to 8 sites
>>>> indexed in them.  Typically those cores will have one very large  
>>>> site with a few
>>>> hundred thousand indexed documents, and lots of small sites with  
>>>> significantly
>>>> less documents indexed.
>>>>
>>>> We're looking to implement a MLT component for our sites but want  
>>>> the similar
>>>> stories to be only for a specific site (not all sites in the core).
>>>>
>>>> Is there a way to do something like this, or will we have to make  
>>>> mods (I'm not
>>>> seeing anything jump out at me in the Solr 1.3.0 or Lucene 2.4.0  
>>>> code)?
>>>>
>>>> /solr/dsteiger/mlt?q=story_id:188665+AND+site_id: 
>>>> 86&mlt.fq=site_id:86
>>>>
>>>> (We have all all of our other defaults set up in the handler  
>>>> config.)
>>>>
>>>> Thanks.
>>>> ---
>>>> Doug Steigerwald
>>>> Software Developer
>>>> McClatchy Interactive
>>>> dsteigerwald@mcclatchyinteractive.com


Re: MoreLikeThis filtering

Posted by Andrew Ingram <an...@andrewingram.net>.
I posted a while back with this problem and I've finally got it working 
using the following method:

in solrconfig.xml:

  <requestHandler name="mlt" class="solr.MoreLikeThisHandler">
    <lst name="defaults">
      <str name="mlt.fl">id,title</str>
      <int name="mlt.mintf">0</int>
    </lst>
  </requestHandler>


then when making the request, I do a normal search for the item with the 
necessary filter query (in my case discontinued:false) and set the 
qt=mlt to activate the mlt handler.

So in your case:

http://localhost:8983/solr/select?q=id:SP2514N&mlt.fl=manu,cat&mlt.mindf=1&mlt.mintf=1&fq=popularity:6&mlt.displayTerms=details&mlt=true&qt=mlt

I had to use qt because I'm using a library to access solr and it 
doesn't include support for alternative handlers at the path level.

Regards,
Andrew Ingram


Doug Steigerwald wrote:
> Sorry.  The examples on the wiki aren't working with the 'fq' to 
> filter the similarities.  It just filters the actual queries.
>
> http://localhost:8983/solr/mlt?q=id:SP2514N&mlt.fl=manu,cat&mlt.mindf=1&mlt.mintf=1&fq=popularity:6&mlt.displayTerms=details&mlt=true 
>
>
> The popularity of the doc found is 6, and trying to use 
> 'fq=popularity:6' brings back similarities with a popularity other 
> than 6.
>
> Doug
>
> On Mar 4, 2009, at 10:39 AM, Doug Steigerwald wrote:
>
>> Hm.  I checked out a clean Solr 1.3.0 and indexed the example docs 
>> and set up a simple MLT handler the example queries on the Wiki work 
>> fine (fq can filter out docs).  Our build has a slight change to 
>> QueryComponent so another query isn't done when we use 
>> localsolr+field collapsing, but that change doesn't look like it 
>> would make a difference.  It just conditionally sets 
>> rb.setNeedDocSet() to true or false.
>>
>> Will run some tests on a clean fresh build of Solr to see if it's our 
>> build.
>>
>> Doug
>>
>> On Mar 4, 2009, at 9:28 AM, Otis Gospodnetic wrote:
>>
>>>
>>> Doug,
>>>
>>> does the good old 'fq' not work with MLT?  It should...
>>>
>>>
>>> Otis
>>> -- 
>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>>
>>>
>>>
>>> ----- Original Message ----
>>>> From: Doug Steigerwald <ds...@mcclatchyinteractive.com>
>>>> To: solr-user@lucene.apache.org
>>>> Sent: Wednesday, March 4, 2009 9:20:40 AM
>>>> Subject: MoreLikeThis filtering
>>>>
>>>> Is it possible to filter similarities found by the MLT 
>>>> component/handler?
>>>> Something like mlt.fq=site_id:86?
>>>>
>>>> We have 32 cores in our Solr install, and some of those cores have 
>>>> up to 8 sites
>>>> indexed in them.  Typically those cores will have one very large 
>>>> site with a few
>>>> hundred thousand indexed documents, and lots of small sites with 
>>>> significantly
>>>> less documents indexed.
>>>>
>>>> We're looking to implement a MLT component for our sites but want 
>>>> the similar
>>>> stories to be only for a specific site (not all sites in the core).
>>>>
>>>> Is there a way to do something like this, or will we have to make 
>>>> mods (I'm not
>>>> seeing anything jump out at me in the Solr 1.3.0 or Lucene 2.4.0 
>>>> code)?
>>>>
>>>> /solr/dsteiger/mlt?q=story_id:188665+AND+site_id:86&mlt.fq=site_id:86
>>>>
>>>> (We have all all of our other defaults set up in the handler config.)
>>>>
>>>> Thanks.
>>>> ---
>>>> Doug Steigerwald
>>>> Software Developer
>>>> McClatchy Interactive
>>>> dsteigerwald@mcclatchyinteractive.com
>
>


Re: MoreLikeThis filtering

Posted by Doug Steigerwald <ds...@mcclatchyinteractive.com>.
Sorry.  The examples on the wiki aren't working with the 'fq' to  
filter the similarities.  It just filters the actual queries.

http://localhost:8983/solr/mlt?q=id:SP2514N&mlt.fl=manu,cat&mlt.mindf=1&mlt.mintf=1&fq=popularity:6&mlt.displayTerms=details&mlt=true

The popularity of the doc found is 6, and trying to use 'fq=popularity: 
6' brings back similarities with a popularity other than 6.

Doug

On Mar 4, 2009, at 10:39 AM, Doug Steigerwald wrote:

> Hm.  I checked out a clean Solr 1.3.0 and indexed the example docs  
> and set up a simple MLT handler the example queries on the Wiki work  
> fine (fq can filter out docs).  Our build has a slight change to  
> QueryComponent so another query isn't done when we use localsolr 
> +field collapsing, but that change doesn't look like it would make a  
> difference.  It just conditionally sets rb.setNeedDocSet() to true  
> or false.
>
> Will run some tests on a clean fresh build of Solr to see if it's  
> our build.
>
> Doug
>
> On Mar 4, 2009, at 9:28 AM, Otis Gospodnetic wrote:
>
>>
>> Doug,
>>
>> does the good old 'fq' not work with MLT?  It should...
>>
>>
>> Otis
>> --
>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>
>>
>>
>> ----- Original Message ----
>>> From: Doug Steigerwald <ds...@mcclatchyinteractive.com>
>>> To: solr-user@lucene.apache.org
>>> Sent: Wednesday, March 4, 2009 9:20:40 AM
>>> Subject: MoreLikeThis filtering
>>>
>>> Is it possible to filter similarities found by the MLT component/ 
>>> handler?
>>> Something like mlt.fq=site_id:86?
>>>
>>> We have 32 cores in our Solr install, and some of those cores have  
>>> up to 8 sites
>>> indexed in them.  Typically those cores will have one very large  
>>> site with a few
>>> hundred thousand indexed documents, and lots of small sites with  
>>> significantly
>>> less documents indexed.
>>>
>>> We're looking to implement a MLT component for our sites but want  
>>> the similar
>>> stories to be only for a specific site (not all sites in the core).
>>>
>>> Is there a way to do something like this, or will we have to make  
>>> mods (I'm not
>>> seeing anything jump out at me in the Solr 1.3.0 or Lucene 2.4.0  
>>> code)?
>>>
>>> /solr/dsteiger/mlt?q=story_id:188665+AND+site_id:86&mlt.fq=site_id: 
>>> 86
>>>
>>> (We have all all of our other defaults set up in the handler  
>>> config.)
>>>
>>> Thanks.
>>> ---
>>> Doug Steigerwald
>>> Software Developer
>>> McClatchy Interactive
>>> dsteigerwald@mcclatchyinteractive.com


Re: MoreLikeThis filtering

Posted by Doug Steigerwald <ds...@mcclatchyinteractive.com>.
Hm.  I checked out a clean Solr 1.3.0 and indexed the example docs and  
set up a simple MLT handler the example queries on the Wiki work fine  
(fq can filter out docs).  Our build has a slight change to  
QueryComponent so another query isn't done when we use localsolr+field  
collapsing, but that change doesn't look like it would make a  
difference.  It just conditionally sets rb.setNeedDocSet() to true or  
false.

Will run some tests on a clean fresh build of Solr to see if it's our  
build.

Doug

On Mar 4, 2009, at 9:28 AM, Otis Gospodnetic wrote:

>
> Doug,
>
> does the good old 'fq' not work with MLT?  It should...
>
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
>> From: Doug Steigerwald <ds...@mcclatchyinteractive.com>
>> To: solr-user@lucene.apache.org
>> Sent: Wednesday, March 4, 2009 9:20:40 AM
>> Subject: MoreLikeThis filtering
>>
>> Is it possible to filter similarities found by the MLT component/ 
>> handler?
>> Something like mlt.fq=site_id:86?
>>
>> We have 32 cores in our Solr install, and some of those cores have  
>> up to 8 sites
>> indexed in them.  Typically those cores will have one very large  
>> site with a few
>> hundred thousand indexed documents, and lots of small sites with  
>> significantly
>> less documents indexed.
>>
>> We're looking to implement a MLT component for our sites but want  
>> the similar
>> stories to be only for a specific site (not all sites in the core).
>>
>> Is there a way to do something like this, or will we have to make  
>> mods (I'm not
>> seeing anything jump out at me in the Solr 1.3.0 or Lucene 2.4.0  
>> code)?
>>
>> /solr/dsteiger/mlt?q=story_id:188665+AND+site_id:86&mlt.fq=site_id:86
>>
>> (We have all all of our other defaults set up in the handler config.)
>>
>> Thanks.
>> ---
>> Doug Steigerwald
>> Software Developer
>> McClatchy Interactive
>> dsteigerwald@mcclatchyinteractive.com


Re: MoreLikeThis filtering

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Doug,

does the good old 'fq' not work with MLT?  It should...


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Doug Steigerwald <ds...@mcclatchyinteractive.com>
> To: solr-user@lucene.apache.org
> Sent: Wednesday, March 4, 2009 9:20:40 AM
> Subject: MoreLikeThis filtering
> 
> Is it possible to filter similarities found by the MLT component/handler?  
> Something like mlt.fq=site_id:86?
> 
> We have 32 cores in our Solr install, and some of those cores have up to 8 sites 
> indexed in them.  Typically those cores will have one very large site with a few 
> hundred thousand indexed documents, and lots of small sites with significantly 
> less documents indexed.
> 
> We're looking to implement a MLT component for our sites but want the similar 
> stories to be only for a specific site (not all sites in the core).
> 
> Is there a way to do something like this, or will we have to make mods (I'm not 
> seeing anything jump out at me in the Solr 1.3.0 or Lucene 2.4.0 code)?
> 
> /solr/dsteiger/mlt?q=story_id:188665+AND+site_id:86&mlt.fq=site_id:86
> 
> (We have all all of our other defaults set up in the handler config.)
> 
> Thanks.
> ---
> Doug Steigerwald
> Software Developer
> McClatchy Interactive
> dsteigerwald@mcclatchyinteractive.com