You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Doug Steigerwald <ds...@mcclatchyinteractive.com> on 2009/03/04 15:20:40 UTC
MoreLikeThis filtering
Is it possible to filter similarities found by the MLT component/
handler? Something like mlt.fq=site_id:86?
We have 32 cores in our Solr install, and some of those cores have up
to 8 sites indexed in them. Typically those cores will have one very
large site with a few hundred thousand indexed documents, and lots of
small sites with significantly less documents indexed.
We're looking to implement a MLT component for our sites but want the
similar stories to be only for a specific site (not all sites in the
core).
Is there a way to do something like this, or will we have to make mods
(I'm not seeing anything jump out at me in the Solr 1.3.0 or Lucene
2.4.0 code)?
/solr/dsteiger/mlt?q=story_id:188665+AND+site_id:86&mlt.fq=site_id:86
(We have all all of our other defaults set up in the handler config.)
Thanks.
---
Doug Steigerwald
Software Developer
McClatchy Interactive
dsteigerwald@mcclatchyinteractive.com
Re: MoreLikeThis filtering
Posted by Doug Steigerwald <ds...@mcclatchyinteractive.com>.
'fq' seems to only work with finding the documents with your original
query, not for filtering the similar documents.
Doug
On Mar 4, 2009, at 9:28 AM, Otis Gospodnetic wrote:
>
> Doug,
>
> does the good old 'fq' not work with MLT? It should...
>
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
>> From: Doug Steigerwald <ds...@mcclatchyinteractive.com>
>> To: solr-user@lucene.apache.org
>> Sent: Wednesday, March 4, 2009 9:20:40 AM
>> Subject: MoreLikeThis filtering
>>
>> Is it possible to filter similarities found by the MLT component/
>> handler?
>> Something like mlt.fq=site_id:86?
>>
>> We have 32 cores in our Solr install, and some of those cores have
>> up to 8 sites
>> indexed in them. Typically those cores will have one very large
>> site with a few
>> hundred thousand indexed documents, and lots of small sites with
>> significantly
>> less documents indexed.
>>
>> We're looking to implement a MLT component for our sites but want
>> the similar
>> stories to be only for a specific site (not all sites in the core).
>>
>> Is there a way to do something like this, or will we have to make
>> mods (I'm not
>> seeing anything jump out at me in the Solr 1.3.0 or Lucene 2.4.0
>> code)?
>>
>> /solr/dsteiger/mlt?q=story_id:188665+AND+site_id:86&mlt.fq=site_id:86
>>
>> (We have all all of our other defaults set up in the handler config.)
>>
>> Thanks.
>> ---
>> Doug Steigerwald
>> Software Developer
>> McClatchy Interactive
>> dsteigerwald@mcclatchyinteractive.com
Re: MoreLikeThis filtering
Posted by Doug Steigerwald <ds...@mcclatchyinteractive.com>.
Hah. Sorry, I'm really out of it today.
The MoreLikeThisComponent doesn't seem to work for filtering using fq,
but the MoreLikeThisHandler does.
Problem solved, we'll just use the handler instead of a component.
Doug
On Mar 4, 2009, at 11:02 AM, Doug Steigerwald wrote:
> Sorry. The examples on the wiki aren't working with the 'fq' to
> filter the similarities. It just filters the actual queries.
>
> http://localhost:8983/solr/mlt?q=id:SP2514N&mlt.fl=manu,cat&mlt.mindf=1&mlt.mintf=1&fq=popularity:6&mlt.displayTerms=details&mlt=true
>
> The popularity of the doc found is 6, and trying to use
> 'fq=popularity:6' brings back similarities with a popularity other
> than 6.
>
> Doug
>
> On Mar 4, 2009, at 10:39 AM, Doug Steigerwald wrote:
>
>> Hm. I checked out a clean Solr 1.3.0 and indexed the example docs
>> and set up a simple MLT handler the example queries on the Wiki
>> work fine (fq can filter out docs). Our build has a slight change
>> to QueryComponent so another query isn't done when we use localsolr
>> +field collapsing, but that change doesn't look like it would make
>> a difference. It just conditionally sets rb.setNeedDocSet() to
>> true or false.
>>
>> Will run some tests on a clean fresh build of Solr to see if it's
>> our build.
>>
>> Doug
>>
>> On Mar 4, 2009, at 9:28 AM, Otis Gospodnetic wrote:
>>
>>>
>>> Doug,
>>>
>>> does the good old 'fq' not work with MLT? It should...
>>>
>>>
>>> Otis
>>> --
>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>>
>>>
>>>
>>> ----- Original Message ----
>>>> From: Doug Steigerwald <ds...@mcclatchyinteractive.com>
>>>> To: solr-user@lucene.apache.org
>>>> Sent: Wednesday, March 4, 2009 9:20:40 AM
>>>> Subject: MoreLikeThis filtering
>>>>
>>>> Is it possible to filter similarities found by the MLT component/
>>>> handler?
>>>> Something like mlt.fq=site_id:86?
>>>>
>>>> We have 32 cores in our Solr install, and some of those cores
>>>> have up to 8 sites
>>>> indexed in them. Typically those cores will have one very large
>>>> site with a few
>>>> hundred thousand indexed documents, and lots of small sites with
>>>> significantly
>>>> less documents indexed.
>>>>
>>>> We're looking to implement a MLT component for our sites but want
>>>> the similar
>>>> stories to be only for a specific site (not all sites in the core).
>>>>
>>>> Is there a way to do something like this, or will we have to make
>>>> mods (I'm not
>>>> seeing anything jump out at me in the Solr 1.3.0 or Lucene 2.4.0
>>>> code)?
>>>>
>>>> /solr/dsteiger/mlt?q=story_id:188665+AND+site_id:
>>>> 86&mlt.fq=site_id:86
>>>>
>>>> (We have all all of our other defaults set up in the handler
>>>> config.)
>>>>
>>>> Thanks.
>>>> ---
>>>> Doug Steigerwald
>>>> Software Developer
>>>> McClatchy Interactive
>>>> dsteigerwald@mcclatchyinteractive.com
Re: MoreLikeThis filtering
Posted by Andrew Ingram <an...@andrewingram.net>.
I posted a while back with this problem and I've finally got it working
using the following method:
in solrconfig.xml:
<requestHandler name="mlt" class="solr.MoreLikeThisHandler">
<lst name="defaults">
<str name="mlt.fl">id,title</str>
<int name="mlt.mintf">0</int>
</lst>
</requestHandler>
then when making the request, I do a normal search for the item with the
necessary filter query (in my case discontinued:false) and set the
qt=mlt to activate the mlt handler.
So in your case:
http://localhost:8983/solr/select?q=id:SP2514N&mlt.fl=manu,cat&mlt.mindf=1&mlt.mintf=1&fq=popularity:6&mlt.displayTerms=details&mlt=true&qt=mlt
I had to use qt because I'm using a library to access solr and it
doesn't include support for alternative handlers at the path level.
Regards,
Andrew Ingram
Doug Steigerwald wrote:
> Sorry. The examples on the wiki aren't working with the 'fq' to
> filter the similarities. It just filters the actual queries.
>
> http://localhost:8983/solr/mlt?q=id:SP2514N&mlt.fl=manu,cat&mlt.mindf=1&mlt.mintf=1&fq=popularity:6&mlt.displayTerms=details&mlt=true
>
>
> The popularity of the doc found is 6, and trying to use
> 'fq=popularity:6' brings back similarities with a popularity other
> than 6.
>
> Doug
>
> On Mar 4, 2009, at 10:39 AM, Doug Steigerwald wrote:
>
>> Hm. I checked out a clean Solr 1.3.0 and indexed the example docs
>> and set up a simple MLT handler the example queries on the Wiki work
>> fine (fq can filter out docs). Our build has a slight change to
>> QueryComponent so another query isn't done when we use
>> localsolr+field collapsing, but that change doesn't look like it
>> would make a difference. It just conditionally sets
>> rb.setNeedDocSet() to true or false.
>>
>> Will run some tests on a clean fresh build of Solr to see if it's our
>> build.
>>
>> Doug
>>
>> On Mar 4, 2009, at 9:28 AM, Otis Gospodnetic wrote:
>>
>>>
>>> Doug,
>>>
>>> does the good old 'fq' not work with MLT? It should...
>>>
>>>
>>> Otis
>>> --
>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>>
>>>
>>>
>>> ----- Original Message ----
>>>> From: Doug Steigerwald <ds...@mcclatchyinteractive.com>
>>>> To: solr-user@lucene.apache.org
>>>> Sent: Wednesday, March 4, 2009 9:20:40 AM
>>>> Subject: MoreLikeThis filtering
>>>>
>>>> Is it possible to filter similarities found by the MLT
>>>> component/handler?
>>>> Something like mlt.fq=site_id:86?
>>>>
>>>> We have 32 cores in our Solr install, and some of those cores have
>>>> up to 8 sites
>>>> indexed in them. Typically those cores will have one very large
>>>> site with a few
>>>> hundred thousand indexed documents, and lots of small sites with
>>>> significantly
>>>> less documents indexed.
>>>>
>>>> We're looking to implement a MLT component for our sites but want
>>>> the similar
>>>> stories to be only for a specific site (not all sites in the core).
>>>>
>>>> Is there a way to do something like this, or will we have to make
>>>> mods (I'm not
>>>> seeing anything jump out at me in the Solr 1.3.0 or Lucene 2.4.0
>>>> code)?
>>>>
>>>> /solr/dsteiger/mlt?q=story_id:188665+AND+site_id:86&mlt.fq=site_id:86
>>>>
>>>> (We have all all of our other defaults set up in the handler config.)
>>>>
>>>> Thanks.
>>>> ---
>>>> Doug Steigerwald
>>>> Software Developer
>>>> McClatchy Interactive
>>>> dsteigerwald@mcclatchyinteractive.com
>
>
Re: MoreLikeThis filtering
Posted by Doug Steigerwald <ds...@mcclatchyinteractive.com>.
Sorry. The examples on the wiki aren't working with the 'fq' to
filter the similarities. It just filters the actual queries.
http://localhost:8983/solr/mlt?q=id:SP2514N&mlt.fl=manu,cat&mlt.mindf=1&mlt.mintf=1&fq=popularity:6&mlt.displayTerms=details&mlt=true
The popularity of the doc found is 6, and trying to use 'fq=popularity:
6' brings back similarities with a popularity other than 6.
Doug
On Mar 4, 2009, at 10:39 AM, Doug Steigerwald wrote:
> Hm. I checked out a clean Solr 1.3.0 and indexed the example docs
> and set up a simple MLT handler the example queries on the Wiki work
> fine (fq can filter out docs). Our build has a slight change to
> QueryComponent so another query isn't done when we use localsolr
> +field collapsing, but that change doesn't look like it would make a
> difference. It just conditionally sets rb.setNeedDocSet() to true
> or false.
>
> Will run some tests on a clean fresh build of Solr to see if it's
> our build.
>
> Doug
>
> On Mar 4, 2009, at 9:28 AM, Otis Gospodnetic wrote:
>
>>
>> Doug,
>>
>> does the good old 'fq' not work with MLT? It should...
>>
>>
>> Otis
>> --
>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>
>>
>>
>> ----- Original Message ----
>>> From: Doug Steigerwald <ds...@mcclatchyinteractive.com>
>>> To: solr-user@lucene.apache.org
>>> Sent: Wednesday, March 4, 2009 9:20:40 AM
>>> Subject: MoreLikeThis filtering
>>>
>>> Is it possible to filter similarities found by the MLT component/
>>> handler?
>>> Something like mlt.fq=site_id:86?
>>>
>>> We have 32 cores in our Solr install, and some of those cores have
>>> up to 8 sites
>>> indexed in them. Typically those cores will have one very large
>>> site with a few
>>> hundred thousand indexed documents, and lots of small sites with
>>> significantly
>>> less documents indexed.
>>>
>>> We're looking to implement a MLT component for our sites but want
>>> the similar
>>> stories to be only for a specific site (not all sites in the core).
>>>
>>> Is there a way to do something like this, or will we have to make
>>> mods (I'm not
>>> seeing anything jump out at me in the Solr 1.3.0 or Lucene 2.4.0
>>> code)?
>>>
>>> /solr/dsteiger/mlt?q=story_id:188665+AND+site_id:86&mlt.fq=site_id:
>>> 86
>>>
>>> (We have all all of our other defaults set up in the handler
>>> config.)
>>>
>>> Thanks.
>>> ---
>>> Doug Steigerwald
>>> Software Developer
>>> McClatchy Interactive
>>> dsteigerwald@mcclatchyinteractive.com
Re: MoreLikeThis filtering
Posted by Doug Steigerwald <ds...@mcclatchyinteractive.com>.
Hm. I checked out a clean Solr 1.3.0 and indexed the example docs and
set up a simple MLT handler the example queries on the Wiki work fine
(fq can filter out docs). Our build has a slight change to
QueryComponent so another query isn't done when we use localsolr+field
collapsing, but that change doesn't look like it would make a
difference. It just conditionally sets rb.setNeedDocSet() to true or
false.
Will run some tests on a clean fresh build of Solr to see if it's our
build.
Doug
On Mar 4, 2009, at 9:28 AM, Otis Gospodnetic wrote:
>
> Doug,
>
> does the good old 'fq' not work with MLT? It should...
>
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
>> From: Doug Steigerwald <ds...@mcclatchyinteractive.com>
>> To: solr-user@lucene.apache.org
>> Sent: Wednesday, March 4, 2009 9:20:40 AM
>> Subject: MoreLikeThis filtering
>>
>> Is it possible to filter similarities found by the MLT component/
>> handler?
>> Something like mlt.fq=site_id:86?
>>
>> We have 32 cores in our Solr install, and some of those cores have
>> up to 8 sites
>> indexed in them. Typically those cores will have one very large
>> site with a few
>> hundred thousand indexed documents, and lots of small sites with
>> significantly
>> less documents indexed.
>>
>> We're looking to implement a MLT component for our sites but want
>> the similar
>> stories to be only for a specific site (not all sites in the core).
>>
>> Is there a way to do something like this, or will we have to make
>> mods (I'm not
>> seeing anything jump out at me in the Solr 1.3.0 or Lucene 2.4.0
>> code)?
>>
>> /solr/dsteiger/mlt?q=story_id:188665+AND+site_id:86&mlt.fq=site_id:86
>>
>> (We have all all of our other defaults set up in the handler config.)
>>
>> Thanks.
>> ---
>> Doug Steigerwald
>> Software Developer
>> McClatchy Interactive
>> dsteigerwald@mcclatchyinteractive.com
Re: MoreLikeThis filtering
Posted by Otis Gospodnetic <ot...@yahoo.com>.
Doug,
does the good old 'fq' not work with MLT? It should...
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ----
> From: Doug Steigerwald <ds...@mcclatchyinteractive.com>
> To: solr-user@lucene.apache.org
> Sent: Wednesday, March 4, 2009 9:20:40 AM
> Subject: MoreLikeThis filtering
>
> Is it possible to filter similarities found by the MLT component/handler?
> Something like mlt.fq=site_id:86?
>
> We have 32 cores in our Solr install, and some of those cores have up to 8 sites
> indexed in them. Typically those cores will have one very large site with a few
> hundred thousand indexed documents, and lots of small sites with significantly
> less documents indexed.
>
> We're looking to implement a MLT component for our sites but want the similar
> stories to be only for a specific site (not all sites in the core).
>
> Is there a way to do something like this, or will we have to make mods (I'm not
> seeing anything jump out at me in the Solr 1.3.0 or Lucene 2.4.0 code)?
>
> /solr/dsteiger/mlt?q=story_id:188665+AND+site_id:86&mlt.fq=site_id:86
>
> (We have all all of our other defaults set up in the handler config.)
>
> Thanks.
> ---
> Doug Steigerwald
> Software Developer
> McClatchy Interactive
> dsteigerwald@mcclatchyinteractive.com