You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Shamik Bandopadhyay <sh...@gmail.com> on 2016/04/06 01:29:05 UTC

MLT Query Parser

Hi,

  I'm trying to use the new MLT query parser in a SolrCloud mode. As per
the documentation, here's the syntax,

{!mlt qf=name}1

where "1" is the id.

What I'm trying to undertsand is whether "id" is a mandatory field in
making this work? Right now,I'm getting mlt documents based on a "keyword"
field. With the new query parser,I'm not able to see a way to use another
field except for id. Is this a constraint? Or there's a different syntax?

Any pointers will be appreciated.

Thanks,
Shamik

Re: MLT Query Parser

Posted by shamik <sh...@gmail.com>.

Thanks Shawn and Alessandro. I get the part why id is needed. I was trying to
compare with the "mlt" request handler which doesn't enforce such
constraint. My previous example of title/keyword is not the right one, but I
do have fields which are unique to each document and can be used as a key to
extract similar content. I don't think we can always have handle to the
document id in every scenario. In my case, it's a composite id and I don't
pass it back and forth as part of search results. For e.g. when I'm trying
to get similar content for a specific forum thread, I could very well use
the threadId field stored in Solr (unique to each document) to generate
similar content. This works great using "mlt" request handler. I was
expecting the query parser will have similar capability.




--
View this message in context: http://lucene.472066.n3.nabble.com/MLT-Query-Parser-for-SolrCloud-tp4268308p4268759.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: MLT Query Parser

Posted by Alessandro Benedetti <ab...@apache.org>.

As Shawn correctly stated, I see nothing strange in the input the More Like
This. I could not think a better input to be honest for this functionality.
You can potentially attach the MLT Component to your search request
handler, to run the MLT on each search result, but I discourage you to do
that, I think it is better to run MLT only when needed.

Can you explain us your use case, maybe as Shawn observed you don't need
MLT at all.

Cheers

On Wed, Apr 6, 2016 at 6:07 PM, shamik <sh...@gmail.com> wrote:

> Thanks Alessandro, that answers my doubt. in a nutshell, to make MLT Query
> parser work, you need to know the document id. I'm just curious as why this
> constraint has been added. This will not work for a bulk of use cases. For
> e.g. if we are trying to generate MLT based on a text or a keyword, how
> would I ever use this API ? My initial impression was that this was
> designed
> to work on a distributed mode.
>
> Now, this adds up a follow-up question as in which one is the right
> approach
> in a solr cloud mode. "mlt"request handler is off the equation since it's
> not supported. That leaves with MoreLikeThisComponent which has a known
> issue with performance. Is that the only availble solution then ?
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/MLT-Query-Parser-for-SolrCloud-tp4268308p4268482.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: MLT Query Parser

Posted by Shawn Heisey <ap...@elyograg.org>.

On 4/6/2016 11:07 AM, shamik wrote:
> Thanks Alessandro, that answers my doubt. in a nutshell, to make MLT Query
> parser work, you need to know the document id. I'm just curious as why this
> constraint has been added. This will not work for a bulk of use cases. For
> e.g. if we are trying to generate MLT based on a text or a keyword, how
> would I ever use this API ? My initial impression was that this was designed
> to work on a distributed mode.
>
> Now, this adds up a follow-up question as in which one is the right approach
> in a solr cloud mode. "mlt"request handler is off the equation since it's
> not supported. That leaves with MoreLikeThisComponent which has a known
> issue with performance. Is that the only availble solution then ?

The feature "More Like This/These" is built around the premise that you
are seeing one or more existing documents in a search result, and you
want to find more documents that are very similar to those.  This is why
you need an ID -- to tell Solr which document(s) you want to use as a
basis for the query.

If you're trying to find documents by keyword/text, that's just a
regular query; you don't need MLT.

MLT was not originally designed for distributed indexes.  Distributed
MLT is a *relatively* recent addition, and when I tried it, it was very
slow.  I do not know if the feature has changed much since it was
introduced three years ago.

Thanks,
Shawn

Re: MLT Query Parser

Posted by shamik <sh...@gmail.com>.

Thanks Alessandro, that answers my doubt. in a nutshell, to make MLT Query
parser work, you need to know the document id. I'm just curious as why this
constraint has been added. This will not work for a bulk of use cases. For
e.g. if we are trying to generate MLT based on a text or a keyword, how
would I ever use this API ? My initial impression was that this was designed
to work on a distributed mode.

Now, this adds up a follow-up question as in which one is the right approach
in a solr cloud mode. "mlt"request handler is off the equation since it's
not supported. That leaves with MoreLikeThisComponent which has a known
issue with performance. Is that the only availble solution then ?



--
View this message in context: http://lucene.472066.n3.nabble.com/MLT-Query-Parser-for-SolrCloud-tp4268308p4268482.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: MLT Query Parser

Posted by Alessandro Benedetti <ab...@apache.org>.

Wait a second, and let's avoid any confusion.
We can have different input for a More Like This Request Handler ( if this
is what you were using).

1) the Id of the document we want to find similar documents to
2) a bunch of text

Then you have a lot of parameters that will affect the MLT core.
Specifically the :
mlt.qf=name is telling the MLT to use the field "name" for the MLT query
and the input document.

Let's go back to the query parser...
Your query : "{!mlt qf=name}1" means :
"give me similar documents to the document 1, based only on the field
"name" ".

According to what you wrote : "
Right now,I'm getting mlt documents based on a "keyword"
field"

I think the query you want is simply :
{!mlt qf=keyword}<id>

For MLT query parser the document id is the only input supported.

Cheers

On Wed, Apr 6, 2016 at 12:29 AM, Shamik Bandopadhyay <sh...@gmail.com>
wrote:

> Hi,
>
>   I'm trying to use the new MLT query parser in a SolrCloud mode. As per
> the documentation, here's the syntax,
>
> {!mlt qf=name}1
>
> where "1" is the id.
>
> What I'm trying to undertsand is whether "id" is a mandatory field in
> making this work? Right now,I'm getting mlt documents based on a "keyword"
> field. With the new query parser,I'm not able to see a way to use another
> field except for id. Is this a constraint? Or there's a different syntax?
>
> Any pointers will be appreciated.
>
> Thanks,
> Shamik
>

-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England