You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Doug Turnbull <dt...@opensourceconnections.com> on 2018/08/31 18:36:01 UTC

MLT in Cloud Mode - Not Returning Fields?

Hello,

We're working on a Solr More Like This project (Solr 6.6.2), using the More
Like This searchComponent. What we note is in standalone Solr, when we
request MLT using the search component, we get every more like this
document fully formed with complete fields in the moreLikeThis section.

In cloud, however, with the exact same query and config, we only get the
doc ids under "moreLikeThis" requiring us to fetch the metadata associated
with each document.

I can't easily share an example due to confidentiality, but I want to check
if we're missing something? Documentation doesn't mention any limitations.
The only interesting note I've found is this one which points to a
potential difference in behavior

>  The Cloud MLT Query Parser uses the realtime get handler to retrieve the
fields to be mined for keywords. Because of the way the realtime get
handler is implemented, it does not return data for fields populated using
copyField.

https://stackoverflow.com/a/46307140/8123

Any thoughts?

-Doug
-- 
CTO, OpenSource Connections
Author, Relevant Search
http://o19s.com/doug

Re: MLT in Cloud Mode - Not Returning Fields?

Posted by Erick Erickson <er...@gmail.com>.
If this is still a problem in master/7x then a JIRA is in order.....
On Fri, Sep 7, 2018 at 7:40 AM Doug Turnbull
<dt...@opensourceconnections.com> wrote:
>
> Looks like this is indeed a bug
>
> My colleague debugged this behavior and it turns out that Solr only
> requests id and score from the shards, and not the user-specified field
> list. You can see that on this line
>
> https://github.com/apache/lucene-solr/blob/branch_7_4/solr/core/src/java/org/apache/solr/handler/component/MoreLikeThisComponent.java#L342
>
> Happy to create a Jira ticket
>
> -Doug
>
> On Mon, Sep 3, 2018 at 5:23 AM Charlie Hull <ch...@flax.co.uk> wrote:
>
> > On 31/08/2018 19:36, Doug Turnbull wrote:
> > > Hello,
> > >
> > > We're working on a Solr More Like This project (Solr 6.6.2), using the
> > More
> > > Like This searchComponent. What we note is in standalone Solr, when we
> > > request MLT using the search component, we get every more like this
> > > document fully formed with complete fields in the moreLikeThis section.
> >
> > Hey Doug,
> >
> > IIRC there wasn't a lot of support for MLT in cloud mode a few years
> > ago, and there are certainly still a few open issues around cloud support:
> > https://issues.apache.org/jira/browse/SOLR-4414
> > https://issues.apache.org/jira/browse/SOLR-5480
> > Maybe there are some hints in the ticket comments about different ways
> > to do what you want.
> >
> > Cheers
> >
> > Charlie
> >
> > >
> > > In cloud, however, with the exact same query and config, we only get the
> > > doc ids under "moreLikeThis" requiring us to fetch the metadata
> > associated
> > > with each document.
> > >
> > > I can't easily share an example due to confidentiality, but I want to
> > check
> > > if we're missing something? Documentation doesn't mention any
> > limitations.
> > > The only interesting note I've found is this one which points to a
> > > potential difference in behavior
> > >
> > >>   The Cloud MLT Query Parser uses the realtime get handler to retrieve
> > the
> > > fields to be mined for keywords. Because of the way the realtime get
> > > handler is implemented, it does not return data for fields populated
> > using
> > > copyField.
> > >
> > > https://stackoverflow.com/a/46307140/8123
> > >
> > > Any thoughts?
> > >
> > > -Doug
> > >
> >
> >
> > --
> > Charlie Hull
> > Flax - Open Source Enterprise Search
> >
> > tel/fax: +44 (0)8700 118334 <+44%20870%20011%208334>
> > mobile:  +44 (0)7767 825828 <+44%207767%20825828>
> > web: www.flax.co.uk
> >
> --
> CTO, OpenSource Connections
> Author, Relevant Search
> http://o19s.com/doug

Re: MLT in Cloud Mode - Not Returning Fields?

Posted by Doug Turnbull <dt...@opensourceconnections.com>.
Looks like this is indeed a bug

My colleague debugged this behavior and it turns out that Solr only
requests id and score from the shards, and not the user-specified field
list. You can see that on this line

https://github.com/apache/lucene-solr/blob/branch_7_4/solr/core/src/java/org/apache/solr/handler/component/MoreLikeThisComponent.java#L342

Happy to create a Jira ticket

-Doug

On Mon, Sep 3, 2018 at 5:23 AM Charlie Hull <ch...@flax.co.uk> wrote:

> On 31/08/2018 19:36, Doug Turnbull wrote:
> > Hello,
> >
> > We're working on a Solr More Like This project (Solr 6.6.2), using the
> More
> > Like This searchComponent. What we note is in standalone Solr, when we
> > request MLT using the search component, we get every more like this
> > document fully formed with complete fields in the moreLikeThis section.
>
> Hey Doug,
>
> IIRC there wasn't a lot of support for MLT in cloud mode a few years
> ago, and there are certainly still a few open issues around cloud support:
> https://issues.apache.org/jira/browse/SOLR-4414
> https://issues.apache.org/jira/browse/SOLR-5480
> Maybe there are some hints in the ticket comments about different ways
> to do what you want.
>
> Cheers
>
> Charlie
>
> >
> > In cloud, however, with the exact same query and config, we only get the
> > doc ids under "moreLikeThis" requiring us to fetch the metadata
> associated
> > with each document.
> >
> > I can't easily share an example due to confidentiality, but I want to
> check
> > if we're missing something? Documentation doesn't mention any
> limitations.
> > The only interesting note I've found is this one which points to a
> > potential difference in behavior
> >
> >>   The Cloud MLT Query Parser uses the realtime get handler to retrieve
> the
> > fields to be mined for keywords. Because of the way the realtime get
> > handler is implemented, it does not return data for fields populated
> using
> > copyField.
> >
> > https://stackoverflow.com/a/46307140/8123
> >
> > Any thoughts?
> >
> > -Doug
> >
>
>
> --
> Charlie Hull
> Flax - Open Source Enterprise Search
>
> tel/fax: +44 (0)8700 118334 <+44%20870%20011%208334>
> mobile:  +44 (0)7767 825828 <+44%207767%20825828>
> web: www.flax.co.uk
>
-- 
CTO, OpenSource Connections
Author, Relevant Search
http://o19s.com/doug

Re: MLT in Cloud Mode - Not Returning Fields?

Posted by Doug Turnbull <dt...@opensourceconnections.com>.
Thanks Charlie, those are helpful.

I think at this point we will attach a debugger and see what shakes out.
Perhaps it's one of these cases you list. Perhaps we're missing something.
We'll report back.

-Doug

On Mon, Sep 3, 2018 at 5:23 AM Charlie Hull <ch...@flax.co.uk> wrote:

> On 31/08/2018 19:36, Doug Turnbull wrote:
> > Hello,
> >
> > We're working on a Solr More Like This project (Solr 6.6.2), using the
> More
> > Like This searchComponent. What we note is in standalone Solr, when we
> > request MLT using the search component, we get every more like this
> > document fully formed with complete fields in the moreLikeThis section.
>
> Hey Doug,
>
> IIRC there wasn't a lot of support for MLT in cloud mode a few years
> ago, and there are certainly still a few open issues around cloud support:
> https://issues.apache.org/jira/browse/SOLR-4414
> https://issues.apache.org/jira/browse/SOLR-5480
> Maybe there are some hints in the ticket comments about different ways
> to do what you want.
>
> Cheers
>
> Charlie
>
> >
> > In cloud, however, with the exact same query and config, we only get the
> > doc ids under "moreLikeThis" requiring us to fetch the metadata
> associated
> > with each document.
> >
> > I can't easily share an example due to confidentiality, but I want to
> check
> > if we're missing something? Documentation doesn't mention any
> limitations.
> > The only interesting note I've found is this one which points to a
> > potential difference in behavior
> >
> >>   The Cloud MLT Query Parser uses the realtime get handler to retrieve
> the
> > fields to be mined for keywords. Because of the way the realtime get
> > handler is implemented, it does not return data for fields populated
> using
> > copyField.
> >
> > https://stackoverflow.com/a/46307140/8123
> >
> > Any thoughts?
> >
> > -Doug
> >
>
>
> --
> Charlie Hull
> Flax - Open Source Enterprise Search
>
> tel/fax: +44 (0)8700 118334 <+44%20870%20011%208334>
> mobile:  +44 (0)7767 825828 <+44%207767%20825828>
> web: www.flax.co.uk
>
-- 
CTO, OpenSource Connections
Author, Relevant Search
http://o19s.com/doug

Re: MLT in Cloud Mode - Not Returning Fields?

Posted by Charlie Hull <ch...@flax.co.uk>.
On 31/08/2018 19:36, Doug Turnbull wrote:
> Hello,
> 
> We're working on a Solr More Like This project (Solr 6.6.2), using the More
> Like This searchComponent. What we note is in standalone Solr, when we
> request MLT using the search component, we get every more like this
> document fully formed with complete fields in the moreLikeThis section.

Hey Doug,

IIRC there wasn't a lot of support for MLT in cloud mode a few years 
ago, and there are certainly still a few open issues around cloud support:
https://issues.apache.org/jira/browse/SOLR-4414
https://issues.apache.org/jira/browse/SOLR-5480
Maybe there are some hints in the ticket comments about different ways 
to do what you want.

Cheers

Charlie

> 
> In cloud, however, with the exact same query and config, we only get the
> doc ids under "moreLikeThis" requiring us to fetch the metadata associated
> with each document.
> 
> I can't easily share an example due to confidentiality, but I want to check
> if we're missing something? Documentation doesn't mention any limitations.
> The only interesting note I've found is this one which points to a
> potential difference in behavior
> 
>>   The Cloud MLT Query Parser uses the realtime get handler to retrieve the
> fields to be mined for keywords. Because of the way the realtime get
> handler is implemented, it does not return data for fields populated using
> copyField.
> 
> https://stackoverflow.com/a/46307140/8123
> 
> Any thoughts?
> 
> -Doug
> 


-- 
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk