You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by uyilmaz <uy...@vivaldi.net.INVALID> on 2020/10/13 21:38:05 UTC

Strange fetch streaming expression doesn't fetch fields sometimes?

Hi all,

I have a streaming expression looking like:

fetch(
  myAlias,
  top(
	n=3,
  ....various expressions here
    sort="count(*) desc"
  ),
  fl="username", on="userid=userid", batchSize=3
)

which fails to fetch username field for the 1st result:

{
 "result-set":{
  "docs":[{
    "userid":"123123",
    "count(*)":58}
   ,{
    "userid":"123123123",
    "count(*)":32,
    "username":"Ayha"}
   ,{
    "userid":"12432423321323",
    "count(*)":30,
    "username":"MEHM"}
   ,{
    "EOF":true,
    "RESPONSE_TIME":34889}]}}
	
But strangely, when I change n and batchSize both to 2 and touch nothing else, fetch fetches the first username correctly:

fetch(
  myAlias,
  top(
	n=2,
  ....various expressions here
    sort="count(*) desc"
  ),
  fl="username", on="userid=userid", batchSize=2
)

Result is:

{
 "result-set":{
  "docs":[{
    "userid":"123123",
    "count(*)":58,
    "username":"mura"}
   ,{
    "userid":"123123123",
    "count(*)":32,
    "username":"Ayha"}
   ,{
    "EOF":true,
    "RESPONSE_TIME":34889}]}}
	
What can be the problem?

Regards

~~ufuk

-- 
uyilmaz <uy...@vivaldi.net>

Re: Strange fetch streaming expression doesn't fetch fields sometimes?

Posted by uyilmaz <uy...@vivaldi.net.INVALID>.
Is it possible to duplicate its functionality using existing expressions?

In SQL, while grouping you can just say first(column) to get some one-to-many value if you don't care which one you get. Solr usually only has min,max,avg.. aggregation functions. If it had a "first" function I could just get userid and first(username) in an expression, I sometimes use min(username) as a trick while faceting to get extra fields alongside faceted results, but max,min only accepts numbers in streaming expressions.

On Wed, 14 Oct 2020 20:47:28 -0400
Joel Bernstein <jo...@gmail.com> wrote:

> Yes, the docs mention one-to-one and many-to-one fetches, but one-to-many
> is not supported currently. I've never really been happy with fetch. It
> really needs to be replaced with a standard nested loop join that handles
> all scenarios.
> 
> 
> Joel Bernstein
> http://joelsolr.blogspot.com/
> 
> 
> On Tue, Oct 13, 2020 at 6:30 PM uyilmaz <uy...@vivaldi.net.invalid> wrote:
> 
> > I think I found the reason right after asking (facepalm), but it took me
> > days to realize this.
> >
> > I think fetch performs a naive "in" query, something like:
> >
> > q="userid:(123123 123123123 12432423321323)&rows={batchSize}"
> >
> > When userid to document relation is one-to-many, it is possible that above
> > query will result in documents consisting entirely of last two userid's
> > documents, so the first one is left out, resulting in empty username. Docs
> > state that one to many is not supported with fetch, but I didn't stumble
> > onto this issue until recently so I just assumed it would work.
> >
> > Sorry to take your time, I hope this helps somebody later.
> >
> > Have a nice day.
> >
> > On Wed, 14 Oct 2020 00:38:05 +0300
> > uyilmaz <uy...@vivaldi.net.INVALID> wrote:
> >
> > >
> > > Hi all,
> > >
> > > I have a streaming expression looking like:
> > >
> > > fetch(
> > >   myAlias,
> > >   top(
> > >       n=3,
> > >   ....various expressions here
> > >     sort="count(*) desc"
> > >   ),
> > >   fl="username", on="userid=userid", batchSize=3
> > > )
> > >
> > > which fails to fetch username field for the 1st result:
> > >
> > > {
> > >  "result-set":{
> > >   "docs":[{
> > >     "userid":"123123",
> > >     "count(*)":58}
> > >    ,{
> > >     "userid":"123123123",
> > >     "count(*)":32,
> > >     "username":"Ayha"}
> > >    ,{
> > >     "userid":"12432423321323",
> > >     "count(*)":30,
> > >     "username":"MEHM"}
> > >    ,{
> > >     "EOF":true,
> > >     "RESPONSE_TIME":34889}]}}
> > >
> > > But strangely, when I change n and batchSize both to 2 and touch nothing
> > else, fetch fetches the first username correctly:
> > >
> > > fetch(
> > >   myAlias,
> > >   top(
> > >       n=2,
> > >   ....various expressions here
> > >     sort="count(*) desc"
> > >   ),
> > >   fl="username", on="userid=userid", batchSize=2
> > > )
> > >
> > > Result is:
> > >
> > > {
> > >  "result-set":{
> > >   "docs":[{
> > >     "userid":"123123",
> > >     "count(*)":58,
> > >     "username":"mura"}
> > >    ,{
> > >     "userid":"123123123",
> > >     "count(*)":32,
> > >     "username":"Ayha"}
> > >    ,{
> > >     "EOF":true,
> > >     "RESPONSE_TIME":34889}]}}
> > >
> > > What can be the problem?
> > >
> > > Regards
> > >
> > > ~~ufuk
> > >
> > > --
> > > uyilmaz <uy...@vivaldi.net>
> >
> >
> > --
> > uyilmaz <uy...@vivaldi.net>
> >


-- 
uyilmaz <uy...@vivaldi.net>

Re: Strange fetch streaming expression doesn't fetch fields sometimes?

Posted by Joel Bernstein <jo...@gmail.com>.
Yes, the docs mention one-to-one and many-to-one fetches, but one-to-many
is not supported currently. I've never really been happy with fetch. It
really needs to be replaced with a standard nested loop join that handles
all scenarios.


Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Oct 13, 2020 at 6:30 PM uyilmaz <uy...@vivaldi.net.invalid> wrote:

> I think I found the reason right after asking (facepalm), but it took me
> days to realize this.
>
> I think fetch performs a naive "in" query, something like:
>
> q="userid:(123123 123123123 12432423321323)&rows={batchSize}"
>
> When userid to document relation is one-to-many, it is possible that above
> query will result in documents consisting entirely of last two userid's
> documents, so the first one is left out, resulting in empty username. Docs
> state that one to many is not supported with fetch, but I didn't stumble
> onto this issue until recently so I just assumed it would work.
>
> Sorry to take your time, I hope this helps somebody later.
>
> Have a nice day.
>
> On Wed, 14 Oct 2020 00:38:05 +0300
> uyilmaz <uy...@vivaldi.net.INVALID> wrote:
>
> >
> > Hi all,
> >
> > I have a streaming expression looking like:
> >
> > fetch(
> >   myAlias,
> >   top(
> >       n=3,
> >   ....various expressions here
> >     sort="count(*) desc"
> >   ),
> >   fl="username", on="userid=userid", batchSize=3
> > )
> >
> > which fails to fetch username field for the 1st result:
> >
> > {
> >  "result-set":{
> >   "docs":[{
> >     "userid":"123123",
> >     "count(*)":58}
> >    ,{
> >     "userid":"123123123",
> >     "count(*)":32,
> >     "username":"Ayha"}
> >    ,{
> >     "userid":"12432423321323",
> >     "count(*)":30,
> >     "username":"MEHM"}
> >    ,{
> >     "EOF":true,
> >     "RESPONSE_TIME":34889}]}}
> >
> > But strangely, when I change n and batchSize both to 2 and touch nothing
> else, fetch fetches the first username correctly:
> >
> > fetch(
> >   myAlias,
> >   top(
> >       n=2,
> >   ....various expressions here
> >     sort="count(*) desc"
> >   ),
> >   fl="username", on="userid=userid", batchSize=2
> > )
> >
> > Result is:
> >
> > {
> >  "result-set":{
> >   "docs":[{
> >     "userid":"123123",
> >     "count(*)":58,
> >     "username":"mura"}
> >    ,{
> >     "userid":"123123123",
> >     "count(*)":32,
> >     "username":"Ayha"}
> >    ,{
> >     "EOF":true,
> >     "RESPONSE_TIME":34889}]}}
> >
> > What can be the problem?
> >
> > Regards
> >
> > ~~ufuk
> >
> > --
> > uyilmaz <uy...@vivaldi.net>
>
>
> --
> uyilmaz <uy...@vivaldi.net>
>

Re: Strange fetch streaming expression doesn't fetch fields sometimes?

Posted by uyilmaz <uy...@vivaldi.net.INVALID>.
I think I found the reason right after asking (facepalm), but it took me days to realize this.

I think fetch performs a naive "in" query, something like:

q="userid:(123123 123123123 12432423321323)&rows={batchSize}"

When userid to document relation is one-to-many, it is possible that above query will result in documents consisting entirely of last two userid's documents, so the first one is left out, resulting in empty username. Docs state that one to many is not supported with fetch, but I didn't stumble onto this issue until recently so I just assumed it would work.

Sorry to take your time, I hope this helps somebody later.

Have a nice day.

On Wed, 14 Oct 2020 00:38:05 +0300
uyilmaz <uy...@vivaldi.net.INVALID> wrote:

> 
> Hi all,
> 
> I have a streaming expression looking like:
> 
> fetch(
>   myAlias,
>   top(
> 	n=3,
>   ....various expressions here
>     sort="count(*) desc"
>   ),
>   fl="username", on="userid=userid", batchSize=3
> )
> 
> which fails to fetch username field for the 1st result:
> 
> {
>  "result-set":{
>   "docs":[{
>     "userid":"123123",
>     "count(*)":58}
>    ,{
>     "userid":"123123123",
>     "count(*)":32,
>     "username":"Ayha"}
>    ,{
>     "userid":"12432423321323",
>     "count(*)":30,
>     "username":"MEHM"}
>    ,{
>     "EOF":true,
>     "RESPONSE_TIME":34889}]}}
> 	
> But strangely, when I change n and batchSize both to 2 and touch nothing else, fetch fetches the first username correctly:
> 
> fetch(
>   myAlias,
>   top(
> 	n=2,
>   ....various expressions here
>     sort="count(*) desc"
>   ),
>   fl="username", on="userid=userid", batchSize=2
> )
> 
> Result is:
> 
> {
>  "result-set":{
>   "docs":[{
>     "userid":"123123",
>     "count(*)":58,
>     "username":"mura"}
>    ,{
>     "userid":"123123123",
>     "count(*)":32,
>     "username":"Ayha"}
>    ,{
>     "EOF":true,
>     "RESPONSE_TIME":34889}]}}
> 	
> What can be the problem?
> 
> Regards
> 
> ~~ufuk
> 
> -- 
> uyilmaz <uy...@vivaldi.net>


-- 
uyilmaz <uy...@vivaldi.net>