You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Jan Høydahl (Jira)" <ji...@apache.org> on 2019/10/21 08:21:00 UTC
[jira] [Commented] (SOLR-13125) Optimize Queries when sorting by router.field

    [ https://issues.apache.org/jira/browse/SOLR-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955861#comment-16955861 ] 

Jan Høydahl commented on SOLR-13125:
------------------------------------

Hi, following up on this. [~gus] I have not looked into the code yet wrt new SearchComponent hook or some other design. I think mosh may be correct that it would be better if this is decoupled from searchHandler and more a feature of TRA.

> Optimize Queries when sorting by router.field
> ---------------------------------------------
>
>                 Key: SOLR-13125
>                 URL: https://issues.apache.org/jira/browse/SOLR-13125
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: mosh
>            Assignee: Gus Heck
>            Priority: Minor
>         Attachments: SOLR-13125-no-commit.patch, SOLR-13125.patch, SOLR-13125.patch, SOLR-13125.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> We are currently testing TRA using Solr 7.7, having >300 shards in the alias, with much growth in the coming months.
> The "hot" data(in our case, more recent) will be stored on stronger nodes(SSD, more RAM, etc).
> A proposal of optimizing queries sorted by router.field(the field which TRA uses to route the data to the correct collection) has emerged.
> Perhaps, in queries which are sorted by router.field, Solr could be smart enough to wait for the more recent collections, and in case the limit was reached cancel other queries(or just not block and wait for the results)?
> For example:
> When querying a TRA which with a filter on a different field than router.field, but sorting by router.field desc, limit=100.
> Since this is a TRA, solr will issue queries for all the collections in the alias.
> But to optimize this particular type of query, Solr could wait for the most recent collection in the TRA, see whether the result set matches or exceeds the limit. If so, the query could be returned to the user without waiting for the rest of the shards. If not, the issuing node will block until the second query returns, and so forth, until the limit of the request is reached.
> This might also be useful for deep paging, querying each collection and only skipping to the next once there are no more results in the specified collection.
> Thoughts or inputs are always welcome.
> This is just my two cents, and I'm always happy to brainstorm.
> Thanks in advance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org