You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Gus Heck (Jira)" <ji...@apache.org> on 2020/09/23 14:13:00 UTC

[jira] [Commented] (SOLR-8281) Add RollupMergeStream to Streaming API

    [ https://issues.apache.org/jira/browse/SOLR-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200848#comment-17200848 ] 

Gus Heck commented on SOLR-8281:
--------------------------------

This seems related to something I wanted to do for a client... I had reduce with group() and I wanted to then feed the groups to an arbitrary streaming expression for further processing, and have the result show up in the groups (result would have been a matrix). Problem I stopped on was how to express the stream to process the group without having a source (the source is the group).

> Add RollupMergeStream to Streaming API
> --------------------------------------
>
>                 Key: SOLR-8281
>                 URL: https://issues.apache.org/jira/browse/SOLR-8281
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>            Priority: Major
>
> The RollupMergeStream merges the aggregate results emitted by the RollupStream on *worker* nodes.
> This is designed to be used in conjunction with the HashJoinStream to perform rollup Aggregations on the joined Tuples. The HashJoinStream will require the tuples to be partitioned on the Join keys. To avoid needing to repartition on the *group by* fields for the RollupStream, we can perform a merge of the rolled up Tuples coming from the workers.
> The construct would like this:
> {code}
> mergeRollup (...
>                       parallel (...
>                                     rollup (...
>                                                 hashJoin (
>                                                                   search(...),
>                                                                   search(...),
>                                                                   on="fieldA" 
>                                                 )
>                                      )
>                          )
>                )
> {code}
> The pseudo code above would push the *hashJoin* and *rollup* to the *worker* nodes. The emitted rolled up tuples would be merged by the mergeRollup.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org