You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Joel Bernstein (JIRA)" <ji...@apache.org> on 2015/12/11 03:31:11 UTC
[jira] [Comment Edited] (SOLR-8337) Add ReduceOperation and wire it
into the ReducerStream
[ https://issues.apache.org/jira/browse/SOLR-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15025241#comment-15025241 ]
Joel Bernstein edited comment on SOLR-8337 at 12/11/15 2:30 AM:
----------------------------------------------------------------
Patch adds a single *reduce()* method for the ReduceOperation that returns a single Tuple, which is the final reduction.
The *operate(Tuple)* method will be called for each Tuple that is read by the *ReducerStream*.
The reduce() method will be called each time the group by key changes. This will give the ReduceOperation a chance to finish the reduce algorithm and return a single Tuple. The ReduceOperation will also clear it's internal memory after each call to reduce() to prepare for the next Tuple grouping.
was (Author: joel.bernstein):
Patch adds a single *reduce()* method for ReduceOperation that returns a single Tuple, which is the final reduction.
The *operate(Tuple)* method will be called for each Tuple that is read by the *ReducerStream*.
The reduce() method will be called each time the group by key changes. This will give the ReduceOperation a chance to finish the reduce algorithm and return a single Tuple. The ReduceOperation will also clear it's internal memory after each call to reduce() to prepare for the next Tuple grouping.
> Add ReduceOperation and wire it into the ReducerStream
> ------------------------------------------------------
>
> Key: SOLR-8337
> URL: https://issues.apache.org/jira/browse/SOLR-8337
> Project: Solr
> Issue Type: Bug
> Reporter: Joel Bernstein
> Attachments: SOLR-8337.patch, SOLR-8337.patch, SOLR-8337.patch, SOLR-8337.patch, SOLR-8337.patch
>
>
> The current ReducerStream groups all documents that share the same key(s) into a list and emits a single Tuple that contains this list. There is no way to tell the ReducerStream to do something more interesting with groups, for example summing a column within a group, or joining tuples.
> This ticket adds a new type of operation called a ReduceOperation which is passed to the ReducerStream so that the reduce behavior can be specialized.
> The ReduceOperation has two methods:
> 1) operate(Tuple) : This is called once for each Tuple in a group. This method can be used to aggregate Tuples as they added to a group.
> 2) reduce() : This is called when the group keys change. This method returns a single Tuple which is output by the ReducerStream. The ReduceOperation must clear it's internal structures when reduce is called as well, to prepare for the next group.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org