You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/12/12 00:16:12 UTC

[GitHub] [iceberg] rdblue commented on pull request #6405: API: Add Aggregate expression evaluation

rdblue commented on PR #6405:
URL: https://github.com/apache/iceberg/pull/6405#issuecomment-1345703757

   @huaxingao, I was looking at #6252 and I wanted to try out implementing aggregation in either the core or API modules so that the majority of the logic could be shared rather than needing to implement it in every processing engine.
   
   Could you please take a look at this and see if it seems reasonable?
   
   The basic idea is to use `BoundAggregate` to do two things:
   1. Extract a value to aggregate in `eval(StructLike)` or `eval(DataFile)`, which is similar to how `eval` is used for other expressions
   2. Create an `Aggregator` that keeps track of the aggregate state
   
   Then this also adds `AggregateEvaluator` that operates on a list of aggregate expressions
   * `aggEval = AggregateEvaluator.create(tableSchema, expressions)` binds the expressions and creates aggregators for each one
   * `aggEval.update(StructLike)` and `aggEval.update(DataFile)` updates each expression aggregator
   * `aggEval.result()` returns a `StructLike` with the aggregated values
   * `aggEval.resultType()` returns a `StructType` for the aggregated values
   
   This is based on #6252, but tries to keep as much logic as possible in core/API. What do you think? Could we incorporate this into #6252?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org