You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "somandal (via GitHub)" <gi...@apache.org> on 2023/05/08 19:52:32 UTC

[GitHub] [pinot] somandal opened a new issue, #10745: [multistage] The multistage aggregate planner and runtime do not work with all v1 aggregation functions

somandal opened a new issue, #10745:
URL: https://github.com/apache/pinot/issues/10745

   The aggregation functions supported by v1 engine are not fully supported in v2 engine and we should be able to leverage them without rewriting them for v2 engine.  Today the multistage engine expects that to add support for each new aggregation function, we modify the planner and runtime code. This approach is not scalable in the long term and requires detailed knowledge of the Pinot engine to add support for new functions. We should make the multistage engine aggregation support generic and extensible.
   
   Some of the high-level changes we need to make: 
   
   - Generic way to register all existing and new aggregate functions dynamically to the SqlOperatorTable (using annotations / etc). (By overriding getOperatorList()) . Also have an additional check in Optimization rule matches()  to early terminate if a agg function doesn’t exist.  This way, when a new Aggregation Function is introduced in Pinot, the author need not make heavy planner and runtime changes.
   - Onmatch()  The AggregationFunction should dictate the splitting logic. Planner Optimization should not get into the business of determining the logic for each and every aggregation function.
   - AggregateOperator (in intermediate stage) should be merged or at least call into AggregationOperator (in leaf). The accumulate logic should be common. The row vs column processing can be implemented as a wrapper to convert wherever necessary.
   - AggregateOperator  should not work on RexExpression. Should be converted to a ExpressionContext . Also leaf AggregationFunction  currently takes in the QueryContext. This should also be avoided.
   
   cc @siddharthteotia @vvivekiyer @walterddr @kishoreg 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on issue #10745: [multistage] The multistage aggregate planner and runtime do not work with all v1 aggregation functions

Posted by "siddharthteotia (via GitHub)" <gi...@apache.org>.
siddharthteotia commented on issue #10745:
URL: https://github.com/apache/pinot/issues/10745#issuecomment-1541143669

   FYI @kishoreg ^^


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on issue #10745: [multistage] The multistage aggregate planner and runtime do not work with all v1 aggregation functions

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on issue #10745:
URL: https://github.com/apache/pinot/issues/10745#issuecomment-1622036240

   closed with #10845 and #10846
   creating follow up for the rest of the functions


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on issue #10745: [multistage] The multistage aggregate planner and runtime do not work with all v1 aggregation functions

Posted by "siddharthteotia (via GitHub)" <gi...@apache.org>.
siddharthteotia commented on issue #10745:
URL: https://github.com/apache/pinot/issues/10745#issuecomment-1540933864

   @vvivekiyer and @jasperjiaguo  will be picking this up dividing the work between themselves for planner and runtime. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr closed issue #10745: [multistage] The multistage aggregate planner and runtime do not work with all v1 aggregation functions

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr closed issue #10745: [multistage] The multistage aggregate planner and runtime do not work with all v1 aggregation functions
URL: https://github.com/apache/pinot/issues/10745


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on issue #10745: [multistage] The multistage aggregate planner and runtime do not work with all v1 aggregation functions

Posted by "siddharthteotia (via GitHub)" <gi...@apache.org>.
siddharthteotia commented on issue #10745:
URL: https://github.com/apache/pinot/issues/10745#issuecomment-1541143234

   We sync'd up. @somandal will help with planner side changes and @vvivekiyer  will help with runtime side changes. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kishoreg commented on issue #10745: [multistage] The multistage aggregate planner and runtime do not work with all v1 aggregation functions

Posted by "kishoreg (via GitHub)" <gi...@apache.org>.
kishoreg commented on issue #10745:
URL: https://github.com/apache/pinot/issues/10745#issuecomment-1539611189

   Thanks @somandal for summarizing. I forgot to bring up the issue with the return data type for aggregation functions.
   
   we should enhance the aggregation function to have a method that provides the result data type based on the input type.
   
   ```  
     ColumnDataType getIntermediateResultColumnType();
   
     ColumnDataType getFinalResultColumnType();
      
   /** new method*/
     default ColumnDataType getIntermediateResultColumnType(ColumnDataType[] argTypes){
       return getIntermediateResultColumnType();
     }
   
     
    /** new method*/
     default ColumnDataType getFinalResultColumnType(ColumnDataType[] argTypes){
       return getFinalResultColumnType();
     }
     ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] somandal commented on issue #10745: [multistage] The multistage aggregate planner and runtime do not work with all v1 aggregation functions

Posted by "somandal (via GitHub)" <gi...@apache.org>.
somandal commented on issue #10745:
URL: https://github.com/apache/pinot/issues/10745#issuecomment-1555378367

   Please review this proposal written by me and @vvivekiyer : https://docs.google.com/document/d/1Us6aBvTpNLMEy0ODo34OgTk73h_LVFFAH6q17689h1M/edit?usp=sharing 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org