You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2019/10/03 18:57:45 UTC

[GitHub] [incubator-pinot] npawar opened a new pull request #4602: First pass of GROUP BY with ORDER BY support

npawar opened a new pull request #4602: First pass of GROUP BY with ORDER BY support
URL: https://github.com/apache/incubator-pinot/pull/4602
 
 
   This PR contains the implementation of ORDER BY support in group by.
   
   In this first pass, the changes have been done from `CombineGroupByOrderByOperator` upwards. The AggregationGroupByOperator hasn't been changed.
   
   `IndexedTable` is used wherever possible (to merge results in CombineGroupByOrderByOperator, and then to reduce results across servers in the BrokerReduceService)
   
   `ResultTable` has been introduced, as a standard way to return results to the client.
   
   2 `queryOptions` have been introduced:
   1. groupByMode - pql/sql - whether to execute the group by in PQL style (split all aggregations and ignore order by) or standard SQL style
   2. responseFormat - pql/sql - whether to present results using List<AggregationResults> (the PQL way), or use ResultTable which is closer to the SQL way.
   By default, the modes are PQL, PQL
   In order to get the order by results in ResultTable, modes should be SQL,SQL
   In order to get the order by results, but in List<AggregationResult>, modes should be SQL,PQL
   These modes can be added to the JSON payload:
   `curl -H "Content-Type: application/json" -X POST -d '{"sql":"select count(*) from table group by dim1 order by dim1","queryOptions":"groupByMode=sql;responseFormat=sql"}' http://localhost:8099/query`
   
   Pending: Benchmarking.
   A comparison should be done of `SELECT agg1 FROM table GROUP BY group1, group2 ORDER by agg1 DESC` with the performance of the original `SELECT agg1 FROM table GROUP BY group1, group2` as the results are expected to be identical.
   We can also compare `SELECT agg1,agg2... FROM table GROUP BY group1, group2 ORDER by agg1 DESC` with the performance of the original `SELECT agg1,agg2... FROM table GROUP BY group1, group2`. The groups will be different in the latter, but it is comparable in terms of result size.
   
   Next steps: Push IndexedTable down into the AggregationGroupByOperator. We can introduce new operators, for each strategy we're trying out (1 ConcurrentIndexedTable, multiple SimpleIndexedTable, etc)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org