You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2019/07/19 21:08:56 UTC

[GitHub] [incubator-pinot] mayankshriv opened a new issue #4450: Support for SQL compliant Order By in PQL

mayankshriv opened a new issue #4450: Support for SQL compliant Order By in PQL
URL: https://github.com/apache/incubator-pinot/issues/4450

Pinot currently does not support `Order By` in PQL. The current behavior is:

1. Throw exception for the following query:
`select sum(m1) from table where ... group by d1, d2 order by d1`

2. Silently ignore `order by` for the following query. Each group is sorted based on its aggregation value.
`select sum(m1), d1, d2 from table where ... group by d1, d2 order by d1`

The proposal is to support support the SQL compliant `Order By` in PQL for both the queries above. With this features. Upon initial investigation, it appears that the existing group-by execution code has several performance related optimizations that make it hard to support the new feature. So the initial proposal is as follows:

1. The existing `BrokerRequest` implementation has tied the `Order By` to `selection` queries only. In order to maintain backward compatibility, the proposal is to add a top level `Order By` in the `request.thrift` that would initially apply to `group-by`, and after drop the `SelectionSort` part.

2. In the presence of `Order By`, generate a new plan using new operator to have a parallel implementation for this new feature. Once this code is matured (for functionality and performance), the existing functionality can also be moved to use the new code.

3. The new operator will work on tabular data as opposed to Map with single group key and a value. This tabular structure would make it more generic and reusable for other queries (selection), as well as allow for creating customized sort orders (column based).

4. With SQL compliant behavior, we will now be able to perform segment level trimming of results in presence of multiple aggregation functions. This does not happen today because it breaks the existing behavior as dropping key for one aggregation function also drop it for all others (not desired for current non SQL behavior).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org