You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/11/16 20:20:43 UTC

[GitHub] [druid] gianm opened a new pull request #11930: Scan: Add "orderBy" parameter.

gianm opened a new pull request #11930:
URL: https://github.com/apache/druid/pull/11930


   This patch adds an API for requesting non-time orderings, although it
   does not actually add the ability to execute such queries.
   
   A key motivation is to enable ORDER BY tracking for INSERT described
   in #11929, which is why this patch is still useful even if it doesn't add the
   ability to execute Scan queries with non-time ordering. But I do think that
   would also be very useful, so I do think we will eventually add execution
   code for this parameter.
   
   The changes are done in such a way that no matter how Scan query objects
   are constructed, they will have a correct "getOrderBy". This will enable
   us to switch the execution to exclusively use "getOrderBy" later on when
   it's implemented.
   
   Scan queries are serialized such that they only include "order" (time
   order) if the ordering is time-based, and they only include "orderBy" if
   the ordering is non-time-based. This maximizes compatibility with
   the existing API while also providing a clean look for formatted queries.
   
   Because this patch does not include execution logic, if someone actually
   tries to run a query with non-time ordering, then they will get an error
   like "Cannot execute query with orderBy [quality ASC]".


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #11930: Scan: Add "orderBy" parameter.

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #11930:
URL: https://github.com/apache/druid/pull/11930#discussion_r752849195



##########
File path: processing/src/main/java/org/apache/druid/query/scan/ScanQueryEngine.java
##########
@@ -131,8 +131,8 @@
                     intervals.get(0),
                     query.getVirtualColumns(),
                     Granularities.ALL,
-                    query.getOrder().equals(ScanQuery.Order.DESCENDING) ||
-                    (query.getOrder().equals(ScanQuery.Order.NONE) && query.isDescending()),
+                    query.getTimeOrder().equals(ScanQuery.Order.DESCENDING) ||
+                    (query.getTimeOrder().equals(ScanQuery.Order.NONE) && query.isDescending()),

Review comment:
       I know this isn't new, but unless i'm missing something it looks like the scan query constructor hard codes `descending` to false, so the 2nd clause should never be true?

##########
File path: processing/src/main/java/org/apache/druid/query/scan/ScanQuery.java
##########
@@ -417,10 +542,92 @@ public String toString()
            ", limit=" + scanRowsLimit +
            ", dimFilter=" + dimFilter +
            ", columns=" + columns +
-           ", legacy=" + legacy +
+           (orderBys.isEmpty() ? "" : ", orderBy=" + orderBys) +
+           (legacy == null ? "" : ", legacy=" + legacy) +
+           ", context=" + getContext() +
            '}';
   }
 
+  /**
+   * Verify and reconcile the two ways of specifying ordering: "orderBy", which can refer to any column, and
+   * "order", which refers to the __time column.
+   *
+   * If only "order" is provided, it is returned as-is, along with an equivalent "orderBy".
+   *
+   * If only "orderBy" is provided, it is returned as-is. If it can be converted into an equivalent "order", then that
+   * equivalent "order" is also returned. Otherwise, "orderBy" is returned as-is and "order" is returned as NONE.
+   *
+   * If both "orderBy" and "order" are provided, this returns them as-is if they are compatible, or throws an
+   * exception if they are incompatible.
+   *
+   * @param orderByFromUser "orderBy" specified by the user (can refer to any column)
+   * @param orderFromUser   "order" specified by the user (refers to time order)
+   */
+  private static Pair<List<OrderBy>, Order> verifyAndReconcileOrdering(
+      @Nullable final List<OrderBy> orderByFromUser,
+      @Nullable final Order orderFromUser
+  )
+  {
+    final List<OrderBy> orderByRetVal;
+    final Order orderRetVal;
+
+    // Compute the returned orderBy.
+    if (orderByFromUser != null) {
+      orderByRetVal = orderByFromUser;
+    } else if (orderFromUser == null || orderFromUser == Order.NONE) {
+      orderByRetVal = Collections.emptyList();
+    } else {
+      orderByRetVal = Collections.singletonList(new OrderBy(ColumnHolder.TIME_COLUMN_NAME, orderFromUser));
+    }
+
+    // Compute the returned order.
+    orderRetVal = computeTimeOrderFromOrderBys(orderByRetVal);

Review comment:
       nit: it seems slightly funny to do this here, but also seems ok since there is no implementation yet of order by other than time. Mostly just wondering if it should be pushed down a bit further. I guess you didn't want to change the scan engine / query runner factory to work on `OrderBy` instead of `Order` and just do the checking there?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on pull request #11930: Scan: Add "orderBy" parameter.

Posted by GitBox <gi...@apache.org>.
gianm commented on pull request #11930:
URL: https://github.com/apache/druid/pull/11930#issuecomment-971098969


   > This pull request **introduces 1 alert** when merging [3368d8d](https://github.com/apache/druid/commit/3368d8da8edc3483066865dc49fb0f23211b4eda) into [1487f55](https://github.com/apache/druid/commit/1487f558b1261a751c0bbcd3f551278498913c44) - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-7a5ec5f3a80168a9af3fa6d13839584d7a0bfc6f)
   > 
   > **new alerts:**
   > 
   >     * 1 for Inconsistent equals and hashCode
   
   This is a false alarm due to the weird JsonIgnore API.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on a change in pull request #11930: Scan: Add "orderBy" parameter.

Posted by GitBox <gi...@apache.org>.
gianm commented on a change in pull request #11930:
URL: https://github.com/apache/druid/pull/11930#discussion_r753276862



##########
File path: processing/src/main/java/org/apache/druid/query/scan/ScanQuery.java
##########
@@ -417,10 +542,92 @@ public String toString()
            ", limit=" + scanRowsLimit +
            ", dimFilter=" + dimFilter +
            ", columns=" + columns +
-           ", legacy=" + legacy +
+           (orderBys.isEmpty() ? "" : ", orderBy=" + orderBys) +
+           (legacy == null ? "" : ", legacy=" + legacy) +
+           ", context=" + getContext() +
            '}';
   }
 
+  /**
+   * Verify and reconcile the two ways of specifying ordering: "orderBy", which can refer to any column, and
+   * "order", which refers to the __time column.
+   *
+   * If only "order" is provided, it is returned as-is, along with an equivalent "orderBy".
+   *
+   * If only "orderBy" is provided, it is returned as-is. If it can be converted into an equivalent "order", then that
+   * equivalent "order" is also returned. Otherwise, "orderBy" is returned as-is and "order" is returned as NONE.
+   *
+   * If both "orderBy" and "order" are provided, this returns them as-is if they are compatible, or throws an
+   * exception if they are incompatible.
+   *
+   * @param orderByFromUser "orderBy" specified by the user (can refer to any column)
+   * @param orderFromUser   "order" specified by the user (refers to time order)
+   */
+  private static Pair<List<OrderBy>, Order> verifyAndReconcileOrdering(
+      @Nullable final List<OrderBy> orderByFromUser,
+      @Nullable final Order orderFromUser
+  )
+  {
+    final List<OrderBy> orderByRetVal;
+    final Order orderRetVal;
+
+    // Compute the returned orderBy.
+    if (orderByFromUser != null) {
+      orderByRetVal = orderByFromUser;
+    } else if (orderFromUser == null || orderFromUser == Order.NONE) {
+      orderByRetVal = Collections.emptyList();
+    } else {
+      orderByRetVal = Collections.singletonList(new OrderBy(ColumnHolder.TIME_COLUMN_NAME, orderFromUser));
+    }
+
+    // Compute the returned order.
+    orderRetVal = computeTimeOrderFromOrderBys(orderByRetVal);

Review comment:
       > I guess you didn't want to change the scan engine / query runner factory to work on OrderBy instead of Order and just do the checking there?
   
   Yeah, I wanted to leave that logic as-is for now, but then go in and change it in a later patch.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] lgtm-com[bot] commented on pull request #11930: Scan: Add "orderBy" parameter.

Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on pull request #11930:
URL: https://github.com/apache/druid/pull/11930#issuecomment-973673986


   This pull request **introduces 1 alert** when merging 64de14449d2c13e83cf7a7fd10b12ac423e43e0d into a4353aa1f42a51a1d218547b3a033550e49e5025 - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-373d4bcf3f60d4de9e41a2e3c9d22a1651306b95)
   
   **new alerts:**
   
   * 1 for Inconsistent equals and hashCode


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm merged pull request #11930: Scan: Add "orderBy" parameter.

Posted by GitBox <gi...@apache.org>.
gianm merged pull request #11930:
URL: https://github.com/apache/druid/pull/11930


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] lgtm-com[bot] commented on pull request #11930: Scan: Add "orderBy" parameter.

Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on pull request #11930:
URL: https://github.com/apache/druid/pull/11930#issuecomment-971225038


   This pull request **introduces 1 alert** when merging 97f3a5b2d2e2bb3e7920c8c38807261db0c47096 into d76e6467004094aef7279903e64532515fde1e95 - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-ef9c136226b58f0295ddea4b8ac25055bd711b81)
   
   **new alerts:**
   
   * 1 for Inconsistent equals and hashCode


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on a change in pull request #11930: Scan: Add "orderBy" parameter.

Posted by GitBox <gi...@apache.org>.
gianm commented on a change in pull request #11930:
URL: https://github.com/apache/druid/pull/11930#discussion_r753277198



##########
File path: processing/src/main/java/org/apache/druid/query/scan/ScanQueryEngine.java
##########
@@ -131,8 +131,8 @@
                     intervals.get(0),
                     query.getVirtualColumns(),
                     Granularities.ALL,
-                    query.getOrder().equals(ScanQuery.Order.DESCENDING) ||
-                    (query.getOrder().equals(ScanQuery.Order.NONE) && query.isDescending()),
+                    query.getTimeOrder().equals(ScanQuery.Order.DESCENDING) ||
+                    (query.getTimeOrder().equals(ScanQuery.Order.NONE) && query.isDescending()),

Review comment:
       I think you're right, which is good, because thinking about orderBy + a separate "descending" flag is definitely weird. I think it'd make sense to remove this branch when the logic is updated to be orderBy-focused in a later patch.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] lgtm-com[bot] commented on pull request #11930: Scan: Add "orderBy" parameter.

Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on pull request #11930:
URL: https://github.com/apache/druid/pull/11930#issuecomment-970936771


   This pull request **introduces 1 alert** when merging 3368d8da8edc3483066865dc49fb0f23211b4eda into 1487f558b1261a751c0bbcd3f551278498913c44 - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-7a5ec5f3a80168a9af3fa6d13839584d7a0bfc6f)
   
   **new alerts:**
   
   * 1 for Inconsistent equals and hashCode


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] lgtm-com[bot] commented on pull request #11930: Scan: Add "orderBy" parameter.

Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on pull request #11930:
URL: https://github.com/apache/druid/pull/11930#issuecomment-971170552


   This pull request **introduces 1 alert** when merging 8f4c5af81446532ece7980fb9ead18570de8c28f into d76e6467004094aef7279903e64532515fde1e95 - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-651dd22718afba551f24abe597d5a1b7ebe1315e)
   
   **new alerts:**
   
   * 1 for Inconsistent equals and hashCode


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org