You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/03/29 17:30:22 UTC

[GitHub] [incubator-pinot] dongxiaoman opened a new issue #6724: Query do not return enough results with filter of "group by , having"

dongxiaoman opened a new issue #6724:
URL: https://github.com/apache/incubator-pinot/issues/6724


   we have a query like below, created_at is the time column
   ```
   SELECT transaction_id, count(*) as cnt FROM point_transaction where created_at >=1616760000 and created_at < 1616846400 group by transaction_id having cnt > 1 limit 400
   ```
   It is returning random results, sometimes 38 items, sometimes 20 items. In our dataset it should have much more than 400.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] kishoreg commented on issue #6724: Query do not return enough results with filter of "group by , having"

Posted by GitBox <gi...@apache.org>.
kishoreg commented on issue #6724:
URL: https://github.com/apache/incubator-pinot/issues/6724#issuecomment-809617998


   @Jackie-Jiang should we do automatic query rewrite for this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] yupeng9 commented on issue #6724: Query do not return enough results with filter of "group by , having"

Posted by GitBox <gi...@apache.org>.
yupeng9 commented on issue #6724:
URL: https://github.com/apache/incubator-pinot/issues/6724#issuecomment-809625031


   @dongxiaoman @Jackie-Jiang I think we shall reopen this issue, as the original query is a valid one but the returned result is incorrect. I have seen similar reports too. 
   
   User wouldn't understand the implementation reasons, and they will simply conclude that pinot gives incorrect results.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dongxiaoman commented on issue #6724: Query do not return enough results with filter of "group by , having"

Posted by GitBox <gi...@apache.org>.
dongxiaoman commented on issue #6724:
URL: https://github.com/apache/incubator-pinot/issues/6724#issuecomment-809597583


   thanks for explaining; makes sense
   final statement is like below:
   ```
   SELECT transaction_id, count(*) as cnt FROM point_transaction where created_at >=1616760000 and created_at < 1616846400 group by transaction_id  having cnt > 1 order by cnt desc  limit 400
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on issue #6724: Query do not return enough results with filter of "group by , having"

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on issue #6724:
URL: https://github.com/apache/incubator-pinot/issues/6724#issuecomment-809572378


   This is a known limitation for `HAVING` clause because the having is computed on broker side only. We don't transfer results from all the groups from servers to broker for performance concern, so in order to get the correct result, you should sort the groups: `SELECT transaction_id, count(*) as cnt FROM point_transaction where created_at >=1616760000 and created_at < 1616846400 group by transaction_id order by cnt desc having cnt > 1 limit 400`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dongxiaoman closed issue #6724: Query do not return enough results with filter of "group by , having"

Posted by GitBox <gi...@apache.org>.
dongxiaoman closed issue #6724:
URL: https://github.com/apache/incubator-pinot/issues/6724


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on issue #6724: Query do not return enough results with filter of "group by , having"

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on issue #6724:
URL: https://github.com/apache/incubator-pinot/issues/6724#issuecomment-809683795


   @kishoreg We can do automatic query rewrite for queries without ordering, but it won't work for the following scenarios:
   - User wants different ordering, e.g. `select ... having col > 10 order by col asc`
   - Nested having clause, e.g. `select ... having col1 > 10 and col2 > 5`
   - Having clause asks for a range, e.g. `select ... having col between 5 and 10
   
   To give accurate result, we need to gather all groups (like common SQL databases), but that would be super expensive.
   
   @dongxiaoman @yupeng9 We can keep this issue open, but currently I don't see a good solution yet. Currently `having` should only be used to simplify the client code of filtering on the group-by results.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org