You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by "pussyWang (via GitHub)" <gi...@apache.org> on 2023/05/05 06:49:07 UTC

[GitHub] [druid] pussyWang opened a new issue, #14213: Incorrect result returned by join query

pussyWang opened a new issue, #14213:
URL: https://github.com/apache/druid/issues/14213

   Druid version:24.0.1
   
   Query sql:
   `SELECT t1.id, t1.time, t1.count FROM my_table t1 JOIN (SELECT id, MAX(time) AS latest_time FROM my_table WHERE id IN ('fwa12s3-dfwr2sg6-9k7g6', '123asd1-sdfqwe23-wqr23') GROUP BY id) t2 ON t1.id = t2.id AND t1.time = t2.latest_time`
   
   The desired result is the data of the maximum time for each id,
   But in the result set, the data of one id and two times were found,The result is as shown below:
   ![image](https://user-images.githubusercontent.com/16094315/236393140-3539794d-e52c-4484-afef-aabbcecd350e.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on issue #14213: Incorrect result returned by join query

Posted by "abhishekagarwal87 (via GitHub)" <gi...@apache.org>.
abhishekagarwal87 commented on issue #14213:
URL: https://github.com/apache/druid/issues/14213#issuecomment-1540001273

   ah I see now. I think it should be fixed by https://github.com/apache/druid/pull/14151/files. In any case, can you share your native query plan? You can get it by doing "Explain query" through web-console. The fix will be available in 26.0. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on issue #14213: Incorrect result returned by join query

Posted by "abhishekagarwal87 (via GitHub)" <gi...@apache.org>.
abhishekagarwal87 commented on issue #14213:
URL: https://github.com/apache/druid/issues/14213#issuecomment-1536337287

   and you don't have multiple entries for that (id, time) combination in your raw table? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on issue #14213: Incorrect result returned by join query

Posted by "abhishekagarwal87 (via GitHub)" <gi...@apache.org>.
abhishekagarwal87 commented on issue #14213:
URL: https://github.com/apache/druid/issues/14213#issuecomment-1540021799

   We are converting the join condition in this query to a wrong filter. The filter that the condition coverts to is
   ```
   (user_id IN (#en.wikipedia, #sv.wikipedia) && __time IN (1466989200000, 1466992800000))
   ```
   however it should be
   ```
   (user_id = #en.wikipedia && __time == 1466989200000) || (user_id = #sv.wikipedia && __time = 1466992800000))
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] pussyWang commented on issue #14213: Incorrect result returned by join query

Posted by "pussyWang (via GitHub)" <gi...@apache.org>.
pussyWang commented on issue #14213:
URL: https://github.com/apache/druid/issues/14213#issuecomment-1539296306

   I agree with the join will produce as many results,My problem is its result should only have data for id1-time1 and id2-time2,There should be no data for id1-time2


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] rohangarg closed issue #14213: Incorrect result returned by join query

Posted by "rohangarg (via GitHub)" <gi...@apache.org>.
rohangarg closed issue #14213: Incorrect result returned by join query
URL: https://github.com/apache/druid/issues/14213


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] pussyWang commented on issue #14213: Incorrect result returned by join query

Posted by "pussyWang (via GitHub)" <gi...@apache.org>.
pussyWang commented on issue #14213:
URL: https://github.com/apache/druid/issues/14213#issuecomment-1537035263

   > and you don't have multiple entries for that (id, time) combination in your raw table?
   
   
   No, there are, but according to my query conditions, one id should not have multiple time query results


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] pussyWang commented on issue #14213: Incorrect result returned by join query

Posted by "pussyWang (via GitHub)" <gi...@apache.org>.
pussyWang commented on issue #14213:
URL: https://github.com/apache/druid/issues/14213#issuecomment-1541535203

   > We are converting the join condition in this query to a wrong filter. The filter that the condition coverts to is
   > 
   > ```
   > (user_id IN (#en.wikipedia, #sv.wikipedia) && __time IN (1466989200000, 1466992800000))
   > ```
   > 
   > however it should be
   > 
   > ```
   > (user_id = #en.wikipedia && __time == 1466989200000) || (user_id = #sv.wikipedia && __time = 1466992800000))
   > ```
   
   As I thought. Thank you very much for your prompt response to this issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on issue #14213: Incorrect result returned by join query

Posted by "abhishekagarwal87 (via GitHub)" <gi...@apache.org>.
abhishekagarwal87 commented on issue #14213:
URL: https://github.com/apache/druid/issues/14213#issuecomment-1540016826

   Scratch that. it's a bug in our join implementation. As a workaround, if you include t2.latest_time in your select, the query will return the correct results. We will fix it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] pussyWang commented on issue #14213: Incorrect result returned by join query

Posted by "pussyWang (via GitHub)" <gi...@apache.org>.
pussyWang commented on issue #14213:
URL: https://github.com/apache/druid/issues/14213#issuecomment-1535859500

   This inner join query is equivalent to:
   `SELECT id, time,count FROM my_table where id in ('fwa12s3-dfwr2sg6-9k7g6','123asd1-sdfqwe23-wqr23') AND time in (1682438400000,1682352000000)`
   
   This is wrong, resulting in unexpected query results


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on issue #14213: Incorrect result returned by join query

Posted by "abhishekagarwal87 (via GitHub)" <gi...@apache.org>.
abhishekagarwal87 commented on issue #14213:
URL: https://github.com/apache/druid/issues/14213#issuecomment-1538135253

   why will that be? if t1 has multiple entries and t2 has one entry, the join will produce as many results as there are in t1. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org