You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by "stuartcarnie (via GitHub)" <gi...@apache.org> on 2023/04/12 05:12:44 UTC

[GitHub] [arrow-datafusion] stuartcarnie opened a new issue, #5970: UNION ALL with ORDER BY results are inconsistent

stuartcarnie opened a new issue, #5970:
URL: https://github.com/apache/arrow-datafusion/issues/5970

   ### Describe the bug
   
   ### DataFusion CLI (18.0)
   
   Results were consistently sorted correctly
   
   ```sql
   WITH
     m0(time, tag0, f64) AS (VALUES (1667181600000000000, 'val00', 10.1), (1667181610000000000, 'val00', 21.2), (1667181620000000000, 'val00', 11.2), (1667181630000000000, 'val00', 19.2), (1667181600000000000, 'val01', 11.3), (1667181600000000000, 'val02', 10.4), (1667181610000000000, 'val00', 18.9)),
   
     m1(time, tag0, f64) AS (VALUES (1667181600000000000, 'val00', 100.5), (1667181610000000000, 'val00', 200.6), (1667181600000000000, 'val01', 101.7)),
   
     t AS (
       SELECT 'm0' as "iox::measurement", tag0, 0::timestamp as time, COUNT(f64), SUM(f64), stddev(f64) FROM m0 GROUP BY 1, 2, 3), 
   
     u AS (
       SELECT 'm1' as "iox::measurement", tag0, 0::timestamp as time, COUNT(f64), SUM(f64), stddev(f64) FROM m1 GROUP BY 1, 2, 3)
   SELECT * FROM t
   UNION ALL
   SELECT * FROM u
   ORDER BY 1, 2, 3;
   ```
   
   Output is correct:
   
   ```text
   +------------------+-------+---------------------+------------+----------+-------------------+
   | iox::measurement | tag0  | time                | COUNT(f64) | SUM(f64) | STDDEV(f64)       |
   +------------------+-------+---------------------+------------+----------+-------------------+
   | m0               | val00 | 1970-01-01T00:00:00 | 5          | 80.6     | 5.085961069453836 |
   | m0               | val01 | 1970-01-01T00:00:00 | 1          | 11.3     |                   |
   | m0               | val02 | 1970-01-01T00:00:00 | 1          | 10.4     |                   |
   | m1               | val00 | 1970-01-01T00:00:00 | 2          | 301.1    | 70.7813887967734  |
   | m1               | val01 | 1970-01-01T00:00:00 | 1          | 101.7    |                   |
   +------------------+-------+---------------------+------------+----------+-------------------+
   ```
   
   
   ### datafusion-cli (20.0)
   
   Results were not consistent
   
   
   
   Output is sometimes incorrect correct:
   
   
   ```text
   +------------------+-------+---------------------+------------+----------+-------------------+
   | iox::measurement | tag0  | time                | COUNT(f64) | SUM(f64) | STDDEV(f64)       |
   +------------------+-------+---------------------+------------+----------+-------------------+
   | m0               | val00 | 1970-01-01T00:00:00 | 5          | 80.6     | 5.085961069453836 |
   | m0               | val01 | 1970-01-01T00:00:00 | 1          | 11.3     |                   |
   | m1               | val00 | 1970-01-01T00:00:00 | 2          | 301.1    | 70.7813887967734  |
   | m0               | val02 | 1970-01-01T00:00:00 | 1          | 10.4     |                   |
   | m1               | val01 | 1970-01-01T00:00:00 | 1          | 101.7    |                   |
   +------------------+-------+---------------------+------------+----------+-------------------+
   ```
   
   ### To Reproduce
   
   _No response_
   
   ### Expected behavior
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] mingmwang commented on issue #5970: UNION ALL with ORDER BY results are inconsistent

Posted by "mingmwang (via GitHub)" <gi...@apache.org>.

mingmwang commented on issue #5970:
URL: https://github.com/apache/arrow-datafusion/issues/5970#issuecomment-1508262463

   My suggestion is making the partition-aware optimization a configure option, but do not remove the code. 
   By default, we turn it off. So that the UnionExec just like a plain Union and do not keep the partition info.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] alamb closed issue #5970: UNION ALL with ORDER BY results are inconsistent

Posted by "alamb (via GitHub)" <gi...@apache.org>.

alamb closed issue #5970: UNION ALL with ORDER BY results are inconsistent
URL: https://github.com/apache/arrow-datafusion/issues/5970


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org