You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "yukkit (via GitHub)" <gi...@apache.org> on 2023/04/14 03:33:14 UTC
[GitHub] [arrow-datafusion] yukkit opened a new issue, #6001: Incorrect column pruning in sql with window operations
yukkit opened a new issue, #6001:
URL: https://github.com/apache/arrow-datafusion/issues/6001
### Describe the bug
As the title
### To Reproduce
```sql
❯ explain select sum(case when latitude < 50.0 then latitude else 0 end) over (partition by name) from readings;
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type | plan |
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| logical_plan | Projection: SUM(CASE WHEN readings.latitude < Float64(50) THEN readings.latitude ELSE Int64(0) END) PARTITION BY [readings.name] ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING |
| | WindowAggr: windowExpr=[[SUM(CASE WHEN readings.latitude < Float64(50) THEN readings.latitude ELSE Float64(0) END) PARTITION BY [readings.name] ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING AS SUM(CASE WHEN readings.latitude < Float64(50) THEN readings.latitude ELSE Int64(0) END) PARTITION BY [readings.name] ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING]] |
| | TableScan: readings projection=[time, name, fleet, driver, model, device_version, latitude, longitude, elevation, velocity, heading, grade, fuel_consumption, load_capacity, fuel_capacity, nominal_fuel_consumption] |
| physical_plan | ProjectionExec: expr=[SUM(CASE WHEN readings.latitude < Float64(50) THEN readings.latitude ELSE Int64(0) END) PARTITION BY [readings.name] ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING@16 as SUM(CASE WHEN readings.latitude < Float64(50) THEN readings.latitude ELSE Int64(0) END) PARTITION BY [readings.name] ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING] |
| | WindowAggExec: wdw=[SUM(CASE WHEN readings.latitude < Float64(50) THEN readings.latitude ELSE Int64(0) END) PARTITION BY [readings.name] ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING: Ok(Field { name: "SUM(CASE WHEN readings.latitude < Float64(50) THEN readings.latitude ELSE Int64(0) END) PARTITION BY [readings.name] ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), frame: WindowFrame { units: Rows, start_bound: Preceding(UInt64(NULL)), end_bound: Following(UInt64(NULL)) }] |
| | SortExec: expr=[name@1 ASC NULLS LAST] |
| | CoalesceBatchesExec: target_batch_size=8192 |
| | RepartitionExec: partitioning=Hash([Column { name: "name", index: 1 }], 8), input_partitions=1 |
| | ParquetExec: limit=None, partitions={1 group: [[Users/yukkit/Documents/tmp/data/parquet/part-297.parquet]]}, projection=[time, name, fleet, driver, model, device_version, latitude, longitude, elevation, velocity, heading, grade, fuel_consumption, load_capacity, fuel_capacity, nominal_fuel_consumption] |
| | |
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set. Query took 0.026 seconds.
```
### Expected behavior
Push down only `latitude` and `name` to TableScan.
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] jackwener closed issue #6001: Incorrect column pruning in sql with window operations
Posted by "jackwener (via GitHub)" <gi...@apache.org>.
jackwener closed issue #6001: Incorrect column pruning in sql with window operations
URL: https://github.com/apache/arrow-datafusion/issues/6001
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org