You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/24 14:09:39 UTC

[GitHub] [arrow-datafusion] alamb opened a new issue, #3938: Predicate still has cast when comparing Timestamp(Nano, None) to a timestamp literal, so can't be pushed down or used for pruning

alamb opened a new issue, #3938:
URL: https://github.com/apache/arrow-datafusion/issues/3938

   **Describe the bug**
   Comparing a Timestamp(Nanosecond, None) column to a timestamp literal is important for IOx and can be used to potentially prune significant amounts of data and pushed down to scans. 
   
   Specifically, to get the last hours of data you can use a predicate like:
   
   ```sql
   col_ts_nano_none < (now() - interval '1 hour')";
   ```
   
   Which should be evaluated to something like
   
   ```rust
   test.col_ts_nano_none < TimestampNanosecond(1666612093000000000, Some(\"UTC\")
   ```
   
   
   However, today DataFusion can't get rid of the cast:
   ```sql
   CAST(test.col_ts_nano_none AS Timestamp(Nanosecond, Some(\"UTC\"))) < TimestampNanosecond(1666612093000000000, Some(\"UTC\")
   ```
   
   **To Reproduce**
   Add this test to the optimizer-integration test:
   
   ```rust
   #[test]
   fn timestamp_nano_ts_none_predicates() -> Result<()> {
       let sql = "SELECT col_int32
           FROM test
           WHERE col_ts_nano_none < (now() - interval '1 hour')";
       let plan = test_sql(sql)?;
       // a scan should have the now()... predicate folded to a single
       // constant and compared to the column without a cast so it can be
       // pushed down / pruned
       let expected = "Projection: test.col_int32\n  Filter: test.col_ts_nano_utc < TimestampNanosecond(1666612093000000000, Some(\"UTC\"))\
                       \n    TableScan: test projection=[col_int32, col_ts_nano_none]";
       assert_eq!(expected, format!("{:?}", plan));
       Ok(())
   }
   ```
   
   It will fail like:
   
   ```
   ---- timestamp_nano_ts_none_predicates stdout ----
   thread 'timestamp_nano_ts_none_predicates' panicked at 'assertion failed: `(left == right)`
     left: `"Projection: test.col_int32\n  Filter: test.col_ts_nano_utc < TimestampNanosecond(1666612093000000000, Some(\"UTC\"))\n    TableScan: test projection=[col_int32, col_ts_nano_none]"`,
    right: `"Projection: test.col_int32\n  Filter: CAST(test.col_ts_nano_none AS Timestamp(Nanosecond, Some(\"UTC\"))) < TimestampNanosecond(1666612093000000000, Some(\"UTC\"))\n    TableScan: test projection=[col_int32, col_ts_nano_none]"`', datafusion/optimizer/tests/integration-test.rs:242:5
   stack backtrace:
   ```
   
   Because the predicate still contains the cast
   
   ```
    CAST(test.col_ts_nano_none AS Timestamp(Nanosecond, Some(\"UTC\"))) < TimestampNanosecond(1666612093000000000, Some(\"UTC\"))
   ```
   
   
   
   **Expected behavior**
   Test should pass (though the actual timestamp value may need to be adjusted)
   
   Specifically, the filter should be something like:
   ```
   test.col_ts_nano_utc < TimestampNanosecond(1666612093000000000, Some(\"UTC\"))\
   ```
   
   
   
   **Additional context**
   See related IOx issue here: https://github.com/influxdata/influxdb_iox/issues/5875
   
   It is not clear to me if the better fix would be to provide `Timestamp(Nanos, UTC)` rather than `Timestamp(Nanos, None)` in this case. However, DataFusion could definitely do a better job with unwrapping the cast


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #3938: Predicate still has cast when comparing Timestamp(Nano, None) to a timestamp literal, so can't be pushed down or used for pruning

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #3938:
URL: https://github.com/apache/arrow-datafusion/issues/3938#issuecomment-1307340947

   It turns out I accidentally marked this as completed when it is still broken


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb closed issue #3938: Predicate still has cast when comparing Timestamp(Nano, None) to a timestamp literal, so can't be pushed down or used for pruning

Posted by GitBox <gi...@apache.org>.
alamb closed issue #3938: Predicate still has cast when comparing Timestamp(Nano, None) to a timestamp literal, so can't be pushed down or used for pruning
URL: https://github.com/apache/arrow-datafusion/issues/3938


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove closed issue #3938: Predicate still has cast when comparing Timestamp(Nano, None) to a timestamp literal, so can't be pushed down or used for pruning

Posted by GitBox <gi...@apache.org>.
andygrove closed issue #3938: Predicate still has cast when comparing Timestamp(Nano, None) to a timestamp literal, so can't be pushed down or used for pruning
URL: https://github.com/apache/arrow-datafusion/issues/3938


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org