You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "comphead (via GitHub)" <gi...@apache.org> on 2023/02/01 00:56:06 UTC

[GitHub] [arrow-datafusion] comphead opened a new pull request, #5140: Date to Timestamp cast

comphead opened a new pull request, #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140

   # Which issue does this PR close?
   
   <!--
   We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123.
   -->
   
   Closes #4761 #4644 .
   
   # Rationale for this change
   Introduce casting between dates and timestamps datatypes. Currently DF just panics on such kind of queries
   
   <!--
    Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.  
   -->
   
   # What changes are included in this PR?
   This PR includes planner changes to support date to timestamp cast. 
   <!--
   There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
   -->
   
   # Are these changes tested?
   Yes. Tests enclosed
   <!--
   We typically require tests for all PRs in order to:
   1. Prevent the code from being accidentally broken by subsequent changes
   2. Serve as another way to document the expected behavior of the code
   
   If tests are not included in your PR, please explain why (for example, are they covered by existing tests)?
   -->
   
   # Are there any user-facing changes?
   No
   <!--
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   -->
   
   <!--
   If there are any breaking changes to public APIs, please add the `api change` label.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #5140: Date to Timestamp cast

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#issuecomment-1414087483

   > We need another cast to add Utf <-> Timestamp https://github.com/apache/arrow-datafusion/pull/5117 probably addresses this cast. If not I'll create another ticket
   
   I am going to merge this PR and test using https://github.com/apache/arrow-datafusion/pull/5117 and if it doesn't work I'll file another ticket
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5140: Date to Timestamp cast

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#discussion_r1094840085


##########
datafusion/expr/src/type_coercion/binary.rs:
##########
@@ -541,33 +541,30 @@ fn is_time_with_valid_unit(datatype: DataType) -> bool {
 fn temporal_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> {
     use arrow::datatypes::DataType::*;
     match (lhs_type, rhs_type) {
-        (Date64, Date32) => Some(Date64),
-        (Date32, Date64) => Some(Date64),
-        (Utf8, Date32) => Some(Date32),
-        (Date32, Utf8) => Some(Date32),
-        (Utf8, Date64) => Some(Date64),
-        (Date64, Utf8) => Some(Date64),
-        (Utf8, Time32(unit)) => match is_time_with_valid_unit(Time32(unit.clone())) {
-            false => None,
-            true => Some(Time32(unit.clone())),
-        },
-        (Time32(unit), Utf8) => match is_time_with_valid_unit(Time32(unit.clone())) {
-            false => None,
-            true => Some(Time32(unit.clone())),
-        },
-        (Utf8, Time64(unit)) => match is_time_with_valid_unit(Time64(unit.clone())) {
-            false => None,
-            true => Some(Time64(unit.clone())),
-        },
-        (Time64(unit), Utf8) => match is_time_with_valid_unit(Time64(unit.clone())) {
-            false => None,
-            true => Some(Time64(unit.clone())),
-        },
-        (Timestamp(_, tz), Utf8) => Some(Timestamp(TimeUnit::Nanosecond, tz.clone())),
-        (Utf8, Timestamp(_, tz)) => Some(Timestamp(TimeUnit::Nanosecond, tz.clone())),
-        // TODO: need to investigate the result type for the comparison between timestamp and date
-        (Timestamp(_, _), Date32) => Some(Date32),
-        (Timestamp(_, _), Date64) => Some(Date64),
+        (Date64, Date32) | (Date32, Date64) => Some(Date64),
+        (Utf8, Date32) | (Date32, Utf8) => Some(Date32),
+        (Utf8, Date64) | (Date64, Utf8) => Some(Date64),
+        (Utf8, Time32(unit)) | (Time32(unit), Utf8) => {
+            match is_time_with_valid_unit(Time32(unit.clone())) {
+                false => None,
+                true => Some(Time32(unit.clone())),
+            }
+        }
+        (Utf8, Time64(unit)) | (Time64(unit), Utf8) => {
+            match is_time_with_valid_unit(Time64(unit.clone())) {
+                false => None,
+                true => Some(Time64(unit.clone())),
+            }
+        }
+        (Timestamp(_, tz), Utf8) | (Utf8, Timestamp(_, tz)) => {
+            Some(Timestamp(TimeUnit::Nanosecond, tz.clone()))
+        }

Review Comment:
   Even after https://github.com/apache/arrow-datafusion/pull/5117 this still fails. I will file a follow on ticket
   
   ❯ select '2000-01-01T00:00:00'::timestamp::timestamptz = '2000-01-01T00:00:00';
   Internal("The type of Timestamp(Nanosecond, Some(\"+00:00\")) Eq Utf8 of binary physical should be same")



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] comphead commented on a diff in pull request #5140: Date to Timestamp cast

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on code in PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#discussion_r1093485681


##########
datafusion/core/tests/sql/timestamp.rs:
##########
@@ -1742,5 +1741,55 @@ async fn test_ts_dt_binary_ops() -> Result<()> {
 
     assert_batches_eq!(expected, &results);
 
+    //test cast path timestamp date using literals
+    let sql = "select '2000-01-01'::timestamp >= '2000-01-01'::date";

Review Comment:
   Thanks @alamb I will create a separate ticket to move all timestamp tests from timestamp.rs to sqllogictest, all that can be moved of course



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] waitingkuo commented on a diff in pull request #5140: Date to Timestamp cast

Posted by "waitingkuo (via GitHub)" <gi...@apache.org>.
waitingkuo commented on code in PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#discussion_r1094229758


##########
datafusion/expr/src/type_coercion/binary.rs:
##########
@@ -541,33 +541,30 @@ fn is_time_with_valid_unit(datatype: DataType) -> bool {
 fn temporal_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> {
     use arrow::datatypes::DataType::*;
     match (lhs_type, rhs_type) {
-        (Date64, Date32) => Some(Date64),
-        (Date32, Date64) => Some(Date64),
-        (Utf8, Date32) => Some(Date32),
-        (Date32, Utf8) => Some(Date32),
-        (Utf8, Date64) => Some(Date64),
-        (Date64, Utf8) => Some(Date64),
-        (Utf8, Time32(unit)) => match is_time_with_valid_unit(Time32(unit.clone())) {
-            false => None,
-            true => Some(Time32(unit.clone())),
-        },
-        (Time32(unit), Utf8) => match is_time_with_valid_unit(Time32(unit.clone())) {
-            false => None,
-            true => Some(Time32(unit.clone())),
-        },
-        (Utf8, Time64(unit)) => match is_time_with_valid_unit(Time64(unit.clone())) {
-            false => None,
-            true => Some(Time64(unit.clone())),
-        },
-        (Time64(unit), Utf8) => match is_time_with_valid_unit(Time64(unit.clone())) {
-            false => None,
-            true => Some(Time64(unit.clone())),
-        },
-        (Timestamp(_, tz), Utf8) => Some(Timestamp(TimeUnit::Nanosecond, tz.clone())),
-        (Utf8, Timestamp(_, tz)) => Some(Timestamp(TimeUnit::Nanosecond, tz.clone())),
-        // TODO: need to investigate the result type for the comparison between timestamp and date
-        (Timestamp(_, _), Date32) => Some(Date32),
-        (Timestamp(_, _), Date64) => Some(Date64),
+        (Date64, Date32) | (Date32, Date64) => Some(Date64),
+        (Utf8, Date32) | (Date32, Utf8) => Some(Date32),
+        (Utf8, Date64) | (Date64, Utf8) => Some(Date64),
+        (Utf8, Time32(unit)) | (Time32(unit), Utf8) => {
+            match is_time_with_valid_unit(Time32(unit.clone())) {
+                false => None,
+                true => Some(Time32(unit.clone())),
+            }
+        }
+        (Utf8, Time64(unit)) | (Time64(unit), Utf8) => {
+            match is_time_with_valid_unit(Time64(unit.clone())) {
+                false => None,
+                true => Some(Time64(unit.clone())),
+            }
+        }
+        (Timestamp(_, tz), Utf8) | (Utf8, Timestamp(_, tz)) => {
+            Some(Timestamp(TimeUnit::Nanosecond, tz.clone()))
+        }
+        (Timestamp(_, None), Date32) | (Date32, Timestamp(_, None)) => {
+            Some(Timestamp(TimeUnit::Nanosecond, None))
+        }
+        (Timestamp(_, _tz), Date32) | (Date32, Timestamp(_, _tz)) => {
+            Some(Timestamp(TimeUnit::Nanosecond, None))
+        }

Review Comment:
   does it mean that we convert Timestamptz to Timestamp and then compare?



##########
datafusion/expr/src/type_coercion/binary.rs:
##########
@@ -541,33 +541,30 @@ fn is_time_with_valid_unit(datatype: DataType) -> bool {
 fn temporal_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> {
     use arrow::datatypes::DataType::*;
     match (lhs_type, rhs_type) {
-        (Date64, Date32) => Some(Date64),
-        (Date32, Date64) => Some(Date64),
-        (Utf8, Date32) => Some(Date32),
-        (Date32, Utf8) => Some(Date32),
-        (Utf8, Date64) => Some(Date64),
-        (Date64, Utf8) => Some(Date64),
-        (Utf8, Time32(unit)) => match is_time_with_valid_unit(Time32(unit.clone())) {
-            false => None,
-            true => Some(Time32(unit.clone())),
-        },
-        (Time32(unit), Utf8) => match is_time_with_valid_unit(Time32(unit.clone())) {
-            false => None,
-            true => Some(Time32(unit.clone())),
-        },
-        (Utf8, Time64(unit)) => match is_time_with_valid_unit(Time64(unit.clone())) {
-            false => None,
-            true => Some(Time64(unit.clone())),
-        },
-        (Time64(unit), Utf8) => match is_time_with_valid_unit(Time64(unit.clone())) {
-            false => None,
-            true => Some(Time64(unit.clone())),
-        },
-        (Timestamp(_, tz), Utf8) => Some(Timestamp(TimeUnit::Nanosecond, tz.clone())),
-        (Utf8, Timestamp(_, tz)) => Some(Timestamp(TimeUnit::Nanosecond, tz.clone())),
-        // TODO: need to investigate the result type for the comparison between timestamp and date
-        (Timestamp(_, _), Date32) => Some(Date32),
-        (Timestamp(_, _), Date64) => Some(Date64),
+        (Date64, Date32) | (Date32, Date64) => Some(Date64),
+        (Utf8, Date32) | (Date32, Utf8) => Some(Date32),
+        (Utf8, Date64) | (Date64, Utf8) => Some(Date64),
+        (Utf8, Time32(unit)) | (Time32(unit), Utf8) => {
+            match is_time_with_valid_unit(Time32(unit.clone())) {
+                false => None,
+                true => Some(Time32(unit.clone())),
+            }
+        }
+        (Utf8, Time64(unit)) | (Time64(unit), Utf8) => {
+            match is_time_with_valid_unit(Time64(unit.clone())) {
+                false => None,
+                true => Some(Time64(unit.clone())),
+            }
+        }
+        (Timestamp(_, tz), Utf8) | (Utf8, Timestamp(_, tz)) => {
+            Some(Timestamp(TimeUnit::Nanosecond, tz.clone()))
+        }

Review Comment:
   i tried this, but it didn't work
   ```bash
   ❯ select '2000-01-01T00:00:00'::timestamp::timestamptz = '2000-01-01T00:00:00';
   Internal("The type of Timestamp(Nanosecond, Some(\"+00:00\")) Eq Utf8 of binary physical should be same")
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] comphead commented on a diff in pull request #5140: Date to Timestamp cast

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on code in PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#discussion_r1094785121


##########
datafusion/expr/src/type_coercion/binary.rs:
##########
@@ -541,33 +541,30 @@ fn is_time_with_valid_unit(datatype: DataType) -> bool {
 fn temporal_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> {
     use arrow::datatypes::DataType::*;
     match (lhs_type, rhs_type) {
-        (Date64, Date32) => Some(Date64),
-        (Date32, Date64) => Some(Date64),
-        (Utf8, Date32) => Some(Date32),
-        (Date32, Utf8) => Some(Date32),
-        (Utf8, Date64) => Some(Date64),
-        (Date64, Utf8) => Some(Date64),
-        (Utf8, Time32(unit)) => match is_time_with_valid_unit(Time32(unit.clone())) {
-            false => None,
-            true => Some(Time32(unit.clone())),
-        },
-        (Time32(unit), Utf8) => match is_time_with_valid_unit(Time32(unit.clone())) {
-            false => None,
-            true => Some(Time32(unit.clone())),
-        },
-        (Utf8, Time64(unit)) => match is_time_with_valid_unit(Time64(unit.clone())) {
-            false => None,
-            true => Some(Time64(unit.clone())),
-        },
-        (Time64(unit), Utf8) => match is_time_with_valid_unit(Time64(unit.clone())) {
-            false => None,
-            true => Some(Time64(unit.clone())),
-        },
-        (Timestamp(_, tz), Utf8) => Some(Timestamp(TimeUnit::Nanosecond, tz.clone())),
-        (Utf8, Timestamp(_, tz)) => Some(Timestamp(TimeUnit::Nanosecond, tz.clone())),
-        // TODO: need to investigate the result type for the comparison between timestamp and date
-        (Timestamp(_, _), Date32) => Some(Date32),
-        (Timestamp(_, _), Date64) => Some(Date64),
+        (Date64, Date32) | (Date32, Date64) => Some(Date64),
+        (Utf8, Date32) | (Date32, Utf8) => Some(Date32),
+        (Utf8, Date64) | (Date64, Utf8) => Some(Date64),
+        (Utf8, Time32(unit)) | (Time32(unit), Utf8) => {
+            match is_time_with_valid_unit(Time32(unit.clone())) {
+                false => None,
+                true => Some(Time32(unit.clone())),
+            }
+        }
+        (Utf8, Time64(unit)) | (Time64(unit), Utf8) => {
+            match is_time_with_valid_unit(Time64(unit.clone())) {
+                false => None,
+                true => Some(Time64(unit.clone())),
+            }
+        }
+        (Timestamp(_, tz), Utf8) | (Utf8, Timestamp(_, tz)) => {
+            Some(Timestamp(TimeUnit::Nanosecond, tz.clone()))
+        }

Review Comment:
   We need another cast to add Utf <-> Timestamp #5117 probably addresses this cast. If not I'll create another ticket



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] comphead commented on a diff in pull request #5140: Date to Timestamp cast

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on code in PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#discussion_r1094911548


##########
datafusion/core/tests/sql/timestamp.rs:
##########
@@ -1742,5 +1741,55 @@ async fn test_ts_dt_binary_ops() -> Result<()> {
 
     assert_batches_eq!(expected, &results);
 
+    //test cast path timestamp date using literals
+    let sql = "select '2000-01-01'::timestamp >= '2000-01-01'::date";

Review Comment:
   Filed https://github.com/apache/arrow-datafusion/issues/5165



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb merged pull request #5140: Date to Timestamp cast

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb merged PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] comphead commented on pull request #5140: Date to Timestamp cast

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#issuecomment-1411286032

   @liukun4515 @crepererum @alamb please check this one.
   
   The downside for now is additional cast created sometimes for timestamp, to convert `Timestamp(_, Some("00:00"))` to `Timestamp(_, None)`. This can be solved in separate PR if you dont mind as it might again require changes in arrow-rs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #5140: Date to Timestamp cast

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#issuecomment-1413811107

   > I think we could consider adding this coercion as well as it works in postgrseql (perhaps in another tickets)
   (Utf8, Date)
   
   I agree -- I can do so. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] comphead commented on a diff in pull request #5140: Date to Timestamp cast

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on code in PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#discussion_r1094788742


##########
datafusion/expr/src/type_coercion/binary.rs:
##########
@@ -541,33 +541,30 @@ fn is_time_with_valid_unit(datatype: DataType) -> bool {
 fn temporal_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> {
     use arrow::datatypes::DataType::*;
     match (lhs_type, rhs_type) {
-        (Date64, Date32) => Some(Date64),
-        (Date32, Date64) => Some(Date64),
-        (Utf8, Date32) => Some(Date32),
-        (Date32, Utf8) => Some(Date32),
-        (Utf8, Date64) => Some(Date64),
-        (Date64, Utf8) => Some(Date64),
-        (Utf8, Time32(unit)) => match is_time_with_valid_unit(Time32(unit.clone())) {
-            false => None,
-            true => Some(Time32(unit.clone())),
-        },
-        (Time32(unit), Utf8) => match is_time_with_valid_unit(Time32(unit.clone())) {
-            false => None,
-            true => Some(Time32(unit.clone())),
-        },
-        (Utf8, Time64(unit)) => match is_time_with_valid_unit(Time64(unit.clone())) {
-            false => None,
-            true => Some(Time64(unit.clone())),
-        },
-        (Time64(unit), Utf8) => match is_time_with_valid_unit(Time64(unit.clone())) {
-            false => None,
-            true => Some(Time64(unit.clone())),
-        },
-        (Timestamp(_, tz), Utf8) => Some(Timestamp(TimeUnit::Nanosecond, tz.clone())),
-        (Utf8, Timestamp(_, tz)) => Some(Timestamp(TimeUnit::Nanosecond, tz.clone())),
-        // TODO: need to investigate the result type for the comparison between timestamp and date
-        (Timestamp(_, _), Date32) => Some(Date32),
-        (Timestamp(_, _), Date64) => Some(Date64),
+        (Date64, Date32) | (Date32, Date64) => Some(Date64),
+        (Utf8, Date32) | (Date32, Utf8) => Some(Date32),
+        (Utf8, Date64) | (Date64, Utf8) => Some(Date64),
+        (Utf8, Time32(unit)) | (Time32(unit), Utf8) => {
+            match is_time_with_valid_unit(Time32(unit.clone())) {
+                false => None,
+                true => Some(Time32(unit.clone())),
+            }
+        }
+        (Utf8, Time64(unit)) | (Time64(unit), Utf8) => {
+            match is_time_with_valid_unit(Time64(unit.clone())) {
+                false => None,
+                true => Some(Time64(unit.clone())),
+            }
+        }
+        (Timestamp(_, tz), Utf8) | (Utf8, Timestamp(_, tz)) => {
+            Some(Timestamp(TimeUnit::Nanosecond, tz.clone()))
+        }
+        (Timestamp(_, None), Date32) | (Date32, Timestamp(_, None)) => {
+            Some(Timestamp(TimeUnit::Nanosecond, None))
+        }
+        (Timestamp(_, _tz), Date32) | (Date32, Timestamp(_, _tz)) => {
+            Some(Timestamp(TimeUnit::Nanosecond, None))
+        }

Review Comment:
   Thats correct, we lose TZ for now, we need some extra efforts to make arrow-rs work with TimestampTZ,planning to do so just after this PR merged.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5140: Date to Timestamp cast

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#discussion_r1094826439


##########
datafusion/core/tests/sql/timestamp.rs:
##########
@@ -1742,5 +1741,55 @@ async fn test_ts_dt_binary_ops() -> Result<()> {
 
     assert_batches_eq!(expected, &results);
 
+    //test cast path timestamp date using literals
+    let sql = "select '2000-01-01'::timestamp >= '2000-01-01'::date";

Review Comment:
   Thank you very much -- that would be great



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5140: Date to Timestamp cast

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#discussion_r1093136598


##########
datafusion/core/tests/sql/timestamp.rs:
##########
@@ -1742,5 +1741,55 @@ async fn test_ts_dt_binary_ops() -> Result<()> {
 
     assert_batches_eq!(expected, &results);
 
+    //test cast path timestamp date using literals
+    let sql = "select '2000-01-01'::timestamp >= '2000-01-01'::date";

Review Comment:
   You might be able to add these tests to sqllogictest if you wanted -- https://github.com/apache/arrow-datafusion/blob/master/datafusion/core/tests/sqllogictests/test_files/timestamps.slt 
   
   Not required but I find the sqllogictest framework much easier to work with



##########
datafusion/core/tests/sql/timestamp.rs:
##########
@@ -1742,5 +1741,55 @@ async fn test_ts_dt_binary_ops() -> Result<()> {
 
     assert_batches_eq!(expected, &results);
 
+    //test cast path timestamp date using literals
+    let sql = "select '2000-01-01'::timestamp >= '2000-01-01'::date";
+    let df = ctx.sql(sql).await.unwrap();
+
+    let plan = df.explain(true, false)?.collect().await?;
+    let batch = &plan[0];
+    let mut res: Option<String> = None;
+    for row in 0..batch.num_rows() {
+        if &array_value_to_string(batch.column(0), row)?
+            == "logical_plan after type_coercion"
+        {
+            res = Some(array_value_to_string(batch.column(1), row)?);
+            break;
+        }
+    }
+    assert_eq!(res, Some("Projection: CAST(Utf8(\"2000-01-01\") AS Timestamp(Nanosecond, None)) >= CAST(CAST(Utf8(\"2000-01-01\") AS Date32) AS Timestamp(Nanosecond, None))\n  EmptyRelation".to_string()));
+
+    //test cast path timestamp date using function
+    let sql = "select now() >= '2000-01-01'::date";
+    let df = ctx.sql(sql).await.unwrap();
+
+    let plan = df.explain(true, false)?.collect().await?;
+    let batch = &plan[0];
+    let mut res: Option<String> = None;
+    for row in 0..batch.num_rows() {
+        if &array_value_to_string(batch.column(0), row)?
+            == "logical_plan after type_coercion"
+        {
+            res = Some(array_value_to_string(batch.column(1), row)?);
+            break;
+        }
+    }
+    assert_eq!(res, Some("Projection: CAST(now() AS Timestamp(Nanosecond, None)) >= CAST(CAST(Utf8(\"2000-01-01\") AS Date32) AS Timestamp(Nanosecond, None))\n  EmptyRelation".to_string()));
+
+    let sql = "select now() = current_date()";
+    let df = ctx.sql(sql).await.unwrap();
+
+    let plan = df.explain(true, false)?.collect().await?;
+    let batch = &plan[0];
+    let mut res: Option<String> = None;
+    for row in 0..batch.num_rows() {
+        if &array_value_to_string(batch.column(0), row)?
+            == "logical_plan after type_coercion"
+        {
+            res = Some(array_value_to_string(batch.column(1), row)?);
+            break;
+        }
+    }
+    assert_eq!(res, Some("Projection: CAST(now() AS Timestamp(Nanosecond, None)) = CAST(currentdate() AS Timestamp(Nanosecond, None))\n  EmptyRelation".to_string()));

Review Comment:
   👍 



##########
datafusion/expr/src/type_coercion/functions.rs:
##########
@@ -174,8 +174,8 @@ pub fn can_coerce_from(type_into: &DataType, type_from: &DataType) -> bool {
                 | Float64
                 | Decimal128(_, _)
         ),
-        Timestamp(TimeUnit::Nanosecond, None) => {
-            matches!(type_from, Null | Timestamp(_, None))
+        Timestamp(TimeUnit::Nanosecond, _) => {
+            matches!(type_from, Null | Timestamp(_, _) | Date32)

Review Comment:
   👍  (I have a draft PR in the works to support coercing from `utf8` as well https://github.com/apache/arrow-datafusion/pull/5117)
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #5140: Date to Timestamp cast

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#issuecomment-1411963160

   cc @crepererum  and @liukun4515  who filed the tickets this closes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5140: Date to Timestamp cast

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#discussion_r1094847517


##########
datafusion/expr/src/type_coercion/binary.rs:
##########
@@ -541,33 +541,30 @@ fn is_time_with_valid_unit(datatype: DataType) -> bool {
 fn temporal_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> {
     use arrow::datatypes::DataType::*;
     match (lhs_type, rhs_type) {
-        (Date64, Date32) => Some(Date64),
-        (Date32, Date64) => Some(Date64),
-        (Utf8, Date32) => Some(Date32),
-        (Date32, Utf8) => Some(Date32),
-        (Utf8, Date64) => Some(Date64),
-        (Date64, Utf8) => Some(Date64),
-        (Utf8, Time32(unit)) => match is_time_with_valid_unit(Time32(unit.clone())) {
-            false => None,
-            true => Some(Time32(unit.clone())),
-        },
-        (Time32(unit), Utf8) => match is_time_with_valid_unit(Time32(unit.clone())) {
-            false => None,
-            true => Some(Time32(unit.clone())),
-        },
-        (Utf8, Time64(unit)) => match is_time_with_valid_unit(Time64(unit.clone())) {
-            false => None,
-            true => Some(Time64(unit.clone())),
-        },
-        (Time64(unit), Utf8) => match is_time_with_valid_unit(Time64(unit.clone())) {
-            false => None,
-            true => Some(Time64(unit.clone())),
-        },
-        (Timestamp(_, tz), Utf8) => Some(Timestamp(TimeUnit::Nanosecond, tz.clone())),
-        (Utf8, Timestamp(_, tz)) => Some(Timestamp(TimeUnit::Nanosecond, tz.clone())),
-        // TODO: need to investigate the result type for the comparison between timestamp and date
-        (Timestamp(_, _), Date32) => Some(Date32),
-        (Timestamp(_, _), Date64) => Some(Date64),
+        (Date64, Date32) | (Date32, Date64) => Some(Date64),
+        (Utf8, Date32) | (Date32, Utf8) => Some(Date32),
+        (Utf8, Date64) | (Date64, Utf8) => Some(Date64),
+        (Utf8, Time32(unit)) | (Time32(unit), Utf8) => {
+            match is_time_with_valid_unit(Time32(unit.clone())) {
+                false => None,
+                true => Some(Time32(unit.clone())),
+            }
+        }
+        (Utf8, Time64(unit)) | (Time64(unit), Utf8) => {
+            match is_time_with_valid_unit(Time64(unit.clone())) {
+                false => None,
+                true => Some(Time64(unit.clone())),
+            }
+        }
+        (Timestamp(_, tz), Utf8) | (Utf8, Timestamp(_, tz)) => {
+            Some(Timestamp(TimeUnit::Nanosecond, tz.clone()))
+        }

Review Comment:
   Filed https://github.com/apache/arrow-datafusion/issues/5164



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #5140: Date to Timestamp cast

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#issuecomment-1414111470

   Filed https://github.com/apache/arrow-datafusion/issues/5164


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #5140: Date to Timestamp cast

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#issuecomment-1412143332

   > The downside for now is additional cast created sometimes for timestamp, to convert Timestamp(_, Some("00:00")) to Timestamp(_, None). This can be solved in separate PR if you dont mind as it might again require changes in arrow-rs
   
   I think this is correct (and needed) if the timestamps in question have different timezones. It should not really effect runtime performance as I expect both to be evaluated at plan time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #5140: Date to Timestamp cast

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#issuecomment-1412819646

   I plan to merge this PR tomorrow unless there are other comments


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] ursabot commented on pull request #5140: Date to Timestamp cast

Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #5140:
URL: https://github.com/apache/arrow-datafusion/pull/5140#issuecomment-1414101269

   Benchmark runs are scheduled for baseline = f0c095802f292dbee891712f31bfd20d32b54336 and contender = e41c4dffa8699165c90dcb5127cd2909869e2f58. e41c4dffa8699165c90dcb5127cd2909869e2f58 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/36b335104ea84eae9cd035cf4962dc79...c44ac284586948cbac0cddd140b0816a/)
   [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] [test-mac-arm](https://conbench.ursa.dev/compare/runs/1763ea57388d4ca889e97cf1f29d7507...e0c8cf78b2d846458363319f96321f32/)
   [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/4fd1450fc0434575a63106f626e1bfc0...708644587d24415c8bfe0ecf7d9ac792/)
   [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/31d10ac0a820424c98d6777f1fded120...c8d93bccb943477e997da96b7d7da0ec/)
   Buildkite builds:
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org