You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "Weijun-H (via GitHub)" <gi...@apache.org> on 2023/06/12 21:33:11 UTC

[GitHub] [arrow-datafusion] Weijun-H opened a new pull request, #6654: fix: 'datatrunc' return inconsistent type

Weijun-H opened a new pull request, #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654

   # Which issue does this PR close?
   
   <!--
   We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123.
   -->
   
   Closes #6653
   
   # Rationale for this change
   
   <!--
    Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.  
   -->
   
   # What changes are included in this PR?
   
   <!--
   There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
   -->
   
   # Are these changes tested?
   
   <!--
   We typically require tests for all PRs in order to:
   1. Prevent the code from being accidentally broken by subsequent changes
   2. Serve as another way to document the expected behavior of the code
   
   If tests are not included in your PR, please explain why (for example, are they covered by existing tests)?
   -->
   
   # Are there any user-facing changes?
   
   <!--
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   -->
   
   <!--
   If there are any breaking changes to public APIs, please add the `api change` label.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya commented on a diff in pull request #6654: fix: 'datatrunc' return inconsistent type

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1227352646


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -294,16 +294,24 @@ pub fn date_trunc(args: &[ColumnarValue]) -> Result<ColumnarValue> {
                 }
                 "second" => {
                     // trunc to second
-                    let mill = ScalarValue::TimestampNanosecond(

Review Comment:
   And the `signature` of `DateTrunc` should be updated, I think. Currently the second argument is only `Timestamp(Nanosecond, None)`. But I guess that this change wants to make output type as same as input type. So you probably want to add more input types to `signature`. Otherwise you cannot get the real datatype of second parameter but just the coerced type which is always `Timestamp(Nanosecond, None)`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #6654: fix: 'datatrunc' return inconsistent type

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#issuecomment-1591931971

   marking this one as a draft -- I think if it is merged to main the tests will fail


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1233097208


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -262,6 +262,65 @@ fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
     Ok(value.unwrap().timestamp_nanos())
 }
 
+fn _date_trunc(
+    tu: TimeUnit,
+    value: &Option<i64>,
+    granularity: &str,
+    f: impl Fn(Option<i64>) -> Result<Option<i64>>,
+    tz_opt: &Option<Arc<str>>,
+) -> Result<ColumnarValue, DataFusionError> {
+    let scale = match tu {
+        TimeUnit::Second => 1_000_000_000,
+        TimeUnit::Millisecond => 1_000_000,
+        TimeUnit::Microsecond => 1_000,
+        TimeUnit::Nanosecond => 1,
+    };
+    let nano = (f)(Some(value.unwrap() * scale))?;
+    let result = match tu {
+        TimeUnit::Second => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000),
+                "second" => Some(nano.unwrap() / 1_000_000_000),
+                "millisecond" => Some(nano.unwrap() / 1_000_000),
+                "microsecond" => Some(nano.unwrap()),
+                _ => Some(nano.unwrap() / 1_000_000_000),
+            };
+            ScalarValue::TimestampSecond(value, tz_opt.clone())
+        }
+        TimeUnit::Millisecond => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000 * 1_000),
+                "second" => Some(nano.unwrap() / 1_000_000_000),

Review Comment:
   ```suggestion
                   "second" => Some(nano.unwrap() / 1_000_000_000 * 1_000),
   ```



##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -262,6 +262,65 @@ fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
     Ok(value.unwrap().timestamp_nanos())
 }
 
+fn _date_trunc(
+    tu: TimeUnit,
+    value: &Option<i64>,
+    granularity: &str,
+    f: impl Fn(Option<i64>) -> Result<Option<i64>>,
+    tz_opt: &Option<Arc<str>>,
+) -> Result<ColumnarValue, DataFusionError> {
+    let scale = match tu {
+        TimeUnit::Second => 1_000_000_000,
+        TimeUnit::Millisecond => 1_000_000,
+        TimeUnit::Microsecond => 1_000,
+        TimeUnit::Nanosecond => 1,
+    };
+    let nano = (f)(Some(value.unwrap() * scale))?;
+    let result = match tu {
+        TimeUnit::Second => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000),
+                "second" => Some(nano.unwrap() / 1_000_000_000),
+                "millisecond" => Some(nano.unwrap() / 1_000_000),
+                "microsecond" => Some(nano.unwrap()),
+                _ => Some(nano.unwrap() / 1_000_000_000),
+            };
+            ScalarValue::TimestampSecond(value, tz_opt.clone())
+        }
+        TimeUnit::Millisecond => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000 * 1_000),
+                "second" => Some(nano.unwrap() / 1_000_000_000),
+                "millisecond" => Some(nano.unwrap() / 1_000_000),
+                "microsecond" => Some(nano.unwrap() / 1_000),
+                _ => Some(nano.unwrap() / 1_000_000),
+            };
+            ScalarValue::TimestampMillisecond(value, tz_opt.clone())
+        }
+        TimeUnit::Microsecond => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000 * 1_000_000),
+                "second" => Some(nano.unwrap() / 1_000_000 * 1_000),
+                "millisecond" => Some(nano.unwrap() / 1_000 * 1_000),
+                "microsecond" => Some(nano.unwrap() / 1_000),
+                _ => Some(nano.unwrap() / 1_000),
+            };
+            ScalarValue::TimestampMicrosecond(value, tz_opt.clone())
+        }
+        _ => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000 * 1_000_000_000),
+                "second" => Some(nano.unwrap() / 1_000_000 * 1_000_000),
+                "millisecond" => Some(nano.unwrap() / 1_000 * 1_000),

Review Comment:
   ```suggestion
                   "millisecond" => Some(nano.unwrap() / 1_000_000 * 1_000_000),
   ```



##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -262,6 +262,65 @@ fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
     Ok(value.unwrap().timestamp_nanos())
 }
 
+fn _date_trunc(
+    tu: TimeUnit,
+    value: &Option<i64>,
+    granularity: &str,
+    f: impl Fn(Option<i64>) -> Result<Option<i64>>,
+    tz_opt: &Option<Arc<str>>,
+) -> Result<ColumnarValue, DataFusionError> {
+    let scale = match tu {
+        TimeUnit::Second => 1_000_000_000,
+        TimeUnit::Millisecond => 1_000_000,
+        TimeUnit::Microsecond => 1_000,
+        TimeUnit::Nanosecond => 1,
+    };
+    let nano = (f)(Some(value.unwrap() * scale))?;
+    let result = match tu {
+        TimeUnit::Second => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000),
+                "second" => Some(nano.unwrap() / 1_000_000_000),
+                "millisecond" => Some(nano.unwrap() / 1_000_000),
+                "microsecond" => Some(nano.unwrap()),
+                _ => Some(nano.unwrap() / 1_000_000_000),
+            };
+            ScalarValue::TimestampSecond(value, tz_opt.clone())

Review Comment:
   ```suggestion
               let value = Some(nano.unwrap() / 1_000_000_000);
               ScalarValue::TimestampSecond(value, tz_opt.clone())
   ```
   
   As output unit is second, smaller granularity can be truncated simply.



##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -262,6 +262,65 @@ fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
     Ok(value.unwrap().timestamp_nanos())
 }
 
+fn _date_trunc(
+    tu: TimeUnit,
+    value: &Option<i64>,
+    granularity: &str,
+    f: impl Fn(Option<i64>) -> Result<Option<i64>>,
+    tz_opt: &Option<Arc<str>>,
+) -> Result<ColumnarValue, DataFusionError> {
+    let scale = match tu {
+        TimeUnit::Second => 1_000_000_000,
+        TimeUnit::Millisecond => 1_000_000,
+        TimeUnit::Microsecond => 1_000,
+        TimeUnit::Nanosecond => 1,
+    };
+    let nano = (f)(Some(value.unwrap() * scale))?;
+    let result = match tu {
+        TimeUnit::Second => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000),
+                "second" => Some(nano.unwrap() / 1_000_000_000),
+                "millisecond" => Some(nano.unwrap() / 1_000_000),
+                "microsecond" => Some(nano.unwrap()),
+                _ => Some(nano.unwrap() / 1_000_000_000),
+            };
+            ScalarValue::TimestampSecond(value, tz_opt.clone())
+        }
+        TimeUnit::Millisecond => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000 * 1_000),
+                "second" => Some(nano.unwrap() / 1_000_000_000),
+                "millisecond" => Some(nano.unwrap() / 1_000_000),
+                "microsecond" => Some(nano.unwrap() / 1_000),
+                _ => Some(nano.unwrap() / 1_000_000),
+            };
+            ScalarValue::TimestampMillisecond(value, tz_opt.clone())
+        }
+        TimeUnit::Microsecond => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000 * 1_000_000),
+                "second" => Some(nano.unwrap() / 1_000_000 * 1_000),

Review Comment:
   ```suggestion
                   "second" => Some(nano.unwrap() / 1_000_000_000 * 1_000_000),
   ```



##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -262,6 +262,65 @@ fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
     Ok(value.unwrap().timestamp_nanos())
 }
 
+fn _date_trunc(
+    tu: TimeUnit,
+    value: &Option<i64>,
+    granularity: &str,
+    f: impl Fn(Option<i64>) -> Result<Option<i64>>,
+    tz_opt: &Option<Arc<str>>,
+) -> Result<ColumnarValue, DataFusionError> {
+    let scale = match tu {
+        TimeUnit::Second => 1_000_000_000,
+        TimeUnit::Millisecond => 1_000_000,
+        TimeUnit::Microsecond => 1_000,
+        TimeUnit::Nanosecond => 1,
+    };
+    let nano = (f)(Some(value.unwrap() * scale))?;
+    let result = match tu {
+        TimeUnit::Second => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000),
+                "second" => Some(nano.unwrap() / 1_000_000_000),
+                "millisecond" => Some(nano.unwrap() / 1_000_000),
+                "microsecond" => Some(nano.unwrap()),
+                _ => Some(nano.unwrap() / 1_000_000_000),
+            };
+            ScalarValue::TimestampSecond(value, tz_opt.clone())
+        }
+        TimeUnit::Millisecond => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000 * 1_000),
+                "second" => Some(nano.unwrap() / 1_000_000_000),
+                "millisecond" => Some(nano.unwrap() / 1_000_000),
+                "microsecond" => Some(nano.unwrap() / 1_000),
+                _ => Some(nano.unwrap() / 1_000_000),

Review Comment:
   ditto. For granularity smaller than millisecond can be truncated simply.
   
   ```suggestion
                   _ => Some(nano.unwrap() / 1_000_000),
   ```



##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -262,6 +262,65 @@ fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
     Ok(value.unwrap().timestamp_nanos())
 }
 
+fn _date_trunc(
+    tu: TimeUnit,
+    value: &Option<i64>,
+    granularity: &str,
+    f: impl Fn(Option<i64>) -> Result<Option<i64>>,
+    tz_opt: &Option<Arc<str>>,
+) -> Result<ColumnarValue, DataFusionError> {
+    let scale = match tu {
+        TimeUnit::Second => 1_000_000_000,
+        TimeUnit::Millisecond => 1_000_000,
+        TimeUnit::Microsecond => 1_000,
+        TimeUnit::Nanosecond => 1,
+    };
+    let nano = (f)(Some(value.unwrap() * scale))?;
+    let result = match tu {
+        TimeUnit::Second => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000),
+                "second" => Some(nano.unwrap() / 1_000_000_000),
+                "millisecond" => Some(nano.unwrap() / 1_000_000),
+                "microsecond" => Some(nano.unwrap()),
+                _ => Some(nano.unwrap() / 1_000_000_000),
+            };
+            ScalarValue::TimestampSecond(value, tz_opt.clone())
+        }
+        TimeUnit::Millisecond => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000 * 1_000),
+                "second" => Some(nano.unwrap() / 1_000_000_000),
+                "millisecond" => Some(nano.unwrap() / 1_000_000),
+                "microsecond" => Some(nano.unwrap() / 1_000),
+                _ => Some(nano.unwrap() / 1_000_000),
+            };
+            ScalarValue::TimestampMillisecond(value, tz_opt.clone())
+        }
+        TimeUnit::Microsecond => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000 * 1_000_000),
+                "second" => Some(nano.unwrap() / 1_000_000 * 1_000),
+                "millisecond" => Some(nano.unwrap() / 1_000 * 1_000),
+                "microsecond" => Some(nano.unwrap() / 1_000),
+                _ => Some(nano.unwrap() / 1_000),
+            };
+            ScalarValue::TimestampMicrosecond(value, tz_opt.clone())
+        }
+        _ => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000 * 1_000_000_000),
+                "second" => Some(nano.unwrap() / 1_000_000 * 1_000_000),
+                "millisecond" => Some(nano.unwrap() / 1_000 * 1_000),
+                "microsecond" => Some(nano.unwrap()),

Review Comment:
   ```suggestion
                   "microsecond" => Some(nano.unwrap() / 1_000 * 1_000),
   ```



##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -262,6 +262,65 @@ fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
     Ok(value.unwrap().timestamp_nanos())
 }
 
+fn _date_trunc(
+    tu: TimeUnit,
+    value: &Option<i64>,
+    granularity: &str,
+    f: impl Fn(Option<i64>) -> Result<Option<i64>>,
+    tz_opt: &Option<Arc<str>>,
+) -> Result<ColumnarValue, DataFusionError> {
+    let scale = match tu {
+        TimeUnit::Second => 1_000_000_000,
+        TimeUnit::Millisecond => 1_000_000,
+        TimeUnit::Microsecond => 1_000,
+        TimeUnit::Nanosecond => 1,
+    };
+    let nano = (f)(Some(value.unwrap() * scale))?;
+    let result = match tu {
+        TimeUnit::Second => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000),
+                "second" => Some(nano.unwrap() / 1_000_000_000),
+                "millisecond" => Some(nano.unwrap() / 1_000_000),
+                "microsecond" => Some(nano.unwrap()),
+                _ => Some(nano.unwrap() / 1_000_000_000),
+            };
+            ScalarValue::TimestampSecond(value, tz_opt.clone())
+        }
+        TimeUnit::Millisecond => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000 * 1_000),
+                "second" => Some(nano.unwrap() / 1_000_000_000),
+                "millisecond" => Some(nano.unwrap() / 1_000_000),
+                "microsecond" => Some(nano.unwrap() / 1_000),
+                _ => Some(nano.unwrap() / 1_000_000),
+            };
+            ScalarValue::TimestampMillisecond(value, tz_opt.clone())
+        }
+        TimeUnit::Microsecond => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000 * 1_000_000),
+                "second" => Some(nano.unwrap() / 1_000_000 * 1_000),
+                "millisecond" => Some(nano.unwrap() / 1_000 * 1_000),
+                "microsecond" => Some(nano.unwrap() / 1_000),
+                _ => Some(nano.unwrap() / 1_000),
+            };
+            ScalarValue::TimestampMicrosecond(value, tz_opt.clone())
+        }
+        _ => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000 * 1_000_000_000),
+                "second" => Some(nano.unwrap() / 1_000_000 * 1_000_000),

Review Comment:
   ```suggestion
                   "second" => Some(nano.unwrap() / 1_000_000_000 * 1_000_000_000),
   ```



##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -262,6 +262,65 @@ fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
     Ok(value.unwrap().timestamp_nanos())
 }
 
+fn _date_trunc(
+    tu: TimeUnit,
+    value: &Option<i64>,
+    granularity: &str,
+    f: impl Fn(Option<i64>) -> Result<Option<i64>>,
+    tz_opt: &Option<Arc<str>>,
+) -> Result<ColumnarValue, DataFusionError> {
+    let scale = match tu {
+        TimeUnit::Second => 1_000_000_000,
+        TimeUnit::Millisecond => 1_000_000,
+        TimeUnit::Microsecond => 1_000,
+        TimeUnit::Nanosecond => 1,
+    };
+    let nano = (f)(Some(value.unwrap() * scale))?;
+    let result = match tu {
+        TimeUnit::Second => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000),
+                "second" => Some(nano.unwrap() / 1_000_000_000),
+                "millisecond" => Some(nano.unwrap() / 1_000_000),
+                "microsecond" => Some(nano.unwrap()),
+                _ => Some(nano.unwrap() / 1_000_000_000),
+            };
+            ScalarValue::TimestampSecond(value, tz_opt.clone())
+        }
+        TimeUnit::Millisecond => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000 * 1_000),
+                "second" => Some(nano.unwrap() / 1_000_000_000),
+                "millisecond" => Some(nano.unwrap() / 1_000_000),
+                "microsecond" => Some(nano.unwrap() / 1_000),
+                _ => Some(nano.unwrap() / 1_000_000),
+            };
+            ScalarValue::TimestampMillisecond(value, tz_opt.clone())
+        }
+        TimeUnit::Microsecond => {
+            let value = match granularity {
+                "minute" => Some(nano.unwrap() / 1_000_000_000 * 1_000_000),
+                "second" => Some(nano.unwrap() / 1_000_000 * 1_000),
+                "millisecond" => Some(nano.unwrap() / 1_000 * 1_000),

Review Comment:
   ```suggestion
                   "millisecond" => Some(nano.unwrap() / 1_000_000 * 1_000),
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1244380096


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -215,13 +215,17 @@ fn quarter_month(date: &NaiveDateTime) -> u32 {
 }
 
 fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
+    if granularity == "millisecond" || granularity == "microsecond" {
+        return Ok(value);
+    }

Review Comment:
   Hmm, is it? https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1242712961 notes about the truncation of timestamps to seconds. But my question is why pulling "millisecond" and "microsecond" earlier which basically avoids `with_nanosecond(0)` only, compared with previous code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Weijun-H commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "Weijun-H (via GitHub)" <gi...@apache.org>.
Weijun-H commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1244855952


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -215,13 +215,17 @@ fn quarter_month(date: &NaiveDateTime) -> u32 {
 }
 
 fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
+    if granularity == "millisecond" || granularity == "microsecond" {
+        return Ok(value);
+    }

Review Comment:
   Due to the utilization of `with_nanosecond(0)`, the timestamp experiences a reduction in granularity at the nanosecond level. But for "millisecond" and "microsecond", we need to keep this granularity and truncate it later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Weijun-H commented on a diff in pull request #6654: fix: 'datatrunc' return inconsistent type

Posted by "Weijun-H (via GitHub)" <gi...@apache.org>.
Weijun-H commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1231375293


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -282,54 +341,64 @@ pub fn date_trunc(args: &[ColumnarValue]) -> Result<ColumnarValue> {
 
     Ok(match array {
         ColumnarValue::Scalar(ScalarValue::TimestampNanosecond(v, tz_opt)) => {
-            let nano = (f)(*v)?;
-            match granularity.as_str() {
-                "minute" => {
-                    // trunc to minute
-                    let second = ScalarValue::TimestampNanosecond(
-                        Some(nano.unwrap() / 1_000_000_000 * 1_000_000_000),
-                        tz_opt.clone(),
-                    );
-                    ColumnarValue::Scalar(second)
+            _date_trunc(TimeUnit::Nanosecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampMicrosecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Microsecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampMillisecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Millisecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampSecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Second, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Array(array) => {

Review Comment:
   For Array, should it return the values based on the input type or just return `TimestampNanoSecond`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1242772016


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -827,7 +912,7 @@ mod tests {
             (
                 "2020-09-08T13:42:29.190855Z",
                 "second",
-                "2020-09-08T13:42:29.000000Z",
+                "2020-09-08T13:42:29.190855Z",

Review Comment:
   Perhaps then we can update the test to call `_date_trunc`
   
   ```
               let result = date_trunc_single(granularity, left).unwrap();
   ``` 
   
   to 
   ```
               let result = _date_trunc(granularity, left).unwrap();
   ```
   
   As written is seems to have the wrong answer



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1244324968


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -215,13 +215,17 @@ fn quarter_month(date: &NaiveDateTime) -> u32 {
 }
 
 fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
+    if granularity == "millisecond" || granularity == "microsecond" {
+        return Ok(value);
+    }

Review Comment:
   Hm? Why this change?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1245581308


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -215,13 +215,17 @@ fn quarter_month(date: &NaiveDateTime) -> u32 {
 }
 
 fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
+    if granularity == "millisecond" || granularity == "microsecond" {
+        return Ok(value);
+    }

Review Comment:
   > You mean with_nanosecond also cleans up millisecond and microsecond in the timestamp? Otherwise nanosecond is smaller granularity than millisecond and microsecond. Why reduction at nanosecond level will affect millisecond and microsecond?
   
   Yes, I think that is correct. 
   
   I played around with it and we can simplify the logic like this which I think may make it easier to understand (the test still pass). If you like this better (I do) I can make it a PR (follow on to https://github.com/apache/arrow-datafusion/pull/6783)
   
   ```diff
   diff --git a/datafusion/physical-expr/src/datetime_expressions.rs b/datafusion/physical-expr/src/datetime_expressions.rs
   index 3a36e8a489..89ec508024 100644
   --- a/datafusion/physical-expr/src/datetime_expressions.rs
   +++ b/datafusion/physical-expr/src/datetime_expressions.rs
   @@ -215,42 +215,49 @@ fn quarter_month(date: &NaiveDateTime) -> u32 {
    }
    
    fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
   -    if granularity == "millisecond" || granularity == "microsecond" {
   -        return Ok(value);
   -    }
   +    let value = timestamp_ns_to_datetime(value).ok_or_else(|| {
   +        DataFusionError::Execution(format!("Timestamp {value} out of range"))
   +    })?;
   +
   +    let value = Some(value);
    
   -    let value = timestamp_ns_to_datetime(value)
   -        .ok_or_else(|| {
   -            DataFusionError::Execution(format!("Timestamp {value} out of range"))
   -        })?
   -        .with_nanosecond(0);
        let value = match granularity {
   -        "second" => value,
   -        "minute" => value.and_then(|d| d.with_second(0)),
   +        "millisecond" => value,
   +        "microsecond" => value,
   +        "second" => value.and_then(|d| d.with_nanosecond(0)),
   +        "minute" => value
   +            .and_then(|d| d.with_nanosecond(0))
   +            .and_then(|d| d.with_second(0)),
            "hour" => value
   +            .and_then(|d| d.with_nanosecond(0))
                .and_then(|d| d.with_second(0))
                .and_then(|d| d.with_minute(0)),
            "day" => value
   +            .and_then(|d| d.with_nanosecond(0))
                .and_then(|d| d.with_second(0))
                .and_then(|d| d.with_minute(0))
                .and_then(|d| d.with_hour(0)),
            "week" => value
   +            .and_then(|d| d.with_nanosecond(0))
                .and_then(|d| d.with_second(0))
                .and_then(|d| d.with_minute(0))
                .and_then(|d| d.with_hour(0))
                .map(|d| d - Duration::seconds(60 * 60 * 24 * d.weekday() as i64)),
            "month" => value
   +            .and_then(|d| d.with_nanosecond(0))
                .and_then(|d| d.with_second(0))
                .and_then(|d| d.with_minute(0))
                .and_then(|d| d.with_hour(0))
                .and_then(|d| d.with_day0(0)),
            "quarter" => value
   +            .and_then(|d| d.with_nanosecond(0))
                .and_then(|d| d.with_second(0))
                .and_then(|d| d.with_minute(0))
                .and_then(|d| d.with_hour(0))
                .and_then(|d| d.with_day0(0))
                .and_then(|d| d.with_month(quarter_month(&d))),
            "year" => value
   +            .and_then(|d| d.with_nanosecond(0))
                .and_then(|d| d.with_second(0))
                .and_then(|d| d.with_minute(0))
                .and_then(|d| d.with_hour(0))
   
   ```
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1242685928


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -827,7 +912,7 @@ mod tests {
             (
                 "2020-09-08T13:42:29.190855Z",
                 "second",
-                "2020-09-08T13:42:29.000000Z",
+                "2020-09-08T13:42:29.190855Z",

Review Comment:
   This change doesn't seem correct to me -- the original value was correct (truncated to microsecond). Do you know why it was changed? As written it still has microsecond precision I think



##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -262,6 +268,53 @@ fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
     Ok(value.unwrap().timestamp_nanos())
 }
 
+fn _date_trunc(
+    tu: TimeUnit,
+    value: &Option<i64>,
+    granularity: &str,
+    f: impl Fn(Option<i64>) -> Result<Option<i64>>,
+    // tz_opt: &Option<Arc<str>>,
+) -> Result<Option<i64>, DataFusionError> {
+    let scale = match tu {
+        TimeUnit::Second => 1_000_000_000,
+        TimeUnit::Millisecond => 1_000_000,
+        TimeUnit::Microsecond => 1_000,
+        TimeUnit::Nanosecond => 1,
+    };
+
+    if value.is_none() {
+        return Ok(None);
+    }

Review Comment:
   This works fine but I think you can simplify the logic with something like the following (and then avoid all the unwraps)
   
   ```suggestion
       let Some(value) = value else {
           return Ok(None);
       }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Weijun-H commented on a diff in pull request #6654: fix: 'datatrunc' return inconsistent type

Posted by "Weijun-H (via GitHub)" <gi...@apache.org>.
Weijun-H commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1231375293


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -282,54 +341,64 @@ pub fn date_trunc(args: &[ColumnarValue]) -> Result<ColumnarValue> {
 
     Ok(match array {
         ColumnarValue::Scalar(ScalarValue::TimestampNanosecond(v, tz_opt)) => {
-            let nano = (f)(*v)?;
-            match granularity.as_str() {
-                "minute" => {
-                    // trunc to minute
-                    let second = ScalarValue::TimestampNanosecond(
-                        Some(nano.unwrap() / 1_000_000_000 * 1_000_000_000),
-                        tz_opt.clone(),
-                    );
-                    ColumnarValue::Scalar(second)
+            _date_trunc(TimeUnit::Nanosecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampMicrosecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Microsecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampMillisecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Millisecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampSecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Second, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Array(array) => {

Review Comment:
   For Array, should it return the values based on the input type or just return `TimestampNanoSecond` consistently 🤔?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6654: fix: 'datatrunc' return inconsistent type

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1232397820


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -282,54 +341,64 @@ pub fn date_trunc(args: &[ColumnarValue]) -> Result<ColumnarValue> {
 
     Ok(match array {
         ColumnarValue::Scalar(ScalarValue::TimestampNanosecond(v, tz_opt)) => {
-            let nano = (f)(*v)?;
-            match granularity.as_str() {
-                "minute" => {
-                    // trunc to minute
-                    let second = ScalarValue::TimestampNanosecond(
-                        Some(nano.unwrap() / 1_000_000_000 * 1_000_000_000),
-                        tz_opt.clone(),
-                    );
-                    ColumnarValue::Scalar(second)
+            _date_trunc(TimeUnit::Nanosecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampMicrosecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Microsecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampMillisecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Millisecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampSecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Second, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Array(array) => {

Review Comment:
   > For Array, should it return the values based on the input type
   
   It should do the same as is done for the Scalar version.
   
   My understanding is that this PR changes the Scalar version to return a type based on the input types (not TimestampNanoSecond) so the array version should as well



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya commented on pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#issuecomment-1595347142

   I will take another look today or tomorrow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Weijun-H commented on pull request #6654: fix: 'datatrunc' return inconsistent type

Posted by "Weijun-H (via GitHub)" <gi...@apache.org>.
Weijun-H commented on PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#issuecomment-1592022285

   I am confused with this error after the return value type change
   ```
   Running "timestamps.slt"
   External error: query failed: DataFusion error: Internal error: could not cast value to arrow_array::array::primitive_array::PrimitiveArray<arrow_array::types::TimestampNanosecondType>. This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker
   [SQL] SELECT 'ts_data_nanos', DATE_TRUNC('day', ts) FROM ts_data_nanos
    UNION ALL
   SELECT 'ts_data_micros', DATE_TRUNC('day', ts) FROM ts_data_micros
    UNION ALL
   SELECT 'ts_data_millis', DATE_TRUNC('day', ts) FROM ts_data_millis
    UNION ALL
   SELECT 'ts_data_secs', DATE_TRUNC('day', ts) FROM ts_data_secs
   at tests/sqllogictests/test_files/timestamps.slt:911
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya commented on pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#issuecomment-1595820549

   I'm wondering maybe I misunderstand this function? Because the current change looks not correct mostly to me but all tests passed. 🤔 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1245435778


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -215,13 +215,17 @@ fn quarter_month(date: &NaiveDateTime) -> u32 {
 }
 
 fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
+    if granularity == "millisecond" || granularity == "microsecond" {
+        return Ok(value);
+    }

Review Comment:
   You mean `with_nanosecond` also cleans up millisecond and microsecond in the timestamp?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1245624498


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -215,13 +215,17 @@ fn quarter_month(date: &NaiveDateTime) -> u32 {
 }
 
 fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
+    if granularity == "millisecond" || granularity == "microsecond" {
+        return Ok(value);
+    }

Review Comment:
   https://github.com/apache/arrow-datafusion/pull/6789



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Weijun-H commented on pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "Weijun-H (via GitHub)" <gi...@apache.org>.
Weijun-H commented on PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#issuecomment-1595844918

   > I'm wondering maybe I misunderstand this function? Because the current change looks not correct mostly to me but all tests passed. 🤔
   
   Seem tests does not cover all cases 🤔


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1245435778


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -215,13 +215,17 @@ fn quarter_month(date: &NaiveDateTime) -> u32 {
 }
 
 fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
+    if granularity == "millisecond" || granularity == "microsecond" {
+        return Ok(value);
+    }

Review Comment:
   You mean `with_nanosecond` also cleans up millisecond and microsecond in the timestamp? Otherwise nanosecond is smaller granularity than millisecond and microsecond. Why reduction at nanosecond level will affect millisecond and microsecond? 
   
   E.g, timestamp is 1.001000001 (1s + 1 ms + 1 ns). After `with_nanosecond(0)`, it will be 1.000000000 instead of 1.001000000?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb merged pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb merged PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1244330697


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -215,13 +215,17 @@ fn quarter_month(date: &NaiveDateTime) -> u32 {
 }
 
 fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
+    if granularity == "millisecond" || granularity == "microsecond" {
+        return Ok(value);
+    }

Review Comment:
   I think https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1242712961 is the answer -- perhaps I can remove this to make it easier to understand



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Weijun-H commented on a diff in pull request #6654: fix: 'datatrunc' return inconsistent type

Posted by "Weijun-H (via GitHub)" <gi...@apache.org>.
Weijun-H commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1231370472


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -282,54 +341,64 @@ pub fn date_trunc(args: &[ColumnarValue]) -> Result<ColumnarValue> {
 
     Ok(match array {
         ColumnarValue::Scalar(ScalarValue::TimestampNanosecond(v, tz_opt)) => {
-            let nano = (f)(*v)?;
-            match granularity.as_str() {
-                "minute" => {
-                    // trunc to minute
-                    let second = ScalarValue::TimestampNanosecond(
-                        Some(nano.unwrap() / 1_000_000_000 * 1_000_000_000),
-                        tz_opt.clone(),
-                    );
-                    ColumnarValue::Scalar(second)
+            _date_trunc(TimeUnit::Nanosecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampMicrosecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Microsecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampMillisecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Millisecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampSecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Second, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Array(array) => {
+            let array_type = array.data_type();

Review Comment:
   I use `f` function pointer, which uses `granularity ` as parameter



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6654: fix: 'datatrunc' return inconsistent type

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1231005390


##########
datafusion/expr/src/built_in_function.rs:
##########
@@ -884,8 +885,13 @@ impl BuiltinScalarFunction {
                 ],
                 self.volatility(),
             ),
-            BuiltinScalarFunction::DateTrunc => Signature::exact(
-                vec![Utf8, Timestamp(Nanosecond, None)],
+            BuiltinScalarFunction::DateTrunc => Signature::one_of(

Review Comment:
   this clause seems unreachable given the change above to match `BuiltinScalarFunction::DateTrunc | BuiltinScalarFunction::DateBin => {`
   
   I would personally suggest keeping this change and removing the `BuiltinScalarFunction::DateTrunc | BuiltinScalarFunction::DateBin => {` change as it make the code more explicy



##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -282,54 +341,64 @@ pub fn date_trunc(args: &[ColumnarValue]) -> Result<ColumnarValue> {
 
     Ok(match array {
         ColumnarValue::Scalar(ScalarValue::TimestampNanosecond(v, tz_opt)) => {
-            let nano = (f)(*v)?;
-            match granularity.as_str() {
-                "minute" => {
-                    // trunc to minute
-                    let second = ScalarValue::TimestampNanosecond(
-                        Some(nano.unwrap() / 1_000_000_000 * 1_000_000_000),
-                        tz_opt.clone(),
-                    );
-                    ColumnarValue::Scalar(second)
+            _date_trunc(TimeUnit::Nanosecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampMicrosecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Microsecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampMillisecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Millisecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampSecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Second, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Array(array) => {
+            let array_type = array.data_type();

Review Comment:
   I must be missing something here -- it seems like `granularity` is not used for the array variants
   
   So like `date_trunc(x, 'minutes')` will return the same value as `date_trunc(x, 'seconds')` which doesn't seem right
   
   Maybe that indicates missing test coverage 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1245593915


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -215,13 +215,17 @@ fn quarter_month(date: &NaiveDateTime) -> u32 {
 }
 
 fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
+    if granularity == "millisecond" || granularity == "microsecond" {
+        return Ok(value);
+    }

Review Comment:
   Yea, I think it looks clearer.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1244365300


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -215,13 +215,17 @@ fn quarter_month(date: &NaiveDateTime) -> u32 {
 }
 
 fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
+    if granularity == "millisecond" || granularity == "microsecond" {
+        return Ok(value);
+    }

Review Comment:
   https://github.com/apache/arrow-datafusion/pull/6783#



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Weijun-H commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "Weijun-H (via GitHub)" <gi...@apache.org>.
Weijun-H commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1242864616


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -827,7 +912,7 @@ mod tests {
             (
                 "2020-09-08T13:42:29.190855Z",
                 "second",
-                "2020-09-08T13:42:29.000000Z",
+                "2020-09-08T13:42:29.190855Z",

Review Comment:
   I modified `date_trunc_single`, so it will truncate the timestamp to second when the granularity is second.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1232403202


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -282,54 +341,64 @@ pub fn date_trunc(args: &[ColumnarValue]) -> Result<ColumnarValue> {
 
     Ok(match array {
         ColumnarValue::Scalar(ScalarValue::TimestampNanosecond(v, tz_opt)) => {
-            let nano = (f)(*v)?;
-            match granularity.as_str() {
-                "minute" => {
-                    // trunc to minute
-                    let second = ScalarValue::TimestampNanosecond(
-                        Some(nano.unwrap() / 1_000_000_000 * 1_000_000_000),
-                        tz_opt.clone(),
-                    );
-                    ColumnarValue::Scalar(second)
+            _date_trunc(TimeUnit::Nanosecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampMicrosecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Microsecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampMillisecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Millisecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampSecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Second, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Array(array) => {
+            let array_type = array.data_type();
+            match array_type {
+                DataType::Timestamp(TimeUnit::Second, _) => {
+                    let array = as_timestamp_second_array(array)?;
+                    let array = array
+                        .iter()
+                        .map(|x| {
+                            f(Some(x.unwrap() * 1_000_000_000))

Review Comment:
   I see this PR doesn't introduce the problem, but it appears that `date_trunc` panic's on `unwrap()` 🤔 
   
   ```
   ❯ select date_trunc('hour', '2021-02-01T12:01:02');
   +------------------------------------------------------+
   | date_trunc(Utf8("hour"),Utf8("2021-02-01T12:01:02")) |
   +------------------------------------------------------+
   | 2021-02-01T12:00:00                                  |
   +------------------------------------------------------+
   1 row in set. Query took 0.074 seconds.
   ❯ select date_trunc('hour', null);
   thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', /Users/alamb/Software/arrow-datafusion/datafusion/physical-expr/src/datetime_expressions.rs:314:35
   note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
   ```
   
   I will file a ticket: https://github.com/apache/arrow-datafusion/issues/6701



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6654: fix: 'datatrunc' return inconsistent type

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1232403202


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -282,54 +341,64 @@ pub fn date_trunc(args: &[ColumnarValue]) -> Result<ColumnarValue> {
 
     Ok(match array {
         ColumnarValue::Scalar(ScalarValue::TimestampNanosecond(v, tz_opt)) => {
-            let nano = (f)(*v)?;
-            match granularity.as_str() {
-                "minute" => {
-                    // trunc to minute
-                    let second = ScalarValue::TimestampNanosecond(
-                        Some(nano.unwrap() / 1_000_000_000 * 1_000_000_000),
-                        tz_opt.clone(),
-                    );
-                    ColumnarValue::Scalar(second)
+            _date_trunc(TimeUnit::Nanosecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampMicrosecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Microsecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampMillisecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Millisecond, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Scalar(ScalarValue::TimestampSecond(v, tz_opt)) => {
+            _date_trunc(TimeUnit::Second, v, granularity.as_str(), f, tz_opt)?
+        }
+        ColumnarValue::Array(array) => {
+            let array_type = array.data_type();
+            match array_type {
+                DataType::Timestamp(TimeUnit::Second, _) => {
+                    let array = as_timestamp_second_array(array)?;
+                    let array = array
+                        .iter()
+                        .map(|x| {
+                            f(Some(x.unwrap() * 1_000_000_000))

Review Comment:
   I see this PR doesn't introduce the problem, but it appears that `date_trunc` panic's on `unwrap()` 🤔 
   
   ```
   ❯ select date_trunc('hour', '2021-02-01T12:01:02');
   +------------------------------------------------------+
   | date_trunc(Utf8("hour"),Utf8("2021-02-01T12:01:02")) |
   +------------------------------------------------------+
   | 2021-02-01T12:00:00                                  |
   +------------------------------------------------------+
   1 row in set. Query took 0.074 seconds.
   ❯ select date_trunc('hour', null);
   thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', /Users/alamb/Software/arrow-datafusion/datafusion/physical-expr/src/datetime_expressions.rs:314:35
   note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
   ```
   
   I will file a ticket



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1244325250


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -262,6 +266,55 @@ fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
     Ok(value.unwrap().timestamp_nanos())
 }
 
+fn _date_trunc(
+    tu: TimeUnit,
+    value: &Option<i64>,
+    granularity: &str,
+    f: impl Fn(Option<i64>) -> Result<Option<i64>>,
+) -> Result<Option<i64>, DataFusionError> {
+    let scale = match tu {
+        TimeUnit::Second => 1_000_000_000,
+        TimeUnit::Millisecond => 1_000_000,
+        TimeUnit::Microsecond => 1_000,
+        TimeUnit::Nanosecond => 1,
+    };
+
+    let Some(value) = value else {
+        return Ok(None);
+    };
+
+    // convert to nanoseconds
+    let Some(nano) = (f)(Some(value * scale))? else {
+        return Ok(None);
+    };
+
+    let result = match tu {
+        TimeUnit::Second => match granularity {
+            "minute" => Some(nano / 1_000_000_000 / 60 * 60),
+            _ => Some(nano / 1_000_000_000),
+        },
+        TimeUnit::Millisecond => match granularity {
+            "minute" => Some(nano / 1_000_000 / 1_000 / 60 * 1_000 * 60),
+            "second" => Some(nano / 1_000_000 / 1_000 * 1_000),
+            _ => Some(nano / 1_000_000),
+        },
+        TimeUnit::Microsecond => match granularity {
+            "minute" => Some(nano / 1_000 / 1_000_000 / 60 * 60 * 1_000_000),
+            "second" => Some(nano / 1_000 / 1_000_000 * 1_000_000),
+            "millisecond" => Some(nano / 1_000 / 1_000 * 1_000),
+            _ => Some(nano / 1_000),
+        },
+        _ => match granularity {
+            "minute" => Some(nano / 1_000_000_000 / 60 * 1_000_000_000 * 60),
+            "second" => Some(nano / 1_000_000_000 * 1_000_000_000),
+            "millisecond" => Some(nano / 1_000_000 * 1_000_000),
+            "microsecond" => Some(nano / 1_000 * 1_000),
+            _ => Some(nano),
+        },

Review Comment:
   Looks correct.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1244340997


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -215,13 +215,17 @@ fn quarter_month(date: &NaiveDateTime) -> u32 {
 }
 
 fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
+    if granularity == "millisecond" || granularity == "microsecond" {
+        return Ok(value);
+    }

Review Comment:
   I find this very confusing -- I think it is because for those granularities, the conversion to chrono::DateTime and back is unecessary (as only sec/min/hour/etc are non uniform in width) -- I will try and make it clearer



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1244381902


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -215,13 +215,17 @@ fn quarter_month(date: &NaiveDateTime) -> u32 {
 }
 
 fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
+    if granularity == "millisecond" || granularity == "microsecond" {
+        return Ok(value);
+    }

Review Comment:
   I think for granularity as "second" | "millisecond" | "microsecond" cases, `with_nanosecond(0)` looks correct in previous code.
   
   It is okay to deferr the logic to the `_date_trunc`, but why only "millisecond" and "microsecond" are treated specially?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Weijun-H commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "Weijun-H (via GitHub)" <gi...@apache.org>.
Weijun-H commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1242712961


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -827,7 +912,7 @@ mod tests {
             (
                 "2020-09-08T13:42:29.190855Z",
                 "second",
-                "2020-09-08T13:42:29.000000Z",
+                "2020-09-08T13:42:29.190855Z",

Review Comment:
   https://github.com/apache/arrow-datafusion/blob/b17cf0710938eed1c9f7794d5695bfac89bff580/datafusion/physical-expr/src/datetime_expressions.rs#L217-L223
   
   Previously, the truncation of timestamps to seconds occurred prematurely in `date_trunc_single`, resulting in inaccurate results. Therefore, I have made the necessary adjustments and deferred the logic to the `_date_trunc` function. By implementing this change, the granularity of the timestamps is preserved, and truncation is performed at a later stage to ensure accuracy.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Weijun-H commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "Weijun-H (via GitHub)" <gi...@apache.org>.
Weijun-H commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1242864616


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -827,7 +912,7 @@ mod tests {
             (
                 "2020-09-08T13:42:29.190855Z",
                 "second",
-                "2020-09-08T13:42:29.000000Z",
+                "2020-09-08T13:42:29.190855Z",

Review Comment:
   I modified `date_trunc_single`, so it will truncate the timestamp to second when the granularity is second. Now the test make sense.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya commented on a diff in pull request #6654: fix: 'datatrunc' return inconsistent type

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1227352646


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -294,16 +294,24 @@ pub fn date_trunc(args: &[ColumnarValue]) -> Result<ColumnarValue> {
                 }
                 "second" => {
                     // trunc to second
-                    let mill = ScalarValue::TimestampNanosecond(

Review Comment:
   And the `signature` of `DateTrunc` should be updated, I think. Currently the second argument is only `Timestamp(Nanosecond, None)`. But I guess that this change wants to make output type as same as input type. So you probably want to add more input types to `signature`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6654: fix: 'datatrunc' return inconsistent type

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1227334439


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -294,16 +294,24 @@ pub fn date_trunc(args: &[ColumnarValue]) -> Result<ColumnarValue> {
                 }
                 "second" => {
                     // trunc to second
-                    let mill = ScalarValue::TimestampNanosecond(

Review Comment:
   I think we would have to change the values returned by https://github.com/apache/arrow-datafusion/pull/6654/files#diff-67ae8808785b2e651767d7ff67dd7c53be04ca098857b52c82ed19927e071cdaL322 as well as the type returned by `BuiltInScalarFunction::return_type`
   
   otherwise we'll be in the situation again where the scalar implementation does something different than the array implementation
   
   However, I was surprised that there were no tests that failed which reveals a gap in our test coverage. I wrote the test in https://github.com/apache/arrow-datafusion/pull/6655 which does pass on `main` but fails on this PR



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6654: Make 'date_trunc' returns the same type as its input

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6654:
URL: https://github.com/apache/arrow-datafusion/pull/6654#discussion_r1244340997


##########
datafusion/physical-expr/src/datetime_expressions.rs:
##########
@@ -215,13 +215,17 @@ fn quarter_month(date: &NaiveDateTime) -> u32 {
 }
 
 fn date_trunc_single(granularity: &str, value: i64) -> Result<i64> {
+    if granularity == "millisecond" || granularity == "microsecond" {
+        return Ok(value);
+    }

Review Comment:
   I find this very confusing -- I think it is because for those granularities, the conversion to chrono::DateTime and back is unecessary (as only sec/min/hour/etc are non uniform in width)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org