You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/30 16:20:20 UTC

[GitHub] [arrow] ovr opened a new pull request #9373: ARROW-10816: [Rust][DataFushion] Initial support for Interval express…

ovr opened a new pull request #9373:
URL: https://github.com/apache/arrow/pull/9373


   …ions


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r570054053



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {

Review comment:
       I dont have any ideas, how it should be correct, but anyway using `0.5 DAY` or `0.5 HOUR` will be rly usefull and works as expected.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r570156779



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {

Review comment:
       @Dandandan
   
   > for example the number of milliseconds to be off for non-trivial fractions?
   
   It's better to use Decimal anyway.
   
    👍  I can place a comment in this place with // @todo, when I will finish with https://github.com/apache/arrow/pull/9232 I can reuse Decimal128 in this place. Is it ok? Thanks




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr edited a comment on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr edited a comment on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-770445969


   Wow, I found that it's possible to write
   
   ```
   select INTERVAL '1 hour 1 second 1 month';
   ```
   
   Start to work on support for it. 
   
   And I found how It's implemented in C++
   
   https://github.com/apache/arrow/blob/master/cpp/src/arrow/type.h#L1238
   
   Started to rework this branch, marked as DRAFT.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r570153622



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {

Review comment:
       I am not sure whether rounding errors are going to be a problem, I could imagine some calculation might cause for example the number of milliseconds to be off for non-trivial fractions?
   In that case we should use some decimal type instead of `f32` in order to avoid those rounding errors.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r570403114



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -982,6 +996,171 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &str,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        const SECONDS_PER_HOUR: f32 = 3_600_f32;
+        const MILLIS_PER_SECOND: f32 = 1_000_f32;
+
+        // We are storing parts as integers, it's why we need to align parts fractional
+        // INTERVAL '0.5 MONTH' = 15 days, INTERVAL '1.5 MONTH' = 1 month 15 days

Review comment:
       I found it strange that the example of`1.5 months` is not supported whereas `0.5` and `1` are supported
   
   ```
   >  select (interval '0.5 month');
   +-------------------------------------------------+
   | IntervalDayTime("64424509440")                  |
   +-------------------------------------------------+
   | 0 years 0 mons 15 days 0 hours 0 mins 0.00 secs |
   +-------------------------------------------------+
   1 rows in set. Query took 0 seconds.
   >  select (interval '1 month');
   +------------------------------------------------+
   | IntervalYearMonth("1")                         |
   +------------------------------------------------+
   | 0 years 1 mons 0 days 0 hours 0 mins 0.00 secs |
   +------------------------------------------------+
   1 rows in set. Query took 0 seconds.
   >  select (interval '1.5 month');
   NotImplemented("DF doesnt support complex intervals: \"1.1 month\"")
   
   ```

##########
File path: rust/arrow/src/util/display.rs
##########
@@ -44,6 +44,66 @@ macro_rules! make_string {
     }};
 }
 
+macro_rules! make_string_interval_year_month {
+    ($column: ident, $row: ident) => {{
+        let array = $column
+            .as_any()
+            .downcast_ref::<array::IntervalYearMonthArray>()
+            .unwrap();
+
+        let s = if array.is_null($row) {
+            "0 years 0 mons 0 days 0 hours 0 mins 0.00 secs".to_string()
+        } else {
+            let interval = array.value($row) as f64;
+            let years = (interval / 12_f64).floor();
+            let month = interval - (years * 12_f64);
+
+            format!(
+                "{} years {} mons 0 days 0 hours 0 mins 0.00 secs",
+                years, month,
+            )
+        };
+
+        Ok(s)
+    }};
+}
+
+macro_rules! make_string_interval_day_time {
+    ($column: ident, $row: ident) => {{
+        let array = $column
+            .as_any()
+            .downcast_ref::<array::IntervalDayTimeArray>()
+            .unwrap();
+
+        let s = if array.is_null($row) {
+            "0 years 0 mons 0 days 0 hours 0 mins 0.00 secs".to_string()

Review comment:
       I wonder if `""` would be a more appropriate representation for `NULL`, to conform with the handling of null strings:
   
   https://github.com/apache/arrow/pull/9373/files#diff-821be3a8a9239cdd12ab2b3c979e174f7b936aa6bfad4b9bc76489b9923cf747R164
   
   I also think `"0 years 0 mons 0 days 0 hours 0 mins 0.00 secs"` might be the same representation if `value = 0` (so you couldn't distinguish between `0` and `NULL`(

##########
File path: rust/arrow/src/util/display.rs
##########
@@ -44,6 +44,66 @@ macro_rules! make_string {
     }};
 }
 
+macro_rules! make_string_interval_year_month {
+    ($column: ident, $row: ident) => {{
+        let array = $column
+            .as_any()
+            .downcast_ref::<array::IntervalYearMonthArray>()
+            .unwrap();
+
+        let s = if array.is_null($row) {
+            "0 years 0 mons 0 days 0 hours 0 mins 0.00 secs".to_string()

Review comment:
       same comment as below about NULL vs 0 handling

##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -982,6 +996,171 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &str,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        const SECONDS_PER_HOUR: f32 = 3_600_f32;
+        const MILLIS_PER_SECOND: f32 = 1_000_f32;
+
+        // We are storing parts as integers, it's why we need to align parts fractional
+        // INTERVAL '0.5 MONTH' = 15 days, INTERVAL '1.5 MONTH' = 1 month 15 days

Review comment:
       I can understand why we might not support fractional months (as depending on the month the length of a month can not necessarily be expressed as an integral number of days or months).
   
   But it seems inconsitent to me to say `0.5 month` is `15 days` but `1.5 month` is not `45 days`
   
   I think the easiest thing for me to reason about is if they both error. 

##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -982,6 +996,171 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &str,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        const SECONDS_PER_HOUR: f32 = 3_600_f32;
+        const MILLIS_PER_SECOND: f32 = 1_000_f32;
+
+        // We are storing parts as integers, it's why we need to align parts fractional
+        // INTERVAL '0.5 MONTH' = 15 days, INTERVAL '1.5 MONTH' = 1 month 15 days
+        // INTERVAL '0.5 DAY' = 12 hours, INTERVAL '1.5 DAY' = 1 day 12 hours
+        let align_interval_parts = |month_part: f32,
+                                    mut day_part: f32,
+                                    mut milles_part: f32|
+         -> (i32, i32, f32) {
+            // Convert fractional month to days, It's not supported by Arrow types, but anyway
+            day_part += (month_part - (month_part as i32) as f32) * 30_f32;
+
+            // Convert fractional days to hours
+            milles_part += (day_part - ((day_part as i32) as f32))
+                * 24_f32
+                * SECONDS_PER_HOUR
+                * MILLIS_PER_SECOND;
+
+            (month_part as i32, day_part as i32, milles_part)
+        };
+
+        let calculate_from_part = |interval_period_str: &str,
+                                   interval_type: &str|
+         -> Result<(i32, i32, f32)> {
+            let interval_period = match f32::from_str(interval_period_str) {
+                Ok(n) => n,
+                Err(_) => {
+                    return Err(DataFusionError::SQL(ParserError(format!(
+                        "Unsupported Interval Expression with value {:?}",
+                        value
+                    ))))
+                }
+            };
+
+            if interval_period > (i32::MAX as f32) {
+                return Err(DataFusionError::NotImplemented(format!(
+                    "Interval field value out of range: {:?}",
+                    value
+                )));
+            }
+
+            match interval_type.to_lowercase().as_str() {
+                "year" => Ok(align_interval_parts(interval_period * 12_f32, 0.0, 0.0)),
+                "month" => Ok(align_interval_parts(interval_period, 0.0, 0.0)),
+                "day" | "days" => Ok(align_interval_parts(0.0, interval_period, 0.0)),
+                "hour" | "hours" => {
+                    Ok((0, 0, interval_period * SECONDS_PER_HOUR * MILLIS_PER_SECOND))
+                }
+                "minutes" | "minute" => {
+                    Ok((0, 0, interval_period * 60_f32 * MILLIS_PER_SECOND))
+                }
+                "seconds" | "second" => Ok((0, 0, interval_period * MILLIS_PER_SECOND)),
+                "milliseconds" | "millisecond" => Ok((0, 0, interval_period)),
+                _ => Err(DataFusionError::NotImplemented(format!(
+                    "Invalid input syntax for type interval: {:?}",
+                    value
+                ))),
+            }
+        };
+
+        let mut result_month: i64 = 0;
+        let mut result_days: i64 = 0;
+        let mut result_millis: i64 = 0;
+
+        let mut parts = value.split_whitespace();
+
+        loop {
+            let interval_period_str = parts.next();
+            if interval_period_str.is_none() {
+                break;
+            }
+
+            let (diff_month, diff_days, diff_millis) = calculate_from_part(
+                interval_period_str.unwrap(),
+                parts.next().unwrap_or("second"),
+            )?;
+
+            result_month += diff_month as i64;
+
+            if result_month > (i32::MAX as i64) {
+                return Err(DataFusionError::NotImplemented(format!(
+                    "Interval field value out of range: {:?}",
+                    value
+                )));
+            }
+
+            result_days += diff_days as i64;
+
+            if result_days > (i32::MAX as i64) {
+                return Err(DataFusionError::NotImplemented(format!(
+                    "Interval field value out of range: {:?}",
+                    value
+                )));
+            }
+
+            result_millis += diff_millis as i64;
+
+            if result_millis > (i32::MAX as i64) {
+                return Err(DataFusionError::NotImplemented(format!(
+                    "Interval field value out of range: {:?}",
+                    value
+                )));
+            }
+        }
+
+        // Interval is tricky thing
+        // 1 day is not 24 hours because timezones, 1 year != 365/364! 30 days != 1 month
+        // The true way to store and calculate intervals is to store it as it defined
+        // Due the fact that Arrow supports only two types YearMonth (month) and DayTime (day, time)
+        // It's not possible to store complex intervals
+        // It's possible to do select (NOW() + INTERVAL '1 year') + INTERVAL '1 day'; as workaround
+        if result_month != 0 && (result_days != 0 || result_millis != 0) {
+            return Err(DataFusionError::NotImplemented(format!(
+                "DF doesnt support complex intervals: {:?}",

Review comment:
       ```suggestion
                   "DF does not support intervals that have both a Year/Month part as well as Days/Hours/Mins/Seconds: {:?}. Hint: try breaking the interval into two parts, one with Year/Month and the other with Days/Hours/Mins/Seconds - e.g. (NOW() + INTERVAL '1 year') + INTERVAL '1 day'",
   ```
   
   Maybe too much in an error message but as I user of DataFusion I would appreciate having this context :)

##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -982,6 +996,171 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &str,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        const SECONDS_PER_HOUR: f32 = 3_600_f32;
+        const MILLIS_PER_SECOND: f32 = 1_000_f32;
+
+        // We are storing parts as integers, it's why we need to align parts fractional
+        // INTERVAL '0.5 MONTH' = 15 days, INTERVAL '1.5 MONTH' = 1 month 15 days
+        // INTERVAL '0.5 DAY' = 12 hours, INTERVAL '1.5 DAY' = 1 day 12 hours
+        let align_interval_parts = |month_part: f32,
+                                    mut day_part: f32,
+                                    mut milles_part: f32|
+         -> (i32, i32, f32) {
+            // Convert fractional month to days, It's not supported by Arrow types, but anyway
+            day_part += (month_part - (month_part as i32) as f32) * 30_f32;
+
+            // Convert fractional days to hours
+            milles_part += (day_part - ((day_part as i32) as f32))
+                * 24_f32
+                * SECONDS_PER_HOUR
+                * MILLIS_PER_SECOND;
+
+            (month_part as i32, day_part as i32, milles_part)
+        };
+
+        let calculate_from_part = |interval_period_str: &str,
+                                   interval_type: &str|
+         -> Result<(i32, i32, f32)> {
+            let interval_period = match f32::from_str(interval_period_str) {
+                Ok(n) => n,
+                Err(_) => {
+                    return Err(DataFusionError::SQL(ParserError(format!(
+                        "Unsupported Interval Expression with value {:?}",
+                        value
+                    ))))
+                }
+            };
+
+            if interval_period > (i32::MAX as f32) {
+                return Err(DataFusionError::NotImplemented(format!(
+                    "Interval field value out of range: {:?}",
+                    value
+                )));
+            }
+
+            match interval_type.to_lowercase().as_str() {
+                "year" => Ok(align_interval_parts(interval_period * 12_f32, 0.0, 0.0)),
+                "month" => Ok(align_interval_parts(interval_period, 0.0, 0.0)),
+                "day" | "days" => Ok(align_interval_parts(0.0, interval_period, 0.0)),
+                "hour" | "hours" => {
+                    Ok((0, 0, interval_period * SECONDS_PER_HOUR * MILLIS_PER_SECOND))
+                }
+                "minutes" | "minute" => {
+                    Ok((0, 0, interval_period * 60_f32 * MILLIS_PER_SECOND))
+                }
+                "seconds" | "second" => Ok((0, 0, interval_period * MILLIS_PER_SECOND)),
+                "milliseconds" | "millisecond" => Ok((0, 0, interval_period)),
+                _ => Err(DataFusionError::NotImplemented(format!(
+                    "Invalid input syntax for type interval: {:?}",
+                    value
+                ))),
+            }
+        };
+
+        let mut result_month: i64 = 0;
+        let mut result_days: i64 = 0;
+        let mut result_millis: i64 = 0;
+
+        let mut parts = value.split_whitespace();
+
+        loop {
+            let interval_period_str = parts.next();
+            if interval_period_str.is_none() {
+                break;
+            }
+
+            let (diff_month, diff_days, diff_millis) = calculate_from_part(
+                interval_period_str.unwrap(),
+                parts.next().unwrap_or("second"),
+            )?;
+
+            result_month += diff_month as i64;
+
+            if result_month > (i32::MAX as i64) {
+                return Err(DataFusionError::NotImplemented(format!(
+                    "Interval field value out of range: {:?}",
+                    value
+                )));
+            }
+
+            result_days += diff_days as i64;
+
+            if result_days > (i32::MAX as i64) {
+                return Err(DataFusionError::NotImplemented(format!(
+                    "Interval field value out of range: {:?}",
+                    value
+                )));
+            }
+
+            result_millis += diff_millis as i64;
+
+            if result_millis > (i32::MAX as i64) {
+                return Err(DataFusionError::NotImplemented(format!(
+                    "Interval field value out of range: {:?}",
+                    value
+                )));
+            }
+        }
+
+        // Interval is tricky thing
+        // 1 day is not 24 hours because timezones, 1 year != 365/364! 30 days != 1 month
+        // The true way to store and calculate intervals is to store it as it defined

Review comment:
       💯  agree with this comment.
   
   I think an additional restriction we should perhaps consider is 'only support integral numbers of months + years' as the `YearMonthInterval` interval type can not capture precisely fractional parts 
   
   https://github.com/apache/arrow/blob/master/format/Schema.fbs#L253-L254




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-770264240


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=h1) Report
   > Merging [#9373](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=desc) (30f9559) into [master](https://codecov.io/gh/apache/arrow/commit/77ae93d6ecaac8fb5f4a18ca5287b7456cd88784?el=desc) (77ae93d) will **decrease** coverage by `0.02%`.
   > The diff coverage is `57.95%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9373/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9373      +/-   ##
   ==========================================
   - Coverage   82.00%   81.98%   -0.03%     
   ==========================================
     Files         230      227       -3     
     Lines       53487    53564      +77     
   ==========================================
   + Hits        43864    43914      +50     
   - Misses       9623     9650      +27     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/datafusion/src/sql/planner.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9zcWwvcGxhbm5lci5ycw==) | `81.78% <47.36%> (-2.41%)` | :arrow_down: |
   | [rust/datafusion/src/scalar.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9zY2FsYXIucnM=) | `56.28% <63.15%> (+0.43%)` | :arrow_up: |
   | [rust/arrow/src/util/display.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvdXRpbC9kaXNwbGF5LnJz) | `77.14% <100.00%> (+1.38%)` | :arrow_up: |
   | [rust/datafusion/tests/sql.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3Rlc3RzL3NxbC5ycw==) | `99.84% <100.00%> (+<0.01%)` | :arrow_up: |
   | [rust/parquet/src/record/api.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9yZWNvcmQvYXBpLnJz) | `92.63% <0.00%> (-5.39%)` | :arrow_down: |
   | [rust/parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9lbmNvZGluZ3MvZW5jb2RpbmcucnM=) | `94.86% <0.00%> (-0.20%)` | :arrow_down: |
   | [rust/parquet/src/bin/parquet-read.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1yZWFkLnJz) | | |
   | [rust/parquet/src/bin/parquet-schema.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1zY2hlbWEucnM=) | | |
   | [rust/parquet/src/bin/parquet-rowcount.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1yb3djb3VudC5ycw==) | | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=footer). Last update [77ae93d...30f9559](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr edited a comment on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr edited a comment on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-770445969


   Wow, I found that it's possible to write
   
   ```
   select INTERVAL '1 hour 1 second 1 month';
   ```
   
   Start to work on support for it. And I found how It's implemented https://github.com/apache/arrow/blob/master/cpp/src/arrow/type.h#L1238


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r569727025



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {

Review comment:
       I am using `f32` for compability with PostgreSQL dialect:
   
   ```
   SELECT INTERVAL '0.5 day';
   ```
   
   ```
   0 years 0 mons 0 days 12 hours 0 mins 0.00 secs
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr edited a comment on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr edited a comment on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-773220218


   I think this PR is now ready for review/merge.
   
   In next PR I am planing to work on:
   
   - BinaryExpressions for Dates and Intervals
   - ISO8601 intervals
   
   Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-770264240


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=h1) Report
   > Merging [#9373](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=desc) (7cf5df7) into [master](https://codecov.io/gh/apache/arrow/commit/77ae93d6ecaac8fb5f4a18ca5287b7456cd88784?el=desc) (77ae93d) will **decrease** coverage by `0.02%`.
   > The diff coverage is `56.81%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9373/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9373      +/-   ##
   ==========================================
   - Coverage   82.00%   81.98%   -0.03%     
   ==========================================
     Files         230      227       -3     
     Lines       53487    53564      +77     
   ==========================================
   + Hits        43864    43914      +50     
   - Misses       9623     9650      +27     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/datafusion/src/sql/planner.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9zcWwvcGxhbm5lci5ycw==) | `81.67% <45.61%> (-2.52%)` | :arrow_down: |
   | [rust/datafusion/src/scalar.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9zY2FsYXIucnM=) | `56.28% <63.15%> (+0.43%)` | :arrow_up: |
   | [rust/arrow/src/util/display.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvdXRpbC9kaXNwbGF5LnJz) | `77.14% <100.00%> (+1.38%)` | :arrow_up: |
   | [rust/datafusion/tests/sql.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3Rlc3RzL3NxbC5ycw==) | `99.84% <100.00%> (+<0.01%)` | :arrow_up: |
   | [rust/parquet/src/record/api.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9yZWNvcmQvYXBpLnJz) | `92.63% <0.00%> (-5.39%)` | :arrow_down: |
   | [rust/parquet/src/bin/parquet-schema.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1zY2hlbWEucnM=) | | |
   | [rust/parquet/src/bin/parquet-read.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1yZWFkLnJz) | | |
   | [rust/parquet/src/bin/parquet-rowcount.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1yb3djb3VudC5ycw==) | | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=footer). Last update [77ae93d...30f9559](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r570054053



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {

Review comment:
       I dont have any ideas, how it should be correct, but anyway using `0.5 DAY` or `0.5 HOUR` will be rly usefull and works as expected.
   
   Can you point my attention how it should works?
   
   Thanks




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-773244774


   Thanks @ovr  -- I plan to review this carefully later today


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr edited a comment on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr edited a comment on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-773220218






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr edited a comment on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr edited a comment on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-773220218


   I think this PR is now ready for review/merge.
   
   In next PR I am planing to work on:
   
   - BinaryExpressions for Dates and Intervals
   - ISO8601 intervals
   - Introduce `ArrowIntervalType` trait
   
   Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r570403114



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -982,6 +996,171 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &str,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        const SECONDS_PER_HOUR: f32 = 3_600_f32;
+        const MILLIS_PER_SECOND: f32 = 1_000_f32;
+
+        // We are storing parts as integers, it's why we need to align parts fractional
+        // INTERVAL '0.5 MONTH' = 15 days, INTERVAL '1.5 MONTH' = 1 month 15 days

Review comment:
       I found it strange that the example of`1.5 months` is not supported whereas `0.5` and `1` are supported
   
   ```
   >  select (interval '0.5 month');
   +-------------------------------------------------+
   | IntervalDayTime("64424509440")                  |
   +-------------------------------------------------+
   | 0 years 0 mons 15 days 0 hours 0 mins 0.00 secs |
   +-------------------------------------------------+
   1 rows in set. Query took 0 seconds.
   >  select (interval '1 month');
   +------------------------------------------------+
   | IntervalYearMonth("1")                         |
   +------------------------------------------------+
   | 0 years 1 mons 0 days 0 hours 0 mins 0.00 secs |
   +------------------------------------------------+
   1 rows in set. Query took 0 seconds.
   >  select (interval '1.5 month');
   NotImplemented("DF doesnt support complex intervals: \"1.1 month\"")
   
   ```

##########
File path: rust/arrow/src/util/display.rs
##########
@@ -44,6 +44,66 @@ macro_rules! make_string {
     }};
 }
 
+macro_rules! make_string_interval_year_month {
+    ($column: ident, $row: ident) => {{
+        let array = $column
+            .as_any()
+            .downcast_ref::<array::IntervalYearMonthArray>()
+            .unwrap();
+
+        let s = if array.is_null($row) {
+            "0 years 0 mons 0 days 0 hours 0 mins 0.00 secs".to_string()
+        } else {
+            let interval = array.value($row) as f64;
+            let years = (interval / 12_f64).floor();
+            let month = interval - (years * 12_f64);
+
+            format!(
+                "{} years {} mons 0 days 0 hours 0 mins 0.00 secs",
+                years, month,
+            )
+        };
+
+        Ok(s)
+    }};
+}
+
+macro_rules! make_string_interval_day_time {
+    ($column: ident, $row: ident) => {{
+        let array = $column
+            .as_any()
+            .downcast_ref::<array::IntervalDayTimeArray>()
+            .unwrap();
+
+        let s = if array.is_null($row) {
+            "0 years 0 mons 0 days 0 hours 0 mins 0.00 secs".to_string()

Review comment:
       I wonder if `""` would be a more appropriate representation for `NULL`, to conform with the handling of null strings:
   
   https://github.com/apache/arrow/pull/9373/files#diff-821be3a8a9239cdd12ab2b3c979e174f7b936aa6bfad4b9bc76489b9923cf747R164
   
   I also think `"0 years 0 mons 0 days 0 hours 0 mins 0.00 secs"` might be the same representation if `value = 0` (so you couldn't distinguish between `0` and `NULL`(

##########
File path: rust/arrow/src/util/display.rs
##########
@@ -44,6 +44,66 @@ macro_rules! make_string {
     }};
 }
 
+macro_rules! make_string_interval_year_month {
+    ($column: ident, $row: ident) => {{
+        let array = $column
+            .as_any()
+            .downcast_ref::<array::IntervalYearMonthArray>()
+            .unwrap();
+
+        let s = if array.is_null($row) {
+            "0 years 0 mons 0 days 0 hours 0 mins 0.00 secs".to_string()

Review comment:
       same comment as below about NULL vs 0 handling

##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -982,6 +996,171 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &str,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        const SECONDS_PER_HOUR: f32 = 3_600_f32;
+        const MILLIS_PER_SECOND: f32 = 1_000_f32;
+
+        // We are storing parts as integers, it's why we need to align parts fractional
+        // INTERVAL '0.5 MONTH' = 15 days, INTERVAL '1.5 MONTH' = 1 month 15 days

Review comment:
       I can understand why we might not support fractional months (as depending on the month the length of a month can not necessarily be expressed as an integral number of days or months).
   
   But it seems inconsitent to me to say `0.5 month` is `15 days` but `1.5 month` is not `45 days`
   
   I think the easiest thing for me to reason about is if they both error. 

##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -982,6 +996,171 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &str,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        const SECONDS_PER_HOUR: f32 = 3_600_f32;
+        const MILLIS_PER_SECOND: f32 = 1_000_f32;
+
+        // We are storing parts as integers, it's why we need to align parts fractional
+        // INTERVAL '0.5 MONTH' = 15 days, INTERVAL '1.5 MONTH' = 1 month 15 days
+        // INTERVAL '0.5 DAY' = 12 hours, INTERVAL '1.5 DAY' = 1 day 12 hours
+        let align_interval_parts = |month_part: f32,
+                                    mut day_part: f32,
+                                    mut milles_part: f32|
+         -> (i32, i32, f32) {
+            // Convert fractional month to days, It's not supported by Arrow types, but anyway
+            day_part += (month_part - (month_part as i32) as f32) * 30_f32;
+
+            // Convert fractional days to hours
+            milles_part += (day_part - ((day_part as i32) as f32))
+                * 24_f32
+                * SECONDS_PER_HOUR
+                * MILLIS_PER_SECOND;
+
+            (month_part as i32, day_part as i32, milles_part)
+        };
+
+        let calculate_from_part = |interval_period_str: &str,
+                                   interval_type: &str|
+         -> Result<(i32, i32, f32)> {
+            let interval_period = match f32::from_str(interval_period_str) {
+                Ok(n) => n,
+                Err(_) => {
+                    return Err(DataFusionError::SQL(ParserError(format!(
+                        "Unsupported Interval Expression with value {:?}",
+                        value
+                    ))))
+                }
+            };
+
+            if interval_period > (i32::MAX as f32) {
+                return Err(DataFusionError::NotImplemented(format!(
+                    "Interval field value out of range: {:?}",
+                    value
+                )));
+            }
+
+            match interval_type.to_lowercase().as_str() {
+                "year" => Ok(align_interval_parts(interval_period * 12_f32, 0.0, 0.0)),
+                "month" => Ok(align_interval_parts(interval_period, 0.0, 0.0)),
+                "day" | "days" => Ok(align_interval_parts(0.0, interval_period, 0.0)),
+                "hour" | "hours" => {
+                    Ok((0, 0, interval_period * SECONDS_PER_HOUR * MILLIS_PER_SECOND))
+                }
+                "minutes" | "minute" => {
+                    Ok((0, 0, interval_period * 60_f32 * MILLIS_PER_SECOND))
+                }
+                "seconds" | "second" => Ok((0, 0, interval_period * MILLIS_PER_SECOND)),
+                "milliseconds" | "millisecond" => Ok((0, 0, interval_period)),
+                _ => Err(DataFusionError::NotImplemented(format!(
+                    "Invalid input syntax for type interval: {:?}",
+                    value
+                ))),
+            }
+        };
+
+        let mut result_month: i64 = 0;
+        let mut result_days: i64 = 0;
+        let mut result_millis: i64 = 0;
+
+        let mut parts = value.split_whitespace();
+
+        loop {
+            let interval_period_str = parts.next();
+            if interval_period_str.is_none() {
+                break;
+            }
+
+            let (diff_month, diff_days, diff_millis) = calculate_from_part(
+                interval_period_str.unwrap(),
+                parts.next().unwrap_or("second"),
+            )?;
+
+            result_month += diff_month as i64;
+
+            if result_month > (i32::MAX as i64) {
+                return Err(DataFusionError::NotImplemented(format!(
+                    "Interval field value out of range: {:?}",
+                    value
+                )));
+            }
+
+            result_days += diff_days as i64;
+
+            if result_days > (i32::MAX as i64) {
+                return Err(DataFusionError::NotImplemented(format!(
+                    "Interval field value out of range: {:?}",
+                    value
+                )));
+            }
+
+            result_millis += diff_millis as i64;
+
+            if result_millis > (i32::MAX as i64) {
+                return Err(DataFusionError::NotImplemented(format!(
+                    "Interval field value out of range: {:?}",
+                    value
+                )));
+            }
+        }
+
+        // Interval is tricky thing
+        // 1 day is not 24 hours because timezones, 1 year != 365/364! 30 days != 1 month
+        // The true way to store and calculate intervals is to store it as it defined
+        // Due the fact that Arrow supports only two types YearMonth (month) and DayTime (day, time)
+        // It's not possible to store complex intervals
+        // It's possible to do select (NOW() + INTERVAL '1 year') + INTERVAL '1 day'; as workaround
+        if result_month != 0 && (result_days != 0 || result_millis != 0) {
+            return Err(DataFusionError::NotImplemented(format!(
+                "DF doesnt support complex intervals: {:?}",

Review comment:
       ```suggestion
                   "DF does not support intervals that have both a Year/Month part as well as Days/Hours/Mins/Seconds: {:?}. Hint: try breaking the interval into two parts, one with Year/Month and the other with Days/Hours/Mins/Seconds - e.g. (NOW() + INTERVAL '1 year') + INTERVAL '1 day'",
   ```
   
   Maybe too much in an error message but as I user of DataFusion I would appreciate having this context :)

##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -982,6 +996,171 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &str,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        const SECONDS_PER_HOUR: f32 = 3_600_f32;
+        const MILLIS_PER_SECOND: f32 = 1_000_f32;
+
+        // We are storing parts as integers, it's why we need to align parts fractional
+        // INTERVAL '0.5 MONTH' = 15 days, INTERVAL '1.5 MONTH' = 1 month 15 days
+        // INTERVAL '0.5 DAY' = 12 hours, INTERVAL '1.5 DAY' = 1 day 12 hours
+        let align_interval_parts = |month_part: f32,
+                                    mut day_part: f32,
+                                    mut milles_part: f32|
+         -> (i32, i32, f32) {
+            // Convert fractional month to days, It's not supported by Arrow types, but anyway
+            day_part += (month_part - (month_part as i32) as f32) * 30_f32;
+
+            // Convert fractional days to hours
+            milles_part += (day_part - ((day_part as i32) as f32))
+                * 24_f32
+                * SECONDS_PER_HOUR
+                * MILLIS_PER_SECOND;
+
+            (month_part as i32, day_part as i32, milles_part)
+        };
+
+        let calculate_from_part = |interval_period_str: &str,
+                                   interval_type: &str|
+         -> Result<(i32, i32, f32)> {
+            let interval_period = match f32::from_str(interval_period_str) {
+                Ok(n) => n,
+                Err(_) => {
+                    return Err(DataFusionError::SQL(ParserError(format!(
+                        "Unsupported Interval Expression with value {:?}",
+                        value
+                    ))))
+                }
+            };
+
+            if interval_period > (i32::MAX as f32) {
+                return Err(DataFusionError::NotImplemented(format!(
+                    "Interval field value out of range: {:?}",
+                    value
+                )));
+            }
+
+            match interval_type.to_lowercase().as_str() {
+                "year" => Ok(align_interval_parts(interval_period * 12_f32, 0.0, 0.0)),
+                "month" => Ok(align_interval_parts(interval_period, 0.0, 0.0)),
+                "day" | "days" => Ok(align_interval_parts(0.0, interval_period, 0.0)),
+                "hour" | "hours" => {
+                    Ok((0, 0, interval_period * SECONDS_PER_HOUR * MILLIS_PER_SECOND))
+                }
+                "minutes" | "minute" => {
+                    Ok((0, 0, interval_period * 60_f32 * MILLIS_PER_SECOND))
+                }
+                "seconds" | "second" => Ok((0, 0, interval_period * MILLIS_PER_SECOND)),
+                "milliseconds" | "millisecond" => Ok((0, 0, interval_period)),
+                _ => Err(DataFusionError::NotImplemented(format!(
+                    "Invalid input syntax for type interval: {:?}",
+                    value
+                ))),
+            }
+        };
+
+        let mut result_month: i64 = 0;
+        let mut result_days: i64 = 0;
+        let mut result_millis: i64 = 0;
+
+        let mut parts = value.split_whitespace();
+
+        loop {
+            let interval_period_str = parts.next();
+            if interval_period_str.is_none() {
+                break;
+            }
+
+            let (diff_month, diff_days, diff_millis) = calculate_from_part(
+                interval_period_str.unwrap(),
+                parts.next().unwrap_or("second"),
+            )?;
+
+            result_month += diff_month as i64;
+
+            if result_month > (i32::MAX as i64) {
+                return Err(DataFusionError::NotImplemented(format!(
+                    "Interval field value out of range: {:?}",
+                    value
+                )));
+            }
+
+            result_days += diff_days as i64;
+
+            if result_days > (i32::MAX as i64) {
+                return Err(DataFusionError::NotImplemented(format!(
+                    "Interval field value out of range: {:?}",
+                    value
+                )));
+            }
+
+            result_millis += diff_millis as i64;
+
+            if result_millis > (i32::MAX as i64) {
+                return Err(DataFusionError::NotImplemented(format!(
+                    "Interval field value out of range: {:?}",
+                    value
+                )));
+            }
+        }
+
+        // Interval is tricky thing
+        // 1 day is not 24 hours because timezones, 1 year != 365/364! 30 days != 1 month
+        // The true way to store and calculate intervals is to store it as it defined

Review comment:
       💯  agree with this comment.
   
   I think an additional restriction we should perhaps consider is 'only support integral numbers of months + years' as the `YearMonthInterval` interval type can not capture precisely fractional parts 
   
   https://github.com/apache/arrow/blob/master/format/Schema.fbs#L253-L254




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr commented on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr commented on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-773220218


   I think this PR is now ready for review/merge.
   
   Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r569727025



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {

Review comment:
       I am using integer to support:
   
   ```
   SELECT INTERVAL '0.5 day';
   ```
   
   ```
   0 years 0 mons 0 days 12 hours 0 mins 0.00 secs
   ```

##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {

Review comment:
       I am using `f32` to support:
   
   ```
   SELECT INTERVAL '0.5 day';
   ```
   
   ```
   0 years 0 mons 0 days 12 hours 0 mins 0.00 secs
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9373: ARROW-10816: [Rust][DataFushion] Initial support for Interval express…

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-770237770


   https://issues.apache.org/jira/browse/ARROW-10816


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb closed pull request #9373: ARROW-10816: [Rust][DataFusion] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
alamb closed pull request #9373:
URL: https://github.com/apache/arrow/pull/9373


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on pull request #9373: ARROW-10816: [Rust][DataFusion] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-774466420


   I'll plan to merge this tomorrow (Sunday) or Monday unless anyone else has any comments


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr commented on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr commented on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-770445969


   Wow, I found that it's possible to write
   
   ```
   select INTERVAL '1 hour 1 second 1 month';
   ```
   
   Start to work on support for it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r570054053



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {

Review comment:
       I dont have any ideas, how it should be correct, but anyway using `0.5 DAY` or `0.5 HOUR` will be rly usefull and works as expected.
   

##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {

Review comment:
       I dont have any ideas, how it should be correct, but anyway using `0.5 DAY` or `0.5 HOUR` will be rly usefull and works as expected.
   
   Can you point my attention how it should works?
   
   Thanks

##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {

Review comment:
       @Dandandan 👍  I can place a comment in this place with // @todo, when I will finish with https://github.com/apache/arrow/pull/9232 I can reuse Decimal128 in this place. Is it ok? Thanks

##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {

Review comment:
       @Dandandan
   
   > for example the number of milliseconds to be off for non-trivial fractions?
   
   It's better to use Decimal anyway.
   
    👍  I can place a comment in this place with // @todo, when I will finish with https://github.com/apache/arrow/pull/9232 I can reuse Decimal128 in this place. Is it ok? Thanks

##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -982,6 +996,171 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &str,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        const SECONDS_PER_HOUR: f32 = 3_600_f32;
+        const MILLIS_PER_SECOND: f32 = 1_000_f32;
+
+        // We are storing parts as integers, it's why we need to align parts fractional
+        // INTERVAL '0.5 MONTH' = 15 days, INTERVAL '1.5 MONTH' = 1 month 15 days

Review comment:
       I am trying to achieve similar experience as PostgreSQL in this place. Users who will use this, should know about this thing.
   
   > NotImplemented("DF doesnt support complex intervals: \"1.1 month\"")
   
   I think, will be great to create an RFC for Apache Arrow to introduce support for `MonthDayTime` interval type, which will help to represent any interval.

##########
File path: rust/arrow/src/util/display.rs
##########
@@ -44,6 +44,66 @@ macro_rules! make_string {
     }};
 }
 
+macro_rules! make_string_interval_year_month {
+    ($column: ident, $row: ident) => {{
+        let array = $column
+            .as_any()
+            .downcast_ref::<array::IntervalYearMonthArray>()
+            .unwrap();
+
+        let s = if array.is_null($row) {
+            "0 years 0 mons 0 days 0 hours 0 mins 0.00 secs".to_string()

Review comment:
       👍 
   
   Yes, it's better to represent NULL as string instead of it. Thanks

##########
File path: rust/arrow/src/util/display.rs
##########
@@ -44,6 +44,66 @@ macro_rules! make_string {
     }};
 }
 
+macro_rules! make_string_interval_year_month {
+    ($column: ident, $row: ident) => {{
+        let array = $column
+            .as_any()
+            .downcast_ref::<array::IntervalYearMonthArray>()
+            .unwrap();
+
+        let s = if array.is_null($row) {
+            "0 years 0 mons 0 days 0 hours 0 mins 0.00 secs".to_string()
+        } else {
+            let interval = array.value($row) as f64;
+            let years = (interval / 12_f64).floor();
+            let month = interval - (years * 12_f64);
+
+            format!(
+                "{} years {} mons 0 days 0 hours 0 mins 0.00 secs",
+                years, month,
+            )
+        };
+
+        Ok(s)
+    }};
+}
+
+macro_rules! make_string_interval_day_time {
+    ($column: ident, $row: ident) => {{
+        let array = $column
+            .as_any()
+            .downcast_ref::<array::IntervalDayTimeArray>()
+            .unwrap();
+
+        let s = if array.is_null($row) {
+            "0 years 0 mons 0 days 0 hours 0 mins 0.00 secs".to_string()

Review comment:
       fixed.

##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -982,6 +996,171 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &str,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        const SECONDS_PER_HOUR: f32 = 3_600_f32;
+        const MILLIS_PER_SECOND: f32 = 1_000_f32;
+
+        // We are storing parts as integers, it's why we need to align parts fractional
+        // INTERVAL '0.5 MONTH' = 15 days, INTERVAL '1.5 MONTH' = 1 month 15 days

Review comment:
       fixed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-773244774


   Thanks @ovr  -- I plan to review this carefully later today


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r569733252



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {

Review comment:
       Didn't realize that was possible. What about rounding errors (and rounding in general)?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r570153622



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {

Review comment:
       I am not sure whether rounding errors are going to be a problem, I could imagine some calculation might cause for example the number of milliseconds to be off for non-trivial fractions?
   In that case we should use some decimal type instead of `f32` in order to avoid those rounding errors.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r570156779



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {

Review comment:
       @Dandandan 👍  I can place a comment in this place with // @todo, when I will finish with https://github.com/apache/arrow/pull/9232 I can reuse Decimal128 in this place. Is it ok? Thanks




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr commented on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr commented on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-773220218


   I think this PR is now ready for review/merge.
   
   Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r569745071



##########
File path: rust/datafusion/tests/sql.rs
##########
@@ -1941,6 +1941,59 @@ async fn boolean_expressions() -> Result<()> {
     Ok(())
 }
 
+#[tokio::test]
+async fn interval_expressions() -> Result<()> {
+    let mut ctx = ExecutionContext::new();
+    let sql = "SELECT
+        (interval '1') as interval_1,
+        (interval '1 second') as interval_2,
+        (interval '500 milliseconds') as interval_3,
+        (interval '5 second') as interval_4,
+        (interval '1 minute') as interval_5,
+        (interval '0.5 minute') as interval_6,
+        (interval '.5 minute') as interval_7,
+        (interval '5 minute') as interval_8,
+        (interval '5 minute 1 second') as interval_9,
+        (interval '1 hour') as interval_10,
+        (interval '5 hour') as interval_11,
+        (interval '1 day') as interval_12,
+        (interval '5 day') as interval_13,
+        (interval '5 day 4 hours 3 minutes 2 seconds 100 milliseconds') as interval_14,
+        (interval '1 month') as interval_15,
+        (interval '5 month') as interval_16,
+        (interval '13 month') as interval_17,
+        (interval '1 year') as interval_18,
+        (interval '2 year') as interval_19,
+        (interval '1 day 1') as interval_20

Review comment:
       Thanks, added.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on a change in pull request #9373: ARROW-10816: [Rust][DataFusion] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r570887560



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -982,6 +996,171 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &str,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        const SECONDS_PER_HOUR: f32 = 3_600_f32;
+        const MILLIS_PER_SECOND: f32 = 1_000_f32;
+
+        // We are storing parts as integers, it's why we need to align parts fractional
+        // INTERVAL '0.5 MONTH' = 15 days, INTERVAL '1.5 MONTH' = 1 month 15 days

Review comment:
       If this is consistent with postgres then it is good with me




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r569818683



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {
+                    Ok(n) => n,
+                    Err(_) => {
+                        return Err(DataFusionError::SQL(ParserError(format!(
+                            "Unsupported Interval Expression with value {:?}",
+                            value
+                        ))))
+                    }
+                };
+
+                if interval_period > (i32::MAX as f32) {
+                    return Err(DataFusionError::NotImplemented(format!(
+                        "Interval field value out of range: {:?}",
+                        value
+                    )));
+                }
+
+                const SECONDS_PER_HOUR: f32 = 3_600_f32;
+                const MILLIS_PER_SECOND: f32 = 1_000_f32;
+
+                match interval_type.to_lowercase().as_str() {
+                    "year" => Ok(((interval_period * 365_f32) as i32, 0.0)),
+                    "month" => Ok(((interval_period * 30_f32) as i32, 0.0)),
+                    "day" | "days" => Ok((interval_period as i32, 0.0)),
+                    "hour" | "hours" => Ok(resort_parts(
+                        0,
+                        interval_period * SECONDS_PER_HOUR * MILLIS_PER_SECOND,
+                    )),
+                    "minutes" | "minute" => Ok(resort_parts(
+                        0,
+                        interval_period * 60_f32 * MILLIS_PER_SECOND,
+                    )),
+                    "seconds" | "second" => {
+                        Ok(resort_parts(0, interval_period * MILLIS_PER_SECOND))
+                    }
+                    "milliseconds" | "millisecond" => Ok((0, interval_period)),
+                    _ => {
+                        return Err(DataFusionError::NotImplemented(format!(
+                            "Invalid input syntax for type interval: {:?}",
+                            value
+                        )))
+                    }
+                }
+            };
+
+        let mut parts = value.split_whitespace();
+        let mut result_days: i64 = 0;
+        let mut result_millis: i64 = 0;
+
+        loop {

Review comment:
       @ovr  -- I see what you are saying (that the iterator may need to consume an extra item during the loop). Thank you for trying -- I am sorry if you wasted much time on that




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-770264240


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=h1) Report
   > Merging [#9373](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=desc) (3e10d6d) into [master](https://codecov.io/gh/apache/arrow/commit/3d4c2bb29bb89dfc7e24bd1295f81849664e46ef?el=desc) (3d4c2bb) will **decrease** coverage by `0.05%`.
   > The diff coverage is `57.85%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9373/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9373      +/-   ##
   ==========================================
   - Coverage   82.10%   82.05%   -0.06%     
   ==========================================
     Files         230      230              
     Lines       54043    54164     +121     
   ==========================================
   + Hits        44374    44442      +68     
   - Misses       9669     9722      +53     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/arrow/src/datatypes.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvZGF0YXR5cGVzLnJz) | `78.33% <ø> (ø)` | |
   | [rust/datafusion/src/sql/planner.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9zcWwvcGxhbm5lci5ycw==) | `82.70% <51.11%> (-3.01%)` | :arrow_down: |
   | [rust/datafusion/src/scalar.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9zY2FsYXIucnM=) | `56.28% <63.15%> (+0.43%)` | :arrow_up: |
   | [rust/arrow/src/util/display.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvdXRpbC9kaXNwbGF5LnJz) | `77.14% <100.00%> (+1.38%)` | :arrow_up: |
   | [rust/datafusion/tests/sql.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3Rlc3RzL3NxbC5ycw==) | `99.85% <100.00%> (+<0.01%)` | :arrow_up: |
   | [rust/arrow/src/array/equal/utils.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvZXF1YWwvdXRpbHMucnM=) | `75.49% <0.00%> (-0.99%)` | :arrow_down: |
   | [rust/parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9lbmNvZGluZ3MvZW5jb2RpbmcucnM=) | `94.86% <0.00%> (-0.20%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=footer). Last update [3d4c2bb...7e55dd4](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-770264240


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=h1) Report
   > Merging [#9373](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=desc) (fc1b71b) into [master](https://codecov.io/gh/apache/arrow/commit/77ae93d6ecaac8fb5f4a18ca5287b7456cd88784?el=desc) (77ae93d) will **decrease** coverage by `0.10%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9373/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9373      +/-   ##
   ==========================================
   - Coverage   82.00%   81.89%   -0.11%     
   ==========================================
     Files         230      227       -3     
     Lines       53487    53557      +70     
   ==========================================
   - Hits        43864    43863       -1     
   - Misses       9623     9694      +71     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/arrow/src/util/display.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvdXRpbC9kaXNwbGF5LnJz) | `71.42% <0.00%> (-4.33%)` | :arrow_down: |
   | [rust/datafusion/src/scalar.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9zY2FsYXIucnM=) | `52.51% <0.00%> (-3.34%)` | :arrow_down: |
   | [rust/datafusion/src/sql/planner.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9zcWwvcGxhbm5lci5ycw==) | `78.69% <0.00%> (-5.50%)` | :arrow_down: |
   | [rust/datafusion/tests/sql.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3Rlc3RzL3NxbC5ycw==) | `99.76% <0.00%> (-0.08%)` | :arrow_down: |
   | [rust/parquet/src/record/api.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9yZWNvcmQvYXBpLnJz) | `92.63% <0.00%> (-5.39%)` | :arrow_down: |
   | [rust/parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9lbmNvZGluZ3MvZW5jb2RpbmcucnM=) | `94.86% <0.00%> (-0.20%)` | :arrow_down: |
   | [rust/parquet/src/bin/parquet-rowcount.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1yb3djb3VudC5ycw==) | | |
   | [rust/parquet/src/bin/parquet-schema.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1zY2hlbWEucnM=) | | |
   | [rust/parquet/src/bin/parquet-read.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1yZWFkLnJz) | | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=footer). Last update [77ae93d...30f9559](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr commented on a change in pull request #9373: ARROW-10816: [Rust][DataFusion] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r570532111



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -982,6 +996,171 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &str,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        const SECONDS_PER_HOUR: f32 = 3_600_f32;
+        const MILLIS_PER_SECOND: f32 = 1_000_f32;
+
+        // We are storing parts as integers, it's why we need to align parts fractional
+        // INTERVAL '0.5 MONTH' = 15 days, INTERVAL '1.5 MONTH' = 1 month 15 days

Review comment:
       I am trying to achieve similar experience as PostgreSQL in this place. Users who will use this, should know about this thing.
   
   > NotImplemented("DF doesnt support complex intervals: \"1.1 month\"")
   
   I think, will be great to create an RFC for Apache Arrow to introduce support for `MonthDayTime` interval type, which will help to represent any interval.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r569597857



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {
+                    Ok(n) => n,
+                    Err(_) => {
+                        return Err(DataFusionError::SQL(ParserError(format!(
+                            "Unsupported Interval Expression with value {:?}",
+                            value
+                        ))))
+                    }
+                };
+
+                if interval_period > (i32::MAX as f32) {
+                    return Err(DataFusionError::NotImplemented(format!(
+                        "Interval field value out of range: {:?}",
+                        value
+                    )));
+                }
+
+                const SECONDS_PER_HOUR: f32 = 3_600_f32;

Review comment:
       I think using integers would be more correct?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr edited a comment on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr edited a comment on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-770445969


   Wow, I found that it's possible to write
   
   ```
   select INTERVAL '1 hour 1 second 1 month';
   select INTERVAL '250 year 1 month 1 day 100 milliseconds';
   ```
   
   Start to work on support for it. 
   
   And I found how It's implemented in C++
   
   https://github.com/apache/arrow/blob/master/cpp/src/arrow/type.h#L1238
   
   Started to rework this branch, marked as DRAFT.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r569600040



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {

Review comment:
       I think this should be `i32` as well as the other types `f32`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on pull request #9373: ARROW-10816: [Rust][DataFusion] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-774656720


   Thanks again for sticking with this @ovr  -- nice work.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r568170411



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {
+                    Ok(n) => n,
+                    Err(_) => {
+                        return Err(DataFusionError::SQL(ParserError(format!(
+                            "Unsupported Interval Expression with value {:?}",
+                            value
+                        ))))
+                    }
+                };
+
+                if interval_period > (i32::MAX as f32) {
+                    return Err(DataFusionError::NotImplemented(format!(
+                        "Interval field value out of range: {:?}",
+                        value
+                    )));
+                }
+
+                const SECONDS_PER_HOUR: f32 = 3_600_f32;
+                const MILLIS_PER_SECOND: f32 = 1_000_f32;
+
+                match interval_type.to_lowercase().as_str() {
+                    "year" => Ok(((interval_period * 365_f32) as i32, 0.0)),
+                    "month" => Ok(((interval_period * 30_f32) as i32, 0.0)),
+                    "day" | "days" => Ok((interval_period as i32, 0.0)),
+                    "hour" | "hours" => Ok(resort_parts(
+                        0,
+                        interval_period * SECONDS_PER_HOUR * MILLIS_PER_SECOND,
+                    )),
+                    "minutes" | "minute" => Ok(resort_parts(
+                        0,
+                        interval_period * 60_f32 * MILLIS_PER_SECOND,
+                    )),
+                    "seconds" | "second" => {
+                        Ok(resort_parts(0, interval_period * MILLIS_PER_SECOND))
+                    }
+                    "milliseconds" | "millisecond" => Ok((0, interval_period)),
+                    _ => {
+                        return Err(DataFusionError::NotImplemented(format!(
+                            "Invalid input syntax for type interval: {:?}",
+                            value
+                        )))
+                    }
+                }
+            };
+
+        let mut parts = value.split_whitespace();
+        let mut result_days: i64 = 0;
+        let mut result_millis: i64 = 0;
+
+        loop {

Review comment:
       Awesome, thank you! I will try.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r569794350



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {
+                    Ok(n) => n,
+                    Err(_) => {
+                        return Err(DataFusionError::SQL(ParserError(format!(
+                            "Unsupported Interval Expression with value {:?}",
+                            value
+                        ))))
+                    }
+                };
+
+                if interval_period > (i32::MAX as f32) {
+                    return Err(DataFusionError::NotImplemented(format!(
+                        "Interval field value out of range: {:?}",
+                        value
+                    )));
+                }
+
+                const SECONDS_PER_HOUR: f32 = 3_600_f32;
+                const MILLIS_PER_SECOND: f32 = 1_000_f32;
+
+                match interval_type.to_lowercase().as_str() {
+                    "year" => Ok(((interval_period * 365_f32) as i32, 0.0)),
+                    "month" => Ok(((interval_period * 30_f32) as i32, 0.0)),
+                    "day" | "days" => Ok((interval_period as i32, 0.0)),
+                    "hour" | "hours" => Ok(resort_parts(
+                        0,
+                        interval_period * SECONDS_PER_HOUR * MILLIS_PER_SECOND,
+                    )),
+                    "minutes" | "minute" => Ok(resort_parts(
+                        0,
+                        interval_period * 60_f32 * MILLIS_PER_SECOND,
+                    )),
+                    "seconds" | "second" => {
+                        Ok(resort_parts(0, interval_period * MILLIS_PER_SECOND))
+                    }
+                    "milliseconds" | "millisecond" => Ok((0, interval_period)),
+                    _ => {
+                        return Err(DataFusionError::NotImplemented(format!(
+                            "Invalid input syntax for type interval: {:?}",
+                            value
+                        )))
+                    }
+                }
+            };
+
+        let mut parts = value.split_whitespace();
+        let mut result_days: i64 = 0;
+        let mut result_millis: i64 = 0;
+
+        loop {

Review comment:
       So, there is a restriction to use it:
   
   ```
   1084 |         let mut parts = value.split_whitespace();
        |             --------- move occurs because `parts` has type `SplitWhitespace<'_>`, which does not implement the `Copy` trait
   1085 | 
   1086 |         for interval_period_str in parts {
        |                                    -----
        |                                    |
        |                                    `parts` moved due to this implicit call to `.into_iter()`
        |                                    help: consider borrowing to avoid moving into the for loop: `&parts`
   ...
   1089 |                 parts.next().unwrap_or("second"),
        |                 ^^^^^ value borrowed here after move
   ```
   
   ![image](https://user-images.githubusercontent.com/572096/106817177-90bf3400-6687-11eb-897c-e180ca156afc.png)
   
   It's needed to support
   
   ```
   SELECT (INTERVAL '1')
   SELECT (ITNERVAL '1 day 1')
   ```
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr edited a comment on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr edited a comment on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-773220218


   I think this PR is now ready for review/merge.
   
   In the next PRs I am planing to work on:
   
   - BinaryExpressions for Dates and Intervals
   - ISO8601 intervals
   - Introduce `ArrowIntervalType` trait
   
   Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr commented on a change in pull request #9373: ARROW-10816: [Rust][DataFusion] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r570533219



##########
File path: rust/arrow/src/util/display.rs
##########
@@ -44,6 +44,66 @@ macro_rules! make_string {
     }};
 }
 
+macro_rules! make_string_interval_year_month {
+    ($column: ident, $row: ident) => {{
+        let array = $column
+            .as_any()
+            .downcast_ref::<array::IntervalYearMonthArray>()
+            .unwrap();
+
+        let s = if array.is_null($row) {
+            "0 years 0 mons 0 days 0 hours 0 mins 0.00 secs".to_string()

Review comment:
       👍 
   
   Yes, it's better to represent NULL as string instead of it. Thanks




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb edited a comment on pull request #9373: ARROW-10816: [Rust][DataFusion] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
alamb edited a comment on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-774656720


   Thanks again for sticking with this @ovr  -- nice work and thank you


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr commented on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr commented on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-770444221


   @alamb @andygrove @Dandandan @jorgecarleitao It's initial support for intervals, I've got some questions:
   
   - Is it correct how I store it?
   - What about rang overflows, should it be implemented?
   
   P.S I next PR i will add binary operations for intervals and support for DateTime + Intervals.
   
   Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ovr edited a comment on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
ovr edited a comment on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-770444221


   @alamb @andygrove @Dandandan @jorgecarleitao It's initial support for intervals, I've got some questions:
   
   > EDITED
   
   - Is it correct how I store it? (I found in C++ source, starting to work on correct implementation)
   - What about rang overflows, should it be implemented?
   
   P.S I next PR i will add binary operations for intervals and support for DateTime + Intervals.
   
   Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io commented on pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#issuecomment-770264240


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=h1) Report
   > Merging [#9373](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=desc) (1779819) into [master](https://codecov.io/gh/apache/arrow/commit/f05b49bb08c0a4cc0cbfcfb07103dcf374c7fd38?el=desc) (f05b49b) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.04%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9373/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9373      +/-   ##
   ==========================================
   - Coverage   81.94%   81.90%   -0.05%     
   ==========================================
     Files         231      231              
     Lines       53374    53465      +91     
   ==========================================
   + Hits        43739    43789      +50     
   - Misses       9635     9676      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...tafusion/src/physical\_plan/expressions/coercion.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL2V4cHJlc3Npb25zL2NvZXJjaW9uLnJz) | `76.92% <0.00%> (-1.73%)` | :arrow_down: |
   | [rust/datafusion/src/sql/planner.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9zcWwvcGxhbm5lci5ycw==) | `81.63% <45.61%> (-2.53%)` | :arrow_down: |
   | [rust/datafusion/src/scalar.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9zY2FsYXIucnM=) | `56.28% <63.15%> (+0.43%)` | :arrow_up: |
   | [rust/arrow/src/util/display.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvdXRpbC9kaXNwbGF5LnJz) | `77.14% <100.00%> (+1.38%)` | :arrow_up: |
   | [...datafusion/src/physical\_plan/expressions/binary.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL2V4cHJlc3Npb25zL2JpbmFyeS5ycw==) | `85.26% <100.00%> (+0.04%)` | :arrow_up: |
   | [rust/datafusion/tests/sql.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3Rlc3RzL3NxbC5ycw==) | `99.84% <100.00%> (+<0.01%)` | :arrow_up: |
   | [rust/parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow/pull/9373/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9lbmNvZGluZ3MvZW5jb2RpbmcucnM=) | `94.86% <0.00%> (-0.20%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=footer). Last update [f05b49b...1779819](https://codecov.io/gh/apache/arrow/pull/9373?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on a change in pull request #9373: ARROW-10816: [Rust][DF] Initial support for Interval expressions

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #9373:
URL: https://github.com/apache/arrow/pull/9373#discussion_r568152557



##########
File path: rust/arrow/src/util/display.rs
##########
@@ -44,6 +44,66 @@ macro_rules! make_string {
     }};
 }
 
+macro_rules! make_string_interval_year_month {
+    ($column: ident, $row: ident) => {{
+        let array = $column
+            .as_any()
+            .downcast_ref::<array::IntervalYearMonthArray>()
+            .unwrap();
+
+        let s = if array.is_null($row) {
+            "0 years 0 mons 0 days 0 hours 0 mins 0.00 secs".to_string()
+        } else {
+            let interval = array.value($row) as f64;
+            let years = (interval / 12_f64).floor();
+            let month = interval - (years * 12_f64);
+
+            format!(
+                "{} years {} mons 0 days 0 hours 0 mins 0.00 secs",
+                years, month,
+            )
+        };
+
+        Ok(s)
+    }};
+}
+
+macro_rules! make_string_interval_day_time {
+    ($column: ident, $row: ident) => {{
+        let array = $column
+            .as_any()
+            .downcast_ref::<array::IntervalDayTimeArray>()
+            .unwrap();
+
+        let s = if array.is_null($row) {
+            "0 years 0 mons 0 days 0 hours 0 mins 0.00 secs".to_string()
+        } else {
+            let value: u64 = array.value($row) as u64;
+
+            let days_parts: i32 = ((value & 0xFFFFFFFF00000000) >> 32) as i32;
+            let seconds_part: i32 = (value & 0xFFFFFFFF) as i32;

Review comment:
       given the comments, this might be more clearly named `milliseconds_part` as I think that is what the low 32 bits represent. 

##########
File path: rust/datafusion/tests/sql.rs
##########
@@ -1941,6 +1941,59 @@ async fn boolean_expressions() -> Result<()> {
     Ok(())
 }
 
+#[tokio::test]
+async fn interval_expressions() -> Result<()> {
+    let mut ctx = ExecutionContext::new();
+    let sql = "SELECT
+        (interval '1') as interval_1,
+        (interval '1 second') as interval_2,
+        (interval '500 milliseconds') as interval_3,
+        (interval '5 second') as interval_4,
+        (interval '1 minute') as interval_5,
+        (interval '0.5 minute') as interval_6,
+        (interval '.5 minute') as interval_7,
+        (interval '5 minute') as interval_8,
+        (interval '5 minute 1 second') as interval_9,
+        (interval '1 hour') as interval_10,
+        (interval '5 hour') as interval_11,
+        (interval '1 day') as interval_12,
+        (interval '5 day') as interval_13,
+        (interval '5 day 4 hours 3 minutes 2 seconds 100 milliseconds') as interval_14,
+        (interval '1 month') as interval_15,
+        (interval '5 month') as interval_16,
+        (interval '13 month') as interval_17,
+        (interval '1 year') as interval_18,
+        (interval '2 year') as interval_19,
+        (interval '1 day 1') as interval_20

Review comment:
       might be worth testing with a fractional interval as well `interval '2.3' years` as I think your implementation parses the numbers as a `f32`

##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(
+        &self,
+        value: &String,
+        leading_field: &Option<DateTimeField>,
+        leading_precision: &Option<u64>,
+        last_field: &Option<DateTimeField>,
+        fractional_seconds_precision: &Option<u64>,
+    ) -> Result<Expr> {
+        if leading_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_field {:?}",
+                leading_field
+            )));
+        }
+
+        if leading_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with leading_precision {:?}",
+                leading_precision
+            )));
+        }
+
+        if last_field.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with last_field {:?}",
+                last_field
+            )));
+        }
+
+        if fractional_seconds_precision.is_some() {
+            return Err(DataFusionError::NotImplemented(format!(
+                "Unsupported Interval Expression with fractional_seconds_precision {:?}",
+                fractional_seconds_precision
+            )));
+        }
+
+        // Should we resort parts on overflow?
+        let resort_parts = |days: i32, seconds: f32| -> (i32, f32) { (days, seconds) };
+
+        let calculate_from_part =
+            |interval_period_str: &str, interval_type: &str| -> Result<(i32, f32)> {
+                let interval_period = match interval_period_str.parse::<f32>() {
+                    Ok(n) => n,
+                    Err(_) => {
+                        return Err(DataFusionError::SQL(ParserError(format!(
+                            "Unsupported Interval Expression with value {:?}",
+                            value
+                        ))))
+                    }
+                };
+
+                if interval_period > (i32::MAX as f32) {
+                    return Err(DataFusionError::NotImplemented(format!(
+                        "Interval field value out of range: {:?}",
+                        value
+                    )));
+                }
+
+                const SECONDS_PER_HOUR: f32 = 3_600_f32;
+                const MILLIS_PER_SECOND: f32 = 1_000_f32;
+
+                match interval_type.to_lowercase().as_str() {
+                    "year" => Ok(((interval_period * 365_f32) as i32, 0.0)),
+                    "month" => Ok(((interval_period * 30_f32) as i32, 0.0)),
+                    "day" | "days" => Ok((interval_period as i32, 0.0)),
+                    "hour" | "hours" => Ok(resort_parts(
+                        0,
+                        interval_period * SECONDS_PER_HOUR * MILLIS_PER_SECOND,
+                    )),
+                    "minutes" | "minute" => Ok(resort_parts(
+                        0,
+                        interval_period * 60_f32 * MILLIS_PER_SECOND,
+                    )),
+                    "seconds" | "second" => {
+                        Ok(resort_parts(0, interval_period * MILLIS_PER_SECOND))
+                    }
+                    "milliseconds" | "millisecond" => Ok((0, interval_period)),
+                    _ => {
+                        return Err(DataFusionError::NotImplemented(format!(
+                            "Invalid input syntax for type interval: {:?}",
+                            value
+                        )))
+                    }
+                }
+            };
+
+        let mut parts = value.split_whitespace();
+        let mut result_days: i64 = 0;
+        let mut result_millis: i64 = 0;
+
+        loop {

Review comment:
       If you wanted to make this more "idomatic" rust you could write something like:
   
   ```
   for interval_period_str in parts {
   ...
   }
   ```
   
   And if you really wanted to get your functional programming groove on you could do something like this with `fold` (untested):
   
   ```
   let (result_days, result_millis) = value.split_whitespace()
     .fold((0,0), |(result_days, result_millis),interval_period_string| {
       // calculate the diff
       (result_days + diff_days, result_millis + diff_millis)
   });
   ```
   
   I don't think there is any real performance difference but I figured I would point it out if you wanted to try

##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -894,6 +908,131 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
             ))),
         }
     }
+
+    fn sql_interval_to_literal(

Review comment:
       👍  is really nice




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org