You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "junjunjd (via GitHub)" <gi...@apache.org> on 2023/10/22 08:39:24 UTC

[PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

junjunjd opened a new pull request, #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901

   ## Which issue does this PR close?
   
   It removes the majority of panics in datafusion-common::scalar https://github.com/apache/arrow-datafusion/issues/3313.
   
   ## Rationale for this change
   
   Important move towards closing #3313
   
   ## What changes are included in this PR?
   
   Replace most of the panics in datafusion-common::scalar by `internal_err` or `not_impl_err`
   
   ## Are these changes tested?
   
   Yes
   
   ## Are there any user-facing changes?
   
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1379704451


##########
datafusion/common/src/scalar.rs:
##########
@@ -1562,7 +1597,11 @@ impl ScalarValue {
                     DataType::UInt16 => dict_from_values::<UInt16Type>(values)?,
                     DataType::UInt32 => dict_from_values::<UInt32Type>(values)?,
                     DataType::UInt64 => dict_from_values::<UInt64Type>(values)?,
-                    _ => unreachable!("Invalid dictionary keys type: {:?}", key_type),

Review Comment:
   I believe this error is reachable if the user passes dictionaries with non-integer keys to the function. It is recoverable so I replaced unreachable! with an internal error.
   This does not require any code change in client side, since `iter_to_array` already returns a Result.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378446881


##########
datafusion/common/src/scalar.rs:
##########
@@ -1309,27 +1338,33 @@ impl ScalarValue {
 
         macro_rules! build_array_list_primitive {
             ($ARRAY_TY:ident, $SCALAR_TY:ident, $NATIVE_TYPE:ident) => {{
-                Arc::new(ListArray::from_iter_primitive::<$ARRAY_TY, _, _>(
-                    scalars.into_iter().map(|x| match x {
-                        ScalarValue::List(arr) => {
-                            if arr.as_any().downcast_ref::<NullArray>().is_some() {
-                                None
-                            } else {
-                                let list_arr = as_list_array(&arr);
-                                let primitive_arr =
-                                    list_arr.values().as_primitive::<$ARRAY_TY>();
-                                Some(
-                                    primitive_arr.into_iter().collect::<Vec<Option<_>>>(),
-                                )
+                Ok::<ArrayRef, DataFusionError>(Arc::new(ListArray::from_iter_primitive::<$ARRAY_TY, _, _>(
+                    scalars
+                        .into_iter()
+                        .map(|x| match x {
+                            ScalarValue::List(arr) => {
+                                if arr.as_any().downcast_ref::<NullArray>().is_some() {
+                                    Ok(None)
+                                } else {
+                                    let list_arr = as_list_array(&arr);
+                                    let primitive_arr =
+                                        list_arr.values().as_primitive::<$ARRAY_TY>();
+                                    Ok(Some(
+                                        primitive_arr
+                                            .into_iter()
+                                            .collect::<Vec<Option<_>>>(),
+                                    ))
+                                }
                             }
-                        }
-                        sv => panic!(
-                            "Inconsistent types in ScalarValue::iter_to_array. \
+                            sv => _internal_err!(
+                                "Inconsistent types in ScalarValue::iter_to_array. \

Review Comment:
   It appears that unreachable errors are handled inconsistently in `iter_to_array`. This error should be [unreachable](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1277). However, all the other `build_*` macros defined in `iter_to_array` return an internal error (for example [`build_array_list_string`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1253), [`build_array_primitive`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1145). I removed `panic` here to align with other `build_*` macros.
   This does not require any code change in client side, since `iter_to_array` already returns a Result.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381197592


##########
datafusion/common/src/scalar.rs:
##########
@@ -1834,30 +1883,31 @@ impl ScalarValue {
                 }
 
                 let arr = Decimal128Array::from(vals)
-                    .with_precision_and_scale(*precision, *scale)
-                    .unwrap();

Review Comment:
   Similar to previous [comment](https://github.com/apache/arrow-datafusion/pull/7901/files#r1381174929), this error is reachable because [ScalarValue::try_new_decimal128](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L660-L661) allows a decimal with precision 0, while [arrow-array](https://github.com/apache/arrow-rs/blob/master/arrow-array/src/types.rs#L1234-L1235) does not support that.
   We can update `try_new_decimal128` to disallow decimal with precision 0 or establish an invariant here so this error becomes unreachable and we can panic here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381223223


##########
datafusion/common/src/scalar.rs:
##########
@@ -1947,14 +1997,14 @@ impl ScalarValue {
                         repeat(Some(value.as_slice())).take(size),
                         *s,
                     )
-                    .unwrap(),

Review Comment:
   I agree this should be unreachable. I will change this back to `unwrap`.



##########
datafusion/common/src/scalar.rs:
##########
@@ -1947,14 +1997,14 @@ impl ScalarValue {
                         repeat(Some(value.as_slice())).take(size),
                         *s,
                     )
-                    .unwrap(),
+                    .map_err(DataFusionError::ArrowError)?,
                 ),
                 None => Arc::new(
                     FixedSizeBinaryArray::try_from_sparse_iter_with_size(
                         repeat(None::<&[u8]>).take(size),
                         *s,
                     )
-                    .unwrap(),

Review Comment:
   Same as above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381247137


##########
datafusion/common/src/scalar.rs:
##########
@@ -2481,119 +2529,136 @@ impl ScalarValue {
                     v.as_ref(),
                     *precision,
                     *scale,
-                )
-                .unwrap()
+                )?
             }
             ScalarValue::Boolean(val) => {
-                eq_array_primitive!(array, index, BooleanArray, val)
+                eq_array_primitive!(array, index, BooleanArray, val)?
             }
             ScalarValue::Float32(val) => {
-                eq_array_primitive!(array, index, Float32Array, val)
+                eq_array_primitive!(array, index, Float32Array, val)?
             }
             ScalarValue::Float64(val) => {
-                eq_array_primitive!(array, index, Float64Array, val)
+                eq_array_primitive!(array, index, Float64Array, val)?
+            }
+            ScalarValue::Int8(val) => eq_array_primitive!(array, index, Int8Array, val)?,
+            ScalarValue::Int16(val) => {
+                eq_array_primitive!(array, index, Int16Array, val)?
+            }
+            ScalarValue::Int32(val) => {
+                eq_array_primitive!(array, index, Int32Array, val)?
+            }
+            ScalarValue::Int64(val) => {
+                eq_array_primitive!(array, index, Int64Array, val)?
+            }
+            ScalarValue::UInt8(val) => {
+                eq_array_primitive!(array, index, UInt8Array, val)?
             }
-            ScalarValue::Int8(val) => eq_array_primitive!(array, index, Int8Array, val),
-            ScalarValue::Int16(val) => eq_array_primitive!(array, index, Int16Array, val),
-            ScalarValue::Int32(val) => eq_array_primitive!(array, index, Int32Array, val),
-            ScalarValue::Int64(val) => eq_array_primitive!(array, index, Int64Array, val),
-            ScalarValue::UInt8(val) => eq_array_primitive!(array, index, UInt8Array, val),
             ScalarValue::UInt16(val) => {
-                eq_array_primitive!(array, index, UInt16Array, val)
+                eq_array_primitive!(array, index, UInt16Array, val)?
             }
             ScalarValue::UInt32(val) => {
-                eq_array_primitive!(array, index, UInt32Array, val)
+                eq_array_primitive!(array, index, UInt32Array, val)?
             }
             ScalarValue::UInt64(val) => {
-                eq_array_primitive!(array, index, UInt64Array, val)
+                eq_array_primitive!(array, index, UInt64Array, val)?
+            }
+            ScalarValue::Utf8(val) => {
+                eq_array_primitive!(array, index, StringArray, val)?
             }
-            ScalarValue::Utf8(val) => eq_array_primitive!(array, index, StringArray, val),
             ScalarValue::LargeUtf8(val) => {
-                eq_array_primitive!(array, index, LargeStringArray, val)
+                eq_array_primitive!(array, index, LargeStringArray, val)?
             }
             ScalarValue::Binary(val) => {
-                eq_array_primitive!(array, index, BinaryArray, val)
+                eq_array_primitive!(array, index, BinaryArray, val)?
             }
             ScalarValue::FixedSizeBinary(_, val) => {
-                eq_array_primitive!(array, index, FixedSizeBinaryArray, val)
+                eq_array_primitive!(array, index, FixedSizeBinaryArray, val)?
             }
             ScalarValue::LargeBinary(val) => {
-                eq_array_primitive!(array, index, LargeBinaryArray, val)
+                eq_array_primitive!(array, index, LargeBinaryArray, val)?
             }
-            ScalarValue::Fixedsizelist(..) => unimplemented!(),
-            ScalarValue::List(_) => unimplemented!("ListArr"),

Review Comment:
   See previous [comment](https://github.com/apache/arrow-datafusion/pull/7901/files#r1381231326).



##########
datafusion/common/src/scalar.rs:
##########
@@ -2481,119 +2529,136 @@ impl ScalarValue {
                     v.as_ref(),
                     *precision,
                     *scale,
-                )
-                .unwrap()
+                )?
             }
             ScalarValue::Boolean(val) => {
-                eq_array_primitive!(array, index, BooleanArray, val)
+                eq_array_primitive!(array, index, BooleanArray, val)?
             }
             ScalarValue::Float32(val) => {
-                eq_array_primitive!(array, index, Float32Array, val)
+                eq_array_primitive!(array, index, Float32Array, val)?
             }
             ScalarValue::Float64(val) => {
-                eq_array_primitive!(array, index, Float64Array, val)
+                eq_array_primitive!(array, index, Float64Array, val)?
+            }
+            ScalarValue::Int8(val) => eq_array_primitive!(array, index, Int8Array, val)?,
+            ScalarValue::Int16(val) => {
+                eq_array_primitive!(array, index, Int16Array, val)?
+            }
+            ScalarValue::Int32(val) => {
+                eq_array_primitive!(array, index, Int32Array, val)?
+            }
+            ScalarValue::Int64(val) => {
+                eq_array_primitive!(array, index, Int64Array, val)?
+            }
+            ScalarValue::UInt8(val) => {
+                eq_array_primitive!(array, index, UInt8Array, val)?
             }
-            ScalarValue::Int8(val) => eq_array_primitive!(array, index, Int8Array, val),
-            ScalarValue::Int16(val) => eq_array_primitive!(array, index, Int16Array, val),
-            ScalarValue::Int32(val) => eq_array_primitive!(array, index, Int32Array, val),
-            ScalarValue::Int64(val) => eq_array_primitive!(array, index, Int64Array, val),
-            ScalarValue::UInt8(val) => eq_array_primitive!(array, index, UInt8Array, val),
             ScalarValue::UInt16(val) => {
-                eq_array_primitive!(array, index, UInt16Array, val)
+                eq_array_primitive!(array, index, UInt16Array, val)?
             }
             ScalarValue::UInt32(val) => {
-                eq_array_primitive!(array, index, UInt32Array, val)
+                eq_array_primitive!(array, index, UInt32Array, val)?
             }
             ScalarValue::UInt64(val) => {
-                eq_array_primitive!(array, index, UInt64Array, val)
+                eq_array_primitive!(array, index, UInt64Array, val)?
+            }
+            ScalarValue::Utf8(val) => {
+                eq_array_primitive!(array, index, StringArray, val)?
             }
-            ScalarValue::Utf8(val) => eq_array_primitive!(array, index, StringArray, val),
             ScalarValue::LargeUtf8(val) => {
-                eq_array_primitive!(array, index, LargeStringArray, val)
+                eq_array_primitive!(array, index, LargeStringArray, val)?
             }
             ScalarValue::Binary(val) => {
-                eq_array_primitive!(array, index, BinaryArray, val)
+                eq_array_primitive!(array, index, BinaryArray, val)?
             }
             ScalarValue::FixedSizeBinary(_, val) => {
-                eq_array_primitive!(array, index, FixedSizeBinaryArray, val)
+                eq_array_primitive!(array, index, FixedSizeBinaryArray, val)?
             }
             ScalarValue::LargeBinary(val) => {
-                eq_array_primitive!(array, index, LargeBinaryArray, val)
+                eq_array_primitive!(array, index, LargeBinaryArray, val)?
             }
-            ScalarValue::Fixedsizelist(..) => unimplemented!(),
-            ScalarValue::List(_) => unimplemented!("ListArr"),
+            ScalarValue::Fixedsizelist(..) => {
+                return _not_impl_err!("FixedSizeList is not supported yet")
+            }
+            ScalarValue::List(_) => return _not_impl_err!("List is not supported yet"),
             ScalarValue::Date32(val) => {
-                eq_array_primitive!(array, index, Date32Array, val)
+                eq_array_primitive!(array, index, Date32Array, val)?
             }
             ScalarValue::Date64(val) => {
-                eq_array_primitive!(array, index, Date64Array, val)
+                eq_array_primitive!(array, index, Date64Array, val)?
             }
             ScalarValue::Time32Second(val) => {
-                eq_array_primitive!(array, index, Time32SecondArray, val)
+                eq_array_primitive!(array, index, Time32SecondArray, val)?
             }
             ScalarValue::Time32Millisecond(val) => {
-                eq_array_primitive!(array, index, Time32MillisecondArray, val)
+                eq_array_primitive!(array, index, Time32MillisecondArray, val)?
             }
             ScalarValue::Time64Microsecond(val) => {
-                eq_array_primitive!(array, index, Time64MicrosecondArray, val)
+                eq_array_primitive!(array, index, Time64MicrosecondArray, val)?
             }
             ScalarValue::Time64Nanosecond(val) => {
-                eq_array_primitive!(array, index, Time64NanosecondArray, val)
+                eq_array_primitive!(array, index, Time64NanosecondArray, val)?
             }
             ScalarValue::TimestampSecond(val, _) => {
-                eq_array_primitive!(array, index, TimestampSecondArray, val)
+                eq_array_primitive!(array, index, TimestampSecondArray, val)?
             }
             ScalarValue::TimestampMillisecond(val, _) => {
-                eq_array_primitive!(array, index, TimestampMillisecondArray, val)
+                eq_array_primitive!(array, index, TimestampMillisecondArray, val)?
             }
             ScalarValue::TimestampMicrosecond(val, _) => {
-                eq_array_primitive!(array, index, TimestampMicrosecondArray, val)
+                eq_array_primitive!(array, index, TimestampMicrosecondArray, val)?
             }
             ScalarValue::TimestampNanosecond(val, _) => {
-                eq_array_primitive!(array, index, TimestampNanosecondArray, val)
+                eq_array_primitive!(array, index, TimestampNanosecondArray, val)?
             }
             ScalarValue::IntervalYearMonth(val) => {
-                eq_array_primitive!(array, index, IntervalYearMonthArray, val)
+                eq_array_primitive!(array, index, IntervalYearMonthArray, val)?
             }
             ScalarValue::IntervalDayTime(val) => {
-                eq_array_primitive!(array, index, IntervalDayTimeArray, val)
+                eq_array_primitive!(array, index, IntervalDayTimeArray, val)?
             }
             ScalarValue::IntervalMonthDayNano(val) => {
-                eq_array_primitive!(array, index, IntervalMonthDayNanoArray, val)
+                eq_array_primitive!(array, index, IntervalMonthDayNanoArray, val)?
             }
             ScalarValue::DurationSecond(val) => {
-                eq_array_primitive!(array, index, DurationSecondArray, val)
+                eq_array_primitive!(array, index, DurationSecondArray, val)?
             }
             ScalarValue::DurationMillisecond(val) => {
-                eq_array_primitive!(array, index, DurationMillisecondArray, val)
+                eq_array_primitive!(array, index, DurationMillisecondArray, val)?
             }
             ScalarValue::DurationMicrosecond(val) => {
-                eq_array_primitive!(array, index, DurationMicrosecondArray, val)
+                eq_array_primitive!(array, index, DurationMicrosecondArray, val)?
             }
             ScalarValue::DurationNanosecond(val) => {
-                eq_array_primitive!(array, index, DurationNanosecondArray, val)
+                eq_array_primitive!(array, index, DurationNanosecondArray, val)?
+            }
+            ScalarValue::Struct(_, _) => {
+                return _not_impl_err!("Struct is not supported yet")
             }
-            ScalarValue::Struct(_, _) => unimplemented!(),

Review Comment:
   See previous [comment](https://github.com/apache/arrow-datafusion/pull/7901/files#r1381231326).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar by making more operations return `Result` [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#issuecomment-1804284247

   Marking as draft as it needs conflicts resolved prior to being mergable. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "Weijun-H (via GitHub)" <gi...@apache.org>.
Weijun-H commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1372788072


##########
datafusion/common/src/scalar.rs:
##########
@@ -3468,22 +3541,30 @@ mod tests {
         }
 
         // decimal scalar to array
-        let array = decimal_value.to_array();
+        let array = decimal_value
+            .to_array()
+            .expect("Failed to convert to array");

Review Comment:
   ```suggestion
           let array = decimal_value
               .to_array()?;
   ```



##########
datafusion/common/src/scalar.rs:
##########
@@ -3419,8 +3492,8 @@ mod tests {
     {
         let scalar_result = left.add_checked(&right);
 
-        let left_array = left.to_array();
-        let right_array = right.to_array();
+        let left_array = left.to_array().expect("Failed to convert to array");
+        let right_array = right.to_array().expect("Failed to convert to array");

Review Comment:
   ```suggestion
           let right_array = right.to_array().expect("Failed to convert to array in add overflow check");
   ```



##########
datafusion/common/src/scalar.rs:
##########
@@ -3468,22 +3541,30 @@ mod tests {
         }
 
         // decimal scalar to array
-        let array = decimal_value.to_array();
+        let array = decimal_value
+            .to_array()
+            .expect("Failed to convert to array");
         let array = as_decimal128_array(&array)?;
         assert_eq!(1, array.len());
         assert_eq!(DataType::Decimal128(10, 1), array.data_type().clone());
         assert_eq!(123i128, array.value(0));
 
         // decimal scalar to array with size
-        let array = decimal_value.to_array_of_size(10);
+        let array = decimal_value
+            .to_array_of_size(10)
+            .expect("Failed to convert to array of size");
         let array_decimal = as_decimal128_array(&array)?;
         assert_eq!(10, array.len());
         assert_eq!(DataType::Decimal128(10, 1), array.data_type().clone());
         assert_eq!(123i128, array_decimal.value(0));
         assert_eq!(123i128, array_decimal.value(9));
         // test eq array
-        assert!(decimal_value.eq_array(&array, 1));
-        assert!(decimal_value.eq_array(&array, 5));
+        assert!(decimal_value
+            .eq_array(&array, 1)
+            .expect("Failed to compare arrays"));
+        assert!(decimal_value
+            .eq_array(&array, 5)
+            .expect("Failed to compare arrays"));

Review Comment:
   ```suggestion
           assert!(decimal_value
               .eq_array(&array, 1)?);
           assert!(decimal_value
               .eq_array(&array, 5)?);
   ```



##########
datafusion/common/src/scalar.rs:
##########
@@ -3419,8 +3492,8 @@ mod tests {
     {
         let scalar_result = left.add_checked(&right);
 
-        let left_array = left.to_array();
-        let right_array = right.to_array();
+        let left_array = left.to_array().expect("Failed to convert to array");

Review Comment:
   ```suggestion
           let left_array = left.to_array().expect("Failed to convert to array in add overflow check");
   ```



##########
datafusion/common/src/scalar.rs:
##########
@@ -3468,22 +3541,30 @@ mod tests {
         }
 
         // decimal scalar to array
-        let array = decimal_value.to_array();
+        let array = decimal_value
+            .to_array()
+            .expect("Failed to convert to array");
         let array = as_decimal128_array(&array)?;
         assert_eq!(1, array.len());
         assert_eq!(DataType::Decimal128(10, 1), array.data_type().clone());
         assert_eq!(123i128, array.value(0));
 
         // decimal scalar to array with size
-        let array = decimal_value.to_array_of_size(10);
+        let array = decimal_value
+            .to_array_of_size(10)
+            .expect("Failed to convert to array of size");

Review Comment:
   ```suggestion
           let array = decimal_value
               .to_array_of_size(10)?;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378463124


##########
datafusion/common/src/scalar.rs:
##########
@@ -1646,10 +1691,14 @@ impl ScalarValue {
         let array = scalars
             .into_iter()
             .map(|element: ScalarValue| match element {
-                ScalarValue::Decimal256(v1, _, _) => v1,
-                _ => unreachable!(),
+                ScalarValue::Decimal256(v1, _, _) => Ok(v1),
+                s => {
+                    _internal_err!(
+                        "Expected ScalarValue::Decimal256 element. Received {s:?}"
+                    )
+                }

Review Comment:
   Same as above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378446881


##########
datafusion/common/src/scalar.rs:
##########
@@ -1309,27 +1338,33 @@ impl ScalarValue {
 
         macro_rules! build_array_list_primitive {
             ($ARRAY_TY:ident, $SCALAR_TY:ident, $NATIVE_TYPE:ident) => {{
-                Arc::new(ListArray::from_iter_primitive::<$ARRAY_TY, _, _>(
-                    scalars.into_iter().map(|x| match x {
-                        ScalarValue::List(arr) => {
-                            if arr.as_any().downcast_ref::<NullArray>().is_some() {
-                                None
-                            } else {
-                                let list_arr = as_list_array(&arr);
-                                let primitive_arr =
-                                    list_arr.values().as_primitive::<$ARRAY_TY>();
-                                Some(
-                                    primitive_arr.into_iter().collect::<Vec<Option<_>>>(),
-                                )
+                Ok::<ArrayRef, DataFusionError>(Arc::new(ListArray::from_iter_primitive::<$ARRAY_TY, _, _>(
+                    scalars
+                        .into_iter()
+                        .map(|x| match x {
+                            ScalarValue::List(arr) => {
+                                if arr.as_any().downcast_ref::<NullArray>().is_some() {
+                                    Ok(None)
+                                } else {
+                                    let list_arr = as_list_array(&arr);
+                                    let primitive_arr =
+                                        list_arr.values().as_primitive::<$ARRAY_TY>();
+                                    Ok(Some(
+                                        primitive_arr
+                                            .into_iter()
+                                            .collect::<Vec<Option<_>>>(),
+                                    ))
+                                }
                             }
-                        }
-                        sv => panic!(
-                            "Inconsistent types in ScalarValue::iter_to_array. \
+                            sv => _internal_err!(
+                                "Inconsistent types in ScalarValue::iter_to_array. \

Review Comment:
   This error is reachable when the `ScalarValue`s in the iterator [are not all the same type](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1093-L1094). All the other `build_*` macros defined in `iter_to_array` return an internal error (for example [`build_array_list_string`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1253), [`build_array_primitive`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1145). Using panic here is inconsistent with the rest of the function.
   This does not require any code change in client side, since `iter_to_array` already returns a `Result`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar by making more operations return `Result` [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb merged PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1379688238


##########
datafusion/common/src/scalar.rs:
##########
@@ -676,13 +696,13 @@ macro_rules! build_values_list {
                     ScalarValue::$SCALAR_TY(None) => {
                         builder.values().append_null();
                     }
-                    _ => panic!("Incompatible ScalarValue for list"),

Review Comment:
   I agree this error can be unreachable if invariant is established that the `data_type` and `values` arguments passed to [`new_list`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1664) align with each other. 
   I will change this back to `panic` and thus making `new_list` panic instead of returning `Result`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378373755


##########
datafusion/common/src/scalar.rs:
##########
@@ -330,9 +330,9 @@ impl PartialOrd for ScalarValue {
                         let arr2 = list_arr2.value(i);
 
                         let lt_res =
-                            arrow::compute::kernels::cmp::lt(&arr1, &arr2).unwrap();
+                            arrow::compute::kernels::cmp::lt(&arr1, &arr2).ok()?;

Review Comment:
   This panic is reachable, for example if `arr1` and `arr2` have different data type, [`arrow::compute::kernels::cmp::lt`](https://github.com/apache/arrow-rs/blob/master/arrow-ord/src/cmp.rs#L201-L204) will panic.
   It makes sense to return `None` here instead of panicking and exiting since user just performs a partial order comparison. 
   This does not require any code change in client side.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378387479


##########
datafusion/common/src/scalar.rs:
##########
@@ -579,24 +581,44 @@ fn dict_from_values<K: ArrowDictionaryKeyType>(
 
 macro_rules! typed_cast_tz {
     ($array:expr, $index:expr, $ARRAYTYPE:ident, $SCALAR:ident, $TZ:expr) => {{
-        let array = $array.as_any().downcast_ref::<$ARRAYTYPE>().unwrap();
-        ScalarValue::$SCALAR(
+        use std::any::type_name;
+        let array = $array
+            .as_any()
+            .downcast_ref::<$ARRAYTYPE>()
+            .ok_or_else(|| {
+                DataFusionError::Internal(format!(
+                    "could not cast value to {}",
+                    type_name::<$ARRAYTYPE>()
+                ))
+            })?;

Review Comment:
   `typed_cast_tz` macro is called by [`try_from_array`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L2021-L2022). `try_from_array` is a public function.  Depending on what `array` value the user passes to the function, downcasting the array to a timestamp array can fail. 
   This error is reachable and recoverable. It makes sense to  return an internal error here. This aligns with how downcast error is handled in the [downcast_value](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/lib.rs#L73) macro.
   This does not require any code change in client side. `typed_cast_tz` is only called in `try_from_array`, which already returns a Result.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381195593


##########
datafusion/common/src/scalar.rs:
##########
@@ -1562,7 +1597,11 @@ impl ScalarValue {
                     DataType::UInt16 => dict_from_values::<UInt16Type>(values)?,
                     DataType::UInt32 => dict_from_values::<UInt32Type>(values)?,
                     DataType::UInt64 => dict_from_values::<UInt64Type>(values)?,
-                    _ => unreachable!("Invalid dictionary keys type: {:?}", key_type),

Review Comment:
   I agree it makes more sense to keep `unreachable!` here to align with how `arrow-rs` deals with illegal dictionary key types.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1379681340


##########
datafusion/common/src/scalar.rs:
##########
@@ -579,24 +581,44 @@ fn dict_from_values<K: ArrowDictionaryKeyType>(
 
 macro_rules! typed_cast_tz {
     ($array:expr, $index:expr, $ARRAYTYPE:ident, $SCALAR:ident, $TZ:expr) => {{
-        let array = $array.as_any().downcast_ref::<$ARRAYTYPE>().unwrap();

Review Comment:
   `typed_cast_tz` macro is called by [`try_from_array`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L2021-L2022). `try_from_array` is a public function. Depending on what array value the user passes to the function, downcasting the array to a timestamp array can fail.
   This error is reachable and recoverable. I think it makes sense to return an internal error here. This aligns with how downcast error is handled in the [downcast_value](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/lib.rs#L73) macro.
   This does not require any change in client code, since `typed_cast_tz` is only used in `try_from_array`, which is already a `Result`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381217818


##########
datafusion/common/src/scalar.rs:
##########
@@ -1801,29 +1850,29 @@ impl ScalarValue {
     ///
     /// assert_eq!(result, &expected);
     /// ```
-    pub fn new_list(values: &[ScalarValue], data_type: &DataType) -> ArrayRef {
-        Arc::new(match data_type {
-            DataType::Boolean => build_values_list!(BooleanBuilder, Boolean, values, 1),
-            DataType::Int8 => build_values_list!(Int8Builder, Int8, values, 1),
-            DataType::Int16 => build_values_list!(Int16Builder, Int16, values, 1),
-            DataType::Int32 => build_values_list!(Int32Builder, Int32, values, 1),
-            DataType::Int64 => build_values_list!(Int64Builder, Int64, values, 1),
-            DataType::UInt8 => build_values_list!(UInt8Builder, UInt8, values, 1),
-            DataType::UInt16 => build_values_list!(UInt16Builder, UInt16, values, 1),
-            DataType::UInt32 => build_values_list!(UInt32Builder, UInt32, values, 1),
-            DataType::UInt64 => build_values_list!(UInt64Builder, UInt64, values, 1),
-            DataType::Utf8 => build_values_list!(StringBuilder, Utf8, values, 1),
+    pub fn new_list(values: &[ScalarValue], data_type: &DataType) -> Result<ArrayRef> {
+        Ok(Arc::new(match data_type {
+            DataType::Boolean => build_values_list!(BooleanBuilder, Boolean, values, 1)?,
+            DataType::Int8 => build_values_list!(Int8Builder, Int8, values, 1)?,
+            DataType::Int16 => build_values_list!(Int16Builder, Int16, values, 1)?,
+            DataType::Int32 => build_values_list!(Int32Builder, Int32, values, 1)?,
+            DataType::Int64 => build_values_list!(Int64Builder, Int64, values, 1)?,
+            DataType::UInt8 => build_values_list!(UInt8Builder, UInt8, values, 1)?,
+            DataType::UInt16 => build_values_list!(UInt16Builder, UInt16, values, 1)?,
+            DataType::UInt32 => build_values_list!(UInt32Builder, UInt32, values, 1)?,
+            DataType::UInt64 => build_values_list!(UInt64Builder, UInt64, values, 1)?,
+            DataType::Utf8 => build_values_list!(StringBuilder, Utf8, values, 1)?,
             DataType::LargeUtf8 => {
-                build_values_list!(LargeStringBuilder, LargeUtf8, values, 1)
+                build_values_list!(LargeStringBuilder, LargeUtf8, values, 1)?
             }
-            DataType::Float32 => build_values_list!(Float32Builder, Float32, values, 1),
-            DataType::Float64 => build_values_list!(Float64Builder, Float64, values, 1),
+            DataType::Float32 => build_values_list!(Float32Builder, Float32, values, 1)?,
+            DataType::Float64 => build_values_list!(Float64Builder, Float64, values, 1)?,
             DataType::Timestamp(unit, tz) => {
                 let values = Some(values);
                 build_timestamp_list!(unit.clone(), tz.clone(), values, 1)
             }
             DataType::List(_) | DataType::Struct(_) => {
-                ScalarValue::iter_to_array_list_without_nulls(values, data_type).unwrap()

Review Comment:
   Similar to previous [comment](https://github.com/apache/arrow-datafusion/pull/7901/files#r1381174929), this error is reachable if `values` are decimals with precision 0. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#issuecomment-1780426607

   > Epic work @junjunjd thanks I'm thinking if we can get rid of `.expect` ? Or at least provide more details in expect message?
   
   I removed the `.expect` in [`get_min_max_values`](https://github.com/apache/arrow-datafusion/blob/4b6180768f6ce38ab95ba1ddcc817f5bcfbd1ed7/datafusion/core/src/datasource/physical_plan/parquet/row_groups.rs#L204) and [`get_null_count_values`](https://github.com/apache/arrow-datafusion/blob/4b6180768f6ce38ab95ba1ddcc817f5bcfbd1ed7/datafusion/core/src/datasource/physical_plan/parquet/row_groups.rs#L224) macros. 
   The rest of the `.expect` calls exist in tests or examples. It makes sense to use `.expect` and panic there.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar by making more operations return `Result` [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#issuecomment-1806743151

   @alamb Thank you for the review! I rebased the MR. It is ready for final review/merge.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar by making more operations return `Result` [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381231958


##########
datafusion/common/src/scalar.rs:
##########
@@ -1970,13 +2020,14 @@ impl ScalarValue {
                 ),
             },
             ScalarValue::Fixedsizelist(..) => {
-                unimplemented!("FixedSizeList is not supported yet")
+                return _not_impl_err!("FixedSizeList is not supported yet")
             }
             ScalarValue::List(arr) => {
                 let arrays = std::iter::repeat(arr.as_ref())
                     .take(size)
                     .collect::<Vec<_>>();
-                arrow::compute::concat(arrays.as_slice()).unwrap()

Review Comment:
   I agree this should be unreachable. I will change this back to unwrap.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#issuecomment-1776041923

   The CI appears to be failing. Marking as draft until they are passing. If there are specific questions about this PR, please let us know. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378379142


##########
datafusion/common/src/scalar.rs:
##########
@@ -512,19 +516,19 @@ impl std::hash::Hash for ScalarValue {
 pub fn get_dict_value<K: ArrowDictionaryKeyType>(
     array: &dyn Array,
     index: usize,
-) -> (&ArrayRef, Option<usize>) {
-    let dict_array = as_dictionary_array::<K>(array).unwrap();
-    (dict_array.values(), dict_array.key(index))
+) -> Result<(&ArrayRef, Option<usize>)> {
+    let dict_array = as_dictionary_array::<K>(array)?;

Review Comment:
   This is a public function and `as_dictionary_array` can fail depending on what `array` value the user passes to the function. The [`downcast_value`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/lib.rs#L73) macro called by `as_dictionary_array` returns an internal error if the cast is impossible. 
   I think returning a `Result` here makes sense because this error is reachable and recoverable (user can handle the error and only passes in arrays that can be downcast to `DictionaryArray`). If panic is preferred here, we can establish an invariant that the `array` value passed to the function should be able to  downcast to `DictionaryArray`.
   `get_dict_value` is only called in `try_from_array`, which already returns a `Result`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378379142


##########
datafusion/common/src/scalar.rs:
##########
@@ -512,19 +516,19 @@ impl std::hash::Hash for ScalarValue {
 pub fn get_dict_value<K: ArrowDictionaryKeyType>(
     array: &dyn Array,
     index: usize,
-) -> (&ArrayRef, Option<usize>) {
-    let dict_array = as_dictionary_array::<K>(array).unwrap();
-    (dict_array.values(), dict_array.key(index))
+) -> Result<(&ArrayRef, Option<usize>)> {
+    let dict_array = as_dictionary_array::<K>(array)?;

Review Comment:
   This is a public function and `as_dictionary_array` can fail depending on what `array` value the user passes to the function. The [`downcast_value`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/lib.rs#L73) macro called by `as_dictionary_array` returns an internal error if the cast is impossible. 
   
   Returning a `Result` here makes sense because this error is recoverable (user can instead pass in an array that is possible to downcast to `DictionaryArray`).
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381209821


##########
datafusion/common/src/scalar.rs:
##########
@@ -2259,71 +2316,63 @@ impl ScalarValue {
 
                 ScalarValue::List(arr)
             }
-            DataType::Date32 => {
-                typed_cast!(array, index, Date32Array, Date32)
-            }
-            DataType::Date64 => {
-                typed_cast!(array, index, Date64Array, Date64)
-            }
+            DataType::Date32 => typed_cast!(array, index, Date32Array, Date32)?,
+            DataType::Date64 => typed_cast!(array, index, Date64Array, Date64)?,
             DataType::Time32(TimeUnit::Second) => {
-                typed_cast!(array, index, Time32SecondArray, Time32Second)
+                typed_cast!(array, index, Time32SecondArray, Time32Second)?
             }
             DataType::Time32(TimeUnit::Millisecond) => {
-                typed_cast!(array, index, Time32MillisecondArray, Time32Millisecond)
+                typed_cast!(array, index, Time32MillisecondArray, Time32Millisecond)?
             }
             DataType::Time64(TimeUnit::Microsecond) => {
-                typed_cast!(array, index, Time64MicrosecondArray, Time64Microsecond)
+                typed_cast!(array, index, Time64MicrosecondArray, Time64Microsecond)?
             }
             DataType::Time64(TimeUnit::Nanosecond) => {
-                typed_cast!(array, index, Time64NanosecondArray, Time64Nanosecond)
-            }
-            DataType::Timestamp(TimeUnit::Second, tz_opt) => {
-                typed_cast_tz!(
-                    array,
-                    index,
-                    TimestampSecondArray,
-                    TimestampSecond,
-                    tz_opt
-                )
-            }
-            DataType::Timestamp(TimeUnit::Millisecond, tz_opt) => {
-                typed_cast_tz!(
-                    array,
-                    index,
-                    TimestampMillisecondArray,
-                    TimestampMillisecond,
-                    tz_opt
-                )
-            }
-            DataType::Timestamp(TimeUnit::Microsecond, tz_opt) => {
-                typed_cast_tz!(
-                    array,
-                    index,
-                    TimestampMicrosecondArray,
-                    TimestampMicrosecond,
-                    tz_opt
-                )
-            }
-            DataType::Timestamp(TimeUnit::Nanosecond, tz_opt) => {
-                typed_cast_tz!(
-                    array,
-                    index,
-                    TimestampNanosecondArray,
-                    TimestampNanosecond,
-                    tz_opt
-                )
+                typed_cast!(array, index, Time64NanosecondArray, Time64Nanosecond)?
             }
+            DataType::Timestamp(TimeUnit::Second, tz_opt) => typed_cast_tz!(
+                array,
+                index,
+                TimestampSecondArray,
+                TimestampSecond,
+                tz_opt
+            )?,
+            DataType::Timestamp(TimeUnit::Millisecond, tz_opt) => typed_cast_tz!(
+                array,
+                index,
+                TimestampMillisecondArray,
+                TimestampMillisecond,
+                tz_opt
+            )?,
+            DataType::Timestamp(TimeUnit::Microsecond, tz_opt) => typed_cast_tz!(
+                array,
+                index,
+                TimestampMicrosecondArray,
+                TimestampMicrosecond,
+                tz_opt
+            )?,
+            DataType::Timestamp(TimeUnit::Nanosecond, tz_opt) => typed_cast_tz!(
+                array,
+                index,
+                TimestampNanosecondArray,
+                TimestampNanosecond,
+                tz_opt
+            )?,
             DataType::Dictionary(key_type, _) => {
                 let (values_array, values_index) = match key_type.as_ref() {
-                    DataType::Int8 => get_dict_value::<Int8Type>(array, index),
-                    DataType::Int16 => get_dict_value::<Int16Type>(array, index),
-                    DataType::Int32 => get_dict_value::<Int32Type>(array, index),
-                    DataType::Int64 => get_dict_value::<Int64Type>(array, index),
-                    DataType::UInt8 => get_dict_value::<UInt8Type>(array, index),
-                    DataType::UInt16 => get_dict_value::<UInt16Type>(array, index),
-                    DataType::UInt32 => get_dict_value::<UInt32Type>(array, index),
-                    DataType::UInt64 => get_dict_value::<UInt64Type>(array, index),
-                    _ => unreachable!("Invalid dictionary keys type: {:?}", key_type),

Review Comment:
   See previous `[comment](https://github.com/apache/arrow-datafusion/pull/7901/files#r1381195593)`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381209743


##########
datafusion/common/src/scalar.rs:
##########
@@ -2089,19 +2140,23 @@ impl ScalarValue {
             ScalarValue::Dictionary(key_type, v) => {
                 // values array is one element long (the value)
                 match key_type.as_ref() {
-                    DataType::Int8 => dict_from_scalar::<Int8Type>(v, size),
-                    DataType::Int16 => dict_from_scalar::<Int16Type>(v, size),
-                    DataType::Int32 => dict_from_scalar::<Int32Type>(v, size),
-                    DataType::Int64 => dict_from_scalar::<Int64Type>(v, size),
-                    DataType::UInt8 => dict_from_scalar::<UInt8Type>(v, size),
-                    DataType::UInt16 => dict_from_scalar::<UInt16Type>(v, size),
-                    DataType::UInt32 => dict_from_scalar::<UInt32Type>(v, size),
-                    DataType::UInt64 => dict_from_scalar::<UInt64Type>(v, size),
-                    _ => unreachable!("Invalid dictionary keys type: {:?}", key_type),

Review Comment:
   See previous [comment](https://github.com/apache/arrow-datafusion/pull/7901/files#r1381195593).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381177149


##########
datafusion/common/src/scalar.rs:
##########
@@ -1768,12 +1817,12 @@ impl ScalarValue {
         precision: u8,
         scale: i8,
         size: usize,
-    ) -> Decimal256Array {
+    ) -> Result<Decimal256Array> {
         std::iter::repeat(value)
             .take(size)
             .collect::<Decimal256Array>()
             .with_precision_and_scale(precision, scale)
-            .unwrap()

Review Comment:
   Same as above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1372602940


##########
datafusion/common/src/scalar.rs:
##########
@@ -247,6 +247,10 @@ impl PartialEq for ScalarValue {
 }
 
 // manual implementation of `PartialOrd`
+//
+// # Panics
+//
+// Panics if there is an error when comparing kernels for arrays

Review Comment:
   Updated to remove unwrap.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1374142052


##########
datafusion/common/src/scalar.rs:
##########
@@ -3419,8 +3492,8 @@ mod tests {
     {
         let scalar_result = left.add_checked(&right);
 
-        let left_array = left.to_array();
-        let right_array = right.to_array();
+        let left_array = left.to_array().expect("Failed to convert to array");

Review Comment:
   I believe the backtrace will provide information on the test function. For example, the following context is provided on failure
   `thread 'scalar::tests::scalar_add_overflow_test' panicked at datafusion/common/src/scalar.rs:3495:9:`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378379142


##########
datafusion/common/src/scalar.rs:
##########
@@ -512,19 +516,19 @@ impl std::hash::Hash for ScalarValue {
 pub fn get_dict_value<K: ArrowDictionaryKeyType>(
     array: &dyn Array,
     index: usize,
-) -> (&ArrayRef, Option<usize>) {
-    let dict_array = as_dictionary_array::<K>(array).unwrap();
-    (dict_array.values(), dict_array.key(index))
+) -> Result<(&ArrayRef, Option<usize>)> {
+    let dict_array = as_dictionary_array::<K>(array)?;

Review Comment:
   This is a public function and `as_dictionary_array` can fail depending on what `array` value the user passes to the function. The [`downcast_value`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/lib.rs#L73) macro called by `as_dictionary_array` returns an internal error if the cast is impossible. 
   
   Returning a `Result` here makes sense because this error is reachable and recoverable (user can handle the error and only passes in arrays that be downcast to `DictionaryArray`).
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378463032


##########
datafusion/common/src/scalar.rs:
##########
@@ -1630,10 +1673,12 @@ impl ScalarValue {
         let array = scalars
             .into_iter()
             .map(|element: ScalarValue| match element {
-                ScalarValue::Decimal128(v1, _, _) => v1,
-                _ => unreachable!(),
+                ScalarValue::Decimal128(v1, _, _) => Ok(v1),
+                s => {
+                    _internal_err!("Expected ScalarValue::Null element. Received {s:?}")
+                }

Review Comment:
   Same as above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1379698814


##########
datafusion/common/src/scalar.rs:
##########
@@ -739,12 +759,21 @@ macro_rules! build_timestamp_array_from_option {
 
 macro_rules! eq_array_primitive {
     ($array:expr, $index:expr, $ARRAYTYPE:ident, $VALUE:expr) => {{
-        let array = $array.as_any().downcast_ref::<$ARRAYTYPE>().unwrap();

Review Comment:
   This error can be unreachable if an invariant is established that the `array` passed to [`eq_array`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1664) always has the same data type, precision and scale with `self`.
   I will change this back to panic and thus making `eq_array` panic instead of returning Result.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378379142


##########
datafusion/common/src/scalar.rs:
##########
@@ -512,19 +516,19 @@ impl std::hash::Hash for ScalarValue {
 pub fn get_dict_value<K: ArrowDictionaryKeyType>(
     array: &dyn Array,
     index: usize,
-) -> (&ArrayRef, Option<usize>) {
-    let dict_array = as_dictionary_array::<K>(array).unwrap();
-    (dict_array.values(), dict_array.key(index))
+) -> Result<(&ArrayRef, Option<usize>)> {
+    let dict_array = as_dictionary_array::<K>(array)?;

Review Comment:
   This is a public function and `as_dictionary_array` can fail depending on what `array` value the user passes to the function. The [`downcast_value`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/lib.rs#L73) macro called by `as_dictionary_array` returns an internal error if the cast is impossible. 
   
   Returning a `Result` here makes sense because this error is recoverable.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381174929


##########
datafusion/common/src/scalar.rs:
##########
@@ -1748,17 +1797,17 @@ impl ScalarValue {
         precision: u8,
         scale: i8,
         size: usize,
-    ) -> Decimal128Array {
+    ) -> Result<Decimal128Array> {
         match value {
             Some(val) => Decimal128Array::from(vec![val; size])
                 .with_precision_and_scale(precision, scale)
-                .unwrap(),

Review Comment:
   Since [`ScalarValue::try_new_decimal128`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L660-L661) allows a decimal with precision 0, this error is reachable when user creates a decimal with precision 0 and [`arrow-array`](https://github.com/apache/arrow-rs/blob/master/arrow-array/src/types.rs#L1234-L1235) does not support 0 precision decimal.
   Would it make sense to disallow a decimal with precision 0 in `try_new_decimal128` so that this error becomes unreachable?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar by making more operations return `Result` [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381768829


##########
datafusion/common/src/scalar.rs:
##########
@@ -330,9 +330,9 @@ impl PartialOrd for ScalarValue {
                         let arr2 = list_arr2.value(i);
 
                         let lt_res =
-                            arrow::compute::kernels::cmp::lt(&arr1, &arr2).unwrap();
+                            arrow::compute::kernels::cmp::lt(&arr1, &arr2).ok()?;

Review Comment:
   The potential downside of returning None rather than panic'ing is that it may mask a real bug and make it harder to track down -- comparing scalars of different types likely means they should have been coerced before



##########
datafusion/common/src/scalar.rs:
##########
@@ -1970,13 +2020,14 @@ impl ScalarValue {
                 ),
             },
             ScalarValue::Fixedsizelist(..) => {
-                unimplemented!("FixedSizeList is not supported yet")

Review Comment:
   I agree returning NotYetImplemented is a better choice



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar by making more operations return `Result` [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#issuecomment-1798612480

   @junjunjd  can you please merge and resolve the conflicts in this PR? Then we can merge it in


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378458769


##########
datafusion/common/src/scalar.rs:
##########
@@ -1562,7 +1597,11 @@ impl ScalarValue {
                     DataType::UInt16 => dict_from_values::<UInt16Type>(values)?,
                     DataType::UInt32 => dict_from_values::<UInt32Type>(values)?,
                     DataType::UInt64 => dict_from_values::<UInt64Type>(values)?,
-                    _ => unreachable!("Invalid dictionary keys type: {:?}", key_type),
+                    _ => {
+                        return _internal_err!(
+                            "Invalid dictionary keys type: {key_type:?}"
+                        )
+                    }

Review Comment:
   I believe this error is reachable if the user passes dictionaries with non-integer keys to the function. It is recoverable so I replaced `unreachable!` with an internal error.
   This does not require any code change in client side, since `iter_to_array` already returns a Result.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378373755


##########
datafusion/common/src/scalar.rs:
##########
@@ -330,9 +330,9 @@ impl PartialOrd for ScalarValue {
                         let arr2 = list_arr2.value(i);
 
                         let lt_res =
-                            arrow::compute::kernels::cmp::lt(&arr1, &arr2).unwrap();
+                            arrow::compute::kernels::cmp::lt(&arr1, &arr2).ok()?;

Review Comment:
   This panic is reachable, for example if `arr1` and `arr2` have different data type, [`arrow::compute::kernels::cmp::lt`](https://github.com/apache/arrow-rs/blob/master/arrow-ord/src/cmp.rs#L201-L204) will panic.
   
   It makes sense to return `None` here instead of panicking and exiting since user just performs a partial order comparison. 
   
   This does not require any code change in client side.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378387479


##########
datafusion/common/src/scalar.rs:
##########
@@ -579,24 +581,44 @@ fn dict_from_values<K: ArrowDictionaryKeyType>(
 
 macro_rules! typed_cast_tz {
     ($array:expr, $index:expr, $ARRAYTYPE:ident, $SCALAR:ident, $TZ:expr) => {{
-        let array = $array.as_any().downcast_ref::<$ARRAYTYPE>().unwrap();
-        ScalarValue::$SCALAR(
+        use std::any::type_name;
+        let array = $array
+            .as_any()
+            .downcast_ref::<$ARRAYTYPE>()
+            .ok_or_else(|| {
+                DataFusionError::Internal(format!(
+                    "could not cast value to {}",
+                    type_name::<$ARRAYTYPE>()
+                ))
+            })?;

Review Comment:
   `typed_cast_tz` macro is called by [`try_from_array`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L2021-L2022). `try_from_array` is a public function.  Depending on what `array` value the user passes to the function, downcasting the array to a timestamp array can fail. 
   This error is reachable and recoverable. It makes sense to  return an internal error here. This aligns with how downcast error is handled in the [downcast_value](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/lib.rs#L73) macro.
   This does not require any code change in client side. `typed_cast_tz` is only called in `try_from_array`, which already returns a Result.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1379681918


##########
datafusion/common/src/scalar.rs:
##########
@@ -579,24 +581,44 @@ fn dict_from_values<K: ArrowDictionaryKeyType>(
 
 macro_rules! typed_cast_tz {
     ($array:expr, $index:expr, $ARRAYTYPE:ident, $SCALAR:ident, $TZ:expr) => {{
-        let array = $array.as_any().downcast_ref::<$ARRAYTYPE>().unwrap();
-        ScalarValue::$SCALAR(
+        use std::any::type_name;
+        let array = $array
+            .as_any()
+            .downcast_ref::<$ARRAYTYPE>()
+            .ok_or_else(|| {
+                DataFusionError::Internal(format!(
+                    "could not cast value to {}",
+                    type_name::<$ARRAYTYPE>()
+                ))
+            })?;
+        Ok::<ScalarValue, DataFusionError>(ScalarValue::$SCALAR(
             match array.is_null($index) {
                 true => None,
                 false => Some(array.value($index).into()),
             },
             $TZ.clone(),
-        )
+        ))
     }};
 }
 
 macro_rules! typed_cast {
     ($array:expr, $index:expr, $ARRAYTYPE:ident, $SCALAR:ident) => {{
-        let array = $array.as_any().downcast_ref::<$ARRAYTYPE>().unwrap();

Review Comment:
   Same as above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1379704451


##########
datafusion/common/src/scalar.rs:
##########
@@ -1562,7 +1597,11 @@ impl ScalarValue {
                     DataType::UInt16 => dict_from_values::<UInt16Type>(values)?,
                     DataType::UInt32 => dict_from_values::<UInt32Type>(values)?,
                     DataType::UInt64 => dict_from_values::<UInt64Type>(values)?,
-                    _ => unreachable!("Invalid dictionary keys type: {:?}", key_type),

Review Comment:
   I believe this error is reachable if the user passes dictionaries with non-integer keys to the function. It is recoverable so I replaced unreachable! with an internal error.
   This does not require any code change in client side, since `iter_to_array` already returns a Result.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381209743


##########
datafusion/common/src/scalar.rs:
##########
@@ -2089,19 +2140,23 @@ impl ScalarValue {
             ScalarValue::Dictionary(key_type, v) => {
                 // values array is one element long (the value)
                 match key_type.as_ref() {
-                    DataType::Int8 => dict_from_scalar::<Int8Type>(v, size),
-                    DataType::Int16 => dict_from_scalar::<Int16Type>(v, size),
-                    DataType::Int32 => dict_from_scalar::<Int32Type>(v, size),
-                    DataType::Int64 => dict_from_scalar::<Int64Type>(v, size),
-                    DataType::UInt8 => dict_from_scalar::<UInt8Type>(v, size),
-                    DataType::UInt16 => dict_from_scalar::<UInt16Type>(v, size),
-                    DataType::UInt32 => dict_from_scalar::<UInt32Type>(v, size),
-                    DataType::UInt64 => dict_from_scalar::<UInt64Type>(v, size),
-                    _ => unreachable!("Invalid dictionary keys type: {:?}", key_type),

Review Comment:
   See previous `[comment](https://github.com/apache/arrow-datafusion/pull/7901/files#r1381195593)`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381231326


##########
datafusion/common/src/scalar.rs:
##########
@@ -1970,13 +2020,14 @@ impl ScalarValue {
                 ),
             },
             ScalarValue::Fixedsizelist(..) => {
-                unimplemented!("FixedSizeList is not supported yet")

Review Comment:
   What would be the preferred way to handle unimplemented errors in `datafusion`? There are many places where a `NotImplemented` error is returned instead of using `unimplemented!` and panicking. IMO returning an error makes more sense as user can choose to ignore unimplemented errors instead of panicking and exiting.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381231958


##########
datafusion/common/src/scalar.rs:
##########
@@ -1970,13 +2020,14 @@ impl ScalarValue {
                 ),
             },
             ScalarValue::Fixedsizelist(..) => {
-                unimplemented!("FixedSizeList is not supported yet")
+                return _not_impl_err!("FixedSizeList is not supported yet")
             }
             ScalarValue::List(arr) => {
                 let arrays = std::iter::repeat(arr.as_ref())
                     .take(size)
                     .collect::<Vec<_>>();
-                arrow::compute::concat(arrays.as_slice()).unwrap()

Review Comment:
   I agree this should be unreachable. I will change this back to unwrap.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378373755


##########
datafusion/common/src/scalar.rs:
##########
@@ -330,9 +330,9 @@ impl PartialOrd for ScalarValue {
                         let arr2 = list_arr2.value(i);
 
                         let lt_res =
-                            arrow::compute::kernels::cmp::lt(&arr1, &arr2).unwrap();
+                            arrow::compute::kernels::cmp::lt(&arr1, &arr2).ok()?;

Review Comment:
   This panic is reachable, for example if `arr1` and `arr2` have different data type, [`arrow::compute::kernels::cmp::lt`](https://github.com/apache/arrow-rs/blob/master/arrow-ord/src/cmp.rs#L201-L204) will panic.
   
   It makes sense to return `None` here instead of panicking and exiting since user just performs a partial order comparison.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378379142


##########
datafusion/common/src/scalar.rs:
##########
@@ -512,19 +516,19 @@ impl std::hash::Hash for ScalarValue {
 pub fn get_dict_value<K: ArrowDictionaryKeyType>(
     array: &dyn Array,
     index: usize,
-) -> (&ArrayRef, Option<usize>) {
-    let dict_array = as_dictionary_array::<K>(array).unwrap();
-    (dict_array.values(), dict_array.key(index))
+) -> Result<(&ArrayRef, Option<usize>)> {
+    let dict_array = as_dictionary_array::<K>(array)?;

Review Comment:
   This is a public function and `as_dictionary_array` can fail depending on what `array` value the user passes to the function. The [`downcast_value`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/lib.rs#L73) macro called by `as_dictionary_array` returns an internal error if the cast is impossible. 
   Returning a `Result` here makes sense because this error is reachable and recoverable (user can handle the error and only passes in arrays that can be downcast to `DictionaryArray`).
   This does not require any code change in client side. `get_dict_value` is only called in `try_from_array`, which already returns a `Result`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378387479


##########
datafusion/common/src/scalar.rs:
##########
@@ -579,24 +581,44 @@ fn dict_from_values<K: ArrowDictionaryKeyType>(
 
 macro_rules! typed_cast_tz {
     ($array:expr, $index:expr, $ARRAYTYPE:ident, $SCALAR:ident, $TZ:expr) => {{
-        let array = $array.as_any().downcast_ref::<$ARRAYTYPE>().unwrap();
-        ScalarValue::$SCALAR(
+        use std::any::type_name;
+        let array = $array
+            .as_any()
+            .downcast_ref::<$ARRAYTYPE>()
+            .ok_or_else(|| {
+                DataFusionError::Internal(format!(
+                    "could not cast value to {}",
+                    type_name::<$ARRAYTYPE>()
+                ))
+            })?;

Review Comment:
   `typed_cast_tz` macro is called by [`try_from_array`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L2021-L2022). `try_from_array` is a public function.  Depending on what `array` value the user passes to the function, downcasting the array to a timestamp array can fail. 
   This error is reachable and recoverable. It makes sense to  return an internal error here. This aligns with how downcast error is handled in the [downcast_value](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/lib.rs#L73) macro.
   This does not require any code change in client side. get_dict_value is only called in try_from_array, which already returns a Result.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#issuecomment-1776042385

   Thank you for the work @junjunjd 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#issuecomment-1780476370

   @alamb @Weijun-H @comphead I addressed the comments. This is ready for another review.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1374142052


##########
datafusion/common/src/scalar.rs:
##########
@@ -3419,8 +3492,8 @@ mod tests {
     {
         let scalar_result = left.add_checked(&right);
 
-        let left_array = left.to_array();
-        let right_array = right.to_array();
+        let left_array = left.to_array().expect("Failed to convert to array");

Review Comment:
   I believe the backtrace will provide all the necessary information on the test function. For example, the following context is provided on failure
   `thread 'scalar::tests::scalar_add_overflow_test' panicked at datafusion/common/src/scalar.rs:3495:9:`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378379142


##########
datafusion/common/src/scalar.rs:
##########
@@ -512,19 +516,19 @@ impl std::hash::Hash for ScalarValue {
 pub fn get_dict_value<K: ArrowDictionaryKeyType>(
     array: &dyn Array,
     index: usize,
-) -> (&ArrayRef, Option<usize>) {
-    let dict_array = as_dictionary_array::<K>(array).unwrap();
-    (dict_array.values(), dict_array.key(index))
+) -> Result<(&ArrayRef, Option<usize>)> {
+    let dict_array = as_dictionary_array::<K>(array)?;

Review Comment:
   This is a public function and `as_dictionary_array` can fail depending on what `array` value the user passes to the function. The [`downcast_value`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/lib.rs#L73) macro called by `as_dictionary_array` returns an internal error if the cast is impossible. 
   
   Returning a `Result` here makes sense because this error is reachable and recoverable (user can instead pass in an array that is possible to downcast to `DictionaryArray`).
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378447915


##########
datafusion/common/src/scalar.rs:
##########
@@ -1539,7 +1574,7 @@ impl ScalarValue {
                             if &inner_key_type == key_type {
                                 Ok(*scalar)
                             } else {
-                                panic!("Expected inner key type of {key_type} but found: {inner_key_type}, value was ({scalar:?})");
+                                _internal_err!("Expected inner key type of {key_type} but found: {inner_key_type}, value was ({scalar:?})")

Review Comment:
   Same as above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381223455


##########
datafusion/common/src/scalar.rs:
##########
@@ -1834,30 +1883,31 @@ impl ScalarValue {
                 }
 
                 let arr = Decimal128Array::from(vals)
-                    .with_precision_and_scale(*precision, *scale)
-                    .unwrap();
+                    .with_precision_and_scale(*precision, *scale)?;
                 wrap_into_list_array(Arc::new(arr))
             }
 
             DataType::Null => {
                 let arr = new_null_array(&DataType::Null, values.len());
                 wrap_into_list_array(arr)
             }
-            _ => panic!(
-                "Unsupported data type {:?} for ScalarValue::list_to_array",
-                data_type
-            ),
-        })
+            _ => {
+                return _internal_err!(
+                    "Unsupported data type {:?} for ScalarValue::list_to_array",
+                    data_type
+                )
+            }
+        }))
     }
 
     /// Converts a scalar value into an array of `size` rows.
-    pub fn to_array_of_size(&self, size: usize) -> ArrayRef {
-        match self {
+    pub fn to_array_of_size(&self, size: usize) -> Result<ArrayRef> {
+        Ok(match self {
             ScalarValue::Decimal128(e, precision, scale) => Arc::new(
-                ScalarValue::build_decimal_array(*e, *precision, *scale, size),
+                ScalarValue::build_decimal_array(*e, *precision, *scale, size)?,
             ),
             ScalarValue::Decimal256(e, precision, scale) => Arc::new(
-                ScalarValue::build_decimal256_array(*e, *precision, *scale, size),

Review Comment:
   See previous [comment](https://github.com/apache/arrow-datafusion/pull/7901/files#r1381174929).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381233526


##########
datafusion/common/src/scalar.rs:
##########
@@ -2350,7 +2399,9 @@ impl ScalarValue {
                 let array = as_fixed_size_binary_array(array)?;
                 let size = match array.data_type() {
                     DataType::FixedSizeBinary(size) => *size,
-                    _ => unreachable!(),

Review Comment:
    I agree this should be unreachable. I will change this back to `unreachable!`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378458769


##########
datafusion/common/src/scalar.rs:
##########
@@ -1562,7 +1597,11 @@ impl ScalarValue {
                     DataType::UInt16 => dict_from_values::<UInt16Type>(values)?,
                     DataType::UInt32 => dict_from_values::<UInt32Type>(values)?,
                     DataType::UInt64 => dict_from_values::<UInt64Type>(values)?,
-                    _ => unreachable!("Invalid dictionary keys type: {:?}", key_type),
+                    _ => {
+                        return _internal_err!(
+                            "Invalid dictionary keys type: {key_type:?}"
+                        )
+                    }

Review Comment:
   I believe this error is reachable if the user passes dictionaries with non-integer keys to the function. It is recoverable so I replaced `unreachable!` with an internal error.
   This does not require any code change in client side, since `iter_to_array` already returns a Result.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#issuecomment-1792050626

   @alamb Thanks for the review. I have added comments to the panics I removed in [scalar.rs](https://github.com/apache/arrow-datafusion/pull/7901/files#diff-3ed49bf2d745740a87441ae1fcdec44fe853f4809ee671fe294818982e020da6). To summarize, these panics can be categorized into five types in general:
   
   - [panics](https://github.com/apache/arrow-datafusion/pull/7901/files#r1378446881) generated in `iter_to_array` when the `ScalarValue`s in the iterator [are not all the same type](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1093-L1094). I believe these errors are reachable. Most of the [`build_*` macros](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1145) defined in the function return an internal error instead of panicking. IMO it makes sense to remove these panics to align with other internal errors returned.
   - [`typed_cast_*` macros](https://github.com/apache/arrow-datafusion/pull/7901/files#r1379681340) called in `try_from_array`. Since `try_from_array` is a public function, downcasting the array to certain types can fail depending on what `array` value the user passes to the function.
   I think it makes sense to return an internal error as this error is reachable and recoverable. This aligns with how downcast error is handled in the [downcast_value](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/lib.rs#L73) macro. `try_from_array` already returns a `Result`, so this does not require any change in client code.
   - [panics](https://github.com/apache/arrow-datafusion/pull/7901/files#r1381174929) generated when `with_precision_and_scale` is called on decimal arrays. I think these errors are reachable because [ScalarValue::try_new_decimal128](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L660-L661) allows decimals with precision 0 while [arrow-array](https://github.com/apache/arrow-rs/blob/master/arrow-array/src/types.rs#L1234-L1235) does not support that. We can update `try_new_decimal128` to disallow decimal with precision 0 and establish some other invariants to [`new_list`](https://github.com/apache/arrow-datafusion/pull/7901/files#r1379688238), [`eq_array`](https://github.com/apache/arrow-datafusion/pull/7901/files#r1379698814) and [`to_array_of_size`](https://github.com/apache/arrow-datafusion/pull/7901/files#r1381221881) so that these error becomes unreachable and these functions can panic. This should remove the impacts on client code.
   - the unimplemented errors. In many other `datafusion` code, a `NotImplemented` error is returned instead of using `unimplemented!`. IMHO returning an error makes more sense as user can choose whether to ignore the unimplemented errors instead of panicking and exiting. Would appreciate your thoughts on this.
   - All the `ArrowError`s and the "Invalid dictionary keys type" errors should be unreachable. I will change these back to panics.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar by making more operations return `Result` [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1379698814


##########
datafusion/common/src/scalar.rs:
##########
@@ -739,12 +759,21 @@ macro_rules! build_timestamp_array_from_option {
 
 macro_rules! eq_array_primitive {
     ($array:expr, $index:expr, $ARRAYTYPE:ident, $VALUE:expr) => {{
-        let array = $array.as_any().downcast_ref::<$ARRAYTYPE>().unwrap();

Review Comment:
   This error can be unreachable if an invariant is established that the `array` passed to [`eq_array`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1664) always has the same data type, precision and scale with `self`.
   With the above invariant established, we can change this back to panic and thus making `eq_array` panic instead of returning Result.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1374109734


##########
datafusion/common/src/scalar.rs:
##########
@@ -3468,22 +3541,30 @@ mod tests {
         }
 
         // decimal scalar to array
-        let array = decimal_value.to_array();
+        let array = decimal_value
+            .to_array()
+            .expect("Failed to convert to array");
         let array = as_decimal128_array(&array)?;
         assert_eq!(1, array.len());
         assert_eq!(DataType::Decimal128(10, 1), array.data_type().clone());
         assert_eq!(123i128, array.value(0));
 
         // decimal scalar to array with size
-        let array = decimal_value.to_array_of_size(10);
+        let array = decimal_value
+            .to_array_of_size(10)
+            .expect("Failed to convert to array of size");

Review Comment:
   Same as above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1374142935


##########
datafusion/common/src/scalar.rs:
##########
@@ -3419,8 +3492,8 @@ mod tests {
     {
         let scalar_result = left.add_checked(&right);
 
-        let left_array = left.to_array();
-        let right_array = right.to_array();
+        let left_array = left.to_array().expect("Failed to convert to array");
+        let right_array = right.to_array().expect("Failed to convert to array");

Review Comment:
   Same as above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381175085


##########
datafusion/common/src/scalar.rs:
##########
@@ -1748,17 +1797,17 @@ impl ScalarValue {
         precision: u8,
         scale: i8,
         size: usize,
-    ) -> Decimal128Array {
+    ) -> Result<Decimal128Array> {
         match value {
             Some(val) => Decimal128Array::from(vec![val; size])
                 .with_precision_and_scale(precision, scale)
-                .unwrap(),
+                .map_err(DataFusionError::ArrowError),
             None => {
                 let mut builder = Decimal128Array::builder(size)
                     .with_precision_and_scale(precision, scale)
-                    .unwrap();

Review Comment:
   Same as above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378373755


##########
datafusion/common/src/scalar.rs:
##########
@@ -330,9 +330,9 @@ impl PartialOrd for ScalarValue {
                         let arr2 = list_arr2.value(i);
 
                         let lt_res =
-                            arrow::compute::kernels::cmp::lt(&arr1, &arr2).unwrap();
+                            arrow::compute::kernels::cmp::lt(&arr1, &arr2).ok()?;

Review Comment:
   This panic is reachable, for example if `arr1` and `arr2` have different data type, [`arrow::compute::kernels::cmp::lt`](https://github.com/apache/arrow-rs/blob/master/arrow-ord/src/cmp.rs#L201-L204) will panic.
   I think it makes sense to return `None` here instead of panicking and exiting since user just performs a partial order comparison. 
   This does not require any code change in client side.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381244300


##########
datafusion/common/src/scalar.rs:
##########
@@ -2462,17 +2511,16 @@ impl ScalarValue {
     /// This function has a few narrow usescases such as hash table key
     /// comparisons where comparing a single row at a time is necessary.
     #[inline]
-    pub fn eq_array(&self, array: &ArrayRef, index: usize) -> bool {
-        match self {
+    pub fn eq_array(&self, array: &ArrayRef, index: usize) -> Result<bool> {
+        Ok(match self {
             ScalarValue::Decimal128(v, precision, scale) => {
                 ScalarValue::eq_array_decimal(
                     array,
                     index,
                     v.as_ref(),
                     *precision,
                     *scale,
-                )
-                .unwrap()

Review Comment:
   See previous [comment](https://github.com/apache/arrow-datafusion/pull/7901/files#r1379698814).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381244656


##########
datafusion/common/src/scalar.rs:
##########
@@ -2481,119 +2529,136 @@ impl ScalarValue {
                     v.as_ref(),
                     *precision,
                     *scale,
-                )
-                .unwrap()

Review Comment:
   See previous [comment](https://github.com/apache/arrow-datafusion/pull/7901/files#r1379698814).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381221881


##########
datafusion/common/src/scalar.rs:
##########
@@ -1834,30 +1883,31 @@ impl ScalarValue {
                 }
 
                 let arr = Decimal128Array::from(vals)
-                    .with_precision_and_scale(*precision, *scale)
-                    .unwrap();
+                    .with_precision_and_scale(*precision, *scale)?;
                 wrap_into_list_array(Arc::new(arr))
             }
 
             DataType::Null => {
                 let arr = new_null_array(&DataType::Null, values.len());
                 wrap_into_list_array(arr)
             }
-            _ => panic!(
-                "Unsupported data type {:?} for ScalarValue::list_to_array",
-                data_type
-            ),
-        })
+            _ => {
+                return _internal_err!(
+                    "Unsupported data type {:?} for ScalarValue::list_to_array",
+                    data_type
+                )
+            }
+        }))
     }
 
     /// Converts a scalar value into an array of `size` rows.
-    pub fn to_array_of_size(&self, size: usize) -> ArrayRef {
-        match self {
+    pub fn to_array_of_size(&self, size: usize) -> Result<ArrayRef> {
+        Ok(match self {
             ScalarValue::Decimal128(e, precision, scale) => Arc::new(
-                ScalarValue::build_decimal_array(*e, *precision, *scale, size),

Review Comment:
   See previous [comment](https://github.com/apache/arrow-datafusion/pull/7901/files#r1381174929).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381220530


##########
datafusion/common/src/scalar.rs:
##########
@@ -1834,30 +1883,31 @@ impl ScalarValue {
                 }
 
                 let arr = Decimal128Array::from(vals)
-                    .with_precision_and_scale(*precision, *scale)
-                    .unwrap();
+                    .with_precision_and_scale(*precision, *scale)?;
                 wrap_into_list_array(Arc::new(arr))
             }
 
             DataType::Null => {
                 let arr = new_null_array(&DataType::Null, values.len());
                 wrap_into_list_array(arr)
             }
-            _ => panic!(
-                "Unsupported data type {:?} for ScalarValue::list_to_array",
-                data_type
-            ),

Review Comment:
   By updating `try_new_decimal128` to disallow decimal with precision 0 and establishing some invariants, errors in `new_list` should be unreachable. I will change `new_list` back to panic to avoid impact on client code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378373755


##########
datafusion/common/src/scalar.rs:
##########
@@ -330,9 +330,9 @@ impl PartialOrd for ScalarValue {
                         let arr2 = list_arr2.value(i);
 
                         let lt_res =
-                            arrow::compute::kernels::cmp::lt(&arr1, &arr2).unwrap();
+                            arrow::compute::kernels::cmp::lt(&arr1, &arr2).ok()?;

Review Comment:
   This panic is reachable, for example if `arr1` and `arr2` have different data type, [`arrow::compute::kernels::cmp::lt`](https://github.com/apache/arrow-rs/blob/master/arrow-ord/src/cmp.rs#L201-L204) will panic.
   
   It makes sense to return `None` here instead of panicking and exiting since user just performs a partial order comparison. This does not require client code change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1379691127


##########
datafusion/common/src/scalar.rs:
##########
@@ -701,13 +721,13 @@ macro_rules! build_values_list_tz {
                     ScalarValue::$SCALAR_TY(None, _) => {
                         builder.values().append_null();
                     }
-                    _ => panic!("Incompatible ScalarValue for list"),
+                    _ => return _internal_err!("Incompatible ScalarValue for list"),

Review Comment:
   Same as above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "houqp (via GitHub)" <gi...@apache.org>.
houqp commented on PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#issuecomment-1781478234

   @Weijun-H @comphead using unwrap and expect in tests is actually the preferred practice, see https://github.com/influxdata/influxdb/blob/main/docs/style_guide.md#dont-return-result-from-test-functions. It makes the test failure easier to parse for a human and the test framework will already provide all  the necessary context.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#issuecomment-1778716889

   @alamb CI is fixed. This MR is ready for final review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378387479


##########
datafusion/common/src/scalar.rs:
##########
@@ -579,24 +581,44 @@ fn dict_from_values<K: ArrowDictionaryKeyType>(
 
 macro_rules! typed_cast_tz {
     ($array:expr, $index:expr, $ARRAYTYPE:ident, $SCALAR:ident, $TZ:expr) => {{
-        let array = $array.as_any().downcast_ref::<$ARRAYTYPE>().unwrap();
-        ScalarValue::$SCALAR(
+        use std::any::type_name;
+        let array = $array
+            .as_any()
+            .downcast_ref::<$ARRAYTYPE>()
+            .ok_or_else(|| {
+                DataFusionError::Internal(format!(
+                    "could not cast value to {}",
+                    type_name::<$ARRAYTYPE>()
+                ))
+            })?;

Review Comment:
   `typed_cast_tz` macro is called by [`try_from_array`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L2021-L2022). `try_from_array` is a public function.  Depending on what `array` value the user passes to the function, downcasting the array to a timestamp array can fail. 
   
   This error is reachable and recoverable. It makes sense to  return an internal error here. This aligns with how downcast error is handled in the [downcast_value](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/lib.rs#L73) macro.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378387786


##########
datafusion/common/src/scalar.rs:
##########
@@ -579,24 +581,44 @@ fn dict_from_values<K: ArrowDictionaryKeyType>(
 
 macro_rules! typed_cast_tz {
     ($array:expr, $index:expr, $ARRAYTYPE:ident, $SCALAR:ident, $TZ:expr) => {{
-        let array = $array.as_any().downcast_ref::<$ARRAYTYPE>().unwrap();
-        ScalarValue::$SCALAR(
+        use std::any::type_name;
+        let array = $array
+            .as_any()
+            .downcast_ref::<$ARRAYTYPE>()
+            .ok_or_else(|| {
+                DataFusionError::Internal(format!(
+                    "could not cast value to {}",
+                    type_name::<$ARRAYTYPE>()
+                ))
+            })?;
+        Ok::<ScalarValue, DataFusionError>(ScalarValue::$SCALAR(
             match array.is_null($index) {
                 true => None,
                 false => Some(array.value($index).into()),
             },
             $TZ.clone(),
-        )
+        ))
     }};
 }
 
 macro_rules! typed_cast {
     ($array:expr, $index:expr, $ARRAYTYPE:ident, $SCALAR:ident) => {{
-        let array = $array.as_any().downcast_ref::<$ARRAYTYPE>().unwrap();
-        ScalarValue::$SCALAR(match array.is_null($index) {
-            true => None,
-            false => Some(array.value($index).into()),
-        })
+        use std::any::type_name;
+        let array = $array
+            .as_any()
+            .downcast_ref::<$ARRAYTYPE>()
+            .ok_or_else(|| {
+                DataFusionError::Internal(format!(
+                    "could not cast value to {}",
+                    type_name::<$ARRAYTYPE>()
+                ))
+            })?;

Review Comment:
   Same as above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1379681340


##########
datafusion/common/src/scalar.rs:
##########
@@ -579,24 +581,44 @@ fn dict_from_values<K: ArrowDictionaryKeyType>(
 
 macro_rules! typed_cast_tz {
     ($array:expr, $index:expr, $ARRAYTYPE:ident, $SCALAR:ident, $TZ:expr) => {{
-        let array = $array.as_any().downcast_ref::<$ARRAYTYPE>().unwrap();

Review Comment:
   typed_cast_tz macro is called by [try_from_array](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L2021-L2022). try_from_array is a public function. Depending on what array value the user passes to the function, downcasting the array to a timestamp array can fail.
   This error is reachable and recoverable. It makes sense to return an internal error here. This aligns with how downcast error is handled in the [downcast_value](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/lib.rs#L73) macro.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381233526


##########
datafusion/common/src/scalar.rs:
##########
@@ -2350,7 +2399,9 @@ impl ScalarValue {
                 let array = as_fixed_size_binary_array(array)?;
                 let size = match array.data_type() {
                     DataType::FixedSizeBinary(size) => *size,
-                    _ => unreachable!(),

Review Comment:
    I agree this should be unreachable. I will change this back to unwrap.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381212662


##########
datafusion/common/src/scalar.rs:
##########
@@ -2481,119 +2529,136 @@ impl ScalarValue {
                     v.as_ref(),
                     *precision,
                     *scale,
-                )
-                .unwrap()
+                )?
             }
             ScalarValue::Boolean(val) => {
-                eq_array_primitive!(array, index, BooleanArray, val)
+                eq_array_primitive!(array, index, BooleanArray, val)?
             }
             ScalarValue::Float32(val) => {
-                eq_array_primitive!(array, index, Float32Array, val)
+                eq_array_primitive!(array, index, Float32Array, val)?
             }
             ScalarValue::Float64(val) => {
-                eq_array_primitive!(array, index, Float64Array, val)
+                eq_array_primitive!(array, index, Float64Array, val)?
+            }
+            ScalarValue::Int8(val) => eq_array_primitive!(array, index, Int8Array, val)?,
+            ScalarValue::Int16(val) => {
+                eq_array_primitive!(array, index, Int16Array, val)?
+            }
+            ScalarValue::Int32(val) => {
+                eq_array_primitive!(array, index, Int32Array, val)?
+            }
+            ScalarValue::Int64(val) => {
+                eq_array_primitive!(array, index, Int64Array, val)?
+            }
+            ScalarValue::UInt8(val) => {
+                eq_array_primitive!(array, index, UInt8Array, val)?
             }
-            ScalarValue::Int8(val) => eq_array_primitive!(array, index, Int8Array, val),
-            ScalarValue::Int16(val) => eq_array_primitive!(array, index, Int16Array, val),
-            ScalarValue::Int32(val) => eq_array_primitive!(array, index, Int32Array, val),
-            ScalarValue::Int64(val) => eq_array_primitive!(array, index, Int64Array, val),
-            ScalarValue::UInt8(val) => eq_array_primitive!(array, index, UInt8Array, val),
             ScalarValue::UInt16(val) => {
-                eq_array_primitive!(array, index, UInt16Array, val)
+                eq_array_primitive!(array, index, UInt16Array, val)?
             }
             ScalarValue::UInt32(val) => {
-                eq_array_primitive!(array, index, UInt32Array, val)
+                eq_array_primitive!(array, index, UInt32Array, val)?
             }
             ScalarValue::UInt64(val) => {
-                eq_array_primitive!(array, index, UInt64Array, val)
+                eq_array_primitive!(array, index, UInt64Array, val)?
+            }
+            ScalarValue::Utf8(val) => {
+                eq_array_primitive!(array, index, StringArray, val)?
             }
-            ScalarValue::Utf8(val) => eq_array_primitive!(array, index, StringArray, val),
             ScalarValue::LargeUtf8(val) => {
-                eq_array_primitive!(array, index, LargeStringArray, val)
+                eq_array_primitive!(array, index, LargeStringArray, val)?
             }
             ScalarValue::Binary(val) => {
-                eq_array_primitive!(array, index, BinaryArray, val)
+                eq_array_primitive!(array, index, BinaryArray, val)?
             }
             ScalarValue::FixedSizeBinary(_, val) => {
-                eq_array_primitive!(array, index, FixedSizeBinaryArray, val)
+                eq_array_primitive!(array, index, FixedSizeBinaryArray, val)?
             }
             ScalarValue::LargeBinary(val) => {
-                eq_array_primitive!(array, index, LargeBinaryArray, val)
+                eq_array_primitive!(array, index, LargeBinaryArray, val)?
             }
-            ScalarValue::Fixedsizelist(..) => unimplemented!(),
-            ScalarValue::List(_) => unimplemented!("ListArr"),
+            ScalarValue::Fixedsizelist(..) => {
+                return _not_impl_err!("FixedSizeList is not supported yet")
+            }
+            ScalarValue::List(_) => return _not_impl_err!("List is not supported yet"),
             ScalarValue::Date32(val) => {
-                eq_array_primitive!(array, index, Date32Array, val)
+                eq_array_primitive!(array, index, Date32Array, val)?
             }
             ScalarValue::Date64(val) => {
-                eq_array_primitive!(array, index, Date64Array, val)
+                eq_array_primitive!(array, index, Date64Array, val)?
             }
             ScalarValue::Time32Second(val) => {
-                eq_array_primitive!(array, index, Time32SecondArray, val)
+                eq_array_primitive!(array, index, Time32SecondArray, val)?
             }
             ScalarValue::Time32Millisecond(val) => {
-                eq_array_primitive!(array, index, Time32MillisecondArray, val)
+                eq_array_primitive!(array, index, Time32MillisecondArray, val)?
             }
             ScalarValue::Time64Microsecond(val) => {
-                eq_array_primitive!(array, index, Time64MicrosecondArray, val)
+                eq_array_primitive!(array, index, Time64MicrosecondArray, val)?
             }
             ScalarValue::Time64Nanosecond(val) => {
-                eq_array_primitive!(array, index, Time64NanosecondArray, val)
+                eq_array_primitive!(array, index, Time64NanosecondArray, val)?
             }
             ScalarValue::TimestampSecond(val, _) => {
-                eq_array_primitive!(array, index, TimestampSecondArray, val)
+                eq_array_primitive!(array, index, TimestampSecondArray, val)?
             }
             ScalarValue::TimestampMillisecond(val, _) => {
-                eq_array_primitive!(array, index, TimestampMillisecondArray, val)
+                eq_array_primitive!(array, index, TimestampMillisecondArray, val)?
             }
             ScalarValue::TimestampMicrosecond(val, _) => {
-                eq_array_primitive!(array, index, TimestampMicrosecondArray, val)
+                eq_array_primitive!(array, index, TimestampMicrosecondArray, val)?
             }
             ScalarValue::TimestampNanosecond(val, _) => {
-                eq_array_primitive!(array, index, TimestampNanosecondArray, val)
+                eq_array_primitive!(array, index, TimestampNanosecondArray, val)?
             }
             ScalarValue::IntervalYearMonth(val) => {
-                eq_array_primitive!(array, index, IntervalYearMonthArray, val)
+                eq_array_primitive!(array, index, IntervalYearMonthArray, val)?
             }
             ScalarValue::IntervalDayTime(val) => {
-                eq_array_primitive!(array, index, IntervalDayTimeArray, val)
+                eq_array_primitive!(array, index, IntervalDayTimeArray, val)?
             }
             ScalarValue::IntervalMonthDayNano(val) => {
-                eq_array_primitive!(array, index, IntervalMonthDayNanoArray, val)
+                eq_array_primitive!(array, index, IntervalMonthDayNanoArray, val)?
             }
             ScalarValue::DurationSecond(val) => {
-                eq_array_primitive!(array, index, DurationSecondArray, val)
+                eq_array_primitive!(array, index, DurationSecondArray, val)?
             }
             ScalarValue::DurationMillisecond(val) => {
-                eq_array_primitive!(array, index, DurationMillisecondArray, val)
+                eq_array_primitive!(array, index, DurationMillisecondArray, val)?
             }
             ScalarValue::DurationMicrosecond(val) => {
-                eq_array_primitive!(array, index, DurationMicrosecondArray, val)
+                eq_array_primitive!(array, index, DurationMicrosecondArray, val)?
             }
             ScalarValue::DurationNanosecond(val) => {
-                eq_array_primitive!(array, index, DurationNanosecondArray, val)
+                eq_array_primitive!(array, index, DurationNanosecondArray, val)?
+            }
+            ScalarValue::Struct(_, _) => {
+                return _not_impl_err!("Struct is not supported yet")
             }
-            ScalarValue::Struct(_, _) => unimplemented!(),
             ScalarValue::Dictionary(key_type, v) => {
                 let (values_array, values_index) = match key_type.as_ref() {
-                    DataType::Int8 => get_dict_value::<Int8Type>(array, index),
-                    DataType::Int16 => get_dict_value::<Int16Type>(array, index),
-                    DataType::Int32 => get_dict_value::<Int32Type>(array, index),
-                    DataType::Int64 => get_dict_value::<Int64Type>(array, index),
-                    DataType::UInt8 => get_dict_value::<UInt8Type>(array, index),
-                    DataType::UInt16 => get_dict_value::<UInt16Type>(array, index),
-                    DataType::UInt32 => get_dict_value::<UInt32Type>(array, index),
-                    DataType::UInt64 => get_dict_value::<UInt64Type>(array, index),
-                    _ => unreachable!("Invalid dictionary keys type: {:?}", key_type),

Review Comment:
   See previous [comment](https://github.com/apache/arrow-datafusion/pull/7901/files#r1381195593).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378379142


##########
datafusion/common/src/scalar.rs:
##########
@@ -512,19 +516,19 @@ impl std::hash::Hash for ScalarValue {
 pub fn get_dict_value<K: ArrowDictionaryKeyType>(
     array: &dyn Array,
     index: usize,
-) -> (&ArrayRef, Option<usize>) {
-    let dict_array = as_dictionary_array::<K>(array).unwrap();
-    (dict_array.values(), dict_array.key(index))
+) -> Result<(&ArrayRef, Option<usize>)> {
+    let dict_array = as_dictionary_array::<K>(array)?;

Review Comment:
   This is a public function and `as_dictionary_array` can fail depending on what `array` value the user passes to the function. The [`downcast_value`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/lib.rs#L73) macro called by `as_dictionary_array` returns an internal error if the cast is impossible. 
   Returning a `Result` here makes sense because this error is reachable and recoverable (user can handle the error and only passes in arrays that can be downcast to `DictionaryArray`).
   This does not require any code change in client side.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378446881


##########
datafusion/common/src/scalar.rs:
##########
@@ -1309,27 +1338,33 @@ impl ScalarValue {
 
         macro_rules! build_array_list_primitive {
             ($ARRAY_TY:ident, $SCALAR_TY:ident, $NATIVE_TYPE:ident) => {{
-                Arc::new(ListArray::from_iter_primitive::<$ARRAY_TY, _, _>(
-                    scalars.into_iter().map(|x| match x {
-                        ScalarValue::List(arr) => {
-                            if arr.as_any().downcast_ref::<NullArray>().is_some() {
-                                None
-                            } else {
-                                let list_arr = as_list_array(&arr);
-                                let primitive_arr =
-                                    list_arr.values().as_primitive::<$ARRAY_TY>();
-                                Some(
-                                    primitive_arr.into_iter().collect::<Vec<Option<_>>>(),
-                                )
+                Ok::<ArrayRef, DataFusionError>(Arc::new(ListArray::from_iter_primitive::<$ARRAY_TY, _, _>(
+                    scalars
+                        .into_iter()
+                        .map(|x| match x {
+                            ScalarValue::List(arr) => {
+                                if arr.as_any().downcast_ref::<NullArray>().is_some() {
+                                    Ok(None)
+                                } else {
+                                    let list_arr = as_list_array(&arr);
+                                    let primitive_arr =
+                                        list_arr.values().as_primitive::<$ARRAY_TY>();
+                                    Ok(Some(
+                                        primitive_arr
+                                            .into_iter()
+                                            .collect::<Vec<Option<_>>>(),
+                                    ))
+                                }
                             }
-                        }
-                        sv => panic!(
-                            "Inconsistent types in ScalarValue::iter_to_array. \
+                            sv => _internal_err!(
+                                "Inconsistent types in ScalarValue::iter_to_array. \

Review Comment:
   It appears that unreachable errors are handled inconsistently in `iter_to_array`. This error should be [unreachable](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1277). However, all the other `build_*` macros defined in `iter_to_array` return an internal error (for example [`build_array_list_string`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1253), [`build_array_primitive`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1145). I removed `panic` here to align with other `build_*` macros.
   This does not require any code change in client side, since `iter_to_array` already returns a `Result`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar by making more operations return `Result` [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1379688238


##########
datafusion/common/src/scalar.rs:
##########
@@ -676,13 +696,13 @@ macro_rules! build_values_list {
                     ScalarValue::$SCALAR_TY(None) => {
                         builder.values().append_null();
                     }
-                    _ => panic!("Incompatible ScalarValue for list"),

Review Comment:
   This error can be unreachable if an invariant is established that the `data_type` and `values` arguments passed to [`new_list`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1664) align with each other. 
   With the above invariant established, we can change this back to `panic` and thus making `new_list` panic instead of returning `Result`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "Weijun-H (via GitHub)" <gi...@apache.org>.
Weijun-H commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1371382927


##########
datafusion/common/src/scalar.rs:
##########
@@ -247,6 +247,10 @@ impl PartialEq for ScalarValue {
 }
 
 // manual implementation of `PartialOrd`
+//
+// # Panics
+//
+// Panics if there is an error when comparing kernels for arrays

Review Comment:
   ```suggestion
   ```



##########
datafusion/common/src/scalar.rs:
##########
@@ -247,6 +247,10 @@ impl PartialEq for ScalarValue {
 }
 
 // manual implementation of `PartialOrd`
+//
+// # Panics
+//
+// Panics if there is an error when comparing kernels for arrays

Review Comment:
   Could we handle unwarp like this here 🤔
   ```rust
   let lt_res = if let Ok(lt_res) =
       arrow::compute::kernels::cmp::lt(&arr1, &arr2)
   {
       lt_res
   } else {
       return None;
   };
   
   let eq_res = if let Ok(eq_res) =
       arrow::compute::kernels::cmp::eq(&arr1, &arr2)
   {
       eq_res
   } else {
       return None;
   };
   ```



##########
datafusion/common/src/scalar.rs:
##########
@@ -247,6 +247,10 @@ impl PartialEq for ScalarValue {
 }
 
 // manual implementation of `PartialOrd`
+//
+// # Panics
+//
+// Panics if there is an error when comparing kernels for arrays

Review Comment:
   Could we handle unwarp like this here 🤔
   ```rust
   let lt_res = if let Ok(lt_res) =
       arrow::compute::kernels::cmp::lt(&arr1, &arr2)
   {
       lt_res
   } else {
       return None;
   };
   
   let eq_res = if let Ok(eq_res) =
       arrow::compute::kernels::cmp::eq(&arr1, &arr2)
   {
       eq_res
   } else {
       return None;
   };
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1374109902


##########
datafusion/common/src/scalar.rs:
##########
@@ -3468,22 +3541,30 @@ mod tests {
         }
 
         // decimal scalar to array
-        let array = decimal_value.to_array();
+        let array = decimal_value
+            .to_array()
+            .expect("Failed to convert to array");
         let array = as_decimal128_array(&array)?;
         assert_eq!(1, array.len());
         assert_eq!(DataType::Decimal128(10, 1), array.data_type().clone());
         assert_eq!(123i128, array.value(0));
 
         // decimal scalar to array with size
-        let array = decimal_value.to_array_of_size(10);
+        let array = decimal_value
+            .to_array_of_size(10)
+            .expect("Failed to convert to array of size");
         let array_decimal = as_decimal128_array(&array)?;
         assert_eq!(10, array.len());
         assert_eq!(DataType::Decimal128(10, 1), array.data_type().clone());
         assert_eq!(123i128, array_decimal.value(0));
         assert_eq!(123i128, array_decimal.value(9));
         // test eq array
-        assert!(decimal_value.eq_array(&array, 1));
-        assert!(decimal_value.eq_array(&array, 5));
+        assert!(decimal_value
+            .eq_array(&array, 1)
+            .expect("Failed to compare arrays"));
+        assert!(decimal_value
+            .eq_array(&array, 5)
+            .expect("Failed to compare arrays"));

Review Comment:
   Same as above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1374109555


##########
datafusion/common/src/scalar.rs:
##########
@@ -3468,22 +3541,30 @@ mod tests {
         }
 
         // decimal scalar to array
-        let array = decimal_value.to_array();
+        let array = decimal_value
+            .to_array()
+            .expect("Failed to convert to array");

Review Comment:
   I think it makes sense to use `.expect` here so that we can have more information on test failure. See my other comment https://github.com/apache/arrow-datafusion/pull/7901#issuecomment-1780426607



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378458769


##########
datafusion/common/src/scalar.rs:
##########
@@ -1562,7 +1597,11 @@ impl ScalarValue {
                     DataType::UInt16 => dict_from_values::<UInt16Type>(values)?,
                     DataType::UInt32 => dict_from_values::<UInt32Type>(values)?,
                     DataType::UInt64 => dict_from_values::<UInt64Type>(values)?,
-                    _ => unreachable!("Invalid dictionary keys type: {:?}", key_type),
+                    _ => {
+                        return _internal_err!(
+                            "Invalid dictionary keys type: {key_type:?}"
+                        )
+                    }

Review Comment:
   I believe this error is reachable if the user passes dictionaries with non-integer keys to the function. It is recoverable so I replaced `unreachable!` with an internal error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378373879


##########
datafusion/common/src/scalar.rs:
##########
@@ -330,9 +330,9 @@ impl PartialOrd for ScalarValue {
                         let arr2 = list_arr2.value(i);
 
                         let lt_res =
-                            arrow::compute::kernels::cmp::lt(&arr1, &arr2).unwrap();
+                            arrow::compute::kernels::cmp::lt(&arr1, &arr2).ok()?;
                         let eq_res =
-                            arrow::compute::kernels::cmp::eq(&arr1, &arr2).unwrap();
+                            arrow::compute::kernels::cmp::eq(&arr1, &arr2).ok()?;

Review Comment:
   Same as above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378462091


##########
datafusion/common/src/scalar.rs:
##########
@@ -1611,15 +1650,19 @@ impl ScalarValue {
         Ok(array)
     }
 
-    fn iter_to_null_array(scalars: impl IntoIterator<Item = ScalarValue>) -> ArrayRef {
-        let length =
-            scalars
-                .into_iter()
-                .fold(0usize, |r, element: ScalarValue| match element {
-                    ScalarValue::Null => r + 1,
-                    _ => unreachable!(),
-                });
-        new_null_array(&DataType::Null, length)
+    fn iter_to_null_array(
+        scalars: impl IntoIterator<Item = ScalarValue>,
+    ) -> Result<ArrayRef> {
+        let length = scalars.into_iter().try_fold(
+            0usize,
+            |r, element: ScalarValue| match element {
+                ScalarValue::Null => Ok::<usize, DataFusionError>(r + 1),
+                s => {
+                    _internal_err!("Expected ScalarValue::Null element. Received {s:?}")

Review Comment:
   Similar to the previous [comment](https://github.com/apache/arrow-datafusion/pull/7901/files#r1378446881) I made to the `build_*` macros in iter_to_array, this error is reachable. Returning an internal error here aligns with the rest of the function.
   This does not require any code change in client side, since [iter_to_array](url) already returns a Result.



##########
datafusion/common/src/scalar.rs:
##########
@@ -1611,15 +1650,19 @@ impl ScalarValue {
         Ok(array)
     }
 
-    fn iter_to_null_array(scalars: impl IntoIterator<Item = ScalarValue>) -> ArrayRef {
-        let length =
-            scalars
-                .into_iter()
-                .fold(0usize, |r, element: ScalarValue| match element {
-                    ScalarValue::Null => r + 1,
-                    _ => unreachable!(),
-                });
-        new_null_array(&DataType::Null, length)
+    fn iter_to_null_array(
+        scalars: impl IntoIterator<Item = ScalarValue>,
+    ) -> Result<ArrayRef> {
+        let length = scalars.into_iter().try_fold(
+            0usize,
+            |r, element: ScalarValue| match element {
+                ScalarValue::Null => Ok::<usize, DataFusionError>(r + 1),
+                s => {
+                    _internal_err!("Expected ScalarValue::Null element. Received {s:?}")

Review Comment:
   Similar to the previous [comment](https://github.com/apache/arrow-datafusion/pull/7901/files#r1378446881) I made to the `build_*` macros in `iter_to_array`, this error is reachable. Returning an internal error here aligns with the rest of the function.
   This does not require any code change in client side, since [iter_to_array](url) already returns a Result.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378447915


##########
datafusion/common/src/scalar.rs:
##########
@@ -1539,7 +1574,7 @@ impl ScalarValue {
                             if &inner_key_type == key_type {
                                 Ok(*scalar)
                             } else {
-                                panic!("Expected inner key type of {key_type} but found: {inner_key_type}, value was ({scalar:?})");
+                                _internal_err!("Expected inner key type of {key_type} but found: {inner_key_type}, value was ({scalar:?})")

Review Comment:
   Same as above. This matching branch also has inconsistent handling of errors. Internal error is returned [here](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1438-L1442) while panic is returned [here](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1435)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378387786


##########
datafusion/common/src/scalar.rs:
##########
@@ -579,24 +581,44 @@ fn dict_from_values<K: ArrowDictionaryKeyType>(
 
 macro_rules! typed_cast_tz {
     ($array:expr, $index:expr, $ARRAYTYPE:ident, $SCALAR:ident, $TZ:expr) => {{
-        let array = $array.as_any().downcast_ref::<$ARRAYTYPE>().unwrap();
-        ScalarValue::$SCALAR(
+        use std::any::type_name;
+        let array = $array
+            .as_any()
+            .downcast_ref::<$ARRAYTYPE>()
+            .ok_or_else(|| {
+                DataFusionError::Internal(format!(
+                    "could not cast value to {}",
+                    type_name::<$ARRAYTYPE>()
+                ))
+            })?;
+        Ok::<ScalarValue, DataFusionError>(ScalarValue::$SCALAR(
             match array.is_null($index) {
                 true => None,
                 false => Some(array.value($index).into()),
             },
             $TZ.clone(),
-        )
+        ))
     }};
 }
 
 macro_rules! typed_cast {
     ($array:expr, $index:expr, $ARRAYTYPE:ident, $SCALAR:ident) => {{
-        let array = $array.as_any().downcast_ref::<$ARRAYTYPE>().unwrap();
-        ScalarValue::$SCALAR(match array.is_null($index) {
-            true => None,
-            false => Some(array.value($index).into()),
-        })
+        use std::any::type_name;
+        let array = $array
+            .as_any()
+            .downcast_ref::<$ARRAYTYPE>()
+            .ok_or_else(|| {
+                DataFusionError::Internal(format!(
+                    "could not cast value to {}",
+                    type_name::<$ARRAYTYPE>()
+                ))
+            })?;

Review Comment:
   Same as above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378446881


##########
datafusion/common/src/scalar.rs:
##########
@@ -1309,27 +1338,33 @@ impl ScalarValue {
 
         macro_rules! build_array_list_primitive {
             ($ARRAY_TY:ident, $SCALAR_TY:ident, $NATIVE_TYPE:ident) => {{
-                Arc::new(ListArray::from_iter_primitive::<$ARRAY_TY, _, _>(
-                    scalars.into_iter().map(|x| match x {
-                        ScalarValue::List(arr) => {
-                            if arr.as_any().downcast_ref::<NullArray>().is_some() {
-                                None
-                            } else {
-                                let list_arr = as_list_array(&arr);
-                                let primitive_arr =
-                                    list_arr.values().as_primitive::<$ARRAY_TY>();
-                                Some(
-                                    primitive_arr.into_iter().collect::<Vec<Option<_>>>(),
-                                )
+                Ok::<ArrayRef, DataFusionError>(Arc::new(ListArray::from_iter_primitive::<$ARRAY_TY, _, _>(
+                    scalars
+                        .into_iter()
+                        .map(|x| match x {
+                            ScalarValue::List(arr) => {
+                                if arr.as_any().downcast_ref::<NullArray>().is_some() {
+                                    Ok(None)
+                                } else {
+                                    let list_arr = as_list_array(&arr);
+                                    let primitive_arr =
+                                        list_arr.values().as_primitive::<$ARRAY_TY>();
+                                    Ok(Some(
+                                        primitive_arr
+                                            .into_iter()
+                                            .collect::<Vec<Option<_>>>(),
+                                    ))
+                                }
                             }
-                        }
-                        sv => panic!(
-                            "Inconsistent types in ScalarValue::iter_to_array. \
+                            sv => _internal_err!(
+                                "Inconsistent types in ScalarValue::iter_to_array. \

Review Comment:
   This error should be reachable when the `ScalarValue`s in the iterator [are not all the same type](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1093-L1094). All the other `build_*` macros defined in `iter_to_array` return an internal error (for example [`build_array_list_string`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1253), [`build_array_primitive`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1145). Using panic here is inconsistent with the rest of the function.
   This does not require any code change in client side, since `iter_to_array` already returns a `Result`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378447915


##########
datafusion/common/src/scalar.rs:
##########
@@ -1539,7 +1574,7 @@ impl ScalarValue {
                             if &inner_key_type == key_type {
                                 Ok(*scalar)
                             } else {
-                                panic!("Expected inner key type of {key_type} but found: {inner_key_type}, value was ({scalar:?})");
+                                _internal_err!("Expected inner key type of {key_type} but found: {inner_key_type}, value was ({scalar:?})")

Review Comment:
   Same as above. This matching branch also has inconsistent handling of unreachable errors. Internal error is returned [here](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1438-L1442) while panic is returned [here](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1435)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381231326


##########
datafusion/common/src/scalar.rs:
##########
@@ -1970,13 +2020,14 @@ impl ScalarValue {
                 ),
             },
             ScalarValue::Fixedsizelist(..) => {
-                unimplemented!("FixedSizeList is not supported yet")

Review Comment:
   @alamb What would be the preferred way to handle unimplemented errors in `datafusion`? There are many places where a `NotImplemented` error is returned instead of using `unimplemented!` and panicking. IMO returning an error makes more sense as user can choose to ignore unimplemented errors instead of panicking and exiting.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1378379142


##########
datafusion/common/src/scalar.rs:
##########
@@ -512,19 +516,19 @@ impl std::hash::Hash for ScalarValue {
 pub fn get_dict_value<K: ArrowDictionaryKeyType>(
     array: &dyn Array,
     index: usize,
-) -> (&ArrayRef, Option<usize>) {
-    let dict_array = as_dictionary_array::<K>(array).unwrap();
-    (dict_array.values(), dict_array.key(index))
+) -> Result<(&ArrayRef, Option<usize>)> {
+    let dict_array = as_dictionary_array::<K>(array)?;

Review Comment:
   This is a public function and `as_dictionary_array` can fail depending on what `array` value the user passes to the function. The [`downcast_value`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/lib.rs#L73) macro called by `as_dictionary_array` returns an internal error if the cast is impossible. 
   I think returning a `Result` here makes sense because this error is reachable and recoverable (user can handle the error and only passes in arrays that can be downcast to `DictionaryArray`).
   `get_dict_value` is only called in `try_from_array`, which already returns a `Result`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1381209821


##########
datafusion/common/src/scalar.rs:
##########
@@ -2259,71 +2316,63 @@ impl ScalarValue {
 
                 ScalarValue::List(arr)
             }
-            DataType::Date32 => {
-                typed_cast!(array, index, Date32Array, Date32)
-            }
-            DataType::Date64 => {
-                typed_cast!(array, index, Date64Array, Date64)
-            }
+            DataType::Date32 => typed_cast!(array, index, Date32Array, Date32)?,
+            DataType::Date64 => typed_cast!(array, index, Date64Array, Date64)?,
             DataType::Time32(TimeUnit::Second) => {
-                typed_cast!(array, index, Time32SecondArray, Time32Second)
+                typed_cast!(array, index, Time32SecondArray, Time32Second)?
             }
             DataType::Time32(TimeUnit::Millisecond) => {
-                typed_cast!(array, index, Time32MillisecondArray, Time32Millisecond)
+                typed_cast!(array, index, Time32MillisecondArray, Time32Millisecond)?
             }
             DataType::Time64(TimeUnit::Microsecond) => {
-                typed_cast!(array, index, Time64MicrosecondArray, Time64Microsecond)
+                typed_cast!(array, index, Time64MicrosecondArray, Time64Microsecond)?
             }
             DataType::Time64(TimeUnit::Nanosecond) => {
-                typed_cast!(array, index, Time64NanosecondArray, Time64Nanosecond)
-            }
-            DataType::Timestamp(TimeUnit::Second, tz_opt) => {
-                typed_cast_tz!(
-                    array,
-                    index,
-                    TimestampSecondArray,
-                    TimestampSecond,
-                    tz_opt
-                )
-            }
-            DataType::Timestamp(TimeUnit::Millisecond, tz_opt) => {
-                typed_cast_tz!(
-                    array,
-                    index,
-                    TimestampMillisecondArray,
-                    TimestampMillisecond,
-                    tz_opt
-                )
-            }
-            DataType::Timestamp(TimeUnit::Microsecond, tz_opt) => {
-                typed_cast_tz!(
-                    array,
-                    index,
-                    TimestampMicrosecondArray,
-                    TimestampMicrosecond,
-                    tz_opt
-                )
-            }
-            DataType::Timestamp(TimeUnit::Nanosecond, tz_opt) => {
-                typed_cast_tz!(
-                    array,
-                    index,
-                    TimestampNanosecondArray,
-                    TimestampNanosecond,
-                    tz_opt
-                )
+                typed_cast!(array, index, Time64NanosecondArray, Time64Nanosecond)?
             }
+            DataType::Timestamp(TimeUnit::Second, tz_opt) => typed_cast_tz!(
+                array,
+                index,
+                TimestampSecondArray,
+                TimestampSecond,
+                tz_opt
+            )?,
+            DataType::Timestamp(TimeUnit::Millisecond, tz_opt) => typed_cast_tz!(
+                array,
+                index,
+                TimestampMillisecondArray,
+                TimestampMillisecond,
+                tz_opt
+            )?,
+            DataType::Timestamp(TimeUnit::Microsecond, tz_opt) => typed_cast_tz!(
+                array,
+                index,
+                TimestampMicrosecondArray,
+                TimestampMicrosecond,
+                tz_opt
+            )?,
+            DataType::Timestamp(TimeUnit::Nanosecond, tz_opt) => typed_cast_tz!(
+                array,
+                index,
+                TimestampNanosecondArray,
+                TimestampNanosecond,
+                tz_opt
+            )?,
             DataType::Dictionary(key_type, _) => {
                 let (values_array, values_index) = match key_type.as_ref() {
-                    DataType::Int8 => get_dict_value::<Int8Type>(array, index),
-                    DataType::Int16 => get_dict_value::<Int16Type>(array, index),
-                    DataType::Int32 => get_dict_value::<Int32Type>(array, index),
-                    DataType::Int64 => get_dict_value::<Int64Type>(array, index),
-                    DataType::UInt8 => get_dict_value::<UInt8Type>(array, index),
-                    DataType::UInt16 => get_dict_value::<UInt16Type>(array, index),
-                    DataType::UInt32 => get_dict_value::<UInt32Type>(array, index),
-                    DataType::UInt64 => get_dict_value::<UInt64Type>(array, index),
-                    _ => unreachable!("Invalid dictionary keys type: {:?}", key_type),

Review Comment:
   See previous [comment](https://github.com/apache/arrow-datafusion/pull/7901/files#r1381195593).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

Posted by "junjunjd (via GitHub)" <gi...@apache.org>.
junjunjd commented on code in PR #7901:
URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1379688238


##########
datafusion/common/src/scalar.rs:
##########
@@ -676,13 +696,13 @@ macro_rules! build_values_list {
                     ScalarValue::$SCALAR_TY(None) => {
                         builder.values().append_null();
                     }
-                    _ => panic!("Incompatible ScalarValue for list"),

Review Comment:
   This error can be unreachable if invariant is established that the `data_type` and `values` arguments passed to [`new_list`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1664) align with each other. 
   I will change this back to `panic` and thus making `new_list` panic instead of returning `Result`.



##########
datafusion/common/src/scalar.rs:
##########
@@ -676,13 +696,13 @@ macro_rules! build_values_list {
                     ScalarValue::$SCALAR_TY(None) => {
                         builder.values().append_null();
                     }
-                    _ => panic!("Incompatible ScalarValue for list"),

Review Comment:
   This error can be unreachable if an invariant is established that the `data_type` and `values` arguments passed to [`new_list`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1664) align with each other. 
   I will change this back to `panic` and thus making `new_list` panic instead of returning `Result`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org