You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/04/03 17:55:49 UTC

[GitHub] [arrow] alamb commented on a change in pull request #9682: ARROW-7364: [Rust][DataFusion] Add cast options to cast kernel and TRY_CAST to DataFusion

alamb commented on a change in pull request #9682:
URL: https://github.com/apache/arrow/pull/9682#discussion_r606691319



##########
File path: rust/arrow/src/compute/kernels/cast.rs
##########
@@ -937,114 +1048,169 @@ where
         from.as_any()
             .downcast_ref::<GenericStringArray<Offset>>()
             .unwrap(),
-    )))
+        cast_options,
+    )?))
 }
 
 fn string_to_numeric_cast<T, Offset: StringOffsetSizeTrait>(
     from: &GenericStringArray<Offset>,
-) -> PrimitiveArray<T>
+    cast_options: &CastOptions,
+) -> Result<PrimitiveArray<T>>
 where
     T: ArrowNumericType,
     <T as ArrowPrimitiveType>::Native: lexical_core::FromLexical,
 {
-    let iter = (0..from.len()).map(|i| {
-        if from.is_null(i) {
-            None
-        } else {
-            lexical_core::parse(from.value(i).as_bytes()).ok()
-        }
-    });
+    let vec = (0..from.len())
+        .map(|i| {
+            if from.is_null(i) {
+                Ok(None)
+            } else {
+                let string = from.value(i);
+                let result = lexical_core::parse(string.as_bytes());
+                if cast_options.safe {
+                    Ok(result.ok())
+                } else {
+                    Some(result.map_err(|_| {
+                        ArrowError::CastError(format!(
+                            "Cannot cast string '{}' to value of {} type",
+                            string,
+                            std::any::type_name::<T>()
+                        ))
+                    }))
+                    .transpose()
+                }
+            }
+        })
+        .collect::<Result<Vec<Option<_>>>>()?;

Review comment:
       I wonder if we need the collect here (and the few places below)? It seems like we might be able to use the iter directly for the case where `cast_options.safe == true` 
   
   I worry that unsafe casts may get slower if we always created a vec for them
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org