You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/08/25 12:13:55 UTC

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #2589: Validate dictionary key in TypedDictionaryArray (#2578)

tustvold commented on code in PR #2589:
URL: https://github.com/apache/arrow-rs/pull/2589#discussion_r954889181


##########
arrow/src/array/array_dictionary.rs:
##########
@@ -491,21 +490,30 @@ where
     K: ArrowPrimitiveType,
     V: Sync + Send,
     &'a V: ArrayAccessor,
+    <&'a V as ArrayAccessor>::Item: Default,
 {
     type Item = <&'a V as ArrayAccessor>::Item;
 
     fn value(&self, index: usize) -> Self::Item {
-        assert!(self.dictionary.is_valid(index), "{}", index);
-        let value_idx = self.dictionary.keys.value(index).to_usize().unwrap();
-        // Dictionary indexes should be valid
-        unsafe { self.values.value_unchecked(value_idx) }
+        assert!(
+            index < self.len(),
+            "Trying to access an element at index {} from a TypedDictionaryArray of length {}",
+            index,
+            self.len()
+        );
+        unsafe { self.value_unchecked(index) }
     }
 
     unsafe fn value_unchecked(&self, index: usize) -> Self::Item {
         let val = self.dictionary.keys.value_unchecked(index);
         let value_idx = val.to_usize().unwrap();
-        // Dictionary indexes should be valid
-        self.values.value_unchecked(value_idx)
+
+        // As dictionary keys are only verified for non-null indexes
+        // we must check the value is within bounds
+        match value_idx < self.dictionary.len() {

Review Comment:
   As pointed out by @crepererum the fact the value is in a dictionary in the first place likely indicates that operations on it are likely to be non-trival (i.e. it is likely a string). Therefore the cost of this bound check is likely irrelevant, especially since because of #1918 we are doing checked conversion to usize anyway.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org