You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/08/07 12:22:45 UTC

[GitHub] [arrow-rs] alamb opened a new issue #672: Add a way to get the dictionary index and values array reference

alamb opened a new issue #672:
URL: https://github.com/apache/arrow-rs/issues/672


   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   To work with the value of a dictionary array at a particular index, the following pattern shows up many times in the arrow-rs and datafusion codebase:
   
   ```rust
       fn do_something <K: ArrowDictionaryKeyType>(
           array: &ArrayRef,
           index: usize,
       ) -> Result<Self> {
           let dict_array = array.as_any().downcast_ref::<DictionaryArray<K>>().unwrap();
   
           // look up the index in the values dictionary
           let keys_col = dict_array.keys();
           let values_index = keys_col.value(index).to_usize().ok_or_else(|| {
               DataFusionError::Internal(format!(
                   "Can not convert index to usize in dictionary of type creating group by value {:?}",
                   keys_col.data_type()
               ))
           })?;
           
           // do actual work with dict_array.values and values_index
           // ...
       }
    }
   ```
   Repeating this code "find the index" code is tedious
   
   **Describe the solution you'd like**
   
   Add a function such as the following on to DictionaryArray (would love suggestions about better names):
   
   ```rust
   impl DictionaryArray<K: ArrowDictionaryKeyType> {
     
     // return the index into the dictionary values for array@index as well
     // as the dictioanry values
     #[inline]
     fn dict_value(
       self: 
       index: usize,
     ) -> Result<(&ArrayRef, usize)>
     {
       // look up the index in the values dictionary
       let keys_col = self.keys();
       let values_index = keys_col.value(index).to_usize().ok_or_else(|| {
           DataFusionError::Internal(format!(
               "Can not convert index to usize in dictionary of type creating group by value {:?}",
               keys_col.data_type()
           ))
       })?;
   
       Ok((self.values(), values_index))
   }
   ```
   
   **Describe alternatives you've considered**
   
   **Additional context**
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on issue #672: Add a way to get the dictionary index and values array reference

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #672:
URL: https://github.com/apache/arrow-rs/issues/672#issuecomment-895198607


   Note the bug in the above code -- it does not handle NULLs correctly!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org