You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/03/08 21:21:43 UTC

[GitHub] [arrow-rs] jorgecarleitao commented on a change in pull request #1407: Add dictionary support for C data interface

jorgecarleitao commented on a change in pull request #1407:
URL: https://github.com/apache/arrow-rs/pull/1407#discussion_r822023775



##########
File path: arrow/src/ffi.rs
##########
@@ -555,18 +594,22 @@ pub trait ArrowArrayRef {
     // for variable-sized buffers, such as the second buffer of a stringArray, we need
     // to fetch offset buffer's len to build the second buffer.
     fn buffer_len(&self, i: usize) -> Result<usize> {
-        // Inner type is not important for buffer length.
-        let data_type = &self.data_type()?;
+        // Special handling for dictionary type as we only care about the key type in the case.
+        let data_type = match &self.data_type()? {
+            DataType::Dictionary(key_data_type, _) => key_data_type.as_ref().clone(),

Review comment:
       maybe it would be possible to not clone here by returning a reference?

##########
File path: arrow/src/array/ffi.rs
##########
@@ -127,4 +128,14 @@ mod tests {
         let data = array.data();
         test_round_trip(data)
     }
+
+    #[test]
+    fn test_dictionary() -> Result<()> {

Review comment:
       would it be worth to have an example with validity (in both the keys and values), so that we cover the most complex use-case?

##########
File path: arrow-pyarrow-integration-testing/tests/test_sql.py
##########
@@ -263,3 +256,13 @@ def test_decimal_python():
     assert a == b
     del a
     del b
+
+def test_dictionary_python():
+    """
+    Python -> Rust -> Python
+    """
+    a = pa.array(["a", "b", "a"], type=pa.dictionary(pa.int8(), pa.string()))

Review comment:
       same here - with validities?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org