You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "wjones127 (via GitHub)" <gi...@apache.org> on 2023/02/25 18:56:53 UTC

[GitHub] [arrow-rs] wjones127 opened a new issue, #3764: Common trait for RecordBatch and StructArray

wjones127 opened a new issue, #3764:
URL: https://github.com/apache/arrow-rs/issues/3764

   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   
   I wanted to write some code that was generic over `StructArray` and `RecordBatch`, but it appears they don't have the same traits (although they are easily convertible).
   
   It seems like there are many shared methods that could be brought into a common trait:
   
    * `column()`
    * `column_by_name()`
    * `columns()`
    * `num_columns()`
    * `num_rows()`
   
   **Describe the solution you'd like**
   
   I'd propose a trait `ArrowTabular` that has these methods.
   
   **Describe alternatives you've considered**
   
   I'm open to alternatives, but couldn't think of any. In any case, I'm implementing a trait for my own use case and if folks are open to this I could upstream it into arrow-rs.
   
   **Additional context**
   
   For context, the thing I am trying to implement is:
   
   ```rust
   /// Represents a row in an [ArrowTabular].
   #[derive(Debug, Clone, Copy)]
   struct RowReference<'a, T: ArrowTabular + 'a> {
       pub inner: &'a T,
       pub pos: usize
   }
   
   impl<'a, T: ArrowTabular + 'a> RowReference<'a, T> {
       fn get_str(&'a self, col_i: usize) -> Option<&'a str> {
           let column = as_string_array(self.inner.column(col_i));
           if column.is_null(self.pos) {
               None
           } else {
               Some(column.value(self.pos))
           }
       }
   }
   ```
   
   This is to help navigate the highly nested schema in [ADBC's GetObjects output](https://github.com/apache/arrow-adbc/blob/e79edafdb02339664ea735097f8c1edc0ea052de/adbc.h#L815), where an entry might be a row of a record batch or a row in a struct array, depending on which level you are at.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold closed issue #3764: Common trait for RecordBatch and StructArray

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold closed issue #3764: Common trait for RecordBatch and StructArray
URL: https://github.com/apache/arrow-rs/issues/3764


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #3764: Common trait for RecordBatch and StructArray

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #3764:
URL: https://github.com/apache/arrow-rs/issues/3764#issuecomment-1445303228

   The way that the parquet reader, etc...handle this is to just use StructArray and convert to/from RecordBatch where necessary. Perhaps that would work for your use-case without adding complexity? Generics, especially recursive ones, get painfuo quickly 😅


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] wjones127 commented on issue #3764: Common trait for RecordBatch and StructArray

Posted by "wjones127 (via GitHub)" <gi...@apache.org>.
wjones127 commented on issue #3764:
URL: https://github.com/apache/arrow-rs/issues/3764#issuecomment-1445408036

   I see. If I convert the top-level to `StructArray`, then it's `StructArray` all the way down! That seems like a reasonable simplification. 👍 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org