You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "wjones127 (via GitHub)" <gi...@apache.org> on 2023/05/10 01:10:38 UTC

[GitHub] [arrow] wjones127 commented on issue #33986: [Python][Rust] Create extension point in python for Dataset/Scanner

wjones127 commented on issue #33986:
URL: https://github.com/apache/arrow/issues/33986#issuecomment-1541113978

   I'm suddenly rather interested in seeing this through. Also have had a change of heart and think either what Chang is proposing (an ABC) or Joris (a protocol) is the way to go. ABC seems straightforward, but I'm eager to chat with Joris if he has ideas on why a protocol like the DataFrame protocol makes more sense. (Or maybe both could be combined? The protocol returns something that subclasses the ABC?)
   
   Starting thinking about this in a Google doc: [Making Arrow dataset into a protocol](https://docs.google.com/document/d/1r56nt5Un2E7yPrZO9YPknBN4EDtptpx-tqOZReHvq1U/edit?usp=sharing)
   
   Also wrote up another doc to share the perspective of delta-rs/deltalake on PyArrow Datasets: [PyArrow Datasets and Python deltalake](https://docs.google.com/document/d/1XGg1pf9Nep9GHlSdvO65Ao1kyQ_Z_g55uyHuTYVyeT0/edit?usp=sharing)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org