You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "jorisvandenbossche (via GitHub)" <gi...@apache.org> on 2023/02/27 11:42:24 UTC

[GitHub] [arrow] jorisvandenbossche commented on pull request #34234: GH-34235: [Python] Add `join_asof` binding

jorisvandenbossche commented on PR #34234:
URL: https://github.com/apache/arrow/pull/34234#issuecomment-1446181964

   @judahrand thanks for working on this!
   
   Before diving into the details, I have two general comments:
   
   - We are in the middle of refactoring how we expose the Acero / ExecPlan features, and I have a PR that exposes the Declaration object and ExecNodeOptions subclasses in Python (https://github.com/apache/arrow/pull/34102). Once that is merged, it should be the goal that also the asof join could be exposed by adding an `AsofJoinNodeOptions` class in pyarrow. 
     Ideally, I would prefer that we can do this directly, but I know that the mentioned PR isn't merged yet (I hope it can be merged in one of the coming days, though)
   
   - I think we should try to do a better job of explaining what the "asof" exactly is and does in the docstring (I also noted that AsOfJoin is missing in the C++ user guide (will open an issue about that), although it has some reference docs), since I think this is generally not a very well known join type: what is the difference with a normal join? What is the difference between the "on" and "by" join keys?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org