You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/04/11 04:04:02 UTC

[GitHub] [arrow] jorgecarleitao commented on pull request #9976: ARROW-12290: [Rust][DataFusion] Add input_file_name function [WIP]

jorgecarleitao commented on pull request #9976:
URL: https://github.com/apache/arrow/pull/9976#issuecomment-817243506


   Isn't the filename that a column came from uniquely identified by the logical plan? If two physical plans arrive to different conclusions about a columns' provenance, then those physical plans are using two data origins, which implies different semantics.
   
   This is rationale I was using to recommend addressing this at the DAG level. I do agree that the logical plan may not have all the information about how the source is partitioned and its exact names (as that may even change with time), but I would expect to resolve that as part of the query execution (just like we resolve the physical plan when we run `SHOW PLAN`).
   
   @seddonm1 , I think that a physical expression does not need to be `ScalarFunctionExpr`: the `ScalarFunctionExpr` is useful for the cases where the physical operation can be described by a simple function and signature. Check e.g. how e.g. the binary operators are defined: they have their own custom `struct` that implements `PhysicalExpr`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org