You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "LouisGariepy (via GitHub)" <gi...@apache.org> on 2023/05/25 02:16:36 UTC

[GitHub] [arrow-datafusion] LouisGariepy opened a new issue, #6437: Inconsistencies in `Expr` operator code.

LouisGariepy opened a new issue, #6437:
URL: https://github.com/apache/arrow-datafusion/issues/6437

   # Problem
   
   Some operations are implemented both directly on `Expr` and as built-in traits ([trait impl](https://github.com/apache/arrow-datafusion/blob/32144501be634253b254aca34660b4b3e81dbc1f/datafusion/expr/src/operator.rs#L299-L305C2) compared with [type impl](https://github.com/apache/arrow-datafusion/blob/32144501be634253b254aca34660b4b3e81dbc1f/datafusion/expr/src/expr.rs#LL744C1-L746C6)).
   
   As far as I can tell, these implementations could (and should) use the same underlying logic (the trait impl should use the type impl, or vice-versa). Instead, the logic is duplicated at both sites (as you can see in from the snippets I linked). This increases the potential for bugs as it would be very easy to change say the type impl and not the trait impl or vice-versa.
   
   Another inconsistency is that the type methods don't have the same name as the corresponding trait method, so we end up in a situation where, for example, `bitwise_and` (type impl), `bitand` and `&` (trait impl) all do the same thing. This can cause confusion for users since it's not immediately clear that these are in fact intended to be the same.
   
   Finally, the `Not` trait implementation is located in the `expr` module ([here](https://github.com/apache/arrow-datafusion/blob/32144501be634253b254aca34660b4b3e81dbc1f/datafusion/expr/src/expr.rs#L912-L938)), while all other operator trait impls are located in the `operator` module.
   
   # Potential solution
   In my opinion, the `Expr` impl should not duplicate built-in operator trait methods. The duplicated methods should be removed from `Expr`, and only the trait impls be kept. This is more in line with idiomatic Rust code, removes duplication, and reduces potential user confusion.
   
   Likewise, the `Not` trait impl should be moved to the `operator` module and the corresponding type method be removed.
   
   # Alternatives
   If the duplicated type impls won't be removed, then the trait impls should at least be derived from the `Expr` methods.
   
   # Discussion
   Please let me know what you think. It's possible that I misunderstood why the code is laid out this way.
   
   If there is in fact a problem, then I'm prepared to make a PR to solve this after we've discussed of the appropriate solution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb closed issue #6437: Inconsistencies in `Expr` operator code.

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb closed issue #6437: Inconsistencies in `Expr` operator code.
URL: https://github.com/apache/arrow-datafusion/issues/6437


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jdye64 commented on issue #6437: Inconsistencies in `Expr` operator code.

Posted by "jdye64 (via GitHub)" <gi...@apache.org>.
jdye64 commented on issue #6437:
URL: https://github.com/apache/arrow-datafusion/issues/6437#issuecomment-1563332127

   @alamb thanks for checking. @LouisGariepy proposal will not cause any issue on the Python bindings side (however good to know so we can expect and adjust for the change). However I can't speak for other non-rust bindings.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #6437: Inconsistencies in `Expr` operator code.

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6437:
URL: https://github.com/apache/arrow-datafusion/issues/6437#issuecomment-1563296056

   > In my opinion, the Expr impl should not duplicate built-in operator trait methods. The duplicated methods should be removed from Expr, and only the trait impls be kept. This is more in line with idiomatic Rust code, removes duplication, and reduces potential user confusion.
   
   I agree a single Rust idomatic pattern sounds good to me. 
   
   I wonder if it might cause issues for python bindings, or other non rust users -- @andygrove  or @jdye64  do you have any insight?
   
   Thank you for working on cleaning this up 🙏 
   
   > Likewise, the Not trait impl should be moved to the operator module and the corresponding type method be removed.
   
   I agree


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org