You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/20 12:40:24 UTC

[GitHub] [arrow-datafusion] alamb commented on pull request #2549: feat: support for AnyExpression

alamb commented on PR #2549:
URL: https://github.com/apache/arrow-datafusion/pull/2549#issuecomment-1132855695

   Thanks -- I'll try and find some more time to review this PR.
   
   On Wed, May 18, 2022 at 5:54 PM Dmitry Patsura ***@***.***>
   wrote:
   
   > ***@***.**** commented on this pull request.
   > ------------------------------
   >
   > In datafusion/physical-expr/src/expressions/any.rs
   > <https://github.com/apache/arrow-datafusion/pull/2549#discussion_r876413237>
   > :
   >
   > > +    fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
   >
   > +        let value = match self.value.evaluate(batch)? {
   >
   > +            ColumnarValue::Array(array) => array,
   >
   > +            ColumnarValue::Scalar(scalar) => scalar.to_array(),
   >
   > +        };
   >
   >
   > I haven't reviewed this code super carefully but I wonder what you think
   > about reusing more of the code for InList?
   >
   > Probably, the nearest operator will be ALL, but it's not implemented
   > right now. I tried to reuse the code of InList, but come to the decision
   > that it will be easier/correct to combine it with ANY in the future.
   >
   > It seems like the implementations of x IN (, ) and x ANY () are almost
   > exactly the same except for the fact that the the x = ANY() has the list as
   > a single ?
   >
   > Nope, because it supports more operators: <, <=, >, >= (it's not
   > implemented in this PR, but anyway).
   >
   > I mention this as it is likely the IN / NOT IN are more optimized (use a
   > hashset for testing, for example) as well as already written.
   >
   > Yes, but IN supports only the vector of scalars in DF. It's not the same
   > as ANY, because it supports List (real column) or scalar.
   >
   > For example SELECT left, values FROM table:
   >
   > left | right LIst(Int64) - another logic is required
   >
   > 1.    | [1,2,3] - ArrayRef = PrimitiveArray<Int64> - another logic is requied
   >
   > 2.   |. [2,3,4]
   >
   > 3.   |. [4,5,6]
   >
   > 4.   |  [4,5,6]
   >
   > 5.   |. [4,5,6]
   >
   >
   > In the case of using the whole column, it is required to use another
   > handling of interacting on values ☝️ . Probably here, will be more
   > correct to go by steps:
   >
   >    - Support more operators
   >    - Introduce ALL operator + extract the same logic with ANY.
   >
   > Thanks
   >
   > —
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/arrow-datafusion/pull/2549#discussion_r876413237>,
   > or unsubscribe
   > <https://github.com/notifications/unsubscribe-auth/AADXZMLLPQHWVVSZ2KIMKULVKVRKVANCNFSM5WCC6J2Q>
   > .
   > You are receiving this because you were mentioned.Message ID:
   > ***@***.***>
   >
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org