You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/03/04 15:19:28 UTC

[GitHub] [arrow-datafusion] jonmmease opened a new pull request #1925: Towards fixing ambiguous reference error 1411

jonmmease opened a new pull request #1925:
URL: https://github.com/apache/arrow-datafusion/pull/1925


   # Which issue does this PR close?
   Closes #1411.  
   
   cc @kszucs
   
    # Rationale for this change
   This PR adds an initially failing test in a060b40f14f6a28945d0e15c2ae893136a9d55b9 that reproduces the behavior described in #1411.
   
   The change in 2042fbb8e16cd5908563c0f8faafe000f02ec029 may not be correct overall, but it addresses this particular failure.
   
   The core issue seems to be that filter optimization examines fields across all plans and then fails because there is an ambiguity. At least in the DataFrame context, I would expect there to be no ambiguity since only one of the columns is projected prior to filtering.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jonmmease commented on pull request #1925: Fix ambiguous reference error in filter plan

Posted by GitBox <gi...@apache.org>.
jonmmease commented on pull request #1925:
URL: https://github.com/apache/arrow-datafusion/pull/1925#issuecomment-1062172300


   Thanks @alamb, not sure how that got in there :facepalm:   Should be reverted now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb merged pull request #1925: Fix ambiguous reference error in filter plan

Posted by GitBox <gi...@apache.org>.
alamb merged pull request #1925:
URL: https://github.com/apache/arrow-datafusion/pull/1925


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on pull request #1925: Fix ambiguous reference error in filter plan

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #1925:
URL: https://github.com/apache/arrow-datafusion/pull/1925#issuecomment-1062162361


   > Maybe something flaky?
   
   
   I think this is related to the fact that somehow this PR has changes to the `testing` repo which I suspect was not intended:
   
   ![Screen Shot 2022-03-08 at 3 06 16 PM](https://user-images.githubusercontent.com/490673/157316327-c191412e-9a8b-4fb3-9b67-14a1c3f7eca2.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jonmmease edited a comment on pull request #1925: Fix ambiguous reference error in filter plan

Posted by GitBox <gi...@apache.org>.
jonmmease edited a comment on pull request #1925:
URL: https://github.com/apache/arrow-datafusion/pull/1925#issuecomment-1062158715


   Thanks for taking a look @alamb  and @houqp! I made the proposed change.
   
   This error poppup up in https://github.com/apache/arrow-datafusion/runs/5470235871?check_suite_focus=true:
   
   ```
    failures:
   
   ---- datasource::file_format::avro::tests::test stdout ----
   thread 'datasource::file_format::avro::tests::test' panicked at 'Local file metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', datafusion/src/datasource/object_store/local.rs:178:40
   
   
   failures:
       datasource::file_format::avro::tests::test
   ```
   
   Maybe something flaky?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb edited a comment on pull request #1925: Fix ambiguous reference error in filter plan

Posted by GitBox <gi...@apache.org>.
alamb edited a comment on pull request #1925:
URL: https://github.com/apache/arrow-datafusion/pull/1925#issuecomment-1062162361


   > Maybe something flaky?
   
   
   I think this is related to the fact that somehow this PR has changes to the `testing` submodule which I suspect was not intended:
   
   ![Screen Shot 2022-03-08 at 3 06 16 PM](https://user-images.githubusercontent.com/490673/157316327-c191412e-9a8b-4fb3-9b67-14a1c3f7eca2.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on pull request #1925: Towards fixing ambiguous reference error in filter plan

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #1925:
URL: https://github.com/apache/arrow-datafusion/pull/1925#issuecomment-1061147239


   (thank you @jonmmease  for raising this PR)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] houqp commented on pull request #1925: Towards fixing ambiguous reference error in filter plan

Posted by GitBox <gi...@apache.org>.
houqp commented on pull request #1925:
URL: https://github.com/apache/arrow-datafusion/pull/1925#issuecomment-1061486092


   thank you @jonmmease for taking on this, I agree with @alamb that the order you outlined in your comment sounds like a good order to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jonmmease commented on pull request #1925: Fix ambiguous reference error in filter plan

Posted by GitBox <gi...@apache.org>.
jonmmease commented on pull request #1925:
URL: https://github.com/apache/arrow-datafusion/pull/1925#issuecomment-1062158715


   This error poppup up in https://github.com/apache/arrow-datafusion/runs/5470235871?check_suite_focus=true:
   
   ```
    failures:
   
   ---- datasource::file_format::avro::tests::test stdout ----
   thread 'datasource::file_format::avro::tests::test' panicked at 'Local file metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', datafusion/src/datasource/object_store/local.rs:178:40
   
   
   failures:
       datasource::file_format::avro::tests::test
   ```
   
   Maybe something flaky?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jonmmease commented on pull request #1925: Fix ambiguous reference error in filter plan

Posted by GitBox <gi...@apache.org>.
jonmmease commented on pull request #1925:
URL: https://github.com/apache/arrow-datafusion/pull/1925#issuecomment-1062223698


   tests now passing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] houqp commented on pull request #1925: Fix ambiguous reference error in filter plan

Posted by GitBox <gi...@apache.org>.
houqp commented on pull request #1925:
URL: https://github.com/apache/arrow-datafusion/pull/1925#issuecomment-1062532739


   Thanks @jonmmease 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on pull request #1925: Towards fixing ambiguous reference error in filter plan

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #1925:
URL: https://github.com/apache/arrow-datafusion/pull/1925#issuecomment-1061146813


   > I'm not sure what the correct behavior here is. Should the projected schema take precedence to resolve ambiguity and then fall back to the combined schema if a column reference is not found?
   
   That sounds reasonable to me at a high level
   
   @houqp  wrote up the desired output name semantics here:  https://arrow.apache.org/datafusion/specification/output-field-name-semantic.html
   
   which might serve to guide you in your question


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jonmmease commented on pull request #1925: Towards fixing ambiguous reference error in filter plan

Posted by GitBox <gi...@apache.org>.
jonmmease commented on pull request #1925:
URL: https://github.com/apache/arrow-datafusion/pull/1925#issuecomment-1059266283


   unsurprisingly, this change causes a failure in the `execution::context::tests::unprojected_filter` test.
   
   I'm not sure what the correct behavior here is.  Should the projected schema take precedence to resolve ambiguity and then fall back to the combined schema if a column reference is not found?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org