You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/31 15:59:37 UTC

[GitHub] [arrow-datafusion] alamb opened a new issue, #4046: Another Internal error when parquet predicate pushdown is enabled "Error evaluating filter predicate:

alamb opened a new issue, #4046:
URL: https://github.com/apache/arrow-datafusion/issues/4046

   **Describe the bug**
   DataFusion generates an error for some predicates when predicate pushdown is enabled. 
   
   NOTE This is the same symptom as reported on https://github.com/apache/arrow-datafusion/issues/4006 but with a different predicate
   
   NOTE that pushdown filtering is not enabled by default (as we are still working on it) so this issue will not likely affect users:
   
   **To Reproduce**
   1. Download data from [repro.zip](https://github.com/apache/arrow-datafusion/files/9902718/repro.zip)
   2. Run datafusion CLI
   
   The query run is
   ```sql
   select count(*) from foo where request_method != 'GET' OR response_status = 400 OR service = 'backend';
   ```
   
   I tested is using master at https://github.com/apache/arrow-datafusion/commit/35f786bb6ce33cbd58db3e16a46958b58f7676f4, which includes the fix for #4006 in https://github.com/apache/arrow-datafusion/commit/5cf090a13391501c0ce7707ac7a1e50e18517b79
   
   
   ```shell
   $ git status
   Your branch is up to date with 'apache/master'.
   
   nothing to commit, working tree clean
   $ git rev-parse HEAD
   5cf090a13391501c0ce7707ac7a1e50e18517b79
   ```
   
   **Expected behavior**
   Same answer should be produced with and without row  filtering enabled. However, with row filtering an error is produced
   
   ```shell
   datafusion-cli -f script.sql
   +-----------------+
   | COUNT(UInt8(1)) |
   +-----------------+
   | 53819           |
   +-----------------+
   1 row in set. Query took 0.006 seconds.
   ```
   
   With it enabled:
   
   ```shell
   DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true datafusion-cli -f script.sql
   ...
   1 row in set. Query took 0.021 seconds.
   ArrowError(ExternalError(Execution("Arrow error: External error: Arrow: underlying Arrow error: Compute error: Error evaluating filter predicate: Internal(\"Cannot evaluate binary expression NotEq with types UInt16 and Utf8\")")))
   ```
   
   **Additional context**
   Found by the test here https://github.com/apache/arrow-datafusion/pull/3976
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb closed issue #4046: Another Internal error when parquet predicate pushdown is enabled "Error evaluating filter predicate:

Posted by GitBox <gi...@apache.org>.
alamb closed issue #4046: Another Internal error when parquet predicate pushdown is enabled "Error evaluating filter predicate:
URL: https://github.com/apache/arrow-datafusion/issues/4046


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org