You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "LiPenglin (Jira)" <ji...@apache.org> on 2022/05/08 09:40:00 UTC
[jira] [Comment Edited] (IMPALA-11243) Improve predicate pushdown to Iceberg
[ https://issues.apache.org/jira/browse/IMPALA-11243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17533192#comment-17533192 ]
LiPenglin edited comment on IMPALA-11243 at 5/8/22 9:39 AM:
------------------------------------------------------------
Hi [~boroknagyz]
I have a problem, I'm making sure I've pushed the NOT_NULL predicate down to Iceberg, But the _org.apache.iceberg.BaseTableScan#filter(org.apache.iceberg.expressions.Expression)_ doesn't really work. This still happens with NOT_IN predicate push down.
---
Luckily, I found the cause of the problem.
1 I found that `value_counts` is null in the iceberg metadata
{code:java}
{"status":1,"snapshot_id":{"long":5383986197020147226},"data_file":{"file_path":"hdfs://localhost:20500/test-warehouse/tdb.db/ice_is _null_pred_pd/data/col_i=8/c94417272af4dd76-6820892500000000_918326216_data.0.parq","file_format":"PARQUET","partition":{"col_i":{"i nt":8}},"record_count":1,"file_size_in_bytes":2581,"block_size_in_bytes":67108864,"column_sizes":{"array":[{"key":1,"value":47},{"ke y":2,"value":51},{"key":3,"value":51},{"key":4,"value":66},{"key":5,"value":51},{"key":6,"value":47},{"key":7,"value":47},{"key":8," value":51},{"key":9,"value":39}]},"value_counts":null,"null_value_counts":{"array":[{"key":1,"value":0},{"key":2,"value":0},{"key":3 ,"value":0},{"key":4,"value":0},{"key":5,"value":0},{"key":6,"value":0},{"key":7,"value":0},{"key":8,"value":0},{"key":9,"value":1}] },"nan_value_counts":null,"lower_bounds":{"array":[{"key":1,"value":"\b\u0000\u0000\u0000"},{"key":2,"value":"<\u001CÜß\u0002\u0000\ u0000\u0000"},{"key":3,"value":"<90>÷ª<95>\t¿\u0005@"},{"key":4,"value":"1700-01-01 00:00:00"},{"key":5,"value":"<80>ZûÁ¨<84>Öÿ"},{" key":6,"value":"\u001Cðýÿ"},{"key":7,"value":"\u0001â:"},{"key":8,"value":"\u0001\u001Fqû\u0004È"}]},"upper_bounds":{"array":[{"key" :1,"value":"\b\u0000\u0000\u0000"},{"key":2,"value":"<\u001CÜß\u0002\u0000\u0000\u0000"},{"key":3,"value":"<90>÷ª<95>\t¿\u0005@"},{" key":4,"value":"1700-01-01 00:00:00"},{"key":5,"value":"<80>ZûÁ¨<84>Öÿ"},{"key":6,"value":"\u001Cðýÿ"},{"key":7,"value":"\u0001â:"}, {"key":8,"value":"\u0001\u001Fqû\u0004È"}]},"key_metadata":null,"split_offsets":null,"sort_order_id":{"int":0}}} {code}
2 According to `https://github.com/apache/iceberg/blob/b521f40d9f897ffc0d3d30511cfdff35797b5894/api/src/main/java/org/apache/iceberg/expressions/InclusiveMetricsEvaluator.java#L457` logic, `ROWS_CANNOT_MATCH` is returned when `value_counts` is null
3 Ultimately, the predicate fails to be pushed down
was (Author: lipenglin):
Hi [~boroknagyz]
I have a problem, I'm making sure I've pushed the NOT_NULL predicate down to Iceberg, But the _org.apache.iceberg.BaseTableScan#filter(org.apache.iceberg.expressions.Expression)_ doesn't really work. This still happens with NOT_IN predicate push down.
> Improve predicate pushdown to Iceberg
> -------------------------------------
>
> Key: IMPALA-11243
> URL: https://issues.apache.org/jira/browse/IMPALA-11243
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: Zoltán Borók-Nagy
> Assignee: LiPenglin
> Priority: Major
> Labels: impala-iceberg
>
> Iceberg provides a rich API to push down predicates, e.g. we could push down complex predicates with OR, NOT, etc.
> Also, currently we only push down predicates in the form:
> {noformat}
> COL <bin-operator> LITERAL_EXPR
> E.g.:
> col_ts <= '2021-01-01 12:01:00'
> {noformat}
> Instead of only allowing literal expressions, we could evaluate any constant expression and push down the result to Iceberg.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org