You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2018/11/16 22:28:00 UTC
[jira] [Commented] (IMPALA-4864) Speed up binary predicates against
dictionary encoded Parquet data by converting the predicates to their
codewords
[ https://issues.apache.org/jira/browse/IMPALA-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690073#comment-16690073 ]
Tim Armstrong commented on IMPALA-4864:
---------------------------------------
IMPALA-4123 makes this a bit more interesting since it means we're smarter about decoding dictionary-encoded data.
For repeated runs, we'd only need to evaluate the predicate once per run, which reduces the cost a lot. For literal runs, we'd could make this more efficient by pushing it down into the batched bit unpacking loop, same as we did with the dictionary lookup.
> Speed up binary predicates against dictionary encoded Parquet data by converting the predicates to their codewords
> ------------------------------------------------------------------------------------------------------------------
>
> Key: IMPALA-4864
> URL: https://issues.apache.org/jira/browse/IMPALA-4864
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 2.9.0
> Reporter: Mostafa Mokhtar
> Priority: Major
> Labels: performance
>
> Selective binary predicates against dictionary-encoded columns can be speeded up by converting the original predicates on the column type to predicates on the dictionary codewords, this should help avoid expensive comparisons.
> Similar to Kudu
> https://kudu.apache.org/2016/09/16/predicate-pushdown.html
> https://github.com/cloudera/kudu/commit/c0f37278cb09a7781d9073279ea54b08db6e2010
> https://github.com/cloudera/kudu/commit/ec80fdb37be44d380046a823b5e6d8e2241ec3da
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org