You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2018/11/16 22:28:00 UTC

[jira] [Commented] (IMPALA-4864) Speed up binary predicates against dictionary encoded Parquet data by converting the predicates to their codewords

    [ https://issues.apache.org/jira/browse/IMPALA-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690073#comment-16690073 ] 

Tim Armstrong commented on IMPALA-4864:
---------------------------------------

IMPALA-4123 makes this a bit more interesting since it means we're smarter about decoding dictionary-encoded data. 

For repeated runs, we'd only need to evaluate the predicate once per run, which reduces the cost a lot. For literal runs, we'd could make this more efficient by pushing it down into the batched bit unpacking loop, same as we did with the dictionary lookup.

> Speed up binary predicates against dictionary encoded Parquet data by converting the predicates to their codewords
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-4864
>                 URL: https://issues.apache.org/jira/browse/IMPALA-4864
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 2.9.0
>            Reporter: Mostafa Mokhtar
>            Priority: Major
>              Labels: performance
>
> Selective binary predicates against dictionary-encoded columns can be speeded up by converting the original predicates on the column type to predicates on the dictionary codewords, this should help avoid expensive comparisons. 
> Similar to Kudu 
> https://kudu.apache.org/2016/09/16/predicate-pushdown.html
> https://github.com/cloudera/kudu/commit/c0f37278cb09a7781d9073279ea54b08db6e2010
> https://github.com/cloudera/kudu/commit/ec80fdb37be44d380046a823b5e6d8e2241ec3da



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org