You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Max Mizikar (Jira)" <ji...@apache.org> on 2020/02/26 17:12:00 UTC

[jira] [Created] (IMPALA-9429) Unioned partition columns break partition pruning

Max Mizikar created IMPALA-9429:
-----------------------------------

             Summary: Unioned partition columns break partition pruning
                 Key: IMPALA-9429
                 URL: https://issues.apache.org/jira/browse/IMPALA-9429
             Project: IMPALA
          Issue Type: Bug
    Affects Versions: Impala 3.2.0
            Reporter: Max Mizikar


We have different granularity of partitions on our landing tables vs our compacted tables. We use a view to union our landing and our compacted. After an upgrade from cdh5.15 (Impala v2.12.0) to cdh6.3 (Impala 3.2.0) we started having issues with our union-ed tables. I've come up with this as the smallest breaking example.
{code:java}
[:21000] debug> create table debug_with_partition (col1 int) partitioned by (col2 int, col3 int);                                                                                                                                                                                             
Query: create table debug_with_partition (col1 int) partitioned by (col2 int, col3 int)
+-------------------------+
| summary                 |
+-------------------------+
| Table has been created. |
+-------------------------+
Fetched 1 row(s) in 0.09s
[:21000] debug> create table debug_without_partition (col1 int) partitioned by (col2 int);                                                                                                                                                                                                    
Query: create table debug_without_partition (col1 int) partitioned by (col2 int)
+-------------------------+
| summary                 |
+-------------------------+
| Table has been created. |
+-------------------------+
Fetched 1 row(s) in 0.03s
[:21000] debug> create view debug as select col1, col2, col3 from debug_with_partition union all select col1, col2, null from debug_without_partition;                                                                                                                                        
Query: create view debug as select col1, col2, col3 from debug_with_partition union all select col1, col2, null from debug_without_partition
Query submitted at: 2020-02-26 17:04:58 (Coordinator: :25000)
Query progress can be monitored at: :25000/query_plan?query_id=28453bdf5f919fe9:66fef22200000000
+------------------------+
| summary                |
+------------------------+
| View has been created. |
+------------------------+
Fetched 1 row(s) in 5.65s
[:21000] debug> select * from debug where col2 = 0 or col3 = 0;                                                                                                                                                                                                                               
Query: select * from debug where col2 = 0 or col3 = 0
Query submitted at: 2020-02-26 17:05:21 (Coordinator: t:25000)
ERROR: IllegalStateException: null
{code}
Here is what I find in the log
{code:java}
I0226 17:05:21.099532 129442 jni-util.cc:256] c34e2a72018579fe:3d7388e100000000] java.lang.IllegalStateException
                                at com.google.common.base.Preconditions.checkState(Preconditions.java:133)
                                at org.apache.impala.planner.HdfsPartitionPruner.canEvalUsingPartitionMd(HdfsPartitionPruner.java:196)
                                at org.apache.impala.planner.HdfsPartitionPruner.canEvalUsingPartitionMd(HdfsPartitionPruner.java:211)
                                at org.apache.impala.planner.HdfsPartitionPruner.prunePartitions(HdfsPartitionPruner.java:131)
                                at org.apache.impala.planner.SingleNodePlanner.createHdfsScanPlan(SingleNodePlanner.java:1257)
                                at org.apache.impala.planner.SingleNodePlanner.createScanNode(SingleNodePlanner.java:1348)
                                at org.apache.impala.planner.SingleNodePlanner.createTableRefNode(SingleNodePlanner.java:1535)
                                at org.apache.impala.planner.SingleNodePlanner.createTableRefsPlan(SingleNodePlanner.java:814)
                                at org.apache.impala.planner.SingleNodePlanner.createSelectPlan(SingleNodePlanner.java:650)
                                at org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:258)
                                at org.apache.impala.planner.SingleNodePlanner.createUnionPlan(SingleNodePlanner.java:1584)
                                at org.apache.impala.planner.SingleNodePlanner.createUnionPlan(SingleNodePlanner.java:1651)
                                at org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:280)
                                at org.apache.impala.planner.SingleNodePlanner.createInlineViewPlan(SingleNodePlanner.java:1088)
                                at org.apache.impala.planner.SingleNodePlanner.createTableRefNode(SingleNodePlanner.java:1546)
                                at org.apache.impala.planner.SingleNodePlanner.createTableRefsPlan(SingleNodePlanner.java:814)
                                at org.apache.impala.planner.SingleNodePlanner.createSelectPlan(SingleNodePlanner.java:650)
                                at org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:258)
                                at org.apache.impala.planner.SingleNodePlanner.createSingleNodePlan(SingleNodePlanner.java:148)
                                at org.apache.impala.planner.Planner.createPlan(Planner.java:103)
                                at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1171)
                                at org.apache.impala.service.Frontend.getPlannedExecRequest(Frontend.java:1466)
                                at org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1345)
                                at org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1252)
                                at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1222)
                                at org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:167)
I0226 17:05:21.099617 129442 status.cc:124] c34e2a72018579fe:3d7388e100000000] IllegalStateException: null
    @           0xb4c459
    @          0x114fe2e
    @          0x102ab53
    @          0x1052ba2
    @          0x105e88c
    @          0x109e5be
    @          0x138fee4
    @          0x138f39c
    @           0xb18169
    @           0xf2d1d8
    @           0xf23c4e
    @           0xf24ae1
    @          0x11c5e0f
    @          0x11c69b9
    @          0x1840569
    @     0x7f2ef82926b9
    @     0x7f2ef7fc841c
{code}
I've done some level of debugging from the shell and I find that the following things work
 querying just on the null filled column
{code:java}
[:21000] debug> select * from debug where col3 = 0;
Query: select * from debug where col3 = 0
Query submitted at: 2020-02-26 17:07:07 (Coordinator: :25000)
Query progress can be monitored at: :25000/query_plan?query_id=1b44157731b6f5ff:d052c2c600000000
Fetched 0 row(s) in 0.11s
{code}
query with an and on the null filled column
{code:java}
[:21000] debug> select * from debug where col2 = 0 and col3 = 0;
Query: select * from debug where col2 = 0 and col3 = 0
Query submitted at: 2020-02-26 17:07:27 (Coordinator: :25000)
Query progress can be monitored at: :25000/query_plan?query_id=334f7fbf2367a558:6ebe4d6100000000
Fetched 0 row(s) in 0.11s
{code}
casting the null filled column
{code:java}
[:21000] debug> select * from debug where col2 = 0 or cast(col3 as int) = 0;
Query: select * from debug where col2 = 0 or cast(col3 as int) = 0
Query submitted at: 2020-02-26 17:08:26 (Coordinator: :25000)
Query progress can be monitored at: :25000/query_plan?query_id=1a4d43d8fc9fc45d:662922b900000000
Fetched 0 row(s) in 0.11s
{code}
Please let me know if there is anything else I can do to help!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)