You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Max Mizikar (Jira)" <ji...@apache.org> on 2020/02/26 17:12:00 UTC
[jira] [Created] (IMPALA-9429) Unioned partition columns break
partition pruning
Max Mizikar created IMPALA-9429:
-----------------------------------
Summary: Unioned partition columns break partition pruning
Key: IMPALA-9429
URL: https://issues.apache.org/jira/browse/IMPALA-9429
Project: IMPALA
Issue Type: Bug
Affects Versions: Impala 3.2.0
Reporter: Max Mizikar
We have different granularity of partitions on our landing tables vs our compacted tables. We use a view to union our landing and our compacted. After an upgrade from cdh5.15 (Impala v2.12.0) to cdh6.3 (Impala 3.2.0) we started having issues with our union-ed tables. I've come up with this as the smallest breaking example.
{code:java}
[:21000] debug> create table debug_with_partition (col1 int) partitioned by (col2 int, col3 int);
Query: create table debug_with_partition (col1 int) partitioned by (col2 int, col3 int)
+-------------------------+
| summary |
+-------------------------+
| Table has been created. |
+-------------------------+
Fetched 1 row(s) in 0.09s
[:21000] debug> create table debug_without_partition (col1 int) partitioned by (col2 int);
Query: create table debug_without_partition (col1 int) partitioned by (col2 int)
+-------------------------+
| summary |
+-------------------------+
| Table has been created. |
+-------------------------+
Fetched 1 row(s) in 0.03s
[:21000] debug> create view debug as select col1, col2, col3 from debug_with_partition union all select col1, col2, null from debug_without_partition;
Query: create view debug as select col1, col2, col3 from debug_with_partition union all select col1, col2, null from debug_without_partition
Query submitted at: 2020-02-26 17:04:58 (Coordinator: :25000)
Query progress can be monitored at: :25000/query_plan?query_id=28453bdf5f919fe9:66fef22200000000
+------------------------+
| summary |
+------------------------+
| View has been created. |
+------------------------+
Fetched 1 row(s) in 5.65s
[:21000] debug> select * from debug where col2 = 0 or col3 = 0;
Query: select * from debug where col2 = 0 or col3 = 0
Query submitted at: 2020-02-26 17:05:21 (Coordinator: t:25000)
ERROR: IllegalStateException: null
{code}
Here is what I find in the log
{code:java}
I0226 17:05:21.099532 129442 jni-util.cc:256] c34e2a72018579fe:3d7388e100000000] java.lang.IllegalStateException
at com.google.common.base.Preconditions.checkState(Preconditions.java:133)
at org.apache.impala.planner.HdfsPartitionPruner.canEvalUsingPartitionMd(HdfsPartitionPruner.java:196)
at org.apache.impala.planner.HdfsPartitionPruner.canEvalUsingPartitionMd(HdfsPartitionPruner.java:211)
at org.apache.impala.planner.HdfsPartitionPruner.prunePartitions(HdfsPartitionPruner.java:131)
at org.apache.impala.planner.SingleNodePlanner.createHdfsScanPlan(SingleNodePlanner.java:1257)
at org.apache.impala.planner.SingleNodePlanner.createScanNode(SingleNodePlanner.java:1348)
at org.apache.impala.planner.SingleNodePlanner.createTableRefNode(SingleNodePlanner.java:1535)
at org.apache.impala.planner.SingleNodePlanner.createTableRefsPlan(SingleNodePlanner.java:814)
at org.apache.impala.planner.SingleNodePlanner.createSelectPlan(SingleNodePlanner.java:650)
at org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:258)
at org.apache.impala.planner.SingleNodePlanner.createUnionPlan(SingleNodePlanner.java:1584)
at org.apache.impala.planner.SingleNodePlanner.createUnionPlan(SingleNodePlanner.java:1651)
at org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:280)
at org.apache.impala.planner.SingleNodePlanner.createInlineViewPlan(SingleNodePlanner.java:1088)
at org.apache.impala.planner.SingleNodePlanner.createTableRefNode(SingleNodePlanner.java:1546)
at org.apache.impala.planner.SingleNodePlanner.createTableRefsPlan(SingleNodePlanner.java:814)
at org.apache.impala.planner.SingleNodePlanner.createSelectPlan(SingleNodePlanner.java:650)
at org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:258)
at org.apache.impala.planner.SingleNodePlanner.createSingleNodePlan(SingleNodePlanner.java:148)
at org.apache.impala.planner.Planner.createPlan(Planner.java:103)
at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1171)
at org.apache.impala.service.Frontend.getPlannedExecRequest(Frontend.java:1466)
at org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1345)
at org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1252)
at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1222)
at org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:167)
I0226 17:05:21.099617 129442 status.cc:124] c34e2a72018579fe:3d7388e100000000] IllegalStateException: null
@ 0xb4c459
@ 0x114fe2e
@ 0x102ab53
@ 0x1052ba2
@ 0x105e88c
@ 0x109e5be
@ 0x138fee4
@ 0x138f39c
@ 0xb18169
@ 0xf2d1d8
@ 0xf23c4e
@ 0xf24ae1
@ 0x11c5e0f
@ 0x11c69b9
@ 0x1840569
@ 0x7f2ef82926b9
@ 0x7f2ef7fc841c
{code}
I've done some level of debugging from the shell and I find that the following things work
querying just on the null filled column
{code:java}
[:21000] debug> select * from debug where col3 = 0;
Query: select * from debug where col3 = 0
Query submitted at: 2020-02-26 17:07:07 (Coordinator: :25000)
Query progress can be monitored at: :25000/query_plan?query_id=1b44157731b6f5ff:d052c2c600000000
Fetched 0 row(s) in 0.11s
{code}
query with an and on the null filled column
{code:java}
[:21000] debug> select * from debug where col2 = 0 and col3 = 0;
Query: select * from debug where col2 = 0 and col3 = 0
Query submitted at: 2020-02-26 17:07:27 (Coordinator: :25000)
Query progress can be monitored at: :25000/query_plan?query_id=334f7fbf2367a558:6ebe4d6100000000
Fetched 0 row(s) in 0.11s
{code}
casting the null filled column
{code:java}
[:21000] debug> select * from debug where col2 = 0 or cast(col3 as int) = 0;
Query: select * from debug where col2 = 0 or cast(col3 as int) = 0
Query submitted at: 2020-02-26 17:08:26 (Coordinator: :25000)
Query progress can be monitored at: :25000/query_plan?query_id=1a4d43d8fc9fc45d:662922b900000000
Fetched 0 row(s) in 0.11s
{code}
Please let me know if there is anything else I can do to help!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)