You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Hui Huang (JIRA)" <ji...@apache.org> on 2018/08/03 09:22:00 UTC
[jira] [Created] (HIVE-20304) When hive.optimize.skewjoin and
hive.auto.convert.join are both set to true, and the execution engine is
mr, same stage may launch twice due to the wrong generated plan
Hui Huang created HIVE-20304:
--------------------------------
Summary: When hive.optimize.skewjoin and hive.auto.convert.join are both set to true, and the execution engine is mr, same stage may launch twice due to the wrong generated plan
Key: HIVE-20304
URL: https://issues.apache.org/jira/browse/HIVE-20304
Project: Hive
Issue Type: Bug
Components: CLI
Affects Versions: 1.2.1, 2.3.3
Reporter: Hui Huang
Assignee: Hui Huang
When hive.optimize.skewjoin and hive.auto.convert.join are both set to true, and the execution engine is set to mr, same stage may launch twice due to the wrong generated plan. If hive.exec.parallel is also true, the job will failed due to the first completed stage clear the map.xml/reduce.xml.
use following sql to reproduce the issue:
{code:java}
CREATE TABLE `tbl1`(
`fence` string);
CREATE TABLE `tbl2`(
`order_id` string,
`phone` string,
`search_id` string
)
PARTITIONED BY (
`dt` string);
CREATE TABLE `tbl3`(
`order_id` string,
`platform` string)
PARTITIONED BY (
`dt` string);
CREATE TABLE `tbl4`(
`groupname` string,
`phone` string)
PARTITIONED BY (
`dt` string);
CREATE TABLE `tbl5`(
`search_id` string,
`fence` string)
PARTITIONED BY (
`dt` string);
SET hive.exec.parallel = TRUE;
SET hive.auto.convert.join = TRUE;
SET hive.optimize.skewjoin = TRUE;
SELECT dt,
platform,
groupname,
count(1) as cnt
FROM
(SELECT dt,
platform,
groupname
FROM
(SELECT fence
FROM tbl1)ta
JOIN
(SELECT a0.dt,
a1.platform,
a2.groupname,
a3.fence
FROM
(SELECT dt,
order_id,
phone,
search_id
FROM tbl2
WHERE dt =20180703 )a0
JOIN
(SELECT order_id,
platform,
dt
FROM tbl3
WHERE dt =20180703 )a1 ON a0.order_id = a1.order_id
INNER JOIN
(SELECT groupname,
phone,
dt
FROM tbl4
WHERE dt =20180703 )a2 ON a0.phone = a2.phone
LEFT JOIN
(SELECT search_id,
fence,
dt
FROM tbl5
WHERE dt =20180703)a3 ON a0.search_id = a3.search_id)t0 ON ta.fence = t0.fence)t11
GROUP BY dt,
platform,
groupname;
DROP TABLE tbl1;
DROP TABLE tbl2;
DROP TABLE tbl3;
DROP TABLE tbl4;
DROP TABLE tbl5;
{code}
you will get error message like this:
Examining task ID: task_1531284442065_3637_m_000000 (and more) from job job_1531284442065_3637
Task with the most failures(4):
-----
Task ID:
task_1531284442065_3637_m_000000
URL:
http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1531284442065_3637&tipid=task_1531284442065_3637_m_000000
-----
Diagnostic Messages for this Task:
File does not exist: hdfs://test/tmp/hive-hadoop/hadoop/fe5efa94-abb1-420f-b6ba-ec782e7b79ad/hive_2018-08-03_17-00-17_707_592882314975289971-5/-mr-10045/757eb1f7-7a37-4a7e-abc0-4a3b8b06510c/reduce.xml
java.io.FileNotFoundException: File does not exist: hdfs://test/tmp/hive-hadoop/hadoop/fe5efa94-abb1-420f-b6ba-ec782e7b79ad/hive_2018-08-03_17-00-17_707_592882314975289971-5/-mr-10045/757eb1f7-7a37-4a7e-abc0-4a3b8b06510c/reduce.xml
When I check the plan by executing explain, I found that the Stage-4 and Stage-5 can reached from multi root tasks, it is the reason to this issue.
{code:java}
Explain
STAGE DEPENDENCIES:
Stage-21 is a root stage , consists of Stage-34, Stage-5
Stage-34 has a backup stage: Stage-5
Stage-20 depends on stages: Stage-34
Stage-17 depends on stages: Stage-5, Stage-18, Stage-20 , consists of Stage-32, Stage-33, Stage-1
Stage-32 has a backup stage: Stage-1
Stage-15 depends on stages: Stage-32
Stage-10 depends on stages: Stage-1, Stage-15, Stage-16 , consists of Stage-31, Stage-2
Stage-31
Stage-9 depends on stages: Stage-31
Stage-2 depends on stages: Stage-9
Stage-33 has a backup stage: Stage-1
Stage-16 depends on stages: Stage-33
Stage-1
Stage-5
Stage-27 is a root stage , consists of Stage-37, Stage-38, Stage-4
Stage-37 has a backup stage: Stage-4
Stage-25 depends on stages: Stage-37
Stage-12 depends on stages: Stage-4, Stage-22, Stage-23, Stage-25, Stage-26 , consists of Stage-36, Stage-5
Stage-36
Stage-11 depends on stages: Stage-36
Stage-19 depends on stages: Stage-11 , consists of Stage-35, Stage-5
Stage-35 has a backup stage: Stage-5
Stage-18 depends on stages: Stage-35
Stage-38 has a backup stage: Stage-4
Stage-26 depends on stages: Stage-38
Stage-4
Stage-30 is a root stage , consists of Stage-42, Stage-43, Stage-3
Stage-42 has a backup stage: Stage-3
Stage-28 depends on stages: Stage-42
Stage-14 depends on stages: Stage-3, Stage-28, Stage-29 , consists of Stage-41, Stage-4
Stage-41
Stage-13 depends on stages: Stage-41
Stage-24 depends on stages: Stage-13 , consists of Stage-39, Stage-40, Stage-4
Stage-39 has a backup stage: Stage-4
Stage-22 depends on stages: Stage-39
Stage-40 has a backup stage: Stage-4
Stage-23 depends on stages: Stage-40
Stage-43 has a backup stage: Stage-3
Stage-29 depends on stages: Stage-43
Stage-3
Stage-0 depends on stages: Stage-2
{code}
workaround: do not set hive.optimize.skewjoin and hive.auto.convert.join to true at the same time.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)