You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by wan kun <31...@qq.com> on 2017/03/09 05:03:52 UTC
Re: Review Request 57444: Get the right parent schema when spliting
opertator plan into MR jobs
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57444/
-----------------------------------------------------------
(Updated \u4e09\u6708 9, 2017, 5:03 a.m.)
Review request for hive, Yongqiang He and namit jain.
Bugs: HIVE-15944
https://issues.apache.org/jira/browse/HIVE-15944
Repository: hive-git
Description
-------
Get the right parent schema when spliting opertator plan into MR jobs
Diffs
-----
ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 24d1681
ql/src/test/queries/clientpositive/with_column_pruner.q PRE-CREATION
ql/src/test/results/clientpositive/with_column_pruner.q.out PRE-CREATION
Diff: https://reviews.apache.org/r/57444/diff/1/
Testing
-------
In semantic analyze stage ,the plan tree is :
````
TS[0]-FIL[17]-SEL[2]-LIM[3]-RS[4] -SEL[5]-LIM[6]-RS[11] -JOIN[14]-SEL[15]-FS[16]
TS[7]-FIL[18]-SEL[9]-RS[13] -JOIN[14]
````
but when it compile to MR jobs, hive will add FIL[19] and TS[20] between LIM[6] and RS[11] operator.
The FIL[19] will get schema from LIM[6] which LIM[6] is not the right output cols.
So,I think two way to solve this problem.
1. when generator MR jobs ,use if condition to get the right file sink schema.
2. update the LIM operator's schema in the semantic optimize operation.
Now I try to fix this bug by the first way.
Thanks,
wan kun