You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by wan kun <31...@qq.com> on 2017/03/09 05:03:52 UTC

Re: Review Request 57444: Get the right parent schema when spliting opertator plan into MR jobs

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57444/
-----------------------------------------------------------

(Updated \u4e09\u6708 9, 2017, 5:03 a.m.)


Review request for hive, Yongqiang He and namit jain.


Bugs: HIVE-15944
    https://issues.apache.org/jira/browse/HIVE-15944


Repository: hive-git


Description
-------

Get the right parent schema when spliting opertator plan into MR jobs


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 24d1681 
  ql/src/test/queries/clientpositive/with_column_pruner.q PRE-CREATION 
  ql/src/test/results/clientpositive/with_column_pruner.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/57444/diff/1/


Testing
-------

In semantic analyze stage ,the plan tree is :
````
TS[0]-FIL[17]-SEL[2]-LIM[3]-RS[4] -SEL[5]-LIM[6]-RS[11] -JOIN[14]-SEL[15]-FS[16]
TS[7]-FIL[18]-SEL[9]-RS[13] -JOIN[14]
````
but when it compile to MR jobs, hive will add FIL[19] and TS[20] between LIM[6] and RS[11] operator.
The FIL[19] will get schema from LIM[6] which LIM[6] is not the right output cols.
So,I think two way to solve this problem.
1. when generator MR jobs ,use if condition to get the right file sink schema.
2. update the LIM operator's schema in the semantic optimize operation.
Now I try to fix this bug by the first way.


Thanks,

wan kun