You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Mahender Sarangam <Ma...@outlook.com> on 2016/12/23 15:40:24 UTC

predicate push down on hive join.

Hi,

We are doing Join on large tables and couple of Left Join on 3-4 tables with result of large table join. We have question, is it better to keep predicate along with JOIN condition or keep predicate in where condition. I was going through Apache site, found below context

Couldn't understand meaning Pushed and Not Pushed. Can any1 throw some light on it.

[cid:part1.59E0ADB9.301AE575@outlook.com]


  *   Another question, In the case of inner Join, Is it better to keep predicate condition part of JOIN ON Condition or in Where Condition. We have seen If we do Join and add predicate in where condition, it was taking too much of time. when we move predicate logic to JOIN ON condition, it is executing fast. both the tables are large. Is this expected ? Below is our Table Join Condition

Table2Detail T2
JOIN Table1Summary T1
ON T2.Nbr= T1.Nbr
AND T2.Year=T1.Year
AND T2.Month=T1.Month
AND  T1.Col1= T2.Col1
AND T2.Col2= T1.Col2
AND T1.Col2= 'XYZ'
AND T2.Col2= 'XYZ'


or

Table2Detail T2
JOIN Table1Summary T1
ON T2.Nbr= T1.Nbr
AND T2.Year=T1.Year
AND T2.Month=T1.Month
AND  T1.Col1= T2.Col1
AND T2.Col2= T1.Col2

Where T1.Col2= 'XYZ'  AND T2.Col2= 'XYZ'