You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by canan chen <cc...@gmail.com> on 2015/03/13 04:19:09 UTC

Question on hive query correlation optimization

I use the following sql with mr engine and find that it would invoke 3 mr
jobs. But as my understanding the join and group by operator could be done
in the same mr job since they are using the same key. So not sure why here
still 3 mr jobs, anyone know that ? Thanks



select s2.name,count(1) as cnt from student_s2 s2 join student_s3 s3 on
s2.name=s3.name group by s2.name order by cnt limit 3;