You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Harsha HN <99...@gmail.com> on 2015/04/15 15:24:19 UTC

Question on MAPJOIN V/s JOIN performance

Hi All,



I went through below mentioned Facebook engineering page,

https://www.facebook.com/notes/facebook-engineering/join-optimization-in-apache-hive/470667928919



I set following for auto conversion of joins,

set hive.auto.convert.join=true;

set hive.mapjoin.smalltable.filesize=1000000000;    (1GB)



I observed some queries performed 2X faster in MAP JOIN as opposed to
Common join

and also instances where MAP JOIN is 3X slower than Common Join.



Any thoughts on what might be slowing down MAP JOIN in some cases ?



I have 40 Node cluster, so I have huge RAM available.



Thanks,

Harsha