You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Adam Silberstein <ad...@trifacta.com> on 2014/06/27 23:50:46 UTC

running pig against yarn cluster

Hi,
I am running Pig against a CDH5 Yarn cluster.  I am not running pig with the command line, but instead running it programmatically through java code.  

The server I am running it from does NOT actually have Pig/Hadoop etc. installed on it, and so there is no mapred-site.xml, yarn-site.xml, etc.  All config I do is through a set of properties set at launching time from java.

I need some help figuring out what properties to set.  Currently I have these properties set as pig properties:
yarn.resourcemanager.hostname
yarn.resourcemanager.addressmapreduce.framework.name

When I run like this through java, pig doesn’t actually fail but reverts to running in local mode.  Of course I don’t want that, it just made it harder to spot what happened.


High level summary of question: what pig properties should I set to run against an external yarn cluster when I cannot rely on any local hadoop config that pig might normally pick up.

Thanks!
Adam