You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Marko Kaar <Ma...@nortal.com> on 2016/08/12 09:16:40 UTC

Hive parameters used by Pig

Hello,

In our current project we have an oozie workflow which has pig actions where we write entries to hive after data propagation. This requires hive properties to be defined in the workflow and currently we’re referencing a copy of hive’s configuration xml file through the <job-xml> element with all possible hive properties. So what our oozie workflow xml looks like is something like this: (pseudocode)
<workflow-app ...>
                ...
                <action ...>
                                <pig>
                                                <job-tracker/>
                                                <name-node/>
                                                <job-xml>/workflows/.../hive-conf.xml</>
                                                <configuration>
                                                                <property/>
                                                                <property/>
                                                                <property/>
                                                </configuration>
                                                <script>/pigscript.pig</script>
                                                <argument/>
                                                <argument/>
                                                <argument/>
                                </pig>
                                <ok to/>
                                <error to/>
                </action>
                ...
</workflow-app>

The hive-conf.xml includes basically all of the possible hive properties starting with authentication properties ending with file footer inclusion etc. So the question would be that which configuration parameters does pig actually use to communicate with hive and which parameters should we include in our configuration element to make the workflow a bit cleaner?

Thanks in advance,
Marko