You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Raghavi Ravi (JIRA)" <ji...@apache.org> on 2018/02/14 07:00:00 UTC
[jira] [Commented] (OOZIE-3057) Custom Partitioner not working in Oozie Mapreduce action

    [ https://issues.apache.org/jira/browse/OOZIE-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363563#comment-16363563 ] 

Raghavi Ravi commented on OOZIE-3057:
-------------------------------------

[~gezapeti]

Attaching oozie logs, Partitioner class and workflow.xml. The workflow.xml has only one mapreduce action that reads RCFiles and creates text files.

[^Logs.zip] [^PonRankPartitioner.java]

> Custom Partitioner not working in Oozie Mapreduce action
> --------------------------------------------------------
>
>                 Key: OOZIE-3057
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3057
>             Project: Oozie
>          Issue Type: Bug
>          Components: action, workflow
>    Affects Versions: 4.1.0
>         Environment: Red Hat Enterprise Linux Server release 7.2 (Maipo)
> Linux version 3.10.0-327.10.1.el7.x86_64 (mockbuild@x86-021.build.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Sat Jan 23 04:54:55 EST 2016
> oozie version - 4.1.0
> cdh version - 5.10.1
> Hue™ 3.11 - The Hadoop UI
>            Reporter: Raghavi Ravi
>            Priority: Critical
>         Attachments: Logs.zip, PonRankPartitioner.java
>
>
> I implemented secondary sort in mapreduce using old API (org.apache.hadoop.mapred.*) and trying to execute it using Oozie (From Hue).
> Though I have set the partitioner class in the properties, the partitioner is not being executed. So, I'm not getting output as expected.
> The same code runs fine when run using hadoop command from CLI.
> And here is my workflow.xml
> <workflow-app name="MyTriplets" xmlns="uri:oozie:workflow:0.5">
> <start to="mapreduce-598d"/>
> <kill name="Kill">
>     <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
> </kill>
> <action name="mapreduce-598d">
>     <map-reduce>
>         <job-tracker>${jobTracker}</job-tracker>
>         <name-node>${nameNode}</name-node>
>         <configuration>
>             <property>
>                 <name>mapred.output.dir</name>
>                 <value>/test_1109_3</value>
>             </property>
>             <property>
>                 <name>mapred.input.dir</name>
>                 <value>/apps/hive/warehouse/7360_0609_rx/day=06-09-2017/hour=13/quarter=2/,/apps/hive/warehouse/7360_0609_tx/day=06-09-2017/hour=13/quarter=2/,/apps/hive/warehouse/7360_0509_util/day=05-09-2017/hour=16/quarter=1/</value>
>             </property>
>             <property>
>                 <name>mapred.input.format.class</name>
>                 <value>org.apache.hadoop.hive.ql.io.RCFileInputFormat</value>
>             </property>
>             <property>
>                 <name>mapred.mapper.class</name>
>                 <value>PonRankMapper</value>
>             </property>
>             <property>
>                 <name>mapred.reducer.class</name>
>                 <value>PonRankReducer</value>
>             </property>
>             <property>
>                 <name>mapred.output.value.comparator.class</name>
>                 <value>PonRankGroupingComparator</value>
>             </property>
>             <property>
>                 <name>mapred.mapoutput.key.class</name>
>                 <value>PonRankPair</value>
>             </property>
>             <property>
>                 <name>mapred.mapoutput.value.class</name>
>                 <value>org.apache.hadoop.io.Text</value>
>             </property>
>             <property>
>                 <name>mapred.reduce.output.key.class</name>
>                 <value>org.apache.hadoop.io.NullWritable</value>
>             </property>
>             <property>
>                 <name>mapred.reduce.output.value.class</name>
>                 <value>org.apache.hadoop.io.Text</value>
>             </property>
>             <property>
>                 <name>mapred.reduce.tasks</name>
>                 <value>1</value>
>             </property>
>             <property>
>                 <name>mapred.partitioner.class</name>
>                 <value>PonRankPartitioner</value>
>             </property>
>             <property>
>                 <name>mapred.mapper.new-api</name>
>                 <value>False</value>
>             </property>
>         </configuration>
>     </map-reduce>
>     <ok to="End"/>
>     <error to="Kill"/>
> </action>
> <end name="End"/>
> When running using hadoop jar command, I set the partitioner class using JobConf.setPartitionerClass API.
>  Partitioner is not executed when using old API . Inspite of adding the property.
>             <property>
>                 <name>mapred.partitioner.class</name>
>                 <value>PonRankPartitioner</value>
>             </property>
> Executed the same logic using new API's (org.apache.hadoop.mapreduce) and added mapreduce.partitioner.class property in workflow.
> Partitioner was executed and desired outcome was seen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)