You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "houman babai (Jira)" <ji...@apache.org> on 2020/07/15 19:03:00 UTC
[jira] [Updated] (OOZIE-3603) Oozie Luncher & Map-Reduce Action
Complete Successfully However Oozie Still Fails the Action
[ https://issues.apache.org/jira/browse/OOZIE-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
houman babai updated OOZIE-3603:
--------------------------------
Description:
I am using oozie 5.1.0-cdh6.3.1
In my workflow I have a mapreduce action, which generates over 300 counters. The oozie launcher & the mapreduce job successfully complete, however, oozie reports that:
{code}
Error Code: LimitExceededException
LimitExceededException: Too many counters: 121 max=120
{code}
I have updated mapred-site.xml.
The log for the mapreduce job reports success, in fact I can see all the counters & the actual output of the mapreduce job on hdfs.
In the oozie launcher log I can see:
* mapreduce.job.counters.max : 8192
* mapreduce.job.counters.groups.max : 100
Furthermore, the oozie launcher log ends with:
{code:java}
--------------------
Submitting Oozie action Map-Reduce job
=======================
<<< Invocation of Main class completed <<<
Oozie Launcher, propagating new Hadoop job id to Oozie
=======================
job_1594765755382_0035
=======================
Oozie Launcher, uploading action data to HDFS sequence file: hdfs://HDFS/user/MY-ID/oozie-oozi/0000012-200714223028181-oozie-oozi-W/ACTION-NAME--map-reduce/action-data.seq
Stopping AM
Callback notification attempts left 0
Callback notification trying http://OOZIE-URL:11000/oozie/callback?id=0000012-200714223028181-oozie-oozi-W@ACTION-NAME&status=RUNNING
Callback notification to http://OOZIE-URL:11000/oozie/callback?id=0000012-200714223028181-oozie-oozi-W@ACTION-NAME&status=RUNNING succeeded
Callback notification succeeded
{code}
I dug out the the following from the oozie logs:
{code}
114108 2020-07-15 17:57:02,253 TRACE org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Precondition check for command [action.end] key [0000012-200714223028181-oozie-oozi-W]
114109 2020-07-15 17:57:02,253 DEBUG org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Execute command [action.end] key [0000012-200714223028181-oozie-oozi-W]
114110 2020-07-15 17:57:02,253 DEBUG org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] STARTED ActionEndXCommand for action 0000012-200714223028181-oozie-oozi-W@ACTION-NAME
114111 2020-07-15 17:57:02,259 DEBUG org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] End, name [ACTION-NAME] type [map-reduce] status[DONE] external status [SUCCEEDED] signal value [null]
114112 2020-07-15 17:57:02,260 INFO org.apache.oozie.action.hadoop.MapReduceActionExecutor: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Action ended with external status [SUCCEEDED]
114113 2020-07-15 17:57:02,260 DEBUG org.apache.oozie.service.HadoopAccessorService: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Checking if filesystem hdfs is supported
114114 2020-07-15 17:57:02,261 DEBUG org.apache.oozie.service.HadoopAccessorService: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Checking if filesystem hdfs is supported
114115 2020-07-15 17:57:02,340 WARN org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Error ending action [ACTION-NAME]. ErrorType [ERROR], ErrorCode [LimitExceededException], Message [LimitExceededException: Too many counters: 121 max=120]
114116 2020-07-15 17:57:02,341 WARN org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Setting Action Status to [ERROR]
{code}
was:
I am using oozie 5.1.0-cdh6.3.1
In my workflow I have a mapreduce action, which generates over 300 counters. The oozie launcher & the mapreduce job successfully complete, however, oozie reports that:
{code}
Error Code: LimitExceededException
LimitExceededException: Too many counters: 121 max=120
{code}
I have updated mapred-site.xml.
The log for the mapreduce job reports success, in fact I can see all the counters & the actual output of the mapreduce job on hdfs.
In the oozie launcher log I can see:
* mapreduce.job.counters.max : 8192
* mapreduce.job.counters.groups.max : 100
Furthermore, the oozie launcher log ends with:
{code:java}
--------------------
Submitting Oozie action Map-Reduce job
=======================
<<< Invocation of Main class completed <<<
Oozie Launcher, propagating new Hadoop job id to Oozie
=======================
job_1594765755382_0035
=======================
Oozie Launcher, uploading action data to HDFS sequence file: hdfs://HDFS/user/MY-ID/oozie-oozi/0000012-200714223028181-oozie-oozi-W/ACTION-NAME--map-reduce/action-data.seq
Stopping AM
Callback notification attempts left 0
Callback notification trying http://OOZIE-URL:11000/oozie/callback?id=0000012-200714223028181-oozie-oozi-W@ACTION-NAME&status=RUNNING
Callback notification to http://OOZIE-URL:11000/oozie/callback?id=0000012-200714223028181-oozie-oozi-W@ACTION-NAME&status=RUNNING succeeded
Callback notification succeeded
{code}
> Oozie Luncher & Map-Reduce Action Complete Successfully However Oozie Still Fails the Action
> --------------------------------------------------------------------------------------------
>
> Key: OOZIE-3603
> URL: https://issues.apache.org/jira/browse/OOZIE-3603
> Project: Oozie
> Issue Type: Bug
> Components: core
> Affects Versions: 5.1.0
> Environment: Oozie Version 5.1.0-CDH6.3.1
> Reporter: houman babai
> Priority: Major
>
> I am using oozie 5.1.0-cdh6.3.1
> In my workflow I have a mapreduce action, which generates over 300 counters. The oozie launcher & the mapreduce job successfully complete, however, oozie reports that:
> {code}
> Error Code: LimitExceededException
> LimitExceededException: Too many counters: 121 max=120
> {code}
> I have updated mapred-site.xml.
> The log for the mapreduce job reports success, in fact I can see all the counters & the actual output of the mapreduce job on hdfs.
> In the oozie launcher log I can see:
> * mapreduce.job.counters.max : 8192
> * mapreduce.job.counters.groups.max : 100
> Furthermore, the oozie launcher log ends with:
> {code:java}
> --------------------
> Submitting Oozie action Map-Reduce job
> =======================
> <<< Invocation of Main class completed <<<
> Oozie Launcher, propagating new Hadoop job id to Oozie
> =======================
> job_1594765755382_0035
> =======================
> Oozie Launcher, uploading action data to HDFS sequence file: hdfs://HDFS/user/MY-ID/oozie-oozi/0000012-200714223028181-oozie-oozi-W/ACTION-NAME--map-reduce/action-data.seq
> Stopping AM
> Callback notification attempts left 0
> Callback notification trying http://OOZIE-URL:11000/oozie/callback?id=0000012-200714223028181-oozie-oozi-W@ACTION-NAME&status=RUNNING
> Callback notification to http://OOZIE-URL:11000/oozie/callback?id=0000012-200714223028181-oozie-oozi-W@ACTION-NAME&status=RUNNING succeeded
> Callback notification succeeded
> {code}
> I dug out the the following from the oozie logs:
> {code}
> 114108 2020-07-15 17:57:02,253 TRACE org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Precondition check for command [action.end] key [0000012-200714223028181-oozie-oozi-W]
> 114109 2020-07-15 17:57:02,253 DEBUG org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Execute command [action.end] key [0000012-200714223028181-oozie-oozi-W]
> 114110 2020-07-15 17:57:02,253 DEBUG org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] STARTED ActionEndXCommand for action 0000012-200714223028181-oozie-oozi-W@ACTION-NAME
> 114111 2020-07-15 17:57:02,259 DEBUG org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] End, name [ACTION-NAME] type [map-reduce] status[DONE] external status [SUCCEEDED] signal value [null]
> 114112 2020-07-15 17:57:02,260 INFO org.apache.oozie.action.hadoop.MapReduceActionExecutor: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Action ended with external status [SUCCEEDED]
> 114113 2020-07-15 17:57:02,260 DEBUG org.apache.oozie.service.HadoopAccessorService: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Checking if filesystem hdfs is supported
> 114114 2020-07-15 17:57:02,261 DEBUG org.apache.oozie.service.HadoopAccessorService: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Checking if filesystem hdfs is supported
> 114115 2020-07-15 17:57:02,340 WARN org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Error ending action [ACTION-NAME]. ErrorType [ERROR], ErrorCode [LimitExceededException], Message [LimitExceededException: Too many counters: 121 max=120]
> 114116 2020-07-15 17:57:02,341 WARN org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Setting Action Status to [ERROR]
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)