You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "houman babai (Jira)" <ji...@apache.org> on 2020/07/15 19:03:00 UTC

[jira] [Updated] (OOZIE-3603) Oozie Luncher & Map-Reduce Action Complete Successfully However Oozie Still Fails the Action

     [ https://issues.apache.org/jira/browse/OOZIE-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

houman babai updated OOZIE-3603:
--------------------------------
    Description: 
I am using oozie 5.1.0-cdh6.3.1

In my workflow I have a mapreduce action, which generates over 300 counters.  The oozie launcher & the mapreduce job successfully complete, however, oozie reports that:
{code}
Error Code: LimitExceededException
LimitExceededException: Too many counters: 121 max=120
{code}

I have updated mapred-site.xml.

The log for the mapreduce job reports success, in fact I can see all the counters & the actual output of the mapreduce job on hdfs.

In the oozie launcher log I can see:
 * mapreduce.job.counters.max : 8192 
 * mapreduce.job.counters.groups.max : 100

Furthermore, the oozie launcher log ends with:
{code:java}

--------------------

Submitting Oozie action Map-Reduce job

=======================


<<< Invocation of Main class completed <<<


Oozie Launcher, propagating new Hadoop job id to Oozie
=======================
job_1594765755382_0035
=======================

Oozie Launcher, uploading action data to HDFS sequence file: hdfs://HDFS/user/MY-ID/oozie-oozi/0000012-200714223028181-oozie-oozi-W/ACTION-NAME--map-reduce/action-data.seq
Stopping AM
Callback notification attempts left 0
Callback notification trying http://OOZIE-URL:11000/oozie/callback?id=0000012-200714223028181-oozie-oozi-W@ACTION-NAME&status=RUNNING
Callback notification to http://OOZIE-URL:11000/oozie/callback?id=0000012-200714223028181-oozie-oozi-W@ACTION-NAME&status=RUNNING succeeded
Callback notification succeeded
 {code}

I dug out the the following from the oozie logs:
{code}
114108 2020-07-15 17:57:02,253 TRACE org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Precondition check for command [action.end] key [0000012-200714223028181-oozie-oozi-W]
114109 2020-07-15 17:57:02,253 DEBUG org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Execute command [action.end] key [0000012-200714223028181-oozie-oozi-W]
114110 2020-07-15 17:57:02,253 DEBUG org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] STARTED ActionEndXCommand for action 0000012-200714223028181-oozie-oozi-W@ACTION-NAME
114111 2020-07-15 17:57:02,259 DEBUG org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] End, name [ACTION-NAME] type [map-reduce] status[DONE] external status [SUCCEEDED] signal value [null]
114112 2020-07-15 17:57:02,260 INFO org.apache.oozie.action.hadoop.MapReduceActionExecutor: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Action ended with external status [SUCCEEDED]
114113 2020-07-15 17:57:02,260 DEBUG org.apache.oozie.service.HadoopAccessorService: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Checking if filesystem hdfs is supported
114114 2020-07-15 17:57:02,261 DEBUG org.apache.oozie.service.HadoopAccessorService: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Checking if filesystem hdfs is supported
114115 2020-07-15 17:57:02,340 WARN org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Error ending action [ACTION-NAME]. ErrorType [ERROR], ErrorCode [LimitExceededException], Message [LimitExceededException: Too many counters: 121 max=120]
114116 2020-07-15 17:57:02,341 WARN org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Setting Action Status to [ERROR]
{code}
 

  was:
I am using oozie 5.1.0-cdh6.3.1

In my workflow I have a mapreduce action, which generates over 300 counters.  The oozie launcher & the mapreduce job successfully complete, however, oozie reports that:
{code}
Error Code: LimitExceededException
LimitExceededException: Too many counters: 121 max=120
{code}

I have updated mapred-site.xml.

The log for the mapreduce job reports success, in fact I can see all the counters & the actual output of the mapreduce job on hdfs.

In the oozie launcher log I can see:
 * mapreduce.job.counters.max : 8192 
 * mapreduce.job.counters.groups.max : 100

Furthermore, the oozie launcher log ends with:
{code:java}

--------------------

Submitting Oozie action Map-Reduce job

=======================


<<< Invocation of Main class completed <<<


Oozie Launcher, propagating new Hadoop job id to Oozie
=======================
job_1594765755382_0035
=======================

Oozie Launcher, uploading action data to HDFS sequence file: hdfs://HDFS/user/MY-ID/oozie-oozi/0000012-200714223028181-oozie-oozi-W/ACTION-NAME--map-reduce/action-data.seq
Stopping AM
Callback notification attempts left 0
Callback notification trying http://OOZIE-URL:11000/oozie/callback?id=0000012-200714223028181-oozie-oozi-W@ACTION-NAME&status=RUNNING
Callback notification to http://OOZIE-URL:11000/oozie/callback?id=0000012-200714223028181-oozie-oozi-W@ACTION-NAME&status=RUNNING succeeded
Callback notification succeeded
 {code}
 


> Oozie Luncher & Map-Reduce Action Complete Successfully However Oozie Still Fails the Action
> --------------------------------------------------------------------------------------------
>
>                 Key: OOZIE-3603
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3603
>             Project: Oozie
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 5.1.0
>         Environment: Oozie Version 5.1.0-CDH6.3.1
>            Reporter: houman babai
>            Priority: Major
>
> I am using oozie 5.1.0-cdh6.3.1
> In my workflow I have a mapreduce action, which generates over 300 counters.  The oozie launcher & the mapreduce job successfully complete, however, oozie reports that:
> {code}
> Error Code: LimitExceededException
> LimitExceededException: Too many counters: 121 max=120
> {code}
> I have updated mapred-site.xml.
> The log for the mapreduce job reports success, in fact I can see all the counters & the actual output of the mapreduce job on hdfs.
> In the oozie launcher log I can see:
>  * mapreduce.job.counters.max : 8192 
>  * mapreduce.job.counters.groups.max : 100
> Furthermore, the oozie launcher log ends with:
> {code:java}
> --------------------
> Submitting Oozie action Map-Reduce job
> =======================
> <<< Invocation of Main class completed <<<
> Oozie Launcher, propagating new Hadoop job id to Oozie
> =======================
> job_1594765755382_0035
> =======================
> Oozie Launcher, uploading action data to HDFS sequence file: hdfs://HDFS/user/MY-ID/oozie-oozi/0000012-200714223028181-oozie-oozi-W/ACTION-NAME--map-reduce/action-data.seq
> Stopping AM
> Callback notification attempts left 0
> Callback notification trying http://OOZIE-URL:11000/oozie/callback?id=0000012-200714223028181-oozie-oozi-W@ACTION-NAME&status=RUNNING
> Callback notification to http://OOZIE-URL:11000/oozie/callback?id=0000012-200714223028181-oozie-oozi-W@ACTION-NAME&status=RUNNING succeeded
> Callback notification succeeded
>  {code}
> I dug out the the following from the oozie logs:
> {code}
> 114108 2020-07-15 17:57:02,253 TRACE org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Precondition check for command [action.end] key [0000012-200714223028181-oozie-oozi-W]
> 114109 2020-07-15 17:57:02,253 DEBUG org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Execute command [action.end] key [0000012-200714223028181-oozie-oozi-W]
> 114110 2020-07-15 17:57:02,253 DEBUG org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] STARTED ActionEndXCommand for action 0000012-200714223028181-oozie-oozi-W@ACTION-NAME
> 114111 2020-07-15 17:57:02,259 DEBUG org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] End, name [ACTION-NAME] type [map-reduce] status[DONE] external status [SUCCEEDED] signal value [null]
> 114112 2020-07-15 17:57:02,260 INFO org.apache.oozie.action.hadoop.MapReduceActionExecutor: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Action ended with external status [SUCCEEDED]
> 114113 2020-07-15 17:57:02,260 DEBUG org.apache.oozie.service.HadoopAccessorService: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Checking if filesystem hdfs is supported
> 114114 2020-07-15 17:57:02,261 DEBUG org.apache.oozie.service.HadoopAccessorService: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Checking if filesystem hdfs is supported
> 114115 2020-07-15 17:57:02,340 WARN org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Error ending action [ACTION-NAME]. ErrorType [ERROR], ErrorCode [LimitExceededException], Message [LimitExceededException: Too many counters: 121 max=120]
> 114116 2020-07-15 17:57:02,341 WARN org.apache.oozie.command.wf.ActionEndXCommand: SERVER[hadoopcn1-ers-sinsnap2-nvan.dev-globalrelay.net] USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] JOB[0000012-200714223028181-oozie-oozi-W] ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Setting Action Status to [ERROR]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)