You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jeff Zhang (JIRA)" <ji...@apache.org> on 2015/06/01 08:42:17 UTC

[jira] [Issue Comment Deleted] (TEZ-2507) mapreduce.{map|reduce}.java.opts should override tez.task.launch.cmd-opts

     [ https://issues.apache.org/jira/browse/TEZ-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Zhang updated TEZ-2507:
----------------------------
    Comment: was deleted

(was: 2 things need to be done for this issue:
* Make tez.task.launch.cmd-opts as vertex level configuration. Currently it is a dag level configuration.
* Override tez.task.launch.cmd-opts if user set mapreduce.{map|reduce}.java.opts when we translate mapreduce job to Tez DAG. )

> mapreduce.{map|reduce}.java.opts should override tez.task.launch.cmd-opts
> -------------------------------------------------------------------------
>
>                 Key: TEZ-2507
>                 URL: https://issues.apache.org/jira/browse/TEZ-2507
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>
> Otherwise it may JVM options conflicts. Here's the issue reported by [~r7raul]
> {noformat}
>  I change my mapreduce.map.java.opts  's  value from -Djava.net.preferIPv4Stack=true  -Xmx825955249  to  -Djava.net.preferIPv4Stack=true -XX:+UseG1GC  -Xmx825955249
> When I run query by hive 1.1.0+tez0.53 in hadoop 2.5.0.
> set mapreduce.framework.name=yarn-tez; 
> set hive.execution.engine=tez; 
> select userid,count(*) from u_data group by userid order by userid;
> The  query return error.
> I found error :
> 2015-05-29 16:02:39,064 WARN [AsyncDispatcher event handler] container.AMContainerImpl: Container container_1432885077153_0004_01_000005 finished with diagnostics set to [Container failed. Exception from container-launch. 
> Container id: container_1432885077153_0004_01_000005 
> Exit code: 1 
> Stack trace: ExitCodeException exitCode=1: 
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) 
> at org.apache.hadoop.util.Shell.run(Shell.java:455) 
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) 
> at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:196) 
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299) 
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
> at java.lang.Thread.run(Thread.java:745) 
> But I try
> hive> set hive.execution.engine=mr; 
> hive> set mapreduce.framework.name=yarn; 
> hive> select userid,count(*) from u_data group by userid order by userid limit 1; 
> Query ID = hdfs_20150529160606_d550bca4-0341-4eb0-aace-a9018bfbb7a9 
> Total jobs = 2 
> Launching Job 1 out of 2 
> Number of reduce tasks not specified. Estimated from input data size: 1 
> In order to change the average load for a reducer (in bytes): 
> set hive.exec.reducers.bytes.per.reducer=<number> 
> In order to limit the maximum number of reducers: 
> set hive.exec.reducers.max=<number> 
> In order to set a constant number of reducers: 
> set mapreduce.job.reduces=<number> 
> Starting Job = job_1432885077153_0005, Tracking URL = http://localhost:8088/proxy/application_1432885077153_0005/ 
> Kill Command = /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop/bin/hadoop job -kill job_1432885077153_0005 
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 
> 2015-05-29 16:06:34,863 Stage-1 map = 0%, reduce = 0% 
> 2015-05-29 16:06:40,066 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.72 sec 
> 2015-05-29 16:06:48,366 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 2.96 sec 
> MapReduce Total cumulative CPU time: 2 seconds 960 msec 
> Ended Job = job_1432885077153_0005 
> Launching Job 2 out of 2 
> Number of reduce tasks determined at compile time: 1 
> In order to change the average load for a reducer (in bytes): 
> set hive.exec.reducers.bytes.per.reducer=<number> 
> In order to limit the maximum number of reducers: 
> set hive.exec.reducers.max=<number> 
> In order to set a constant number of reducers: 
> set mapreduce.job.reduces=<number> 
> Starting Job = job_1432885077153_0006, Tracking URL = http://localhost:8088/proxy/application_1432885077153_0006/ 
> Kill Command = /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop/bin/hadoop job -kill job_1432885077153_0006 
> Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1 
> 2015-05-29 16:07:03,333 Stage-2 map = 0%, reduce = 0% 
> 2015-05-29 16:07:07,485 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 1.2 sec 
> 2015-05-29 16:07:15,739 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 2.35 sec 
> MapReduce Total cumulative CPU time: 2 seconds 350 msec 
> Ended Job = job_1432885077153_0006 
> MapReduce Jobs Launched: 
> Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 2.96 sec HDFS Read: 1985399 HDFS Write: 20068 SUCCESS 
> Stage-Stage-2: Map: 1 Reduce: 1 Cumulative CPU: 2.35 sec HDFS Read: 24481 HDFS Write: 6 SUCCESS 
> Total MapReduce CPU Time Spent: 5 seconds 310 msec 
> {noformat}
> This is due to JVM option conflicts:  -XX:+UseG1GC conflict with Tez's default setting: -XX:+UseParallelGC



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)