You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jeff Zhang (JIRA)" <ji...@apache.org> on 2015/06/01 08:42:17 UTC
[jira] [Issue Comment Deleted] (TEZ-2507)
mapreduce.{map|reduce}.java.opts should override tez.task.launch.cmd-opts
[ https://issues.apache.org/jira/browse/TEZ-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeff Zhang updated TEZ-2507:
----------------------------
Comment: was deleted
(was: 2 things need to be done for this issue:
* Make tez.task.launch.cmd-opts as vertex level configuration. Currently it is a dag level configuration.
* Override tez.task.launch.cmd-opts if user set mapreduce.{map|reduce}.java.opts when we translate mapreduce job to Tez DAG. )
> mapreduce.{map|reduce}.java.opts should override tez.task.launch.cmd-opts
> -------------------------------------------------------------------------
>
> Key: TEZ-2507
> URL: https://issues.apache.org/jira/browse/TEZ-2507
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Jeff Zhang
> Assignee: Jeff Zhang
>
> Otherwise it may JVM options conflicts. Here's the issue reported by [~r7raul]
> {noformat}
> I change my mapreduce.map.java.opts 's value from -Djava.net.preferIPv4Stack=true -Xmx825955249 to -Djava.net.preferIPv4Stack=true -XX:+UseG1GC -Xmx825955249
> When I run query by hive 1.1.0+tez0.53 in hadoop 2.5.0.
> set mapreduce.framework.name=yarn-tez;
> set hive.execution.engine=tez;
> select userid,count(*) from u_data group by userid order by userid;
> The query return error.
> I found error :
> 2015-05-29 16:02:39,064 WARN [AsyncDispatcher event handler] container.AMContainerImpl: Container container_1432885077153_0004_01_000005 finished with diagnostics set to [Container failed. Exception from container-launch.
> Container id: container_1432885077153_0004_01_000005
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
> at org.apache.hadoop.util.Shell.run(Shell.java:455)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
> at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:196)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> But I try
> hive> set hive.execution.engine=mr;
> hive> set mapreduce.framework.name=yarn;
> hive> select userid,count(*) from u_data group by userid order by userid limit 1;
> Query ID = hdfs_20150529160606_d550bca4-0341-4eb0-aace-a9018bfbb7a9
> Total jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapreduce.job.reduces=<number>
> Starting Job = job_1432885077153_0005, Tracking URL = http://localhost:8088/proxy/application_1432885077153_0005/
> Kill Command = /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop/bin/hadoop job -kill job_1432885077153_0005
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
> 2015-05-29 16:06:34,863 Stage-1 map = 0%, reduce = 0%
> 2015-05-29 16:06:40,066 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.72 sec
> 2015-05-29 16:06:48,366 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 2.96 sec
> MapReduce Total cumulative CPU time: 2 seconds 960 msec
> Ended Job = job_1432885077153_0005
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapreduce.job.reduces=<number>
> Starting Job = job_1432885077153_0006, Tracking URL = http://localhost:8088/proxy/application_1432885077153_0006/
> Kill Command = /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop/bin/hadoop job -kill job_1432885077153_0006
> Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1
> 2015-05-29 16:07:03,333 Stage-2 map = 0%, reduce = 0%
> 2015-05-29 16:07:07,485 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 1.2 sec
> 2015-05-29 16:07:15,739 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 2.35 sec
> MapReduce Total cumulative CPU time: 2 seconds 350 msec
> Ended Job = job_1432885077153_0006
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 2.96 sec HDFS Read: 1985399 HDFS Write: 20068 SUCCESS
> Stage-Stage-2: Map: 1 Reduce: 1 Cumulative CPU: 2.35 sec HDFS Read: 24481 HDFS Write: 6 SUCCESS
> Total MapReduce CPU Time Spent: 5 seconds 310 msec
> {noformat}
> This is due to JVM option conflicts: -XX:+UseG1GC conflict with Tez's default setting: -XX:+UseParallelGC
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)