You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Prasanth Jayachandran (JIRA)" <ji...@apache.org> on 2015/11/07 03:49:11 UTC

[jira] [Commented] (HIVE-12364) Distcp job fails when run under Tez

    [ https://issues.apache.org/jira/browse/HIVE-12364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994949#comment-14994949 ] 

Prasanth Jayachandran commented on HIVE-12364:
----------------------------------------------

[~ashutoshc] Can you plz take a look at this patch? I will put up patch for master shortly.

> Distcp job fails when run under Tez
> -----------------------------------
>
>                 Key: HIVE-12364
>                 URL: https://issues.apache.org/jira/browse/HIVE-12364
>             Project: Hive
>          Issue Type: Bug
>          Components: Tez
>    Affects Versions: 1.3.0, 2.0.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>            Priority: Critical
>         Attachments: HIVE-12364-branch-1.patch
>
>
> PROBLEM:
> insert into/overwrite directory '/path' invokes distcp for moveTask and fails
> query when execution engine is Tez 
> set hive.exec.copyfile.maxsize=40000;
> insert overwrite into '/tmp/testinser' select * from customer;
> failed at moveTask
> hive client log:
> {code}
> 2015-11-05 16:02:53,254 INFO  [main]: exec.FileSinkOperator (Utilities.java:mvFileToFinalPath(1882)) - Moving tmp dir: hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/_tmp.-ext-10000 to: hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-10000
> 2015-11-05 16:02:53,611 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=task.DEPENDENCY_COLLECTION.Stage-2 from=org.apache.hadoop.hive.ql.Driver>
> 2015-11-05 16:02:53,612 INFO  [main]: ql.Driver (Driver.java:launchTask(1653)) - Starting task [Stage-2:DEPENDENCY_COLLECTION] in serial mode
> 2015-11-05 16:02:53,612 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=task.MOVE.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
> 2015-11-05 16:02:53,612 INFO  [main]: ql.Driver (Driver.java:launchTask(1653)) - Starting task [Stage-0:MOVE] in serial mode
> 2015-11-05 16:02:53,612 INFO  [main]: exec.Task (SessionState.java:printInfo(951)) - Moving data to: /tmp/testindir from hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-10000
> 2015-11-05 16:02:53,637 INFO  [main]: common.FileUtils (FileUtils.java:copy(551)) - Source is 491763261 bytes. (MAX: 40000)
> 2015-11-05 16:02:53,638 INFO  [main]: common.FileUtils (FileUtils.java:copy(552)) - Launch distributed copy (distcp) job.
> 2015-11-05 16:03:03,924 INFO  [main]: impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(296)) - Timeline service address: http://hdpsece02.sece.hwxsup.com:8188/ws/v1/timeline/
> 2015-11-05 16:03:04,081 INFO  [main]: impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(296)) - Timeline service address: http://hdpsece02.sece.hwxsup.com:8188/ws/v1/timeline/
> 2015-11-05 16:03:20,210 INFO  [main]: hdfs.DFSClient (DFSClient.java:getDelegationToken(1047)) - Created HDFS_DELEGATION_TOKEN token 1069 for haha on ha-hdfs:hdpsecehdfs
> 2015-11-05 16:03:20,249 INFO  [main]: security.TokenCache (TokenCache.java:obtainTokensForNamenodesInternal(125)) - Got dt for hdfs://hdpsecehdfs; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hdpsecehdfs, Ident: (HDFS_DELEGATION_TOKEN token 1069 for haha)
> 2015-11-05 16:03:20,250 WARN  [main]: token.Token (Token.java:getClassForIdentifier(121)) - Cannot find class for token kind kms-dt
> 2015-11-05 16:03:20,250 INFO  [main]: security.TokenCache (TokenCache.java:obtainTokensForNamenodesInternal(125)) - Got dt for hdfs://hdpsecehdfs; Kind: kms-dt, Service: 172.25.17.102:9292, Ident: 00 04 68 61 68 61 02 72 6d 00 8a 01 50 da 1a ca 29 8a 01 50 fe 27 4e 29 03 02
> 2015-11-05 16:03:22,561 INFO  [main]: Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1173)) - io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb
> 2015-11-05 16:03:22,562 INFO  [main]: Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1173)) - io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor
> 2015-11-05 16:03:33,733 ERROR [main]: exec.Task (SessionState.java:printError(960)) - Failed with exception Unable to move source hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-10000 to destination /tmp/testindir
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-10000 to destination /tmp/testindir
>         at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2665)
>         at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:105)
>         at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:222)
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>         at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
>         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655)
>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414)
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>         at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
>         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>         at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.io.IOException: Cannot execute DistCp process: java.io.IOException: mapreduce.job.inputformat.class is incompatible with map compatability mode.
>         at org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1156)
>         at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553)
>         at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2647)
>         ... 21 more
> Caused by: java.io.IOException: mapreduce.job.inputformat.class is incompatible with map compatability mode.
>         at org.apache.hadoop.mapreduce.Job.ensureNotSet(Job.java:1194)
>         at org.apache.hadoop.mapreduce.Job.setUseNewAPI(Job.java:1229)
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
>         at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:183)
>         at org.apache.hadoop.tools.DistCp.execute(DistCp.java:153)
>         at org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1153)
>         ... 23 more
> 2015-11-05 16:03:33,734 INFO  [main]: hooks.ATSHook (ATSHook.java:<init>(84)) - Created ATS Hook
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)