You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Prasanth Jayachandran (JIRA)" <ji...@apache.org> on 2015/11/07 03:05:10 UTC

[jira] [Created] (HIVE-12364) Distcp job fails when run under Tez

Prasanth Jayachandran created HIVE-12364:
--------------------------------------------

             Summary: Distcp job fails when run under Tez
                 Key: HIVE-12364
                 URL: https://issues.apache.org/jira/browse/HIVE-12364
             Project: Hive
          Issue Type: Bug
          Components: Tez
    Affects Versions: 1.3.0, 2.0.0
            Reporter: Prasanth Jayachandran
            Assignee: Prasanth Jayachandran
            Priority: Critical


PROBLEM:

insert into/overwrite directory '/path' invokes distcp for moveTask and fails
query when execution engine is Tez 

set hive.exec.copyfile.maxsize=40000;
insert overwrite into '/tmp/testinser' select * from customer;
failed at moveTask
hive client log:
{code}
2015-11-05 16:02:53,254 INFO  [main]: exec.FileSinkOperator (Utilities.java:mvFileToFinalPath(1882)) - Moving tmp dir: hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/_tmp.-ext-10000 to: hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-10000
2015-11-05 16:02:53,611 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=task.DEPENDENCY_COLLECTION.Stage-2 from=org.apache.hadoop.hive.ql.Driver>
2015-11-05 16:02:53,612 INFO  [main]: ql.Driver (Driver.java:launchTask(1653)) - Starting task [Stage-2:DEPENDENCY_COLLECTION] in serial mode
2015-11-05 16:02:53,612 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=task.MOVE.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
2015-11-05 16:02:53,612 INFO  [main]: ql.Driver (Driver.java:launchTask(1653)) - Starting task [Stage-0:MOVE] in serial mode
2015-11-05 16:02:53,612 INFO  [main]: exec.Task (SessionState.java:printInfo(951)) - Moving data to: /tmp/testindir from hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-10000
2015-11-05 16:02:53,637 INFO  [main]: common.FileUtils (FileUtils.java:copy(551)) - Source is 491763261 bytes. (MAX: 40000)
2015-11-05 16:02:53,638 INFO  [main]: common.FileUtils (FileUtils.java:copy(552)) - Launch distributed copy (distcp) job.
2015-11-05 16:03:03,924 INFO  [main]: impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(296)) - Timeline service address: http://hdpsece02.sece.hwxsup.com:8188/ws/v1/timeline/
2015-11-05 16:03:04,081 INFO  [main]: impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(296)) - Timeline service address: http://hdpsece02.sece.hwxsup.com:8188/ws/v1/timeline/
2015-11-05 16:03:20,210 INFO  [main]: hdfs.DFSClient (DFSClient.java:getDelegationToken(1047)) - Created HDFS_DELEGATION_TOKEN token 1069 for haha on ha-hdfs:hdpsecehdfs
2015-11-05 16:03:20,249 INFO  [main]: security.TokenCache (TokenCache.java:obtainTokensForNamenodesInternal(125)) - Got dt for hdfs://hdpsecehdfs; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hdpsecehdfs, Ident: (HDFS_DELEGATION_TOKEN token 1069 for haha)
2015-11-05 16:03:20,250 WARN  [main]: token.Token (Token.java:getClassForIdentifier(121)) - Cannot find class for token kind kms-dt
2015-11-05 16:03:20,250 INFO  [main]: security.TokenCache (TokenCache.java:obtainTokensForNamenodesInternal(125)) - Got dt for hdfs://hdpsecehdfs; Kind: kms-dt, Service: 172.25.17.102:9292, Ident: 00 04 68 61 68 61 02 72 6d 00 8a 01 50 da 1a ca 29 8a 01 50 fe 27 4e 29 03 02
2015-11-05 16:03:22,561 INFO  [main]: Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1173)) - io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb
2015-11-05 16:03:22,562 INFO  [main]: Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1173)) - io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor
2015-11-05 16:03:33,733 ERROR [main]: exec.Task (SessionState.java:printError(960)) - Failed with exception Unable to move source hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-10000 to destination /tmp/testindir
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-10000 to destination /tmp/testindir
        at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2665)
        at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:105)
        at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:222)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: Cannot execute DistCp process: java.io.IOException: mapreduce.job.inputformat.class is incompatible with map compatability mode.
        at org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1156)
        at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553)
        at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2647)
        ... 21 more
Caused by: java.io.IOException: mapreduce.job.inputformat.class is incompatible with map compatability mode.
        at org.apache.hadoop.mapreduce.Job.ensureNotSet(Job.java:1194)
        at org.apache.hadoop.mapreduce.Job.setUseNewAPI(Job.java:1229)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
        at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:183)
        at org.apache.hadoop.tools.DistCp.execute(DistCp.java:153)
        at org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1153)
        ... 23 more

2015-11-05 16:03:33,734 INFO  [main]: hooks.ATSHook (ATSHook.java:<init>(84)) - Created ATS Hook

{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)