You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2015/08/03 07:04:04 UTC

[jira] [Updated] (HIVE-11438) Join a ACID table with non-ACID table fail with MR on 1.0.0

     [ https://issues.apache.org/jira/browse/HIVE-11438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated HIVE-11438:
------------------------------
    Attachment: HIVE-7951.1.patch.txt

The issue only occur when I create/insert table first, then run the join statement in a separate script. It seems "create table orc_update_table" contaminate the HiveConf (set hive.doing.acid=true), and join take a different execution path, so the stack does not appear. This might be a different issue.

The test case play a trick which override hive.doing.acid, so it will throw the exception in a single script without the attached fix.

> Join a ACID table with non-ACID table fail with MR on 1.0.0
> -----------------------------------------------------------
>
>                 Key: HIVE-11438
>                 URL: https://issues.apache.org/jira/browse/HIVE-11438
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor, Transactions
>    Affects Versions: 1.0.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 1.0.1
>
>         Attachments: HIVE-7951.1.patch.txt
>
>
> The following script fail on MR mode:
> Preparation:
> {code}
> CREATE TABLE orc_update_table (k1 INT, f1 STRING, op_code STRING) 
> CLUSTERED BY (k1) INTO 2 BUCKETS 
> STORED AS ORC TBLPROPERTIES("transactional"="true"); 
> INSERT INTO TABLE orc_update_table VALUES (1, 'a', 'I');
> CREATE TABLE orc_table (k1 INT, f1 STRING) 
> CLUSTERED BY (k1) SORTED BY (k1) INTO 2 BUCKETS 
> STORED AS ORC; 
> INSERT OVERWRITE TABLE orc_table VALUES (1, 'x');
> {code}
> Then run the following script:
> {code}
> SET hive.execution.engine=mr; 
> SET hive.auto.convert.join=false; 
> SET hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
> SELECT t1.*, t2.* FROM orc_table t1 
> JOIN orc_update_table t2 ON t1.k1=t2.k1 ORDER BY t1.k1;
> {code}
> Stack:
> {code}
> java.lang.NullPointerException
> 	at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265)
> 	at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getCombineSplits(CombineHiveInputFormat.java:272)
> 	at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:509)
> 	at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:624)
> 	at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:616)
> 	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:492)
> 	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
> 	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> 	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
> 	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:585)
> 	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:580)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> 	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:580)
> 	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:571)
> 	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:429)
> 	at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
> 	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> 	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> 	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1606)
> 	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1367)
> 	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1179)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
> 	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
> 	at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
> 	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
> 	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Job Submission failed with exception 'java.lang.NullPointerException(null)'
> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> {code}
> Note the query is the same as HIVE-11422. But in 1.0.0 for this Jira, it throw a different exeception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)