You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Sahil Takiar (Jira)" <ji...@apache.org> on 2019/11/27 01:09:00 UTC

[jira] [Resolved] (IMPALA-9188) Dataload is failing when USE_CDP_HIVE=true

     [ https://issues.apache.org/jira/browse/IMPALA-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sahil Takiar resolved IMPALA-9188.
----------------------------------
    Fix Version/s: Impala 3.4.0
       Resolution: Fixed

USE_CDP_HIVE builds are passing data load now.

> Dataload is failing when USE_CDP_HIVE=true
> ------------------------------------------
>
>                 Key: IMPALA-9188
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9188
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Sahil Takiar
>            Assignee: Anurag Mantripragada
>            Priority: Critical
>             Fix For: Impala 3.4.0
>
>
> When USE_CDP_HIVE=true, Impala builds are failing during dataload when creating tables with PK/FK constraints.
> The error is:
> {code:java}
> ERROR: CREATE EXTERNAL TABLE IF NOT EXISTS functional_seq_record_snap.child_table (
> seq int, id int, year string, a int, primary key(seq) DISABLE NOVALIDATE RELY, foreign key
> (id, year) references functional_seq_record_snap.parent_table(id, year) DISABLE NOVALIDATE
> RELY, foreign key(a) references functional_seq_record_snap.parent_table_2(a) DISABLE
> NOVALIDATE RELY)
> row format delimited fields terminated by ','
> LOCATION '/test-warehouse/child_table'
> Traceback (most recent call last):
>   File "Impala/bin/load-data.py", line 208, in exec_impala_query_from_file
>     result = impala_client.execute(query)
>   File "Impala/tests/beeswax/impala_beeswax.py", line 187, in execute
>     handle = self.__execute_query(query_string.strip(), user=user)
>   File "Impala/tests/beeswax/impala_beeswax.py", line 362, in __execute_query
>     handle = self.execute_query_async(query_string, user=user)
>   File "Impala/tests/beeswax/impala_beeswax.py", line 356, in execute_query_async
>     handle = self.__do_rpc(lambda: self.imp_service.query(query,))
>   File "Impala/tests/beeswax/impala_beeswax.py", line 519, in __do_rpc
>     raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> ImpalaBeeswaxException: ImpalaBeeswaxException:
>  INNER EXCEPTION: <class 'beeswaxd.ttypes.BeeswaxException'>
>  MESSAGE: ImpalaRuntimeException: Error making 'createTable' RPC to Hive Metastore:
> CAUSED BY: MetaException: Foreign key references id:int;year:string; but no corresponding primary key or unique key exists. Possible keys: [year:string;id:int;]{code}
> The corresponding error in HMS is:
> {code:java}
> 2019-11-22T06:36:59,937  INFO [pool-10-thread-13] metastore.HiveMetaStore: 18: source:127.0.0.1 create_table_req: Table(tableName:child_table, dbName:functional_seq_record_gzip, owner:jenkins, createTime:0, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:seq, type:int, comment:null), FieldSchema(name:id, type:int, comment:null), FieldSchema(name:year, type:string, comment:null), FieldSchema(name:a, type:int, comment:null)], location:hdfs://localhost:20500/test-warehouse/child_table, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=,, field.delim=,}), bucketCols:null, sortCols:null, parameters:null), partitionKeys:[], parameters:{EXTERNAL=TRUE, OBJCAPABILITIES=EXTREAD,EXTWRITE}, viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE, catName:hive, ownerType:USER, accessType:8)
> 2019-11-22T06:36:59,937  INFO [pool-10-thread-13] HiveMetaStore.audit: ugi=jenkins      ip=127.0.0.1    cmd=source:127.0.0.1 create_table_req: Table(tableName:child_table, dbName:functional_seq_record_gzip, owner:jenkins, createTime:0, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:seq, type:int, comment:null), FieldSchema(name:id, type:int, comment:null), FieldSchema(name:year, type:string, comment:null), FieldSchema(name:a, type:int, comment:null)], location:hdfs://localhost:20500/test-warehouse/child_table, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=,, field.delim=,}), bucketCols:null, sortCols:null, parameters:null), partitionKeys:[], parameters:{EXTERNAL=TRUE, OBJCAPABILITIES=EXTREAD,EXTWRITE}, viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE, catName:hive, ownerType:USER, accessType:8)
> 2019-11-22T06:36:59,937  INFO [pool-10-thread-13] metastore.MetastoreDefaultTransformer: Starting translation for CreateTable for processor Impala3.4.0-SNAPSHOT@localhost with [EXTWRITE, EXTREAD, HIVEMANAGEDINSERTREAD, HIVEMANAGEDINSERTWRITE, HIVESQL, HIVEMQT, HIVEBUCKET2] on table child_table
> 2019-11-22T06:36:59,937  INFO [pool-10-thread-13] metastore.MetastoreDefaultTransformer: Table to be created is of type EXTERNAL_TABLE but not MANAGED_TABLE
> 2019-11-22T06:36:59,937  INFO [pool-10-thread-13] metastore.MetastoreDefaultTransformer: Transformer returning table:Table(tableName:child_table, dbName:functional_seq_record_gzip, owner:jenkins, createTime:0, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:seq, type:int, comment:null), FieldSchema(name:id, type:int, comment:null), FieldSchema(name:year, type:string, comment:null), FieldSchema(name:a, type:int, comment:null)], location:hdfs://localhost:20500/test-warehouse/child_table, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=,, field.delim=,}), bucketCols:null, sortCols:null, parameters:null), partitionKeys:[], parameters:{EXTERNAL=TRUE, OBJCAPABILITIES=EXTREAD,EXTWRITE}, viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE, catName:hive, ownerType:USER, accessType:8)
> 2019-11-22T06:36:59,945 ERROR [pool-10-thread-13] metastore.RetryingHMSHandler: MetaException(message:Foreign key references id:int;year:string; but no corresponding primary key or unique key exists. Possible keys: [year:string;id:int;])
>         at org.apache.hadoop.hive.metastore.ObjectStore.addForeignKeys(ObjectStore.java:4968)
>         at org.apache.hadoop.hive.metastore.ObjectStore.createTableWithConstraints(ObjectStore.java:1289)
>         at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
>         at com.sun.proxy.$Proxy27.createTableWithConstraints(Unknown Source)
>         at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:2220)
>         at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_req(HiveMetaStore.java:2404)
>         at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>         at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>         at com.sun.proxy.$Proxy34.create_table_req(Unknown Source)
>         at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_req.getResult(ThriftHiveMetastore.java:16107)
>         at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_req.getResult(ThriftHiveMetastore.java:16091)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>         at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>         at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>         at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
> Looks like this was caused by IMPALA-9104.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)