You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Robbie Zhang (Jira)" <ji...@apache.org> on 2021/06/05 06:05:00 UTC

[jira] [Commented] (HIVE-23489) dynamic partition failed use insert overwrite

    [ https://issues.apache.org/jira/browse/HIVE-23489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17357764#comment-17357764 ] 

Robbie Zhang commented on HIVE-23489:
-------------------------------------

I can reproduce this issue on the same version by setting hive.metastore.dml.events to true. Insert Overwrites to a new partition should not capture new files as part of insert event. Otherwise, HMS will report this error because the new partition doesn't exist. This bug is fixed by HIVE-15642.

 

> dynamic partition failed use insert overwrite
> ---------------------------------------------
>
>                 Key: HIVE-23489
>                 URL: https://issues.apache.org/jira/browse/HIVE-23489
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 2.1.1
>            Reporter: shining
>            Assignee: Robbie Zhang
>            Priority: Major
>
> SQL: insert *overwrite *table dzdz_fpxx_dzfp partition(nian) select * from test.dzdz_fpxx_dzfp
> {noformat}
> create table dzdz_fpxx_dzfp  (
> 		FPDM string,
> 		FPHM string,
> 		KPRQ timestamp,
> 		FPLY string) 
> 	partitioned by (nian string) 
> 	stored as parquet;
> 	
> 	
> 	create table test.dzdz_fpxx_dzfp (
> 		FPDM string,
> 		FPHM string,
> 		KPRQ timestamp,
> 		FPLY string,
> 		nian string) 
> 	stored as textfile
> 	location "/origin/data/dzfp_origin/"
> {noformat}
> The execute insert sql: 
> SQL: insert overwrite table dzdz_fpxx_dzfp partition(nian) select * from test.dzdz_fpxx_dzfp
> {noformat}
> INFO  : Compiling command(queryId=hive_20200519100412_12f2b39c-45f4-4f4c-9261-32ca86fa28db): insert overwrite table dzdz_fpxx_dzfp partition(nian) select * from test.dzdz_fpxx_dzfp
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:dzdz_fpxx_dzfp.fpdm, type:string, comment:null), FieldSchema(name:dzdz_fpxx_dzfp.fphm, type:string, comment:null), FieldSchema(name:dzdz_fpxx_dzfp.kprq, type:timestamp, comment:null), FieldSchema(name:dzdz_fpxx_dzfp.tspz_dm, type:string, comment:null), FieldSchema(name:dzdz_fpxx_dzfp.fpzt_bz, type:string, comment:null), FieldSchema(name:dzdz_fpxx_dzfp.fply, type:string, comment:null), FieldSchema(name:dzdz_fpxx_dzfp.nian, type:string, comment:null)], properties:null)
> INFO  : Completed compiling command(queryId=hive_20200519100412_12f2b39c-45f4-4f4c-9261-32ca86fa28db); Time taken: 1.719 seconds
> INFO  : Executing command(queryId=hive_20200519100412_12f2b39c-45f4-4f4c-9261-32ca86fa28db): insert overwrite table dzdz_fpxx_dzfp partition(nian) select * from test.dzdz_fpxx_dzfp
> WARN  :
> INFO  : Query ID = hive_20200519100412_12f2b39c-45f4-4f4c-9261-32ca86fa28db
> INFO  : Total jobs = 3
> INFO  : Launching Job 1 out of 3
> INFO  : Starting task [Stage-1:MAPRED] in serial mode
> INFO  : Number of reduce tasks is set to 0 since there's no reduce operator
> INFO  : number of splits:1
> INFO  : Submitting tokens for job: job_1589451904439_0049
> INFO  : Executing with tokens: []
> INFO  : The url to track the job: http://qcj37.hde.com:8088/proxy/application_1589451904439_0049/
> INFO  : Starting Job = job_1589451904439_0049, Tracking URL = http://qcj37.hde.com:8088/proxy/application_1589451904439_0049/
> INFO  : Kill Command = /usr/hdp/current/hadoop-client/bin/hadoop job  -kill job_1589451904439_0049
> INFO  : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
> INFO  : 2020-05-19 10:06:56,823 Stage-1 map = 0%,  reduce = 0%
> INFO  : 2020-05-19 10:07:06,317 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 4.3 sec
> INFO  : MapReduce Total cumulative CPU time: 4 seconds 300 msec
> INFO  : Ended Job = job_1589451904439_0049
> INFO  : Starting task [Stage-7:CONDITIONAL] in serial mode
> INFO  : Stage-4 is selected by condition resolver.
> INFO  : Stage-3 is filtered out by condition resolver.
> INFO  : Stage-5 is filtered out by condition resolver.
> INFO  : Starting task [Stage-4:MOVE] in serial mode
> INFO  : Moving data to directory hdfs://mycluster/warehouse/tablespace/managed/hive/dzdz_fpxx_dzfp/.hive-staging_hive_2020-05-19_10-04-12_468_7695595367555279265-3/-ext-10000 from hdfs://mycluster/warehouse/tablespace/managed/hive/dzdz_fpxx_dzfp/.hive-staging_hive_2020-05-19_10-04-12_468_7695595367555279265-3/-ext-10002
> INFO  : Starting task [Stage-0:MOVE] in serial mode
> INFO  : Loading data to table default.dzdz_fpxx_dzfp partition (nian=null) from hdfs://mycluster/warehouse/tablespace/managed/hive/dzdz_fpxx_dzfp/.hive-staging_hive_2020-05-19_10-04-12_468_7695595367555279265-3/-ext-10000
> INFO  :
> ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Exception when loading 2 in table dzdz_fpxx_dzfp with loadPath=hdfs://mycluster/warehouse/tablespace/managed/hive/dzdz_fpxx_dzfp/.hive-staging_hive_2020-05-19_10-04-12_468_7695595367555279265-3/-ext-10000
> INFO  : MapReduce Jobs Launched:
> INFO  : Stage-Stage-1: Map: 1   Cumulative CPU: 4.3 sec   HDFS Read: 8015 HDFS Write: 2511 SUCCESS
> INFO  : Total MapReduce CPU Time Spent: 4 seconds 300 msec
> INFO  : Completed executing command(queryId=hive_20200519100412_12f2b39c-45f4-4f4c-9261-32ca86fa28db); Time taken: 297.35 seconds
> 2020-05-19 10:11:54,635 DEBUG transport.TSaslTransport (TSaslTransport.java:flush(496)) - writing data length: 117
> 2020-05-19 10:11:54,638 DEBUG transport.TSaslTransport (TSaslTransport.java:readFrame(457)) - CLIENT: reading data length: 96
> 2020-05-19 10:11:54,649 DEBUG transport.TSaslTransport (TSaslTransport.java:flush(496)) - writing data length: 96
> 2020-05-19 10:11:54,714 DEBUG transport.TSaslTransport (TSaslTransport.java:readFrame(457)) - CLIENT: reading data length: 42
> Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Exception when loading 2 in table dzdz_fpxx_dzfp with loadPath=hdfs://mycluster/warehouse/tablespace/managed/hive/dzdz_fpxx_dzfp/.hive-staging_hive_2020-05-19_10-04-12_468_7695595367555279265-3/-ext-10000 (state=08S01,code=1)
> {noformat}
> I find error log as follows in hive metastore:
> {noformat}
> 2020-05-19 08:37:23,957 ERROR metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(175)) - NoSuchObjectException(message:partition values=[ujm])
>         at org.apache.hadoop.hive.metastore.ObjectStore.getPartition(ObjectStore.java:2003)
>         at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101)
>         at com.sun.proxy.$Proxy26.getPartition(Unknown Source)
>         at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition(HiveMetaStore.java:3553)
>         at org.apache.hadoop.hive.metastore.events.InsertEvent.<init>(InsertEvent.java:62)
>         at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.fire_listener_event(HiveMetaStore.java:6737)
>         at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
>         at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
>         at com.sun.proxy.$Proxy28.fire_listener_event(Unknown Source)
>         at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$fire_listener_event.getResult(ThriftHiveMetastore.java:14208)
>         at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$fire_listener_event.getResult(ThriftHiveMetastore.java:14193)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>         at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>         at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>         at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> 2020-05-19 08:37:23,957 DEBUG log.PerfLogger (PerfLogger.java:PerfLogEnd(179)) - </PERFLOG method=fire_listener_event start=1589848643184 end=1589848643957 duration=773 from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=56 retryCount=-1 error=true>
> 2020-05-19 08:37:23,958 ERROR thrift.ProcessFunction (ProcessFunction.java:process(41)) - Internal error processing fire_listener_event
> org.apache.hadoop.hive.metastore.api.NoSuchObjectException: partition values=[ujm]
>         at org.apache.hadoop.hive.metastore.ObjectStore.getPartition(ObjectStore.java:2003) ~[hive-exec-2.1.1-cdh6.2.0.jar:2.1.1-cdh6.2.0]
>         at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) ~[?:?]
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0-internal]
>         at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0-internal]
>         at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101) ~[hive-exec-2.1.1-cdh6.2.0.jar:2.1.1-cdh6.2.0]
>         at com.sun.proxy.$Proxy26.getPartition(Unknown Source) ~[?:?]
>         at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition(HiveMetaStore.java:3553) ~[hive-exec-2.1.1-cdh6.2.0.jar:2.1.1-cdh6.2.0]
>         at org.apache.hadoop.hive.metastore.events.InsertEvent.<init>(InsertEvent.java:62) ~[hive-exec-2.1.1-cdh6.2.0.jar:2.1.1-cdh6.2.0]
>         at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.fire_listener_event(HiveMetaStore.java:6737) ~[hive-exec-2.1.1-cdh6.2.0.jar:2.1.1-cdh6.2.0]
>         at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) ~[?:?]
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0-internal]
>         at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0-internal]
>         at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140) ~[hive-exec-2.1.1-cdh6.2.0.jar:2.1.1-cdh6.2.0]
>         at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99) ~[hive-exec-2.1.1-cdh6.2.0.jar:2.1.1-cdh6.2.0]
>         at com.sun.proxy.$Proxy28.fire_listener_event(Unknown Source) ~[?:?]
>         at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$fire_listener_event.getResult(ThriftHiveMetastore.java:14208) ~[hive-exec-2.1.1-cdh6.2.0.jar:2.1.1-cdh6.2.0]
>         at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$fire_listener_event.getResult(ThriftHiveMetastore.java:14193) ~[hive-exec-2.1.1-cdh6.2.0.jar:2.1.1-cdh6.2.0]
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) [hive-exec-2.1.1-cdh6.2.0.jar:2.1.1-cdh6.2.0]
>         at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) [hive-exec-2.1.1-cdh6.2.0.jar:2.1.1-cdh6.2.0]
>         at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) [hive-exec-2.1.1-cdh6.2.0.jar:2.1.1-cdh6.2.0]
>         at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0-internal]
>         at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0-internal]
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) [hadoop-common-3.0.0-cdh6.2.0.jar:?]
>         at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) [hive-exec-2.1.1-cdh6.2.0.jar:2.1.1-cdh6.2.0]
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-2.1.1-cdh6.2.0.jar:2.1.1-cdh6.2.0]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0-internal]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0-internal]
>         at java.lang.Thread.run(Thread.java:748) [?:1.8.0-internal]
> 2020-05-19 08:37:23,976 DEBUG ipc.Client (ClientCache.java:stopClient(97)) - stopping client from cache: org.apache.hadoop.ipc.Client@2c66e678
> {noformat}
> I guess metastore could not find partition,and disconnect connection.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)