You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by ?? ? <qi...@hotmail.com> on 2016/08/15 06:43:58 UTC

hive throws ConcurrentModificationException when executing insert overwrite table

Hi everyone,


When I run the following SQL in beeline, hive just throws a ConcurrentModificationException. Anybody knows what's wrong in my hive? Or give me some ideas to target where the problem is?


INSERT OVERWRITE TABLE kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000 SELECT TBL_HIS_UWIP_SCAN_PROM.ORDER_NAME FROM TESTMES.TBL_HIS_UWIP_SCAN_PROM as TBL_HIS_UWIP_SCAN_PROM  WHERE (TBL_HIS_UWIP_SCAN_PROM.START_TIME >= '1970-01-01 01:00:00' AND TBL_HIS_UWIP_SCAN_PROM.START_TIME < '2010-01-01 01:00:00') DISTRIBUTE BY RAND();


My environment:

12 nodes cluster with

Hadoop 2.7.2

Spark 1.6.2

Zookeeper 3.4.6

Hbase 1.2.2

Hive 2.1.0

Kylin 1.5.3


Also list some settings in hive-site.xml I think maybe helpful for you to analyze the problem:

hive.support.concurrency=true

hive.lock.manager=org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager

hive.execution.engine=spark

hive.server2.transport.mode=http

hive.server2.authentication=NONE


Actually it's one step of building a Kylin cube.  The select query returns about 3,000,000 lines. Here is the log I got from hive.log:


2016-08-12T18:43:07,473 INFO  [HiveServer2-Background-Pool: Thread-83]: status.SparkJobMonitor (:()) - 2016-08-12 18:43:07,472  Stage-0_0: 58/58 Finished       Stage-1_0: 13/13 Finished
2016-08-12T18:43:07,476 INFO  [HiveServer2-Background-Pool: Thread-83]: status.SparkJobMonitor (:()) - Status: Finished successfully in 264.96 seconds
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) - =====Spark Job[85a00425-c044-4e22-b54a-f2c12feb4e82] statistics=====
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) - Spark Job[85a00425-c044-4e22-b54a-f2c12feb4e82] Metrics
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         ExecutorDeserializeTime: 157772
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         ExecutorRunTime: 4102583
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         ResultSize: 149069
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         JvmGCTime: 234246
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         ResultSerializationTime: 23
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         MemoryBytesSpilled: 0
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         DiskBytesSpilled: 0
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         BytesRead: 6831052047
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         RemoteBlocksFetched: 702
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         LocalBlocksFetched: 52
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         TotalBlocksFetched: 754
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         FetchWaitTime: 12
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         RemoteBytesRead: 2611264054
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         ShuffleBytesWritten: 2804791500
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         ShuffleWriteTime: 56641742751
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) - HIVE
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         CREATED_FILES: 13
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         RECORDS_OUT_1_default.kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000: 271942413
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         RECORDS_IN: 1076808610
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         RECORDS_OUT_INTERMEDIATE: 271942413
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         DESERIALIZE_ERRORS: 0
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) - Execution completed successfully
2016-08-12T18:43:07,521 INFO  [HiveServer2-Background-Pool: Thread-83]: exec.FileSinkOperator (:()) - Moving tmp dir: hdfs://bigdata/kylin/kylin_metadata/kylin-38250257-1649-4530-8ccb-975469aa6d22/kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000/.hive-staging_hive_2016-08-12_18-37-50_610_2817977227856616745-2/_tmp.-ext-10000 to: hdfs://bigdata/kylin/kylin_metadata/kylin-38250257-1649-4530-8ccb-975469aa6d22/kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000/.hive-staging_hive_2016-08-12_18-37-50_610_2817977227856616745-2/-ext-10000
2016-08-12T18:43:07,740 INFO  [HiveServer2-Background-Pool: Thread-83]: ql.Driver (:()) - Starting task [Stage-0:MOVE] in serial mode
2016-08-12T18:43:07,741 INFO  [HiveServer2-Background-Pool: Thread-83]: hive.metastore (:()) - Closed a connection to metastore, current connections: 1
2016-08-12T18:43:07,742 INFO  [HiveServer2-Background-Pool: Thread-83]: exec.Task (:()) - Loading data to table default.kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000 from hdfs://bigdata/kylin/kylin_metadata/kylin-38250257-1649-4530-8ccb-975469aa6d22/kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000/.hive-staging_hive_2016-08-12_18-37-50_610_2817977227856616745-2/-ext-10000
2016-08-12T18:43:07,743 INFO  [HiveServer2-Background-Pool: Thread-83]: hive.metastore (:()) - Trying to connect to metastore with URI thrift://bigdata-master:9083
2016-08-12T18:43:07,744 INFO  [HiveServer2-Background-Pool: Thread-83]: hive.metastore (:()) - Opened a connection to metastore, current connections: 2
2016-08-12T18:43:07,769 INFO  [HiveServer2-Background-Pool: Thread-83]: hive.metastore (:()) - Connected to metastore.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-1]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-12]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-0]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-7]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-4]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-8]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,111 INFO  [Delete-Thread-2]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,111 INFO  [Delete-Thread-9]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,112 INFO  [Delete-Thread-10]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,112 INFO  [Delete-Thread-3]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,112 INFO  [Delete-Thread-5]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,113 INFO  [Delete-Thread-6]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,113 INFO  [Delete-Thread-11]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,164 INFO  [HiveServer2-Background-Pool: Thread-83]: common.FileUtils (:()) - Creating directory if it doesn't exist: hdfs://bigdata/kylin/kylin_metadata/kylin-38250257-1649-4530-8ccb-975469aa6d22/kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000
2016-08-12T18:43:08,177 ERROR [HiveServer2-Background-Pool: Thread-83]: hdfs.KeyProviderCache (:()) - Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !!
2016-08-12T18:43:08,285 ERROR [HiveServer2-Background-Pool: Thread-83]: exec.Task (:()) - Failed with exception java.util.ConcurrentModificationException
org.apache.hadoop.hive.ql.metadata.HiveException: java.util.ConcurrentModificationException
        at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2942)
        at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3198)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1805)
        at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:355)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1077)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:235)
        at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:90)
        at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:299)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:312)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.ConcurrentModificationException
        at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
        at java.util.ArrayList$Itr.next(ArrayList.java:831)
        at org.apache.hadoop.hdfs.protocolPB.PBHelper.convertAclEntryProto(PBHelper.java:2325)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setAcl(ClientNamenodeProtocolTranslatorPB.java:1325)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy28.setAcl(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.setAcl(DFSClient.java:3242)
        at org.apache.hadoop.hdfs.DistributedFileSystem$43.doCall(DistributedFileSystem.java:2052)
        at org.apache.hadoop.hdfs.DistributedFileSystem$43.doCall(DistributedFileSystem.java:2049)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.setAcl(DistributedFileSystem.java:2049)
        at org.apache.hadoop.hive.io.HdfsUtils.setFullFileStatus(HdfsUtils.java:126)
        at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2919)
        at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2911)
        ... 4 more

2016-08-12T18:43:08,286 ERROR [HiveServer2-Background-Pool: Thread-83]: ql.Driver (:()) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. java.util.ConcurrentModificationException
2016-08-12T18:43:08,286 INFO  [HiveServer2-Background-Pool: Thread-83]: ql.Driver (:()) - Completed executing command(queryId=hadoop_20160812183750_2f4560e7-7a07-4443-8937-cd0ec03ee887); Time taken: 267.439 seconds
2016-08-12T18:43:08,664 ERROR [HiveServer2-Background-Pool: Thread-83]: operation.Operation (:()) - Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. java.util.ConcurrentModificationException
        at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:387)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
        at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:90)
        at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:299)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:312)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.ConcurrentModificationException
        at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2942)
        at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3198)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1805)
        at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:355)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1077)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:235)
        ... 11 more
Caused by: java.util.ConcurrentModificationException
        at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
        at java.util.ArrayList$Itr.next(ArrayList.java:831)
        at org.apache.hadoop.hdfs.protocolPB.PBHelper.convertAclEntryProto(PBHelper.java:2325)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setAcl(ClientNamenodeProtocolTranslatorPB.java:1325)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy28.setAcl(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.setAcl(DFSClient.java:3242)
        at org.apache.hadoop.hdfs.DistributedFileSystem$43.doCall(DistributedFileSystem.java:2052)
        at org.apache.hadoop.hdfs.DistributedFileSystem$43.doCall(DistributedFileSystem.java:2049)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.setAcl(DistributedFileSystem.java:2049)
        at org.apache.hadoop.hive.io.HdfsUtils.setFullFileStatus(HdfsUtils.java:126)
        at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2919)
        at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2911)
        ... 4 more

An interesting thing is that if I narrow down the 'where' to make the select query only return about 300,000 line, the insert SQL can be completed successfully.

Thanks,
Mh F


Re: hive throws ConcurrentModificationException when executing insert overwrite table

Posted by Stephen Sprague <sp...@gmail.com>.
indeed +1 to Gopal on that explanation! That was huge.

On Wed, Aug 17, 2016 at 12:58 AM, 明浩 冯 <qi...@hotmail.com> wrote:

> Hi Gopal,
>
>
> It works when I disabled the dfs.namenode.acls.
>
> For the data loss, it doesn't affect me too much currently. But I will
> track the issue in Kylin.
>
> Thank you very much for your detailed explain and solution.  You saved me!
>
>
> Best Regards,
>
> Minghao Feng
> ------------------------------
> *From:* Gopal Vijayaraghavan <go...@hortonworks.com> on behalf of Gopal
> Vijayaraghavan <go...@apache.org>
> *Sent:* Wednesday, August 17, 2016 1:18:54 PM
> *To:* user@hive.apache.org
> *Subject:* Re: hive throws ConcurrentModificationException when executing
> insert overwrite table
>
>
> > Yes, Kylin generated the query. I'm using Kylin 1.5.3.
>
> I would report a bug to Kylin about DISTRIBUTE BY RAND().
>
> This is what happens when a node which ran a Map task fails and the whole
> task is retried.
>
> Assume that the first attempt of the Map task0 wrote value1 into
> reducer-99, because RAND() returned 99.
>
> Now the task succeeds and then reducer starts, running reducer-0
> successfully, which write 0000_0.
>
> But before reducer-99 runs, the node which ran Map task0 crashes.
>
> So, the engine re-runs Map task0 on another node. Except because RAND() is
> completely random, it may give 0 as the output of RAND() for "value1".
>
> The reducer-0 output from Map task0 now has "value1", except there's no
> task which will ever read that out or write that out.
>
> In short, the output of the table will not contain "value1", despite the
> input and the shuffle outputs containing "value1".
>
> I would replace the DISTRIBUTE BY RAND() with SORT BY 0, for a random
> distribution without data loss.
>
> > But I still not sure how can I fix the problem. I'm a beginner of Hive
> >and Kylin, Can the problem be fixed by just change the hive or kylin
> >settings?
>
> If you're just experimenting with Kylin right now, I recommend just
> disabling the ACL settings in HDFS (this is not permissions btw, ACLs are
> permissions++).
>
> Set dfs.namenode.acls.enabled=false in core-site.xml and wherever else in
> your /etc/hadoop/conf it shows up and you should be good to avoid the race
> condition.
>
> Cheers,
> Gopal
>
>
>

Re: hive throws ConcurrentModificationException when executing insert overwrite table

Posted by 明浩 冯 <qi...@hotmail.com>.
Hi Gopal,


It works when I disabled the dfs.namenode.acls.

For the data loss, it doesn't affect me too much currently. But I will track the issue in Kylin.

Thank you very much for your detailed explain and solution.  You saved me!


Best Regards,

Minghao Feng

________________________________
From: Gopal Vijayaraghavan <go...@hortonworks.com> on behalf of Gopal Vijayaraghavan <go...@apache.org>
Sent: Wednesday, August 17, 2016 1:18:54 PM
To: user@hive.apache.org
Subject: Re: hive throws ConcurrentModificationException when executing insert overwrite table


> Yes, Kylin generated the query. I'm using Kylin 1.5.3.

I would report a bug to Kylin about DISTRIBUTE BY RAND().

This is what happens when a node which ran a Map task fails and the whole
task is retried.

Assume that the first attempt of the Map task0 wrote value1 into
reducer-99, because RAND() returned 99.

Now the task succeeds and then reducer starts, running reducer-0
successfully, which write 0000_0.

But before reducer-99 runs, the node which ran Map task0 crashes.

So, the engine re-runs Map task0 on another node. Except because RAND() is
completely random, it may give 0 as the output of RAND() for "value1".

The reducer-0 output from Map task0 now has "value1", except there's no
task which will ever read that out or write that out.

In short, the output of the table will not contain "value1", despite the
input and the shuffle outputs containing "value1".

I would replace the DISTRIBUTE BY RAND() with SORT BY 0, for a random
distribution without data loss.

> But I still not sure how can I fix the problem. I'm a beginner of Hive
>and Kylin, Can the problem be fixed by just change the hive or kylin
>settings?

If you're just experimenting with Kylin right now, I recommend just
disabling the ACL settings in HDFS (this is not permissions btw, ACLs are
permissions++).

Set dfs.namenode.acls.enabled=false in core-site.xml and wherever else in
your /etc/hadoop/conf it shows up and you should be good to avoid the race
condition.

Cheers,
Gopal



Re: hive throws ConcurrentModificationException when executing insert overwrite table

Posted by Gopal Vijayaraghavan <go...@apache.org>.
> Yes, Kylin generated the query. I'm using Kylin 1.5.3.

I would report a bug to Kylin about DISTRIBUTE BY RAND().

This is what happens when a node which ran a Map task fails and the whole
task is retried.

Assume that the first attempt of the Map task0 wrote value1 into
reducer-99, because RAND() returned 99.

Now the task succeeds and then reducer starts, running reducer-0
successfully, which write 0000_0.

But before reducer-99 runs, the node which ran Map task0 crashes.

So, the engine re-runs Map task0 on another node. Except because RAND() is
completely random, it may give 0 as the output of RAND() for "value1".

The reducer-0 output from Map task0 now has "value1", except there's no
task which will ever read that out or write that out.

In short, the output of the table will not contain "value1", despite the
input and the shuffle outputs containing "value1".

I would replace the DISTRIBUTE BY RAND() with SORT BY 0, for a random
distribution without data loss.

> But I still not sure how can I fix the problem. I'm a beginner of Hive
>and Kylin, Can the problem be fixed by just change the hive or kylin
>settings?

If you're just experimenting with Kylin right now, I recommend just
disabling the ACL settings in HDFS (this is not permissions btw, ACLs are
permissions++).

Set dfs.namenode.acls.enabled=false in core-site.xml and wherever else in
your /etc/hadoop/conf it shows up and you should be good to avoid the race
condition.

Cheers,
Gopal



Re: hive throws ConcurrentModificationException when executing insert overwrite table

Posted by 明浩 冯 <qi...@hotmail.com>.
Hi Gopal,


Thanks for your comment.

Yes, Kylin generated the query. I'm using Kylin 1.5.3.

But I still not sure how can I fix the problem. I'm a beginner of Hive and Kylin, Can the problem be fixed by just change the hive or kylin settings?

The total data is about 1 billion lines, I'm trying to build a cube as the base and then dealing with the increment everyday. Show I separate the 1 billion lines to hundreds of pieces and then build the cube?


Thanks,

Minghao Feng

________________________________
From: Gopal Vijayaraghavan <go...@hortonworks.com> on behalf of Gopal Vijayaraghavan <go...@apache.org>
Sent: Wednesday, August 17, 2016 11:10:45 AM
To: user@hive.apache.org
Subject: Re: hive throws ConcurrentModificationException when executing insert overwrite table


> This problem has blocked me a whole week, anybodies have any ideas?

This might be a race condition here.

<https://github.com/apache/hive/blob/master/shims/common/src/main/java/org/
apache/hadoop/hive/io/HdfsUtils.java#L68>


aclStatus.getEntries(); is being modified without being copied (oddly with
Kerberos, it might be okay).


>> >= '1970-01-01 01:00:00' AND TBL_HIS_UWIP_SCAN_PROM.START_TIME <
>>'2010-01-01 01:00:00') DISTRIBUTE BY RAND();

Did Kylin generate this query? This pattern is known to cause data loss
during runtime.

Distribute BY RAND() loses data when map tasks fail.

>        at org.apache.hadoop.hdfs.DFSClient.setAcl(DFSClient.java:3242)
...
>        at
>org.apache.hadoop.hive.io.HdfsUtils.setFullFileStatus(HdfsUtils.java:126)

> An interesting thing is that if I narrow down the 'where' to make the
>select query only return about 300,000 line, the insert SQL can be
>completed successfully.

Producing exactly 1 file will fix the issue.

Cheers,
Gopal











Re: hive throws ConcurrentModificationException when executing insert overwrite table

Posted by Gopal Vijayaraghavan <go...@apache.org>.
> This problem has blocked me a whole week, anybodies have any ideas?

This might be a race condition here.

<https://github.com/apache/hive/blob/master/shims/common/src/main/java/org/
apache/hadoop/hive/io/HdfsUtils.java#L68>


aclStatus.getEntries(); is being modified without being copied (oddly with
Kerberos, it might be okay).


>> >= '1970-01-01 01:00:00' AND TBL_HIS_UWIP_SCAN_PROM.START_TIME <
>>'2010-01-01 01:00:00') DISTRIBUTE BY RAND();

Did Kylin generate this query? This pattern is known to cause data loss
during runtime. 

Distribute BY RAND() loses data when map tasks fail.

>        at org.apache.hadoop.hdfs.DFSClient.setAcl(DFSClient.java:3242)
...
>        at 
>org.apache.hadoop.hive.io.HdfsUtils.setFullFileStatus(HdfsUtils.java:126)

> An interesting thing is that if I narrow down the 'where' to make the
>select query only return about 300,000 line, the insert SQL can be
>completed successfully.

Producing exactly 1 file will fix the issue.

Cheers,
Gopal











Re: hive throws ConcurrentModificationException when executing insert overwrite table

Posted by 明浩 冯 <qi...@hotmail.com>.
Hi,


This problem has blocked me a whole week, anybodies have any ideas?

Many thanks.


Mh F

________________________________
From: 明浩 冯 <qi...@hotmail.com>
Sent: Monday, August 15, 2016 2:43:58 PM
To: user@hive.apache.org
Subject: hive throws ConcurrentModificationException when executing insert overwrite table


Hi everyone,


When I run the following SQL in beeline, hive just throws a ConcurrentModificationException. Anybody knows what's wrong in my hive? Or give me some ideas to target where the problem is?


INSERT OVERWRITE TABLE kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000 SELECT TBL_HIS_UWIP_SCAN_PROM.ORDER_NAME FROM TESTMES.TBL_HIS_UWIP_SCAN_PROM as TBL_HIS_UWIP_SCAN_PROM  WHERE (TBL_HIS_UWIP_SCAN_PROM.START_TIME >= '1970-01-01 01:00:00' AND TBL_HIS_UWIP_SCAN_PROM.START_TIME < '2010-01-01 01:00:00') DISTRIBUTE BY RAND();


My environment:

12 nodes cluster with

Hadoop 2.7.2

Spark 1.6.2

Zookeeper 3.4.6

Hbase 1.2.2

Hive 2.1.0

Kylin 1.5.3


Also list some settings in hive-site.xml I think maybe helpful for you to analyze the problem:

hive.support.concurrency=true

hive.lock.manager=org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager

hive.execution.engine=spark

hive.server2.transport.mode=http

hive.server2.authentication=NONE


Actually it's one step of building a Kylin cube.  The select query returns about 3,000,000 lines. Here is the log I got from hive.log:


2016-08-12T18:43:07,473 INFO  [HiveServer2-Background-Pool: Thread-83]: status.SparkJobMonitor (:()) - 2016-08-12 18:43:07,472  Stage-0_0: 58/58 Finished       Stage-1_0: 13/13 Finished
2016-08-12T18:43:07,476 INFO  [HiveServer2-Background-Pool: Thread-83]: status.SparkJobMonitor (:()) - Status: Finished successfully in 264.96 seconds
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) - =====Spark Job[85a00425-c044-4e22-b54a-f2c12feb4e82] statistics=====
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) - Spark Job[85a00425-c044-4e22-b54a-f2c12feb4e82] Metrics
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         ExecutorDeserializeTime: 157772
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         ExecutorRunTime: 4102583
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         ResultSize: 149069
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         JvmGCTime: 234246
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         ResultSerializationTime: 23
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         MemoryBytesSpilled: 0
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         DiskBytesSpilled: 0
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         BytesRead: 6831052047
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         RemoteBlocksFetched: 702
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         LocalBlocksFetched: 52
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         TotalBlocksFetched: 754
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         FetchWaitTime: 12
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         RemoteBytesRead: 2611264054
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         ShuffleBytesWritten: 2804791500
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         ShuffleWriteTime: 56641742751
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) - HIVE
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         CREATED_FILES: 13
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         RECORDS_OUT_1_default.kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000: 271942413
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         RECORDS_IN: 1076808610
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         RECORDS_OUT_INTERMEDIATE: 271942413
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) -         DESERIALIZE_ERRORS: 0
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: spark.SparkTask (:()) - Execution completed successfully
2016-08-12T18:43:07,521 INFO  [HiveServer2-Background-Pool: Thread-83]: exec.FileSinkOperator (:()) - Moving tmp dir: hdfs://bigdata/kylin/kylin_metadata/kylin-38250257-1649-4530-8ccb-975469aa6d22/kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000/.hive-staging_hive_2016-08-12_18-37-50_610_2817977227856616745-2/_tmp.-ext-10000 to: hdfs://bigdata/kylin/kylin_metadata/kylin-38250257-1649-4530-8ccb-975469aa6d22/kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000/.hive-staging_hive_2016-08-12_18-37-50_610_2817977227856616745-2/-ext-10000
2016-08-12T18:43:07,740 INFO  [HiveServer2-Background-Pool: Thread-83]: ql.Driver (:()) - Starting task [Stage-0:MOVE] in serial mode
2016-08-12T18:43:07,741 INFO  [HiveServer2-Background-Pool: Thread-83]: hive.metastore (:()) - Closed a connection to metastore, current connections: 1
2016-08-12T18:43:07,742 INFO  [HiveServer2-Background-Pool: Thread-83]: exec.Task (:()) - Loading data to table default.kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000 from hdfs://bigdata/kylin/kylin_metadata/kylin-38250257-1649-4530-8ccb-975469aa6d22/kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000/.hive-staging_hive_2016-08-12_18-37-50_610_2817977227856616745-2/-ext-10000
2016-08-12T18:43:07,743 INFO  [HiveServer2-Background-Pool: Thread-83]: hive.metastore (:()) - Trying to connect to metastore with URI thrift://bigdata-master:9083
2016-08-12T18:43:07,744 INFO  [HiveServer2-Background-Pool: Thread-83]: hive.metastore (:()) - Opened a connection to metastore, current connections: 2
2016-08-12T18:43:07,769 INFO  [HiveServer2-Background-Pool: Thread-83]: hive.metastore (:()) - Connected to metastore.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-1]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-12]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-0]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-7]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-4]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-8]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,111 INFO  [Delete-Thread-2]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,111 INFO  [Delete-Thread-9]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,112 INFO  [Delete-Thread-10]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,112 INFO  [Delete-Thread-3]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,112 INFO  [Delete-Thread-5]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,113 INFO  [Delete-Thread-6]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,113 INFO  [Delete-Thread-11]: fs.TrashPolicyDefault (:()) - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-08-12T18:43:08,164 INFO  [HiveServer2-Background-Pool: Thread-83]: common.FileUtils (:()) - Creating directory if it doesn't exist: hdfs://bigdata/kylin/kylin_metadata/kylin-38250257-1649-4530-8ccb-975469aa6d22/kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000
2016-08-12T18:43:08,177 ERROR [HiveServer2-Background-Pool: Thread-83]: hdfs.KeyProviderCache (:()) - Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !!
2016-08-12T18:43:08,285 ERROR [HiveServer2-Background-Pool: Thread-83]: exec.Task (:()) - Failed with exception java.util.ConcurrentModificationException
org.apache.hadoop.hive.ql.metadata.HiveException: java.util.ConcurrentModificationException
        at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2942)
        at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3198)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1805)
        at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:355)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1077)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:235)
        at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:90)
        at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:299)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:312)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.ConcurrentModificationException
        at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
        at java.util.ArrayList$Itr.next(ArrayList.java:831)
        at org.apache.hadoop.hdfs.protocolPB.PBHelper.convertAclEntryProto(PBHelper.java:2325)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setAcl(ClientNamenodeProtocolTranslatorPB.java:1325)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy28.setAcl(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.setAcl(DFSClient.java:3242)
        at org.apache.hadoop.hdfs.DistributedFileSystem$43.doCall(DistributedFileSystem.java:2052)
        at org.apache.hadoop.hdfs.DistributedFileSystem$43.doCall(DistributedFileSystem.java:2049)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.setAcl(DistributedFileSystem.java:2049)
        at org.apache.hadoop.hive.io.HdfsUtils.setFullFileStatus(HdfsUtils.java:126)
        at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2919)
        at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2911)
        ... 4 more

2016-08-12T18:43:08,286 ERROR [HiveServer2-Background-Pool: Thread-83]: ql.Driver (:()) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. java.util.ConcurrentModificationException
2016-08-12T18:43:08,286 INFO  [HiveServer2-Background-Pool: Thread-83]: ql.Driver (:()) - Completed executing command(queryId=hadoop_20160812183750_2f4560e7-7a07-4443-8937-cd0ec03ee887); Time taken: 267.439 seconds
2016-08-12T18:43:08,664 ERROR [HiveServer2-Background-Pool: Thread-83]: operation.Operation (:()) - Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. java.util.ConcurrentModificationException
        at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:387)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
        at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:90)
        at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:299)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:312)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.ConcurrentModificationException
        at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2942)
        at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3198)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1805)
        at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:355)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1077)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:235)
        ... 11 more
Caused by: java.util.ConcurrentModificationException
        at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
        at java.util.ArrayList$Itr.next(ArrayList.java:831)
        at org.apache.hadoop.hdfs.protocolPB.PBHelper.convertAclEntryProto(PBHelper.java:2325)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setAcl(ClientNamenodeProtocolTranslatorPB.java:1325)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy28.setAcl(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.setAcl(DFSClient.java:3242)
        at org.apache.hadoop.hdfs.DistributedFileSystem$43.doCall(DistributedFileSystem.java:2052)
        at org.apache.hadoop.hdfs.DistributedFileSystem$43.doCall(DistributedFileSystem.java:2049)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.setAcl(DistributedFileSystem.java:2049)
        at org.apache.hadoop.hive.io.HdfsUtils.setFullFileStatus(HdfsUtils.java:126)
        at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2919)
        at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2911)
        ... 4 more

An interesting thing is that if I narrow down the 'where' to make the select query only return about 300,000 line, the insert SQL can be completed successfully.

Thanks,
Mh F