You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Shashank Pedamallu (Jira)" <ji...@apache.org> on 2020/05/19 16:55:00 UTC

[jira] [Created] (HIVE-23509) MapJoin AssertionError: Capacity must be power of 2

Shashank Pedamallu created HIVE-23509:
-----------------------------------------

             Summary: MapJoin AssertionError: Capacity must be power of 2
                 Key: HIVE-23509
                 URL: https://issues.apache.org/jira/browse/HIVE-23509
             Project: Hive
          Issue Type: Bug
         Environment: Hive-2.3.6
            Reporter: Shashank Pedamallu
            Assignee: Shashank Pedamallu


Observed AssertionError errors in Hive query when rowCount for join is issued as (2^x)+(2^(x+1)).

Following is the stacktrace:
{noformat}
[2020-05-11 05:43:12,135] {base_task_runner.py:95} INFO - Subtask: ERROR : Vertex failed, vertexName=Map 4, vertexId=vertex_1588729523139_51702_1_06, diagnostics=[Task failed, taskId=task_1588729523139_51702_1_06_001286, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1588729523139_51702_1_06_001286_0:java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a power of two [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211) [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168) [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at java.security.AccessController.doPrivileged(Native Method) [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at javax.security.auth.Subject.doAs(Subject.java:422) [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at java.util.concurrent.FutureTask.run(FutureTask.java:266) [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at java.lang.Thread.run(Thread.java:748) [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: Caused by: java.lang.AssertionError: Capacity must be a power of two [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.validateCapacity(BytesBytesMultiHashMap.java:552) [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.expandAndRehashImpl(BytesBytesMultiHashMap.java:731) [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.expandAndRehashToTarget(BytesBytesMultiHashMap.java:545) [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer$HashPartition.getHashMapFromDisk(HybridHashTableContainer.java:183) [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.MapJoinOperator.reloadHashTable(MapJoinOperator.java:641) [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.MapJoinOperator.continueProcess(MapJoinOperator.java:603) [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:539) [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697) [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:711) [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:477) [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:199) [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: ... 14 more [2020-05-11 05:43:12,138] {base_task_runner.py:95} INFO - Subtask: ], TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : attempt_1588729523139_51702_1_06_001286_1:java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a power of two [2020-05-11 05:43:12,138] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211) [2020-05-11 05:43:12,138] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168) [2020-05-11 05:43:12,138] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) [2020-05-11 05:43:12,138] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) [2020-05-11 05:43:12,138] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) [2020-05-11 05:43:12,138] {base_task_runner.py:95} INFO - Subtask: at java.security.AccessController.doPrivileged(Native Method) [2020-05-11 05:43:12,138] {base_task_runner.py:95} INFO - Subtask: at javax.security.auth.Subject.doAs(Subject.java:422) [2020-05-11 05:43:12,138] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) [2020-05-11 05:43:12,138] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) [2020-05-11 05:43:12,138] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) [2020-05-11 05:43:12,138] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) [2020-05-11 05:43:12,138] {base_task_runner.py:95} INFO - Subtask: at java.util.concurrent.FutureTask.run(FutureTask.java:266) [2020-05-11 05:43:12,138] {base_task_runner.py:95} INFO - Subtask: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [2020-05-11 05:43:12,139] {base_task_runner.py:95} INFO - Subtask: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [2020-05-11 05:43:12,139] {base_task_runner.py:95} INFO - Subtask: at java.lang.Thread.run(Thread.java:748) [2020-05-11 05:43:12,139] {base_task_runner.py:95} INFO - Subtask: Caused by: java.lang.AssertionError: Capacity must be a power of two [2020-05-11 05:43:12,139] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.validateCapacity(BytesBytesMultiHashMap.java:552) [2020-05-11 05:43:12,139] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.expandAndRehashImpl(BytesBytesMultiHashMap.java:731) [2020-05-11 05:43:12,139] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.expandAndRehashToTarget(BytesBytesMultiHashMap.java:545) [2020-05-11 05:43:12,139] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer$HashPartition.getHashMapFromDisk(HybridHashTableContainer.java:183) [2020-05-11 05:43:12,139] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.MapJoinOperator.reloadHashTable(MapJoinOperator.java:641) [2020-05-11 05:43:12,139] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.MapJoinOperator.continueProcess(MapJoinOperator.java:603) [2020-05-11 05:43:12,139] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:539) [2020-05-11 05:43:12,139] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697) [2020-05-11 05:43:12,139] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:711) [2020-05-11 05:43:12,139] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:477) [2020-05-11 05:43:12,140] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:199) [2020-05-11 05:43:12,140] {base_task_runner.py:95} INFO - Subtask: ... 14 more [2020-05-11 05:43:12,140] {base_task_runner.py:95} INFO - Subtask: ], TaskAttempt 2 failed, info=[Error: Error while running task ( failure ) : attempt_1588729523139_51702_1_06_001286_2:java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a power of two [2020-05-11 05:43:12,140] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211) [2020-05-11 05:43:12,140] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168) [2020-05-11 05:43:12,140] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) [2020-05-11 05:43:12,140] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) [2020-05-11 05:43:12,140] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) [2020-05-11 05:43:12,140] {base_task_runner.py:95} INFO - Subtask: at java.security.AccessController.doPrivileged(Native Method) [2020-05-11 05:43:12,140] {base_task_runner.py:95} INFO - Subtask: at javax.security.auth.Subject.doAs(Subject.java:422) [2020-05-11 05:43:12,140] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) [2020-05-11 05:43:12,140] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) [2020-05-11 05:43:12,140] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) [2020-05-11 05:43:12,140] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) [2020-05-11 05:43:12,140] {base_task_runner.py:95} INFO - Subtask: at java.util.concurrent.FutureTask.run(FutureTask.java:266) [2020-05-11 05:43:12,141] {base_task_runner.py:95} INFO - Subtask: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [2020-05-11 05:43:12,141] {base_task_runner.py:95} INFO - Subtask: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [2020-05-11 05:43:12,141] {base_task_runner.py:95} INFO - Subtask: at java.lang.Thread.run(Thread.java:748) [2020-05-11 05:43:12,141] {base_task_runner.py:95} INFO - Subtask: Caused by: java.lang.AssertionError: Capacity must be a power of two [2020-05-11 05:43:12,141] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.validateCapacity(BytesBytesMultiHashMap.java:552) [2020-05-11 05:43:12,141] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.expandAndRehashImpl(BytesBytesMultiHashMap.java:731) [2020-05-11 05:43:12,141] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.expandAndRehashToTarget(BytesBytesMultiHashMap.java:545) [2020-05-11 05:43:12,141] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer$HashPartition.getHashMapFromDisk(HybridHashTableContainer.java:183) [2020-05-11 05:43:12,141] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.MapJoinOperator.reloadHashTable(MapJoinOperator.java:641) [2020-05-11 05:43:12,141] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.MapJoinOperator.continueProcess(MapJoinOperator.java:603) [2020-05-11 05:43:12,141] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:539) [2020-05-11 05:43:12,141] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697) [2020-05-11 05:43:12,141] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:711) [2020-05-11 05:43:12,141] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:477) [2020-05-11 05:43:12,142] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:199) [2020-05-11 05:43:12,142] {base_task_runner.py:95} INFO - Subtask: ... 14 more [2020-05-11 05:43:12,142] {base_task_runner.py:95} INFO - Subtask: ], TaskAttempt 3 failed, info=[Error: Error while running task ( failure ) : attempt_1588729523139_51702_1_06_001286_3:java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a power of two [2020-05-11 05:43:12,142] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211) [2020-05-11 05:43:12,142] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168) [2020-05-11 05:43:12,142] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) [2020-05-11 05:43:12,142] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) [2020-05-11 05:43:12,142] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) [2020-05-11 05:43:12,142] {base_task_runner.py:95} INFO - Subtask: at java.security.AccessController.doPrivileged(Native Method) [2020-05-11 05:43:12,142] {base_task_runner.py:95} INFO - Subtask: at javax.security.auth.Subject.doAs(Subject.java:422) [2020-05-11 05:43:12,142] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) [2020-05-11 05:43:12,142] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) [2020-05-11 05:43:12,142] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) [2020-05-11 05:43:12,142] {base_task_runner.py:95} INFO - Subtask: at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) [2020-05-11 05:43:12,143] {base_task_runner.py:95} INFO - Subtask: at java.util.concurrent.FutureTask.run(FutureTask.java:266) [2020-05-11 05:43:12,143] {base_task_runner.py:95} INFO - Subtask: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [2020-05-11 05:43:12,143] {base_task_runner.py:95} INFO - Subtask: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [2020-05-11 05:43:12,143] {base_task_runner.py:95} INFO - Subtask: at java.lang.Thread.run(Thread.java:748) [2020-05-11 05:43:12,143] {base_task_runner.py:95} INFO - Subtask: Caused by: java.lang.AssertionError: Capacity must be a power of two [2020-05-11 05:43:12,143] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.validateCapacity(BytesBytesMultiHashMap.java:552) [2020-05-11 05:43:12,143] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.expandAndRehashImpl(BytesBytesMultiHashMap.java:731) [2020-05-11 05:43:12,143] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.expandAndRehashToTarget(BytesBytesMultiHashMap.java:545) [2020-05-11 05:43:12,143] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer$HashPartition.getHashMapFromDisk(HybridHashTableContainer.java:183) [2020-05-11 05:43:12,143] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.MapJoinOperator.reloadHashTable(MapJoinOperator.java:641) [2020-05-11 05:43:12,143] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.MapJoinOperator.continueProcess(MapJoinOperator.java:603) [2020-05-11 05:43:12,143] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:539) [2020-05-11 05:43:12,143] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697) [2020-05-11 05:43:12,143] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:711) [2020-05-11 05:43:12,144] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:477) [2020-05-11 05:43:12,144] {base_task_runner.py:95} INFO - Subtask: at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:199) [2020-05-11 05:43:12,144] {base_task_runner.py:95} INFO - Subtask: ... 14 more{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)