You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Eugene Koifman (JIRA)" <ji...@apache.org> on 2017/02/21 18:12:44 UTC
[jira] [Commented] (HIVE-15756) Update/deletes on ACID table throws
ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-15756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15876413#comment-15876413 ]
Eugene Koifman commented on HIVE-15756:
---------------------------------------
this is caused by hive.enforce.bucketing=false, which is not supported. In fact, this property doesn't even exist in Hive 2.2
> Update/deletes on ACID table throws ArrayIndexOutOfBoundsException
> ------------------------------------------------------------------
>
> Key: HIVE-15756
> URL: https://issues.apache.org/jira/browse/HIVE-15756
> Project: Hive
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 2.0.0
> Reporter: Kavan Suresh
> Assignee: Eugene Koifman
> Priority: Critical
>
> Update and delete queries on ACID tables fail throwing ArrayIndexOutOfBoundsException.
> {noformat}
> hive> update customer_acid set c_comment = 'foo bar' where c_custkey % 100 = 1;
> Query ID = cstm-hdfs_20170128005823_efa1cdb7-2ad2-4371-ac80-0e35868ad17c
> Total jobs = 1
> Launching Job 1 out of 1
> Tez session was closed. Reopening...
> Session re-established.
> Status: Running (Executing on YARN cluster with App id application_1485331877667_0036)
> --------------------------------------------------------------------------------
> VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
> --------------------------------------------------------------------------------
> Map 1 .......... SUCCEEDED 14 14 0 0 0 0
> Reducer 2 FAILED 1 0 0 1 1 0
> --------------------------------------------------------------------------------
> VERTICES: 01/02 [========================>>--] 93% ELAPSED TIME: 23.68 s
> --------------------------------------------------------------------------------
> Status: Failed
> Vertex failed, vertexName=Reducer 2, vertexId=vertex_1485331877667_0036_1_01, diagnostics=[Task failed, taskId=task_1485331877667_0036_1_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":{"transactionid":72,"bucketid":1,"rowid":0}},"value":{"_col0":103601,"_col1":"Customer#000103601","_col2":"3cYSrJtAA36vth35 emuIk","_col3":20,"_col4":"30-526-248-3190","_col5":8047.21,"_col6":"MACHINERY "}}
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
> at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
> at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
> at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
> at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":{"transactionid":72,"bucketid":1,"rowid":0}},"value":{"_col0":103601,"_col1":"Customer#000103601","_col2":"3cYSrJtAA36vth35 emuIk","_col3":20,"_col4":"30-526-248-3190","_col5":8047.21,"_col6":"MACHINERY "}}
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:284)
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:252)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
> ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":{"transactionid":72,"bucketid":1,"rowid":0}},"value":{"_col0":103601,"_col1":"Customer#000103601","_col2":"3cYSrJtAA36vth35 emuIk","_col3":20,"_col4":"30-526-248-3190","_col5":8047.21,"_col6":"MACHINERY "}}
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
> ... 16 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
> at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:780)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
> at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
> ... 17 more
> ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1485331877667_0036_1_01 [Reducer 2] killed/failed due to:OWN_TASK_FAILURE]
> DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 2, vertexId=vertex_1485331877667_0036_1_01, diagnostics=[Task failed, taskId=task_1485331877667_0036_1_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":{"transactionid":72,"bucketid":1,"rowid":0}},"value":{"_col0":103601,"_col1":"Customer#000103601","_col2":"3cYSrJtAA36vth35 emuIk","_col3":20,"_col4":"30-526-248-3190","_col5":8047.21,"_col6":"MACHINERY "}}
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
> at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
> at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
> at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
> at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":{"transactionid":72,"bucketid":1,"rowid":0}},"value":{"_col0":103601,"_col1":"Customer#000103601","_col2":"3cYSrJtAA36vth35 emuIk","_col3":20,"_col4":"30-526-248-3190","_col5":8047.21,"_col6":"MACHINERY "}}
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:284)
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:252)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
> ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":{"transactionid":72,"bucketid":1,"rowid":0}},"value":{"_col0":103601,"_col1":"Customer#000103601","_col2":"3cYSrJtAA36vth35 emuIk","_col3":20,"_col4":"30-526-248-3190","_col5":8047.21,"_col6":"MACHINERY "}}
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
> ... 16 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
> at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:780)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
> at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
> ... 17 more
> ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1485331877667_0036_1_01 [Reducer 2] killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
> {noformat}
> {noformat}
> hive> explain extended update customer_acid set c_comment = 'foo bar' where c_custkey % 100 = 1;
> OK
> ABSTRACT SYNTAX TREE:
>
> TOK_UPDATE_TABLE
> TOK_TABNAME
> customer_acid
> TOK_SET_COLUMNS_CLAUSE
> =
> TOK_TABLE_OR_COL
> c_comment
> 'foo bar'
> TOK_WHERE
> =
> %
> TOK_TABLE_OR_COL
> c_custkey
> 100
> 1
> STAGE DEPENDENCIES:
> Stage-1 is a root stage
> Stage-2 depends on stages: Stage-1
> Stage-0 depends on stages: Stage-2
> Stage-3 depends on stages: Stage-0
> STAGE PLANS:
> Stage: Stage-1
> Tez
> DagId: cstm-hdfs_20170128012834_4d41e184-1e40-443c-9990-147cfdc6ea15:5
> Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> DagName:
> Vertices:
> Map 1
> Map Operator Tree:
> TableScan
> alias: customer_acid
> filterExpr: ((c_custkey % 100) = 1) (type: boolean)
> Statistics: Num rows: 25219 Data size: 8700894 Basic stats: COMPLETE Column stats: NONE
> GatherStats: false
> Filter Operator
> isSamplingPred: false
> predicate: ((c_custkey % 100) = 1) (type: boolean)
> Statistics: Num rows: 12609 Data size: 4350274 Basic stats: COMPLETE Column stats: NONE
> Select Operator
> expressions: ROW__ID (type: struct<transactionid:bigint,bucketid:int,rowid:bigint>), c_custkey (type: int), c_name (type: string), c_address (type: string), c_nationkey (type: int), c_phone (type: char(15)), c_acctbal (type: decimal(15,2)), c_mktsegment (type: char(10))
> outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7
> Statistics: Num rows: 12609 Data size: 4350274 Basic stats: COMPLETE Column stats: NONE
> Reduce Output Operator
> key expressions: _col0 (type: struct<transactionid:bigint,bucketid:int,rowid:bigint>)
> sort order: +
> Statistics: Num rows: 12609 Data size: 4350274 Basic stats: COMPLETE Column stats: NONE
> tag: -1
> value expressions: _col1 (type: int), _col2 (type: string), _col3 (type: string), _col4 (type: int), _col5 (type: char(15)), _col6 (type: decimal(15,2)), _col7 (type: char(10))
> auto parallelism: true
> Path -> Alias:
> hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid [customer_acid]
> Path -> Partition:
> hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid
> Partition
> base file name: customer_acid
> input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
> properties:
> bucket_count 8
> bucket_field_name c_custkey
> columns c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment
> columns.comments
> columns.types int:string:string:int:char(15):decimal(15,2):char(10):string
> file.inputformat org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> file.outputformat org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
> location hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid
> name tpch.customer_acid
> numFiles 12
> numRows 0
> rawDataSize 0
> serialization.ddl struct customer_acid { i32 c_custkey, string c_name, string c_address, i32 c_nationkey, char(15) c_phone, decimal(15,2) c_acctbal, char(10) c_mktsegment, string c_comment}
> serialization.format 1
> serialization.lib org.apache.hadoop.hive.ql.io.orc.OrcSerde
> totalSize 8700894
> transactional true
> transient_lastDdlTime 1485548417
> serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
>
> input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
> properties:
> bucket_count 8
> bucket_field_name c_custkey
> columns c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment
> columns.comments
> columns.types int:string:string:int:char(15):decimal(15,2):char(10):string
> file.inputformat org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> file.outputformat org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
> location hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid
> name tpch.customer_acid
> numFiles 12
> numRows 0
> rawDataSize 0
> serialization.ddl struct customer_acid { i32 c_custkey, string c_name, string c_address, i32 c_nationkey, char(15) c_phone, decimal(15,2) c_acctbal, char(10) c_mktsegment, string c_comment}
> serialization.format 1
> serialization.lib org.apache.hadoop.hive.ql.io.orc.OrcSerde
> totalSize 8700894
> transactional true
> transient_lastDdlTime 1485548417
> serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
> name: tpch.customer_acid
> name: tpch.customer_acid
> Truncated Path -> Alias:
> /tpch.db/customer_acid [customer_acid]
> Reducer 2
> Needs Tagging: false
> Reduce Operator Tree:
> Select Operator
> expressions: KEY.reducesinkkey0 (type: struct<transactionid:bigint,bucketid:int,rowid:bigint>), VALUE._col0 (type: int), VALUE._col1 (type: string), VALUE._col2 (type: string), VALUE._col3 (type: int), VALUE._col4 (type: char(15)), VALUE._col5 (type: decimal(15,2)), VALUE._col6 (type: char(10)), 'foo bar' (type: string)
> outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8
> Statistics: Num rows: 12609 Data size: 4350274 Basic stats: COMPLETE Column stats: NONE
> File Output Operator
> compressed: false
> GlobalTableId: 1
> directory: hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid/.hive-staging_hive_2017-01-28_01-28-34_547_5091220054599015088-1/-ext-10000
> NumFilesPerFileSink: 1
> Statistics: Num rows: 12609 Data size: 4350274 Basic stats: COMPLETE Column stats: NONE
> Stats Publishing Key Prefix: hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid/.hive-staging_hive_2017-01-28_01-28-34_547_5091220054599015088-1/-ext-10000/
> table:
> input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
> properties:
> bucket_count 8
> bucket_field_name c_custkey
> columns c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment
> columns.comments
> columns.types int:string:string:int:char(15):decimal(15,2):char(10):string
> file.inputformat org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> file.outputformat org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
> location hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid
> name tpch.customer_acid
> numFiles 12
> numRows 0
> rawDataSize 0
> serialization.ddl struct customer_acid { i32 c_custkey, string c_name, string c_address, i32 c_nationkey, char(15) c_phone, decimal(15,2) c_acctbal, char(10) c_mktsegment, string c_comment}
> serialization.format 1
> serialization.lib org.apache.hadoop.hive.ql.io.orc.OrcSerde
> totalSize 8700894
> transactional true
> transient_lastDdlTime 1485548417
> serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
> name: tpch.customer_acid
> TotalFiles: 1
> GatherStats: true
> MultiFileSpray: false
> Stage: Stage-2
> Dependency Collection
> Stage: Stage-0
> Move Operator
> tables:
> replace: false
> source: hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid/.hive-staging_hive_2017-01-28_01-28-34_547_5091220054599015088-1/-ext-10000
> table:
> input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
> properties:
> bucket_count 8
> bucket_field_name c_custkey
> columns c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment
> columns.comments
> columns.types int:string:string:int:char(15):decimal(15,2):char(10):string
> file.inputformat org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> file.outputformat org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
> location hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid
> name tpch.customer_acid
> numFiles 12
> numRows 0
> rawDataSize 0
> serialization.ddl struct customer_acid { i32 c_custkey, string c_name, string c_address, i32 c_nationkey, char(15) c_phone, decimal(15,2) c_acctbal, char(10) c_mktsegment, string c_comment}
> serialization.format 1
> serialization.lib org.apache.hadoop.hive.ql.io.orc.OrcSerde
> totalSize 8700894
> transactional true
> transient_lastDdlTime 1485548417
> serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
> name: tpch.customer_acid
> Stage: Stage-3
> Stats-Aggr Operator
> Stats Aggregation Key Prefix: hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid/.hive-staging_hive_2017-01-28_01-28-34_547_5091220054599015088-1/-ext-10000/
> Time taken: 0.422 seconds, Fetched: 189 row(s)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)