You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "zengxl (Jira)" <ji...@apache.org> on 2023/05/05 11:44:00 UTC

[jira] [Comment Edited] (HIVE-22318) Java.io.exception:Two readers for

    [ https://issues.apache.org/jira/browse/HIVE-22318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17719814#comment-17719814 ] 

zengxl edited comment on HIVE-22318 at 5/5/23 11:43 AM:
--------------------------------------------------------

have the same problem
{code:java}
MERGE INTO WH_OFR.ITV_ACTIVATE_DAY_BUCKET_TEST WDM USING WH_OFR.ITV_ACTIVATE_DAY_TEMP1_BUCKET_TEST  IDM 
ON (WDM.PROD_ID = IDM.PROD_ID)
WHEN  MATCHED  THEN
UPDATE SET
 PROD_ID = IDM.PROD_ID
,PLATFORM_NAME   =  IDM.PLATFORM_NAME 
,ACCOUNT          =  IDM.ACCOUNT        
,ACTIVE_DATE    =  IDM.ACTIVE_DATE 
,FILE_CYCLE     = '2023-05-03'   
,FILE_NBR       = 1
WHEN  NOT MATCHED THEN        
INSERT VALUES(
 IDM.PROD_ID    
,IDM.PLATFORM_NAME     
,IDM.ACCOUNT    
,IDM.ACTIVE_DATE      
,'2023-05-03'
,1);

CREATE TABLE `WH_OFRST.ITV_ACTIVATE_DAY_BUCKET_TEST`(
  `prod_id` bigint, 
  `platform_name` string, 
  `account` string, 
  `active_date` timestamp, 
  `file_cycle` timestamp, 
  `file_nbr` bigint)
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
  'viewfs://xxx/UserData/wh_ofrst/itv_activate_day_bucket_test'
TBLPROPERTIES (
  'bucketing_version'='2', 
  'transactional'='true', 
  'transactional_properties'='default', 
  'transient_lastDdlTime'='1679341838');

CREATE TABLE `WH_OFRST.ITV_ACTIVATE_DAY_TEMP1_BUCKET_TEST`(
  `prod_id` bigint, 
  `platform_name` string, 
  `account` string, 
  `active_date` timestamp)
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
  'viewfs://xxx/UserData/tmp/wh_ofrst.itv_activate_day_temp1_bucket_test'
TBLPROPERTIES (
  'bucketing_version'='2', 
  'transactional'='true', 
  'transactional_properties'='default', 
  'transient_lastDdlTime'='1683236932') {code}
exception:
{code:java}
Error: java.io.IOException: java.io.IOException: Two readers for {originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}: new [key={originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}, nextRecord={2, 4, 536870912, 730862, 8, null}, reader=Hive ORC Reader(viewfs://xxx/UserData/wh_ofrst/itv_activate_day/delete_delta_0000005_0000008/bucket_00001, 9223372036854775807)], old [key={originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}, nextRecord={2, 4, 536870912, 730862, 8, null}, reader=Hive ORC Reader(viewfs://xxx/UserData/wh_ofrst/itv_activate_day/delete_delta_0000005_0000008/bucket_00000, 9223372036854775807)]
    at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
    at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:420)
    at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:702)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:176)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:445)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:350)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
Caused by: java.io.IOException: Two readers for {originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}: new [key={originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}, nextRecord={2, 4, 536870912, 730862, 8, null}, reader=Hive ORC Reader(viewfs://xxx/UserData/ffcs_edw/wh_ofrst/itv_activate_day/delete_delta_0000005_0000008/bucket_00001, 9223372036854775807)], old [key={originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}, nextRecord={2, 4, 536870912, 730862, 8, null}, reader=Hive ORC Reader(viewfs://xxx/UserData/ffcs_edw/wh_ofrst/itv_activate_day/delete_delta_0000005_0000008/bucket_00000, 9223372036854775807)]
    at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.ensurePutReader(OrcRawRecordMerger.java:1191)
    at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.<init>(OrcRawRecordMerger.java:1146)
    at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:2110)
    at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:2008)
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:417)
 {code}
 


was (Author: zengxl):
have the same problem
{code:java}
MERGE INTO WH_OFR.ITV_ACTIVATE_DAY_BUCKET_TEST WDM USING WH_OFR.ITV_ACTIVATE_DAY_TEMP1_BUCKET_TEST  IDM 
ON (WDM.PROD_ID = IDM.PROD_ID)
WHEN  MATCHED  THEN
UPDATE SET
 PROD_ID = IDM.PROD_ID
,PLATFORM_NAME   =  IDM.PLATFORM_NAME 
,ACCOUNT          =  IDM.ACCOUNT        
,ACTIVE_DATE    =  IDM.ACTIVE_DATE 
,FILE_CYCLE     = '2023-05-03'   
,FILE_NBR       = 1
WHEN  NOT MATCHED THEN        
INSERT VALUES(
 IDM.PROD_ID    
,IDM.PLATFORM_NAME     
,IDM.ACCOUNT    
,IDM.ACTIVE_DATE      
,'2023-05-03'
,1);

CREATE TABLE `WH_OFRST.ITV_ACTIVATE_DAY_BUCKET_TEST`(
  `prod_id` bigint, 
  `platform_name` string, 
  `account` string, 
  `active_date` timestamp, 
  `file_cycle` timestamp, 
  `file_nbr` bigint)
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
  'viewfs://xxx/UserData/wh_ofrst/itv_activate_day_bucket_test'
TBLPROPERTIES (
  'bucketing_version'='2', 
  'transactional'='true', 
  'transactional_properties'='default', 
  'transient_lastDdlTime'='1679341838');

CREATE TABLE `WH_OFRST.ITV_ACTIVATE_DAY_TEMP1_BUCKET_TEST`(
  `prod_id` bigint, 
  `platform_name` string, 
  `account` string, 
  `active_date` timestamp)
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
  'viewfs://xxx/UserData/tmp/wh_ofrst.itv_activate_day_temp1_bucket_test'
TBLPROPERTIES (
  'bucketing_version'='2', 
  'transactional'='true', 
  'transactional_properties'='default', 
  'transient_lastDdlTime'='1683236932') {code}
exception:
{code:java}
Error: java.io.IOException: java.io.IOException: Two readers for {originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}: new [key={originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}, nextRecord={2, 4, 536870912, 730862, 8, null}, reader=Hive ORC Reader(viewfs://xxx/UserData/wh_ofr/itv_activate_day/delete_delta_0000005_0000008/bucket_00001, 9223372036854775807)], old [key={originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}, nextRecord={2, 4, 536870912, 730862, 8, null}, reader=Hive ORC Reader(viewfs://xxx/UserData/wh_ofr/itv_activate_day/delete_delta_0000005_0000008/bucket_00000, 9223372036854775807)]
    at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
    at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:420)
    at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:702)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:176)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:445)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:350)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
Caused by: java.io.IOException: Two readers for {originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}: new [key={originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}, nextRecord={2, 4, 536870912, 730862, 8, null}, reader=Hive ORC Reader(viewfs://xxx/UserData/ffcs_edw/wh_ofr/itv_activate_day/delete_delta_0000005_0000008/bucket_00001, 9223372036854775807)], old [key={originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}, nextRecord={2, 4, 536870912, 730862, 8, null}, reader=Hive ORC Reader(viewfs://xxx/UserData/ffcs_edw/wh_ofr/itv_activate_day/delete_delta_0000005_0000008/bucket_00000, 9223372036854775807)]
    at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.ensurePutReader(OrcRawRecordMerger.java:1191)
    at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.<init>(OrcRawRecordMerger.java:1146)
    at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:2110)
    at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:2008)
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:417)
 {code}
 

> Java.io.exception:Two readers for
> ---------------------------------
>
>                 Key: HIVE-22318
>                 URL: https://issues.apache.org/jira/browse/HIVE-22318
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, HiveServer2
>    Affects Versions: 3.1.0
>            Reporter: max_c
>            Priority: Major
>         Attachments: hiveserver2 for exception.log
>
>
> I create a ACID table with ORC format:
>  
> {noformat}
> CREATE TABLE `some.TableA`( 
>    ....
>    )                                                                   
>  ROW FORMAT SERDE                                   
>    'org.apache.hadoop.hive.ql.io.orc.OrcSerde'      
>  STORED AS INPUTFORMAT                              
>    'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  
>  OUTPUTFORMAT                                       
>    'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'  
>  TBLPROPERTIES (                                    
>    'bucketing_version'='2',                         
>    'orc.compress'='SNAPPY',                         
>    'transactional'='true',                          
>    'transactional_properties'='default'){noformat}
> After executing merge into operation:
> {noformat}
> MERGE INTO some.TableA AS a USING (SELECT vend_no FROM some.TableB UNION ALL SELECT vend_no FROM some.TableC) AS b ON a.vend_no=b.vend_no WHEN MATCHED THEN DELETE
> {noformat}
> the problem happend(when selecting the TableA, the exception happens too):
> {noformat}
> java.io.IOException: java.io.IOException: Two readers for {originalWriteId: 4, bucket: 536870912(1.0.0), row: 2434, currentWriteId 25}: new [key={originalWriteId: 4, bucket: 536870912(1.0.0), row: 2434, currentWriteId 25}, nextRecord={2, 4, 536870912, 2434, 25, null}, reader=Hive ORC Reader(hdfs://hdpprod/warehouse/tablespace/managed/hive/some.db/tableA/delete_delta_0000015_0000026/bucket_00001, 9223372036854775807)], old [key={originalWriteId: 4, bucket: 536870912(1.0.0), row: 2434, currentWriteId 25}, nextRecord={2, 4, 536870912, 2434, 25, null}, reader=Hive ORC Reader(hdfs://hdpprod/warehouse/tablespace/managed/hive/some.db/tableA/delete_delta_0000015_0000026/bucket_00000{noformat}
> Through orc_tools I scan all the files(bucket_00000,bucket_00001,bucket_00002) under delete_delta and find all rows of files are the same.I think this will cause the same key(RecordIdentifer) when scan the bucket_00001 after bucket_00000 but I don't know why all the rows are the same in these bucket files.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)