You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Karen Coppage (Jira)" <ji...@apache.org> on 2020/07/31 16:16:00 UTC

[jira] [Updated] (HIVE-23966) Minor query-based compaction always results in delta dirs with minWriteId=1

     [ https://issues.apache.org/jira/browse/HIVE-23966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karen Coppage updated HIVE-23966:
---------------------------------
    Description: 
Minor compaction after major/IOW will result in directories that look like:
 * base_z_v
 * delta_1_y_v
 * delete_delta_1_y_v

Should be:
 * base_z_v
 * delta_(z+1)_y_v
 * delete_delta_(z+1)_y_v

Issues this causes:

For example, after running insert overwrite, then minor compaction, major compaction will fail with the following error:
{noformat}
Found 2 equal splits: OrcSplit [hdfs://.../warehouse/tablespace/managed/hive/bucketed/delta_0000001_0000006_v0001058/bucket_00004,
start=0, length=722, isOriginal=false, fileLength=722, hasFooter=false, hasBase=true, deltas=1] and OrcSplit 
[hdfs://.../warehouse/tablespace/managed/hive/bucketed/base_0000001/bucket_00004_0,
start=0, length=811, isOriginal=false, fileLength=811, hasFooter=false, hasBase=true, deltas=1]
{noformat}

or it can fail with:
{noformat}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Wrong sort order of Acid rows detected for the rows: org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder$WriteIdRowId@201be62b an
d org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder$WriteIdRowId@5f97bd3f
{noformat}

  was:
Minor compaction after major/IOW will result in directories that look like:
 * base_z_v
 * delta_1_y_v
 * delete_delta_1_y_v

Should be:
 * base_z_v
 * delta_(z+1)_y_v
 * delete_delta_(z+1)_y_v


> Minor query-based compaction always results in delta dirs with minWriteId=1
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-23966
>                 URL: https://issues.apache.org/jira/browse/HIVE-23966
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Karen Coppage
>            Assignee: Karen Coppage
>            Priority: Major
>
> Minor compaction after major/IOW will result in directories that look like:
>  * base_z_v
>  * delta_1_y_v
>  * delete_delta_1_y_v
> Should be:
>  * base_z_v
>  * delta_(z+1)_y_v
>  * delete_delta_(z+1)_y_v
> Issues this causes:
> For example, after running insert overwrite, then minor compaction, major compaction will fail with the following error:
> {noformat}
> Found 2 equal splits: OrcSplit [hdfs://.../warehouse/tablespace/managed/hive/bucketed/delta_0000001_0000006_v0001058/bucket_00004,
> start=0, length=722, isOriginal=false, fileLength=722, hasFooter=false, hasBase=true, deltas=1] and OrcSplit 
> [hdfs://.../warehouse/tablespace/managed/hive/bucketed/base_0000001/bucket_00004_0,
> start=0, length=811, isOriginal=false, fileLength=811, hasFooter=false, hasBase=true, deltas=1]
> {noformat}
> or it can fail with:
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Wrong sort order of Acid rows detected for the rows: org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder$WriteIdRowId@201be62b an
> d org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder$WriteIdRowId@5f97bd3f
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)