You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/11/14 00:26:00 UTC
[jira] [Work logged] (HIVE-26734) Iceberg: Add an option to allow positional delete files without actual row data
[ https://issues.apache.org/jira/browse/HIVE-26734?focusedWorklogId=825613&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-825613 ]
ASF GitHub Bot logged work on HIVE-26734:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 14/Nov/22 00:25
Start Date: 14/Nov/22 00:25
Worklog Time Spent: 10m
Work Description: ayushtkn opened a new pull request, #3758:
URL: https://github.com/apache/hive/pull/3758
### What changes were proposed in this pull request?
Allow writing actual row data in delete file as optional
### Why are the changes needed?
To avoid cost of reading/writing actual row data while reading/write delete files
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Added UT to ensure functionality. Tested File Contents and operations in actual env.
**Output: (With Config Enabled)**
```
[root@ayushsaxena-2 ~]# sudo -u hive hive --orcfiledump -d hdfs://cluster:8020/warehouse/tablespace/external/hive/ice02/data/00000-0-delete-hive_20221112171847_aba7a24a-68ec-46fa-bed4-3588f4a92e74-job_16680670596081_0052-1-00001.orc
Processing data file hdfs://cluster:8020/warehouse/tablespace/external/hive/ice02/data/00000-0-delete-hive_20221112171847_aba7a24a-68ec-46fa-bed4-3588f4a92e74-job_16680670596081_0052-1-00001.orc [length: 1075]
{"file_path":"hdfs:\/\/cluster:8020\/warehouse\/tablespace\/external\/hive\/ice02\/data\/00000-0-data-hive_20221112171758_ff81cb9a-d455-455d-8080-9b1507f33bd8-job_16680670596080_0052-2-00001.orc","pos":3}
```
**Output:(With config Disabled)**
```
[root@ayushsaxena-2 ~]# sudo -u hive hive --orcfiledump -d hdfs://cluster:8020/warehouse/tablespace/external/hive/ice01/data/00000-0-delete-hive_20221112171823_8817d25c-3178-4f4b-9908-be794029cbce-job_16680670596081_0052-1-00001.orc
Processing data file hdfs://cluster:8020/warehouse/tablespace/external/hive/ice01/data/00000-0-delete-hive_20221112171823_8817d25c-3178-4f4b-9908-be794029cbce-job_16680670596081_0052-1-00001.orc [length: 1178]
{"file_path":"hdfs:\/\/cluster:8020\/warehouse\/tablespace\/external\/hive\/ice01\/data\/00000-0-data-hive_20221112171727_18104d4c-f5a2-4c47-8f6e-b8e4f17a18b3-job_16680670596080_0052-1-00001.orc","pos":3,"row":{"id":4}}
```
Issue Time Tracking
-------------------
Worklog Id: (was: 825613)
Remaining Estimate: 0h
Time Spent: 10m
> Iceberg: Add an option to allow positional delete files without actual row data
> -------------------------------------------------------------------------------
>
> Key: HIVE-26734
> URL: https://issues.apache.org/jira/browse/HIVE-26734
> Project: Hive
> Issue Type: Improvement
> Reporter: Ayush Saxena
> Assignee: Ayush Saxena
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Allow an option to have actual row data in the Iceberg PositionalDelete file as optional, to avoid reading and writing huge amount of actual row data during query executions.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)