You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/05/13 10:52:00 UTC
[jira] [Work logged] (HIVE-25976) Cleaner may remove files being accessed from a fetch-task-converted reader
[ https://issues.apache.org/jira/browse/HIVE-25976?focusedWorklogId=770124&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770124 ]
ASF GitHub Bot logged work on HIVE-25976:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 13/May/22 10:51
Start Date: 13/May/22 10:51
Worklog Time Spent: 10m
Work Description: veghlaci05 opened a new pull request, #3289:
URL: https://github.com/apache/hive/pull/3289
### What changes were proposed in this pull request?
This PR changes the commit time of the Fetch tasks. From now on these tasks are committed only upon driver close.
### Why are the changes needed?
Fetch tasks were committed inside the org.apache.hadoop.hive.ql.Driver#run(java.lang.String) call which was too early. The reading can occur only after this point, which can cause issues, if the table changes during the read.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Manually, and through unit tests
Issue Time Tracking
-------------------
Worklog Id: (was: 770124)
Remaining Estimate: 0h
Time Spent: 10m
> Cleaner may remove files being accessed from a fetch-task-converted reader
> --------------------------------------------------------------------------
>
> Key: HIVE-25976
> URL: https://issues.apache.org/jira/browse/HIVE-25976
> Project: Hive
> Issue Type: Bug
> Reporter: Zoltan Haindrich
> Assignee: László Végh
> Priority: Major
> Attachments: fetch_task_conv_compactor_test.patch
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> in a nutshell the following happens:
> * query is compiled in fetch-task-converted mode
> * no real execution happens....but the locks are released
> * the HS2 is communicating with the client and uses the fetch-task to get the rows - which in this case will directly read files from the table's directory....
> * client sleeps between reads - so there is ample time for other events...
> * cleaner wakes up and removes some files....
> * in the next read the fetch-task encounters a read error...
--
This message was sent by Atlassian Jira
(v8.20.7#820007)