You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/05/13 10:52:00 UTC

[jira] [Work logged] (HIVE-25976) Cleaner may remove files being accessed from a fetch-task-converted reader

     [ https://issues.apache.org/jira/browse/HIVE-25976?focusedWorklogId=770124&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770124 ]

ASF GitHub Bot logged work on HIVE-25976:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 13/May/22 10:51
            Start Date: 13/May/22 10:51
    Worklog Time Spent: 10m 
      Work Description: veghlaci05 opened a new pull request, #3289:
URL: https://github.com/apache/hive/pull/3289

   
   ### What changes were proposed in this pull request?
   This PR changes the commit time of the Fetch tasks. From now on these tasks are committed only upon driver close.
   
   ### Why are the changes needed?
   Fetch tasks were committed inside the org.apache.hadoop.hive.ql.Driver#run(java.lang.String) call which was too early. The reading can occur only after this point, which can cause issues, if the table changes during the read. 
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Manually, and through unit tests




Issue Time Tracking
-------------------

            Worklog Id:     (was: 770124)
    Remaining Estimate: 0h
            Time Spent: 10m

> Cleaner may remove files being accessed from a fetch-task-converted reader
> --------------------------------------------------------------------------
>
>                 Key: HIVE-25976
>                 URL: https://issues.apache.org/jira/browse/HIVE-25976
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Zoltan Haindrich
>            Assignee: László Végh
>            Priority: Major
>         Attachments: fetch_task_conv_compactor_test.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> in a nutshell the following happens:
> * query is compiled in fetch-task-converted mode
> * no real execution happens....but the locks are released
> * the HS2 is communicating with the client and uses the fetch-task to get the rows - which in this case will directly read files from the table's directory....
> * client sleeps between reads - so there is ample time for other events...
> * cleaner wakes up and removes some files....
> * in the next read the fetch-task encounters a read error...



--
This message was sent by Atlassian Jira
(v8.20.7#820007)