You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/09/08 09:28:00 UTC

[jira] [Work logged] (HIVE-25790) Make managed table copies handle updates (FileUtils)

     [ https://issues.apache.org/jira/browse/HIVE-25790?focusedWorklogId=806943&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-806943 ]

ASF GitHub Bot logged work on HIVE-25790:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Sep/22 09:27
            Start Date: 08/Sep/22 09:27
    Worklog Time Spent: 10m 
      Work Description: pudidic opened a new pull request, #3582:
URL: https://github.com/apache/hive/pull/3582

   ### What changes were proposed in this pull request?
   Changed FileUtils.copy() to skip identical files on the destination directory to improve copy performance. FileUtils.copy() originally just removed and recreated the destination directory. This change makes it compare each file and directory, and delete only different files and directories.
   
   ### Why are the changes needed?
   In an optimized replication bootstrap scenario, it copies many files from source to destination. It can copy thousands of files. If it fails during copying process, it retries. Then it has some files already copied, but its implementation removes them and copy all of them entirely. It should skip the already copied ones.
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   It introduced few JUnit test scenarios in TestFileUtils and TestCopyUtils. It will be tested with automated regression test suites on the test server.




Issue Time Tracking
-------------------

            Worklog Id:     (was: 806943)
    Remaining Estimate: 0h
            Time Spent: 10m

> Make managed table copies handle updates (FileUtils)
> ----------------------------------------------------
>
>                 Key: HIVE-25790
>                 URL: https://issues.apache.org/jira/browse/HIVE-25790
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Haymant Mangla
>            Assignee: Teddy Choi
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)