You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/08/04 15:36:00 UTC

[jira] [Work logged] (HADOOP-11452) Make FileSystem.rename(path, path, options) public, specified, tested

     [ https://issues.apache.org/jira/browse/HADOOP-11452?focusedWorklogId=633657&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-633657 ]

ASF GitHub Bot logged work on HADOOP-11452:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 04/Aug/21 15:35
            Start Date: 04/Aug/21 15:35
    Worklog Time Spent: 10m 
      Work Description: steveloughran commented on pull request #2735:
URL: https://github.com/apache/hadoop/pull/2735#issuecomment-892760289


   I have stopped working on this. Feel free to take it up
   
   I originally thought "hey, we could just make this public and there'd be a good rename", but as usual the challenge becomes one of strictly implementing the preconditions. FileContext does that, though non-atomically; Factoring out all that policy at least makes things consistent.
   
   
   But, having dealt with other rename related trouble recently, I'm thinking really I'd want a new builder-based rename
   
   ```
   Future<RenameOutcome> foutcome = FS.renamePath(source, dest)
      .must("fs.opt.rename.atomic", true)
      .build()
   
   RenameOutcome outcome = outcome.get();
   
   
   class RenameOutcome implements IOStatisticsSource {
   
   }
   ````
   
   Why this?
   * Allows for stores which count IO Costs of renames to report them
   * You need some kind of return type for java futures
   * builder options would let you say whether you MUST have atomic rename, in which case
     -no s3a, wasb or gcs rename for you. 
   
   
   Why async?
   * so slow stores can be obviously slow about it
   * let you pass in a progressable. Distcp could do this so stop tasks failing during rename of big files on non -direct uploads to s3; same for FileOutputCommitter
   
   ```
   Future<RenameOutcome> foutcome = FS.renamePath(source, dest)
      .withProgress(reporter)
      .build()
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 633657)
    Time Spent: 4h 10m  (was: 4h)

> Make FileSystem.rename(path, path, options) public, specified, tested
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-11452
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11452
>             Project: Hadoop Common
>          Issue Type: Task
>          Components: fs
>    Affects Versions: 2.7.3
>            Reporter: Yi Liu
>            Assignee: Steve Loughran
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HADOOP-11452-001.patch, HADOOP-11452-002.patch, HADOOP-14452-004.patch, HADOOP-14452-branch-2-003.patch
>
>          Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Currently in {{FileSystem}}, {{rename}} with _Rename options_ is protected and with _deprecated_ annotation. And the default implementation is not atomic.
> So this method is not able to be used outside. On the other hand, HDFS has a good and atomic implementation. (Also an interesting thing in {{DFSClient}}, the _deprecated_ annotations for these two methods are opposite).
> It makes sense to make public for {{rename}} with _Rename options_, since it's atomic for rename+overwrite, also it saves RPC calls if user desires rename+overwrite.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org