You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2014/06/29 13:57:24 UTC

[jira] [Commented] (HADOOP-5353) add progress callback feature to the slow FileUtil operations with ability to cancel the work

    [ https://issues.apache.org/jira/browse/HADOOP-5353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14047106#comment-14047106 ] 

Steve Loughran commented on HADOOP-5353:
----------------------------------------

well, that was a long time ago, wasn't it?

Having a quick look at the patch the core design looks good
* {{Progress}} should be {{IOProgress}} in case we add more types in future
* {{StopProgress}} should be {{StopProgressException}}, and take a text string

I see in this design a remote listener is expected to poll for status. I'd envisaged some kind of callback instead, but actually a progress counter would be more loosely coupled. Polling can be inefficient, so we should recommend to tools that they do a sleep+poll

One of the fun things would actually be implementing a cancel operation; most bulk single-thread IO operations don't; the HTTP based object stores do all their work in close() so stopping the write isn't enough...

> add progress callback feature to the slow FileUtil operations with ability to cancel the work
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5353
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5353
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.21.0
>            Reporter: Steve Loughran
>            Assignee: Lei (Eddy) Xu
>            Priority: Minor
>         Attachments: HADOOP-5353.000.patch
>
>
> This is something only of relevance of people doing front ends to FS operations, and as they could take the code in FSUtil and add something with this feature, its a blocker to none of them. 
> Current FileUtil.copy can take a long time to move large files around, but there is no progress indicator to GUIs, or a way to cancel the operation mid-way, short of interrupting the thread or closing the filesystem.
> I propose a FileIOProgress interface to the copy ops, one that had a single method to notify listeners of bytes read and written, and the number of files handled.
> {code}
> interface FileIOProgress {
>  boolean progress(int files, long bytesRead, long bytesWritten);
> }
> The return value would be true to continue the operation, or false to stop the copy and leave the FS in whatever incomplete state it is in currently. 
> it could even be fancier: have  beginFileOperation and endFileOperation callbacks to pass in the name of the current file being worked on, though I don't have a personal need for that.
> GUIs could show progress bars and cancel buttons, other tools could use the interface to pass any cancellation notice upstream.
> The FileUtil.copy operations would call this interface (blocking) after every block copy, so the frequency of invocation would depend on block size and network/disk speeds. Which is also why I don't propose having any percentage done indicators; it's too hard to predict percentage of time done for distributed file IO with any degree of accuracy.



--
This message was sent by Atlassian JIRA
(v6.2#6252)