You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by sanjaydasgupta <gi...@git.apache.org> on 2018/05/22 17:22:17 UTC

[GitHub] zeppelin pull request #2978: [ZEPPELIN-3467] two-step, atomic configuration ...

GitHub user sanjaydasgupta opened a pull request:

    https://github.com/apache/zeppelin/pull/2978

    [ZEPPELIN-3467] two-step, atomic configuration file writes

    ### What is this PR for?
    This PR prevents the total loss of configuration information that happens when zeppelin attempts to write out modified configuration information into an already full file-system. The approach taken here is to first write the modified information to a temporary file (in the same directory), and then to rename the successfully written file to its eventual name. 
    
    The rename operation is not attempted if there is an error while writing the temporary file, thus leaving the pre-existing configuration file untouched.
    
    ### What type of PR is it?
    [Improvement]
    
    ### Todos
    
    ### What is the Jira issue?
    https://issues.apache.org/jira/browse/ZEPPELIN-3467
    
    ### How should this be tested?
    CI pass
    
    ### Screenshots (if appropriate)
    
    ### Questions:
    * Does the licenses files need update? No
    * Is there breaking changes for older versions? No
    * Does this needs documentation? No


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sanjaydasgupta/zeppelin zeppelin-3467-atomic-config-file-writes

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/zeppelin/pull/2978.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2978
    
----
commit 8d7bde9cb354e6c6dea34e26376b1fd64cd2594c
Author: Sanjay Dasgupta <sa...@...>
Date:   2018-05-22T15:17:50Z

    zeppelin-3467-atomic-config-file-writes initial commit

----


---

[GitHub] zeppelin issue #2978: [ZEPPELIN-3467] two-step, atomic configuration file wr...

Posted by sanjaydasgupta <gi...@git.apache.org>.
Github user sanjaydasgupta commented on the issue:

    https://github.com/apache/zeppelin/pull/2978
  
    @felixcheung have made the suggested changes. But am not sure how to add tests for this. Any hints will be useful.
    
    @zjffdu, @Tagar do you recommend similar treatment for FileSystemConfigStorage also?


---

[GitHub] zeppelin issue #2978: [ZEPPELIN-3467] two-step, atomic configuration file wr...

Posted by Tagar <gi...@git.apache.org>.
Github user Tagar commented on the issue:

    https://github.com/apache/zeppelin/pull/2978
  
    @sanjaydasgupta 
    
    > do you recommend similar treatment for FileSystemConfigStorage also?
    
    yep, HDFS although is not a posix filesystem, but renaming a file is still implemented as an atomic operation. 
    
    might want to consider moving atomicWriteToFile() to ConfigStorage so both LocalConfigStorage  and FileSystemConfigStorage can use it.


---

[GitHub] zeppelin issue #2978: [ZEPPELIN-3467] two-step, atomic configuration file wr...

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on the issue:

    https://github.com/apache/zeppelin/pull/2978
  
    LocalConfigStorage is the old implantation of config storage just for backward compatibility. FileSystemConfigStorage is for hadoop compatible filesystem which is new in 0.8 


---

[GitHub] zeppelin pull request #2978: [ZEPPELIN-3467] two-step, atomic configuration ...

Posted by sanjaydasgupta <gi...@git.apache.org>.
Github user sanjaydasgupta closed the pull request at:

    https://github.com/apache/zeppelin/pull/2978


---

[GitHub] zeppelin issue #2978: [ZEPPELIN-3467] two-step, atomic configuration file wr...

Posted by sanjaydasgupta <gi...@git.apache.org>.
Github user sanjaydasgupta commented on the issue:

    https://github.com/apache/zeppelin/pull/2978
  
    Added code to delete the temp file if the write fails. Re-throwing the exception serves two purposes:
    
    1) Ensure that the caller sees the same exception when the write fails, and
    2) Prevent calling the subsequent rename operation (which would fail)
    
    Thanks for pointing out this case @felixcheung 


---

[GitHub] zeppelin issue #2978: [ZEPPELIN-3467] two-step, atomic configuration file wr...

Posted by sanjaydasgupta <gi...@git.apache.org>.
Github user sanjaydasgupta commented on the issue:

    https://github.com/apache/zeppelin/pull/2978
  
    Done. Also used more formal way of creating the temporary file. 
    
    I've tested the integrated code on Ubuntu 16.04 - success case only :-) but am not sure how things will work on Windows. Much of the confusion over atomicity seems based on the different platform-specific behaviors (UNIX vs Windows) of the _rename_ function in the underlying C Standard library.
    
    Thanks for the insight @Tagar 


---

[GitHub] zeppelin issue #2978: [ZEPPELIN-3467] two-step, atomic configuration file wr...

Posted by Tagar <gi...@git.apache.org>.
Github user Tagar commented on the issue:

    https://github.com/apache/zeppelin/pull/2978
  
    Thanks @sanjaydasgupta. I will give this a try today.


---

[GitHub] zeppelin issue #2978: [ZEPPELIN-3467] two-step, atomic configuration file wr...

Posted by sanjaydasgupta <gi...@git.apache.org>.
Github user sanjaydasgupta commented on the issue:

    https://github.com/apache/zeppelin/pull/2978
  
    Hi @zjffdu 
    
    While reading the code of `FileSystemConfigStorage.java`, I noticed that the method `save(...)` delegates its work to `FileSystemStorage.writeFile(...)`. Interestingly, this `writeFile(...)` method has an _unused_ boolean parameter named `writeTempFileFirst`. 
    
    My guess is that this unused boolean parameter `writeTempFileFirst` is in fact a _ToDo_ that has been left here with the intention of adding the same kind of protective logic that we are discussing here.
    
    Please let me know if that is the case, and if so whether I can go ahead and implement the required logic in the `FileSystemStorage.writeFile(...)` method.
    
    Thanks for your help.


---

[GitHub] zeppelin issue #2978: [ZEPPELIN-3467] two-step, atomic configuration file wr...

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on the issue:

    https://github.com/apache/zeppelin/pull/2978
  
    There's nothing needs to be done in FileSystemStorage which has already use tmp file 


---

[GitHub] zeppelin issue #2978: [ZEPPELIN-3467] two-step, atomic configuration file wr...

Posted by sanjaydasgupta <gi...@git.apache.org>.
Github user sanjaydasgupta commented on the issue:

    https://github.com/apache/zeppelin/pull/2978
  
    Hi @felixcheung, @Tagar 
    
    Any suggestions on how to push this forward from here? 
    
    Thanks


---

[GitHub] zeppelin issue #2978: [ZEPPELIN-3467] two-step, atomic configuration file wr...

Posted by Tagar <gi...@git.apache.org>.
Github user Tagar commented on the issue:

    https://github.com/apache/zeppelin/pull/2978
  
    @sanjaydasgupta, `writeFile` in `FileSystemStorage` drops file first, and then renames temp file. 
    It's not an atomic write, and also leaves a chance to loosing file that is being written. 
    I'd say approach in this PR is more formalized and a better way to go.
    
    https://github.com/apache/zeppelin/blob/master/zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/FileSystemStorage.java#L155


---

[GitHub] zeppelin issue #2978: [ZEPPELIN-3467] two-step, atomic configuration file wr...

Posted by sanjaydasgupta <gi...@git.apache.org>.
Github user sanjaydasgupta commented on the issue:

    https://github.com/apache/zeppelin/pull/2978
  
    There is a problem in my repo that needs correction.
    
    I am closing this PR now, and will reopen it later if possible, or return with a new PR.


---

[GitHub] zeppelin issue #2978: [ZEPPELIN-3467] two-step, atomic configuration file wr...

Posted by Tagar <gi...@git.apache.org>.
Github user Tagar commented on the issue:

    https://github.com/apache/zeppelin/pull/2978
  
    What's difference between LocalConfigStorage and FileSystemConfigStorage?
    I don't know Zeppelin internals that well.
    Wonder if this has to be fixed FileSystemConfigStorage.java as well?
    
    https://github.com/sanjaydasgupta/zeppelin/blob/36698a9a6a45610b409fa456f94255b32b746db8/zeppelin-zengine/src/main/java/org/apache/zeppelin/storage/FileSystemConfigStorage.java#L66 



---

[GitHub] zeppelin issue #2978: [ZEPPELIN-3467] two-step, atomic configuration file wr...

Posted by sanjaydasgupta <gi...@git.apache.org>.
Github user sanjaydasgupta commented on the issue:

    https://github.com/apache/zeppelin/pull/2978
  
    Hi @zjffdu, can you please confirm if anything more remains to be done in [FileSystemStorage.writeFile](https://github.com/apache/zeppelin/blob/master/zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/FileSystemStorage.java#L146)?
    
    Thanks for your insight.


---