You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ignite.apache.org by "Ivan Veselovsky (JIRA)" <ji...@apache.org> on 2015/10/29 19:11:27 UTC

[jira] [Updated] (IGNITE-1697) IGFS: implement reliable Igfs failover logic

     [ https://issues.apache.org/jira/browse/IGNITE-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ivan Veselovsky updated IGNITE-1697:
------------------------------------
    Assignee: Vladimir Ozerov  (was: Ivan Veselovsky)

> IGFS: implement reliable Igfs failover logic 
> ---------------------------------------------
>
>                 Key: IGNITE-1697
>                 URL: https://issues.apache.org/jira/browse/IGNITE-1697
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Ivan Veselovsky
>            Assignee: Vladimir Ozerov
>             Fix For: 1.5
>
>
> Problems to solve:
> 1) currently a write lock for a file may stay taken forever if a node have taken the lock and then crashed.
> 2) Currently the blocks of file content are written not just as dataCache.put() operations , but sent using ad-hoc async messages. This was done earlier to improve performance. But in order to implement reliable failover we need to get rid of that and use simple put() or asyncPut() cache operations.
> Solution plan:
> 1) use async put to write file data blocks.
> 2) do writing using scheme "lock" -> "reserve space" -> "write" -> "commit" -> "release lock".
> 3) The id of the node that locked a file should be readable from the lock id.
> 4) Upon taking a file lock the following procedure should be performed: 
> if file is locked, take the node Id of the node that locked the file. After that ask DiscoveryProcessor if this node is alive. If it is not (node has left topology), perform cleanup procedure: delete all the data blocks of the reserved data range, then delete the lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)