You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ignite.apache.org by "Ivan Veselovsky (JIRA)" <ji...@apache.org> on 2015/10/15 18:11:05 UTC

[jira] [Created] (IGNITE-1697) IGFS: implement reliable Igfs failover logic

Ivan Veselovsky created IGNITE-1697:
---------------------------------------

             Summary: IGFS: implement reliable Igfs failover logic 
                 Key: IGNITE-1697
                 URL: https://issues.apache.org/jira/browse/IGNITE-1697
             Project: Ignite
          Issue Type: Bug
            Reporter: Ivan Veselovsky
            Assignee: Ivan Veselovsky
             Fix For: 1.5


Problems to solve:
1) currently a write lock for a file may stay taken forever if a node have taken the lock and then crashed.
2) Currently the blocks of file content are written not just as dataCache.put() operations , but sent using ad-hoc async messages. This was done earlier to improve performance. But in order to implement reliable failover we need to get rid of that and use simple put() or asyncPut() cache operations.

Solution plan:
1) use async put to write file data blocks.
2) do writing using scheme "lock" -> "reserve space" -> "write" -> "commit" -> "release lock".
3) The id of the node that locked a file should be readable from the lock id.
4) Upon taking a file lock the following procedure should be performed: 
if file is locked, take the node Id of the node that locked the file. After that ask DiscoveryProcessor if this node is alive. If it is not (node has left topology), perform cleanup procedure: delete all the data blocks of the reserved data range, then delete the lock.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)