You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Konstantin Shvachko (JIRA)" <ji...@apache.org> on 2006/11/09 04:21:37 UTC

[jira] Created: (HADOOP-702) DFS Upgrade Proposal

DFS Upgrade Proposal
--------------------

                 Key: HADOOP-702
                 URL: http://issues.apache.org/jira/browse/HADOOP-702
             Project: Hadoop
          Issue Type: New Feature
          Components: dfs
            Reporter: Konstantin Shvachko


Currently the DFS cluster upgrade procedure is manual.
http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-702:
---------------------------------------

    Attachment: FSStateTransition5.htm

This is the updated document: FSStateTransition5.htm
I tried to combine the two design documents into one.


> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, DFSUpgradeProposal3.html, FSStateTransition.html, FSStateTransition5.htm, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-702?page=all ]

Konstantin Shvachko updated HADOOP-702:
---------------------------------------

    Attachment: DFSUpgradeProposal2.html

I substantially modified the upgrade proposal.

- There will be more changes in the data layout.
Please let me know if something is not satisfactory. E.g. changing file or directory
names will be very hard with backward compatibility and up-/de-gradability in mind.

- Taking into account massive layout changes and also that upgrades with rollback are 
hard to support for versions with different directory structures, I propose to keep automatic 
data conversion (as we had until now) from the current version to the next one. 
No going back (rollback) and force (upgrade) between current and the next versions. 
The upgrades will be supported once the data is converted to the new format.
Hope that makes sense.

- The proposal contains more details to cover different failure scenarios.
Like data-node started the upgrade or discard process but crashed before completing.

- The discard and the rollback commands are not commands, but rather server startup
options. We do not have administrative authorization and this should make harder to 
do an upgrade, discard or rollback by mistake.


> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: http://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-702:
---------------------------------------

    Attachment:     (was: FSStateTransition.patch)

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition6.htm, FSStateTransitionApr03.patch, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12483232 ] 

Raghu Angadi commented on HADOOP-702:
-------------------------------------

+1 for the patch. I reviewed it and did some basic testing.


> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition.patch, FSStateTransition6.htm, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Yoram Arnon (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-702?page=comments#action_12448617 ] 
            
Yoram Arnon commented on HADOOP-702:
------------------------------------

there are two things you can do with a snapshot: view individual files or roll back the entire FS.
We currently plan to support only the latter, more extreme option, typically used only in case of disasters, and that's typically non reversible, though it would be nice to allow the former as well.

snapshots are immutable, otherwise you get into the business of managing diverging branches, which I'd recommend against.
Rolling back with the option to roll forward implies your entire FS is read-only, limiting its usefulness, even for testing. The job tracker, for example, won't start on a read-only dfs, let alone execute jobs.

I'd recommend staying the course with non reversible rollbacks, while keeping in mind the desire for full snapshot functionality in future, which will allow read-only viewing of individual files or directories.

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: http://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-702:
-------------------------------

    Fix Version/s: 0.13.0

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, DFSUpgradeProposal3.html, FSStateTransition.html, FSStateTransition5.htm, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-702:
---------------------------------------

    Attachment:     (was: DFSUpgradeProposal.html)

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition.html, FSStateTransition.patch, FSStateTransition5.htm, FSStateTransition6.htm, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-702:
---------------------------------------

    Status: Patch Available  (was: Open)

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition.patch, FSStateTransition6.htm, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12486503 ] 

Hadoop QA commented on HADOOP-702:
----------------------------------

+1, because http://issues.apache.org/jira/secure/attachment/12354886/FSStateTransitionApr03.patch applied and successfully tested against trunk revision http://svn.apache.org/repos/asf/lucene/hadoop/trunk/525268. Results are at http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition6.htm, FSStateTransitionApr03.patch, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12466814 ] 

Sameer Paranjpye commented on HADOOP-702:
-----------------------------------------

After some discussion with Konstantin, Milind, Owen and Nigel it feels like we need some amendments to the design for upgrade and rollbacks. The most significant delta is in the area of keeping multiple snapshots with different FSSIDs.

The fundamental problem with allowing multiple FSSIDs each representing a different filesystem state is that these 'snapshots' decay over time unless they are actively managed. There is no monitoring and replication of
blocks in a snapshot. Datanodes going down can cause bit rot and data loss. Data corruption also goes undetected since clients never read from snapshots. Allowing multiple FSSIDs also causes the number of states the filesystem can be in to grow significantly and the number of corner cases that need to be handled to explode (particularly on the datanodes). Further, the primary motivation for this design is to protect filesystem data in the face of software upgrades and rollbacks. Snapshots were a side-effect of the design but they don't feel like a hard requirement at this point.

The other important change is much tighter integration of the Namenode and Datanodes. The new design requires that the Namenode and Datanodes be running the same software version. This is a much stricter requirement than having them speaking the same protocol versions. But given that replication and layout can change with software revisions it seems reasonable to enforce. Note that this does *not* affect HDFS clients, which continue to require protocol compatibility only.

Konstantin will be publishing an updated document shortly.



> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, DFSUpgradeProposal3.html, FSStateTransition.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12482962 ] 

Konstantin Shvachko commented on HADOOP-702:
--------------------------------------------

There are at least 2 separate issues related to Owen's comment.
See HADOOP-1063 and HADOOP-1075.
I think the comment is based solely on the description I posted.
I'd prefer to see a more thorough review.

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition.patch, FSStateTransition6.htm, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12486511 ] 

Owen O'Malley commented on HADOOP-702:
--------------------------------------

My objections have been addressed and I think this should be committed. There are a couple of things that I'd like cleaned up eventually, but they shouldn't block the patch at this point, in my opinion.
  1. The UpgradeUtilities in test should be merged with mini-dfs cluster.
  2. The static class in FileUtils for HardLink seems unnecessary.
  3. FileUtils.HardLink.createLink handles InterruptedException badly.

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition6.htm, FSStateTransitionApr03.patch, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-702?page=comments#action_12448580 ] 
            
Doug Cutting commented on HADOOP-702:
-------------------------------------

Should there also be a '-list' option, that lists all known FSSIDs?

Also, must rollback always remove the newer version?  If changes were made there they will be lost.  Someone might want to rollback to revert to an old version to test something, or even to find a deleted file, then switch back to the newer version.  In effect these are filesystem checkpoints.  We probably don't want to encourage use of them as checkpoints right off, but we also shouldn't do things that prohibit it, like removing versions whenever we switch versions.  Thoughts?

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: http://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12468818 ] 

Konstantin Shvachko commented on HADOOP-702:
--------------------------------------------

> Data-nodes automatically "catching up" on a missed previous upgrade/discard seems like a good thing, isn't it?

Yes, but!
We are trying to protect the system from human mistakes. Now suppose that adminK started the overnight upgrade
of the system before going home, he plans to come back in the morning and check whether the upgrade was successful
or not. But another adminY comes to work earlier and not knowing about adminK actions last night starts the upgrade again.
The data-nodes will automatically discard "previous" fs state before upgrading because they can store only one backup per node.
So it can automatically discard the last working state of the file system if the upgraded software had bugs affecting the namespace.
I see it as the main problem with our new approach.

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, DFSUpgradeProposal3.html, FSStateTransition.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467205 ] 

Konstantin Shvachko commented on HADOOP-702:
--------------------------------------------

I'd like to emphasize the changes in the design of the upgrade and the behavior of the system in general.
People expressed different opinions during previous discussion so if anybody sees problems with the new
approach now would be a good time to speak up.

- No FSSIDs means that there will no possibility to create multiple snapshots of the fs.
Only one snapshot at any given time.
Something like what Dough calls above "filesystem checkpoints" will not be possible any more.

- The requirement of exact release version match will result in that there will be no option for administrators
to stop the name-node (without stopping data-nodes) and restart it with updated software. Even if no
changes to the data layout or data-node protocol have been done.

- Another important issue in the new design is that data-nodes will decide on their own whether to upgrade
or discard old fs state based on comparison of the local data layout version and the name-node LV.
That is, even if you start name-node in regular mode some data-nodes, which missed previous upgrade(s)
or discard(s), can decide to do it on their own.

I wrote a test that creates hard links of block files in a new directory. On my machine a hard link creation
takes about 10 milliseconds, which is 6,000 blocks per minute.
Depending on your data-node size you can calculate the cluster startup delay.

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, DFSUpgradeProposal3.html, FSStateTransition.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12462664 ] 

Nigel Daley commented on HADOOP-702:
------------------------------------

Updated the test plan to reflect the latest design.

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, DFSUpgradeProposal3.html, FSStateTransition.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-702:
-------------------------------

    Attachment: TestPlan-HdfsUpgrade.html

Attached updated test plan for the latest design doc.  There are certainly errors in the "expected response" sections of this document, largely due to missing details in the design doc.

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, DFSUpgradeProposal3.html, FSStateTransition.html, FSStateTransition5.htm, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-702?page=comments#action_12453689 ] 
            
Raghu Angadi commented on HADOOP-702:
-------------------------------------


The latest proposal includes <BuildVersion> in directory name and also includes in "VERSION" file. I am not sure what BuildVersion would be used for. Each backed up directory is uniquely identified by FFSID. If we include build version also in its name, it gives an impression that directory is somehow connected to build version. But it is not. Build version will change often and increases number of things to consider in the code and error handling.


> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: http://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-702?page=comments#action_12448628 ] 
            
Konstantin Shvachko commented on HADOOP-702:
--------------------------------------------

Doug,
Do you mean a shell -list option to list FSSIDs?
Sure, if we displayed them in the web UI why not report the same via a shell command.

You actually can do rollback and preserve current version.
You need to upgrade current version to some new FSSID first then rollback to the old version.
But I agree this not very convenient.
So do we want to separate taking a snapshot to a special function?
It will still be a part of upgrade, we are just adding an api to call it independently?

Yoram,
I don't think we should require snapshots to be immutable.
The hard link scheme lets different versions coexist and being modified independently of each other.

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: http://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-702:
---------------------------------------

    Attachment:     (was: DFSUpgradeProposal2.html)

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition.html, FSStateTransition.patch, FSStateTransition5.htm, FSStateTransition6.htm, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Yoram Arnon (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-702?page=comments#action_12448632 ] 
            
Yoram Arnon commented on HADOOP-702:
------------------------------------

read-write snapshots are possible, maybe even 'neat' from a technology standpoint.
They're just hard to manage. I know of no file systems then implement them though they're common in revision control systems.
I recommend against it in hdfs.

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: http://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-702?page=all ]

Nigel Daley updated HADOOP-702:
-------------------------------

    Attachment: TestPlan-HdfsUpgrade.html

A test plan for DFS upgrades.  Review comments welcome.

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: http://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12482969 ] 

Raghu Angadi commented on HADOOP-702:
-------------------------------------

+1.

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition.patch, FSStateTransition6.htm, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by Nigel Daley <nd...@yahoo-inc.com>.
To Owen's -1:

I dislike adding sleep statements too, but 2 of these very small  
sleeps are a temporary necessary evil while we wait for the issues  
Konstantin mentioned plus HADOOP-1085.  The 3rd sleep (in  
TestDFSFinalize) is necessary due to the architecture of the feature  
being tested.  Yes, these can and should be revisited once the other  
patches are committed, but I don't think they're blockers for  
committing this patch.

What may be a blocker is the fact that the 5 new unit tests (totaling  
134 test cases) in this patch add 19 minutes to the unit test run.

On Mar 21, 2007, at 4:22 PM, Owen O'Malley (JIRA) wrote:

>
>     [ https://issues.apache.org/jira/browse/HADOOP-702? 
> page=com.atlassian.jira.plugin.system.issuetabpanels:comment- 
> tabpanel#action_12482961 ]
>
> Owen O'Malley commented on HADOOP-702:
> --------------------------------------
>
> -1
>
> I strongly dislike adding sleep statements to the test. Please add  
> proper synchronization to remove the need for sleeping.
>
>> DFS Upgrade Proposal
>> --------------------
>>
>>                 Key: HADOOP-702
>>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>>             Project: Hadoop
>>          Issue Type: New Feature
>>          Components: dfs
>>            Reporter: Konstantin Shvachko
>>         Assigned To: Konstantin Shvachko
>>             Fix For: 0.13.0
>>
>>         Attachments: DFSUpgradeProposal3.html,  
>> FSStateTransition.patch, FSStateTransition6.htm, TestPlan- 
>> HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan- 
>> HdfsUpgrade.html
>>
>>
>> Currently the DFS cluster upgrade procedure is manual.
>> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
>> It is rather complicated and does not guarantee data  
>> recoverability in case of software errors or administrator mistakes.
>> This is a description of utilities that make the upgrade process  
>> almost automatic and minimize chance of loosing or corrupting data.
>> Please see the attached html file for details.
>
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12482961 ] 

Owen O'Malley commented on HADOOP-702:
--------------------------------------

-1

I strongly dislike adding sleep statements to the test. Please add proper synchronization to remove the need for sleeping.

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition.patch, FSStateTransition6.htm, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-702:
-------------------------------

    Status: Open  (was: Patch Available)

I'll reworking some of the tests to address Owen's concerns.
The patch also needs to be updated for the latest trunk.
I'll create a test-nightly target to run 4 of the new tests which take a while to run.  The target will run all tests that start with the word "Nightly".

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition.patch, FSStateTransition6.htm, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12482957 ] 

Nigel Daley commented on HADOOP-702:
------------------------------------

+1  This passes unit tests against trunk revision 520995

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition.patch, FSStateTransition6.htm, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-702?page=comments#action_12450530 ] 
            
Raghu Angadi commented on HADOOP-702:
-------------------------------------


With the manual rollback on each of the nodes on the cluster, I think we will need a way to know if a data-node is connecting to namenode with wrong fs version because there will be some datanodes which did not run the rollback procedure. One indirect way for namenode to recognize such nodes is to check if "latest stored version" on datanode is "later" than namenode's. What should Namenode do if it notices such a datanode? Since rollback is supposed to be rare, it could 'fail fast' and somehow let admin to fix such nodes.

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: http://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467721 ] 

Raghu Angadi commented on HADOOP-702:
-------------------------------------

I vote against requiring strict build version match between datanodes and namenode.. especially if it results in datanodes being marked dead in case of mismatch. Unless there is an easy way to disable,

1) In practice its hard to make sure every node is running the same software version ALL the time. Pretty soon we might a have case we forgot or rsync failed to push to half the nodes and name nodes suddenly looses a large chunk of data as a result.

2) As Konstantin mentioned, even in a test cluster this makes active testing hard. If i am working on a small namenode feature on not so small test cluster, it would require me to push new software and restart the whole cluster many times, also increasing possibility of (1).


> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, DFSUpgradeProposal3.html, FSStateTransition.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12462312 ] 

Konstantin Shvachko commented on HADOOP-702:
--------------------------------------------

Current proposal is now in DFSUpgradeProposal3.html
And FSStateTransition.html contains more detailed algorithms.


> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, DFSUpgradeProposal3.html, FSStateTransition.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-702:
---------------------------------------

    Attachment:     (was: FSStateTransition.html)

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition.patch, FSStateTransition5.htm, FSStateTransition6.htm, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12480532 ] 

Raghu Angadi commented on HADOOP-702:
-------------------------------------


How about enforcing buildVersion match only when we are rollingback ( and may be while upgrading and finalizing.. ).


> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, DFSUpgradeProposal3.html, FSStateTransition.html, FSStateTransition5.htm, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-702:
--------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this.  Thanks, Konstantin!

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition6.htm, FSStateTransitionApr03.patch, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-702:
---------------------------------------

    Status: Patch Available  (was: Open)

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition6.htm, FSStateTransitionApr03.patch, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-702:
---------------------------------------

    Attachment: FSStateTransition6.htm
                FSStateTransition.patch

This is the patch that fully implements the design in the updated document.
I updated three versions for ClientProtocol, DatanodeProtocol and the LAYOUT_VERSION,
which previously used to be called DFS_CURRENT_VERSION.

New code enforces more strict version checking: if a data-node has different from the
name-node build version then it fails, even if the layout and protocol versions are the same.
The build version is checked during handshake - a new rpc call which happens before registration.

The -upgrade feature can be used immediately although it is not mandatory.
The expected behavior is that the old fs layout will be first converted into the new layout, and then
saved in directory "previous". "current" directory will contain the new file system state.
All old files (in "previous") will remain unmodified, and can be restored in case of failure.
The rollback will not restore the pre-upgrade layout as pointed out in the design doc.

After applying the upgrade patch I recommend to actually upgrade
- start the cluster with the -upgrade option
- run fsck and some tests
- bin/hadoop dfsadmin -finalizeUpgrade
If something failed during conversion or later on I do not recommend to use rollback as a recovery procedure.
In order to recover the pre-upgrade state and layout from the "previous" directory one should manually rename files, namely:
for NameNode
    mv previous/edits ../
    rm previous/VERSION
    mv previous image
    rm current
for DataNode
    mv previous/storage ../
    rm previous/VERSION
    mv previous data
    rm current

Other changes and future work.
- The name-node image file format has not been changed, and it still contains the layout version and the namespace ID,
  which are redundant now. The reason for that is that it would make failure during the conversion unrecoverable.
  If the image is converted but the name-node fails before writing down the version file, the namespace id and the LV will be lost.
  The image file format should be changed sometimes later.
- I deprecated some methods. Most of then will need to be removed in a subsequent patch.
- Name-node is locking the storage directory now, the same as data-nodes, so no one can start
  two name-nodes in the same directory from now on.
- I removed unused code in FSEditLog and SecondaryNameNode. This is related to HADOOP-1076  (2)
- In FSEditLog I replaced 4 arrays by one and eliminated duplicate code.
- I changed MiniDFSCluster to sleep for 2 seconds before starting each data-node.
  Otherwise many tests were failing, because data-nodes were rolling ports.
  This is not a good fix, we will need to find out why this is happening.

Thanks Raghu for reviewing the code and helping with testing.
Thanks Nigel for testing and for creating a comprehensive junit test that covers at least 134 test cases
related to the new functionality.


> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, DFSUpgradeProposal3.html, FSStateTransition.html, FSStateTransition.patch, FSStateTransition5.htm, FSStateTransition6.htm, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-702:
-------------------------------

    Attachment: Manual-TestCases-HdfsConversion.txt

Attaching a writeup of the manual tests I ran to test the directory structure conversion from pre-0.13 to 0.13.

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition6.htm, FSStateTransitionApr03.patch, Manual-TestCases-HdfsConversion.txt, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-702?page=all ]

Konstantin Shvachko updated HADOOP-702:
---------------------------------------

    Attachment: DFSUpgradeProposal.html

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: http://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-702:
-------------------------------

    Attachment: TestPlan-HdfsUpgrade.html

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, DFSUpgradeProposal3.html, FSStateTransition.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Yoram Arnon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467224 ] 

Yoram Arnon commented on HADOOP-702:
------------------------------------

losing the capability for multiple snapshots is regrettable, but if the snapshots aren't maintained and are just left there to rot then perhaps it's not such a bad thing. A full snapshots solution will need to wait until it's done, well, fully.

Not being able to restart just the name node with an upgraded version is regrettable too, since we've seen cases where a tiny namenode bug is fixed and it's *much* simpler to update just one node than to update the entire cluster. Can that limitation be relaxed?

Datanodes automatically "catching up" on a missed previous upgrade/discard seems like a good thing, isn't it?

The benchmark - was it executed on a machine with a single disk or several? How fast can links be created (and deleted) on a machine with several disks?

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, DFSUpgradeProposal3.html, FSStateTransition.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Yoram Arnon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467727 ] 

Yoram Arnon commented on HADOOP-702:
------------------------------------

the last comment leads me to thinking about online upgrades. While this is a ways off, at some point we'll want to upgrade the dfs gradually, without bringing it down, especially for minor changes. I envision upgrading the namenode, which is backwards compatible with the previous version of the datanodes, and having the datanodes upgrade gradually later. 
That would require allowing a version mismatch between namenode and datanodes.

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, DFSUpgradeProposal3.html, FSStateTransition.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-702:
---------------------------------------

    Attachment: FSStateTransition.html
                DFSUpgradeProposal3.html

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Sameer Paranjpye
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, DFSUpgradeProposal3.html, FSStateTransition.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Assigned: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko reassigned HADOOP-702:
------------------------------------------

    Assignee: Konstantin Shvachko  (was: Sameer Paranjpye)

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, DFSUpgradeProposal3.html, FSStateTransition.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HADOOP-702) DFS Upgrade Proposal

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12486632 ] 

Hadoop QA commented on HADOOP-702:
----------------------------------

Integrated in Hadoop-Nightly #47 (See http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/47/)

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition6.htm, FSStateTransitionApr03.patch, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-702:
---------------------------------------

    Attachment: FSStateTransitionApr03.patch

Multiple patches have been committed that substantially improved junit tests.
With those patches the upgrade tests now run for about 3 minutes total, so "Nightly" target is no longer necessary.
Sleeps in MiniDFSCluster or upgrade unit tests are not required any more.


> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition6.htm, FSStateTransitionApr03.patch, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-702) DFS Upgrade Proposal

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-702:
---------------------------------------

    Attachment:     (was: FSStateTransition5.htm)

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>             Fix For: 0.13.0
>
>         Attachments: DFSUpgradeProposal3.html, FSStateTransition.patch, FSStateTransition6.htm, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.