You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Suresh Srinivas <su...@hortonworks.com> on 2012/03/03 03:15:59 UTC

Merge Namenode HA feature to 0.23

Namenode HA (HDFS-1623) feature is now in trunk (yay!). It adds support for
namenode high availability with active and standby namenodes, with manual
failover. I propose merging this feature to 0.23 branch. Recently Nicholas
merged protocol buffers/wire compatibility changes to 0.23. Merging
Namenode HA into 0.23 will make it a significant release.

I plan to complete this by next week after running some nightly/regressions
tests. Early next week, I plan to attach a 0.23 merge patch to HDFS-1623
jira for people interested in reviewing the patch.

Please vote.

Regards,
Suresh

Re: Merge Namenode HA feature to 0.23

Posted by Arun Murthy <ac...@hortonworks.com>.
Awesome! +1!

Sent from my iPhone

On Mar 2, 2012, at 6:16 PM, Suresh Srinivas <su...@hortonworks.com> wrote:

> Namenode HA (HDFS-1623) feature is now in trunk (yay!). It adds support for
> namenode high availability with active and standby namenodes, with manual
> failover. I propose merging this feature to 0.23 branch. Recently Nicholas
> merged protocol buffers/wire compatibility changes to 0.23. Merging
> Namenode HA into 0.23 will make it a significant release.
>
> I plan to complete this by next week after running some nightly/regressions
> tests. Early next week, I plan to attach a 0.23 merge patch to HDFS-1623
> jira for people interested in reviewing the patch.
>
> Please vote.
>
> Regards,
> Suresh

Re: Merge Namenode HA feature to 0.23

Posted by "Aaron T. Myers" <at...@cloudera.com>.
+1, this'll be great. The sooner the better. Please let me know if I can be
of any help.

Thanks a lot for volunteering to do this, Suresh.

--
Aaron T. Myers
Software Engineer, Cloudera



On Fri, Mar 2, 2012 at 6:15 PM, Suresh Srinivas <su...@hortonworks.com>wrote:

> Namenode HA (HDFS-1623) feature is now in trunk (yay!). It adds support for
> namenode high availability with active and standby namenodes, with manual
> failover. I propose merging this feature to 0.23 branch. Recently Nicholas
> merged protocol buffers/wire compatibility changes to 0.23. Merging
> Namenode HA into 0.23 will make it a significant release.
>
> I plan to complete this by next week after running some nightly/regressions
> tests. Early next week, I plan to attach a 0.23 merge patch to HDFS-1623
> jira for people interested in reviewing the patch.
>
> Please vote.
>
> Regards,
> Suresh
>

Re: Merge Namenode HA feature to 0.23

Posted by Eli Collins <el...@cloudera.com>.
+1.  Thanks Suresh, those lgtm


On Wednesday, March 7, 2012, Suresh Srinivas <su...@hortonworks.com> wrote:
> Thanks Eli for the list. I have merged the bugs you identified to 0.23. I
> am in the process of merging HA change. I have several conflicts, stemming
> from the changes that are no in 0.23. Pushing following jiras to 0.23 will
> make the merge easier. I believe these are mostly straight forward
changes:
>
> HADOOP-7557 Make IPC header be extensible
> HDFS-3003. Remove getHostPortString() from NameNode, replace it with
> NetUtils.getHostPortString().
> HADOOP-8108. Move method getHostPortString() from NameNode to NetUtils.
> HDFS-2764. TestBackupNode is racy.
> HDFS-2430. The number of failed or low-resource volumes the NN can
tolerate
> should be configurable.
> HADOOP-7358. Improve log levels when exceptions caught in RPC handler.
> HDFS-2410. Further cleanup of hardcoded configuration keys and values.
> HDFS-2285. BackupNode should reject requests to modify namespace.
> HADOOP-7729. Send back valid HTTP response if user hits IPC port with HTTP
> GET.
> HADOOP-7717. Move handling of concurrent client fail-overs to
> RetryInvocationHandler
>
> If no one gets back to me in a day, I will start merging these to 0.23.
>
> On Sun, Mar 4, 2012 at 9:43 AM, Eli Collins <el...@cloudera.com> wrote:
>
>> Hey Suresh,
>>
>> +1  Sounds great. Thanks for volunteering!
>>
>> You'll probably want to merge HDFS-1580,  HDFS-1765, HDFS-2158,
>> HDFS-2188, HDFS-2334, HDFS-2476, HDFS-2477, and HDFS-2495 to branch-23
>> first as these conflict and the patch will contain a bunch of non-HA
>> stuff.
>>
>> Thanks,
>> Eli
>>
>> On Fri, Mar 2, 2012 at 6:15 PM, Suresh Srinivas <su...@hortonworks.com>
>> wrote:
>> > Namenode HA (HDFS-1623) feature is now in trunk (yay!). It adds support
>> for
>> > namenode high availability with active and standby namenodes, with
manual
>> > failover. I propose merging this feature to 0.23 branch. Recently
>> Nicholas
>> > merged protocol buffers/wire compatibility changes to 0.23. Merging
>> > Namenode HA into 0.23 will make it a significant release.
>> >
>> > I plan to complete this by next week after running some
>> nightly/regressions
>> > tests. Early next week, I plan to attach a 0.23 merge patch to
HDFS-1623
>> > jira for people interested in reviewing the patch.
>> >
>> > Please vote.
>> >
>> > Regards,
>> > Suresh
>>
>

Re: Merge Namenode HA feature to 0.23

Posted by "Aaron T. Myers" <at...@cloudera.com>.
Likewise, +1 to merging all of those. HADOOP-7717 and HDFS-2430 are hard
requirements for the HA work, and the rest of those seem like strict
improvements to me. If they make the merge easier, then by all means do it.

I don't think you need to wait a day, either. I doubt anyone will object to
merging these to 0.23.

Thanks a lot, Suresh.

--
Aaron T. Myers
Software Engineer, Cloudera



On Wed, Mar 7, 2012 at 1:12 AM, Suresh Srinivas <su...@hortonworks.com>wrote:

> Thanks Eli for the list. I have merged the bugs you identified to 0.23. I
> am in the process of merging HA change. I have several conflicts, stemming
> from the changes that are no in 0.23. Pushing following jiras to 0.23 will
> make the merge easier. I believe these are mostly straight forward changes:
>
> HADOOP-7557 Make IPC header be extensible
> HDFS-3003. Remove getHostPortString() from NameNode, replace it with
> NetUtils.getHostPortString().
> HADOOP-8108. Move method getHostPortString() from NameNode to NetUtils.
> HDFS-2764. TestBackupNode is racy.
> HDFS-2430. The number of failed or low-resource volumes the NN can tolerate
> should be configurable.
> HADOOP-7358. Improve log levels when exceptions caught in RPC handler.
> HDFS-2410. Further cleanup of hardcoded configuration keys and values.
> HDFS-2285. BackupNode should reject requests to modify namespace.
> HADOOP-7729. Send back valid HTTP response if user hits IPC port with HTTP
> GET.
> HADOOP-7717. Move handling of concurrent client fail-overs to
> RetryInvocationHandler
>
> If no one gets back to me in a day, I will start merging these to 0.23.
>
> On Sun, Mar 4, 2012 at 9:43 AM, Eli Collins <el...@cloudera.com> wrote:
>
> > Hey Suresh,
> >
> > +1  Sounds great. Thanks for volunteering!
> >
> > You'll probably want to merge HDFS-1580,  HDFS-1765, HDFS-2158,
> > HDFS-2188, HDFS-2334, HDFS-2476, HDFS-2477, and HDFS-2495 to branch-23
> > first as these conflict and the patch will contain a bunch of non-HA
> > stuff.
> >
> > Thanks,
> > Eli
> >
> > On Fri, Mar 2, 2012 at 6:15 PM, Suresh Srinivas <su...@hortonworks.com>
> > wrote:
> > > Namenode HA (HDFS-1623) feature is now in trunk (yay!). It adds support
> > for
> > > namenode high availability with active and standby namenodes, with
> manual
> > > failover. I propose merging this feature to 0.23 branch. Recently
> > Nicholas
> > > merged protocol buffers/wire compatibility changes to 0.23. Merging
> > > Namenode HA into 0.23 will make it a significant release.
> > >
> > > I plan to complete this by next week after running some
> > nightly/regressions
> > > tests. Early next week, I plan to attach a 0.23 merge patch to
> HDFS-1623
> > > jira for people interested in reviewing the patch.
> > >
> > > Please vote.
> > >
> > > Regards,
> > > Suresh
> >
>

Re: Merge Namenode HA feature to 0.23

Posted by Eli Collins <el...@cloudera.com>.
Hey Suresh,

Forgot to ask, when do you plan to check this into branch-23?  Will
delay any merges that might conflict with this until you've committed
it.

Thanks,
Eli

On Fri, Mar 9, 2012 at 10:08 AM, Eli Collins <el...@cloudera.com> wrote:
> Hey Suresh,
>
> Went through the patch, looks good to me. Put a +1 on jira.
>
> Thanks,
> Eli
>
> On Wed, Mar 7, 2012 at 9:26 PM, Suresh Srinivas <su...@hortonworks.com> wrote:
>> I have merged the change required for merging Namenode HA. I have also
>> attached a release 23 patch in the jira HDFS-1623. Please take a look the
>> attached patch and let me know if that looks good.
>>
>> Regards,
>> Suresh

Re: Merge Namenode HA feature to 0.23

Posted by Eli Collins <el...@cloudera.com>.
Hey Suresh,

Went through the patch, looks good to me. Put a +1 on jira.

Thanks,
Eli

On Wed, Mar 7, 2012 at 9:26 PM, Suresh Srinivas <su...@hortonworks.com> wrote:
> I have merged the change required for merging Namenode HA. I have also
> attached a release 23 patch in the jira HDFS-1623. Please take a look the
> attached patch and let me know if that looks good.
>
> Regards,
> Suresh

Re: Merge Namenode HA feature to 0.23

Posted by "Aaron T. Myers" <at...@cloudera.com>.
+1

I applied the patch to branch-0.23. It compiles just fine. I built a
distribution tar, deployed it to a 4-node cluster, and ran some smoke tests
with HA enabled. All seemed good.

I also ran the following unit tests, which should exercise the relevant HA
code:

TestOfflineEditsViewer,TestHDFSConcat,TestEditLogRace,TestNameEditsConfigs,TestSaveNamespace,TestEditLogFileOutputStream,TestFileJournalManager,TestCheckpoint,TestEditLog,TestFSEditLogLoader,TestFsLimits,TestSecurityTokenEditLog,TestStorageRestore,TestBackupNode,TestEditLogJournalFailures,TestEditLogTailer,TestEditLogsDuringFailover,TestFailureToReadEdits,TestHASafeMode,TestHAStateTransitions,TestFailureOfSharedDir,TestDNFencing,TestStandbyIsHot,TestGenericJournalConf,TestCheckPointForSecurityTokens,TestNNStorageRetentionManager,TestPersistBlocks,TestPBHelper,TestNNLeaseRecovery

Of these, they all passed except for TestOfflineEditsViewer and
TestPersistBlocks. These failed because the patch obviously doesn't include
changes to a few binary files which the tests rely on. Assuming that when
you merge to branch-0.23 you do an actual svn merge, and don't just apply
the patch, then these won't be a problem.

--
Aaron T. Myers
Software Engineer, Cloudera



On Wed, Mar 7, 2012 at 9:26 PM, Suresh Srinivas <su...@hortonworks.com>wrote:

> I have merged the change required for merging Namenode HA. I have also
> attached a release 23 patch in the jira HDFS-1623. Please take a look the
> attached patch and let me know if that looks good.
>
> Regards,
> Suresh
>

Re: Merge Namenode HA feature to 0.23

Posted by Suresh Srinivas <su...@hortonworks.com>.
I have merged the change required for merging Namenode HA. I have also
attached a release 23 patch in the jira HDFS-1623. Please take a look the
attached patch and let me know if that looks good.

Regards,
Suresh

Re: Merge Namenode HA feature to 0.23

Posted by Todd Lipcon <to...@cloudera.com>.
+1 on merging all of those

On Wed, Mar 7, 2012 at 1:12 AM, Suresh Srinivas <su...@hortonworks.com> wrote:
> Thanks Eli for the list. I have merged the bugs you identified to 0.23. I
> am in the process of merging HA change. I have several conflicts, stemming
> from the changes that are no in 0.23. Pushing following jiras to 0.23 will
> make the merge easier. I believe these are mostly straight forward changes:
>
> HADOOP-7557 Make IPC header be extensible
> HDFS-3003. Remove getHostPortString() from NameNode, replace it with
> NetUtils.getHostPortString().
> HADOOP-8108. Move method getHostPortString() from NameNode to NetUtils.
> HDFS-2764. TestBackupNode is racy.
> HDFS-2430. The number of failed or low-resource volumes the NN can tolerate
> should be configurable.
> HADOOP-7358. Improve log levels when exceptions caught in RPC handler.
> HDFS-2410. Further cleanup of hardcoded configuration keys and values.
> HDFS-2285. BackupNode should reject requests to modify namespace.
> HADOOP-7729. Send back valid HTTP response if user hits IPC port with HTTP
> GET.
> HADOOP-7717. Move handling of concurrent client fail-overs to
> RetryInvocationHandler
>
> If no one gets back to me in a day, I will start merging these to 0.23.
>
> On Sun, Mar 4, 2012 at 9:43 AM, Eli Collins <el...@cloudera.com> wrote:
>
>> Hey Suresh,
>>
>> +1  Sounds great. Thanks for volunteering!
>>
>> You'll probably want to merge HDFS-1580,  HDFS-1765, HDFS-2158,
>> HDFS-2188, HDFS-2334, HDFS-2476, HDFS-2477, and HDFS-2495 to branch-23
>> first as these conflict and the patch will contain a bunch of non-HA
>> stuff.
>>
>> Thanks,
>> Eli
>>
>> On Fri, Mar 2, 2012 at 6:15 PM, Suresh Srinivas <su...@hortonworks.com>
>> wrote:
>> > Namenode HA (HDFS-1623) feature is now in trunk (yay!). It adds support
>> for
>> > namenode high availability with active and standby namenodes, with manual
>> > failover. I propose merging this feature to 0.23 branch. Recently
>> Nicholas
>> > merged protocol buffers/wire compatibility changes to 0.23. Merging
>> > Namenode HA into 0.23 will make it a significant release.
>> >
>> > I plan to complete this by next week after running some
>> nightly/regressions
>> > tests. Early next week, I plan to attach a 0.23 merge patch to HDFS-1623
>> > jira for people interested in reviewing the patch.
>> >
>> > Please vote.
>> >
>> > Regards,
>> > Suresh
>>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Merge Namenode HA feature to 0.23

Posted by Suresh Srinivas <su...@hortonworks.com>.
Thanks Eli for the list. I have merged the bugs you identified to 0.23. I
am in the process of merging HA change. I have several conflicts, stemming
from the changes that are no in 0.23. Pushing following jiras to 0.23 will
make the merge easier. I believe these are mostly straight forward changes:

HADOOP-7557 Make IPC header be extensible
HDFS-3003. Remove getHostPortString() from NameNode, replace it with
NetUtils.getHostPortString().
HADOOP-8108. Move method getHostPortString() from NameNode to NetUtils.
HDFS-2764. TestBackupNode is racy.
HDFS-2430. The number of failed or low-resource volumes the NN can tolerate
should be configurable.
HADOOP-7358. Improve log levels when exceptions caught in RPC handler.
HDFS-2410. Further cleanup of hardcoded configuration keys and values.
HDFS-2285. BackupNode should reject requests to modify namespace.
HADOOP-7729. Send back valid HTTP response if user hits IPC port with HTTP
GET.
HADOOP-7717. Move handling of concurrent client fail-overs to
RetryInvocationHandler

If no one gets back to me in a day, I will start merging these to 0.23.

On Sun, Mar 4, 2012 at 9:43 AM, Eli Collins <el...@cloudera.com> wrote:

> Hey Suresh,
>
> +1  Sounds great. Thanks for volunteering!
>
> You'll probably want to merge HDFS-1580,  HDFS-1765, HDFS-2158,
> HDFS-2188, HDFS-2334, HDFS-2476, HDFS-2477, and HDFS-2495 to branch-23
> first as these conflict and the patch will contain a bunch of non-HA
> stuff.
>
> Thanks,
> Eli
>
> On Fri, Mar 2, 2012 at 6:15 PM, Suresh Srinivas <su...@hortonworks.com>
> wrote:
> > Namenode HA (HDFS-1623) feature is now in trunk (yay!). It adds support
> for
> > namenode high availability with active and standby namenodes, with manual
> > failover. I propose merging this feature to 0.23 branch. Recently
> Nicholas
> > merged protocol buffers/wire compatibility changes to 0.23. Merging
> > Namenode HA into 0.23 will make it a significant release.
> >
> > I plan to complete this by next week after running some
> nightly/regressions
> > tests. Early next week, I plan to attach a 0.23 merge patch to HDFS-1623
> > jira for people interested in reviewing the patch.
> >
> > Please vote.
> >
> > Regards,
> > Suresh
>

Re: Merge Namenode HA feature to 0.23

Posted by Eli Collins <el...@cloudera.com>.
Hey Suresh,

+1  Sounds great. Thanks for volunteering!

You'll probably want to merge HDFS-1580,  HDFS-1765, HDFS-2158,
HDFS-2188, HDFS-2334, HDFS-2476, HDFS-2477, and HDFS-2495 to branch-23
first as these conflict and the patch will contain a bunch of non-HA
stuff.

Thanks,
Eli

On Fri, Mar 2, 2012 at 6:15 PM, Suresh Srinivas <su...@hortonworks.com> wrote:
> Namenode HA (HDFS-1623) feature is now in trunk (yay!). It adds support for
> namenode high availability with active and standby namenodes, with manual
> failover. I propose merging this feature to 0.23 branch. Recently Nicholas
> merged protocol buffers/wire compatibility changes to 0.23. Merging
> Namenode HA into 0.23 will make it a significant release.
>
> I plan to complete this by next week after running some nightly/regressions
> tests. Early next week, I plan to attach a 0.23 merge patch to HDFS-1623
> jira for people interested in reviewing the patch.
>
> Please vote.
>
> Regards,
> Suresh