You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-dev@hadoop.apache.org by Suresh Srinivas <su...@yahoo-inc.com> on 2011/04/22 18:48:30 UTC

[Discuss] Merge federation branch HDFS-1052 into trunk

A few weeks ago, I had sent an email about the progress of HDFS federation development in HDFS-1052 branch. I am happy to announce that all the tasks related to this feature development is complete and it is ready to be integrated into trunk.

I have a merge patch attached to HDFS-1052 jira. All Hudson tests pass except for two test failures. We will fix these unit test failures in trunk, post merge. I plan on completing merge to trunk early next week. I would like to do this ASAP to avoid having to keep the patch up to date (which has been time consuming). This also avoids need for re-merging, due to SVN changes proposed by Nigel, scheduled late next week. Comments are welcome.

Regards,
Suresh

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Sanjay Radia <sr...@yahoo-inc.com>.

On Apr 26, 2011, at 10:40 PM, Konstantin Boudnik wrote:

> Oops, the message came out garbled. I meant to say
>
> I assume the outlined changes won't prevent an earlier version of  
> HDFS from
> upgrades to the federation version, right?


Yes absolutely. We have tested upgrades .
Besides our ops will throw us out of the window if we even hint that  
there isn't an
automatic upgrade for the next release :-)

sanjay
>
> Thanks in advance,
>  Cos
>
> On Tue, Apr 26, 2011 at 17:59, Konstantin Boudnik <co...@apache.org>  
> wrote:
>> Sanjay,
>>
>> I assume the outlined changes won't an earlier version of HDFS from
>> upgrads to the federation version, right?
>>
>> Cos
>>
>> On Tue, Apr 26, 2011 at 17:26, Sanjay Radia <sr...@yahoo-inc.com>  
>> wrote:
>>>
>>> Changes to the code base
>>>  - The fundamental code change is to extend the notion of block id  
>>> to now
>>> include a block pool id.
>>> - The  NN had little change, the protocols did change to include  
>>> the block
>>> pool id.
>>> - The DN code did change. Each data structure is now indexed by  
>>> the block
>>> pool id -- while this is a code change, it is architecturally very  
>>> simple
>>> and low risk.
>>> - We also did a fair amount of cleanup of threads used to send  
>>> block reports
>>> - while it was not strictly necessary to do the cleanup we took  
>>> the extra
>>> effort to pay the technical debt. As Dhruba recently noted, adding  
>>> support
>>> to send block reports to primary and secondary NN for HA will be  
>>> now much
>>> easier to do.
>>

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by suresh srinivas <sr...@gmail.com>.

Upgrades from earlier version is supported. The existing configuration
should run without any change.

On Tue, Apr 26, 2011 at 10:40 PM, Konstantin Boudnik <co...@apache.org> wrote:

> Oops, the message came out garbled. I meant to say
>
> I assume the outlined changes won't prevent an earlier version of HDFS from
> upgrades to the federation version, right?
>
> Thanks in advance,
>  Cos
>
>

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Konstantin Boudnik <co...@apache.org>.

Oops, the message came out garbled. I meant to say

I assume the outlined changes won't prevent an earlier version of HDFS from
upgrades to the federation version, right?

Thanks in advance,
  Cos

On Tue, Apr 26, 2011 at 17:59, Konstantin Boudnik <co...@apache.org> wrote:
> Sanjay,
>
> I assume the outlined changes won't an earlier version of HDFS from
> upgrads to the federation version, right?
>
> Cos
>
> On Tue, Apr 26, 2011 at 17:26, Sanjay Radia <sr...@yahoo-inc.com> wrote:
>>
>> Changes to the code base
>>  - The fundamental code change is to extend the notion of block id to now
>> include a block pool id.
>> - The  NN had little change, the protocols did change to include the block
>> pool id.
>> - The DN code did change. Each data structure is now indexed by the block
>> pool id -- while this is a code change, it is architecturally very simple
>> and low risk.
>> - We also did a fair amount of cleanup of threads used to send block reports
>> - while it was not strictly necessary to do the cleanup we took the extra
>> effort to pay the technical debt. As Dhruba recently noted, adding support
>> to send block reports to primary and secondary NN for HA will be now much
>> easier to do.
>

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Konstantin Shvachko <sh...@gmail.com>.

Dhruba,

It would be very valuable for the community to share your experience
if you performed any independent testing of the federation branch.

Thanks,
--Konstantin

On Tue, Apr 26, 2011 at 9:27 PM, Dhruba Borthakur <dh...@gmail.com> wrote:

> I feel that making the datanode talk to multiple namenodes is very
> valuable,
> especially when there is plenty of storage available on a single datanode
> machine (think 24 TB to 36 TB) and a single namenode does not have enough
> memory to hold all file metadata for such a large cluster in memory.
>
> This is a feature that we are in dire need of, and could put it to good use
> starting "yesterday"!
>
> thanks,
> dhruba
>
> On Tue, Apr 26, 2011 at 5:59 PM, Konstantin Boudnik <co...@apache.org>
> wrote:
>
> > Sanjay,
> >
> > I assume the outlined changes won't an earlier version of HDFS from
> > upgrads to the federation version, right?
> >
> > Cos
> >
> > On Tue, Apr 26, 2011 at 17:26, Sanjay Radia <sr...@yahoo-inc.com>
> wrote:
> > >
> > > Changes to the code base
> > >  - The fundamental code change is to extend the notion of block id to
> now
> > > include a block pool id.
> > > - The  NN had little change, the protocols did change to include the
> > block
> > > pool id.
> > > - The DN code did change. Each data structure is now indexed by the
> block
> > > pool id -- while this is a code change, it is architecturally very
> simple
> > > and low risk.
> > > - We also did a fair amount of cleanup of threads used to send block
> > reports
> > > - while it was not strictly necessary to do the cleanup we took the
> extra
> > > effort to pay the technical debt. As Dhruba recently noted, adding
> > support
> > > to send block reports to primary and secondary NN for HA will be now
> much
> > > easier to do.
> >
>
>
>
> --
> Connect to me at http://www.facebook.com/dhruba
>

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by "Tsz Wo (Nicholas), Sze" <s2...@yahoo.com>.

Agree.  It is a step forward to distributed namespace.

Regards,
Nicholas





________________________________
From: Dhruba Borthakur <dh...@gmail.com>
To: hdfs-dev@hadoop.apache.org
Cc: sradia@yahoo-inc.com; Doug Cutting <cu...@apache.org>
Sent: Wed, April 27, 2011 12:27:30 AM
Subject: Re: [Discuss] Merge federation branch HDFS-1052 into trunk

I feel that making the datanode talk to multiple namenodes is very valuable,
especially when there is plenty of storage available on a single datanode
machine (think 24 TB to 36 TB) and a single namenode does not have enough
memory to hold all file metadata for such a large cluster in memory.

This is a feature that we are in dire need of, and could put it to good use
starting "yesterday"!

thanks,
dhruba

On Tue, Apr 26, 2011 at 5:59 PM, Konstantin Boudnik <co...@apache.org> wrote:

> Sanjay,
>
> I assume the outlined changes won't an earlier version of HDFS from
> upgrads to the federation version, right?
>
> Cos
>
> On Tue, Apr 26, 2011 at 17:26, Sanjay Radia <sr...@yahoo-inc.com> wrote:
> >
> > Changes to the code base
> >  - The fundamental code change is to extend the notion of block id to now
> > include a block pool id.
> > - The  NN had little change, the protocols did change to include the
> block
> > pool id.
> > - The DN code did change. Each data structure is now indexed by the block
> > pool id -- while this is a code change, it is architecturally very simple
> > and low risk.
> > - We also did a fair amount of cleanup of threads used to send block
> reports
> > - while it was not strictly necessary to do the cleanup we took the extra
> > effort to pay the technical debt. As Dhruba recently noted, adding
> support
> > to send block reports to primary and secondary NN for HA will be now much
> > easier to do.
>



-- 
Connect to me at http://www.facebook.com/dhruba

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Dhruba Borthakur <dh...@gmail.com>.

I feel that making the datanode talk to multiple namenodes is very valuable,
especially when there is plenty of storage available on a single datanode
machine (think 24 TB to 36 TB) and a single namenode does not have enough
memory to hold all file metadata for such a large cluster in memory.

This is a feature that we are in dire need of, and could put it to good use
starting "yesterday"!

thanks,
dhruba

On Tue, Apr 26, 2011 at 5:59 PM, Konstantin Boudnik <co...@apache.org> wrote:

> Sanjay,
>
> I assume the outlined changes won't an earlier version of HDFS from
> upgrads to the federation version, right?
>
> Cos
>
> On Tue, Apr 26, 2011 at 17:26, Sanjay Radia <sr...@yahoo-inc.com> wrote:
> >
> > Changes to the code base
> >  - The fundamental code change is to extend the notion of block id to now
> > include a block pool id.
> > - The  NN had little change, the protocols did change to include the
> block
> > pool id.
> > - The DN code did change. Each data structure is now indexed by the block
> > pool id -- while this is a code change, it is architecturally very simple
> > and low risk.
> > - We also did a fair amount of cleanup of threads used to send block
> reports
> > - while it was not strictly necessary to do the cleanup we took the extra
> > effort to pay the technical debt. As Dhruba recently noted, adding
> support
> > to send block reports to primary and secondary NN for HA will be now much
> > easier to do.
>

-- 
Connect to me at http://www.facebook.com/dhruba

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Konstantin Boudnik <co...@apache.org>.

Sanjay,

I assume the outlined changes won't an earlier version of HDFS from
upgrads to the federation version, right?

Cos

On Tue, Apr 26, 2011 at 17:26, Sanjay Radia <sr...@yahoo-inc.com> wrote:
>
> Changes to the code base
>  - The fundamental code change is to extend the notion of block id to now
> include a block pool id.
> - The  NN had little change, the protocols did change to include the block
> pool id.
> - The DN code did change. Each data structure is now indexed by the block
> pool id -- while this is a code change, it is architecturally very simple
> and low risk.
> - We also did a fair amount of cleanup of threads used to send block reports
> - while it was not strictly necessary to do the cleanup we took the extra
> effort to pay the technical debt. As Dhruba recently noted, adding support
> to send block reports to primary and secondary NN for HA will be now much
> easier to do.

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Sanjay Radia <sr...@yahoo-inc.com>.

On Apr 25, 2011, at 2:36 PM, Doug Cutting wrote:
>
> A couple of questions:
>
> 1. Can you please describe the significant advantages this approach  
> has
> over a symlink-based approach?
> It seems to me that one could run multiple namenodes on separate boxes
> and run multile datanode processes per storage box configured with
> something like:
> .......

Doug,

There are two separate issues;  your email seems to suggest that these  
are joined.
(1) creating (or not ) a unified namespace
(2) sharing the storage and the block storage layer across NameNodes -  
the architecture document covers this layering in great detail.
This separation reflects architecture of HDFS (derived from GFS) where  
the namespace layer is separate from the block storage layer (although  
the HDFS implementation violates the layers in many places).

HDFS-1052 deals with (2) - allowing multiple NameNodes to share the  
block storage layer.

As far as (1), creating a unified namespace,  federation does NOT  
dictate how you create a unified namespace or whether you even create  
a unified namespace in the first place. Indeed you may want to share  
the physical storage but want independent namespaces. For example, you  
may want to run a private namespace for HBase files within the same  
Hadoop cluster. Two different tenants sharing a cluster may choose to  
have their independent namespaces for isolation.

Of course in many situations one wants to create a unified namespace.  
One could create a unified namespace using symbolic links as you  
suggest. The federation work has also added client-side mount tables  
(HDFS-1053) (it is an implementation of FileSystem and  
AbstractFileSystem). It offers advantages over symbolic links but this  
is separable and you can use symbolic links if you like. HDFS-1053  
(client-side mount tables) makes no changes to any existing file system.

Now getting to (2),  sharing the physical storage and the block  
storage layer.
The approach you describe (run multiple DNs on the same machine which  
is essentially multiple super-imposed HDFS clusters)
is the most common reaction to this work and one which we also explored.
Unfortunately this approach runs into several issues and when you  
start exploring the details you realize that it is essentially a hack.
- Extra processes  running the DN on the same machine taking precious  
memory away from MR tasks.
- Independent pools of threads for each DN
- Not being able to schedule disk operations across multiple DNs
- Not being able to provide a unified view of balancing or  
decommissioning. For example, one could run multiple balancers but  
this will give you less control of bandwidth used for balancing.
- The disk-fail-in-place work  and the balance-disks-on-introducing-a- 
new-disk would become more  complicated to coordinate across DNs.
- Federation allows the cluster to be managed as a unit rather then as  
a a bunch of overlapping HDFS clusters. Overlapping HDFS clusters will  
be operationally taxing.

On the other hand, the new architecture generalizes the block storage  
layer and allow us to evolve it to address new needs. For example, it  
will allow us to address issues like offering tmp storage for  
intermediate MR output - one can allocate a block pool for MR tmp  
storage on each DN. HBase could also use the block storage layer  
directly without going through a name node.

>
> 2. ....  The patch modifies much
> of the logic of Hadoop's central component, upon which the performance
> and reliability of most other components of the ecosystem depend.

Changes to the code base
  - The fundamental code change is to extend the notion of block id to  
now include a block pool id.
- The  NN had little change, the protocols did change to include the  
block pool id.
- The DN code did change. Each data structure is now indexed by the  
block pool id -- while this is a code change, it is architecturally  
very simple and low risk.
- We also did a fair amount of cleanup of threads used to send block  
reports - while it was not strictly necessary to do the cleanup we  
took the extra effort to pay the technical debt. As Dhruba recently  
noted, adding support to send block reports to primary and secondary  
NN for HA will be now much easier to do.

The write and read pipelines - which are performance critical,  have  
NOT changed.

>  It seems to me that such an invasive change should be well tested  
> before it
> is merged to trunk.  Can you please tell me how this has been tested
> beyond unit tests?

Risk, Quality & Testing
Besides the amount of code change one has to ask the fundamental  
question:  how good is the design and how is the project managed.  
Conceptually, federation is very simple: pools of blocks are owned by  
a service (a NN in this case) and the block id is extended by an   
identifier called the block-pool id.
First and foremost - we wrote a very extensive architecture document -  
more comprehensive than any other document  in Hadoop in the past.  
This was published very early:  version 1 in march 2010 and version 5  
in april 2010 based on feedback we received from the community. We  
sought and incorporated feedback from other HDFs developers outside of  
Yahoo.

The project was managed as a separate branch rather than introduce the  
code to trunk incrementally.
The branch has also been tested as a separate unit by us - this  
ensures that it does not destabilize trunk.

More details on testing.
The same QA process that drove and tested key stable Apache Hadoop   
releases (16, 17, 18, 20, 20-security) is being used for testing the  
federation feature. We have been running integrated tests with  
federation for a few months and continue to do so.
We will not deploy a Hadoop release with the federation feature in  
Yahoo clusters until we are confident that it is stable and reliable  
for our clusters. Indeed the level of testing is significantly more  
than in previous releases.

Hopefully the above addresses your concerns.

regards
sanjay

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by suresh srinivas <sr...@gmail.com>.

As Eli suggested, I have uploaded a new patch to the jira. Merging new trunk
changes and testing them took several hours! It passes all the tests except
two unit test failure. These failures do not happen on my machine - if this
is a real failure we will address them after merging the patch to the trunk.

Please review the patch and post your comments on jira.

Regards,
Suresh

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Konstantin Boudnik <co...@boudnik.org>.

+1. Having an open QE process would be a tremendous value-add to the
overall quality of the feature. Append was an exemplary development in
this sense. Would it be possible to have Federation test plan (if
exists) to be published along with the specs on the JIRA (similar to
HDFS-265) at least for the reference?

Cos

On Wed, Apr 27, 2011 at 21:56, Konstantin Shvachko <sh...@gmail.com> wrote:
> Yes, I can talk about append as an example.
> Some differences with federation project are:
> - append had a comprehensive test plan document, which was designed an
> executed;
> - append was independently evaluated by HBase guys;
> - it introduced new benchmark for append;
> - We ran both DFSIO and NNThroughput. DFSIO was executed on a relatively
> small cluster. I couldn't find where I posted the results, my bad. But you
> may be able to find these tasks in our scrum records.
>
> --Konstantin
>
>
> On Tue, Apr 26, 2011 at 11:55 PM, suresh srinivas <sr...@gmail.com>wrote:
>
>> Konstantin,
>>
>> Could you provide me link to how this was done on a big feature, like say
>> append and how benchmark info was captured? I am planning to run dfsio
>> tests, btw.
>>
>> Regards,
>> Suresh
>>
>> On Tue, Apr 26, 2011 at 11:34 PM, suresh srinivas <srini30005@gmail.com
>> >wrote:
>>
>> > Konstantin,
>> >
>> > On Tue, Apr 26, 2011 at 10:26 PM, Konstantin Shvachko <
>> > shv.hadoop@gmail.com> wrote:
>> >
>> >> Suresh, Sanjay.
>> >>
>> >> 1. I asked for benchmarks many times over the course of different
>> >> discussions on the topic.
>> >> I don't see any numbers attached to jira, and I was getting the same
>> >> response,
>> >> Doug just got from you, guys: which is "why would the performance be
>> >> worse".
>> >> And this is not an argument for me.
>> >>
>> >
>> > We had done testing earlier and had found that performance had not
>> > degraded. We are waiting for out performance team to publish the official
>> > numbers to post it to the jira. Unfortunately they are busy qualifying
>> 2xx
>> > releases currently. I will get the perf numbers and post them.
>> >
>> >
>> >>
>> >> 2. I assume that merging requires a vote. I am sure people who know
>> bylaws
>> >> better than I do will correct me if it is not true.
>> >> Did I miss the vote?
>> >>
>> >
>> >
>> > As regards to voting, since I was not sure about the procedure, I had
>> > consulted Owen about it. He had indicated that voting is not necessary.
>> If
>> > the right procedure is to call for voting, I will do so. Owen any
>> comments?
>> >
>> >
>> >>
>> >> It feels like you are rushing this and are not doing what you would
>> expect
>> >> others to
>> >> do in the same position, and what has been done in the past for such
>> large
>> >> projects.
>> >>
>> >
>> > I am not trying to rush here and not follow the procedure required. I am
>> > not sure about what the procedure is. Any pointers to it is appreciated.
>> >
>> >
>> >>
>> >> Thanks,
>> >> --Konstantin
>> >>
>> >>
>> >> On Tue, Apr 26, 2011 at 9:43 PM, Doug Cutting <cu...@apache.org>
>> wrote:
>> >>
>> >> > Suresh, Sanjay,
>> >> >
>> >> > Thank you very much for addressing my questions.
>> >> >
>> >> > Cheers,
>> >> >
>> >> > Doug
>> >> >
>> >> > On 04/26/2011 10:29 AM, suresh srinivas wrote:
>> >> > > Doug,
>> >> > >
>> >> > >
>> >> > >> 1. Can you please describe the significant advantages this approach
>> >> has
>> >> > >> over a symlink-based approach?
>> >> > >
>> >> > > Federation is complementary with symlink approach. You could choose
>> to
>> >> > > provide integrated namespace using symlinks. However, client side
>> >> mount
>> >> > > tables seems a better approach for many reasons:
>> >> > > # Unlike symbolic links, client side mount tables can choose to go
>> to
>> >> > right
>> >> > > namenode based on configuration. This avoids unnecessary RPCs to the
>> >> > > namenodes to discover the targer of symlink.
>> >> > > # The unavailability of a namenode where a symbolic link is
>> configured
>> >> > does
>> >> > > not affect reaching the symlink target.
>> >> > > # Symbolic links need not be configured on every namenode in the
>> >> cluster
>> >> > and
>> >> > > future changes to symlinks need not be propagated to multiple
>> >> namenodes.
>> >> > In
>> >> > > client side mount tables, this information is in a central
>> >> configuration.
>> >> > >
>> >> > > If a deployment still wants to use symbolic link, federation does
>> not
>> >> > > preclude it.
>> >> > >
>> >> > >> It seems to me that one could run multiple namenodes on separate
>> >> boxes
>> >> > > and run multile datanode processes per storage box
>> >> > >
>> >> > > There are several advantages to using a single datanode:
>> >> > > # When you have large number of namenodes (say 20), the cost of
>> >> running
>> >> > > separate datanodes in terms of process resources such as memory is
>> >> huge.
>> >> > > # The disk i/o management and storage utilization using a single
>> >> datanode
>> >> > is
>> >> > > much better, as it has complete view the storage.
>> >> > > # In the approach you are proposing, you have several clusters to
>> >> manage.
>> >> > > However with federation, all datanodes are in a single cluster; with
>> >> > single
>> >> > > configuration and operationally easier to manage.
>> >> > >
>> >> > >> The patch modifies much of the logic of Hadoop's central component,
>> >> upon
>> >> > > which the performance and reliability of most other components of
>> the
>> >> > > ecosystem depend.
>> >> > > That is not true.
>> >> > >
>> >> > > # Namenode is mostly unchanged in this feature.
>> >> > > # Read/write pipelines are unchanged.
>> >> > > # The changes are mainly in datanode:
>> >> > > #* the storage, FSDataset, Directory and Disk scanners now have
>> >> another
>> >> > > level to incorporate block pool ID into the hierarchy. This is not a
>> >> > > significant change that should cause performance or stability
>> >> concerns.
>> >> > > #* datanodes use a separate thread per NN, just like the existing
>> >> thread
>> >> > > that communicates with NN.
>> >> > >
>> >> > >> Can you please tell me how this has been tested beyond unit tests?
>> >> > > As regards to testing, we have passed 600+ tests. In hadoop, these
>> >>  tests
>> >> > > are mostly integration tests and not pure unit tests.
>> >> > >
>> >> > > While these tests have been extensive, we have also been testing
>> this
>> >> > branch
>> >> > > for last 4 months, with QA validation that reflects our production
>> >> > > environment. We have found the system to be stable, performing well
>> >> and
>> >> > have
>> >> > > not found any blockers with the branch so far.
>> >> > >
>> >> > > HDFS-1052 has been open more than a year now. I had also sent an
>> email
>> >> > about
>> >> > > this merge around 2 months ago. There are 90 subtasks that have been
>> >> > worked
>> >> > > on last couple of months under HDFS-1052. Given that there was
>> enough
>> >> > time
>> >> > > to ask these questions, your email a day before I am planning to
>> merge
>> >> > the
>> >> > > branch into trunk seems late!
>> >> > >
>> >> >
>> >>
>> >
>> >
>> >
>> > --
>> > Regards,
>> > Suresh
>> >
>> >
>>
>>
>> --
>> Regards,
>> Suresh
>>
>

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Konstantin Shvachko <sh...@gmail.com>.

Yes, I can talk about append as an example.
Some differences with federation project are:
- append had a comprehensive test plan document, which was designed an
executed;
- append was independently evaluated by HBase guys;
- it introduced new benchmark for append;
- We ran both DFSIO and NNThroughput. DFSIO was executed on a relatively
small cluster. I couldn't find where I posted the results, my bad. But you
may be able to find these tasks in our scrum records.

--Konstantin


On Tue, Apr 26, 2011 at 11:55 PM, suresh srinivas <sr...@gmail.com>wrote:

> Konstantin,
>
> Could you provide me link to how this was done on a big feature, like say
> append and how benchmark info was captured? I am planning to run dfsio
> tests, btw.
>
> Regards,
> Suresh
>
> On Tue, Apr 26, 2011 at 11:34 PM, suresh srinivas <srini30005@gmail.com
> >wrote:
>
> > Konstantin,
> >
> > On Tue, Apr 26, 2011 at 10:26 PM, Konstantin Shvachko <
> > shv.hadoop@gmail.com> wrote:
> >
> >> Suresh, Sanjay.
> >>
> >> 1. I asked for benchmarks many times over the course of different
> >> discussions on the topic.
> >> I don't see any numbers attached to jira, and I was getting the same
> >> response,
> >> Doug just got from you, guys: which is "why would the performance be
> >> worse".
> >> And this is not an argument for me.
> >>
> >
> > We had done testing earlier and had found that performance had not
> > degraded. We are waiting for out performance team to publish the official
> > numbers to post it to the jira. Unfortunately they are busy qualifying
> 2xx
> > releases currently. I will get the perf numbers and post them.
> >
> >
> >>
> >> 2. I assume that merging requires a vote. I am sure people who know
> bylaws
> >> better than I do will correct me if it is not true.
> >> Did I miss the vote?
> >>
> >
> >
> > As regards to voting, since I was not sure about the procedure, I had
> > consulted Owen about it. He had indicated that voting is not necessary.
> If
> > the right procedure is to call for voting, I will do so. Owen any
> comments?
> >
> >
> >>
> >> It feels like you are rushing this and are not doing what you would
> expect
> >> others to
> >> do in the same position, and what has been done in the past for such
> large
> >> projects.
> >>
> >
> > I am not trying to rush here and not follow the procedure required. I am
> > not sure about what the procedure is. Any pointers to it is appreciated.
> >
> >
> >>
> >> Thanks,
> >> --Konstantin
> >>
> >>
> >> On Tue, Apr 26, 2011 at 9:43 PM, Doug Cutting <cu...@apache.org>
> wrote:
> >>
> >> > Suresh, Sanjay,
> >> >
> >> > Thank you very much for addressing my questions.
> >> >
> >> > Cheers,
> >> >
> >> > Doug
> >> >
> >> > On 04/26/2011 10:29 AM, suresh srinivas wrote:
> >> > > Doug,
> >> > >
> >> > >
> >> > >> 1. Can you please describe the significant advantages this approach
> >> has
> >> > >> over a symlink-based approach?
> >> > >
> >> > > Federation is complementary with symlink approach. You could choose
> to
> >> > > provide integrated namespace using symlinks. However, client side
> >> mount
> >> > > tables seems a better approach for many reasons:
> >> > > # Unlike symbolic links, client side mount tables can choose to go
> to
> >> > right
> >> > > namenode based on configuration. This avoids unnecessary RPCs to the
> >> > > namenodes to discover the targer of symlink.
> >> > > # The unavailability of a namenode where a symbolic link is
> configured
> >> > does
> >> > > not affect reaching the symlink target.
> >> > > # Symbolic links need not be configured on every namenode in the
> >> cluster
> >> > and
> >> > > future changes to symlinks need not be propagated to multiple
> >> namenodes.
> >> > In
> >> > > client side mount tables, this information is in a central
> >> configuration.
> >> > >
> >> > > If a deployment still wants to use symbolic link, federation does
> not
> >> > > preclude it.
> >> > >
> >> > >> It seems to me that one could run multiple namenodes on separate
> >> boxes
> >> > > and run multile datanode processes per storage box
> >> > >
> >> > > There are several advantages to using a single datanode:
> >> > > # When you have large number of namenodes (say 20), the cost of
> >> running
> >> > > separate datanodes in terms of process resources such as memory is
> >> huge.
> >> > > # The disk i/o management and storage utilization using a single
> >> datanode
> >> > is
> >> > > much better, as it has complete view the storage.
> >> > > # In the approach you are proposing, you have several clusters to
> >> manage.
> >> > > However with federation, all datanodes are in a single cluster; with
> >> > single
> >> > > configuration and operationally easier to manage.
> >> > >
> >> > >> The patch modifies much of the logic of Hadoop's central component,
> >> upon
> >> > > which the performance and reliability of most other components of
> the
> >> > > ecosystem depend.
> >> > > That is not true.
> >> > >
> >> > > # Namenode is mostly unchanged in this feature.
> >> > > # Read/write pipelines are unchanged.
> >> > > # The changes are mainly in datanode:
> >> > > #* the storage, FSDataset, Directory and Disk scanners now have
> >> another
> >> > > level to incorporate block pool ID into the hierarchy. This is not a
> >> > > significant change that should cause performance or stability
> >> concerns.
> >> > > #* datanodes use a separate thread per NN, just like the existing
> >> thread
> >> > > that communicates with NN.
> >> > >
> >> > >> Can you please tell me how this has been tested beyond unit tests?
> >> > > As regards to testing, we have passed 600+ tests. In hadoop, these
> >>  tests
> >> > > are mostly integration tests and not pure unit tests.
> >> > >
> >> > > While these tests have been extensive, we have also been testing
> this
> >> > branch
> >> > > for last 4 months, with QA validation that reflects our production
> >> > > environment. We have found the system to be stable, performing well
> >> and
> >> > have
> >> > > not found any blockers with the branch so far.
> >> > >
> >> > > HDFS-1052 has been open more than a year now. I had also sent an
> email
> >> > about
> >> > > this merge around 2 months ago. There are 90 subtasks that have been
> >> > worked
> >> > > on last couple of months under HDFS-1052. Given that there was
> enough
> >> > time
> >> > > to ask these questions, your email a day before I am planning to
> merge
> >> > the
> >> > > branch into trunk seems late!
> >> > >
> >> >
> >>
> >
> >
> >
> > --
> > Regards,
> > Suresh
> >
> >
>
>
> --
> Regards,
> Suresh
>

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Konstantin Shvachko <sh...@gmail.com>.

Suresh,
Showing no degradation in performance on one-node cluster is a good start
for benchmarking.
You still have a dev cluster to run benchmarks, don't you?
--Konstantin

On Wed, Apr 27, 2011 at 2:36 PM, suresh srinivas <sr...@gmail.com>wrote:

> I ran these tests on my laptop. I would like to use this data to emphasize
> that there is no regression in performance. I am not sure with just the
> tests that I ran we could conclude there is a huge gain in performance with
> federation. When out performance test team runs tests at scale we will get
> more clearer picture.
>
>
>
> On Wed, Apr 27, 2011 at 10:41 AM, Konstantin Boudnik <cos@boudnik.org
> >wrote:
>
> > Interesting... while the read performance has only marginally improved
> > <4% (still a good thing) the write performance shows significantly
> > better improvements >10%. Very interesting asymmetry, indeed.
> >
> > Suresh, what was the size of the cluster in the testing?
> >  Cos
> >
> > On Wed, Apr 27, 2011 at 10:02, suresh srinivas <sr...@gmail.com>
> > wrote:
> > > I posted the TestDFSIO comparison with and without federation to
> > HDFS-1052.
> > > Please let me know if it addresses your concern. I am also adding it
> > here:
> > >
> > > TestDFSIO read tests
> > > *Without federation:*
> > > ----- TestDFSIO ----- : read
> > >           Date & time: Wed Apr 27 02:04:24 PDT 2011
> > >       Number of files: 1000
> > > Total MBytes processed: 30000.0
> > >     Throughput mb/sec: 43.62329251162561
> > > Average IO rate mb/sec: 44.619869232177734
> > >  IO rate std deviation: 5.060306158158443
> > >    Test exec time sec: 959.943
> > >
> > > *With federation:*
> > > ----- TestDFSIO ----- : read
> > >           Date & time: Wed Apr 27 02:43:10 PDT 2011
> > >       Number of files: 1000
> > > Total MBytes processed: 30000.0
> > >     Throughput mb/sec: 45.657513857055456
> > > Average IO rate mb/sec: 46.72107696533203
> > >  IO rate std deviation: 5.455125923399539
> > >    Test exec time sec: 924.922
> > >
> > > TestDFSIO write tests
> > > *Without federation:*
> > > ----- TestDFSIO ----- : write
> > >           Date & time: Wed Apr 27 01:47:50 PDT 2011
> > >       Number of files: 1000
> > > Total MBytes processed: 30000.0
> > >     Throughput mb/sec: 35.940755259031015
> > > Average IO rate mb/sec: 38.236236572265625
> > >  IO rate std deviation: 5.929484960036511
> > >    Test exec time sec: 1266.624
> > >
> > > *With federation:*
> > > ----- TestDFSIO ----- : write
> > >           Date & time: Wed Apr 27 02:27:12 PDT 2011
> > >       Number of files: 1000
> > > Total MBytes processed: 30000.0
> > >     Throughput mb/sec: 42.17884674597227
> > > Average IO rate mb/sec: 43.11423873901367
> > >  IO rate std deviation: 5.357057259968647
> > >    Test exec time sec: 1135.298
> > > {noformat}
> > >
> > >
> > > On Tue, Apr 26, 2011 at 11:55 PM, suresh srinivas <
> srini30005@gmail.com
> > >wrote:
> > >
> > >> Konstantin,
> > >>
> > >> Could you provide me link to how this was done on a big feature, like
> > say
> > >> append and how benchmark info was captured? I am planning to run dfsio
> > >> tests, btw.
> > >>
> > >> Regards,
> > >> Suresh
> > >>
> > >>
> > >> On Tue, Apr 26, 2011 at 11:34 PM, suresh srinivas <
> srini30005@gmail.com
> > >wrote:
> > >>
> > >>> Konstantin,
> > >>>
> > >>> On Tue, Apr 26, 2011 at 10:26 PM, Konstantin Shvachko <
> > >>> shv.hadoop@gmail.com> wrote:
> > >>>
> > >>>> Suresh, Sanjay.
> > >>>>
> > >>>> 1. I asked for benchmarks many times over the course of different
> > >>>> discussions on the topic.
> > >>>> I don't see any numbers attached to jira, and I was getting the same
> > >>>> response,
> > >>>> Doug just got from you, guys: which is "why would the performance be
> > >>>> worse".
> > >>>> And this is not an argument for me.
> > >>>>
> > >>>
> > >>> We had done testing earlier and had found that performance had not
> > >>> degraded. We are waiting for out performance team to publish the
> > official
> > >>> numbers to post it to the jira. Unfortunately they are busy
> qualifying
> > 2xx
> > >>> releases currently. I will get the perf numbers and post them.
> > >>>
> > >>>
> > >>>>
> > >>>> 2. I assume that merging requires a vote. I am sure people who know
> > >>>> bylaws
> > >>>> better than I do will correct me if it is not true.
> > >>>> Did I miss the vote?
> > >>>>
> > >>>
> > >>>
> > >>> As regards to voting, since I was not sure about the procedure, I had
> > >>> consulted Owen about it. He had indicated that voting is not
> necessary.
> > If
> > >>> the right procedure is to call for voting, I will do so. Owen any
> > comments?
> > >>>
> > >>>
> > >>>>
> > >>>> It feels like you are rushing this and are not doing what you would
> > >>>> expect
> > >>>> others to
> > >>>> do in the same position, and what has been done in the past for such
> > >>>> large
> > >>>> projects.
> > >>>>
> > >>>
> > >>> I am not trying to rush here and not follow the procedure required. I
> > am
> > >>> not sure about what the procedure is. Any pointers to it is
> > appreciated.
> > >>>
> > >>>
> > >>>>
> > >>>> Thanks,
> > >>>> --Konstantin
> > >>>>
> > >>>>
> > >>>> On Tue, Apr 26, 2011 at 9:43 PM, Doug Cutting <cu...@apache.org>
> > >>>> wrote:
> > >>>>
> > >>>> > Suresh, Sanjay,
> > >>>> >
> > >>>> > Thank you very much for addressing my questions.
> > >>>> >
> > >>>> > Cheers,
> > >>>> >
> > >>>> > Doug
> > >>>> >
> > >>>> > On 04/26/2011 10:29 AM, suresh srinivas wrote:
> > >>>> > > Doug,
> > >>>> > >
> > >>>> > >
> > >>>> > >> 1. Can you please describe the significant advantages this
> > approach
> > >>>> has
> > >>>> > >> over a symlink-based approach?
> > >>>> > >
> > >>>> > > Federation is complementary with symlink approach. You could
> > choose
> > >>>> to
> > >>>> > > provide integrated namespace using symlinks. However, client
> side
> > >>>> mount
> > >>>> > > tables seems a better approach for many reasons:
> > >>>> > > # Unlike symbolic links, client side mount tables can choose to
> go
> > to
> > >>>> > right
> > >>>> > > namenode based on configuration. This avoids unnecessary RPCs to
> > the
> > >>>> > > namenodes to discover the targer of symlink.
> > >>>> > > # The unavailability of a namenode where a symbolic link is
> > >>>> configured
> > >>>> > does
> > >>>> > > not affect reaching the symlink target.
> > >>>> > > # Symbolic links need not be configured on every namenode in the
> > >>>> cluster
> > >>>> > and
> > >>>> > > future changes to symlinks need not be propagated to multiple
> > >>>> namenodes.
> > >>>> > In
> > >>>> > > client side mount tables, this information is in a central
> > >>>> configuration.
> > >>>> > >
> > >>>> > > If a deployment still wants to use symbolic link, federation
> does
> > not
> > >>>> > > preclude it.
> > >>>> > >
> > >>>> > >> It seems to me that one could run multiple namenodes on
> separate
> > >>>> boxes
> > >>>> > > and run multile datanode processes per storage box
> > >>>> > >
> > >>>> > > There are several advantages to using a single datanode:
> > >>>> > > # When you have large number of namenodes (say 20), the cost of
> > >>>> running
> > >>>> > > separate datanodes in terms of process resources such as memory
> is
> > >>>> huge.
> > >>>> > > # The disk i/o management and storage utilization using a single
> > >>>> datanode
> > >>>> > is
> > >>>> > > much better, as it has complete view the storage.
> > >>>> > > # In the approach you are proposing, you have several clusters
> to
> > >>>> manage.
> > >>>> > > However with federation, all datanodes are in a single cluster;
> > with
> > >>>> > single
> > >>>> > > configuration and operationally easier to manage.
> > >>>> > >
> > >>>> > >> The patch modifies much of the logic of Hadoop's central
> > component,
> > >>>> upon
> > >>>> > > which the performance and reliability of most other components
> of
> > the
> > >>>> > > ecosystem depend.
> > >>>> > > That is not true.
> > >>>> > >
> > >>>> > > # Namenode is mostly unchanged in this feature.
> > >>>> > > # Read/write pipelines are unchanged.
> > >>>> > > # The changes are mainly in datanode:
> > >>>> > > #* the storage, FSDataset, Directory and Disk scanners now have
> > >>>> another
> > >>>> > > level to incorporate block pool ID into the hierarchy. This is
> not
> > a
> > >>>> > > significant change that should cause performance or stability
> > >>>> concerns.
> > >>>> > > #* datanodes use a separate thread per NN, just like the
> existing
> > >>>> thread
> > >>>> > > that communicates with NN.
> > >>>> > >
> > >>>> > >> Can you please tell me how this has been tested beyond unit
> > tests?
> > >>>> > > As regards to testing, we have passed 600+ tests. In hadoop,
> these
> > >>>>  tests
> > >>>> > > are mostly integration tests and not pure unit tests.
> > >>>> > >
> > >>>> > > While these tests have been extensive, we have also been testing
> > this
> > >>>> > branch
> > >>>> > > for last 4 months, with QA validation that reflects our
> production
> > >>>> > > environment. We have found the system to be stable, performing
> > well
> > >>>> and
> > >>>> > have
> > >>>> > > not found any blockers with the branch so far.
> > >>>> > >
> > >>>> > > HDFS-1052 has been open more than a year now. I had also sent an
> > >>>> email
> > >>>> > about
> > >>>> > > this merge around 2 months ago. There are 90 subtasks that have
> > been
> > >>>> > worked
> > >>>> > > on last couple of months under HDFS-1052. Given that there was
> > enough
> > >>>> > time
> > >>>> > > to ask these questions, your email a day before I am planning to
> > >>>> merge
> > >>>> > the
> > >>>> > > branch into trunk seems late!
> > >>>> > >
> > >>>> >
> > >>>>
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> Regards,
> > >>> Suresh
> > >>>
> > >>>
> > >>
> > >>
> > >> --
> > >> Regards,
> > >> Suresh
> > >>
> > >>
> > >
> > >
> > > --
> > > Regards,
> > > Suresh
> > >
> >
>
>
>
> --
> Regards,
> Suresh
>

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by suresh srinivas <sr...@gmail.com>.

I ran these tests on my laptop. I would like to use this data to emphasize
that there is no regression in performance. I am not sure with just the
tests that I ran we could conclude there is a huge gain in performance with
federation. When out performance test team runs tests at scale we will get
more clearer picture.



On Wed, Apr 27, 2011 at 10:41 AM, Konstantin Boudnik <co...@boudnik.org>wrote:

> Interesting... while the read performance has only marginally improved
> <4% (still a good thing) the write performance shows significantly
> better improvements >10%. Very interesting asymmetry, indeed.
>
> Suresh, what was the size of the cluster in the testing?
>  Cos
>
> On Wed, Apr 27, 2011 at 10:02, suresh srinivas <sr...@gmail.com>
> wrote:
> > I posted the TestDFSIO comparison with and without federation to
> HDFS-1052.
> > Please let me know if it addresses your concern. I am also adding it
> here:
> >
> > TestDFSIO read tests
> > *Without federation:*
> > ----- TestDFSIO ----- : read
> >           Date & time: Wed Apr 27 02:04:24 PDT 2011
> >       Number of files: 1000
> > Total MBytes processed: 30000.0
> >     Throughput mb/sec: 43.62329251162561
> > Average IO rate mb/sec: 44.619869232177734
> >  IO rate std deviation: 5.060306158158443
> >    Test exec time sec: 959.943
> >
> > *With federation:*
> > ----- TestDFSIO ----- : read
> >           Date & time: Wed Apr 27 02:43:10 PDT 2011
> >       Number of files: 1000
> > Total MBytes processed: 30000.0
> >     Throughput mb/sec: 45.657513857055456
> > Average IO rate mb/sec: 46.72107696533203
> >  IO rate std deviation: 5.455125923399539
> >    Test exec time sec: 924.922
> >
> > TestDFSIO write tests
> > *Without federation:*
> > ----- TestDFSIO ----- : write
> >           Date & time: Wed Apr 27 01:47:50 PDT 2011
> >       Number of files: 1000
> > Total MBytes processed: 30000.0
> >     Throughput mb/sec: 35.940755259031015
> > Average IO rate mb/sec: 38.236236572265625
> >  IO rate std deviation: 5.929484960036511
> >    Test exec time sec: 1266.624
> >
> > *With federation:*
> > ----- TestDFSIO ----- : write
> >           Date & time: Wed Apr 27 02:27:12 PDT 2011
> >       Number of files: 1000
> > Total MBytes processed: 30000.0
> >     Throughput mb/sec: 42.17884674597227
> > Average IO rate mb/sec: 43.11423873901367
> >  IO rate std deviation: 5.357057259968647
> >    Test exec time sec: 1135.298
> > {noformat}
> >
> >
> > On Tue, Apr 26, 2011 at 11:55 PM, suresh srinivas <srini30005@gmail.com
> >wrote:
> >
> >> Konstantin,
> >>
> >> Could you provide me link to how this was done on a big feature, like
> say
> >> append and how benchmark info was captured? I am planning to run dfsio
> >> tests, btw.
> >>
> >> Regards,
> >> Suresh
> >>
> >>
> >> On Tue, Apr 26, 2011 at 11:34 PM, suresh srinivas <srini30005@gmail.com
> >wrote:
> >>
> >>> Konstantin,
> >>>
> >>> On Tue, Apr 26, 2011 at 10:26 PM, Konstantin Shvachko <
> >>> shv.hadoop@gmail.com> wrote:
> >>>
> >>>> Suresh, Sanjay.
> >>>>
> >>>> 1. I asked for benchmarks many times over the course of different
> >>>> discussions on the topic.
> >>>> I don't see any numbers attached to jira, and I was getting the same
> >>>> response,
> >>>> Doug just got from you, guys: which is "why would the performance be
> >>>> worse".
> >>>> And this is not an argument for me.
> >>>>
> >>>
> >>> We had done testing earlier and had found that performance had not
> >>> degraded. We are waiting for out performance team to publish the
> official
> >>> numbers to post it to the jira. Unfortunately they are busy qualifying
> 2xx
> >>> releases currently. I will get the perf numbers and post them.
> >>>
> >>>
> >>>>
> >>>> 2. I assume that merging requires a vote. I am sure people who know
> >>>> bylaws
> >>>> better than I do will correct me if it is not true.
> >>>> Did I miss the vote?
> >>>>
> >>>
> >>>
> >>> As regards to voting, since I was not sure about the procedure, I had
> >>> consulted Owen about it. He had indicated that voting is not necessary.
> If
> >>> the right procedure is to call for voting, I will do so. Owen any
> comments?
> >>>
> >>>
> >>>>
> >>>> It feels like you are rushing this and are not doing what you would
> >>>> expect
> >>>> others to
> >>>> do in the same position, and what has been done in the past for such
> >>>> large
> >>>> projects.
> >>>>
> >>>
> >>> I am not trying to rush here and not follow the procedure required. I
> am
> >>> not sure about what the procedure is. Any pointers to it is
> appreciated.
> >>>
> >>>
> >>>>
> >>>> Thanks,
> >>>> --Konstantin
> >>>>
> >>>>
> >>>> On Tue, Apr 26, 2011 at 9:43 PM, Doug Cutting <cu...@apache.org>
> >>>> wrote:
> >>>>
> >>>> > Suresh, Sanjay,
> >>>> >
> >>>> > Thank you very much for addressing my questions.
> >>>> >
> >>>> > Cheers,
> >>>> >
> >>>> > Doug
> >>>> >
> >>>> > On 04/26/2011 10:29 AM, suresh srinivas wrote:
> >>>> > > Doug,
> >>>> > >
> >>>> > >
> >>>> > >> 1. Can you please describe the significant advantages this
> approach
> >>>> has
> >>>> > >> over a symlink-based approach?
> >>>> > >
> >>>> > > Federation is complementary with symlink approach. You could
> choose
> >>>> to
> >>>> > > provide integrated namespace using symlinks. However, client side
> >>>> mount
> >>>> > > tables seems a better approach for many reasons:
> >>>> > > # Unlike symbolic links, client side mount tables can choose to go
> to
> >>>> > right
> >>>> > > namenode based on configuration. This avoids unnecessary RPCs to
> the
> >>>> > > namenodes to discover the targer of symlink.
> >>>> > > # The unavailability of a namenode where a symbolic link is
> >>>> configured
> >>>> > does
> >>>> > > not affect reaching the symlink target.
> >>>> > > # Symbolic links need not be configured on every namenode in the
> >>>> cluster
> >>>> > and
> >>>> > > future changes to symlinks need not be propagated to multiple
> >>>> namenodes.
> >>>> > In
> >>>> > > client side mount tables, this information is in a central
> >>>> configuration.
> >>>> > >
> >>>> > > If a deployment still wants to use symbolic link, federation does
> not
> >>>> > > preclude it.
> >>>> > >
> >>>> > >> It seems to me that one could run multiple namenodes on separate
> >>>> boxes
> >>>> > > and run multile datanode processes per storage box
> >>>> > >
> >>>> > > There are several advantages to using a single datanode:
> >>>> > > # When you have large number of namenodes (say 20), the cost of
> >>>> running
> >>>> > > separate datanodes in terms of process resources such as memory is
> >>>> huge.
> >>>> > > # The disk i/o management and storage utilization using a single
> >>>> datanode
> >>>> > is
> >>>> > > much better, as it has complete view the storage.
> >>>> > > # In the approach you are proposing, you have several clusters to
> >>>> manage.
> >>>> > > However with federation, all datanodes are in a single cluster;
> with
> >>>> > single
> >>>> > > configuration and operationally easier to manage.
> >>>> > >
> >>>> > >> The patch modifies much of the logic of Hadoop's central
> component,
> >>>> upon
> >>>> > > which the performance and reliability of most other components of
> the
> >>>> > > ecosystem depend.
> >>>> > > That is not true.
> >>>> > >
> >>>> > > # Namenode is mostly unchanged in this feature.
> >>>> > > # Read/write pipelines are unchanged.
> >>>> > > # The changes are mainly in datanode:
> >>>> > > #* the storage, FSDataset, Directory and Disk scanners now have
> >>>> another
> >>>> > > level to incorporate block pool ID into the hierarchy. This is not
> a
> >>>> > > significant change that should cause performance or stability
> >>>> concerns.
> >>>> > > #* datanodes use a separate thread per NN, just like the existing
> >>>> thread
> >>>> > > that communicates with NN.
> >>>> > >
> >>>> > >> Can you please tell me how this has been tested beyond unit
> tests?
> >>>> > > As regards to testing, we have passed 600+ tests. In hadoop, these
> >>>>  tests
> >>>> > > are mostly integration tests and not pure unit tests.
> >>>> > >
> >>>> > > While these tests have been extensive, we have also been testing
> this
> >>>> > branch
> >>>> > > for last 4 months, with QA validation that reflects our production
> >>>> > > environment. We have found the system to be stable, performing
> well
> >>>> and
> >>>> > have
> >>>> > > not found any blockers with the branch so far.
> >>>> > >
> >>>> > > HDFS-1052 has been open more than a year now. I had also sent an
> >>>> email
> >>>> > about
> >>>> > > this merge around 2 months ago. There are 90 subtasks that have
> been
> >>>> > worked
> >>>> > > on last couple of months under HDFS-1052. Given that there was
> enough
> >>>> > time
> >>>> > > to ask these questions, your email a day before I am planning to
> >>>> merge
> >>>> > the
> >>>> > > branch into trunk seems late!
> >>>> > >
> >>>> >
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Regards,
> >>> Suresh
> >>>
> >>>
> >>
> >>
> >> --
> >> Regards,
> >> Suresh
> >>
> >>
> >
> >
> > --
> > Regards,
> > Suresh
> >
>



-- 
Regards,
Suresh

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Konstantin Boudnik <co...@boudnik.org>.

Interesting... while the read performance has only marginally improved
<4% (still a good thing) the write performance shows significantly
better improvements >10%. Very interesting asymmetry, indeed.

Suresh, what was the size of the cluster in the testing?
  Cos

On Wed, Apr 27, 2011 at 10:02, suresh srinivas <sr...@gmail.com> wrote:
> I posted the TestDFSIO comparison with and without federation to HDFS-1052.
> Please let me know if it addresses your concern. I am also adding it here:
>
> TestDFSIO read tests
> *Without federation:*
> ----- TestDFSIO ----- : read
>           Date & time: Wed Apr 27 02:04:24 PDT 2011
>       Number of files: 1000
> Total MBytes processed: 30000.0
>     Throughput mb/sec: 43.62329251162561
> Average IO rate mb/sec: 44.619869232177734
>  IO rate std deviation: 5.060306158158443
>    Test exec time sec: 959.943
>
> *With federation:*
> ----- TestDFSIO ----- : read
>           Date & time: Wed Apr 27 02:43:10 PDT 2011
>       Number of files: 1000
> Total MBytes processed: 30000.0
>     Throughput mb/sec: 45.657513857055456
> Average IO rate mb/sec: 46.72107696533203
>  IO rate std deviation: 5.455125923399539
>    Test exec time sec: 924.922
>
> TestDFSIO write tests
> *Without federation:*
> ----- TestDFSIO ----- : write
>           Date & time: Wed Apr 27 01:47:50 PDT 2011
>       Number of files: 1000
> Total MBytes processed: 30000.0
>     Throughput mb/sec: 35.940755259031015
> Average IO rate mb/sec: 38.236236572265625
>  IO rate std deviation: 5.929484960036511
>    Test exec time sec: 1266.624
>
> *With federation:*
> ----- TestDFSIO ----- : write
>           Date & time: Wed Apr 27 02:27:12 PDT 2011
>       Number of files: 1000
> Total MBytes processed: 30000.0
>     Throughput mb/sec: 42.17884674597227
> Average IO rate mb/sec: 43.11423873901367
>  IO rate std deviation: 5.357057259968647
>    Test exec time sec: 1135.298
> {noformat}
>
>
> On Tue, Apr 26, 2011 at 11:55 PM, suresh srinivas <sr...@gmail.com>wrote:
>
>> Konstantin,
>>
>> Could you provide me link to how this was done on a big feature, like say
>> append and how benchmark info was captured? I am planning to run dfsio
>> tests, btw.
>>
>> Regards,
>> Suresh
>>
>>
>> On Tue, Apr 26, 2011 at 11:34 PM, suresh srinivas <sr...@gmail.com>wrote:
>>
>>> Konstantin,
>>>
>>> On Tue, Apr 26, 2011 at 10:26 PM, Konstantin Shvachko <
>>> shv.hadoop@gmail.com> wrote:
>>>
>>>> Suresh, Sanjay.
>>>>
>>>> 1. I asked for benchmarks many times over the course of different
>>>> discussions on the topic.
>>>> I don't see any numbers attached to jira, and I was getting the same
>>>> response,
>>>> Doug just got from you, guys: which is "why would the performance be
>>>> worse".
>>>> And this is not an argument for me.
>>>>
>>>
>>> We had done testing earlier and had found that performance had not
>>> degraded. We are waiting for out performance team to publish the official
>>> numbers to post it to the jira. Unfortunately they are busy qualifying 2xx
>>> releases currently. I will get the perf numbers and post them.
>>>
>>>
>>>>
>>>> 2. I assume that merging requires a vote. I am sure people who know
>>>> bylaws
>>>> better than I do will correct me if it is not true.
>>>> Did I miss the vote?
>>>>
>>>
>>>
>>> As regards to voting, since I was not sure about the procedure, I had
>>> consulted Owen about it. He had indicated that voting is not necessary. If
>>> the right procedure is to call for voting, I will do so. Owen any comments?
>>>
>>>
>>>>
>>>> It feels like you are rushing this and are not doing what you would
>>>> expect
>>>> others to
>>>> do in the same position, and what has been done in the past for such
>>>> large
>>>> projects.
>>>>
>>>
>>> I am not trying to rush here and not follow the procedure required. I am
>>> not sure about what the procedure is. Any pointers to it is appreciated.
>>>
>>>
>>>>
>>>> Thanks,
>>>> --Konstantin
>>>>
>>>>
>>>> On Tue, Apr 26, 2011 at 9:43 PM, Doug Cutting <cu...@apache.org>
>>>> wrote:
>>>>
>>>> > Suresh, Sanjay,
>>>> >
>>>> > Thank you very much for addressing my questions.
>>>> >
>>>> > Cheers,
>>>> >
>>>> > Doug
>>>> >
>>>> > On 04/26/2011 10:29 AM, suresh srinivas wrote:
>>>> > > Doug,
>>>> > >
>>>> > >
>>>> > >> 1. Can you please describe the significant advantages this approach
>>>> has
>>>> > >> over a symlink-based approach?
>>>> > >
>>>> > > Federation is complementary with symlink approach. You could choose
>>>> to
>>>> > > provide integrated namespace using symlinks. However, client side
>>>> mount
>>>> > > tables seems a better approach for many reasons:
>>>> > > # Unlike symbolic links, client side mount tables can choose to go to
>>>> > right
>>>> > > namenode based on configuration. This avoids unnecessary RPCs to the
>>>> > > namenodes to discover the targer of symlink.
>>>> > > # The unavailability of a namenode where a symbolic link is
>>>> configured
>>>> > does
>>>> > > not affect reaching the symlink target.
>>>> > > # Symbolic links need not be configured on every namenode in the
>>>> cluster
>>>> > and
>>>> > > future changes to symlinks need not be propagated to multiple
>>>> namenodes.
>>>> > In
>>>> > > client side mount tables, this information is in a central
>>>> configuration.
>>>> > >
>>>> > > If a deployment still wants to use symbolic link, federation does not
>>>> > > preclude it.
>>>> > >
>>>> > >> It seems to me that one could run multiple namenodes on separate
>>>> boxes
>>>> > > and run multile datanode processes per storage box
>>>> > >
>>>> > > There are several advantages to using a single datanode:
>>>> > > # When you have large number of namenodes (say 20), the cost of
>>>> running
>>>> > > separate datanodes in terms of process resources such as memory is
>>>> huge.
>>>> > > # The disk i/o management and storage utilization using a single
>>>> datanode
>>>> > is
>>>> > > much better, as it has complete view the storage.
>>>> > > # In the approach you are proposing, you have several clusters to
>>>> manage.
>>>> > > However with federation, all datanodes are in a single cluster; with
>>>> > single
>>>> > > configuration and operationally easier to manage.
>>>> > >
>>>> > >> The patch modifies much of the logic of Hadoop's central component,
>>>> upon
>>>> > > which the performance and reliability of most other components of the
>>>> > > ecosystem depend.
>>>> > > That is not true.
>>>> > >
>>>> > > # Namenode is mostly unchanged in this feature.
>>>> > > # Read/write pipelines are unchanged.
>>>> > > # The changes are mainly in datanode:
>>>> > > #* the storage, FSDataset, Directory and Disk scanners now have
>>>> another
>>>> > > level to incorporate block pool ID into the hierarchy. This is not a
>>>> > > significant change that should cause performance or stability
>>>> concerns.
>>>> > > #* datanodes use a separate thread per NN, just like the existing
>>>> thread
>>>> > > that communicates with NN.
>>>> > >
>>>> > >> Can you please tell me how this has been tested beyond unit tests?
>>>> > > As regards to testing, we have passed 600+ tests. In hadoop, these
>>>>  tests
>>>> > > are mostly integration tests and not pure unit tests.
>>>> > >
>>>> > > While these tests have been extensive, we have also been testing this
>>>> > branch
>>>> > > for last 4 months, with QA validation that reflects our production
>>>> > > environment. We have found the system to be stable, performing well
>>>> and
>>>> > have
>>>> > > not found any blockers with the branch so far.
>>>> > >
>>>> > > HDFS-1052 has been open more than a year now. I had also sent an
>>>> email
>>>> > about
>>>> > > this merge around 2 months ago. There are 90 subtasks that have been
>>>> > worked
>>>> > > on last couple of months under HDFS-1052. Given that there was enough
>>>> > time
>>>> > > to ask these questions, your email a day before I am planning to
>>>> merge
>>>> > the
>>>> > > branch into trunk seems late!
>>>> > >
>>>> >
>>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Suresh
>>>
>>>
>>
>>
>> --
>> Regards,
>> Suresh
>>
>>
>
>
> --
> Regards,
> Suresh
>

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by "Tsz Wo (Nicholas), Sze" <s2...@yahoo.com>.

It is not a surprise that the performance of Federation is better than trunk 
since, as Suresh mentioned previously, we improved some components of HDFS when 
we were developing Federation.

Regards,
Nicholas





________________________________
From: suresh srinivas <sr...@gmail.com>
To: hdfs-dev@hadoop.apache.org
Sent: Wed, April 27, 2011 10:02:32 AM
Subject: Re: [Discuss] Merge federation branch HDFS-1052 into trunk

I posted the TestDFSIO comparison with and without federation to HDFS-1052.
Please let me know if it addresses your concern. I am also adding it here:

TestDFSIO read tests
*Without federation:*
----- TestDFSIO ----- : read
           Date & time: Wed Apr 27 02:04:24 PDT 2011
       Number of files: 1000
Total MBytes processed: 30000.0
     Throughput mb/sec: 43.62329251162561
Average IO rate mb/sec: 44.619869232177734
IO rate std deviation: 5.060306158158443
    Test exec time sec: 959.943

*With federation:*
----- TestDFSIO ----- : read
           Date & time: Wed Apr 27 02:43:10 PDT 2011
       Number of files: 1000
Total MBytes processed: 30000.0
     Throughput mb/sec: 45.657513857055456
Average IO rate mb/sec: 46.72107696533203
IO rate std deviation: 5.455125923399539
    Test exec time sec: 924.922

TestDFSIO write tests
*Without federation:*
----- TestDFSIO ----- : write
           Date & time: Wed Apr 27 01:47:50 PDT 2011
       Number of files: 1000
Total MBytes processed: 30000.0
     Throughput mb/sec: 35.940755259031015
Average IO rate mb/sec: 38.236236572265625
IO rate std deviation: 5.929484960036511
    Test exec time sec: 1266.624

*With federation:*
----- TestDFSIO ----- : write
           Date & time: Wed Apr 27 02:27:12 PDT 2011
       Number of files: 1000
Total MBytes processed: 30000.0
     Throughput mb/sec: 42.17884674597227
Average IO rate mb/sec: 43.11423873901367
IO rate std deviation: 5.357057259968647
    Test exec time sec: 1135.298
{noformat}


On Tue, Apr 26, 2011 at 11:55 PM, suresh srinivas <sr...@gmail.com>wrote:

> Konstantin,
>
> Could you provide me link to how this was done on a big feature, like say
> append and how benchmark info was captured? I am planning to run dfsio
> tests, btw.
>
> Regards,
> Suresh
>
>
> On Tue, Apr 26, 2011 at 11:34 PM, suresh srinivas <sr...@gmail.com>wrote:
>
>> Konstantin,
>>
>> On Tue, Apr 26, 2011 at 10:26 PM, Konstantin Shvachko <
>> shv.hadoop@gmail.com> wrote:
>>
>>> Suresh, Sanjay.
>>>
>>> 1. I asked for benchmarks many times over the course of different
>>> discussions on the topic.
>>> I don't see any numbers attached to jira, and I was getting the same
>>> response,
>>> Doug just got from you, guys: which is "why would the performance be
>>> worse".
>>> And this is not an argument for me.
>>>
>>
>> We had done testing earlier and had found that performance had not
>> degraded. We are waiting for out performance team to publish the official
>> numbers to post it to the jira. Unfortunately they are busy qualifying 2xx
>> releases currently. I will get the perf numbers and post them.
>>
>>
>>>
>>> 2. I assume that merging requires a vote. I am sure people who know
>>> bylaws
>>> better than I do will correct me if it is not true.
>>> Did I miss the vote?
>>>
>>
>>
>> As regards to voting, since I was not sure about the procedure, I had
>> consulted Owen about it. He had indicated that voting is not necessary. If
>> the right procedure is to call for voting, I will do so. Owen any comments?
>>
>>
>>>
>>> It feels like you are rushing this and are not doing what you would
>>> expect
>>> others to
>>> do in the same position, and what has been done in the past for such
>>> large
>>> projects.
>>>
>>
>> I am not trying to rush here and not follow the procedure required. I am
>> not sure about what the procedure is. Any pointers to it is appreciated.
>>
>>
>>>
>>> Thanks,
>>> --Konstantin
>>>
>>>
>>> On Tue, Apr 26, 2011 at 9:43 PM, Doug Cutting <cu...@apache.org>
>>> wrote:
>>>
>>> > Suresh, Sanjay,
>>> >
>>> > Thank you very much for addressing my questions.
>>> >
>>> > Cheers,
>>> >
>>> > Doug
>>> >
>>> > On 04/26/2011 10:29 AM, suresh srinivas wrote:
>>> > > Doug,
>>> > >
>>> > >
>>> > >> 1. Can you please describe the significant advantages this approach
>>> has
>>> > >> over a symlink-based approach?
>>> > >
>>> > > Federation is complementary with symlink approach. You could choose
>>> to
>>> > > provide integrated namespace using symlinks. However, client side
>>> mount
>>> > > tables seems a better approach for many reasons:
>>> > > # Unlike symbolic links, client side mount tables can choose to go to
>>> > right
>>> > > namenode based on configuration. This avoids unnecessary RPCs to the
>>> > > namenodes to discover the targer of symlink.
>>> > > # The unavailability of a namenode where a symbolic link is
>>> configured
>>> > does
>>> > > not affect reaching the symlink target.
>>> > > # Symbolic links need not be configured on every namenode in the
>>> cluster
>>> > and
>>> > > future changes to symlinks need not be propagated to multiple
>>> namenodes.
>>> > In
>>> > > client side mount tables, this information is in a central
>>> configuration.
>>> > >
>>> > > If a deployment still wants to use symbolic link, federation does not
>>> > > preclude it.
>>> > >
>>> > >> It seems to me that one could run multiple namenodes on separate
>>> boxes
>>> > > and run multile datanode processes per storage box
>>> > >
>>> > > There are several advantages to using a single datanode:
>>> > > # When you have large number of namenodes (say 20), the cost of
>>> running
>>> > > separate datanodes in terms of process resources such as memory is
>>> huge.
>>> > > # The disk i/o management and storage utilization using a single
>>> datanode
>>> > is
>>> > > much better, as it has complete view the storage.
>>> > > # In the approach you are proposing, you have several clusters to
>>> manage.
>>> > > However with federation, all datanodes are in a single cluster; with
>>> > single
>>> > > configuration and operationally easier to manage.
>>> > >
>>> > >> The patch modifies much of the logic of Hadoop's central component,
>>> upon
>>> > > which the performance and reliability of most other components of the
>>> > > ecosystem depend.
>>> > > That is not true.
>>> > >
>>> > > # Namenode is mostly unchanged in this feature.
>>> > > # Read/write pipelines are unchanged.
>>> > > # The changes are mainly in datanode:
>>> > > #* the storage, FSDataset, Directory and Disk scanners now have
>>> another
>>> > > level to incorporate block pool ID into the hierarchy. This is not a
>>> > > significant change that should cause performance or stability
>>> concerns.
>>> > > #* datanodes use a separate thread per NN, just like the existing
>>> thread
>>> > > that communicates with NN.
>>> > >
>>> > >> Can you please tell me how this has been tested beyond unit tests?
>>> > > As regards to testing, we have passed 600+ tests. In hadoop, these
>>>  tests
>>> > > are mostly integration tests and not pure unit tests.
>>> > >
>>> > > While these tests have been extensive, we have also been testing this
>>> > branch
>>> > > for last 4 months, with QA validation that reflects our production
>>> > > environment. We have found the system to be stable, performing well
>>> and
>>> > have
>>> > > not found any blockers with the branch so far.
>>> > >
>>> > > HDFS-1052 has been open more than a year now. I had also sent an
>>> email
>>> > about
>>> > > this merge around 2 months ago. There are 90 subtasks that have been
>>> > worked
>>> > > on last couple of months under HDFS-1052. Given that there was enough
>>> > time
>>> > > to ask these questions, your email a day before I am planning to
>>> merge
>>> > the
>>> > > branch into trunk seems late!
>>> > >
>>> >
>>>
>>
>>
>>
>> --
>> Regards,
>> Suresh
>>
>>
>
>
> --
> Regards,
> Suresh
>
>


-- 
Regards,
Suresh

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Hairong <ku...@gmail.com>.

Nice performance data! The federation branch definitely adds code
complexity to HDFS, but this is a long waited feature to improve HDFS
scalability and is a step forward to separating the namespace management
from the storage management. I am for merging this to trunk.

Hairong

On 4/27/11 10:02 AM, "suresh srinivas" <sr...@gmail.com> wrote:

>I posted the TestDFSIO comparison with and without federation to
>HDFS-1052.
>Please let me know if it addresses your concern. I am also adding it here:
>
>TestDFSIO read tests
>*Without federation:*
>----- TestDFSIO ----- : read
>           Date & time: Wed Apr 27 02:04:24 PDT 2011
>       Number of files: 1000
>Total MBytes processed: 30000.0
>     Throughput mb/sec: 43.62329251162561
>Average IO rate mb/sec: 44.619869232177734
> IO rate std deviation: 5.060306158158443
>    Test exec time sec: 959.943
>
>*With federation:*
>----- TestDFSIO ----- : read
>           Date & time: Wed Apr 27 02:43:10 PDT 2011
>       Number of files: 1000
>Total MBytes processed: 30000.0
>     Throughput mb/sec: 45.657513857055456
>Average IO rate mb/sec: 46.72107696533203
> IO rate std deviation: 5.455125923399539
>    Test exec time sec: 924.922
>
>TestDFSIO write tests
>*Without federation:*
>----- TestDFSIO ----- : write
>           Date & time: Wed Apr 27 01:47:50 PDT 2011
>       Number of files: 1000
>Total MBytes processed: 30000.0
>     Throughput mb/sec: 35.940755259031015
>Average IO rate mb/sec: 38.236236572265625
> IO rate std deviation: 5.929484960036511
>    Test exec time sec: 1266.624
>
>*With federation:*
>----- TestDFSIO ----- : write
>           Date & time: Wed Apr 27 02:27:12 PDT 2011
>       Number of files: 1000
>Total MBytes processed: 30000.0
>     Throughput mb/sec: 42.17884674597227
>Average IO rate mb/sec: 43.11423873901367
> IO rate std deviation: 5.357057259968647
>    Test exec time sec: 1135.298
>{noformat}
>
>
>On Tue, Apr 26, 2011 at 11:55 PM, suresh srinivas
><sr...@gmail.com>wrote:
>
>> Konstantin,
>>
>> Could you provide me link to how this was done on a big feature, like
>>say
>> append and how benchmark info was captured? I am planning to run dfsio
>> tests, btw.
>>
>> Regards,
>> Suresh
>>
>>
>> On Tue, Apr 26, 2011 at 11:34 PM, suresh srinivas
>><sr...@gmail.com>wrote:
>>
>>> Konstantin,
>>>
>>> On Tue, Apr 26, 2011 at 10:26 PM, Konstantin Shvachko <
>>> shv.hadoop@gmail.com> wrote:
>>>
>>>> Suresh, Sanjay.
>>>>
>>>> 1. I asked for benchmarks many times over the course of different
>>>> discussions on the topic.
>>>> I don't see any numbers attached to jira, and I was getting the same
>>>> response,
>>>> Doug just got from you, guys: which is "why would the performance be
>>>> worse".
>>>> And this is not an argument for me.
>>>>
>>>
>>> We had done testing earlier and had found that performance had not
>>> degraded. We are waiting for out performance team to publish the
>>>official
>>> numbers to post it to the jira. Unfortunately they are busy qualifying
>>>2xx
>>> releases currently. I will get the perf numbers and post them.
>>>
>>>
>>>>
>>>> 2. I assume that merging requires a vote. I am sure people who know
>>>> bylaws
>>>> better than I do will correct me if it is not true.
>>>> Did I miss the vote?
>>>>
>>>
>>>
>>> As regards to voting, since I was not sure about the procedure, I had
>>> consulted Owen about it. He had indicated that voting is not
>>>necessary. If
>>> the right procedure is to call for voting, I will do so. Owen any
>>>comments?
>>>
>>>
>>>>
>>>> It feels like you are rushing this and are not doing what you would
>>>> expect
>>>> others to
>>>> do in the same position, and what has been done in the past for such
>>>> large
>>>> projects.
>>>>
>>>
>>> I am not trying to rush here and not follow the procedure required. I
>>>am
>>> not sure about what the procedure is. Any pointers to it is
>>>appreciated.
>>>
>>>
>>>>
>>>> Thanks,
>>>> --Konstantin
>>>>
>>>>
>>>> On Tue, Apr 26, 2011 at 9:43 PM, Doug Cutting <cu...@apache.org>
>>>> wrote:
>>>>
>>>> > Suresh, Sanjay,
>>>> >
>>>> > Thank you very much for addressing my questions.
>>>> >
>>>> > Cheers,
>>>> >
>>>> > Doug
>>>> >
>>>> > On 04/26/2011 10:29 AM, suresh srinivas wrote:
>>>> > > Doug,
>>>> > >
>>>> > >
>>>> > >> 1. Can you please describe the significant advantages this
>>>>approach
>>>> has
>>>> > >> over a symlink-based approach?
>>>> > >
>>>> > > Federation is complementary with symlink approach. You could
>>>>choose
>>>> to
>>>> > > provide integrated namespace using symlinks. However, client side
>>>> mount
>>>> > > tables seems a better approach for many reasons:
>>>> > > # Unlike symbolic links, client side mount tables can choose to
>>>>go to
>>>> > right
>>>> > > namenode based on configuration. This avoids unnecessary RPCs to
>>>>the
>>>> > > namenodes to discover the targer of symlink.
>>>> > > # The unavailability of a namenode where a symbolic link is
>>>> configured
>>>> > does
>>>> > > not affect reaching the symlink target.
>>>> > > # Symbolic links need not be configured on every namenode in the
>>>> cluster
>>>> > and
>>>> > > future changes to symlinks need not be propagated to multiple
>>>> namenodes.
>>>> > In
>>>> > > client side mount tables, this information is in a central
>>>> configuration.
>>>> > >
>>>> > > If a deployment still wants to use symbolic link, federation does
>>>>not
>>>> > > preclude it.
>>>> > >
>>>> > >> It seems to me that one could run multiple namenodes on separate
>>>> boxes
>>>> > > and run multile datanode processes per storage box
>>>> > >
>>>> > > There are several advantages to using a single datanode:
>>>> > > # When you have large number of namenodes (say 20), the cost of
>>>> running
>>>> > > separate datanodes in terms of process resources such as memory is
>>>> huge.
>>>> > > # The disk i/o management and storage utilization using a single
>>>> datanode
>>>> > is
>>>> > > much better, as it has complete view the storage.
>>>> > > # In the approach you are proposing, you have several clusters to
>>>> manage.
>>>> > > However with federation, all datanodes are in a single cluster;
>>>>with
>>>> > single
>>>> > > configuration and operationally easier to manage.
>>>> > >
>>>> > >> The patch modifies much of the logic of Hadoop's central
>>>>component,
>>>> upon
>>>> > > which the performance and reliability of most other components of
>>>>the
>>>> > > ecosystem depend.
>>>> > > That is not true.
>>>> > >
>>>> > > # Namenode is mostly unchanged in this feature.
>>>> > > # Read/write pipelines are unchanged.
>>>> > > # The changes are mainly in datanode:
>>>> > > #* the storage, FSDataset, Directory and Disk scanners now have
>>>> another
>>>> > > level to incorporate block pool ID into the hierarchy. This is
>>>>not a
>>>> > > significant change that should cause performance or stability
>>>> concerns.
>>>> > > #* datanodes use a separate thread per NN, just like the existing
>>>> thread
>>>> > > that communicates with NN.
>>>> > >
>>>> > >> Can you please tell me how this has been tested beyond unit
>>>>tests?
>>>> > > As regards to testing, we have passed 600+ tests. In hadoop, these
>>>>  tests
>>>> > > are mostly integration tests and not pure unit tests.
>>>> > >
>>>> > > While these tests have been extensive, we have also been testing
>>>>this
>>>> > branch
>>>> > > for last 4 months, with QA validation that reflects our production
>>>> > > environment. We have found the system to be stable, performing
>>>>well
>>>> and
>>>> > have
>>>> > > not found any blockers with the branch so far.
>>>> > >
>>>> > > HDFS-1052 has been open more than a year now. I had also sent an
>>>> email
>>>> > about
>>>> > > this merge around 2 months ago. There are 90 subtasks that have
>>>>been
>>>> > worked
>>>> > > on last couple of months under HDFS-1052. Given that there was
>>>>enough
>>>> > time
>>>> > > to ask these questions, your email a day before I am planning to
>>>> merge
>>>> > the
>>>> > > branch into trunk seems late!
>>>> > >
>>>> >
>>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Suresh
>>>
>>>
>>
>>
>> --
>> Regards,
>> Suresh
>>
>>
>
>
>-- 
>Regards,
>Suresh

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Devaraj Das <dd...@yahoo-inc.com>.

Good to see the performance improvements with federation. Curious to know whether it is because of the associated refactoring?


On 4/27/11 10:02 AM, "suresh srinivas" <sr...@gmail.com> wrote:

I posted the TestDFSIO comparison with and without federation to HDFS-1052.
Please let me know if it addresses your concern. I am also adding it here:

TestDFSIO read tests
*Without federation:*
----- TestDFSIO ----- : read
           Date & time: Wed Apr 27 02:04:24 PDT 2011
       Number of files: 1000
Total MBytes processed: 30000.0
     Throughput mb/sec: 43.62329251162561
Average IO rate mb/sec: 44.619869232177734
 IO rate std deviation: 5.060306158158443
    Test exec time sec: 959.943

*With federation:*
----- TestDFSIO ----- : read
           Date & time: Wed Apr 27 02:43:10 PDT 2011
       Number of files: 1000
Total MBytes processed: 30000.0
     Throughput mb/sec: 45.657513857055456
Average IO rate mb/sec: 46.72107696533203
 IO rate std deviation: 5.455125923399539
    Test exec time sec: 924.922

TestDFSIO write tests
*Without federation:*
----- TestDFSIO ----- : write
           Date & time: Wed Apr 27 01:47:50 PDT 2011
       Number of files: 1000
Total MBytes processed: 30000.0
     Throughput mb/sec: 35.940755259031015
Average IO rate mb/sec: 38.236236572265625
 IO rate std deviation: 5.929484960036511
    Test exec time sec: 1266.624

*With federation:*
----- TestDFSIO ----- : write
           Date & time: Wed Apr 27 02:27:12 PDT 2011
       Number of files: 1000
Total MBytes processed: 30000.0
     Throughput mb/sec: 42.17884674597227
Average IO rate mb/sec: 43.11423873901367
 IO rate std deviation: 5.357057259968647
    Test exec time sec: 1135.298
{noformat}


On Tue, Apr 26, 2011 at 11:55 PM, suresh srinivas <sr...@gmail.com>wrote:

> Konstantin,
>
> Could you provide me link to how this was done on a big feature, like say
> append and how benchmark info was captured? I am planning to run dfsio
> tests, btw.
>
> Regards,
> Suresh
>
>
> On Tue, Apr 26, 2011 at 11:34 PM, suresh srinivas <sr...@gmail.com>wrote:
>
>> Konstantin,
>>
>> On Tue, Apr 26, 2011 at 10:26 PM, Konstantin Shvachko <
>> shv.hadoop@gmail.com> wrote:
>>
>>> Suresh, Sanjay.
>>>
>>> 1. I asked for benchmarks many times over the course of different
>>> discussions on the topic.
>>> I don't see any numbers attached to jira, and I was getting the same
>>> response,
>>> Doug just got from you, guys: which is "why would the performance be
>>> worse".
>>> And this is not an argument for me.
>>>
>>
>> We had done testing earlier and had found that performance had not
>> degraded. We are waiting for out performance team to publish the official
>> numbers to post it to the jira. Unfortunately they are busy qualifying 2xx
>> releases currently. I will get the perf numbers and post them.
>>
>>
>>>
>>> 2. I assume that merging requires a vote. I am sure people who know
>>> bylaws
>>> better than I do will correct me if it is not true.
>>> Did I miss the vote?
>>>
>>
>>
>> As regards to voting, since I was not sure about the procedure, I had
>> consulted Owen about it. He had indicated that voting is not necessary. If
>> the right procedure is to call for voting, I will do so. Owen any comments?
>>
>>
>>>
>>> It feels like you are rushing this and are not doing what you would
>>> expect
>>> others to
>>> do in the same position, and what has been done in the past for such
>>> large
>>> projects.
>>>
>>
>> I am not trying to rush here and not follow the procedure required. I am
>> not sure about what the procedure is. Any pointers to it is appreciated.
>>
>>
>>>
>>> Thanks,
>>> --Konstantin
>>>
>>>
>>> On Tue, Apr 26, 2011 at 9:43 PM, Doug Cutting <cu...@apache.org>
>>> wrote:
>>>
>>> > Suresh, Sanjay,
>>> >
>>> > Thank you very much for addressing my questions.
>>> >
>>> > Cheers,
>>> >
>>> > Doug
>>> >
>>> > On 04/26/2011 10:29 AM, suresh srinivas wrote:
>>> > > Doug,
>>> > >
>>> > >
>>> > >> 1. Can you please describe the significant advantages this approach
>>> has
>>> > >> over a symlink-based approach?
>>> > >
>>> > > Federation is complementary with symlink approach. You could choose
>>> to
>>> > > provide integrated namespace using symlinks. However, client side
>>> mount
>>> > > tables seems a better approach for many reasons:
>>> > > # Unlike symbolic links, client side mount tables can choose to go to
>>> > right
>>> > > namenode based on configuration. This avoids unnecessary RPCs to the
>>> > > namenodes to discover the targer of symlink.
>>> > > # The unavailability of a namenode where a symbolic link is
>>> configured
>>> > does
>>> > > not affect reaching the symlink target.
>>> > > # Symbolic links need not be configured on every namenode in the
>>> cluster
>>> > and
>>> > > future changes to symlinks need not be propagated to multiple
>>> namenodes.
>>> > In
>>> > > client side mount tables, this information is in a central
>>> configuration.
>>> > >
>>> > > If a deployment still wants to use symbolic link, federation does not
>>> > > preclude it.
>>> > >
>>> > >> It seems to me that one could run multiple namenodes on separate
>>> boxes
>>> > > and run multile datanode processes per storage box
>>> > >
>>> > > There are several advantages to using a single datanode:
>>> > > # When you have large number of namenodes (say 20), the cost of
>>> running
>>> > > separate datanodes in terms of process resources such as memory is
>>> huge.
>>> > > # The disk i/o management and storage utilization using a single
>>> datanode
>>> > is
>>> > > much better, as it has complete view the storage.
>>> > > # In the approach you are proposing, you have several clusters to
>>> manage.
>>> > > However with federation, all datanodes are in a single cluster; with
>>> > single
>>> > > configuration and operationally easier to manage.
>>> > >
>>> > >> The patch modifies much of the logic of Hadoop's central component,
>>> upon
>>> > > which the performance and reliability of most other components of the
>>> > > ecosystem depend.
>>> > > That is not true.
>>> > >
>>> > > # Namenode is mostly unchanged in this feature.
>>> > > # Read/write pipelines are unchanged.
>>> > > # The changes are mainly in datanode:
>>> > > #* the storage, FSDataset, Directory and Disk scanners now have
>>> another
>>> > > level to incorporate block pool ID into the hierarchy. This is not a
>>> > > significant change that should cause performance or stability
>>> concerns.
>>> > > #* datanodes use a separate thread per NN, just like the existing
>>> thread
>>> > > that communicates with NN.
>>> > >
>>> > >> Can you please tell me how this has been tested beyond unit tests?
>>> > > As regards to testing, we have passed 600+ tests. In hadoop, these
>>>  tests
>>> > > are mostly integration tests and not pure unit tests.
>>> > >
>>> > > While these tests have been extensive, we have also been testing this
>>> > branch
>>> > > for last 4 months, with QA validation that reflects our production
>>> > > environment. We have found the system to be stable, performing well
>>> and
>>> > have
>>> > > not found any blockers with the branch so far.
>>> > >
>>> > > HDFS-1052 has been open more than a year now. I had also sent an
>>> email
>>> > about
>>> > > this merge around 2 months ago. There are 90 subtasks that have been
>>> > worked
>>> > > on last couple of months under HDFS-1052. Given that there was enough
>>> > time
>>> > > to ask these questions, your email a day before I am planning to
>>> merge
>>> > the
>>> > > branch into trunk seems late!
>>> > >
>>> >
>>>
>>
>>
>>
>> --
>> Regards,
>> Suresh
>>
>>
>
>
> --
> Regards,
> Suresh
>
>


--
Regards,
Suresh

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by suresh srinivas <sr...@gmail.com>.

I posted the TestDFSIO comparison with and without federation to HDFS-1052.
Please let me know if it addresses your concern. I am also adding it here:

TestDFSIO read tests
*Without federation:*
----- TestDFSIO ----- : read
           Date & time: Wed Apr 27 02:04:24 PDT 2011
       Number of files: 1000
Total MBytes processed: 30000.0
     Throughput mb/sec: 43.62329251162561
Average IO rate mb/sec: 44.619869232177734
 IO rate std deviation: 5.060306158158443
    Test exec time sec: 959.943

*With federation:*
----- TestDFSIO ----- : read
           Date & time: Wed Apr 27 02:43:10 PDT 2011
       Number of files: 1000
Total MBytes processed: 30000.0
     Throughput mb/sec: 45.657513857055456
Average IO rate mb/sec: 46.72107696533203
 IO rate std deviation: 5.455125923399539
    Test exec time sec: 924.922

TestDFSIO write tests
*Without federation:*
----- TestDFSIO ----- : write
           Date & time: Wed Apr 27 01:47:50 PDT 2011
       Number of files: 1000
Total MBytes processed: 30000.0
     Throughput mb/sec: 35.940755259031015
Average IO rate mb/sec: 38.236236572265625
 IO rate std deviation: 5.929484960036511
    Test exec time sec: 1266.624

*With federation:*
----- TestDFSIO ----- : write
           Date & time: Wed Apr 27 02:27:12 PDT 2011
       Number of files: 1000
Total MBytes processed: 30000.0
     Throughput mb/sec: 42.17884674597227
Average IO rate mb/sec: 43.11423873901367
 IO rate std deviation: 5.357057259968647
    Test exec time sec: 1135.298
{noformat}


On Tue, Apr 26, 2011 at 11:55 PM, suresh srinivas <sr...@gmail.com>wrote:

> Konstantin,
>
> Could you provide me link to how this was done on a big feature, like say
> append and how benchmark info was captured? I am planning to run dfsio
> tests, btw.
>
> Regards,
> Suresh
>
>
> On Tue, Apr 26, 2011 at 11:34 PM, suresh srinivas <sr...@gmail.com>wrote:
>
>> Konstantin,
>>
>> On Tue, Apr 26, 2011 at 10:26 PM, Konstantin Shvachko <
>> shv.hadoop@gmail.com> wrote:
>>
>>> Suresh, Sanjay.
>>>
>>> 1. I asked for benchmarks many times over the course of different
>>> discussions on the topic.
>>> I don't see any numbers attached to jira, and I was getting the same
>>> response,
>>> Doug just got from you, guys: which is "why would the performance be
>>> worse".
>>> And this is not an argument for me.
>>>
>>
>> We had done testing earlier and had found that performance had not
>> degraded. We are waiting for out performance team to publish the official
>> numbers to post it to the jira. Unfortunately they are busy qualifying 2xx
>> releases currently. I will get the perf numbers and post them.
>>
>>
>>>
>>> 2. I assume that merging requires a vote. I am sure people who know
>>> bylaws
>>> better than I do will correct me if it is not true.
>>> Did I miss the vote?
>>>
>>
>>
>> As regards to voting, since I was not sure about the procedure, I had
>> consulted Owen about it. He had indicated that voting is not necessary. If
>> the right procedure is to call for voting, I will do so. Owen any comments?
>>
>>
>>>
>>> It feels like you are rushing this and are not doing what you would
>>> expect
>>> others to
>>> do in the same position, and what has been done in the past for such
>>> large
>>> projects.
>>>
>>
>> I am not trying to rush here and not follow the procedure required. I am
>> not sure about what the procedure is. Any pointers to it is appreciated.
>>
>>
>>>
>>> Thanks,
>>> --Konstantin
>>>
>>>
>>> On Tue, Apr 26, 2011 at 9:43 PM, Doug Cutting <cu...@apache.org>
>>> wrote:
>>>
>>> > Suresh, Sanjay,
>>> >
>>> > Thank you very much for addressing my questions.
>>> >
>>> > Cheers,
>>> >
>>> > Doug
>>> >
>>> > On 04/26/2011 10:29 AM, suresh srinivas wrote:
>>> > > Doug,
>>> > >
>>> > >
>>> > >> 1. Can you please describe the significant advantages this approach
>>> has
>>> > >> over a symlink-based approach?
>>> > >
>>> > > Federation is complementary with symlink approach. You could choose
>>> to
>>> > > provide integrated namespace using symlinks. However, client side
>>> mount
>>> > > tables seems a better approach for many reasons:
>>> > > # Unlike symbolic links, client side mount tables can choose to go to
>>> > right
>>> > > namenode based on configuration. This avoids unnecessary RPCs to the
>>> > > namenodes to discover the targer of symlink.
>>> > > # The unavailability of a namenode where a symbolic link is
>>> configured
>>> > does
>>> > > not affect reaching the symlink target.
>>> > > # Symbolic links need not be configured on every namenode in the
>>> cluster
>>> > and
>>> > > future changes to symlinks need not be propagated to multiple
>>> namenodes.
>>> > In
>>> > > client side mount tables, this information is in a central
>>> configuration.
>>> > >
>>> > > If a deployment still wants to use symbolic link, federation does not
>>> > > preclude it.
>>> > >
>>> > >> It seems to me that one could run multiple namenodes on separate
>>> boxes
>>> > > and run multile datanode processes per storage box
>>> > >
>>> > > There are several advantages to using a single datanode:
>>> > > # When you have large number of namenodes (say 20), the cost of
>>> running
>>> > > separate datanodes in terms of process resources such as memory is
>>> huge.
>>> > > # The disk i/o management and storage utilization using a single
>>> datanode
>>> > is
>>> > > much better, as it has complete view the storage.
>>> > > # In the approach you are proposing, you have several clusters to
>>> manage.
>>> > > However with federation, all datanodes are in a single cluster; with
>>> > single
>>> > > configuration and operationally easier to manage.
>>> > >
>>> > >> The patch modifies much of the logic of Hadoop's central component,
>>> upon
>>> > > which the performance and reliability of most other components of the
>>> > > ecosystem depend.
>>> > > That is not true.
>>> > >
>>> > > # Namenode is mostly unchanged in this feature.
>>> > > # Read/write pipelines are unchanged.
>>> > > # The changes are mainly in datanode:
>>> > > #* the storage, FSDataset, Directory and Disk scanners now have
>>> another
>>> > > level to incorporate block pool ID into the hierarchy. This is not a
>>> > > significant change that should cause performance or stability
>>> concerns.
>>> > > #* datanodes use a separate thread per NN, just like the existing
>>> thread
>>> > > that communicates with NN.
>>> > >
>>> > >> Can you please tell me how this has been tested beyond unit tests?
>>> > > As regards to testing, we have passed 600+ tests. In hadoop, these
>>>  tests
>>> > > are mostly integration tests and not pure unit tests.
>>> > >
>>> > > While these tests have been extensive, we have also been testing this
>>> > branch
>>> > > for last 4 months, with QA validation that reflects our production
>>> > > environment. We have found the system to be stable, performing well
>>> and
>>> > have
>>> > > not found any blockers with the branch so far.
>>> > >
>>> > > HDFS-1052 has been open more than a year now. I had also sent an
>>> email
>>> > about
>>> > > this merge around 2 months ago. There are 90 subtasks that have been
>>> > worked
>>> > > on last couple of months under HDFS-1052. Given that there was enough
>>> > time
>>> > > to ask these questions, your email a day before I am planning to
>>> merge
>>> > the
>>> > > branch into trunk seems late!
>>> > >
>>> >
>>>
>>
>>
>>
>> --
>> Regards,
>> Suresh
>>
>>
>
>
> --
> Regards,
> Suresh
>
>


-- 
Regards,
Suresh

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by suresh srinivas <sr...@gmail.com>.

Konstantin,

Could you provide me link to how this was done on a big feature, like say
append and how benchmark info was captured? I am planning to run dfsio
tests, btw.

Regards,
Suresh

On Tue, Apr 26, 2011 at 11:34 PM, suresh srinivas <sr...@gmail.com>wrote:

> Konstantin,
>
> On Tue, Apr 26, 2011 at 10:26 PM, Konstantin Shvachko <
> shv.hadoop@gmail.com> wrote:
>
>> Suresh, Sanjay.
>>
>> 1. I asked for benchmarks many times over the course of different
>> discussions on the topic.
>> I don't see any numbers attached to jira, and I was getting the same
>> response,
>> Doug just got from you, guys: which is "why would the performance be
>> worse".
>> And this is not an argument for me.
>>
>
> We had done testing earlier and had found that performance had not
> degraded. We are waiting for out performance team to publish the official
> numbers to post it to the jira. Unfortunately they are busy qualifying 2xx
> releases currently. I will get the perf numbers and post them.
>
>
>>
>> 2. I assume that merging requires a vote. I am sure people who know bylaws
>> better than I do will correct me if it is not true.
>> Did I miss the vote?
>>
>
>
> As regards to voting, since I was not sure about the procedure, I had
> consulted Owen about it. He had indicated that voting is not necessary. If
> the right procedure is to call for voting, I will do so. Owen any comments?
>
>
>>
>> It feels like you are rushing this and are not doing what you would expect
>> others to
>> do in the same position, and what has been done in the past for such large
>> projects.
>>
>
> I am not trying to rush here and not follow the procedure required. I am
> not sure about what the procedure is. Any pointers to it is appreciated.
>
>
>>
>> Thanks,
>> --Konstantin
>>
>>
>> On Tue, Apr 26, 2011 at 9:43 PM, Doug Cutting <cu...@apache.org> wrote:
>>
>> > Suresh, Sanjay,
>> >
>> > Thank you very much for addressing my questions.
>> >
>> > Cheers,
>> >
>> > Doug
>> >
>> > On 04/26/2011 10:29 AM, suresh srinivas wrote:
>> > > Doug,
>> > >
>> > >
>> > >> 1. Can you please describe the significant advantages this approach
>> has
>> > >> over a symlink-based approach?
>> > >
>> > > Federation is complementary with symlink approach. You could choose to
>> > > provide integrated namespace using symlinks. However, client side
>> mount
>> > > tables seems a better approach for many reasons:
>> > > # Unlike symbolic links, client side mount tables can choose to go to
>> > right
>> > > namenode based on configuration. This avoids unnecessary RPCs to the
>> > > namenodes to discover the targer of symlink.
>> > > # The unavailability of a namenode where a symbolic link is configured
>> > does
>> > > not affect reaching the symlink target.
>> > > # Symbolic links need not be configured on every namenode in the
>> cluster
>> > and
>> > > future changes to symlinks need not be propagated to multiple
>> namenodes.
>> > In
>> > > client side mount tables, this information is in a central
>> configuration.
>> > >
>> > > If a deployment still wants to use symbolic link, federation does not
>> > > preclude it.
>> > >
>> > >> It seems to me that one could run multiple namenodes on separate
>> boxes
>> > > and run multile datanode processes per storage box
>> > >
>> > > There are several advantages to using a single datanode:
>> > > # When you have large number of namenodes (say 20), the cost of
>> running
>> > > separate datanodes in terms of process resources such as memory is
>> huge.
>> > > # The disk i/o management and storage utilization using a single
>> datanode
>> > is
>> > > much better, as it has complete view the storage.
>> > > # In the approach you are proposing, you have several clusters to
>> manage.
>> > > However with federation, all datanodes are in a single cluster; with
>> > single
>> > > configuration and operationally easier to manage.
>> > >
>> > >> The patch modifies much of the logic of Hadoop's central component,
>> upon
>> > > which the performance and reliability of most other components of the
>> > > ecosystem depend.
>> > > That is not true.
>> > >
>> > > # Namenode is mostly unchanged in this feature.
>> > > # Read/write pipelines are unchanged.
>> > > # The changes are mainly in datanode:
>> > > #* the storage, FSDataset, Directory and Disk scanners now have
>> another
>> > > level to incorporate block pool ID into the hierarchy. This is not a
>> > > significant change that should cause performance or stability
>> concerns.
>> > > #* datanodes use a separate thread per NN, just like the existing
>> thread
>> > > that communicates with NN.
>> > >
>> > >> Can you please tell me how this has been tested beyond unit tests?
>> > > As regards to testing, we have passed 600+ tests. In hadoop, these
>>  tests
>> > > are mostly integration tests and not pure unit tests.
>> > >
>> > > While these tests have been extensive, we have also been testing this
>> > branch
>> > > for last 4 months, with QA validation that reflects our production
>> > > environment. We have found the system to be stable, performing well
>> and
>> > have
>> > > not found any blockers with the branch so far.
>> > >
>> > > HDFS-1052 has been open more than a year now. I had also sent an email
>> > about
>> > > this merge around 2 months ago. There are 90 subtasks that have been
>> > worked
>> > > on last couple of months under HDFS-1052. Given that there was enough
>> > time
>> > > to ask these questions, your email a day before I am planning to merge
>> > the
>> > > branch into trunk seems late!
>> > >
>> >
>>
>
>
>
> --
> Regards,
> Suresh
>
>


-- 
Regards,
Suresh

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by suresh srinivas <sr...@gmail.com>.

If there are no further issues by tonight, I will merge the branch into
trunk.

Regards,
Suresh

On Wed, Apr 27, 2011 at 1:53 PM, Owen O'Malley <om...@apache.org> wrote:

> On Apr 26, 2011, at 11:34 PM, suresh srinivas wrote:
>
> >> 2. I assume that merging requires a vote. I am sure people who know
> bylaws
> >> better than I do will correct me if it is not true.
> >> Did I miss the vote?
> >>
> >
> >
> > As regards to voting, since I was not sure about the procedure, I had
> > consulted Owen about it. He had indicated that voting is not necessary.
> If
> > the right procedure is to call for voting, I will do so. Owen any
> comments?
>
> Merging a branch back in doesn't require an explicit vote. It is just a
> code commit. This discussion thread is enough to establish that there is
> consensus in the dev community.
>
> -- Owen




-- 
Regards,
Suresh

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by suresh srinivas <sr...@gmail.com>.

We have been testing federation regularly with MapReduce with yahoo-merge
branches. With trunk we missed the contrib (raid). The dependency with
project splits has been crazy. Not sure how large changes can keep on top of
all these things.

I am working on fixing the raid contrib.

On Mon, May 2, 2011 at 2:44 PM, Todd Lipcon <to...@cloudera.com> wrote:

> Apparently this merge wasn't tested against MapReduce trunk at all -- MR
> trunk has been failing to compile for several days. Please see
> MAPREDUCE-2465. I attempted to fix it myself but don't have enough
> background in the new federation code or in RAID.
>
> -Todd
>
> On Thu, Apr 28, 2011 at 11:30 PM, Konstantin Shvachko
> <sh...@gmail.com>wrote:
>
> > Thanks for clarifying, Owen.
> > Should we have the bylaws somewhere on wiki?
> > --Konstantin
> >
> >
> > On Thu, Apr 28, 2011 at 1:33 PM, Owen O'Malley <om...@apache.org>
> wrote:
> >
> > > On Apr 27, 2011, at 10:12 PM, Konstantin Shvachko wrote:
> > >
> > > > The question is whether this is a
> > > > * Code Change,
> > > > which requires Lazy consensus of active committers or a
> > > > * Adoption of New Codebase,
> > > > which needs Lazy 2/3 majority of PMC members
> > >
> > > This is a code change, just like all of our jiras. The standard rules
> of
> > at
> > > least one +1 on the jira and no -1's apply.
> > >
> > > Adoption of new codebase is adopting a new subproject or completely
> > > replacing trunk.
> > >
> > > > Lazy consensus requires 3 binding +1 votes and no binding vetoes.
> > >
> > > This was clarified in the bylaws back in November.
> > >
> > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/hadoop-general/201011.mbox/%3C159E99C4-B71C-437E-9640-AA24C50D636E@apache.org%3E
> > >
> > > Where it was modified to:
> > >
> > > Lazy consensus of active committers, but with a minimum of
> > > one +1. The code can be committed after the first +1.
> > >
> > > -- Owen
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>



-- 
Regards,
Suresh

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Todd Lipcon <to...@cloudera.com>.

Apparently this merge wasn't tested against MapReduce trunk at all -- MR
trunk has been failing to compile for several days. Please see
MAPREDUCE-2465. I attempted to fix it myself but don't have enough
background in the new federation code or in RAID.

-Todd

On Thu, Apr 28, 2011 at 11:30 PM, Konstantin Shvachko
<sh...@gmail.com>wrote:

> Thanks for clarifying, Owen.
> Should we have the bylaws somewhere on wiki?
> --Konstantin
>
>
> On Thu, Apr 28, 2011 at 1:33 PM, Owen O'Malley <om...@apache.org> wrote:
>
> > On Apr 27, 2011, at 10:12 PM, Konstantin Shvachko wrote:
> >
> > > The question is whether this is a
> > > * Code Change,
> > > which requires Lazy consensus of active committers or a
> > > * Adoption of New Codebase,
> > > which needs Lazy 2/3 majority of PMC members
> >
> > This is a code change, just like all of our jiras. The standard rules of
> at
> > least one +1 on the jira and no -1's apply.
> >
> > Adoption of new codebase is adopting a new subproject or completely
> > replacing trunk.
> >
> > > Lazy consensus requires 3 binding +1 votes and no binding vetoes.
> >
> > This was clarified in the bylaws back in November.
> >
> >
> >
> http://mail-archives.apache.org/mod_mbox/hadoop-general/201011.mbox/%3C159E99C4-B71C-437E-9640-AA24C50D636E@apache.org%3E
> >
> > Where it was modified to:
> >
> > Lazy consensus of active committers, but with a minimum of
> > one +1. The code can be committed after the first +1.
> >
> > -- Owen
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Konstantin Shvachko <sh...@gmail.com>.

Thanks for clarifying, Owen.
Should we have the bylaws somewhere on wiki?
--Konstantin


On Thu, Apr 28, 2011 at 1:33 PM, Owen O'Malley <om...@apache.org> wrote:

> On Apr 27, 2011, at 10:12 PM, Konstantin Shvachko wrote:
>
> > The question is whether this is a
> > * Code Change,
> > which requires Lazy consensus of active committers or a
> > * Adoption of New Codebase,
> > which needs Lazy 2/3 majority of PMC members
>
> This is a code change, just like all of our jiras. The standard rules of at
> least one +1 on the jira and no -1's apply.
>
> Adoption of new codebase is adopting a new subproject or completely
> replacing trunk.
>
> > Lazy consensus requires 3 binding +1 votes and no binding vetoes.
>
> This was clarified in the bylaws back in November.
>
>
> http://mail-archives.apache.org/mod_mbox/hadoop-general/201011.mbox/%3C159E99C4-B71C-437E-9640-AA24C50D636E@apache.org%3E
>
> Where it was modified to:
>
> Lazy consensus of active committers, but with a minimum of
> one +1. The code can be committed after the first +1.
>
> -- Owen

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by suresh srinivas <sr...@gmail.com>.

Owen, thanks for clarification.

I have attached the patch to the jira HDFS-1052. Please use the jira to cast
your vote or post objections. If you have objections please be specific on
how I can address it and move forward with this issue.

Regards,
Suresh

On Thu, Apr 28, 2011 at 1:33 PM, Owen O'Malley <om...@apache.org> wrote:

> On Apr 27, 2011, at 10:12 PM, Konstantin Shvachko wrote:
>
> > The question is whether this is a
> > * Code Change,
> > which requires Lazy consensus of active committers or a
> > * Adoption of New Codebase,
> > which needs Lazy 2/3 majority of PMC members
>
> This is a code change, just like all of our jiras. The standard rules of at
> least one +1 on the jira and no -1's apply.
>
> Adoption of new codebase is adopting a new subproject or completely
> replacing trunk.
>
> > Lazy consensus requires 3 binding +1 votes and no binding vetoes.
>
> This was clarified in the bylaws back in November.
>
>
> http://mail-archives.apache.org/mod_mbox/hadoop-general/201011.mbox/%3C159E99C4-B71C-437E-9640-AA24C50D636E@apache.org%3E
>
> Where it was modified to:
>
> Lazy consensus of active committers, but with a minimum of
> one +1. The code can be committed after the first +1.
>
> -- Owen




-- 
Regards,
Suresh

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Owen O'Malley <om...@apache.org>.

On Apr 27, 2011, at 10:12 PM, Konstantin Shvachko wrote:

> The question is whether this is a
> * Code Change,
> which requires Lazy consensus of active committers or a
> * Adoption of New Codebase,
> which needs Lazy 2/3 majority of PMC members

This is a code change, just like all of our jiras. The standard rules of at least one +1 on the jira and no -1's apply.

Adoption of new codebase is adopting a new subproject or completely replacing trunk.

> Lazy consensus requires 3 binding +1 votes and no binding vetoes.

This was clarified in the bylaws back in November.

http://mail-archives.apache.org/mod_mbox/hadoop-general/201011.mbox/%3C159E99C4-B71C-437E-9640-AA24C50D636E@apache.org%3E

Where it was modified to:

Lazy consensus of active committers, but with a minimum of
one +1. The code can be committed after the first +1.

-- Owen

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Konstantin Shvachko <sh...@gmail.com>.

Owen,

The question is whether this is a
* Code Change,
which requires Lazy consensus of active committers or a
* Adoption of New Codebase,
which needs Lazy 2/3 majority of PMC members

Lazy consensus requires 3 binding +1 votes and no binding vetoes.

If I am looking at the current bylaws, then it tells me this needs a vote.
Did I miss anything?

Konstantin


On Wed, Apr 27, 2011 at 1:53 PM, Owen O'Malley <om...@apache.org> wrote:

> On Apr 26, 2011, at 11:34 PM, suresh srinivas wrote:
>
> >> 2. I assume that merging requires a vote. I am sure people who know
> bylaws
> >> better than I do will correct me if it is not true.
> >> Did I miss the vote?
> >>
> >
> >
> > As regards to voting, since I was not sure about the procedure, I had
> > consulted Owen about it. He had indicated that voting is not necessary.
> If
> > the right procedure is to call for voting, I will do so. Owen any
> comments?
>
> Merging a branch back in doesn't require an explicit vote. It is just a
> code commit. This discussion thread is enough to establish that there is
> consensus in the dev community.
>
> -- Owen

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Owen O'Malley <om...@apache.org>.

On Apr 26, 2011, at 11:34 PM, suresh srinivas wrote:

>> 2. I assume that merging requires a vote. I am sure people who know bylaws
>> better than I do will correct me if it is not true.
>> Did I miss the vote?
>> 
> 
> 
> As regards to voting, since I was not sure about the procedure, I had
> consulted Owen about it. He had indicated that voting is not necessary. If
> the right procedure is to call for voting, I will do so. Owen any comments?

Merging a branch back in doesn't require an explicit vote. It is just a code commit. This discussion thread is enough to establish that there is consensus in the dev community.

-- Owen

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by suresh srinivas <sr...@gmail.com>.

Konstantin,

On Tue, Apr 26, 2011 at 10:26 PM, Konstantin Shvachko
<sh...@gmail.com>wrote:

> Suresh, Sanjay.
>
> 1. I asked for benchmarks many times over the course of different
> discussions on the topic.
> I don't see any numbers attached to jira, and I was getting the same
> response,
> Doug just got from you, guys: which is "why would the performance be
> worse".
> And this is not an argument for me.
>

We had done testing earlier and had found that performance had not degraded.
We are waiting for out performance team to publish the official numbers to
post it to the jira. Unfortunately they are busy qualifying 2xx releases
currently. I will get the perf numbers and post them.


>
> 2. I assume that merging requires a vote. I am sure people who know bylaws
> better than I do will correct me if it is not true.
> Did I miss the vote?
>


As regards to voting, since I was not sure about the procedure, I had
consulted Owen about it. He had indicated that voting is not necessary. If
the right procedure is to call for voting, I will do so. Owen any comments?


>
> It feels like you are rushing this and are not doing what you would expect
> others to
> do in the same position, and what has been done in the past for such large
> projects.
>

I am not trying to rush here and not follow the procedure required. I am not
sure about what the procedure is. Any pointers to it is appreciated.


>
> Thanks,
> --Konstantin
>
>
> On Tue, Apr 26, 2011 at 9:43 PM, Doug Cutting <cu...@apache.org> wrote:
>
> > Suresh, Sanjay,
> >
> > Thank you very much for addressing my questions.
> >
> > Cheers,
> >
> > Doug
> >
> > On 04/26/2011 10:29 AM, suresh srinivas wrote:
> > > Doug,
> > >
> > >
> > >> 1. Can you please describe the significant advantages this approach
> has
> > >> over a symlink-based approach?
> > >
> > > Federation is complementary with symlink approach. You could choose to
> > > provide integrated namespace using symlinks. However, client side mount
> > > tables seems a better approach for many reasons:
> > > # Unlike symbolic links, client side mount tables can choose to go to
> > right
> > > namenode based on configuration. This avoids unnecessary RPCs to the
> > > namenodes to discover the targer of symlink.
> > > # The unavailability of a namenode where a symbolic link is configured
> > does
> > > not affect reaching the symlink target.
> > > # Symbolic links need not be configured on every namenode in the
> cluster
> > and
> > > future changes to symlinks need not be propagated to multiple
> namenodes.
> > In
> > > client side mount tables, this information is in a central
> configuration.
> > >
> > > If a deployment still wants to use symbolic link, federation does not
> > > preclude it.
> > >
> > >> It seems to me that one could run multiple namenodes on separate boxes
> > > and run multile datanode processes per storage box
> > >
> > > There are several advantages to using a single datanode:
> > > # When you have large number of namenodes (say 20), the cost of running
> > > separate datanodes in terms of process resources such as memory is
> huge.
> > > # The disk i/o management and storage utilization using a single
> datanode
> > is
> > > much better, as it has complete view the storage.
> > > # In the approach you are proposing, you have several clusters to
> manage.
> > > However with federation, all datanodes are in a single cluster; with
> > single
> > > configuration and operationally easier to manage.
> > >
> > >> The patch modifies much of the logic of Hadoop's central component,
> upon
> > > which the performance and reliability of most other components of the
> > > ecosystem depend.
> > > That is not true.
> > >
> > > # Namenode is mostly unchanged in this feature.
> > > # Read/write pipelines are unchanged.
> > > # The changes are mainly in datanode:
> > > #* the storage, FSDataset, Directory and Disk scanners now have another
> > > level to incorporate block pool ID into the hierarchy. This is not a
> > > significant change that should cause performance or stability concerns.
> > > #* datanodes use a separate thread per NN, just like the existing
> thread
> > > that communicates with NN.
> > >
> > >> Can you please tell me how this has been tested beyond unit tests?
> > > As regards to testing, we have passed 600+ tests. In hadoop, these
>  tests
> > > are mostly integration tests and not pure unit tests.
> > >
> > > While these tests have been extensive, we have also been testing this
> > branch
> > > for last 4 months, with QA validation that reflects our production
> > > environment. We have found the system to be stable, performing well and
> > have
> > > not found any blockers with the branch so far.
> > >
> > > HDFS-1052 has been open more than a year now. I had also sent an email
> > about
> > > this merge around 2 months ago. There are 90 subtasks that have been
> > worked
> > > on last couple of months under HDFS-1052. Given that there was enough
> > time
> > > to ask these questions, your email a day before I am planning to merge
> > the
> > > branch into trunk seems late!
> > >
> >
>



-- 
Regards,
Suresh

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Konstantin Shvachko <sh...@gmail.com>.

Suresh, Sanjay.

1. I asked for benchmarks many times over the course of different
discussions on the topic.
I don't see any numbers attached to jira, and I was getting the same
response,
Doug just got from you, guys: which is "why would the performance be worse".
And this is not an argument for me.

2. I assume that merging requires a vote. I am sure people who know bylaws
better than I do will correct me if it is not true.
Did I miss the vote?

It feels like you are rushing this and are not doing what you would expect
others to
do in the same position, and what has been done in the past for such large
projects.

Thanks,
--Konstantin


On Tue, Apr 26, 2011 at 9:43 PM, Doug Cutting <cu...@apache.org> wrote:

> Suresh, Sanjay,
>
> Thank you very much for addressing my questions.
>
> Cheers,
>
> Doug
>
> On 04/26/2011 10:29 AM, suresh srinivas wrote:
> > Doug,
> >
> >
> >> 1. Can you please describe the significant advantages this approach has
> >> over a symlink-based approach?
> >
> > Federation is complementary with symlink approach. You could choose to
> > provide integrated namespace using symlinks. However, client side mount
> > tables seems a better approach for many reasons:
> > # Unlike symbolic links, client side mount tables can choose to go to
> right
> > namenode based on configuration. This avoids unnecessary RPCs to the
> > namenodes to discover the targer of symlink.
> > # The unavailability of a namenode where a symbolic link is configured
> does
> > not affect reaching the symlink target.
> > # Symbolic links need not be configured on every namenode in the cluster
> and
> > future changes to symlinks need not be propagated to multiple namenodes.
> In
> > client side mount tables, this information is in a central configuration.
> >
> > If a deployment still wants to use symbolic link, federation does not
> > preclude it.
> >
> >> It seems to me that one could run multiple namenodes on separate boxes
> > and run multile datanode processes per storage box
> >
> > There are several advantages to using a single datanode:
> > # When you have large number of namenodes (say 20), the cost of running
> > separate datanodes in terms of process resources such as memory is huge.
> > # The disk i/o management and storage utilization using a single datanode
> is
> > much better, as it has complete view the storage.
> > # In the approach you are proposing, you have several clusters to manage.
> > However with federation, all datanodes are in a single cluster; with
> single
> > configuration and operationally easier to manage.
> >
> >> The patch modifies much of the logic of Hadoop's central component, upon
> > which the performance and reliability of most other components of the
> > ecosystem depend.
> > That is not true.
> >
> > # Namenode is mostly unchanged in this feature.
> > # Read/write pipelines are unchanged.
> > # The changes are mainly in datanode:
> > #* the storage, FSDataset, Directory and Disk scanners now have another
> > level to incorporate block pool ID into the hierarchy. This is not a
> > significant change that should cause performance or stability concerns.
> > #* datanodes use a separate thread per NN, just like the existing thread
> > that communicates with NN.
> >
> >> Can you please tell me how this has been tested beyond unit tests?
> > As regards to testing, we have passed 600+ tests. In hadoop, these  tests
> > are mostly integration tests and not pure unit tests.
> >
> > While these tests have been extensive, we have also been testing this
> branch
> > for last 4 months, with QA validation that reflects our production
> > environment. We have found the system to be stable, performing well and
> have
> > not found any blockers with the branch so far.
> >
> > HDFS-1052 has been open more than a year now. I had also sent an email
> about
> > this merge around 2 months ago. There are 90 subtasks that have been
> worked
> > on last couple of months under HDFS-1052. Given that there was enough
> time
> > to ask these questions, your email a day before I am planning to merge
> the
> > branch into trunk seems late!
> >
>

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Doug Cutting <cu...@apache.org>.

Suresh, Sanjay,

Thank you very much for addressing my questions.

Cheers,

Doug

On 04/26/2011 10:29 AM, suresh srinivas wrote:
> Doug,
> 
> 
>> 1. Can you please describe the significant advantages this approach has
>> over a symlink-based approach?
> 
> Federation is complementary with symlink approach. You could choose to
> provide integrated namespace using symlinks. However, client side mount
> tables seems a better approach for many reasons:
> # Unlike symbolic links, client side mount tables can choose to go to right
> namenode based on configuration. This avoids unnecessary RPCs to the
> namenodes to discover the targer of symlink.
> # The unavailability of a namenode where a symbolic link is configured does
> not affect reaching the symlink target.
> # Symbolic links need not be configured on every namenode in the cluster and
> future changes to symlinks need not be propagated to multiple namenodes. In
> client side mount tables, this information is in a central configuration.
> 
> If a deployment still wants to use symbolic link, federation does not
> preclude it.
> 
>> It seems to me that one could run multiple namenodes on separate boxes
> and run multile datanode processes per storage box
> 
> There are several advantages to using a single datanode:
> # When you have large number of namenodes (say 20), the cost of running
> separate datanodes in terms of process resources such as memory is huge.
> # The disk i/o management and storage utilization using a single datanode is
> much better, as it has complete view the storage.
> # In the approach you are proposing, you have several clusters to manage.
> However with federation, all datanodes are in a single cluster; with single
> configuration and operationally easier to manage.
> 
>> The patch modifies much of the logic of Hadoop's central component, upon
> which the performance and reliability of most other components of the
> ecosystem depend.
> That is not true.
> 
> # Namenode is mostly unchanged in this feature.
> # Read/write pipelines are unchanged.
> # The changes are mainly in datanode:
> #* the storage, FSDataset, Directory and Disk scanners now have another
> level to incorporate block pool ID into the hierarchy. This is not a
> significant change that should cause performance or stability concerns.
> #* datanodes use a separate thread per NN, just like the existing thread
> that communicates with NN.
> 
>> Can you please tell me how this has been tested beyond unit tests?
> As regards to testing, we have passed 600+ tests. In hadoop, these  tests
> are mostly integration tests and not pure unit tests.
> 
> While these tests have been extensive, we have also been testing this branch
> for last 4 months, with QA validation that reflects our production
> environment. We have found the system to be stable, performing well and have
> not found any blockers with the branch so far.
> 
> HDFS-1052 has been open more than a year now. I had also sent an email about
> this merge around 2 months ago. There are 90 subtasks that have been worked
> on last couple of months under HDFS-1052. Given that there was enough time
> to ask these questions, your email a day before I am planning to merge the
> branch into trunk seems late!
>

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by suresh srinivas <sr...@gmail.com>.

Doug, please reply back. I am planning to commit this by tonight, as I would
like to avoid unnecessary merge work and also avoid having to redo the merge
if SVN is re-organized.

On Tue, Apr 26, 2011 at 10:29 AM, suresh srinivas <sr...@gmail.com>wrote:

> Doug,
>
>
>> 1. Can you please describe the significant advantages this approach has
>> over a symlink-based approach?
>
> Federation is complementary with symlink approach. You could choose to
> provide integrated namespace using symlinks. However, client side mount
> tables seems a better approach for many reasons:
> # Unlike symbolic links, client side mount tables can choose to go to right
> namenode based on configuration. This avoids unnecessary RPCs to the
> namenodes to discover the targer of symlink.
> # The unavailability of a namenode where a symbolic link is configured does
> not affect reaching the symlink target.
> # Symbolic links need not be configured on every namenode in the cluster
> and future changes to symlinks need not be propagated to multiple namenodes.
> In client side mount tables, this information is in a central configuration.
>
> If a deployment still wants to use symbolic link, federation does not
> preclude it.
>
>
> > It seems to me that one could run multiple namenodes on separate boxes
> and run multile datanode processes per storage box
>
> There are several advantages to using a single datanode:
> # When you have large number of namenodes (say 20), the cost of running
> separate datanodes in terms of process resources such as memory is huge.
> # The disk i/o management and storage utilization using a single datanode
> is much better, as it has complete view the storage.
> # In the approach you are proposing, you have several clusters to manage.
> However with federation, all datanodes are in a single cluster; with single
> configuration and operationally easier to manage.
>
> > The patch modifies much of the logic of Hadoop's central component, upon
> which the performance and reliability of most other components of the
> ecosystem depend.
> That is not true.
>
> # Namenode is mostly unchanged in this feature.
> # Read/write pipelines are unchanged.
> # The changes are mainly in datanode:
> #* the storage, FSDataset, Directory and Disk scanners now have another
> level to incorporate block pool ID into the hierarchy. This is not a
> significant change that should cause performance or stability concerns.
> #* datanodes use a separate thread per NN, just like the existing thread
> that communicates with NN.
>
> > Can you please tell me how this has been tested beyond unit tests?
> As regards to testing, we have passed 600+ tests. In hadoop, these  tests
> are mostly integration tests and not pure unit tests.
>
> While these tests have been extensive, we have also been testing this
> branch for last 4 months, with QA validation that reflects our production
> environment. We have found the system to be stable, performing well and have
> not found any blockers with the branch so far.
>
> HDFS-1052 has been open more than a year now. I had also sent an email
> about this merge around 2 months ago. There are 90 subtasks that have been
> worked on last couple of months under HDFS-1052. Given that there was enough
> time to ask these questions, your email a day before I am planning to merge
> the branch into trunk seems late!
>
> --
> Regards,
> Suresh
>
>


-- 
Regards,
Suresh

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by suresh srinivas <sr...@gmail.com>.

Doug,


> 1. Can you please describe the significant advantages this approach has
> over a symlink-based approach?

Federation is complementary with symlink approach. You could choose to
provide integrated namespace using symlinks. However, client side mount
tables seems a better approach for many reasons:
# Unlike symbolic links, client side mount tables can choose to go to right
namenode based on configuration. This avoids unnecessary RPCs to the
namenodes to discover the targer of symlink.
# The unavailability of a namenode where a symbolic link is configured does
not affect reaching the symlink target.
# Symbolic links need not be configured on every namenode in the cluster and
future changes to symlinks need not be propagated to multiple namenodes. In
client side mount tables, this information is in a central configuration.

If a deployment still wants to use symbolic link, federation does not
preclude it.

> It seems to me that one could run multiple namenodes on separate boxes
and run multile datanode processes per storage box

There are several advantages to using a single datanode:
# When you have large number of namenodes (say 20), the cost of running
separate datanodes in terms of process resources such as memory is huge.
# The disk i/o management and storage utilization using a single datanode is
much better, as it has complete view the storage.
# In the approach you are proposing, you have several clusters to manage.
However with federation, all datanodes are in a single cluster; with single
configuration and operationally easier to manage.

> The patch modifies much of the logic of Hadoop's central component, upon
which the performance and reliability of most other components of the
ecosystem depend.
That is not true.

# Namenode is mostly unchanged in this feature.
# Read/write pipelines are unchanged.
# The changes are mainly in datanode:
#* the storage, FSDataset, Directory and Disk scanners now have another
level to incorporate block pool ID into the hierarchy. This is not a
significant change that should cause performance or stability concerns.
#* datanodes use a separate thread per NN, just like the existing thread
that communicates with NN.

> Can you please tell me how this has been tested beyond unit tests?
As regards to testing, we have passed 600+ tests. In hadoop, these  tests
are mostly integration tests and not pure unit tests.

While these tests have been extensive, we have also been testing this branch
for last 4 months, with QA validation that reflects our production
environment. We have found the system to be stable, performing well and have
not found any blockers with the branch so far.

HDFS-1052 has been open more than a year now. I had also sent an email about
this merge around 2 months ago. There are 90 subtasks that have been worked
on last couple of months under HDFS-1052. Given that there was enough time
to ask these questions, your email a day before I am planning to merge the
branch into trunk seems late!

-- 
Regards,
Suresh

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Doug Cutting <cu...@apache.org>.

On 04/22/2011 09:48 AM, Suresh Srinivas wrote:
> A few weeks ago, I had sent an email about the progress of HDFS
> federation development in HDFS-1052 branch. I am happy to announce
> that all the tasks related to this feature development is complete
> and it is ready to be integrated into trunk.

A couple of questions:

1. Can you please describe the significant advantages this approach has
over a symlink-based approach?

It seems to me that one could run multiple namenodes on separate boxes
and run multile datanode processes per storage box configured with
something like:

first datanode process configuraton
  fs.default.name = hdfs://nn1/
  dfs.data.dir = /drive1/nn1/,drive2/nn1/...

second datanode process configuraton
  fs.default.name = hdfs://nn2/
  dfs.data.dir = /drive1/nn2/,drive2/nn2/...

...

Then symlinks could be used between nn1, nn2, etc to provide a
reasonably unified namespace.  From the benefits listed in the design
document it is not clear to me what the clear, substantial benefits are
over such a configuration.

2. How much testing has been performed on this?  The patch modifies much
of the logic of Hadoop's central component, upon which the performance
and reliability of most other components of the ecosystem depend.  It
seems to me that such an invasive change should be well tested before it
is merged to trunk.  Can you please tell me how this has been tested
beyond unit tests?

Thanks!

Doug

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Dhruba Borthakur <dh...@gmail.com>.

Given that we will be re-organizing the svn tree very soon and the fact that
the design and most of the implementation is complete, let's merge it into
trunk!

-dhruba

On Fri, Apr 22, 2011 at 9:48 AM, Suresh Srinivas <su...@yahoo-inc.com>wrote:

> A few weeks ago, I had sent an email about the progress of HDFS federation
> development in HDFS-1052 branch. I am happy to announce that all the tasks
> related to this feature development is complete and it is ready to be
> integrated into trunk.
>
> I have a merge patch attached to HDFS-1052 jira. All Hudson tests pass
> except for two test failures. We will fix these unit test failures in trunk,
> post merge. I plan on completing merge to trunk early next week. I would
> like to do this ASAP to avoid having to keep the patch up to date (which has
> been time consuming). This also avoids need for re-merging, due to SVN
> changes proposed by Nigel, scheduled late next week. Comments are welcome.
>
> Regards,
> Suresh
>



-- 
Connect to me at http://www.facebook.com/dhruba

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by suresh srinivas <sr...@gmail.com>.

Thanks Eli.

The merge of latest changes in trunk is not straight forward. I will get it
done tonight and post a new patch. That means the earlier the merge can
happen is tomorrow.


On Wed, Apr 27, 2011 at 2:36 PM, Eli Collins <el...@cloudera.com> wrote:

> Hey Suresh,
>
> Do you plan to update the patch on HDFS-1052 soon?  Trunk has moved on
> a little bit since the last patch.  I assume we vote on the patch
> there. I think additional review feedback (beyond what's already been
> done) can be handled after the code is merged, I know what a pain it
> is to keep a patch out of mainline.  What I've looked at so far looks
> great btw.
>
> For those of you who missed the design doc you should check it out:
>
> https://issues.apache.org/jira/secure/attachment/12442372/Mulitple+Namespaces5.pdf
>
> Thanks,
> Eli
>
> On Fri, Apr 22, 2011 at 9:48 AM, Suresh Srinivas <su...@yahoo-inc.com>
> wrote:
> > A few weeks ago, I had sent an email about the progress of HDFS
> federation development in HDFS-1052 branch. I am happy to announce that all
> the tasks related to this feature development is complete and it is ready to
> be integrated into trunk.
> >
> > I have a merge patch attached to HDFS-1052 jira. All Hudson tests pass
> except for two test failures. We will fix these unit test failures in trunk,
> post merge. I plan on completing merge to trunk early next week. I would
> like to do this ASAP to avoid having to keep the patch up to date (which has
> been time consuming). This also avoids need for re-merging, due to SVN
> changes proposed by Nigel, scheduled late next week. Comments are welcome.
> >
> > Regards,
> > Suresh
> >
>



-- 
Regards,
Suresh

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

Posted by Eli Collins <el...@cloudera.com>.

Hey Suresh,

Do you plan to update the patch on HDFS-1052 soon?  Trunk has moved on
a little bit since the last patch.  I assume we vote on the patch
there. I think additional review feedback (beyond what's already been
done) can be handled after the code is merged, I know what a pain it
is to keep a patch out of mainline.  What I've looked at so far looks
great btw.

For those of you who missed the design doc you should check it out:
https://issues.apache.org/jira/secure/attachment/12442372/Mulitple+Namespaces5.pdf

Thanks,
Eli

On Fri, Apr 22, 2011 at 9:48 AM, Suresh Srinivas <su...@yahoo-inc.com> wrote:
> A few weeks ago, I had sent an email about the progress of HDFS federation development in HDFS-1052 branch. I am happy to announce that all the tasks related to this feature development is complete and it is ready to be integrated into trunk.
>
> I have a merge patch attached to HDFS-1052 jira. All Hudson tests pass except for two test failures. We will fix these unit test failures in trunk, post merge. I plan on completing merge to trunk early next week. I would like to do this ASAP to avoid having to keep the patch up to date (which has been time consuming). This also avoids need for re-merging, due to SVN changes proposed by Nigel, scheduled late next week. Comments are welcome.
>
> Regards,
> Suresh
>