You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Zhe Zhang <zh...@cloudera.com> on 2015/09/23 00:40:59 UTC

[VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Hi,

I'd like to propose a vote to merge the HDFS-7285 feature branch back to
trunk. Since November 2014 we have been designing and developing this
feature under the umbrella JIRAs HDFS-7285 and HADOOP-11264, and have
committed approximately 210 patches.

The HDFS-7285 feature branch was created to support the first phase of HDFS
erasure coding (HDFS-EC). The objective of HDFS-EC is to significantly
reduce storage space usage in HDFS clusters. Instead of always creating 3
replicas of each block with 200% storage space overhead, HDFS-EC provides
data durability through parity data blocks. With most EC configurations,
the storage overhead is no more than 50%. Based on profiling results of
production clusters, we decided to support EC with the striped block layout
in the first phase, so that small files can be better handled. This means
dividing each logical HDFS file block into smaller units (striping cells)
and spreading them on a set of DataNodes in round-robin fashion. Parity
cells are generated for each stripe of original data cells. We have made
changes to NameNode, client, and DataNode to generalize the block concept
and handle the mapping between a logical file block and its internal
storage blocks. For further details please see the design doc on HDFS-7285.
HADOOP-11264 focuses on providing flexible and high-performance codec
calculation support.

The nightly Jenkins job of the branch has reported several successful runs,
and doesn't show new flaky tests compared with trunk. We have posted
several versions of the test plan including both unit testing and cluster
testing, and have executed most tests in the plan. The most basic
functionalities have been extensively tested and verified in several real
clusters with different hardware configurations; results have been very
stable. We have created follow-on tasks for more advanced error handling
and optimization under the umbrella HDFS-8031. We also plan to implement or
harden the integration of EC with existing features such as WebHDFS,
snapshot, append, truncate, hflush, hsync, and so forth.

Development of this feature has been a collaboration across many companies
and institutions. I'd like to thank J. Andreina, Takanobu Asanuma,
Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao G, Rui Li, Yi Liu,
Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai Sasaki, Walter Su, Tsz Wo
Nicholas Sze, Andrew Wang, Yong Zhang, Jing Zhao, Hui Zheng and Kai Zheng
for their code contributions and reviews. Andrew and Kai Zheng also made
fundamental contributions to the initial design. Rui Li, Gao Rui, Kai
Sasaki, Kai Zheng and many other contributors have made great efforts in
system testing. Many thanks go to Weihua Jiang for proposing the JIRA, and
ATM, Todd Lipcon, Silvius Rus, Suresh, as well as many others for providing
helpful feedbacks.

Following the community convention, this vote will last for 7 days (ending
September 29th). Votes from Hadoop committers are binding but non-binding
votes are very welcome as well. And here's my non-binding +1.

Thanks,
---
Zhe Zhang

RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by "Liu, Yi A" <yi...@intel.com>.
+1 (non-binding)
I have been involved in the development and code review on the feature branch. It's a great feature and I think it's ready to merge it into trunk.

Thanks all for the contribution.

Regards,
Yi Liu


-----Original Message-----
From: Vinayakumar B [mailto:vinayakumarb@apache.org] 
Sent: Friday, September 25, 2015 12:21 PM
To: hdfs-dev@hadoop.apache.org
Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

+1,

I've been involved starting from design and development of ErasureCoding. I think phase 1 of this development is ready to be merged to trunk.
It had come a long way to the current state with significant effort of many Contributors and Reviewers for both design and code.

Thanks Everyone for the efforts.

Regards,
Vinay

On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org> wrote:

> +1
>
> I've been involved in both development and review on the branch, and I 
> believe it's now ready to get merged into trunk. Many thanks to all 
> the contributors and reviewers!
>
> Thanks,
> -Jing
>
> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <ka...@intel.com> wrote:
>
> > Non-binding +1
> >
> > According to our extensive performance tests, striping + ISA-L coder
> based
> > erasure coding not only can save storage, but also can increase the 
> > throughput of a client or a cluster. It will be a great addition to 
> > HDFS and its users. Based on the latest branch codes, we also 
> > observed it's
> very
> > reliable in the concurrent tests. We'll provide the perf test report
> after
> > it's sorted out and hope it helps.
> > Thanks!
> >
> > Regards,
> > Kai
> >
> > -----Original Message-----
> > From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > Sent: Wednesday, September 23, 2015 8:50 AM
> > To: hdfs-dev@hadoop.apache.org; common-dev@hadoop.apache.org
> > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
> >
> > +1
> >
> > Great addition to HDFS. Thanks all contributors for the nice work.
> >
> > Regards,
> > Uma
> >
> > On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com> wrote:
> >
> > >Hi,
> > >
> > >I'd like to propose a vote to merge the HDFS-7285 feature branch 
> > >back to trunk. Since November 2014 we have been designing and 
> > >developing this feature under the umbrella JIRAs HDFS-7285 and 
> > >HADOOP-11264, and have committed approximately 210 patches.
> > >
> > >The HDFS-7285 feature branch was created to support the first phase 
> > >of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC is to 
> > >significantly reduce storage space usage in HDFS clusters. Instead 
> > >of always creating 3 replicas of each block with 200% storage space 
> > >overhead, HDFS-EC provides data durability through parity data blocks.
> > >With most EC configurations, the storage overhead is no more than 50%.
> > >Based on profiling results of production clusters, we decided to 
> > >support EC with the striped block layout in the first phase, so 
> > >that small files can be better handled. This means dividing each 
> > >logical HDFS file block into smaller units (striping cells) and 
> > >spreading them on a set of DataNodes in round-robin fashion. Parity 
> > >cells are generated for each stripe of original data cells. We have 
> > >made changes to NameNode, client, and DataNode to generalize the 
> > >block concept and handle the mapping between a logical file block 
> > >and its internal storage blocks. For further details please see the 
> > >design doc on HDFS-7285.
> > >HADOOP-11264 focuses on providing flexible and high-performance 
> > >codec calculation support.
> > >
> > >The nightly Jenkins job of the branch has reported several 
> > >successful runs, and doesn't show new flaky tests compared with 
> > >trunk. We have posted several versions of the test plan including 
> > >both unit testing and cluster testing, and have executed most tests 
> > >in the plan. The most basic functionalities have been extensively 
> > >tested and verified in several real clusters with different 
> > >hardware configurations; results have been very stable. We have 
> > >created follow-on tasks for more advanced error handling and optimization under the umbrella HDFS-8031.
> > >We also plan to implement or harden the integration of EC with 
> > >existing features such as WebHDFS, snapshot, append, truncate, 
> > >hflush, hsync, and so forth.
> > >
> > >Development of this feature has been a collaboration across many 
> > >companies and institutions. I'd like to thank J. Andreina, Takanobu 
> > >Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao 
> > >G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai 
> > >Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong Zhang, 
> > >Jing Zhao, Hui Zheng and Kai Zheng for their code contributions and reviews.
> > >Andrew and Kai Zheng also made fundamental contributions to the 
> > >initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and many 
> > >other contributors have made great efforts in system testing. Many 
> > >thanks go to Weihua Jiang for proposing the JIRA, and ATM, Todd 
> > >Lipcon, Silvius Rus, Suresh, as well as many others for providing helpful feedbacks.
> > >
> > >Following the community convention, this vote will last for 7 days 
> > >(ending September 29th). Votes from Hadoop committers are binding 
> > >but non-binding votes are very welcome as well. And here's my 
> > >non-binding
> +1.
> > >
> > >Thanks,
> > >---
> > >Zhe Zhang
> >
> >
>

Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by Steve Loughran <st...@hortonworks.com>.
> On 17 Oct 2015, at 03:04, Vinayakumar B <vi...@gmail.com> wrote:
> 
> Is anyone else also thinks that feature is ready to goto branch-2  as well?
> 
> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since then and
> ready to go in branch-2.


2 weeks on jenkins -given jenkin's persistent instability- isn't a great metric. We've had lots of things passing tests for two weeks that turned out to break in the real world, S3a being an example : HADOOP-11571. Tests worked, out in the field the problems surface and then its a rush to get those fixes in to the next hadoop release and/or backport.

We know that encryption at rest broke on HBase; its those downstream things rather than the Hadoop compatibility suites which are the deal-breakers, especially given how "loses data" is the thing we really want to avoid in HDFS.

What full stack integration testing has been done so far?

RE: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by "Zheng, Kai" <ka...@intel.com>.
Thanks Andrew for pointing this. It sounds good. Yes we have umbrella JIRAs for the follow-on tasks, HDFS-8031 for the HDFS side, and HADOOP-11842 for the HADOOP side. 

-----Original Message-----
From: Andrew Wang [mailto:andrew.wang@cloudera.com] 
Sent: Tuesday, November 03, 2015 8:49 AM
To: hdfs-dev@hadoop.apache.org
Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

If we use an umbrella JIRA to categorize all the ongoing EC work, that will let us more easily change the target version later. For instance, if we decide to bump Phase II out of 2.9, then we just need to change the target version of the Phase II umbrella rather than all the subtasks.

On Mon, Nov 2, 2015 at 4:26 PM, Zheng, Kai <ka...@intel.com> wrote:

> Yeah, so for the issues we recently resolved on trunk and are 
> addressing as follow-on tasks in Phase I, we would label them with "erasure coding"
> and maybe also set the target version as "2.9" for the convenience?
>
> -----Original Message-----
> From: Jing Zhao [mailto:jing9@apache.org]
> Sent: Tuesday, November 03, 2015 8:04 AM
> To: hdfs-dev@hadoop.apache.org
> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge 
> HDFS-7285 (erasure coding) branch to trunk]
>
> +1 for the plan about Phase I & II.
>
> BTW, maybe out of the scope of this thread, just want to mention we 
> should either move the jira under HDFS-8031 or update the jira 
> component as "erasure-coding" when making further improvement or 
> fixing bugs in EC. In this way it will be easier for later backporting EC to 2.9.
>
> On Mon, Nov 2, 2015 at 3:48 PM, Vinayakumar B < 
> vinayakumarb.apache@gmail.com
> > wrote:
>
> > +1 for the idea.
> > On Nov 3, 2015 07:22, "Zheng, Kai" <ka...@intel.com> wrote:
> >
> > > Sounds good to me. When it's determined to include EC in 2.9 
> > > release, it may be good to have a rough release date as Zhe asked, 
> > > so accordingly the scope of EC can be discussed out. We still have 
> > > quite a few of things as Phase I follow-on tasks to do before EC 
> > > can be deployed in a production system. Phase II to develop 
> > > non-striping EC for cold data would possibly
> > be
> > > started after that. We might consider to include only Phase I and 
> > > leave Phase II for next release according to the rough release date.
> > >
> > > Regards,
> > > Kai
> > >
> > > -----Original Message-----
> > > From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > > Sent: Tuesday, November 03, 2015 5:41 AM
> > > To: hdfs-dev@hadoop.apache.org
> > > Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge
> > > HDFS-7285 (erasure coding) branch to trunk]
> > >
> > > +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we 
> > > +plan to
> > > have 2.8 and 2.9 releases.
> > >
> > > Regards,
> > > Uma
> > >
> > > On 11/2/15, 11:49 AM, "Vinod Vavilapalli" 
> > > <vi...@hortonworks.com>
> > wrote:
> > >
> > > >Forking the thread. Started looking at the 2.8 list, various 
> > > >features¹ status and arrived here.
> > > >
> > > >While I understand the pervasive nature of EC and a need for a 
> > > >significant bake-in, moving this to a 3.x release is not a good idea.
> > > >We will surely get a 2.8 out this year and, as needed, I can even 
> > > >spend time getting started on a 2.9. OTOH, 3.x is long ways off, 
> > > >and given all the incompatibilities there, it would be a while 
> > > >before users can get their hands on EC if it were to be only on 
> > > >3.x. At best, this may force sites that want EC to backport the 
> > > >entire EC feature to older releases, at worst this will be repeat 
> > > >the mess of 0.20 security release
> > > forks.
> > > >
> > > >If we think adding this to 2.8 (even if it switched off) is too 
> > > >much risk per our original plan, let¹s move this to 2.9, there by 
> > > >leaving enough time for stability, integration testing and 
> > > >bake-in, and a realistic chance of having it end up on users¹ clusters soonish.
> > > >
> > > >+Vinod
> > > >
> > > >> On Oct 19, 2015, at 1:44 PM, Andrew Wang 
> > > >><an...@cloudera.com>
> > > >>wrote:
> > > >>
> > > >> I think our plan thus far has been to target this for 3.0. I'm 
> > > >>okay with  putting it in branch-2 if we've given a hard look at 
> > > >>compatibility, but  I'll note though that 2.8 is already looking 
> > > >>like quite a large release,  and our release bandwidth has been 
> > > >>focused on the 2.6 and 2.7 maintenance  releases. Adding another 
> > > >>multi-hundred JIRAs to 2.8 might make it too  unwieldy to get 
> > > >>out the door. If we bump EC past that, 3.0 might very well  be 
> > > >>our next release vehicle. I do plan to revive the 3.0 schedule 
> > > >>some time  next year. With EC and
> > > >>JDK8 in a good spot, the only big feature remaining  is 
> > > >>classpath isolation.
> > > >>
> > > >> EC is also a pretty fundamental change to HDFS. Even if it's 
> > > >>compatible, in  terms of size and impact it might best belong in 
> > > >>a new major release.
> > > >>
> > > >> Best,
> > > >> Andrew
> > > >>
> > > >> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B < 
> > > >> vinayakumarb.apache@gmail.com> wrote:
> > > >>
> > > >>> Is anyone else also thinks that feature is ready to goto
> > > >>>branch-2 as well?
> > > >>>
> > > >>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable 
> > > >>>since then and  ready to go in branch-2.
> > > >>>
> > > >>> -Vinay
> > > >>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com>
> wrote:
> > > >>>
> > > >>>> Thanks Vinay for capturing the issue and Uma for offering the
> help.
> > > >>>>
> > > >>>> ---
> > > >>>> Zhe Zhang
> > > >>>>
> > > >>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
> > > >>> uma.gangumalla@intel.com
> > > >>>>>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Vinay,
> > > >>>>>
> > > >>>>>
> > > >>>>> I would merge them as part of HDFS-9182.
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>> Uma
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> On 10/5/15, 12:48 AM, "Vinayakumar B"
> > > >>>>><vi...@apache.org>
> > > >>>>>wrote:
> > > >>>>>
> > > >>>>>> Hi Andrew,
> > > >>>>>> I see CHANGES.txt entries not yet merged from
> > > >>> CHANGES-HDFS-EC-7285.txt.
> > > >>>>>>
> > > >>>>>> Was this intentional?
> > > >>>>>>
> > > >>>>>> Regards,
> > > >>>>>> Vinay
> > > >>>>>>
> > > >>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
> > > >>> andrew.wang@cloudera.com>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Branch has been merged to trunk, thanks again to everyone 
> > > >>>>>>>who worked
> > > >>>> on
> > > >>>>>>> the
> > > >>>>>>> feature!
> > > >>>>>>>
> > > >>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang 
> > > >>>>>>> <zh...@cloudera.com>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Thanks everyone who has participated in this discussion.
> > > >>>>>>>>
> > > >>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, 
> > > >>>>>>>> this vote
> > > >>> has
> > > >>>>>>> passed.
> > > >>>>>>>> I will do a final 'git merge' with trunk and work with 
> > > >>>>>>>> Andrew to
> > > >>>> merge
> > > >>>>>>> the
> > > >>>>>>>> branch to trunk. I'll update on this thread when the 
> > > >>>>>>>> merge is
> > > >>> done.
> > > >>>>>>>>
> > > >>>>>>>> ---
> > > >>>>>>>> Zhe Zhang
> > > >>>>>>>>
> > > >>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A 
> > > >>>>>>>> <yi...@intel.com>
> > > >>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> (Change it to binding.)
> > > >>>>>>>>>
> > > >>>>>>>>> +1
> > > >>>>>>>>> I have been involved in the development and code review 
> > > >>>>>>>>> on the
> > > >>>>>>> feature
> > > >>>>>>>>> branch. It's a great feature and I think it's ready to 
> > > >>>>>>>>> merge it
> > > >>>> into
> > > >>>>>>>> trunk.
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks all for the contribution.
> > > >>>>>>>>>
> > > >>>>>>>>> Regards,
> > > >>>>>>>>> Yi Liu
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> -----Original Message-----
> > > >>>>>>>>> From: Liu, Yi A
> > > >>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
> > > >>>>>>>>> To: hdfs-dev@hadoop.apache.org
> > > >>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) 
> > > >>>>>>>>> branch to
> > > >>>> trunk
> > > >>>>>>>>>
> > > >>>>>>>>> +1 (non-binding)
> > > >>>>>>>>> I have been involved in the development and code review 
> > > >>>>>>>>> on the
> > > >>>>>>> feature
> > > >>>>>>>>> branch. It's a great feature and I think it's ready to 
> > > >>>>>>>>> merge it
> > > >>>> into
> > > >>>>>>>> trunk.
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks all for the contribution.
> > > >>>>>>>>>
> > > >>>>>>>>> Regards,
> > > >>>>>>>>> Yi Liu
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> -----Original Message-----
> > > >>>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> > > >>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
> > > >>>>>>>>> To: hdfs-dev@hadoop.apache.org
> > > >>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) 
> > > >>>>>>>>> branch to
> > > >>>> trunk
> > > >>>>>>>>>
> > > >>>>>>>>> +1,
> > > >>>>>>>>>
> > > >>>>>>>>> I've been involved starting from design and development 
> > > >>>>>>>>> of
> > > >>>>>>> ErasureCoding.
> > > >>>>>>>>> I think phase 1 of this development is ready to be 
> > > >>>>>>>>> merged to
> > > >>>> trunk.
> > > >>>>>>>>> It had come a long way to the current state with 
> > > >>>>>>>>> significant
> > > >>>> effort
> > > >>>>>>> of
> > > >>>>>>>>> many Contributors and Reviewers for both design and code.
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks Everyone for the efforts.
> > > >>>>>>>>>
> > > >>>>>>>>> Regards,
> > > >>>>>>>>> Vinay
> > > >>>>>>>>>
> > > >>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao 
> > > >>>>>>>>> <ji...@apache.org>
> > > >>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> +1
> > > >>>>>>>>>>
> > > >>>>>>>>>> I've been involved in both development and review on 
> > > >>>>>>>>>> the
> > > >>> branch,
> > > >>>>>>> and
> > > >>>>>>> I
> > > >>>>>>>>>> believe it's now ready to get merged into trunk. Many 
> > > >>>>>>>>>> thanks
> > > >>> to
> > > >>>>>>> all
> > > >>>>>>>>>> the contributors and reviewers!
> > > >>>>>>>>>>
> > > >>>>>>>>>> Thanks,
> > > >>>>>>>>>> -Jing
> > > >>>>>>>>>>
> > > >>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
> > > >>>> kai.zheng@intel.com>
> > > >>>>>>>> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Non-binding +1
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> According to our extensive performance tests, striping 
> > > >>>>>>>>>>> +
> > > >>> ISA-L
> > > >>>>>>> coder
> > > >>>>>>>>>> based
> > > >>>>>>>>>>> erasure coding not only can save storage, but also can
> > > >>>> increase
> > > >>>>>>> the
> > > >>>>>>>>>>> throughput of a client or a cluster. It will be a 
> > > >>>>>>>>>>> great
> > > >>>>>>> addition to
> > > >>>>>>>>>>> HDFS and its users. Based on the latest branch codes, 
> > > >>>>>>>>>>> we
> > > >>> also
> > > >>>>>>>>>>> observed it's
> > > >>>>>>>>>> very
> > > >>>>>>>>>>> reliable in the concurrent tests. We'll provide the 
> > > >>>>>>>>>>> perf
> > > >>> test
> > > >>>>>>> report
> > > >>>>>>>>>> after
> > > >>>>>>>>>>> it's sorted out and hope it helps.
> > > >>>>>>>>>>> Thanks!
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Regards,
> > > >>>>>>>>>>> Kai
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> -----Original Message-----
> > > >>>>>>>>>>> From: Gangumalla, Uma 
> > > >>>>>>>>>>> [mailto:uma.gangumalla@intel.com]
> > > >>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
> > > >>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
> > > >>> common-dev@hadoop.apache.org
> > > >>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) 
> > > >>>>>>>>>>> branch
> > > >>> to
> > > >>>>>>> trunk
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> +1
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Great addition to HDFS. Thanks all contributors for 
> > > >>>>>>>>>>> the nice
> > > >>>>>>> work.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Regards,
> > > >>>>>>>>>>> Uma
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" 
> > > >>>>>>>>>>> <zh...@cloudera.com>
> > > >>>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> Hi,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 
> > > >>>>>>>>>>>> feature
> > > >>>>>>> branch
> > > >>>>>>>>>>>> back to trunk. Since November 2014 we have been 
> > > >>>>>>>>>>>> designing
> > > >>> and
> > > >>>>>>>>>>>> developing this feature under the umbrella JIRAs
> > > >>>>>>>>>>>> HDFS-7285
> > > >>>> and
> > > >>>>>>>>>>>> HADOOP-11264, and have committed approximately 210
> patches.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> The HDFS-7285 feature branch was created to support 
> > > >>>>>>>>>>>> the
> > > >>> first
> > > >>>>>>> phase
> > > >>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of 
> > > >>>>>>>>>>>> HDFS-EC
> > > >>> is
> > > >>>>>>> to
> > > >>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
> > > >>>>>>> Instead
> > > >>>>>>>>>>>> of always creating 3 replicas of each block with 200%
> > > >>> storage
> > > >>>>>>> space
> > > >>>>>>>>>>>> overhead, HDFS-EC provides data durability through 
> > > >>>>>>>>>>>> parity
> > > >>>> data
> > > >>>>>>>> blocks.
> > > >>>>>>>>>>>> With most EC configurations, the storage overhead is 
> > > >>>>>>>>>>>> no
> > > >>> more
> > > >>>>>>> than
> > > >>>>>>>> 50%.
> > > >>>>>>>>>>>> Based on profiling results of production clusters, we
> > > >>> decided
> > > >>>>>>> to
> > > >>>>>>>>>>>> support EC with the striped block layout in the first
> > > >>> phase,
> > > >>>> so
> > > >>>>>>>>>>>> that small files can be better handled. This means 
> > > >>>>>>>>>>>> dividing
> > > >>>>>>> each
> > > >>>>>>>>>>>> logical HDFS file block into smaller units (striping
> > > >>>>>>>>>>>> cells)
> > > >>>> and
> > > >>>>>>>>>>>> spreading them on a set of DataNodes in round-robin
> > > >>> fashion.
> > > >>>>>>> Parity
> > > >>>>>>>>>>>> cells are generated for each stripe of original data
> cells.
> > > >>>> We
> > > >>>>>>> have
> > > >>>>>>>>>>>> made changes to NameNode, client, and DataNode to
> > > >>> generalize
> > > >>>>>>> the
> > > >>>>>>>>>>>> block concept and handle the mapping between a 
> > > >>>>>>>>>>>> logical file
> > > >>>>>>> block
> > > >>>>>>>>>>>> and its internal storage blocks. For further details 
> > > >>>>>>>>>>>> please
> > > >>>> see
> > > >>>>>>> the
> > > >>>>>>>>>>>> design doc on HDFS-7285.
> > > >>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
> > > >>>> high-performance
> > > >>>>>>>>>>>> codec calculation support.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> The nightly Jenkins job of the branch has reported 
> > > >>>>>>>>>>>> several successful runs, and doesn't show new flaky 
> > > >>>>>>>>>>>> tests compared
> > > >>>> with
> > > >>>>>>>>>>>> trunk. We have posted several versions of the test 
> > > >>>>>>>>>>>> plan
> > > >>>>>>> including
> > > >>>>>>>>>>>> both unit testing and cluster testing, and have 
> > > >>>>>>>>>>>> executed
> > > >>> most
> > > >>>>>>> tests
> > > >>>>>>>>>>>> in the plan. The most basic functionalities have been
> > > >>>>>>> extensively
> > > >>>>>>>>>>>> tested and verified in several real clusters with 
> > > >>>>>>>>>>>> different hardware configurations; results have been 
> > > >>>>>>>>>>>> very stable. We
> > > >>>> have
> > > >>>>>>>>>>>> created follow-on tasks for more advanced error 
> > > >>>>>>>>>>>> handling
> > > >>> and
> > > >>>>>>>>> optimization under the umbrella HDFS-8031.
> > > >>>>>>>>>>>> We also plan to implement or harden the integration 
> > > >>>>>>>>>>>> of EC
> > > >>>> with
> > > >>>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
> > > >>>> truncate,
> > > >>>>>>>>>>>> hflush, hsync, and so forth.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Development of this feature has been a collaboration 
> > > >>>>>>>>>>>> across
> > > >>>>>>> many
> > > >>>>>>>>>>>> companies and institutions. I'd like to thank J.
> > > >>>>>>>>>>>> Andreina,
> > > >>>>>>> Takanobu
> > > >>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
> > > >>> Maheswara
> > > >>>>>>> Rao
> > > >>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh 
> > > >>>>>>>>>>>> R, Gao
> > > >>>> Rui,
> > > >>>>>>> Kai
> > > >>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, 
> > > >>>>>>>>>>>> Yong
> > > >>>>>>> Zhang,
> > > >>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
> > > >>>> contributions
> > > >>>>>>> and
> > > >>>>>>>>> reviews.
> > > >>>>>>>>>>>> Andrew and Kai Zheng also made fundamental 
> > > >>>>>>>>>>>> contributions to
> > > >>>> the
> > > >>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai 
> > > >>>>>>>>>>>> Zheng and
> > > >>>> many
> > > >>>>>>>>>>>> other contributors have made great efforts in system
> > > >>> testing.
> > > >>>>>>> Many
> > > >>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and 
> > > >>>>>>>>>>>> ATM,
> > > >>>> Todd
> > > >>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others 
> > > >>>>>>>>>>>> for
> > > >>>>>>> providing
> > > >>>>>>>>> helpful feedbacks.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Following the community convention, this vote will 
> > > >>>>>>>>>>>> last
> > > >>> for 7
> > > >>>>>>> days
> > > >>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers 
> > > >>>>>>>>>>>> are
> > > >>>>>>> binding
> > > >>>>>>>>>>>> but non-binding votes are very welcome as well. And 
> > > >>>>>>>>>>>> here's
> > > >>> my
> > > >>>>>>>>>>>> non-binding
> > > >>>>>>>>>> +1.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>> ---
> > > >>>>>>>>>>>> Zhe Zhang
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >
> > >
> > >
> >
>

Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Andrew Wang <an...@cloudera.com>.
If we use an umbrella JIRA to categorize all the ongoing EC work, that will
let us more easily change the target version later. For instance, if we
decide to bump Phase II out of 2.9, then we just need to change the target
version of the Phase II umbrella rather than all the subtasks.

On Mon, Nov 2, 2015 at 4:26 PM, Zheng, Kai <ka...@intel.com> wrote:

> Yeah, so for the issues we recently resolved on trunk and are addressing
> as follow-on tasks in Phase I, we would label them with "erasure coding"
> and maybe also set the target version as "2.9" for the convenience?
>
> -----Original Message-----
> From: Jing Zhao [mailto:jing9@apache.org]
> Sent: Tuesday, November 03, 2015 8:04 AM
> To: hdfs-dev@hadoop.apache.org
> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285
> (erasure coding) branch to trunk]
>
> +1 for the plan about Phase I & II.
>
> BTW, maybe out of the scope of this thread, just want to mention we should
> either move the jira under HDFS-8031 or update the jira component as
> "erasure-coding" when making further improvement or fixing bugs in EC. In
> this way it will be easier for later backporting EC to 2.9.
>
> On Mon, Nov 2, 2015 at 3:48 PM, Vinayakumar B <
> vinayakumarb.apache@gmail.com
> > wrote:
>
> > +1 for the idea.
> > On Nov 3, 2015 07:22, "Zheng, Kai" <ka...@intel.com> wrote:
> >
> > > Sounds good to me. When it's determined to include EC in 2.9
> > > release, it may be good to have a rough release date as Zhe asked,
> > > so accordingly the scope of EC can be discussed out. We still have
> > > quite a few of things as Phase I follow-on tasks to do before EC can
> > > be deployed in a production system. Phase II to develop non-striping
> > > EC for cold data would possibly
> > be
> > > started after that. We might consider to include only Phase I and
> > > leave Phase II for next release according to the rough release date.
> > >
> > > Regards,
> > > Kai
> > >
> > > -----Original Message-----
> > > From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > > Sent: Tuesday, November 03, 2015 5:41 AM
> > > To: hdfs-dev@hadoop.apache.org
> > > Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge
> > > HDFS-7285 (erasure coding) branch to trunk]
> > >
> > > +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we
> > > +plan to
> > > have 2.8 and 2.9 releases.
> > >
> > > Regards,
> > > Uma
> > >
> > > On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vi...@hortonworks.com>
> > wrote:
> > >
> > > >Forking the thread. Started looking at the 2.8 list, various
> > > >features¹ status and arrived here.
> > > >
> > > >While I understand the pervasive nature of EC and a need for a
> > > >significant bake-in, moving this to a 3.x release is not a good idea.
> > > >We will surely get a 2.8 out this year and, as needed, I can even
> > > >spend time getting started on a 2.9. OTOH, 3.x is long ways off,
> > > >and given all the incompatibilities there, it would be a while
> > > >before users can get their hands on EC if it were to be only on
> > > >3.x. At best, this may force sites that want EC to backport the
> > > >entire EC feature to older releases, at worst this will be repeat
> > > >the mess of 0.20 security release
> > > forks.
> > > >
> > > >If we think adding this to 2.8 (even if it switched off) is too
> > > >much risk per our original plan, let¹s move this to 2.9, there by
> > > >leaving enough time for stability, integration testing and bake-in,
> > > >and a realistic chance of having it end up on users¹ clusters soonish.
> > > >
> > > >+Vinod
> > > >
> > > >> On Oct 19, 2015, at 1:44 PM, Andrew Wang
> > > >><an...@cloudera.com>
> > > >>wrote:
> > > >>
> > > >> I think our plan thus far has been to target this for 3.0. I'm
> > > >>okay with  putting it in branch-2 if we've given a hard look at
> > > >>compatibility, but  I'll note though that 2.8 is already looking
> > > >>like quite a large release,  and our release bandwidth has been
> > > >>focused on the 2.6 and 2.7 maintenance  releases. Adding another
> > > >>multi-hundred JIRAs to 2.8 might make it too  unwieldy to get out
> > > >>the door. If we bump EC past that, 3.0 might very well  be our
> > > >>next release vehicle. I do plan to revive the 3.0 schedule some
> > > >>time  next year. With EC and
> > > >>JDK8 in a good spot, the only big feature remaining  is classpath
> > > >>isolation.
> > > >>
> > > >> EC is also a pretty fundamental change to HDFS. Even if it's
> > > >>compatible, in  terms of size and impact it might best belong in a
> > > >>new major release.
> > > >>
> > > >> Best,
> > > >> Andrew
> > > >>
> > > >> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
> > > >> vinayakumarb.apache@gmail.com> wrote:
> > > >>
> > > >>> Is anyone else also thinks that feature is ready to goto
> > > >>>branch-2 as well?
> > > >>>
> > > >>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable
> > > >>>since then and  ready to go in branch-2.
> > > >>>
> > > >>> -Vinay
> > > >>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com>
> wrote:
> > > >>>
> > > >>>> Thanks Vinay for capturing the issue and Uma for offering the
> help.
> > > >>>>
> > > >>>> ---
> > > >>>> Zhe Zhang
> > > >>>>
> > > >>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
> > > >>> uma.gangumalla@intel.com
> > > >>>>>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Vinay,
> > > >>>>>
> > > >>>>>
> > > >>>>> I would merge them as part of HDFS-9182.
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>> Uma
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> On 10/5/15, 12:48 AM, "Vinayakumar B"
> > > >>>>><vi...@apache.org>
> > > >>>>>wrote:
> > > >>>>>
> > > >>>>>> Hi Andrew,
> > > >>>>>> I see CHANGES.txt entries not yet merged from
> > > >>> CHANGES-HDFS-EC-7285.txt.
> > > >>>>>>
> > > >>>>>> Was this intentional?
> > > >>>>>>
> > > >>>>>> Regards,
> > > >>>>>> Vinay
> > > >>>>>>
> > > >>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
> > > >>> andrew.wang@cloudera.com>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Branch has been merged to trunk, thanks again to everyone
> > > >>>>>>>who worked
> > > >>>> on
> > > >>>>>>> the
> > > >>>>>>> feature!
> > > >>>>>>>
> > > >>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang
> > > >>>>>>> <zh...@cloudera.com>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Thanks everyone who has participated in this discussion.
> > > >>>>>>>>
> > > >>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this
> > > >>>>>>>> vote
> > > >>> has
> > > >>>>>>> passed.
> > > >>>>>>>> I will do a final 'git merge' with trunk and work with
> > > >>>>>>>> Andrew to
> > > >>>> merge
> > > >>>>>>> the
> > > >>>>>>>> branch to trunk. I'll update on this thread when the merge
> > > >>>>>>>> is
> > > >>> done.
> > > >>>>>>>>
> > > >>>>>>>> ---
> > > >>>>>>>> Zhe Zhang
> > > >>>>>>>>
> > > >>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A
> > > >>>>>>>> <yi...@intel.com>
> > > >>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> (Change it to binding.)
> > > >>>>>>>>>
> > > >>>>>>>>> +1
> > > >>>>>>>>> I have been involved in the development and code review on
> > > >>>>>>>>> the
> > > >>>>>>> feature
> > > >>>>>>>>> branch. It's a great feature and I think it's ready to
> > > >>>>>>>>> merge it
> > > >>>> into
> > > >>>>>>>> trunk.
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks all for the contribution.
> > > >>>>>>>>>
> > > >>>>>>>>> Regards,
> > > >>>>>>>>> Yi Liu
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> -----Original Message-----
> > > >>>>>>>>> From: Liu, Yi A
> > > >>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
> > > >>>>>>>>> To: hdfs-dev@hadoop.apache.org
> > > >>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding)
> > > >>>>>>>>> branch to
> > > >>>> trunk
> > > >>>>>>>>>
> > > >>>>>>>>> +1 (non-binding)
> > > >>>>>>>>> I have been involved in the development and code review on
> > > >>>>>>>>> the
> > > >>>>>>> feature
> > > >>>>>>>>> branch. It's a great feature and I think it's ready to
> > > >>>>>>>>> merge it
> > > >>>> into
> > > >>>>>>>> trunk.
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks all for the contribution.
> > > >>>>>>>>>
> > > >>>>>>>>> Regards,
> > > >>>>>>>>> Yi Liu
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> -----Original Message-----
> > > >>>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> > > >>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
> > > >>>>>>>>> To: hdfs-dev@hadoop.apache.org
> > > >>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding)
> > > >>>>>>>>> branch to
> > > >>>> trunk
> > > >>>>>>>>>
> > > >>>>>>>>> +1,
> > > >>>>>>>>>
> > > >>>>>>>>> I've been involved starting from design and development of
> > > >>>>>>> ErasureCoding.
> > > >>>>>>>>> I think phase 1 of this development is ready to be merged
> > > >>>>>>>>> to
> > > >>>> trunk.
> > > >>>>>>>>> It had come a long way to the current state with
> > > >>>>>>>>> significant
> > > >>>> effort
> > > >>>>>>> of
> > > >>>>>>>>> many Contributors and Reviewers for both design and code.
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks Everyone for the efforts.
> > > >>>>>>>>>
> > > >>>>>>>>> Regards,
> > > >>>>>>>>> Vinay
> > > >>>>>>>>>
> > > >>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao
> > > >>>>>>>>> <ji...@apache.org>
> > > >>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> +1
> > > >>>>>>>>>>
> > > >>>>>>>>>> I've been involved in both development and review on the
> > > >>> branch,
> > > >>>>>>> and
> > > >>>>>>> I
> > > >>>>>>>>>> believe it's now ready to get merged into trunk. Many
> > > >>>>>>>>>> thanks
> > > >>> to
> > > >>>>>>> all
> > > >>>>>>>>>> the contributors and reviewers!
> > > >>>>>>>>>>
> > > >>>>>>>>>> Thanks,
> > > >>>>>>>>>> -Jing
> > > >>>>>>>>>>
> > > >>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
> > > >>>> kai.zheng@intel.com>
> > > >>>>>>>> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Non-binding +1
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> According to our extensive performance tests, striping +
> > > >>> ISA-L
> > > >>>>>>> coder
> > > >>>>>>>>>> based
> > > >>>>>>>>>>> erasure coding not only can save storage, but also can
> > > >>>> increase
> > > >>>>>>> the
> > > >>>>>>>>>>> throughput of a client or a cluster. It will be a great
> > > >>>>>>> addition to
> > > >>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
> > > >>> also
> > > >>>>>>>>>>> observed it's
> > > >>>>>>>>>> very
> > > >>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
> > > >>> test
> > > >>>>>>> report
> > > >>>>>>>>>> after
> > > >>>>>>>>>>> it's sorted out and hope it helps.
> > > >>>>>>>>>>> Thanks!
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Regards,
> > > >>>>>>>>>>> Kai
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> -----Original Message-----
> > > >>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > > >>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
> > > >>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
> > > >>> common-dev@hadoop.apache.org
> > > >>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding)
> > > >>>>>>>>>>> branch
> > > >>> to
> > > >>>>>>> trunk
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> +1
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the
> > > >>>>>>>>>>> nice
> > > >>>>>>> work.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Regards,
> > > >>>>>>>>>>> Uma
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
> > > >>>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> Hi,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285
> > > >>>>>>>>>>>> feature
> > > >>>>>>> branch
> > > >>>>>>>>>>>> back to trunk. Since November 2014 we have been
> > > >>>>>>>>>>>> designing
> > > >>> and
> > > >>>>>>>>>>>> developing this feature under the umbrella JIRAs
> > > >>>>>>>>>>>> HDFS-7285
> > > >>>> and
> > > >>>>>>>>>>>> HADOOP-11264, and have committed approximately 210
> patches.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> The HDFS-7285 feature branch was created to support the
> > > >>> first
> > > >>>>>>> phase
> > > >>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of
> > > >>>>>>>>>>>> HDFS-EC
> > > >>> is
> > > >>>>>>> to
> > > >>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
> > > >>>>>>> Instead
> > > >>>>>>>>>>>> of always creating 3 replicas of each block with 200%
> > > >>> storage
> > > >>>>>>> space
> > > >>>>>>>>>>>> overhead, HDFS-EC provides data durability through
> > > >>>>>>>>>>>> parity
> > > >>>> data
> > > >>>>>>>> blocks.
> > > >>>>>>>>>>>> With most EC configurations, the storage overhead is no
> > > >>> more
> > > >>>>>>> than
> > > >>>>>>>> 50%.
> > > >>>>>>>>>>>> Based on profiling results of production clusters, we
> > > >>> decided
> > > >>>>>>> to
> > > >>>>>>>>>>>> support EC with the striped block layout in the first
> > > >>> phase,
> > > >>>> so
> > > >>>>>>>>>>>> that small files can be better handled. This means
> > > >>>>>>>>>>>> dividing
> > > >>>>>>> each
> > > >>>>>>>>>>>> logical HDFS file block into smaller units (striping
> > > >>>>>>>>>>>> cells)
> > > >>>> and
> > > >>>>>>>>>>>> spreading them on a set of DataNodes in round-robin
> > > >>> fashion.
> > > >>>>>>> Parity
> > > >>>>>>>>>>>> cells are generated for each stripe of original data
> cells.
> > > >>>> We
> > > >>>>>>> have
> > > >>>>>>>>>>>> made changes to NameNode, client, and DataNode to
> > > >>> generalize
> > > >>>>>>> the
> > > >>>>>>>>>>>> block concept and handle the mapping between a logical
> > > >>>>>>>>>>>> file
> > > >>>>>>> block
> > > >>>>>>>>>>>> and its internal storage blocks. For further details
> > > >>>>>>>>>>>> please
> > > >>>> see
> > > >>>>>>> the
> > > >>>>>>>>>>>> design doc on HDFS-7285.
> > > >>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
> > > >>>> high-performance
> > > >>>>>>>>>>>> codec calculation support.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> The nightly Jenkins job of the branch has reported
> > > >>>>>>>>>>>> several successful runs, and doesn't show new flaky
> > > >>>>>>>>>>>> tests compared
> > > >>>> with
> > > >>>>>>>>>>>> trunk. We have posted several versions of the test plan
> > > >>>>>>> including
> > > >>>>>>>>>>>> both unit testing and cluster testing, and have
> > > >>>>>>>>>>>> executed
> > > >>> most
> > > >>>>>>> tests
> > > >>>>>>>>>>>> in the plan. The most basic functionalities have been
> > > >>>>>>> extensively
> > > >>>>>>>>>>>> tested and verified in several real clusters with
> > > >>>>>>>>>>>> different hardware configurations; results have been
> > > >>>>>>>>>>>> very stable. We
> > > >>>> have
> > > >>>>>>>>>>>> created follow-on tasks for more advanced error
> > > >>>>>>>>>>>> handling
> > > >>> and
> > > >>>>>>>>> optimization under the umbrella HDFS-8031.
> > > >>>>>>>>>>>> We also plan to implement or harden the integration of
> > > >>>>>>>>>>>> EC
> > > >>>> with
> > > >>>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
> > > >>>> truncate,
> > > >>>>>>>>>>>> hflush, hsync, and so forth.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Development of this feature has been a collaboration
> > > >>>>>>>>>>>> across
> > > >>>>>>> many
> > > >>>>>>>>>>>> companies and institutions. I'd like to thank J.
> > > >>>>>>>>>>>> Andreina,
> > > >>>>>>> Takanobu
> > > >>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
> > > >>> Maheswara
> > > >>>>>>> Rao
> > > >>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R,
> > > >>>>>>>>>>>> Gao
> > > >>>> Rui,
> > > >>>>>>> Kai
> > > >>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang,
> > > >>>>>>>>>>>> Yong
> > > >>>>>>> Zhang,
> > > >>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
> > > >>>> contributions
> > > >>>>>>> and
> > > >>>>>>>>> reviews.
> > > >>>>>>>>>>>> Andrew and Kai Zheng also made fundamental
> > > >>>>>>>>>>>> contributions to
> > > >>>> the
> > > >>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng
> > > >>>>>>>>>>>> and
> > > >>>> many
> > > >>>>>>>>>>>> other contributors have made great efforts in system
> > > >>> testing.
> > > >>>>>>> Many
> > > >>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and
> > > >>>>>>>>>>>> ATM,
> > > >>>> Todd
> > > >>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
> > > >>>>>>> providing
> > > >>>>>>>>> helpful feedbacks.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Following the community convention, this vote will last
> > > >>> for 7
> > > >>>>>>> days
> > > >>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers
> > > >>>>>>>>>>>> are
> > > >>>>>>> binding
> > > >>>>>>>>>>>> but non-binding votes are very welcome as well. And
> > > >>>>>>>>>>>> here's
> > > >>> my
> > > >>>>>>>>>>>> non-binding
> > > >>>>>>>>>> +1.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>> ---
> > > >>>>>>>>>>>> Zhe Zhang
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >
> > >
> > >
> >
>

Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Zhe Zhang <zh...@apache.org>.
Sorry for getting back to the thread late.

Since Ming/Daryn have not raised further concerns, I'm +0 on including EC
in 2.9. The upside is that we can perhaps position EC in 2.9 as a "beta"
feature to gain some production experiences, and graduate it from beta in
3.0. The downside is as Elliot stated, this will delay the upgrade to 2.9
for some customers.

On Tue, Dec 8, 2015 at 2:54 PM, Zhe Zhang <zh...@cloudera.com> wrote:

> Hi Vinod,
>
> Thanks for the update. After seeing the comment from Elliot I checked with
> Ming Ma and Daryn Sharp offline, to get feedback based on their large scale
> deployments (Twitter and Yahoo).
>
> Based on my reading, Ming doesn't have a strong preference between 2.9 and
> 3.0, and Daryn is still having some questions about the stability of EC
> code.
>
> Maybe we should give Ming/Daryn some more time to reply to this thread?
>
> I think we should also address Elliot's comments above regarding major vs.
> minor releases, how that impacts the 2.9 upgrade timing etc.
>
> Thanks,
> ---
> Zhe Zhang
>
> On Tue, Dec 8, 2015 at 2:04 PM, Vinod Kumar Vavilapalli <
> vinodkv@apache.org> wrote:
>
>> Forgot to update this thread. I branched off 2.8 last week. So, we can
>> now go ahead and do a merge of HDFS-7285 into branch-2 (version 2.9) like
>> we discussed before.
>>
>> Thanks
>> +Vinod
>>
>>
>> > On Nov 3, 2015, at 4:40 PM, Vinod Kumar Vavilapalli <
>> vinodkv@hortonworks.com> wrote:
>> >
>> > That makes sense.
>> >
>> > Thanks for the discussion everyone, let’s stick to this tentative plan
>> of EC for 2.9.
>> >
>> > I just updated the Roadmap wiki to reflect the same.
>> >
>> > +Vinod
>> >
>> >
>> >> On Nov 2, 2015, at 4:26 PM, Zheng, Kai <ka...@intel.com> wrote:
>> >>
>> >> Yeah, so for the issues we recently resolved on trunk and are
>> addressing as follow-on tasks in Phase I, we would label them with "erasure
>> coding" and maybe also set the target version as "2.9" for the convenience?
>> >>
>> >> -----Original Message-----
>> >> From: Jing Zhao [mailto:jing9@apache.org]
>> >> Sent: Tuesday, November 03, 2015 8:04 AM
>> >> To: hdfs-dev@hadoop.apache.org
>> >> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge
>> HDFS-7285 (erasure coding) branch to trunk]
>> >>
>> >> +1 for the plan about Phase I & II.
>> >>
>> >> BTW, maybe out of the scope of this thread, just want to mention we
>> should either move the jira under HDFS-8031 or update the jira component as
>> "erasure-coding" when making further improvement or fixing bugs in EC. In
>> this way it will be easier for later backporting EC to 2.9.
>> >>
>> >> On Mon, Nov 2, 2015 at 3:48 PM, Vinayakumar B <
>> vinayakumarb.apache@gmail.com
>> >>> wrote:
>> >>
>> >>> +1 for the idea.
>> >>> On Nov 3, 2015 07:22, "Zheng, Kai" <ka...@intel.com> wrote:
>> >>>
>> >>>> Sounds good to me. When it's determined to include EC in 2.9
>> >>>> release, it may be good to have a rough release date as Zhe asked,
>> >>>> so accordingly the scope of EC can be discussed out. We still have
>> >>>> quite a few of things as Phase I follow-on tasks to do before EC can
>> >>>> be deployed in a production system. Phase II to develop non-striping
>> >>>> EC for cold data would possibly
>> >>> be
>> >>>> started after that. We might consider to include only Phase I and
>> >>>> leave Phase II for next release according to the rough release date.
>> >>>>
>> >>>> Regards,
>> >>>> Kai
>> >>>>
>> >>>> -----Original Message-----
>> >>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
>> >>>> Sent: Tuesday, November 03, 2015 5:41 AM
>> >>>> To: hdfs-dev@hadoop.apache.org
>> >>>> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge
>> >>>> HDFS-7285 (erasure coding) branch to trunk]
>> >>>>
>> >>>> +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we
>> >>>> +plan to
>> >>>> have 2.8 and 2.9 releases.
>> >>>>
>> >>>> Regards,
>> >>>> Uma
>> >>>>
>> >>>> On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vi...@hortonworks.com>
>> >>> wrote:
>> >>>>
>> >>>>> Forking the thread. Started looking at the 2.8 list, various
>> >>>>> features¹ status and arrived here.
>> >>>>>
>> >>>>> While I understand the pervasive nature of EC and a need for a
>> >>>>> significant bake-in, moving this to a 3.x release is not a good
>> idea.
>> >>>>> We will surely get a 2.8 out this year and, as needed, I can even
>> >>>>> spend time getting started on a 2.9. OTOH, 3.x is long ways off,
>> >>>>> and given all the incompatibilities there, it would be a while
>> >>>>> before users can get their hands on EC if it were to be only on
>> >>>>> 3.x. At best, this may force sites that want EC to backport the
>> >>>>> entire EC feature to older releases, at worst this will be repeat
>> >>>>> the mess of 0.20 security release
>> >>>> forks.
>> >>>>>
>> >>>>> If we think adding this to 2.8 (even if it switched off) is too
>> >>>>> much risk per our original plan, let¹s move this to 2.9, there by
>> >>>>> leaving enough time for stability, integration testing and bake-in,
>> >>>>> and a realistic chance of having it end up on users¹ clusters
>> soonish.
>> >>>>>
>> >>>>> +Vinod
>> >>>>>
>> >>>>>> On Oct 19, 2015, at 1:44 PM, Andrew Wang
>> >>>>>> <an...@cloudera.com>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>> I think our plan thus far has been to target this for 3.0. I'm
>> >>>>>> okay with  putting it in branch-2 if we've given a hard look at
>> >>>>>> compatibility, but  I'll note though that 2.8 is already looking
>> >>>>>> like quite a large release,  and our release bandwidth has been
>> >>>>>> focused on the 2.6 and 2.7 maintenance  releases. Adding another
>> >>>>>> multi-hundred JIRAs to 2.8 might make it too  unwieldy to get out
>> >>>>>> the door. If we bump EC past that, 3.0 might very well  be our
>> >>>>>> next release vehicle. I do plan to revive the 3.0 schedule some
>> >>>>>> time  next year. With EC and
>> >>>>>> JDK8 in a good spot, the only big feature remaining  is classpath
>> >>>>>> isolation.
>> >>>>>>
>> >>>>>> EC is also a pretty fundamental change to HDFS. Even if it's
>> >>>>>> compatible, in  terms of size and impact it might best belong in a
>> >>>>>> new major release.
>> >>>>>>
>> >>>>>> Best,
>> >>>>>> Andrew
>> >>>>>>
>> >>>>>> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
>> >>>>>> vinayakumarb.apache@gmail.com> wrote:
>> >>>>>>
>> >>>>>>> Is anyone else also thinks that feature is ready to goto
>> >>>>>>> branch-2 as well?
>> >>>>>>>
>> >>>>>>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable
>> >>>>>>> since then and  ready to go in branch-2.
>> >>>>>>>
>> >>>>>>> -Vinay
>> >>>>>>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com>
>> wrote:
>> >>>>>>>
>> >>>>>>>> Thanks Vinay for capturing the issue and Uma for offering the
>> help.
>> >>>>>>>>
>> >>>>>>>> ---
>> >>>>>>>> Zhe Zhang
>> >>>>>>>>
>> >>>>>>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
>> >>>>>>> uma.gangumalla@intel.com
>> >>>>>>>>>
>> >>>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>> Vinay,
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> I would merge them as part of HDFS-9182.
>> >>>>>>>>>
>> >>>>>>>>> Thanks,
>> >>>>>>>>> Uma
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On 10/5/15, 12:48 AM, "Vinayakumar B"
>> >>>>>>>>> <vi...@apache.org>
>> >>>>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> Hi Andrew,
>> >>>>>>>>>> I see CHANGES.txt entries not yet merged from
>> >>>>>>> CHANGES-HDFS-EC-7285.txt.
>> >>>>>>>>>>
>> >>>>>>>>>> Was this intentional?
>> >>>>>>>>>>
>> >>>>>>>>>> Regards,
>> >>>>>>>>>> Vinay
>> >>>>>>>>>>
>> >>>>>>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
>> >>>>>>> andrew.wang@cloudera.com>
>> >>>>>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> Branch has been merged to trunk, thanks again to everyone
>> >>>>>>>>>>> who worked
>> >>>>>>>> on
>> >>>>>>>>>>> the
>> >>>>>>>>>>> feature!
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang
>> >>>>>>>>>>> <zh...@cloudera.com>
>> >>>>>>>>>>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> Thanks everyone who has participated in this discussion.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this
>> >>>>>>>>>>>> vote
>> >>>>>>> has
>> >>>>>>>>>>> passed.
>> >>>>>>>>>>>> I will do a final 'git merge' with trunk and work with
>> >>>>>>>>>>>> Andrew to
>> >>>>>>>> merge
>> >>>>>>>>>>> the
>> >>>>>>>>>>>> branch to trunk. I'll update on this thread when the merge
>> >>>>>>>>>>>> is
>> >>>>>>> done.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> ---
>> >>>>>>>>>>>> Zhe Zhang
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A
>> >>>>>>>>>>>> <yi...@intel.com>
>> >>>>>>>>>>> wrote:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>> (Change it to binding.)
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> +1
>> >>>>>>>>>>>>> I have been involved in the development and code review on
>> >>>>>>>>>>>>> the
>> >>>>>>>>>>> feature
>> >>>>>>>>>>>>> branch. It's a great feature and I think it's ready to
>> >>>>>>>>>>>>> merge it
>> >>>>>>>> into
>> >>>>>>>>>>>> trunk.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Thanks all for the contribution.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Regards,
>> >>>>>>>>>>>>> Yi Liu
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> -----Original Message-----
>> >>>>>>>>>>>>> From: Liu, Yi A
>> >>>>>>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
>> >>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org
>> >>>>>>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding)
>> >>>>>>>>>>>>> branch to
>> >>>>>>>> trunk
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> +1 (non-binding)
>> >>>>>>>>>>>>> I have been involved in the development and code review on
>> >>>>>>>>>>>>> the
>> >>>>>>>>>>> feature
>> >>>>>>>>>>>>> branch. It's a great feature and I think it's ready to
>> >>>>>>>>>>>>> merge it
>> >>>>>>>> into
>> >>>>>>>>>>>> trunk.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Thanks all for the contribution.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Regards,
>> >>>>>>>>>>>>> Yi Liu
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> -----Original Message-----
>> >>>>>>>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
>> >>>>>>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
>> >>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org
>> >>>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding)
>> >>>>>>>>>>>>> branch to
>> >>>>>>>> trunk
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> +1,
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> I've been involved starting from design and development of
>> >>>>>>>>>>> ErasureCoding.
>> >>>>>>>>>>>>> I think phase 1 of this development is ready to be merged
>> >>>>>>>>>>>>> to
>> >>>>>>>> trunk.
>> >>>>>>>>>>>>> It had come a long way to the current state with
>> >>>>>>>>>>>>> significant
>> >>>>>>>> effort
>> >>>>>>>>>>> of
>> >>>>>>>>>>>>> many Contributors and Reviewers for both design and code.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Thanks Everyone for the efforts.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Regards,
>> >>>>>>>>>>>>> Vinay
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao
>> >>>>>>>>>>>>> <ji...@apache.org>
>> >>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>> +1
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> I've been involved in both development and review on the
>> >>>>>>> branch,
>> >>>>>>>>>>> and
>> >>>>>>>>>>> I
>> >>>>>>>>>>>>>> believe it's now ready to get merged into trunk. Many
>> >>>>>>>>>>>>>> thanks
>> >>>>>>> to
>> >>>>>>>>>>> all
>> >>>>>>>>>>>>>> the contributors and reviewers!
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Thanks,
>> >>>>>>>>>>>>>> -Jing
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
>> >>>>>>>> kai.zheng@intel.com>
>> >>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Non-binding +1
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> According to our extensive performance tests, striping +
>> >>>>>>> ISA-L
>> >>>>>>>>>>> coder
>> >>>>>>>>>>>>>> based
>> >>>>>>>>>>>>>>> erasure coding not only can save storage, but also can
>> >>>>>>>> increase
>> >>>>>>>>>>> the
>> >>>>>>>>>>>>>>> throughput of a client or a cluster. It will be a great
>> >>>>>>>>>>> addition to
>> >>>>>>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
>> >>>>>>> also
>> >>>>>>>>>>>>>>> observed it's
>> >>>>>>>>>>>>>> very
>> >>>>>>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
>> >>>>>>> test
>> >>>>>>>>>>> report
>> >>>>>>>>>>>>>> after
>> >>>>>>>>>>>>>>> it's sorted out and hope it helps.
>> >>>>>>>>>>>>>>> Thanks!
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Regards,
>> >>>>>>>>>>>>>>> Kai
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> -----Original Message-----
>> >>>>>>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
>> >>>>>>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
>> >>>>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
>> >>>>>>> common-dev@hadoop.apache.org
>> >>>>>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding)
>> >>>>>>>>>>>>>>> branch
>> >>>>>>> to
>> >>>>>>>>>>> trunk
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> +1
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the
>> >>>>>>>>>>>>>>> nice
>> >>>>>>>>>>> work.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Regards,
>> >>>>>>>>>>>>>>> Uma
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
>> >>>>>>>> wrote:
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> Hi,
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285
>> >>>>>>>>>>>>>>>> feature
>> >>>>>>>>>>> branch
>> >>>>>>>>>>>>>>>> back to trunk. Since November 2014 we have been
>> >>>>>>>>>>>>>>>> designing
>> >>>>>>> and
>> >>>>>>>>>>>>>>>> developing this feature under the umbrella JIRAs
>> >>>>>>>>>>>>>>>> HDFS-7285
>> >>>>>>>> and
>> >>>>>>>>>>>>>>>> HADOOP-11264, and have committed approximately 210
>> patches.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> The HDFS-7285 feature branch was created to support the
>> >>>>>>> first
>> >>>>>>>>>>> phase
>> >>>>>>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of
>> >>>>>>>>>>>>>>>> HDFS-EC
>> >>>>>>> is
>> >>>>>>>>>>> to
>> >>>>>>>>>>>>>>>> significantly reduce storage space usage in HDFS
>> clusters.
>> >>>>>>>>>>> Instead
>> >>>>>>>>>>>>>>>> of always creating 3 replicas of each block with 200%
>> >>>>>>> storage
>> >>>>>>>>>>> space
>> >>>>>>>>>>>>>>>> overhead, HDFS-EC provides data durability through
>> >>>>>>>>>>>>>>>> parity
>> >>>>>>>> data
>> >>>>>>>>>>>> blocks.
>> >>>>>>>>>>>>>>>> With most EC configurations, the storage overhead is no
>> >>>>>>> more
>> >>>>>>>>>>> than
>> >>>>>>>>>>>> 50%.
>> >>>>>>>>>>>>>>>> Based on profiling results of production clusters, we
>> >>>>>>> decided
>> >>>>>>>>>>> to
>> >>>>>>>>>>>>>>>> support EC with the striped block layout in the first
>> >>>>>>> phase,
>> >>>>>>>> so
>> >>>>>>>>>>>>>>>> that small files can be better handled. This means
>> >>>>>>>>>>>>>>>> dividing
>> >>>>>>>>>>> each
>> >>>>>>>>>>>>>>>> logical HDFS file block into smaller units (striping
>> >>>>>>>>>>>>>>>> cells)
>> >>>>>>>> and
>> >>>>>>>>>>>>>>>> spreading them on a set of DataNodes in round-robin
>> >>>>>>> fashion.
>> >>>>>>>>>>> Parity
>> >>>>>>>>>>>>>>>> cells are generated for each stripe of original data
>> cells.
>> >>>>>>>> We
>> >>>>>>>>>>> have
>> >>>>>>>>>>>>>>>> made changes to NameNode, client, and DataNode to
>> >>>>>>> generalize
>> >>>>>>>>>>> the
>> >>>>>>>>>>>>>>>> block concept and handle the mapping between a logical
>> >>>>>>>>>>>>>>>> file
>> >>>>>>>>>>> block
>> >>>>>>>>>>>>>>>> and its internal storage blocks. For further details
>> >>>>>>>>>>>>>>>> please
>> >>>>>>>> see
>> >>>>>>>>>>> the
>> >>>>>>>>>>>>>>>> design doc on HDFS-7285.
>> >>>>>>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
>> >>>>>>>> high-performance
>> >>>>>>>>>>>>>>>> codec calculation support.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> The nightly Jenkins job of the branch has reported
>> >>>>>>>>>>>>>>>> several successful runs, and doesn't show new flaky
>> >>>>>>>>>>>>>>>> tests compared
>> >>>>>>>> with
>> >>>>>>>>>>>>>>>> trunk. We have posted several versions of the test plan
>> >>>>>>>>>>> including
>> >>>>>>>>>>>>>>>> both unit testing and cluster testing, and have
>> >>>>>>>>>>>>>>>> executed
>> >>>>>>> most
>> >>>>>>>>>>> tests
>> >>>>>>>>>>>>>>>> in the plan. The most basic functionalities have been
>> >>>>>>>>>>> extensively
>> >>>>>>>>>>>>>>>> tested and verified in several real clusters with
>> >>>>>>>>>>>>>>>> different hardware configurations; results have been
>> >>>>>>>>>>>>>>>> very stable. We
>> >>>>>>>> have
>> >>>>>>>>>>>>>>>> created follow-on tasks for more advanced error
>> >>>>>>>>>>>>>>>> handling
>> >>>>>>> and
>> >>>>>>>>>>>>> optimization under the umbrella HDFS-8031.
>> >>>>>>>>>>>>>>>> We also plan to implement or harden the integration of
>> >>>>>>>>>>>>>>>> EC
>> >>>>>>>> with
>> >>>>>>>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
>> >>>>>>>> truncate,
>> >>>>>>>>>>>>>>>> hflush, hsync, and so forth.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> Development of this feature has been a collaboration
>> >>>>>>>>>>>>>>>> across
>> >>>>>>>>>>> many
>> >>>>>>>>>>>>>>>> companies and institutions. I'd like to thank J.
>> >>>>>>>>>>>>>>>> Andreina,
>> >>>>>>>>>>> Takanobu
>> >>>>>>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
>> >>>>>>> Maheswara
>> >>>>>>>>>>> Rao
>> >>>>>>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R,
>> >>>>>>>>>>>>>>>> Gao
>> >>>>>>>> Rui,
>> >>>>>>>>>>> Kai
>> >>>>>>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang,
>> >>>>>>>>>>>>>>>> Yong
>> >>>>>>>>>>> Zhang,
>> >>>>>>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
>> >>>>>>>> contributions
>> >>>>>>>>>>> and
>> >>>>>>>>>>>>> reviews.
>> >>>>>>>>>>>>>>>> Andrew and Kai Zheng also made fundamental
>> >>>>>>>>>>>>>>>> contributions to
>> >>>>>>>> the
>> >>>>>>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng
>> >>>>>>>>>>>>>>>> and
>> >>>>>>>> many
>> >>>>>>>>>>>>>>>> other contributors have made great efforts in system
>> >>>>>>> testing.
>> >>>>>>>>>>> Many
>> >>>>>>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and
>> >>>>>>>>>>>>>>>> ATM,
>> >>>>>>>> Todd
>> >>>>>>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
>> >>>>>>>>>>> providing
>> >>>>>>>>>>>>> helpful feedbacks.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> Following the community convention, this vote will last
>> >>>>>>> for 7
>> >>>>>>>>>>> days
>> >>>>>>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers
>> >>>>>>>>>>>>>>>> are
>> >>>>>>>>>>> binding
>> >>>>>>>>>>>>>>>> but non-binding votes are very welcome as well. And
>> >>>>>>>>>>>>>>>> here's
>> >>>>>>> my
>> >>>>>>>>>>>>>>>> non-binding
>> >>>>>>>>>>>>>> +1.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> Thanks,
>> >>>>>>>>>>>>>>>> ---
>> >>>>>>>>>>>>>>>> Zhe Zhang
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>
>> >
>>
>>
>

Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Zhe Zhang <zh...@cloudera.com>.
Hi Vinod,

Thanks for the update. After seeing the comment from Elliot I checked with
Ming Ma and Daryn Sharp offline, to get feedback based on their large scale
deployments (Twitter and Yahoo).

Based on my reading, Ming doesn't have a strong preference between 2.9 and
3.0, and Daryn is still having some questions about the stability of EC
code.

Maybe we should give Ming/Daryn some more time to reply to this thread?

I think we should also address Elliot's comments above regarding major vs.
minor releases, how that impacts the 2.9 upgrade timing etc.

Thanks,
---
Zhe Zhang

On Tue, Dec 8, 2015 at 2:04 PM, Vinod Kumar Vavilapalli <vi...@apache.org>
wrote:

> Forgot to update this thread. I branched off 2.8 last week. So, we can now
> go ahead and do a merge of HDFS-7285 into branch-2 (version 2.9) like we
> discussed before.
>
> Thanks
> +Vinod
>
>
> > On Nov 3, 2015, at 4:40 PM, Vinod Kumar Vavilapalli <
> vinodkv@hortonworks.com> wrote:
> >
> > That makes sense.
> >
> > Thanks for the discussion everyone, let’s stick to this tentative plan
> of EC for 2.9.
> >
> > I just updated the Roadmap wiki to reflect the same.
> >
> > +Vinod
> >
> >
> >> On Nov 2, 2015, at 4:26 PM, Zheng, Kai <ka...@intel.com> wrote:
> >>
> >> Yeah, so for the issues we recently resolved on trunk and are
> addressing as follow-on tasks in Phase I, we would label them with "erasure
> coding" and maybe also set the target version as "2.9" for the convenience?
> >>
> >> -----Original Message-----
> >> From: Jing Zhao [mailto:jing9@apache.org]
> >> Sent: Tuesday, November 03, 2015 8:04 AM
> >> To: hdfs-dev@hadoop.apache.org
> >> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285
> (erasure coding) branch to trunk]
> >>
> >> +1 for the plan about Phase I & II.
> >>
> >> BTW, maybe out of the scope of this thread, just want to mention we
> should either move the jira under HDFS-8031 or update the jira component as
> "erasure-coding" when making further improvement or fixing bugs in EC. In
> this way it will be easier for later backporting EC to 2.9.
> >>
> >> On Mon, Nov 2, 2015 at 3:48 PM, Vinayakumar B <
> vinayakumarb.apache@gmail.com
> >>> wrote:
> >>
> >>> +1 for the idea.
> >>> On Nov 3, 2015 07:22, "Zheng, Kai" <ka...@intel.com> wrote:
> >>>
> >>>> Sounds good to me. When it's determined to include EC in 2.9
> >>>> release, it may be good to have a rough release date as Zhe asked,
> >>>> so accordingly the scope of EC can be discussed out. We still have
> >>>> quite a few of things as Phase I follow-on tasks to do before EC can
> >>>> be deployed in a production system. Phase II to develop non-striping
> >>>> EC for cold data would possibly
> >>> be
> >>>> started after that. We might consider to include only Phase I and
> >>>> leave Phase II for next release according to the rough release date.
> >>>>
> >>>> Regards,
> >>>> Kai
> >>>>
> >>>> -----Original Message-----
> >>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> >>>> Sent: Tuesday, November 03, 2015 5:41 AM
> >>>> To: hdfs-dev@hadoop.apache.org
> >>>> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge
> >>>> HDFS-7285 (erasure coding) branch to trunk]
> >>>>
> >>>> +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we
> >>>> +plan to
> >>>> have 2.8 and 2.9 releases.
> >>>>
> >>>> Regards,
> >>>> Uma
> >>>>
> >>>> On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vi...@hortonworks.com>
> >>> wrote:
> >>>>
> >>>>> Forking the thread. Started looking at the 2.8 list, various
> >>>>> features¹ status and arrived here.
> >>>>>
> >>>>> While I understand the pervasive nature of EC and a need for a
> >>>>> significant bake-in, moving this to a 3.x release is not a good idea.
> >>>>> We will surely get a 2.8 out this year and, as needed, I can even
> >>>>> spend time getting started on a 2.9. OTOH, 3.x is long ways off,
> >>>>> and given all the incompatibilities there, it would be a while
> >>>>> before users can get their hands on EC if it were to be only on
> >>>>> 3.x. At best, this may force sites that want EC to backport the
> >>>>> entire EC feature to older releases, at worst this will be repeat
> >>>>> the mess of 0.20 security release
> >>>> forks.
> >>>>>
> >>>>> If we think adding this to 2.8 (even if it switched off) is too
> >>>>> much risk per our original plan, let¹s move this to 2.9, there by
> >>>>> leaving enough time for stability, integration testing and bake-in,
> >>>>> and a realistic chance of having it end up on users¹ clusters
> soonish.
> >>>>>
> >>>>> +Vinod
> >>>>>
> >>>>>> On Oct 19, 2015, at 1:44 PM, Andrew Wang
> >>>>>> <an...@cloudera.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>> I think our plan thus far has been to target this for 3.0. I'm
> >>>>>> okay with  putting it in branch-2 if we've given a hard look at
> >>>>>> compatibility, but  I'll note though that 2.8 is already looking
> >>>>>> like quite a large release,  and our release bandwidth has been
> >>>>>> focused on the 2.6 and 2.7 maintenance  releases. Adding another
> >>>>>> multi-hundred JIRAs to 2.8 might make it too  unwieldy to get out
> >>>>>> the door. If we bump EC past that, 3.0 might very well  be our
> >>>>>> next release vehicle. I do plan to revive the 3.0 schedule some
> >>>>>> time  next year. With EC and
> >>>>>> JDK8 in a good spot, the only big feature remaining  is classpath
> >>>>>> isolation.
> >>>>>>
> >>>>>> EC is also a pretty fundamental change to HDFS. Even if it's
> >>>>>> compatible, in  terms of size and impact it might best belong in a
> >>>>>> new major release.
> >>>>>>
> >>>>>> Best,
> >>>>>> Andrew
> >>>>>>
> >>>>>> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
> >>>>>> vinayakumarb.apache@gmail.com> wrote:
> >>>>>>
> >>>>>>> Is anyone else also thinks that feature is ready to goto
> >>>>>>> branch-2 as well?
> >>>>>>>
> >>>>>>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable
> >>>>>>> since then and  ready to go in branch-2.
> >>>>>>>
> >>>>>>> -Vinay
> >>>>>>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com>
> wrote:
> >>>>>>>
> >>>>>>>> Thanks Vinay for capturing the issue and Uma for offering the
> help.
> >>>>>>>>
> >>>>>>>> ---
> >>>>>>>> Zhe Zhang
> >>>>>>>>
> >>>>>>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
> >>>>>>> uma.gangumalla@intel.com
> >>>>>>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Vinay,
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I would merge them as part of HDFS-9182.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Uma
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On 10/5/15, 12:48 AM, "Vinayakumar B"
> >>>>>>>>> <vi...@apache.org>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi Andrew,
> >>>>>>>>>> I see CHANGES.txt entries not yet merged from
> >>>>>>> CHANGES-HDFS-EC-7285.txt.
> >>>>>>>>>>
> >>>>>>>>>> Was this intentional?
> >>>>>>>>>>
> >>>>>>>>>> Regards,
> >>>>>>>>>> Vinay
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
> >>>>>>> andrew.wang@cloudera.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Branch has been merged to trunk, thanks again to everyone
> >>>>>>>>>>> who worked
> >>>>>>>> on
> >>>>>>>>>>> the
> >>>>>>>>>>> feature!
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang
> >>>>>>>>>>> <zh...@cloudera.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Thanks everyone who has participated in this discussion.
> >>>>>>>>>>>>
> >>>>>>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this
> >>>>>>>>>>>> vote
> >>>>>>> has
> >>>>>>>>>>> passed.
> >>>>>>>>>>>> I will do a final 'git merge' with trunk and work with
> >>>>>>>>>>>> Andrew to
> >>>>>>>> merge
> >>>>>>>>>>> the
> >>>>>>>>>>>> branch to trunk. I'll update on this thread when the merge
> >>>>>>>>>>>> is
> >>>>>>> done.
> >>>>>>>>>>>>
> >>>>>>>>>>>> ---
> >>>>>>>>>>>> Zhe Zhang
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A
> >>>>>>>>>>>> <yi...@intel.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> (Change it to binding.)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> +1
> >>>>>>>>>>>>> I have been involved in the development and code review on
> >>>>>>>>>>>>> the
> >>>>>>>>>>> feature
> >>>>>>>>>>>>> branch. It's a great feature and I think it's ready to
> >>>>>>>>>>>>> merge it
> >>>>>>>> into
> >>>>>>>>>>>> trunk.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks all for the contribution.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>> Yi Liu
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>> From: Liu, Yi A
> >>>>>>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
> >>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org
> >>>>>>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding)
> >>>>>>>>>>>>> branch to
> >>>>>>>> trunk
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> +1 (non-binding)
> >>>>>>>>>>>>> I have been involved in the development and code review on
> >>>>>>>>>>>>> the
> >>>>>>>>>>> feature
> >>>>>>>>>>>>> branch. It's a great feature and I think it's ready to
> >>>>>>>>>>>>> merge it
> >>>>>>>> into
> >>>>>>>>>>>> trunk.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks all for the contribution.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>> Yi Liu
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> >>>>>>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
> >>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org
> >>>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding)
> >>>>>>>>>>>>> branch to
> >>>>>>>> trunk
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> +1,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I've been involved starting from design and development of
> >>>>>>>>>>> ErasureCoding.
> >>>>>>>>>>>>> I think phase 1 of this development is ready to be merged
> >>>>>>>>>>>>> to
> >>>>>>>> trunk.
> >>>>>>>>>>>>> It had come a long way to the current state with
> >>>>>>>>>>>>> significant
> >>>>>>>> effort
> >>>>>>>>>>> of
> >>>>>>>>>>>>> many Contributors and Reviewers for both design and code.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks Everyone for the efforts.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>> Vinay
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao
> >>>>>>>>>>>>> <ji...@apache.org>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> +1
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I've been involved in both development and review on the
> >>>>>>> branch,
> >>>>>>>>>>> and
> >>>>>>>>>>> I
> >>>>>>>>>>>>>> believe it's now ready to get merged into trunk. Many
> >>>>>>>>>>>>>> thanks
> >>>>>>> to
> >>>>>>>>>>> all
> >>>>>>>>>>>>>> the contributors and reviewers!
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>> -Jing
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
> >>>>>>>> kai.zheng@intel.com>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Non-binding +1
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> According to our extensive performance tests, striping +
> >>>>>>> ISA-L
> >>>>>>>>>>> coder
> >>>>>>>>>>>>>> based
> >>>>>>>>>>>>>>> erasure coding not only can save storage, but also can
> >>>>>>>> increase
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>> throughput of a client or a cluster. It will be a great
> >>>>>>>>>>> addition to
> >>>>>>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
> >>>>>>> also
> >>>>>>>>>>>>>>> observed it's
> >>>>>>>>>>>>>> very
> >>>>>>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
> >>>>>>> test
> >>>>>>>>>>> report
> >>>>>>>>>>>>>> after
> >>>>>>>>>>>>>>> it's sorted out and hope it helps.
> >>>>>>>>>>>>>>> Thanks!
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>> Kai
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> >>>>>>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
> >>>>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
> >>>>>>> common-dev@hadoop.apache.org
> >>>>>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding)
> >>>>>>>>>>>>>>> branch
> >>>>>>> to
> >>>>>>>>>>> trunk
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> +1
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the
> >>>>>>>>>>>>>>> nice
> >>>>>>>>>>> work.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>> Uma
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
> >>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285
> >>>>>>>>>>>>>>>> feature
> >>>>>>>>>>> branch
> >>>>>>>>>>>>>>>> back to trunk. Since November 2014 we have been
> >>>>>>>>>>>>>>>> designing
> >>>>>>> and
> >>>>>>>>>>>>>>>> developing this feature under the umbrella JIRAs
> >>>>>>>>>>>>>>>> HDFS-7285
> >>>>>>>> and
> >>>>>>>>>>>>>>>> HADOOP-11264, and have committed approximately 210
> patches.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> The HDFS-7285 feature branch was created to support the
> >>>>>>> first
> >>>>>>>>>>> phase
> >>>>>>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of
> >>>>>>>>>>>>>>>> HDFS-EC
> >>>>>>> is
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
> >>>>>>>>>>> Instead
> >>>>>>>>>>>>>>>> of always creating 3 replicas of each block with 200%
> >>>>>>> storage
> >>>>>>>>>>> space
> >>>>>>>>>>>>>>>> overhead, HDFS-EC provides data durability through
> >>>>>>>>>>>>>>>> parity
> >>>>>>>> data
> >>>>>>>>>>>> blocks.
> >>>>>>>>>>>>>>>> With most EC configurations, the storage overhead is no
> >>>>>>> more
> >>>>>>>>>>> than
> >>>>>>>>>>>> 50%.
> >>>>>>>>>>>>>>>> Based on profiling results of production clusters, we
> >>>>>>> decided
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>> support EC with the striped block layout in the first
> >>>>>>> phase,
> >>>>>>>> so
> >>>>>>>>>>>>>>>> that small files can be better handled. This means
> >>>>>>>>>>>>>>>> dividing
> >>>>>>>>>>> each
> >>>>>>>>>>>>>>>> logical HDFS file block into smaller units (striping
> >>>>>>>>>>>>>>>> cells)
> >>>>>>>> and
> >>>>>>>>>>>>>>>> spreading them on a set of DataNodes in round-robin
> >>>>>>> fashion.
> >>>>>>>>>>> Parity
> >>>>>>>>>>>>>>>> cells are generated for each stripe of original data
> cells.
> >>>>>>>> We
> >>>>>>>>>>> have
> >>>>>>>>>>>>>>>> made changes to NameNode, client, and DataNode to
> >>>>>>> generalize
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>> block concept and handle the mapping between a logical
> >>>>>>>>>>>>>>>> file
> >>>>>>>>>>> block
> >>>>>>>>>>>>>>>> and its internal storage blocks. For further details
> >>>>>>>>>>>>>>>> please
> >>>>>>>> see
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>> design doc on HDFS-7285.
> >>>>>>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
> >>>>>>>> high-performance
> >>>>>>>>>>>>>>>> codec calculation support.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> The nightly Jenkins job of the branch has reported
> >>>>>>>>>>>>>>>> several successful runs, and doesn't show new flaky
> >>>>>>>>>>>>>>>> tests compared
> >>>>>>>> with
> >>>>>>>>>>>>>>>> trunk. We have posted several versions of the test plan
> >>>>>>>>>>> including
> >>>>>>>>>>>>>>>> both unit testing and cluster testing, and have
> >>>>>>>>>>>>>>>> executed
> >>>>>>> most
> >>>>>>>>>>> tests
> >>>>>>>>>>>>>>>> in the plan. The most basic functionalities have been
> >>>>>>>>>>> extensively
> >>>>>>>>>>>>>>>> tested and verified in several real clusters with
> >>>>>>>>>>>>>>>> different hardware configurations; results have been
> >>>>>>>>>>>>>>>> very stable. We
> >>>>>>>> have
> >>>>>>>>>>>>>>>> created follow-on tasks for more advanced error
> >>>>>>>>>>>>>>>> handling
> >>>>>>> and
> >>>>>>>>>>>>> optimization under the umbrella HDFS-8031.
> >>>>>>>>>>>>>>>> We also plan to implement or harden the integration of
> >>>>>>>>>>>>>>>> EC
> >>>>>>>> with
> >>>>>>>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
> >>>>>>>> truncate,
> >>>>>>>>>>>>>>>> hflush, hsync, and so forth.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Development of this feature has been a collaboration
> >>>>>>>>>>>>>>>> across
> >>>>>>>>>>> many
> >>>>>>>>>>>>>>>> companies and institutions. I'd like to thank J.
> >>>>>>>>>>>>>>>> Andreina,
> >>>>>>>>>>> Takanobu
> >>>>>>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
> >>>>>>> Maheswara
> >>>>>>>>>>> Rao
> >>>>>>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R,
> >>>>>>>>>>>>>>>> Gao
> >>>>>>>> Rui,
> >>>>>>>>>>> Kai
> >>>>>>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang,
> >>>>>>>>>>>>>>>> Yong
> >>>>>>>>>>> Zhang,
> >>>>>>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
> >>>>>>>> contributions
> >>>>>>>>>>> and
> >>>>>>>>>>>>> reviews.
> >>>>>>>>>>>>>>>> Andrew and Kai Zheng also made fundamental
> >>>>>>>>>>>>>>>> contributions to
> >>>>>>>> the
> >>>>>>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng
> >>>>>>>>>>>>>>>> and
> >>>>>>>> many
> >>>>>>>>>>>>>>>> other contributors have made great efforts in system
> >>>>>>> testing.
> >>>>>>>>>>> Many
> >>>>>>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and
> >>>>>>>>>>>>>>>> ATM,
> >>>>>>>> Todd
> >>>>>>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
> >>>>>>>>>>> providing
> >>>>>>>>>>>>> helpful feedbacks.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Following the community convention, this vote will last
> >>>>>>> for 7
> >>>>>>>>>>> days
> >>>>>>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers
> >>>>>>>>>>>>>>>> are
> >>>>>>>>>>> binding
> >>>>>>>>>>>>>>>> but non-binding votes are very welcome as well. And
> >>>>>>>>>>>>>>>> here's
> >>>>>>> my
> >>>>>>>>>>>>>>>> non-binding
> >>>>>>>>>>>>>> +1.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>>> Zhe Zhang
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >
>
>

Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.
Forgot to update this thread. I branched off 2.8 last week. So, we can now go ahead and do a merge of HDFS-7285 into branch-2 (version 2.9) like we discussed before.

Thanks
+Vinod


> On Nov 3, 2015, at 4:40 PM, Vinod Kumar Vavilapalli <vi...@hortonworks.com> wrote:
> 
> That makes sense.
> 
> Thanks for the discussion everyone, let’s stick to this tentative plan of EC for 2.9.
> 
> I just updated the Roadmap wiki to reflect the same.
> 
> +Vinod
> 
> 
>> On Nov 2, 2015, at 4:26 PM, Zheng, Kai <ka...@intel.com> wrote:
>> 
>> Yeah, so for the issues we recently resolved on trunk and are addressing as follow-on tasks in Phase I, we would label them with "erasure coding" and maybe also set the target version as "2.9" for the convenience?
>> 
>> -----Original Message-----
>> From: Jing Zhao [mailto:jing9@apache.org] 
>> Sent: Tuesday, November 03, 2015 8:04 AM
>> To: hdfs-dev@hadoop.apache.org
>> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]
>> 
>> +1 for the plan about Phase I & II.
>> 
>> BTW, maybe out of the scope of this thread, just want to mention we should either move the jira under HDFS-8031 or update the jira component as "erasure-coding" when making further improvement or fixing bugs in EC. In this way it will be easier for later backporting EC to 2.9.
>> 
>> On Mon, Nov 2, 2015 at 3:48 PM, Vinayakumar B <vinayakumarb.apache@gmail.com
>>> wrote:
>> 
>>> +1 for the idea.
>>> On Nov 3, 2015 07:22, "Zheng, Kai" <ka...@intel.com> wrote:
>>> 
>>>> Sounds good to me. When it's determined to include EC in 2.9 
>>>> release, it may be good to have a rough release date as Zhe asked, 
>>>> so accordingly the scope of EC can be discussed out. We still have 
>>>> quite a few of things as Phase I follow-on tasks to do before EC can 
>>>> be deployed in a production system. Phase II to develop non-striping 
>>>> EC for cold data would possibly
>>> be
>>>> started after that. We might consider to include only Phase I and 
>>>> leave Phase II for next release according to the rough release date.
>>>> 
>>>> Regards,
>>>> Kai
>>>> 
>>>> -----Original Message-----
>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
>>>> Sent: Tuesday, November 03, 2015 5:41 AM
>>>> To: hdfs-dev@hadoop.apache.org
>>>> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge 
>>>> HDFS-7285 (erasure coding) branch to trunk]
>>>> 
>>>> +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we 
>>>> +plan to
>>>> have 2.8 and 2.9 releases.
>>>> 
>>>> Regards,
>>>> Uma
>>>> 
>>>> On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vi...@hortonworks.com>
>>> wrote:
>>>> 
>>>>> Forking the thread. Started looking at the 2.8 list, various 
>>>>> features¹ status and arrived here.
>>>>> 
>>>>> While I understand the pervasive nature of EC and a need for a 
>>>>> significant bake-in, moving this to a 3.x release is not a good idea.
>>>>> We will surely get a 2.8 out this year and, as needed, I can even 
>>>>> spend time getting started on a 2.9. OTOH, 3.x is long ways off, 
>>>>> and given all the incompatibilities there, it would be a while 
>>>>> before users can get their hands on EC if it were to be only on 
>>>>> 3.x. At best, this may force sites that want EC to backport the 
>>>>> entire EC feature to older releases, at worst this will be repeat 
>>>>> the mess of 0.20 security release
>>>> forks.
>>>>> 
>>>>> If we think adding this to 2.8 (even if it switched off) is too 
>>>>> much risk per our original plan, let¹s move this to 2.9, there by 
>>>>> leaving enough time for stability, integration testing and bake-in, 
>>>>> and a realistic chance of having it end up on users¹ clusters soonish.
>>>>> 
>>>>> +Vinod
>>>>> 
>>>>>> On Oct 19, 2015, at 1:44 PM, Andrew Wang 
>>>>>> <an...@cloudera.com>
>>>>>> wrote:
>>>>>> 
>>>>>> I think our plan thus far has been to target this for 3.0. I'm 
>>>>>> okay with  putting it in branch-2 if we've given a hard look at 
>>>>>> compatibility, but  I'll note though that 2.8 is already looking 
>>>>>> like quite a large release,  and our release bandwidth has been 
>>>>>> focused on the 2.6 and 2.7 maintenance  releases. Adding another 
>>>>>> multi-hundred JIRAs to 2.8 might make it too  unwieldy to get out 
>>>>>> the door. If we bump EC past that, 3.0 might very well  be our 
>>>>>> next release vehicle. I do plan to revive the 3.0 schedule some 
>>>>>> time  next year. With EC and
>>>>>> JDK8 in a good spot, the only big feature remaining  is classpath 
>>>>>> isolation.
>>>>>> 
>>>>>> EC is also a pretty fundamental change to HDFS. Even if it's 
>>>>>> compatible, in  terms of size and impact it might best belong in a 
>>>>>> new major release.
>>>>>> 
>>>>>> Best,
>>>>>> Andrew
>>>>>> 
>>>>>> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B < 
>>>>>> vinayakumarb.apache@gmail.com> wrote:
>>>>>> 
>>>>>>> Is anyone else also thinks that feature is ready to goto 
>>>>>>> branch-2 as well?
>>>>>>> 
>>>>>>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable 
>>>>>>> since then and  ready to go in branch-2.
>>>>>>> 
>>>>>>> -Vinay
>>>>>>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com> wrote:
>>>>>>> 
>>>>>>>> Thanks Vinay for capturing the issue and Uma for offering the help.
>>>>>>>> 
>>>>>>>> ---
>>>>>>>> Zhe Zhang
>>>>>>>> 
>>>>>>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
>>>>>>> uma.gangumalla@intel.com
>>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Vinay,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I would merge them as part of HDFS-9182.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Uma
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" 
>>>>>>>>> <vi...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi Andrew,
>>>>>>>>>> I see CHANGES.txt entries not yet merged from
>>>>>>> CHANGES-HDFS-EC-7285.txt.
>>>>>>>>>> 
>>>>>>>>>> Was this intentional?
>>>>>>>>>> 
>>>>>>>>>> Regards,
>>>>>>>>>> Vinay
>>>>>>>>>> 
>>>>>>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
>>>>>>> andrew.wang@cloudera.com>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Branch has been merged to trunk, thanks again to everyone 
>>>>>>>>>>> who worked
>>>>>>>> on
>>>>>>>>>>> the
>>>>>>>>>>> feature!
>>>>>>>>>>> 
>>>>>>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang 
>>>>>>>>>>> <zh...@cloudera.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Thanks everyone who has participated in this discussion.
>>>>>>>>>>>> 
>>>>>>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this 
>>>>>>>>>>>> vote
>>>>>>> has
>>>>>>>>>>> passed.
>>>>>>>>>>>> I will do a final 'git merge' with trunk and work with 
>>>>>>>>>>>> Andrew to
>>>>>>>> merge
>>>>>>>>>>> the
>>>>>>>>>>>> branch to trunk. I'll update on this thread when the merge 
>>>>>>>>>>>> is
>>>>>>> done.
>>>>>>>>>>>> 
>>>>>>>>>>>> ---
>>>>>>>>>>>> Zhe Zhang
>>>>>>>>>>>> 
>>>>>>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A 
>>>>>>>>>>>> <yi...@intel.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> (Change it to binding.)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> +1
>>>>>>>>>>>>> I have been involved in the development and code review on 
>>>>>>>>>>>>> the
>>>>>>>>>>> feature
>>>>>>>>>>>>> branch. It's a great feature and I think it's ready to 
>>>>>>>>>>>>> merge it
>>>>>>>> into
>>>>>>>>>>>> trunk.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks all for the contribution.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Yi Liu
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: Liu, Yi A
>>>>>>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
>>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) 
>>>>>>>>>>>>> branch to
>>>>>>>> trunk
>>>>>>>>>>>>> 
>>>>>>>>>>>>> +1 (non-binding)
>>>>>>>>>>>>> I have been involved in the development and code review on 
>>>>>>>>>>>>> the
>>>>>>>>>>> feature
>>>>>>>>>>>>> branch. It's a great feature and I think it's ready to 
>>>>>>>>>>>>> merge it
>>>>>>>> into
>>>>>>>>>>>> trunk.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks all for the contribution.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Yi Liu
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
>>>>>>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
>>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) 
>>>>>>>>>>>>> branch to
>>>>>>>> trunk
>>>>>>>>>>>>> 
>>>>>>>>>>>>> +1,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I've been involved starting from design and development of
>>>>>>>>>>> ErasureCoding.
>>>>>>>>>>>>> I think phase 1 of this development is ready to be merged 
>>>>>>>>>>>>> to
>>>>>>>> trunk.
>>>>>>>>>>>>> It had come a long way to the current state with 
>>>>>>>>>>>>> significant
>>>>>>>> effort
>>>>>>>>>>> of
>>>>>>>>>>>>> many Contributors and Reviewers for both design and code.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks Everyone for the efforts.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Vinay
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao 
>>>>>>>>>>>>> <ji...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I've been involved in both development and review on the
>>>>>>> branch,
>>>>>>>>>>> and
>>>>>>>>>>> I
>>>>>>>>>>>>>> believe it's now ready to get merged into trunk. Many 
>>>>>>>>>>>>>> thanks
>>>>>>> to
>>>>>>>>>>> all
>>>>>>>>>>>>>> the contributors and reviewers!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> -Jing
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
>>>>>>>> kai.zheng@intel.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Non-binding +1
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> According to our extensive performance tests, striping +
>>>>>>> ISA-L
>>>>>>>>>>> coder
>>>>>>>>>>>>>> based
>>>>>>>>>>>>>>> erasure coding not only can save storage, but also can
>>>>>>>> increase
>>>>>>>>>>> the
>>>>>>>>>>>>>>> throughput of a client or a cluster. It will be a great
>>>>>>>>>>> addition to
>>>>>>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
>>>>>>> also
>>>>>>>>>>>>>>> observed it's
>>>>>>>>>>>>>> very
>>>>>>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
>>>>>>> test
>>>>>>>>>>> report
>>>>>>>>>>>>>> after
>>>>>>>>>>>>>>> it's sorted out and hope it helps.
>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>> Kai
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
>>>>>>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
>>>>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
>>>>>>> common-dev@hadoop.apache.org
>>>>>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) 
>>>>>>>>>>>>>>> branch
>>>>>>> to
>>>>>>>>>>> trunk
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the 
>>>>>>>>>>>>>>> nice
>>>>>>>>>>> work.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>> Uma
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 
>>>>>>>>>>>>>>>> feature
>>>>>>>>>>> branch
>>>>>>>>>>>>>>>> back to trunk. Since November 2014 we have been 
>>>>>>>>>>>>>>>> designing
>>>>>>> and
>>>>>>>>>>>>>>>> developing this feature under the umbrella JIRAs 
>>>>>>>>>>>>>>>> HDFS-7285
>>>>>>>> and
>>>>>>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> The HDFS-7285 feature branch was created to support the
>>>>>>> first
>>>>>>>>>>> phase
>>>>>>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of 
>>>>>>>>>>>>>>>> HDFS-EC
>>>>>>> is
>>>>>>>>>>> to
>>>>>>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
>>>>>>>>>>> Instead
>>>>>>>>>>>>>>>> of always creating 3 replicas of each block with 200%
>>>>>>> storage
>>>>>>>>>>> space
>>>>>>>>>>>>>>>> overhead, HDFS-EC provides data durability through 
>>>>>>>>>>>>>>>> parity
>>>>>>>> data
>>>>>>>>>>>> blocks.
>>>>>>>>>>>>>>>> With most EC configurations, the storage overhead is no
>>>>>>> more
>>>>>>>>>>> than
>>>>>>>>>>>> 50%.
>>>>>>>>>>>>>>>> Based on profiling results of production clusters, we
>>>>>>> decided
>>>>>>>>>>> to
>>>>>>>>>>>>>>>> support EC with the striped block layout in the first
>>>>>>> phase,
>>>>>>>> so
>>>>>>>>>>>>>>>> that small files can be better handled. This means 
>>>>>>>>>>>>>>>> dividing
>>>>>>>>>>> each
>>>>>>>>>>>>>>>> logical HDFS file block into smaller units (striping 
>>>>>>>>>>>>>>>> cells)
>>>>>>>> and
>>>>>>>>>>>>>>>> spreading them on a set of DataNodes in round-robin
>>>>>>> fashion.
>>>>>>>>>>> Parity
>>>>>>>>>>>>>>>> cells are generated for each stripe of original data cells.
>>>>>>>> We
>>>>>>>>>>> have
>>>>>>>>>>>>>>>> made changes to NameNode, client, and DataNode to
>>>>>>> generalize
>>>>>>>>>>> the
>>>>>>>>>>>>>>>> block concept and handle the mapping between a logical 
>>>>>>>>>>>>>>>> file
>>>>>>>>>>> block
>>>>>>>>>>>>>>>> and its internal storage blocks. For further details 
>>>>>>>>>>>>>>>> please
>>>>>>>> see
>>>>>>>>>>> the
>>>>>>>>>>>>>>>> design doc on HDFS-7285.
>>>>>>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
>>>>>>>> high-performance
>>>>>>>>>>>>>>>> codec calculation support.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> The nightly Jenkins job of the branch has reported 
>>>>>>>>>>>>>>>> several successful runs, and doesn't show new flaky 
>>>>>>>>>>>>>>>> tests compared
>>>>>>>> with
>>>>>>>>>>>>>>>> trunk. We have posted several versions of the test plan
>>>>>>>>>>> including
>>>>>>>>>>>>>>>> both unit testing and cluster testing, and have 
>>>>>>>>>>>>>>>> executed
>>>>>>> most
>>>>>>>>>>> tests
>>>>>>>>>>>>>>>> in the plan. The most basic functionalities have been
>>>>>>>>>>> extensively
>>>>>>>>>>>>>>>> tested and verified in several real clusters with 
>>>>>>>>>>>>>>>> different hardware configurations; results have been 
>>>>>>>>>>>>>>>> very stable. We
>>>>>>>> have
>>>>>>>>>>>>>>>> created follow-on tasks for more advanced error 
>>>>>>>>>>>>>>>> handling
>>>>>>> and
>>>>>>>>>>>>> optimization under the umbrella HDFS-8031.
>>>>>>>>>>>>>>>> We also plan to implement or harden the integration of 
>>>>>>>>>>>>>>>> EC
>>>>>>>> with
>>>>>>>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
>>>>>>>> truncate,
>>>>>>>>>>>>>>>> hflush, hsync, and so forth.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Development of this feature has been a collaboration 
>>>>>>>>>>>>>>>> across
>>>>>>>>>>> many
>>>>>>>>>>>>>>>> companies and institutions. I'd like to thank J. 
>>>>>>>>>>>>>>>> Andreina,
>>>>>>>>>>> Takanobu
>>>>>>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
>>>>>>> Maheswara
>>>>>>>>>>> Rao
>>>>>>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, 
>>>>>>>>>>>>>>>> Gao
>>>>>>>> Rui,
>>>>>>>>>>> Kai
>>>>>>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, 
>>>>>>>>>>>>>>>> Yong
>>>>>>>>>>> Zhang,
>>>>>>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
>>>>>>>> contributions
>>>>>>>>>>> and
>>>>>>>>>>>>> reviews.
>>>>>>>>>>>>>>>> Andrew and Kai Zheng also made fundamental 
>>>>>>>>>>>>>>>> contributions to
>>>>>>>> the
>>>>>>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng 
>>>>>>>>>>>>>>>> and
>>>>>>>> many
>>>>>>>>>>>>>>>> other contributors have made great efforts in system
>>>>>>> testing.
>>>>>>>>>>> Many
>>>>>>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and 
>>>>>>>>>>>>>>>> ATM,
>>>>>>>> Todd
>>>>>>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
>>>>>>>>>>> providing
>>>>>>>>>>>>> helpful feedbacks.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Following the community convention, this vote will last
>>>>>>> for 7
>>>>>>>>>>> days
>>>>>>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers 
>>>>>>>>>>>>>>>> are
>>>>>>>>>>> binding
>>>>>>>>>>>>>>>> but non-binding votes are very welcome as well. And 
>>>>>>>>>>>>>>>> here's
>>>>>>> my
>>>>>>>>>>>>>>>> non-binding
>>>>>>>>>>>>>> +1.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>> Zhe Zhang
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
> 


Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Zhe Zhang <zh...@cloudera.com>.
To add to the above discussion on umbrella JIRA: the COMMON side changes of
EC have been tracked under HADOOP-11264 (before merge) and HADOOP-11842
(after merge).

---
Zhe Zhang

On Tue, Nov 3, 2015 at 4:40 PM, Vinod Vavilapalli <vi...@hortonworks.com>
wrote:

> That makes sense.
>
> Thanks for the discussion everyone, let’s stick to this tentative plan of
> EC for 2.9.
>
> I just updated the Roadmap wiki to reflect the same.
>
> +Vinod
>
>
> > On Nov 2, 2015, at 4:26 PM, Zheng, Kai <ka...@intel.com> wrote:
> >
> > Yeah, so for the issues we recently resolved on trunk and are addressing
> as follow-on tasks in Phase I, we would label them with "erasure coding"
> and maybe also set the target version as "2.9" for the convenience?
> >
> > -----Original Message-----
> > From: Jing Zhao [mailto:jing9@apache.org]
> > Sent: Tuesday, November 03, 2015 8:04 AM
> > To: hdfs-dev@hadoop.apache.org
> > Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285
> (erasure coding) branch to trunk]
> >
> > +1 for the plan about Phase I & II.
> >
> > BTW, maybe out of the scope of this thread, just want to mention we
> should either move the jira under HDFS-8031 or update the jira component as
> "erasure-coding" when making further improvement or fixing bugs in EC. In
> this way it will be easier for later backporting EC to 2.9.
> >
> > On Mon, Nov 2, 2015 at 3:48 PM, Vinayakumar B <
> vinayakumarb.apache@gmail.com
> >> wrote:
> >
> >> +1 for the idea.
> >> On Nov 3, 2015 07:22, "Zheng, Kai" <ka...@intel.com> wrote:
> >>
> >>> Sounds good to me. When it's determined to include EC in 2.9
> >>> release, it may be good to have a rough release date as Zhe asked,
> >>> so accordingly the scope of EC can be discussed out. We still have
> >>> quite a few of things as Phase I follow-on tasks to do before EC can
> >>> be deployed in a production system. Phase II to develop non-striping
> >>> EC for cold data would possibly
> >> be
> >>> started after that. We might consider to include only Phase I and
> >>> leave Phase II for next release according to the rough release date.
> >>>
> >>> Regards,
> >>> Kai
> >>>
> >>> -----Original Message-----
> >>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> >>> Sent: Tuesday, November 03, 2015 5:41 AM
> >>> To: hdfs-dev@hadoop.apache.org
> >>> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge
> >>> HDFS-7285 (erasure coding) branch to trunk]
> >>>
> >>> +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we
> >>> +plan to
> >>> have 2.8 and 2.9 releases.
> >>>
> >>> Regards,
> >>> Uma
> >>>
> >>> On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vi...@hortonworks.com>
> >> wrote:
> >>>
> >>>> Forking the thread. Started looking at the 2.8 list, various
> >>>> features¹ status and arrived here.
> >>>>
> >>>> While I understand the pervasive nature of EC and a need for a
> >>>> significant bake-in, moving this to a 3.x release is not a good idea.
> >>>> We will surely get a 2.8 out this year and, as needed, I can even
> >>>> spend time getting started on a 2.9. OTOH, 3.x is long ways off,
> >>>> and given all the incompatibilities there, it would be a while
> >>>> before users can get their hands on EC if it were to be only on
> >>>> 3.x. At best, this may force sites that want EC to backport the
> >>>> entire EC feature to older releases, at worst this will be repeat
> >>>> the mess of 0.20 security release
> >>> forks.
> >>>>
> >>>> If we think adding this to 2.8 (even if it switched off) is too
> >>>> much risk per our original plan, let¹s move this to 2.9, there by
> >>>> leaving enough time for stability, integration testing and bake-in,
> >>>> and a realistic chance of having it end up on users¹ clusters soonish.
> >>>>
> >>>> +Vinod
> >>>>
> >>>>> On Oct 19, 2015, at 1:44 PM, Andrew Wang
> >>>>> <an...@cloudera.com>
> >>>>> wrote:
> >>>>>
> >>>>> I think our plan thus far has been to target this for 3.0. I'm
> >>>>> okay with  putting it in branch-2 if we've given a hard look at
> >>>>> compatibility, but  I'll note though that 2.8 is already looking
> >>>>> like quite a large release,  and our release bandwidth has been
> >>>>> focused on the 2.6 and 2.7 maintenance  releases. Adding another
> >>>>> multi-hundred JIRAs to 2.8 might make it too  unwieldy to get out
> >>>>> the door. If we bump EC past that, 3.0 might very well  be our
> >>>>> next release vehicle. I do plan to revive the 3.0 schedule some
> >>>>> time  next year. With EC and
> >>>>> JDK8 in a good spot, the only big feature remaining  is classpath
> >>>>> isolation.
> >>>>>
> >>>>> EC is also a pretty fundamental change to HDFS. Even if it's
> >>>>> compatible, in  terms of size and impact it might best belong in a
> >>>>> new major release.
> >>>>>
> >>>>> Best,
> >>>>> Andrew
> >>>>>
> >>>>> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
> >>>>> vinayakumarb.apache@gmail.com> wrote:
> >>>>>
> >>>>>> Is anyone else also thinks that feature is ready to goto
> >>>>>> branch-2 as well?
> >>>>>>
> >>>>>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable
> >>>>>> since then and  ready to go in branch-2.
> >>>>>>
> >>>>>> -Vinay
> >>>>>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com> wrote:
> >>>>>>
> >>>>>>> Thanks Vinay for capturing the issue and Uma for offering the help.
> >>>>>>>
> >>>>>>> ---
> >>>>>>> Zhe Zhang
> >>>>>>>
> >>>>>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
> >>>>>> uma.gangumalla@intel.com
> >>>>>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Vinay,
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I would merge them as part of HDFS-9182.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Uma
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 10/5/15, 12:48 AM, "Vinayakumar B"
> >>>>>>>> <vi...@apache.org>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi Andrew,
> >>>>>>>>> I see CHANGES.txt entries not yet merged from
> >>>>>> CHANGES-HDFS-EC-7285.txt.
> >>>>>>>>>
> >>>>>>>>> Was this intentional?
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Vinay
> >>>>>>>>>
> >>>>>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
> >>>>>> andrew.wang@cloudera.com>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Branch has been merged to trunk, thanks again to everyone
> >>>>>>>>>> who worked
> >>>>>>> on
> >>>>>>>>>> the
> >>>>>>>>>> feature!
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang
> >>>>>>>>>> <zh...@cloudera.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Thanks everyone who has participated in this discussion.
> >>>>>>>>>>>
> >>>>>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this
> >>>>>>>>>>> vote
> >>>>>> has
> >>>>>>>>>> passed.
> >>>>>>>>>>> I will do a final 'git merge' with trunk and work with
> >>>>>>>>>>> Andrew to
> >>>>>>> merge
> >>>>>>>>>> the
> >>>>>>>>>>> branch to trunk. I'll update on this thread when the merge
> >>>>>>>>>>> is
> >>>>>> done.
> >>>>>>>>>>>
> >>>>>>>>>>> ---
> >>>>>>>>>>> Zhe Zhang
> >>>>>>>>>>>
> >>>>>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A
> >>>>>>>>>>> <yi...@intel.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> (Change it to binding.)
> >>>>>>>>>>>>
> >>>>>>>>>>>> +1
> >>>>>>>>>>>> I have been involved in the development and code review on
> >>>>>>>>>>>> the
> >>>>>>>>>> feature
> >>>>>>>>>>>> branch. It's a great feature and I think it's ready to
> >>>>>>>>>>>> merge it
> >>>>>>> into
> >>>>>>>>>>> trunk.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks all for the contribution.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards,
> >>>>>>>>>>>> Yi Liu
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>> From: Liu, Yi A
> >>>>>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
> >>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org
> >>>>>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding)
> >>>>>>>>>>>> branch to
> >>>>>>> trunk
> >>>>>>>>>>>>
> >>>>>>>>>>>> +1 (non-binding)
> >>>>>>>>>>>> I have been involved in the development and code review on
> >>>>>>>>>>>> the
> >>>>>>>>>> feature
> >>>>>>>>>>>> branch. It's a great feature and I think it's ready to
> >>>>>>>>>>>> merge it
> >>>>>>> into
> >>>>>>>>>>> trunk.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks all for the contribution.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards,
> >>>>>>>>>>>> Yi Liu
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> >>>>>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
> >>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org
> >>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding)
> >>>>>>>>>>>> branch to
> >>>>>>> trunk
> >>>>>>>>>>>>
> >>>>>>>>>>>> +1,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I've been involved starting from design and development of
> >>>>>>>>>> ErasureCoding.
> >>>>>>>>>>>> I think phase 1 of this development is ready to be merged
> >>>>>>>>>>>> to
> >>>>>>> trunk.
> >>>>>>>>>>>> It had come a long way to the current state with
> >>>>>>>>>>>> significant
> >>>>>>> effort
> >>>>>>>>>> of
> >>>>>>>>>>>> many Contributors and Reviewers for both design and code.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks Everyone for the efforts.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards,
> >>>>>>>>>>>> Vinay
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao
> >>>>>>>>>>>> <ji...@apache.org>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> +1
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I've been involved in both development and review on the
> >>>>>> branch,
> >>>>>>>>>> and
> >>>>>>>>>> I
> >>>>>>>>>>>>> believe it's now ready to get merged into trunk. Many
> >>>>>>>>>>>>> thanks
> >>>>>> to
> >>>>>>>>>> all
> >>>>>>>>>>>>> the contributors and reviewers!
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> -Jing
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
> >>>>>>> kai.zheng@intel.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Non-binding +1
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> According to our extensive performance tests, striping +
> >>>>>> ISA-L
> >>>>>>>>>> coder
> >>>>>>>>>>>>> based
> >>>>>>>>>>>>>> erasure coding not only can save storage, but also can
> >>>>>>> increase
> >>>>>>>>>> the
> >>>>>>>>>>>>>> throughput of a client or a cluster. It will be a great
> >>>>>>>>>> addition to
> >>>>>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
> >>>>>> also
> >>>>>>>>>>>>>> observed it's
> >>>>>>>>>>>>> very
> >>>>>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
> >>>>>> test
> >>>>>>>>>> report
> >>>>>>>>>>>>> after
> >>>>>>>>>>>>>> it's sorted out and hope it helps.
> >>>>>>>>>>>>>> Thanks!
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>> Kai
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> >>>>>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
> >>>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
> >>>>>> common-dev@hadoop.apache.org
> >>>>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding)
> >>>>>>>>>>>>>> branch
> >>>>>> to
> >>>>>>>>>> trunk
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> +1
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the
> >>>>>>>>>>>>>> nice
> >>>>>>>>>> work.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>> Uma
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
> >>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285
> >>>>>>>>>>>>>>> feature
> >>>>>>>>>> branch
> >>>>>>>>>>>>>>> back to trunk. Since November 2014 we have been
> >>>>>>>>>>>>>>> designing
> >>>>>> and
> >>>>>>>>>>>>>>> developing this feature under the umbrella JIRAs
> >>>>>>>>>>>>>>> HDFS-7285
> >>>>>>> and
> >>>>>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The HDFS-7285 feature branch was created to support the
> >>>>>> first
> >>>>>>>>>> phase
> >>>>>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of
> >>>>>>>>>>>>>>> HDFS-EC
> >>>>>> is
> >>>>>>>>>> to
> >>>>>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
> >>>>>>>>>> Instead
> >>>>>>>>>>>>>>> of always creating 3 replicas of each block with 200%
> >>>>>> storage
> >>>>>>>>>> space
> >>>>>>>>>>>>>>> overhead, HDFS-EC provides data durability through
> >>>>>>>>>>>>>>> parity
> >>>>>>> data
> >>>>>>>>>>> blocks.
> >>>>>>>>>>>>>>> With most EC configurations, the storage overhead is no
> >>>>>> more
> >>>>>>>>>> than
> >>>>>>>>>>> 50%.
> >>>>>>>>>>>>>>> Based on profiling results of production clusters, we
> >>>>>> decided
> >>>>>>>>>> to
> >>>>>>>>>>>>>>> support EC with the striped block layout in the first
> >>>>>> phase,
> >>>>>>> so
> >>>>>>>>>>>>>>> that small files can be better handled. This means
> >>>>>>>>>>>>>>> dividing
> >>>>>>>>>> each
> >>>>>>>>>>>>>>> logical HDFS file block into smaller units (striping
> >>>>>>>>>>>>>>> cells)
> >>>>>>> and
> >>>>>>>>>>>>>>> spreading them on a set of DataNodes in round-robin
> >>>>>> fashion.
> >>>>>>>>>> Parity
> >>>>>>>>>>>>>>> cells are generated for each stripe of original data cells.
> >>>>>>> We
> >>>>>>>>>> have
> >>>>>>>>>>>>>>> made changes to NameNode, client, and DataNode to
> >>>>>> generalize
> >>>>>>>>>> the
> >>>>>>>>>>>>>>> block concept and handle the mapping between a logical
> >>>>>>>>>>>>>>> file
> >>>>>>>>>> block
> >>>>>>>>>>>>>>> and its internal storage blocks. For further details
> >>>>>>>>>>>>>>> please
> >>>>>>> see
> >>>>>>>>>> the
> >>>>>>>>>>>>>>> design doc on HDFS-7285.
> >>>>>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
> >>>>>>> high-performance
> >>>>>>>>>>>>>>> codec calculation support.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The nightly Jenkins job of the branch has reported
> >>>>>>>>>>>>>>> several successful runs, and doesn't show new flaky
> >>>>>>>>>>>>>>> tests compared
> >>>>>>> with
> >>>>>>>>>>>>>>> trunk. We have posted several versions of the test plan
> >>>>>>>>>> including
> >>>>>>>>>>>>>>> both unit testing and cluster testing, and have
> >>>>>>>>>>>>>>> executed
> >>>>>> most
> >>>>>>>>>> tests
> >>>>>>>>>>>>>>> in the plan. The most basic functionalities have been
> >>>>>>>>>> extensively
> >>>>>>>>>>>>>>> tested and verified in several real clusters with
> >>>>>>>>>>>>>>> different hardware configurations; results have been
> >>>>>>>>>>>>>>> very stable. We
> >>>>>>> have
> >>>>>>>>>>>>>>> created follow-on tasks for more advanced error
> >>>>>>>>>>>>>>> handling
> >>>>>> and
> >>>>>>>>>>>> optimization under the umbrella HDFS-8031.
> >>>>>>>>>>>>>>> We also plan to implement or harden the integration of
> >>>>>>>>>>>>>>> EC
> >>>>>>> with
> >>>>>>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
> >>>>>>> truncate,
> >>>>>>>>>>>>>>> hflush, hsync, and so forth.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Development of this feature has been a collaboration
> >>>>>>>>>>>>>>> across
> >>>>>>>>>> many
> >>>>>>>>>>>>>>> companies and institutions. I'd like to thank J.
> >>>>>>>>>>>>>>> Andreina,
> >>>>>>>>>> Takanobu
> >>>>>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
> >>>>>> Maheswara
> >>>>>>>>>> Rao
> >>>>>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R,
> >>>>>>>>>>>>>>> Gao
> >>>>>>> Rui,
> >>>>>>>>>> Kai
> >>>>>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang,
> >>>>>>>>>>>>>>> Yong
> >>>>>>>>>> Zhang,
> >>>>>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
> >>>>>>> contributions
> >>>>>>>>>> and
> >>>>>>>>>>>> reviews.
> >>>>>>>>>>>>>>> Andrew and Kai Zheng also made fundamental
> >>>>>>>>>>>>>>> contributions to
> >>>>>>> the
> >>>>>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng
> >>>>>>>>>>>>>>> and
> >>>>>>> many
> >>>>>>>>>>>>>>> other contributors have made great efforts in system
> >>>>>> testing.
> >>>>>>>>>> Many
> >>>>>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and
> >>>>>>>>>>>>>>> ATM,
> >>>>>>> Todd
> >>>>>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
> >>>>>>>>>> providing
> >>>>>>>>>>>> helpful feedbacks.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Following the community convention, this vote will last
> >>>>>> for 7
> >>>>>>>>>> days
> >>>>>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers
> >>>>>>>>>>>>>>> are
> >>>>>>>>>> binding
> >>>>>>>>>>>>>>> but non-binding votes are very welcome as well. And
> >>>>>>>>>>>>>>> here's
> >>>>>> my
> >>>>>>>>>>>>>>> non-binding
> >>>>>>>>>>>>> +1.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>> Zhe Zhang
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>>
> >>
>
>

Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Vinod Vavilapalli <vi...@hortonworks.com>.
That makes sense.

Thanks for the discussion everyone, let’s stick to this tentative plan of EC for 2.9.

I just updated the Roadmap wiki to reflect the same.

+Vinod


> On Nov 2, 2015, at 4:26 PM, Zheng, Kai <ka...@intel.com> wrote:
> 
> Yeah, so for the issues we recently resolved on trunk and are addressing as follow-on tasks in Phase I, we would label them with "erasure coding" and maybe also set the target version as "2.9" for the convenience?
> 
> -----Original Message-----
> From: Jing Zhao [mailto:jing9@apache.org] 
> Sent: Tuesday, November 03, 2015 8:04 AM
> To: hdfs-dev@hadoop.apache.org
> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]
> 
> +1 for the plan about Phase I & II.
> 
> BTW, maybe out of the scope of this thread, just want to mention we should either move the jira under HDFS-8031 or update the jira component as "erasure-coding" when making further improvement or fixing bugs in EC. In this way it will be easier for later backporting EC to 2.9.
> 
> On Mon, Nov 2, 2015 at 3:48 PM, Vinayakumar B <vinayakumarb.apache@gmail.com
>> wrote:
> 
>> +1 for the idea.
>> On Nov 3, 2015 07:22, "Zheng, Kai" <ka...@intel.com> wrote:
>> 
>>> Sounds good to me. When it's determined to include EC in 2.9 
>>> release, it may be good to have a rough release date as Zhe asked, 
>>> so accordingly the scope of EC can be discussed out. We still have 
>>> quite a few of things as Phase I follow-on tasks to do before EC can 
>>> be deployed in a production system. Phase II to develop non-striping 
>>> EC for cold data would possibly
>> be
>>> started after that. We might consider to include only Phase I and 
>>> leave Phase II for next release according to the rough release date.
>>> 
>>> Regards,
>>> Kai
>>> 
>>> -----Original Message-----
>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
>>> Sent: Tuesday, November 03, 2015 5:41 AM
>>> To: hdfs-dev@hadoop.apache.org
>>> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge 
>>> HDFS-7285 (erasure coding) branch to trunk]
>>> 
>>> +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we 
>>> +plan to
>>> have 2.8 and 2.9 releases.
>>> 
>>> Regards,
>>> Uma
>>> 
>>> On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vi...@hortonworks.com>
>> wrote:
>>> 
>>>> Forking the thread. Started looking at the 2.8 list, various 
>>>> features¹ status and arrived here.
>>>> 
>>>> While I understand the pervasive nature of EC and a need for a 
>>>> significant bake-in, moving this to a 3.x release is not a good idea.
>>>> We will surely get a 2.8 out this year and, as needed, I can even 
>>>> spend time getting started on a 2.9. OTOH, 3.x is long ways off, 
>>>> and given all the incompatibilities there, it would be a while 
>>>> before users can get their hands on EC if it were to be only on 
>>>> 3.x. At best, this may force sites that want EC to backport the 
>>>> entire EC feature to older releases, at worst this will be repeat 
>>>> the mess of 0.20 security release
>>> forks.
>>>> 
>>>> If we think adding this to 2.8 (even if it switched off) is too 
>>>> much risk per our original plan, let¹s move this to 2.9, there by 
>>>> leaving enough time for stability, integration testing and bake-in, 
>>>> and a realistic chance of having it end up on users¹ clusters soonish.
>>>> 
>>>> +Vinod
>>>> 
>>>>> On Oct 19, 2015, at 1:44 PM, Andrew Wang 
>>>>> <an...@cloudera.com>
>>>>> wrote:
>>>>> 
>>>>> I think our plan thus far has been to target this for 3.0. I'm 
>>>>> okay with  putting it in branch-2 if we've given a hard look at 
>>>>> compatibility, but  I'll note though that 2.8 is already looking 
>>>>> like quite a large release,  and our release bandwidth has been 
>>>>> focused on the 2.6 and 2.7 maintenance  releases. Adding another 
>>>>> multi-hundred JIRAs to 2.8 might make it too  unwieldy to get out 
>>>>> the door. If we bump EC past that, 3.0 might very well  be our 
>>>>> next release vehicle. I do plan to revive the 3.0 schedule some 
>>>>> time  next year. With EC and
>>>>> JDK8 in a good spot, the only big feature remaining  is classpath 
>>>>> isolation.
>>>>> 
>>>>> EC is also a pretty fundamental change to HDFS. Even if it's 
>>>>> compatible, in  terms of size and impact it might best belong in a 
>>>>> new major release.
>>>>> 
>>>>> Best,
>>>>> Andrew
>>>>> 
>>>>> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B < 
>>>>> vinayakumarb.apache@gmail.com> wrote:
>>>>> 
>>>>>> Is anyone else also thinks that feature is ready to goto 
>>>>>> branch-2 as well?
>>>>>> 
>>>>>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable 
>>>>>> since then and  ready to go in branch-2.
>>>>>> 
>>>>>> -Vinay
>>>>>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com> wrote:
>>>>>> 
>>>>>>> Thanks Vinay for capturing the issue and Uma for offering the help.
>>>>>>> 
>>>>>>> ---
>>>>>>> Zhe Zhang
>>>>>>> 
>>>>>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
>>>>>> uma.gangumalla@intel.com
>>>>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Vinay,
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I would merge them as part of HDFS-9182.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Uma
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" 
>>>>>>>> <vi...@apache.org>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi Andrew,
>>>>>>>>> I see CHANGES.txt entries not yet merged from
>>>>>> CHANGES-HDFS-EC-7285.txt.
>>>>>>>>> 
>>>>>>>>> Was this intentional?
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Vinay
>>>>>>>>> 
>>>>>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
>>>>>> andrew.wang@cloudera.com>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Branch has been merged to trunk, thanks again to everyone 
>>>>>>>>>> who worked
>>>>>>> on
>>>>>>>>>> the
>>>>>>>>>> feature!
>>>>>>>>>> 
>>>>>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang 
>>>>>>>>>> <zh...@cloudera.com>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Thanks everyone who has participated in this discussion.
>>>>>>>>>>> 
>>>>>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this 
>>>>>>>>>>> vote
>>>>>> has
>>>>>>>>>> passed.
>>>>>>>>>>> I will do a final 'git merge' with trunk and work with 
>>>>>>>>>>> Andrew to
>>>>>>> merge
>>>>>>>>>> the
>>>>>>>>>>> branch to trunk. I'll update on this thread when the merge 
>>>>>>>>>>> is
>>>>>> done.
>>>>>>>>>>> 
>>>>>>>>>>> ---
>>>>>>>>>>> Zhe Zhang
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A 
>>>>>>>>>>> <yi...@intel.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> (Change it to binding.)
>>>>>>>>>>>> 
>>>>>>>>>>>> +1
>>>>>>>>>>>> I have been involved in the development and code review on 
>>>>>>>>>>>> the
>>>>>>>>>> feature
>>>>>>>>>>>> branch. It's a great feature and I think it's ready to 
>>>>>>>>>>>> merge it
>>>>>>> into
>>>>>>>>>>> trunk.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks all for the contribution.
>>>>>>>>>>>> 
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Yi Liu
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: Liu, Yi A
>>>>>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) 
>>>>>>>>>>>> branch to
>>>>>>> trunk
>>>>>>>>>>>> 
>>>>>>>>>>>> +1 (non-binding)
>>>>>>>>>>>> I have been involved in the development and code review on 
>>>>>>>>>>>> the
>>>>>>>>>> feature
>>>>>>>>>>>> branch. It's a great feature and I think it's ready to 
>>>>>>>>>>>> merge it
>>>>>>> into
>>>>>>>>>>> trunk.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks all for the contribution.
>>>>>>>>>>>> 
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Yi Liu
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
>>>>>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) 
>>>>>>>>>>>> branch to
>>>>>>> trunk
>>>>>>>>>>>> 
>>>>>>>>>>>> +1,
>>>>>>>>>>>> 
>>>>>>>>>>>> I've been involved starting from design and development of
>>>>>>>>>> ErasureCoding.
>>>>>>>>>>>> I think phase 1 of this development is ready to be merged 
>>>>>>>>>>>> to
>>>>>>> trunk.
>>>>>>>>>>>> It had come a long way to the current state with 
>>>>>>>>>>>> significant
>>>>>>> effort
>>>>>>>>>> of
>>>>>>>>>>>> many Contributors and Reviewers for both design and code.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks Everyone for the efforts.
>>>>>>>>>>>> 
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Vinay
>>>>>>>>>>>> 
>>>>>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao 
>>>>>>>>>>>> <ji...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> +1
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I've been involved in both development and review on the
>>>>>> branch,
>>>>>>>>>> and
>>>>>>>>>> I
>>>>>>>>>>>>> believe it's now ready to get merged into trunk. Many 
>>>>>>>>>>>>> thanks
>>>>>> to
>>>>>>>>>> all
>>>>>>>>>>>>> the contributors and reviewers!
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> -Jing
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
>>>>>>> kai.zheng@intel.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Non-binding +1
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> According to our extensive performance tests, striping +
>>>>>> ISA-L
>>>>>>>>>> coder
>>>>>>>>>>>>> based
>>>>>>>>>>>>>> erasure coding not only can save storage, but also can
>>>>>>> increase
>>>>>>>>>> the
>>>>>>>>>>>>>> throughput of a client or a cluster. It will be a great
>>>>>>>>>> addition to
>>>>>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
>>>>>> also
>>>>>>>>>>>>>> observed it's
>>>>>>>>>>>>> very
>>>>>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
>>>>>> test
>>>>>>>>>> report
>>>>>>>>>>>>> after
>>>>>>>>>>>>>> it's sorted out and hope it helps.
>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Kai
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
>>>>>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
>>>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
>>>>>> common-dev@hadoop.apache.org
>>>>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) 
>>>>>>>>>>>>>> branch
>>>>>> to
>>>>>>>>>> trunk
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the 
>>>>>>>>>>>>>> nice
>>>>>>>>>> work.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Uma
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 
>>>>>>>>>>>>>>> feature
>>>>>>>>>> branch
>>>>>>>>>>>>>>> back to trunk. Since November 2014 we have been 
>>>>>>>>>>>>>>> designing
>>>>>> and
>>>>>>>>>>>>>>> developing this feature under the umbrella JIRAs 
>>>>>>>>>>>>>>> HDFS-7285
>>>>>>> and
>>>>>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The HDFS-7285 feature branch was created to support the
>>>>>> first
>>>>>>>>>> phase
>>>>>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of 
>>>>>>>>>>>>>>> HDFS-EC
>>>>>> is
>>>>>>>>>> to
>>>>>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
>>>>>>>>>> Instead
>>>>>>>>>>>>>>> of always creating 3 replicas of each block with 200%
>>>>>> storage
>>>>>>>>>> space
>>>>>>>>>>>>>>> overhead, HDFS-EC provides data durability through 
>>>>>>>>>>>>>>> parity
>>>>>>> data
>>>>>>>>>>> blocks.
>>>>>>>>>>>>>>> With most EC configurations, the storage overhead is no
>>>>>> more
>>>>>>>>>> than
>>>>>>>>>>> 50%.
>>>>>>>>>>>>>>> Based on profiling results of production clusters, we
>>>>>> decided
>>>>>>>>>> to
>>>>>>>>>>>>>>> support EC with the striped block layout in the first
>>>>>> phase,
>>>>>>> so
>>>>>>>>>>>>>>> that small files can be better handled. This means 
>>>>>>>>>>>>>>> dividing
>>>>>>>>>> each
>>>>>>>>>>>>>>> logical HDFS file block into smaller units (striping 
>>>>>>>>>>>>>>> cells)
>>>>>>> and
>>>>>>>>>>>>>>> spreading them on a set of DataNodes in round-robin
>>>>>> fashion.
>>>>>>>>>> Parity
>>>>>>>>>>>>>>> cells are generated for each stripe of original data cells.
>>>>>>> We
>>>>>>>>>> have
>>>>>>>>>>>>>>> made changes to NameNode, client, and DataNode to
>>>>>> generalize
>>>>>>>>>> the
>>>>>>>>>>>>>>> block concept and handle the mapping between a logical 
>>>>>>>>>>>>>>> file
>>>>>>>>>> block
>>>>>>>>>>>>>>> and its internal storage blocks. For further details 
>>>>>>>>>>>>>>> please
>>>>>>> see
>>>>>>>>>> the
>>>>>>>>>>>>>>> design doc on HDFS-7285.
>>>>>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
>>>>>>> high-performance
>>>>>>>>>>>>>>> codec calculation support.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The nightly Jenkins job of the branch has reported 
>>>>>>>>>>>>>>> several successful runs, and doesn't show new flaky 
>>>>>>>>>>>>>>> tests compared
>>>>>>> with
>>>>>>>>>>>>>>> trunk. We have posted several versions of the test plan
>>>>>>>>>> including
>>>>>>>>>>>>>>> both unit testing and cluster testing, and have 
>>>>>>>>>>>>>>> executed
>>>>>> most
>>>>>>>>>> tests
>>>>>>>>>>>>>>> in the plan. The most basic functionalities have been
>>>>>>>>>> extensively
>>>>>>>>>>>>>>> tested and verified in several real clusters with 
>>>>>>>>>>>>>>> different hardware configurations; results have been 
>>>>>>>>>>>>>>> very stable. We
>>>>>>> have
>>>>>>>>>>>>>>> created follow-on tasks for more advanced error 
>>>>>>>>>>>>>>> handling
>>>>>> and
>>>>>>>>>>>> optimization under the umbrella HDFS-8031.
>>>>>>>>>>>>>>> We also plan to implement or harden the integration of 
>>>>>>>>>>>>>>> EC
>>>>>>> with
>>>>>>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
>>>>>>> truncate,
>>>>>>>>>>>>>>> hflush, hsync, and so forth.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Development of this feature has been a collaboration 
>>>>>>>>>>>>>>> across
>>>>>>>>>> many
>>>>>>>>>>>>>>> companies and institutions. I'd like to thank J. 
>>>>>>>>>>>>>>> Andreina,
>>>>>>>>>> Takanobu
>>>>>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
>>>>>> Maheswara
>>>>>>>>>> Rao
>>>>>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, 
>>>>>>>>>>>>>>> Gao
>>>>>>> Rui,
>>>>>>>>>> Kai
>>>>>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, 
>>>>>>>>>>>>>>> Yong
>>>>>>>>>> Zhang,
>>>>>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
>>>>>>> contributions
>>>>>>>>>> and
>>>>>>>>>>>> reviews.
>>>>>>>>>>>>>>> Andrew and Kai Zheng also made fundamental 
>>>>>>>>>>>>>>> contributions to
>>>>>>> the
>>>>>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng 
>>>>>>>>>>>>>>> and
>>>>>>> many
>>>>>>>>>>>>>>> other contributors have made great efforts in system
>>>>>> testing.
>>>>>>>>>> Many
>>>>>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and 
>>>>>>>>>>>>>>> ATM,
>>>>>>> Todd
>>>>>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
>>>>>>>>>> providing
>>>>>>>>>>>> helpful feedbacks.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Following the community convention, this vote will last
>>>>>> for 7
>>>>>>>>>> days
>>>>>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers 
>>>>>>>>>>>>>>> are
>>>>>>>>>> binding
>>>>>>>>>>>>>>> but non-binding votes are very welcome as well. And 
>>>>>>>>>>>>>>> here's
>>>>>> my
>>>>>>>>>>>>>>> non-binding
>>>>>>>>>>>>> +1.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>> Zhe Zhang
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>> 
>> 


RE: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by "Zheng, Kai" <ka...@intel.com>.
Yeah, so for the issues we recently resolved on trunk and are addressing as follow-on tasks in Phase I, we would label them with "erasure coding" and maybe also set the target version as "2.9" for the convenience?

-----Original Message-----
From: Jing Zhao [mailto:jing9@apache.org] 
Sent: Tuesday, November 03, 2015 8:04 AM
To: hdfs-dev@hadoop.apache.org
Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

+1 for the plan about Phase I & II.

BTW, maybe out of the scope of this thread, just want to mention we should either move the jira under HDFS-8031 or update the jira component as "erasure-coding" when making further improvement or fixing bugs in EC. In this way it will be easier for later backporting EC to 2.9.

On Mon, Nov 2, 2015 at 3:48 PM, Vinayakumar B <vinayakumarb.apache@gmail.com
> wrote:

> +1 for the idea.
> On Nov 3, 2015 07:22, "Zheng, Kai" <ka...@intel.com> wrote:
>
> > Sounds good to me. When it's determined to include EC in 2.9 
> > release, it may be good to have a rough release date as Zhe asked, 
> > so accordingly the scope of EC can be discussed out. We still have 
> > quite a few of things as Phase I follow-on tasks to do before EC can 
> > be deployed in a production system. Phase II to develop non-striping 
> > EC for cold data would possibly
> be
> > started after that. We might consider to include only Phase I and 
> > leave Phase II for next release according to the rough release date.
> >
> > Regards,
> > Kai
> >
> > -----Original Message-----
> > From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > Sent: Tuesday, November 03, 2015 5:41 AM
> > To: hdfs-dev@hadoop.apache.org
> > Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge 
> > HDFS-7285 (erasure coding) branch to trunk]
> >
> > +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we 
> > +plan to
> > have 2.8 and 2.9 releases.
> >
> > Regards,
> > Uma
> >
> > On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vi...@hortonworks.com>
> wrote:
> >
> > >Forking the thread. Started looking at the 2.8 list, various 
> > >features¹ status and arrived here.
> > >
> > >While I understand the pervasive nature of EC and a need for a 
> > >significant bake-in, moving this to a 3.x release is not a good idea.
> > >We will surely get a 2.8 out this year and, as needed, I can even 
> > >spend time getting started on a 2.9. OTOH, 3.x is long ways off, 
> > >and given all the incompatibilities there, it would be a while 
> > >before users can get their hands on EC if it were to be only on 
> > >3.x. At best, this may force sites that want EC to backport the 
> > >entire EC feature to older releases, at worst this will be repeat 
> > >the mess of 0.20 security release
> > forks.
> > >
> > >If we think adding this to 2.8 (even if it switched off) is too 
> > >much risk per our original plan, let¹s move this to 2.9, there by 
> > >leaving enough time for stability, integration testing and bake-in, 
> > >and a realistic chance of having it end up on users¹ clusters soonish.
> > >
> > >+Vinod
> > >
> > >> On Oct 19, 2015, at 1:44 PM, Andrew Wang 
> > >><an...@cloudera.com>
> > >>wrote:
> > >>
> > >> I think our plan thus far has been to target this for 3.0. I'm 
> > >>okay with  putting it in branch-2 if we've given a hard look at 
> > >>compatibility, but  I'll note though that 2.8 is already looking 
> > >>like quite a large release,  and our release bandwidth has been 
> > >>focused on the 2.6 and 2.7 maintenance  releases. Adding another 
> > >>multi-hundred JIRAs to 2.8 might make it too  unwieldy to get out 
> > >>the door. If we bump EC past that, 3.0 might very well  be our 
> > >>next release vehicle. I do plan to revive the 3.0 schedule some 
> > >>time  next year. With EC and
> > >>JDK8 in a good spot, the only big feature remaining  is classpath 
> > >>isolation.
> > >>
> > >> EC is also a pretty fundamental change to HDFS. Even if it's 
> > >>compatible, in  terms of size and impact it might best belong in a 
> > >>new major release.
> > >>
> > >> Best,
> > >> Andrew
> > >>
> > >> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B < 
> > >> vinayakumarb.apache@gmail.com> wrote:
> > >>
> > >>> Is anyone else also thinks that feature is ready to goto 
> > >>>branch-2 as well?
> > >>>
> > >>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable 
> > >>>since then and  ready to go in branch-2.
> > >>>
> > >>> -Vinay
> > >>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com> wrote:
> > >>>
> > >>>> Thanks Vinay for capturing the issue and Uma for offering the help.
> > >>>>
> > >>>> ---
> > >>>> Zhe Zhang
> > >>>>
> > >>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
> > >>> uma.gangumalla@intel.com
> > >>>>>
> > >>>> wrote:
> > >>>>
> > >>>>> Vinay,
> > >>>>>
> > >>>>>
> > >>>>> I would merge them as part of HDFS-9182.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Uma
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" 
> > >>>>><vi...@apache.org>
> > >>>>>wrote:
> > >>>>>
> > >>>>>> Hi Andrew,
> > >>>>>> I see CHANGES.txt entries not yet merged from
> > >>> CHANGES-HDFS-EC-7285.txt.
> > >>>>>>
> > >>>>>> Was this intentional?
> > >>>>>>
> > >>>>>> Regards,
> > >>>>>> Vinay
> > >>>>>>
> > >>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
> > >>> andrew.wang@cloudera.com>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> Branch has been merged to trunk, thanks again to everyone 
> > >>>>>>>who worked
> > >>>> on
> > >>>>>>> the
> > >>>>>>> feature!
> > >>>>>>>
> > >>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang 
> > >>>>>>> <zh...@cloudera.com>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Thanks everyone who has participated in this discussion.
> > >>>>>>>>
> > >>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this 
> > >>>>>>>> vote
> > >>> has
> > >>>>>>> passed.
> > >>>>>>>> I will do a final 'git merge' with trunk and work with 
> > >>>>>>>> Andrew to
> > >>>> merge
> > >>>>>>> the
> > >>>>>>>> branch to trunk. I'll update on this thread when the merge 
> > >>>>>>>> is
> > >>> done.
> > >>>>>>>>
> > >>>>>>>> ---
> > >>>>>>>> Zhe Zhang
> > >>>>>>>>
> > >>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A 
> > >>>>>>>> <yi...@intel.com>
> > >>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> (Change it to binding.)
> > >>>>>>>>>
> > >>>>>>>>> +1
> > >>>>>>>>> I have been involved in the development and code review on 
> > >>>>>>>>> the
> > >>>>>>> feature
> > >>>>>>>>> branch. It's a great feature and I think it's ready to 
> > >>>>>>>>> merge it
> > >>>> into
> > >>>>>>>> trunk.
> > >>>>>>>>>
> > >>>>>>>>> Thanks all for the contribution.
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>> Yi Liu
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> -----Original Message-----
> > >>>>>>>>> From: Liu, Yi A
> > >>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
> > >>>>>>>>> To: hdfs-dev@hadoop.apache.org
> > >>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) 
> > >>>>>>>>> branch to
> > >>>> trunk
> > >>>>>>>>>
> > >>>>>>>>> +1 (non-binding)
> > >>>>>>>>> I have been involved in the development and code review on 
> > >>>>>>>>> the
> > >>>>>>> feature
> > >>>>>>>>> branch. It's a great feature and I think it's ready to 
> > >>>>>>>>> merge it
> > >>>> into
> > >>>>>>>> trunk.
> > >>>>>>>>>
> > >>>>>>>>> Thanks all for the contribution.
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>> Yi Liu
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> -----Original Message-----
> > >>>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> > >>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
> > >>>>>>>>> To: hdfs-dev@hadoop.apache.org
> > >>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) 
> > >>>>>>>>> branch to
> > >>>> trunk
> > >>>>>>>>>
> > >>>>>>>>> +1,
> > >>>>>>>>>
> > >>>>>>>>> I've been involved starting from design and development of
> > >>>>>>> ErasureCoding.
> > >>>>>>>>> I think phase 1 of this development is ready to be merged 
> > >>>>>>>>> to
> > >>>> trunk.
> > >>>>>>>>> It had come a long way to the current state with 
> > >>>>>>>>> significant
> > >>>> effort
> > >>>>>>> of
> > >>>>>>>>> many Contributors and Reviewers for both design and code.
> > >>>>>>>>>
> > >>>>>>>>> Thanks Everyone for the efforts.
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>> Vinay
> > >>>>>>>>>
> > >>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao 
> > >>>>>>>>> <ji...@apache.org>
> > >>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> +1
> > >>>>>>>>>>
> > >>>>>>>>>> I've been involved in both development and review on the
> > >>> branch,
> > >>>>>>> and
> > >>>>>>> I
> > >>>>>>>>>> believe it's now ready to get merged into trunk. Many 
> > >>>>>>>>>> thanks
> > >>> to
> > >>>>>>> all
> > >>>>>>>>>> the contributors and reviewers!
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks,
> > >>>>>>>>>> -Jing
> > >>>>>>>>>>
> > >>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
> > >>>> kai.zheng@intel.com>
> > >>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Non-binding +1
> > >>>>>>>>>>>
> > >>>>>>>>>>> According to our extensive performance tests, striping +
> > >>> ISA-L
> > >>>>>>> coder
> > >>>>>>>>>> based
> > >>>>>>>>>>> erasure coding not only can save storage, but also can
> > >>>> increase
> > >>>>>>> the
> > >>>>>>>>>>> throughput of a client or a cluster. It will be a great
> > >>>>>>> addition to
> > >>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
> > >>> also
> > >>>>>>>>>>> observed it's
> > >>>>>>>>>> very
> > >>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
> > >>> test
> > >>>>>>> report
> > >>>>>>>>>> after
> > >>>>>>>>>>> it's sorted out and hope it helps.
> > >>>>>>>>>>> Thanks!
> > >>>>>>>>>>>
> > >>>>>>>>>>> Regards,
> > >>>>>>>>>>> Kai
> > >>>>>>>>>>>
> > >>>>>>>>>>> -----Original Message-----
> > >>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > >>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
> > >>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
> > >>> common-dev@hadoop.apache.org
> > >>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) 
> > >>>>>>>>>>> branch
> > >>> to
> > >>>>>>> trunk
> > >>>>>>>>>>>
> > >>>>>>>>>>> +1
> > >>>>>>>>>>>
> > >>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the 
> > >>>>>>>>>>> nice
> > >>>>>>> work.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Regards,
> > >>>>>>>>>>> Uma
> > >>>>>>>>>>>
> > >>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
> > >>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> Hi,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 
> > >>>>>>>>>>>> feature
> > >>>>>>> branch
> > >>>>>>>>>>>> back to trunk. Since November 2014 we have been 
> > >>>>>>>>>>>> designing
> > >>> and
> > >>>>>>>>>>>> developing this feature under the umbrella JIRAs 
> > >>>>>>>>>>>> HDFS-7285
> > >>>> and
> > >>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> The HDFS-7285 feature branch was created to support the
> > >>> first
> > >>>>>>> phase
> > >>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of 
> > >>>>>>>>>>>> HDFS-EC
> > >>> is
> > >>>>>>> to
> > >>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
> > >>>>>>> Instead
> > >>>>>>>>>>>> of always creating 3 replicas of each block with 200%
> > >>> storage
> > >>>>>>> space
> > >>>>>>>>>>>> overhead, HDFS-EC provides data durability through 
> > >>>>>>>>>>>> parity
> > >>>> data
> > >>>>>>>> blocks.
> > >>>>>>>>>>>> With most EC configurations, the storage overhead is no
> > >>> more
> > >>>>>>> than
> > >>>>>>>> 50%.
> > >>>>>>>>>>>> Based on profiling results of production clusters, we
> > >>> decided
> > >>>>>>> to
> > >>>>>>>>>>>> support EC with the striped block layout in the first
> > >>> phase,
> > >>>> so
> > >>>>>>>>>>>> that small files can be better handled. This means 
> > >>>>>>>>>>>> dividing
> > >>>>>>> each
> > >>>>>>>>>>>> logical HDFS file block into smaller units (striping 
> > >>>>>>>>>>>> cells)
> > >>>> and
> > >>>>>>>>>>>> spreading them on a set of DataNodes in round-robin
> > >>> fashion.
> > >>>>>>> Parity
> > >>>>>>>>>>>> cells are generated for each stripe of original data cells.
> > >>>> We
> > >>>>>>> have
> > >>>>>>>>>>>> made changes to NameNode, client, and DataNode to
> > >>> generalize
> > >>>>>>> the
> > >>>>>>>>>>>> block concept and handle the mapping between a logical 
> > >>>>>>>>>>>> file
> > >>>>>>> block
> > >>>>>>>>>>>> and its internal storage blocks. For further details 
> > >>>>>>>>>>>> please
> > >>>> see
> > >>>>>>> the
> > >>>>>>>>>>>> design doc on HDFS-7285.
> > >>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
> > >>>> high-performance
> > >>>>>>>>>>>> codec calculation support.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> The nightly Jenkins job of the branch has reported 
> > >>>>>>>>>>>> several successful runs, and doesn't show new flaky 
> > >>>>>>>>>>>> tests compared
> > >>>> with
> > >>>>>>>>>>>> trunk. We have posted several versions of the test plan
> > >>>>>>> including
> > >>>>>>>>>>>> both unit testing and cluster testing, and have 
> > >>>>>>>>>>>> executed
> > >>> most
> > >>>>>>> tests
> > >>>>>>>>>>>> in the plan. The most basic functionalities have been
> > >>>>>>> extensively
> > >>>>>>>>>>>> tested and verified in several real clusters with 
> > >>>>>>>>>>>> different hardware configurations; results have been 
> > >>>>>>>>>>>> very stable. We
> > >>>> have
> > >>>>>>>>>>>> created follow-on tasks for more advanced error 
> > >>>>>>>>>>>> handling
> > >>> and
> > >>>>>>>>> optimization under the umbrella HDFS-8031.
> > >>>>>>>>>>>> We also plan to implement or harden the integration of 
> > >>>>>>>>>>>> EC
> > >>>> with
> > >>>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
> > >>>> truncate,
> > >>>>>>>>>>>> hflush, hsync, and so forth.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Development of this feature has been a collaboration 
> > >>>>>>>>>>>> across
> > >>>>>>> many
> > >>>>>>>>>>>> companies and institutions. I'd like to thank J. 
> > >>>>>>>>>>>> Andreina,
> > >>>>>>> Takanobu
> > >>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
> > >>> Maheswara
> > >>>>>>> Rao
> > >>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, 
> > >>>>>>>>>>>> Gao
> > >>>> Rui,
> > >>>>>>> Kai
> > >>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, 
> > >>>>>>>>>>>> Yong
> > >>>>>>> Zhang,
> > >>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
> > >>>> contributions
> > >>>>>>> and
> > >>>>>>>>> reviews.
> > >>>>>>>>>>>> Andrew and Kai Zheng also made fundamental 
> > >>>>>>>>>>>> contributions to
> > >>>> the
> > >>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng 
> > >>>>>>>>>>>> and
> > >>>> many
> > >>>>>>>>>>>> other contributors have made great efforts in system
> > >>> testing.
> > >>>>>>> Many
> > >>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and 
> > >>>>>>>>>>>> ATM,
> > >>>> Todd
> > >>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
> > >>>>>>> providing
> > >>>>>>>>> helpful feedbacks.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Following the community convention, this vote will last
> > >>> for 7
> > >>>>>>> days
> > >>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers 
> > >>>>>>>>>>>> are
> > >>>>>>> binding
> > >>>>>>>>>>>> but non-binding votes are very welcome as well. And 
> > >>>>>>>>>>>> here's
> > >>> my
> > >>>>>>>>>>>> non-binding
> > >>>>>>>>>> +1.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>> ---
> > >>>>>>>>>>>> Zhe Zhang
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>>
> > >>>>
> > >>>
> > >
> >
> >
>

Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Jing Zhao <ji...@apache.org>.
+1 for the plan about Phase I & II.

BTW, maybe out of the scope of this thread, just want to mention we should
either move the jira under HDFS-8031 or update the jira component as
"erasure-coding" when making further improvement or fixing bugs in EC. In
this way it will be easier for later backporting EC to 2.9.

On Mon, Nov 2, 2015 at 3:48 PM, Vinayakumar B <vinayakumarb.apache@gmail.com
> wrote:

> +1 for the idea.
> On Nov 3, 2015 07:22, "Zheng, Kai" <ka...@intel.com> wrote:
>
> > Sounds good to me. When it's determined to include EC in 2.9 release, it
> > may be good to have a rough release date as Zhe asked, so accordingly the
> > scope of EC can be discussed out. We still have quite a few of things as
> > Phase I follow-on tasks to do before EC can be deployed in a production
> > system. Phase II to develop non-striping EC for cold data would possibly
> be
> > started after that. We might consider to include only Phase I and leave
> > Phase II for next release according to the rough release date.
> >
> > Regards,
> > Kai
> >
> > -----Original Message-----
> > From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > Sent: Tuesday, November 03, 2015 5:41 AM
> > To: hdfs-dev@hadoop.apache.org
> > Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285
> > (erasure coding) branch to trunk]
> >
> > +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we plan
> > +to
> > have 2.8 and 2.9 releases.
> >
> > Regards,
> > Uma
> >
> > On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vi...@hortonworks.com>
> wrote:
> >
> > >Forking the thread. Started looking at the 2.8 list, various features¹
> > >status and arrived here.
> > >
> > >While I understand the pervasive nature of EC and a need for a
> > >significant bake-in, moving this to a 3.x release is not a good idea.
> > >We will surely get a 2.8 out this year and, as needed, I can even spend
> > >time getting started on a 2.9. OTOH, 3.x is long ways off, and given
> > >all the incompatibilities there, it would be a while before users can
> > >get their hands on EC if it were to be only on 3.x. At best, this may
> > >force sites that want EC to backport the entire EC feature to older
> > >releases, at worst this will be repeat the mess of 0.20 security release
> > forks.
> > >
> > >If we think adding this to 2.8 (even if it switched off) is too much
> > >risk per our original plan, let¹s move this to 2.9, there by leaving
> > >enough time for stability, integration testing and bake-in, and a
> > >realistic chance of having it end up on users¹ clusters soonish.
> > >
> > >+Vinod
> > >
> > >> On Oct 19, 2015, at 1:44 PM, Andrew Wang <an...@cloudera.com>
> > >>wrote:
> > >>
> > >> I think our plan thus far has been to target this for 3.0. I'm okay
> > >>with  putting it in branch-2 if we've given a hard look at
> > >>compatibility, but  I'll note though that 2.8 is already looking like
> > >>quite a large release,  and our release bandwidth has been focused on
> > >>the 2.6 and 2.7 maintenance  releases. Adding another multi-hundred
> > >>JIRAs to 2.8 might make it too  unwieldy to get out the door. If we
> > >>bump EC past that, 3.0 might very well  be our next release vehicle. I
> > >>do plan to revive the 3.0 schedule some time  next year. With EC and
> > >>JDK8 in a good spot, the only big feature remaining  is classpath
> > >>isolation.
> > >>
> > >> EC is also a pretty fundamental change to HDFS. Even if it's
> > >>compatible, in  terms of size and impact it might best belong in a new
> > >>major release.
> > >>
> > >> Best,
> > >> Andrew
> > >>
> > >> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
> > >> vinayakumarb.apache@gmail.com> wrote:
> > >>
> > >>> Is anyone else also thinks that feature is ready to goto branch-2
> > >>>as well?
> > >>>
> > >>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since
> > >>>then and  ready to go in branch-2.
> > >>>
> > >>> -Vinay
> > >>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com> wrote:
> > >>>
> > >>>> Thanks Vinay for capturing the issue and Uma for offering the help.
> > >>>>
> > >>>> ---
> > >>>> Zhe Zhang
> > >>>>
> > >>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
> > >>> uma.gangumalla@intel.com
> > >>>>>
> > >>>> wrote:
> > >>>>
> > >>>>> Vinay,
> > >>>>>
> > >>>>>
> > >>>>> I would merge them as part of HDFS-9182.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Uma
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" <vi...@apache.org>
> > >>>>>wrote:
> > >>>>>
> > >>>>>> Hi Andrew,
> > >>>>>> I see CHANGES.txt entries not yet merged from
> > >>> CHANGES-HDFS-EC-7285.txt.
> > >>>>>>
> > >>>>>> Was this intentional?
> > >>>>>>
> > >>>>>> Regards,
> > >>>>>> Vinay
> > >>>>>>
> > >>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
> > >>> andrew.wang@cloudera.com>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> Branch has been merged to trunk, thanks again to everyone who
> > >>>>>>>worked
> > >>>> on
> > >>>>>>> the
> > >>>>>>> feature!
> > >>>>>>>
> > >>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang
> > >>>>>>> <zh...@cloudera.com>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Thanks everyone who has participated in this discussion.
> > >>>>>>>>
> > >>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this vote
> > >>> has
> > >>>>>>> passed.
> > >>>>>>>> I will do a final 'git merge' with trunk and work with Andrew
> > >>>>>>>> to
> > >>>> merge
> > >>>>>>> the
> > >>>>>>>> branch to trunk. I'll update on this thread when the merge is
> > >>> done.
> > >>>>>>>>
> > >>>>>>>> ---
> > >>>>>>>> Zhe Zhang
> > >>>>>>>>
> > >>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A
> > >>>>>>>> <yi...@intel.com>
> > >>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> (Change it to binding.)
> > >>>>>>>>>
> > >>>>>>>>> +1
> > >>>>>>>>> I have been involved in the development and code review on the
> > >>>>>>> feature
> > >>>>>>>>> branch. It's a great feature and I think it's ready to merge
> > >>>>>>>>> it
> > >>>> into
> > >>>>>>>> trunk.
> > >>>>>>>>>
> > >>>>>>>>> Thanks all for the contribution.
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>> Yi Liu
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> -----Original Message-----
> > >>>>>>>>> From: Liu, Yi A
> > >>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
> > >>>>>>>>> To: hdfs-dev@hadoop.apache.org
> > >>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> > >>>> trunk
> > >>>>>>>>>
> > >>>>>>>>> +1 (non-binding)
> > >>>>>>>>> I have been involved in the development and code review on the
> > >>>>>>> feature
> > >>>>>>>>> branch. It's a great feature and I think it's ready to merge
> > >>>>>>>>> it
> > >>>> into
> > >>>>>>>> trunk.
> > >>>>>>>>>
> > >>>>>>>>> Thanks all for the contribution.
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>> Yi Liu
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> -----Original Message-----
> > >>>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> > >>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
> > >>>>>>>>> To: hdfs-dev@hadoop.apache.org
> > >>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> > >>>> trunk
> > >>>>>>>>>
> > >>>>>>>>> +1,
> > >>>>>>>>>
> > >>>>>>>>> I've been involved starting from design and development of
> > >>>>>>> ErasureCoding.
> > >>>>>>>>> I think phase 1 of this development is ready to be merged to
> > >>>> trunk.
> > >>>>>>>>> It had come a long way to the current state with significant
> > >>>> effort
> > >>>>>>> of
> > >>>>>>>>> many Contributors and Reviewers for both design and code.
> > >>>>>>>>>
> > >>>>>>>>> Thanks Everyone for the efforts.
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>> Vinay
> > >>>>>>>>>
> > >>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org>
> > >>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> +1
> > >>>>>>>>>>
> > >>>>>>>>>> I've been involved in both development and review on the
> > >>> branch,
> > >>>>>>> and
> > >>>>>>> I
> > >>>>>>>>>> believe it's now ready to get merged into trunk. Many thanks
> > >>> to
> > >>>>>>> all
> > >>>>>>>>>> the contributors and reviewers!
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks,
> > >>>>>>>>>> -Jing
> > >>>>>>>>>>
> > >>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
> > >>>> kai.zheng@intel.com>
> > >>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Non-binding +1
> > >>>>>>>>>>>
> > >>>>>>>>>>> According to our extensive performance tests, striping +
> > >>> ISA-L
> > >>>>>>> coder
> > >>>>>>>>>> based
> > >>>>>>>>>>> erasure coding not only can save storage, but also can
> > >>>> increase
> > >>>>>>> the
> > >>>>>>>>>>> throughput of a client or a cluster. It will be a great
> > >>>>>>> addition to
> > >>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
> > >>> also
> > >>>>>>>>>>> observed it's
> > >>>>>>>>>> very
> > >>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
> > >>> test
> > >>>>>>> report
> > >>>>>>>>>> after
> > >>>>>>>>>>> it's sorted out and hope it helps.
> > >>>>>>>>>>> Thanks!
> > >>>>>>>>>>>
> > >>>>>>>>>>> Regards,
> > >>>>>>>>>>> Kai
> > >>>>>>>>>>>
> > >>>>>>>>>>> -----Original Message-----
> > >>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > >>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
> > >>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
> > >>> common-dev@hadoop.apache.org
> > >>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch
> > >>> to
> > >>>>>>> trunk
> > >>>>>>>>>>>
> > >>>>>>>>>>> +1
> > >>>>>>>>>>>
> > >>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the nice
> > >>>>>>> work.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Regards,
> > >>>>>>>>>>> Uma
> > >>>>>>>>>>>
> > >>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
> > >>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> Hi,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 feature
> > >>>>>>> branch
> > >>>>>>>>>>>> back to trunk. Since November 2014 we have been designing
> > >>> and
> > >>>>>>>>>>>> developing this feature under the umbrella JIRAs HDFS-7285
> > >>>> and
> > >>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> The HDFS-7285 feature branch was created to support the
> > >>> first
> > >>>>>>> phase
> > >>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC
> > >>> is
> > >>>>>>> to
> > >>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
> > >>>>>>> Instead
> > >>>>>>>>>>>> of always creating 3 replicas of each block with 200%
> > >>> storage
> > >>>>>>> space
> > >>>>>>>>>>>> overhead, HDFS-EC provides data durability through parity
> > >>>> data
> > >>>>>>>> blocks.
> > >>>>>>>>>>>> With most EC configurations, the storage overhead is no
> > >>> more
> > >>>>>>> than
> > >>>>>>>> 50%.
> > >>>>>>>>>>>> Based on profiling results of production clusters, we
> > >>> decided
> > >>>>>>> to
> > >>>>>>>>>>>> support EC with the striped block layout in the first
> > >>> phase,
> > >>>> so
> > >>>>>>>>>>>> that small files can be better handled. This means dividing
> > >>>>>>> each
> > >>>>>>>>>>>> logical HDFS file block into smaller units (striping cells)
> > >>>> and
> > >>>>>>>>>>>> spreading them on a set of DataNodes in round-robin
> > >>> fashion.
> > >>>>>>> Parity
> > >>>>>>>>>>>> cells are generated for each stripe of original data cells.
> > >>>> We
> > >>>>>>> have
> > >>>>>>>>>>>> made changes to NameNode, client, and DataNode to
> > >>> generalize
> > >>>>>>> the
> > >>>>>>>>>>>> block concept and handle the mapping between a logical file
> > >>>>>>> block
> > >>>>>>>>>>>> and its internal storage blocks. For further details please
> > >>>> see
> > >>>>>>> the
> > >>>>>>>>>>>> design doc on HDFS-7285.
> > >>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
> > >>>> high-performance
> > >>>>>>>>>>>> codec calculation support.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> The nightly Jenkins job of the branch has reported several
> > >>>>>>>>>>>> successful runs, and doesn't show new flaky tests compared
> > >>>> with
> > >>>>>>>>>>>> trunk. We have posted several versions of the test plan
> > >>>>>>> including
> > >>>>>>>>>>>> both unit testing and cluster testing, and have executed
> > >>> most
> > >>>>>>> tests
> > >>>>>>>>>>>> in the plan. The most basic functionalities have been
> > >>>>>>> extensively
> > >>>>>>>>>>>> tested and verified in several real clusters with different
> > >>>>>>>>>>>> hardware configurations; results have been very stable. We
> > >>>> have
> > >>>>>>>>>>>> created follow-on tasks for more advanced error handling
> > >>> and
> > >>>>>>>>> optimization under the umbrella HDFS-8031.
> > >>>>>>>>>>>> We also plan to implement or harden the integration of EC
> > >>>> with
> > >>>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
> > >>>> truncate,
> > >>>>>>>>>>>> hflush, hsync, and so forth.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Development of this feature has been a collaboration across
> > >>>>>>> many
> > >>>>>>>>>>>> companies and institutions. I'd like to thank J. Andreina,
> > >>>>>>> Takanobu
> > >>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
> > >>> Maheswara
> > >>>>>>> Rao
> > >>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao
> > >>>> Rui,
> > >>>>>>> Kai
> > >>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong
> > >>>>>>> Zhang,
> > >>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
> > >>>> contributions
> > >>>>>>> and
> > >>>>>>>>> reviews.
> > >>>>>>>>>>>> Andrew and Kai Zheng also made fundamental contributions to
> > >>>> the
> > >>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and
> > >>>> many
> > >>>>>>>>>>>> other contributors have made great efforts in system
> > >>> testing.
> > >>>>>>> Many
> > >>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and ATM,
> > >>>> Todd
> > >>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
> > >>>>>>> providing
> > >>>>>>>>> helpful feedbacks.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Following the community convention, this vote will last
> > >>> for 7
> > >>>>>>> days
> > >>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers are
> > >>>>>>> binding
> > >>>>>>>>>>>> but non-binding votes are very welcome as well. And here's
> > >>> my
> > >>>>>>>>>>>> non-binding
> > >>>>>>>>>> +1.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>> ---
> > >>>>>>>>>>>> Zhe Zhang
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>>
> > >>>>
> > >>>
> > >
> >
> >
>

RE: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Vinayakumar B <vi...@gmail.com>.
+1 for the idea.
On Nov 3, 2015 07:22, "Zheng, Kai" <ka...@intel.com> wrote:

> Sounds good to me. When it's determined to include EC in 2.9 release, it
> may be good to have a rough release date as Zhe asked, so accordingly the
> scope of EC can be discussed out. We still have quite a few of things as
> Phase I follow-on tasks to do before EC can be deployed in a production
> system. Phase II to develop non-striping EC for cold data would possibly be
> started after that. We might consider to include only Phase I and leave
> Phase II for next release according to the rough release date.
>
> Regards,
> Kai
>
> -----Original Message-----
> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> Sent: Tuesday, November 03, 2015 5:41 AM
> To: hdfs-dev@hadoop.apache.org
> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285
> (erasure coding) branch to trunk]
>
> +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we plan
> +to
> have 2.8 and 2.9 releases.
>
> Regards,
> Uma
>
> On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vi...@hortonworks.com> wrote:
>
> >Forking the thread. Started looking at the 2.8 list, various features¹
> >status and arrived here.
> >
> >While I understand the pervasive nature of EC and a need for a
> >significant bake-in, moving this to a 3.x release is not a good idea.
> >We will surely get a 2.8 out this year and, as needed, I can even spend
> >time getting started on a 2.9. OTOH, 3.x is long ways off, and given
> >all the incompatibilities there, it would be a while before users can
> >get their hands on EC if it were to be only on 3.x. At best, this may
> >force sites that want EC to backport the entire EC feature to older
> >releases, at worst this will be repeat the mess of 0.20 security release
> forks.
> >
> >If we think adding this to 2.8 (even if it switched off) is too much
> >risk per our original plan, let¹s move this to 2.9, there by leaving
> >enough time for stability, integration testing and bake-in, and a
> >realistic chance of having it end up on users¹ clusters soonish.
> >
> >+Vinod
> >
> >> On Oct 19, 2015, at 1:44 PM, Andrew Wang <an...@cloudera.com>
> >>wrote:
> >>
> >> I think our plan thus far has been to target this for 3.0. I'm okay
> >>with  putting it in branch-2 if we've given a hard look at
> >>compatibility, but  I'll note though that 2.8 is already looking like
> >>quite a large release,  and our release bandwidth has been focused on
> >>the 2.6 and 2.7 maintenance  releases. Adding another multi-hundred
> >>JIRAs to 2.8 might make it too  unwieldy to get out the door. If we
> >>bump EC past that, 3.0 might very well  be our next release vehicle. I
> >>do plan to revive the 3.0 schedule some time  next year. With EC and
> >>JDK8 in a good spot, the only big feature remaining  is classpath
> >>isolation.
> >>
> >> EC is also a pretty fundamental change to HDFS. Even if it's
> >>compatible, in  terms of size and impact it might best belong in a new
> >>major release.
> >>
> >> Best,
> >> Andrew
> >>
> >> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
> >> vinayakumarb.apache@gmail.com> wrote:
> >>
> >>> Is anyone else also thinks that feature is ready to goto branch-2
> >>>as well?
> >>>
> >>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since
> >>>then and  ready to go in branch-2.
> >>>
> >>> -Vinay
> >>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com> wrote:
> >>>
> >>>> Thanks Vinay for capturing the issue and Uma for offering the help.
> >>>>
> >>>> ---
> >>>> Zhe Zhang
> >>>>
> >>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
> >>> uma.gangumalla@intel.com
> >>>>>
> >>>> wrote:
> >>>>
> >>>>> Vinay,
> >>>>>
> >>>>>
> >>>>> I would merge them as part of HDFS-9182.
> >>>>>
> >>>>> Thanks,
> >>>>> Uma
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" <vi...@apache.org>
> >>>>>wrote:
> >>>>>
> >>>>>> Hi Andrew,
> >>>>>> I see CHANGES.txt entries not yet merged from
> >>> CHANGES-HDFS-EC-7285.txt.
> >>>>>>
> >>>>>> Was this intentional?
> >>>>>>
> >>>>>> Regards,
> >>>>>> Vinay
> >>>>>>
> >>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
> >>> andrew.wang@cloudera.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Branch has been merged to trunk, thanks again to everyone who
> >>>>>>>worked
> >>>> on
> >>>>>>> the
> >>>>>>> feature!
> >>>>>>>
> >>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang
> >>>>>>> <zh...@cloudera.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Thanks everyone who has participated in this discussion.
> >>>>>>>>
> >>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this vote
> >>> has
> >>>>>>> passed.
> >>>>>>>> I will do a final 'git merge' with trunk and work with Andrew
> >>>>>>>> to
> >>>> merge
> >>>>>>> the
> >>>>>>>> branch to trunk. I'll update on this thread when the merge is
> >>> done.
> >>>>>>>>
> >>>>>>>> ---
> >>>>>>>> Zhe Zhang
> >>>>>>>>
> >>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A
> >>>>>>>> <yi...@intel.com>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> (Change it to binding.)
> >>>>>>>>>
> >>>>>>>>> +1
> >>>>>>>>> I have been involved in the development and code review on the
> >>>>>>> feature
> >>>>>>>>> branch. It's a great feature and I think it's ready to merge
> >>>>>>>>> it
> >>>> into
> >>>>>>>> trunk.
> >>>>>>>>>
> >>>>>>>>> Thanks all for the contribution.
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Yi Liu
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: Liu, Yi A
> >>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
> >>>>>>>>> To: hdfs-dev@hadoop.apache.org
> >>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> >>>> trunk
> >>>>>>>>>
> >>>>>>>>> +1 (non-binding)
> >>>>>>>>> I have been involved in the development and code review on the
> >>>>>>> feature
> >>>>>>>>> branch. It's a great feature and I think it's ready to merge
> >>>>>>>>> it
> >>>> into
> >>>>>>>> trunk.
> >>>>>>>>>
> >>>>>>>>> Thanks all for the contribution.
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Yi Liu
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> >>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
> >>>>>>>>> To: hdfs-dev@hadoop.apache.org
> >>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> >>>> trunk
> >>>>>>>>>
> >>>>>>>>> +1,
> >>>>>>>>>
> >>>>>>>>> I've been involved starting from design and development of
> >>>>>>> ErasureCoding.
> >>>>>>>>> I think phase 1 of this development is ready to be merged to
> >>>> trunk.
> >>>>>>>>> It had come a long way to the current state with significant
> >>>> effort
> >>>>>>> of
> >>>>>>>>> many Contributors and Reviewers for both design and code.
> >>>>>>>>>
> >>>>>>>>> Thanks Everyone for the efforts.
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Vinay
> >>>>>>>>>
> >>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org>
> >>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> +1
> >>>>>>>>>>
> >>>>>>>>>> I've been involved in both development and review on the
> >>> branch,
> >>>>>>> and
> >>>>>>> I
> >>>>>>>>>> believe it's now ready to get merged into trunk. Many thanks
> >>> to
> >>>>>>> all
> >>>>>>>>>> the contributors and reviewers!
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> -Jing
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
> >>>> kai.zheng@intel.com>
> >>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Non-binding +1
> >>>>>>>>>>>
> >>>>>>>>>>> According to our extensive performance tests, striping +
> >>> ISA-L
> >>>>>>> coder
> >>>>>>>>>> based
> >>>>>>>>>>> erasure coding not only can save storage, but also can
> >>>> increase
> >>>>>>> the
> >>>>>>>>>>> throughput of a client or a cluster. It will be a great
> >>>>>>> addition to
> >>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
> >>> also
> >>>>>>>>>>> observed it's
> >>>>>>>>>> very
> >>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
> >>> test
> >>>>>>> report
> >>>>>>>>>> after
> >>>>>>>>>>> it's sorted out and hope it helps.
> >>>>>>>>>>> Thanks!
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Kai
> >>>>>>>>>>>
> >>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> >>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
> >>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
> >>> common-dev@hadoop.apache.org
> >>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch
> >>> to
> >>>>>>> trunk
> >>>>>>>>>>>
> >>>>>>>>>>> +1
> >>>>>>>>>>>
> >>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the nice
> >>>>>>> work.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Uma
> >>>>>>>>>>>
> >>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
> >>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 feature
> >>>>>>> branch
> >>>>>>>>>>>> back to trunk. Since November 2014 we have been designing
> >>> and
> >>>>>>>>>>>> developing this feature under the umbrella JIRAs HDFS-7285
> >>>> and
> >>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches.
> >>>>>>>>>>>>
> >>>>>>>>>>>> The HDFS-7285 feature branch was created to support the
> >>> first
> >>>>>>> phase
> >>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC
> >>> is
> >>>>>>> to
> >>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
> >>>>>>> Instead
> >>>>>>>>>>>> of always creating 3 replicas of each block with 200%
> >>> storage
> >>>>>>> space
> >>>>>>>>>>>> overhead, HDFS-EC provides data durability through parity
> >>>> data
> >>>>>>>> blocks.
> >>>>>>>>>>>> With most EC configurations, the storage overhead is no
> >>> more
> >>>>>>> than
> >>>>>>>> 50%.
> >>>>>>>>>>>> Based on profiling results of production clusters, we
> >>> decided
> >>>>>>> to
> >>>>>>>>>>>> support EC with the striped block layout in the first
> >>> phase,
> >>>> so
> >>>>>>>>>>>> that small files can be better handled. This means dividing
> >>>>>>> each
> >>>>>>>>>>>> logical HDFS file block into smaller units (striping cells)
> >>>> and
> >>>>>>>>>>>> spreading them on a set of DataNodes in round-robin
> >>> fashion.
> >>>>>>> Parity
> >>>>>>>>>>>> cells are generated for each stripe of original data cells.
> >>>> We
> >>>>>>> have
> >>>>>>>>>>>> made changes to NameNode, client, and DataNode to
> >>> generalize
> >>>>>>> the
> >>>>>>>>>>>> block concept and handle the mapping between a logical file
> >>>>>>> block
> >>>>>>>>>>>> and its internal storage blocks. For further details please
> >>>> see
> >>>>>>> the
> >>>>>>>>>>>> design doc on HDFS-7285.
> >>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
> >>>> high-performance
> >>>>>>>>>>>> codec calculation support.
> >>>>>>>>>>>>
> >>>>>>>>>>>> The nightly Jenkins job of the branch has reported several
> >>>>>>>>>>>> successful runs, and doesn't show new flaky tests compared
> >>>> with
> >>>>>>>>>>>> trunk. We have posted several versions of the test plan
> >>>>>>> including
> >>>>>>>>>>>> both unit testing and cluster testing, and have executed
> >>> most
> >>>>>>> tests
> >>>>>>>>>>>> in the plan. The most basic functionalities have been
> >>>>>>> extensively
> >>>>>>>>>>>> tested and verified in several real clusters with different
> >>>>>>>>>>>> hardware configurations; results have been very stable. We
> >>>> have
> >>>>>>>>>>>> created follow-on tasks for more advanced error handling
> >>> and
> >>>>>>>>> optimization under the umbrella HDFS-8031.
> >>>>>>>>>>>> We also plan to implement or harden the integration of EC
> >>>> with
> >>>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
> >>>> truncate,
> >>>>>>>>>>>> hflush, hsync, and so forth.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Development of this feature has been a collaboration across
> >>>>>>> many
> >>>>>>>>>>>> companies and institutions. I'd like to thank J. Andreina,
> >>>>>>> Takanobu
> >>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
> >>> Maheswara
> >>>>>>> Rao
> >>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao
> >>>> Rui,
> >>>>>>> Kai
> >>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong
> >>>>>>> Zhang,
> >>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
> >>>> contributions
> >>>>>>> and
> >>>>>>>>> reviews.
> >>>>>>>>>>>> Andrew and Kai Zheng also made fundamental contributions to
> >>>> the
> >>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and
> >>>> many
> >>>>>>>>>>>> other contributors have made great efforts in system
> >>> testing.
> >>>>>>> Many
> >>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and ATM,
> >>>> Todd
> >>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
> >>>>>>> providing
> >>>>>>>>> helpful feedbacks.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Following the community convention, this vote will last
> >>> for 7
> >>>>>>> days
> >>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers are
> >>>>>>> binding
> >>>>>>>>>>>> but non-binding votes are very welcome as well. And here's
> >>> my
> >>>>>>>>>>>> non-binding
> >>>>>>>>>> +1.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> ---
> >>>>>>>>>>>> Zhe Zhang
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >
>
>

RE: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by "Zheng, Kai" <ka...@intel.com>.
Sounds good to me. When it's determined to include EC in 2.9 release, it may be good to have a rough release date as Zhe asked, so accordingly the scope of EC can be discussed out. We still have quite a few of things as Phase I follow-on tasks to do before EC can be deployed in a production system. Phase II to develop non-striping EC for cold data would possibly be started after that. We might consider to include only Phase I and leave Phase II for next release according to the rough release date.

Regards,
Kai

-----Original Message-----
From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com] 
Sent: Tuesday, November 03, 2015 5:41 AM
To: hdfs-dev@hadoop.apache.org
Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

+1 for EC to go into 2.9. Yes, 3.x would be long way to go when we plan 
+to
have 2.8 and 2.9 releases.

Regards,
Uma

On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vi...@hortonworks.com> wrote:

>Forking the thread. Started looking at the 2.8 list, various features¹ 
>status and arrived here.
>
>While I understand the pervasive nature of EC and a need for a 
>significant bake-in, moving this to a 3.x release is not a good idea. 
>We will surely get a 2.8 out this year and, as needed, I can even spend 
>time getting started on a 2.9. OTOH, 3.x is long ways off, and given 
>all the incompatibilities there, it would be a while before users can 
>get their hands on EC if it were to be only on 3.x. At best, this may 
>force sites that want EC to backport the entire EC feature to older 
>releases, at worst this will be repeat the mess of 0.20 security release forks.
>
>If we think adding this to 2.8 (even if it switched off) is too much 
>risk per our original plan, let¹s move this to 2.9, there by leaving 
>enough time for stability, integration testing and bake-in, and a 
>realistic chance of having it end up on users¹ clusters soonish.
>
>+Vinod
>
>> On Oct 19, 2015, at 1:44 PM, Andrew Wang <an...@cloudera.com>
>>wrote:
>> 
>> I think our plan thus far has been to target this for 3.0. I'm okay 
>>with  putting it in branch-2 if we've given a hard look at 
>>compatibility, but  I'll note though that 2.8 is already looking like 
>>quite a large release,  and our release bandwidth has been focused on 
>>the 2.6 and 2.7 maintenance  releases. Adding another multi-hundred 
>>JIRAs to 2.8 might make it too  unwieldy to get out the door. If we 
>>bump EC past that, 3.0 might very well  be our next release vehicle. I 
>>do plan to revive the 3.0 schedule some time  next year. With EC and 
>>JDK8 in a good spot, the only big feature remaining  is classpath 
>>isolation.
>> 
>> EC is also a pretty fundamental change to HDFS. Even if it's 
>>compatible, in  terms of size and impact it might best belong in a new 
>>major release.
>> 
>> Best,
>> Andrew
>> 
>> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B < 
>> vinayakumarb.apache@gmail.com> wrote:
>> 
>>> Is anyone else also thinks that feature is ready to goto branch-2  
>>>as well?
>>> 
>>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since 
>>>then and  ready to go in branch-2.
>>> 
>>> -Vinay
>>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com> wrote:
>>> 
>>>> Thanks Vinay for capturing the issue and Uma for offering the help.
>>>> 
>>>> ---
>>>> Zhe Zhang
>>>> 
>>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
>>> uma.gangumalla@intel.com
>>>>> 
>>>> wrote:
>>>> 
>>>>> Vinay,
>>>>> 
>>>>> 
>>>>> I would merge them as part of HDFS-9182.
>>>>> 
>>>>> Thanks,
>>>>> Uma
>>>>> 
>>>>> 
>>>>> 
>>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" <vi...@apache.org>
>>>>>wrote:
>>>>> 
>>>>>> Hi Andrew,
>>>>>> I see CHANGES.txt entries not yet merged from
>>> CHANGES-HDFS-EC-7285.txt.
>>>>>> 
>>>>>> Was this intentional?
>>>>>> 
>>>>>> Regards,
>>>>>> Vinay
>>>>>> 
>>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
>>> andrew.wang@cloudera.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Branch has been merged to trunk, thanks again to everyone who 
>>>>>>>worked
>>>> on
>>>>>>> the
>>>>>>> feature!
>>>>>>> 
>>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang 
>>>>>>> <zh...@cloudera.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Thanks everyone who has participated in this discussion.
>>>>>>>> 
>>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this vote
>>> has
>>>>>>> passed.
>>>>>>>> I will do a final 'git merge' with trunk and work with Andrew 
>>>>>>>> to
>>>> merge
>>>>>>> the
>>>>>>>> branch to trunk. I'll update on this thread when the merge is
>>> done.
>>>>>>>> 
>>>>>>>> ---
>>>>>>>> Zhe Zhang
>>>>>>>> 
>>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A 
>>>>>>>> <yi...@intel.com>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> (Change it to binding.)
>>>>>>>>> 
>>>>>>>>> +1
>>>>>>>>> I have been involved in the development and code review on the
>>>>>>> feature
>>>>>>>>> branch. It's a great feature and I think it's ready to merge 
>>>>>>>>> it
>>>> into
>>>>>>>> trunk.
>>>>>>>>> 
>>>>>>>>> Thanks all for the contribution.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Yi Liu
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Liu, Yi A
>>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
>>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to
>>>> trunk
>>>>>>>>> 
>>>>>>>>> +1 (non-binding)
>>>>>>>>> I have been involved in the development and code review on the
>>>>>>> feature
>>>>>>>>> branch. It's a great feature and I think it's ready to merge 
>>>>>>>>> it
>>>> into
>>>>>>>> trunk.
>>>>>>>>> 
>>>>>>>>> Thanks all for the contribution.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Yi Liu
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
>>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
>>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
>>>> trunk
>>>>>>>>> 
>>>>>>>>> +1,
>>>>>>>>> 
>>>>>>>>> I've been involved starting from design and development of
>>>>>>> ErasureCoding.
>>>>>>>>> I think phase 1 of this development is ready to be merged to
>>>> trunk.
>>>>>>>>> It had come a long way to the current state with significant
>>>> effort
>>>>>>> of
>>>>>>>>> many Contributors and Reviewers for both design and code.
>>>>>>>>> 
>>>>>>>>> Thanks Everyone for the efforts.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Vinay
>>>>>>>>> 
>>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org>
>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> +1
>>>>>>>>>> 
>>>>>>>>>> I've been involved in both development and review on the
>>> branch,
>>>>>>> and
>>>>>>> I
>>>>>>>>>> believe it's now ready to get merged into trunk. Many thanks
>>> to
>>>>>>> all
>>>>>>>>>> the contributors and reviewers!
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> -Jing
>>>>>>>>>> 
>>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
>>>> kai.zheng@intel.com>
>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Non-binding +1
>>>>>>>>>>> 
>>>>>>>>>>> According to our extensive performance tests, striping +
>>> ISA-L
>>>>>>> coder
>>>>>>>>>> based
>>>>>>>>>>> erasure coding not only can save storage, but also can
>>>> increase
>>>>>>> the
>>>>>>>>>>> throughput of a client or a cluster. It will be a great
>>>>>>> addition to
>>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
>>> also
>>>>>>>>>>> observed it's
>>>>>>>>>> very
>>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
>>> test
>>>>>>> report
>>>>>>>>>> after
>>>>>>>>>>> it's sorted out and hope it helps.
>>>>>>>>>>> Thanks!
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Kai
>>>>>>>>>>> 
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
>>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
>>> common-dev@hadoop.apache.org
>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch
>>> to
>>>>>>> trunk
>>>>>>>>>>> 
>>>>>>>>>>> +1
>>>>>>>>>>> 
>>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the nice
>>>>>>> work.
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Uma
>>>>>>>>>>> 
>>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> 
>>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 feature
>>>>>>> branch
>>>>>>>>>>>> back to trunk. Since November 2014 we have been designing
>>> and
>>>>>>>>>>>> developing this feature under the umbrella JIRAs HDFS-7285
>>>> and
>>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches.
>>>>>>>>>>>> 
>>>>>>>>>>>> The HDFS-7285 feature branch was created to support the
>>> first
>>>>>>> phase
>>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC
>>> is
>>>>>>> to
>>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
>>>>>>> Instead
>>>>>>>>>>>> of always creating 3 replicas of each block with 200%
>>> storage
>>>>>>> space
>>>>>>>>>>>> overhead, HDFS-EC provides data durability through parity
>>>> data
>>>>>>>> blocks.
>>>>>>>>>>>> With most EC configurations, the storage overhead is no
>>> more
>>>>>>> than
>>>>>>>> 50%.
>>>>>>>>>>>> Based on profiling results of production clusters, we
>>> decided
>>>>>>> to
>>>>>>>>>>>> support EC with the striped block layout in the first
>>> phase,
>>>> so
>>>>>>>>>>>> that small files can be better handled. This means dividing
>>>>>>> each
>>>>>>>>>>>> logical HDFS file block into smaller units (striping cells)
>>>> and
>>>>>>>>>>>> spreading them on a set of DataNodes in round-robin
>>> fashion.
>>>>>>> Parity
>>>>>>>>>>>> cells are generated for each stripe of original data cells.
>>>> We
>>>>>>> have
>>>>>>>>>>>> made changes to NameNode, client, and DataNode to
>>> generalize
>>>>>>> the
>>>>>>>>>>>> block concept and handle the mapping between a logical file
>>>>>>> block
>>>>>>>>>>>> and its internal storage blocks. For further details please
>>>> see
>>>>>>> the
>>>>>>>>>>>> design doc on HDFS-7285.
>>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
>>>> high-performance
>>>>>>>>>>>> codec calculation support.
>>>>>>>>>>>> 
>>>>>>>>>>>> The nightly Jenkins job of the branch has reported several 
>>>>>>>>>>>> successful runs, and doesn't show new flaky tests compared
>>>> with
>>>>>>>>>>>> trunk. We have posted several versions of the test plan
>>>>>>> including
>>>>>>>>>>>> both unit testing and cluster testing, and have executed
>>> most
>>>>>>> tests
>>>>>>>>>>>> in the plan. The most basic functionalities have been
>>>>>>> extensively
>>>>>>>>>>>> tested and verified in several real clusters with different 
>>>>>>>>>>>> hardware configurations; results have been very stable. We
>>>> have
>>>>>>>>>>>> created follow-on tasks for more advanced error handling
>>> and
>>>>>>>>> optimization under the umbrella HDFS-8031.
>>>>>>>>>>>> We also plan to implement or harden the integration of EC
>>>> with
>>>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
>>>> truncate,
>>>>>>>>>>>> hflush, hsync, and so forth.
>>>>>>>>>>>> 
>>>>>>>>>>>> Development of this feature has been a collaboration across
>>>>>>> many
>>>>>>>>>>>> companies and institutions. I'd like to thank J. Andreina,
>>>>>>> Takanobu
>>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
>>> Maheswara
>>>>>>> Rao
>>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao
>>>> Rui,
>>>>>>> Kai
>>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong
>>>>>>> Zhang,
>>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
>>>> contributions
>>>>>>> and
>>>>>>>>> reviews.
>>>>>>>>>>>> Andrew and Kai Zheng also made fundamental contributions to
>>>> the
>>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and
>>>> many
>>>>>>>>>>>> other contributors have made great efforts in system
>>> testing.
>>>>>>> Many
>>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and ATM,
>>>> Todd
>>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
>>>>>>> providing
>>>>>>>>> helpful feedbacks.
>>>>>>>>>>>> 
>>>>>>>>>>>> Following the community convention, this vote will last
>>> for 7
>>>>>>> days
>>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers are
>>>>>>> binding
>>>>>>>>>>>> but non-binding votes are very welcome as well. And here's
>>> my
>>>>>>>>>>>> non-binding
>>>>>>>>>> +1.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> ---
>>>>>>>>>>>> Zhe Zhang
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>


Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Elliott Clark <ec...@apache.org>.
I don't really see the difference between 2.9 with a ton of scary changes
(EC is a lot more NN stuff than a usual release) and a 3.0.
What's the downside of getting a major version. It relaxes the compat a
little bit. It would allow some bake time before it's stable.

Put another way.  I'm upgrading our prod clusters to 2.7 now and 2.8 when
it comes out. That's because those releases are normal dot releases.
I would hold off a much longer time for a 2.9 with EC.

That to me really signals that it's a different type of release and that
should be messaged with a major version rev.

On Wed, Nov 4, 2015 at 4:06 PM, Andrew Purtell <an...@gmail.com>
wrote:

> Yes we can mostly likely help you. Please come over to dev@bigtop.
>
>
> > On Nov 4, 2015, at 12:40 PM, Andrew Wang <an...@cloudera.com>
> wrote:
> >
> > We used to get help from Bigtop when it comes to integration testing. Do
> we
> > think that's possible for 2.8?
> >
> > On Wed, Nov 4, 2015 at 10:08 AM, Steve Loughran <st...@hortonworks.com>
> > wrote:
> >
> >>
> >>>> On 2 Nov 2015, at 23:11, Vinod Vavilapalli <vi...@hortonworks.com>
> >>> wrote:
> >>>
> >>> Yes, I’ve already started looking at 2.8.0, that is exactly how I ended
> >> up with this discussion on the state of EC.
> >>>
> >>> +Vinod
> >>>
> >>>
> >>>> On Nov 2, 2015, at 3:02 PM, Haohui Mai <ricetons@gmail.com<mailto:
> >>> ricetons@gmail.com>> wrote:
> >>>
> >>> Is it a good time to start the discussion on the issues of releasing
> 2.8?
> >>
> >> Before rushing to release 2.8, people should be trying in downstream
> apps
> >> today,. As well as identifying hdfs-client related issues, I've just
> >> discovered that the MiniYARNCluster has added lots of stack traces
> >> (YARN-4330), and I'm sure there are other regressions.
> >>
> >> It's generally not that hard to take a downstream project and try to
> build
> >> with a hadoop version of 2.8.0-SNAPSHOT; compilation and classpath
> problems
> >> will show up immediately; unit test regressions can be at least
> identified
> >> by switching between 2.7.1 and 2.8.0-SNAPSHOT.
>

Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Andrew Purtell <an...@gmail.com>.
Yes we can mostly likely help you. Please come over to dev@bigtop. 


> On Nov 4, 2015, at 12:40 PM, Andrew Wang <an...@cloudera.com> wrote:
> 
> We used to get help from Bigtop when it comes to integration testing. Do we
> think that's possible for 2.8?
> 
> On Wed, Nov 4, 2015 at 10:08 AM, Steve Loughran <st...@hortonworks.com>
> wrote:
> 
>> 
>>>> On 2 Nov 2015, at 23:11, Vinod Vavilapalli <vi...@hortonworks.com>
>>> wrote:
>>> 
>>> Yes, I’ve already started looking at 2.8.0, that is exactly how I ended
>> up with this discussion on the state of EC.
>>> 
>>> +Vinod
>>> 
>>> 
>>>> On Nov 2, 2015, at 3:02 PM, Haohui Mai <ricetons@gmail.com<mailto:
>>> ricetons@gmail.com>> wrote:
>>> 
>>> Is it a good time to start the discussion on the issues of releasing 2.8?
>> 
>> Before rushing to release 2.8, people should be trying in downstream apps
>> today,. As well as identifying hdfs-client related issues, I've just
>> discovered that the MiniYARNCluster has added lots of stack traces
>> (YARN-4330), and I'm sure there are other regressions.
>> 
>> It's generally not that hard to take a downstream project and try to build
>> with a hadoop version of 2.8.0-SNAPSHOT; compilation and classpath problems
>> will show up immediately; unit test regressions can be at least identified
>> by switching between 2.7.1 and 2.8.0-SNAPSHOT.

Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Andrew Wang <an...@cloudera.com>.
We used to get help from Bigtop when it comes to integration testing. Do we
think that's possible for 2.8?

On Wed, Nov 4, 2015 at 10:08 AM, Steve Loughran <st...@hortonworks.com>
wrote:

>
> > On 2 Nov 2015, at 23:11, Vinod Vavilapalli <vi...@hortonworks.com>
> wrote:
> >
> > Yes, I’ve already started looking at 2.8.0, that is exactly how I ended
> up with this discussion on the state of EC.
> >
> > +Vinod
> >
> >
> > On Nov 2, 2015, at 3:02 PM, Haohui Mai <ricetons@gmail.com<mailto:
> ricetons@gmail.com>> wrote:
> >
> > Is it a good time to start the discussion on the issues of releasing 2.8?
> >
>
> Before rushing to release 2.8, people should be trying in downstream apps
> today,. As well as identifying hdfs-client related issues, I've just
> discovered that the MiniYARNCluster has added lots of stack traces
> (YARN-4330), and I'm sure there are other regressions.
>
> It's generally not that hard to take a downstream project and try to build
> with a hadoop version of 2.8.0-SNAPSHOT; compilation and classpath problems
> will show up immediately; unit test regressions can be at least identified
> by switching between 2.7.1 and 2.8.0-SNAPSHOT.

Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Steve Loughran <st...@hortonworks.com>.
> On 2 Nov 2015, at 23:11, Vinod Vavilapalli <vi...@hortonworks.com> wrote:
> 
> Yes, I’ve already started looking at 2.8.0, that is exactly how I ended up with this discussion on the state of EC.
> 
> +Vinod
> 
> 
> On Nov 2, 2015, at 3:02 PM, Haohui Mai <ri...@gmail.com>> wrote:
> 
> Is it a good time to start the discussion on the issues of releasing 2.8?
> 

Before rushing to release 2.8, people should be trying in downstream apps today,. As well as identifying hdfs-client related issues, I've just discovered that the MiniYARNCluster has added lots of stack traces (YARN-4330), and I'm sure there are other regressions.

It's generally not that hard to take a downstream project and try to build with a hadoop version of 2.8.0-SNAPSHOT; compilation and classpath problems will show up immediately; unit test regressions can be at least identified by switching between 2.7.1 and 2.8.0-SNAPSHOT.

Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Vinod Vavilapalli <vi...@hortonworks.com>.
Yes, I’ve already started looking at 2.8.0, that is exactly how I ended up with this discussion on the state of EC.

+Vinod


On Nov 2, 2015, at 3:02 PM, Haohui Mai <ri...@gmail.com>> wrote:

Is it a good time to start the discussion on the issues of releasing 2.8?


Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Haohui Mai <ri...@gmail.com>.
+1 on putting EC on 2.9.

Is it a good time to start the discussion on the issues of releasing 2.8?

~Haohui

On Mon, Nov 2, 2015 at 1:40 PM, Gangumalla, Uma
<um...@intel.com> wrote:
> +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we plan to
> have 2.8 and 2.9 releases.
>
> Regards,
> Uma
>
> On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vi...@hortonworks.com> wrote:
>
>>Forking the thread. Started looking at the 2.8 list, various features¹
>>status and arrived here.
>>
>>While I understand the pervasive nature of EC and a need for a
>>significant bake-in, moving this to a 3.x release is not a good idea. We
>>will surely get a 2.8 out this year and, as needed, I can even spend time
>>getting started on a 2.9. OTOH, 3.x is long ways off, and given all the
>>incompatibilities there, it would be a while before users can get their
>>hands on EC if it were to be only on 3.x. At best, this may force sites
>>that want EC to backport the entire EC feature to older releases, at
>>worst this will be repeat the mess of 0.20 security release forks.
>>
>>If we think adding this to 2.8 (even if it switched off) is too much risk
>>per our original plan, let¹s move this to 2.9, there by leaving enough
>>time for stability, integration testing and bake-in, and a realistic
>>chance of having it end up on users¹ clusters soonish.
>>
>>+Vinod
>>
>>> On Oct 19, 2015, at 1:44 PM, Andrew Wang <an...@cloudera.com>
>>>wrote:
>>>
>>> I think our plan thus far has been to target this for 3.0. I'm okay with
>>> putting it in branch-2 if we've given a hard look at compatibility, but
>>> I'll note though that 2.8 is already looking like quite a large release,
>>> and our release bandwidth has been focused on the 2.6 and 2.7
>>>maintenance
>>> releases. Adding another multi-hundred JIRAs to 2.8 might make it too
>>> unwieldy to get out the door. If we bump EC past that, 3.0 might very
>>>well
>>> be our next release vehicle. I do plan to revive the 3.0 schedule some
>>>time
>>> next year. With EC and JDK8 in a good spot, the only big feature
>>>remaining
>>> is classpath isolation.
>>>
>>> EC is also a pretty fundamental change to HDFS. Even if it's
>>>compatible, in
>>> terms of size and impact it might best belong in a new major release.
>>>
>>> Best,
>>> Andrew
>>>
>>> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
>>> vinayakumarb.apache@gmail.com> wrote:
>>>
>>>> Is anyone else also thinks that feature is ready to goto branch-2  as
>>>>well?
>>>>
>>>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since
>>>>then and
>>>> ready to go in branch-2.
>>>>
>>>> -Vinay
>>>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com> wrote:
>>>>
>>>>> Thanks Vinay for capturing the issue and Uma for offering the help.
>>>>>
>>>>> ---
>>>>> Zhe Zhang
>>>>>
>>>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
>>>> uma.gangumalla@intel.com
>>>>>>
>>>>> wrote:
>>>>>
>>>>>> Vinay,
>>>>>>
>>>>>>
>>>>>> I would merge them as part of HDFS-9182.
>>>>>>
>>>>>> Thanks,
>>>>>> Uma
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" <vi...@apache.org>
>>>>>>wrote:
>>>>>>
>>>>>>> Hi Andrew,
>>>>>>> I see CHANGES.txt entries not yet merged from
>>>> CHANGES-HDFS-EC-7285.txt.
>>>>>>>
>>>>>>> Was this intentional?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Vinay
>>>>>>>
>>>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
>>>> andrew.wang@cloudera.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Branch has been merged to trunk, thanks again to everyone who
>>>>>>>>worked
>>>>> on
>>>>>>>> the
>>>>>>>> feature!
>>>>>>>>
>>>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <zh...@cloudera.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks everyone who has participated in this discussion.
>>>>>>>>>
>>>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this vote
>>>> has
>>>>>>>> passed.
>>>>>>>>> I will do a final 'git merge' with trunk and work with Andrew to
>>>>> merge
>>>>>>>> the
>>>>>>>>> branch to trunk. I'll update on this thread when the merge is
>>>> done.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>> Zhe Zhang
>>>>>>>>>
>>>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi...@intel.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> (Change it to binding.)
>>>>>>>>>>
>>>>>>>>>> +1
>>>>>>>>>> I have been involved in the development and code review on the
>>>>>>>> feature
>>>>>>>>>> branch. It's a great feature and I think it's ready to merge it
>>>>> into
>>>>>>>>> trunk.
>>>>>>>>>>
>>>>>>>>>> Thanks all for the contribution.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Yi Liu
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Liu, Yi A
>>>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
>>>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to
>>>>> trunk
>>>>>>>>>>
>>>>>>>>>> +1 (non-binding)
>>>>>>>>>> I have been involved in the development and code review on the
>>>>>>>> feature
>>>>>>>>>> branch. It's a great feature and I think it's ready to merge it
>>>>> into
>>>>>>>>> trunk.
>>>>>>>>>>
>>>>>>>>>> Thanks all for the contribution.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Yi Liu
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
>>>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
>>>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
>>>>> trunk
>>>>>>>>>>
>>>>>>>>>> +1,
>>>>>>>>>>
>>>>>>>>>> I've been involved starting from design and development of
>>>>>>>> ErasureCoding.
>>>>>>>>>> I think phase 1 of this development is ready to be merged to
>>>>> trunk.
>>>>>>>>>> It had come a long way to the current state with significant
>>>>> effort
>>>>>>>> of
>>>>>>>>>> many Contributors and Reviewers for both design and code.
>>>>>>>>>>
>>>>>>>>>> Thanks Everyone for the efforts.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Vinay
>>>>>>>>>>
>>>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org>
>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> +1
>>>>>>>>>>>
>>>>>>>>>>> I've been involved in both development and review on the
>>>> branch,
>>>>>>>> and
>>>>>>>> I
>>>>>>>>>>> believe it's now ready to get merged into trunk. Many thanks
>>>> to
>>>>>>>> all
>>>>>>>>>>> the contributors and reviewers!
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> -Jing
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
>>>>> kai.zheng@intel.com>
>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Non-binding +1
>>>>>>>>>>>>
>>>>>>>>>>>> According to our extensive performance tests, striping +
>>>> ISA-L
>>>>>>>> coder
>>>>>>>>>>> based
>>>>>>>>>>>> erasure coding not only can save storage, but also can
>>>>> increase
>>>>>>>> the
>>>>>>>>>>>> throughput of a client or a cluster. It will be a great
>>>>>>>> addition to
>>>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
>>>> also
>>>>>>>>>>>> observed it's
>>>>>>>>>>> very
>>>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
>>>> test
>>>>>>>> report
>>>>>>>>>>> after
>>>>>>>>>>>> it's sorted out and hope it helps.
>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Kai
>>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
>>>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
>>>> common-dev@hadoop.apache.org
>>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch
>>>> to
>>>>>>>> trunk
>>>>>>>>>>>>
>>>>>>>>>>>> +1
>>>>>>>>>>>>
>>>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the nice
>>>>>>>> work.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Uma
>>>>>>>>>>>>
>>>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 feature
>>>>>>>> branch
>>>>>>>>>>>>> back to trunk. Since November 2014 we have been designing
>>>> and
>>>>>>>>>>>>> developing this feature under the umbrella JIRAs HDFS-7285
>>>>> and
>>>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The HDFS-7285 feature branch was created to support the
>>>> first
>>>>>>>> phase
>>>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC
>>>> is
>>>>>>>> to
>>>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
>>>>>>>> Instead
>>>>>>>>>>>>> of always creating 3 replicas of each block with 200%
>>>> storage
>>>>>>>> space
>>>>>>>>>>>>> overhead, HDFS-EC provides data durability through parity
>>>>> data
>>>>>>>>> blocks.
>>>>>>>>>>>>> With most EC configurations, the storage overhead is no
>>>> more
>>>>>>>> than
>>>>>>>>> 50%.
>>>>>>>>>>>>> Based on profiling results of production clusters, we
>>>> decided
>>>>>>>> to
>>>>>>>>>>>>> support EC with the striped block layout in the first
>>>> phase,
>>>>> so
>>>>>>>>>>>>> that small files can be better handled. This means dividing
>>>>>>>> each
>>>>>>>>>>>>> logical HDFS file block into smaller units (striping cells)
>>>>> and
>>>>>>>>>>>>> spreading them on a set of DataNodes in round-robin
>>>> fashion.
>>>>>>>> Parity
>>>>>>>>>>>>> cells are generated for each stripe of original data cells.
>>>>> We
>>>>>>>> have
>>>>>>>>>>>>> made changes to NameNode, client, and DataNode to
>>>> generalize
>>>>>>>> the
>>>>>>>>>>>>> block concept and handle the mapping between a logical file
>>>>>>>> block
>>>>>>>>>>>>> and its internal storage blocks. For further details please
>>>>> see
>>>>>>>> the
>>>>>>>>>>>>> design doc on HDFS-7285.
>>>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
>>>>> high-performance
>>>>>>>>>>>>> codec calculation support.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The nightly Jenkins job of the branch has reported several
>>>>>>>>>>>>> successful runs, and doesn't show new flaky tests compared
>>>>> with
>>>>>>>>>>>>> trunk. We have posted several versions of the test plan
>>>>>>>> including
>>>>>>>>>>>>> both unit testing and cluster testing, and have executed
>>>> most
>>>>>>>> tests
>>>>>>>>>>>>> in the plan. The most basic functionalities have been
>>>>>>>> extensively
>>>>>>>>>>>>> tested and verified in several real clusters with different
>>>>>>>>>>>>> hardware configurations; results have been very stable. We
>>>>> have
>>>>>>>>>>>>> created follow-on tasks for more advanced error handling
>>>> and
>>>>>>>>>> optimization under the umbrella HDFS-8031.
>>>>>>>>>>>>> We also plan to implement or harden the integration of EC
>>>>> with
>>>>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
>>>>> truncate,
>>>>>>>>>>>>> hflush, hsync, and so forth.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Development of this feature has been a collaboration across
>>>>>>>> many
>>>>>>>>>>>>> companies and institutions. I'd like to thank J. Andreina,
>>>>>>>> Takanobu
>>>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
>>>> Maheswara
>>>>>>>> Rao
>>>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao
>>>>> Rui,
>>>>>>>> Kai
>>>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong
>>>>>>>> Zhang,
>>>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
>>>>> contributions
>>>>>>>> and
>>>>>>>>>> reviews.
>>>>>>>>>>>>> Andrew and Kai Zheng also made fundamental contributions to
>>>>> the
>>>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and
>>>>> many
>>>>>>>>>>>>> other contributors have made great efforts in system
>>>> testing.
>>>>>>>> Many
>>>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and ATM,
>>>>> Todd
>>>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
>>>>>>>> providing
>>>>>>>>>> helpful feedbacks.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Following the community convention, this vote will last
>>>> for 7
>>>>>>>> days
>>>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers are
>>>>>>>> binding
>>>>>>>>>>>>> but non-binding votes are very welcome as well. And here's
>>>> my
>>>>>>>>>>>>> non-binding
>>>>>>>>>>> +1.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> ---
>>>>>>>>>>>>> Zhe Zhang
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>
>

Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by "Gangumalla, Uma" <um...@intel.com>.
+1 for EC to go into 2.9. Yes, 3.x would be long way to go when we plan to
have 2.8 and 2.9 releases.

Regards,
Uma

On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vi...@hortonworks.com> wrote:

>Forking the thread. Started looking at the 2.8 list, various features¹
>status and arrived here.
>
>While I understand the pervasive nature of EC and a need for a
>significant bake-in, moving this to a 3.x release is not a good idea. We
>will surely get a 2.8 out this year and, as needed, I can even spend time
>getting started on a 2.9. OTOH, 3.x is long ways off, and given all the
>incompatibilities there, it would be a while before users can get their
>hands on EC if it were to be only on 3.x. At best, this may force sites
>that want EC to backport the entire EC feature to older releases, at
>worst this will be repeat the mess of 0.20 security release forks.
>
>If we think adding this to 2.8 (even if it switched off) is too much risk
>per our original plan, let¹s move this to 2.9, there by leaving enough
>time for stability, integration testing and bake-in, and a realistic
>chance of having it end up on users¹ clusters soonish.
>
>+Vinod
>
>> On Oct 19, 2015, at 1:44 PM, Andrew Wang <an...@cloudera.com>
>>wrote:
>> 
>> I think our plan thus far has been to target this for 3.0. I'm okay with
>> putting it in branch-2 if we've given a hard look at compatibility, but
>> I'll note though that 2.8 is already looking like quite a large release,
>> and our release bandwidth has been focused on the 2.6 and 2.7
>>maintenance
>> releases. Adding another multi-hundred JIRAs to 2.8 might make it too
>> unwieldy to get out the door. If we bump EC past that, 3.0 might very
>>well
>> be our next release vehicle. I do plan to revive the 3.0 schedule some
>>time
>> next year. With EC and JDK8 in a good spot, the only big feature
>>remaining
>> is classpath isolation.
>> 
>> EC is also a pretty fundamental change to HDFS. Even if it's
>>compatible, in
>> terms of size and impact it might best belong in a new major release.
>> 
>> Best,
>> Andrew
>> 
>> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
>> vinayakumarb.apache@gmail.com> wrote:
>> 
>>> Is anyone else also thinks that feature is ready to goto branch-2  as
>>>well?
>>> 
>>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since
>>>then and
>>> ready to go in branch-2.
>>> 
>>> -Vinay
>>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com> wrote:
>>> 
>>>> Thanks Vinay for capturing the issue and Uma for offering the help.
>>>> 
>>>> ---
>>>> Zhe Zhang
>>>> 
>>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
>>> uma.gangumalla@intel.com
>>>>> 
>>>> wrote:
>>>> 
>>>>> Vinay,
>>>>> 
>>>>> 
>>>>> I would merge them as part of HDFS-9182.
>>>>> 
>>>>> Thanks,
>>>>> Uma
>>>>> 
>>>>> 
>>>>> 
>>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" <vi...@apache.org>
>>>>>wrote:
>>>>> 
>>>>>> Hi Andrew,
>>>>>> I see CHANGES.txt entries not yet merged from
>>> CHANGES-HDFS-EC-7285.txt.
>>>>>> 
>>>>>> Was this intentional?
>>>>>> 
>>>>>> Regards,
>>>>>> Vinay
>>>>>> 
>>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
>>> andrew.wang@cloudera.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Branch has been merged to trunk, thanks again to everyone who
>>>>>>>worked
>>>> on
>>>>>>> the
>>>>>>> feature!
>>>>>>> 
>>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <zh...@cloudera.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Thanks everyone who has participated in this discussion.
>>>>>>>> 
>>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this vote
>>> has
>>>>>>> passed.
>>>>>>>> I will do a final 'git merge' with trunk and work with Andrew to
>>>> merge
>>>>>>> the
>>>>>>>> branch to trunk. I'll update on this thread when the merge is
>>> done.
>>>>>>>> 
>>>>>>>> ---
>>>>>>>> Zhe Zhang
>>>>>>>> 
>>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi...@intel.com>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> (Change it to binding.)
>>>>>>>>> 
>>>>>>>>> +1
>>>>>>>>> I have been involved in the development and code review on the
>>>>>>> feature
>>>>>>>>> branch. It's a great feature and I think it's ready to merge it
>>>> into
>>>>>>>> trunk.
>>>>>>>>> 
>>>>>>>>> Thanks all for the contribution.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Yi Liu
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Liu, Yi A
>>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
>>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to
>>>> trunk
>>>>>>>>> 
>>>>>>>>> +1 (non-binding)
>>>>>>>>> I have been involved in the development and code review on the
>>>>>>> feature
>>>>>>>>> branch. It's a great feature and I think it's ready to merge it
>>>> into
>>>>>>>> trunk.
>>>>>>>>> 
>>>>>>>>> Thanks all for the contribution.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Yi Liu
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
>>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
>>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
>>>> trunk
>>>>>>>>> 
>>>>>>>>> +1,
>>>>>>>>> 
>>>>>>>>> I've been involved starting from design and development of
>>>>>>> ErasureCoding.
>>>>>>>>> I think phase 1 of this development is ready to be merged to
>>>> trunk.
>>>>>>>>> It had come a long way to the current state with significant
>>>> effort
>>>>>>> of
>>>>>>>>> many Contributors and Reviewers for both design and code.
>>>>>>>>> 
>>>>>>>>> Thanks Everyone for the efforts.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Vinay
>>>>>>>>> 
>>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org>
>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> +1
>>>>>>>>>> 
>>>>>>>>>> I've been involved in both development and review on the
>>> branch,
>>>>>>> and
>>>>>>> I
>>>>>>>>>> believe it's now ready to get merged into trunk. Many thanks
>>> to
>>>>>>> all
>>>>>>>>>> the contributors and reviewers!
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> -Jing
>>>>>>>>>> 
>>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
>>>> kai.zheng@intel.com>
>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Non-binding +1
>>>>>>>>>>> 
>>>>>>>>>>> According to our extensive performance tests, striping +
>>> ISA-L
>>>>>>> coder
>>>>>>>>>> based
>>>>>>>>>>> erasure coding not only can save storage, but also can
>>>> increase
>>>>>>> the
>>>>>>>>>>> throughput of a client or a cluster. It will be a great
>>>>>>> addition to
>>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
>>> also
>>>>>>>>>>> observed it's
>>>>>>>>>> very
>>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
>>> test
>>>>>>> report
>>>>>>>>>> after
>>>>>>>>>>> it's sorted out and hope it helps.
>>>>>>>>>>> Thanks!
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Kai
>>>>>>>>>>> 
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
>>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
>>> common-dev@hadoop.apache.org
>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch
>>> to
>>>>>>> trunk
>>>>>>>>>>> 
>>>>>>>>>>> +1
>>>>>>>>>>> 
>>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the nice
>>>>>>> work.
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Uma
>>>>>>>>>>> 
>>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> 
>>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 feature
>>>>>>> branch
>>>>>>>>>>>> back to trunk. Since November 2014 we have been designing
>>> and
>>>>>>>>>>>> developing this feature under the umbrella JIRAs HDFS-7285
>>>> and
>>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches.
>>>>>>>>>>>> 
>>>>>>>>>>>> The HDFS-7285 feature branch was created to support the
>>> first
>>>>>>> phase
>>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC
>>> is
>>>>>>> to
>>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
>>>>>>> Instead
>>>>>>>>>>>> of always creating 3 replicas of each block with 200%
>>> storage
>>>>>>> space
>>>>>>>>>>>> overhead, HDFS-EC provides data durability through parity
>>>> data
>>>>>>>> blocks.
>>>>>>>>>>>> With most EC configurations, the storage overhead is no
>>> more
>>>>>>> than
>>>>>>>> 50%.
>>>>>>>>>>>> Based on profiling results of production clusters, we
>>> decided
>>>>>>> to
>>>>>>>>>>>> support EC with the striped block layout in the first
>>> phase,
>>>> so
>>>>>>>>>>>> that small files can be better handled. This means dividing
>>>>>>> each
>>>>>>>>>>>> logical HDFS file block into smaller units (striping cells)
>>>> and
>>>>>>>>>>>> spreading them on a set of DataNodes in round-robin
>>> fashion.
>>>>>>> Parity
>>>>>>>>>>>> cells are generated for each stripe of original data cells.
>>>> We
>>>>>>> have
>>>>>>>>>>>> made changes to NameNode, client, and DataNode to
>>> generalize
>>>>>>> the
>>>>>>>>>>>> block concept and handle the mapping between a logical file
>>>>>>> block
>>>>>>>>>>>> and its internal storage blocks. For further details please
>>>> see
>>>>>>> the
>>>>>>>>>>>> design doc on HDFS-7285.
>>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
>>>> high-performance
>>>>>>>>>>>> codec calculation support.
>>>>>>>>>>>> 
>>>>>>>>>>>> The nightly Jenkins job of the branch has reported several
>>>>>>>>>>>> successful runs, and doesn't show new flaky tests compared
>>>> with
>>>>>>>>>>>> trunk. We have posted several versions of the test plan
>>>>>>> including
>>>>>>>>>>>> both unit testing and cluster testing, and have executed
>>> most
>>>>>>> tests
>>>>>>>>>>>> in the plan. The most basic functionalities have been
>>>>>>> extensively
>>>>>>>>>>>> tested and verified in several real clusters with different
>>>>>>>>>>>> hardware configurations; results have been very stable. We
>>>> have
>>>>>>>>>>>> created follow-on tasks for more advanced error handling
>>> and
>>>>>>>>> optimization under the umbrella HDFS-8031.
>>>>>>>>>>>> We also plan to implement or harden the integration of EC
>>>> with
>>>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
>>>> truncate,
>>>>>>>>>>>> hflush, hsync, and so forth.
>>>>>>>>>>>> 
>>>>>>>>>>>> Development of this feature has been a collaboration across
>>>>>>> many
>>>>>>>>>>>> companies and institutions. I'd like to thank J. Andreina,
>>>>>>> Takanobu
>>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
>>> Maheswara
>>>>>>> Rao
>>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao
>>>> Rui,
>>>>>>> Kai
>>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong
>>>>>>> Zhang,
>>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
>>>> contributions
>>>>>>> and
>>>>>>>>> reviews.
>>>>>>>>>>>> Andrew and Kai Zheng also made fundamental contributions to
>>>> the
>>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and
>>>> many
>>>>>>>>>>>> other contributors have made great efforts in system
>>> testing.
>>>>>>> Many
>>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and ATM,
>>>> Todd
>>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
>>>>>>> providing
>>>>>>>>> helpful feedbacks.
>>>>>>>>>>>> 
>>>>>>>>>>>> Following the community convention, this vote will last
>>> for 7
>>>>>>> days
>>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers are
>>>>>>> binding
>>>>>>>>>>>> but non-binding votes are very welcome as well. And here's
>>> my
>>>>>>>>>>>> non-binding
>>>>>>>>>> +1.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> ---
>>>>>>>>>>>> Zhe Zhang
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>


Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Zhe Zhang <zh...@cloudera.com>.
Thanks Vinod for the proposal and Andrew/Jing for the comments. 2.9 sounds
a good plan. As Andrew pointed out, 2.8 is already quite big. And even when
disabled, EC logic has been baked in NN pretty deeply.

Do we have a tentative date or estimate for 2.9?

---
Zhe Zhang

On Mon, Nov 2, 2015 at 1:22 PM, Jing Zhao <ji...@apache.org> wrote:

> Thanks for the discussion, Vinod and Andrew. Backporting EC to 2.9 sounds
> good to me.
>
> On Mon, Nov 2, 2015 at 12:06 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > Thanks for forking the thread Vinod,
> >
> > SGTM, though I really do recommend waiting for 2.9 given the current size
> > of 2.8. I'm not a fan of an "off by default" half-measure, since it
> doesn't
> > change our compatibility requirements, and there's some major NN surgery
> > that can't really be disabled.
> >
> > If we do find a major user who's backported this to their own branch-2
> > fork, I agree that's motivation to get it in an upstream release
> quicker. I
> > haven't heard anything along these lines though.
> >
> > On Mon, Nov 2, 2015 at 11:49 AM, Vinod Vavilapalli <
> > vinodkv@hortonworks.com>
> > wrote:
> >
> > > Forking the thread. Started looking at the 2.8 list, various features’
> > > status and arrived here.
> > >
> > > While I understand the pervasive nature of EC and a need for a
> > significant
> > > bake-in, moving this to a 3.x release is not a good idea. We will
> surely
> > > get a 2.8 out this year and, as needed, I can even spend time getting
> > > started on a 2.9. OTOH, 3.x is long ways off, and given all the
> > > incompatibilities there, it would be a while before users can get their
> > > hands on EC if it were to be only on 3.x. At best, this may force sites
> > > that want EC to backport the entire EC feature to older releases, at
> > worst
> > > this will be repeat the mess of 0.20 security release forks.
> > >
> > > If we think adding this to 2.8 (even if it switched off) is too much
> risk
> > > per our original plan, let’s move this to 2.9, there by leaving enough
> > time
> > > for stability, integration testing and bake-in, and a realistic chance
> of
> > > having it end up on users’ clusters soonish.
> > >
> > > +Vinod
> > >
> > > > On Oct 19, 2015, at 1:44 PM, Andrew Wang <an...@cloudera.com>
> > > wrote:
> > > >
> > > > I think our plan thus far has been to target this for 3.0. I'm okay
> > with
> > > > putting it in branch-2 if we've given a hard look at compatibility,
> but
> > > > I'll note though that 2.8 is already looking like quite a large
> > release,
> > > > and our release bandwidth has been focused on the 2.6 and 2.7
> > maintenance
> > > > releases. Adding another multi-hundred JIRAs to 2.8 might make it too
> > > > unwieldy to get out the door. If we bump EC past that, 3.0 might very
> > > well
> > > > be our next release vehicle. I do plan to revive the 3.0 schedule
> some
> > > time
> > > > next year. With EC and JDK8 in a good spot, the only big feature
> > > remaining
> > > > is classpath isolation.
> > > >
> > > > EC is also a pretty fundamental change to HDFS. Even if it's
> > compatible,
> > > in
> > > > terms of size and impact it might best belong in a new major release.
> > > >
> > > > Best,
> > > > Andrew
> > > >
> > > > On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
> > > > vinayakumarb.apache@gmail.com> wrote:
> > > >
> > > >> Is anyone else also thinks that feature is ready to goto branch-2
> as
> > > well?
> > > >>
> > > >> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since
> > then
> > > and
> > > >> ready to go in branch-2.
> > > >>
> > > >> -Vinay
> > > >> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com> wrote:
> > > >>
> > > >>> Thanks Vinay for capturing the issue and Uma for offering the help.
> > > >>>
> > > >>> ---
> > > >>> Zhe Zhang
> > > >>>
> > > >>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
> > > >> uma.gangumalla@intel.com
> > > >>>>
> > > >>> wrote:
> > > >>>
> > > >>>> Vinay,
> > > >>>>
> > > >>>>
> > > >>>> I would merge them as part of HDFS-9182.
> > > >>>>
> > > >>>> Thanks,
> > > >>>> Uma
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> On 10/5/15, 12:48 AM, "Vinayakumar B" <vi...@apache.org>
> > > wrote:
> > > >>>>
> > > >>>>> Hi Andrew,
> > > >>>>> I see CHANGES.txt entries not yet merged from
> > > >> CHANGES-HDFS-EC-7285.txt.
> > > >>>>>
> > > >>>>> Was this intentional?
> > > >>>>>
> > > >>>>> Regards,
> > > >>>>> Vinay
> > > >>>>>
> > > >>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
> > > >> andrew.wang@cloudera.com>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>>> Branch has been merged to trunk, thanks again to everyone who
> > worked
> > > >>> on
> > > >>>>>> the
> > > >>>>>> feature!
> > > >>>>>>
> > > >>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <
> > zhezhang@cloudera.com>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Thanks everyone who has participated in this discussion.
> > > >>>>>>>
> > > >>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this vote
> > > >> has
> > > >>>>>> passed.
> > > >>>>>>> I will do a final 'git merge' with trunk and work with Andrew
> to
> > > >>> merge
> > > >>>>>> the
> > > >>>>>>> branch to trunk. I'll update on this thread when the merge is
> > > >> done.
> > > >>>>>>>
> > > >>>>>>> ---
> > > >>>>>>> Zhe Zhang
> > > >>>>>>>
> > > >>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <
> yi.a.liu@intel.com>
> > > >>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> (Change it to binding.)
> > > >>>>>>>>
> > > >>>>>>>> +1
> > > >>>>>>>> I have been involved in the development and code review on the
> > > >>>>>> feature
> > > >>>>>>>> branch. It's a great feature and I think it's ready to merge
> it
> > > >>> into
> > > >>>>>>> trunk.
> > > >>>>>>>>
> > > >>>>>>>> Thanks all for the contribution.
> > > >>>>>>>>
> > > >>>>>>>> Regards,
> > > >>>>>>>> Yi Liu
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> -----Original Message-----
> > > >>>>>>>> From: Liu, Yi A
> > > >>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
> > > >>>>>>>> To: hdfs-dev@hadoop.apache.org
> > > >>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> > > >>> trunk
> > > >>>>>>>>
> > > >>>>>>>> +1 (non-binding)
> > > >>>>>>>> I have been involved in the development and code review on the
> > > >>>>>> feature
> > > >>>>>>>> branch. It's a great feature and I think it's ready to merge
> it
> > > >>> into
> > > >>>>>>> trunk.
> > > >>>>>>>>
> > > >>>>>>>> Thanks all for the contribution.
> > > >>>>>>>>
> > > >>>>>>>> Regards,
> > > >>>>>>>> Yi Liu
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> -----Original Message-----
> > > >>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> > > >>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
> > > >>>>>>>> To: hdfs-dev@hadoop.apache.org
> > > >>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> > > >>> trunk
> > > >>>>>>>>
> > > >>>>>>>> +1,
> > > >>>>>>>>
> > > >>>>>>>> I've been involved starting from design and development of
> > > >>>>>> ErasureCoding.
> > > >>>>>>>> I think phase 1 of this development is ready to be merged to
> > > >>> trunk.
> > > >>>>>>>> It had come a long way to the current state with significant
> > > >>> effort
> > > >>>>>> of
> > > >>>>>>>> many Contributors and Reviewers for both design and code.
> > > >>>>>>>>
> > > >>>>>>>> Thanks Everyone for the efforts.
> > > >>>>>>>>
> > > >>>>>>>> Regards,
> > > >>>>>>>> Vinay
> > > >>>>>>>>
> > > >>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <jing9@apache.org
> >
> > > >>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> +1
> > > >>>>>>>>>
> > > >>>>>>>>> I've been involved in both development and review on the
> > > >> branch,
> > > >>>>>> and
> > > >>>>>> I
> > > >>>>>>>>> believe it's now ready to get merged into trunk. Many thanks
> > > >> to
> > > >>>>>> all
> > > >>>>>>>>> the contributors and reviewers!
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks,
> > > >>>>>>>>> -Jing
> > > >>>>>>>>>
> > > >>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
> > > >>> kai.zheng@intel.com>
> > > >>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> Non-binding +1
> > > >>>>>>>>>>
> > > >>>>>>>>>> According to our extensive performance tests, striping +
> > > >> ISA-L
> > > >>>>>> coder
> > > >>>>>>>>> based
> > > >>>>>>>>>> erasure coding not only can save storage, but also can
> > > >>> increase
> > > >>>>>> the
> > > >>>>>>>>>> throughput of a client or a cluster. It will be a great
> > > >>>>>> addition to
> > > >>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
> > > >> also
> > > >>>>>>>>>> observed it's
> > > >>>>>>>>> very
> > > >>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
> > > >> test
> > > >>>>>> report
> > > >>>>>>>>> after
> > > >>>>>>>>>> it's sorted out and hope it helps.
> > > >>>>>>>>>> Thanks!
> > > >>>>>>>>>>
> > > >>>>>>>>>> Regards,
> > > >>>>>>>>>> Kai
> > > >>>>>>>>>>
> > > >>>>>>>>>> -----Original Message-----
> > > >>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > > >>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
> > > >>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
> > > >> common-dev@hadoop.apache.org
> > > >>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch
> > > >> to
> > > >>>>>> trunk
> > > >>>>>>>>>>
> > > >>>>>>>>>> +1
> > > >>>>>>>>>>
> > > >>>>>>>>>> Great addition to HDFS. Thanks all contributors for the nice
> > > >>>>>> work.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Regards,
> > > >>>>>>>>>> Uma
> > > >>>>>>>>>>
> > > >>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
> > > >>> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Hi,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 feature
> > > >>>>>> branch
> > > >>>>>>>>>>> back to trunk. Since November 2014 we have been designing
> > > >> and
> > > >>>>>>>>>>> developing this feature under the umbrella JIRAs HDFS-7285
> > > >>> and
> > > >>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> The HDFS-7285 feature branch was created to support the
> > > >> first
> > > >>>>>> phase
> > > >>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC
> > > >> is
> > > >>>>>> to
> > > >>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
> > > >>>>>> Instead
> > > >>>>>>>>>>> of always creating 3 replicas of each block with 200%
> > > >> storage
> > > >>>>>> space
> > > >>>>>>>>>>> overhead, HDFS-EC provides data durability through parity
> > > >>> data
> > > >>>>>>> blocks.
> > > >>>>>>>>>>> With most EC configurations, the storage overhead is no
> > > >> more
> > > >>>>>> than
> > > >>>>>>> 50%.
> > > >>>>>>>>>>> Based on profiling results of production clusters, we
> > > >> decided
> > > >>>>>> to
> > > >>>>>>>>>>> support EC with the striped block layout in the first
> > > >> phase,
> > > >>> so
> > > >>>>>>>>>>> that small files can be better handled. This means dividing
> > > >>>>>> each
> > > >>>>>>>>>>> logical HDFS file block into smaller units (striping cells)
> > > >>> and
> > > >>>>>>>>>>> spreading them on a set of DataNodes in round-robin
> > > >> fashion.
> > > >>>>>> Parity
> > > >>>>>>>>>>> cells are generated for each stripe of original data cells.
> > > >>> We
> > > >>>>>> have
> > > >>>>>>>>>>> made changes to NameNode, client, and DataNode to
> > > >> generalize
> > > >>>>>> the
> > > >>>>>>>>>>> block concept and handle the mapping between a logical file
> > > >>>>>> block
> > > >>>>>>>>>>> and its internal storage blocks. For further details please
> > > >>> see
> > > >>>>>> the
> > > >>>>>>>>>>> design doc on HDFS-7285.
> > > >>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
> > > >>> high-performance
> > > >>>>>>>>>>> codec calculation support.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> The nightly Jenkins job of the branch has reported several
> > > >>>>>>>>>>> successful runs, and doesn't show new flaky tests compared
> > > >>> with
> > > >>>>>>>>>>> trunk. We have posted several versions of the test plan
> > > >>>>>> including
> > > >>>>>>>>>>> both unit testing and cluster testing, and have executed
> > > >> most
> > > >>>>>> tests
> > > >>>>>>>>>>> in the plan. The most basic functionalities have been
> > > >>>>>> extensively
> > > >>>>>>>>>>> tested and verified in several real clusters with different
> > > >>>>>>>>>>> hardware configurations; results have been very stable. We
> > > >>> have
> > > >>>>>>>>>>> created follow-on tasks for more advanced error handling
> > > >> and
> > > >>>>>>>> optimization under the umbrella HDFS-8031.
> > > >>>>>>>>>>> We also plan to implement or harden the integration of EC
> > > >>> with
> > > >>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
> > > >>> truncate,
> > > >>>>>>>>>>> hflush, hsync, and so forth.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Development of this feature has been a collaboration across
> > > >>>>>> many
> > > >>>>>>>>>>> companies and institutions. I'd like to thank J. Andreina,
> > > >>>>>> Takanobu
> > > >>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
> > > >> Maheswara
> > > >>>>>> Rao
> > > >>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao
> > > >>> Rui,
> > > >>>>>> Kai
> > > >>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong
> > > >>>>>> Zhang,
> > > >>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
> > > >>> contributions
> > > >>>>>> and
> > > >>>>>>>> reviews.
> > > >>>>>>>>>>> Andrew and Kai Zheng also made fundamental contributions to
> > > >>> the
> > > >>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and
> > > >>> many
> > > >>>>>>>>>>> other contributors have made great efforts in system
> > > >> testing.
> > > >>>>>> Many
> > > >>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and ATM,
> > > >>> Todd
> > > >>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
> > > >>>>>> providing
> > > >>>>>>>> helpful feedbacks.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Following the community convention, this vote will last
> > > >> for 7
> > > >>>>>> days
> > > >>>>>>>>>>> (ending September 29th). Votes from Hadoop committers are
> > > >>>>>> binding
> > > >>>>>>>>>>> but non-binding votes are very welcome as well. And here's
> > > >> my
> > > >>>>>>>>>>> non-binding
> > > >>>>>>>>> +1.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Thanks,
> > > >>>>>>>>>>> ---
> > > >>>>>>>>>>> Zhe Zhang
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> > >
> >
>

Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Jing Zhao <ji...@apache.org>.
Thanks for the discussion, Vinod and Andrew. Backporting EC to 2.9 sounds
good to me.

On Mon, Nov 2, 2015 at 12:06 PM, Andrew Wang <an...@cloudera.com>
wrote:

> Thanks for forking the thread Vinod,
>
> SGTM, though I really do recommend waiting for 2.9 given the current size
> of 2.8. I'm not a fan of an "off by default" half-measure, since it doesn't
> change our compatibility requirements, and there's some major NN surgery
> that can't really be disabled.
>
> If we do find a major user who's backported this to their own branch-2
> fork, I agree that's motivation to get it in an upstream release quicker. I
> haven't heard anything along these lines though.
>
> On Mon, Nov 2, 2015 at 11:49 AM, Vinod Vavilapalli <
> vinodkv@hortonworks.com>
> wrote:
>
> > Forking the thread. Started looking at the 2.8 list, various features’
> > status and arrived here.
> >
> > While I understand the pervasive nature of EC and a need for a
> significant
> > bake-in, moving this to a 3.x release is not a good idea. We will surely
> > get a 2.8 out this year and, as needed, I can even spend time getting
> > started on a 2.9. OTOH, 3.x is long ways off, and given all the
> > incompatibilities there, it would be a while before users can get their
> > hands on EC if it were to be only on 3.x. At best, this may force sites
> > that want EC to backport the entire EC feature to older releases, at
> worst
> > this will be repeat the mess of 0.20 security release forks.
> >
> > If we think adding this to 2.8 (even if it switched off) is too much risk
> > per our original plan, let’s move this to 2.9, there by leaving enough
> time
> > for stability, integration testing and bake-in, and a realistic chance of
> > having it end up on users’ clusters soonish.
> >
> > +Vinod
> >
> > > On Oct 19, 2015, at 1:44 PM, Andrew Wang <an...@cloudera.com>
> > wrote:
> > >
> > > I think our plan thus far has been to target this for 3.0. I'm okay
> with
> > > putting it in branch-2 if we've given a hard look at compatibility, but
> > > I'll note though that 2.8 is already looking like quite a large
> release,
> > > and our release bandwidth has been focused on the 2.6 and 2.7
> maintenance
> > > releases. Adding another multi-hundred JIRAs to 2.8 might make it too
> > > unwieldy to get out the door. If we bump EC past that, 3.0 might very
> > well
> > > be our next release vehicle. I do plan to revive the 3.0 schedule some
> > time
> > > next year. With EC and JDK8 in a good spot, the only big feature
> > remaining
> > > is classpath isolation.
> > >
> > > EC is also a pretty fundamental change to HDFS. Even if it's
> compatible,
> > in
> > > terms of size and impact it might best belong in a new major release.
> > >
> > > Best,
> > > Andrew
> > >
> > > On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
> > > vinayakumarb.apache@gmail.com> wrote:
> > >
> > >> Is anyone else also thinks that feature is ready to goto branch-2  as
> > well?
> > >>
> > >> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since
> then
> > and
> > >> ready to go in branch-2.
> > >>
> > >> -Vinay
> > >> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com> wrote:
> > >>
> > >>> Thanks Vinay for capturing the issue and Uma for offering the help.
> > >>>
> > >>> ---
> > >>> Zhe Zhang
> > >>>
> > >>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
> > >> uma.gangumalla@intel.com
> > >>>>
> > >>> wrote:
> > >>>
> > >>>> Vinay,
> > >>>>
> > >>>>
> > >>>> I would merge them as part of HDFS-9182.
> > >>>>
> > >>>> Thanks,
> > >>>> Uma
> > >>>>
> > >>>>
> > >>>>
> > >>>> On 10/5/15, 12:48 AM, "Vinayakumar B" <vi...@apache.org>
> > wrote:
> > >>>>
> > >>>>> Hi Andrew,
> > >>>>> I see CHANGES.txt entries not yet merged from
> > >> CHANGES-HDFS-EC-7285.txt.
> > >>>>>
> > >>>>> Was this intentional?
> > >>>>>
> > >>>>> Regards,
> > >>>>> Vinay
> > >>>>>
> > >>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
> > >> andrew.wang@cloudera.com>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Branch has been merged to trunk, thanks again to everyone who
> worked
> > >>> on
> > >>>>>> the
> > >>>>>> feature!
> > >>>>>>
> > >>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <
> zhezhang@cloudera.com>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> Thanks everyone who has participated in this discussion.
> > >>>>>>>
> > >>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this vote
> > >> has
> > >>>>>> passed.
> > >>>>>>> I will do a final 'git merge' with trunk and work with Andrew to
> > >>> merge
> > >>>>>> the
> > >>>>>>> branch to trunk. I'll update on this thread when the merge is
> > >> done.
> > >>>>>>>
> > >>>>>>> ---
> > >>>>>>> Zhe Zhang
> > >>>>>>>
> > >>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi...@intel.com>
> > >>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> (Change it to binding.)
> > >>>>>>>>
> > >>>>>>>> +1
> > >>>>>>>> I have been involved in the development and code review on the
> > >>>>>> feature
> > >>>>>>>> branch. It's a great feature and I think it's ready to merge it
> > >>> into
> > >>>>>>> trunk.
> > >>>>>>>>
> > >>>>>>>> Thanks all for the contribution.
> > >>>>>>>>
> > >>>>>>>> Regards,
> > >>>>>>>> Yi Liu
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> -----Original Message-----
> > >>>>>>>> From: Liu, Yi A
> > >>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
> > >>>>>>>> To: hdfs-dev@hadoop.apache.org
> > >>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> > >>> trunk
> > >>>>>>>>
> > >>>>>>>> +1 (non-binding)
> > >>>>>>>> I have been involved in the development and code review on the
> > >>>>>> feature
> > >>>>>>>> branch. It's a great feature and I think it's ready to merge it
> > >>> into
> > >>>>>>> trunk.
> > >>>>>>>>
> > >>>>>>>> Thanks all for the contribution.
> > >>>>>>>>
> > >>>>>>>> Regards,
> > >>>>>>>> Yi Liu
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> -----Original Message-----
> > >>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> > >>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
> > >>>>>>>> To: hdfs-dev@hadoop.apache.org
> > >>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> > >>> trunk
> > >>>>>>>>
> > >>>>>>>> +1,
> > >>>>>>>>
> > >>>>>>>> I've been involved starting from design and development of
> > >>>>>> ErasureCoding.
> > >>>>>>>> I think phase 1 of this development is ready to be merged to
> > >>> trunk.
> > >>>>>>>> It had come a long way to the current state with significant
> > >>> effort
> > >>>>>> of
> > >>>>>>>> many Contributors and Reviewers for both design and code.
> > >>>>>>>>
> > >>>>>>>> Thanks Everyone for the efforts.
> > >>>>>>>>
> > >>>>>>>> Regards,
> > >>>>>>>> Vinay
> > >>>>>>>>
> > >>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org>
> > >>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> +1
> > >>>>>>>>>
> > >>>>>>>>> I've been involved in both development and review on the
> > >> branch,
> > >>>>>> and
> > >>>>>> I
> > >>>>>>>>> believe it's now ready to get merged into trunk. Many thanks
> > >> to
> > >>>>>> all
> > >>>>>>>>> the contributors and reviewers!
> > >>>>>>>>>
> > >>>>>>>>> Thanks,
> > >>>>>>>>> -Jing
> > >>>>>>>>>
> > >>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
> > >>> kai.zheng@intel.com>
> > >>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Non-binding +1
> > >>>>>>>>>>
> > >>>>>>>>>> According to our extensive performance tests, striping +
> > >> ISA-L
> > >>>>>> coder
> > >>>>>>>>> based
> > >>>>>>>>>> erasure coding not only can save storage, but also can
> > >>> increase
> > >>>>>> the
> > >>>>>>>>>> throughput of a client or a cluster. It will be a great
> > >>>>>> addition to
> > >>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
> > >> also
> > >>>>>>>>>> observed it's
> > >>>>>>>>> very
> > >>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
> > >> test
> > >>>>>> report
> > >>>>>>>>> after
> > >>>>>>>>>> it's sorted out and hope it helps.
> > >>>>>>>>>> Thanks!
> > >>>>>>>>>>
> > >>>>>>>>>> Regards,
> > >>>>>>>>>> Kai
> > >>>>>>>>>>
> > >>>>>>>>>> -----Original Message-----
> > >>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > >>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
> > >>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
> > >> common-dev@hadoop.apache.org
> > >>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch
> > >> to
> > >>>>>> trunk
> > >>>>>>>>>>
> > >>>>>>>>>> +1
> > >>>>>>>>>>
> > >>>>>>>>>> Great addition to HDFS. Thanks all contributors for the nice
> > >>>>>> work.
> > >>>>>>>>>>
> > >>>>>>>>>> Regards,
> > >>>>>>>>>> Uma
> > >>>>>>>>>>
> > >>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
> > >>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Hi,
> > >>>>>>>>>>>
> > >>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 feature
> > >>>>>> branch
> > >>>>>>>>>>> back to trunk. Since November 2014 we have been designing
> > >> and
> > >>>>>>>>>>> developing this feature under the umbrella JIRAs HDFS-7285
> > >>> and
> > >>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches.
> > >>>>>>>>>>>
> > >>>>>>>>>>> The HDFS-7285 feature branch was created to support the
> > >> first
> > >>>>>> phase
> > >>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC
> > >> is
> > >>>>>> to
> > >>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
> > >>>>>> Instead
> > >>>>>>>>>>> of always creating 3 replicas of each block with 200%
> > >> storage
> > >>>>>> space
> > >>>>>>>>>>> overhead, HDFS-EC provides data durability through parity
> > >>> data
> > >>>>>>> blocks.
> > >>>>>>>>>>> With most EC configurations, the storage overhead is no
> > >> more
> > >>>>>> than
> > >>>>>>> 50%.
> > >>>>>>>>>>> Based on profiling results of production clusters, we
> > >> decided
> > >>>>>> to
> > >>>>>>>>>>> support EC with the striped block layout in the first
> > >> phase,
> > >>> so
> > >>>>>>>>>>> that small files can be better handled. This means dividing
> > >>>>>> each
> > >>>>>>>>>>> logical HDFS file block into smaller units (striping cells)
> > >>> and
> > >>>>>>>>>>> spreading them on a set of DataNodes in round-robin
> > >> fashion.
> > >>>>>> Parity
> > >>>>>>>>>>> cells are generated for each stripe of original data cells.
> > >>> We
> > >>>>>> have
> > >>>>>>>>>>> made changes to NameNode, client, and DataNode to
> > >> generalize
> > >>>>>> the
> > >>>>>>>>>>> block concept and handle the mapping between a logical file
> > >>>>>> block
> > >>>>>>>>>>> and its internal storage blocks. For further details please
> > >>> see
> > >>>>>> the
> > >>>>>>>>>>> design doc on HDFS-7285.
> > >>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
> > >>> high-performance
> > >>>>>>>>>>> codec calculation support.
> > >>>>>>>>>>>
> > >>>>>>>>>>> The nightly Jenkins job of the branch has reported several
> > >>>>>>>>>>> successful runs, and doesn't show new flaky tests compared
> > >>> with
> > >>>>>>>>>>> trunk. We have posted several versions of the test plan
> > >>>>>> including
> > >>>>>>>>>>> both unit testing and cluster testing, and have executed
> > >> most
> > >>>>>> tests
> > >>>>>>>>>>> in the plan. The most basic functionalities have been
> > >>>>>> extensively
> > >>>>>>>>>>> tested and verified in several real clusters with different
> > >>>>>>>>>>> hardware configurations; results have been very stable. We
> > >>> have
> > >>>>>>>>>>> created follow-on tasks for more advanced error handling
> > >> and
> > >>>>>>>> optimization under the umbrella HDFS-8031.
> > >>>>>>>>>>> We also plan to implement or harden the integration of EC
> > >>> with
> > >>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
> > >>> truncate,
> > >>>>>>>>>>> hflush, hsync, and so forth.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Development of this feature has been a collaboration across
> > >>>>>> many
> > >>>>>>>>>>> companies and institutions. I'd like to thank J. Andreina,
> > >>>>>> Takanobu
> > >>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
> > >> Maheswara
> > >>>>>> Rao
> > >>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao
> > >>> Rui,
> > >>>>>> Kai
> > >>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong
> > >>>>>> Zhang,
> > >>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
> > >>> contributions
> > >>>>>> and
> > >>>>>>>> reviews.
> > >>>>>>>>>>> Andrew and Kai Zheng also made fundamental contributions to
> > >>> the
> > >>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and
> > >>> many
> > >>>>>>>>>>> other contributors have made great efforts in system
> > >> testing.
> > >>>>>> Many
> > >>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and ATM,
> > >>> Todd
> > >>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
> > >>>>>> providing
> > >>>>>>>> helpful feedbacks.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Following the community convention, this vote will last
> > >> for 7
> > >>>>>> days
> > >>>>>>>>>>> (ending September 29th). Votes from Hadoop committers are
> > >>>>>> binding
> > >>>>>>>>>>> but non-binding votes are very welcome as well. And here's
> > >> my
> > >>>>>>>>>>> non-binding
> > >>>>>>>>> +1.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Thanks,
> > >>>>>>>>>>> ---
> > >>>>>>>>>>> Zhe Zhang
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> >
> >
>

Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Andrew Wang <an...@cloudera.com>.
Thanks for forking the thread Vinod,

SGTM, though I really do recommend waiting for 2.9 given the current size
of 2.8. I'm not a fan of an "off by default" half-measure, since it doesn't
change our compatibility requirements, and there's some major NN surgery
that can't really be disabled.

If we do find a major user who's backported this to their own branch-2
fork, I agree that's motivation to get it in an upstream release quicker. I
haven't heard anything along these lines though.

On Mon, Nov 2, 2015 at 11:49 AM, Vinod Vavilapalli <vi...@hortonworks.com>
wrote:

> Forking the thread. Started looking at the 2.8 list, various features’
> status and arrived here.
>
> While I understand the pervasive nature of EC and a need for a significant
> bake-in, moving this to a 3.x release is not a good idea. We will surely
> get a 2.8 out this year and, as needed, I can even spend time getting
> started on a 2.9. OTOH, 3.x is long ways off, and given all the
> incompatibilities there, it would be a while before users can get their
> hands on EC if it were to be only on 3.x. At best, this may force sites
> that want EC to backport the entire EC feature to older releases, at worst
> this will be repeat the mess of 0.20 security release forks.
>
> If we think adding this to 2.8 (even if it switched off) is too much risk
> per our original plan, let’s move this to 2.9, there by leaving enough time
> for stability, integration testing and bake-in, and a realistic chance of
> having it end up on users’ clusters soonish.
>
> +Vinod
>
> > On Oct 19, 2015, at 1:44 PM, Andrew Wang <an...@cloudera.com>
> wrote:
> >
> > I think our plan thus far has been to target this for 3.0. I'm okay with
> > putting it in branch-2 if we've given a hard look at compatibility, but
> > I'll note though that 2.8 is already looking like quite a large release,
> > and our release bandwidth has been focused on the 2.6 and 2.7 maintenance
> > releases. Adding another multi-hundred JIRAs to 2.8 might make it too
> > unwieldy to get out the door. If we bump EC past that, 3.0 might very
> well
> > be our next release vehicle. I do plan to revive the 3.0 schedule some
> time
> > next year. With EC and JDK8 in a good spot, the only big feature
> remaining
> > is classpath isolation.
> >
> > EC is also a pretty fundamental change to HDFS. Even if it's compatible,
> in
> > terms of size and impact it might best belong in a new major release.
> >
> > Best,
> > Andrew
> >
> > On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
> > vinayakumarb.apache@gmail.com> wrote:
> >
> >> Is anyone else also thinks that feature is ready to goto branch-2  as
> well?
> >>
> >> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since then
> and
> >> ready to go in branch-2.
> >>
> >> -Vinay
> >> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com> wrote:
> >>
> >>> Thanks Vinay for capturing the issue and Uma for offering the help.
> >>>
> >>> ---
> >>> Zhe Zhang
> >>>
> >>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
> >> uma.gangumalla@intel.com
> >>>>
> >>> wrote:
> >>>
> >>>> Vinay,
> >>>>
> >>>>
> >>>> I would merge them as part of HDFS-9182.
> >>>>
> >>>> Thanks,
> >>>> Uma
> >>>>
> >>>>
> >>>>
> >>>> On 10/5/15, 12:48 AM, "Vinayakumar B" <vi...@apache.org>
> wrote:
> >>>>
> >>>>> Hi Andrew,
> >>>>> I see CHANGES.txt entries not yet merged from
> >> CHANGES-HDFS-EC-7285.txt.
> >>>>>
> >>>>> Was this intentional?
> >>>>>
> >>>>> Regards,
> >>>>> Vinay
> >>>>>
> >>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
> >> andrew.wang@cloudera.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Branch has been merged to trunk, thanks again to everyone who worked
> >>> on
> >>>>>> the
> >>>>>> feature!
> >>>>>>
> >>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <zh...@cloudera.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Thanks everyone who has participated in this discussion.
> >>>>>>>
> >>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this vote
> >> has
> >>>>>> passed.
> >>>>>>> I will do a final 'git merge' with trunk and work with Andrew to
> >>> merge
> >>>>>> the
> >>>>>>> branch to trunk. I'll update on this thread when the merge is
> >> done.
> >>>>>>>
> >>>>>>> ---
> >>>>>>> Zhe Zhang
> >>>>>>>
> >>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi...@intel.com>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>> (Change it to binding.)
> >>>>>>>>
> >>>>>>>> +1
> >>>>>>>> I have been involved in the development and code review on the
> >>>>>> feature
> >>>>>>>> branch. It's a great feature and I think it's ready to merge it
> >>> into
> >>>>>>> trunk.
> >>>>>>>>
> >>>>>>>> Thanks all for the contribution.
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Yi Liu
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Liu, Yi A
> >>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
> >>>>>>>> To: hdfs-dev@hadoop.apache.org
> >>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> >>> trunk
> >>>>>>>>
> >>>>>>>> +1 (non-binding)
> >>>>>>>> I have been involved in the development and code review on the
> >>>>>> feature
> >>>>>>>> branch. It's a great feature and I think it's ready to merge it
> >>> into
> >>>>>>> trunk.
> >>>>>>>>
> >>>>>>>> Thanks all for the contribution.
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Yi Liu
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> >>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
> >>>>>>>> To: hdfs-dev@hadoop.apache.org
> >>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> >>> trunk
> >>>>>>>>
> >>>>>>>> +1,
> >>>>>>>>
> >>>>>>>> I've been involved starting from design and development of
> >>>>>> ErasureCoding.
> >>>>>>>> I think phase 1 of this development is ready to be merged to
> >>> trunk.
> >>>>>>>> It had come a long way to the current state with significant
> >>> effort
> >>>>>> of
> >>>>>>>> many Contributors and Reviewers for both design and code.
> >>>>>>>>
> >>>>>>>> Thanks Everyone for the efforts.
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Vinay
> >>>>>>>>
> >>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org>
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> +1
> >>>>>>>>>
> >>>>>>>>> I've been involved in both development and review on the
> >> branch,
> >>>>>> and
> >>>>>> I
> >>>>>>>>> believe it's now ready to get merged into trunk. Many thanks
> >> to
> >>>>>> all
> >>>>>>>>> the contributors and reviewers!
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> -Jing
> >>>>>>>>>
> >>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
> >>> kai.zheng@intel.com>
> >>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Non-binding +1
> >>>>>>>>>>
> >>>>>>>>>> According to our extensive performance tests, striping +
> >> ISA-L
> >>>>>> coder
> >>>>>>>>> based
> >>>>>>>>>> erasure coding not only can save storage, but also can
> >>> increase
> >>>>>> the
> >>>>>>>>>> throughput of a client or a cluster. It will be a great
> >>>>>> addition to
> >>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
> >> also
> >>>>>>>>>> observed it's
> >>>>>>>>> very
> >>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
> >> test
> >>>>>> report
> >>>>>>>>> after
> >>>>>>>>>> it's sorted out and hope it helps.
> >>>>>>>>>> Thanks!
> >>>>>>>>>>
> >>>>>>>>>> Regards,
> >>>>>>>>>> Kai
> >>>>>>>>>>
> >>>>>>>>>> -----Original Message-----
> >>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> >>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
> >>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
> >> common-dev@hadoop.apache.org
> >>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch
> >> to
> >>>>>> trunk
> >>>>>>>>>>
> >>>>>>>>>> +1
> >>>>>>>>>>
> >>>>>>>>>> Great addition to HDFS. Thanks all contributors for the nice
> >>>>>> work.
> >>>>>>>>>>
> >>>>>>>>>> Regards,
> >>>>>>>>>> Uma
> >>>>>>>>>>
> >>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
> >>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 feature
> >>>>>> branch
> >>>>>>>>>>> back to trunk. Since November 2014 we have been designing
> >> and
> >>>>>>>>>>> developing this feature under the umbrella JIRAs HDFS-7285
> >>> and
> >>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches.
> >>>>>>>>>>>
> >>>>>>>>>>> The HDFS-7285 feature branch was created to support the
> >> first
> >>>>>> phase
> >>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC
> >> is
> >>>>>> to
> >>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
> >>>>>> Instead
> >>>>>>>>>>> of always creating 3 replicas of each block with 200%
> >> storage
> >>>>>> space
> >>>>>>>>>>> overhead, HDFS-EC provides data durability through parity
> >>> data
> >>>>>>> blocks.
> >>>>>>>>>>> With most EC configurations, the storage overhead is no
> >> more
> >>>>>> than
> >>>>>>> 50%.
> >>>>>>>>>>> Based on profiling results of production clusters, we
> >> decided
> >>>>>> to
> >>>>>>>>>>> support EC with the striped block layout in the first
> >> phase,
> >>> so
> >>>>>>>>>>> that small files can be better handled. This means dividing
> >>>>>> each
> >>>>>>>>>>> logical HDFS file block into smaller units (striping cells)
> >>> and
> >>>>>>>>>>> spreading them on a set of DataNodes in round-robin
> >> fashion.
> >>>>>> Parity
> >>>>>>>>>>> cells are generated for each stripe of original data cells.
> >>> We
> >>>>>> have
> >>>>>>>>>>> made changes to NameNode, client, and DataNode to
> >> generalize
> >>>>>> the
> >>>>>>>>>>> block concept and handle the mapping between a logical file
> >>>>>> block
> >>>>>>>>>>> and its internal storage blocks. For further details please
> >>> see
> >>>>>> the
> >>>>>>>>>>> design doc on HDFS-7285.
> >>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
> >>> high-performance
> >>>>>>>>>>> codec calculation support.
> >>>>>>>>>>>
> >>>>>>>>>>> The nightly Jenkins job of the branch has reported several
> >>>>>>>>>>> successful runs, and doesn't show new flaky tests compared
> >>> with
> >>>>>>>>>>> trunk. We have posted several versions of the test plan
> >>>>>> including
> >>>>>>>>>>> both unit testing and cluster testing, and have executed
> >> most
> >>>>>> tests
> >>>>>>>>>>> in the plan. The most basic functionalities have been
> >>>>>> extensively
> >>>>>>>>>>> tested and verified in several real clusters with different
> >>>>>>>>>>> hardware configurations; results have been very stable. We
> >>> have
> >>>>>>>>>>> created follow-on tasks for more advanced error handling
> >> and
> >>>>>>>> optimization under the umbrella HDFS-8031.
> >>>>>>>>>>> We also plan to implement or harden the integration of EC
> >>> with
> >>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
> >>> truncate,
> >>>>>>>>>>> hflush, hsync, and so forth.
> >>>>>>>>>>>
> >>>>>>>>>>> Development of this feature has been a collaboration across
> >>>>>> many
> >>>>>>>>>>> companies and institutions. I'd like to thank J. Andreina,
> >>>>>> Takanobu
> >>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
> >> Maheswara
> >>>>>> Rao
> >>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao
> >>> Rui,
> >>>>>> Kai
> >>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong
> >>>>>> Zhang,
> >>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
> >>> contributions
> >>>>>> and
> >>>>>>>> reviews.
> >>>>>>>>>>> Andrew and Kai Zheng also made fundamental contributions to
> >>> the
> >>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and
> >>> many
> >>>>>>>>>>> other contributors have made great efforts in system
> >> testing.
> >>>>>> Many
> >>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and ATM,
> >>> Todd
> >>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
> >>>>>> providing
> >>>>>>>> helpful feedbacks.
> >>>>>>>>>>>
> >>>>>>>>>>> Following the community convention, this vote will last
> >> for 7
> >>>>>> days
> >>>>>>>>>>> (ending September 29th). Votes from Hadoop committers are
> >>>>>> binding
> >>>>>>>>>>> but non-binding votes are very welcome as well. And here's
> >> my
> >>>>>>>>>>> non-binding
> >>>>>>>>> +1.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>> ---
> >>>>>>>>>>> Zhe Zhang
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>
> >>
>
>

Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Posted by Vinod Vavilapalli <vi...@hortonworks.com>.
Forking the thread. Started looking at the 2.8 list, various features’ status and arrived here.

While I understand the pervasive nature of EC and a need for a significant bake-in, moving this to a 3.x release is not a good idea. We will surely get a 2.8 out this year and, as needed, I can even spend time getting started on a 2.9. OTOH, 3.x is long ways off, and given all the incompatibilities there, it would be a while before users can get their hands on EC if it were to be only on 3.x. At best, this may force sites that want EC to backport the entire EC feature to older releases, at worst this will be repeat the mess of 0.20 security release forks.

If we think adding this to 2.8 (even if it switched off) is too much risk per our original plan, let’s move this to 2.9, there by leaving enough time for stability, integration testing and bake-in, and a realistic chance of having it end up on users’ clusters soonish.

+Vinod

> On Oct 19, 2015, at 1:44 PM, Andrew Wang <an...@cloudera.com> wrote:
> 
> I think our plan thus far has been to target this for 3.0. I'm okay with
> putting it in branch-2 if we've given a hard look at compatibility, but
> I'll note though that 2.8 is already looking like quite a large release,
> and our release bandwidth has been focused on the 2.6 and 2.7 maintenance
> releases. Adding another multi-hundred JIRAs to 2.8 might make it too
> unwieldy to get out the door. If we bump EC past that, 3.0 might very well
> be our next release vehicle. I do plan to revive the 3.0 schedule some time
> next year. With EC and JDK8 in a good spot, the only big feature remaining
> is classpath isolation.
> 
> EC is also a pretty fundamental change to HDFS. Even if it's compatible, in
> terms of size and impact it might best belong in a new major release.
> 
> Best,
> Andrew
> 
> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
> vinayakumarb.apache@gmail.com> wrote:
> 
>> Is anyone else also thinks that feature is ready to goto branch-2  as well?
>> 
>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since then and
>> ready to go in branch-2.
>> 
>> -Vinay
>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com> wrote:
>> 
>>> Thanks Vinay for capturing the issue and Uma for offering the help.
>>> 
>>> ---
>>> Zhe Zhang
>>> 
>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
>> uma.gangumalla@intel.com
>>>> 
>>> wrote:
>>> 
>>>> Vinay,
>>>> 
>>>> 
>>>> I would merge them as part of HDFS-9182.
>>>> 
>>>> Thanks,
>>>> Uma
>>>> 
>>>> 
>>>> 
>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" <vi...@apache.org> wrote:
>>>> 
>>>>> Hi Andrew,
>>>>> I see CHANGES.txt entries not yet merged from
>> CHANGES-HDFS-EC-7285.txt.
>>>>> 
>>>>> Was this intentional?
>>>>> 
>>>>> Regards,
>>>>> Vinay
>>>>> 
>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
>> andrew.wang@cloudera.com>
>>>>> wrote:
>>>>> 
>>>>>> Branch has been merged to trunk, thanks again to everyone who worked
>>> on
>>>>>> the
>>>>>> feature!
>>>>>> 
>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <zh...@cloudera.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Thanks everyone who has participated in this discussion.
>>>>>>> 
>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this vote
>> has
>>>>>> passed.
>>>>>>> I will do a final 'git merge' with trunk and work with Andrew to
>>> merge
>>>>>> the
>>>>>>> branch to trunk. I'll update on this thread when the merge is
>> done.
>>>>>>> 
>>>>>>> ---
>>>>>>> Zhe Zhang
>>>>>>> 
>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi...@intel.com>
>>>>>> wrote:
>>>>>>> 
>>>>>>>> (Change it to binding.)
>>>>>>>> 
>>>>>>>> +1
>>>>>>>> I have been involved in the development and code review on the
>>>>>> feature
>>>>>>>> branch. It's a great feature and I think it's ready to merge it
>>> into
>>>>>>> trunk.
>>>>>>>> 
>>>>>>>> Thanks all for the contribution.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Yi Liu
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Liu, Yi A
>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to
>>> trunk
>>>>>>>> 
>>>>>>>> +1 (non-binding)
>>>>>>>> I have been involved in the development and code review on the
>>>>>> feature
>>>>>>>> branch. It's a great feature and I think it's ready to merge it
>>> into
>>>>>>> trunk.
>>>>>>>> 
>>>>>>>> Thanks all for the contribution.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Yi Liu
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
>>> trunk
>>>>>>>> 
>>>>>>>> +1,
>>>>>>>> 
>>>>>>>> I've been involved starting from design and development of
>>>>>> ErasureCoding.
>>>>>>>> I think phase 1 of this development is ready to be merged to
>>> trunk.
>>>>>>>> It had come a long way to the current state with significant
>>> effort
>>>>>> of
>>>>>>>> many Contributors and Reviewers for both design and code.
>>>>>>>> 
>>>>>>>> Thanks Everyone for the efforts.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Vinay
>>>>>>>> 
>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org>
>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> +1
>>>>>>>>> 
>>>>>>>>> I've been involved in both development and review on the
>> branch,
>>>>>> and
>>>>>> I
>>>>>>>>> believe it's now ready to get merged into trunk. Many thanks
>> to
>>>>>> all
>>>>>>>>> the contributors and reviewers!
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> -Jing
>>>>>>>>> 
>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
>>> kai.zheng@intel.com>
>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Non-binding +1
>>>>>>>>>> 
>>>>>>>>>> According to our extensive performance tests, striping +
>> ISA-L
>>>>>> coder
>>>>>>>>> based
>>>>>>>>>> erasure coding not only can save storage, but also can
>>> increase
>>>>>> the
>>>>>>>>>> throughput of a client or a cluster. It will be a great
>>>>>> addition to
>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
>> also
>>>>>>>>>> observed it's
>>>>>>>>> very
>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
>> test
>>>>>> report
>>>>>>>>> after
>>>>>>>>>> it's sorted out and hope it helps.
>>>>>>>>>> Thanks!
>>>>>>>>>> 
>>>>>>>>>> Regards,
>>>>>>>>>> Kai
>>>>>>>>>> 
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
>> common-dev@hadoop.apache.org
>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch
>> to
>>>>>> trunk
>>>>>>>>>> 
>>>>>>>>>> +1
>>>>>>>>>> 
>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the nice
>>>>>> work.
>>>>>>>>>> 
>>>>>>>>>> Regards,
>>>>>>>>>> Uma
>>>>>>>>>> 
>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi,
>>>>>>>>>>> 
>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 feature
>>>>>> branch
>>>>>>>>>>> back to trunk. Since November 2014 we have been designing
>> and
>>>>>>>>>>> developing this feature under the umbrella JIRAs HDFS-7285
>>> and
>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches.
>>>>>>>>>>> 
>>>>>>>>>>> The HDFS-7285 feature branch was created to support the
>> first
>>>>>> phase
>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC
>> is
>>>>>> to
>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
>>>>>> Instead
>>>>>>>>>>> of always creating 3 replicas of each block with 200%
>> storage
>>>>>> space
>>>>>>>>>>> overhead, HDFS-EC provides data durability through parity
>>> data
>>>>>>> blocks.
>>>>>>>>>>> With most EC configurations, the storage overhead is no
>> more
>>>>>> than
>>>>>>> 50%.
>>>>>>>>>>> Based on profiling results of production clusters, we
>> decided
>>>>>> to
>>>>>>>>>>> support EC with the striped block layout in the first
>> phase,
>>> so
>>>>>>>>>>> that small files can be better handled. This means dividing
>>>>>> each
>>>>>>>>>>> logical HDFS file block into smaller units (striping cells)
>>> and
>>>>>>>>>>> spreading them on a set of DataNodes in round-robin
>> fashion.
>>>>>> Parity
>>>>>>>>>>> cells are generated for each stripe of original data cells.
>>> We
>>>>>> have
>>>>>>>>>>> made changes to NameNode, client, and DataNode to
>> generalize
>>>>>> the
>>>>>>>>>>> block concept and handle the mapping between a logical file
>>>>>> block
>>>>>>>>>>> and its internal storage blocks. For further details please
>>> see
>>>>>> the
>>>>>>>>>>> design doc on HDFS-7285.
>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
>>> high-performance
>>>>>>>>>>> codec calculation support.
>>>>>>>>>>> 
>>>>>>>>>>> The nightly Jenkins job of the branch has reported several
>>>>>>>>>>> successful runs, and doesn't show new flaky tests compared
>>> with
>>>>>>>>>>> trunk. We have posted several versions of the test plan
>>>>>> including
>>>>>>>>>>> both unit testing and cluster testing, and have executed
>> most
>>>>>> tests
>>>>>>>>>>> in the plan. The most basic functionalities have been
>>>>>> extensively
>>>>>>>>>>> tested and verified in several real clusters with different
>>>>>>>>>>> hardware configurations; results have been very stable. We
>>> have
>>>>>>>>>>> created follow-on tasks for more advanced error handling
>> and
>>>>>>>> optimization under the umbrella HDFS-8031.
>>>>>>>>>>> We also plan to implement or harden the integration of EC
>>> with
>>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
>>> truncate,
>>>>>>>>>>> hflush, hsync, and so forth.
>>>>>>>>>>> 
>>>>>>>>>>> Development of this feature has been a collaboration across
>>>>>> many
>>>>>>>>>>> companies and institutions. I'd like to thank J. Andreina,
>>>>>> Takanobu
>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
>> Maheswara
>>>>>> Rao
>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao
>>> Rui,
>>>>>> Kai
>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong
>>>>>> Zhang,
>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
>>> contributions
>>>>>> and
>>>>>>>> reviews.
>>>>>>>>>>> Andrew and Kai Zheng also made fundamental contributions to
>>> the
>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and
>>> many
>>>>>>>>>>> other contributors have made great efforts in system
>> testing.
>>>>>> Many
>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and ATM,
>>> Todd
>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
>>>>>> providing
>>>>>>>> helpful feedbacks.
>>>>>>>>>>> 
>>>>>>>>>>> Following the community convention, this vote will last
>> for 7
>>>>>> days
>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers are
>>>>>> binding
>>>>>>>>>>> but non-binding votes are very welcome as well. And here's
>> my
>>>>>>>>>>> non-binding
>>>>>>>>> +1.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> ---
>>>>>>>>>>> Zhe Zhang
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>> 
>> 


Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by Andrew Wang <an...@cloudera.com>.
I think our plan thus far has been to target this for 3.0. I'm okay with
putting it in branch-2 if we've given a hard look at compatibility, but
I'll note though that 2.8 is already looking like quite a large release,
and our release bandwidth has been focused on the 2.6 and 2.7 maintenance
releases. Adding another multi-hundred JIRAs to 2.8 might make it too
unwieldy to get out the door. If we bump EC past that, 3.0 might very well
be our next release vehicle. I do plan to revive the 3.0 schedule some time
next year. With EC and JDK8 in a good spot, the only big feature remaining
is classpath isolation.

EC is also a pretty fundamental change to HDFS. Even if it's compatible, in
terms of size and impact it might best belong in a new major release.

Best,
Andrew

On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
vinayakumarb.apache@gmail.com> wrote:

> Is anyone else also thinks that feature is ready to goto branch-2  as well?
>
> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since then and
> ready to go in branch-2.
>
> -Vinay
> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com> wrote:
>
> > Thanks Vinay for capturing the issue and Uma for offering the help.
> >
> > ---
> > Zhe Zhang
> >
> > On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
> uma.gangumalla@intel.com
> > >
> > wrote:
> >
> > > Vinay,
> > >
> > >
> > >  I would merge them as part of HDFS-9182.
> > >
> > > Thanks,
> > > Uma
> > >
> > >
> > >
> > > On 10/5/15, 12:48 AM, "Vinayakumar B" <vi...@apache.org> wrote:
> > >
> > > >Hi Andrew,
> > > > I see CHANGES.txt entries not yet merged from
> CHANGES-HDFS-EC-7285.txt.
> > > >
> > > > Was this intentional?
> > > >
> > > >Regards,
> > > >Vinay
> > > >
> > > >On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
> andrew.wang@cloudera.com>
> > > >wrote:
> > > >
> > > >> Branch has been merged to trunk, thanks again to everyone who worked
> > on
> > > >>the
> > > >> feature!
> > > >>
> > > >> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <zh...@cloudera.com>
> > > >>wrote:
> > > >>
> > > >> > Thanks everyone who has participated in this discussion.
> > > >> >
> > > >> > With 7 +1's (5 binding and 2 non-binding), and no -1, this vote
> has
> > > >> passed.
> > > >> > I will do a final 'git merge' with trunk and work with Andrew to
> > merge
> > > >> the
> > > >> > branch to trunk. I'll update on this thread when the merge is
> done.
> > > >> >
> > > >> > ---
> > > >> > Zhe Zhang
> > > >> >
> > > >> > On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi...@intel.com>
> > > >>wrote:
> > > >> >
> > > >> > > (Change it to binding.)
> > > >> > >
> > > >> > > +1
> > > >> > > I have been involved in the development and code review on the
> > > >>feature
> > > >> > > branch. It's a great feature and I think it's ready to merge it
> > into
> > > >> > trunk.
> > > >> > >
> > > >> > > Thanks all for the contribution.
> > > >> > >
> > > >> > > Regards,
> > > >> > > Yi Liu
> > > >> > >
> > > >> > >
> > > >> > > -----Original Message-----
> > > >> > > From: Liu, Yi A
> > > >> > > Sent: Friday, September 25, 2015 1:51 PM
> > > >> > > To: hdfs-dev@hadoop.apache.org
> > > >> > > Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> > trunk
> > > >> > >
> > > >> > > +1 (non-binding)
> > > >> > > I have been involved in the development and code review on the
> > > >>feature
> > > >> > > branch. It's a great feature and I think it's ready to merge it
> > into
> > > >> > trunk.
> > > >> > >
> > > >> > > Thanks all for the contribution.
> > > >> > >
> > > >> > > Regards,
> > > >> > > Yi Liu
> > > >> > >
> > > >> > >
> > > >> > > -----Original Message-----
> > > >> > > From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> > > >> > > Sent: Friday, September 25, 2015 12:21 PM
> > > >> > > To: hdfs-dev@hadoop.apache.org
> > > >> > > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> > trunk
> > > >> > >
> > > >> > > +1,
> > > >> > >
> > > >> > > I've been involved starting from design and development of
> > > >> ErasureCoding.
> > > >> > > I think phase 1 of this development is ready to be merged to
> > trunk.
> > > >> > > It had come a long way to the current state with significant
> > effort
> > > >>of
> > > >> > > many Contributors and Reviewers for both design and code.
> > > >> > >
> > > >> > > Thanks Everyone for the efforts.
> > > >> > >
> > > >> > > Regards,
> > > >> > > Vinay
> > > >> > >
> > > >> > > On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org>
> > > >>wrote:
> > > >> > >
> > > >> > > > +1
> > > >> > > >
> > > >> > > > I've been involved in both development and review on the
> branch,
> > > >>and
> > > >> I
> > > >> > > > believe it's now ready to get merged into trunk. Many thanks
> to
> > > >>all
> > > >> > > > the contributors and reviewers!
> > > >> > > >
> > > >> > > > Thanks,
> > > >> > > > -Jing
> > > >> > > >
> > > >> > > > On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
> > kai.zheng@intel.com>
> > > >> > wrote:
> > > >> > > >
> > > >> > > > > Non-binding +1
> > > >> > > > >
> > > >> > > > > According to our extensive performance tests, striping +
> ISA-L
> > > >> coder
> > > >> > > > based
> > > >> > > > > erasure coding not only can save storage, but also can
> > increase
> > > >>the
> > > >> > > > > throughput of a client or a cluster. It will be a great
> > > >>addition to
> > > >> > > > > HDFS and its users. Based on the latest branch codes, we
> also
> > > >> > > > > observed it's
> > > >> > > > very
> > > >> > > > > reliable in the concurrent tests. We'll provide the perf
> test
> > > >> report
> > > >> > > > after
> > > >> > > > > it's sorted out and hope it helps.
> > > >> > > > > Thanks!
> > > >> > > > >
> > > >> > > > > Regards,
> > > >> > > > > Kai
> > > >> > > > >
> > > >> > > > > -----Original Message-----
> > > >> > > > > From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > > >> > > > > Sent: Wednesday, September 23, 2015 8:50 AM
> > > >> > > > > To: hdfs-dev@hadoop.apache.org;
> common-dev@hadoop.apache.org
> > > >> > > > > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch
> to
> > > >> trunk
> > > >> > > > >
> > > >> > > > > +1
> > > >> > > > >
> > > >> > > > > Great addition to HDFS. Thanks all contributors for the nice
> > > >>work.
> > > >> > > > >
> > > >> > > > > Regards,
> > > >> > > > > Uma
> > > >> > > > >
> > > >> > > > > On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
> > wrote:
> > > >> > > > >
> > > >> > > > > >Hi,
> > > >> > > > > >
> > > >> > > > > >I'd like to propose a vote to merge the HDFS-7285 feature
> > > >>branch
> > > >> > > > > >back to trunk. Since November 2014 we have been designing
> and
> > > >> > > > > >developing this feature under the umbrella JIRAs HDFS-7285
> > and
> > > >> > > > > >HADOOP-11264, and have committed approximately 210 patches.
> > > >> > > > > >
> > > >> > > > > >The HDFS-7285 feature branch was created to support the
> first
> > > >> phase
> > > >> > > > > >of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC
> is
> > > >>to
> > > >> > > > > >significantly reduce storage space usage in HDFS clusters.
> > > >>Instead
> > > >> > > > > >of always creating 3 replicas of each block with 200%
> storage
> > > >> space
> > > >> > > > > >overhead, HDFS-EC provides data durability through parity
> > data
> > > >> > blocks.
> > > >> > > > > >With most EC configurations, the storage overhead is no
> more
> > > >>than
> > > >> > 50%.
> > > >> > > > > >Based on profiling results of production clusters, we
> decided
> > > >>to
> > > >> > > > > >support EC with the striped block layout in the first
> phase,
> > so
> > > >> > > > > >that small files can be better handled. This means dividing
> > > >>each
> > > >> > > > > >logical HDFS file block into smaller units (striping cells)
> > and
> > > >> > > > > >spreading them on a set of DataNodes in round-robin
> fashion.
> > > >> Parity
> > > >> > > > > >cells are generated for each stripe of original data cells.
> > We
> > > >> have
> > > >> > > > > >made changes to NameNode, client, and DataNode to
> generalize
> > > >>the
> > > >> > > > > >block concept and handle the mapping between a logical file
> > > >>block
> > > >> > > > > >and its internal storage blocks. For further details please
> > see
> > > >> the
> > > >> > > > > >design doc on HDFS-7285.
> > > >> > > > > >HADOOP-11264 focuses on providing flexible and
> > high-performance
> > > >> > > > > >codec calculation support.
> > > >> > > > > >
> > > >> > > > > >The nightly Jenkins job of the branch has reported several
> > > >> > > > > >successful runs, and doesn't show new flaky tests compared
> > with
> > > >> > > > > >trunk. We have posted several versions of the test plan
> > > >>including
> > > >> > > > > >both unit testing and cluster testing, and have executed
> most
> > > >> tests
> > > >> > > > > >in the plan. The most basic functionalities have been
> > > >>extensively
> > > >> > > > > >tested and verified in several real clusters with different
> > > >> > > > > >hardware configurations; results have been very stable. We
> > have
> > > >> > > > > >created follow-on tasks for more advanced error handling
> and
> > > >> > > optimization under the umbrella HDFS-8031.
> > > >> > > > > >We also plan to implement or harden the integration of EC
> > with
> > > >> > > > > >existing features such as WebHDFS, snapshot, append,
> > truncate,
> > > >> > > > > >hflush, hsync, and so forth.
> > > >> > > > > >
> > > >> > > > > >Development of this feature has been a collaboration across
> > > >>many
> > > >> > > > > >companies and institutions. I'd like to thank J. Andreina,
> > > >> Takanobu
> > > >> > > > > >Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
> Maheswara
> > > >>Rao
> > > >> > > > > >G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao
> > Rui,
> > > >> Kai
> > > >> > > > > >Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong
> > > >>Zhang,
> > > >> > > > > >Jing Zhao, Hui Zheng and Kai Zheng for their code
> > contributions
> > > >> and
> > > >> > > reviews.
> > > >> > > > > >Andrew and Kai Zheng also made fundamental contributions to
> > the
> > > >> > > > > >initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and
> > many
> > > >> > > > > >other contributors have made great efforts in system
> testing.
> > > >>Many
> > > >> > > > > >thanks go to Weihua Jiang for proposing the JIRA, and ATM,
> > Todd
> > > >> > > > > >Lipcon, Silvius Rus, Suresh, as well as many others for
> > > >>providing
> > > >> > > helpful feedbacks.
> > > >> > > > > >
> > > >> > > > > >Following the community convention, this vote will last
> for 7
> > > >>days
> > > >> > > > > >(ending September 29th). Votes from Hadoop committers are
> > > >>binding
> > > >> > > > > >but non-binding votes are very welcome as well. And here's
> my
> > > >> > > > > >non-binding
> > > >> > > > +1.
> > > >> > > > > >
> > > >> > > > > >Thanks,
> > > >> > > > > >---
> > > >> > > > > >Zhe Zhang
> > > >> > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> > >
> >
>

Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by Vinayakumar B <vi...@gmail.com>.
Is anyone else also thinks that feature is ready to goto branch-2  as well?

Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since then and
ready to go in branch-2.

-Vinay
On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zh...@cloudera.com> wrote:

> Thanks Vinay for capturing the issue and Uma for offering the help.
>
> ---
> Zhe Zhang
>
> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <uma.gangumalla@intel.com
> >
> wrote:
>
> > Vinay,
> >
> >
> >  I would merge them as part of HDFS-9182.
> >
> > Thanks,
> > Uma
> >
> >
> >
> > On 10/5/15, 12:48 AM, "Vinayakumar B" <vi...@apache.org> wrote:
> >
> > >Hi Andrew,
> > > I see CHANGES.txt entries not yet merged from CHANGES-HDFS-EC-7285.txt.
> > >
> > > Was this intentional?
> > >
> > >Regards,
> > >Vinay
> > >
> > >On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <an...@cloudera.com>
> > >wrote:
> > >
> > >> Branch has been merged to trunk, thanks again to everyone who worked
> on
> > >>the
> > >> feature!
> > >>
> > >> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <zh...@cloudera.com>
> > >>wrote:
> > >>
> > >> > Thanks everyone who has participated in this discussion.
> > >> >
> > >> > With 7 +1's (5 binding and 2 non-binding), and no -1, this vote has
> > >> passed.
> > >> > I will do a final 'git merge' with trunk and work with Andrew to
> merge
> > >> the
> > >> > branch to trunk. I'll update on this thread when the merge is done.
> > >> >
> > >> > ---
> > >> > Zhe Zhang
> > >> >
> > >> > On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi...@intel.com>
> > >>wrote:
> > >> >
> > >> > > (Change it to binding.)
> > >> > >
> > >> > > +1
> > >> > > I have been involved in the development and code review on the
> > >>feature
> > >> > > branch. It's a great feature and I think it's ready to merge it
> into
> > >> > trunk.
> > >> > >
> > >> > > Thanks all for the contribution.
> > >> > >
> > >> > > Regards,
> > >> > > Yi Liu
> > >> > >
> > >> > >
> > >> > > -----Original Message-----
> > >> > > From: Liu, Yi A
> > >> > > Sent: Friday, September 25, 2015 1:51 PM
> > >> > > To: hdfs-dev@hadoop.apache.org
> > >> > > Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> trunk
> > >> > >
> > >> > > +1 (non-binding)
> > >> > > I have been involved in the development and code review on the
> > >>feature
> > >> > > branch. It's a great feature and I think it's ready to merge it
> into
> > >> > trunk.
> > >> > >
> > >> > > Thanks all for the contribution.
> > >> > >
> > >> > > Regards,
> > >> > > Yi Liu
> > >> > >
> > >> > >
> > >> > > -----Original Message-----
> > >> > > From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> > >> > > Sent: Friday, September 25, 2015 12:21 PM
> > >> > > To: hdfs-dev@hadoop.apache.org
> > >> > > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> trunk
> > >> > >
> > >> > > +1,
> > >> > >
> > >> > > I've been involved starting from design and development of
> > >> ErasureCoding.
> > >> > > I think phase 1 of this development is ready to be merged to
> trunk.
> > >> > > It had come a long way to the current state with significant
> effort
> > >>of
> > >> > > many Contributors and Reviewers for both design and code.
> > >> > >
> > >> > > Thanks Everyone for the efforts.
> > >> > >
> > >> > > Regards,
> > >> > > Vinay
> > >> > >
> > >> > > On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org>
> > >>wrote:
> > >> > >
> > >> > > > +1
> > >> > > >
> > >> > > > I've been involved in both development and review on the branch,
> > >>and
> > >> I
> > >> > > > believe it's now ready to get merged into trunk. Many thanks to
> > >>all
> > >> > > > the contributors and reviewers!
> > >> > > >
> > >> > > > Thanks,
> > >> > > > -Jing
> > >> > > >
> > >> > > > On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
> kai.zheng@intel.com>
> > >> > wrote:
> > >> > > >
> > >> > > > > Non-binding +1
> > >> > > > >
> > >> > > > > According to our extensive performance tests, striping + ISA-L
> > >> coder
> > >> > > > based
> > >> > > > > erasure coding not only can save storage, but also can
> increase
> > >>the
> > >> > > > > throughput of a client or a cluster. It will be a great
> > >>addition to
> > >> > > > > HDFS and its users. Based on the latest branch codes, we also
> > >> > > > > observed it's
> > >> > > > very
> > >> > > > > reliable in the concurrent tests. We'll provide the perf test
> > >> report
> > >> > > > after
> > >> > > > > it's sorted out and hope it helps.
> > >> > > > > Thanks!
> > >> > > > >
> > >> > > > > Regards,
> > >> > > > > Kai
> > >> > > > >
> > >> > > > > -----Original Message-----
> > >> > > > > From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > >> > > > > Sent: Wednesday, September 23, 2015 8:50 AM
> > >> > > > > To: hdfs-dev@hadoop.apache.org; common-dev@hadoop.apache.org
> > >> > > > > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> > >> trunk
> > >> > > > >
> > >> > > > > +1
> > >> > > > >
> > >> > > > > Great addition to HDFS. Thanks all contributors for the nice
> > >>work.
> > >> > > > >
> > >> > > > > Regards,
> > >> > > > > Uma
> > >> > > > >
> > >> > > > > On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com>
> wrote:
> > >> > > > >
> > >> > > > > >Hi,
> > >> > > > > >
> > >> > > > > >I'd like to propose a vote to merge the HDFS-7285 feature
> > >>branch
> > >> > > > > >back to trunk. Since November 2014 we have been designing and
> > >> > > > > >developing this feature under the umbrella JIRAs HDFS-7285
> and
> > >> > > > > >HADOOP-11264, and have committed approximately 210 patches.
> > >> > > > > >
> > >> > > > > >The HDFS-7285 feature branch was created to support the first
> > >> phase
> > >> > > > > >of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC is
> > >>to
> > >> > > > > >significantly reduce storage space usage in HDFS clusters.
> > >>Instead
> > >> > > > > >of always creating 3 replicas of each block with 200% storage
> > >> space
> > >> > > > > >overhead, HDFS-EC provides data durability through parity
> data
> > >> > blocks.
> > >> > > > > >With most EC configurations, the storage overhead is no more
> > >>than
> > >> > 50%.
> > >> > > > > >Based on profiling results of production clusters, we decided
> > >>to
> > >> > > > > >support EC with the striped block layout in the first phase,
> so
> > >> > > > > >that small files can be better handled. This means dividing
> > >>each
> > >> > > > > >logical HDFS file block into smaller units (striping cells)
> and
> > >> > > > > >spreading them on a set of DataNodes in round-robin fashion.
> > >> Parity
> > >> > > > > >cells are generated for each stripe of original data cells.
> We
> > >> have
> > >> > > > > >made changes to NameNode, client, and DataNode to generalize
> > >>the
> > >> > > > > >block concept and handle the mapping between a logical file
> > >>block
> > >> > > > > >and its internal storage blocks. For further details please
> see
> > >> the
> > >> > > > > >design doc on HDFS-7285.
> > >> > > > > >HADOOP-11264 focuses on providing flexible and
> high-performance
> > >> > > > > >codec calculation support.
> > >> > > > > >
> > >> > > > > >The nightly Jenkins job of the branch has reported several
> > >> > > > > >successful runs, and doesn't show new flaky tests compared
> with
> > >> > > > > >trunk. We have posted several versions of the test plan
> > >>including
> > >> > > > > >both unit testing and cluster testing, and have executed most
> > >> tests
> > >> > > > > >in the plan. The most basic functionalities have been
> > >>extensively
> > >> > > > > >tested and verified in several real clusters with different
> > >> > > > > >hardware configurations; results have been very stable. We
> have
> > >> > > > > >created follow-on tasks for more advanced error handling and
> > >> > > optimization under the umbrella HDFS-8031.
> > >> > > > > >We also plan to implement or harden the integration of EC
> with
> > >> > > > > >existing features such as WebHDFS, snapshot, append,
> truncate,
> > >> > > > > >hflush, hsync, and so forth.
> > >> > > > > >
> > >> > > > > >Development of this feature has been a collaboration across
> > >>many
> > >> > > > > >companies and institutions. I'd like to thank J. Andreina,
> > >> Takanobu
> > >> > > > > >Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara
> > >>Rao
> > >> > > > > >G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao
> Rui,
> > >> Kai
> > >> > > > > >Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong
> > >>Zhang,
> > >> > > > > >Jing Zhao, Hui Zheng and Kai Zheng for their code
> contributions
> > >> and
> > >> > > reviews.
> > >> > > > > >Andrew and Kai Zheng also made fundamental contributions to
> the
> > >> > > > > >initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and
> many
> > >> > > > > >other contributors have made great efforts in system testing.
> > >>Many
> > >> > > > > >thanks go to Weihua Jiang for proposing the JIRA, and ATM,
> Todd
> > >> > > > > >Lipcon, Silvius Rus, Suresh, as well as many others for
> > >>providing
> > >> > > helpful feedbacks.
> > >> > > > > >
> > >> > > > > >Following the community convention, this vote will last for 7
> > >>days
> > >> > > > > >(ending September 29th). Votes from Hadoop committers are
> > >>binding
> > >> > > > > >but non-binding votes are very welcome as well. And here's my
> > >> > > > > >non-binding
> > >> > > > +1.
> > >> > > > > >
> > >> > > > > >Thanks,
> > >> > > > > >---
> > >> > > > > >Zhe Zhang
> > >> > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> >
>

Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by Zhe Zhang <zh...@cloudera.com>.
Thanks Vinay for capturing the issue and Uma for offering the help.

---
Zhe Zhang

On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <um...@intel.com>
wrote:

> Vinay,
>
>
>  I would merge them as part of HDFS-9182.
>
> Thanks,
> Uma
>
>
>
> On 10/5/15, 12:48 AM, "Vinayakumar B" <vi...@apache.org> wrote:
>
> >Hi Andrew,
> > I see CHANGES.txt entries not yet merged from CHANGES-HDFS-EC-7285.txt.
> >
> > Was this intentional?
> >
> >Regards,
> >Vinay
> >
> >On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <an...@cloudera.com>
> >wrote:
> >
> >> Branch has been merged to trunk, thanks again to everyone who worked on
> >>the
> >> feature!
> >>
> >> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <zh...@cloudera.com>
> >>wrote:
> >>
> >> > Thanks everyone who has participated in this discussion.
> >> >
> >> > With 7 +1's (5 binding and 2 non-binding), and no -1, this vote has
> >> passed.
> >> > I will do a final 'git merge' with trunk and work with Andrew to merge
> >> the
> >> > branch to trunk. I'll update on this thread when the merge is done.
> >> >
> >> > ---
> >> > Zhe Zhang
> >> >
> >> > On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi...@intel.com>
> >>wrote:
> >> >
> >> > > (Change it to binding.)
> >> > >
> >> > > +1
> >> > > I have been involved in the development and code review on the
> >>feature
> >> > > branch. It's a great feature and I think it's ready to merge it into
> >> > trunk.
> >> > >
> >> > > Thanks all for the contribution.
> >> > >
> >> > > Regards,
> >> > > Yi Liu
> >> > >
> >> > >
> >> > > -----Original Message-----
> >> > > From: Liu, Yi A
> >> > > Sent: Friday, September 25, 2015 1:51 PM
> >> > > To: hdfs-dev@hadoop.apache.org
> >> > > Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
> >> > >
> >> > > +1 (non-binding)
> >> > > I have been involved in the development and code review on the
> >>feature
> >> > > branch. It's a great feature and I think it's ready to merge it into
> >> > trunk.
> >> > >
> >> > > Thanks all for the contribution.
> >> > >
> >> > > Regards,
> >> > > Yi Liu
> >> > >
> >> > >
> >> > > -----Original Message-----
> >> > > From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> >> > > Sent: Friday, September 25, 2015 12:21 PM
> >> > > To: hdfs-dev@hadoop.apache.org
> >> > > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
> >> > >
> >> > > +1,
> >> > >
> >> > > I've been involved starting from design and development of
> >> ErasureCoding.
> >> > > I think phase 1 of this development is ready to be merged to trunk.
> >> > > It had come a long way to the current state with significant effort
> >>of
> >> > > many Contributors and Reviewers for both design and code.
> >> > >
> >> > > Thanks Everyone for the efforts.
> >> > >
> >> > > Regards,
> >> > > Vinay
> >> > >
> >> > > On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org>
> >>wrote:
> >> > >
> >> > > > +1
> >> > > >
> >> > > > I've been involved in both development and review on the branch,
> >>and
> >> I
> >> > > > believe it's now ready to get merged into trunk. Many thanks to
> >>all
> >> > > > the contributors and reviewers!
> >> > > >
> >> > > > Thanks,
> >> > > > -Jing
> >> > > >
> >> > > > On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <ka...@intel.com>
> >> > wrote:
> >> > > >
> >> > > > > Non-binding +1
> >> > > > >
> >> > > > > According to our extensive performance tests, striping + ISA-L
> >> coder
> >> > > > based
> >> > > > > erasure coding not only can save storage, but also can increase
> >>the
> >> > > > > throughput of a client or a cluster. It will be a great
> >>addition to
> >> > > > > HDFS and its users. Based on the latest branch codes, we also
> >> > > > > observed it's
> >> > > > very
> >> > > > > reliable in the concurrent tests. We'll provide the perf test
> >> report
> >> > > > after
> >> > > > > it's sorted out and hope it helps.
> >> > > > > Thanks!
> >> > > > >
> >> > > > > Regards,
> >> > > > > Kai
> >> > > > >
> >> > > > > -----Original Message-----
> >> > > > > From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> >> > > > > Sent: Wednesday, September 23, 2015 8:50 AM
> >> > > > > To: hdfs-dev@hadoop.apache.org; common-dev@hadoop.apache.org
> >> > > > > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> >> trunk
> >> > > > >
> >> > > > > +1
> >> > > > >
> >> > > > > Great addition to HDFS. Thanks all contributors for the nice
> >>work.
> >> > > > >
> >> > > > > Regards,
> >> > > > > Uma
> >> > > > >
> >> > > > > On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com> wrote:
> >> > > > >
> >> > > > > >Hi,
> >> > > > > >
> >> > > > > >I'd like to propose a vote to merge the HDFS-7285 feature
> >>branch
> >> > > > > >back to trunk. Since November 2014 we have been designing and
> >> > > > > >developing this feature under the umbrella JIRAs HDFS-7285 and
> >> > > > > >HADOOP-11264, and have committed approximately 210 patches.
> >> > > > > >
> >> > > > > >The HDFS-7285 feature branch was created to support the first
> >> phase
> >> > > > > >of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC is
> >>to
> >> > > > > >significantly reduce storage space usage in HDFS clusters.
> >>Instead
> >> > > > > >of always creating 3 replicas of each block with 200% storage
> >> space
> >> > > > > >overhead, HDFS-EC provides data durability through parity data
> >> > blocks.
> >> > > > > >With most EC configurations, the storage overhead is no more
> >>than
> >> > 50%.
> >> > > > > >Based on profiling results of production clusters, we decided
> >>to
> >> > > > > >support EC with the striped block layout in the first phase, so
> >> > > > > >that small files can be better handled. This means dividing
> >>each
> >> > > > > >logical HDFS file block into smaller units (striping cells) and
> >> > > > > >spreading them on a set of DataNodes in round-robin fashion.
> >> Parity
> >> > > > > >cells are generated for each stripe of original data cells. We
> >> have
> >> > > > > >made changes to NameNode, client, and DataNode to generalize
> >>the
> >> > > > > >block concept and handle the mapping between a logical file
> >>block
> >> > > > > >and its internal storage blocks. For further details please see
> >> the
> >> > > > > >design doc on HDFS-7285.
> >> > > > > >HADOOP-11264 focuses on providing flexible and high-performance
> >> > > > > >codec calculation support.
> >> > > > > >
> >> > > > > >The nightly Jenkins job of the branch has reported several
> >> > > > > >successful runs, and doesn't show new flaky tests compared with
> >> > > > > >trunk. We have posted several versions of the test plan
> >>including
> >> > > > > >both unit testing and cluster testing, and have executed most
> >> tests
> >> > > > > >in the plan. The most basic functionalities have been
> >>extensively
> >> > > > > >tested and verified in several real clusters with different
> >> > > > > >hardware configurations; results have been very stable. We have
> >> > > > > >created follow-on tasks for more advanced error handling and
> >> > > optimization under the umbrella HDFS-8031.
> >> > > > > >We also plan to implement or harden the integration of EC with
> >> > > > > >existing features such as WebHDFS, snapshot, append, truncate,
> >> > > > > >hflush, hsync, and so forth.
> >> > > > > >
> >> > > > > >Development of this feature has been a collaboration across
> >>many
> >> > > > > >companies and institutions. I'd like to thank J. Andreina,
> >> Takanobu
> >> > > > > >Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara
> >>Rao
> >> > > > > >G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui,
> >> Kai
> >> > > > > >Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong
> >>Zhang,
> >> > > > > >Jing Zhao, Hui Zheng and Kai Zheng for their code contributions
> >> and
> >> > > reviews.
> >> > > > > >Andrew and Kai Zheng also made fundamental contributions to the
> >> > > > > >initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and many
> >> > > > > >other contributors have made great efforts in system testing.
> >>Many
> >> > > > > >thanks go to Weihua Jiang for proposing the JIRA, and ATM, Todd
> >> > > > > >Lipcon, Silvius Rus, Suresh, as well as many others for
> >>providing
> >> > > helpful feedbacks.
> >> > > > > >
> >> > > > > >Following the community convention, this vote will last for 7
> >>days
> >> > > > > >(ending September 29th). Votes from Hadoop committers are
> >>binding
> >> > > > > >but non-binding votes are very welcome as well. And here's my
> >> > > > > >non-binding
> >> > > > +1.
> >> > > > > >
> >> > > > > >Thanks,
> >> > > > > >---
> >> > > > > >Zhe Zhang
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
>
>

Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by "Gangumalla, Uma" <um...@intel.com>.
Vinay,


 I would merge them as part of HDFS-9182.

Thanks,
Uma



On 10/5/15, 12:48 AM, "Vinayakumar B" <vi...@apache.org> wrote:

>Hi Andrew,
> I see CHANGES.txt entries not yet merged from CHANGES-HDFS-EC-7285.txt.
>
> Was this intentional?
>
>Regards,
>Vinay
>
>On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <an...@cloudera.com>
>wrote:
>
>> Branch has been merged to trunk, thanks again to everyone who worked on
>>the
>> feature!
>>
>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <zh...@cloudera.com>
>>wrote:
>>
>> > Thanks everyone who has participated in this discussion.
>> >
>> > With 7 +1's (5 binding and 2 non-binding), and no -1, this vote has
>> passed.
>> > I will do a final 'git merge' with trunk and work with Andrew to merge
>> the
>> > branch to trunk. I'll update on this thread when the merge is done.
>> >
>> > ---
>> > Zhe Zhang
>> >
>> > On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi...@intel.com>
>>wrote:
>> >
>> > > (Change it to binding.)
>> > >
>> > > +1
>> > > I have been involved in the development and code review on the
>>feature
>> > > branch. It's a great feature and I think it's ready to merge it into
>> > trunk.
>> > >
>> > > Thanks all for the contribution.
>> > >
>> > > Regards,
>> > > Yi Liu
>> > >
>> > >
>> > > -----Original Message-----
>> > > From: Liu, Yi A
>> > > Sent: Friday, September 25, 2015 1:51 PM
>> > > To: hdfs-dev@hadoop.apache.org
>> > > Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
>> > >
>> > > +1 (non-binding)
>> > > I have been involved in the development and code review on the
>>feature
>> > > branch. It's a great feature and I think it's ready to merge it into
>> > trunk.
>> > >
>> > > Thanks all for the contribution.
>> > >
>> > > Regards,
>> > > Yi Liu
>> > >
>> > >
>> > > -----Original Message-----
>> > > From: Vinayakumar B [mailto:vinayakumarb@apache.org]
>> > > Sent: Friday, September 25, 2015 12:21 PM
>> > > To: hdfs-dev@hadoop.apache.org
>> > > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
>> > >
>> > > +1,
>> > >
>> > > I've been involved starting from design and development of
>> ErasureCoding.
>> > > I think phase 1 of this development is ready to be merged to trunk.
>> > > It had come a long way to the current state with significant effort
>>of
>> > > many Contributors and Reviewers for both design and code.
>> > >
>> > > Thanks Everyone for the efforts.
>> > >
>> > > Regards,
>> > > Vinay
>> > >
>> > > On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org>
>>wrote:
>> > >
>> > > > +1
>> > > >
>> > > > I've been involved in both development and review on the branch,
>>and
>> I
>> > > > believe it's now ready to get merged into trunk. Many thanks to
>>all
>> > > > the contributors and reviewers!
>> > > >
>> > > > Thanks,
>> > > > -Jing
>> > > >
>> > > > On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <ka...@intel.com>
>> > wrote:
>> > > >
>> > > > > Non-binding +1
>> > > > >
>> > > > > According to our extensive performance tests, striping + ISA-L
>> coder
>> > > > based
>> > > > > erasure coding not only can save storage, but also can increase
>>the
>> > > > > throughput of a client or a cluster. It will be a great
>>addition to
>> > > > > HDFS and its users. Based on the latest branch codes, we also
>> > > > > observed it's
>> > > > very
>> > > > > reliable in the concurrent tests. We'll provide the perf test
>> report
>> > > > after
>> > > > > it's sorted out and hope it helps.
>> > > > > Thanks!
>> > > > >
>> > > > > Regards,
>> > > > > Kai
>> > > > >
>> > > > > -----Original Message-----
>> > > > > From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
>> > > > > Sent: Wednesday, September 23, 2015 8:50 AM
>> > > > > To: hdfs-dev@hadoop.apache.org; common-dev@hadoop.apache.org
>> > > > > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
>> trunk
>> > > > >
>> > > > > +1
>> > > > >
>> > > > > Great addition to HDFS. Thanks all contributors for the nice
>>work.
>> > > > >
>> > > > > Regards,
>> > > > > Uma
>> > > > >
>> > > > > On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com> wrote:
>> > > > >
>> > > > > >Hi,
>> > > > > >
>> > > > > >I'd like to propose a vote to merge the HDFS-7285 feature
>>branch
>> > > > > >back to trunk. Since November 2014 we have been designing and
>> > > > > >developing this feature under the umbrella JIRAs HDFS-7285 and
>> > > > > >HADOOP-11264, and have committed approximately 210 patches.
>> > > > > >
>> > > > > >The HDFS-7285 feature branch was created to support the first
>> phase
>> > > > > >of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC is
>>to
>> > > > > >significantly reduce storage space usage in HDFS clusters.
>>Instead
>> > > > > >of always creating 3 replicas of each block with 200% storage
>> space
>> > > > > >overhead, HDFS-EC provides data durability through parity data
>> > blocks.
>> > > > > >With most EC configurations, the storage overhead is no more
>>than
>> > 50%.
>> > > > > >Based on profiling results of production clusters, we decided
>>to
>> > > > > >support EC with the striped block layout in the first phase, so
>> > > > > >that small files can be better handled. This means dividing
>>each
>> > > > > >logical HDFS file block into smaller units (striping cells) and
>> > > > > >spreading them on a set of DataNodes in round-robin fashion.
>> Parity
>> > > > > >cells are generated for each stripe of original data cells. We
>> have
>> > > > > >made changes to NameNode, client, and DataNode to generalize
>>the
>> > > > > >block concept and handle the mapping between a logical file
>>block
>> > > > > >and its internal storage blocks. For further details please see
>> the
>> > > > > >design doc on HDFS-7285.
>> > > > > >HADOOP-11264 focuses on providing flexible and high-performance
>> > > > > >codec calculation support.
>> > > > > >
>> > > > > >The nightly Jenkins job of the branch has reported several
>> > > > > >successful runs, and doesn't show new flaky tests compared with
>> > > > > >trunk. We have posted several versions of the test plan
>>including
>> > > > > >both unit testing and cluster testing, and have executed most
>> tests
>> > > > > >in the plan. The most basic functionalities have been
>>extensively
>> > > > > >tested and verified in several real clusters with different
>> > > > > >hardware configurations; results have been very stable. We have
>> > > > > >created follow-on tasks for more advanced error handling and
>> > > optimization under the umbrella HDFS-8031.
>> > > > > >We also plan to implement or harden the integration of EC with
>> > > > > >existing features such as WebHDFS, snapshot, append, truncate,
>> > > > > >hflush, hsync, and so forth.
>> > > > > >
>> > > > > >Development of this feature has been a collaboration across
>>many
>> > > > > >companies and institutions. I'd like to thank J. Andreina,
>> Takanobu
>> > > > > >Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara
>>Rao
>> > > > > >G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui,
>> Kai
>> > > > > >Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong
>>Zhang,
>> > > > > >Jing Zhao, Hui Zheng and Kai Zheng for their code contributions
>> and
>> > > reviews.
>> > > > > >Andrew and Kai Zheng also made fundamental contributions to the
>> > > > > >initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and many
>> > > > > >other contributors have made great efforts in system testing.
>>Many
>> > > > > >thanks go to Weihua Jiang for proposing the JIRA, and ATM, Todd
>> > > > > >Lipcon, Silvius Rus, Suresh, as well as many others for
>>providing
>> > > helpful feedbacks.
>> > > > > >
>> > > > > >Following the community convention, this vote will last for 7
>>days
>> > > > > >(ending September 29th). Votes from Hadoop committers are
>>binding
>> > > > > >but non-binding votes are very welcome as well. And here's my
>> > > > > >non-binding
>> > > > +1.
>> > > > > >
>> > > > > >Thanks,
>> > > > > >---
>> > > > > >Zhe Zhang
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>>


Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by Vinayakumar B <vi...@apache.org>.
Hi Andrew,
 I see CHANGES.txt entries not yet merged from CHANGES-HDFS-EC-7285.txt.

 Was this intentional?

Regards,
Vinay

On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <an...@cloudera.com>
wrote:

> Branch has been merged to trunk, thanks again to everyone who worked on the
> feature!
>
> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <zh...@cloudera.com> wrote:
>
> > Thanks everyone who has participated in this discussion.
> >
> > With 7 +1's (5 binding and 2 non-binding), and no -1, this vote has
> passed.
> > I will do a final 'git merge' with trunk and work with Andrew to merge
> the
> > branch to trunk. I'll update on this thread when the merge is done.
> >
> > ---
> > Zhe Zhang
> >
> > On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi...@intel.com> wrote:
> >
> > > (Change it to binding.)
> > >
> > > +1
> > > I have been involved in the development and code review on the feature
> > > branch. It's a great feature and I think it's ready to merge it into
> > trunk.
> > >
> > > Thanks all for the contribution.
> > >
> > > Regards,
> > > Yi Liu
> > >
> > >
> > > -----Original Message-----
> > > From: Liu, Yi A
> > > Sent: Friday, September 25, 2015 1:51 PM
> > > To: hdfs-dev@hadoop.apache.org
> > > Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
> > >
> > > +1 (non-binding)
> > > I have been involved in the development and code review on the feature
> > > branch. It's a great feature and I think it's ready to merge it into
> > trunk.
> > >
> > > Thanks all for the contribution.
> > >
> > > Regards,
> > > Yi Liu
> > >
> > >
> > > -----Original Message-----
> > > From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> > > Sent: Friday, September 25, 2015 12:21 PM
> > > To: hdfs-dev@hadoop.apache.org
> > > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
> > >
> > > +1,
> > >
> > > I've been involved starting from design and development of
> ErasureCoding.
> > > I think phase 1 of this development is ready to be merged to trunk.
> > > It had come a long way to the current state with significant effort of
> > > many Contributors and Reviewers for both design and code.
> > >
> > > Thanks Everyone for the efforts.
> > >
> > > Regards,
> > > Vinay
> > >
> > > On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org> wrote:
> > >
> > > > +1
> > > >
> > > > I've been involved in both development and review on the branch, and
> I
> > > > believe it's now ready to get merged into trunk. Many thanks to all
> > > > the contributors and reviewers!
> > > >
> > > > Thanks,
> > > > -Jing
> > > >
> > > > On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <ka...@intel.com>
> > wrote:
> > > >
> > > > > Non-binding +1
> > > > >
> > > > > According to our extensive performance tests, striping + ISA-L
> coder
> > > > based
> > > > > erasure coding not only can save storage, but also can increase the
> > > > > throughput of a client or a cluster. It will be a great addition to
> > > > > HDFS and its users. Based on the latest branch codes, we also
> > > > > observed it's
> > > > very
> > > > > reliable in the concurrent tests. We'll provide the perf test
> report
> > > > after
> > > > > it's sorted out and hope it helps.
> > > > > Thanks!
> > > > >
> > > > > Regards,
> > > > > Kai
> > > > >
> > > > > -----Original Message-----
> > > > > From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > > > > Sent: Wednesday, September 23, 2015 8:50 AM
> > > > > To: hdfs-dev@hadoop.apache.org; common-dev@hadoop.apache.org
> > > > > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
> trunk
> > > > >
> > > > > +1
> > > > >
> > > > > Great addition to HDFS. Thanks all contributors for the nice work.
> > > > >
> > > > > Regards,
> > > > > Uma
> > > > >
> > > > > On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com> wrote:
> > > > >
> > > > > >Hi,
> > > > > >
> > > > > >I'd like to propose a vote to merge the HDFS-7285 feature branch
> > > > > >back to trunk. Since November 2014 we have been designing and
> > > > > >developing this feature under the umbrella JIRAs HDFS-7285 and
> > > > > >HADOOP-11264, and have committed approximately 210 patches.
> > > > > >
> > > > > >The HDFS-7285 feature branch was created to support the first
> phase
> > > > > >of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC is to
> > > > > >significantly reduce storage space usage in HDFS clusters. Instead
> > > > > >of always creating 3 replicas of each block with 200% storage
> space
> > > > > >overhead, HDFS-EC provides data durability through parity data
> > blocks.
> > > > > >With most EC configurations, the storage overhead is no more than
> > 50%.
> > > > > >Based on profiling results of production clusters, we decided to
> > > > > >support EC with the striped block layout in the first phase, so
> > > > > >that small files can be better handled. This means dividing each
> > > > > >logical HDFS file block into smaller units (striping cells) and
> > > > > >spreading them on a set of DataNodes in round-robin fashion.
> Parity
> > > > > >cells are generated for each stripe of original data cells. We
> have
> > > > > >made changes to NameNode, client, and DataNode to generalize the
> > > > > >block concept and handle the mapping between a logical file block
> > > > > >and its internal storage blocks. For further details please see
> the
> > > > > >design doc on HDFS-7285.
> > > > > >HADOOP-11264 focuses on providing flexible and high-performance
> > > > > >codec calculation support.
> > > > > >
> > > > > >The nightly Jenkins job of the branch has reported several
> > > > > >successful runs, and doesn't show new flaky tests compared with
> > > > > >trunk. We have posted several versions of the test plan including
> > > > > >both unit testing and cluster testing, and have executed most
> tests
> > > > > >in the plan. The most basic functionalities have been extensively
> > > > > >tested and verified in several real clusters with different
> > > > > >hardware configurations; results have been very stable. We have
> > > > > >created follow-on tasks for more advanced error handling and
> > > optimization under the umbrella HDFS-8031.
> > > > > >We also plan to implement or harden the integration of EC with
> > > > > >existing features such as WebHDFS, snapshot, append, truncate,
> > > > > >hflush, hsync, and so forth.
> > > > > >
> > > > > >Development of this feature has been a collaboration across many
> > > > > >companies and institutions. I'd like to thank J. Andreina,
> Takanobu
> > > > > >Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao
> > > > > >G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui,
> Kai
> > > > > >Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong Zhang,
> > > > > >Jing Zhao, Hui Zheng and Kai Zheng for their code contributions
> and
> > > reviews.
> > > > > >Andrew and Kai Zheng also made fundamental contributions to the
> > > > > >initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and many
> > > > > >other contributors have made great efforts in system testing. Many
> > > > > >thanks go to Weihua Jiang for proposing the JIRA, and ATM, Todd
> > > > > >Lipcon, Silvius Rus, Suresh, as well as many others for providing
> > > helpful feedbacks.
> > > > > >
> > > > > >Following the community convention, this vote will last for 7 days
> > > > > >(ending September 29th). Votes from Hadoop committers are binding
> > > > > >but non-binding votes are very welcome as well. And here's my
> > > > > >non-binding
> > > > +1.
> > > > > >
> > > > > >Thanks,
> > > > > >---
> > > > > >Zhe Zhang
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by Andrew Wang <an...@cloudera.com>.
Branch has been merged to trunk, thanks again to everyone who worked on the
feature!

On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <zh...@cloudera.com> wrote:

> Thanks everyone who has participated in this discussion.
>
> With 7 +1's (5 binding and 2 non-binding), and no -1, this vote has passed.
> I will do a final 'git merge' with trunk and work with Andrew to merge the
> branch to trunk. I'll update on this thread when the merge is done.
>
> ---
> Zhe Zhang
>
> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi...@intel.com> wrote:
>
> > (Change it to binding.)
> >
> > +1
> > I have been involved in the development and code review on the feature
> > branch. It's a great feature and I think it's ready to merge it into
> trunk.
> >
> > Thanks all for the contribution.
> >
> > Regards,
> > Yi Liu
> >
> >
> > -----Original Message-----
> > From: Liu, Yi A
> > Sent: Friday, September 25, 2015 1:51 PM
> > To: hdfs-dev@hadoop.apache.org
> > Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
> >
> > +1 (non-binding)
> > I have been involved in the development and code review on the feature
> > branch. It's a great feature and I think it's ready to merge it into
> trunk.
> >
> > Thanks all for the contribution.
> >
> > Regards,
> > Yi Liu
> >
> >
> > -----Original Message-----
> > From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> > Sent: Friday, September 25, 2015 12:21 PM
> > To: hdfs-dev@hadoop.apache.org
> > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
> >
> > +1,
> >
> > I've been involved starting from design and development of ErasureCoding.
> > I think phase 1 of this development is ready to be merged to trunk.
> > It had come a long way to the current state with significant effort of
> > many Contributors and Reviewers for both design and code.
> >
> > Thanks Everyone for the efforts.
> >
> > Regards,
> > Vinay
> >
> > On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org> wrote:
> >
> > > +1
> > >
> > > I've been involved in both development and review on the branch, and I
> > > believe it's now ready to get merged into trunk. Many thanks to all
> > > the contributors and reviewers!
> > >
> > > Thanks,
> > > -Jing
> > >
> > > On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <ka...@intel.com>
> wrote:
> > >
> > > > Non-binding +1
> > > >
> > > > According to our extensive performance tests, striping + ISA-L coder
> > > based
> > > > erasure coding not only can save storage, but also can increase the
> > > > throughput of a client or a cluster. It will be a great addition to
> > > > HDFS and its users. Based on the latest branch codes, we also
> > > > observed it's
> > > very
> > > > reliable in the concurrent tests. We'll provide the perf test report
> > > after
> > > > it's sorted out and hope it helps.
> > > > Thanks!
> > > >
> > > > Regards,
> > > > Kai
> > > >
> > > > -----Original Message-----
> > > > From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > > > Sent: Wednesday, September 23, 2015 8:50 AM
> > > > To: hdfs-dev@hadoop.apache.org; common-dev@hadoop.apache.org
> > > > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
> > > >
> > > > +1
> > > >
> > > > Great addition to HDFS. Thanks all contributors for the nice work.
> > > >
> > > > Regards,
> > > > Uma
> > > >
> > > > On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com> wrote:
> > > >
> > > > >Hi,
> > > > >
> > > > >I'd like to propose a vote to merge the HDFS-7285 feature branch
> > > > >back to trunk. Since November 2014 we have been designing and
> > > > >developing this feature under the umbrella JIRAs HDFS-7285 and
> > > > >HADOOP-11264, and have committed approximately 210 patches.
> > > > >
> > > > >The HDFS-7285 feature branch was created to support the first phase
> > > > >of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC is to
> > > > >significantly reduce storage space usage in HDFS clusters. Instead
> > > > >of always creating 3 replicas of each block with 200% storage space
> > > > >overhead, HDFS-EC provides data durability through parity data
> blocks.
> > > > >With most EC configurations, the storage overhead is no more than
> 50%.
> > > > >Based on profiling results of production clusters, we decided to
> > > > >support EC with the striped block layout in the first phase, so
> > > > >that small files can be better handled. This means dividing each
> > > > >logical HDFS file block into smaller units (striping cells) and
> > > > >spreading them on a set of DataNodes in round-robin fashion. Parity
> > > > >cells are generated for each stripe of original data cells. We have
> > > > >made changes to NameNode, client, and DataNode to generalize the
> > > > >block concept and handle the mapping between a logical file block
> > > > >and its internal storage blocks. For further details please see the
> > > > >design doc on HDFS-7285.
> > > > >HADOOP-11264 focuses on providing flexible and high-performance
> > > > >codec calculation support.
> > > > >
> > > > >The nightly Jenkins job of the branch has reported several
> > > > >successful runs, and doesn't show new flaky tests compared with
> > > > >trunk. We have posted several versions of the test plan including
> > > > >both unit testing and cluster testing, and have executed most tests
> > > > >in the plan. The most basic functionalities have been extensively
> > > > >tested and verified in several real clusters with different
> > > > >hardware configurations; results have been very stable. We have
> > > > >created follow-on tasks for more advanced error handling and
> > optimization under the umbrella HDFS-8031.
> > > > >We also plan to implement or harden the integration of EC with
> > > > >existing features such as WebHDFS, snapshot, append, truncate,
> > > > >hflush, hsync, and so forth.
> > > > >
> > > > >Development of this feature has been a collaboration across many
> > > > >companies and institutions. I'd like to thank J. Andreina, Takanobu
> > > > >Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao
> > > > >G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai
> > > > >Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong Zhang,
> > > > >Jing Zhao, Hui Zheng and Kai Zheng for their code contributions and
> > reviews.
> > > > >Andrew and Kai Zheng also made fundamental contributions to the
> > > > >initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and many
> > > > >other contributors have made great efforts in system testing. Many
> > > > >thanks go to Weihua Jiang for proposing the JIRA, and ATM, Todd
> > > > >Lipcon, Silvius Rus, Suresh, as well as many others for providing
> > helpful feedbacks.
> > > > >
> > > > >Following the community convention, this vote will last for 7 days
> > > > >(ending September 29th). Votes from Hadoop committers are binding
> > > > >but non-binding votes are very welcome as well. And here's my
> > > > >non-binding
> > > +1.
> > > > >
> > > > >Thanks,
> > > > >---
> > > > >Zhe Zhang
> > > >
> > > >
> > >
> >
>

Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by Zhe Zhang <zh...@cloudera.com>.
Thanks everyone who has participated in this discussion.

With 7 +1's (5 binding and 2 non-binding), and no -1, this vote has passed.
I will do a final 'git merge' with trunk and work with Andrew to merge the
branch to trunk. I'll update on this thread when the merge is done.

---
Zhe Zhang

On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi...@intel.com> wrote:

> (Change it to binding.)
>
> +1
> I have been involved in the development and code review on the feature
> branch. It's a great feature and I think it's ready to merge it into trunk.
>
> Thanks all for the contribution.
>
> Regards,
> Yi Liu
>
>
> -----Original Message-----
> From: Liu, Yi A
> Sent: Friday, September 25, 2015 1:51 PM
> To: hdfs-dev@hadoop.apache.org
> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
>
> +1 (non-binding)
> I have been involved in the development and code review on the feature
> branch. It's a great feature and I think it's ready to merge it into trunk.
>
> Thanks all for the contribution.
>
> Regards,
> Yi Liu
>
>
> -----Original Message-----
> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> Sent: Friday, September 25, 2015 12:21 PM
> To: hdfs-dev@hadoop.apache.org
> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
>
> +1,
>
> I've been involved starting from design and development of ErasureCoding.
> I think phase 1 of this development is ready to be merged to trunk.
> It had come a long way to the current state with significant effort of
> many Contributors and Reviewers for both design and code.
>
> Thanks Everyone for the efforts.
>
> Regards,
> Vinay
>
> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org> wrote:
>
> > +1
> >
> > I've been involved in both development and review on the branch, and I
> > believe it's now ready to get merged into trunk. Many thanks to all
> > the contributors and reviewers!
> >
> > Thanks,
> > -Jing
> >
> > On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <ka...@intel.com> wrote:
> >
> > > Non-binding +1
> > >
> > > According to our extensive performance tests, striping + ISA-L coder
> > based
> > > erasure coding not only can save storage, but also can increase the
> > > throughput of a client or a cluster. It will be a great addition to
> > > HDFS and its users. Based on the latest branch codes, we also
> > > observed it's
> > very
> > > reliable in the concurrent tests. We'll provide the perf test report
> > after
> > > it's sorted out and hope it helps.
> > > Thanks!
> > >
> > > Regards,
> > > Kai
> > >
> > > -----Original Message-----
> > > From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > > Sent: Wednesday, September 23, 2015 8:50 AM
> > > To: hdfs-dev@hadoop.apache.org; common-dev@hadoop.apache.org
> > > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
> > >
> > > +1
> > >
> > > Great addition to HDFS. Thanks all contributors for the nice work.
> > >
> > > Regards,
> > > Uma
> > >
> > > On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com> wrote:
> > >
> > > >Hi,
> > > >
> > > >I'd like to propose a vote to merge the HDFS-7285 feature branch
> > > >back to trunk. Since November 2014 we have been designing and
> > > >developing this feature under the umbrella JIRAs HDFS-7285 and
> > > >HADOOP-11264, and have committed approximately 210 patches.
> > > >
> > > >The HDFS-7285 feature branch was created to support the first phase
> > > >of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC is to
> > > >significantly reduce storage space usage in HDFS clusters. Instead
> > > >of always creating 3 replicas of each block with 200% storage space
> > > >overhead, HDFS-EC provides data durability through parity data blocks.
> > > >With most EC configurations, the storage overhead is no more than 50%.
> > > >Based on profiling results of production clusters, we decided to
> > > >support EC with the striped block layout in the first phase, so
> > > >that small files can be better handled. This means dividing each
> > > >logical HDFS file block into smaller units (striping cells) and
> > > >spreading them on a set of DataNodes in round-robin fashion. Parity
> > > >cells are generated for each stripe of original data cells. We have
> > > >made changes to NameNode, client, and DataNode to generalize the
> > > >block concept and handle the mapping between a logical file block
> > > >and its internal storage blocks. For further details please see the
> > > >design doc on HDFS-7285.
> > > >HADOOP-11264 focuses on providing flexible and high-performance
> > > >codec calculation support.
> > > >
> > > >The nightly Jenkins job of the branch has reported several
> > > >successful runs, and doesn't show new flaky tests compared with
> > > >trunk. We have posted several versions of the test plan including
> > > >both unit testing and cluster testing, and have executed most tests
> > > >in the plan. The most basic functionalities have been extensively
> > > >tested and verified in several real clusters with different
> > > >hardware configurations; results have been very stable. We have
> > > >created follow-on tasks for more advanced error handling and
> optimization under the umbrella HDFS-8031.
> > > >We also plan to implement or harden the integration of EC with
> > > >existing features such as WebHDFS, snapshot, append, truncate,
> > > >hflush, hsync, and so forth.
> > > >
> > > >Development of this feature has been a collaboration across many
> > > >companies and institutions. I'd like to thank J. Andreina, Takanobu
> > > >Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao
> > > >G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai
> > > >Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong Zhang,
> > > >Jing Zhao, Hui Zheng and Kai Zheng for their code contributions and
> reviews.
> > > >Andrew and Kai Zheng also made fundamental contributions to the
> > > >initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and many
> > > >other contributors have made great efforts in system testing. Many
> > > >thanks go to Weihua Jiang for proposing the JIRA, and ATM, Todd
> > > >Lipcon, Silvius Rus, Suresh, as well as many others for providing
> helpful feedbacks.
> > > >
> > > >Following the community convention, this vote will last for 7 days
> > > >(ending September 29th). Votes from Hadoop committers are binding
> > > >but non-binding votes are very welcome as well. And here's my
> > > >non-binding
> > +1.
> > > >
> > > >Thanks,
> > > >---
> > > >Zhe Zhang
> > >
> > >
> >
>

RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by "Liu, Yi A" <yi...@intel.com>.
(Change it to binding.)

+1 
I have been involved in the development and code review on the feature branch. It's a great feature and I think it's ready to merge it into trunk.

Thanks all for the contribution.

Regards,
Yi Liu


-----Original Message-----
From: Liu, Yi A 
Sent: Friday, September 25, 2015 1:51 PM
To: hdfs-dev@hadoop.apache.org
Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

+1 (non-binding)
I have been involved in the development and code review on the feature branch. It's a great feature and I think it's ready to merge it into trunk.

Thanks all for the contribution.

Regards,
Yi Liu


-----Original Message-----
From: Vinayakumar B [mailto:vinayakumarb@apache.org]
Sent: Friday, September 25, 2015 12:21 PM
To: hdfs-dev@hadoop.apache.org
Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

+1,

I've been involved starting from design and development of ErasureCoding. I think phase 1 of this development is ready to be merged to trunk.
It had come a long way to the current state with significant effort of many Contributors and Reviewers for both design and code.

Thanks Everyone for the efforts.

Regards,
Vinay

On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org> wrote:

> +1
>
> I've been involved in both development and review on the branch, and I 
> believe it's now ready to get merged into trunk. Many thanks to all 
> the contributors and reviewers!
>
> Thanks,
> -Jing
>
> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <ka...@intel.com> wrote:
>
> > Non-binding +1
> >
> > According to our extensive performance tests, striping + ISA-L coder
> based
> > erasure coding not only can save storage, but also can increase the 
> > throughput of a client or a cluster. It will be a great addition to 
> > HDFS and its users. Based on the latest branch codes, we also 
> > observed it's
> very
> > reliable in the concurrent tests. We'll provide the perf test report
> after
> > it's sorted out and hope it helps.
> > Thanks!
> >
> > Regards,
> > Kai
> >
> > -----Original Message-----
> > From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > Sent: Wednesday, September 23, 2015 8:50 AM
> > To: hdfs-dev@hadoop.apache.org; common-dev@hadoop.apache.org
> > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
> >
> > +1
> >
> > Great addition to HDFS. Thanks all contributors for the nice work.
> >
> > Regards,
> > Uma
> >
> > On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com> wrote:
> >
> > >Hi,
> > >
> > >I'd like to propose a vote to merge the HDFS-7285 feature branch 
> > >back to trunk. Since November 2014 we have been designing and 
> > >developing this feature under the umbrella JIRAs HDFS-7285 and 
> > >HADOOP-11264, and have committed approximately 210 patches.
> > >
> > >The HDFS-7285 feature branch was created to support the first phase 
> > >of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC is to 
> > >significantly reduce storage space usage in HDFS clusters. Instead 
> > >of always creating 3 replicas of each block with 200% storage space 
> > >overhead, HDFS-EC provides data durability through parity data blocks.
> > >With most EC configurations, the storage overhead is no more than 50%.
> > >Based on profiling results of production clusters, we decided to 
> > >support EC with the striped block layout in the first phase, so 
> > >that small files can be better handled. This means dividing each 
> > >logical HDFS file block into smaller units (striping cells) and 
> > >spreading them on a set of DataNodes in round-robin fashion. Parity 
> > >cells are generated for each stripe of original data cells. We have 
> > >made changes to NameNode, client, and DataNode to generalize the 
> > >block concept and handle the mapping between a logical file block 
> > >and its internal storage blocks. For further details please see the 
> > >design doc on HDFS-7285.
> > >HADOOP-11264 focuses on providing flexible and high-performance 
> > >codec calculation support.
> > >
> > >The nightly Jenkins job of the branch has reported several 
> > >successful runs, and doesn't show new flaky tests compared with 
> > >trunk. We have posted several versions of the test plan including 
> > >both unit testing and cluster testing, and have executed most tests 
> > >in the plan. The most basic functionalities have been extensively 
> > >tested and verified in several real clusters with different 
> > >hardware configurations; results have been very stable. We have 
> > >created follow-on tasks for more advanced error handling and optimization under the umbrella HDFS-8031.
> > >We also plan to implement or harden the integration of EC with 
> > >existing features such as WebHDFS, snapshot, append, truncate, 
> > >hflush, hsync, and so forth.
> > >
> > >Development of this feature has been a collaboration across many 
> > >companies and institutions. I'd like to thank J. Andreina, Takanobu 
> > >Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao 
> > >G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai 
> > >Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong Zhang, 
> > >Jing Zhao, Hui Zheng and Kai Zheng for their code contributions and reviews.
> > >Andrew and Kai Zheng also made fundamental contributions to the 
> > >initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and many 
> > >other contributors have made great efforts in system testing. Many 
> > >thanks go to Weihua Jiang for proposing the JIRA, and ATM, Todd 
> > >Lipcon, Silvius Rus, Suresh, as well as many others for providing helpful feedbacks.
> > >
> > >Following the community convention, this vote will last for 7 days 
> > >(ending September 29th). Votes from Hadoop committers are binding 
> > >but non-binding votes are very welcome as well. And here's my 
> > >non-binding
> +1.
> > >
> > >Thanks,
> > >---
> > >Zhe Zhang
> >
> >
>

Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by Vinayakumar B <vi...@apache.org>.
+1,

I've been involved starting from design and development of ErasureCoding. I
think phase 1 of this development is ready to be merged to trunk.
It had come a long way to the current state with significant effort of many
Contributors and Reviewers for both design and code.

Thanks Everyone for the efforts.

Regards,
Vinay

On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org> wrote:

> +1
>
> I've been involved in both development and review on the branch, and I
> believe it's now ready to get merged into trunk. Many thanks to all the
> contributors and reviewers!
>
> Thanks,
> -Jing
>
> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <ka...@intel.com> wrote:
>
> > Non-binding +1
> >
> > According to our extensive performance tests, striping + ISA-L coder
> based
> > erasure coding not only can save storage, but also can increase the
> > throughput of a client or a cluster. It will be a great addition to HDFS
> > and its users. Based on the latest branch codes, we also observed it's
> very
> > reliable in the concurrent tests. We'll provide the perf test report
> after
> > it's sorted out and hope it helps.
> > Thanks!
> >
> > Regards,
> > Kai
> >
> > -----Original Message-----
> > From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > Sent: Wednesday, September 23, 2015 8:50 AM
> > To: hdfs-dev@hadoop.apache.org; common-dev@hadoop.apache.org
> > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
> >
> > +1
> >
> > Great addition to HDFS. Thanks all contributors for the nice work.
> >
> > Regards,
> > Uma
> >
> > On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com> wrote:
> >
> > >Hi,
> > >
> > >I'd like to propose a vote to merge the HDFS-7285 feature branch back
> > >to trunk. Since November 2014 we have been designing and developing
> > >this feature under the umbrella JIRAs HDFS-7285 and HADOOP-11264, and
> > >have committed approximately 210 patches.
> > >
> > >The HDFS-7285 feature branch was created to support the first phase of
> > >HDFS erasure coding (HDFS-EC). The objective of HDFS-EC is to
> > >significantly reduce storage space usage in HDFS clusters. Instead of
> > >always creating 3 replicas of each block with 200% storage space
> > >overhead, HDFS-EC provides data durability through parity data blocks.
> > >With most EC configurations, the storage overhead is no more than 50%.
> > >Based on profiling results of production clusters, we decided to
> > >support EC with the striped block layout in the first phase, so that
> > >small files can be better handled. This means dividing each logical
> > >HDFS file block into smaller units (striping cells) and spreading them
> > >on a set of DataNodes in round-robin fashion. Parity cells are
> > >generated for each stripe of original data cells. We have made changes
> > >to NameNode, client, and DataNode to generalize the block concept and
> > >handle the mapping between a logical file block and its internal
> > >storage blocks. For further details please see the design doc on
> > >HDFS-7285.
> > >HADOOP-11264 focuses on providing flexible and high-performance codec
> > >calculation support.
> > >
> > >The nightly Jenkins job of the branch has reported several successful
> > >runs, and doesn't show new flaky tests compared with trunk. We have
> > >posted several versions of the test plan including both unit testing
> > >and cluster testing, and have executed most tests in the plan. The most
> > >basic functionalities have been extensively tested and verified in
> > >several real clusters with different hardware configurations; results
> > >have been very stable. We have created follow-on tasks for more
> > >advanced error handling and optimization under the umbrella HDFS-8031.
> > >We also plan to implement or harden the integration of EC with existing
> > >features such as WebHDFS, snapshot, append, truncate, hflush, hsync,
> > >and so forth.
> > >
> > >Development of this feature has been a collaboration across many
> > >companies and institutions. I'd like to thank J. Andreina, Takanobu
> > >Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao G,
> > >Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai
> > >Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong Zhang, Jing
> > >Zhao, Hui Zheng and Kai Zheng for their code contributions and reviews.
> > >Andrew and Kai Zheng also made fundamental contributions to the initial
> > >design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and many other
> > >contributors have made great efforts in system testing. Many thanks go
> > >to Weihua Jiang for proposing the JIRA, and ATM, Todd Lipcon, Silvius
> > >Rus, Suresh, as well as many others for providing helpful feedbacks.
> > >
> > >Following the community convention, this vote will last for 7 days
> > >(ending September 29th). Votes from Hadoop committers are binding but
> > >non-binding votes are very welcome as well. And here's my non-binding
> +1.
> > >
> > >Thanks,
> > >---
> > >Zhe Zhang
> >
> >
>

Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by Jing Zhao <ji...@apache.org>.
+1

I've been involved in both development and review on the branch, and I
believe it's now ready to get merged into trunk. Many thanks to all the
contributors and reviewers!

Thanks,
-Jing

On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <ka...@intel.com> wrote:

> Non-binding +1
>
> According to our extensive performance tests, striping + ISA-L coder based
> erasure coding not only can save storage, but also can increase the
> throughput of a client or a cluster. It will be a great addition to HDFS
> and its users. Based on the latest branch codes, we also observed it's very
> reliable in the concurrent tests. We'll provide the perf test report after
> it's sorted out and hope it helps.
> Thanks!
>
> Regards,
> Kai
>
> -----Original Message-----
> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> Sent: Wednesday, September 23, 2015 8:50 AM
> To: hdfs-dev@hadoop.apache.org; common-dev@hadoop.apache.org
> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
>
> +1
>
> Great addition to HDFS. Thanks all contributors for the nice work.
>
> Regards,
> Uma
>
> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com> wrote:
>
> >Hi,
> >
> >I'd like to propose a vote to merge the HDFS-7285 feature branch back
> >to trunk. Since November 2014 we have been designing and developing
> >this feature under the umbrella JIRAs HDFS-7285 and HADOOP-11264, and
> >have committed approximately 210 patches.
> >
> >The HDFS-7285 feature branch was created to support the first phase of
> >HDFS erasure coding (HDFS-EC). The objective of HDFS-EC is to
> >significantly reduce storage space usage in HDFS clusters. Instead of
> >always creating 3 replicas of each block with 200% storage space
> >overhead, HDFS-EC provides data durability through parity data blocks.
> >With most EC configurations, the storage overhead is no more than 50%.
> >Based on profiling results of production clusters, we decided to
> >support EC with the striped block layout in the first phase, so that
> >small files can be better handled. This means dividing each logical
> >HDFS file block into smaller units (striping cells) and spreading them
> >on a set of DataNodes in round-robin fashion. Parity cells are
> >generated for each stripe of original data cells. We have made changes
> >to NameNode, client, and DataNode to generalize the block concept and
> >handle the mapping between a logical file block and its internal
> >storage blocks. For further details please see the design doc on
> >HDFS-7285.
> >HADOOP-11264 focuses on providing flexible and high-performance codec
> >calculation support.
> >
> >The nightly Jenkins job of the branch has reported several successful
> >runs, and doesn't show new flaky tests compared with trunk. We have
> >posted several versions of the test plan including both unit testing
> >and cluster testing, and have executed most tests in the plan. The most
> >basic functionalities have been extensively tested and verified in
> >several real clusters with different hardware configurations; results
> >have been very stable. We have created follow-on tasks for more
> >advanced error handling and optimization under the umbrella HDFS-8031.
> >We also plan to implement or harden the integration of EC with existing
> >features such as WebHDFS, snapshot, append, truncate, hflush, hsync,
> >and so forth.
> >
> >Development of this feature has been a collaboration across many
> >companies and institutions. I'd like to thank J. Andreina, Takanobu
> >Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao G,
> >Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai
> >Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong Zhang, Jing
> >Zhao, Hui Zheng and Kai Zheng for their code contributions and reviews.
> >Andrew and Kai Zheng also made fundamental contributions to the initial
> >design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and many other
> >contributors have made great efforts in system testing. Many thanks go
> >to Weihua Jiang for proposing the JIRA, and ATM, Todd Lipcon, Silvius
> >Rus, Suresh, as well as many others for providing helpful feedbacks.
> >
> >Following the community convention, this vote will last for 7 days
> >(ending September 29th). Votes from Hadoop committers are binding but
> >non-binding votes are very welcome as well. And here's my non-binding +1.
> >
> >Thanks,
> >---
> >Zhe Zhang
>
>

Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by Jing Zhao <ji...@apache.org>.
+1

I've been involved in both development and review on the branch, and I
believe it's now ready to get merged into trunk. Many thanks to all the
contributors and reviewers!

Thanks,
-Jing

On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <ka...@intel.com> wrote:

> Non-binding +1
>
> According to our extensive performance tests, striping + ISA-L coder based
> erasure coding not only can save storage, but also can increase the
> throughput of a client or a cluster. It will be a great addition to HDFS
> and its users. Based on the latest branch codes, we also observed it's very
> reliable in the concurrent tests. We'll provide the perf test report after
> it's sorted out and hope it helps.
> Thanks!
>
> Regards,
> Kai
>
> -----Original Message-----
> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> Sent: Wednesday, September 23, 2015 8:50 AM
> To: hdfs-dev@hadoop.apache.org; common-dev@hadoop.apache.org
> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
>
> +1
>
> Great addition to HDFS. Thanks all contributors for the nice work.
>
> Regards,
> Uma
>
> On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com> wrote:
>
> >Hi,
> >
> >I'd like to propose a vote to merge the HDFS-7285 feature branch back
> >to trunk. Since November 2014 we have been designing and developing
> >this feature under the umbrella JIRAs HDFS-7285 and HADOOP-11264, and
> >have committed approximately 210 patches.
> >
> >The HDFS-7285 feature branch was created to support the first phase of
> >HDFS erasure coding (HDFS-EC). The objective of HDFS-EC is to
> >significantly reduce storage space usage in HDFS clusters. Instead of
> >always creating 3 replicas of each block with 200% storage space
> >overhead, HDFS-EC provides data durability through parity data blocks.
> >With most EC configurations, the storage overhead is no more than 50%.
> >Based on profiling results of production clusters, we decided to
> >support EC with the striped block layout in the first phase, so that
> >small files can be better handled. This means dividing each logical
> >HDFS file block into smaller units (striping cells) and spreading them
> >on a set of DataNodes in round-robin fashion. Parity cells are
> >generated for each stripe of original data cells. We have made changes
> >to NameNode, client, and DataNode to generalize the block concept and
> >handle the mapping between a logical file block and its internal
> >storage blocks. For further details please see the design doc on
> >HDFS-7285.
> >HADOOP-11264 focuses on providing flexible and high-performance codec
> >calculation support.
> >
> >The nightly Jenkins job of the branch has reported several successful
> >runs, and doesn't show new flaky tests compared with trunk. We have
> >posted several versions of the test plan including both unit testing
> >and cluster testing, and have executed most tests in the plan. The most
> >basic functionalities have been extensively tested and verified in
> >several real clusters with different hardware configurations; results
> >have been very stable. We have created follow-on tasks for more
> >advanced error handling and optimization under the umbrella HDFS-8031.
> >We also plan to implement or harden the integration of EC with existing
> >features such as WebHDFS, snapshot, append, truncate, hflush, hsync,
> >and so forth.
> >
> >Development of this feature has been a collaboration across many
> >companies and institutions. I'd like to thank J. Andreina, Takanobu
> >Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao G,
> >Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai
> >Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong Zhang, Jing
> >Zhao, Hui Zheng and Kai Zheng for their code contributions and reviews.
> >Andrew and Kai Zheng also made fundamental contributions to the initial
> >design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and many other
> >contributors have made great efforts in system testing. Many thanks go
> >to Weihua Jiang for proposing the JIRA, and ATM, Todd Lipcon, Silvius
> >Rus, Suresh, as well as many others for providing helpful feedbacks.
> >
> >Following the community convention, this vote will last for 7 days
> >(ending September 29th). Votes from Hadoop committers are binding but
> >non-binding votes are very welcome as well. And here's my non-binding +1.
> >
> >Thanks,
> >---
> >Zhe Zhang
>
>

RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by "Zheng, Kai" <ka...@intel.com>.
Non-binding +1

According to our extensive performance tests, striping + ISA-L coder based erasure coding not only can save storage, but also can increase the throughput of a client or a cluster. It will be a great addition to HDFS and its users. Based on the latest branch codes, we also observed it's very reliable in the concurrent tests. We'll provide the perf test report after it's sorted out and hope it helps. 
Thanks!

Regards,
Kai

-----Original Message-----
From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com] 
Sent: Wednesday, September 23, 2015 8:50 AM
To: hdfs-dev@hadoop.apache.org; common-dev@hadoop.apache.org
Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

+1

Great addition to HDFS. Thanks all contributors for the nice work.

Regards,
Uma

On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com> wrote:

>Hi,
>
>I'd like to propose a vote to merge the HDFS-7285 feature branch back 
>to trunk. Since November 2014 we have been designing and developing 
>this feature under the umbrella JIRAs HDFS-7285 and HADOOP-11264, and 
>have committed approximately 210 patches.
>
>The HDFS-7285 feature branch was created to support the first phase of 
>HDFS erasure coding (HDFS-EC). The objective of HDFS-EC is to 
>significantly reduce storage space usage in HDFS clusters. Instead of 
>always creating 3 replicas of each block with 200% storage space 
>overhead, HDFS-EC provides data durability through parity data blocks. 
>With most EC configurations, the storage overhead is no more than 50%. 
>Based on profiling results of production clusters, we decided to 
>support EC with the striped block layout in the first phase, so that 
>small files can be better handled. This means dividing each logical 
>HDFS file block into smaller units (striping cells) and spreading them 
>on a set of DataNodes in round-robin fashion. Parity cells are 
>generated for each stripe of original data cells. We have made changes 
>to NameNode, client, and DataNode to generalize the block concept and 
>handle the mapping between a logical file block and its internal 
>storage blocks. For further details please see the design doc on 
>HDFS-7285.
>HADOOP-11264 focuses on providing flexible and high-performance codec 
>calculation support.
>
>The nightly Jenkins job of the branch has reported several successful 
>runs, and doesn't show new flaky tests compared with trunk. We have 
>posted several versions of the test plan including both unit testing 
>and cluster testing, and have executed most tests in the plan. The most 
>basic functionalities have been extensively tested and verified in 
>several real clusters with different hardware configurations; results 
>have been very stable. We have created follow-on tasks for more 
>advanced error handling and optimization under the umbrella HDFS-8031. 
>We also plan to implement or harden the integration of EC with existing 
>features such as WebHDFS, snapshot, append, truncate, hflush, hsync, 
>and so forth.
>
>Development of this feature has been a collaboration across many 
>companies and institutions. I'd like to thank J. Andreina, Takanobu 
>Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao G, 
>Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai 
>Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong Zhang, Jing 
>Zhao, Hui Zheng and Kai Zheng for their code contributions and reviews. 
>Andrew and Kai Zheng also made fundamental contributions to the initial 
>design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and many other 
>contributors have made great efforts in system testing. Many thanks go 
>to Weihua Jiang for proposing the JIRA, and ATM, Todd Lipcon, Silvius 
>Rus, Suresh, as well as many others for providing helpful feedbacks.
>
>Following the community convention, this vote will last for 7 days 
>(ending September 29th). Votes from Hadoop committers are binding but 
>non-binding votes are very welcome as well. And here's my non-binding +1.
>
>Thanks,
>---
>Zhe Zhang


RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by "Zheng, Kai" <ka...@intel.com>.
Non-binding +1

According to our extensive performance tests, striping + ISA-L coder based erasure coding not only can save storage, but also can increase the throughput of a client or a cluster. It will be a great addition to HDFS and its users. Based on the latest branch codes, we also observed it's very reliable in the concurrent tests. We'll provide the perf test report after it's sorted out and hope it helps. 
Thanks!

Regards,
Kai

-----Original Message-----
From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com] 
Sent: Wednesday, September 23, 2015 8:50 AM
To: hdfs-dev@hadoop.apache.org; common-dev@hadoop.apache.org
Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

+1

Great addition to HDFS. Thanks all contributors for the nice work.

Regards,
Uma

On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com> wrote:

>Hi,
>
>I'd like to propose a vote to merge the HDFS-7285 feature branch back 
>to trunk. Since November 2014 we have been designing and developing 
>this feature under the umbrella JIRAs HDFS-7285 and HADOOP-11264, and 
>have committed approximately 210 patches.
>
>The HDFS-7285 feature branch was created to support the first phase of 
>HDFS erasure coding (HDFS-EC). The objective of HDFS-EC is to 
>significantly reduce storage space usage in HDFS clusters. Instead of 
>always creating 3 replicas of each block with 200% storage space 
>overhead, HDFS-EC provides data durability through parity data blocks. 
>With most EC configurations, the storage overhead is no more than 50%. 
>Based on profiling results of production clusters, we decided to 
>support EC with the striped block layout in the first phase, so that 
>small files can be better handled. This means dividing each logical 
>HDFS file block into smaller units (striping cells) and spreading them 
>on a set of DataNodes in round-robin fashion. Parity cells are 
>generated for each stripe of original data cells. We have made changes 
>to NameNode, client, and DataNode to generalize the block concept and 
>handle the mapping between a logical file block and its internal 
>storage blocks. For further details please see the design doc on 
>HDFS-7285.
>HADOOP-11264 focuses on providing flexible and high-performance codec 
>calculation support.
>
>The nightly Jenkins job of the branch has reported several successful 
>runs, and doesn't show new flaky tests compared with trunk. We have 
>posted several versions of the test plan including both unit testing 
>and cluster testing, and have executed most tests in the plan. The most 
>basic functionalities have been extensively tested and verified in 
>several real clusters with different hardware configurations; results 
>have been very stable. We have created follow-on tasks for more 
>advanced error handling and optimization under the umbrella HDFS-8031. 
>We also plan to implement or harden the integration of EC with existing 
>features such as WebHDFS, snapshot, append, truncate, hflush, hsync, 
>and so forth.
>
>Development of this feature has been a collaboration across many 
>companies and institutions. I'd like to thank J. Andreina, Takanobu 
>Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao G, 
>Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai 
>Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong Zhang, Jing 
>Zhao, Hui Zheng and Kai Zheng for their code contributions and reviews. 
>Andrew and Kai Zheng also made fundamental contributions to the initial 
>design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and many other 
>contributors have made great efforts in system testing. Many thanks go 
>to Weihua Jiang for proposing the JIRA, and ATM, Todd Lipcon, Silvius 
>Rus, Suresh, as well as many others for providing helpful feedbacks.
>
>Following the community convention, this vote will last for 7 days 
>(ending September 29th). Votes from Hadoop committers are binding but 
>non-binding votes are very welcome as well. And here's my non-binding +1.
>
>Thanks,
>---
>Zhe Zhang


Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by "Gangumalla, Uma" <um...@intel.com>.
+1 

Great addition to HDFS. Thanks all contributors for the nice work.

Regards,
Uma

On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com> wrote:

>Hi,
>
>I'd like to propose a vote to merge the HDFS-7285 feature branch back to
>trunk. Since November 2014 we have been designing and developing this
>feature under the umbrella JIRAs HDFS-7285 and HADOOP-11264, and have
>committed approximately 210 patches.
>
>The HDFS-7285 feature branch was created to support the first phase of
>HDFS
>erasure coding (HDFS-EC). The objective of HDFS-EC is to significantly
>reduce storage space usage in HDFS clusters. Instead of always creating 3
>replicas of each block with 200% storage space overhead, HDFS-EC provides
>data durability through parity data blocks. With most EC configurations,
>the storage overhead is no more than 50%. Based on profiling results of
>production clusters, we decided to support EC with the striped block
>layout
>in the first phase, so that small files can be better handled. This means
>dividing each logical HDFS file block into smaller units (striping cells)
>and spreading them on a set of DataNodes in round-robin fashion. Parity
>cells are generated for each stripe of original data cells. We have made
>changes to NameNode, client, and DataNode to generalize the block concept
>and handle the mapping between a logical file block and its internal
>storage blocks. For further details please see the design doc on
>HDFS-7285.
>HADOOP-11264 focuses on providing flexible and high-performance codec
>calculation support.
>
>The nightly Jenkins job of the branch has reported several successful
>runs,
>and doesn't show new flaky tests compared with trunk. We have posted
>several versions of the test plan including both unit testing and cluster
>testing, and have executed most tests in the plan. The most basic
>functionalities have been extensively tested and verified in several real
>clusters with different hardware configurations; results have been very
>stable. We have created follow-on tasks for more advanced error handling
>and optimization under the umbrella HDFS-8031. We also plan to implement
>or
>harden the integration of EC with existing features such as WebHDFS,
>snapshot, append, truncate, hflush, hsync, and so forth.
>
>Development of this feature has been a collaboration across many companies
>and institutions. I'd like to thank J. Andreina, Takanobu Asanuma,
>Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao G, Rui Li, Yi
>Liu,
>Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai Sasaki, Walter Su, Tsz Wo
>Nicholas Sze, Andrew Wang, Yong Zhang, Jing Zhao, Hui Zheng and Kai Zheng
>for their code contributions and reviews. Andrew and Kai Zheng also made
>fundamental contributions to the initial design. Rui Li, Gao Rui, Kai
>Sasaki, Kai Zheng and many other contributors have made great efforts in
>system testing. Many thanks go to Weihua Jiang for proposing the JIRA, and
>ATM, Todd Lipcon, Silvius Rus, Suresh, as well as many others for
>providing
>helpful feedbacks.
>
>Following the community convention, this vote will last for 7 days (ending
>September 29th). Votes from Hadoop committers are binding but non-binding
>votes are very welcome as well. And here's my non-binding +1.
>
>Thanks,
>---
>Zhe Zhang


Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by "Gangumalla, Uma" <um...@intel.com>.
+1 

Great addition to HDFS. Thanks all contributors for the nice work.

Regards,
Uma

On 9/22/15, 3:40 PM, "Zhe Zhang" <zh...@cloudera.com> wrote:

>Hi,
>
>I'd like to propose a vote to merge the HDFS-7285 feature branch back to
>trunk. Since November 2014 we have been designing and developing this
>feature under the umbrella JIRAs HDFS-7285 and HADOOP-11264, and have
>committed approximately 210 patches.
>
>The HDFS-7285 feature branch was created to support the first phase of
>HDFS
>erasure coding (HDFS-EC). The objective of HDFS-EC is to significantly
>reduce storage space usage in HDFS clusters. Instead of always creating 3
>replicas of each block with 200% storage space overhead, HDFS-EC provides
>data durability through parity data blocks. With most EC configurations,
>the storage overhead is no more than 50%. Based on profiling results of
>production clusters, we decided to support EC with the striped block
>layout
>in the first phase, so that small files can be better handled. This means
>dividing each logical HDFS file block into smaller units (striping cells)
>and spreading them on a set of DataNodes in round-robin fashion. Parity
>cells are generated for each stripe of original data cells. We have made
>changes to NameNode, client, and DataNode to generalize the block concept
>and handle the mapping between a logical file block and its internal
>storage blocks. For further details please see the design doc on
>HDFS-7285.
>HADOOP-11264 focuses on providing flexible and high-performance codec
>calculation support.
>
>The nightly Jenkins job of the branch has reported several successful
>runs,
>and doesn't show new flaky tests compared with trunk. We have posted
>several versions of the test plan including both unit testing and cluster
>testing, and have executed most tests in the plan. The most basic
>functionalities have been extensively tested and verified in several real
>clusters with different hardware configurations; results have been very
>stable. We have created follow-on tasks for more advanced error handling
>and optimization under the umbrella HDFS-8031. We also plan to implement
>or
>harden the integration of EC with existing features such as WebHDFS,
>snapshot, append, truncate, hflush, hsync, and so forth.
>
>Development of this feature has been a collaboration across many companies
>and institutions. I'd like to thank J. Andreina, Takanobu Asanuma,
>Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao G, Rui Li, Yi
>Liu,
>Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai Sasaki, Walter Su, Tsz Wo
>Nicholas Sze, Andrew Wang, Yong Zhang, Jing Zhao, Hui Zheng and Kai Zheng
>for their code contributions and reviews. Andrew and Kai Zheng also made
>fundamental contributions to the initial design. Rui Li, Gao Rui, Kai
>Sasaki, Kai Zheng and many other contributors have made great efforts in
>system testing. Many thanks go to Weihua Jiang for proposing the JIRA, and
>ATM, Todd Lipcon, Silvius Rus, Suresh, as well as many others for
>providing
>helpful feedbacks.
>
>Following the community convention, this vote will last for 7 days (ending
>September 29th). Votes from Hadoop committers are binding but non-binding
>votes are very welcome as well. And here's my non-binding +1.
>
>Thanks,
>---
>Zhe Zhang


Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by Andrew Wang <an...@cloudera.com>.
+1

I've been involved in design discussions and reviews on the branch, I think
we're ready to merge to trunk. Stability has looked good in cluster testing.

Many thanks to all the contributors for their hard work!

Best,
Andrew

On Tue, Sep 22, 2015 at 3:40 PM, Zhe Zhang <zh...@cloudera.com> wrote:

> Hi,
>
> I'd like to propose a vote to merge the HDFS-7285 feature branch back to
> trunk. Since November 2014 we have been designing and developing this
> feature under the umbrella JIRAs HDFS-7285 and HADOOP-11264, and have
> committed approximately 210 patches.
>
> The HDFS-7285 feature branch was created to support the first phase of HDFS
> erasure coding (HDFS-EC). The objective of HDFS-EC is to significantly
> reduce storage space usage in HDFS clusters. Instead of always creating 3
> replicas of each block with 200% storage space overhead, HDFS-EC provides
> data durability through parity data blocks. With most EC configurations,
> the storage overhead is no more than 50%. Based on profiling results of
> production clusters, we decided to support EC with the striped block layout
> in the first phase, so that small files can be better handled. This means
> dividing each logical HDFS file block into smaller units (striping cells)
> and spreading them on a set of DataNodes in round-robin fashion. Parity
> cells are generated for each stripe of original data cells. We have made
> changes to NameNode, client, and DataNode to generalize the block concept
> and handle the mapping between a logical file block and its internal
> storage blocks. For further details please see the design doc on HDFS-7285.
> HADOOP-11264 focuses on providing flexible and high-performance codec
> calculation support.
>
> The nightly Jenkins job of the branch has reported several successful runs,
> and doesn't show new flaky tests compared with trunk. We have posted
> several versions of the test plan including both unit testing and cluster
> testing, and have executed most tests in the plan. The most basic
> functionalities have been extensively tested and verified in several real
> clusters with different hardware configurations; results have been very
> stable. We have created follow-on tasks for more advanced error handling
> and optimization under the umbrella HDFS-8031. We also plan to implement or
> harden the integration of EC with existing features such as WebHDFS,
> snapshot, append, truncate, hflush, hsync, and so forth.
>
> Development of this feature has been a collaboration across many companies
> and institutions. I'd like to thank J. Andreina, Takanobu Asanuma,
> Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao G, Rui Li, Yi Liu,
> Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai Sasaki, Walter Su, Tsz Wo
> Nicholas Sze, Andrew Wang, Yong Zhang, Jing Zhao, Hui Zheng and Kai Zheng
> for their code contributions and reviews. Andrew and Kai Zheng also made
> fundamental contributions to the initial design. Rui Li, Gao Rui, Kai
> Sasaki, Kai Zheng and many other contributors have made great efforts in
> system testing. Many thanks go to Weihua Jiang for proposing the JIRA, and
> ATM, Todd Lipcon, Silvius Rus, Suresh, as well as many others for providing
> helpful feedbacks.
>
> Following the community convention, this vote will last for 7 days (ending
> September 29th). Votes from Hadoop committers are binding but non-binding
> votes are very welcome as well. And here's my non-binding +1.
>
> Thanks,
> ---
> Zhe Zhang
>

Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

Posted by Andrew Wang <an...@cloudera.com>.
+1

I've been involved in design discussions and reviews on the branch, I think
we're ready to merge to trunk. Stability has looked good in cluster testing.

Many thanks to all the contributors for their hard work!

Best,
Andrew

On Tue, Sep 22, 2015 at 3:40 PM, Zhe Zhang <zh...@cloudera.com> wrote:

> Hi,
>
> I'd like to propose a vote to merge the HDFS-7285 feature branch back to
> trunk. Since November 2014 we have been designing and developing this
> feature under the umbrella JIRAs HDFS-7285 and HADOOP-11264, and have
> committed approximately 210 patches.
>
> The HDFS-7285 feature branch was created to support the first phase of HDFS
> erasure coding (HDFS-EC). The objective of HDFS-EC is to significantly
> reduce storage space usage in HDFS clusters. Instead of always creating 3
> replicas of each block with 200% storage space overhead, HDFS-EC provides
> data durability through parity data blocks. With most EC configurations,
> the storage overhead is no more than 50%. Based on profiling results of
> production clusters, we decided to support EC with the striped block layout
> in the first phase, so that small files can be better handled. This means
> dividing each logical HDFS file block into smaller units (striping cells)
> and spreading them on a set of DataNodes in round-robin fashion. Parity
> cells are generated for each stripe of original data cells. We have made
> changes to NameNode, client, and DataNode to generalize the block concept
> and handle the mapping between a logical file block and its internal
> storage blocks. For further details please see the design doc on HDFS-7285.
> HADOOP-11264 focuses on providing flexible and high-performance codec
> calculation support.
>
> The nightly Jenkins job of the branch has reported several successful runs,
> and doesn't show new flaky tests compared with trunk. We have posted
> several versions of the test plan including both unit testing and cluster
> testing, and have executed most tests in the plan. The most basic
> functionalities have been extensively tested and verified in several real
> clusters with different hardware configurations; results have been very
> stable. We have created follow-on tasks for more advanced error handling
> and optimization under the umbrella HDFS-8031. We also plan to implement or
> harden the integration of EC with existing features such as WebHDFS,
> snapshot, append, truncate, hflush, hsync, and so forth.
>
> Development of this feature has been a collaboration across many companies
> and institutions. I'd like to thank J. Andreina, Takanobu Asanuma,
> Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao G, Rui Li, Yi Liu,
> Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai Sasaki, Walter Su, Tsz Wo
> Nicholas Sze, Andrew Wang, Yong Zhang, Jing Zhao, Hui Zheng and Kai Zheng
> for their code contributions and reviews. Andrew and Kai Zheng also made
> fundamental contributions to the initial design. Rui Li, Gao Rui, Kai
> Sasaki, Kai Zheng and many other contributors have made great efforts in
> system testing. Many thanks go to Weihua Jiang for proposing the JIRA, and
> ATM, Todd Lipcon, Silvius Rus, Suresh, as well as many others for providing
> helpful feedbacks.
>
> Following the community convention, this vote will last for 7 days (ending
> September 29th). Votes from Hadoop committers are binding but non-binding
> votes are very welcome as well. And here's my non-binding +1.
>
> Thanks,
> ---
> Zhe Zhang
>