You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ozone.apache.org by Uma gangumalla <um...@apache.org> on 2022/04/06 05:58:43 UTC

[VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Dear Ozone Devs,

As you may know, we have been actively developing Ozone Erasure Coding
support in a separate branch HDDS-3816-ec.

We have finished the development of EC key write and read functionality.
The support of offline recovery( Recovering replica from node loss) will be
part of second phase work.

Since the code has already grown and increasingly started seeing merge
complications, we would like to propose to merge the current EC branch into
master.

We filed the new JIRA(HDDS-6462) for the second phase of work and continued
the offline recovery work there.

Details on Changes:

   -

   Most of the EC core logic went to newly extended classes. Key changes
   went into EC*OutputStream and EC*InputStream classes for write and read
   respectively. Based on replication type, ECPipelineProvider will be chosen
   for creating EC pipelines.



   -

   Since we cannot represent the EC replication in the existing replication
   factor, we have introduced ECReplicationConfig. The ReplicationConfig
   interface is already pushed to master, so it’s not a new idea coming
   through this branch merge now. What is newly coming here is the
   ECReplicationConfig class which can be used to express EC replication
   configuration.



   -

   We wanted to provide the support to enable EC at bucket level. To
   simplify some complications, we have moved the default replication
   configurations from client to server.



   -

   Client side replication type and replication factor removed from the
   configuration files and introduced the ozone.server.default.replication
   and ozone.server.default.replication.type.We would continue to respect if
   one configures at client side explicitly or passed through APIs, otherwise
   server side bucket level properties or server side default configuration
   would take effect.



   -

   Other than this change, the rest of EC side code should not impact any
   of the existing code flows.


We have finished documentation JIRA(HDDS-6172) for covering this feature
and we will continue to improve further in master.

Git Branch Name : HDDS-3816-ec

JIRAs: HDDS-3816 and HDDS-5351

Completed tasks: ~ 142

+ We are covering the following two mandatory JIRAs:

1. HDDS-6209: EC: [Forward compatibility issue] New client to older server
could fail due to the unavailability for client default replication config

2. HDDS-5909: EC: Onboard EC into upgrade framework.

PRs reviews in-progress and expected to close in a day or two.

Few other JIRAs in HDDS-3816 are still open but I believe they're not
blockers for merge.

In short what you can do now with this feature:

   -

   You can enable EC at bucket level and cluster level.

How to enable it at bucket level? Just create the bucket by passing the ec
replication options.

   -

   You can create EC keys and read the same back.
   -

   You should be able to continue writing even when chosen nodes are
   failing. (Of Course minimum of Data+Parity live nodes should be available
   in cluster for complete the write)
   -

   You should be able to read the file back even if a few nodes failed in
   the same ec block group(Failures should not be more than parity number of
   nodes.).

What is pending? Offline recovery of lost/missing EC containers. As
mentioned above, post merge of this branch, I will create a separate JIRA
for starting the work for OfflineRecovery.


There are automated acceptance test cases already added. HDDS-6231

In addition to that, we have also performed basic Acceptance Testing in
physical cluster:

   1.

   Installed 10 nodes cluster and created EC bucket (3:2).

Uploaded 10GB key.

Downloaded the same key and checked the md5sum.

   1.

   Uploaded 8GB key.

Downloaded the same key and checked the md5sum.

   1.

   Uploaded 3MB key

Downloaded the same and verified md5sum.

   1.

   Changed bucket to (6:3)

Uploaded 8GB key

Download the same.

Also verified the new key should be in 6:3 policy and old keys must be
3:2.Verified
with several different size key writes and reads.

Merge checklist items assessment is here:
https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist

Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan Fajth
<pi...@cloudera.com> for great efforts in core development and also thanks
a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating on some
of the EC tasks.

Thanks to Marton for design discussion and on some dev tasks as well.

Thanks to many others who were involved in design discussions, Arpit, Sidd,
Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, Rakesh,
Yiqun Lin.
Sorry if I miss anyone here, but your efforts are much appreciated. Without
your tremendous help, we would have not reached this position yet.

If there are no objections for the merge, I will start the official vote
later.

Regards,

EC Branch Devs

Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Neil Joshi <ne...@gmail.com>.
Thanks for the great work.  +1 vote to merge.

Regards,
Neil
Neil Joshi

On Fri, Apr 8, 2022 at 5:46 PM Shashikant Banerjee <sh...@gmail.com>
wrote:

> +1 for the merge
>
> Thanks
> Shashi
>
> On Sat, Apr 9, 2022, 12:01 AM Hanisha Koneru <hkoneru@cloudera.com.invalid
> >
> wrote:
>
> > +1 for merge. Thanks for the great work on this.
> >
> > Thanks
> > Hanisha
> >
> > > On Apr 7, 2022, at 2:18 AM, jackson yao <ja...@gmail.com>
> wrote:
> > >
> > > thanks for the great work! i am +1 for merging (non-binding)
> > >
> > > mingchao zhao <ca...@apache.org> 于2022年4月7日周四 14:18写道:
> > >
> > >> +1 for the merge. Thanks
> > >>
> > >> Mukul Kumar Singh <mk...@gmail.com> 于2022年4月7日周四 14:05写道:
> > >>
> > >>> +1 for the merge.
> > >>>
> > >>>
> > >>> Thanks Lokesh
> > >>>
> > >>> On 07/04/22 11:29 am, Lokesh Jain wrote:
> > >>>> +1 for merge
> > >>>>
> > >>>> Thanks
> > >>>> Lokesh
> > >>>>
> > >>>>> On 07-Apr-2022, at 11:06 AM, Tsz Wo Sze <sz...@gmail.com>
> wrote:
> > >>>>>
> > >>>>> +1
> > >>>>> We should merge it so that more people can try it.  We can work on
> > the
> > >>>>> remaining tasks in the master branch.  Thanks a lot!
> > >>>>>
> > >>>>> Tsz-Wo
> > >>>>>
> > >>>>> On Thu, Apr 7, 2022 at 1:17 PM Aravindan Vijayan
> > >>>>> <av...@cloudera.com.invalid> wrote:
> > >>>>>
> > >>>>>> +1 for the merge. Thanks for the great work!
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On Wed, Apr 6, 2022 at 9:45 PM Prashant Pogde
> > >>> <ppogde@cloudera.com.invalid
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> +1 for the EC branch merge.
> > >>>>>>>
> > >>>>>>> Regards,
> > >>>>>>> Prashant
> > >>>>>>>
> > >>>>>>>> On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <sw...@apache.org>
> > >>> wrote:
> > >>>>>>>>
> > >>>>>>>> +1 for the EC branch merge.
> > >>>>>>>>
> > >>>>>>>> Best,
> > >>>>>>>> Sid
> > >>>>>>>>
> > >>>>>>>> On Wed, Apr 6, 2022 at 8:05 PM guimark <gu...@126.com> wrote:
> > >>>>>>>>
> > >>>>>>>>> Great news!
> > >>>>>>>>> +1 to merge.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> At 2022-04-06 22:18:31, "Stephen O'Donnell" <
> > >> sodonnell@cloudera.com
> > >>>>>>> .INVALID>
> > >>>>>>>>> wrote:
> > >>>>>>>>>> I have been working on the code on this branch for some time,
> > >> and I
> > >>>>>>>>> believe
> > >>>>>>>>>> it is in a good state to merge now. It is mostly new code, and
> > if
> > >>>>>>> nothing
> > >>>>>>>>>> attempts to use EC, none of the EC code paths will be
> executed.
> > >>>>>>>>>>
> > >>>>>>>>>> +1 to merge from me.
> > >>>>>>>>>>
> > >>>>>>>>>> Stephen.
> > >>>>>>>>>>
> > >>>>>>>>>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <
> > >>> umamahesh@apache.org>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>> =====Few Edits Below===================
> > >>>>>>>>>>>
> > >>>>>>>>>>> Dear Ozone Devs,
> > >>>>>>>>>>>
> > >>>>>>>>>>> As you may know, we have been actively developing Ozone
> Erasure
> > >>>>>> Coding
> > >>>>>>>>>>> support in a separate branch HDDS-3816-ec.
> > >>>>>>>>>>>
> > >>>>>>>>>>> We have finished the development of EC key write and read
> > >>>>>>> functionality.
> > >>>>>>>>>>> The support of offline recovery( Recovering replica from node
> > >>> loss)
> > >>>>>>>>> will be
> > >>>>>>>>>>> part of second phase work.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Since the code has already grown and increasingly started
> > seeing
> > >>>>>> merge
> > >>>>>>>>>>> complications, we would like to merge the current EC branch
> > into
> > >>>>>>> master.
> > >>>>>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work
> > >> and
> > >>>>>>>>> continued
> > >>>>>>>>>>> the offline recovery work there. (we have uploaded the design
> > >> doc
> > >>>>>>> there)
> > >>>>>>>>>>> Details on Changes:
> > >>>>>>>>>>>
> > >>>>>>>>>>>  -
> > >>>>>>>>>>>
> > >>>>>>>>>>>  Most of the EC core logic went to newly extended classes.
> Key
> > >>>>>>> changes
> > >>>>>>>>>>>  went into EC*OutputStream and EC*InputStream classes for
> write
> > >>> and
> > >>>>>>>>> read
> > >>>>>>>>>>>  respectively. Based on replication type, ECPipelineProvider
> > >> will
> > >>>>>> be
> > >>>>>>>>>>> chosen
> > >>>>>>>>>>>  for creating EC pipelines.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>  -
> > >>>>>>>>>>>
> > >>>>>>>>>>>  Since we cannot represent the EC replication in the existing
> > >>>>>>>>> replication
> > >>>>>>>>>>>  factor, we have introduced ECReplicationConfig. The
> > >>>>>>> ReplicationConfig
> > >>>>>>>>>>>  interface is already pushed to master, so it’s not a new
> idea
> > >>>>>> coming
> > >>>>>>>>>>>  through this branch merge now. What is newly coming here is
> > >> the
> > >>>>>>>>>>>  ECReplicationConfig class which can be used to express EC
> > >>>>>>> replication
> > >>>>>>>>>>>  configuration.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>  -
> > >>>>>>>>>>>
> > >>>>>>>>>>>  We wanted to provide the support to enable EC at bucket
> level.
> > >>> To
> > >>>>>>>>>>>  simplify some complications, we have moved the default
> > >>> replication
> > >>>>>>>>>>>  configurations from client to server.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>  -
> > >>>>>>>>>>>
> > >>>>>>>>>>>  Client side replication type and replication factor removed
> > >> from
> > >>>>>> the
> > >>>>>>>>>>>  configuration files and introduced the
> > >>>>>>>>> ozone.server.default.replication
> > >>>>>>>>>>>  and ozone.server.default.replication.type.We would continue
> to
> > >>>>>>>>> respect
> > >>>>>>>>>>> if
> > >>>>>>>>>>>  one configures at client side explicitly or passed through
> > >> APIs,
> > >>>>>>>>>>> otherwise
> > >>>>>>>>>>>  server side bucket level properties or server side default
> > >>>>>>>>> configuration
> > >>>>>>>>>>>  would take effect.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>  -
> > >>>>>>>>>>>
> > >>>>>>>>>>>  Other than this change, the rest of EC side code should not
> > >>> impact
> > >>>>>>>>> any
> > >>>>>>>>>>>  of the existing code flows.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering
> > this
> > >>>>>>> feature
> > >>>>>>>>>>> and we will continue to improve further in master.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Git Branch Name : HDDS-3816-ec
> > >>>>>>>>>>>
> > >>>>>>>>>>> JIRAs: HDDS-3816 and HDDS-5351
> > >>>>>>>>>>>
> > >>>>>>>>>>> Completed tasks: ~ 142
> > >>>>>>>>>>>
> > >>>>>>>>>>> + We are covering the following two mandatory JIRAs to come
> in:
> > >>>>>>>>>>>
> > >>>>>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to
> > >>> older
> > >>>>>>>>> server
> > >>>>>>>>>>> could fail due to the unavailability for client default
> > >>> replication
> > >>>>>>>>> config
> > >>>>>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> > >>>>>>>>>>>
> > >>>>>>>>>>> PRs reviews in-progress and expected to close in a day or
> two.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe
> > >> they're
> > >>>>>> not
> > >>>>>>>>>>> blockers for merge.
> > >>>>>>>>>>>
> > >>>>>>>>>>> In short what you can do now with this feature:
> > >>>>>>>>>>>
> > >>>>>>>>>>>  -
> > >>>>>>>>>>>
> > >>>>>>>>>>>  You can enable EC at bucket level and cluster level.
> > >>>>>>>>>>>
> > >>>>>>>>>>> How to enable it at bucket level? Just create the bucket by
> > >>> passing
> > >>>>>>> the
> > >>>>>>>>> ec
> > >>>>>>>>>>> replication options.
> > >>>>>>>>>>>
> > >>>>>>>>>>>  -
> > >>>>>>>>>>>
> > >>>>>>>>>>>  You can create EC keys and read the same back.
> > >>>>>>>>>>>  -
> > >>>>>>>>>>>
> > >>>>>>>>>>>  You should be able to continue writing even when chosen
> nodes
> > >>> are
> > >>>>>>>>>>>  failing. (Of Course minimum of Data+Parity live nodes should
> > >> be
> > >>>>>>>>>>> available
> > >>>>>>>>>>>  in cluster for complete the write)
> > >>>>>>>>>>>  -
> > >>>>>>>>>>>
> > >>>>>>>>>>>  You should be able to read the file back even if a few nodes
> > >>>>>> failed
> > >>>>>>>>> in
> > >>>>>>>>>>>  the same ec block group(Failures should not be more than
> > >> parity
> > >>>>>>>>> number
> > >>>>>>>>>>> of
> > >>>>>>>>>>>  nodes.).
> > >>>>>>>>>>>
> > >>>>>>>>>>> What is pending? Offline recovery of lost/missing EC
> > containers.
> > >>> As
> > >>>>>>>>>>> mentioned above, post merge of this branch, I will create a
> > >>> separate
> > >>>>>>>>> JIRA
> > >>>>>>>>>>> for starting the work for OfflineRecovery.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> There are automated acceptance test cases already added.
> > >> HDDS-6231
> > >>>>>>>>>>>
> > >>>>>>>>>>> In addition to that, we have also performed basic Acceptance
> > >>> Testing
> > >>>>>>> in
> > >>>>>>>>>>> physical cluster:
> > >>>>>>>>>>>
> > >>>>>>>>>>>  1.
> > >>>>>>>>>>>
> > >>>>>>>>>>>  Installed 10 nodes cluster and created EC bucket (3:2).
> > >>>>>>>>>>>
> > >>>>>>>>>>> Uploaded 10GB key.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Downloaded the same key and checked the md5sum.
> > >>>>>>>>>>>
> > >>>>>>>>>>>  1.
> > >>>>>>>>>>>
> > >>>>>>>>>>>  Uploaded 8GB key.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Downloaded the same key and checked the md5sum.
> > >>>>>>>>>>>
> > >>>>>>>>>>>  1.
> > >>>>>>>>>>>
> > >>>>>>>>>>>  Uploaded 3MB key
> > >>>>>>>>>>>
> > >>>>>>>>>>> Downloaded the same and verified md5sum.
> > >>>>>>>>>>>
> > >>>>>>>>>>>  1.
> > >>>>>>>>>>>
> > >>>>>>>>>>>  Changed bucket to (6:3)
> > >>>>>>>>>>>
> > >>>>>>>>>>> Uploaded 8GB key
> > >>>>>>>>>>>
> > >>>>>>>>>>> Download the same.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Also verified the new key should be in 6:3 policy and old
> keys
> > >>> must
> > >>>>>> be
> > >>>>>>>>>>> 3:2.Verified
> > >>>>>>>>>>> with several different size key writes and reads.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> Since the merge discussion thread, we have well stabilized
> code
> > >>> and
> > >>>>>>>>> fixed
> > >>>>>>>>>>> several bugs.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> Merge checklist items assessment is here:
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> > >>>>>>>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>,
> > >>> Istvan
> > >>>>>>>>> Fajth
> > >>>>>>>>>>> <pi...@cloudera.com> for great efforts in core development
> and
> > >>> also
> > >>>>>>>>> thanks
> > >>>>>>>>>>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for
> > >>>>>>>>> collaborating
> > >>>>>>>>>>> on some of the EC tasks.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Thanks to Marton for design discussion and on some dev tasks
> as
> > >>>>>> well.
> > >>>>>>>>>>> Thanks to many others who were involved in design
> discussions,
> > >>>>>> Arpit,
> > >>>>>>>>> Sidd,
> > >>>>>>>>>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
> > >>> Prashanth,
> > >>>>>>>>> Rakesh,
> > >>>>>>>>>>> Yiqun Lin.
> > >>>>>>>>>>> Sorry if I miss anyone here, but your efforts are much
> > >>> appreciated.
> > >>>>>>>>> Without
> > >>>>>>>>>>> your tremendous help, we would have not reached this position
> > >> yet.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> To start with, here is my +1
> > >>>>>>>>>>>
> > >>>>>>>>>>> The vote will run for 5 days.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Regards,
> > >>>>>>>>>>> Uma
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <
> > >>>>>> umamahesh@apache.org>
> > >>>>>>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> Dear Ozone Devs,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> As you may know, we have been actively developing Ozone
> > Erasure
> > >>>>>>> Coding
> > >>>>>>>>>>>> support in a separate branch HDDS-3816-ec.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> We have finished the development of EC key write and read
> > >>>>>>>>> functionality.
> > >>>>>>>>>>>> The support of offline recovery( Recovering replica from
> node
> > >>> loss)
> > >>>>>>>>> will
> > >>>>>>>>>>> be
> > >>>>>>>>>>>> part of second phase work.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Since the code has already grown and increasingly started
> > >> seeing
> > >>>>>>> merge
> > >>>>>>>>>>>> complications, we would like to propose to merge the current
> > EC
> > >>>>>>> branch
> > >>>>>>>>>>> into
> > >>>>>>>>>>>> master.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of
> work
> > >> and
> > >>>>>>>>>>>> continued the offline recovery work there.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Details on Changes:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  -
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  Most of the EC core logic went to newly extended classes.
> Key
> > >>>>>>>>> changes
> > >>>>>>>>>>>>  went into EC*OutputStream and EC*InputStream classes for
> > >> write
> > >>>>>> and
> > >>>>>>>>>>> read
> > >>>>>>>>>>>>  respectively. Based on replication type, ECPipelineProvider
> > >>> will
> > >>>>>> be
> > >>>>>>>>>>> chosen
> > >>>>>>>>>>>>  for creating EC pipelines.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  -
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  Since we cannot represent the EC replication in the
> existing
> > >>>>>>>>>>>>  replication factor, we have introduced ECReplicationConfig.
> > >> The
> > >>>>>>>>>>>>  ReplicationConfig interface is already pushed to master, so
> > >>> it’s
> > >>>>>>>>> not
> > >>>>>>>>>>> a new
> > >>>>>>>>>>>>  idea coming through this branch merge now. What is newly
> > >> coming
> > >>>>>>>>> here
> > >>>>>>>>>>> is the
> > >>>>>>>>>>>>  ECReplicationConfig class which can be used to express EC
> > >>>>>>>>> replication
> > >>>>>>>>>>>>  configuration.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  -
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  We wanted to provide the support to enable EC at bucket
> > >> level.
> > >>> To
> > >>>>>>>>>>>>  simplify some complications, we have moved the default
> > >>>>>> replication
> > >>>>>>>>>>>>  configurations from client to server.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  -
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  Client side replication type and replication factor removed
> > >>> from
> > >>>>>>>>> the
> > >>>>>>>>>>>>  configuration files and introduced the
> > >>>>>>>>>>> ozone.server.default.replication
> > >>>>>>>>>>>>  and ozone.server.default.replication.type.We would continue
> > >> to
> > >>>>>>>>>>> respect if
> > >>>>>>>>>>>>  one configures at client side explicitly or passed through
> > >>> APIs,
> > >>>>>>>>>>> otherwise
> > >>>>>>>>>>>>  server side bucket level properties or server side default
> > >>>>>>>>>>> configuration
> > >>>>>>>>>>>>  would take effect.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  -
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  Other than this change, the rest of EC side code should not
> > >>>>>> impact
> > >>>>>>>>> any
> > >>>>>>>>>>>>  of the existing code flows.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering
> > >> this
> > >>>>>>>>> feature
> > >>>>>>>>>>>> and we will continue to improve further in master.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Git Branch Name : HDDS-3816-ec
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> JIRAs: HDDS-3816 and HDDS-5351
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Completed tasks: ~ 142
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> + We are covering the following two mandatory JIRAs:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client
> to
> > >>> older
> > >>>>>>>>>>>> server could fail due to the unavailability for client
> default
> > >>>>>>>>>>> replication
> > >>>>>>>>>>>> config
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> PRs reviews in-progress and expected to close in a day or
> two.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe
> > >> they're
> > >>>>>> not
> > >>>>>>>>>>>> blockers for merge.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> In short what you can do now with this feature:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  -
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  You can enable EC at bucket level and cluster level.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> How to enable it at bucket level? Just create the bucket by
> > >>> passing
> > >>>>>>>>> the
> > >>>>>>>>>>> ec
> > >>>>>>>>>>>> replication options.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  -
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  You can create EC keys and read the same back.
> > >>>>>>>>>>>>  -
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  You should be able to continue writing even when chosen
> nodes
> > >>> are
> > >>>>>>>>>>>>  failing. (Of Course minimum of Data+Parity live nodes
> should
> > >> be
> > >>>>>>>>>>> available
> > >>>>>>>>>>>>  in cluster for complete the write)
> > >>>>>>>>>>>>  -
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  You should be able to read the file back even if a few
> nodes
> > >>>>>>>>> failed in
> > >>>>>>>>>>>>  the same ec block group(Failures should not be more than
> > >> parity
> > >>>>>>>>>>> number of
> > >>>>>>>>>>>>  nodes.).
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> What is pending? Offline recovery of lost/missing EC
> > >> containers.
> > >>> As
> > >>>>>>>>>>>> mentioned above, post merge of this branch, I will create a
> > >>>>>> separate
> > >>>>>>>>> JIRA
> > >>>>>>>>>>>> for starting the work for OfflineRecovery.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> There are automated acceptance test cases already added.
> > >>> HDDS-6231
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> In addition to that, we have also performed basic Acceptance
> > >>>>>> Testing
> > >>>>>>>>> in
> > >>>>>>>>>>>> physical cluster:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  1.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  Installed 10 nodes cluster and created EC bucket (3:2).
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Uploaded 10GB key.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Downloaded the same key and checked the md5sum.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  1.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  Uploaded 8GB key.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Downloaded the same key and checked the md5sum.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  1.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  Uploaded 3MB key
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Downloaded the same and verified md5sum.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  1.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>  Changed bucket to (6:3)
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Uploaded 8GB key
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Download the same.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Also verified the new key should be in 6:3 policy and old
> keys
> > >>> must
> > >>>>>>> be
> > >>>>>>>>>>> 3:2.Verified
> > >>>>>>>>>>>> with several different size key writes and reads.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Merge checklist items assessment is here:
> > >>>>>>>>>>>>
> > >>>>>>
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> > >>>>>>>>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>,
> > >>> Istvan
> > >>>>>>>>> Fajth
> > >>>>>>>>>>>> <pi...@cloudera.com> for great efforts in core development
> > and
> > >>>>>> also
> > >>>>>>>>>>>> thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for
> > >>>>>>>>> collaborating
> > >>>>>>>>>>>> on some of the EC tasks.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Thanks to Marton for design discussion and on some dev tasks
> > as
> > >>>>>> well.
> > >>>>>>>>>>>> Thanks to many others who were involved in design
> discussions,
> > >>>>>> Arpit,
> > >>>>>>>>>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda,
> Shashi,
> > >>>>>>>>> Prashanth,
> > >>>>>>>>>>>> Rakesh, Yiqun Lin.
> > >>>>>>>>>>>> Sorry if I miss anyone here, but your efforts are much
> > >>> appreciated.
> > >>>>>>>>>>>> Without your tremendous help, we would have not reached this
> > >>>>>> position
> > >>>>>>>>>>> yet.
> > >>>>>>>>>>>> If there are no objections for the merge, I will start the
> > >>> official
> > >>>>>>>>> vote
> > >>>>>>>>>>>> later.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> EC Branch Devs
> > >>>>>>>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >> ---------------------------------------------------------------------
> > >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> > >>>>>>> For additional commands, e-mail: dev-help@ozone.apache.org
> > >>>>>>>
> > >>>>>>>
> > >>>>>> --
> > >>>>>> Thanks & Regards,
> > >>>>>> Aravindan
> > >>>>>>
> > >>>>
> > >>>>
> ---------------------------------------------------------------------
> > >>>> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> > >>>> For additional commands, e-mail: dev-help@ozone.apache.org
> > >>>>
> > >>>
> > >>> ---------------------------------------------------------------------
> > >>> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> > >>> For additional commands, e-mail: dev-help@ozone.apache.org
> > >>>
> > >>>
> > >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> > For additional commands, e-mail: dev-help@ozone.apache.org
> >
> >
>


-- 
NJ

Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Shashikant Banerjee <sh...@gmail.com>.
+1 for the merge

Thanks
Shashi

On Sat, Apr 9, 2022, 12:01 AM Hanisha Koneru <hk...@cloudera.com.invalid>
wrote:

> +1 for merge. Thanks for the great work on this.
>
> Thanks
> Hanisha
>
> > On Apr 7, 2022, at 2:18 AM, jackson yao <ja...@gmail.com> wrote:
> >
> > thanks for the great work! i am +1 for merging (non-binding)
> >
> > mingchao zhao <ca...@apache.org> 于2022年4月7日周四 14:18写道:
> >
> >> +1 for the merge. Thanks
> >>
> >> Mukul Kumar Singh <mk...@gmail.com> 于2022年4月7日周四 14:05写道:
> >>
> >>> +1 for the merge.
> >>>
> >>>
> >>> Thanks Lokesh
> >>>
> >>> On 07/04/22 11:29 am, Lokesh Jain wrote:
> >>>> +1 for merge
> >>>>
> >>>> Thanks
> >>>> Lokesh
> >>>>
> >>>>> On 07-Apr-2022, at 11:06 AM, Tsz Wo Sze <sz...@gmail.com> wrote:
> >>>>>
> >>>>> +1
> >>>>> We should merge it so that more people can try it.  We can work on
> the
> >>>>> remaining tasks in the master branch.  Thanks a lot!
> >>>>>
> >>>>> Tsz-Wo
> >>>>>
> >>>>> On Thu, Apr 7, 2022 at 1:17 PM Aravindan Vijayan
> >>>>> <av...@cloudera.com.invalid> wrote:
> >>>>>
> >>>>>> +1 for the merge. Thanks for the great work!
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Apr 6, 2022 at 9:45 PM Prashant Pogde
> >>> <ppogde@cloudera.com.invalid
> >>>>>> wrote:
> >>>>>>
> >>>>>>> +1 for the EC branch merge.
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Prashant
> >>>>>>>
> >>>>>>>> On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <sw...@apache.org>
> >>> wrote:
> >>>>>>>>
> >>>>>>>> +1 for the EC branch merge.
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Sid
> >>>>>>>>
> >>>>>>>> On Wed, Apr 6, 2022 at 8:05 PM guimark <gu...@126.com> wrote:
> >>>>>>>>
> >>>>>>>>> Great news!
> >>>>>>>>> +1 to merge.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> At 2022-04-06 22:18:31, "Stephen O'Donnell" <
> >> sodonnell@cloudera.com
> >>>>>>> .INVALID>
> >>>>>>>>> wrote:
> >>>>>>>>>> I have been working on the code on this branch for some time,
> >> and I
> >>>>>>>>> believe
> >>>>>>>>>> it is in a good state to merge now. It is mostly new code, and
> if
> >>>>>>> nothing
> >>>>>>>>>> attempts to use EC, none of the EC code paths will be executed.
> >>>>>>>>>>
> >>>>>>>>>> +1 to merge from me.
> >>>>>>>>>>
> >>>>>>>>>> Stephen.
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <
> >>> umamahesh@apache.org>
> >>>>>>>>> wrote:
> >>>>>>>>>>> =====Few Edits Below===================
> >>>>>>>>>>>
> >>>>>>>>>>> Dear Ozone Devs,
> >>>>>>>>>>>
> >>>>>>>>>>> As you may know, we have been actively developing Ozone Erasure
> >>>>>> Coding
> >>>>>>>>>>> support in a separate branch HDDS-3816-ec.
> >>>>>>>>>>>
> >>>>>>>>>>> We have finished the development of EC key write and read
> >>>>>>> functionality.
> >>>>>>>>>>> The support of offline recovery( Recovering replica from node
> >>> loss)
> >>>>>>>>> will be
> >>>>>>>>>>> part of second phase work.
> >>>>>>>>>>>
> >>>>>>>>>>> Since the code has already grown and increasingly started
> seeing
> >>>>>> merge
> >>>>>>>>>>> complications, we would like to merge the current EC branch
> into
> >>>>>>> master.
> >>>>>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work
> >> and
> >>>>>>>>> continued
> >>>>>>>>>>> the offline recovery work there. (we have uploaded the design
> >> doc
> >>>>>>> there)
> >>>>>>>>>>> Details on Changes:
> >>>>>>>>>>>
> >>>>>>>>>>>  -
> >>>>>>>>>>>
> >>>>>>>>>>>  Most of the EC core logic went to newly extended classes. Key
> >>>>>>> changes
> >>>>>>>>>>>  went into EC*OutputStream and EC*InputStream classes for write
> >>> and
> >>>>>>>>> read
> >>>>>>>>>>>  respectively. Based on replication type, ECPipelineProvider
> >> will
> >>>>>> be
> >>>>>>>>>>> chosen
> >>>>>>>>>>>  for creating EC pipelines.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>  -
> >>>>>>>>>>>
> >>>>>>>>>>>  Since we cannot represent the EC replication in the existing
> >>>>>>>>> replication
> >>>>>>>>>>>  factor, we have introduced ECReplicationConfig. The
> >>>>>>> ReplicationConfig
> >>>>>>>>>>>  interface is already pushed to master, so it’s not a new idea
> >>>>>> coming
> >>>>>>>>>>>  through this branch merge now. What is newly coming here is
> >> the
> >>>>>>>>>>>  ECReplicationConfig class which can be used to express EC
> >>>>>>> replication
> >>>>>>>>>>>  configuration.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>  -
> >>>>>>>>>>>
> >>>>>>>>>>>  We wanted to provide the support to enable EC at bucket level.
> >>> To
> >>>>>>>>>>>  simplify some complications, we have moved the default
> >>> replication
> >>>>>>>>>>>  configurations from client to server.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>  -
> >>>>>>>>>>>
> >>>>>>>>>>>  Client side replication type and replication factor removed
> >> from
> >>>>>> the
> >>>>>>>>>>>  configuration files and introduced the
> >>>>>>>>> ozone.server.default.replication
> >>>>>>>>>>>  and ozone.server.default.replication.type.We would continue to
> >>>>>>>>> respect
> >>>>>>>>>>> if
> >>>>>>>>>>>  one configures at client side explicitly or passed through
> >> APIs,
> >>>>>>>>>>> otherwise
> >>>>>>>>>>>  server side bucket level properties or server side default
> >>>>>>>>> configuration
> >>>>>>>>>>>  would take effect.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>  -
> >>>>>>>>>>>
> >>>>>>>>>>>  Other than this change, the rest of EC side code should not
> >>> impact
> >>>>>>>>> any
> >>>>>>>>>>>  of the existing code flows.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering
> this
> >>>>>>> feature
> >>>>>>>>>>> and we will continue to improve further in master.
> >>>>>>>>>>>
> >>>>>>>>>>> Git Branch Name : HDDS-3816-ec
> >>>>>>>>>>>
> >>>>>>>>>>> JIRAs: HDDS-3816 and HDDS-5351
> >>>>>>>>>>>
> >>>>>>>>>>> Completed tasks: ~ 142
> >>>>>>>>>>>
> >>>>>>>>>>> + We are covering the following two mandatory JIRAs to come in:
> >>>>>>>>>>>
> >>>>>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to
> >>> older
> >>>>>>>>> server
> >>>>>>>>>>> could fail due to the unavailability for client default
> >>> replication
> >>>>>>>>> config
> >>>>>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> >>>>>>>>>>>
> >>>>>>>>>>> PRs reviews in-progress and expected to close in a day or two.
> >>>>>>>>>>>
> >>>>>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe
> >> they're
> >>>>>> not
> >>>>>>>>>>> blockers for merge.
> >>>>>>>>>>>
> >>>>>>>>>>> In short what you can do now with this feature:
> >>>>>>>>>>>
> >>>>>>>>>>>  -
> >>>>>>>>>>>
> >>>>>>>>>>>  You can enable EC at bucket level and cluster level.
> >>>>>>>>>>>
> >>>>>>>>>>> How to enable it at bucket level? Just create the bucket by
> >>> passing
> >>>>>>> the
> >>>>>>>>> ec
> >>>>>>>>>>> replication options.
> >>>>>>>>>>>
> >>>>>>>>>>>  -
> >>>>>>>>>>>
> >>>>>>>>>>>  You can create EC keys and read the same back.
> >>>>>>>>>>>  -
> >>>>>>>>>>>
> >>>>>>>>>>>  You should be able to continue writing even when chosen nodes
> >>> are
> >>>>>>>>>>>  failing. (Of Course minimum of Data+Parity live nodes should
> >> be
> >>>>>>>>>>> available
> >>>>>>>>>>>  in cluster for complete the write)
> >>>>>>>>>>>  -
> >>>>>>>>>>>
> >>>>>>>>>>>  You should be able to read the file back even if a few nodes
> >>>>>> failed
> >>>>>>>>> in
> >>>>>>>>>>>  the same ec block group(Failures should not be more than
> >> parity
> >>>>>>>>> number
> >>>>>>>>>>> of
> >>>>>>>>>>>  nodes.).
> >>>>>>>>>>>
> >>>>>>>>>>> What is pending? Offline recovery of lost/missing EC
> containers.
> >>> As
> >>>>>>>>>>> mentioned above, post merge of this branch, I will create a
> >>> separate
> >>>>>>>>> JIRA
> >>>>>>>>>>> for starting the work for OfflineRecovery.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> There are automated acceptance test cases already added.
> >> HDDS-6231
> >>>>>>>>>>>
> >>>>>>>>>>> In addition to that, we have also performed basic Acceptance
> >>> Testing
> >>>>>>> in
> >>>>>>>>>>> physical cluster:
> >>>>>>>>>>>
> >>>>>>>>>>>  1.
> >>>>>>>>>>>
> >>>>>>>>>>>  Installed 10 nodes cluster and created EC bucket (3:2).
> >>>>>>>>>>>
> >>>>>>>>>>> Uploaded 10GB key.
> >>>>>>>>>>>
> >>>>>>>>>>> Downloaded the same key and checked the md5sum.
> >>>>>>>>>>>
> >>>>>>>>>>>  1.
> >>>>>>>>>>>
> >>>>>>>>>>>  Uploaded 8GB key.
> >>>>>>>>>>>
> >>>>>>>>>>> Downloaded the same key and checked the md5sum.
> >>>>>>>>>>>
> >>>>>>>>>>>  1.
> >>>>>>>>>>>
> >>>>>>>>>>>  Uploaded 3MB key
> >>>>>>>>>>>
> >>>>>>>>>>> Downloaded the same and verified md5sum.
> >>>>>>>>>>>
> >>>>>>>>>>>  1.
> >>>>>>>>>>>
> >>>>>>>>>>>  Changed bucket to (6:3)
> >>>>>>>>>>>
> >>>>>>>>>>> Uploaded 8GB key
> >>>>>>>>>>>
> >>>>>>>>>>> Download the same.
> >>>>>>>>>>>
> >>>>>>>>>>> Also verified the new key should be in 6:3 policy and old keys
> >>> must
> >>>>>> be
> >>>>>>>>>>> 3:2.Verified
> >>>>>>>>>>> with several different size key writes and reads.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Since the merge discussion thread, we have well stabilized code
> >>> and
> >>>>>>>>> fixed
> >>>>>>>>>>> several bugs.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Merge checklist items assessment is here:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> >>>>>>>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>,
> >>> Istvan
> >>>>>>>>> Fajth
> >>>>>>>>>>> <pi...@cloudera.com> for great efforts in core development and
> >>> also
> >>>>>>>>> thanks
> >>>>>>>>>>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for
> >>>>>>>>> collaborating
> >>>>>>>>>>> on some of the EC tasks.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks to Marton for design discussion and on some dev tasks as
> >>>>>> well.
> >>>>>>>>>>> Thanks to many others who were involved in design discussions,
> >>>>>> Arpit,
> >>>>>>>>> Sidd,
> >>>>>>>>>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
> >>> Prashanth,
> >>>>>>>>> Rakesh,
> >>>>>>>>>>> Yiqun Lin.
> >>>>>>>>>>> Sorry if I miss anyone here, but your efforts are much
> >>> appreciated.
> >>>>>>>>> Without
> >>>>>>>>>>> your tremendous help, we would have not reached this position
> >> yet.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> To start with, here is my +1
> >>>>>>>>>>>
> >>>>>>>>>>> The vote will run for 5 days.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Uma
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <
> >>>>>> umamahesh@apache.org>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Dear Ozone Devs,
> >>>>>>>>>>>>
> >>>>>>>>>>>> As you may know, we have been actively developing Ozone
> Erasure
> >>>>>>> Coding
> >>>>>>>>>>>> support in a separate branch HDDS-3816-ec.
> >>>>>>>>>>>>
> >>>>>>>>>>>> We have finished the development of EC key write and read
> >>>>>>>>> functionality.
> >>>>>>>>>>>> The support of offline recovery( Recovering replica from node
> >>> loss)
> >>>>>>>>> will
> >>>>>>>>>>> be
> >>>>>>>>>>>> part of second phase work.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Since the code has already grown and increasingly started
> >> seeing
> >>>>>>> merge
> >>>>>>>>>>>> complications, we would like to propose to merge the current
> EC
> >>>>>>> branch
> >>>>>>>>>>> into
> >>>>>>>>>>>> master.
> >>>>>>>>>>>>
> >>>>>>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work
> >> and
> >>>>>>>>>>>> continued the offline recovery work there.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Details on Changes:
> >>>>>>>>>>>>
> >>>>>>>>>>>>  -
> >>>>>>>>>>>>
> >>>>>>>>>>>>  Most of the EC core logic went to newly extended classes. Key
> >>>>>>>>> changes
> >>>>>>>>>>>>  went into EC*OutputStream and EC*InputStream classes for
> >> write
> >>>>>> and
> >>>>>>>>>>> read
> >>>>>>>>>>>>  respectively. Based on replication type, ECPipelineProvider
> >>> will
> >>>>>> be
> >>>>>>>>>>> chosen
> >>>>>>>>>>>>  for creating EC pipelines.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>  -
> >>>>>>>>>>>>
> >>>>>>>>>>>>  Since we cannot represent the EC replication in the existing
> >>>>>>>>>>>>  replication factor, we have introduced ECReplicationConfig.
> >> The
> >>>>>>>>>>>>  ReplicationConfig interface is already pushed to master, so
> >>> it’s
> >>>>>>>>> not
> >>>>>>>>>>> a new
> >>>>>>>>>>>>  idea coming through this branch merge now. What is newly
> >> coming
> >>>>>>>>> here
> >>>>>>>>>>> is the
> >>>>>>>>>>>>  ECReplicationConfig class which can be used to express EC
> >>>>>>>>> replication
> >>>>>>>>>>>>  configuration.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>  -
> >>>>>>>>>>>>
> >>>>>>>>>>>>  We wanted to provide the support to enable EC at bucket
> >> level.
> >>> To
> >>>>>>>>>>>>  simplify some complications, we have moved the default
> >>>>>> replication
> >>>>>>>>>>>>  configurations from client to server.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>  -
> >>>>>>>>>>>>
> >>>>>>>>>>>>  Client side replication type and replication factor removed
> >>> from
> >>>>>>>>> the
> >>>>>>>>>>>>  configuration files and introduced the
> >>>>>>>>>>> ozone.server.default.replication
> >>>>>>>>>>>>  and ozone.server.default.replication.type.We would continue
> >> to
> >>>>>>>>>>> respect if
> >>>>>>>>>>>>  one configures at client side explicitly or passed through
> >>> APIs,
> >>>>>>>>>>> otherwise
> >>>>>>>>>>>>  server side bucket level properties or server side default
> >>>>>>>>>>> configuration
> >>>>>>>>>>>>  would take effect.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>  -
> >>>>>>>>>>>>
> >>>>>>>>>>>>  Other than this change, the rest of EC side code should not
> >>>>>> impact
> >>>>>>>>> any
> >>>>>>>>>>>>  of the existing code flows.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering
> >> this
> >>>>>>>>> feature
> >>>>>>>>>>>> and we will continue to improve further in master.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Git Branch Name : HDDS-3816-ec
> >>>>>>>>>>>>
> >>>>>>>>>>>> JIRAs: HDDS-3816 and HDDS-5351
> >>>>>>>>>>>>
> >>>>>>>>>>>> Completed tasks: ~ 142
> >>>>>>>>>>>>
> >>>>>>>>>>>> + We are covering the following two mandatory JIRAs:
> >>>>>>>>>>>>
> >>>>>>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to
> >>> older
> >>>>>>>>>>>> server could fail due to the unavailability for client default
> >>>>>>>>>>> replication
> >>>>>>>>>>>> config
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> >>>>>>>>>>>>
> >>>>>>>>>>>> PRs reviews in-progress and expected to close in a day or two.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe
> >> they're
> >>>>>> not
> >>>>>>>>>>>> blockers for merge.
> >>>>>>>>>>>>
> >>>>>>>>>>>> In short what you can do now with this feature:
> >>>>>>>>>>>>
> >>>>>>>>>>>>  -
> >>>>>>>>>>>>
> >>>>>>>>>>>>  You can enable EC at bucket level and cluster level.
> >>>>>>>>>>>>
> >>>>>>>>>>>> How to enable it at bucket level? Just create the bucket by
> >>> passing
> >>>>>>>>> the
> >>>>>>>>>>> ec
> >>>>>>>>>>>> replication options.
> >>>>>>>>>>>>
> >>>>>>>>>>>>  -
> >>>>>>>>>>>>
> >>>>>>>>>>>>  You can create EC keys and read the same back.
> >>>>>>>>>>>>  -
> >>>>>>>>>>>>
> >>>>>>>>>>>>  You should be able to continue writing even when chosen nodes
> >>> are
> >>>>>>>>>>>>  failing. (Of Course minimum of Data+Parity live nodes should
> >> be
> >>>>>>>>>>> available
> >>>>>>>>>>>>  in cluster for complete the write)
> >>>>>>>>>>>>  -
> >>>>>>>>>>>>
> >>>>>>>>>>>>  You should be able to read the file back even if a few nodes
> >>>>>>>>> failed in
> >>>>>>>>>>>>  the same ec block group(Failures should not be more than
> >> parity
> >>>>>>>>>>> number of
> >>>>>>>>>>>>  nodes.).
> >>>>>>>>>>>>
> >>>>>>>>>>>> What is pending? Offline recovery of lost/missing EC
> >> containers.
> >>> As
> >>>>>>>>>>>> mentioned above, post merge of this branch, I will create a
> >>>>>> separate
> >>>>>>>>> JIRA
> >>>>>>>>>>>> for starting the work for OfflineRecovery.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> There are automated acceptance test cases already added.
> >>> HDDS-6231
> >>>>>>>>>>>>
> >>>>>>>>>>>> In addition to that, we have also performed basic Acceptance
> >>>>>> Testing
> >>>>>>>>> in
> >>>>>>>>>>>> physical cluster:
> >>>>>>>>>>>>
> >>>>>>>>>>>>  1.
> >>>>>>>>>>>>
> >>>>>>>>>>>>  Installed 10 nodes cluster and created EC bucket (3:2).
> >>>>>>>>>>>>
> >>>>>>>>>>>> Uploaded 10GB key.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Downloaded the same key and checked the md5sum.
> >>>>>>>>>>>>
> >>>>>>>>>>>>  1.
> >>>>>>>>>>>>
> >>>>>>>>>>>>  Uploaded 8GB key.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Downloaded the same key and checked the md5sum.
> >>>>>>>>>>>>
> >>>>>>>>>>>>  1.
> >>>>>>>>>>>>
> >>>>>>>>>>>>  Uploaded 3MB key
> >>>>>>>>>>>>
> >>>>>>>>>>>> Downloaded the same and verified md5sum.
> >>>>>>>>>>>>
> >>>>>>>>>>>>  1.
> >>>>>>>>>>>>
> >>>>>>>>>>>>  Changed bucket to (6:3)
> >>>>>>>>>>>>
> >>>>>>>>>>>> Uploaded 8GB key
> >>>>>>>>>>>>
> >>>>>>>>>>>> Download the same.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Also verified the new key should be in 6:3 policy and old keys
> >>> must
> >>>>>>> be
> >>>>>>>>>>> 3:2.Verified
> >>>>>>>>>>>> with several different size key writes and reads.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Merge checklist items assessment is here:
> >>>>>>>>>>>>
> >>>>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> >>>>>>>>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>,
> >>> Istvan
> >>>>>>>>> Fajth
> >>>>>>>>>>>> <pi...@cloudera.com> for great efforts in core development
> and
> >>>>>> also
> >>>>>>>>>>>> thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for
> >>>>>>>>> collaborating
> >>>>>>>>>>>> on some of the EC tasks.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks to Marton for design discussion and on some dev tasks
> as
> >>>>>> well.
> >>>>>>>>>>>> Thanks to many others who were involved in design discussions,
> >>>>>> Arpit,
> >>>>>>>>>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
> >>>>>>>>> Prashanth,
> >>>>>>>>>>>> Rakesh, Yiqun Lin.
> >>>>>>>>>>>> Sorry if I miss anyone here, but your efforts are much
> >>> appreciated.
> >>>>>>>>>>>> Without your tremendous help, we would have not reached this
> >>>>>> position
> >>>>>>>>>>> yet.
> >>>>>>>>>>>> If there are no objections for the merge, I will start the
> >>> official
> >>>>>>>>> vote
> >>>>>>>>>>>> later.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards,
> >>>>>>>>>>>>
> >>>>>>>>>>>> EC Branch Devs
> >>>>>>>>>>>>
> >>>>>>>
> >>>>>>>
> >> ---------------------------------------------------------------------
> >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> >>>>>>> For additional commands, e-mail: dev-help@ozone.apache.org
> >>>>>>>
> >>>>>>>
> >>>>>> --
> >>>>>> Thanks & Regards,
> >>>>>> Aravindan
> >>>>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> >>>> For additional commands, e-mail: dev-help@ozone.apache.org
> >>>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> >>> For additional commands, e-mail: dev-help@ozone.apache.org
> >>>
> >>>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> For additional commands, e-mail: dev-help@ozone.apache.org
>
>

Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Hanisha Koneru <hk...@cloudera.com.INVALID>.
+1 for merge. Thanks for the great work on this.

Thanks
Hanisha

> On Apr 7, 2022, at 2:18 AM, jackson yao <ja...@gmail.com> wrote:
> 
> thanks for the great work! i am +1 for merging (non-binding)
> 
> mingchao zhao <ca...@apache.org> 于2022年4月7日周四 14:18写道:
> 
>> +1 for the merge. Thanks
>> 
>> Mukul Kumar Singh <mk...@gmail.com> 于2022年4月7日周四 14:05写道:
>> 
>>> +1 for the merge.
>>> 
>>> 
>>> Thanks Lokesh
>>> 
>>> On 07/04/22 11:29 am, Lokesh Jain wrote:
>>>> +1 for merge
>>>> 
>>>> Thanks
>>>> Lokesh
>>>> 
>>>>> On 07-Apr-2022, at 11:06 AM, Tsz Wo Sze <sz...@gmail.com> wrote:
>>>>> 
>>>>> +1
>>>>> We should merge it so that more people can try it.  We can work on the
>>>>> remaining tasks in the master branch.  Thanks a lot!
>>>>> 
>>>>> Tsz-Wo
>>>>> 
>>>>> On Thu, Apr 7, 2022 at 1:17 PM Aravindan Vijayan
>>>>> <av...@cloudera.com.invalid> wrote:
>>>>> 
>>>>>> +1 for the merge. Thanks for the great work!
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Wed, Apr 6, 2022 at 9:45 PM Prashant Pogde
>>> <ppogde@cloudera.com.invalid
>>>>>> wrote:
>>>>>> 
>>>>>>> +1 for the EC branch merge.
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Prashant
>>>>>>> 
>>>>>>>> On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <sw...@apache.org>
>>> wrote:
>>>>>>>> 
>>>>>>>> +1 for the EC branch merge.
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Sid
>>>>>>>> 
>>>>>>>> On Wed, Apr 6, 2022 at 8:05 PM guimark <gu...@126.com> wrote:
>>>>>>>> 
>>>>>>>>> Great news!
>>>>>>>>> +1 to merge.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> At 2022-04-06 22:18:31, "Stephen O'Donnell" <
>> sodonnell@cloudera.com
>>>>>>> .INVALID>
>>>>>>>>> wrote:
>>>>>>>>>> I have been working on the code on this branch for some time,
>> and I
>>>>>>>>> believe
>>>>>>>>>> it is in a good state to merge now. It is mostly new code, and if
>>>>>>> nothing
>>>>>>>>>> attempts to use EC, none of the EC code paths will be executed.
>>>>>>>>>> 
>>>>>>>>>> +1 to merge from me.
>>>>>>>>>> 
>>>>>>>>>> Stephen.
>>>>>>>>>> 
>>>>>>>>>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <
>>> umamahesh@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>>> =====Few Edits Below===================
>>>>>>>>>>> 
>>>>>>>>>>> Dear Ozone Devs,
>>>>>>>>>>> 
>>>>>>>>>>> As you may know, we have been actively developing Ozone Erasure
>>>>>> Coding
>>>>>>>>>>> support in a separate branch HDDS-3816-ec.
>>>>>>>>>>> 
>>>>>>>>>>> We have finished the development of EC key write and read
>>>>>>> functionality.
>>>>>>>>>>> The support of offline recovery( Recovering replica from node
>>> loss)
>>>>>>>>> will be
>>>>>>>>>>> part of second phase work.
>>>>>>>>>>> 
>>>>>>>>>>> Since the code has already grown and increasingly started seeing
>>>>>> merge
>>>>>>>>>>> complications, we would like to merge the current EC branch into
>>>>>>> master.
>>>>>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work
>> and
>>>>>>>>> continued
>>>>>>>>>>> the offline recovery work there. (we have uploaded the design
>> doc
>>>>>>> there)
>>>>>>>>>>> Details on Changes:
>>>>>>>>>>> 
>>>>>>>>>>>  -
>>>>>>>>>>> 
>>>>>>>>>>>  Most of the EC core logic went to newly extended classes. Key
>>>>>>> changes
>>>>>>>>>>>  went into EC*OutputStream and EC*InputStream classes for write
>>> and
>>>>>>>>> read
>>>>>>>>>>>  respectively. Based on replication type, ECPipelineProvider
>> will
>>>>>> be
>>>>>>>>>>> chosen
>>>>>>>>>>>  for creating EC pipelines.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>  -
>>>>>>>>>>> 
>>>>>>>>>>>  Since we cannot represent the EC replication in the existing
>>>>>>>>> replication
>>>>>>>>>>>  factor, we have introduced ECReplicationConfig. The
>>>>>>> ReplicationConfig
>>>>>>>>>>>  interface is already pushed to master, so it’s not a new idea
>>>>>> coming
>>>>>>>>>>>  through this branch merge now. What is newly coming here is
>> the
>>>>>>>>>>>  ECReplicationConfig class which can be used to express EC
>>>>>>> replication
>>>>>>>>>>>  configuration.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>  -
>>>>>>>>>>> 
>>>>>>>>>>>  We wanted to provide the support to enable EC at bucket level.
>>> To
>>>>>>>>>>>  simplify some complications, we have moved the default
>>> replication
>>>>>>>>>>>  configurations from client to server.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>  -
>>>>>>>>>>> 
>>>>>>>>>>>  Client side replication type and replication factor removed
>> from
>>>>>> the
>>>>>>>>>>>  configuration files and introduced the
>>>>>>>>> ozone.server.default.replication
>>>>>>>>>>>  and ozone.server.default.replication.type.We would continue to
>>>>>>>>> respect
>>>>>>>>>>> if
>>>>>>>>>>>  one configures at client side explicitly or passed through
>> APIs,
>>>>>>>>>>> otherwise
>>>>>>>>>>>  server side bucket level properties or server side default
>>>>>>>>> configuration
>>>>>>>>>>>  would take effect.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>  -
>>>>>>>>>>> 
>>>>>>>>>>>  Other than this change, the rest of EC side code should not
>>> impact
>>>>>>>>> any
>>>>>>>>>>>  of the existing code flows.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering this
>>>>>>> feature
>>>>>>>>>>> and we will continue to improve further in master.
>>>>>>>>>>> 
>>>>>>>>>>> Git Branch Name : HDDS-3816-ec
>>>>>>>>>>> 
>>>>>>>>>>> JIRAs: HDDS-3816 and HDDS-5351
>>>>>>>>>>> 
>>>>>>>>>>> Completed tasks: ~ 142
>>>>>>>>>>> 
>>>>>>>>>>> + We are covering the following two mandatory JIRAs to come in:
>>>>>>>>>>> 
>>>>>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to
>>> older
>>>>>>>>> server
>>>>>>>>>>> could fail due to the unavailability for client default
>>> replication
>>>>>>>>> config
>>>>>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>>>>>>>>>>> 
>>>>>>>>>>> PRs reviews in-progress and expected to close in a day or two.
>>>>>>>>>>> 
>>>>>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe
>> they're
>>>>>> not
>>>>>>>>>>> blockers for merge.
>>>>>>>>>>> 
>>>>>>>>>>> In short what you can do now with this feature:
>>>>>>>>>>> 
>>>>>>>>>>>  -
>>>>>>>>>>> 
>>>>>>>>>>>  You can enable EC at bucket level and cluster level.
>>>>>>>>>>> 
>>>>>>>>>>> How to enable it at bucket level? Just create the bucket by
>>> passing
>>>>>>> the
>>>>>>>>> ec
>>>>>>>>>>> replication options.
>>>>>>>>>>> 
>>>>>>>>>>>  -
>>>>>>>>>>> 
>>>>>>>>>>>  You can create EC keys and read the same back.
>>>>>>>>>>>  -
>>>>>>>>>>> 
>>>>>>>>>>>  You should be able to continue writing even when chosen nodes
>>> are
>>>>>>>>>>>  failing. (Of Course minimum of Data+Parity live nodes should
>> be
>>>>>>>>>>> available
>>>>>>>>>>>  in cluster for complete the write)
>>>>>>>>>>>  -
>>>>>>>>>>> 
>>>>>>>>>>>  You should be able to read the file back even if a few nodes
>>>>>> failed
>>>>>>>>> in
>>>>>>>>>>>  the same ec block group(Failures should not be more than
>> parity
>>>>>>>>> number
>>>>>>>>>>> of
>>>>>>>>>>>  nodes.).
>>>>>>>>>>> 
>>>>>>>>>>> What is pending? Offline recovery of lost/missing EC containers.
>>> As
>>>>>>>>>>> mentioned above, post merge of this branch, I will create a
>>> separate
>>>>>>>>> JIRA
>>>>>>>>>>> for starting the work for OfflineRecovery.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> There are automated acceptance test cases already added.
>> HDDS-6231
>>>>>>>>>>> 
>>>>>>>>>>> In addition to that, we have also performed basic Acceptance
>>> Testing
>>>>>>> in
>>>>>>>>>>> physical cluster:
>>>>>>>>>>> 
>>>>>>>>>>>  1.
>>>>>>>>>>> 
>>>>>>>>>>>  Installed 10 nodes cluster and created EC bucket (3:2).
>>>>>>>>>>> 
>>>>>>>>>>> Uploaded 10GB key.
>>>>>>>>>>> 
>>>>>>>>>>> Downloaded the same key and checked the md5sum.
>>>>>>>>>>> 
>>>>>>>>>>>  1.
>>>>>>>>>>> 
>>>>>>>>>>>  Uploaded 8GB key.
>>>>>>>>>>> 
>>>>>>>>>>> Downloaded the same key and checked the md5sum.
>>>>>>>>>>> 
>>>>>>>>>>>  1.
>>>>>>>>>>> 
>>>>>>>>>>>  Uploaded 3MB key
>>>>>>>>>>> 
>>>>>>>>>>> Downloaded the same and verified md5sum.
>>>>>>>>>>> 
>>>>>>>>>>>  1.
>>>>>>>>>>> 
>>>>>>>>>>>  Changed bucket to (6:3)
>>>>>>>>>>> 
>>>>>>>>>>> Uploaded 8GB key
>>>>>>>>>>> 
>>>>>>>>>>> Download the same.
>>>>>>>>>>> 
>>>>>>>>>>> Also verified the new key should be in 6:3 policy and old keys
>>> must
>>>>>> be
>>>>>>>>>>> 3:2.Verified
>>>>>>>>>>> with several different size key writes and reads.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Since the merge discussion thread, we have well stabilized code
>>> and
>>>>>>>>> fixed
>>>>>>>>>>> several bugs.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Merge checklist items assessment is here:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>>>>>>>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>,
>>> Istvan
>>>>>>>>> Fajth
>>>>>>>>>>> <pi...@cloudera.com> for great efforts in core development and
>>> also
>>>>>>>>> thanks
>>>>>>>>>>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for
>>>>>>>>> collaborating
>>>>>>>>>>> on some of the EC tasks.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks to Marton for design discussion and on some dev tasks as
>>>>>> well.
>>>>>>>>>>> Thanks to many others who were involved in design discussions,
>>>>>> Arpit,
>>>>>>>>> Sidd,
>>>>>>>>>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
>>> Prashanth,
>>>>>>>>> Rakesh,
>>>>>>>>>>> Yiqun Lin.
>>>>>>>>>>> Sorry if I miss anyone here, but your efforts are much
>>> appreciated.
>>>>>>>>> Without
>>>>>>>>>>> your tremendous help, we would have not reached this position
>> yet.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> To start with, here is my +1
>>>>>>>>>>> 
>>>>>>>>>>> The vote will run for 5 days.
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Uma
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <
>>>>>> umamahesh@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Dear Ozone Devs,
>>>>>>>>>>>> 
>>>>>>>>>>>> As you may know, we have been actively developing Ozone Erasure
>>>>>>> Coding
>>>>>>>>>>>> support in a separate branch HDDS-3816-ec.
>>>>>>>>>>>> 
>>>>>>>>>>>> We have finished the development of EC key write and read
>>>>>>>>> functionality.
>>>>>>>>>>>> The support of offline recovery( Recovering replica from node
>>> loss)
>>>>>>>>> will
>>>>>>>>>>> be
>>>>>>>>>>>> part of second phase work.
>>>>>>>>>>>> 
>>>>>>>>>>>> Since the code has already grown and increasingly started
>> seeing
>>>>>>> merge
>>>>>>>>>>>> complications, we would like to propose to merge the current EC
>>>>>>> branch
>>>>>>>>>>> into
>>>>>>>>>>>> master.
>>>>>>>>>>>> 
>>>>>>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work
>> and
>>>>>>>>>>>> continued the offline recovery work there.
>>>>>>>>>>>> 
>>>>>>>>>>>> Details on Changes:
>>>>>>>>>>>> 
>>>>>>>>>>>>  -
>>>>>>>>>>>> 
>>>>>>>>>>>>  Most of the EC core logic went to newly extended classes. Key
>>>>>>>>> changes
>>>>>>>>>>>>  went into EC*OutputStream and EC*InputStream classes for
>> write
>>>>>> and
>>>>>>>>>>> read
>>>>>>>>>>>>  respectively. Based on replication type, ECPipelineProvider
>>> will
>>>>>> be
>>>>>>>>>>> chosen
>>>>>>>>>>>>  for creating EC pipelines.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>  -
>>>>>>>>>>>> 
>>>>>>>>>>>>  Since we cannot represent the EC replication in the existing
>>>>>>>>>>>>  replication factor, we have introduced ECReplicationConfig.
>> The
>>>>>>>>>>>>  ReplicationConfig interface is already pushed to master, so
>>> it’s
>>>>>>>>> not
>>>>>>>>>>> a new
>>>>>>>>>>>>  idea coming through this branch merge now. What is newly
>> coming
>>>>>>>>> here
>>>>>>>>>>> is the
>>>>>>>>>>>>  ECReplicationConfig class which can be used to express EC
>>>>>>>>> replication
>>>>>>>>>>>>  configuration.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>  -
>>>>>>>>>>>> 
>>>>>>>>>>>>  We wanted to provide the support to enable EC at bucket
>> level.
>>> To
>>>>>>>>>>>>  simplify some complications, we have moved the default
>>>>>> replication
>>>>>>>>>>>>  configurations from client to server.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>  -
>>>>>>>>>>>> 
>>>>>>>>>>>>  Client side replication type and replication factor removed
>>> from
>>>>>>>>> the
>>>>>>>>>>>>  configuration files and introduced the
>>>>>>>>>>> ozone.server.default.replication
>>>>>>>>>>>>  and ozone.server.default.replication.type.We would continue
>> to
>>>>>>>>>>> respect if
>>>>>>>>>>>>  one configures at client side explicitly or passed through
>>> APIs,
>>>>>>>>>>> otherwise
>>>>>>>>>>>>  server side bucket level properties or server side default
>>>>>>>>>>> configuration
>>>>>>>>>>>>  would take effect.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>  -
>>>>>>>>>>>> 
>>>>>>>>>>>>  Other than this change, the rest of EC side code should not
>>>>>> impact
>>>>>>>>> any
>>>>>>>>>>>>  of the existing code flows.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering
>> this
>>>>>>>>> feature
>>>>>>>>>>>> and we will continue to improve further in master.
>>>>>>>>>>>> 
>>>>>>>>>>>> Git Branch Name : HDDS-3816-ec
>>>>>>>>>>>> 
>>>>>>>>>>>> JIRAs: HDDS-3816 and HDDS-5351
>>>>>>>>>>>> 
>>>>>>>>>>>> Completed tasks: ~ 142
>>>>>>>>>>>> 
>>>>>>>>>>>> + We are covering the following two mandatory JIRAs:
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to
>>> older
>>>>>>>>>>>> server could fail due to the unavailability for client default
>>>>>>>>>>> replication
>>>>>>>>>>>> config
>>>>>>>>>>>> 
>>>>>>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>>>>>>>>>>>> 
>>>>>>>>>>>> PRs reviews in-progress and expected to close in a day or two.
>>>>>>>>>>>> 
>>>>>>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe
>> they're
>>>>>> not
>>>>>>>>>>>> blockers for merge.
>>>>>>>>>>>> 
>>>>>>>>>>>> In short what you can do now with this feature:
>>>>>>>>>>>> 
>>>>>>>>>>>>  -
>>>>>>>>>>>> 
>>>>>>>>>>>>  You can enable EC at bucket level and cluster level.
>>>>>>>>>>>> 
>>>>>>>>>>>> How to enable it at bucket level? Just create the bucket by
>>> passing
>>>>>>>>> the
>>>>>>>>>>> ec
>>>>>>>>>>>> replication options.
>>>>>>>>>>>> 
>>>>>>>>>>>>  -
>>>>>>>>>>>> 
>>>>>>>>>>>>  You can create EC keys and read the same back.
>>>>>>>>>>>>  -
>>>>>>>>>>>> 
>>>>>>>>>>>>  You should be able to continue writing even when chosen nodes
>>> are
>>>>>>>>>>>>  failing. (Of Course minimum of Data+Parity live nodes should
>> be
>>>>>>>>>>> available
>>>>>>>>>>>>  in cluster for complete the write)
>>>>>>>>>>>>  -
>>>>>>>>>>>> 
>>>>>>>>>>>>  You should be able to read the file back even if a few nodes
>>>>>>>>> failed in
>>>>>>>>>>>>  the same ec block group(Failures should not be more than
>> parity
>>>>>>>>>>> number of
>>>>>>>>>>>>  nodes.).
>>>>>>>>>>>> 
>>>>>>>>>>>> What is pending? Offline recovery of lost/missing EC
>> containers.
>>> As
>>>>>>>>>>>> mentioned above, post merge of this branch, I will create a
>>>>>> separate
>>>>>>>>> JIRA
>>>>>>>>>>>> for starting the work for OfflineRecovery.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> There are automated acceptance test cases already added.
>>> HDDS-6231
>>>>>>>>>>>> 
>>>>>>>>>>>> In addition to that, we have also performed basic Acceptance
>>>>>> Testing
>>>>>>>>> in
>>>>>>>>>>>> physical cluster:
>>>>>>>>>>>> 
>>>>>>>>>>>>  1.
>>>>>>>>>>>> 
>>>>>>>>>>>>  Installed 10 nodes cluster and created EC bucket (3:2).
>>>>>>>>>>>> 
>>>>>>>>>>>> Uploaded 10GB key.
>>>>>>>>>>>> 
>>>>>>>>>>>> Downloaded the same key and checked the md5sum.
>>>>>>>>>>>> 
>>>>>>>>>>>>  1.
>>>>>>>>>>>> 
>>>>>>>>>>>>  Uploaded 8GB key.
>>>>>>>>>>>> 
>>>>>>>>>>>> Downloaded the same key and checked the md5sum.
>>>>>>>>>>>> 
>>>>>>>>>>>>  1.
>>>>>>>>>>>> 
>>>>>>>>>>>>  Uploaded 3MB key
>>>>>>>>>>>> 
>>>>>>>>>>>> Downloaded the same and verified md5sum.
>>>>>>>>>>>> 
>>>>>>>>>>>>  1.
>>>>>>>>>>>> 
>>>>>>>>>>>>  Changed bucket to (6:3)
>>>>>>>>>>>> 
>>>>>>>>>>>> Uploaded 8GB key
>>>>>>>>>>>> 
>>>>>>>>>>>> Download the same.
>>>>>>>>>>>> 
>>>>>>>>>>>> Also verified the new key should be in 6:3 policy and old keys
>>> must
>>>>>>> be
>>>>>>>>>>> 3:2.Verified
>>>>>>>>>>>> with several different size key writes and reads.
>>>>>>>>>>>> 
>>>>>>>>>>>> Merge checklist items assessment is here:
>>>>>>>>>>>> 
>>>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>>>>>>>>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>,
>>> Istvan
>>>>>>>>> Fajth
>>>>>>>>>>>> <pi...@cloudera.com> for great efforts in core development and
>>>>>> also
>>>>>>>>>>>> thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for
>>>>>>>>> collaborating
>>>>>>>>>>>> on some of the EC tasks.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks to Marton for design discussion and on some dev tasks as
>>>>>> well.
>>>>>>>>>>>> Thanks to many others who were involved in design discussions,
>>>>>> Arpit,
>>>>>>>>>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
>>>>>>>>> Prashanth,
>>>>>>>>>>>> Rakesh, Yiqun Lin.
>>>>>>>>>>>> Sorry if I miss anyone here, but your efforts are much
>>> appreciated.
>>>>>>>>>>>> Without your tremendous help, we would have not reached this
>>>>>> position
>>>>>>>>>>> yet.
>>>>>>>>>>>> If there are no objections for the merge, I will start the
>>> official
>>>>>>>>> vote
>>>>>>>>>>>> later.
>>>>>>>>>>>> 
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> 
>>>>>>>>>>>> EC Branch Devs
>>>>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
>>>>>>> For additional commands, e-mail: dev-help@ozone.apache.org
>>>>>>> 
>>>>>>> 
>>>>>> --
>>>>>> Thanks & Regards,
>>>>>> Aravindan
>>>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
>>>> For additional commands, e-mail: dev-help@ozone.apache.org
>>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
>>> For additional commands, e-mail: dev-help@ozone.apache.org
>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
For additional commands, e-mail: dev-help@ozone.apache.org


Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by jackson yao <ja...@gmail.com>.
thanks for the great work! i am +1 for merging (non-binding)

mingchao zhao <ca...@apache.org> 于2022年4月7日周四 14:18写道:

> +1 for the merge. Thanks
>
> Mukul Kumar Singh <mk...@gmail.com> 于2022年4月7日周四 14:05写道:
>
> > +1 for the merge.
> >
> >
> > Thanks Lokesh
> >
> > On 07/04/22 11:29 am, Lokesh Jain wrote:
> > > +1 for merge
> > >
> > > Thanks
> > > Lokesh
> > >
> > >> On 07-Apr-2022, at 11:06 AM, Tsz Wo Sze <sz...@gmail.com> wrote:
> > >>
> > >> +1
> > >> We should merge it so that more people can try it.  We can work on the
> > >> remaining tasks in the master branch.  Thanks a lot!
> > >>
> > >> Tsz-Wo
> > >>
> > >> On Thu, Apr 7, 2022 at 1:17 PM Aravindan Vijayan
> > >> <av...@cloudera.com.invalid> wrote:
> > >>
> > >>> +1 for the merge. Thanks for the great work!
> > >>>
> > >>>
> > >>>
> > >>> On Wed, Apr 6, 2022 at 9:45 PM Prashant Pogde
> > <ppogde@cloudera.com.invalid
> > >>> wrote:
> > >>>
> > >>>> +1 for the EC branch merge.
> > >>>>
> > >>>> Regards,
> > >>>> Prashant
> > >>>>
> > >>>>> On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <sw...@apache.org>
> > wrote:
> > >>>>>
> > >>>>> +1 for the EC branch merge.
> > >>>>>
> > >>>>> Best,
> > >>>>> Sid
> > >>>>>
> > >>>>> On Wed, Apr 6, 2022 at 8:05 PM guimark <gu...@126.com> wrote:
> > >>>>>
> > >>>>>> Great news!
> > >>>>>> +1 to merge.
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> At 2022-04-06 22:18:31, "Stephen O'Donnell" <
> sodonnell@cloudera.com
> > >>>> .INVALID>
> > >>>>>> wrote:
> > >>>>>>> I have been working on the code on this branch for some time,
> and I
> > >>>>>> believe
> > >>>>>>> it is in a good state to merge now. It is mostly new code, and if
> > >>>> nothing
> > >>>>>>> attempts to use EC, none of the EC code paths will be executed.
> > >>>>>>>
> > >>>>>>> +1 to merge from me.
> > >>>>>>>
> > >>>>>>> Stephen.
> > >>>>>>>
> > >>>>>>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <
> > umamahesh@apache.org>
> > >>>>>> wrote:
> > >>>>>>>> =====Few Edits Below===================
> > >>>>>>>>
> > >>>>>>>> Dear Ozone Devs,
> > >>>>>>>>
> > >>>>>>>> As you may know, we have been actively developing Ozone Erasure
> > >>> Coding
> > >>>>>>>> support in a separate branch HDDS-3816-ec.
> > >>>>>>>>
> > >>>>>>>> We have finished the development of EC key write and read
> > >>>> functionality.
> > >>>>>>>> The support of offline recovery( Recovering replica from node
> > loss)
> > >>>>>> will be
> > >>>>>>>> part of second phase work.
> > >>>>>>>>
> > >>>>>>>> Since the code has already grown and increasingly started seeing
> > >>> merge
> > >>>>>>>> complications, we would like to merge the current EC branch into
> > >>>> master.
> > >>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work
> and
> > >>>>>> continued
> > >>>>>>>> the offline recovery work there. (we have uploaded the design
> doc
> > >>>> there)
> > >>>>>>>> Details on Changes:
> > >>>>>>>>
> > >>>>>>>>   -
> > >>>>>>>>
> > >>>>>>>>   Most of the EC core logic went to newly extended classes. Key
> > >>>> changes
> > >>>>>>>>   went into EC*OutputStream and EC*InputStream classes for write
> > and
> > >>>>>> read
> > >>>>>>>>   respectively. Based on replication type, ECPipelineProvider
> will
> > >>> be
> > >>>>>>>> chosen
> > >>>>>>>>   for creating EC pipelines.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>   -
> > >>>>>>>>
> > >>>>>>>>   Since we cannot represent the EC replication in the existing
> > >>>>>> replication
> > >>>>>>>>   factor, we have introduced ECReplicationConfig. The
> > >>>> ReplicationConfig
> > >>>>>>>>   interface is already pushed to master, so it’s not a new idea
> > >>> coming
> > >>>>>>>>   through this branch merge now. What is newly coming here is
> the
> > >>>>>>>>   ECReplicationConfig class which can be used to express EC
> > >>>> replication
> > >>>>>>>>   configuration.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>   -
> > >>>>>>>>
> > >>>>>>>>   We wanted to provide the support to enable EC at bucket level.
> > To
> > >>>>>>>>   simplify some complications, we have moved the default
> > replication
> > >>>>>>>>   configurations from client to server.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>   -
> > >>>>>>>>
> > >>>>>>>>   Client side replication type and replication factor removed
> from
> > >>> the
> > >>>>>>>>   configuration files and introduced the
> > >>>>>> ozone.server.default.replication
> > >>>>>>>>   and ozone.server.default.replication.type.We would continue to
> > >>>>>> respect
> > >>>>>>>> if
> > >>>>>>>>   one configures at client side explicitly or passed through
> APIs,
> > >>>>>>>> otherwise
> > >>>>>>>>   server side bucket level properties or server side default
> > >>>>>> configuration
> > >>>>>>>>   would take effect.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>   -
> > >>>>>>>>
> > >>>>>>>>   Other than this change, the rest of EC side code should not
> > impact
> > >>>>>> any
> > >>>>>>>>   of the existing code flows.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering this
> > >>>> feature
> > >>>>>>>> and we will continue to improve further in master.
> > >>>>>>>>
> > >>>>>>>> Git Branch Name : HDDS-3816-ec
> > >>>>>>>>
> > >>>>>>>> JIRAs: HDDS-3816 and HDDS-5351
> > >>>>>>>>
> > >>>>>>>> Completed tasks: ~ 142
> > >>>>>>>>
> > >>>>>>>> + We are covering the following two mandatory JIRAs to come in:
> > >>>>>>>>
> > >>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to
> > older
> > >>>>>> server
> > >>>>>>>> could fail due to the unavailability for client default
> > replication
> > >>>>>> config
> > >>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> > >>>>>>>>
> > >>>>>>>> PRs reviews in-progress and expected to close in a day or two.
> > >>>>>>>>
> > >>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe
> they're
> > >>> not
> > >>>>>>>> blockers for merge.
> > >>>>>>>>
> > >>>>>>>> In short what you can do now with this feature:
> > >>>>>>>>
> > >>>>>>>>   -
> > >>>>>>>>
> > >>>>>>>>   You can enable EC at bucket level and cluster level.
> > >>>>>>>>
> > >>>>>>>> How to enable it at bucket level? Just create the bucket by
> > passing
> > >>>> the
> > >>>>>> ec
> > >>>>>>>> replication options.
> > >>>>>>>>
> > >>>>>>>>   -
> > >>>>>>>>
> > >>>>>>>>   You can create EC keys and read the same back.
> > >>>>>>>>   -
> > >>>>>>>>
> > >>>>>>>>   You should be able to continue writing even when chosen nodes
> > are
> > >>>>>>>>   failing. (Of Course minimum of Data+Parity live nodes should
> be
> > >>>>>>>> available
> > >>>>>>>>   in cluster for complete the write)
> > >>>>>>>>   -
> > >>>>>>>>
> > >>>>>>>>   You should be able to read the file back even if a few nodes
> > >>> failed
> > >>>>>> in
> > >>>>>>>>   the same ec block group(Failures should not be more than
> parity
> > >>>>>> number
> > >>>>>>>> of
> > >>>>>>>>   nodes.).
> > >>>>>>>>
> > >>>>>>>> What is pending? Offline recovery of lost/missing EC containers.
> > As
> > >>>>>>>> mentioned above, post merge of this branch, I will create a
> > separate
> > >>>>>> JIRA
> > >>>>>>>> for starting the work for OfflineRecovery.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> There are automated acceptance test cases already added.
> HDDS-6231
> > >>>>>>>>
> > >>>>>>>> In addition to that, we have also performed basic Acceptance
> > Testing
> > >>>> in
> > >>>>>>>> physical cluster:
> > >>>>>>>>
> > >>>>>>>>   1.
> > >>>>>>>>
> > >>>>>>>>   Installed 10 nodes cluster and created EC bucket (3:2).
> > >>>>>>>>
> > >>>>>>>> Uploaded 10GB key.
> > >>>>>>>>
> > >>>>>>>> Downloaded the same key and checked the md5sum.
> > >>>>>>>>
> > >>>>>>>>   1.
> > >>>>>>>>
> > >>>>>>>>   Uploaded 8GB key.
> > >>>>>>>>
> > >>>>>>>> Downloaded the same key and checked the md5sum.
> > >>>>>>>>
> > >>>>>>>>   1.
> > >>>>>>>>
> > >>>>>>>>   Uploaded 3MB key
> > >>>>>>>>
> > >>>>>>>> Downloaded the same and verified md5sum.
> > >>>>>>>>
> > >>>>>>>>   1.
> > >>>>>>>>
> > >>>>>>>>   Changed bucket to (6:3)
> > >>>>>>>>
> > >>>>>>>> Uploaded 8GB key
> > >>>>>>>>
> > >>>>>>>> Download the same.
> > >>>>>>>>
> > >>>>>>>> Also verified the new key should be in 6:3 policy and old keys
> > must
> > >>> be
> > >>>>>>>> 3:2.Verified
> > >>>>>>>> with several different size key writes and reads.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Since the merge discussion thread, we have well stabilized code
> > and
> > >>>>>> fixed
> > >>>>>>>> several bugs.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Merge checklist items assessment is here:
> > >>>>>>>>
> > >>>>>>>>
> > >>>
> >
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> > >>>>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>,
> > Istvan
> > >>>>>> Fajth
> > >>>>>>>> <pi...@cloudera.com> for great efforts in core development and
> > also
> > >>>>>> thanks
> > >>>>>>>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for
> > >>>>>> collaborating
> > >>>>>>>> on some of the EC tasks.
> > >>>>>>>>
> > >>>>>>>> Thanks to Marton for design discussion and on some dev tasks as
> > >>> well.
> > >>>>>>>> Thanks to many others who were involved in design discussions,
> > >>> Arpit,
> > >>>>>> Sidd,
> > >>>>>>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
> > Prashanth,
> > >>>>>> Rakesh,
> > >>>>>>>> Yiqun Lin.
> > >>>>>>>> Sorry if I miss anyone here, but your efforts are much
> > appreciated.
> > >>>>>> Without
> > >>>>>>>> your tremendous help, we would have not reached this position
> yet.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> To start with, here is my +1
> > >>>>>>>>
> > >>>>>>>> The vote will run for 5 days.
> > >>>>>>>>
> > >>>>>>>> Regards,
> > >>>>>>>> Uma
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <
> > >>> umamahesh@apache.org>
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Dear Ozone Devs,
> > >>>>>>>>>
> > >>>>>>>>> As you may know, we have been actively developing Ozone Erasure
> > >>>> Coding
> > >>>>>>>>> support in a separate branch HDDS-3816-ec.
> > >>>>>>>>>
> > >>>>>>>>> We have finished the development of EC key write and read
> > >>>>>> functionality.
> > >>>>>>>>> The support of offline recovery( Recovering replica from node
> > loss)
> > >>>>>> will
> > >>>>>>>> be
> > >>>>>>>>> part of second phase work.
> > >>>>>>>>>
> > >>>>>>>>> Since the code has already grown and increasingly started
> seeing
> > >>>> merge
> > >>>>>>>>> complications, we would like to propose to merge the current EC
> > >>>> branch
> > >>>>>>>> into
> > >>>>>>>>> master.
> > >>>>>>>>>
> > >>>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work
> and
> > >>>>>>>>> continued the offline recovery work there.
> > >>>>>>>>>
> > >>>>>>>>> Details on Changes:
> > >>>>>>>>>
> > >>>>>>>>>   -
> > >>>>>>>>>
> > >>>>>>>>>   Most of the EC core logic went to newly extended classes. Key
> > >>>>>> changes
> > >>>>>>>>>   went into EC*OutputStream and EC*InputStream classes for
> write
> > >>> and
> > >>>>>>>> read
> > >>>>>>>>>   respectively. Based on replication type, ECPipelineProvider
> > will
> > >>> be
> > >>>>>>>> chosen
> > >>>>>>>>>   for creating EC pipelines.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>   -
> > >>>>>>>>>
> > >>>>>>>>>   Since we cannot represent the EC replication in the existing
> > >>>>>>>>>   replication factor, we have introduced ECReplicationConfig.
> The
> > >>>>>>>>>   ReplicationConfig interface is already pushed to master, so
> > it’s
> > >>>>>> not
> > >>>>>>>> a new
> > >>>>>>>>>   idea coming through this branch merge now. What is newly
> coming
> > >>>>>> here
> > >>>>>>>> is the
> > >>>>>>>>>   ECReplicationConfig class which can be used to express EC
> > >>>>>> replication
> > >>>>>>>>>   configuration.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>   -
> > >>>>>>>>>
> > >>>>>>>>>   We wanted to provide the support to enable EC at bucket
> level.
> > To
> > >>>>>>>>>   simplify some complications, we have moved the default
> > >>> replication
> > >>>>>>>>>   configurations from client to server.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>   -
> > >>>>>>>>>
> > >>>>>>>>>   Client side replication type and replication factor removed
> > from
> > >>>>>> the
> > >>>>>>>>>   configuration files and introduced the
> > >>>>>>>> ozone.server.default.replication
> > >>>>>>>>>   and ozone.server.default.replication.type.We would continue
> to
> > >>>>>>>> respect if
> > >>>>>>>>>   one configures at client side explicitly or passed through
> > APIs,
> > >>>>>>>> otherwise
> > >>>>>>>>>   server side bucket level properties or server side default
> > >>>>>>>> configuration
> > >>>>>>>>>   would take effect.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>   -
> > >>>>>>>>>
> > >>>>>>>>>   Other than this change, the rest of EC side code should not
> > >>> impact
> > >>>>>> any
> > >>>>>>>>>   of the existing code flows.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering
> this
> > >>>>>> feature
> > >>>>>>>>> and we will continue to improve further in master.
> > >>>>>>>>>
> > >>>>>>>>> Git Branch Name : HDDS-3816-ec
> > >>>>>>>>>
> > >>>>>>>>> JIRAs: HDDS-3816 and HDDS-5351
> > >>>>>>>>>
> > >>>>>>>>> Completed tasks: ~ 142
> > >>>>>>>>>
> > >>>>>>>>> + We are covering the following two mandatory JIRAs:
> > >>>>>>>>>
> > >>>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to
> > older
> > >>>>>>>>> server could fail due to the unavailability for client default
> > >>>>>>>> replication
> > >>>>>>>>> config
> > >>>>>>>>>
> > >>>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> > >>>>>>>>>
> > >>>>>>>>> PRs reviews in-progress and expected to close in a day or two.
> > >>>>>>>>>
> > >>>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe
> they're
> > >>> not
> > >>>>>>>>> blockers for merge.
> > >>>>>>>>>
> > >>>>>>>>> In short what you can do now with this feature:
> > >>>>>>>>>
> > >>>>>>>>>   -
> > >>>>>>>>>
> > >>>>>>>>>   You can enable EC at bucket level and cluster level.
> > >>>>>>>>>
> > >>>>>>>>> How to enable it at bucket level? Just create the bucket by
> > passing
> > >>>>>> the
> > >>>>>>>> ec
> > >>>>>>>>> replication options.
> > >>>>>>>>>
> > >>>>>>>>>   -
> > >>>>>>>>>
> > >>>>>>>>>   You can create EC keys and read the same back.
> > >>>>>>>>>   -
> > >>>>>>>>>
> > >>>>>>>>>   You should be able to continue writing even when chosen nodes
> > are
> > >>>>>>>>>   failing. (Of Course minimum of Data+Parity live nodes should
> be
> > >>>>>>>> available
> > >>>>>>>>>   in cluster for complete the write)
> > >>>>>>>>>   -
> > >>>>>>>>>
> > >>>>>>>>>   You should be able to read the file back even if a few nodes
> > >>>>>> failed in
> > >>>>>>>>>   the same ec block group(Failures should not be more than
> parity
> > >>>>>>>> number of
> > >>>>>>>>>   nodes.).
> > >>>>>>>>>
> > >>>>>>>>> What is pending? Offline recovery of lost/missing EC
> containers.
> > As
> > >>>>>>>>> mentioned above, post merge of this branch, I will create a
> > >>> separate
> > >>>>>> JIRA
> > >>>>>>>>> for starting the work for OfflineRecovery.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> There are automated acceptance test cases already added.
> > HDDS-6231
> > >>>>>>>>>
> > >>>>>>>>> In addition to that, we have also performed basic Acceptance
> > >>> Testing
> > >>>>>> in
> > >>>>>>>>> physical cluster:
> > >>>>>>>>>
> > >>>>>>>>>   1.
> > >>>>>>>>>
> > >>>>>>>>>   Installed 10 nodes cluster and created EC bucket (3:2).
> > >>>>>>>>>
> > >>>>>>>>> Uploaded 10GB key.
> > >>>>>>>>>
> > >>>>>>>>> Downloaded the same key and checked the md5sum.
> > >>>>>>>>>
> > >>>>>>>>>   1.
> > >>>>>>>>>
> > >>>>>>>>>   Uploaded 8GB key.
> > >>>>>>>>>
> > >>>>>>>>> Downloaded the same key and checked the md5sum.
> > >>>>>>>>>
> > >>>>>>>>>   1.
> > >>>>>>>>>
> > >>>>>>>>>   Uploaded 3MB key
> > >>>>>>>>>
> > >>>>>>>>> Downloaded the same and verified md5sum.
> > >>>>>>>>>
> > >>>>>>>>>   1.
> > >>>>>>>>>
> > >>>>>>>>>   Changed bucket to (6:3)
> > >>>>>>>>>
> > >>>>>>>>> Uploaded 8GB key
> > >>>>>>>>>
> > >>>>>>>>> Download the same.
> > >>>>>>>>>
> > >>>>>>>>> Also verified the new key should be in 6:3 policy and old keys
> > must
> > >>>> be
> > >>>>>>>> 3:2.Verified
> > >>>>>>>>> with several different size key writes and reads.
> > >>>>>>>>>
> > >>>>>>>>> Merge checklist items assessment is here:
> > >>>>>>>>>
> > >>>
> >
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> > >>>>>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>,
> > Istvan
> > >>>>>> Fajth
> > >>>>>>>>> <pi...@cloudera.com> for great efforts in core development and
> > >>> also
> > >>>>>>>>> thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for
> > >>>>>> collaborating
> > >>>>>>>>> on some of the EC tasks.
> > >>>>>>>>>
> > >>>>>>>>> Thanks to Marton for design discussion and on some dev tasks as
> > >>> well.
> > >>>>>>>>> Thanks to many others who were involved in design discussions,
> > >>> Arpit,
> > >>>>>>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
> > >>>>>> Prashanth,
> > >>>>>>>>> Rakesh, Yiqun Lin.
> > >>>>>>>>> Sorry if I miss anyone here, but your efforts are much
> > appreciated.
> > >>>>>>>>> Without your tremendous help, we would have not reached this
> > >>> position
> > >>>>>>>> yet.
> > >>>>>>>>> If there are no objections for the merge, I will start the
> > official
> > >>>>>> vote
> > >>>>>>>>> later.
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>>
> > >>>>>>>>> EC Branch Devs
> > >>>>>>>>>
> > >>>>
> > >>>>
> ---------------------------------------------------------------------
> > >>>> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> > >>>> For additional commands, e-mail: dev-help@ozone.apache.org
> > >>>>
> > >>>>
> > >>> --
> > >>> Thanks & Regards,
> > >>> Aravindan
> > >>>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> > > For additional commands, e-mail: dev-help@ozone.apache.org
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> > For additional commands, e-mail: dev-help@ozone.apache.org
> >
> >
>

Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by mingchao zhao <ca...@apache.org>.
+1 for the merge. Thanks

Mukul Kumar Singh <mk...@gmail.com> 于2022年4月7日周四 14:05写道:

> +1 for the merge.
>
>
> Thanks Lokesh
>
> On 07/04/22 11:29 am, Lokesh Jain wrote:
> > +1 for merge
> >
> > Thanks
> > Lokesh
> >
> >> On 07-Apr-2022, at 11:06 AM, Tsz Wo Sze <sz...@gmail.com> wrote:
> >>
> >> +1
> >> We should merge it so that more people can try it.  We can work on the
> >> remaining tasks in the master branch.  Thanks a lot!
> >>
> >> Tsz-Wo
> >>
> >> On Thu, Apr 7, 2022 at 1:17 PM Aravindan Vijayan
> >> <av...@cloudera.com.invalid> wrote:
> >>
> >>> +1 for the merge. Thanks for the great work!
> >>>
> >>>
> >>>
> >>> On Wed, Apr 6, 2022 at 9:45 PM Prashant Pogde
> <ppogde@cloudera.com.invalid
> >>> wrote:
> >>>
> >>>> +1 for the EC branch merge.
> >>>>
> >>>> Regards,
> >>>> Prashant
> >>>>
> >>>>> On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <sw...@apache.org>
> wrote:
> >>>>>
> >>>>> +1 for the EC branch merge.
> >>>>>
> >>>>> Best,
> >>>>> Sid
> >>>>>
> >>>>> On Wed, Apr 6, 2022 at 8:05 PM guimark <gu...@126.com> wrote:
> >>>>>
> >>>>>> Great news!
> >>>>>> +1 to merge.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> At 2022-04-06 22:18:31, "Stephen O'Donnell" <sodonnell@cloudera.com
> >>>> .INVALID>
> >>>>>> wrote:
> >>>>>>> I have been working on the code on this branch for some time, and I
> >>>>>> believe
> >>>>>>> it is in a good state to merge now. It is mostly new code, and if
> >>>> nothing
> >>>>>>> attempts to use EC, none of the EC code paths will be executed.
> >>>>>>>
> >>>>>>> +1 to merge from me.
> >>>>>>>
> >>>>>>> Stephen.
> >>>>>>>
> >>>>>>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <
> umamahesh@apache.org>
> >>>>>> wrote:
> >>>>>>>> =====Few Edits Below===================
> >>>>>>>>
> >>>>>>>> Dear Ozone Devs,
> >>>>>>>>
> >>>>>>>> As you may know, we have been actively developing Ozone Erasure
> >>> Coding
> >>>>>>>> support in a separate branch HDDS-3816-ec.
> >>>>>>>>
> >>>>>>>> We have finished the development of EC key write and read
> >>>> functionality.
> >>>>>>>> The support of offline recovery( Recovering replica from node
> loss)
> >>>>>> will be
> >>>>>>>> part of second phase work.
> >>>>>>>>
> >>>>>>>> Since the code has already grown and increasingly started seeing
> >>> merge
> >>>>>>>> complications, we would like to merge the current EC branch into
> >>>> master.
> >>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work and
> >>>>>> continued
> >>>>>>>> the offline recovery work there. (we have uploaded the design doc
> >>>> there)
> >>>>>>>> Details on Changes:
> >>>>>>>>
> >>>>>>>>   -
> >>>>>>>>
> >>>>>>>>   Most of the EC core logic went to newly extended classes. Key
> >>>> changes
> >>>>>>>>   went into EC*OutputStream and EC*InputStream classes for write
> and
> >>>>>> read
> >>>>>>>>   respectively. Based on replication type, ECPipelineProvider will
> >>> be
> >>>>>>>> chosen
> >>>>>>>>   for creating EC pipelines.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>   -
> >>>>>>>>
> >>>>>>>>   Since we cannot represent the EC replication in the existing
> >>>>>> replication
> >>>>>>>>   factor, we have introduced ECReplicationConfig. The
> >>>> ReplicationConfig
> >>>>>>>>   interface is already pushed to master, so it’s not a new idea
> >>> coming
> >>>>>>>>   through this branch merge now. What is newly coming here is the
> >>>>>>>>   ECReplicationConfig class which can be used to express EC
> >>>> replication
> >>>>>>>>   configuration.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>   -
> >>>>>>>>
> >>>>>>>>   We wanted to provide the support to enable EC at bucket level.
> To
> >>>>>>>>   simplify some complications, we have moved the default
> replication
> >>>>>>>>   configurations from client to server.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>   -
> >>>>>>>>
> >>>>>>>>   Client side replication type and replication factor removed from
> >>> the
> >>>>>>>>   configuration files and introduced the
> >>>>>> ozone.server.default.replication
> >>>>>>>>   and ozone.server.default.replication.type.We would continue to
> >>>>>> respect
> >>>>>>>> if
> >>>>>>>>   one configures at client side explicitly or passed through APIs,
> >>>>>>>> otherwise
> >>>>>>>>   server side bucket level properties or server side default
> >>>>>> configuration
> >>>>>>>>   would take effect.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>   -
> >>>>>>>>
> >>>>>>>>   Other than this change, the rest of EC side code should not
> impact
> >>>>>> any
> >>>>>>>>   of the existing code flows.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering this
> >>>> feature
> >>>>>>>> and we will continue to improve further in master.
> >>>>>>>>
> >>>>>>>> Git Branch Name : HDDS-3816-ec
> >>>>>>>>
> >>>>>>>> JIRAs: HDDS-3816 and HDDS-5351
> >>>>>>>>
> >>>>>>>> Completed tasks: ~ 142
> >>>>>>>>
> >>>>>>>> + We are covering the following two mandatory JIRAs to come in:
> >>>>>>>>
> >>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to
> older
> >>>>>> server
> >>>>>>>> could fail due to the unavailability for client default
> replication
> >>>>>> config
> >>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> >>>>>>>>
> >>>>>>>> PRs reviews in-progress and expected to close in a day or two.
> >>>>>>>>
> >>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe they're
> >>> not
> >>>>>>>> blockers for merge.
> >>>>>>>>
> >>>>>>>> In short what you can do now with this feature:
> >>>>>>>>
> >>>>>>>>   -
> >>>>>>>>
> >>>>>>>>   You can enable EC at bucket level and cluster level.
> >>>>>>>>
> >>>>>>>> How to enable it at bucket level? Just create the bucket by
> passing
> >>>> the
> >>>>>> ec
> >>>>>>>> replication options.
> >>>>>>>>
> >>>>>>>>   -
> >>>>>>>>
> >>>>>>>>   You can create EC keys and read the same back.
> >>>>>>>>   -
> >>>>>>>>
> >>>>>>>>   You should be able to continue writing even when chosen nodes
> are
> >>>>>>>>   failing. (Of Course minimum of Data+Parity live nodes should be
> >>>>>>>> available
> >>>>>>>>   in cluster for complete the write)
> >>>>>>>>   -
> >>>>>>>>
> >>>>>>>>   You should be able to read the file back even if a few nodes
> >>> failed
> >>>>>> in
> >>>>>>>>   the same ec block group(Failures should not be more than parity
> >>>>>> number
> >>>>>>>> of
> >>>>>>>>   nodes.).
> >>>>>>>>
> >>>>>>>> What is pending? Offline recovery of lost/missing EC containers.
> As
> >>>>>>>> mentioned above, post merge of this branch, I will create a
> separate
> >>>>>> JIRA
> >>>>>>>> for starting the work for OfflineRecovery.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> There are automated acceptance test cases already added. HDDS-6231
> >>>>>>>>
> >>>>>>>> In addition to that, we have also performed basic Acceptance
> Testing
> >>>> in
> >>>>>>>> physical cluster:
> >>>>>>>>
> >>>>>>>>   1.
> >>>>>>>>
> >>>>>>>>   Installed 10 nodes cluster and created EC bucket (3:2).
> >>>>>>>>
> >>>>>>>> Uploaded 10GB key.
> >>>>>>>>
> >>>>>>>> Downloaded the same key and checked the md5sum.
> >>>>>>>>
> >>>>>>>>   1.
> >>>>>>>>
> >>>>>>>>   Uploaded 8GB key.
> >>>>>>>>
> >>>>>>>> Downloaded the same key and checked the md5sum.
> >>>>>>>>
> >>>>>>>>   1.
> >>>>>>>>
> >>>>>>>>   Uploaded 3MB key
> >>>>>>>>
> >>>>>>>> Downloaded the same and verified md5sum.
> >>>>>>>>
> >>>>>>>>   1.
> >>>>>>>>
> >>>>>>>>   Changed bucket to (6:3)
> >>>>>>>>
> >>>>>>>> Uploaded 8GB key
> >>>>>>>>
> >>>>>>>> Download the same.
> >>>>>>>>
> >>>>>>>> Also verified the new key should be in 6:3 policy and old keys
> must
> >>> be
> >>>>>>>> 3:2.Verified
> >>>>>>>> with several different size key writes and reads.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Since the merge discussion thread, we have well stabilized code
> and
> >>>>>> fixed
> >>>>>>>> several bugs.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Merge checklist items assessment is here:
> >>>>>>>>
> >>>>>>>>
> >>>
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> >>>>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>,
> Istvan
> >>>>>> Fajth
> >>>>>>>> <pi...@cloudera.com> for great efforts in core development and
> also
> >>>>>> thanks
> >>>>>>>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for
> >>>>>> collaborating
> >>>>>>>> on some of the EC tasks.
> >>>>>>>>
> >>>>>>>> Thanks to Marton for design discussion and on some dev tasks as
> >>> well.
> >>>>>>>> Thanks to many others who were involved in design discussions,
> >>> Arpit,
> >>>>>> Sidd,
> >>>>>>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
> Prashanth,
> >>>>>> Rakesh,
> >>>>>>>> Yiqun Lin.
> >>>>>>>> Sorry if I miss anyone here, but your efforts are much
> appreciated.
> >>>>>> Without
> >>>>>>>> your tremendous help, we would have not reached this position yet.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> To start with, here is my +1
> >>>>>>>>
> >>>>>>>> The vote will run for 5 days.
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Uma
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <
> >>> umamahesh@apache.org>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Dear Ozone Devs,
> >>>>>>>>>
> >>>>>>>>> As you may know, we have been actively developing Ozone Erasure
> >>>> Coding
> >>>>>>>>> support in a separate branch HDDS-3816-ec.
> >>>>>>>>>
> >>>>>>>>> We have finished the development of EC key write and read
> >>>>>> functionality.
> >>>>>>>>> The support of offline recovery( Recovering replica from node
> loss)
> >>>>>> will
> >>>>>>>> be
> >>>>>>>>> part of second phase work.
> >>>>>>>>>
> >>>>>>>>> Since the code has already grown and increasingly started seeing
> >>>> merge
> >>>>>>>>> complications, we would like to propose to merge the current EC
> >>>> branch
> >>>>>>>> into
> >>>>>>>>> master.
> >>>>>>>>>
> >>>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work and
> >>>>>>>>> continued the offline recovery work there.
> >>>>>>>>>
> >>>>>>>>> Details on Changes:
> >>>>>>>>>
> >>>>>>>>>   -
> >>>>>>>>>
> >>>>>>>>>   Most of the EC core logic went to newly extended classes. Key
> >>>>>> changes
> >>>>>>>>>   went into EC*OutputStream and EC*InputStream classes for write
> >>> and
> >>>>>>>> read
> >>>>>>>>>   respectively. Based on replication type, ECPipelineProvider
> will
> >>> be
> >>>>>>>> chosen
> >>>>>>>>>   for creating EC pipelines.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>   -
> >>>>>>>>>
> >>>>>>>>>   Since we cannot represent the EC replication in the existing
> >>>>>>>>>   replication factor, we have introduced ECReplicationConfig. The
> >>>>>>>>>   ReplicationConfig interface is already pushed to master, so
> it’s
> >>>>>> not
> >>>>>>>> a new
> >>>>>>>>>   idea coming through this branch merge now. What is newly coming
> >>>>>> here
> >>>>>>>> is the
> >>>>>>>>>   ECReplicationConfig class which can be used to express EC
> >>>>>> replication
> >>>>>>>>>   configuration.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>   -
> >>>>>>>>>
> >>>>>>>>>   We wanted to provide the support to enable EC at bucket level.
> To
> >>>>>>>>>   simplify some complications, we have moved the default
> >>> replication
> >>>>>>>>>   configurations from client to server.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>   -
> >>>>>>>>>
> >>>>>>>>>   Client side replication type and replication factor removed
> from
> >>>>>> the
> >>>>>>>>>   configuration files and introduced the
> >>>>>>>> ozone.server.default.replication
> >>>>>>>>>   and ozone.server.default.replication.type.We would continue to
> >>>>>>>> respect if
> >>>>>>>>>   one configures at client side explicitly or passed through
> APIs,
> >>>>>>>> otherwise
> >>>>>>>>>   server side bucket level properties or server side default
> >>>>>>>> configuration
> >>>>>>>>>   would take effect.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>   -
> >>>>>>>>>
> >>>>>>>>>   Other than this change, the rest of EC side code should not
> >>> impact
> >>>>>> any
> >>>>>>>>>   of the existing code flows.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering this
> >>>>>> feature
> >>>>>>>>> and we will continue to improve further in master.
> >>>>>>>>>
> >>>>>>>>> Git Branch Name : HDDS-3816-ec
> >>>>>>>>>
> >>>>>>>>> JIRAs: HDDS-3816 and HDDS-5351
> >>>>>>>>>
> >>>>>>>>> Completed tasks: ~ 142
> >>>>>>>>>
> >>>>>>>>> + We are covering the following two mandatory JIRAs:
> >>>>>>>>>
> >>>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to
> older
> >>>>>>>>> server could fail due to the unavailability for client default
> >>>>>>>> replication
> >>>>>>>>> config
> >>>>>>>>>
> >>>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> >>>>>>>>>
> >>>>>>>>> PRs reviews in-progress and expected to close in a day or two.
> >>>>>>>>>
> >>>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe they're
> >>> not
> >>>>>>>>> blockers for merge.
> >>>>>>>>>
> >>>>>>>>> In short what you can do now with this feature:
> >>>>>>>>>
> >>>>>>>>>   -
> >>>>>>>>>
> >>>>>>>>>   You can enable EC at bucket level and cluster level.
> >>>>>>>>>
> >>>>>>>>> How to enable it at bucket level? Just create the bucket by
> passing
> >>>>>> the
> >>>>>>>> ec
> >>>>>>>>> replication options.
> >>>>>>>>>
> >>>>>>>>>   -
> >>>>>>>>>
> >>>>>>>>>   You can create EC keys and read the same back.
> >>>>>>>>>   -
> >>>>>>>>>
> >>>>>>>>>   You should be able to continue writing even when chosen nodes
> are
> >>>>>>>>>   failing. (Of Course minimum of Data+Parity live nodes should be
> >>>>>>>> available
> >>>>>>>>>   in cluster for complete the write)
> >>>>>>>>>   -
> >>>>>>>>>
> >>>>>>>>>   You should be able to read the file back even if a few nodes
> >>>>>> failed in
> >>>>>>>>>   the same ec block group(Failures should not be more than parity
> >>>>>>>> number of
> >>>>>>>>>   nodes.).
> >>>>>>>>>
> >>>>>>>>> What is pending? Offline recovery of lost/missing EC containers.
> As
> >>>>>>>>> mentioned above, post merge of this branch, I will create a
> >>> separate
> >>>>>> JIRA
> >>>>>>>>> for starting the work for OfflineRecovery.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> There are automated acceptance test cases already added.
> HDDS-6231
> >>>>>>>>>
> >>>>>>>>> In addition to that, we have also performed basic Acceptance
> >>> Testing
> >>>>>> in
> >>>>>>>>> physical cluster:
> >>>>>>>>>
> >>>>>>>>>   1.
> >>>>>>>>>
> >>>>>>>>>   Installed 10 nodes cluster and created EC bucket (3:2).
> >>>>>>>>>
> >>>>>>>>> Uploaded 10GB key.
> >>>>>>>>>
> >>>>>>>>> Downloaded the same key and checked the md5sum.
> >>>>>>>>>
> >>>>>>>>>   1.
> >>>>>>>>>
> >>>>>>>>>   Uploaded 8GB key.
> >>>>>>>>>
> >>>>>>>>> Downloaded the same key and checked the md5sum.
> >>>>>>>>>
> >>>>>>>>>   1.
> >>>>>>>>>
> >>>>>>>>>   Uploaded 3MB key
> >>>>>>>>>
> >>>>>>>>> Downloaded the same and verified md5sum.
> >>>>>>>>>
> >>>>>>>>>   1.
> >>>>>>>>>
> >>>>>>>>>   Changed bucket to (6:3)
> >>>>>>>>>
> >>>>>>>>> Uploaded 8GB key
> >>>>>>>>>
> >>>>>>>>> Download the same.
> >>>>>>>>>
> >>>>>>>>> Also verified the new key should be in 6:3 policy and old keys
> must
> >>>> be
> >>>>>>>> 3:2.Verified
> >>>>>>>>> with several different size key writes and reads.
> >>>>>>>>>
> >>>>>>>>> Merge checklist items assessment is here:
> >>>>>>>>>
> >>>
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> >>>>>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>,
> Istvan
> >>>>>> Fajth
> >>>>>>>>> <pi...@cloudera.com> for great efforts in core development and
> >>> also
> >>>>>>>>> thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for
> >>>>>> collaborating
> >>>>>>>>> on some of the EC tasks.
> >>>>>>>>>
> >>>>>>>>> Thanks to Marton for design discussion and on some dev tasks as
> >>> well.
> >>>>>>>>> Thanks to many others who were involved in design discussions,
> >>> Arpit,
> >>>>>>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
> >>>>>> Prashanth,
> >>>>>>>>> Rakesh, Yiqun Lin.
> >>>>>>>>> Sorry if I miss anyone here, but your efforts are much
> appreciated.
> >>>>>>>>> Without your tremendous help, we would have not reached this
> >>> position
> >>>>>>>> yet.
> >>>>>>>>> If there are no objections for the merge, I will start the
> official
> >>>>>> vote
> >>>>>>>>> later.
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>>
> >>>>>>>>> EC Branch Devs
> >>>>>>>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> >>>> For additional commands, e-mail: dev-help@ozone.apache.org
> >>>>
> >>>>
> >>> --
> >>> Thanks & Regards,
> >>> Aravindan
> >>>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> > For additional commands, e-mail: dev-help@ozone.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> For additional commands, e-mail: dev-help@ozone.apache.org
>
>

Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Mukul Kumar Singh <mk...@gmail.com>.
+1 for the merge.


Thanks Lokesh

On 07/04/22 11:29 am, Lokesh Jain wrote:
> +1 for merge
>
> Thanks
> Lokesh
>
>> On 07-Apr-2022, at 11:06 AM, Tsz Wo Sze <sz...@gmail.com> wrote:
>>
>> +1
>> We should merge it so that more people can try it.  We can work on the
>> remaining tasks in the master branch.  Thanks a lot!
>>
>> Tsz-Wo
>>
>> On Thu, Apr 7, 2022 at 1:17 PM Aravindan Vijayan
>> <av...@cloudera.com.invalid> wrote:
>>
>>> +1 for the merge. Thanks for the great work!
>>>
>>>
>>>
>>> On Wed, Apr 6, 2022 at 9:45 PM Prashant Pogde <ppogde@cloudera.com.invalid
>>> wrote:
>>>
>>>> +1 for the EC branch merge.
>>>>
>>>> Regards,
>>>> Prashant
>>>>
>>>>> On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <sw...@apache.org> wrote:
>>>>>
>>>>> +1 for the EC branch merge.
>>>>>
>>>>> Best,
>>>>> Sid
>>>>>
>>>>> On Wed, Apr 6, 2022 at 8:05 PM guimark <gu...@126.com> wrote:
>>>>>
>>>>>> Great news!
>>>>>> +1 to merge.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> At 2022-04-06 22:18:31, "Stephen O'Donnell" <sodonnell@cloudera.com
>>>> .INVALID>
>>>>>> wrote:
>>>>>>> I have been working on the code on this branch for some time, and I
>>>>>> believe
>>>>>>> it is in a good state to merge now. It is mostly new code, and if
>>>> nothing
>>>>>>> attempts to use EC, none of the EC code paths will be executed.
>>>>>>>
>>>>>>> +1 to merge from me.
>>>>>>>
>>>>>>> Stephen.
>>>>>>>
>>>>>>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <um...@apache.org>
>>>>>> wrote:
>>>>>>>> =====Few Edits Below===================
>>>>>>>>
>>>>>>>> Dear Ozone Devs,
>>>>>>>>
>>>>>>>> As you may know, we have been actively developing Ozone Erasure
>>> Coding
>>>>>>>> support in a separate branch HDDS-3816-ec.
>>>>>>>>
>>>>>>>> We have finished the development of EC key write and read
>>>> functionality.
>>>>>>>> The support of offline recovery( Recovering replica from node loss)
>>>>>> will be
>>>>>>>> part of second phase work.
>>>>>>>>
>>>>>>>> Since the code has already grown and increasingly started seeing
>>> merge
>>>>>>>> complications, we would like to merge the current EC branch into
>>>> master.
>>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work and
>>>>>> continued
>>>>>>>> the offline recovery work there. (we have uploaded the design doc
>>>> there)
>>>>>>>> Details on Changes:
>>>>>>>>
>>>>>>>>   -
>>>>>>>>
>>>>>>>>   Most of the EC core logic went to newly extended classes. Key
>>>> changes
>>>>>>>>   went into EC*OutputStream and EC*InputStream classes for write and
>>>>>> read
>>>>>>>>   respectively. Based on replication type, ECPipelineProvider will
>>> be
>>>>>>>> chosen
>>>>>>>>   for creating EC pipelines.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>   -
>>>>>>>>
>>>>>>>>   Since we cannot represent the EC replication in the existing
>>>>>> replication
>>>>>>>>   factor, we have introduced ECReplicationConfig. The
>>>> ReplicationConfig
>>>>>>>>   interface is already pushed to master, so it’s not a new idea
>>> coming
>>>>>>>>   through this branch merge now. What is newly coming here is the
>>>>>>>>   ECReplicationConfig class which can be used to express EC
>>>> replication
>>>>>>>>   configuration.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>   -
>>>>>>>>
>>>>>>>>   We wanted to provide the support to enable EC at bucket level. To
>>>>>>>>   simplify some complications, we have moved the default replication
>>>>>>>>   configurations from client to server.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>   -
>>>>>>>>
>>>>>>>>   Client side replication type and replication factor removed from
>>> the
>>>>>>>>   configuration files and introduced the
>>>>>> ozone.server.default.replication
>>>>>>>>   and ozone.server.default.replication.type.We would continue to
>>>>>> respect
>>>>>>>> if
>>>>>>>>   one configures at client side explicitly or passed through APIs,
>>>>>>>> otherwise
>>>>>>>>   server side bucket level properties or server side default
>>>>>> configuration
>>>>>>>>   would take effect.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>   -
>>>>>>>>
>>>>>>>>   Other than this change, the rest of EC side code should not impact
>>>>>> any
>>>>>>>>   of the existing code flows.
>>>>>>>>
>>>>>>>>
>>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering this
>>>> feature
>>>>>>>> and we will continue to improve further in master.
>>>>>>>>
>>>>>>>> Git Branch Name : HDDS-3816-ec
>>>>>>>>
>>>>>>>> JIRAs: HDDS-3816 and HDDS-5351
>>>>>>>>
>>>>>>>> Completed tasks: ~ 142
>>>>>>>>
>>>>>>>> + We are covering the following two mandatory JIRAs to come in:
>>>>>>>>
>>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
>>>>>> server
>>>>>>>> could fail due to the unavailability for client default replication
>>>>>> config
>>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>>>>>>>>
>>>>>>>> PRs reviews in-progress and expected to close in a day or two.
>>>>>>>>
>>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe they're
>>> not
>>>>>>>> blockers for merge.
>>>>>>>>
>>>>>>>> In short what you can do now with this feature:
>>>>>>>>
>>>>>>>>   -
>>>>>>>>
>>>>>>>>   You can enable EC at bucket level and cluster level.
>>>>>>>>
>>>>>>>> How to enable it at bucket level? Just create the bucket by passing
>>>> the
>>>>>> ec
>>>>>>>> replication options.
>>>>>>>>
>>>>>>>>   -
>>>>>>>>
>>>>>>>>   You can create EC keys and read the same back.
>>>>>>>>   -
>>>>>>>>
>>>>>>>>   You should be able to continue writing even when chosen nodes are
>>>>>>>>   failing. (Of Course minimum of Data+Parity live nodes should be
>>>>>>>> available
>>>>>>>>   in cluster for complete the write)
>>>>>>>>   -
>>>>>>>>
>>>>>>>>   You should be able to read the file back even if a few nodes
>>> failed
>>>>>> in
>>>>>>>>   the same ec block group(Failures should not be more than parity
>>>>>> number
>>>>>>>> of
>>>>>>>>   nodes.).
>>>>>>>>
>>>>>>>> What is pending? Offline recovery of lost/missing EC containers. As
>>>>>>>> mentioned above, post merge of this branch, I will create a separate
>>>>>> JIRA
>>>>>>>> for starting the work for OfflineRecovery.
>>>>>>>>
>>>>>>>>
>>>>>>>> There are automated acceptance test cases already added. HDDS-6231
>>>>>>>>
>>>>>>>> In addition to that, we have also performed basic Acceptance Testing
>>>> in
>>>>>>>> physical cluster:
>>>>>>>>
>>>>>>>>   1.
>>>>>>>>
>>>>>>>>   Installed 10 nodes cluster and created EC bucket (3:2).
>>>>>>>>
>>>>>>>> Uploaded 10GB key.
>>>>>>>>
>>>>>>>> Downloaded the same key and checked the md5sum.
>>>>>>>>
>>>>>>>>   1.
>>>>>>>>
>>>>>>>>   Uploaded 8GB key.
>>>>>>>>
>>>>>>>> Downloaded the same key and checked the md5sum.
>>>>>>>>
>>>>>>>>   1.
>>>>>>>>
>>>>>>>>   Uploaded 3MB key
>>>>>>>>
>>>>>>>> Downloaded the same and verified md5sum.
>>>>>>>>
>>>>>>>>   1.
>>>>>>>>
>>>>>>>>   Changed bucket to (6:3)
>>>>>>>>
>>>>>>>> Uploaded 8GB key
>>>>>>>>
>>>>>>>> Download the same.
>>>>>>>>
>>>>>>>> Also verified the new key should be in 6:3 policy and old keys must
>>> be
>>>>>>>> 3:2.Verified
>>>>>>>> with several different size key writes and reads.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Since the merge discussion thread, we have well stabilized code and
>>>>>> fixed
>>>>>>>> several bugs.
>>>>>>>>
>>>>>>>>
>>>>>>>> Merge checklist items assessment is here:
>>>>>>>>
>>>>>>>>
>>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>>>>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan
>>>>>> Fajth
>>>>>>>> <pi...@cloudera.com> for great efforts in core development and also
>>>>>> thanks
>>>>>>>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for
>>>>>> collaborating
>>>>>>>> on some of the EC tasks.
>>>>>>>>
>>>>>>>> Thanks to Marton for design discussion and on some dev tasks as
>>> well.
>>>>>>>> Thanks to many others who were involved in design discussions,
>>> Arpit,
>>>>>> Sidd,
>>>>>>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth,
>>>>>> Rakesh,
>>>>>>>> Yiqun Lin.
>>>>>>>> Sorry if I miss anyone here, but your efforts are much appreciated.
>>>>>> Without
>>>>>>>> your tremendous help, we would have not reached this position yet.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> To start with, here is my +1
>>>>>>>>
>>>>>>>> The vote will run for 5 days.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Uma
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <
>>> umamahesh@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Dear Ozone Devs,
>>>>>>>>>
>>>>>>>>> As you may know, we have been actively developing Ozone Erasure
>>>> Coding
>>>>>>>>> support in a separate branch HDDS-3816-ec.
>>>>>>>>>
>>>>>>>>> We have finished the development of EC key write and read
>>>>>> functionality.
>>>>>>>>> The support of offline recovery( Recovering replica from node loss)
>>>>>> will
>>>>>>>> be
>>>>>>>>> part of second phase work.
>>>>>>>>>
>>>>>>>>> Since the code has already grown and increasingly started seeing
>>>> merge
>>>>>>>>> complications, we would like to propose to merge the current EC
>>>> branch
>>>>>>>> into
>>>>>>>>> master.
>>>>>>>>>
>>>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work and
>>>>>>>>> continued the offline recovery work there.
>>>>>>>>>
>>>>>>>>> Details on Changes:
>>>>>>>>>
>>>>>>>>>   -
>>>>>>>>>
>>>>>>>>>   Most of the EC core logic went to newly extended classes. Key
>>>>>> changes
>>>>>>>>>   went into EC*OutputStream and EC*InputStream classes for write
>>> and
>>>>>>>> read
>>>>>>>>>   respectively. Based on replication type, ECPipelineProvider will
>>> be
>>>>>>>> chosen
>>>>>>>>>   for creating EC pipelines.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   -
>>>>>>>>>
>>>>>>>>>   Since we cannot represent the EC replication in the existing
>>>>>>>>>   replication factor, we have introduced ECReplicationConfig. The
>>>>>>>>>   ReplicationConfig interface is already pushed to master, so it’s
>>>>>> not
>>>>>>>> a new
>>>>>>>>>   idea coming through this branch merge now. What is newly coming
>>>>>> here
>>>>>>>> is the
>>>>>>>>>   ECReplicationConfig class which can be used to express EC
>>>>>> replication
>>>>>>>>>   configuration.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   -
>>>>>>>>>
>>>>>>>>>   We wanted to provide the support to enable EC at bucket level. To
>>>>>>>>>   simplify some complications, we have moved the default
>>> replication
>>>>>>>>>   configurations from client to server.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   -
>>>>>>>>>
>>>>>>>>>   Client side replication type and replication factor removed from
>>>>>> the
>>>>>>>>>   configuration files and introduced the
>>>>>>>> ozone.server.default.replication
>>>>>>>>>   and ozone.server.default.replication.type.We would continue to
>>>>>>>> respect if
>>>>>>>>>   one configures at client side explicitly or passed through APIs,
>>>>>>>> otherwise
>>>>>>>>>   server side bucket level properties or server side default
>>>>>>>> configuration
>>>>>>>>>   would take effect.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   -
>>>>>>>>>
>>>>>>>>>   Other than this change, the rest of EC side code should not
>>> impact
>>>>>> any
>>>>>>>>>   of the existing code flows.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering this
>>>>>> feature
>>>>>>>>> and we will continue to improve further in master.
>>>>>>>>>
>>>>>>>>> Git Branch Name : HDDS-3816-ec
>>>>>>>>>
>>>>>>>>> JIRAs: HDDS-3816 and HDDS-5351
>>>>>>>>>
>>>>>>>>> Completed tasks: ~ 142
>>>>>>>>>
>>>>>>>>> + We are covering the following two mandatory JIRAs:
>>>>>>>>>
>>>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
>>>>>>>>> server could fail due to the unavailability for client default
>>>>>>>> replication
>>>>>>>>> config
>>>>>>>>>
>>>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>>>>>>>>>
>>>>>>>>> PRs reviews in-progress and expected to close in a day or two.
>>>>>>>>>
>>>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe they're
>>> not
>>>>>>>>> blockers for merge.
>>>>>>>>>
>>>>>>>>> In short what you can do now with this feature:
>>>>>>>>>
>>>>>>>>>   -
>>>>>>>>>
>>>>>>>>>   You can enable EC at bucket level and cluster level.
>>>>>>>>>
>>>>>>>>> How to enable it at bucket level? Just create the bucket by passing
>>>>>> the
>>>>>>>> ec
>>>>>>>>> replication options.
>>>>>>>>>
>>>>>>>>>   -
>>>>>>>>>
>>>>>>>>>   You can create EC keys and read the same back.
>>>>>>>>>   -
>>>>>>>>>
>>>>>>>>>   You should be able to continue writing even when chosen nodes are
>>>>>>>>>   failing. (Of Course minimum of Data+Parity live nodes should be
>>>>>>>> available
>>>>>>>>>   in cluster for complete the write)
>>>>>>>>>   -
>>>>>>>>>
>>>>>>>>>   You should be able to read the file back even if a few nodes
>>>>>> failed in
>>>>>>>>>   the same ec block group(Failures should not be more than parity
>>>>>>>> number of
>>>>>>>>>   nodes.).
>>>>>>>>>
>>>>>>>>> What is pending? Offline recovery of lost/missing EC containers. As
>>>>>>>>> mentioned above, post merge of this branch, I will create a
>>> separate
>>>>>> JIRA
>>>>>>>>> for starting the work for OfflineRecovery.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> There are automated acceptance test cases already added. HDDS-6231
>>>>>>>>>
>>>>>>>>> In addition to that, we have also performed basic Acceptance
>>> Testing
>>>>>> in
>>>>>>>>> physical cluster:
>>>>>>>>>
>>>>>>>>>   1.
>>>>>>>>>
>>>>>>>>>   Installed 10 nodes cluster and created EC bucket (3:2).
>>>>>>>>>
>>>>>>>>> Uploaded 10GB key.
>>>>>>>>>
>>>>>>>>> Downloaded the same key and checked the md5sum.
>>>>>>>>>
>>>>>>>>>   1.
>>>>>>>>>
>>>>>>>>>   Uploaded 8GB key.
>>>>>>>>>
>>>>>>>>> Downloaded the same key and checked the md5sum.
>>>>>>>>>
>>>>>>>>>   1.
>>>>>>>>>
>>>>>>>>>   Uploaded 3MB key
>>>>>>>>>
>>>>>>>>> Downloaded the same and verified md5sum.
>>>>>>>>>
>>>>>>>>>   1.
>>>>>>>>>
>>>>>>>>>   Changed bucket to (6:3)
>>>>>>>>>
>>>>>>>>> Uploaded 8GB key
>>>>>>>>>
>>>>>>>>> Download the same.
>>>>>>>>>
>>>>>>>>> Also verified the new key should be in 6:3 policy and old keys must
>>>> be
>>>>>>>> 3:2.Verified
>>>>>>>>> with several different size key writes and reads.
>>>>>>>>>
>>>>>>>>> Merge checklist items assessment is here:
>>>>>>>>>
>>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>>>>>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan
>>>>>> Fajth
>>>>>>>>> <pi...@cloudera.com> for great efforts in core development and
>>> also
>>>>>>>>> thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for
>>>>>> collaborating
>>>>>>>>> on some of the EC tasks.
>>>>>>>>>
>>>>>>>>> Thanks to Marton for design discussion and on some dev tasks as
>>> well.
>>>>>>>>> Thanks to many others who were involved in design discussions,
>>> Arpit,
>>>>>>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
>>>>>> Prashanth,
>>>>>>>>> Rakesh, Yiqun Lin.
>>>>>>>>> Sorry if I miss anyone here, but your efforts are much appreciated.
>>>>>>>>> Without your tremendous help, we would have not reached this
>>> position
>>>>>>>> yet.
>>>>>>>>> If there are no objections for the merge, I will start the official
>>>>>> vote
>>>>>>>>> later.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> EC Branch Devs
>>>>>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
>>>> For additional commands, e-mail: dev-help@ozone.apache.org
>>>>
>>>>
>>> --
>>> Thanks & Regards,
>>> Aravindan
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> For additional commands, e-mail: dev-help@ozone.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
For additional commands, e-mail: dev-help@ozone.apache.org


Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Lokesh Jain <lj...@apache.org>.
+1 for merge

Thanks
Lokesh

> On 07-Apr-2022, at 11:06 AM, Tsz Wo Sze <sz...@gmail.com> wrote:
> 
> +1
> We should merge it so that more people can try it.  We can work on the
> remaining tasks in the master branch.  Thanks a lot!
> 
> Tsz-Wo
> 
> On Thu, Apr 7, 2022 at 1:17 PM Aravindan Vijayan
> <av...@cloudera.com.invalid> wrote:
> 
>> +1 for the merge. Thanks for the great work!
>> 
>> 
>> 
>> On Wed, Apr 6, 2022 at 9:45 PM Prashant Pogde <ppogde@cloudera.com.invalid
>>> 
>> wrote:
>> 
>>> +1 for the EC branch merge.
>>> 
>>> Regards,
>>> Prashant
>>> 
>>>> On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <sw...@apache.org> wrote:
>>>> 
>>>> +1 for the EC branch merge.
>>>> 
>>>> Best,
>>>> Sid
>>>> 
>>>> On Wed, Apr 6, 2022 at 8:05 PM guimark <gu...@126.com> wrote:
>>>> 
>>>>> Great news!
>>>>> +1 to merge.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> At 2022-04-06 22:18:31, "Stephen O'Donnell" <sodonnell@cloudera.com
>>> .INVALID>
>>>>> wrote:
>>>>>> I have been working on the code on this branch for some time, and I
>>>>> believe
>>>>>> it is in a good state to merge now. It is mostly new code, and if
>>> nothing
>>>>>> attempts to use EC, none of the EC code paths will be executed.
>>>>>> 
>>>>>> +1 to merge from me.
>>>>>> 
>>>>>> Stephen.
>>>>>> 
>>>>>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <um...@apache.org>
>>>>> wrote:
>>>>>> 
>>>>>>> =====Few Edits Below===================
>>>>>>> 
>>>>>>> Dear Ozone Devs,
>>>>>>> 
>>>>>>> As you may know, we have been actively developing Ozone Erasure
>> Coding
>>>>>>> support in a separate branch HDDS-3816-ec.
>>>>>>> 
>>>>>>> We have finished the development of EC key write and read
>>> functionality.
>>>>>>> The support of offline recovery( Recovering replica from node loss)
>>>>> will be
>>>>>>> part of second phase work.
>>>>>>> 
>>>>>>> Since the code has already grown and increasingly started seeing
>> merge
>>>>>>> complications, we would like to merge the current EC branch into
>>> master.
>>>>>>> 
>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work and
>>>>> continued
>>>>>>> the offline recovery work there. (we have uploaded the design doc
>>> there)
>>>>>>> 
>>>>>>> Details on Changes:
>>>>>>> 
>>>>>>>  -
>>>>>>> 
>>>>>>>  Most of the EC core logic went to newly extended classes. Key
>>> changes
>>>>>>>  went into EC*OutputStream and EC*InputStream classes for write and
>>>>> read
>>>>>>>  respectively. Based on replication type, ECPipelineProvider will
>> be
>>>>>>> chosen
>>>>>>>  for creating EC pipelines.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>  -
>>>>>>> 
>>>>>>>  Since we cannot represent the EC replication in the existing
>>>>> replication
>>>>>>>  factor, we have introduced ECReplicationConfig. The
>>> ReplicationConfig
>>>>>>>  interface is already pushed to master, so it’s not a new idea
>> coming
>>>>>>>  through this branch merge now. What is newly coming here is the
>>>>>>>  ECReplicationConfig class which can be used to express EC
>>> replication
>>>>>>>  configuration.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>  -
>>>>>>> 
>>>>>>>  We wanted to provide the support to enable EC at bucket level. To
>>>>>>>  simplify some complications, we have moved the default replication
>>>>>>>  configurations from client to server.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>  -
>>>>>>> 
>>>>>>>  Client side replication type and replication factor removed from
>> the
>>>>>>>  configuration files and introduced the
>>>>> ozone.server.default.replication
>>>>>>>  and ozone.server.default.replication.type.We would continue to
>>>>> respect
>>>>>>> if
>>>>>>>  one configures at client side explicitly or passed through APIs,
>>>>>>> otherwise
>>>>>>>  server side bucket level properties or server side default
>>>>> configuration
>>>>>>>  would take effect.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>  -
>>>>>>> 
>>>>>>>  Other than this change, the rest of EC side code should not impact
>>>>> any
>>>>>>>  of the existing code flows.
>>>>>>> 
>>>>>>> 
>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering this
>>> feature
>>>>>>> and we will continue to improve further in master.
>>>>>>> 
>>>>>>> Git Branch Name : HDDS-3816-ec
>>>>>>> 
>>>>>>> JIRAs: HDDS-3816 and HDDS-5351
>>>>>>> 
>>>>>>> Completed tasks: ~ 142
>>>>>>> 
>>>>>>> + We are covering the following two mandatory JIRAs to come in:
>>>>>>> 
>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
>>>>> server
>>>>>>> could fail due to the unavailability for client default replication
>>>>> config
>>>>>>> 
>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>>>>>>> 
>>>>>>> PRs reviews in-progress and expected to close in a day or two.
>>>>>>> 
>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe they're
>> not
>>>>>>> blockers for merge.
>>>>>>> 
>>>>>>> In short what you can do now with this feature:
>>>>>>> 
>>>>>>>  -
>>>>>>> 
>>>>>>>  You can enable EC at bucket level and cluster level.
>>>>>>> 
>>>>>>> How to enable it at bucket level? Just create the bucket by passing
>>> the
>>>>> ec
>>>>>>> replication options.
>>>>>>> 
>>>>>>>  -
>>>>>>> 
>>>>>>>  You can create EC keys and read the same back.
>>>>>>>  -
>>>>>>> 
>>>>>>>  You should be able to continue writing even when chosen nodes are
>>>>>>>  failing. (Of Course minimum of Data+Parity live nodes should be
>>>>>>> available
>>>>>>>  in cluster for complete the write)
>>>>>>>  -
>>>>>>> 
>>>>>>>  You should be able to read the file back even if a few nodes
>> failed
>>>>> in
>>>>>>>  the same ec block group(Failures should not be more than parity
>>>>> number
>>>>>>> of
>>>>>>>  nodes.).
>>>>>>> 
>>>>>>> What is pending? Offline recovery of lost/missing EC containers. As
>>>>>>> mentioned above, post merge of this branch, I will create a separate
>>>>> JIRA
>>>>>>> for starting the work for OfflineRecovery.
>>>>>>> 
>>>>>>> 
>>>>>>> There are automated acceptance test cases already added. HDDS-6231
>>>>>>> 
>>>>>>> In addition to that, we have also performed basic Acceptance Testing
>>> in
>>>>>>> physical cluster:
>>>>>>> 
>>>>>>>  1.
>>>>>>> 
>>>>>>>  Installed 10 nodes cluster and created EC bucket (3:2).
>>>>>>> 
>>>>>>> Uploaded 10GB key.
>>>>>>> 
>>>>>>> Downloaded the same key and checked the md5sum.
>>>>>>> 
>>>>>>>  1.
>>>>>>> 
>>>>>>>  Uploaded 8GB key.
>>>>>>> 
>>>>>>> Downloaded the same key and checked the md5sum.
>>>>>>> 
>>>>>>>  1.
>>>>>>> 
>>>>>>>  Uploaded 3MB key
>>>>>>> 
>>>>>>> Downloaded the same and verified md5sum.
>>>>>>> 
>>>>>>>  1.
>>>>>>> 
>>>>>>>  Changed bucket to (6:3)
>>>>>>> 
>>>>>>> Uploaded 8GB key
>>>>>>> 
>>>>>>> Download the same.
>>>>>>> 
>>>>>>> Also verified the new key should be in 6:3 policy and old keys must
>> be
>>>>>>> 3:2.Verified
>>>>>>> with several different size key writes and reads.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Since the merge discussion thread, we have well stabilized code and
>>>>> fixed
>>>>>>> several bugs.
>>>>>>> 
>>>>>>> 
>>>>>>> Merge checklist items assessment is here:
>>>>>>> 
>>>>>>> 
>>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>>>>>>> 
>>>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan
>>>>> Fajth
>>>>>>> <pi...@cloudera.com> for great efforts in core development and also
>>>>> thanks
>>>>>>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for
>>>>> collaborating
>>>>>>> on some of the EC tasks.
>>>>>>> 
>>>>>>> Thanks to Marton for design discussion and on some dev tasks as
>> well.
>>>>>>> 
>>>>>>> Thanks to many others who were involved in design discussions,
>> Arpit,
>>>>> Sidd,
>>>>>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth,
>>>>> Rakesh,
>>>>>>> Yiqun Lin.
>>>>>>> Sorry if I miss anyone here, but your efforts are much appreciated.
>>>>> Without
>>>>>>> your tremendous help, we would have not reached this position yet.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> To start with, here is my +1
>>>>>>> 
>>>>>>> The vote will run for 5 days.
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Uma
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <
>> umamahesh@apache.org>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Dear Ozone Devs,
>>>>>>>> 
>>>>>>>> As you may know, we have been actively developing Ozone Erasure
>>> Coding
>>>>>>>> support in a separate branch HDDS-3816-ec.
>>>>>>>> 
>>>>>>>> We have finished the development of EC key write and read
>>>>> functionality.
>>>>>>>> The support of offline recovery( Recovering replica from node loss)
>>>>> will
>>>>>>> be
>>>>>>>> part of second phase work.
>>>>>>>> 
>>>>>>>> Since the code has already grown and increasingly started seeing
>>> merge
>>>>>>>> complications, we would like to propose to merge the current EC
>>> branch
>>>>>>> into
>>>>>>>> master.
>>>>>>>> 
>>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work and
>>>>>>>> continued the offline recovery work there.
>>>>>>>> 
>>>>>>>> Details on Changes:
>>>>>>>> 
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  Most of the EC core logic went to newly extended classes. Key
>>>>> changes
>>>>>>>>  went into EC*OutputStream and EC*InputStream classes for write
>> and
>>>>>>> read
>>>>>>>>  respectively. Based on replication type, ECPipelineProvider will
>> be
>>>>>>> chosen
>>>>>>>>  for creating EC pipelines.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  Since we cannot represent the EC replication in the existing
>>>>>>>>  replication factor, we have introduced ECReplicationConfig. The
>>>>>>>>  ReplicationConfig interface is already pushed to master, so it’s
>>>>> not
>>>>>>> a new
>>>>>>>>  idea coming through this branch merge now. What is newly coming
>>>>> here
>>>>>>> is the
>>>>>>>>  ECReplicationConfig class which can be used to express EC
>>>>> replication
>>>>>>>>  configuration.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  We wanted to provide the support to enable EC at bucket level. To
>>>>>>>>  simplify some complications, we have moved the default
>> replication
>>>>>>>>  configurations from client to server.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  Client side replication type and replication factor removed from
>>>>> the
>>>>>>>>  configuration files and introduced the
>>>>>>> ozone.server.default.replication
>>>>>>>>  and ozone.server.default.replication.type.We would continue to
>>>>>>> respect if
>>>>>>>>  one configures at client side explicitly or passed through APIs,
>>>>>>> otherwise
>>>>>>>>  server side bucket level properties or server side default
>>>>>>> configuration
>>>>>>>>  would take effect.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  Other than this change, the rest of EC side code should not
>> impact
>>>>> any
>>>>>>>>  of the existing code flows.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering this
>>>>> feature
>>>>>>>> and we will continue to improve further in master.
>>>>>>>> 
>>>>>>>> Git Branch Name : HDDS-3816-ec
>>>>>>>> 
>>>>>>>> JIRAs: HDDS-3816 and HDDS-5351
>>>>>>>> 
>>>>>>>> Completed tasks: ~ 142
>>>>>>>> 
>>>>>>>> + We are covering the following two mandatory JIRAs:
>>>>>>>> 
>>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
>>>>>>>> server could fail due to the unavailability for client default
>>>>>>> replication
>>>>>>>> config
>>>>>>>> 
>>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>>>>>>>> 
>>>>>>>> PRs reviews in-progress and expected to close in a day or two.
>>>>>>>> 
>>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe they're
>> not
>>>>>>>> blockers for merge.
>>>>>>>> 
>>>>>>>> In short what you can do now with this feature:
>>>>>>>> 
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  You can enable EC at bucket level and cluster level.
>>>>>>>> 
>>>>>>>> How to enable it at bucket level? Just create the bucket by passing
>>>>> the
>>>>>>> ec
>>>>>>>> replication options.
>>>>>>>> 
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  You can create EC keys and read the same back.
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  You should be able to continue writing even when chosen nodes are
>>>>>>>>  failing. (Of Course minimum of Data+Parity live nodes should be
>>>>>>> available
>>>>>>>>  in cluster for complete the write)
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  You should be able to read the file back even if a few nodes
>>>>> failed in
>>>>>>>>  the same ec block group(Failures should not be more than parity
>>>>>>> number of
>>>>>>>>  nodes.).
>>>>>>>> 
>>>>>>>> What is pending? Offline recovery of lost/missing EC containers. As
>>>>>>>> mentioned above, post merge of this branch, I will create a
>> separate
>>>>> JIRA
>>>>>>>> for starting the work for OfflineRecovery.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> There are automated acceptance test cases already added. HDDS-6231
>>>>>>>> 
>>>>>>>> In addition to that, we have also performed basic Acceptance
>> Testing
>>>>> in
>>>>>>>> physical cluster:
>>>>>>>> 
>>>>>>>>  1.
>>>>>>>> 
>>>>>>>>  Installed 10 nodes cluster and created EC bucket (3:2).
>>>>>>>> 
>>>>>>>> Uploaded 10GB key.
>>>>>>>> 
>>>>>>>> Downloaded the same key and checked the md5sum.
>>>>>>>> 
>>>>>>>>  1.
>>>>>>>> 
>>>>>>>>  Uploaded 8GB key.
>>>>>>>> 
>>>>>>>> Downloaded the same key and checked the md5sum.
>>>>>>>> 
>>>>>>>>  1.
>>>>>>>> 
>>>>>>>>  Uploaded 3MB key
>>>>>>>> 
>>>>>>>> Downloaded the same and verified md5sum.
>>>>>>>> 
>>>>>>>>  1.
>>>>>>>> 
>>>>>>>>  Changed bucket to (6:3)
>>>>>>>> 
>>>>>>>> Uploaded 8GB key
>>>>>>>> 
>>>>>>>> Download the same.
>>>>>>>> 
>>>>>>>> Also verified the new key should be in 6:3 policy and old keys must
>>> be
>>>>>>> 3:2.Verified
>>>>>>>> with several different size key writes and reads.
>>>>>>>> 
>>>>>>>> Merge checklist items assessment is here:
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>>>>>>>> 
>>>>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan
>>>>> Fajth
>>>>>>>> <pi...@cloudera.com> for great efforts in core development and
>> also
>>>>>>>> thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for
>>>>> collaborating
>>>>>>>> on some of the EC tasks.
>>>>>>>> 
>>>>>>>> Thanks to Marton for design discussion and on some dev tasks as
>> well.
>>>>>>>> 
>>>>>>>> Thanks to many others who were involved in design discussions,
>> Arpit,
>>>>>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
>>>>> Prashanth,
>>>>>>>> Rakesh, Yiqun Lin.
>>>>>>>> Sorry if I miss anyone here, but your efforts are much appreciated.
>>>>>>>> Without your tremendous help, we would have not reached this
>> position
>>>>>>> yet.
>>>>>>>> 
>>>>>>>> If there are no objections for the merge, I will start the official
>>>>> vote
>>>>>>>> later.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> 
>>>>>>>> EC Branch Devs
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
>>> For additional commands, e-mail: dev-help@ozone.apache.org
>>> 
>>> 
>> 
>> --
>> Thanks & Regards,
>> Aravindan
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
For additional commands, e-mail: dev-help@ozone.apache.org


Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Ayush Saxena <ay...@gmail.com>.
+1 for merge

-Ayush

> On 07-Apr-2022, at 11:19 AM, Bharat Viswanadham <vi...@gmail.com> wrote:
> 
> +1 for merge.
> 
> Thanks,
> Bharat
> 
> 
>> On Wed, Apr 6, 2022 at 10:36 PM Tsz Wo Sze <sz...@gmail.com> wrote:
>> 
>> +1
>> We should merge it so that more people can try it.  We can work on the
>> remaining tasks in the master branch.  Thanks a lot!
>> 
>> Tsz-Wo
>> 
>> On Thu, Apr 7, 2022 at 1:17 PM Aravindan Vijayan
>> <av...@cloudera.com.invalid> wrote:
>> 
>>> +1 for the merge. Thanks for the great work!
>>> 
>>> 
>>> 
>>> On Wed, Apr 6, 2022 at 9:45 PM Prashant Pogde
>> <ppogde@cloudera.com.invalid
>>>> 
>>> wrote:
>>> 
>>>> +1 for the EC branch merge.
>>>> 
>>>> Regards,
>>>> Prashant
>>>> 
>>>>> On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <sw...@apache.org>
>> wrote:
>>>>> 
>>>>> +1 for the EC branch merge.
>>>>> 
>>>>> Best,
>>>>> Sid
>>>>> 
>>>>> On Wed, Apr 6, 2022 at 8:05 PM guimark <gu...@126.com> wrote:
>>>>> 
>>>>>> Great news!
>>>>>> +1 to merge.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> At 2022-04-06 22:18:31, "Stephen O'Donnell" <sodonnell@cloudera.com
>>>> .INVALID>
>>>>>> wrote:
>>>>>>> I have been working on the code on this branch for some time, and I
>>>>>> believe
>>>>>>> it is in a good state to merge now. It is mostly new code, and if
>>>> nothing
>>>>>>> attempts to use EC, none of the EC code paths will be executed.
>>>>>>> 
>>>>>>> +1 to merge from me.
>>>>>>> 
>>>>>>> Stephen.
>>>>>>> 
>>>>>>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <
>> umamahesh@apache.org>
>>>>>> wrote:
>>>>>>> 
>>>>>>>> =====Few Edits Below===================
>>>>>>>> 
>>>>>>>> Dear Ozone Devs,
>>>>>>>> 
>>>>>>>> As you may know, we have been actively developing Ozone Erasure
>>> Coding
>>>>>>>> support in a separate branch HDDS-3816-ec.
>>>>>>>> 
>>>>>>>> We have finished the development of EC key write and read
>>>> functionality.
>>>>>>>> The support of offline recovery( Recovering replica from node
>> loss)
>>>>>> will be
>>>>>>>> part of second phase work.
>>>>>>>> 
>>>>>>>> Since the code has already grown and increasingly started seeing
>>> merge
>>>>>>>> complications, we would like to merge the current EC branch into
>>>> master.
>>>>>>>> 
>>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work and
>>>>>> continued
>>>>>>>> the offline recovery work there. (we have uploaded the design doc
>>>> there)
>>>>>>>> 
>>>>>>>> Details on Changes:
>>>>>>>> 
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  Most of the EC core logic went to newly extended classes. Key
>>>> changes
>>>>>>>>  went into EC*OutputStream and EC*InputStream classes for write
>> and
>>>>>> read
>>>>>>>>  respectively. Based on replication type, ECPipelineProvider will
>>> be
>>>>>>>> chosen
>>>>>>>>  for creating EC pipelines.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  Since we cannot represent the EC replication in the existing
>>>>>> replication
>>>>>>>>  factor, we have introduced ECReplicationConfig. The
>>>> ReplicationConfig
>>>>>>>>  interface is already pushed to master, so it’s not a new idea
>>> coming
>>>>>>>>  through this branch merge now. What is newly coming here is the
>>>>>>>>  ECReplicationConfig class which can be used to express EC
>>>> replication
>>>>>>>>  configuration.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  We wanted to provide the support to enable EC at bucket level.
>> To
>>>>>>>>  simplify some complications, we have moved the default
>> replication
>>>>>>>>  configurations from client to server.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  Client side replication type and replication factor removed from
>>> the
>>>>>>>>  configuration files and introduced the
>>>>>> ozone.server.default.replication
>>>>>>>>  and ozone.server.default.replication.type.We would continue to
>>>>>> respect
>>>>>>>> if
>>>>>>>>  one configures at client side explicitly or passed through APIs,
>>>>>>>> otherwise
>>>>>>>>  server side bucket level properties or server side default
>>>>>> configuration
>>>>>>>>  would take effect.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  Other than this change, the rest of EC side code should not
>> impact
>>>>>> any
>>>>>>>>  of the existing code flows.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering this
>>>> feature
>>>>>>>> and we will continue to improve further in master.
>>>>>>>> 
>>>>>>>> Git Branch Name : HDDS-3816-ec
>>>>>>>> 
>>>>>>>> JIRAs: HDDS-3816 and HDDS-5351
>>>>>>>> 
>>>>>>>> Completed tasks: ~ 142
>>>>>>>> 
>>>>>>>> + We are covering the following two mandatory JIRAs to come in:
>>>>>>>> 
>>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to
>> older
>>>>>> server
>>>>>>>> could fail due to the unavailability for client default
>> replication
>>>>>> config
>>>>>>>> 
>>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>>>>>>>> 
>>>>>>>> PRs reviews in-progress and expected to close in a day or two.
>>>>>>>> 
>>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe they're
>>> not
>>>>>>>> blockers for merge.
>>>>>>>> 
>>>>>>>> In short what you can do now with this feature:
>>>>>>>> 
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  You can enable EC at bucket level and cluster level.
>>>>>>>> 
>>>>>>>> How to enable it at bucket level? Just create the bucket by
>> passing
>>>> the
>>>>>> ec
>>>>>>>> replication options.
>>>>>>>> 
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  You can create EC keys and read the same back.
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  You should be able to continue writing even when chosen nodes
>> are
>>>>>>>>  failing. (Of Course minimum of Data+Parity live nodes should be
>>>>>>>> available
>>>>>>>>  in cluster for complete the write)
>>>>>>>>  -
>>>>>>>> 
>>>>>>>>  You should be able to read the file back even if a few nodes
>>> failed
>>>>>> in
>>>>>>>>  the same ec block group(Failures should not be more than parity
>>>>>> number
>>>>>>>> of
>>>>>>>>  nodes.).
>>>>>>>> 
>>>>>>>> What is pending? Offline recovery of lost/missing EC containers.
>> As
>>>>>>>> mentioned above, post merge of this branch, I will create a
>> separate
>>>>>> JIRA
>>>>>>>> for starting the work for OfflineRecovery.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> There are automated acceptance test cases already added. HDDS-6231
>>>>>>>> 
>>>>>>>> In addition to that, we have also performed basic Acceptance
>> Testing
>>>> in
>>>>>>>> physical cluster:
>>>>>>>> 
>>>>>>>>  1.
>>>>>>>> 
>>>>>>>>  Installed 10 nodes cluster and created EC bucket (3:2).
>>>>>>>> 
>>>>>>>> Uploaded 10GB key.
>>>>>>>> 
>>>>>>>> Downloaded the same key and checked the md5sum.
>>>>>>>> 
>>>>>>>>  1.
>>>>>>>> 
>>>>>>>>  Uploaded 8GB key.
>>>>>>>> 
>>>>>>>> Downloaded the same key and checked the md5sum.
>>>>>>>> 
>>>>>>>>  1.
>>>>>>>> 
>>>>>>>>  Uploaded 3MB key
>>>>>>>> 
>>>>>>>> Downloaded the same and verified md5sum.
>>>>>>>> 
>>>>>>>>  1.
>>>>>>>> 
>>>>>>>>  Changed bucket to (6:3)
>>>>>>>> 
>>>>>>>> Uploaded 8GB key
>>>>>>>> 
>>>>>>>> Download the same.
>>>>>>>> 
>>>>>>>> Also verified the new key should be in 6:3 policy and old keys
>> must
>>> be
>>>>>>>> 3:2.Verified
>>>>>>>> with several different size key writes and reads.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Since the merge discussion thread, we have well stabilized code
>> and
>>>>>> fixed
>>>>>>>> several bugs.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Merge checklist items assessment is here:
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>>>>>>>> 
>>>>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>,
>> Istvan
>>>>>> Fajth
>>>>>>>> <pi...@cloudera.com> for great efforts in core development and
>> also
>>>>>> thanks
>>>>>>>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for
>>>>>> collaborating
>>>>>>>> on some of the EC tasks.
>>>>>>>> 
>>>>>>>> Thanks to Marton for design discussion and on some dev tasks as
>>> well.
>>>>>>>> 
>>>>>>>> Thanks to many others who were involved in design discussions,
>>> Arpit,
>>>>>> Sidd,
>>>>>>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
>> Prashanth,
>>>>>> Rakesh,
>>>>>>>> Yiqun Lin.
>>>>>>>> Sorry if I miss anyone here, but your efforts are much
>> appreciated.
>>>>>> Without
>>>>>>>> your tremendous help, we would have not reached this position yet.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> To start with, here is my +1
>>>>>>>> 
>>>>>>>> The vote will run for 5 days.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Uma
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <
>>> umamahesh@apache.org>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Dear Ozone Devs,
>>>>>>>>> 
>>>>>>>>> As you may know, we have been actively developing Ozone Erasure
>>>> Coding
>>>>>>>>> support in a separate branch HDDS-3816-ec.
>>>>>>>>> 
>>>>>>>>> We have finished the development of EC key write and read
>>>>>> functionality.
>>>>>>>>> The support of offline recovery( Recovering replica from node
>> loss)
>>>>>> will
>>>>>>>> be
>>>>>>>>> part of second phase work.
>>>>>>>>> 
>>>>>>>>> Since the code has already grown and increasingly started seeing
>>>> merge
>>>>>>>>> complications, we would like to propose to merge the current EC
>>>> branch
>>>>>>>> into
>>>>>>>>> master.
>>>>>>>>> 
>>>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work and
>>>>>>>>> continued the offline recovery work there.
>>>>>>>>> 
>>>>>>>>> Details on Changes:
>>>>>>>>> 
>>>>>>>>>  -
>>>>>>>>> 
>>>>>>>>>  Most of the EC core logic went to newly extended classes. Key
>>>>>> changes
>>>>>>>>>  went into EC*OutputStream and EC*InputStream classes for write
>>> and
>>>>>>>> read
>>>>>>>>>  respectively. Based on replication type, ECPipelineProvider
>> will
>>> be
>>>>>>>> chosen
>>>>>>>>>  for creating EC pipelines.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>  -
>>>>>>>>> 
>>>>>>>>>  Since we cannot represent the EC replication in the existing
>>>>>>>>>  replication factor, we have introduced ECReplicationConfig. The
>>>>>>>>>  ReplicationConfig interface is already pushed to master, so
>> it’s
>>>>>> not
>>>>>>>> a new
>>>>>>>>>  idea coming through this branch merge now. What is newly coming
>>>>>> here
>>>>>>>> is the
>>>>>>>>>  ECReplicationConfig class which can be used to express EC
>>>>>> replication
>>>>>>>>>  configuration.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>  -
>>>>>>>>> 
>>>>>>>>>  We wanted to provide the support to enable EC at bucket level.
>> To
>>>>>>>>>  simplify some complications, we have moved the default
>>> replication
>>>>>>>>>  configurations from client to server.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>  -
>>>>>>>>> 
>>>>>>>>>  Client side replication type and replication factor removed
>> from
>>>>>> the
>>>>>>>>>  configuration files and introduced the
>>>>>>>> ozone.server.default.replication
>>>>>>>>>  and ozone.server.default.replication.type.We would continue to
>>>>>>>> respect if
>>>>>>>>>  one configures at client side explicitly or passed through
>> APIs,
>>>>>>>> otherwise
>>>>>>>>>  server side bucket level properties or server side default
>>>>>>>> configuration
>>>>>>>>>  would take effect.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>  -
>>>>>>>>> 
>>>>>>>>>  Other than this change, the rest of EC side code should not
>>> impact
>>>>>> any
>>>>>>>>>  of the existing code flows.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering this
>>>>>> feature
>>>>>>>>> and we will continue to improve further in master.
>>>>>>>>> 
>>>>>>>>> Git Branch Name : HDDS-3816-ec
>>>>>>>>> 
>>>>>>>>> JIRAs: HDDS-3816 and HDDS-5351
>>>>>>>>> 
>>>>>>>>> Completed tasks: ~ 142
>>>>>>>>> 
>>>>>>>>> + We are covering the following two mandatory JIRAs:
>>>>>>>>> 
>>>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to
>> older
>>>>>>>>> server could fail due to the unavailability for client default
>>>>>>>> replication
>>>>>>>>> config
>>>>>>>>> 
>>>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>>>>>>>>> 
>>>>>>>>> PRs reviews in-progress and expected to close in a day or two.
>>>>>>>>> 
>>>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe they're
>>> not
>>>>>>>>> blockers for merge.
>>>>>>>>> 
>>>>>>>>> In short what you can do now with this feature:
>>>>>>>>> 
>>>>>>>>>  -
>>>>>>>>> 
>>>>>>>>>  You can enable EC at bucket level and cluster level.
>>>>>>>>> 
>>>>>>>>> How to enable it at bucket level? Just create the bucket by
>> passing
>>>>>> the
>>>>>>>> ec
>>>>>>>>> replication options.
>>>>>>>>> 
>>>>>>>>>  -
>>>>>>>>> 
>>>>>>>>>  You can create EC keys and read the same back.
>>>>>>>>>  -
>>>>>>>>> 
>>>>>>>>>  You should be able to continue writing even when chosen nodes
>> are
>>>>>>>>>  failing. (Of Course minimum of Data+Parity live nodes should be
>>>>>>>> available
>>>>>>>>>  in cluster for complete the write)
>>>>>>>>>  -
>>>>>>>>> 
>>>>>>>>>  You should be able to read the file back even if a few nodes
>>>>>> failed in
>>>>>>>>>  the same ec block group(Failures should not be more than parity
>>>>>>>> number of
>>>>>>>>>  nodes.).
>>>>>>>>> 
>>>>>>>>> What is pending? Offline recovery of lost/missing EC containers.
>> As
>>>>>>>>> mentioned above, post merge of this branch, I will create a
>>> separate
>>>>>> JIRA
>>>>>>>>> for starting the work for OfflineRecovery.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> There are automated acceptance test cases already added.
>> HDDS-6231
>>>>>>>>> 
>>>>>>>>> In addition to that, we have also performed basic Acceptance
>>> Testing
>>>>>> in
>>>>>>>>> physical cluster:
>>>>>>>>> 
>>>>>>>>>  1.
>>>>>>>>> 
>>>>>>>>>  Installed 10 nodes cluster and created EC bucket (3:2).
>>>>>>>>> 
>>>>>>>>> Uploaded 10GB key.
>>>>>>>>> 
>>>>>>>>> Downloaded the same key and checked the md5sum.
>>>>>>>>> 
>>>>>>>>>  1.
>>>>>>>>> 
>>>>>>>>>  Uploaded 8GB key.
>>>>>>>>> 
>>>>>>>>> Downloaded the same key and checked the md5sum.
>>>>>>>>> 
>>>>>>>>>  1.
>>>>>>>>> 
>>>>>>>>>  Uploaded 3MB key
>>>>>>>>> 
>>>>>>>>> Downloaded the same and verified md5sum.
>>>>>>>>> 
>>>>>>>>>  1.
>>>>>>>>> 
>>>>>>>>>  Changed bucket to (6:3)
>>>>>>>>> 
>>>>>>>>> Uploaded 8GB key
>>>>>>>>> 
>>>>>>>>> Download the same.
>>>>>>>>> 
>>>>>>>>> Also verified the new key should be in 6:3 policy and old keys
>> must
>>>> be
>>>>>>>> 3:2.Verified
>>>>>>>>> with several different size key writes and reads.
>>>>>>>>> 
>>>>>>>>> Merge checklist items assessment is here:
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>>>>>>>>> 
>>>>>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>,
>> Istvan
>>>>>> Fajth
>>>>>>>>> <pi...@cloudera.com> for great efforts in core development and
>>> also
>>>>>>>>> thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for
>>>>>> collaborating
>>>>>>>>> on some of the EC tasks.
>>>>>>>>> 
>>>>>>>>> Thanks to Marton for design discussion and on some dev tasks as
>>> well.
>>>>>>>>> 
>>>>>>>>> Thanks to many others who were involved in design discussions,
>>> Arpit,
>>>>>>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
>>>>>> Prashanth,
>>>>>>>>> Rakesh, Yiqun Lin.
>>>>>>>>> Sorry if I miss anyone here, but your efforts are much
>> appreciated.
>>>>>>>>> Without your tremendous help, we would have not reached this
>>> position
>>>>>>>> yet.
>>>>>>>>> 
>>>>>>>>> If there are no objections for the merge, I will start the
>> official
>>>>>> vote
>>>>>>>>> later.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> 
>>>>>>>>> EC Branch Devs
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
>>>> For additional commands, e-mail: dev-help@ozone.apache.org
>>>> 
>>>> 
>>> 
>>> --
>>> Thanks & Regards,
>>> Aravindan
>>> 
>> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
For additional commands, e-mail: dev-help@ozone.apache.org


Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Bharat Viswanadham <vi...@gmail.com>.
+1 for merge.

Thanks,
Bharat


On Wed, Apr 6, 2022 at 10:36 PM Tsz Wo Sze <sz...@gmail.com> wrote:

> +1
> We should merge it so that more people can try it.  We can work on the
> remaining tasks in the master branch.  Thanks a lot!
>
> Tsz-Wo
>
> On Thu, Apr 7, 2022 at 1:17 PM Aravindan Vijayan
> <av...@cloudera.com.invalid> wrote:
>
> > +1 for the merge. Thanks for the great work!
> >
> >
> >
> > On Wed, Apr 6, 2022 at 9:45 PM Prashant Pogde
> <ppogde@cloudera.com.invalid
> > >
> > wrote:
> >
> > > +1 for the EC branch merge.
> > >
> > > Regards,
> > > Prashant
> > >
> > > > On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <sw...@apache.org>
> wrote:
> > > >
> > > > +1 for the EC branch merge.
> > > >
> > > > Best,
> > > > Sid
> > > >
> > > > On Wed, Apr 6, 2022 at 8:05 PM guimark <gu...@126.com> wrote:
> > > >
> > > >> Great news!
> > > >> +1 to merge.
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> At 2022-04-06 22:18:31, "Stephen O'Donnell" <sodonnell@cloudera.com
> > > .INVALID>
> > > >> wrote:
> > > >>> I have been working on the code on this branch for some time, and I
> > > >> believe
> > > >>> it is in a good state to merge now. It is mostly new code, and if
> > > nothing
> > > >>> attempts to use EC, none of the EC code paths will be executed.
> > > >>>
> > > >>> +1 to merge from me.
> > > >>>
> > > >>> Stephen.
> > > >>>
> > > >>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <
> umamahesh@apache.org>
> > > >> wrote:
> > > >>>
> > > >>>> =====Few Edits Below===================
> > > >>>>
> > > >>>> Dear Ozone Devs,
> > > >>>>
> > > >>>> As you may know, we have been actively developing Ozone Erasure
> > Coding
> > > >>>> support in a separate branch HDDS-3816-ec.
> > > >>>>
> > > >>>> We have finished the development of EC key write and read
> > > functionality.
> > > >>>> The support of offline recovery( Recovering replica from node
> loss)
> > > >> will be
> > > >>>> part of second phase work.
> > > >>>>
> > > >>>> Since the code has already grown and increasingly started seeing
> > merge
> > > >>>> complications, we would like to merge the current EC branch into
> > > master.
> > > >>>>
> > > >>>> We filed the new JIRA(HDDS-6462) for the second phase of work and
> > > >> continued
> > > >>>> the offline recovery work there. (we have uploaded the design doc
> > > there)
> > > >>>>
> > > >>>> Details on Changes:
> > > >>>>
> > > >>>>   -
> > > >>>>
> > > >>>>   Most of the EC core logic went to newly extended classes. Key
> > > changes
> > > >>>>   went into EC*OutputStream and EC*InputStream classes for write
> and
> > > >> read
> > > >>>>   respectively. Based on replication type, ECPipelineProvider will
> > be
> > > >>>> chosen
> > > >>>>   for creating EC pipelines.
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>   -
> > > >>>>
> > > >>>>   Since we cannot represent the EC replication in the existing
> > > >> replication
> > > >>>>   factor, we have introduced ECReplicationConfig. The
> > > ReplicationConfig
> > > >>>>   interface is already pushed to master, so it’s not a new idea
> > coming
> > > >>>>   through this branch merge now. What is newly coming here is the
> > > >>>>   ECReplicationConfig class which can be used to express EC
> > > replication
> > > >>>>   configuration.
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>   -
> > > >>>>
> > > >>>>   We wanted to provide the support to enable EC at bucket level.
> To
> > > >>>>   simplify some complications, we have moved the default
> replication
> > > >>>>   configurations from client to server.
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>   -
> > > >>>>
> > > >>>>   Client side replication type and replication factor removed from
> > the
> > > >>>>   configuration files and introduced the
> > > >> ozone.server.default.replication
> > > >>>>   and ozone.server.default.replication.type.We would continue to
> > > >> respect
> > > >>>> if
> > > >>>>   one configures at client side explicitly or passed through APIs,
> > > >>>> otherwise
> > > >>>>   server side bucket level properties or server side default
> > > >> configuration
> > > >>>>   would take effect.
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>   -
> > > >>>>
> > > >>>>   Other than this change, the rest of EC side code should not
> impact
> > > >> any
> > > >>>>   of the existing code flows.
> > > >>>>
> > > >>>>
> > > >>>> We have finished documentation JIRA(HDDS-6172) for covering this
> > > feature
> > > >>>> and we will continue to improve further in master.
> > > >>>>
> > > >>>> Git Branch Name : HDDS-3816-ec
> > > >>>>
> > > >>>> JIRAs: HDDS-3816 and HDDS-5351
> > > >>>>
> > > >>>> Completed tasks: ~ 142
> > > >>>>
> > > >>>> + We are covering the following two mandatory JIRAs to come in:
> > > >>>>
> > > >>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to
> older
> > > >> server
> > > >>>> could fail due to the unavailability for client default
> replication
> > > >> config
> > > >>>>
> > > >>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> > > >>>>
> > > >>>> PRs reviews in-progress and expected to close in a day or two.
> > > >>>>
> > > >>>> Few other JIRAs in HDDS-3816 are still open but I believe they're
> > not
> > > >>>> blockers for merge.
> > > >>>>
> > > >>>> In short what you can do now with this feature:
> > > >>>>
> > > >>>>   -
> > > >>>>
> > > >>>>   You can enable EC at bucket level and cluster level.
> > > >>>>
> > > >>>> How to enable it at bucket level? Just create the bucket by
> passing
> > > the
> > > >> ec
> > > >>>> replication options.
> > > >>>>
> > > >>>>   -
> > > >>>>
> > > >>>>   You can create EC keys and read the same back.
> > > >>>>   -
> > > >>>>
> > > >>>>   You should be able to continue writing even when chosen nodes
> are
> > > >>>>   failing. (Of Course minimum of Data+Parity live nodes should be
> > > >>>> available
> > > >>>>   in cluster for complete the write)
> > > >>>>   -
> > > >>>>
> > > >>>>   You should be able to read the file back even if a few nodes
> > failed
> > > >> in
> > > >>>>   the same ec block group(Failures should not be more than parity
> > > >> number
> > > >>>> of
> > > >>>>   nodes.).
> > > >>>>
> > > >>>> What is pending? Offline recovery of lost/missing EC containers.
> As
> > > >>>> mentioned above, post merge of this branch, I will create a
> separate
> > > >> JIRA
> > > >>>> for starting the work for OfflineRecovery.
> > > >>>>
> > > >>>>
> > > >>>> There are automated acceptance test cases already added. HDDS-6231
> > > >>>>
> > > >>>> In addition to that, we have also performed basic Acceptance
> Testing
> > > in
> > > >>>> physical cluster:
> > > >>>>
> > > >>>>   1.
> > > >>>>
> > > >>>>   Installed 10 nodes cluster and created EC bucket (3:2).
> > > >>>>
> > > >>>> Uploaded 10GB key.
> > > >>>>
> > > >>>> Downloaded the same key and checked the md5sum.
> > > >>>>
> > > >>>>   1.
> > > >>>>
> > > >>>>   Uploaded 8GB key.
> > > >>>>
> > > >>>> Downloaded the same key and checked the md5sum.
> > > >>>>
> > > >>>>   1.
> > > >>>>
> > > >>>>   Uploaded 3MB key
> > > >>>>
> > > >>>> Downloaded the same and verified md5sum.
> > > >>>>
> > > >>>>   1.
> > > >>>>
> > > >>>>   Changed bucket to (6:3)
> > > >>>>
> > > >>>> Uploaded 8GB key
> > > >>>>
> > > >>>> Download the same.
> > > >>>>
> > > >>>> Also verified the new key should be in 6:3 policy and old keys
> must
> > be
> > > >>>> 3:2.Verified
> > > >>>> with several different size key writes and reads.
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> Since the merge discussion thread, we have well stabilized code
> and
> > > >> fixed
> > > >>>> several bugs.
> > > >>>>
> > > >>>>
> > > >>>> Merge checklist items assessment is here:
> > > >>>>
> > > >>>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> > > >>>>
> > > >>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>,
> Istvan
> > > >> Fajth
> > > >>>> <pi...@cloudera.com> for great efforts in core development and
> also
> > > >> thanks
> > > >>>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for
> > > >> collaborating
> > > >>>> on some of the EC tasks.
> > > >>>>
> > > >>>> Thanks to Marton for design discussion and on some dev tasks as
> > well.
> > > >>>>
> > > >>>> Thanks to many others who were involved in design discussions,
> > Arpit,
> > > >> Sidd,
> > > >>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
> Prashanth,
> > > >> Rakesh,
> > > >>>> Yiqun Lin.
> > > >>>> Sorry if I miss anyone here, but your efforts are much
> appreciated.
> > > >> Without
> > > >>>> your tremendous help, we would have not reached this position yet.
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> To start with, here is my +1
> > > >>>>
> > > >>>> The vote will run for 5 days.
> > > >>>>
> > > >>>> Regards,
> > > >>>> Uma
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <
> > umamahesh@apache.org>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Dear Ozone Devs,
> > > >>>>>
> > > >>>>> As you may know, we have been actively developing Ozone Erasure
> > > Coding
> > > >>>>> support in a separate branch HDDS-3816-ec.
> > > >>>>>
> > > >>>>> We have finished the development of EC key write and read
> > > >> functionality.
> > > >>>>> The support of offline recovery( Recovering replica from node
> loss)
> > > >> will
> > > >>>> be
> > > >>>>> part of second phase work.
> > > >>>>>
> > > >>>>> Since the code has already grown and increasingly started seeing
> > > merge
> > > >>>>> complications, we would like to propose to merge the current EC
> > > branch
> > > >>>> into
> > > >>>>> master.
> > > >>>>>
> > > >>>>> We filed the new JIRA(HDDS-6462) for the second phase of work and
> > > >>>>> continued the offline recovery work there.
> > > >>>>>
> > > >>>>> Details on Changes:
> > > >>>>>
> > > >>>>>   -
> > > >>>>>
> > > >>>>>   Most of the EC core logic went to newly extended classes. Key
> > > >> changes
> > > >>>>>   went into EC*OutputStream and EC*InputStream classes for write
> > and
> > > >>>> read
> > > >>>>>   respectively. Based on replication type, ECPipelineProvider
> will
> > be
> > > >>>> chosen
> > > >>>>>   for creating EC pipelines.
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>   -
> > > >>>>>
> > > >>>>>   Since we cannot represent the EC replication in the existing
> > > >>>>>   replication factor, we have introduced ECReplicationConfig. The
> > > >>>>>   ReplicationConfig interface is already pushed to master, so
> it’s
> > > >> not
> > > >>>> a new
> > > >>>>>   idea coming through this branch merge now. What is newly coming
> > > >> here
> > > >>>> is the
> > > >>>>>   ECReplicationConfig class which can be used to express EC
> > > >> replication
> > > >>>>>   configuration.
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>   -
> > > >>>>>
> > > >>>>>   We wanted to provide the support to enable EC at bucket level.
> To
> > > >>>>>   simplify some complications, we have moved the default
> > replication
> > > >>>>>   configurations from client to server.
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>   -
> > > >>>>>
> > > >>>>>   Client side replication type and replication factor removed
> from
> > > >> the
> > > >>>>>   configuration files and introduced the
> > > >>>> ozone.server.default.replication
> > > >>>>>   and ozone.server.default.replication.type.We would continue to
> > > >>>> respect if
> > > >>>>>   one configures at client side explicitly or passed through
> APIs,
> > > >>>> otherwise
> > > >>>>>   server side bucket level properties or server side default
> > > >>>> configuration
> > > >>>>>   would take effect.
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>   -
> > > >>>>>
> > > >>>>>   Other than this change, the rest of EC side code should not
> > impact
> > > >> any
> > > >>>>>   of the existing code flows.
> > > >>>>>
> > > >>>>>
> > > >>>>> We have finished documentation JIRA(HDDS-6172) for covering this
> > > >> feature
> > > >>>>> and we will continue to improve further in master.
> > > >>>>>
> > > >>>>> Git Branch Name : HDDS-3816-ec
> > > >>>>>
> > > >>>>> JIRAs: HDDS-3816 and HDDS-5351
> > > >>>>>
> > > >>>>> Completed tasks: ~ 142
> > > >>>>>
> > > >>>>> + We are covering the following two mandatory JIRAs:
> > > >>>>>
> > > >>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to
> older
> > > >>>>> server could fail due to the unavailability for client default
> > > >>>> replication
> > > >>>>> config
> > > >>>>>
> > > >>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> > > >>>>>
> > > >>>>> PRs reviews in-progress and expected to close in a day or two.
> > > >>>>>
> > > >>>>> Few other JIRAs in HDDS-3816 are still open but I believe they're
> > not
> > > >>>>> blockers for merge.
> > > >>>>>
> > > >>>>> In short what you can do now with this feature:
> > > >>>>>
> > > >>>>>   -
> > > >>>>>
> > > >>>>>   You can enable EC at bucket level and cluster level.
> > > >>>>>
> > > >>>>> How to enable it at bucket level? Just create the bucket by
> passing
> > > >> the
> > > >>>> ec
> > > >>>>> replication options.
> > > >>>>>
> > > >>>>>   -
> > > >>>>>
> > > >>>>>   You can create EC keys and read the same back.
> > > >>>>>   -
> > > >>>>>
> > > >>>>>   You should be able to continue writing even when chosen nodes
> are
> > > >>>>>   failing. (Of Course minimum of Data+Parity live nodes should be
> > > >>>> available
> > > >>>>>   in cluster for complete the write)
> > > >>>>>   -
> > > >>>>>
> > > >>>>>   You should be able to read the file back even if a few nodes
> > > >> failed in
> > > >>>>>   the same ec block group(Failures should not be more than parity
> > > >>>> number of
> > > >>>>>   nodes.).
> > > >>>>>
> > > >>>>> What is pending? Offline recovery of lost/missing EC containers.
> As
> > > >>>>> mentioned above, post merge of this branch, I will create a
> > separate
> > > >> JIRA
> > > >>>>> for starting the work for OfflineRecovery.
> > > >>>>>
> > > >>>>>
> > > >>>>> There are automated acceptance test cases already added.
> HDDS-6231
> > > >>>>>
> > > >>>>> In addition to that, we have also performed basic Acceptance
> > Testing
> > > >> in
> > > >>>>> physical cluster:
> > > >>>>>
> > > >>>>>   1.
> > > >>>>>
> > > >>>>>   Installed 10 nodes cluster and created EC bucket (3:2).
> > > >>>>>
> > > >>>>> Uploaded 10GB key.
> > > >>>>>
> > > >>>>> Downloaded the same key and checked the md5sum.
> > > >>>>>
> > > >>>>>   1.
> > > >>>>>
> > > >>>>>   Uploaded 8GB key.
> > > >>>>>
> > > >>>>> Downloaded the same key and checked the md5sum.
> > > >>>>>
> > > >>>>>   1.
> > > >>>>>
> > > >>>>>   Uploaded 3MB key
> > > >>>>>
> > > >>>>> Downloaded the same and verified md5sum.
> > > >>>>>
> > > >>>>>   1.
> > > >>>>>
> > > >>>>>   Changed bucket to (6:3)
> > > >>>>>
> > > >>>>> Uploaded 8GB key
> > > >>>>>
> > > >>>>> Download the same.
> > > >>>>>
> > > >>>>> Also verified the new key should be in 6:3 policy and old keys
> must
> > > be
> > > >>>> 3:2.Verified
> > > >>>>> with several different size key writes and reads.
> > > >>>>>
> > > >>>>> Merge checklist items assessment is here:
> > > >>>>>
> > > >>>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> > > >>>>>
> > > >>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>,
> Istvan
> > > >> Fajth
> > > >>>>> <pi...@cloudera.com> for great efforts in core development and
> > also
> > > >>>>> thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for
> > > >> collaborating
> > > >>>>> on some of the EC tasks.
> > > >>>>>
> > > >>>>> Thanks to Marton for design discussion and on some dev tasks as
> > well.
> > > >>>>>
> > > >>>>> Thanks to many others who were involved in design discussions,
> > Arpit,
> > > >>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
> > > >> Prashanth,
> > > >>>>> Rakesh, Yiqun Lin.
> > > >>>>> Sorry if I miss anyone here, but your efforts are much
> appreciated.
> > > >>>>> Without your tremendous help, we would have not reached this
> > position
> > > >>>> yet.
> > > >>>>>
> > > >>>>> If there are no objections for the merge, I will start the
> official
> > > >> vote
> > > >>>>> later.
> > > >>>>>
> > > >>>>> Regards,
> > > >>>>>
> > > >>>>> EC Branch Devs
> > > >>>>>
> > > >>>>
> > > >>
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> > > For additional commands, e-mail: dev-help@ozone.apache.org
> > >
> > >
> >
> > --
> > Thanks & Regards,
> > Aravindan
> >
>

Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Tsz Wo Sze <sz...@gmail.com>.
+1
We should merge it so that more people can try it.  We can work on the
remaining tasks in the master branch.  Thanks a lot!

Tsz-Wo

On Thu, Apr 7, 2022 at 1:17 PM Aravindan Vijayan
<av...@cloudera.com.invalid> wrote:

> +1 for the merge. Thanks for the great work!
>
>
>
> On Wed, Apr 6, 2022 at 9:45 PM Prashant Pogde <ppogde@cloudera.com.invalid
> >
> wrote:
>
> > +1 for the EC branch merge.
> >
> > Regards,
> > Prashant
> >
> > > On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <sw...@apache.org> wrote:
> > >
> > > +1 for the EC branch merge.
> > >
> > > Best,
> > > Sid
> > >
> > > On Wed, Apr 6, 2022 at 8:05 PM guimark <gu...@126.com> wrote:
> > >
> > >> Great news!
> > >> +1 to merge.
> > >>
> > >>
> > >>
> > >>
> > >> At 2022-04-06 22:18:31, "Stephen O'Donnell" <sodonnell@cloudera.com
> > .INVALID>
> > >> wrote:
> > >>> I have been working on the code on this branch for some time, and I
> > >> believe
> > >>> it is in a good state to merge now. It is mostly new code, and if
> > nothing
> > >>> attempts to use EC, none of the EC code paths will be executed.
> > >>>
> > >>> +1 to merge from me.
> > >>>
> > >>> Stephen.
> > >>>
> > >>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <um...@apache.org>
> > >> wrote:
> > >>>
> > >>>> =====Few Edits Below===================
> > >>>>
> > >>>> Dear Ozone Devs,
> > >>>>
> > >>>> As you may know, we have been actively developing Ozone Erasure
> Coding
> > >>>> support in a separate branch HDDS-3816-ec.
> > >>>>
> > >>>> We have finished the development of EC key write and read
> > functionality.
> > >>>> The support of offline recovery( Recovering replica from node loss)
> > >> will be
> > >>>> part of second phase work.
> > >>>>
> > >>>> Since the code has already grown and increasingly started seeing
> merge
> > >>>> complications, we would like to merge the current EC branch into
> > master.
> > >>>>
> > >>>> We filed the new JIRA(HDDS-6462) for the second phase of work and
> > >> continued
> > >>>> the offline recovery work there. (we have uploaded the design doc
> > there)
> > >>>>
> > >>>> Details on Changes:
> > >>>>
> > >>>>   -
> > >>>>
> > >>>>   Most of the EC core logic went to newly extended classes. Key
> > changes
> > >>>>   went into EC*OutputStream and EC*InputStream classes for write and
> > >> read
> > >>>>   respectively. Based on replication type, ECPipelineProvider will
> be
> > >>>> chosen
> > >>>>   for creating EC pipelines.
> > >>>>
> > >>>>
> > >>>>
> > >>>>   -
> > >>>>
> > >>>>   Since we cannot represent the EC replication in the existing
> > >> replication
> > >>>>   factor, we have introduced ECReplicationConfig. The
> > ReplicationConfig
> > >>>>   interface is already pushed to master, so it’s not a new idea
> coming
> > >>>>   through this branch merge now. What is newly coming here is the
> > >>>>   ECReplicationConfig class which can be used to express EC
> > replication
> > >>>>   configuration.
> > >>>>
> > >>>>
> > >>>>
> > >>>>   -
> > >>>>
> > >>>>   We wanted to provide the support to enable EC at bucket level. To
> > >>>>   simplify some complications, we have moved the default replication
> > >>>>   configurations from client to server.
> > >>>>
> > >>>>
> > >>>>
> > >>>>   -
> > >>>>
> > >>>>   Client side replication type and replication factor removed from
> the
> > >>>>   configuration files and introduced the
> > >> ozone.server.default.replication
> > >>>>   and ozone.server.default.replication.type.We would continue to
> > >> respect
> > >>>> if
> > >>>>   one configures at client side explicitly or passed through APIs,
> > >>>> otherwise
> > >>>>   server side bucket level properties or server side default
> > >> configuration
> > >>>>   would take effect.
> > >>>>
> > >>>>
> > >>>>
> > >>>>   -
> > >>>>
> > >>>>   Other than this change, the rest of EC side code should not impact
> > >> any
> > >>>>   of the existing code flows.
> > >>>>
> > >>>>
> > >>>> We have finished documentation JIRA(HDDS-6172) for covering this
> > feature
> > >>>> and we will continue to improve further in master.
> > >>>>
> > >>>> Git Branch Name : HDDS-3816-ec
> > >>>>
> > >>>> JIRAs: HDDS-3816 and HDDS-5351
> > >>>>
> > >>>> Completed tasks: ~ 142
> > >>>>
> > >>>> + We are covering the following two mandatory JIRAs to come in:
> > >>>>
> > >>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
> > >> server
> > >>>> could fail due to the unavailability for client default replication
> > >> config
> > >>>>
> > >>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> > >>>>
> > >>>> PRs reviews in-progress and expected to close in a day or two.
> > >>>>
> > >>>> Few other JIRAs in HDDS-3816 are still open but I believe they're
> not
> > >>>> blockers for merge.
> > >>>>
> > >>>> In short what you can do now with this feature:
> > >>>>
> > >>>>   -
> > >>>>
> > >>>>   You can enable EC at bucket level and cluster level.
> > >>>>
> > >>>> How to enable it at bucket level? Just create the bucket by passing
> > the
> > >> ec
> > >>>> replication options.
> > >>>>
> > >>>>   -
> > >>>>
> > >>>>   You can create EC keys and read the same back.
> > >>>>   -
> > >>>>
> > >>>>   You should be able to continue writing even when chosen nodes are
> > >>>>   failing. (Of Course minimum of Data+Parity live nodes should be
> > >>>> available
> > >>>>   in cluster for complete the write)
> > >>>>   -
> > >>>>
> > >>>>   You should be able to read the file back even if a few nodes
> failed
> > >> in
> > >>>>   the same ec block group(Failures should not be more than parity
> > >> number
> > >>>> of
> > >>>>   nodes.).
> > >>>>
> > >>>> What is pending? Offline recovery of lost/missing EC containers. As
> > >>>> mentioned above, post merge of this branch, I will create a separate
> > >> JIRA
> > >>>> for starting the work for OfflineRecovery.
> > >>>>
> > >>>>
> > >>>> There are automated acceptance test cases already added. HDDS-6231
> > >>>>
> > >>>> In addition to that, we have also performed basic Acceptance Testing
> > in
> > >>>> physical cluster:
> > >>>>
> > >>>>   1.
> > >>>>
> > >>>>   Installed 10 nodes cluster and created EC bucket (3:2).
> > >>>>
> > >>>> Uploaded 10GB key.
> > >>>>
> > >>>> Downloaded the same key and checked the md5sum.
> > >>>>
> > >>>>   1.
> > >>>>
> > >>>>   Uploaded 8GB key.
> > >>>>
> > >>>> Downloaded the same key and checked the md5sum.
> > >>>>
> > >>>>   1.
> > >>>>
> > >>>>   Uploaded 3MB key
> > >>>>
> > >>>> Downloaded the same and verified md5sum.
> > >>>>
> > >>>>   1.
> > >>>>
> > >>>>   Changed bucket to (6:3)
> > >>>>
> > >>>> Uploaded 8GB key
> > >>>>
> > >>>> Download the same.
> > >>>>
> > >>>> Also verified the new key should be in 6:3 policy and old keys must
> be
> > >>>> 3:2.Verified
> > >>>> with several different size key writes and reads.
> > >>>>
> > >>>>
> > >>>>
> > >>>> Since the merge discussion thread, we have well stabilized code and
> > >> fixed
> > >>>> several bugs.
> > >>>>
> > >>>>
> > >>>> Merge checklist items assessment is here:
> > >>>>
> > >>>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> > >>>>
> > >>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan
> > >> Fajth
> > >>>> <pi...@cloudera.com> for great efforts in core development and also
> > >> thanks
> > >>>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for
> > >> collaborating
> > >>>> on some of the EC tasks.
> > >>>>
> > >>>> Thanks to Marton for design discussion and on some dev tasks as
> well.
> > >>>>
> > >>>> Thanks to many others who were involved in design discussions,
> Arpit,
> > >> Sidd,
> > >>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth,
> > >> Rakesh,
> > >>>> Yiqun Lin.
> > >>>> Sorry if I miss anyone here, but your efforts are much appreciated.
> > >> Without
> > >>>> your tremendous help, we would have not reached this position yet.
> > >>>>
> > >>>>
> > >>>>
> > >>>> To start with, here is my +1
> > >>>>
> > >>>> The vote will run for 5 days.
> > >>>>
> > >>>> Regards,
> > >>>> Uma
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <
> umamahesh@apache.org>
> > >>>> wrote:
> > >>>>
> > >>>>> Dear Ozone Devs,
> > >>>>>
> > >>>>> As you may know, we have been actively developing Ozone Erasure
> > Coding
> > >>>>> support in a separate branch HDDS-3816-ec.
> > >>>>>
> > >>>>> We have finished the development of EC key write and read
> > >> functionality.
> > >>>>> The support of offline recovery( Recovering replica from node loss)
> > >> will
> > >>>> be
> > >>>>> part of second phase work.
> > >>>>>
> > >>>>> Since the code has already grown and increasingly started seeing
> > merge
> > >>>>> complications, we would like to propose to merge the current EC
> > branch
> > >>>> into
> > >>>>> master.
> > >>>>>
> > >>>>> We filed the new JIRA(HDDS-6462) for the second phase of work and
> > >>>>> continued the offline recovery work there.
> > >>>>>
> > >>>>> Details on Changes:
> > >>>>>
> > >>>>>   -
> > >>>>>
> > >>>>>   Most of the EC core logic went to newly extended classes. Key
> > >> changes
> > >>>>>   went into EC*OutputStream and EC*InputStream classes for write
> and
> > >>>> read
> > >>>>>   respectively. Based on replication type, ECPipelineProvider will
> be
> > >>>> chosen
> > >>>>>   for creating EC pipelines.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>   -
> > >>>>>
> > >>>>>   Since we cannot represent the EC replication in the existing
> > >>>>>   replication factor, we have introduced ECReplicationConfig. The
> > >>>>>   ReplicationConfig interface is already pushed to master, so it’s
> > >> not
> > >>>> a new
> > >>>>>   idea coming through this branch merge now. What is newly coming
> > >> here
> > >>>> is the
> > >>>>>   ECReplicationConfig class which can be used to express EC
> > >> replication
> > >>>>>   configuration.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>   -
> > >>>>>
> > >>>>>   We wanted to provide the support to enable EC at bucket level. To
> > >>>>>   simplify some complications, we have moved the default
> replication
> > >>>>>   configurations from client to server.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>   -
> > >>>>>
> > >>>>>   Client side replication type and replication factor removed from
> > >> the
> > >>>>>   configuration files and introduced the
> > >>>> ozone.server.default.replication
> > >>>>>   and ozone.server.default.replication.type.We would continue to
> > >>>> respect if
> > >>>>>   one configures at client side explicitly or passed through APIs,
> > >>>> otherwise
> > >>>>>   server side bucket level properties or server side default
> > >>>> configuration
> > >>>>>   would take effect.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>   -
> > >>>>>
> > >>>>>   Other than this change, the rest of EC side code should not
> impact
> > >> any
> > >>>>>   of the existing code flows.
> > >>>>>
> > >>>>>
> > >>>>> We have finished documentation JIRA(HDDS-6172) for covering this
> > >> feature
> > >>>>> and we will continue to improve further in master.
> > >>>>>
> > >>>>> Git Branch Name : HDDS-3816-ec
> > >>>>>
> > >>>>> JIRAs: HDDS-3816 and HDDS-5351
> > >>>>>
> > >>>>> Completed tasks: ~ 142
> > >>>>>
> > >>>>> + We are covering the following two mandatory JIRAs:
> > >>>>>
> > >>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
> > >>>>> server could fail due to the unavailability for client default
> > >>>> replication
> > >>>>> config
> > >>>>>
> > >>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> > >>>>>
> > >>>>> PRs reviews in-progress and expected to close in a day or two.
> > >>>>>
> > >>>>> Few other JIRAs in HDDS-3816 are still open but I believe they're
> not
> > >>>>> blockers for merge.
> > >>>>>
> > >>>>> In short what you can do now with this feature:
> > >>>>>
> > >>>>>   -
> > >>>>>
> > >>>>>   You can enable EC at bucket level and cluster level.
> > >>>>>
> > >>>>> How to enable it at bucket level? Just create the bucket by passing
> > >> the
> > >>>> ec
> > >>>>> replication options.
> > >>>>>
> > >>>>>   -
> > >>>>>
> > >>>>>   You can create EC keys and read the same back.
> > >>>>>   -
> > >>>>>
> > >>>>>   You should be able to continue writing even when chosen nodes are
> > >>>>>   failing. (Of Course minimum of Data+Parity live nodes should be
> > >>>> available
> > >>>>>   in cluster for complete the write)
> > >>>>>   -
> > >>>>>
> > >>>>>   You should be able to read the file back even if a few nodes
> > >> failed in
> > >>>>>   the same ec block group(Failures should not be more than parity
> > >>>> number of
> > >>>>>   nodes.).
> > >>>>>
> > >>>>> What is pending? Offline recovery of lost/missing EC containers. As
> > >>>>> mentioned above, post merge of this branch, I will create a
> separate
> > >> JIRA
> > >>>>> for starting the work for OfflineRecovery.
> > >>>>>
> > >>>>>
> > >>>>> There are automated acceptance test cases already added. HDDS-6231
> > >>>>>
> > >>>>> In addition to that, we have also performed basic Acceptance
> Testing
> > >> in
> > >>>>> physical cluster:
> > >>>>>
> > >>>>>   1.
> > >>>>>
> > >>>>>   Installed 10 nodes cluster and created EC bucket (3:2).
> > >>>>>
> > >>>>> Uploaded 10GB key.
> > >>>>>
> > >>>>> Downloaded the same key and checked the md5sum.
> > >>>>>
> > >>>>>   1.
> > >>>>>
> > >>>>>   Uploaded 8GB key.
> > >>>>>
> > >>>>> Downloaded the same key and checked the md5sum.
> > >>>>>
> > >>>>>   1.
> > >>>>>
> > >>>>>   Uploaded 3MB key
> > >>>>>
> > >>>>> Downloaded the same and verified md5sum.
> > >>>>>
> > >>>>>   1.
> > >>>>>
> > >>>>>   Changed bucket to (6:3)
> > >>>>>
> > >>>>> Uploaded 8GB key
> > >>>>>
> > >>>>> Download the same.
> > >>>>>
> > >>>>> Also verified the new key should be in 6:3 policy and old keys must
> > be
> > >>>> 3:2.Verified
> > >>>>> with several different size key writes and reads.
> > >>>>>
> > >>>>> Merge checklist items assessment is here:
> > >>>>>
> > >>>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> > >>>>>
> > >>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan
> > >> Fajth
> > >>>>> <pi...@cloudera.com> for great efforts in core development and
> also
> > >>>>> thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for
> > >> collaborating
> > >>>>> on some of the EC tasks.
> > >>>>>
> > >>>>> Thanks to Marton for design discussion and on some dev tasks as
> well.
> > >>>>>
> > >>>>> Thanks to many others who were involved in design discussions,
> Arpit,
> > >>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
> > >> Prashanth,
> > >>>>> Rakesh, Yiqun Lin.
> > >>>>> Sorry if I miss anyone here, but your efforts are much appreciated.
> > >>>>> Without your tremendous help, we would have not reached this
> position
> > >>>> yet.
> > >>>>>
> > >>>>> If there are no objections for the merge, I will start the official
> > >> vote
> > >>>>> later.
> > >>>>>
> > >>>>> Regards,
> > >>>>>
> > >>>>> EC Branch Devs
> > >>>>>
> > >>>>
> > >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> > For additional commands, e-mail: dev-help@ozone.apache.org
> >
> >
>
> --
> Thanks & Regards,
> Aravindan
>

Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Aravindan Vijayan <av...@cloudera.com.INVALID>.
+1 for the merge. Thanks for the great work!



On Wed, Apr 6, 2022 at 9:45 PM Prashant Pogde <pp...@cloudera.com.invalid>
wrote:

> +1 for the EC branch merge.
>
> Regards,
> Prashant
>
> > On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <sw...@apache.org> wrote:
> >
> > +1 for the EC branch merge.
> >
> > Best,
> > Sid
> >
> > On Wed, Apr 6, 2022 at 8:05 PM guimark <gu...@126.com> wrote:
> >
> >> Great news!
> >> +1 to merge.
> >>
> >>
> >>
> >>
> >> At 2022-04-06 22:18:31, "Stephen O'Donnell" <sodonnell@cloudera.com
> .INVALID>
> >> wrote:
> >>> I have been working on the code on this branch for some time, and I
> >> believe
> >>> it is in a good state to merge now. It is mostly new code, and if
> nothing
> >>> attempts to use EC, none of the EC code paths will be executed.
> >>>
> >>> +1 to merge from me.
> >>>
> >>> Stephen.
> >>>
> >>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <um...@apache.org>
> >> wrote:
> >>>
> >>>> =====Few Edits Below===================
> >>>>
> >>>> Dear Ozone Devs,
> >>>>
> >>>> As you may know, we have been actively developing Ozone Erasure Coding
> >>>> support in a separate branch HDDS-3816-ec.
> >>>>
> >>>> We have finished the development of EC key write and read
> functionality.
> >>>> The support of offline recovery( Recovering replica from node loss)
> >> will be
> >>>> part of second phase work.
> >>>>
> >>>> Since the code has already grown and increasingly started seeing merge
> >>>> complications, we would like to merge the current EC branch into
> master.
> >>>>
> >>>> We filed the new JIRA(HDDS-6462) for the second phase of work and
> >> continued
> >>>> the offline recovery work there. (we have uploaded the design doc
> there)
> >>>>
> >>>> Details on Changes:
> >>>>
> >>>>   -
> >>>>
> >>>>   Most of the EC core logic went to newly extended classes. Key
> changes
> >>>>   went into EC*OutputStream and EC*InputStream classes for write and
> >> read
> >>>>   respectively. Based on replication type, ECPipelineProvider will be
> >>>> chosen
> >>>>   for creating EC pipelines.
> >>>>
> >>>>
> >>>>
> >>>>   -
> >>>>
> >>>>   Since we cannot represent the EC replication in the existing
> >> replication
> >>>>   factor, we have introduced ECReplicationConfig. The
> ReplicationConfig
> >>>>   interface is already pushed to master, so it’s not a new idea coming
> >>>>   through this branch merge now. What is newly coming here is the
> >>>>   ECReplicationConfig class which can be used to express EC
> replication
> >>>>   configuration.
> >>>>
> >>>>
> >>>>
> >>>>   -
> >>>>
> >>>>   We wanted to provide the support to enable EC at bucket level. To
> >>>>   simplify some complications, we have moved the default replication
> >>>>   configurations from client to server.
> >>>>
> >>>>
> >>>>
> >>>>   -
> >>>>
> >>>>   Client side replication type and replication factor removed from the
> >>>>   configuration files and introduced the
> >> ozone.server.default.replication
> >>>>   and ozone.server.default.replication.type.We would continue to
> >> respect
> >>>> if
> >>>>   one configures at client side explicitly or passed through APIs,
> >>>> otherwise
> >>>>   server side bucket level properties or server side default
> >> configuration
> >>>>   would take effect.
> >>>>
> >>>>
> >>>>
> >>>>   -
> >>>>
> >>>>   Other than this change, the rest of EC side code should not impact
> >> any
> >>>>   of the existing code flows.
> >>>>
> >>>>
> >>>> We have finished documentation JIRA(HDDS-6172) for covering this
> feature
> >>>> and we will continue to improve further in master.
> >>>>
> >>>> Git Branch Name : HDDS-3816-ec
> >>>>
> >>>> JIRAs: HDDS-3816 and HDDS-5351
> >>>>
> >>>> Completed tasks: ~ 142
> >>>>
> >>>> + We are covering the following two mandatory JIRAs to come in:
> >>>>
> >>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
> >> server
> >>>> could fail due to the unavailability for client default replication
> >> config
> >>>>
> >>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> >>>>
> >>>> PRs reviews in-progress and expected to close in a day or two.
> >>>>
> >>>> Few other JIRAs in HDDS-3816 are still open but I believe they're not
> >>>> blockers for merge.
> >>>>
> >>>> In short what you can do now with this feature:
> >>>>
> >>>>   -
> >>>>
> >>>>   You can enable EC at bucket level and cluster level.
> >>>>
> >>>> How to enable it at bucket level? Just create the bucket by passing
> the
> >> ec
> >>>> replication options.
> >>>>
> >>>>   -
> >>>>
> >>>>   You can create EC keys and read the same back.
> >>>>   -
> >>>>
> >>>>   You should be able to continue writing even when chosen nodes are
> >>>>   failing. (Of Course minimum of Data+Parity live nodes should be
> >>>> available
> >>>>   in cluster for complete the write)
> >>>>   -
> >>>>
> >>>>   You should be able to read the file back even if a few nodes failed
> >> in
> >>>>   the same ec block group(Failures should not be more than parity
> >> number
> >>>> of
> >>>>   nodes.).
> >>>>
> >>>> What is pending? Offline recovery of lost/missing EC containers. As
> >>>> mentioned above, post merge of this branch, I will create a separate
> >> JIRA
> >>>> for starting the work for OfflineRecovery.
> >>>>
> >>>>
> >>>> There are automated acceptance test cases already added. HDDS-6231
> >>>>
> >>>> In addition to that, we have also performed basic Acceptance Testing
> in
> >>>> physical cluster:
> >>>>
> >>>>   1.
> >>>>
> >>>>   Installed 10 nodes cluster and created EC bucket (3:2).
> >>>>
> >>>> Uploaded 10GB key.
> >>>>
> >>>> Downloaded the same key and checked the md5sum.
> >>>>
> >>>>   1.
> >>>>
> >>>>   Uploaded 8GB key.
> >>>>
> >>>> Downloaded the same key and checked the md5sum.
> >>>>
> >>>>   1.
> >>>>
> >>>>   Uploaded 3MB key
> >>>>
> >>>> Downloaded the same and verified md5sum.
> >>>>
> >>>>   1.
> >>>>
> >>>>   Changed bucket to (6:3)
> >>>>
> >>>> Uploaded 8GB key
> >>>>
> >>>> Download the same.
> >>>>
> >>>> Also verified the new key should be in 6:3 policy and old keys must be
> >>>> 3:2.Verified
> >>>> with several different size key writes and reads.
> >>>>
> >>>>
> >>>>
> >>>> Since the merge discussion thread, we have well stabilized code and
> >> fixed
> >>>> several bugs.
> >>>>
> >>>>
> >>>> Merge checklist items assessment is here:
> >>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> >>>>
> >>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan
> >> Fajth
> >>>> <pi...@cloudera.com> for great efforts in core development and also
> >> thanks
> >>>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for
> >> collaborating
> >>>> on some of the EC tasks.
> >>>>
> >>>> Thanks to Marton for design discussion and on some dev tasks as well.
> >>>>
> >>>> Thanks to many others who were involved in design discussions, Arpit,
> >> Sidd,
> >>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth,
> >> Rakesh,
> >>>> Yiqun Lin.
> >>>> Sorry if I miss anyone here, but your efforts are much appreciated.
> >> Without
> >>>> your tremendous help, we would have not reached this position yet.
> >>>>
> >>>>
> >>>>
> >>>> To start with, here is my +1
> >>>>
> >>>> The vote will run for 5 days.
> >>>>
> >>>> Regards,
> >>>> Uma
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <um...@apache.org>
> >>>> wrote:
> >>>>
> >>>>> Dear Ozone Devs,
> >>>>>
> >>>>> As you may know, we have been actively developing Ozone Erasure
> Coding
> >>>>> support in a separate branch HDDS-3816-ec.
> >>>>>
> >>>>> We have finished the development of EC key write and read
> >> functionality.
> >>>>> The support of offline recovery( Recovering replica from node loss)
> >> will
> >>>> be
> >>>>> part of second phase work.
> >>>>>
> >>>>> Since the code has already grown and increasingly started seeing
> merge
> >>>>> complications, we would like to propose to merge the current EC
> branch
> >>>> into
> >>>>> master.
> >>>>>
> >>>>> We filed the new JIRA(HDDS-6462) for the second phase of work and
> >>>>> continued the offline recovery work there.
> >>>>>
> >>>>> Details on Changes:
> >>>>>
> >>>>>   -
> >>>>>
> >>>>>   Most of the EC core logic went to newly extended classes. Key
> >> changes
> >>>>>   went into EC*OutputStream and EC*InputStream classes for write and
> >>>> read
> >>>>>   respectively. Based on replication type, ECPipelineProvider will be
> >>>> chosen
> >>>>>   for creating EC pipelines.
> >>>>>
> >>>>>
> >>>>>
> >>>>>   -
> >>>>>
> >>>>>   Since we cannot represent the EC replication in the existing
> >>>>>   replication factor, we have introduced ECReplicationConfig. The
> >>>>>   ReplicationConfig interface is already pushed to master, so it’s
> >> not
> >>>> a new
> >>>>>   idea coming through this branch merge now. What is newly coming
> >> here
> >>>> is the
> >>>>>   ECReplicationConfig class which can be used to express EC
> >> replication
> >>>>>   configuration.
> >>>>>
> >>>>>
> >>>>>
> >>>>>   -
> >>>>>
> >>>>>   We wanted to provide the support to enable EC at bucket level. To
> >>>>>   simplify some complications, we have moved the default replication
> >>>>>   configurations from client to server.
> >>>>>
> >>>>>
> >>>>>
> >>>>>   -
> >>>>>
> >>>>>   Client side replication type and replication factor removed from
> >> the
> >>>>>   configuration files and introduced the
> >>>> ozone.server.default.replication
> >>>>>   and ozone.server.default.replication.type.We would continue to
> >>>> respect if
> >>>>>   one configures at client side explicitly or passed through APIs,
> >>>> otherwise
> >>>>>   server side bucket level properties or server side default
> >>>> configuration
> >>>>>   would take effect.
> >>>>>
> >>>>>
> >>>>>
> >>>>>   -
> >>>>>
> >>>>>   Other than this change, the rest of EC side code should not impact
> >> any
> >>>>>   of the existing code flows.
> >>>>>
> >>>>>
> >>>>> We have finished documentation JIRA(HDDS-6172) for covering this
> >> feature
> >>>>> and we will continue to improve further in master.
> >>>>>
> >>>>> Git Branch Name : HDDS-3816-ec
> >>>>>
> >>>>> JIRAs: HDDS-3816 and HDDS-5351
> >>>>>
> >>>>> Completed tasks: ~ 142
> >>>>>
> >>>>> + We are covering the following two mandatory JIRAs:
> >>>>>
> >>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
> >>>>> server could fail due to the unavailability for client default
> >>>> replication
> >>>>> config
> >>>>>
> >>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> >>>>>
> >>>>> PRs reviews in-progress and expected to close in a day or two.
> >>>>>
> >>>>> Few other JIRAs in HDDS-3816 are still open but I believe they're not
> >>>>> blockers for merge.
> >>>>>
> >>>>> In short what you can do now with this feature:
> >>>>>
> >>>>>   -
> >>>>>
> >>>>>   You can enable EC at bucket level and cluster level.
> >>>>>
> >>>>> How to enable it at bucket level? Just create the bucket by passing
> >> the
> >>>> ec
> >>>>> replication options.
> >>>>>
> >>>>>   -
> >>>>>
> >>>>>   You can create EC keys and read the same back.
> >>>>>   -
> >>>>>
> >>>>>   You should be able to continue writing even when chosen nodes are
> >>>>>   failing. (Of Course minimum of Data+Parity live nodes should be
> >>>> available
> >>>>>   in cluster for complete the write)
> >>>>>   -
> >>>>>
> >>>>>   You should be able to read the file back even if a few nodes
> >> failed in
> >>>>>   the same ec block group(Failures should not be more than parity
> >>>> number of
> >>>>>   nodes.).
> >>>>>
> >>>>> What is pending? Offline recovery of lost/missing EC containers. As
> >>>>> mentioned above, post merge of this branch, I will create a separate
> >> JIRA
> >>>>> for starting the work for OfflineRecovery.
> >>>>>
> >>>>>
> >>>>> There are automated acceptance test cases already added. HDDS-6231
> >>>>>
> >>>>> In addition to that, we have also performed basic Acceptance Testing
> >> in
> >>>>> physical cluster:
> >>>>>
> >>>>>   1.
> >>>>>
> >>>>>   Installed 10 nodes cluster and created EC bucket (3:2).
> >>>>>
> >>>>> Uploaded 10GB key.
> >>>>>
> >>>>> Downloaded the same key and checked the md5sum.
> >>>>>
> >>>>>   1.
> >>>>>
> >>>>>   Uploaded 8GB key.
> >>>>>
> >>>>> Downloaded the same key and checked the md5sum.
> >>>>>
> >>>>>   1.
> >>>>>
> >>>>>   Uploaded 3MB key
> >>>>>
> >>>>> Downloaded the same and verified md5sum.
> >>>>>
> >>>>>   1.
> >>>>>
> >>>>>   Changed bucket to (6:3)
> >>>>>
> >>>>> Uploaded 8GB key
> >>>>>
> >>>>> Download the same.
> >>>>>
> >>>>> Also verified the new key should be in 6:3 policy and old keys must
> be
> >>>> 3:2.Verified
> >>>>> with several different size key writes and reads.
> >>>>>
> >>>>> Merge checklist items assessment is here:
> >>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> >>>>>
> >>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan
> >> Fajth
> >>>>> <pi...@cloudera.com> for great efforts in core development and also
> >>>>> thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for
> >> collaborating
> >>>>> on some of the EC tasks.
> >>>>>
> >>>>> Thanks to Marton for design discussion and on some dev tasks as well.
> >>>>>
> >>>>> Thanks to many others who were involved in design discussions, Arpit,
> >>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
> >> Prashanth,
> >>>>> Rakesh, Yiqun Lin.
> >>>>> Sorry if I miss anyone here, but your efforts are much appreciated.
> >>>>> Without your tremendous help, we would have not reached this position
> >>>> yet.
> >>>>>
> >>>>> If there are no objections for the merge, I will start the official
> >> vote
> >>>>> later.
> >>>>>
> >>>>> Regards,
> >>>>>
> >>>>> EC Branch Devs
> >>>>>
> >>>>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> For additional commands, e-mail: dev-help@ozone.apache.org
>
>

-- 
Thanks & Regards,
Aravindan

Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Prashant Pogde <pp...@cloudera.com.INVALID>.
+1 for the EC branch merge.

Regards,
Prashant

> On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <sw...@apache.org> wrote:
> 
> +1 for the EC branch merge.
> 
> Best,
> Sid
> 
> On Wed, Apr 6, 2022 at 8:05 PM guimark <gu...@126.com> wrote:
> 
>> Great news!
>> +1 to merge.
>> 
>> 
>> 
>> 
>> At 2022-04-06 22:18:31, "Stephen O'Donnell" <so...@cloudera.com.INVALID>
>> wrote:
>>> I have been working on the code on this branch for some time, and I
>> believe
>>> it is in a good state to merge now. It is mostly new code, and if nothing
>>> attempts to use EC, none of the EC code paths will be executed.
>>> 
>>> +1 to merge from me.
>>> 
>>> Stephen.
>>> 
>>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <um...@apache.org>
>> wrote:
>>> 
>>>> =====Few Edits Below===================
>>>> 
>>>> Dear Ozone Devs,
>>>> 
>>>> As you may know, we have been actively developing Ozone Erasure Coding
>>>> support in a separate branch HDDS-3816-ec.
>>>> 
>>>> We have finished the development of EC key write and read functionality.
>>>> The support of offline recovery( Recovering replica from node loss)
>> will be
>>>> part of second phase work.
>>>> 
>>>> Since the code has already grown and increasingly started seeing merge
>>>> complications, we would like to merge the current EC branch into master.
>>>> 
>>>> We filed the new JIRA(HDDS-6462) for the second phase of work and
>> continued
>>>> the offline recovery work there. (we have uploaded the design doc there)
>>>> 
>>>> Details on Changes:
>>>> 
>>>>   -
>>>> 
>>>>   Most of the EC core logic went to newly extended classes. Key changes
>>>>   went into EC*OutputStream and EC*InputStream classes for write and
>> read
>>>>   respectively. Based on replication type, ECPipelineProvider will be
>>>> chosen
>>>>   for creating EC pipelines.
>>>> 
>>>> 
>>>> 
>>>>   -
>>>> 
>>>>   Since we cannot represent the EC replication in the existing
>> replication
>>>>   factor, we have introduced ECReplicationConfig. The ReplicationConfig
>>>>   interface is already pushed to master, so it’s not a new idea coming
>>>>   through this branch merge now. What is newly coming here is the
>>>>   ECReplicationConfig class which can be used to express EC replication
>>>>   configuration.
>>>> 
>>>> 
>>>> 
>>>>   -
>>>> 
>>>>   We wanted to provide the support to enable EC at bucket level. To
>>>>   simplify some complications, we have moved the default replication
>>>>   configurations from client to server.
>>>> 
>>>> 
>>>> 
>>>>   -
>>>> 
>>>>   Client side replication type and replication factor removed from the
>>>>   configuration files and introduced the
>> ozone.server.default.replication
>>>>   and ozone.server.default.replication.type.We would continue to
>> respect
>>>> if
>>>>   one configures at client side explicitly or passed through APIs,
>>>> otherwise
>>>>   server side bucket level properties or server side default
>> configuration
>>>>   would take effect.
>>>> 
>>>> 
>>>> 
>>>>   -
>>>> 
>>>>   Other than this change, the rest of EC side code should not impact
>> any
>>>>   of the existing code flows.
>>>> 
>>>> 
>>>> We have finished documentation JIRA(HDDS-6172) for covering this feature
>>>> and we will continue to improve further in master.
>>>> 
>>>> Git Branch Name : HDDS-3816-ec
>>>> 
>>>> JIRAs: HDDS-3816 and HDDS-5351
>>>> 
>>>> Completed tasks: ~ 142
>>>> 
>>>> + We are covering the following two mandatory JIRAs to come in:
>>>> 
>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
>> server
>>>> could fail due to the unavailability for client default replication
>> config
>>>> 
>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>>>> 
>>>> PRs reviews in-progress and expected to close in a day or two.
>>>> 
>>>> Few other JIRAs in HDDS-3816 are still open but I believe they're not
>>>> blockers for merge.
>>>> 
>>>> In short what you can do now with this feature:
>>>> 
>>>>   -
>>>> 
>>>>   You can enable EC at bucket level and cluster level.
>>>> 
>>>> How to enable it at bucket level? Just create the bucket by passing the
>> ec
>>>> replication options.
>>>> 
>>>>   -
>>>> 
>>>>   You can create EC keys and read the same back.
>>>>   -
>>>> 
>>>>   You should be able to continue writing even when chosen nodes are
>>>>   failing. (Of Course minimum of Data+Parity live nodes should be
>>>> available
>>>>   in cluster for complete the write)
>>>>   -
>>>> 
>>>>   You should be able to read the file back even if a few nodes failed
>> in
>>>>   the same ec block group(Failures should not be more than parity
>> number
>>>> of
>>>>   nodes.).
>>>> 
>>>> What is pending? Offline recovery of lost/missing EC containers. As
>>>> mentioned above, post merge of this branch, I will create a separate
>> JIRA
>>>> for starting the work for OfflineRecovery.
>>>> 
>>>> 
>>>> There are automated acceptance test cases already added. HDDS-6231
>>>> 
>>>> In addition to that, we have also performed basic Acceptance Testing in
>>>> physical cluster:
>>>> 
>>>>   1.
>>>> 
>>>>   Installed 10 nodes cluster and created EC bucket (3:2).
>>>> 
>>>> Uploaded 10GB key.
>>>> 
>>>> Downloaded the same key and checked the md5sum.
>>>> 
>>>>   1.
>>>> 
>>>>   Uploaded 8GB key.
>>>> 
>>>> Downloaded the same key and checked the md5sum.
>>>> 
>>>>   1.
>>>> 
>>>>   Uploaded 3MB key
>>>> 
>>>> Downloaded the same and verified md5sum.
>>>> 
>>>>   1.
>>>> 
>>>>   Changed bucket to (6:3)
>>>> 
>>>> Uploaded 8GB key
>>>> 
>>>> Download the same.
>>>> 
>>>> Also verified the new key should be in 6:3 policy and old keys must be
>>>> 3:2.Verified
>>>> with several different size key writes and reads.
>>>> 
>>>> 
>>>> 
>>>> Since the merge discussion thread, we have well stabilized code and
>> fixed
>>>> several bugs.
>>>> 
>>>> 
>>>> Merge checklist items assessment is here:
>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>>>> 
>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan
>> Fajth
>>>> <pi...@cloudera.com> for great efforts in core development and also
>> thanks
>>>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for
>> collaborating
>>>> on some of the EC tasks.
>>>> 
>>>> Thanks to Marton for design discussion and on some dev tasks as well.
>>>> 
>>>> Thanks to many others who were involved in design discussions, Arpit,
>> Sidd,
>>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth,
>> Rakesh,
>>>> Yiqun Lin.
>>>> Sorry if I miss anyone here, but your efforts are much appreciated.
>> Without
>>>> your tremendous help, we would have not reached this position yet.
>>>> 
>>>> 
>>>> 
>>>> To start with, here is my +1
>>>> 
>>>> The vote will run for 5 days.
>>>> 
>>>> Regards,
>>>> Uma
>>>> 
>>>> 
>>>> 
>>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <um...@apache.org>
>>>> wrote:
>>>> 
>>>>> Dear Ozone Devs,
>>>>> 
>>>>> As you may know, we have been actively developing Ozone Erasure Coding
>>>>> support in a separate branch HDDS-3816-ec.
>>>>> 
>>>>> We have finished the development of EC key write and read
>> functionality.
>>>>> The support of offline recovery( Recovering replica from node loss)
>> will
>>>> be
>>>>> part of second phase work.
>>>>> 
>>>>> Since the code has already grown and increasingly started seeing merge
>>>>> complications, we would like to propose to merge the current EC branch
>>>> into
>>>>> master.
>>>>> 
>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work and
>>>>> continued the offline recovery work there.
>>>>> 
>>>>> Details on Changes:
>>>>> 
>>>>>   -
>>>>> 
>>>>>   Most of the EC core logic went to newly extended classes. Key
>> changes
>>>>>   went into EC*OutputStream and EC*InputStream classes for write and
>>>> read
>>>>>   respectively. Based on replication type, ECPipelineProvider will be
>>>> chosen
>>>>>   for creating EC pipelines.
>>>>> 
>>>>> 
>>>>> 
>>>>>   -
>>>>> 
>>>>>   Since we cannot represent the EC replication in the existing
>>>>>   replication factor, we have introduced ECReplicationConfig. The
>>>>>   ReplicationConfig interface is already pushed to master, so it’s
>> not
>>>> a new
>>>>>   idea coming through this branch merge now. What is newly coming
>> here
>>>> is the
>>>>>   ECReplicationConfig class which can be used to express EC
>> replication
>>>>>   configuration.
>>>>> 
>>>>> 
>>>>> 
>>>>>   -
>>>>> 
>>>>>   We wanted to provide the support to enable EC at bucket level. To
>>>>>   simplify some complications, we have moved the default replication
>>>>>   configurations from client to server.
>>>>> 
>>>>> 
>>>>> 
>>>>>   -
>>>>> 
>>>>>   Client side replication type and replication factor removed from
>> the
>>>>>   configuration files and introduced the
>>>> ozone.server.default.replication
>>>>>   and ozone.server.default.replication.type.We would continue to
>>>> respect if
>>>>>   one configures at client side explicitly or passed through APIs,
>>>> otherwise
>>>>>   server side bucket level properties or server side default
>>>> configuration
>>>>>   would take effect.
>>>>> 
>>>>> 
>>>>> 
>>>>>   -
>>>>> 
>>>>>   Other than this change, the rest of EC side code should not impact
>> any
>>>>>   of the existing code flows.
>>>>> 
>>>>> 
>>>>> We have finished documentation JIRA(HDDS-6172) for covering this
>> feature
>>>>> and we will continue to improve further in master.
>>>>> 
>>>>> Git Branch Name : HDDS-3816-ec
>>>>> 
>>>>> JIRAs: HDDS-3816 and HDDS-5351
>>>>> 
>>>>> Completed tasks: ~ 142
>>>>> 
>>>>> + We are covering the following two mandatory JIRAs:
>>>>> 
>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
>>>>> server could fail due to the unavailability for client default
>>>> replication
>>>>> config
>>>>> 
>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>>>>> 
>>>>> PRs reviews in-progress and expected to close in a day or two.
>>>>> 
>>>>> Few other JIRAs in HDDS-3816 are still open but I believe they're not
>>>>> blockers for merge.
>>>>> 
>>>>> In short what you can do now with this feature:
>>>>> 
>>>>>   -
>>>>> 
>>>>>   You can enable EC at bucket level and cluster level.
>>>>> 
>>>>> How to enable it at bucket level? Just create the bucket by passing
>> the
>>>> ec
>>>>> replication options.
>>>>> 
>>>>>   -
>>>>> 
>>>>>   You can create EC keys and read the same back.
>>>>>   -
>>>>> 
>>>>>   You should be able to continue writing even when chosen nodes are
>>>>>   failing. (Of Course minimum of Data+Parity live nodes should be
>>>> available
>>>>>   in cluster for complete the write)
>>>>>   -
>>>>> 
>>>>>   You should be able to read the file back even if a few nodes
>> failed in
>>>>>   the same ec block group(Failures should not be more than parity
>>>> number of
>>>>>   nodes.).
>>>>> 
>>>>> What is pending? Offline recovery of lost/missing EC containers. As
>>>>> mentioned above, post merge of this branch, I will create a separate
>> JIRA
>>>>> for starting the work for OfflineRecovery.
>>>>> 
>>>>> 
>>>>> There are automated acceptance test cases already added. HDDS-6231
>>>>> 
>>>>> In addition to that, we have also performed basic Acceptance Testing
>> in
>>>>> physical cluster:
>>>>> 
>>>>>   1.
>>>>> 
>>>>>   Installed 10 nodes cluster and created EC bucket (3:2).
>>>>> 
>>>>> Uploaded 10GB key.
>>>>> 
>>>>> Downloaded the same key and checked the md5sum.
>>>>> 
>>>>>   1.
>>>>> 
>>>>>   Uploaded 8GB key.
>>>>> 
>>>>> Downloaded the same key and checked the md5sum.
>>>>> 
>>>>>   1.
>>>>> 
>>>>>   Uploaded 3MB key
>>>>> 
>>>>> Downloaded the same and verified md5sum.
>>>>> 
>>>>>   1.
>>>>> 
>>>>>   Changed bucket to (6:3)
>>>>> 
>>>>> Uploaded 8GB key
>>>>> 
>>>>> Download the same.
>>>>> 
>>>>> Also verified the new key should be in 6:3 policy and old keys must be
>>>> 3:2.Verified
>>>>> with several different size key writes and reads.
>>>>> 
>>>>> Merge checklist items assessment is here:
>>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>>>>> 
>>>>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan
>> Fajth
>>>>> <pi...@cloudera.com> for great efforts in core development and also
>>>>> thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for
>> collaborating
>>>>> on some of the EC tasks.
>>>>> 
>>>>> Thanks to Marton for design discussion and on some dev tasks as well.
>>>>> 
>>>>> Thanks to many others who were involved in design discussions, Arpit,
>>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
>> Prashanth,
>>>>> Rakesh, Yiqun Lin.
>>>>> Sorry if I miss anyone here, but your efforts are much appreciated.
>>>>> Without your tremendous help, we would have not reached this position
>>>> yet.
>>>>> 
>>>>> If there are no objections for the merge, I will start the official
>> vote
>>>>> later.
>>>>> 
>>>>> Regards,
>>>>> 
>>>>> EC Branch Devs
>>>>> 
>>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
For additional commands, e-mail: dev-help@ozone.apache.org


Re: Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Siddharth Wagle <sw...@apache.org>.
+1 for the EC branch merge.

Best,
Sid

On Wed, Apr 6, 2022 at 8:05 PM guimark <gu...@126.com> wrote:

> Great news!
> +1 to merge.
>
>
>
>
> At 2022-04-06 22:18:31, "Stephen O'Donnell" <so...@cloudera.com.INVALID>
> wrote:
> >I have been working on the code on this branch for some time, and I
> believe
> >it is in a good state to merge now. It is mostly new code, and if nothing
> >attempts to use EC, none of the EC code paths will be executed.
> >
> >+1 to merge from me.
> >
> >Stephen.
> >
> >On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <um...@apache.org>
> wrote:
> >
> >> =====Few Edits Below===================
> >>
> >> Dear Ozone Devs,
> >>
> >> As you may know, we have been actively developing Ozone Erasure Coding
> >> support in a separate branch HDDS-3816-ec.
> >>
> >> We have finished the development of EC key write and read functionality.
> >> The support of offline recovery( Recovering replica from node loss)
> will be
> >> part of second phase work.
> >>
> >> Since the code has already grown and increasingly started seeing merge
> >> complications, we would like to merge the current EC branch into master.
> >>
> >> We filed the new JIRA(HDDS-6462) for the second phase of work and
> continued
> >> the offline recovery work there. (we have uploaded the design doc there)
> >>
> >> Details on Changes:
> >>
> >>    -
> >>
> >>    Most of the EC core logic went to newly extended classes. Key changes
> >>    went into EC*OutputStream and EC*InputStream classes for write and
> read
> >>    respectively. Based on replication type, ECPipelineProvider will be
> >> chosen
> >>    for creating EC pipelines.
> >>
> >>
> >>
> >>    -
> >>
> >>    Since we cannot represent the EC replication in the existing
> replication
> >>    factor, we have introduced ECReplicationConfig. The ReplicationConfig
> >>    interface is already pushed to master, so it’s not a new idea coming
> >>    through this branch merge now. What is newly coming here is the
> >>    ECReplicationConfig class which can be used to express EC replication
> >>    configuration.
> >>
> >>
> >>
> >>    -
> >>
> >>    We wanted to provide the support to enable EC at bucket level. To
> >>    simplify some complications, we have moved the default replication
> >>    configurations from client to server.
> >>
> >>
> >>
> >>    -
> >>
> >>    Client side replication type and replication factor removed from the
> >>    configuration files and introduced the
> ozone.server.default.replication
> >>    and ozone.server.default.replication.type.We would continue to
> respect
> >> if
> >>    one configures at client side explicitly or passed through APIs,
> >> otherwise
> >>    server side bucket level properties or server side default
> configuration
> >>    would take effect.
> >>
> >>
> >>
> >>    -
> >>
> >>    Other than this change, the rest of EC side code should not impact
> any
> >>    of the existing code flows.
> >>
> >>
> >> We have finished documentation JIRA(HDDS-6172) for covering this feature
> >> and we will continue to improve further in master.
> >>
> >> Git Branch Name : HDDS-3816-ec
> >>
> >> JIRAs: HDDS-3816 and HDDS-5351
> >>
> >> Completed tasks: ~ 142
> >>
> >> + We are covering the following two mandatory JIRAs to come in:
> >>
> >> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
> server
> >> could fail due to the unavailability for client default replication
> config
> >>
> >> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> >>
> >> PRs reviews in-progress and expected to close in a day or two.
> >>
> >> Few other JIRAs in HDDS-3816 are still open but I believe they're not
> >> blockers for merge.
> >>
> >> In short what you can do now with this feature:
> >>
> >>    -
> >>
> >>    You can enable EC at bucket level and cluster level.
> >>
> >> How to enable it at bucket level? Just create the bucket by passing the
> ec
> >> replication options.
> >>
> >>    -
> >>
> >>    You can create EC keys and read the same back.
> >>    -
> >>
> >>    You should be able to continue writing even when chosen nodes are
> >>    failing. (Of Course minimum of Data+Parity live nodes should be
> >> available
> >>    in cluster for complete the write)
> >>    -
> >>
> >>    You should be able to read the file back even if a few nodes failed
> in
> >>    the same ec block group(Failures should not be more than parity
> number
> >> of
> >>    nodes.).
> >>
> >> What is pending? Offline recovery of lost/missing EC containers. As
> >> mentioned above, post merge of this branch, I will create a separate
> JIRA
> >> for starting the work for OfflineRecovery.
> >>
> >>
> >> There are automated acceptance test cases already added. HDDS-6231
> >>
> >> In addition to that, we have also performed basic Acceptance Testing in
> >> physical cluster:
> >>
> >>    1.
> >>
> >>    Installed 10 nodes cluster and created EC bucket (3:2).
> >>
> >> Uploaded 10GB key.
> >>
> >> Downloaded the same key and checked the md5sum.
> >>
> >>    1.
> >>
> >>    Uploaded 8GB key.
> >>
> >> Downloaded the same key and checked the md5sum.
> >>
> >>    1.
> >>
> >>    Uploaded 3MB key
> >>
> >> Downloaded the same and verified md5sum.
> >>
> >>    1.
> >>
> >>    Changed bucket to (6:3)
> >>
> >> Uploaded 8GB key
> >>
> >> Download the same.
> >>
> >> Also verified the new key should be in 6:3 policy and old keys must be
> >> 3:2.Verified
> >> with several different size key writes and reads.
> >>
> >>
> >>
> >> Since the merge discussion thread, we have well stabilized code and
> fixed
> >> several bugs.
> >>
> >>
> >> Merge checklist items assessment is here:
> >>
> >>
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> >>
> >> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan
> Fajth
> >> <pi...@cloudera.com> for great efforts in core development and also
> thanks
> >> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for
> collaborating
> >> on some of the EC tasks.
> >>
> >> Thanks to Marton for design discussion and on some dev tasks as well.
> >>
> >> Thanks to many others who were involved in design discussions, Arpit,
> Sidd,
> >> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth,
> Rakesh,
> >> Yiqun Lin.
> >> Sorry if I miss anyone here, but your efforts are much appreciated.
> Without
> >> your tremendous help, we would have not reached this position yet.
> >>
> >>
> >>
> >> To start with, here is my +1
> >>
> >> The vote will run for 5 days.
> >>
> >> Regards,
> >> Uma
> >>
> >>
> >>
> >> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <um...@apache.org>
> >> wrote:
> >>
> >> > Dear Ozone Devs,
> >> >
> >> > As you may know, we have been actively developing Ozone Erasure Coding
> >> > support in a separate branch HDDS-3816-ec.
> >> >
> >> > We have finished the development of EC key write and read
> functionality.
> >> > The support of offline recovery( Recovering replica from node loss)
> will
> >> be
> >> > part of second phase work.
> >> >
> >> > Since the code has already grown and increasingly started seeing merge
> >> > complications, we would like to propose to merge the current EC branch
> >> into
> >> > master.
> >> >
> >> > We filed the new JIRA(HDDS-6462) for the second phase of work and
> >> > continued the offline recovery work there.
> >> >
> >> > Details on Changes:
> >> >
> >> >    -
> >> >
> >> >    Most of the EC core logic went to newly extended classes. Key
> changes
> >> >    went into EC*OutputStream and EC*InputStream classes for write and
> >> read
> >> >    respectively. Based on replication type, ECPipelineProvider will be
> >> chosen
> >> >    for creating EC pipelines.
> >> >
> >> >
> >> >
> >> >    -
> >> >
> >> >    Since we cannot represent the EC replication in the existing
> >> >    replication factor, we have introduced ECReplicationConfig. The
> >> >    ReplicationConfig interface is already pushed to master, so it’s
> not
> >> a new
> >> >    idea coming through this branch merge now. What is newly coming
> here
> >> is the
> >> >    ECReplicationConfig class which can be used to express EC
> replication
> >> >    configuration.
> >> >
> >> >
> >> >
> >> >    -
> >> >
> >> >    We wanted to provide the support to enable EC at bucket level. To
> >> >    simplify some complications, we have moved the default replication
> >> >    configurations from client to server.
> >> >
> >> >
> >> >
> >> >    -
> >> >
> >> >    Client side replication type and replication factor removed from
> the
> >> >    configuration files and introduced the
> >> ozone.server.default.replication
> >> >    and ozone.server.default.replication.type.We would continue to
> >> respect if
> >> >    one configures at client side explicitly or passed through APIs,
> >> otherwise
> >> >    server side bucket level properties or server side default
> >> configuration
> >> >    would take effect.
> >> >
> >> >
> >> >
> >> >    -
> >> >
> >> >    Other than this change, the rest of EC side code should not impact
> any
> >> >    of the existing code flows.
> >> >
> >> >
> >> > We have finished documentation JIRA(HDDS-6172) for covering this
> feature
> >> > and we will continue to improve further in master.
> >> >
> >> > Git Branch Name : HDDS-3816-ec
> >> >
> >> > JIRAs: HDDS-3816 and HDDS-5351
> >> >
> >> > Completed tasks: ~ 142
> >> >
> >> > + We are covering the following two mandatory JIRAs:
> >> >
> >> > 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
> >> > server could fail due to the unavailability for client default
> >> replication
> >> > config
> >> >
> >> > 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> >> >
> >> > PRs reviews in-progress and expected to close in a day or two.
> >> >
> >> > Few other JIRAs in HDDS-3816 are still open but I believe they're not
> >> > blockers for merge.
> >> >
> >> > In short what you can do now with this feature:
> >> >
> >> >    -
> >> >
> >> >    You can enable EC at bucket level and cluster level.
> >> >
> >> > How to enable it at bucket level? Just create the bucket by passing
> the
> >> ec
> >> > replication options.
> >> >
> >> >    -
> >> >
> >> >    You can create EC keys and read the same back.
> >> >    -
> >> >
> >> >    You should be able to continue writing even when chosen nodes are
> >> >    failing. (Of Course minimum of Data+Parity live nodes should be
> >> available
> >> >    in cluster for complete the write)
> >> >    -
> >> >
> >> >    You should be able to read the file back even if a few nodes
> failed in
> >> >    the same ec block group(Failures should not be more than parity
> >> number of
> >> >    nodes.).
> >> >
> >> > What is pending? Offline recovery of lost/missing EC containers. As
> >> > mentioned above, post merge of this branch, I will create a separate
> JIRA
> >> > for starting the work for OfflineRecovery.
> >> >
> >> >
> >> > There are automated acceptance test cases already added. HDDS-6231
> >> >
> >> > In addition to that, we have also performed basic Acceptance Testing
> in
> >> > physical cluster:
> >> >
> >> >    1.
> >> >
> >> >    Installed 10 nodes cluster and created EC bucket (3:2).
> >> >
> >> > Uploaded 10GB key.
> >> >
> >> > Downloaded the same key and checked the md5sum.
> >> >
> >> >    1.
> >> >
> >> >    Uploaded 8GB key.
> >> >
> >> > Downloaded the same key and checked the md5sum.
> >> >
> >> >    1.
> >> >
> >> >    Uploaded 3MB key
> >> >
> >> > Downloaded the same and verified md5sum.
> >> >
> >> >    1.
> >> >
> >> >    Changed bucket to (6:3)
> >> >
> >> > Uploaded 8GB key
> >> >
> >> > Download the same.
> >> >
> >> > Also verified the new key should be in 6:3 policy and old keys must be
> >> 3:2.Verified
> >> > with several different size key writes and reads.
> >> >
> >> > Merge checklist items assessment is here:
> >> >
> >>
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> >> >
> >> > Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan
> Fajth
> >> > <pi...@cloudera.com> for great efforts in core development and also
> >> > thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for
> collaborating
> >> > on some of the EC tasks.
> >> >
> >> > Thanks to Marton for design discussion and on some dev tasks as well.
> >> >
> >> > Thanks to many others who were involved in design discussions, Arpit,
> >> > Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi,
> Prashanth,
> >> > Rakesh, Yiqun Lin.
> >> > Sorry if I miss anyone here, but your efforts are much appreciated.
> >> > Without your tremendous help, we would have not reached this position
> >> yet.
> >> >
> >> > If there are no objections for the merge, I will start the official
> vote
> >> > later.
> >> >
> >> > Regards,
> >> >
> >> > EC Branch Devs
> >> >
> >>
>

Re:Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by guimark <gu...@126.com>.
Great news!
+1 to merge.




At 2022-04-06 22:18:31, "Stephen O'Donnell" <so...@cloudera.com.INVALID> wrote:
>I have been working on the code on this branch for some time, and I believe
>it is in a good state to merge now. It is mostly new code, and if nothing
>attempts to use EC, none of the EC code paths will be executed.
>
>+1 to merge from me.
>
>Stephen.
>
>On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <um...@apache.org> wrote:
>
>> =====Few Edits Below===================
>>
>> Dear Ozone Devs,
>>
>> As you may know, we have been actively developing Ozone Erasure Coding
>> support in a separate branch HDDS-3816-ec.
>>
>> We have finished the development of EC key write and read functionality.
>> The support of offline recovery( Recovering replica from node loss) will be
>> part of second phase work.
>>
>> Since the code has already grown and increasingly started seeing merge
>> complications, we would like to merge the current EC branch into master.
>>
>> We filed the new JIRA(HDDS-6462) for the second phase of work and continued
>> the offline recovery work there. (we have uploaded the design doc there)
>>
>> Details on Changes:
>>
>>    -
>>
>>    Most of the EC core logic went to newly extended classes. Key changes
>>    went into EC*OutputStream and EC*InputStream classes for write and read
>>    respectively. Based on replication type, ECPipelineProvider will be
>> chosen
>>    for creating EC pipelines.
>>
>>
>>
>>    -
>>
>>    Since we cannot represent the EC replication in the existing replication
>>    factor, we have introduced ECReplicationConfig. The ReplicationConfig
>>    interface is already pushed to master, so it’s not a new idea coming
>>    through this branch merge now. What is newly coming here is the
>>    ECReplicationConfig class which can be used to express EC replication
>>    configuration.
>>
>>
>>
>>    -
>>
>>    We wanted to provide the support to enable EC at bucket level. To
>>    simplify some complications, we have moved the default replication
>>    configurations from client to server.
>>
>>
>>
>>    -
>>
>>    Client side replication type and replication factor removed from the
>>    configuration files and introduced the ozone.server.default.replication
>>    and ozone.server.default.replication.type.We would continue to respect
>> if
>>    one configures at client side explicitly or passed through APIs,
>> otherwise
>>    server side bucket level properties or server side default configuration
>>    would take effect.
>>
>>
>>
>>    -
>>
>>    Other than this change, the rest of EC side code should not impact any
>>    of the existing code flows.
>>
>>
>> We have finished documentation JIRA(HDDS-6172) for covering this feature
>> and we will continue to improve further in master.
>>
>> Git Branch Name : HDDS-3816-ec
>>
>> JIRAs: HDDS-3816 and HDDS-5351
>>
>> Completed tasks: ~ 142
>>
>> + We are covering the following two mandatory JIRAs to come in:
>>
>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older server
>> could fail due to the unavailability for client default replication config
>>
>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>>
>> PRs reviews in-progress and expected to close in a day or two.
>>
>> Few other JIRAs in HDDS-3816 are still open but I believe they're not
>> blockers for merge.
>>
>> In short what you can do now with this feature:
>>
>>    -
>>
>>    You can enable EC at bucket level and cluster level.
>>
>> How to enable it at bucket level? Just create the bucket by passing the ec
>> replication options.
>>
>>    -
>>
>>    You can create EC keys and read the same back.
>>    -
>>
>>    You should be able to continue writing even when chosen nodes are
>>    failing. (Of Course minimum of Data+Parity live nodes should be
>> available
>>    in cluster for complete the write)
>>    -
>>
>>    You should be able to read the file back even if a few nodes failed in
>>    the same ec block group(Failures should not be more than parity number
>> of
>>    nodes.).
>>
>> What is pending? Offline recovery of lost/missing EC containers. As
>> mentioned above, post merge of this branch, I will create a separate JIRA
>> for starting the work for OfflineRecovery.
>>
>>
>> There are automated acceptance test cases already added. HDDS-6231
>>
>> In addition to that, we have also performed basic Acceptance Testing in
>> physical cluster:
>>
>>    1.
>>
>>    Installed 10 nodes cluster and created EC bucket (3:2).
>>
>> Uploaded 10GB key.
>>
>> Downloaded the same key and checked the md5sum.
>>
>>    1.
>>
>>    Uploaded 8GB key.
>>
>> Downloaded the same key and checked the md5sum.
>>
>>    1.
>>
>>    Uploaded 3MB key
>>
>> Downloaded the same and verified md5sum.
>>
>>    1.
>>
>>    Changed bucket to (6:3)
>>
>> Uploaded 8GB key
>>
>> Download the same.
>>
>> Also verified the new key should be in 6:3 policy and old keys must be
>> 3:2.Verified
>> with several different size key writes and reads.
>>
>>
>>
>> Since the merge discussion thread, we have well stabilized code and fixed
>> several bugs.
>>
>>
>> Merge checklist items assessment is here:
>>
>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>>
>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan Fajth
>> <pi...@cloudera.com> for great efforts in core development and also thanks
>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for collaborating
>> on some of the EC tasks.
>>
>> Thanks to Marton for design discussion and on some dev tasks as well.
>>
>> Thanks to many others who were involved in design discussions, Arpit, Sidd,
>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, Rakesh,
>> Yiqun Lin.
>> Sorry if I miss anyone here, but your efforts are much appreciated. Without
>> your tremendous help, we would have not reached this position yet.
>>
>>
>>
>> To start with, here is my +1
>>
>> The vote will run for 5 days.
>>
>> Regards,
>> Uma
>>
>>
>>
>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <um...@apache.org>
>> wrote:
>>
>> > Dear Ozone Devs,
>> >
>> > As you may know, we have been actively developing Ozone Erasure Coding
>> > support in a separate branch HDDS-3816-ec.
>> >
>> > We have finished the development of EC key write and read functionality.
>> > The support of offline recovery( Recovering replica from node loss) will
>> be
>> > part of second phase work.
>> >
>> > Since the code has already grown and increasingly started seeing merge
>> > complications, we would like to propose to merge the current EC branch
>> into
>> > master.
>> >
>> > We filed the new JIRA(HDDS-6462) for the second phase of work and
>> > continued the offline recovery work there.
>> >
>> > Details on Changes:
>> >
>> >    -
>> >
>> >    Most of the EC core logic went to newly extended classes. Key changes
>> >    went into EC*OutputStream and EC*InputStream classes for write and
>> read
>> >    respectively. Based on replication type, ECPipelineProvider will be
>> chosen
>> >    for creating EC pipelines.
>> >
>> >
>> >
>> >    -
>> >
>> >    Since we cannot represent the EC replication in the existing
>> >    replication factor, we have introduced ECReplicationConfig. The
>> >    ReplicationConfig interface is already pushed to master, so it’s not
>> a new
>> >    idea coming through this branch merge now. What is newly coming here
>> is the
>> >    ECReplicationConfig class which can be used to express EC replication
>> >    configuration.
>> >
>> >
>> >
>> >    -
>> >
>> >    We wanted to provide the support to enable EC at bucket level. To
>> >    simplify some complications, we have moved the default replication
>> >    configurations from client to server.
>> >
>> >
>> >
>> >    -
>> >
>> >    Client side replication type and replication factor removed from the
>> >    configuration files and introduced the
>> ozone.server.default.replication
>> >    and ozone.server.default.replication.type.We would continue to
>> respect if
>> >    one configures at client side explicitly or passed through APIs,
>> otherwise
>> >    server side bucket level properties or server side default
>> configuration
>> >    would take effect.
>> >
>> >
>> >
>> >    -
>> >
>> >    Other than this change, the rest of EC side code should not impact any
>> >    of the existing code flows.
>> >
>> >
>> > We have finished documentation JIRA(HDDS-6172) for covering this feature
>> > and we will continue to improve further in master.
>> >
>> > Git Branch Name : HDDS-3816-ec
>> >
>> > JIRAs: HDDS-3816 and HDDS-5351
>> >
>> > Completed tasks: ~ 142
>> >
>> > + We are covering the following two mandatory JIRAs:
>> >
>> > 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
>> > server could fail due to the unavailability for client default
>> replication
>> > config
>> >
>> > 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>> >
>> > PRs reviews in-progress and expected to close in a day or two.
>> >
>> > Few other JIRAs in HDDS-3816 are still open but I believe they're not
>> > blockers for merge.
>> >
>> > In short what you can do now with this feature:
>> >
>> >    -
>> >
>> >    You can enable EC at bucket level and cluster level.
>> >
>> > How to enable it at bucket level? Just create the bucket by passing the
>> ec
>> > replication options.
>> >
>> >    -
>> >
>> >    You can create EC keys and read the same back.
>> >    -
>> >
>> >    You should be able to continue writing even when chosen nodes are
>> >    failing. (Of Course minimum of Data+Parity live nodes should be
>> available
>> >    in cluster for complete the write)
>> >    -
>> >
>> >    You should be able to read the file back even if a few nodes failed in
>> >    the same ec block group(Failures should not be more than parity
>> number of
>> >    nodes.).
>> >
>> > What is pending? Offline recovery of lost/missing EC containers. As
>> > mentioned above, post merge of this branch, I will create a separate JIRA
>> > for starting the work for OfflineRecovery.
>> >
>> >
>> > There are automated acceptance test cases already added. HDDS-6231
>> >
>> > In addition to that, we have also performed basic Acceptance Testing in
>> > physical cluster:
>> >
>> >    1.
>> >
>> >    Installed 10 nodes cluster and created EC bucket (3:2).
>> >
>> > Uploaded 10GB key.
>> >
>> > Downloaded the same key and checked the md5sum.
>> >
>> >    1.
>> >
>> >    Uploaded 8GB key.
>> >
>> > Downloaded the same key and checked the md5sum.
>> >
>> >    1.
>> >
>> >    Uploaded 3MB key
>> >
>> > Downloaded the same and verified md5sum.
>> >
>> >    1.
>> >
>> >    Changed bucket to (6:3)
>> >
>> > Uploaded 8GB key
>> >
>> > Download the same.
>> >
>> > Also verified the new key should be in 6:3 policy and old keys must be
>> 3:2.Verified
>> > with several different size key writes and reads.
>> >
>> > Merge checklist items assessment is here:
>> >
>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>> >
>> > Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan Fajth
>> > <pi...@cloudera.com> for great efforts in core development and also
>> > thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating
>> > on some of the EC tasks.
>> >
>> > Thanks to Marton for design discussion and on some dev tasks as well.
>> >
>> > Thanks to many others who were involved in design discussions, Arpit,
>> > Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth,
>> > Rakesh, Yiqun Lin.
>> > Sorry if I miss anyone here, but your efforts are much appreciated.
>> > Without your tremendous help, we would have not reached this position
>> yet.
>> >
>> > If there are no objections for the merge, I will start the official vote
>> > later.
>> >
>> > Regards,
>> >
>> > EC Branch Devs
>> >
>>

Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Stephen O'Donnell <so...@cloudera.com.INVALID>.
I have been working on the code on this branch for some time, and I believe
it is in a good state to merge now. It is mostly new code, and if nothing
attempts to use EC, none of the EC code paths will be executed.

+1 to merge from me.

Stephen.

On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <um...@apache.org> wrote:

> =====Few Edits Below===================
>
> Dear Ozone Devs,
>
> As you may know, we have been actively developing Ozone Erasure Coding
> support in a separate branch HDDS-3816-ec.
>
> We have finished the development of EC key write and read functionality.
> The support of offline recovery( Recovering replica from node loss) will be
> part of second phase work.
>
> Since the code has already grown and increasingly started seeing merge
> complications, we would like to merge the current EC branch into master.
>
> We filed the new JIRA(HDDS-6462) for the second phase of work and continued
> the offline recovery work there. (we have uploaded the design doc there)
>
> Details on Changes:
>
>    -
>
>    Most of the EC core logic went to newly extended classes. Key changes
>    went into EC*OutputStream and EC*InputStream classes for write and read
>    respectively. Based on replication type, ECPipelineProvider will be
> chosen
>    for creating EC pipelines.
>
>
>
>    -
>
>    Since we cannot represent the EC replication in the existing replication
>    factor, we have introduced ECReplicationConfig. The ReplicationConfig
>    interface is already pushed to master, so it’s not a new idea coming
>    through this branch merge now. What is newly coming here is the
>    ECReplicationConfig class which can be used to express EC replication
>    configuration.
>
>
>
>    -
>
>    We wanted to provide the support to enable EC at bucket level. To
>    simplify some complications, we have moved the default replication
>    configurations from client to server.
>
>
>
>    -
>
>    Client side replication type and replication factor removed from the
>    configuration files and introduced the ozone.server.default.replication
>    and ozone.server.default.replication.type.We would continue to respect
> if
>    one configures at client side explicitly or passed through APIs,
> otherwise
>    server side bucket level properties or server side default configuration
>    would take effect.
>
>
>
>    -
>
>    Other than this change, the rest of EC side code should not impact any
>    of the existing code flows.
>
>
> We have finished documentation JIRA(HDDS-6172) for covering this feature
> and we will continue to improve further in master.
>
> Git Branch Name : HDDS-3816-ec
>
> JIRAs: HDDS-3816 and HDDS-5351
>
> Completed tasks: ~ 142
>
> + We are covering the following two mandatory JIRAs to come in:
>
> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older server
> could fail due to the unavailability for client default replication config
>
> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>
> PRs reviews in-progress and expected to close in a day or two.
>
> Few other JIRAs in HDDS-3816 are still open but I believe they're not
> blockers for merge.
>
> In short what you can do now with this feature:
>
>    -
>
>    You can enable EC at bucket level and cluster level.
>
> How to enable it at bucket level? Just create the bucket by passing the ec
> replication options.
>
>    -
>
>    You can create EC keys and read the same back.
>    -
>
>    You should be able to continue writing even when chosen nodes are
>    failing. (Of Course minimum of Data+Parity live nodes should be
> available
>    in cluster for complete the write)
>    -
>
>    You should be able to read the file back even if a few nodes failed in
>    the same ec block group(Failures should not be more than parity number
> of
>    nodes.).
>
> What is pending? Offline recovery of lost/missing EC containers. As
> mentioned above, post merge of this branch, I will create a separate JIRA
> for starting the work for OfflineRecovery.
>
>
> There are automated acceptance test cases already added. HDDS-6231
>
> In addition to that, we have also performed basic Acceptance Testing in
> physical cluster:
>
>    1.
>
>    Installed 10 nodes cluster and created EC bucket (3:2).
>
> Uploaded 10GB key.
>
> Downloaded the same key and checked the md5sum.
>
>    1.
>
>    Uploaded 8GB key.
>
> Downloaded the same key and checked the md5sum.
>
>    1.
>
>    Uploaded 3MB key
>
> Downloaded the same and verified md5sum.
>
>    1.
>
>    Changed bucket to (6:3)
>
> Uploaded 8GB key
>
> Download the same.
>
> Also verified the new key should be in 6:3 policy and old keys must be
> 3:2.Verified
> with several different size key writes and reads.
>
>
>
> Since the merge discussion thread, we have well stabilized code and fixed
> several bugs.
>
>
> Merge checklist items assessment is here:
>
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>
> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan Fajth
> <pi...@cloudera.com> for great efforts in core development and also thanks
> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for collaborating
> on some of the EC tasks.
>
> Thanks to Marton for design discussion and on some dev tasks as well.
>
> Thanks to many others who were involved in design discussions, Arpit, Sidd,
> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, Rakesh,
> Yiqun Lin.
> Sorry if I miss anyone here, but your efforts are much appreciated. Without
> your tremendous help, we would have not reached this position yet.
>
>
>
> To start with, here is my +1
>
> The vote will run for 5 days.
>
> Regards,
> Uma
>
>
>
> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <um...@apache.org>
> wrote:
>
> > Dear Ozone Devs,
> >
> > As you may know, we have been actively developing Ozone Erasure Coding
> > support in a separate branch HDDS-3816-ec.
> >
> > We have finished the development of EC key write and read functionality.
> > The support of offline recovery( Recovering replica from node loss) will
> be
> > part of second phase work.
> >
> > Since the code has already grown and increasingly started seeing merge
> > complications, we would like to propose to merge the current EC branch
> into
> > master.
> >
> > We filed the new JIRA(HDDS-6462) for the second phase of work and
> > continued the offline recovery work there.
> >
> > Details on Changes:
> >
> >    -
> >
> >    Most of the EC core logic went to newly extended classes. Key changes
> >    went into EC*OutputStream and EC*InputStream classes for write and
> read
> >    respectively. Based on replication type, ECPipelineProvider will be
> chosen
> >    for creating EC pipelines.
> >
> >
> >
> >    -
> >
> >    Since we cannot represent the EC replication in the existing
> >    replication factor, we have introduced ECReplicationConfig. The
> >    ReplicationConfig interface is already pushed to master, so it’s not
> a new
> >    idea coming through this branch merge now. What is newly coming here
> is the
> >    ECReplicationConfig class which can be used to express EC replication
> >    configuration.
> >
> >
> >
> >    -
> >
> >    We wanted to provide the support to enable EC at bucket level. To
> >    simplify some complications, we have moved the default replication
> >    configurations from client to server.
> >
> >
> >
> >    -
> >
> >    Client side replication type and replication factor removed from the
> >    configuration files and introduced the
> ozone.server.default.replication
> >    and ozone.server.default.replication.type.We would continue to
> respect if
> >    one configures at client side explicitly or passed through APIs,
> otherwise
> >    server side bucket level properties or server side default
> configuration
> >    would take effect.
> >
> >
> >
> >    -
> >
> >    Other than this change, the rest of EC side code should not impact any
> >    of the existing code flows.
> >
> >
> > We have finished documentation JIRA(HDDS-6172) for covering this feature
> > and we will continue to improve further in master.
> >
> > Git Branch Name : HDDS-3816-ec
> >
> > JIRAs: HDDS-3816 and HDDS-5351
> >
> > Completed tasks: ~ 142
> >
> > + We are covering the following two mandatory JIRAs:
> >
> > 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
> > server could fail due to the unavailability for client default
> replication
> > config
> >
> > 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> >
> > PRs reviews in-progress and expected to close in a day or two.
> >
> > Few other JIRAs in HDDS-3816 are still open but I believe they're not
> > blockers for merge.
> >
> > In short what you can do now with this feature:
> >
> >    -
> >
> >    You can enable EC at bucket level and cluster level.
> >
> > How to enable it at bucket level? Just create the bucket by passing the
> ec
> > replication options.
> >
> >    -
> >
> >    You can create EC keys and read the same back.
> >    -
> >
> >    You should be able to continue writing even when chosen nodes are
> >    failing. (Of Course minimum of Data+Parity live nodes should be
> available
> >    in cluster for complete the write)
> >    -
> >
> >    You should be able to read the file back even if a few nodes failed in
> >    the same ec block group(Failures should not be more than parity
> number of
> >    nodes.).
> >
> > What is pending? Offline recovery of lost/missing EC containers. As
> > mentioned above, post merge of this branch, I will create a separate JIRA
> > for starting the work for OfflineRecovery.
> >
> >
> > There are automated acceptance test cases already added. HDDS-6231
> >
> > In addition to that, we have also performed basic Acceptance Testing in
> > physical cluster:
> >
> >    1.
> >
> >    Installed 10 nodes cluster and created EC bucket (3:2).
> >
> > Uploaded 10GB key.
> >
> > Downloaded the same key and checked the md5sum.
> >
> >    1.
> >
> >    Uploaded 8GB key.
> >
> > Downloaded the same key and checked the md5sum.
> >
> >    1.
> >
> >    Uploaded 3MB key
> >
> > Downloaded the same and verified md5sum.
> >
> >    1.
> >
> >    Changed bucket to (6:3)
> >
> > Uploaded 8GB key
> >
> > Download the same.
> >
> > Also verified the new key should be in 6:3 policy and old keys must be
> 3:2.Verified
> > with several different size key writes and reads.
> >
> > Merge checklist items assessment is here:
> >
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> >
> > Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan Fajth
> > <pi...@cloudera.com> for great efforts in core development and also
> > thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating
> > on some of the EC tasks.
> >
> > Thanks to Marton for design discussion and on some dev tasks as well.
> >
> > Thanks to many others who were involved in design discussions, Arpit,
> > Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth,
> > Rakesh, Yiqun Lin.
> > Sorry if I miss anyone here, but your efforts are much appreciated.
> > Without your tremendous help, we would have not reached this position
> yet.
> >
> > If there are no objections for the merge, I will start the official vote
> > later.
> >
> > Regards,
> >
> > EC Branch Devs
> >
>

Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Uma gangumalla <um...@apache.org>.
Thanks everyone for voting.

The vote has passed with the following stats:
 +1's  : 15 ( Stephen, Mark, Siddharth, Prashant, Aravindan, Nicholas,
Bharat, Ayush, Lokesh, Mukul, Mingchao, Jackson, Hanisha, Shashikant, Neil,
Sammi, Dinesh, Janus )
 No -1s.

As promised, HDDS-5909 and HDDS-6209 are committed in the branch. So, EC
branch covered the compatibility issues.

Also, we are tracking the merge PR at
https://github.com/apache/ozone/pull/3301 to make sure we get green CI
before merge. We have got green CI which includes all expected changes in
the branch.

I will go ahead and merge the branch soon.


Regards,
Uma




On Tue, Apr 5, 2022 at 11:11 PM Uma gangumalla <um...@apache.org> wrote:

> =====Few Edits Below===================
>
> Dear Ozone Devs,
>
> As you may know, we have been actively developing Ozone Erasure Coding
> support in a separate branch HDDS-3816-ec.
>
> We have finished the development of EC key write and read functionality.
> The support of offline recovery( Recovering replica from node loss) will be
> part of second phase work.
>
> Since the code has already grown and increasingly started seeing merge
> complications, we would like to merge the current EC branch into master.
>
> We filed the new JIRA(HDDS-6462) for the second phase of work and
> continued the offline recovery work there. (we have uploaded the design doc
> there)
>
> Details on Changes:
>
>    -
>
>    Most of the EC core logic went to newly extended classes. Key changes
>    went into EC*OutputStream and EC*InputStream classes for write and read
>    respectively. Based on replication type, ECPipelineProvider will be chosen
>    for creating EC pipelines.
>
>
>
>    -
>
>    Since we cannot represent the EC replication in the existing
>    replication factor, we have introduced ECReplicationConfig. The
>    ReplicationConfig interface is already pushed to master, so it’s not a new
>    idea coming through this branch merge now. What is newly coming here is the
>    ECReplicationConfig class which can be used to express EC replication
>    configuration.
>
>
>
>    -
>
>    We wanted to provide the support to enable EC at bucket level. To
>    simplify some complications, we have moved the default replication
>    configurations from client to server.
>
>
>
>    -
>
>    Client side replication type and replication factor removed from the
>    configuration files and introduced the ozone.server.default.replication
>    and ozone.server.default.replication.type.We would continue to respect if
>    one configures at client side explicitly or passed through APIs, otherwise
>    server side bucket level properties or server side default configuration
>    would take effect.
>
>
>
>    -
>
>    Other than this change, the rest of EC side code should not impact any
>    of the existing code flows.
>
>
> We have finished documentation JIRA(HDDS-6172) for covering this feature
> and we will continue to improve further in master.
>
> Git Branch Name : HDDS-3816-ec
>
> JIRAs: HDDS-3816 and HDDS-5351
>
> Completed tasks: ~ 142
>
> + We are covering the following two mandatory JIRAs to come in:
>
> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
> server could fail due to the unavailability for client default replication
> config
>
> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>
> PRs reviews in-progress and expected to close in a day or two.
>
> Few other JIRAs in HDDS-3816 are still open but I believe they're not
> blockers for merge.
>
> In short what you can do now with this feature:
>
>    -
>
>    You can enable EC at bucket level and cluster level.
>
> How to enable it at bucket level? Just create the bucket by passing the ec
> replication options.
>
>    -
>
>    You can create EC keys and read the same back.
>    -
>
>    You should be able to continue writing even when chosen nodes are
>    failing. (Of Course minimum of Data+Parity live nodes should be available
>    in cluster for complete the write)
>    -
>
>    You should be able to read the file back even if a few nodes failed in
>    the same ec block group(Failures should not be more than parity number of
>    nodes.).
>
> What is pending? Offline recovery of lost/missing EC containers. As
> mentioned above, post merge of this branch, I will create a separate JIRA
> for starting the work for OfflineRecovery.
>
>
> There are automated acceptance test cases already added. HDDS-6231
>
> In addition to that, we have also performed basic Acceptance Testing in
> physical cluster:
>
>    1.
>
>    Installed 10 nodes cluster and created EC bucket (3:2).
>
> Uploaded 10GB key.
>
> Downloaded the same key and checked the md5sum.
>
>    1.
>
>    Uploaded 8GB key.
>
> Downloaded the same key and checked the md5sum.
>
>    1.
>
>    Uploaded 3MB key
>
> Downloaded the same and verified md5sum.
>
>    1.
>
>    Changed bucket to (6:3)
>
> Uploaded 8GB key
>
> Download the same.
>
> Also verified the new key should be in 6:3 policy and old keys must be 3:2.Verified
> with several different size key writes and reads.
>
>
>
> Since the merge discussion thread, we have well stabilized code and fixed
> several bugs.
>
>
> Merge checklist items assessment is here:
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>
> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan Fajth
> <pi...@cloudera.com> for great efforts in core development and also
> thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for
> collaborating on some of the EC tasks.
>
> Thanks to Marton for design discussion and on some dev tasks as well.
>
> Thanks to many others who were involved in design discussions, Arpit,
> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth,
> Rakesh, Yiqun Lin.
> Sorry if I miss anyone here, but your efforts are much appreciated.
> Without your tremendous help, we would have not reached this position yet.
>
>
>
> To start with, here is my +1
>
> The vote will run for 5 days.
>
> Regards,
> Uma
>
>
>
> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <um...@apache.org>
> wrote:
>
>> Dear Ozone Devs,
>>
>> As you may know, we have been actively developing Ozone Erasure Coding
>> support in a separate branch HDDS-3816-ec.
>>
>> We have finished the development of EC key write and read functionality.
>> The support of offline recovery( Recovering replica from node loss) will be
>> part of second phase work.
>>
>> Since the code has already grown and increasingly started seeing merge
>> complications, we would like to propose to merge the current EC branch into
>> master.
>>
>> We filed the new JIRA(HDDS-6462) for the second phase of work and
>> continued the offline recovery work there.
>>
>> Details on Changes:
>>
>>    -
>>
>>    Most of the EC core logic went to newly extended classes. Key changes
>>    went into EC*OutputStream and EC*InputStream classes for write and read
>>    respectively. Based on replication type, ECPipelineProvider will be chosen
>>    for creating EC pipelines.
>>
>>
>>
>>    -
>>
>>    Since we cannot represent the EC replication in the existing
>>    replication factor, we have introduced ECReplicationConfig. The
>>    ReplicationConfig interface is already pushed to master, so it’s not a new
>>    idea coming through this branch merge now. What is newly coming here is the
>>    ECReplicationConfig class which can be used to express EC replication
>>    configuration.
>>
>>
>>
>>    -
>>
>>    We wanted to provide the support to enable EC at bucket level. To
>>    simplify some complications, we have moved the default replication
>>    configurations from client to server.
>>
>>
>>
>>    -
>>
>>    Client side replication type and replication factor removed from the
>>    configuration files and introduced the ozone.server.default.replication
>>    and ozone.server.default.replication.type.We would continue to respect if
>>    one configures at client side explicitly or passed through APIs, otherwise
>>    server side bucket level properties or server side default configuration
>>    would take effect.
>>
>>
>>
>>    -
>>
>>    Other than this change, the rest of EC side code should not impact
>>    any of the existing code flows.
>>
>>
>> We have finished documentation JIRA(HDDS-6172) for covering this feature
>> and we will continue to improve further in master.
>>
>> Git Branch Name : HDDS-3816-ec
>>
>> JIRAs: HDDS-3816 and HDDS-5351
>>
>> Completed tasks: ~ 142
>>
>> + We are covering the following two mandatory JIRAs:
>>
>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
>> server could fail due to the unavailability for client default replication
>> config
>>
>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>>
>> PRs reviews in-progress and expected to close in a day or two.
>>
>> Few other JIRAs in HDDS-3816 are still open but I believe they're not
>> blockers for merge.
>>
>> In short what you can do now with this feature:
>>
>>    -
>>
>>    You can enable EC at bucket level and cluster level.
>>
>> How to enable it at bucket level? Just create the bucket by passing the
>> ec replication options.
>>
>>    -
>>
>>    You can create EC keys and read the same back.
>>    -
>>
>>    You should be able to continue writing even when chosen nodes are
>>    failing. (Of Course minimum of Data+Parity live nodes should be available
>>    in cluster for complete the write)
>>    -
>>
>>    You should be able to read the file back even if a few nodes failed
>>    in the same ec block group(Failures should not be more than parity number
>>    of nodes.).
>>
>> What is pending? Offline recovery of lost/missing EC containers. As
>> mentioned above, post merge of this branch, I will create a separate JIRA
>> for starting the work for OfflineRecovery.
>>
>>
>> There are automated acceptance test cases already added. HDDS-6231
>>
>> In addition to that, we have also performed basic Acceptance Testing in
>> physical cluster:
>>
>>    1.
>>
>>    Installed 10 nodes cluster and created EC bucket (3:2).
>>
>> Uploaded 10GB key.
>>
>> Downloaded the same key and checked the md5sum.
>>
>>    1.
>>
>>    Uploaded 8GB key.
>>
>> Downloaded the same key and checked the md5sum.
>>
>>    1.
>>
>>    Uploaded 3MB key
>>
>> Downloaded the same and verified md5sum.
>>
>>    1.
>>
>>    Changed bucket to (6:3)
>>
>> Uploaded 8GB key
>>
>> Download the same.
>>
>> Also verified the new key should be in 6:3 policy and old keys must be
>> 3:2.Verified with several different size key writes and reads.
>>
>> Merge checklist items assessment is here:
>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>>
>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan Fajth
>> <pi...@cloudera.com> for great efforts in core development and also
>> thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating
>> on some of the EC tasks.
>>
>> Thanks to Marton for design discussion and on some dev tasks as well.
>>
>> Thanks to many others who were involved in design discussions, Arpit,
>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth,
>> Rakesh, Yiqun Lin.
>> Sorry if I miss anyone here, but your efforts are much appreciated.
>> Without your tremendous help, we would have not reached this position yet.
>>
>> If there are no objections for the merge, I will start the official vote
>> later.
>>
>> Regards,
>>
>> EC Branch Devs
>>
>

Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Uma gangumalla <um...@apache.org>.
=====Few Edits Below===================

Dear Ozone Devs,

As you may know, we have been actively developing Ozone Erasure Coding
support in a separate branch HDDS-3816-ec.

We have finished the development of EC key write and read functionality.
The support of offline recovery( Recovering replica from node loss) will be
part of second phase work.

Since the code has already grown and increasingly started seeing merge
complications, we would like to merge the current EC branch into master.

We filed the new JIRA(HDDS-6462) for the second phase of work and continued
the offline recovery work there. (we have uploaded the design doc there)

Details on Changes:

   -

   Most of the EC core logic went to newly extended classes. Key changes
   went into EC*OutputStream and EC*InputStream classes for write and read
   respectively. Based on replication type, ECPipelineProvider will be chosen
   for creating EC pipelines.



   -

   Since we cannot represent the EC replication in the existing replication
   factor, we have introduced ECReplicationConfig. The ReplicationConfig
   interface is already pushed to master, so it’s not a new idea coming
   through this branch merge now. What is newly coming here is the
   ECReplicationConfig class which can be used to express EC replication
   configuration.



   -

   We wanted to provide the support to enable EC at bucket level. To
   simplify some complications, we have moved the default replication
   configurations from client to server.



   -

   Client side replication type and replication factor removed from the
   configuration files and introduced the ozone.server.default.replication
   and ozone.server.default.replication.type.We would continue to respect if
   one configures at client side explicitly or passed through APIs, otherwise
   server side bucket level properties or server side default configuration
   would take effect.



   -

   Other than this change, the rest of EC side code should not impact any
   of the existing code flows.


We have finished documentation JIRA(HDDS-6172) for covering this feature
and we will continue to improve further in master.

Git Branch Name : HDDS-3816-ec

JIRAs: HDDS-3816 and HDDS-5351

Completed tasks: ~ 142

+ We are covering the following two mandatory JIRAs to come in:

1. HDDS-6209: EC: [Forward compatibility issue] New client to older server
could fail due to the unavailability for client default replication config

2. HDDS-5909: EC: Onboard EC into upgrade framework.

PRs reviews in-progress and expected to close in a day or two.

Few other JIRAs in HDDS-3816 are still open but I believe they're not
blockers for merge.

In short what you can do now with this feature:

   -

   You can enable EC at bucket level and cluster level.

How to enable it at bucket level? Just create the bucket by passing the ec
replication options.

   -

   You can create EC keys and read the same back.
   -

   You should be able to continue writing even when chosen nodes are
   failing. (Of Course minimum of Data+Parity live nodes should be available
   in cluster for complete the write)
   -

   You should be able to read the file back even if a few nodes failed in
   the same ec block group(Failures should not be more than parity number of
   nodes.).

What is pending? Offline recovery of lost/missing EC containers. As
mentioned above, post merge of this branch, I will create a separate JIRA
for starting the work for OfflineRecovery.


There are automated acceptance test cases already added. HDDS-6231

In addition to that, we have also performed basic Acceptance Testing in
physical cluster:

   1.

   Installed 10 nodes cluster and created EC bucket (3:2).

Uploaded 10GB key.

Downloaded the same key and checked the md5sum.

   1.

   Uploaded 8GB key.

Downloaded the same key and checked the md5sum.

   1.

   Uploaded 3MB key

Downloaded the same and verified md5sum.

   1.

   Changed bucket to (6:3)

Uploaded 8GB key

Download the same.

Also verified the new key should be in 6:3 policy and old keys must be
3:2.Verified
with several different size key writes and reads.



Since the merge discussion thread, we have well stabilized code and fixed
several bugs.


Merge checklist items assessment is here:
https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist

Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan Fajth
<pi...@cloudera.com> for great efforts in core development and also thanks
a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for collaborating
on some of the EC tasks.

Thanks to Marton for design discussion and on some dev tasks as well.

Thanks to many others who were involved in design discussions, Arpit, Sidd,
Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, Rakesh,
Yiqun Lin.
Sorry if I miss anyone here, but your efforts are much appreciated. Without
your tremendous help, we would have not reached this position yet.



To start with, here is my +1

The vote will run for 5 days.

Regards,
Uma



On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <um...@apache.org> wrote:

> Dear Ozone Devs,
>
> As you may know, we have been actively developing Ozone Erasure Coding
> support in a separate branch HDDS-3816-ec.
>
> We have finished the development of EC key write and read functionality.
> The support of offline recovery( Recovering replica from node loss) will be
> part of second phase work.
>
> Since the code has already grown and increasingly started seeing merge
> complications, we would like to propose to merge the current EC branch into
> master.
>
> We filed the new JIRA(HDDS-6462) for the second phase of work and
> continued the offline recovery work there.
>
> Details on Changes:
>
>    -
>
>    Most of the EC core logic went to newly extended classes. Key changes
>    went into EC*OutputStream and EC*InputStream classes for write and read
>    respectively. Based on replication type, ECPipelineProvider will be chosen
>    for creating EC pipelines.
>
>
>
>    -
>
>    Since we cannot represent the EC replication in the existing
>    replication factor, we have introduced ECReplicationConfig. The
>    ReplicationConfig interface is already pushed to master, so it’s not a new
>    idea coming through this branch merge now. What is newly coming here is the
>    ECReplicationConfig class which can be used to express EC replication
>    configuration.
>
>
>
>    -
>
>    We wanted to provide the support to enable EC at bucket level. To
>    simplify some complications, we have moved the default replication
>    configurations from client to server.
>
>
>
>    -
>
>    Client side replication type and replication factor removed from the
>    configuration files and introduced the ozone.server.default.replication
>    and ozone.server.default.replication.type.We would continue to respect if
>    one configures at client side explicitly or passed through APIs, otherwise
>    server side bucket level properties or server side default configuration
>    would take effect.
>
>
>
>    -
>
>    Other than this change, the rest of EC side code should not impact any
>    of the existing code flows.
>
>
> We have finished documentation JIRA(HDDS-6172) for covering this feature
> and we will continue to improve further in master.
>
> Git Branch Name : HDDS-3816-ec
>
> JIRAs: HDDS-3816 and HDDS-5351
>
> Completed tasks: ~ 142
>
> + We are covering the following two mandatory JIRAs:
>
> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
> server could fail due to the unavailability for client default replication
> config
>
> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>
> PRs reviews in-progress and expected to close in a day or two.
>
> Few other JIRAs in HDDS-3816 are still open but I believe they're not
> blockers for merge.
>
> In short what you can do now with this feature:
>
>    -
>
>    You can enable EC at bucket level and cluster level.
>
> How to enable it at bucket level? Just create the bucket by passing the ec
> replication options.
>
>    -
>
>    You can create EC keys and read the same back.
>    -
>
>    You should be able to continue writing even when chosen nodes are
>    failing. (Of Course minimum of Data+Parity live nodes should be available
>    in cluster for complete the write)
>    -
>
>    You should be able to read the file back even if a few nodes failed in
>    the same ec block group(Failures should not be more than parity number of
>    nodes.).
>
> What is pending? Offline recovery of lost/missing EC containers. As
> mentioned above, post merge of this branch, I will create a separate JIRA
> for starting the work for OfflineRecovery.
>
>
> There are automated acceptance test cases already added. HDDS-6231
>
> In addition to that, we have also performed basic Acceptance Testing in
> physical cluster:
>
>    1.
>
>    Installed 10 nodes cluster and created EC bucket (3:2).
>
> Uploaded 10GB key.
>
> Downloaded the same key and checked the md5sum.
>
>    1.
>
>    Uploaded 8GB key.
>
> Downloaded the same key and checked the md5sum.
>
>    1.
>
>    Uploaded 3MB key
>
> Downloaded the same and verified md5sum.
>
>    1.
>
>    Changed bucket to (6:3)
>
> Uploaded 8GB key
>
> Download the same.
>
> Also verified the new key should be in 6:3 policy and old keys must be 3:2.Verified
> with several different size key writes and reads.
>
> Merge checklist items assessment is here:
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>
> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan Fajth
> <pi...@cloudera.com> for great efforts in core development and also
> thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating
> on some of the EC tasks.
>
> Thanks to Marton for design discussion and on some dev tasks as well.
>
> Thanks to many others who were involved in design discussions, Arpit,
> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth,
> Rakesh, Yiqun Lin.
> Sorry if I miss anyone here, but your efforts are much appreciated.
> Without your tremendous help, we would have not reached this position yet.
>
> If there are no objections for the merge, I will start the official vote
> later.
>
> Regards,
>
> EC Branch Devs
>

Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Janus Chow <yi...@gmail.com>.
+1. Thanks for the great work.


Thanks
Symious

On Fri, Apr 15, 2022 at 21:36 Dinesh Chitlangia <di...@apache.org> wrote:

> +1 for merge, it will be good to have people use it.
>
>
> Cheers,
> Dinesh
>
> On Wed, Apr 13, 2022, 10:51 AM Mukul Kumar Singh <mksingh.apache@gmail.com
> >
> wrote:
>
> > +1
> > Thanks for all the work on this feature.
> >
> > Thanks,
> >
> > Mukul
> >
> > On 13/04/22 9:01 am, Sammi Chen wrote:
> > > +1 for the merge.
> > >
> > > Thanks a lot to Uma and all contributors working on EC.  Great work!
> > >
> > > It's so great that we have EC in Ozone now.
> > >
> > > Bests,
> > > Sammi
> > >
> > > On Wed, 6 Apr 2022 at 13:59, Uma gangumalla <um...@apache.org>
> > wrote:
> > >
> > >> Dear Ozone Devs,
> > >>
> > >> As you may know, we have been actively developing Ozone Erasure Coding
> > >> support in a separate branch HDDS-3816-ec.
> > >>
> > >> We have finished the development of EC key write and read
> functionality.
> > >> The support of offline recovery( Recovering replica from node loss)
> > will be
> > >> part of second phase work.
> > >>
> > >> Since the code has already grown and increasingly started seeing merge
> > >> complications, we would like to propose to merge the current EC branch
> > into
> > >> master.
> > >>
> > >> We filed the new JIRA(HDDS-6462) for the second phase of work and
> > continued
> > >> the offline recovery work there.
> > >>
> > >> Details on Changes:
> > >>
> > >>     -
> > >>
> > >>     Most of the EC core logic went to newly extended classes. Key
> > changes
> > >>     went into EC*OutputStream and EC*InputStream classes for write and
> > read
> > >>     respectively. Based on replication type, ECPipelineProvider will
> be
> > >> chosen
> > >>     for creating EC pipelines.
> > >>
> > >>
> > >>
> > >>     -
> > >>
> > >>     Since we cannot represent the EC replication in the existing
> > replication
> > >>     factor, we have introduced ECReplicationConfig. The
> > ReplicationConfig
> > >>     interface is already pushed to master, so it’s not a new idea
> coming
> > >>     through this branch merge now. What is newly coming here is the
> > >>     ECReplicationConfig class which can be used to express EC
> > replication
> > >>     configuration.
> > >>
> > >>
> > >>
> > >>     -
> > >>
> > >>     We wanted to provide the support to enable EC at bucket level. To
> > >>     simplify some complications, we have moved the default replication
> > >>     configurations from client to server.
> > >>
> > >>
> > >>
> > >>     -
> > >>
> > >>     Client side replication type and replication factor removed from
> the
> > >>     configuration files and introduced the
> > ozone.server.default.replication
> > >>     and ozone.server.default.replication.type.We would continue to
> > respect
> > >> if
> > >>     one configures at client side explicitly or passed through APIs,
> > >> otherwise
> > >>     server side bucket level properties or server side default
> > configuration
> > >>     would take effect.
> > >>
> > >>
> > >>
> > >>     -
> > >>
> > >>     Other than this change, the rest of EC side code should not impact
> > any
> > >>     of the existing code flows.
> > >>
> > >>
> > >> We have finished documentation JIRA(HDDS-6172) for covering this
> feature
> > >> and we will continue to improve further in master.
> > >>
> > >> Git Branch Name : HDDS-3816-ec
> > >>
> > >> JIRAs: HDDS-3816 and HDDS-5351
> > >>
> > >> Completed tasks: ~ 142
> > >>
> > >> + We are covering the following two mandatory JIRAs:
> > >>
> > >> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
> > server
> > >> could fail due to the unavailability for client default replication
> > config
> > >>
> > >> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> > >>
> > >> PRs reviews in-progress and expected to close in a day or two.
> > >>
> > >> Few other JIRAs in HDDS-3816 are still open but I believe they're not
> > >> blockers for merge.
> > >>
> > >> In short what you can do now with this feature:
> > >>
> > >>     -
> > >>
> > >>     You can enable EC at bucket level and cluster level.
> > >>
> > >> How to enable it at bucket level? Just create the bucket by passing
> the
> > ec
> > >> replication options.
> > >>
> > >>     -
> > >>
> > >>     You can create EC keys and read the same back.
> > >>     -
> > >>
> > >>     You should be able to continue writing even when chosen nodes are
> > >>     failing. (Of Course minimum of Data+Parity live nodes should be
> > >> available
> > >>     in cluster for complete the write)
> > >>     -
> > >>
> > >>     You should be able to read the file back even if a few nodes
> failed
> > in
> > >>     the same ec block group(Failures should not be more than parity
> > number
> > >> of
> > >>     nodes.).
> > >>
> > >> What is pending? Offline recovery of lost/missing EC containers. As
> > >> mentioned above, post merge of this branch, I will create a separate
> > JIRA
> > >> for starting the work for OfflineRecovery.
> > >>
> > >>
> > >> There are automated acceptance test cases already added. HDDS-6231
> > >>
> > >> In addition to that, we have also performed basic Acceptance Testing
> in
> > >> physical cluster:
> > >>
> > >>     1.
> > >>
> > >>     Installed 10 nodes cluster and created EC bucket (3:2).
> > >>
> > >> Uploaded 10GB key.
> > >>
> > >> Downloaded the same key and checked the md5sum.
> > >>
> > >>     1.
> > >>
> > >>     Uploaded 8GB key.
> > >>
> > >> Downloaded the same key and checked the md5sum.
> > >>
> > >>     1.
> > >>
> > >>     Uploaded 3MB key
> > >>
> > >> Downloaded the same and verified md5sum.
> > >>
> > >>     1.
> > >>
> > >>     Changed bucket to (6:3)
> > >>
> > >> Uploaded 8GB key
> > >>
> > >> Download the same.
> > >>
> > >> Also verified the new key should be in 6:3 policy and old keys must be
> > >> 3:2.Verified
> > >> with several different size key writes and reads.
> > >>
> > >> Merge checklist items assessment is here:
> > >>
> > >>
> >
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> > >>
> > >> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan
> > Fajth
> > >> <pi...@cloudera.com> for great efforts in core development and also
> > thanks
> > >> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating on
> > some
> > >> of the EC tasks.
> > >>
> > >> Thanks to Marton for design discussion and on some dev tasks as well.
> > >>
> > >> Thanks to many others who were involved in design discussions, Arpit,
> > Sidd,
> > >> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth,
> > Rakesh,
> > >> Yiqun Lin.
> > >> Sorry if I miss anyone here, but your efforts are much appreciated.
> > Without
> > >> your tremendous help, we would have not reached this position yet.
> > >>
> > >> If there are no objections for the merge, I will start the official
> vote
> > >> later.
> > >>
> > >> Regards,
> > >>
> > >> EC Branch Devs
> > >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> > For additional commands, e-mail: dev-help@ozone.apache.org
> >
> >
>

Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Dinesh Chitlangia <di...@apache.org>.
+1 for merge, it will be good to have people use it.


Cheers,
Dinesh

On Wed, Apr 13, 2022, 10:51 AM Mukul Kumar Singh <mk...@gmail.com>
wrote:

> +1
> Thanks for all the work on this feature.
>
> Thanks,
>
> Mukul
>
> On 13/04/22 9:01 am, Sammi Chen wrote:
> > +1 for the merge.
> >
> > Thanks a lot to Uma and all contributors working on EC.  Great work!
> >
> > It's so great that we have EC in Ozone now.
> >
> > Bests,
> > Sammi
> >
> > On Wed, 6 Apr 2022 at 13:59, Uma gangumalla <um...@apache.org>
> wrote:
> >
> >> Dear Ozone Devs,
> >>
> >> As you may know, we have been actively developing Ozone Erasure Coding
> >> support in a separate branch HDDS-3816-ec.
> >>
> >> We have finished the development of EC key write and read functionality.
> >> The support of offline recovery( Recovering replica from node loss)
> will be
> >> part of second phase work.
> >>
> >> Since the code has already grown and increasingly started seeing merge
> >> complications, we would like to propose to merge the current EC branch
> into
> >> master.
> >>
> >> We filed the new JIRA(HDDS-6462) for the second phase of work and
> continued
> >> the offline recovery work there.
> >>
> >> Details on Changes:
> >>
> >>     -
> >>
> >>     Most of the EC core logic went to newly extended classes. Key
> changes
> >>     went into EC*OutputStream and EC*InputStream classes for write and
> read
> >>     respectively. Based on replication type, ECPipelineProvider will be
> >> chosen
> >>     for creating EC pipelines.
> >>
> >>
> >>
> >>     -
> >>
> >>     Since we cannot represent the EC replication in the existing
> replication
> >>     factor, we have introduced ECReplicationConfig. The
> ReplicationConfig
> >>     interface is already pushed to master, so it’s not a new idea coming
> >>     through this branch merge now. What is newly coming here is the
> >>     ECReplicationConfig class which can be used to express EC
> replication
> >>     configuration.
> >>
> >>
> >>
> >>     -
> >>
> >>     We wanted to provide the support to enable EC at bucket level. To
> >>     simplify some complications, we have moved the default replication
> >>     configurations from client to server.
> >>
> >>
> >>
> >>     -
> >>
> >>     Client side replication type and replication factor removed from the
> >>     configuration files and introduced the
> ozone.server.default.replication
> >>     and ozone.server.default.replication.type.We would continue to
> respect
> >> if
> >>     one configures at client side explicitly or passed through APIs,
> >> otherwise
> >>     server side bucket level properties or server side default
> configuration
> >>     would take effect.
> >>
> >>
> >>
> >>     -
> >>
> >>     Other than this change, the rest of EC side code should not impact
> any
> >>     of the existing code flows.
> >>
> >>
> >> We have finished documentation JIRA(HDDS-6172) for covering this feature
> >> and we will continue to improve further in master.
> >>
> >> Git Branch Name : HDDS-3816-ec
> >>
> >> JIRAs: HDDS-3816 and HDDS-5351
> >>
> >> Completed tasks: ~ 142
> >>
> >> + We are covering the following two mandatory JIRAs:
> >>
> >> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
> server
> >> could fail due to the unavailability for client default replication
> config
> >>
> >> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
> >>
> >> PRs reviews in-progress and expected to close in a day or two.
> >>
> >> Few other JIRAs in HDDS-3816 are still open but I believe they're not
> >> blockers for merge.
> >>
> >> In short what you can do now with this feature:
> >>
> >>     -
> >>
> >>     You can enable EC at bucket level and cluster level.
> >>
> >> How to enable it at bucket level? Just create the bucket by passing the
> ec
> >> replication options.
> >>
> >>     -
> >>
> >>     You can create EC keys and read the same back.
> >>     -
> >>
> >>     You should be able to continue writing even when chosen nodes are
> >>     failing. (Of Course minimum of Data+Parity live nodes should be
> >> available
> >>     in cluster for complete the write)
> >>     -
> >>
> >>     You should be able to read the file back even if a few nodes failed
> in
> >>     the same ec block group(Failures should not be more than parity
> number
> >> of
> >>     nodes.).
> >>
> >> What is pending? Offline recovery of lost/missing EC containers. As
> >> mentioned above, post merge of this branch, I will create a separate
> JIRA
> >> for starting the work for OfflineRecovery.
> >>
> >>
> >> There are automated acceptance test cases already added. HDDS-6231
> >>
> >> In addition to that, we have also performed basic Acceptance Testing in
> >> physical cluster:
> >>
> >>     1.
> >>
> >>     Installed 10 nodes cluster and created EC bucket (3:2).
> >>
> >> Uploaded 10GB key.
> >>
> >> Downloaded the same key and checked the md5sum.
> >>
> >>     1.
> >>
> >>     Uploaded 8GB key.
> >>
> >> Downloaded the same key and checked the md5sum.
> >>
> >>     1.
> >>
> >>     Uploaded 3MB key
> >>
> >> Downloaded the same and verified md5sum.
> >>
> >>     1.
> >>
> >>     Changed bucket to (6:3)
> >>
> >> Uploaded 8GB key
> >>
> >> Download the same.
> >>
> >> Also verified the new key should be in 6:3 policy and old keys must be
> >> 3:2.Verified
> >> with several different size key writes and reads.
> >>
> >> Merge checklist items assessment is here:
> >>
> >>
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> >>
> >> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan
> Fajth
> >> <pi...@cloudera.com> for great efforts in core development and also
> thanks
> >> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating on
> some
> >> of the EC tasks.
> >>
> >> Thanks to Marton for design discussion and on some dev tasks as well.
> >>
> >> Thanks to many others who were involved in design discussions, Arpit,
> Sidd,
> >> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth,
> Rakesh,
> >> Yiqun Lin.
> >> Sorry if I miss anyone here, but your efforts are much appreciated.
> Without
> >> your tremendous help, we would have not reached this position yet.
> >>
> >> If there are no objections for the merge, I will start the official vote
> >> later.
> >>
> >> Regards,
> >>
> >> EC Branch Devs
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> For additional commands, e-mail: dev-help@ozone.apache.org
>
>

Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Mukul Kumar Singh <mk...@gmail.com>.
+1
Thanks for all the work on this feature.

Thanks,

Mukul

On 13/04/22 9:01 am, Sammi Chen wrote:
> +1 for the merge.
>
> Thanks a lot to Uma and all contributors working on EC.  Great work!
>
> It's so great that we have EC in Ozone now.
>
> Bests,
> Sammi
>
> On Wed, 6 Apr 2022 at 13:59, Uma gangumalla <um...@apache.org> wrote:
>
>> Dear Ozone Devs,
>>
>> As you may know, we have been actively developing Ozone Erasure Coding
>> support in a separate branch HDDS-3816-ec.
>>
>> We have finished the development of EC key write and read functionality.
>> The support of offline recovery( Recovering replica from node loss) will be
>> part of second phase work.
>>
>> Since the code has already grown and increasingly started seeing merge
>> complications, we would like to propose to merge the current EC branch into
>> master.
>>
>> We filed the new JIRA(HDDS-6462) for the second phase of work and continued
>> the offline recovery work there.
>>
>> Details on Changes:
>>
>>     -
>>
>>     Most of the EC core logic went to newly extended classes. Key changes
>>     went into EC*OutputStream and EC*InputStream classes for write and read
>>     respectively. Based on replication type, ECPipelineProvider will be
>> chosen
>>     for creating EC pipelines.
>>
>>
>>
>>     -
>>
>>     Since we cannot represent the EC replication in the existing replication
>>     factor, we have introduced ECReplicationConfig. The ReplicationConfig
>>     interface is already pushed to master, so it’s not a new idea coming
>>     through this branch merge now. What is newly coming here is the
>>     ECReplicationConfig class which can be used to express EC replication
>>     configuration.
>>
>>
>>
>>     -
>>
>>     We wanted to provide the support to enable EC at bucket level. To
>>     simplify some complications, we have moved the default replication
>>     configurations from client to server.
>>
>>
>>
>>     -
>>
>>     Client side replication type and replication factor removed from the
>>     configuration files and introduced the ozone.server.default.replication
>>     and ozone.server.default.replication.type.We would continue to respect
>> if
>>     one configures at client side explicitly or passed through APIs,
>> otherwise
>>     server side bucket level properties or server side default configuration
>>     would take effect.
>>
>>
>>
>>     -
>>
>>     Other than this change, the rest of EC side code should not impact any
>>     of the existing code flows.
>>
>>
>> We have finished documentation JIRA(HDDS-6172) for covering this feature
>> and we will continue to improve further in master.
>>
>> Git Branch Name : HDDS-3816-ec
>>
>> JIRAs: HDDS-3816 and HDDS-5351
>>
>> Completed tasks: ~ 142
>>
>> + We are covering the following two mandatory JIRAs:
>>
>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older server
>> could fail due to the unavailability for client default replication config
>>
>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>>
>> PRs reviews in-progress and expected to close in a day or two.
>>
>> Few other JIRAs in HDDS-3816 are still open but I believe they're not
>> blockers for merge.
>>
>> In short what you can do now with this feature:
>>
>>     -
>>
>>     You can enable EC at bucket level and cluster level.
>>
>> How to enable it at bucket level? Just create the bucket by passing the ec
>> replication options.
>>
>>     -
>>
>>     You can create EC keys and read the same back.
>>     -
>>
>>     You should be able to continue writing even when chosen nodes are
>>     failing. (Of Course minimum of Data+Parity live nodes should be
>> available
>>     in cluster for complete the write)
>>     -
>>
>>     You should be able to read the file back even if a few nodes failed in
>>     the same ec block group(Failures should not be more than parity number
>> of
>>     nodes.).
>>
>> What is pending? Offline recovery of lost/missing EC containers. As
>> mentioned above, post merge of this branch, I will create a separate JIRA
>> for starting the work for OfflineRecovery.
>>
>>
>> There are automated acceptance test cases already added. HDDS-6231
>>
>> In addition to that, we have also performed basic Acceptance Testing in
>> physical cluster:
>>
>>     1.
>>
>>     Installed 10 nodes cluster and created EC bucket (3:2).
>>
>> Uploaded 10GB key.
>>
>> Downloaded the same key and checked the md5sum.
>>
>>     1.
>>
>>     Uploaded 8GB key.
>>
>> Downloaded the same key and checked the md5sum.
>>
>>     1.
>>
>>     Uploaded 3MB key
>>
>> Downloaded the same and verified md5sum.
>>
>>     1.
>>
>>     Changed bucket to (6:3)
>>
>> Uploaded 8GB key
>>
>> Download the same.
>>
>> Also verified the new key should be in 6:3 policy and old keys must be
>> 3:2.Verified
>> with several different size key writes and reads.
>>
>> Merge checklist items assessment is here:
>>
>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>>
>> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan Fajth
>> <pi...@cloudera.com> for great efforts in core development and also thanks
>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating on some
>> of the EC tasks.
>>
>> Thanks to Marton for design discussion and on some dev tasks as well.
>>
>> Thanks to many others who were involved in design discussions, Arpit, Sidd,
>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, Rakesh,
>> Yiqun Lin.
>> Sorry if I miss anyone here, but your efforts are much appreciated. Without
>> your tremendous help, we would have not reached this position yet.
>>
>> If there are no objections for the merge, I will start the official vote
>> later.
>>
>> Regards,
>>
>> EC Branch Devs
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
For additional commands, e-mail: dev-help@ozone.apache.org


Re: [VOTE] Merge Ozone Erasure Coding branch (HDDS-3816-ec) into master

Posted by Sammi Chen <sa...@apache.org>.
+1 for the merge.

Thanks a lot to Uma and all contributors working on EC.  Great work!

It's so great that we have EC in Ozone now.

Bests,
Sammi

On Wed, 6 Apr 2022 at 13:59, Uma gangumalla <um...@apache.org> wrote:

> Dear Ozone Devs,
>
> As you may know, we have been actively developing Ozone Erasure Coding
> support in a separate branch HDDS-3816-ec.
>
> We have finished the development of EC key write and read functionality.
> The support of offline recovery( Recovering replica from node loss) will be
> part of second phase work.
>
> Since the code has already grown and increasingly started seeing merge
> complications, we would like to propose to merge the current EC branch into
> master.
>
> We filed the new JIRA(HDDS-6462) for the second phase of work and continued
> the offline recovery work there.
>
> Details on Changes:
>
>    -
>
>    Most of the EC core logic went to newly extended classes. Key changes
>    went into EC*OutputStream and EC*InputStream classes for write and read
>    respectively. Based on replication type, ECPipelineProvider will be
> chosen
>    for creating EC pipelines.
>
>
>
>    -
>
>    Since we cannot represent the EC replication in the existing replication
>    factor, we have introduced ECReplicationConfig. The ReplicationConfig
>    interface is already pushed to master, so it’s not a new idea coming
>    through this branch merge now. What is newly coming here is the
>    ECReplicationConfig class which can be used to express EC replication
>    configuration.
>
>
>
>    -
>
>    We wanted to provide the support to enable EC at bucket level. To
>    simplify some complications, we have moved the default replication
>    configurations from client to server.
>
>
>
>    -
>
>    Client side replication type and replication factor removed from the
>    configuration files and introduced the ozone.server.default.replication
>    and ozone.server.default.replication.type.We would continue to respect
> if
>    one configures at client side explicitly or passed through APIs,
> otherwise
>    server side bucket level properties or server side default configuration
>    would take effect.
>
>
>
>    -
>
>    Other than this change, the rest of EC side code should not impact any
>    of the existing code flows.
>
>
> We have finished documentation JIRA(HDDS-6172) for covering this feature
> and we will continue to improve further in master.
>
> Git Branch Name : HDDS-3816-ec
>
> JIRAs: HDDS-3816 and HDDS-5351
>
> Completed tasks: ~ 142
>
> + We are covering the following two mandatory JIRAs:
>
> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older server
> could fail due to the unavailability for client default replication config
>
> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>
> PRs reviews in-progress and expected to close in a day or two.
>
> Few other JIRAs in HDDS-3816 are still open but I believe they're not
> blockers for merge.
>
> In short what you can do now with this feature:
>
>    -
>
>    You can enable EC at bucket level and cluster level.
>
> How to enable it at bucket level? Just create the bucket by passing the ec
> replication options.
>
>    -
>
>    You can create EC keys and read the same back.
>    -
>
>    You should be able to continue writing even when chosen nodes are
>    failing. (Of Course minimum of Data+Parity live nodes should be
> available
>    in cluster for complete the write)
>    -
>
>    You should be able to read the file back even if a few nodes failed in
>    the same ec block group(Failures should not be more than parity number
> of
>    nodes.).
>
> What is pending? Offline recovery of lost/missing EC containers. As
> mentioned above, post merge of this branch, I will create a separate JIRA
> for starting the work for OfflineRecovery.
>
>
> There are automated acceptance test cases already added. HDDS-6231
>
> In addition to that, we have also performed basic Acceptance Testing in
> physical cluster:
>
>    1.
>
>    Installed 10 nodes cluster and created EC bucket (3:2).
>
> Uploaded 10GB key.
>
> Downloaded the same key and checked the md5sum.
>
>    1.
>
>    Uploaded 8GB key.
>
> Downloaded the same key and checked the md5sum.
>
>    1.
>
>    Uploaded 3MB key
>
> Downloaded the same and verified md5sum.
>
>    1.
>
>    Changed bucket to (6:3)
>
> Uploaded 8GB key
>
> Download the same.
>
> Also verified the new key should be in 6:3 policy and old keys must be
> 3:2.Verified
> with several different size key writes and reads.
>
> Merge checklist items assessment is here:
>
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>
> Big shoutout to Stephen O'Donnell <so...@cloudera.com>, Istvan Fajth
> <pi...@cloudera.com> for great efforts in core development and also thanks
> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating on some
> of the EC tasks.
>
> Thanks to Marton for design discussion and on some dev tasks as well.
>
> Thanks to many others who were involved in design discussions, Arpit, Sidd,
> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, Rakesh,
> Yiqun Lin.
> Sorry if I miss anyone here, but your efforts are much appreciated. Without
> your tremendous help, we would have not reached this position yet.
>
> If there are no objections for the merge, I will start the official vote
> later.
>
> Regards,
>
> EC Branch Devs
>