You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ozone.apache.org by Uma gangumalla <um...@apache.org> on 2021/03/11 17:52:25 UTC

Deprecating ReplicationFactor and introducing ReplicationConfig

Hi Devs,

Currently as part of the client APIs, we have the parameters
ReplicationFactor and ReplicationType passed and based on that SCM, picks
the corresponding PipelineProvider and chooses the datanodes.

Considering the ErasureCoding, it's hard to pass all of the EC related
parameters( example: data blocks number, parity blocks number etc) in
single class ReplicationFactor and it's not good idea to pass
multiple parameter and call the class name with "Factor"

There is a proposal to introduce the ReplicationConfig: *HDDS-4882*: Introduce
the ReplicationConfig and modify the proto files

I would like to bring this topic to the dev list to get everyone's
attention and feedback. The proposal is to deprecate the existing
ReplicationFactor and introduce ReplicationConfig.

The respective replicationConfigs, like for Ratis: RatisReplicationConfig
and for EC: ECReplicationConfig.

Thanks @Elek, Marton <el...@apache.org> for the JIRA and proposal. And It
would be great if you can put them into some class diagrams, so that it
will be helpful to understand batter.

If we all agree for the proposal, we can push the ReplicationConfig,
RatisReplicationConfig changes to master itself. and later we will add only
ECReplicationConfig changes into the EC branch. This will help to reduce
the merge conflicts later.

Thoughts?

Regards,
Uma

[DISCUSS] Re: Deprecating ReplicationFactor and introducing ReplicationConfig

Posted by Uma gangumalla <um...@apache.org>.
Hi All,

   Please raise if you have any concerns. Otherwise we will move forward
with this approach.

Regards,
Uma

On Wed, Mar 17, 2021 at 5:12 AM Elek, Marton <el...@apache.org> wrote:

>
> Thanks the feedback Arpit,
>
> We don't need to replace the old replcationFactor field, we can keep it
> for compatibility reasons. If the old field is used by any client it can
> be used during the de-/serialziation instead of the new object.
>
> With this approach we don't need to be worried about compatibility issues.
>
> An alternative approach what I considered it using the
> ReplicationConfiguration only in the Java interfaces but without
> introducing new subclass for existing RATIS/STANDALONE:
>
> In proto use:
>
> replicationFactor
>    OR
> ECReplicationConfigProto
>
> in Java class:
> ReplicationConfig field can have two-three type of implementations/values:
>
>   * ECReplicationConfig (extends ReplicationConfig) de/serialized
> from/to ECReplicationConfigProto
>   * RatisReplicationConfig de/serialized from/to existing
> replicationFactor proto field
>   * StandaloneReplicationConfig de/serialized from/to existing
> replicationFactory proto field
>
> It's slightly easier but API is not as consistent (and doesn't allow to
> add additional configuration for RATIS/STANDALONE replications).
>
> That's the reason why I started with the more generic approach in PR#1973.
>
> Thanks,
> Marton
>
> On 3/16/21 10:39 PM, Arpit Agarwal wrote:
> > The idea sounds reasonable. Compatibility is an open question. We will
> have to bring this change through the upgrade framework since there will be
> changes to OM on-disk formats.
> >
> > Upgrades and downgrades will be tricky IAC. It needs more thought.
> >
> > Thanks,
> > Arpit
> >
> >
> >
> >> On Mar 11, 2021, at 9:52 AM, Uma gangumalla <um...@apache.org>
> wrote:
> >>
> >> Hi Devs,
> >>
> >> Currently as part of the client APIs, we have the parameters
> >> ReplicationFactor and ReplicationType passed and based on that SCM,
> picks
> >> the corresponding PipelineProvider and chooses the datanodes.
> >>
> >> Considering the ErasureCoding, it's hard to pass all of the EC related
> >> parameters( example: data blocks number, parity blocks number etc) in
> >> single class ReplicationFactor and it's not good idea to pass
> >> multiple parameter and call the class name with "Factor"
> >>
> >> There is a proposal to introduce the ReplicationConfig: *HDDS-4882*:
> Introduce
> >> the ReplicationConfig and modify the proto files
> >>
> >> I would like to bring this topic to the dev list to get everyone's
> >> attention and feedback. The proposal is to deprecate the existing
> >> ReplicationFactor and introduce ReplicationConfig.
> >>
> >> The respective replicationConfigs, like for Ratis:
> RatisReplicationConfig
> >> and for EC: ECReplicationConfig.
> >>
> >> Thanks @Elek, Marton <el...@apache.org> for the JIRA and proposal. And
> It
> >> would be great if you can put them into some class diagrams, so that it
> >> will be helpful to understand batter.
> >>
> >> If we all agree for the proposal, we can push the ReplicationConfig,
> >> RatisReplicationConfig changes to master itself. and later we will add
> only
> >> ECReplicationConfig changes into the EC branch. This will help to reduce
> >> the merge conflicts later.
> >>
> >> Thoughts?
> >>
> >> Regards,
> >> Uma
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> > For additional commands, e-mail: dev-help@ozone.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> For additional commands, e-mail: dev-help@ozone.apache.org
>
>

Re: Deprecating ReplicationFactor and introducing ReplicationConfig

Posted by "Elek, Marton" <el...@apache.org>.
Thanks the feedback Arpit,

We don't need to replace the old replcationFactor field, we can keep it 
for compatibility reasons. If the old field is used by any client it can 
be used during the de-/serialziation instead of the new object.

With this approach we don't need to be worried about compatibility issues.

An alternative approach what I considered it using the 
ReplicationConfiguration only in the Java interfaces but without 
introducing new subclass for existing RATIS/STANDALONE:

In proto use:

replicationFactor
   OR
ECReplicationConfigProto

in Java class:
ReplicationConfig field can have two-three type of implementations/values:

  * ECReplicationConfig (extends ReplicationConfig) de/serialized 
from/to ECReplicationConfigProto
  * RatisReplicationConfig de/serialized from/to existing 
replicationFactor proto field
  * StandaloneReplicationConfig de/serialized from/to existing 
replicationFactory proto field

It's slightly easier but API is not as consistent (and doesn't allow to 
add additional configuration for RATIS/STANDALONE replications).

That's the reason why I started with the more generic approach in PR#1973.

Thanks,
Marton

On 3/16/21 10:39 PM, Arpit Agarwal wrote:
> The idea sounds reasonable. Compatibility is an open question. We will have to bring this change through the upgrade framework since there will be changes to OM on-disk formats.
> 
> Upgrades and downgrades will be tricky IAC. It needs more thought.
> 
> Thanks,
> Arpit
> 
> 
> 
>> On Mar 11, 2021, at 9:52 AM, Uma gangumalla <um...@apache.org> wrote:
>>
>> Hi Devs,
>>
>> Currently as part of the client APIs, we have the parameters
>> ReplicationFactor and ReplicationType passed and based on that SCM, picks
>> the corresponding PipelineProvider and chooses the datanodes.
>>
>> Considering the ErasureCoding, it's hard to pass all of the EC related
>> parameters( example: data blocks number, parity blocks number etc) in
>> single class ReplicationFactor and it's not good idea to pass
>> multiple parameter and call the class name with "Factor"
>>
>> There is a proposal to introduce the ReplicationConfig: *HDDS-4882*: Introduce
>> the ReplicationConfig and modify the proto files
>>
>> I would like to bring this topic to the dev list to get everyone's
>> attention and feedback. The proposal is to deprecate the existing
>> ReplicationFactor and introduce ReplicationConfig.
>>
>> The respective replicationConfigs, like for Ratis: RatisReplicationConfig
>> and for EC: ECReplicationConfig.
>>
>> Thanks @Elek, Marton <el...@apache.org> for the JIRA and proposal. And It
>> would be great if you can put them into some class diagrams, so that it
>> will be helpful to understand batter.
>>
>> If we all agree for the proposal, we can push the ReplicationConfig,
>> RatisReplicationConfig changes to master itself. and later we will add only
>> ECReplicationConfig changes into the EC branch. This will help to reduce
>> the merge conflicts later.
>>
>> Thoughts?
>>
>> Regards,
>> Uma
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> For additional commands, e-mail: dev-help@ozone.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
For additional commands, e-mail: dev-help@ozone.apache.org


Re: Deprecating ReplicationFactor and introducing ReplicationConfig

Posted by Arpit Agarwal <aa...@cloudera.com.INVALID>.
The idea sounds reasonable. Compatibility is an open question. We will have to bring this change through the upgrade framework since there will be changes to OM on-disk formats.

Upgrades and downgrades will be tricky IAC. It needs more thought.

Thanks,
Arpit



> On Mar 11, 2021, at 9:52 AM, Uma gangumalla <um...@apache.org> wrote:
> 
> Hi Devs,
> 
> Currently as part of the client APIs, we have the parameters
> ReplicationFactor and ReplicationType passed and based on that SCM, picks
> the corresponding PipelineProvider and chooses the datanodes.
> 
> Considering the ErasureCoding, it's hard to pass all of the EC related
> parameters( example: data blocks number, parity blocks number etc) in
> single class ReplicationFactor and it's not good idea to pass
> multiple parameter and call the class name with "Factor"
> 
> There is a proposal to introduce the ReplicationConfig: *HDDS-4882*: Introduce
> the ReplicationConfig and modify the proto files
> 
> I would like to bring this topic to the dev list to get everyone's
> attention and feedback. The proposal is to deprecate the existing
> ReplicationFactor and introduce ReplicationConfig.
> 
> The respective replicationConfigs, like for Ratis: RatisReplicationConfig
> and for EC: ECReplicationConfig.
> 
> Thanks @Elek, Marton <el...@apache.org> for the JIRA and proposal. And It
> would be great if you can put them into some class diagrams, so that it
> will be helpful to understand batter.
> 
> If we all agree for the proposal, we can push the ReplicationConfig,
> RatisReplicationConfig changes to master itself. and later we will add only
> ECReplicationConfig changes into the EC branch. This will help to reduce
> the merge conflicts later.
> 
> Thoughts?
> 
> Regards,
> Uma


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
For additional commands, e-mail: dev-help@ozone.apache.org