You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hudi.apache.org by tison <wa...@gmail.com> on 2020/04/21 06:32:34 UTC

Generic Types of HoodieRecordPayload

Hi developers,

Recently when I went through Hudi codebase, I found some annoying warning
'raw use of
parameterized class' and some other generic type related.

It seems the root of this wide usage of raw type is HoodieRecordPayload
where

  HoodieRecordPayload<T extends HoodieRecordPayload>

should have been

  HoodieRecordPayload<T extends HoodieRecordPayload<T>>

I'm not sure what community think of this wide usage of raw type and given
it affect most of our
code I'd like to start this thread for advice. Personally explicit type
signature benefits from strong
type system but it might cause some signature breaking changes as well as
require a huge
engineering effort.

Shall Hudi live with these raw types? Or we will have a plan migrate to
explicit generic type?

Best,
tison.

Re: Generic Types of HoodieRecordPayload

Posted by Vinoth Chandar <vi...@apache.org>.
Hi Tison,

But custom payloads are commonly used to plugin logic to merge base and
delta records. So, yes, its already out there :/

Let me take a closer pass at this and circle back.

Thanks
Vinoth

On Thu, Apr 23, 2020 at 7:44 AM tison <wa...@gmail.com> wrote:

> Thanks for your reply Vinoth.
>
> First of all I'd like to know about "custom payloads". Previously
> I tend to regard such payloads as internal classes. If we already
> exposing it to users, it is possibly we can do little thing to change
> anything about the interface. A reasonable way would be we
> deprecate HoodieRecordPayload which lacks of type design
> and do adapt logic with a new payload class. Although, I think it
> would be then useless to do it and we have to live with current
> HoodieRecordPayload. Given it is one of primary class and already
> user facing, we might live with it and the raw usage forever.
>
> Best,
> tison.
>
>
> Vinoth Chandar <vi...@apache.org> 于2020年4月23日周四 下午10:31写道:
>
> > Thanks Tison! One consideration we need to have is that we cannot have a
> > breaking non-backwards compatible change for existing users with custom
> > payloads.
> > It will be a rough experience.. Guessing some effort would be needed to
> > upgrade existing payload classes right?
> >
> > On Wed, Apr 22, 2020 at 10:14 PM tison <wa...@gmail.com> wrote:
> >
> > > Thanks for your feedback!
> > >
> > > Here is a quick summary about the type parameter of
> HoodieRecordPayload.
> > >
> > > 1. The type parameter is used to refer "self type" which is generally
> > good
> > > for
> > > access subclass methods/fields without unchecked casting.
> > >
> > > 2. However, the concrete payload is accessed by
> > `combineAndGetUpdateValue`
> > > or `getInsertValue` which doesn't get benefit from the type parameter.
> > BTW,
> > > the
> > > return type of these two methods suffers from unchecked casting also,
> > which
> > > is
> > > filed as HUDI-834[1].
> > >
> > > 3. So, the only usage of the type parameter(self type) is `preCombine`
> > > method.
> > > Though, at the real use point we pass raw typed HoodieRecordPayload as
> > > the parameter[2]. I'm no sure the behavior when casting failed and want
> > to
> > > ask for advice. So far, we cast at runtime and a cast exception will be
> > > thrown
> > > if failed.
> > >
> > > Given the cases above, I tend to change the definition
> > > of HoodieRecordPayload
> > > as
> > >
> > >   public interface HoodieRecordPayload { }
> > >
> > > if we can well-define how to do in `preCombine` when type mismatch. The
> > > alternative,
> > > i.e., a full paramterized self type type parameter will affect quite a
> > wide
> > > range of code
> > > where we have to cascade change other raw usages such as
> > > HoodieTable/HoodieRecord/HoodieWriteHandle and so on; while the self
> type
> > > parameter
> > > doesn't bring a lot benefit.
> > >
> > > Best,
> > > tison.
> > >
> > > [1] https://issues.apache.org/jira/browse/HUDI-834
> > > [2]
> > org/apache/hudi/common/table/log/HoodieMergedLogRecordScanner.java:114
> > >
> > >
> > >
> > > Vinoth Chandar <vi...@apache.org> 于2020年4月23日周四 上午12:04写道:
> > >
> > > > +1 raising a JIRA and summarizing some findings would be great.
> > > >
> > > > On Wed, Apr 22, 2020 at 8:38 AM leesf <le...@gmail.com> wrote:
> > > >
> > > > > hi tison,
> > > > >
> > > > > It is always better to make the codebase more clear, so it would be
> > > great
> > > > > if you would do an investigation.
> > > > >
> > > > > tison <wa...@gmail.com> 于2020年4月22日周三 下午2:42写道:
> > > > >
> > > > > > Hi Vinoth,
> > > > > >
> > > > > > >much of the code actually does not depend on the templatized
> type
> > at
> > > > all
> > > > > >
> > > > > > Agree.
> > > > > >
> > > > > > Let's say I'm ok with untyped HoodieRecordPayload since all
> payload
> > > is
> > > > > > effectively untyped(Object) which we deal with type and cast at
> > > > runtime.
> > > > > >
> > > > > > Though, it is noisy we implement it in a half-generic form.
> > > Meanwhile,
> > > > > > wildcard doesn't work for every case since <? capture
> > > > > HoodieRecordPayload>
> > > > > > is NOT compatible with <? capture HoodieRecordPayload> and any
> > > > > > exact T extends HoodieRecordPayload.
> > > > > >
> > > > > > We don't actually use the type parameter heavily, so it is an
> > > > alternative
> > > > > > that we define HoodieRecordPayload just
> > > > > >
> > > > > >   public class HoodieRecordPayload { }
> > > > > >
> > > > > > If the community think it is a worth effort, I'm glad to do more
> > > > > > investigation
> > > > > > and evaluate its impact, also find a practical way.
> > > > > >
> > > > > > Best,
> > > > > > tison.
> > > > > >
> > > > > >
> > > > > > Vinoth Chandar <vi...@apache.org> 于2020年4月22日周三 下午2:22写道:
> > > > > >
> > > > > > > Hi Tison,
> > > > > > >
> > > > > > > Thanks for raising this.. In most places doing a HoodieTable<?>
> > > > > wildcard
> > > > > > > should be totally acceptable, since much of the code actually
> > does
> > > > not
> > > > > > > depend on the templatized type at all..
> > > > > > >
> > > > > > > Def, worth taking another look holistically and see if we can
> > > address
> > > > > > > this..
> > > > > > >
> > > > > > > My 2c.
> > > > > > > Vinoth
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Apr 20, 2020 at 11:32 PM tison <wa...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > Hi developers,
> > > > > > > >
> > > > > > > > Recently when I went through Hudi codebase, I found some
> > annoying
> > > > > > warning
> > > > > > > > 'raw use of
> > > > > > > > parameterized class' and some other generic type related.
> > > > > > > >
> > > > > > > > It seems the root of this wide usage of raw type is
> > > > > HoodieRecordPayload
> > > > > > > > where
> > > > > > > >
> > > > > > > >   HoodieRecordPayload<T extends HoodieRecordPayload>
> > > > > > > >
> > > > > > > > should have been
> > > > > > > >
> > > > > > > >   HoodieRecordPayload<T extends HoodieRecordPayload<T>>
> > > > > > > >
> > > > > > > > I'm not sure what community think of this wide usage of raw
> > type
> > > > and
> > > > > > > given
> > > > > > > > it affect most of our
> > > > > > > > code I'd like to start this thread for advice. Personally
> > > explicit
> > > > > type
> > > > > > > > signature benefits from strong
> > > > > > > > type system but it might cause some signature breaking
> changes
> > as
> > > > > well
> > > > > > as
> > > > > > > > require a huge
> > > > > > > > engineering effort.
> > > > > > > >
> > > > > > > > Shall Hudi live with these raw types? Or we will have a plan
> > > > migrate
> > > > > to
> > > > > > > > explicit generic type?
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > tison.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Generic Types of HoodieRecordPayload

Posted by tison <wa...@gmail.com>.
Thanks for your reply Vinoth.

First of all I'd like to know about "custom payloads". Previously
I tend to regard such payloads as internal classes. If we already
exposing it to users, it is possibly we can do little thing to change
anything about the interface. A reasonable way would be we
deprecate HoodieRecordPayload which lacks of type design
and do adapt logic with a new payload class. Although, I think it
would be then useless to do it and we have to live with current
HoodieRecordPayload. Given it is one of primary class and already
user facing, we might live with it and the raw usage forever.

Best,
tison.


Vinoth Chandar <vi...@apache.org> 于2020年4月23日周四 下午10:31写道:

> Thanks Tison! One consideration we need to have is that we cannot have a
> breaking non-backwards compatible change for existing users with custom
> payloads.
> It will be a rough experience.. Guessing some effort would be needed to
> upgrade existing payload classes right?
>
> On Wed, Apr 22, 2020 at 10:14 PM tison <wa...@gmail.com> wrote:
>
> > Thanks for your feedback!
> >
> > Here is a quick summary about the type parameter of HoodieRecordPayload.
> >
> > 1. The type parameter is used to refer "self type" which is generally
> good
> > for
> > access subclass methods/fields without unchecked casting.
> >
> > 2. However, the concrete payload is accessed by
> `combineAndGetUpdateValue`
> > or `getInsertValue` which doesn't get benefit from the type parameter.
> BTW,
> > the
> > return type of these two methods suffers from unchecked casting also,
> which
> > is
> > filed as HUDI-834[1].
> >
> > 3. So, the only usage of the type parameter(self type) is `preCombine`
> > method.
> > Though, at the real use point we pass raw typed HoodieRecordPayload as
> > the parameter[2]. I'm no sure the behavior when casting failed and want
> to
> > ask for advice. So far, we cast at runtime and a cast exception will be
> > thrown
> > if failed.
> >
> > Given the cases above, I tend to change the definition
> > of HoodieRecordPayload
> > as
> >
> >   public interface HoodieRecordPayload { }
> >
> > if we can well-define how to do in `preCombine` when type mismatch. The
> > alternative,
> > i.e., a full paramterized self type type parameter will affect quite a
> wide
> > range of code
> > where we have to cascade change other raw usages such as
> > HoodieTable/HoodieRecord/HoodieWriteHandle and so on; while the self type
> > parameter
> > doesn't bring a lot benefit.
> >
> > Best,
> > tison.
> >
> > [1] https://issues.apache.org/jira/browse/HUDI-834
> > [2]
> org/apache/hudi/common/table/log/HoodieMergedLogRecordScanner.java:114
> >
> >
> >
> > Vinoth Chandar <vi...@apache.org> 于2020年4月23日周四 上午12:04写道:
> >
> > > +1 raising a JIRA and summarizing some findings would be great.
> > >
> > > On Wed, Apr 22, 2020 at 8:38 AM leesf <le...@gmail.com> wrote:
> > >
> > > > hi tison,
> > > >
> > > > It is always better to make the codebase more clear, so it would be
> > great
> > > > if you would do an investigation.
> > > >
> > > > tison <wa...@gmail.com> 于2020年4月22日周三 下午2:42写道:
> > > >
> > > > > Hi Vinoth,
> > > > >
> > > > > >much of the code actually does not depend on the templatized type
> at
> > > all
> > > > >
> > > > > Agree.
> > > > >
> > > > > Let's say I'm ok with untyped HoodieRecordPayload since all payload
> > is
> > > > > effectively untyped(Object) which we deal with type and cast at
> > > runtime.
> > > > >
> > > > > Though, it is noisy we implement it in a half-generic form.
> > Meanwhile,
> > > > > wildcard doesn't work for every case since <? capture
> > > > HoodieRecordPayload>
> > > > > is NOT compatible with <? capture HoodieRecordPayload> and any
> > > > > exact T extends HoodieRecordPayload.
> > > > >
> > > > > We don't actually use the type parameter heavily, so it is an
> > > alternative
> > > > > that we define HoodieRecordPayload just
> > > > >
> > > > >   public class HoodieRecordPayload { }
> > > > >
> > > > > If the community think it is a worth effort, I'm glad to do more
> > > > > investigation
> > > > > and evaluate its impact, also find a practical way.
> > > > >
> > > > > Best,
> > > > > tison.
> > > > >
> > > > >
> > > > > Vinoth Chandar <vi...@apache.org> 于2020年4月22日周三 下午2:22写道:
> > > > >
> > > > > > Hi Tison,
> > > > > >
> > > > > > Thanks for raising this.. In most places doing a HoodieTable<?>
> > > > wildcard
> > > > > > should be totally acceptable, since much of the code actually
> does
> > > not
> > > > > > depend on the templatized type at all..
> > > > > >
> > > > > > Def, worth taking another look holistically and see if we can
> > address
> > > > > > this..
> > > > > >
> > > > > > My 2c.
> > > > > > Vinoth
> > > > > >
> > > > > >
> > > > > > On Mon, Apr 20, 2020 at 11:32 PM tison <wa...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > Hi developers,
> > > > > > >
> > > > > > > Recently when I went through Hudi codebase, I found some
> annoying
> > > > > warning
> > > > > > > 'raw use of
> > > > > > > parameterized class' and some other generic type related.
> > > > > > >
> > > > > > > It seems the root of this wide usage of raw type is
> > > > HoodieRecordPayload
> > > > > > > where
> > > > > > >
> > > > > > >   HoodieRecordPayload<T extends HoodieRecordPayload>
> > > > > > >
> > > > > > > should have been
> > > > > > >
> > > > > > >   HoodieRecordPayload<T extends HoodieRecordPayload<T>>
> > > > > > >
> > > > > > > I'm not sure what community think of this wide usage of raw
> type
> > > and
> > > > > > given
> > > > > > > it affect most of our
> > > > > > > code I'd like to start this thread for advice. Personally
> > explicit
> > > > type
> > > > > > > signature benefits from strong
> > > > > > > type system but it might cause some signature breaking changes
> as
> > > > well
> > > > > as
> > > > > > > require a huge
> > > > > > > engineering effort.
> > > > > > >
> > > > > > > Shall Hudi live with these raw types? Or we will have a plan
> > > migrate
> > > > to
> > > > > > > explicit generic type?
> > > > > > >
> > > > > > > Best,
> > > > > > > tison.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Generic Types of HoodieRecordPayload

Posted by Vinoth Chandar <vi...@apache.org>.
Thanks Tison! One consideration we need to have is that we cannot have a
breaking non-backwards compatible change for existing users with custom
payloads.
It will be a rough experience.. Guessing some effort would be needed to
upgrade existing payload classes right?

On Wed, Apr 22, 2020 at 10:14 PM tison <wa...@gmail.com> wrote:

> Thanks for your feedback!
>
> Here is a quick summary about the type parameter of HoodieRecordPayload.
>
> 1. The type parameter is used to refer "self type" which is generally good
> for
> access subclass methods/fields without unchecked casting.
>
> 2. However, the concrete payload is accessed by `combineAndGetUpdateValue`
> or `getInsertValue` which doesn't get benefit from the type parameter. BTW,
> the
> return type of these two methods suffers from unchecked casting also, which
> is
> filed as HUDI-834[1].
>
> 3. So, the only usage of the type parameter(self type) is `preCombine`
> method.
> Though, at the real use point we pass raw typed HoodieRecordPayload as
> the parameter[2]. I'm no sure the behavior when casting failed and want to
> ask for advice. So far, we cast at runtime and a cast exception will be
> thrown
> if failed.
>
> Given the cases above, I tend to change the definition
> of HoodieRecordPayload
> as
>
>   public interface HoodieRecordPayload { }
>
> if we can well-define how to do in `preCombine` when type mismatch. The
> alternative,
> i.e., a full paramterized self type type parameter will affect quite a wide
> range of code
> where we have to cascade change other raw usages such as
> HoodieTable/HoodieRecord/HoodieWriteHandle and so on; while the self type
> parameter
> doesn't bring a lot benefit.
>
> Best,
> tison.
>
> [1] https://issues.apache.org/jira/browse/HUDI-834
> [2] org/apache/hudi/common/table/log/HoodieMergedLogRecordScanner.java:114
>
>
>
> Vinoth Chandar <vi...@apache.org> 于2020年4月23日周四 上午12:04写道:
>
> > +1 raising a JIRA and summarizing some findings would be great.
> >
> > On Wed, Apr 22, 2020 at 8:38 AM leesf <le...@gmail.com> wrote:
> >
> > > hi tison,
> > >
> > > It is always better to make the codebase more clear, so it would be
> great
> > > if you would do an investigation.
> > >
> > > tison <wa...@gmail.com> 于2020年4月22日周三 下午2:42写道:
> > >
> > > > Hi Vinoth,
> > > >
> > > > >much of the code actually does not depend on the templatized type at
> > all
> > > >
> > > > Agree.
> > > >
> > > > Let's say I'm ok with untyped HoodieRecordPayload since all payload
> is
> > > > effectively untyped(Object) which we deal with type and cast at
> > runtime.
> > > >
> > > > Though, it is noisy we implement it in a half-generic form.
> Meanwhile,
> > > > wildcard doesn't work for every case since <? capture
> > > HoodieRecordPayload>
> > > > is NOT compatible with <? capture HoodieRecordPayload> and any
> > > > exact T extends HoodieRecordPayload.
> > > >
> > > > We don't actually use the type parameter heavily, so it is an
> > alternative
> > > > that we define HoodieRecordPayload just
> > > >
> > > >   public class HoodieRecordPayload { }
> > > >
> > > > If the community think it is a worth effort, I'm glad to do more
> > > > investigation
> > > > and evaluate its impact, also find a practical way.
> > > >
> > > > Best,
> > > > tison.
> > > >
> > > >
> > > > Vinoth Chandar <vi...@apache.org> 于2020年4月22日周三 下午2:22写道:
> > > >
> > > > > Hi Tison,
> > > > >
> > > > > Thanks for raising this.. In most places doing a HoodieTable<?>
> > > wildcard
> > > > > should be totally acceptable, since much of the code actually does
> > not
> > > > > depend on the templatized type at all..
> > > > >
> > > > > Def, worth taking another look holistically and see if we can
> address
> > > > > this..
> > > > >
> > > > > My 2c.
> > > > > Vinoth
> > > > >
> > > > >
> > > > > On Mon, Apr 20, 2020 at 11:32 PM tison <wa...@gmail.com>
> wrote:
> > > > >
> > > > > > Hi developers,
> > > > > >
> > > > > > Recently when I went through Hudi codebase, I found some annoying
> > > > warning
> > > > > > 'raw use of
> > > > > > parameterized class' and some other generic type related.
> > > > > >
> > > > > > It seems the root of this wide usage of raw type is
> > > HoodieRecordPayload
> > > > > > where
> > > > > >
> > > > > >   HoodieRecordPayload<T extends HoodieRecordPayload>
> > > > > >
> > > > > > should have been
> > > > > >
> > > > > >   HoodieRecordPayload<T extends HoodieRecordPayload<T>>
> > > > > >
> > > > > > I'm not sure what community think of this wide usage of raw type
> > and
> > > > > given
> > > > > > it affect most of our
> > > > > > code I'd like to start this thread for advice. Personally
> explicit
> > > type
> > > > > > signature benefits from strong
> > > > > > type system but it might cause some signature breaking changes as
> > > well
> > > > as
> > > > > > require a huge
> > > > > > engineering effort.
> > > > > >
> > > > > > Shall Hudi live with these raw types? Or we will have a plan
> > migrate
> > > to
> > > > > > explicit generic type?
> > > > > >
> > > > > > Best,
> > > > > > tison.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Generic Types of HoodieRecordPayload

Posted by tison <wa...@gmail.com>.
Thanks for your feedback!

Here is a quick summary about the type parameter of HoodieRecordPayload.

1. The type parameter is used to refer "self type" which is generally good
for
access subclass methods/fields without unchecked casting.

2. However, the concrete payload is accessed by `combineAndGetUpdateValue`
or `getInsertValue` which doesn't get benefit from the type parameter. BTW,
the
return type of these two methods suffers from unchecked casting also, which
is
filed as HUDI-834[1].

3. So, the only usage of the type parameter(self type) is `preCombine`
method.
Though, at the real use point we pass raw typed HoodieRecordPayload as
the parameter[2]. I'm no sure the behavior when casting failed and want to
ask for advice. So far, we cast at runtime and a cast exception will be
thrown
if failed.

Given the cases above, I tend to change the definition
of HoodieRecordPayload
as

  public interface HoodieRecordPayload { }

if we can well-define how to do in `preCombine` when type mismatch. The
alternative,
i.e., a full paramterized self type type parameter will affect quite a wide
range of code
where we have to cascade change other raw usages such as
HoodieTable/HoodieRecord/HoodieWriteHandle and so on; while the self type
parameter
doesn't bring a lot benefit.

Best,
tison.

[1] https://issues.apache.org/jira/browse/HUDI-834
[2] org/apache/hudi/common/table/log/HoodieMergedLogRecordScanner.java:114



Vinoth Chandar <vi...@apache.org> 于2020年4月23日周四 上午12:04写道:

> +1 raising a JIRA and summarizing some findings would be great.
>
> On Wed, Apr 22, 2020 at 8:38 AM leesf <le...@gmail.com> wrote:
>
> > hi tison,
> >
> > It is always better to make the codebase more clear, so it would be great
> > if you would do an investigation.
> >
> > tison <wa...@gmail.com> 于2020年4月22日周三 下午2:42写道:
> >
> > > Hi Vinoth,
> > >
> > > >much of the code actually does not depend on the templatized type at
> all
> > >
> > > Agree.
> > >
> > > Let's say I'm ok with untyped HoodieRecordPayload since all payload is
> > > effectively untyped(Object) which we deal with type and cast at
> runtime.
> > >
> > > Though, it is noisy we implement it in a half-generic form. Meanwhile,
> > > wildcard doesn't work for every case since <? capture
> > HoodieRecordPayload>
> > > is NOT compatible with <? capture HoodieRecordPayload> and any
> > > exact T extends HoodieRecordPayload.
> > >
> > > We don't actually use the type parameter heavily, so it is an
> alternative
> > > that we define HoodieRecordPayload just
> > >
> > >   public class HoodieRecordPayload { }
> > >
> > > If the community think it is a worth effort, I'm glad to do more
> > > investigation
> > > and evaluate its impact, also find a practical way.
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Vinoth Chandar <vi...@apache.org> 于2020年4月22日周三 下午2:22写道:
> > >
> > > > Hi Tison,
> > > >
> > > > Thanks for raising this.. In most places doing a HoodieTable<?>
> > wildcard
> > > > should be totally acceptable, since much of the code actually does
> not
> > > > depend on the templatized type at all..
> > > >
> > > > Def, worth taking another look holistically and see if we can address
> > > > this..
> > > >
> > > > My 2c.
> > > > Vinoth
> > > >
> > > >
> > > > On Mon, Apr 20, 2020 at 11:32 PM tison <wa...@gmail.com> wrote:
> > > >
> > > > > Hi developers,
> > > > >
> > > > > Recently when I went through Hudi codebase, I found some annoying
> > > warning
> > > > > 'raw use of
> > > > > parameterized class' and some other generic type related.
> > > > >
> > > > > It seems the root of this wide usage of raw type is
> > HoodieRecordPayload
> > > > > where
> > > > >
> > > > >   HoodieRecordPayload<T extends HoodieRecordPayload>
> > > > >
> > > > > should have been
> > > > >
> > > > >   HoodieRecordPayload<T extends HoodieRecordPayload<T>>
> > > > >
> > > > > I'm not sure what community think of this wide usage of raw type
> and
> > > > given
> > > > > it affect most of our
> > > > > code I'd like to start this thread for advice. Personally explicit
> > type
> > > > > signature benefits from strong
> > > > > type system but it might cause some signature breaking changes as
> > well
> > > as
> > > > > require a huge
> > > > > engineering effort.
> > > > >
> > > > > Shall Hudi live with these raw types? Or we will have a plan
> migrate
> > to
> > > > > explicit generic type?
> > > > >
> > > > > Best,
> > > > > tison.
> > > > >
> > > >
> > >
> >
>

Re: Generic Types of HoodieRecordPayload

Posted by Vinoth Chandar <vi...@apache.org>.
+1 raising a JIRA and summarizing some findings would be great.

On Wed, Apr 22, 2020 at 8:38 AM leesf <le...@gmail.com> wrote:

> hi tison,
>
> It is always better to make the codebase more clear, so it would be great
> if you would do an investigation.
>
> tison <wa...@gmail.com> 于2020年4月22日周三 下午2:42写道:
>
> > Hi Vinoth,
> >
> > >much of the code actually does not depend on the templatized type at all
> >
> > Agree.
> >
> > Let's say I'm ok with untyped HoodieRecordPayload since all payload is
> > effectively untyped(Object) which we deal with type and cast at runtime.
> >
> > Though, it is noisy we implement it in a half-generic form. Meanwhile,
> > wildcard doesn't work for every case since <? capture
> HoodieRecordPayload>
> > is NOT compatible with <? capture HoodieRecordPayload> and any
> > exact T extends HoodieRecordPayload.
> >
> > We don't actually use the type parameter heavily, so it is an alternative
> > that we define HoodieRecordPayload just
> >
> >   public class HoodieRecordPayload { }
> >
> > If the community think it is a worth effort, I'm glad to do more
> > investigation
> > and evaluate its impact, also find a practical way.
> >
> > Best,
> > tison.
> >
> >
> > Vinoth Chandar <vi...@apache.org> 于2020年4月22日周三 下午2:22写道:
> >
> > > Hi Tison,
> > >
> > > Thanks for raising this.. In most places doing a HoodieTable<?>
> wildcard
> > > should be totally acceptable, since much of the code actually does not
> > > depend on the templatized type at all..
> > >
> > > Def, worth taking another look holistically and see if we can address
> > > this..
> > >
> > > My 2c.
> > > Vinoth
> > >
> > >
> > > On Mon, Apr 20, 2020 at 11:32 PM tison <wa...@gmail.com> wrote:
> > >
> > > > Hi developers,
> > > >
> > > > Recently when I went through Hudi codebase, I found some annoying
> > warning
> > > > 'raw use of
> > > > parameterized class' and some other generic type related.
> > > >
> > > > It seems the root of this wide usage of raw type is
> HoodieRecordPayload
> > > > where
> > > >
> > > >   HoodieRecordPayload<T extends HoodieRecordPayload>
> > > >
> > > > should have been
> > > >
> > > >   HoodieRecordPayload<T extends HoodieRecordPayload<T>>
> > > >
> > > > I'm not sure what community think of this wide usage of raw type and
> > > given
> > > > it affect most of our
> > > > code I'd like to start this thread for advice. Personally explicit
> type
> > > > signature benefits from strong
> > > > type system but it might cause some signature breaking changes as
> well
> > as
> > > > require a huge
> > > > engineering effort.
> > > >
> > > > Shall Hudi live with these raw types? Or we will have a plan migrate
> to
> > > > explicit generic type?
> > > >
> > > > Best,
> > > > tison.
> > > >
> > >
> >
>

Re: Generic Types of HoodieRecordPayload

Posted by leesf <le...@gmail.com>.
hi tison,

It is always better to make the codebase more clear, so it would be great
if you would do an investigation.

tison <wa...@gmail.com> 于2020年4月22日周三 下午2:42写道:

> Hi Vinoth,
>
> >much of the code actually does not depend on the templatized type at all
>
> Agree.
>
> Let's say I'm ok with untyped HoodieRecordPayload since all payload is
> effectively untyped(Object) which we deal with type and cast at runtime.
>
> Though, it is noisy we implement it in a half-generic form. Meanwhile,
> wildcard doesn't work for every case since <? capture HoodieRecordPayload>
> is NOT compatible with <? capture HoodieRecordPayload> and any
> exact T extends HoodieRecordPayload.
>
> We don't actually use the type parameter heavily, so it is an alternative
> that we define HoodieRecordPayload just
>
>   public class HoodieRecordPayload { }
>
> If the community think it is a worth effort, I'm glad to do more
> investigation
> and evaluate its impact, also find a practical way.
>
> Best,
> tison.
>
>
> Vinoth Chandar <vi...@apache.org> 于2020年4月22日周三 下午2:22写道:
>
> > Hi Tison,
> >
> > Thanks for raising this.. In most places doing a HoodieTable<?> wildcard
> > should be totally acceptable, since much of the code actually does not
> > depend on the templatized type at all..
> >
> > Def, worth taking another look holistically and see if we can address
> > this..
> >
> > My 2c.
> > Vinoth
> >
> >
> > On Mon, Apr 20, 2020 at 11:32 PM tison <wa...@gmail.com> wrote:
> >
> > > Hi developers,
> > >
> > > Recently when I went through Hudi codebase, I found some annoying
> warning
> > > 'raw use of
> > > parameterized class' and some other generic type related.
> > >
> > > It seems the root of this wide usage of raw type is HoodieRecordPayload
> > > where
> > >
> > >   HoodieRecordPayload<T extends HoodieRecordPayload>
> > >
> > > should have been
> > >
> > >   HoodieRecordPayload<T extends HoodieRecordPayload<T>>
> > >
> > > I'm not sure what community think of this wide usage of raw type and
> > given
> > > it affect most of our
> > > code I'd like to start this thread for advice. Personally explicit type
> > > signature benefits from strong
> > > type system but it might cause some signature breaking changes as well
> as
> > > require a huge
> > > engineering effort.
> > >
> > > Shall Hudi live with these raw types? Or we will have a plan migrate to
> > > explicit generic type?
> > >
> > > Best,
> > > tison.
> > >
> >
>

Re: Generic Types of HoodieRecordPayload

Posted by tison <wa...@gmail.com>.
Hi Vinoth,

>much of the code actually does not depend on the templatized type at all

Agree.

Let's say I'm ok with untyped HoodieRecordPayload since all payload is
effectively untyped(Object) which we deal with type and cast at runtime.

Though, it is noisy we implement it in a half-generic form. Meanwhile,
wildcard doesn't work for every case since <? capture HoodieRecordPayload>
is NOT compatible with <? capture HoodieRecordPayload> and any
exact T extends HoodieRecordPayload.

We don't actually use the type parameter heavily, so it is an alternative
that we define HoodieRecordPayload just

  public class HoodieRecordPayload { }

If the community think it is a worth effort, I'm glad to do more
investigation
and evaluate its impact, also find a practical way.

Best,
tison.


Vinoth Chandar <vi...@apache.org> 于2020年4月22日周三 下午2:22写道:

> Hi Tison,
>
> Thanks for raising this.. In most places doing a HoodieTable<?> wildcard
> should be totally acceptable, since much of the code actually does not
> depend on the templatized type at all..
>
> Def, worth taking another look holistically and see if we can address
> this..
>
> My 2c.
> Vinoth
>
>
> On Mon, Apr 20, 2020 at 11:32 PM tison <wa...@gmail.com> wrote:
>
> > Hi developers,
> >
> > Recently when I went through Hudi codebase, I found some annoying warning
> > 'raw use of
> > parameterized class' and some other generic type related.
> >
> > It seems the root of this wide usage of raw type is HoodieRecordPayload
> > where
> >
> >   HoodieRecordPayload<T extends HoodieRecordPayload>
> >
> > should have been
> >
> >   HoodieRecordPayload<T extends HoodieRecordPayload<T>>
> >
> > I'm not sure what community think of this wide usage of raw type and
> given
> > it affect most of our
> > code I'd like to start this thread for advice. Personally explicit type
> > signature benefits from strong
> > type system but it might cause some signature breaking changes as well as
> > require a huge
> > engineering effort.
> >
> > Shall Hudi live with these raw types? Or we will have a plan migrate to
> > explicit generic type?
> >
> > Best,
> > tison.
> >
>

Re: Generic Types of HoodieRecordPayload

Posted by Vinoth Chandar <vi...@apache.org>.
Hi Tison,

Thanks for raising this.. In most places doing a HoodieTable<?> wildcard
should be totally acceptable, since much of the code actually does not
depend on the templatized type at all..

Def, worth taking another look holistically and see if we can address
this..

My 2c.
Vinoth


On Mon, Apr 20, 2020 at 11:32 PM tison <wa...@gmail.com> wrote:

> Hi developers,
>
> Recently when I went through Hudi codebase, I found some annoying warning
> 'raw use of
> parameterized class' and some other generic type related.
>
> It seems the root of this wide usage of raw type is HoodieRecordPayload
> where
>
>   HoodieRecordPayload<T extends HoodieRecordPayload>
>
> should have been
>
>   HoodieRecordPayload<T extends HoodieRecordPayload<T>>
>
> I'm not sure what community think of this wide usage of raw type and given
> it affect most of our
> code I'd like to start this thread for advice. Personally explicit type
> signature benefits from strong
> type system but it might cause some signature breaking changes as well as
> require a huge
> engineering effort.
>
> Shall Hudi live with these raw types? Or we will have a plan migrate to
> explicit generic type?
>
> Best,
> tison.
>