You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Cyrus Vafadari <cy...@confluent.io> on 2019/08/04 02:01:26 UTC

[DISCUSS] KIP-502: Connect SinkTask.put(...) to specify ArrayList in Signature

Hi all,

I've written a KIP to update the SinkTask abstract class to specify that
the `put` method will take ArrayList. I think this will greatly simplify
connector development, so you aren't limited to the simplest iterations. It
will also harden the ordering guarantees of the API.

Looking forward to the feedback!

https://cwiki.apache.org/confluence/display/KAFKA/KIP-502%3A+Connect+SinkTask.put%28...%29+to+specify+ArrayList%3CSinkRecord%3E+in+Signature

Cyrus

Re: [DISCUSS] KIP-502: Connect SinkTask.put(...) to specify ArrayList in Signature

Posted by Tom Bentley <tb...@redhat.com>.
Hi Cyrus,

Thanks for the KIP.

In the motivation I can see how having a guaranteed iteration order might
be useful for some connectors, but I couldn't see how a Connector developer
would benefit from using List.get(int). Could you elaborate? Can you offer
any numbers to show that the cost of letting the connector instantiate and
sort it's own list is prohibitive (rather than being merely annoying)?

On its own changing the type to List doesn't actually provide the semantics
you want. All a List tells you, the implementer, is that whatever ordering
was used to build the collection/list is maintained. It does not in itself
define what that ordering is. So I think you'd have to define the ordering
as part of the contract. But that's problematic because the iteration order
of the List ultimately depends on the iteration order for ConsumerRecords
and AFAIK that's not defined as part of any public API.

Also, I don't think what is proposed will be binary compatible with
existing connectors. Adding an abstract method in the base class will mean
the JVM will throw AbstractMethodError when a caller tries to invoke the
method on existing connectors (compiled with the old base class). I think
your put(List) would have to be concrete and delegate to put(Collection) in
order to provide binary compatibility.

Kind regards,

Tom



On Tue, Sep 24, 2019 at 9:31 PM Cyrus Vafadari <cv...@gmail.com> wrote:

> Almog, I think that's a great point -- I will update the KIP to reflect
> your suggestion!
>
> On Tue, Sep 24, 2019 at 12:46 PM Almog Gavra <al...@confluent.io> wrote:
>
> > Thanks Cyrus! I think this change is a good step in hardening the API. I
> do
> > believe that APIs should be defined by functionality and not performance
> > characteristics, so I'd prefer using List<> over ArrayList<> (the
> > alternative you mention in rejected). That also gives us leeway in the
> > future to swap implementations without breaking compatibility (e.g. we
> want
> > to pass an ImmutableList or Collections.unmodifiableList instead of
> > ArrayList or discover some unknown more efficient implementation than
> > ArrayList).
> >
> > Almog
> >
> > On Sat, Aug 3, 2019 at 7:01 PM Cyrus Vafadari <cy...@confluent.io>
> wrote:
> >
> > > Hi all,
> > >
> > > I've written a KIP to update the SinkTask abstract class to specify
> that
> > > the `put` method will take ArrayList. I think this will greatly
> simplify
> > > connector development, so you aren't limited to the simplest
> iterations.
> > It
> > > will also harden the ordering guarantees of the API.
> > >
> > > Looking forward to the feedback!
> > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-502%3A+Connect+SinkTask.put%28...%29+to+specify+ArrayList%3CSinkRecord%3E+in+Signature
> > >
> > > Cyrus
> > >
> >
>

Re: [DISCUSS] KIP-502: Connect SinkTask.put(...) to specify ArrayList in Signature

Posted by Cyrus Vafadari <cv...@gmail.com>.
Almog, I think that's a great point -- I will update the KIP to reflect
your suggestion!

On Tue, Sep 24, 2019 at 12:46 PM Almog Gavra <al...@confluent.io> wrote:

> Thanks Cyrus! I think this change is a good step in hardening the API. I do
> believe that APIs should be defined by functionality and not performance
> characteristics, so I'd prefer using List<> over ArrayList<> (the
> alternative you mention in rejected). That also gives us leeway in the
> future to swap implementations without breaking compatibility (e.g. we want
> to pass an ImmutableList or Collections.unmodifiableList instead of
> ArrayList or discover some unknown more efficient implementation than
> ArrayList).
>
> Almog
>
> On Sat, Aug 3, 2019 at 7:01 PM Cyrus Vafadari <cy...@confluent.io> wrote:
>
> > Hi all,
> >
> > I've written a KIP to update the SinkTask abstract class to specify that
> > the `put` method will take ArrayList. I think this will greatly simplify
> > connector development, so you aren't limited to the simplest iterations.
> It
> > will also harden the ordering guarantees of the API.
> >
> > Looking forward to the feedback!
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-502%3A+Connect+SinkTask.put%28...%29+to+specify+ArrayList%3CSinkRecord%3E+in+Signature
> >
> > Cyrus
> >
>

Re: [DISCUSS] KIP-502: Connect SinkTask.put(...) to specify ArrayList in Signature

Posted by Almog Gavra <al...@confluent.io>.
Thanks Cyrus! I think this change is a good step in hardening the API. I do
believe that APIs should be defined by functionality and not performance
characteristics, so I'd prefer using List<> over ArrayList<> (the
alternative you mention in rejected). That also gives us leeway in the
future to swap implementations without breaking compatibility (e.g. we want
to pass an ImmutableList or Collections.unmodifiableList instead of
ArrayList or discover some unknown more efficient implementation than
ArrayList).

Almog

On Sat, Aug 3, 2019 at 7:01 PM Cyrus Vafadari <cy...@confluent.io> wrote:

> Hi all,
>
> I've written a KIP to update the SinkTask abstract class to specify that
> the `put` method will take ArrayList. I think this will greatly simplify
> connector development, so you aren't limited to the simplest iterations. It
> will also harden the ordering guarantees of the API.
>
> Looking forward to the feedback!
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-502%3A+Connect+SinkTask.put%28...%29+to+specify+ArrayList%3CSinkRecord%3E+in+Signature
>
> Cyrus
>