You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Ewen Cheslack-Postava <ew...@confluent.io> on 2017/02/25 22:25:15 UTC

[DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect

Hi all,

I've added a pretty trivial KIP for adding a pass-through Converter for
Kafka Connect, similar to ByteArraySerializer/Deserializer.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-128%3A+Add+ByteArrayConverter+for+Kafka+Connect

This wasn't added with the framework originally because the idea was to
deal with structured data for the most part. However, we've seen a couple
of use cases arise as the framework got more traction and I think it makes
sense to provide this out of the box now so people stop reinventing the
wheel (and using a different fully-qualified class name) for each connector
that needs this functionality.

I imagine this will be a rather uncontroversial addition, so if I don't see
any comments in the next day or two I'll just start the vote thread.

-Ewen

Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect

Posted by Guozhang Wang <wa...@gmail.com>.
Generally I'd prefer not duplicating functional logic as sometimes you may
miss to sync one of them when you change the other. I understand for this
specific case such scenario may never happen as the logic is quite simple
and static, but still sounds a good coding practice to me?


Guozhang


On Tue, Feb 28, 2017 at 4:31 PM, Ewen Cheslack-Postava <ew...@confluent.io>
wrote:

> Guozhang,
>
> Did you look at the PR? I'm fine with doing that if we really think it's
> better, but since this is a config-less passthrough, it's actually just
> adding overhead to do that...
>
> -Ewen
>
> On Mon, Feb 27, 2017 at 11:47 AM, Guozhang Wang <wa...@gmail.com>
> wrote:
>
> > Thanks Ewen,
> >
> > "use the corresponding serializer internally and just add in the extra
> > conversion
> > steps for the data API" sounds good to me.
> >
> > Guozhang
> >
> >
> > On Mon, Feb 27, 2017 at 8:24 AM, Ewen Cheslack-Postava <
> ewen@confluent.io>
> > wrote:
> >
> > > It's a different interface that's being implemented. The functionality
> is
> > > the same (since it's just a simple pass through), but we intentionally
> > > named Converters differently than Serializers since they do more work
> > than
> > > Serializers (besides the normal serialization they also need to convert
> > > between <serialization format> and the Connect Data API.
> > >
> > > We could certainly reuse/extend that class instead, though I'm not sure
> > > there's much benefit in that and since they implement different
> > interfaces
> > > and this is Connect-specific, it will probably be clearer to have it
> > under
> > > a Connect package. Note that for other Converters the pattern we've
> used
> > is
> > > to use the corresponding serializer internally and just add in the
> extra
> > > conversion steps for the data API.
> > >
> > > -Ewen
> > >
> > > On Sat, Feb 25, 2017 at 6:52 PM, Guozhang Wang <wa...@gmail.com>
> > wrote:
> > >
> > > > I'm wondering why we can't just use ByteArarySerde in o.a.k.common?
> > > >
> > > > Guozhang
> > > >
> > > > On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava <
> > > ewen@confluent.io>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I've added a pretty trivial KIP for adding a pass-through Converter
> > for
> > > > > Kafka Connect, similar to ByteArraySerializer/Deserializer.
> > > > >
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > 128%3A+Add+ByteArrayConverter+for+Kafka+Connect
> > > > >
> > > > > This wasn't added with the framework originally because the idea
> was
> > to
> > > > > deal with structured data for the most part. However, we've seen a
> > > couple
> > > > > of use cases arise as the framework got more traction and I think
> it
> > > > makes
> > > > > sense to provide this out of the box now so people stop reinventing
> > the
> > > > > wheel (and using a different fully-qualified class name) for each
> > > > connector
> > > > > that needs this functionality.
> > > > >
> > > > > I imagine this will be a rather uncontroversial addition, so if I
> > don't
> > > > see
> > > > > any comments in the next day or two I'll just start the vote
> thread.
> > > > >
> > > > > -Ewen
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang

Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect

Posted by Ewen Cheslack-Postava <ew...@confluent.io>.
Guozhang,

Did you look at the PR? I'm fine with doing that if we really think it's
better, but since this is a config-less passthrough, it's actually just
adding overhead to do that...

-Ewen

On Mon, Feb 27, 2017 at 11:47 AM, Guozhang Wang <wa...@gmail.com> wrote:

> Thanks Ewen,
>
> "use the corresponding serializer internally and just add in the extra
> conversion
> steps for the data API" sounds good to me.
>
> Guozhang
>
>
> On Mon, Feb 27, 2017 at 8:24 AM, Ewen Cheslack-Postava <ew...@confluent.io>
> wrote:
>
> > It's a different interface that's being implemented. The functionality is
> > the same (since it's just a simple pass through), but we intentionally
> > named Converters differently than Serializers since they do more work
> than
> > Serializers (besides the normal serialization they also need to convert
> > between <serialization format> and the Connect Data API.
> >
> > We could certainly reuse/extend that class instead, though I'm not sure
> > there's much benefit in that and since they implement different
> interfaces
> > and this is Connect-specific, it will probably be clearer to have it
> under
> > a Connect package. Note that for other Converters the pattern we've used
> is
> > to use the corresponding serializer internally and just add in the extra
> > conversion steps for the data API.
> >
> > -Ewen
> >
> > On Sat, Feb 25, 2017 at 6:52 PM, Guozhang Wang <wa...@gmail.com>
> wrote:
> >
> > > I'm wondering why we can't just use ByteArarySerde in o.a.k.common?
> > >
> > > Guozhang
> > >
> > > On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava <
> > ewen@confluent.io>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I've added a pretty trivial KIP for adding a pass-through Converter
> for
> > > > Kafka Connect, similar to ByteArraySerializer/Deserializer.
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 128%3A+Add+ByteArrayConverter+for+Kafka+Connect
> > > >
> > > > This wasn't added with the framework originally because the idea was
> to
> > > > deal with structured data for the most part. However, we've seen a
> > couple
> > > > of use cases arise as the framework got more traction and I think it
> > > makes
> > > > sense to provide this out of the box now so people stop reinventing
> the
> > > > wheel (and using a different fully-qualified class name) for each
> > > connector
> > > > that needs this functionality.
> > > >
> > > > I imagine this will be a rather uncontroversial addition, so if I
> don't
> > > see
> > > > any comments in the next day or two I'll just start the vote thread.
> > > >
> > > > -Ewen
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect

Posted by Guozhang Wang <wa...@gmail.com>.
Thanks Ewen,

"use the corresponding serializer internally and just add in the extra
conversion
steps for the data API" sounds good to me.

Guozhang


On Mon, Feb 27, 2017 at 8:24 AM, Ewen Cheslack-Postava <ew...@confluent.io>
wrote:

> It's a different interface that's being implemented. The functionality is
> the same (since it's just a simple pass through), but we intentionally
> named Converters differently than Serializers since they do more work than
> Serializers (besides the normal serialization they also need to convert
> between <serialization format> and the Connect Data API.
>
> We could certainly reuse/extend that class instead, though I'm not sure
> there's much benefit in that and since they implement different interfaces
> and this is Connect-specific, it will probably be clearer to have it under
> a Connect package. Note that for other Converters the pattern we've used is
> to use the corresponding serializer internally and just add in the extra
> conversion steps for the data API.
>
> -Ewen
>
> On Sat, Feb 25, 2017 at 6:52 PM, Guozhang Wang <wa...@gmail.com> wrote:
>
> > I'm wondering why we can't just use ByteArarySerde in o.a.k.common?
> >
> > Guozhang
> >
> > On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava <
> ewen@confluent.io>
> > wrote:
> >
> > > Hi all,
> > >
> > > I've added a pretty trivial KIP for adding a pass-through Converter for
> > > Kafka Connect, similar to ByteArraySerializer/Deserializer.
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 128%3A+Add+ByteArrayConverter+for+Kafka+Connect
> > >
> > > This wasn't added with the framework originally because the idea was to
> > > deal with structured data for the most part. However, we've seen a
> couple
> > > of use cases arise as the framework got more traction and I think it
> > makes
> > > sense to provide this out of the box now so people stop reinventing the
> > > wheel (and using a different fully-qualified class name) for each
> > connector
> > > that needs this functionality.
> > >
> > > I imagine this will be a rather uncontroversial addition, so if I don't
> > see
> > > any comments in the next day or two I'll just start the vote thread.
> > >
> > > -Ewen
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang

Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect

Posted by Ewen Cheslack-Postava <ew...@confluent.io>.
It's a different interface that's being implemented. The functionality is
the same (since it's just a simple pass through), but we intentionally
named Converters differently than Serializers since they do more work than
Serializers (besides the normal serialization they also need to convert
between <serialization format> and the Connect Data API.

We could certainly reuse/extend that class instead, though I'm not sure
there's much benefit in that and since they implement different interfaces
and this is Connect-specific, it will probably be clearer to have it under
a Connect package. Note that for other Converters the pattern we've used is
to use the corresponding serializer internally and just add in the extra
conversion steps for the data API.

-Ewen

On Sat, Feb 25, 2017 at 6:52 PM, Guozhang Wang <wa...@gmail.com> wrote:

> I'm wondering why we can't just use ByteArarySerde in o.a.k.common?
>
> Guozhang
>
> On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava <ew...@confluent.io>
> wrote:
>
> > Hi all,
> >
> > I've added a pretty trivial KIP for adding a pass-through Converter for
> > Kafka Connect, similar to ByteArraySerializer/Deserializer.
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 128%3A+Add+ByteArrayConverter+for+Kafka+Connect
> >
> > This wasn't added with the framework originally because the idea was to
> > deal with structured data for the most part. However, we've seen a couple
> > of use cases arise as the framework got more traction and I think it
> makes
> > sense to provide this out of the box now so people stop reinventing the
> > wheel (and using a different fully-qualified class name) for each
> connector
> > that needs this functionality.
> >
> > I imagine this will be a rather uncontroversial addition, so if I don't
> see
> > any comments in the next day or two I'll just start the vote thread.
> >
> > -Ewen
> >
>
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect

Posted by Guozhang Wang <wa...@gmail.com>.
I'm wondering why we can't just use ByteArarySerde in o.a.k.common?

Guozhang

On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava <ew...@confluent.io>
wrote:

> Hi all,
>
> I've added a pretty trivial KIP for adding a pass-through Converter for
> Kafka Connect, similar to ByteArraySerializer/Deserializer.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 128%3A+Add+ByteArrayConverter+for+Kafka+Connect
>
> This wasn't added with the framework originally because the idea was to
> deal with structured data for the most part. However, we've seen a couple
> of use cases arise as the framework got more traction and I think it makes
> sense to provide this out of the box now so people stop reinventing the
> wheel (and using a different fully-qualified class name) for each connector
> that needs this functionality.
>
> I imagine this will be a rather uncontroversial addition, so if I don't see
> any comments in the next day or two I'll just start the vote thread.
>
> -Ewen
>



-- 
-- Guozhang

Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect

Posted by Guozhang Wang <wa...@gmail.com>.
Sounds good. Thanks.

On Wed, Mar 1, 2017 at 8:21 PM, Gwen Shapira <gw...@confluent.io> wrote:

> Yeah, you are right, it is best not to make converters actively break the
> data structures :)
>
> On Wed, Mar 1, 2017 at 4:55 PM, Ewen Cheslack-Postava <ew...@confluent.io>
> wrote:
>
> > Guozhang,
> >
> > I'm fine w/ adjusting if people want to, but it ends up being more code
> > since we also need to convert SerializationExceptions to DataExceptions
> and
> > the only thing the toConnectData method even does is specific to Connect
> > (adding the SchemaAndValue).
> >
> > Gwen -- isn't that an SMT? ExtractField?
> >
> > -Ewen
> >
> > On Wed, Mar 1, 2017 at 1:58 PM, Gwen Shapira <gw...@confluent.io> wrote:
> >
> > > Hi Ewen,
> > >
> > > Thanks for the KIP, I think it will be useful :)
> > >
> > > I'm just wondering if we can add support not just for bytes schema,
> > > but also for a struct that contains bytes? I'm thinking of the
> > > scenario of using a connector to grab BLOBs out of a DB - I think you
> > > end up with this structure if you use a JDBC connector and custom
> > > query...
> > >
> > > Maybe even support Maps with generic objects using Java's default
> > > serialization? I'm not sure if this is useful.
> > >
> > > Gwen
> > >
> > >
> > >
> > > On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava
> > > <ew...@confluent.io> wrote:
> > > > Hi all,
> > > >
> > > > I've added a pretty trivial KIP for adding a pass-through Converter
> for
> > > > Kafka Connect, similar to ByteArraySerializer/Deserializer.
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 128%3A+Add+ByteArrayConverter+for+Kafka+Connect
> > > >
> > > > This wasn't added with the framework originally because the idea was
> to
> > > > deal with structured data for the most part. However, we've seen a
> > couple
> > > > of use cases arise as the framework got more traction and I think it
> > > makes
> > > > sense to provide this out of the box now so people stop reinventing
> the
> > > > wheel (and using a different fully-qualified class name) for each
> > > connector
> > > > that needs this functionality.
> > > >
> > > > I imagine this will be a rather uncontroversial addition, so if I
> don't
> > > see
> > > > any comments in the next day or two I'll just start the vote thread.
> > > >
> > > > -Ewen
> > >
> > >
> > >
> > > --
> > > Gwen Shapira
> > > Product Manager | Confluent
> > > 650.450.2760 | @gwenshap
> > > Follow us: Twitter | blog
> > >
> >
>
>
>
> --
> *Gwen Shapira*
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> <http://www.confluent.io/blog>
>



-- 
-- Guozhang

Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect

Posted by Gwen Shapira <gw...@confluent.io>.
Yeah, you are right, it is best not to make converters actively break the
data structures :)

On Wed, Mar 1, 2017 at 4:55 PM, Ewen Cheslack-Postava <ew...@confluent.io>
wrote:

> Guozhang,
>
> I'm fine w/ adjusting if people want to, but it ends up being more code
> since we also need to convert SerializationExceptions to DataExceptions and
> the only thing the toConnectData method even does is specific to Connect
> (adding the SchemaAndValue).
>
> Gwen -- isn't that an SMT? ExtractField?
>
> -Ewen
>
> On Wed, Mar 1, 2017 at 1:58 PM, Gwen Shapira <gw...@confluent.io> wrote:
>
> > Hi Ewen,
> >
> > Thanks for the KIP, I think it will be useful :)
> >
> > I'm just wondering if we can add support not just for bytes schema,
> > but also for a struct that contains bytes? I'm thinking of the
> > scenario of using a connector to grab BLOBs out of a DB - I think you
> > end up with this structure if you use a JDBC connector and custom
> > query...
> >
> > Maybe even support Maps with generic objects using Java's default
> > serialization? I'm not sure if this is useful.
> >
> > Gwen
> >
> >
> >
> > On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava
> > <ew...@confluent.io> wrote:
> > > Hi all,
> > >
> > > I've added a pretty trivial KIP for adding a pass-through Converter for
> > > Kafka Connect, similar to ByteArraySerializer/Deserializer.
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 128%3A+Add+ByteArrayConverter+for+Kafka+Connect
> > >
> > > This wasn't added with the framework originally because the idea was to
> > > deal with structured data for the most part. However, we've seen a
> couple
> > > of use cases arise as the framework got more traction and I think it
> > makes
> > > sense to provide this out of the box now so people stop reinventing the
> > > wheel (and using a different fully-qualified class name) for each
> > connector
> > > that needs this functionality.
> > >
> > > I imagine this will be a rather uncontroversial addition, so if I don't
> > see
> > > any comments in the next day or two I'll just start the vote thread.
> > >
> > > -Ewen
> >
> >
> >
> > --
> > Gwen Shapira
> > Product Manager | Confluent
> > 650.450.2760 | @gwenshap
> > Follow us: Twitter | blog
> >
>



-- 
*Gwen Shapira*
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
<http://www.confluent.io/blog>

Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect

Posted by Ewen Cheslack-Postava <ew...@confluent.io>.
Guozhang,

I'm fine w/ adjusting if people want to, but it ends up being more code
since we also need to convert SerializationExceptions to DataExceptions and
the only thing the toConnectData method even does is specific to Connect
(adding the SchemaAndValue).

Gwen -- isn't that an SMT? ExtractField?

-Ewen

On Wed, Mar 1, 2017 at 1:58 PM, Gwen Shapira <gw...@confluent.io> wrote:

> Hi Ewen,
>
> Thanks for the KIP, I think it will be useful :)
>
> I'm just wondering if we can add support not just for bytes schema,
> but also for a struct that contains bytes? I'm thinking of the
> scenario of using a connector to grab BLOBs out of a DB - I think you
> end up with this structure if you use a JDBC connector and custom
> query...
>
> Maybe even support Maps with generic objects using Java's default
> serialization? I'm not sure if this is useful.
>
> Gwen
>
>
>
> On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava
> <ew...@confluent.io> wrote:
> > Hi all,
> >
> > I've added a pretty trivial KIP for adding a pass-through Converter for
> > Kafka Connect, similar to ByteArraySerializer/Deserializer.
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 128%3A+Add+ByteArrayConverter+for+Kafka+Connect
> >
> > This wasn't added with the framework originally because the idea was to
> > deal with structured data for the most part. However, we've seen a couple
> > of use cases arise as the framework got more traction and I think it
> makes
> > sense to provide this out of the box now so people stop reinventing the
> > wheel (and using a different fully-qualified class name) for each
> connector
> > that needs this functionality.
> >
> > I imagine this will be a rather uncontroversial addition, so if I don't
> see
> > any comments in the next day or two I'll just start the vote thread.
> >
> > -Ewen
>
>
>
> --
> Gwen Shapira
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter | blog
>

Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect

Posted by Gwen Shapira <gw...@confluent.io>.
Hi Ewen,

Thanks for the KIP, I think it will be useful :)

I'm just wondering if we can add support not just for bytes schema,
but also for a struct that contains bytes? I'm thinking of the
scenario of using a connector to grab BLOBs out of a DB - I think you
end up with this structure if you use a JDBC connector and custom
query...

Maybe even support Maps with generic objects using Java's default
serialization? I'm not sure if this is useful.

Gwen



On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava
<ew...@confluent.io> wrote:
> Hi all,
>
> I've added a pretty trivial KIP for adding a pass-through Converter for
> Kafka Connect, similar to ByteArraySerializer/Deserializer.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-128%3A+Add+ByteArrayConverter+for+Kafka+Connect
>
> This wasn't added with the framework originally because the idea was to
> deal with structured data for the most part. However, we've seen a couple
> of use cases arise as the framework got more traction and I think it makes
> sense to provide this out of the box now so people stop reinventing the
> wheel (and using a different fully-qualified class name) for each connector
> that needs this functionality.
>
> I imagine this will be a rather uncontroversial addition, so if I don't see
> any comments in the next day or two I'll just start the vote thread.
>
> -Ewen



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog