You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Ewen Cheslack-Postava <ew...@confluent.io> on 2017/02/25 22:25:15 UTC
[DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect
Hi all,
I've added a pretty trivial KIP for adding a pass-through Converter for
Kafka Connect, similar to ByteArraySerializer/Deserializer.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-128%3A+Add+ByteArrayConverter+for+Kafka+Connect
This wasn't added with the framework originally because the idea was to
deal with structured data for the most part. However, we've seen a couple
of use cases arise as the framework got more traction and I think it makes
sense to provide this out of the box now so people stop reinventing the
wheel (and using a different fully-qualified class name) for each connector
that needs this functionality.
I imagine this will be a rather uncontroversial addition, so if I don't see
any comments in the next day or two I'll just start the vote thread.
-Ewen
Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect
Posted by Guozhang Wang <wa...@gmail.com>.
Generally I'd prefer not duplicating functional logic as sometimes you may
miss to sync one of them when you change the other. I understand for this
specific case such scenario may never happen as the logic is quite simple
and static, but still sounds a good coding practice to me?
Guozhang
On Tue, Feb 28, 2017 at 4:31 PM, Ewen Cheslack-Postava <ew...@confluent.io>
wrote:
> Guozhang,
>
> Did you look at the PR? I'm fine with doing that if we really think it's
> better, but since this is a config-less passthrough, it's actually just
> adding overhead to do that...
>
> -Ewen
>
> On Mon, Feb 27, 2017 at 11:47 AM, Guozhang Wang <wa...@gmail.com>
> wrote:
>
> > Thanks Ewen,
> >
> > "use the corresponding serializer internally and just add in the extra
> > conversion
> > steps for the data API" sounds good to me.
> >
> > Guozhang
> >
> >
> > On Mon, Feb 27, 2017 at 8:24 AM, Ewen Cheslack-Postava <
> ewen@confluent.io>
> > wrote:
> >
> > > It's a different interface that's being implemented. The functionality
> is
> > > the same (since it's just a simple pass through), but we intentionally
> > > named Converters differently than Serializers since they do more work
> > than
> > > Serializers (besides the normal serialization they also need to convert
> > > between <serialization format> and the Connect Data API.
> > >
> > > We could certainly reuse/extend that class instead, though I'm not sure
> > > there's much benefit in that and since they implement different
> > interfaces
> > > and this is Connect-specific, it will probably be clearer to have it
> > under
> > > a Connect package. Note that for other Converters the pattern we've
> used
> > is
> > > to use the corresponding serializer internally and just add in the
> extra
> > > conversion steps for the data API.
> > >
> > > -Ewen
> > >
> > > On Sat, Feb 25, 2017 at 6:52 PM, Guozhang Wang <wa...@gmail.com>
> > wrote:
> > >
> > > > I'm wondering why we can't just use ByteArarySerde in o.a.k.common?
> > > >
> > > > Guozhang
> > > >
> > > > On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava <
> > > ewen@confluent.io>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I've added a pretty trivial KIP for adding a pass-through Converter
> > for
> > > > > Kafka Connect, similar to ByteArraySerializer/Deserializer.
> > > > >
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > 128%3A+Add+ByteArrayConverter+for+Kafka+Connect
> > > > >
> > > > > This wasn't added with the framework originally because the idea
> was
> > to
> > > > > deal with structured data for the most part. However, we've seen a
> > > couple
> > > > > of use cases arise as the framework got more traction and I think
> it
> > > > makes
> > > > > sense to provide this out of the box now so people stop reinventing
> > the
> > > > > wheel (and using a different fully-qualified class name) for each
> > > > connector
> > > > > that needs this functionality.
> > > > >
> > > > > I imagine this will be a rather uncontroversial addition, so if I
> > don't
> > > > see
> > > > > any comments in the next day or two I'll just start the vote
> thread.
> > > > >
> > > > > -Ewen
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>
--
-- Guozhang
Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect
Posted by Ewen Cheslack-Postava <ew...@confluent.io>.
Guozhang,
Did you look at the PR? I'm fine with doing that if we really think it's
better, but since this is a config-less passthrough, it's actually just
adding overhead to do that...
-Ewen
On Mon, Feb 27, 2017 at 11:47 AM, Guozhang Wang <wa...@gmail.com> wrote:
> Thanks Ewen,
>
> "use the corresponding serializer internally and just add in the extra
> conversion
> steps for the data API" sounds good to me.
>
> Guozhang
>
>
> On Mon, Feb 27, 2017 at 8:24 AM, Ewen Cheslack-Postava <ew...@confluent.io>
> wrote:
>
> > It's a different interface that's being implemented. The functionality is
> > the same (since it's just a simple pass through), but we intentionally
> > named Converters differently than Serializers since they do more work
> than
> > Serializers (besides the normal serialization they also need to convert
> > between <serialization format> and the Connect Data API.
> >
> > We could certainly reuse/extend that class instead, though I'm not sure
> > there's much benefit in that and since they implement different
> interfaces
> > and this is Connect-specific, it will probably be clearer to have it
> under
> > a Connect package. Note that for other Converters the pattern we've used
> is
> > to use the corresponding serializer internally and just add in the extra
> > conversion steps for the data API.
> >
> > -Ewen
> >
> > On Sat, Feb 25, 2017 at 6:52 PM, Guozhang Wang <wa...@gmail.com>
> wrote:
> >
> > > I'm wondering why we can't just use ByteArarySerde in o.a.k.common?
> > >
> > > Guozhang
> > >
> > > On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava <
> > ewen@confluent.io>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I've added a pretty trivial KIP for adding a pass-through Converter
> for
> > > > Kafka Connect, similar to ByteArraySerializer/Deserializer.
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 128%3A+Add+ByteArrayConverter+for+Kafka+Connect
> > > >
> > > > This wasn't added with the framework originally because the idea was
> to
> > > > deal with structured data for the most part. However, we've seen a
> > couple
> > > > of use cases arise as the framework got more traction and I think it
> > > makes
> > > > sense to provide this out of the box now so people stop reinventing
> the
> > > > wheel (and using a different fully-qualified class name) for each
> > > connector
> > > > that needs this functionality.
> > > >
> > > > I imagine this will be a rather uncontroversial addition, so if I
> don't
> > > see
> > > > any comments in the next day or two I'll just start the vote thread.
> > > >
> > > > -Ewen
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>
Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect
Posted by Guozhang Wang <wa...@gmail.com>.
Thanks Ewen,
"use the corresponding serializer internally and just add in the extra
conversion
steps for the data API" sounds good to me.
Guozhang
On Mon, Feb 27, 2017 at 8:24 AM, Ewen Cheslack-Postava <ew...@confluent.io>
wrote:
> It's a different interface that's being implemented. The functionality is
> the same (since it's just a simple pass through), but we intentionally
> named Converters differently than Serializers since they do more work than
> Serializers (besides the normal serialization they also need to convert
> between <serialization format> and the Connect Data API.
>
> We could certainly reuse/extend that class instead, though I'm not sure
> there's much benefit in that and since they implement different interfaces
> and this is Connect-specific, it will probably be clearer to have it under
> a Connect package. Note that for other Converters the pattern we've used is
> to use the corresponding serializer internally and just add in the extra
> conversion steps for the data API.
>
> -Ewen
>
> On Sat, Feb 25, 2017 at 6:52 PM, Guozhang Wang <wa...@gmail.com> wrote:
>
> > I'm wondering why we can't just use ByteArarySerde in o.a.k.common?
> >
> > Guozhang
> >
> > On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava <
> ewen@confluent.io>
> > wrote:
> >
> > > Hi all,
> > >
> > > I've added a pretty trivial KIP for adding a pass-through Converter for
> > > Kafka Connect, similar to ByteArraySerializer/Deserializer.
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 128%3A+Add+ByteArrayConverter+for+Kafka+Connect
> > >
> > > This wasn't added with the framework originally because the idea was to
> > > deal with structured data for the most part. However, we've seen a
> couple
> > > of use cases arise as the framework got more traction and I think it
> > makes
> > > sense to provide this out of the box now so people stop reinventing the
> > > wheel (and using a different fully-qualified class name) for each
> > connector
> > > that needs this functionality.
> > >
> > > I imagine this will be a rather uncontroversial addition, so if I don't
> > see
> > > any comments in the next day or two I'll just start the vote thread.
> > >
> > > -Ewen
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>
--
-- Guozhang
Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect
Posted by Ewen Cheslack-Postava <ew...@confluent.io>.
It's a different interface that's being implemented. The functionality is
the same (since it's just a simple pass through), but we intentionally
named Converters differently than Serializers since they do more work than
Serializers (besides the normal serialization they also need to convert
between <serialization format> and the Connect Data API.
We could certainly reuse/extend that class instead, though I'm not sure
there's much benefit in that and since they implement different interfaces
and this is Connect-specific, it will probably be clearer to have it under
a Connect package. Note that for other Converters the pattern we've used is
to use the corresponding serializer internally and just add in the extra
conversion steps for the data API.
-Ewen
On Sat, Feb 25, 2017 at 6:52 PM, Guozhang Wang <wa...@gmail.com> wrote:
> I'm wondering why we can't just use ByteArarySerde in o.a.k.common?
>
> Guozhang
>
> On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava <ew...@confluent.io>
> wrote:
>
> > Hi all,
> >
> > I've added a pretty trivial KIP for adding a pass-through Converter for
> > Kafka Connect, similar to ByteArraySerializer/Deserializer.
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 128%3A+Add+ByteArrayConverter+for+Kafka+Connect
> >
> > This wasn't added with the framework originally because the idea was to
> > deal with structured data for the most part. However, we've seen a couple
> > of use cases arise as the framework got more traction and I think it
> makes
> > sense to provide this out of the box now so people stop reinventing the
> > wheel (and using a different fully-qualified class name) for each
> connector
> > that needs this functionality.
> >
> > I imagine this will be a rather uncontroversial addition, so if I don't
> see
> > any comments in the next day or two I'll just start the vote thread.
> >
> > -Ewen
> >
>
>
>
> --
> -- Guozhang
>
Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect
Posted by Guozhang Wang <wa...@gmail.com>.
I'm wondering why we can't just use ByteArarySerde in o.a.k.common?
Guozhang
On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava <ew...@confluent.io>
wrote:
> Hi all,
>
> I've added a pretty trivial KIP for adding a pass-through Converter for
> Kafka Connect, similar to ByteArraySerializer/Deserializer.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 128%3A+Add+ByteArrayConverter+for+Kafka+Connect
>
> This wasn't added with the framework originally because the idea was to
> deal with structured data for the most part. However, we've seen a couple
> of use cases arise as the framework got more traction and I think it makes
> sense to provide this out of the box now so people stop reinventing the
> wheel (and using a different fully-qualified class name) for each connector
> that needs this functionality.
>
> I imagine this will be a rather uncontroversial addition, so if I don't see
> any comments in the next day or two I'll just start the vote thread.
>
> -Ewen
>
--
-- Guozhang
Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect
Posted by Guozhang Wang <wa...@gmail.com>.
Sounds good. Thanks.
On Wed, Mar 1, 2017 at 8:21 PM, Gwen Shapira <gw...@confluent.io> wrote:
> Yeah, you are right, it is best not to make converters actively break the
> data structures :)
>
> On Wed, Mar 1, 2017 at 4:55 PM, Ewen Cheslack-Postava <ew...@confluent.io>
> wrote:
>
> > Guozhang,
> >
> > I'm fine w/ adjusting if people want to, but it ends up being more code
> > since we also need to convert SerializationExceptions to DataExceptions
> and
> > the only thing the toConnectData method even does is specific to Connect
> > (adding the SchemaAndValue).
> >
> > Gwen -- isn't that an SMT? ExtractField?
> >
> > -Ewen
> >
> > On Wed, Mar 1, 2017 at 1:58 PM, Gwen Shapira <gw...@confluent.io> wrote:
> >
> > > Hi Ewen,
> > >
> > > Thanks for the KIP, I think it will be useful :)
> > >
> > > I'm just wondering if we can add support not just for bytes schema,
> > > but also for a struct that contains bytes? I'm thinking of the
> > > scenario of using a connector to grab BLOBs out of a DB - I think you
> > > end up with this structure if you use a JDBC connector and custom
> > > query...
> > >
> > > Maybe even support Maps with generic objects using Java's default
> > > serialization? I'm not sure if this is useful.
> > >
> > > Gwen
> > >
> > >
> > >
> > > On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava
> > > <ew...@confluent.io> wrote:
> > > > Hi all,
> > > >
> > > > I've added a pretty trivial KIP for adding a pass-through Converter
> for
> > > > Kafka Connect, similar to ByteArraySerializer/Deserializer.
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 128%3A+Add+ByteArrayConverter+for+Kafka+Connect
> > > >
> > > > This wasn't added with the framework originally because the idea was
> to
> > > > deal with structured data for the most part. However, we've seen a
> > couple
> > > > of use cases arise as the framework got more traction and I think it
> > > makes
> > > > sense to provide this out of the box now so people stop reinventing
> the
> > > > wheel (and using a different fully-qualified class name) for each
> > > connector
> > > > that needs this functionality.
> > > >
> > > > I imagine this will be a rather uncontroversial addition, so if I
> don't
> > > see
> > > > any comments in the next day or two I'll just start the vote thread.
> > > >
> > > > -Ewen
> > >
> > >
> > >
> > > --
> > > Gwen Shapira
> > > Product Manager | Confluent
> > > 650.450.2760 | @gwenshap
> > > Follow us: Twitter | blog
> > >
> >
>
>
>
> --
> *Gwen Shapira*
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> <http://www.confluent.io/blog>
>
--
-- Guozhang
Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect
Posted by Gwen Shapira <gw...@confluent.io>.
Yeah, you are right, it is best not to make converters actively break the
data structures :)
On Wed, Mar 1, 2017 at 4:55 PM, Ewen Cheslack-Postava <ew...@confluent.io>
wrote:
> Guozhang,
>
> I'm fine w/ adjusting if people want to, but it ends up being more code
> since we also need to convert SerializationExceptions to DataExceptions and
> the only thing the toConnectData method even does is specific to Connect
> (adding the SchemaAndValue).
>
> Gwen -- isn't that an SMT? ExtractField?
>
> -Ewen
>
> On Wed, Mar 1, 2017 at 1:58 PM, Gwen Shapira <gw...@confluent.io> wrote:
>
> > Hi Ewen,
> >
> > Thanks for the KIP, I think it will be useful :)
> >
> > I'm just wondering if we can add support not just for bytes schema,
> > but also for a struct that contains bytes? I'm thinking of the
> > scenario of using a connector to grab BLOBs out of a DB - I think you
> > end up with this structure if you use a JDBC connector and custom
> > query...
> >
> > Maybe even support Maps with generic objects using Java's default
> > serialization? I'm not sure if this is useful.
> >
> > Gwen
> >
> >
> >
> > On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava
> > <ew...@confluent.io> wrote:
> > > Hi all,
> > >
> > > I've added a pretty trivial KIP for adding a pass-through Converter for
> > > Kafka Connect, similar to ByteArraySerializer/Deserializer.
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 128%3A+Add+ByteArrayConverter+for+Kafka+Connect
> > >
> > > This wasn't added with the framework originally because the idea was to
> > > deal with structured data for the most part. However, we've seen a
> couple
> > > of use cases arise as the framework got more traction and I think it
> > makes
> > > sense to provide this out of the box now so people stop reinventing the
> > > wheel (and using a different fully-qualified class name) for each
> > connector
> > > that needs this functionality.
> > >
> > > I imagine this will be a rather uncontroversial addition, so if I don't
> > see
> > > any comments in the next day or two I'll just start the vote thread.
> > >
> > > -Ewen
> >
> >
> >
> > --
> > Gwen Shapira
> > Product Manager | Confluent
> > 650.450.2760 | @gwenshap
> > Follow us: Twitter | blog
> >
>
--
*Gwen Shapira*
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
<http://www.confluent.io/blog>
Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect
Posted by Ewen Cheslack-Postava <ew...@confluent.io>.
Guozhang,
I'm fine w/ adjusting if people want to, but it ends up being more code
since we also need to convert SerializationExceptions to DataExceptions and
the only thing the toConnectData method even does is specific to Connect
(adding the SchemaAndValue).
Gwen -- isn't that an SMT? ExtractField?
-Ewen
On Wed, Mar 1, 2017 at 1:58 PM, Gwen Shapira <gw...@confluent.io> wrote:
> Hi Ewen,
>
> Thanks for the KIP, I think it will be useful :)
>
> I'm just wondering if we can add support not just for bytes schema,
> but also for a struct that contains bytes? I'm thinking of the
> scenario of using a connector to grab BLOBs out of a DB - I think you
> end up with this structure if you use a JDBC connector and custom
> query...
>
> Maybe even support Maps with generic objects using Java's default
> serialization? I'm not sure if this is useful.
>
> Gwen
>
>
>
> On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava
> <ew...@confluent.io> wrote:
> > Hi all,
> >
> > I've added a pretty trivial KIP for adding a pass-through Converter for
> > Kafka Connect, similar to ByteArraySerializer/Deserializer.
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 128%3A+Add+ByteArrayConverter+for+Kafka+Connect
> >
> > This wasn't added with the framework originally because the idea was to
> > deal with structured data for the most part. However, we've seen a couple
> > of use cases arise as the framework got more traction and I think it
> makes
> > sense to provide this out of the box now so people stop reinventing the
> > wheel (and using a different fully-qualified class name) for each
> connector
> > that needs this functionality.
> >
> > I imagine this will be a rather uncontroversial addition, so if I don't
> see
> > any comments in the next day or two I'll just start the vote thread.
> >
> > -Ewen
>
>
>
> --
> Gwen Shapira
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter | blog
>
Re: [DISCUSS] KIP-128: Add ByteArrayConverter for Kafka Connect
Posted by Gwen Shapira <gw...@confluent.io>.
Hi Ewen,
Thanks for the KIP, I think it will be useful :)
I'm just wondering if we can add support not just for bytes schema,
but also for a struct that contains bytes? I'm thinking of the
scenario of using a connector to grab BLOBs out of a DB - I think you
end up with this structure if you use a JDBC connector and custom
query...
Maybe even support Maps with generic objects using Java's default
serialization? I'm not sure if this is useful.
Gwen
On Sat, Feb 25, 2017 at 2:25 PM, Ewen Cheslack-Postava
<ew...@confluent.io> wrote:
> Hi all,
>
> I've added a pretty trivial KIP for adding a pass-through Converter for
> Kafka Connect, similar to ByteArraySerializer/Deserializer.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-128%3A+Add+ByteArrayConverter+for+Kafka+Connect
>
> This wasn't added with the framework originally because the idea was to
> deal with structured data for the most part. However, we've seen a couple
> of use cases arise as the framework got more traction and I think it makes
> sense to provide this out of the box now so people stop reinventing the
> wheel (and using a different fully-qualified class name) for each connector
> that needs this functionality.
>
> I imagine this will be a rather uncontroversial addition, so if I don't see
> any comments in the next day or two I'll just start the vote thread.
>
> -Ewen
--
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog