You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "brandon@bbrownsound.com" <br...@bbrownsound.com> on 2020/10/01 13:10:24 UTC

[VOTE] KIP-665 Kafka Connect Hash SMT

Hey Kafka Developers,

I’ve created the following KIP and updated it based on feedback from Mickael. I was wondering if we could get a vote on my proposal and move forward with the proposed pr.

Thanks so much!
-Brandon

Re: [VOTE] KIP-665 Kafka Connect Hash SMT

Posted by Brandon Brown <br...@bbrownsound.com>.
I’d like to give this one another friendly bump. If there are no disagreements I can update my existing Pr with the latest KIP changes. 

Thanks,
-Brandon 

Brandon Brown
> On Oct 26, 2020, at 8:29 PM, Brandon Brown <br...@bbrownsound.com> wrote:
> 
> I’ve update the KIP with suggestions from Gunnar. I’d like to bring this up for a vote. 
> 
> Brandon Brown
>> On Oct 22, 2020, at 12:53 PM, Brandon Brown <br...@bbrownsound.com> wrote:
>> 
>> Hey Gunnar,
>> 
>> Those are great questions!
>> 
>> 1) I went with it only selecting top level fields since it seems like that’s the way most of the out of the box SMTS work, however I could see a lot of value in it supporting nested fields. 
>> 2) I had not thought about adding salt but I think that would be a valid option as well. 
>> 
>> I think I’ll update the KIP to reflect those suggestions. One more, do you think this should allow a regex for fields or stick with the explicit naming of the fields?
>> 
>> Thanks for the great feedback
>> 
>> Brandon Brown
>> 
>>>> On Oct 22, 2020, at 12:40 PM, Gunnar Morling <gu...@googlemail.com.invalid> wrote:
>>> 
>>> Hey Brandon,
>>> 
>>> I think that's an interesting idea, we got something as a built-in
>>> connector feature in Debezium, too [1]. Two questions:
>>> 
>>> * Can "field" select nested fields, e.g. "after.email"?
>>> * Did you consider an option for specifying salt for the hash functions?
>>> 
>>> --Gunnar
>>> 
>>> [1]
>>> https://debezium.io/documentation/reference/connectors/mysql.html#mysql-property-column-mask-hash
>>> 
>>> 
>>> 
>>>> Am Do., 22. Okt. 2020 um 12:53 Uhr schrieb Brandon Brown <
>>>> brandon@bbrownsound.com>:
>>>> 
>>>> Gonna give this another little bump. :)
>>>> 
>>>> Brandon Brown
>>>> 
>>>>> On Oct 15, 2020, at 12:51 PM, Brandon Brown <br...@bbrownsound.com>
>>>> wrote:
>>>>> 
>>>>> 
>>>>> As I mentioned in the KIP, this transformer is slightly different from
>>>> the current MaskField SMT.
>>>>> 
>>>>>> Currently there exists a MaskField SMT but that would completely remove
>>>> the value by setting it to an equivalent null value. One problem with this
>>>> would be that you’d not be able to know in the case of say a password going
>>>> through the mask transform it would become "" which could mean that no
>>>> password was present in the message, or it was removed. However this hash
>>>> transformer would remove this ambiguity if that makes sense. The proposed
>>>> hash functions would be MD5, SHA1, SHA256. which are all supported via
>>>> MessageDigest.
>>>>> 
>>>>> Given this take on things do you still think there would be value in
>>>> this smt?
>>>>> 
>>>>> 
>>>>> Brandon Brown
>>>>>> On Oct 15, 2020, at 12:36 PM, Ning Zhang <ni...@gmail.com>
>>>> wrote:
>>>>>> 
>>>>>> Hello, I think this SMT feature is parallel to
>>>> https://docs.confluent.io/current/connect/transforms/index.html
>>>>>> 
>>>>>>>> On 2020/10/15 15:24:51, Brandon Brown <br...@bbrownsound.com>
>>>> wrote:
>>>>>>> Bumping this thread.
>>>>>>> Please take a look at the KIP and vote or let me know if you have any
>>>> feedback.
>>>>>>> 
>>>>>>> KIP:
>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
>>>>>>> 
>>>>>>> Proposed: https://github.com/apache/kafka/pull/9057
>>>>>>> 
>>>>>>> Thanks
>>>>>>> 
>>>>>>> Brandon Brown
>>>>>>> 
>>>>>>>>> On Oct 8, 2020, at 10:30 PM, Brandon Brown <br...@bbrownsound.com>
>>>> wrote:
>>>>>>>> 
>>>>>>>> Just wanted to give another bump on this and see if anyone had any
>>>> comments.
>>>>>>>> 
>>>>>>>> Thanks!
>>>>>>>> 
>>>>>>>> Brandon Brown
>>>>>>>> 
>>>>>>>>> On Oct 1, 2020, at 9:11 AM, "brandon@bbrownsound.com" <
>>>> brandon@bbrownsound.com> wrote:
>>>>>>>>> 
>>>>>>>>> Hey Kafka Developers,
>>>>>>>>> 
>>>>>>>>> I’ve created the following KIP and updated it based on feedback from
>>>> Mickael. I was wondering if we could get a vote on my proposal and move
>>>> forward with the proposed pr.
>>>>>>>>> 
>>>>>>>>> Thanks so much!
>>>>>>>>> -Brandon
>>>>>>> 
>>>> 

Re: [VOTE] KIP-665 Kafka Connect Hash SMT

Posted by Brandon Brown <br...@bbrownsound.com>.
I’ve update the KIP with suggestions from Gunnar. I’d like to bring this up for a vote. 

Brandon Brown
> On Oct 22, 2020, at 12:53 PM, Brandon Brown <br...@bbrownsound.com> wrote:
> 
> Hey Gunnar,
> 
> Those are great questions!
> 
> 1) I went with it only selecting top level fields since it seems like that’s the way most of the out of the box SMTS work, however I could see a lot of value in it supporting nested fields. 
> 2) I had not thought about adding salt but I think that would be a valid option as well. 
> 
> I think I’ll update the KIP to reflect those suggestions. One more, do you think this should allow a regex for fields or stick with the explicit naming of the fields?
> 
> Thanks for the great feedback
> 
> Brandon Brown
> 
>> On Oct 22, 2020, at 12:40 PM, Gunnar Morling <gu...@googlemail.com.invalid> wrote:
>> 
>> Hey Brandon,
>> 
>> I think that's an interesting idea, we got something as a built-in
>> connector feature in Debezium, too [1]. Two questions:
>> 
>> * Can "field" select nested fields, e.g. "after.email"?
>> * Did you consider an option for specifying salt for the hash functions?
>> 
>> --Gunnar
>> 
>> [1]
>> https://debezium.io/documentation/reference/connectors/mysql.html#mysql-property-column-mask-hash
>> 
>> 
>> 
>>> Am Do., 22. Okt. 2020 um 12:53 Uhr schrieb Brandon Brown <
>>> brandon@bbrownsound.com>:
>>> 
>>> Gonna give this another little bump. :)
>>> 
>>> Brandon Brown
>>> 
>>>> On Oct 15, 2020, at 12:51 PM, Brandon Brown <br...@bbrownsound.com>
>>> wrote:
>>>> 
>>>> 
>>>> As I mentioned in the KIP, this transformer is slightly different from
>>> the current MaskField SMT.
>>>> 
>>>>> Currently there exists a MaskField SMT but that would completely remove
>>> the value by setting it to an equivalent null value. One problem with this
>>> would be that you’d not be able to know in the case of say a password going
>>> through the mask transform it would become "" which could mean that no
>>> password was present in the message, or it was removed. However this hash
>>> transformer would remove this ambiguity if that makes sense. The proposed
>>> hash functions would be MD5, SHA1, SHA256. which are all supported via
>>> MessageDigest.
>>>> 
>>>> Given this take on things do you still think there would be value in
>>> this smt?
>>>> 
>>>> 
>>>> Brandon Brown
>>>>> On Oct 15, 2020, at 12:36 PM, Ning Zhang <ni...@gmail.com>
>>> wrote:
>>>>> 
>>>>> Hello, I think this SMT feature is parallel to
>>> https://docs.confluent.io/current/connect/transforms/index.html
>>>>> 
>>>>>>> On 2020/10/15 15:24:51, Brandon Brown <br...@bbrownsound.com>
>>> wrote:
>>>>>> Bumping this thread.
>>>>>> Please take a look at the KIP and vote or let me know if you have any
>>> feedback.
>>>>>> 
>>>>>> KIP:
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
>>>>>> 
>>>>>> Proposed: https://github.com/apache/kafka/pull/9057
>>>>>> 
>>>>>> Thanks
>>>>>> 
>>>>>> Brandon Brown
>>>>>> 
>>>>>>>> On Oct 8, 2020, at 10:30 PM, Brandon Brown <br...@bbrownsound.com>
>>> wrote:
>>>>>>> 
>>>>>>> Just wanted to give another bump on this and see if anyone had any
>>> comments.
>>>>>>> 
>>>>>>> Thanks!
>>>>>>> 
>>>>>>> Brandon Brown
>>>>>>> 
>>>>>>>> On Oct 1, 2020, at 9:11 AM, "brandon@bbrownsound.com" <
>>> brandon@bbrownsound.com> wrote:
>>>>>>>> 
>>>>>>>> Hey Kafka Developers,
>>>>>>>> 
>>>>>>>> I’ve created the following KIP and updated it based on feedback from
>>> Mickael. I was wondering if we could get a vote on my proposal and move
>>> forward with the proposed pr.
>>>>>>>> 
>>>>>>>> Thanks so much!
>>>>>>>> -Brandon
>>>>>> 
>>> 

Re: [VOTE] KIP-665 Kafka Connect Hash SMT

Posted by Brandon Brown <br...@bbrownsound.com>.
Hey Gunnar,

Those are great questions!

1) I went with it only selecting top level fields since it seems like that’s the way most of the out of the box SMTS work, however I could see a lot of value in it supporting nested fields. 
2) I had not thought about adding salt but I think that would be a valid option as well. 

I think I’ll update the KIP to reflect those suggestions. One more, do you think this should allow a regex for fields or stick with the explicit naming of the fields?

Thanks for the great feedback

Brandon Brown

> On Oct 22, 2020, at 12:40 PM, Gunnar Morling <gu...@googlemail.com.invalid> wrote:
> 
> Hey Brandon,
> 
> I think that's an interesting idea, we got something as a built-in
> connector feature in Debezium, too [1]. Two questions:
> 
> * Can "field" select nested fields, e.g. "after.email"?
> * Did you consider an option for specifying salt for the hash functions?
> 
> --Gunnar
> 
> [1]
> https://debezium.io/documentation/reference/connectors/mysql.html#mysql-property-column-mask-hash
> 
> 
> 
>> Am Do., 22. Okt. 2020 um 12:53 Uhr schrieb Brandon Brown <
>> brandon@bbrownsound.com>:
>> 
>> Gonna give this another little bump. :)
>> 
>> Brandon Brown
>> 
>>> On Oct 15, 2020, at 12:51 PM, Brandon Brown <br...@bbrownsound.com>
>> wrote:
>>> 
>>> 
>>> As I mentioned in the KIP, this transformer is slightly different from
>> the current MaskField SMT.
>>> 
>>>> Currently there exists a MaskField SMT but that would completely remove
>> the value by setting it to an equivalent null value. One problem with this
>> would be that you’d not be able to know in the case of say a password going
>> through the mask transform it would become "" which could mean that no
>> password was present in the message, or it was removed. However this hash
>> transformer would remove this ambiguity if that makes sense. The proposed
>> hash functions would be MD5, SHA1, SHA256. which are all supported via
>> MessageDigest.
>>> 
>>> Given this take on things do you still think there would be value in
>> this smt?
>>> 
>>> 
>>> Brandon Brown
>>>> On Oct 15, 2020, at 12:36 PM, Ning Zhang <ni...@gmail.com>
>> wrote:
>>>> 
>>>> Hello, I think this SMT feature is parallel to
>> https://docs.confluent.io/current/connect/transforms/index.html
>>>> 
>>>>>> On 2020/10/15 15:24:51, Brandon Brown <br...@bbrownsound.com>
>> wrote:
>>>>> Bumping this thread.
>>>>> Please take a look at the KIP and vote or let me know if you have any
>> feedback.
>>>>> 
>>>>> KIP:
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
>>>>> 
>>>>> Proposed: https://github.com/apache/kafka/pull/9057
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Brandon Brown
>>>>> 
>>>>>>> On Oct 8, 2020, at 10:30 PM, Brandon Brown <br...@bbrownsound.com>
>> wrote:
>>>>>> 
>>>>>> Just wanted to give another bump on this and see if anyone had any
>> comments.
>>>>>> 
>>>>>> Thanks!
>>>>>> 
>>>>>> Brandon Brown
>>>>>> 
>>>>>>> On Oct 1, 2020, at 9:11 AM, "brandon@bbrownsound.com" <
>> brandon@bbrownsound.com> wrote:
>>>>>>> 
>>>>>>> Hey Kafka Developers,
>>>>>>> 
>>>>>>> I’ve created the following KIP and updated it based on feedback from
>> Mickael. I was wondering if we could get a vote on my proposal and move
>> forward with the proposed pr.
>>>>>>> 
>>>>>>> Thanks so much!
>>>>>>> -Brandon
>>>>> 
>> 

Re: [VOTE] KIP-665 Kafka Connect Hash SMT

Posted by Gunnar Morling <gu...@googlemail.com.INVALID>.
Hey Brandon,

I think that's an interesting idea, we got something as a built-in
connector feature in Debezium, too [1]. Two questions:

* Can "field" select nested fields, e.g. "after.email"?
* Did you consider an option for specifying salt for the hash functions?

--Gunnar

[1]
https://debezium.io/documentation/reference/connectors/mysql.html#mysql-property-column-mask-hash



Am Do., 22. Okt. 2020 um 12:53 Uhr schrieb Brandon Brown <
brandon@bbrownsound.com>:

> Gonna give this another little bump. :)
>
> Brandon Brown
>
> > On Oct 15, 2020, at 12:51 PM, Brandon Brown <br...@bbrownsound.com>
> wrote:
> >
> > 
> > As I mentioned in the KIP, this transformer is slightly different from
> the current MaskField SMT.
> >
> >> Currently there exists a MaskField SMT but that would completely remove
> the value by setting it to an equivalent null value. One problem with this
> would be that you’d not be able to know in the case of say a password going
> through the mask transform it would become "" which could mean that no
> password was present in the message, or it was removed. However this hash
> transformer would remove this ambiguity if that makes sense. The proposed
> hash functions would be MD5, SHA1, SHA256. which are all supported via
> MessageDigest.
> >
> > Given this take on things do you still think there would be value in
> this smt?
> >
> >
> > Brandon Brown
> >> On Oct 15, 2020, at 12:36 PM, Ning Zhang <ni...@gmail.com>
> wrote:
> >>
> >> Hello, I think this SMT feature is parallel to
> https://docs.confluent.io/current/connect/transforms/index.html
> >>
> >>>> On 2020/10/15 15:24:51, Brandon Brown <br...@bbrownsound.com>
> wrote:
> >>> Bumping this thread.
> >>> Please take a look at the KIP and vote or let me know if you have any
> feedback.
> >>>
> >>> KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
> >>>
> >>> Proposed: https://github.com/apache/kafka/pull/9057
> >>>
> >>> Thanks
> >>>
> >>> Brandon Brown
> >>>
> >>>>> On Oct 8, 2020, at 10:30 PM, Brandon Brown <br...@bbrownsound.com>
> wrote:
> >>>>
> >>>> Just wanted to give another bump on this and see if anyone had any
> comments.
> >>>>
> >>>> Thanks!
> >>>>
> >>>> Brandon Brown
> >>>>
> >>>>> On Oct 1, 2020, at 9:11 AM, "brandon@bbrownsound.com" <
> brandon@bbrownsound.com> wrote:
> >>>>>
> >>>>> Hey Kafka Developers,
> >>>>>
> >>>>> I’ve created the following KIP and updated it based on feedback from
> Mickael. I was wondering if we could get a vote on my proposal and move
> forward with the proposed pr.
> >>>>>
> >>>>> Thanks so much!
> >>>>> -Brandon
> >>>
>

Re: [VOTE] KIP-665 Kafka Connect Hash SMT

Posted by Brandon Brown <br...@bbrownsound.com>.
Gonna give this another little bump. :)

Brandon Brown

> On Oct 15, 2020, at 12:51 PM, Brandon Brown <br...@bbrownsound.com> wrote:
> 
> 
> As I mentioned in the KIP, this transformer is slightly different from the current MaskField SMT. 
> 
>> Currently there exists a MaskField SMT but that would completely remove the value by setting it to an equivalent null value. One problem with this would be that you’d not be able to know in the case of say a password going through the mask transform it would become "" which could mean that no password was present in the message, or it was removed. However this hash transformer would remove this ambiguity if that makes sense. The proposed hash functions would be MD5, SHA1, SHA256. which are all supported via MessageDigest.
> 
> Given this take on things do you still think there would be value in this smt?
> 
> 
> Brandon Brown 
>> On Oct 15, 2020, at 12:36 PM, Ning Zhang <ni...@gmail.com> wrote:
>> 
>> Hello, I think this SMT feature is parallel to https://docs.confluent.io/current/connect/transforms/index.html
>> 
>>>> On 2020/10/15 15:24:51, Brandon Brown <br...@bbrownsound.com> wrote: 
>>> Bumping this thread.
>>> Please take a look at the KIP and vote or let me know if you have any feedback.
>>> 
>>> KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
>>> 
>>> Proposed: https://github.com/apache/kafka/pull/9057
>>> 
>>> Thanks
>>> 
>>> Brandon Brown
>>> 
>>>>> On Oct 8, 2020, at 10:30 PM, Brandon Brown <br...@bbrownsound.com> wrote:
>>>> 
>>>> Just wanted to give another bump on this and see if anyone had any comments. 
>>>> 
>>>> Thanks!
>>>> 
>>>> Brandon Brown
>>>> 
>>>>> On Oct 1, 2020, at 9:11 AM, "brandon@bbrownsound.com" <br...@bbrownsound.com> wrote:
>>>>> 
>>>>> Hey Kafka Developers,
>>>>> 
>>>>> I’ve created the following KIP and updated it based on feedback from Mickael. I was wondering if we could get a vote on my proposal and move forward with the proposed pr.
>>>>> 
>>>>> Thanks so much!
>>>>> -Brandon
>>> 

Re: [VOTE] KIP-665 Kafka Connect Hash SMT

Posted by Brandon Brown <br...@bbrownsound.com>.
As I mentioned in the KIP, this transformer is slightly different from the current MaskField SMT. 

> Currently there exists a MaskField SMT but that would completely remove the value by setting it to an equivalent null value. One problem with this would be that you’d not be able to know in the case of say a password going through the mask transform it would become "" which could mean that no password was present in the message, or it was removed. However this hash transformer would remove this ambiguity if that makes sense. The proposed hash functions would be MD5, SHA1, SHA256. which are all supported via MessageDigest.

Given this take on things do you still think there would be value in this smt?


Brandon Brown 
> On Oct 15, 2020, at 12:36 PM, Ning Zhang <ni...@gmail.com> wrote:
> 
> Hello, I think this SMT feature is parallel to https://docs.confluent.io/current/connect/transforms/index.html
> 
>> On 2020/10/15 15:24:51, Brandon Brown <br...@bbrownsound.com> wrote: 
>> Bumping this thread.
>> Please take a look at the KIP and vote or let me know if you have any feedback.
>> 
>> KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
>> 
>> Proposed: https://github.com/apache/kafka/pull/9057
>> 
>> Thanks
>> 
>> Brandon Brown
>> 
>>>> On Oct 8, 2020, at 10:30 PM, Brandon Brown <br...@bbrownsound.com> wrote:
>>> 
>>> Just wanted to give another bump on this and see if anyone had any comments. 
>>> 
>>> Thanks!
>>> 
>>> Brandon Brown
>>> 
>>>> On Oct 1, 2020, at 9:11 AM, "brandon@bbrownsound.com" <br...@bbrownsound.com> wrote:
>>>> 
>>>> Hey Kafka Developers,
>>>> 
>>>> I’ve created the following KIP and updated it based on feedback from Mickael. I was wondering if we could get a vote on my proposal and move forward with the proposed pr.
>>>> 
>>>> Thanks so much!
>>>> -Brandon
>> 

Re: [VOTE] KIP-665 Kafka Connect Hash SMT

Posted by Ning Zhang <ni...@gmail.com>.
Hello, I think this SMT feature is parallel to https://docs.confluent.io/current/connect/transforms/index.html

On 2020/10/15 15:24:51, Brandon Brown <br...@bbrownsound.com> wrote: 
> Bumping this thread.
> Please take a look at the KIP and vote or let me know if you have any feedback.
> 
> KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
> 
> Proposed: https://github.com/apache/kafka/pull/9057
> 
> Thanks
> 
> Brandon Brown
> 
> > On Oct 8, 2020, at 10:30 PM, Brandon Brown <br...@bbrownsound.com> wrote:
> > 
> > Just wanted to give another bump on this and see if anyone had any comments. 
> > 
> > Thanks!
> > 
> > Brandon Brown
> > 
> >> On Oct 1, 2020, at 9:11 AM, "brandon@bbrownsound.com" <br...@bbrownsound.com> wrote:
> >> 
> >> Hey Kafka Developers,
> >> 
> >> I’ve created the following KIP and updated it based on feedback from Mickael. I was wondering if we could get a vote on my proposal and move forward with the proposed pr.
> >> 
> >> Thanks so much!
> >> -Brandon
> 

Re: [VOTE] KIP-665 Kafka Connect Hash SMT

Posted by Brandon Brown <br...@bbrownsound.com>.
Bumping this thread.
Please take a look at the KIP and vote or let me know if you have any feedback.

KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT

Proposed: https://github.com/apache/kafka/pull/9057

Thanks

Brandon Brown

> On Oct 8, 2020, at 10:30 PM, Brandon Brown <br...@bbrownsound.com> wrote:
> 
> Just wanted to give another bump on this and see if anyone had any comments. 
> 
> Thanks!
> 
> Brandon Brown
> 
>> On Oct 1, 2020, at 9:11 AM, "brandon@bbrownsound.com" <br...@bbrownsound.com> wrote:
>> 
>> Hey Kafka Developers,
>> 
>> I’ve created the following KIP and updated it based on feedback from Mickael. I was wondering if we could get a vote on my proposal and move forward with the proposed pr.
>> 
>> Thanks so much!
>> -Brandon

Re: [VOTE] KIP-665 Kafka Connect Hash SMT

Posted by Brandon Brown <br...@bbrownsound.com>.
Just wanted to give another bump on this and see if anyone had any comments. 

Thanks!

Brandon Brown

> On Oct 1, 2020, at 9:11 AM, "brandon@bbrownsound.com" <br...@bbrownsound.com> wrote:
> 
> Hey Kafka Developers,
> 
> I’ve created the following KIP and updated it based on feedback from Mickael. I was wondering if we could get a vote on my proposal and move forward with the proposed pr.
> 
> Thanks so much!
> -Brandon