You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by br...@bbrownsound.com on 2020/08/27 20:42:54 UTC

[DISCUSS] KIP-665 Kafka Connect Hash SMT

https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT

The current pr with the proposed changes  
https://github.com/apache/kafka/pull/9057 and the original 3rd party  
contribution which initiated this change  
https://github.com/aiven/aiven-kafka-connect-transforms/issues/9#issuecomment-662378057.

I'm interested in any suggestions for ways to improve this as I think  
it would make a nice addition to the existing SMTs provided by Kafka  
Connect out of the box.

Thanks,
Brandon




Re: [DISCUSS] KIP-665 Kafka Connect Hash SMT

Posted by "brandon@bbrownsound.com" <br...@bbrownsound.com>.
It’s been a while, so I thought I’d give this one another friendly bump. 

> On Sep 21, 2020, at 9:38 AM, Brandon Brown <br...@bbrownsound.com> wrote:
> 
> Hi Tom,
> 
> The reason I went fix was so that we could simplify the configuration for example you can say sha256 instead of having to remember that it’s SHA-256. Admittedly if other formats become implemented then it would require updating this as well. 
> 
> I’m flexible on changing it to a string and letting it be configured with the exact name. What do you think Mickael?
> 
> Brandon Brown
> 
>> On Sep 21, 2020, at 3:42 AM, Tom Bentley <tb...@redhat.com> wrote:
>> 
>> Hi Brandon and Mickael,
>> 
>> Is it necessary to fix the supported digest? We could just support whatever
>> the JVM's MessageDigest supports?
>> 
>> Kind regards,
>> 
>> Tom
>> 
>>> On Fri, Sep 18, 2020 at 6:00 PM Brandon Brown <br...@bbrownsound.com>
>>> wrote:
>>> 
>>> Thanks Michael! So proposed hash functions would be MD5, SHA1, SHA256.
>>> 
>>> I can expand the motivation on the KIP but here’s where my head is at.
>>> MaskField would completely remove the value by setting it to an equivalent
>>> null value. One problem with this would be that you’d not be able to know
>>> in the case of say a password going through the mask transform it would
>>> become “” which could mean that no password was present in the message, or
>>> it was removed. However this hash transformer would remove this ambiguity
>>> if that makes sense.
>>> 
>>> Do you think there are other hash functions that should be supported as
>>> well?
>>> 
>>> Thanks,
>>> Brandon Brown
>>> 
>>>> On Sep 18, 2020, at 12:00 PM, Mickael Maison <mi...@gmail.com>
>>> wrote:
>>>> 
>>>> Thanks Brandon for the KIP.
>>>> 
>>>> There's already a built-in transformation (MaskField) that can
>>>> obfuscate fields. In the motivation section, it would be nice to
>>>> explain the use cases when MaskField is not suitable and when users
>>>> would need the proposed transformation.
>>>> 
>>>> The KIP exposes a "function" configuration to select the hash function
>>>> to use. Which hash functions do you propose supporting?
>>>> 
>>>>> On Thu, Aug 27, 2020 at 10:43 PM <br...@bbrownsound.com> wrote:
>>>>> 
>>>>> 
>>>>> 
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
>>>>> 
>>>>> The current pr with the proposed changes
>>>>> https://github.com/apache/kafka/pull/9057 and the original 3rd party
>>>>> contribution which initiated this change
>>>>> 
>>> https://github.com/aiven/aiven-kafka-connect-transforms/issues/9#issuecomment-662378057
>>> .
>>>>> 
>>>>> I'm interested in any suggestions for ways to improve this as I think
>>>>> it would make a nice addition to the existing SMTs provided by Kafka
>>>>> Connect out of the box.
>>>>> 
>>>>> Thanks,
>>>>> Brandon
>>>>> 
>>>>> 
>>>>> 
>>> 


Re: [DISCUSS] KIP-665 Kafka Connect Hash SMT

Posted by Brandon Brown <br...@bbrownsound.com>.
Hi Tom,

The reason I went fix was so that we could simplify the configuration for example you can say sha256 instead of having to remember that it’s SHA-256. Admittedly if other formats become implemented then it would require updating this as well. 

I’m flexible on changing it to a string and letting it be configured with the exact name. What do you think Mickael?

Brandon Brown

> On Sep 21, 2020, at 3:42 AM, Tom Bentley <tb...@redhat.com> wrote:
> 
> Hi Brandon and Mickael,
> 
> Is it necessary to fix the supported digest? We could just support whatever
> the JVM's MessageDigest supports?
> 
> Kind regards,
> 
> Tom
> 
>> On Fri, Sep 18, 2020 at 6:00 PM Brandon Brown <br...@bbrownsound.com>
>> wrote:
>> 
>> Thanks Michael! So proposed hash functions would be MD5, SHA1, SHA256.
>> 
>> I can expand the motivation on the KIP but here’s where my head is at.
>> MaskField would completely remove the value by setting it to an equivalent
>> null value. One problem with this would be that you’d not be able to know
>> in the case of say a password going through the mask transform it would
>> become “” which could mean that no password was present in the message, or
>> it was removed. However this hash transformer would remove this ambiguity
>> if that makes sense.
>> 
>> Do you think there are other hash functions that should be supported as
>> well?
>> 
>> Thanks,
>> Brandon Brown
>> 
>>> On Sep 18, 2020, at 12:00 PM, Mickael Maison <mi...@gmail.com>
>> wrote:
>>> 
>>> Thanks Brandon for the KIP.
>>> 
>>> There's already a built-in transformation (MaskField) that can
>>> obfuscate fields. In the motivation section, it would be nice to
>>> explain the use cases when MaskField is not suitable and when users
>>> would need the proposed transformation.
>>> 
>>> The KIP exposes a "function" configuration to select the hash function
>>> to use. Which hash functions do you propose supporting?
>>> 
>>>> On Thu, Aug 27, 2020 at 10:43 PM <br...@bbrownsound.com> wrote:
>>>> 
>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
>>>> 
>>>> The current pr with the proposed changes
>>>> https://github.com/apache/kafka/pull/9057 and the original 3rd party
>>>> contribution which initiated this change
>>>> 
>> https://github.com/aiven/aiven-kafka-connect-transforms/issues/9#issuecomment-662378057
>> .
>>>> 
>>>> I'm interested in any suggestions for ways to improve this as I think
>>>> it would make a nice addition to the existing SMTs provided by Kafka
>>>> Connect out of the box.
>>>> 
>>>> Thanks,
>>>> Brandon
>>>> 
>>>> 
>>>> 
>> 

Re: [DISCUSS] KIP-665 Kafka Connect Hash SMT

Posted by Tom Bentley <tb...@redhat.com>.
Hi Brandon and Mickael,

Is it necessary to fix the supported digest? We could just support whatever
the JVM's MessageDigest supports?

Kind regards,

Tom

On Fri, Sep 18, 2020 at 6:00 PM Brandon Brown <br...@bbrownsound.com>
wrote:

> Thanks Michael! So proposed hash functions would be MD5, SHA1, SHA256.
>
> I can expand the motivation on the KIP but here’s where my head is at.
> MaskField would completely remove the value by setting it to an equivalent
> null value. One problem with this would be that you’d not be able to know
> in the case of say a password going through the mask transform it would
> become “” which could mean that no password was present in the message, or
> it was removed. However this hash transformer would remove this ambiguity
> if that makes sense.
>
> Do you think there are other hash functions that should be supported as
> well?
>
> Thanks,
> Brandon Brown
>
> > On Sep 18, 2020, at 12:00 PM, Mickael Maison <mi...@gmail.com>
> wrote:
> >
> > Thanks Brandon for the KIP.
> >
> > There's already a built-in transformation (MaskField) that can
> > obfuscate fields. In the motivation section, it would be nice to
> > explain the use cases when MaskField is not suitable and when users
> > would need the proposed transformation.
> >
> > The KIP exposes a "function" configuration to select the hash function
> > to use. Which hash functions do you propose supporting?
> >
> >> On Thu, Aug 27, 2020 at 10:43 PM <br...@bbrownsound.com> wrote:
> >>
> >>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
> >>
> >> The current pr with the proposed changes
> >> https://github.com/apache/kafka/pull/9057 and the original 3rd party
> >> contribution which initiated this change
> >>
> https://github.com/aiven/aiven-kafka-connect-transforms/issues/9#issuecomment-662378057
> .
> >>
> >> I'm interested in any suggestions for ways to improve this as I think
> >> it would make a nice addition to the existing SMTs provided by Kafka
> >> Connect out of the box.
> >>
> >> Thanks,
> >> Brandon
> >>
> >>
> >>
>

Re: [DISCUSS] KIP-665 Kafka Connect Hash SMT

Posted by Brandon Brown <br...@bbrownsound.com>.
Thanks Michael! So proposed hash functions would be MD5, SHA1, SHA256. 

I can expand the motivation on the KIP but here’s where my head is at. MaskField would completely remove the value by setting it to an equivalent null value. One problem with this would be that you’d not be able to know in the case of say a password going through the mask transform it would become “” which could mean that no password was present in the message, or it was removed. However this hash transformer would remove this ambiguity if that makes sense. 

Do you think there are other hash functions that should be supported as well?

Thanks,
Brandon Brown

> On Sep 18, 2020, at 12:00 PM, Mickael Maison <mi...@gmail.com> wrote:
> 
> Thanks Brandon for the KIP.
> 
> There's already a built-in transformation (MaskField) that can
> obfuscate fields. In the motivation section, it would be nice to
> explain the use cases when MaskField is not suitable and when users
> would need the proposed transformation.
> 
> The KIP exposes a "function" configuration to select the hash function
> to use. Which hash functions do you propose supporting?
> 
>> On Thu, Aug 27, 2020 at 10:43 PM <br...@bbrownsound.com> wrote:
>> 
>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
>> 
>> The current pr with the proposed changes
>> https://github.com/apache/kafka/pull/9057 and the original 3rd party
>> contribution which initiated this change
>> https://github.com/aiven/aiven-kafka-connect-transforms/issues/9#issuecomment-662378057.
>> 
>> I'm interested in any suggestions for ways to improve this as I think
>> it would make a nice addition to the existing SMTs provided by Kafka
>> Connect out of the box.
>> 
>> Thanks,
>> Brandon
>> 
>> 
>> 

Re: [DISCUSS] KIP-665 Kafka Connect Hash SMT

Posted by Mickael Maison <mi...@gmail.com>.
Thanks Brandon for the KIP.

There's already a built-in transformation (MaskField) that can
obfuscate fields. In the motivation section, it would be nice to
explain the use cases when MaskField is not suitable and when users
would need the proposed transformation.

The KIP exposes a "function" configuration to select the hash function
to use. Which hash functions do you propose supporting?

On Thu, Aug 27, 2020 at 10:43 PM <br...@bbrownsound.com> wrote:
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
>
> The current pr with the proposed changes
> https://github.com/apache/kafka/pull/9057 and the original 3rd party
> contribution which initiated this change
> https://github.com/aiven/aiven-kafka-connect-transforms/issues/9#issuecomment-662378057.
>
> I'm interested in any suggestions for ways to improve this as I think
> it would make a nice addition to the existing SMTs provided by Kafka
> Connect out of the box.
>
> Thanks,
> Brandon
>
>
>