You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Dana Powers <da...@gmail.com> on 2016/05/01 04:57:08 UTC

Re: KIP-57: Interoperable LZ4 Framing

On Fri, Apr 29, 2016 at 6:29 PM, Ewen Cheslack-Postava
<ew...@confluent.io> wrote:
> Two questions:
>
> 1. My understanding based on KIP-35 is that this won't be a problem for
> clients that want to support older broker versions since they will use v0
> produce requests with broken checksum to send to those, and any broker
> advertising support for v1 produce requests will also support valid
> checksums? In other words, the KIP is structured in terms of Java client
> versions, but I'd like to make sure we have the compatibility path for
> non-Java clients cleanly mapped out. (And think we do, especially given
> that Dana is proposing, but still would like an ack on that.)

Yes, I'm treating these as the same:

broker/client <= 0.9
messages == v0
Fetch api version <= 1
Produce api version <= 1

broker/client >= 0.10
messages >= v1
Fetch api version >= 2
Produce api version >= 2

I dont think there will be any problem for clients that want to
support both encodings.

> 2. We're completely disabling checksumming of the compressed payload on
> consumption. Normally you'd want to validate each level of framing for
> correct end-to-end validation. You could still do this (albeit more weakly)
> by validating the checksum is one of the two potentially valid values
> (correct checksum or old, incorrect checksum). This obviously has
> computational cost. Are we sure the tradeoff we're going with makes sense?

Yes, to be honest, not validating on consumption is mostly because I just
haven't dug into the bowels of the java client compressor / memory records
call chains. It seems non-trivial to switch validation based on the message
version in the consumer code. I did not opt for the weak validation that you
suggest because I think the broker should always validate v1 messages on
produce, and that piece shares the same code path within the lz4 java classes.
I suppose we could make the default to raise an error on checksums that fail
weak validation, and then switch to strong validation in the broker.
Alternately,
if you have suggestions on how to wire up the consumer code to switch lz4
behavior based on message version, I would be happy to run with that.

-Dana

Re: KIP-57: Interoperable LZ4 Framing

Posted by Dana Powers <da...@gmail.com>.
Ok -- removed Public Interfaces discussion. It should be up to date w/
PR review comments now.

-Dana

On Fri, May 6, 2016 at 2:15 PM, Ismael Juma <is...@juma.me.uk> wrote:
> One more suggestion Dana, I would remove the "Public interfaces" section as
> those classes are not actually public (only the classes with Javadoc are
> public: https://kafka.apache.org/090/javadoc/index.html) and the
> information in the KIP is a bit stale when compared to the PR.
>
> Ismael
>
> On Fri, May 6, 2016 at 10:12 PM, Ismael Juma <is...@juma.me.uk> wrote:
>
>> On Sun, May 1, 2016 at 3:57 AM, Dana Powers <da...@gmail.com> wrote:
>>>
>>> > 2. We're completely disabling checksumming of the compressed payload on
>>> > consumption. Normally you'd want to validate each level of framing for
>>> > correct end-to-end validation. You could still do this (albeit more
>>> weakly)
>>> > by validating the checksum is one of the two potentially valid values
>>> > (correct checksum or old, incorrect checksum). This obviously has
>>> > computational cost. Are we sure the tradeoff we're going with makes
>>> sense?
>>>
>>> Yes, to be honest, not validating on consumption is mostly because I just
>>> haven't dug into the bowels of the java client compressor / memory records
>>> call chains. It seems non-trivial to switch validation based on the
>>> message
>>> version in the consumer code. I did not opt for the weak validation that
>>> you
>>> suggest because I think the broker should always validate v1 messages on
>>> produce, and that piece shares the same code path within the lz4 java
>>> classes.
>>> I suppose we could make the default to raise an error on checksums that
>>> fail
>>> weak validation, and then switch to strong validation in the broker.
>>> Alternately,
>>> if you have suggestions on how to wire up the consumer code to switch lz4
>>> behavior based on message version, I would be happy to run with that.
>>
>>
>> The lack of checksum validation on consumption was a concern I had as well
>> (and Jun too, when I checked with him) so I helped Dana with this and the
>> PR now includes consumer validation for V1 messages. Dana, can you please
>> update the KIP?
>>
>> Ismael
>>

Re: KIP-57: Interoperable LZ4 Framing

Posted by Ismael Juma <is...@juma.me.uk>.
One more suggestion Dana, I would remove the "Public interfaces" section as
those classes are not actually public (only the classes with Javadoc are
public: https://kafka.apache.org/090/javadoc/index.html) and the
information in the KIP is a bit stale when compared to the PR.

Ismael

On Fri, May 6, 2016 at 10:12 PM, Ismael Juma <is...@juma.me.uk> wrote:

> On Sun, May 1, 2016 at 3:57 AM, Dana Powers <da...@gmail.com> wrote:
>>
>> > 2. We're completely disabling checksumming of the compressed payload on
>> > consumption. Normally you'd want to validate each level of framing for
>> > correct end-to-end validation. You could still do this (albeit more
>> weakly)
>> > by validating the checksum is one of the two potentially valid values
>> > (correct checksum or old, incorrect checksum). This obviously has
>> > computational cost. Are we sure the tradeoff we're going with makes
>> sense?
>>
>> Yes, to be honest, not validating on consumption is mostly because I just
>> haven't dug into the bowels of the java client compressor / memory records
>> call chains. It seems non-trivial to switch validation based on the
>> message
>> version in the consumer code. I did not opt for the weak validation that
>> you
>> suggest because I think the broker should always validate v1 messages on
>> produce, and that piece shares the same code path within the lz4 java
>> classes.
>> I suppose we could make the default to raise an error on checksums that
>> fail
>> weak validation, and then switch to strong validation in the broker.
>> Alternately,
>> if you have suggestions on how to wire up the consumer code to switch lz4
>> behavior based on message version, I would be happy to run with that.
>
>
> The lack of checksum validation on consumption was a concern I had as well
> (and Jun too, when I checked with him) so I helped Dana with this and the
> PR now includes consumer validation for V1 messages. Dana, can you please
> update the KIP?
>
> Ismael
>

Re: KIP-57: Interoperable LZ4 Framing

Posted by Gwen Shapira <gw...@confluent.io>.
Agree.

On Fri, May 6, 2016 at 3:14 PM, Ismael Juma <is...@juma.me.uk> wrote:
> On Fri, May 6, 2016 at 11:07 PM, Dana Powers <da...@gmail.com> wrote:
>
>> Updated:
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-57+-+Interoperable+LZ4+Framing
>>
>> Should I restart the vote?
>>
>
> I think the update is small enough that we don't need to restart the vote.
>
> Ismael

Re: KIP-57: Interoperable LZ4 Framing

Posted by Ismael Juma <is...@juma.me.uk>.
On Fri, May 6, 2016 at 11:07 PM, Dana Powers <da...@gmail.com> wrote:

> Updated:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-57+-+Interoperable+LZ4+Framing
>
> Should I restart the vote?
>

I think the update is small enough that we don't need to restart the vote.

Ismael

Re: KIP-57: Interoperable LZ4 Framing

Posted by Dana Powers <da...@gmail.com>.
Updated: https://cwiki.apache.org/confluence/display/KAFKA/KIP-57+-+Interoperable+LZ4+Framing

Should I restart the vote?

-Dana

On Fri, May 6, 2016 at 2:12 PM, Ismael Juma <is...@juma.me.uk> wrote:
> On Sun, May 1, 2016 at 3:57 AM, Dana Powers <da...@gmail.com> wrote:
>>
>> > 2. We're completely disabling checksumming of the compressed payload on
>> > consumption. Normally you'd want to validate each level of framing for
>> > correct end-to-end validation. You could still do this (albeit more
>> weakly)
>> > by validating the checksum is one of the two potentially valid values
>> > (correct checksum or old, incorrect checksum). This obviously has
>> > computational cost. Are we sure the tradeoff we're going with makes
>> sense?
>>
>> Yes, to be honest, not validating on consumption is mostly because I just
>> haven't dug into the bowels of the java client compressor / memory records
>> call chains. It seems non-trivial to switch validation based on the message
>> version in the consumer code. I did not opt for the weak validation that
>> you
>> suggest because I think the broker should always validate v1 messages on
>> produce, and that piece shares the same code path within the lz4 java
>> classes.
>> I suppose we could make the default to raise an error on checksums that
>> fail
>> weak validation, and then switch to strong validation in the broker.
>> Alternately,
>> if you have suggestions on how to wire up the consumer code to switch lz4
>> behavior based on message version, I would be happy to run with that.
>
>
> The lack of checksum validation on consumption was a concern I had as well
> (and Jun too, when I checked with him) so I helped Dana with this and the
> PR now includes consumer validation for V1 messages. Dana, can you please
> update the KIP?
>
> Ismael

Re: KIP-57: Interoperable LZ4 Framing

Posted by Ismael Juma <is...@juma.me.uk>.
On Sun, May 1, 2016 at 3:57 AM, Dana Powers <da...@gmail.com> wrote:
>
> > 2. We're completely disabling checksumming of the compressed payload on
> > consumption. Normally you'd want to validate each level of framing for
> > correct end-to-end validation. You could still do this (albeit more
> weakly)
> > by validating the checksum is one of the two potentially valid values
> > (correct checksum or old, incorrect checksum). This obviously has
> > computational cost. Are we sure the tradeoff we're going with makes
> sense?
>
> Yes, to be honest, not validating on consumption is mostly because I just
> haven't dug into the bowels of the java client compressor / memory records
> call chains. It seems non-trivial to switch validation based on the message
> version in the consumer code. I did not opt for the weak validation that
> you
> suggest because I think the broker should always validate v1 messages on
> produce, and that piece shares the same code path within the lz4 java
> classes.
> I suppose we could make the default to raise an error on checksums that
> fail
> weak validation, and then switch to strong validation in the broker.
> Alternately,
> if you have suggestions on how to wire up the consumer code to switch lz4
> behavior based on message version, I would be happy to run with that.


The lack of checksum validation on consumption was a concern I had as well
(and Jun too, when I checked with him) so I helped Dana with this and the
PR now includes consumer validation for V1 messages. Dana, can you please
update the KIP?

Ismael