You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Vahid S Hashemian <va...@us.ibm.com> on 2017/01/03 20:12:19 UTC

Re: [DISCUSS] KIP 88: OffsetFetch Protocol Update

One more try to ask for feedback on this KIP (that had to go through some 
more changes after approval): 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-88%3A+OffsetFetch+Protocol+Update

If there is no concern I could start the vote, hoping it could make it to 
the 0.10.2.0 release.

Thanks.
 
Regards,
--Vahid




From:   Vahid S Hashemian/Silicon Valley/IBM@IBMUS
To:     dev@kafka.apache.org
Date:   12/19/2016 01:25 PM
Subject:        Re: [DISCUSS] KIP 88: OffsetFetch Protocol Update



Happy Monday,

Jason, thanks for further explaining the issue.

I have updated the KIP and reflected the recent discussions in there: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-88%3A+OffsetFetch+Protocol+Update

You can also see the modifications to the KIP compared to the approved 
version here: 
https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=66849788&selectedPageVersions=26&selectedPageVersions=25


Feedback and comments are welcome.

Thanks.
--Vahid



From:   Jason Gustafson <ja...@confluent.io>
To:     dev@kafka.apache.org
Date:   12/16/2016 11:11 AM
Subject:        Re: [DISCUSS] KIP 88: OffsetFetch Protocol Update



Thanks Vahid. To clarify the impact of this issue, since we have no way to
send an error code in the OffsetFetchResponse when requesting all offsets,
we cannot detect when the coordinator has moved to another broker or when
it is still in the process of loading the offsets. This means we cannot
tell if there were was an error or if there were just no offsets stored 
for
the group. We've considered a few options:

1. Include an error code at the top level of the response. This seems like
the cleanest approach. The downside is that clients need to look for 
errors
in two locations for response errors. One small benefit is that many
OffsetFetch errors are group-level, so in that case, we can save the need
to return responses for all the requested partitions.
2. Sort of hacky, but we could insert a "dummy" partition into the 
response
so that we have somewhere to return an error code.
3. Include no error code, but use a null array in the response to indicate
that there was some error. If there was no error, and the group simply had
no partitions, then we return an empty array. I guess in this case, if the
client receives a null array in the response, it should assume the worst
and rediscover the coordinator and try again.

My preference is the first one. Not sure if there are any other ideas?

-Jason

On Thu, Dec 15, 2016 at 3:02 PM, Vahid S Hashemian <
vahidhashemian@us.ibm.com> wrote:

> Hi all,
>
> Even though KIP-88 was recently approved, due to a limitation that comes
> with the proposed protocol change in KIP-88 I'll have to re-open it to
> address the problem.
> I'd like to thank Jason Gustafson for catching this issue.
>
> I'll explain this in the KIP as well, but to summarize, KIP-88 suggests
> adding the option of passing a "null" array in FetchOffset request to
> query all existing offsets for a consumer group. It does not suggest any
> modification to FetchOffset response.
>
> In the existing protocol, group or coordinator related errors are 
reported
> along with each partition in the OffsetFetch response.
>
> If there are partitions in the request, they are guaranteed to appear in
> the response (there could be an error code associated with each). So if
> there is an error, it is reported back by being attached to some 
partition
> in the request.
> If an empty array is passed, no error is reported (no matter what the
> group or coordinator status is). The response comes back with an empty
> list.
>
> With the proposed change in KIP-88 we could have a scenario in which a
> null array is sent in FetchOffset request, and due to some errors (for
> example if coordinator just started and hasn't caught up yet with the
> offset topic), an empty list is returned in the FetchOffset response 
(the
> group may or may not actually be empty). The issue is in situations like
> this no error can be returned in the response because there is no
> partition to attach the error to.
>
> I'll update the KIP with more details and propose to add to OffsetFetch
> response schema an "error_code" at the top level that can be used to
> report group related errors (instead of reporting those errors with each
> individual partition).
>
> I apologize if this causes any inconvenience.
>
> Feedback and comments are always welcome.
>
> Thanks.
> --Vahid
>
>









Re: [DISCUSS] KIP 88: OffsetFetch Protocol Update

Posted by Ewen Cheslack-Postava <ew...@confluent.io>.
Vahid,

This looks reasonable to me and fits well with the changes made for
metadata requests.

-Ewen

On Tue, Jan 3, 2017 at 12:12 PM, Vahid S Hashemian <
vahidhashemian@us.ibm.com> wrote:

> One more try to ask for feedback on this KIP (that had to go through some
> more changes after approval):
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 88%3A+OffsetFetch+Protocol+Update
>
> If there is no concern I could start the vote, hoping it could make it to
> the 0.10.2.0 release.
>
> Thanks.
>
> Regards,
> --Vahid
>
>
>
>
> From:   Vahid S Hashemian/Silicon Valley/IBM@IBMUS
> To:     dev@kafka.apache.org
> Date:   12/19/2016 01:25 PM
> Subject:        Re: [DISCUSS] KIP 88: OffsetFetch Protocol Update
>
>
>
> Happy Monday,
>
> Jason, thanks for further explaining the issue.
>
> I have updated the KIP and reflected the recent discussions in there:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 88%3A+OffsetFetch+Protocol+Update
>
> You can also see the modifications to the KIP compared to the approved
> version here:
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?
> pageId=66849788&selectedPageVersions=26&selectedPageVersions=25
>
>
> Feedback and comments are welcome.
>
> Thanks.
> --Vahid
>
>
>
> From:   Jason Gustafson <ja...@confluent.io>
> To:     dev@kafka.apache.org
> Date:   12/16/2016 11:11 AM
> Subject:        Re: [DISCUSS] KIP 88: OffsetFetch Protocol Update
>
>
>
> Thanks Vahid. To clarify the impact of this issue, since we have no way to
> send an error code in the OffsetFetchResponse when requesting all offsets,
> we cannot detect when the coordinator has moved to another broker or when
> it is still in the process of loading the offsets. This means we cannot
> tell if there were was an error or if there were just no offsets stored
> for
> the group. We've considered a few options:
>
> 1. Include an error code at the top level of the response. This seems like
> the cleanest approach. The downside is that clients need to look for
> errors
> in two locations for response errors. One small benefit is that many
> OffsetFetch errors are group-level, so in that case, we can save the need
> to return responses for all the requested partitions.
> 2. Sort of hacky, but we could insert a "dummy" partition into the
> response
> so that we have somewhere to return an error code.
> 3. Include no error code, but use a null array in the response to indicate
> that there was some error. If there was no error, and the group simply had
> no partitions, then we return an empty array. I guess in this case, if the
> client receives a null array in the response, it should assume the worst
> and rediscover the coordinator and try again.
>
> My preference is the first one. Not sure if there are any other ideas?
>
> -Jason
>
> On Thu, Dec 15, 2016 at 3:02 PM, Vahid S Hashemian <
> vahidhashemian@us.ibm.com> wrote:
>
> > Hi all,
> >
> > Even though KIP-88 was recently approved, due to a limitation that comes
> > with the proposed protocol change in KIP-88 I'll have to re-open it to
> > address the problem.
> > I'd like to thank Jason Gustafson for catching this issue.
> >
> > I'll explain this in the KIP as well, but to summarize, KIP-88 suggests
> > adding the option of passing a "null" array in FetchOffset request to
> > query all existing offsets for a consumer group. It does not suggest any
> > modification to FetchOffset response.
> >
> > In the existing protocol, group or coordinator related errors are
> reported
> > along with each partition in the OffsetFetch response.
> >
> > If there are partitions in the request, they are guaranteed to appear in
> > the response (there could be an error code associated with each). So if
> > there is an error, it is reported back by being attached to some
> partition
> > in the request.
> > If an empty array is passed, no error is reported (no matter what the
> > group or coordinator status is). The response comes back with an empty
> > list.
> >
> > With the proposed change in KIP-88 we could have a scenario in which a
> > null array is sent in FetchOffset request, and due to some errors (for
> > example if coordinator just started and hasn't caught up yet with the
> > offset topic), an empty list is returned in the FetchOffset response
> (the
> > group may or may not actually be empty). The issue is in situations like
> > this no error can be returned in the response because there is no
> > partition to attach the error to.
> >
> > I'll update the KIP with more details and propose to add to OffsetFetch
> > response schema an "error_code" at the top level that can be used to
> > report group related errors (instead of reporting those errors with each
> > individual partition).
> >
> > I apologize if this causes any inconvenience.
> >
> > Feedback and comments are always welcome.
> >
> > Thanks.
> > --Vahid
> >
> >
>
>
>
>
>
>
>
>
>