You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Jorge Esteban Quilcate Otoya <qu...@gmail.com> on 2017/02/08 02:43:16 UTC

KIP-122: Add a tool to Reset Consumer Group Offsets

Hi all,

I would like to propose a KIP to Add a tool to Reset Consumer Group Offsets.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets

Please, take a look at the proposal and share your feedback.

Thanks,
Jorge.

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Vahid S Hashemian <va...@us.ibm.com>.
Thanks for this useful and the nice KIP write-up.

Just a minor suggestion. Would it make sense to avoid repeating the term 
"reset" in the arguments?
We already use the argument "reset-offset", so we may not need to repeat 
the term in the follow-on argument.

For example, instead of

kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime 
2017-01-01T00:00:00.000

we could use

kafka-consumer-groups.sh --reset-offset --group cg1 --to-datetime 
2017-01-01T00:00:00.000


Similarly we could replace the other suggested arguments (or any other 
that is eventually approved) like this:

--reset-to-period -> --to-period
--reset-to-earliest -> --to-earliest
--reset-to-latest -> --to-latest
--reset-minus -> --to-minus
--reset-plus -> --to-plus
--reset-to -> --to

Thanks.
--Vahid



From:   Jorge Esteban Quilcate Otoya <qu...@gmail.com>
To:     dev@kafka.apache.org, Users <us...@kafka.apache.org>
Date:   02/08/2017 02:23 PM
Subject:        Re: KIP-122: Add a tool to Reset Consumer Group Offsets



Great. I think I got the idea. What about this options:

Scenarios:

1. Current status

´kafka-consumer-groups.sh --reset-offset --group cg1´

2. To Datetime

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
2017-01-01T00:00:00.000´

3. To Period

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period 
P2D´

4. To Earliest

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-earliest´

5. To Latest

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´

6. Minus 'n' offsets

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´

7. Plus 'n' offsets

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´

8. To specific offset

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´

Scopes:

a. All topics used by Consumer Group

Don't specify --topics

b. Specific List of Topics

Add list of values in --topics t1,t2,tn

c. One Topic, all Partitions

Add one topic and no partitions values: --topic t1

d. One Topic, List of Partitions

Add one topic and partitions values: --topic t1 --partitions 0,1,2

About Reset Plan (JSON file):

I think is still valid to have the option to persist reset configuration 
as
a file, but I agree to give the option to run the tool without going down
to the JSON file.

Execution options:

1. Without execution argument (No args):

Print out results (reset plan)

2. With --execute argument:

Run reset process

3. With --output argument:

Save result in a JSON format.

4. Only with --execute option and --reset-file (path to JSON)

Reset based on file

4. Only with --verify option and --reset-file (path to JSON)

Verify file values with current offsets

I think we can remove --generate-and-execute because is a bit clumsy.

With this options we will be able to execute with manual JSON 
configuration.


El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
escribió:

> Yes - using a tool like this to skip a set of consumer groups over a
> corrupt/bad message is definitely appealing.
>
> B
>
> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
>
> > I like the --reset-to-earliest and --reset-to-latest. In general,
> > since the JSON route is the most challenging for users, we want to
> > provide a lot of ways to do useful things without going there.
> >
> > Two things that can help:
> >
> > 1. A lot of times, users want to skip few messages that cause issues
> > and continue. maybe just specifying the topic, partition and delta
> > will be better than having to find the offset and write a JSON and
> > validate the JSON etc.
> >
> > 2. Thinking if there are other common use-cases that we can make easy
> > rather than just one generic but not very usable method.
> >
> > Gwen
> >
> > On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> > <qu...@gmail.com> wrote:
> > > Thanks for the feedback!
> > >
> > > @Onur, @Gwen:
> > >
> > > Agree. Actually at the first draft I considered to have it inside
> > > ´kafka-consumer-groups.sh´, but I decide to propose it as a 
standalone
> > tool
> > > to describe it clearly and focus it on reset functionality.
> > >
> > > But now that you mentioned, it does make sense to have it in
> > > ´kafka-consumer-groups.sh´. How would be a consistent way to 
introduce
> > it?
> > >
> > > Maybe something like this:
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> --topics
> > t1
> > > --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> > > plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> > > plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --generate-and-execute 
--group
> > cg1
> > > --topics t1 --reset-from 2017-01-01T00:00:00.000´
> > >
> > > @Gwen:
> > >
> > >> It looks exactly like the replica assignment tool
> > >
> > > It was influenced by ;-) I use the generate-verify-execute process 
here
> > to
> > > make sure user will be aware of the result of this operation. At the
> > > beginning we considered only add a couple of options to Consumer 
Group
> > > Command:
> > >
> > > --rewind-to-timestamp and --rewind-to-period
> > >
> > > @Onur:
> > >
> > >> You can actually get away with overriding while members of the 
group
> > are live
> > > with method 2 by using group information from DescribeGroupsRequest.
> > >
> > > This means that we need to have Consumer Group stopped before 
executing
> > and
> > > start a new consumer internally to do this? Therefore, we won't be 
able
> > to
> > > consider executing reset when ConsumerGroup is active? (trying to
> relate
> > it
> > > with @Dong 5th question)
> > >
> > > @Dong:
> > >
> > >> Should we allow user to use wildcard to reset offset of all groups
> for a
> > > given topic as well?
> > >
> > > I haven't thought about this scenario. Could be interesting. 
Following
> > the
> > > recommendation to add it into Consumer Group Command, in this case
> Group
> > > argument will be optional if there are only 1 topic. I think for
> multiple
> > > topic won't be that useful.
> > >
> > >> Should we allow user to specify timestamp per topic partition in 
the
> > json
> > > file as well?
> > >
> > > Don't think this could be a valid from the tool, but if Reset Plan 
is
> > > generated, and user want to set the offset for a specific partition 
to
> > > other offset (eventually based on another timestamp), and execute 
it,
> it
> > > will be up to her/him.
> > >
> > >> Should the script take some credential file to make sure that this
> > > operation is authenticated given the potential impact of this
> operation?
> > >
> > > Haven't tried to secure brokers yet, but the tool should support
> > > authorization if it's enabled in the broker.
> > >
> > >> Should we provide constant to reset committed offset to
> earliest/latest
> > > offset of a partition, e.g. -1 indicates earliest offset and -2
> indicates
> > > latest offset.
> > >
> > > I will go for something like ´--reset-to-earliest´ and
> > ´--reset-to-latest´
> > >
> > >> Should we allow dynamic change of the comitted offset when consumer
> are
> > > running, such that consumer will seek to the newly committed offset 
and
> > > start consuming from there?
> > >
> > > Not sure about this. I will recommend to keep it simple and ask user 
to
> > > stop consumers first. But I would considered it if the trade-offs 
are
> > > clear.
> > >
> > > @Matthias
> > >
> > > Added :). And thanks a lot for your help to define this KIP!
> > >
> > >
> > >
> > > El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
> > > escribió:
> > >
> > >> As long as the CLI is a bit consistent? Like, not just adding 3
> > >> arguments and a JSON parser to the existing tool, right?
> > >>
> > >> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> > >> <on...@gmail.com> wrote:
> > >> > I think it makes sense to just add the feature to
> > >> kafka-consumer-groups.sh
> > >> >
> > >> > On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
> > wrote:
> > >> >
> > >> >> Thanks for the KIP. I'm super happy about adding the capability.
> > >> >>
> > >> >> I hate the interface, though. It looks exactly like the replica
> > >> >> assignment tool. A tool everyone loves so much that there are
> > multiple
> > >> >> projects, open and closed, that try to fix it.
> > >> >>
> > >> >> Can we swap it with something that looks a bit more like the
> consumer
> > >> >> group tool? or the kafka streams reset tool? Consistency is 
helpful
> > in
> > >> >> such cases. I spent some time learning existing tools and 
learning
> > yet
> > >> >> another one is a deterrent.
> > >> >>
> > >> >> Gwen
> > >> >>
> > >> >>
> > >> >>
> > >> >> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> > >> >> <qu...@gmail.com> wrote:
> > >> >> > Hi all,
> > >> >> >
> > >> >> > I would like to propose a KIP to Add a tool to Reset Consumer
> Group
> > >> >> Offsets.
> > >> >> >
> > >> >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >> >> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> > >> >> >
> > >> >> > Please, take a look at the proposal and share your feedback.
> > >> >> >
> > >> >> > Thanks,
> > >> >> > Jorge.
> > >> >>
> > >> >>
> > >> >>
> > >> >> --
> > >> >> Gwen Shapira
> > >> >> Product Manager | Confluent
> > >> >> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760> | @gwenshap
> > >> >> Follow us: Twitter | blog
> > >> >>
> > >>
> > >>
> > >>
> > >> --
> > >> Gwen Shapira
> > >> Product Manager | Confluent
> > >> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760> | @gwenshap
> > >> Follow us: Twitter | blog
> > >>
> >
> >
> >
> > --
> > Gwen Shapira
> > Product Manager | Confluent
> > 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap
> > Follow us: Twitter | blog
> >
>





Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by "Matthias J. Sax" <ma...@confluent.io>.
There is a voting thread on dev list. Please put your vote there. Thx.


-Matthias

On 2/23/17 8:15 PM, Mahendra Kariya wrote:
> +1 for such a tool. It would be of great help in a lot of use cases.
> 
> On Thu, Feb 23, 2017 at 11:44 PM, Matthias J. Sax <ma...@confluent.io>
> wrote:
> 
>> \cc from dev
>>
>>
>> -------- Forwarded Message --------
>> Subject: Re: KIP-122: Add a tool to Reset Consumer Group Offsets
>> Date: Thu, 23 Feb 2017 10:13:39 -0800
>> From: Matthias J. Sax <ma...@confluent.io>
>> Organization: Confluent Inc
>> To: dev@kafka.apache.org
>>
>> So you suggest to merge "scope options" --topics, --topic, and
>> --partitions into a single option? Sound good to me.
>>
>> I like the compact way to express it, ie, topicname:list-of-partitions
>> with "all partitions" if not partitions are specified. It's quite
>> intuitive to use.
>>
>> Just wondering, if we could get rid of the repeated --topic option; it's
>> somewhat verbose. Have no good idea though who to improve it.
>>
>> If you concatenate multiple topic, we need one more character that is
>> not allowed in topic names to separate the topics:
>>
>>> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';', '*',
>> '?', ' ', '\t', '\r', '\n', '='};
>>
>> maybe
>>
>> --topics t1=1,2,3:t2:t3=3
>>
>> use '=' to specify partitions (instead of ':' as you proposed) and ':'
>> to separate topics? All other characters seem to be worse to use to me.
>> But maybe you have a better idea.
>>
>>
>>
>> -Matthias
>>
>>
>> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
>>> @Matthias about the point 9:
>>>
>>> What about keeping only the --topic option, and support this format:
>>>
>>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
>>>
>>> In this case topics t1, t2, and t3 will be selected: topic t1 with
>>> partitions 0,1 and 2; topic t2 with all its partitions; and topic t3,
>> with
>>> only partition 2.
>>>
>>> Jorge.
>>>
>>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
>>> quilcate.jorge@gmail.com>) escribió:
>>>
>>>> Thanks for the feedback Matthias.
>>>>
>>>> * 1. You're right. I'll reorder the scenarios.
>>>>
>>>> * 2. Agree. I'll update the KIP.
>>>>
>>>> * 3. I like it, updating to `reset-offsets`
>>>>
>>>> * 4. Agree, removing the `reset-` part
>>>>
>>>> * 5. Yes, 1.e option without --execute or --export will print out
>> current
>>>> offset, and the new offset, that will be the same. The use-case of this
>>>> option is to use it in combination with --export mostly and have a
>> current
>>>> 'checkpoint' to reset later. I will add to the KIP how the output should
>>>> looks like.
>>>>
>>>> * 6. Considering 4., I will update it to `--to-offset`
>>>>
>>>> * 7. I like the idea to unify these options (plus, minus).
>>>> `shift-offsets-by` is a good option, but I will like some more feedback
>>>> here about the name. I will update the KIP in the meantime.
>>>>
>>>> * 8. Yes, discussed in 9.
>>>>
>>>> * 9. Agree. I'll love some feedback here. `topic` is already used by
>>>> `delete`, and we can add `--all-topics` to consider all
>> topics/partitions
>>>> assigned to a group. How could we define specific topics/partitions?
>>>>
>>>> * 10. Haven't thought about it, but make sense.
>>>> <topic>,<partition>,<offset> would be enough.
>>>>
>>>> * 11. Agree. Solved with 10.
>>>>
>>>> Also, I have a couple of changes to mention:
>>>>
>>>> 1. I have add a reference to the branch where I'm working on this KIP.
>>>>
>>>> 2. About the period scenario `--to-period`. I will change it to
>>>> `--to-duration` given that duration (
>>>> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html)
>>>> follows this format: 'PnDTnHnMnS' and does not consider daylight saving
>>>> efects.
>>>>
>>>>
>>>>
>>>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
>> matthias@confluent.io>)
>>>> escribió:
>>>>
>>>> Hi,
>>>>
>>>> thanks for updating the KIP. Couple of follow up comments:
>>>>
>>>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
>>>> time" option -- IMHO it belongs to "reset by position"?
>>>>
>>>>
>>>> * Nit: Description of "Reset to Earliest"
>>>>
>>>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
>>>>
>>>> I think this is strictly speaking not correct (as auto.offset.reset only
>>>> triggered if no valid offset is found, but this tool explicitly modified
>>>> committed offset), and should be phrased as
>>>>
>>>>> using Kafka Consumer's #seekToBeginning()
>>>>
>>>> -> similar issue for description of "Reset to Latest"
>>>>
>>>>
>>>> * Main option: rename to --reset-offsets (plural instead of singular)
>>>>
>>>>
>>>> * Scenario Options: I would remove "reset" from all options, because the
>>>> main argument "--reset-offset" says already what to do:
>>>>
>>>>> bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
>>>>
>>>> better (IMHO):
>>>>
>>>>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
>>>>
>>>>
>>>>
>>>> * Option 1.e ("print and export current offset") is not intuitive to use
>>>> IMHO. The main option is "--reset-offset" but nothing happens if no
>>>> scenario is specified. It is also not specified, what the output should
>>>> look like?
>>>>
>>>> Furthermore, --describe should actually show currently committed offset
>>>> for a group. So it seems to be redundant to have the same option in
>>>> --reset-offsets
>>>>
>>>>
>>>> * Option 2.a: I would rename to "--reset-to-offset" (or considering the
>>>> comment above to "--to-offset")
>>>>
>>>>
>>>> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar)
>>>> and accept positive/negative values
>>>>
>>>>
>>>> * About Scope "all": maybe it's better to have an option "--all-topics"
>>>> (or similar). IMHO explicit arguments are preferable over implicit
>>>> setting to guard again accidental miss use of the tool.
>>>>
>>>>
>>>> * Scope: I also think, that "--topic" (singular) and "--topics" (plural)
>>>> are too similar and easy to use in a wrong way (ie, mix up) -- maybe we
>>>> can have two options that are easier to distinguish.
>>>>
>>>>
>>>> * I still think that JSON is not the best format (it's too verbose/hard
>>>> to write for humans from scratch). A simple CSV format with implicit
>>>> schema (topic,partition,offset) would be sufficient.
>>>>
>>>>
>>>> * Why does the JSON contain "group_id" field -- there is parameter
>>>> "--group" to specify the group ID. Would one overwrite the other (what
>>>> order) or would there be an error if "--group" is used in combination
>>>> with "--reset-from-file"?
>>>>
>>>>
>>>>
>>>> -Matthias
>>>>
>>>>
>>>>
>>>>
>>>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
>>>>> Hi,
>>>>>
>>>>> according to the feedback, I've updated the KIP:
>>>>>
>>>>> - We have added and ordered the scenarios, scopes and executions of the
>>>>> Reset Offset tool.
>>>>> - Consider it as an extension to the current `ConsumerGroupCommand`
>> tool
>>>>> - Execution will be possible without generating JSON files.
>>>>>
>>>>>
>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 122%3A+Add+Reset+Consumer+Group+Offsets+tooling
>>>>>
>>>>> Looking forward to your feedback!
>>>>>
>>>>> Jorge.
>>>>>
>>>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
>>>>> quilcate.jorge@gmail.com>) escribió:
>>>>>
>>>>>> Great. I think I got the idea. What about this options:
>>>>>>
>>>>>> Scenarios:
>>>>>>
>>>>>> 1. Current status
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
>>>>>>
>>>>>> 2. To Datetime
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
>> --reset-to-datetime
>>>>>> 2017-01-01T00:00:00.000´
>>>>>>
>>>>>> 3. To Period
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period
>>>> P2D´
>>>>>>
>>>>>> 4. To Earliest
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
>>>> --reset-to-earliest´
>>>>>>
>>>>>> 5. To Latest
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
>> --reset-to-latest´
>>>>>>
>>>>>> 6. Minus 'n' offsets
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
>>>>>>
>>>>>> 7. Plus 'n' offsets
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
>>>>>>
>>>>>> 8. To specific offset
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
>>>>>>
>>>>>> Scopes:
>>>>>>
>>>>>> a. All topics used by Consumer Group
>>>>>>
>>>>>> Don't specify --topics
>>>>>>
>>>>>> b. Specific List of Topics
>>>>>>
>>>>>> Add list of values in --topics t1,t2,tn
>>>>>>
>>>>>> c. One Topic, all Partitions
>>>>>>
>>>>>> Add one topic and no partitions values: --topic t1
>>>>>>
>>>>>> d. One Topic, List of Partitions
>>>>>>
>>>>>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
>>>>>>
>>>>>> About Reset Plan (JSON file):
>>>>>>
>>>>>> I think is still valid to have the option to persist reset
>> configuration
>>>>>> as a file, but I agree to give the option to run the tool without
>> going
>>>>>> down to the JSON file.
>>>>>>
>>>>>> Execution options:
>>>>>>
>>>>>> 1. Without execution argument (No args):
>>>>>>
>>>>>> Print out results (reset plan)
>>>>>>
>>>>>> 2. With --execute argument:
>>>>>>
>>>>>> Run reset process
>>>>>>
>>>>>> 3. With --output argument:
>>>>>>
>>>>>> Save result in a JSON format.
>>>>>>
>>>>>> 4. Only with --execute option and --reset-file (path to JSON)
>>>>>>
>>>>>> Reset based on file
>>>>>>
>>>>>> 4. Only with --verify option and --reset-file (path to JSON)
>>>>>>
>>>>>> Verify file values with current offsets
>>>>>>
>>>>>> I think we can remove --generate-and-execute because is a bit clumsy.
>>>>>>
>>>>>> With this options we will be able to execute with manual JSON
>>>>>> configuration.
>>>>>>
>>>>>>
>>>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
>>>>>> escribió:
>>>>>>
>>>>>> Yes - using a tool like this to skip a set of consumer groups over a
>>>>>> corrupt/bad message is definitely appealing.
>>>>>>
>>>>>> B
>>>>>>
>>>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io>
>> wrote:
>>>>>>
>>>>>>> I like the --reset-to-earliest and --reset-to-latest. In general,
>>>>>>> since the JSON route is the most challenging for users, we want to
>>>>>>> provide a lot of ways to do useful things without going there.
>>>>>>>
>>>>>>> Two things that can help:
>>>>>>>
>>>>>>> 1. A lot of times, users want to skip few messages that cause issues
>>>>>>> and continue. maybe just specifying the topic, partition and delta
>>>>>>> will be better than having to find the offset and write a JSON and
>>>>>>> validate the JSON etc.
>>>>>>>
>>>>>>> 2. Thinking if there are other common use-cases that we can make easy
>>>>>>> rather than just one generic but not very usable method.
>>>>>>>
>>>>>>> Gwen
>>>>>>>
>>>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
>>>>>>> <qu...@gmail.com> wrote:
>>>>>>>> Thanks for the feedback!
>>>>>>>>
>>>>>>>> @Onur, @Gwen:
>>>>>>>>
>>>>>>>> Agree. Actually at the first draft I considered to have it inside
>>>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a
>> standalone
>>>>>>> tool
>>>>>>>> to describe it clearly and focus it on reset functionality.
>>>>>>>>
>>>>>>>> But now that you mentioned, it does make sense to have it in
>>>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to
>> introduce
>>>>>>> it?
>>>>>>>>
>>>>>>>> Maybe something like this:
>>>>>>>>
>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
>>>>>> --topics
>>>>>>> t1
>>>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
>>>>>>>>
>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
>>>>>>>> plan.json´
>>>>>>>>
>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
>>>>>>>> plan.json´
>>>>>>>>
>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute
>>>> --group
>>>>>>> cg1
>>>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
>>>>>>>>
>>>>>>>> @Gwen:
>>>>>>>>
>>>>>>>>> It looks exactly like the replica assignment tool
>>>>>>>>
>>>>>>>> It was influenced by ;-) I use the generate-verify-execute process
>>>> here
>>>>>>> to
>>>>>>>> make sure user will be aware of the result of this operation. At the
>>>>>>>> beginning we considered only add a couple of options to Consumer
>> Group
>>>>>>>> Command:
>>>>>>>>
>>>>>>>> --rewind-to-timestamp and --rewind-to-period
>>>>>>>>
>>>>>>>> @Onur:
>>>>>>>>
>>>>>>>>> You can actually get away with overriding while members of the
>> group
>>>>>>> are live
>>>>>>>> with method 2 by using group information from DescribeGroupsRequest.
>>>>>>>>
>>>>>>>> This means that we need to have Consumer Group stopped before
>>>> executing
>>>>>>> and
>>>>>>>> start a new consumer internally to do this? Therefore, we won't be
>>>> able
>>>>>>> to
>>>>>>>> consider executing reset when ConsumerGroup is active? (trying to
>>>>>> relate
>>>>>>> it
>>>>>>>> with @Dong 5th question)
>>>>>>>>
>>>>>>>> @Dong:
>>>>>>>>
>>>>>>>>> Should we allow user to use wildcard to reset offset of all groups
>>>>>> for a
>>>>>>>> given topic as well?
>>>>>>>>
>>>>>>>> I haven't thought about this scenario. Could be interesting.
>> Following
>>>>>>> the
>>>>>>>> recommendation to add it into Consumer Group Command, in this case
>>>>>> Group
>>>>>>>> argument will be optional if there are only 1 topic. I think for
>>>>>> multiple
>>>>>>>> topic won't be that useful.
>>>>>>>>
>>>>>>>>> Should we allow user to specify timestamp per topic partition in
>> the
>>>>>>> json
>>>>>>>> file as well?
>>>>>>>>
>>>>>>>> Don't think this could be a valid from the tool, but if Reset Plan
>> is
>>>>>>>> generated, and user want to set the offset for a specific partition
>> to
>>>>>>>> other offset (eventually based on another timestamp), and execute
>> it,
>>>>>> it
>>>>>>>> will be up to her/him.
>>>>>>>>
>>>>>>>>> Should the script take some credential file to make sure that this
>>>>>>>> operation is authenticated given the potential impact of this
>>>>>> operation?
>>>>>>>>
>>>>>>>> Haven't tried to secure brokers yet, but the tool should support
>>>>>>>> authorization if it's enabled in the broker.
>>>>>>>>
>>>>>>>>> Should we provide constant to reset committed offset to
>>>>>> earliest/latest
>>>>>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
>>>>>> indicates
>>>>>>>> latest offset.
>>>>>>>>
>>>>>>>> I will go for something like ´--reset-to-earliest´ and
>>>>>>> ´--reset-to-latest´
>>>>>>>>
>>>>>>>>> Should we allow dynamic change of the comitted offset when consumer
>>>>>> are
>>>>>>>> running, such that consumer will seek to the newly committed offset
>>>> and
>>>>>>>> start consuming from there?
>>>>>>>>
>>>>>>>> Not sure about this. I will recommend to keep it simple and ask user
>>>> to
>>>>>>>> stop consumers first. But I would considered it if the trade-offs
>> are
>>>>>>>> clear.
>>>>>>>>
>>>>>>>> @Matthias
>>>>>>>>
>>>>>>>> Added :). And thanks a lot for your help to define this KIP!
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
>>>>>>>> escribió:
>>>>>>>>
>>>>>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
>>>>>>>>> arguments and a JSON parser to the existing tool, right?
>>>>>>>>>
>>>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
>>>>>>>>> <on...@gmail.com> wrote:
>>>>>>>>>> I think it makes sense to just add the feature to
>>>>>>>>> kafka-consumer-groups.sh
>>>>>>>>>>
>>>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
>>>>>>>>>>>
>>>>>>>>>>> I hate the interface, though. It looks exactly like the replica
>>>>>>>>>>> assignment tool. A tool everyone loves so much that there are
>>>>>>> multiple
>>>>>>>>>>> projects, open and closed, that try to fix it.
>>>>>>>>>>>
>>>>>>>>>>> Can we swap it with something that looks a bit more like the
>>>>>> consumer
>>>>>>>>>>> group tool? or the kafka streams reset tool? Consistency is
>> helpful
>>>>>>> in
>>>>>>>>>>> such cases. I spent some time learning existing tools and
>> learning
>>>>>>> yet
>>>>>>>>>>> another one is a deterrent.
>>>>>>>>>>>
>>>>>>>>>>> Gwen
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>>>>>>>>>>> <qu...@gmail.com> wrote:
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
>>>>>> Group
>>>>>>>>>>> Offsets.
>>>>>>>>>>>>
>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>>>>>>>>>>>>
>>>>>>>>>>>> Please, take a look at the proposal and share your feedback.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Jorge.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Gwen Shapira
>>>>>>>>>>> Product Manager | Confluent
>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>>>> <(650)%20450-2760>
>>>>>> <(650)%20450-2760> | @gwenshap
>>>>>>>>>>> Follow us: Twitter | blog
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Gwen Shapira
>>>>>>>>> Product Manager | Confluent
>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>>>> <(650)%20450-2760>
>>>>>> <(650)%20450-2760> | @gwenshap
>>>>>>>>> Follow us: Twitter | blog
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Gwen Shapira
>>>>>>> Product Manager | Confluent
>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> <(650)%20450-2760>
>>>> | @gwenshap
>>>>>>> Follow us: Twitter | blog
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>>
>>
> 


Re: Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Mahendra Kariya <ma...@go-jek.com>.
+1 for such a tool. It would be of great help in a lot of use cases.

On Thu, Feb 23, 2017 at 11:44 PM, Matthias J. Sax <ma...@confluent.io>
wrote:

> \cc from dev
>
>
> -------- Forwarded Message --------
> Subject: Re: KIP-122: Add a tool to Reset Consumer Group Offsets
> Date: Thu, 23 Feb 2017 10:13:39 -0800
> From: Matthias J. Sax <ma...@confluent.io>
> Organization: Confluent Inc
> To: dev@kafka.apache.org
>
> So you suggest to merge "scope options" --topics, --topic, and
> --partitions into a single option? Sound good to me.
>
> I like the compact way to express it, ie, topicname:list-of-partitions
> with "all partitions" if not partitions are specified. It's quite
> intuitive to use.
>
> Just wondering, if we could get rid of the repeated --topic option; it's
> somewhat verbose. Have no good idea though who to improve it.
>
> If you concatenate multiple topic, we need one more character that is
> not allowed in topic names to separate the topics:
>
> > invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';', '*',
> '?', ' ', '\t', '\r', '\n', '='};
>
> maybe
>
> --topics t1=1,2,3:t2:t3=3
>
> use '=' to specify partitions (instead of ':' as you proposed) and ':'
> to separate topics? All other characters seem to be worse to use to me.
> But maybe you have a better idea.
>
>
>
> -Matthias
>
>
> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> > @Matthias about the point 9:
> >
> > What about keeping only the --topic option, and support this format:
> >
> > `--topic t1:0,1,2 --topic t2 --topic t3:2`
> >
> > In this case topics t1, t2, and t3 will be selected: topic t1 with
> > partitions 0,1 and 2; topic t2 with all its partitions; and topic t3,
> with
> > only partition 2.
> >
> > Jorge.
> >
> > El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
> > quilcate.jorge@gmail.com>) escribió:
> >
> >> Thanks for the feedback Matthias.
> >>
> >> * 1. You're right. I'll reorder the scenarios.
> >>
> >> * 2. Agree. I'll update the KIP.
> >>
> >> * 3. I like it, updating to `reset-offsets`
> >>
> >> * 4. Agree, removing the `reset-` part
> >>
> >> * 5. Yes, 1.e option without --execute or --export will print out
> current
> >> offset, and the new offset, that will be the same. The use-case of this
> >> option is to use it in combination with --export mostly and have a
> current
> >> 'checkpoint' to reset later. I will add to the KIP how the output should
> >> looks like.
> >>
> >> * 6. Considering 4., I will update it to `--to-offset`
> >>
> >> * 7. I like the idea to unify these options (plus, minus).
> >> `shift-offsets-by` is a good option, but I will like some more feedback
> >> here about the name. I will update the KIP in the meantime.
> >>
> >> * 8. Yes, discussed in 9.
> >>
> >> * 9. Agree. I'll love some feedback here. `topic` is already used by
> >> `delete`, and we can add `--all-topics` to consider all
> topics/partitions
> >> assigned to a group. How could we define specific topics/partitions?
> >>
> >> * 10. Haven't thought about it, but make sense.
> >> <topic>,<partition>,<offset> would be enough.
> >>
> >> * 11. Agree. Solved with 10.
> >>
> >> Also, I have a couple of changes to mention:
> >>
> >> 1. I have add a reference to the branch where I'm working on this KIP.
> >>
> >> 2. About the period scenario `--to-period`. I will change it to
> >> `--to-duration` given that duration (
> >> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html)
> >> follows this format: 'PnDTnHnMnS' and does not consider daylight saving
> >> efects.
> >>
> >>
> >>
> >> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
> matthias@confluent.io>)
> >> escribió:
> >>
> >> Hi,
> >>
> >> thanks for updating the KIP. Couple of follow up comments:
> >>
> >> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
> >> time" option -- IMHO it belongs to "reset by position"?
> >>
> >>
> >> * Nit: Description of "Reset to Earliest"
> >>
> >>> using Kafka Consumer's `auto.offset.reset` to `earliest`
> >>
> >> I think this is strictly speaking not correct (as auto.offset.reset only
> >> triggered if no valid offset is found, but this tool explicitly modified
> >> committed offset), and should be phrased as
> >>
> >>> using Kafka Consumer's #seekToBeginning()
> >>
> >> -> similar issue for description of "Reset to Latest"
> >>
> >>
> >> * Main option: rename to --reset-offsets (plural instead of singular)
> >>
> >>
> >> * Scenario Options: I would remove "reset" from all options, because the
> >> main argument "--reset-offset" says already what to do:
> >>
> >>> bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
> >>
> >> better (IMHO):
> >>
> >>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
> >>
> >>
> >>
> >> * Option 1.e ("print and export current offset") is not intuitive to use
> >> IMHO. The main option is "--reset-offset" but nothing happens if no
> >> scenario is specified. It is also not specified, what the output should
> >> look like?
> >>
> >> Furthermore, --describe should actually show currently committed offset
> >> for a group. So it seems to be redundant to have the same option in
> >> --reset-offsets
> >>
> >>
> >> * Option 2.a: I would rename to "--reset-to-offset" (or considering the
> >> comment above to "--to-offset")
> >>
> >>
> >> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar)
> >> and accept positive/negative values
> >>
> >>
> >> * About Scope "all": maybe it's better to have an option "--all-topics"
> >> (or similar). IMHO explicit arguments are preferable over implicit
> >> setting to guard again accidental miss use of the tool.
> >>
> >>
> >> * Scope: I also think, that "--topic" (singular) and "--topics" (plural)
> >> are too similar and easy to use in a wrong way (ie, mix up) -- maybe we
> >> can have two options that are easier to distinguish.
> >>
> >>
> >> * I still think that JSON is not the best format (it's too verbose/hard
> >> to write for humans from scratch). A simple CSV format with implicit
> >> schema (topic,partition,offset) would be sufficient.
> >>
> >>
> >> * Why does the JSON contain "group_id" field -- there is parameter
> >> "--group" to specify the group ID. Would one overwrite the other (what
> >> order) or would there be an error if "--group" is used in combination
> >> with "--reset-from-file"?
> >>
> >>
> >>
> >> -Matthias
> >>
> >>
> >>
> >>
> >> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> >>> Hi,
> >>>
> >>> according to the feedback, I've updated the KIP:
> >>>
> >>> - We have added and ordered the scenarios, scopes and executions of the
> >>> Reset Offset tool.
> >>> - Consider it as an extension to the current `ConsumerGroupCommand`
> tool
> >>> - Execution will be possible without generating JSON files.
> >>>
> >>>
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> >>>
> >>> Looking forward to your feedback!
> >>>
> >>> Jorge.
> >>>
> >>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
> >>> quilcate.jorge@gmail.com>) escribió:
> >>>
> >>>> Great. I think I got the idea. What about this options:
> >>>>
> >>>> Scenarios:
> >>>>
> >>>> 1. Current status
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> >>>>
> >>>> 2. To Datetime
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-to-datetime
> >>>> 2017-01-01T00:00:00.000´
> >>>>
> >>>> 3. To Period
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period
> >> P2D´
> >>>>
> >>>> 4. To Earliest
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >> --reset-to-earliest´
> >>>>
> >>>> 5. To Latest
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-to-latest´
> >>>>
> >>>> 6. Minus 'n' offsets
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
> >>>>
> >>>> 7. Plus 'n' offsets
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
> >>>>
> >>>> 8. To specific offset
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
> >>>>
> >>>> Scopes:
> >>>>
> >>>> a. All topics used by Consumer Group
> >>>>
> >>>> Don't specify --topics
> >>>>
> >>>> b. Specific List of Topics
> >>>>
> >>>> Add list of values in --topics t1,t2,tn
> >>>>
> >>>> c. One Topic, all Partitions
> >>>>
> >>>> Add one topic and no partitions values: --topic t1
> >>>>
> >>>> d. One Topic, List of Partitions
> >>>>
> >>>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
> >>>>
> >>>> About Reset Plan (JSON file):
> >>>>
> >>>> I think is still valid to have the option to persist reset
> configuration
> >>>> as a file, but I agree to give the option to run the tool without
> going
> >>>> down to the JSON file.
> >>>>
> >>>> Execution options:
> >>>>
> >>>> 1. Without execution argument (No args):
> >>>>
> >>>> Print out results (reset plan)
> >>>>
> >>>> 2. With --execute argument:
> >>>>
> >>>> Run reset process
> >>>>
> >>>> 3. With --output argument:
> >>>>
> >>>> Save result in a JSON format.
> >>>>
> >>>> 4. Only with --execute option and --reset-file (path to JSON)
> >>>>
> >>>> Reset based on file
> >>>>
> >>>> 4. Only with --verify option and --reset-file (path to JSON)
> >>>>
> >>>> Verify file values with current offsets
> >>>>
> >>>> I think we can remove --generate-and-execute because is a bit clumsy.
> >>>>
> >>>> With this options we will be able to execute with manual JSON
> >>>> configuration.
> >>>>
> >>>>
> >>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
> >>>> escribió:
> >>>>
> >>>> Yes - using a tool like this to skip a set of consumer groups over a
> >>>> corrupt/bad message is definitely appealing.
> >>>>
> >>>> B
> >>>>
> >>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io>
> wrote:
> >>>>
> >>>>> I like the --reset-to-earliest and --reset-to-latest. In general,
> >>>>> since the JSON route is the most challenging for users, we want to
> >>>>> provide a lot of ways to do useful things without going there.
> >>>>>
> >>>>> Two things that can help:
> >>>>>
> >>>>> 1. A lot of times, users want to skip few messages that cause issues
> >>>>> and continue. maybe just specifying the topic, partition and delta
> >>>>> will be better than having to find the offset and write a JSON and
> >>>>> validate the JSON etc.
> >>>>>
> >>>>> 2. Thinking if there are other common use-cases that we can make easy
> >>>>> rather than just one generic but not very usable method.
> >>>>>
> >>>>> Gwen
> >>>>>
> >>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> >>>>> <qu...@gmail.com> wrote:
> >>>>>> Thanks for the feedback!
> >>>>>>
> >>>>>> @Onur, @Gwen:
> >>>>>>
> >>>>>> Agree. Actually at the first draft I considered to have it inside
> >>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a
> standalone
> >>>>> tool
> >>>>>> to describe it clearly and focus it on reset functionality.
> >>>>>>
> >>>>>> But now that you mentioned, it does make sense to have it in
> >>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to
> introduce
> >>>>> it?
> >>>>>>
> >>>>>> Maybe something like this:
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> >>>> --topics
> >>>>> t1
> >>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> >>>>>> plan.json´
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> >>>>>> plan.json´
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute
> >> --group
> >>>>> cg1
> >>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> >>>>>>
> >>>>>> @Gwen:
> >>>>>>
> >>>>>>> It looks exactly like the replica assignment tool
> >>>>>>
> >>>>>> It was influenced by ;-) I use the generate-verify-execute process
> >> here
> >>>>> to
> >>>>>> make sure user will be aware of the result of this operation. At the
> >>>>>> beginning we considered only add a couple of options to Consumer
> Group
> >>>>>> Command:
> >>>>>>
> >>>>>> --rewind-to-timestamp and --rewind-to-period
> >>>>>>
> >>>>>> @Onur:
> >>>>>>
> >>>>>>> You can actually get away with overriding while members of the
> group
> >>>>> are live
> >>>>>> with method 2 by using group information from DescribeGroupsRequest.
> >>>>>>
> >>>>>> This means that we need to have Consumer Group stopped before
> >> executing
> >>>>> and
> >>>>>> start a new consumer internally to do this? Therefore, we won't be
> >> able
> >>>>> to
> >>>>>> consider executing reset when ConsumerGroup is active? (trying to
> >>>> relate
> >>>>> it
> >>>>>> with @Dong 5th question)
> >>>>>>
> >>>>>> @Dong:
> >>>>>>
> >>>>>>> Should we allow user to use wildcard to reset offset of all groups
> >>>> for a
> >>>>>> given topic as well?
> >>>>>>
> >>>>>> I haven't thought about this scenario. Could be interesting.
> Following
> >>>>> the
> >>>>>> recommendation to add it into Consumer Group Command, in this case
> >>>> Group
> >>>>>> argument will be optional if there are only 1 topic. I think for
> >>>> multiple
> >>>>>> topic won't be that useful.
> >>>>>>
> >>>>>>> Should we allow user to specify timestamp per topic partition in
> the
> >>>>> json
> >>>>>> file as well?
> >>>>>>
> >>>>>> Don't think this could be a valid from the tool, but if Reset Plan
> is
> >>>>>> generated, and user want to set the offset for a specific partition
> to
> >>>>>> other offset (eventually based on another timestamp), and execute
> it,
> >>>> it
> >>>>>> will be up to her/him.
> >>>>>>
> >>>>>>> Should the script take some credential file to make sure that this
> >>>>>> operation is authenticated given the potential impact of this
> >>>> operation?
> >>>>>>
> >>>>>> Haven't tried to secure brokers yet, but the tool should support
> >>>>>> authorization if it's enabled in the broker.
> >>>>>>
> >>>>>>> Should we provide constant to reset committed offset to
> >>>> earliest/latest
> >>>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
> >>>> indicates
> >>>>>> latest offset.
> >>>>>>
> >>>>>> I will go for something like ´--reset-to-earliest´ and
> >>>>> ´--reset-to-latest´
> >>>>>>
> >>>>>>> Should we allow dynamic change of the comitted offset when consumer
> >>>> are
> >>>>>> running, such that consumer will seek to the newly committed offset
> >> and
> >>>>>> start consuming from there?
> >>>>>>
> >>>>>> Not sure about this. I will recommend to keep it simple and ask user
> >> to
> >>>>>> stop consumers first. But I would considered it if the trade-offs
> are
> >>>>>> clear.
> >>>>>>
> >>>>>> @Matthias
> >>>>>>
> >>>>>> Added :). And thanks a lot for your help to define this KIP!
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
> >>>>>> escribió:
> >>>>>>
> >>>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
> >>>>>>> arguments and a JSON parser to the existing tool, right?
> >>>>>>>
> >>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> >>>>>>> <on...@gmail.com> wrote:
> >>>>>>>> I think it makes sense to just add the feature to
> >>>>>>> kafka-consumer-groups.sh
> >>>>>>>>
> >>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
> >>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
> >>>>>>>>>
> >>>>>>>>> I hate the interface, though. It looks exactly like the replica
> >>>>>>>>> assignment tool. A tool everyone loves so much that there are
> >>>>> multiple
> >>>>>>>>> projects, open and closed, that try to fix it.
> >>>>>>>>>
> >>>>>>>>> Can we swap it with something that looks a bit more like the
> >>>> consumer
> >>>>>>>>> group tool? or the kafka streams reset tool? Consistency is
> helpful
> >>>>> in
> >>>>>>>>> such cases. I spent some time learning existing tools and
> learning
> >>>>> yet
> >>>>>>>>> another one is a deterrent.
> >>>>>>>>>
> >>>>>>>>> Gwen
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> >>>>>>>>> <qu...@gmail.com> wrote:
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
> >>>> Group
> >>>>>>>>> Offsets.
> >>>>>>>>>>
> >>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >>>>>>>>>>
> >>>>>>>>>> Please, take a look at the proposal and share your feedback.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Jorge.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Gwen Shapira
> >>>>>>>>> Product Manager | Confluent
> >>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760> | @gwenshap
> >>>>>>>>> Follow us: Twitter | blog
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Gwen Shapira
> >>>>>>> Product Manager | Confluent
> >>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760> | @gwenshap
> >>>>>>> Follow us: Twitter | blog
> >>>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Gwen Shapira
> >>>>> Product Manager | Confluent
> >>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> <(650)%20450-2760>
> >> | @gwenshap
> >>>>> Follow us: Twitter | blog
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>
>
>

Fwd: Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by "Matthias J. Sax" <ma...@confluent.io>.
\cc from dev


-------- Forwarded Message --------
Subject: Re: KIP-122: Add a tool to Reset Consumer Group Offsets
Date: Thu, 23 Feb 2017 10:13:39 -0800
From: Matthias J. Sax <ma...@confluent.io>
Organization: Confluent Inc
To: dev@kafka.apache.org

So you suggest to merge "scope options" --topics, --topic, and
--partitions into a single option? Sound good to me.

I like the compact way to express it, ie, topicname:list-of-partitions
with "all partitions" if not partitions are specified. It's quite
intuitive to use.

Just wondering, if we could get rid of the repeated --topic option; it's
somewhat verbose. Have no good idea though who to improve it.

If you concatenate multiple topic, we need one more character that is
not allowed in topic names to separate the topics:

> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';', '*',
'?', ' ', '\t', '\r', '\n', '='};

maybe

--topics t1=1,2,3:t2:t3=3

use '=' to specify partitions (instead of ':' as you proposed) and ':'
to separate topics? All other characters seem to be worse to use to me.
But maybe you have a better idea.



-Matthias


On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> @Matthias about the point 9:
> 
> What about keeping only the --topic option, and support this format:
> 
> `--topic t1:0,1,2 --topic t2 --topic t3:2`
> 
> In this case topics t1, t2, and t3 will be selected: topic t1 with
> partitions 0,1 and 2; topic t2 with all its partitions; and topic t3, with
> only partition 2.
> 
> Jorge.
> 
> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
> quilcate.jorge@gmail.com>) escribió:
> 
>> Thanks for the feedback Matthias.
>>
>> * 1. You're right. I'll reorder the scenarios.
>>
>> * 2. Agree. I'll update the KIP.
>>
>> * 3. I like it, updating to `reset-offsets`
>>
>> * 4. Agree, removing the `reset-` part
>>
>> * 5. Yes, 1.e option without --execute or --export will print out current
>> offset, and the new offset, that will be the same. The use-case of this
>> option is to use it in combination with --export mostly and have a current
>> 'checkpoint' to reset later. I will add to the KIP how the output should
>> looks like.
>>
>> * 6. Considering 4., I will update it to `--to-offset`
>>
>> * 7. I like the idea to unify these options (plus, minus).
>> `shift-offsets-by` is a good option, but I will like some more feedback
>> here about the name. I will update the KIP in the meantime.
>>
>> * 8. Yes, discussed in 9.
>>
>> * 9. Agree. I'll love some feedback here. `topic` is already used by
>> `delete`, and we can add `--all-topics` to consider all topics/partitions
>> assigned to a group. How could we define specific topics/partitions?
>>
>> * 10. Haven't thought about it, but make sense.
>> <topic>,<partition>,<offset> would be enough.
>>
>> * 11. Agree. Solved with 10.
>>
>> Also, I have a couple of changes to mention:
>>
>> 1. I have add a reference to the branch where I'm working on this KIP.
>>
>> 2. About the period scenario `--to-period`. I will change it to
>> `--to-duration` given that duration (
>> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html)
>> follows this format: 'PnDTnHnMnS' and does not consider daylight saving
>> efects.
>>
>>
>>
>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<ma...@confluent.io>)
>> escribió:
>>
>> Hi,
>>
>> thanks for updating the KIP. Couple of follow up comments:
>>
>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
>> time" option -- IMHO it belongs to "reset by position"?
>>
>>
>> * Nit: Description of "Reset to Earliest"
>>
>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
>>
>> I think this is strictly speaking not correct (as auto.offset.reset only
>> triggered if no valid offset is found, but this tool explicitly modified
>> committed offset), and should be phrased as
>>
>>> using Kafka Consumer's #seekToBeginning()
>>
>> -> similar issue for description of "Reset to Latest"
>>
>>
>> * Main option: rename to --reset-offsets (plural instead of singular)
>>
>>
>> * Scenario Options: I would remove "reset" from all options, because the
>> main argument "--reset-offset" says already what to do:
>>
>>> bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
>>
>> better (IMHO):
>>
>>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
>>
>>
>>
>> * Option 1.e ("print and export current offset") is not intuitive to use
>> IMHO. The main option is "--reset-offset" but nothing happens if no
>> scenario is specified. It is also not specified, what the output should
>> look like?
>>
>> Furthermore, --describe should actually show currently committed offset
>> for a group. So it seems to be redundant to have the same option in
>> --reset-offsets
>>
>>
>> * Option 2.a: I would rename to "--reset-to-offset" (or considering the
>> comment above to "--to-offset")
>>
>>
>> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar)
>> and accept positive/negative values
>>
>>
>> * About Scope "all": maybe it's better to have an option "--all-topics"
>> (or similar). IMHO explicit arguments are preferable over implicit
>> setting to guard again accidental miss use of the tool.
>>
>>
>> * Scope: I also think, that "--topic" (singular) and "--topics" (plural)
>> are too similar and easy to use in a wrong way (ie, mix up) -- maybe we
>> can have two options that are easier to distinguish.
>>
>>
>> * I still think that JSON is not the best format (it's too verbose/hard
>> to write for humans from scratch). A simple CSV format with implicit
>> schema (topic,partition,offset) would be sufficient.
>>
>>
>> * Why does the JSON contain "group_id" field -- there is parameter
>> "--group" to specify the group ID. Would one overwrite the other (what
>> order) or would there be an error if "--group" is used in combination
>> with "--reset-from-file"?
>>
>>
>>
>> -Matthias
>>
>>
>>
>>
>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
>>> Hi,
>>>
>>> according to the feedback, I've updated the KIP:
>>>
>>> - We have added and ordered the scenarios, scopes and executions of the
>>> Reset Offset tool.
>>> - Consider it as an extension to the current `ConsumerGroupCommand` tool
>>> - Execution will be possible without generating JSON files.
>>>
>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
>>>
>>> Looking forward to your feedback!
>>>
>>> Jorge.
>>>
>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
>>> quilcate.jorge@gmail.com>) escribió:
>>>
>>>> Great. I think I got the idea. What about this options:
>>>>
>>>> Scenarios:
>>>>
>>>> 1. Current status
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
>>>>
>>>> 2. To Datetime
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
>>>> 2017-01-01T00:00:00.000´
>>>>
>>>> 3. To Period
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period
>> P2D´
>>>>
>>>> 4. To Earliest
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
>> --reset-to-earliest´
>>>>
>>>> 5. To Latest
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´
>>>>
>>>> 6. Minus 'n' offsets
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
>>>>
>>>> 7. Plus 'n' offsets
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
>>>>
>>>> 8. To specific offset
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
>>>>
>>>> Scopes:
>>>>
>>>> a. All topics used by Consumer Group
>>>>
>>>> Don't specify --topics
>>>>
>>>> b. Specific List of Topics
>>>>
>>>> Add list of values in --topics t1,t2,tn
>>>>
>>>> c. One Topic, all Partitions
>>>>
>>>> Add one topic and no partitions values: --topic t1
>>>>
>>>> d. One Topic, List of Partitions
>>>>
>>>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
>>>>
>>>> About Reset Plan (JSON file):
>>>>
>>>> I think is still valid to have the option to persist reset configuration
>>>> as a file, but I agree to give the option to run the tool without going
>>>> down to the JSON file.
>>>>
>>>> Execution options:
>>>>
>>>> 1. Without execution argument (No args):
>>>>
>>>> Print out results (reset plan)
>>>>
>>>> 2. With --execute argument:
>>>>
>>>> Run reset process
>>>>
>>>> 3. With --output argument:
>>>>
>>>> Save result in a JSON format.
>>>>
>>>> 4. Only with --execute option and --reset-file (path to JSON)
>>>>
>>>> Reset based on file
>>>>
>>>> 4. Only with --verify option and --reset-file (path to JSON)
>>>>
>>>> Verify file values with current offsets
>>>>
>>>> I think we can remove --generate-and-execute because is a bit clumsy.
>>>>
>>>> With this options we will be able to execute with manual JSON
>>>> configuration.
>>>>
>>>>
>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
>>>> escribió:
>>>>
>>>> Yes - using a tool like this to skip a set of consumer groups over a
>>>> corrupt/bad message is definitely appealing.
>>>>
>>>> B
>>>>
>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
>>>>
>>>>> I like the --reset-to-earliest and --reset-to-latest. In general,
>>>>> since the JSON route is the most challenging for users, we want to
>>>>> provide a lot of ways to do useful things without going there.
>>>>>
>>>>> Two things that can help:
>>>>>
>>>>> 1. A lot of times, users want to skip few messages that cause issues
>>>>> and continue. maybe just specifying the topic, partition and delta
>>>>> will be better than having to find the offset and write a JSON and
>>>>> validate the JSON etc.
>>>>>
>>>>> 2. Thinking if there are other common use-cases that we can make easy
>>>>> rather than just one generic but not very usable method.
>>>>>
>>>>> Gwen
>>>>>
>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
>>>>> <qu...@gmail.com> wrote:
>>>>>> Thanks for the feedback!
>>>>>>
>>>>>> @Onur, @Gwen:
>>>>>>
>>>>>> Agree. Actually at the first draft I considered to have it inside
>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
>>>>> tool
>>>>>> to describe it clearly and focus it on reset functionality.
>>>>>>
>>>>>> But now that you mentioned, it does make sense to have it in
>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
>>>>> it?
>>>>>>
>>>>>> Maybe something like this:
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
>>>> --topics
>>>>> t1
>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
>>>>>> plan.json´
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
>>>>>> plan.json´
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute
>> --group
>>>>> cg1
>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
>>>>>>
>>>>>> @Gwen:
>>>>>>
>>>>>>> It looks exactly like the replica assignment tool
>>>>>>
>>>>>> It was influenced by ;-) I use the generate-verify-execute process
>> here
>>>>> to
>>>>>> make sure user will be aware of the result of this operation. At the
>>>>>> beginning we considered only add a couple of options to Consumer Group
>>>>>> Command:
>>>>>>
>>>>>> --rewind-to-timestamp and --rewind-to-period
>>>>>>
>>>>>> @Onur:
>>>>>>
>>>>>>> You can actually get away with overriding while members of the group
>>>>> are live
>>>>>> with method 2 by using group information from DescribeGroupsRequest.
>>>>>>
>>>>>> This means that we need to have Consumer Group stopped before
>> executing
>>>>> and
>>>>>> start a new consumer internally to do this? Therefore, we won't be
>> able
>>>>> to
>>>>>> consider executing reset when ConsumerGroup is active? (trying to
>>>> relate
>>>>> it
>>>>>> with @Dong 5th question)
>>>>>>
>>>>>> @Dong:
>>>>>>
>>>>>>> Should we allow user to use wildcard to reset offset of all groups
>>>> for a
>>>>>> given topic as well?
>>>>>>
>>>>>> I haven't thought about this scenario. Could be interesting. Following
>>>>> the
>>>>>> recommendation to add it into Consumer Group Command, in this case
>>>> Group
>>>>>> argument will be optional if there are only 1 topic. I think for
>>>> multiple
>>>>>> topic won't be that useful.
>>>>>>
>>>>>>> Should we allow user to specify timestamp per topic partition in the
>>>>> json
>>>>>> file as well?
>>>>>>
>>>>>> Don't think this could be a valid from the tool, but if Reset Plan is
>>>>>> generated, and user want to set the offset for a specific partition to
>>>>>> other offset (eventually based on another timestamp), and execute it,
>>>> it
>>>>>> will be up to her/him.
>>>>>>
>>>>>>> Should the script take some credential file to make sure that this
>>>>>> operation is authenticated given the potential impact of this
>>>> operation?
>>>>>>
>>>>>> Haven't tried to secure brokers yet, but the tool should support
>>>>>> authorization if it's enabled in the broker.
>>>>>>
>>>>>>> Should we provide constant to reset committed offset to
>>>> earliest/latest
>>>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
>>>> indicates
>>>>>> latest offset.
>>>>>>
>>>>>> I will go for something like ´--reset-to-earliest´ and
>>>>> ´--reset-to-latest´
>>>>>>
>>>>>>> Should we allow dynamic change of the comitted offset when consumer
>>>> are
>>>>>> running, such that consumer will seek to the newly committed offset
>> and
>>>>>> start consuming from there?
>>>>>>
>>>>>> Not sure about this. I will recommend to keep it simple and ask user
>> to
>>>>>> stop consumers first. But I would considered it if the trade-offs are
>>>>>> clear.
>>>>>>
>>>>>> @Matthias
>>>>>>
>>>>>> Added :). And thanks a lot for your help to define this KIP!
>>>>>>
>>>>>>
>>>>>>
>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
>>>>>> escribió:
>>>>>>
>>>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
>>>>>>> arguments and a JSON parser to the existing tool, right?
>>>>>>>
>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
>>>>>>> <on...@gmail.com> wrote:
>>>>>>>> I think it makes sense to just add the feature to
>>>>>>> kafka-consumer-groups.sh
>>>>>>>>
>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
>>>>>>>>>
>>>>>>>>> I hate the interface, though. It looks exactly like the replica
>>>>>>>>> assignment tool. A tool everyone loves so much that there are
>>>>> multiple
>>>>>>>>> projects, open and closed, that try to fix it.
>>>>>>>>>
>>>>>>>>> Can we swap it with something that looks a bit more like the
>>>> consumer
>>>>>>>>> group tool? or the kafka streams reset tool? Consistency is helpful
>>>>> in
>>>>>>>>> such cases. I spent some time learning existing tools and learning
>>>>> yet
>>>>>>>>> another one is a deterrent.
>>>>>>>>>
>>>>>>>>> Gwen
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>>>>>>>>> <qu...@gmail.com> wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
>>>> Group
>>>>>>>>> Offsets.
>>>>>>>>>>
>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>>>>>>>>>>
>>>>>>>>>> Please, take a look at the proposal and share your feedback.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Jorge.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Gwen Shapira
>>>>>>>>> Product Manager | Confluent
>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760>
>>>> <(650)%20450-2760> | @gwenshap
>>>>>>>>> Follow us: Twitter | blog
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Gwen Shapira
>>>>>>> Product Manager | Confluent
>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760>
>>>> <(650)%20450-2760> | @gwenshap
>>>>>>> Follow us: Twitter | blog
>>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Gwen Shapira
>>>>> Product Manager | Confluent
>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> <(650)%20450-2760>
>> | @gwenshap
>>>>> Follow us: Twitter | blog
>>>>>
>>>>
>>>>
>>>
>>
>>
> 




Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
Hi,

I have incorporated the latest revisions into the KIP and created a PR to
check the implementation details.
If there are no more issues, the VOTE thread has already started.

Looking forward to your comments.

Jorge.

El mar., 28 feb. 2017 a las 19:46, Vahid S Hashemian (<
vahidhashemian@us.ibm.com>) escribió:

Thanks Jorge for addressing my suggestions. Looks good to me.

--Vahid



From:   Jorge Esteban Quilcate Otoya <qu...@gmail.com>
To:     dev@kafka.apache.org
Date:   02/27/2017 01:57 AM
Subject:        Re: KIP-122: Add a tool to Reset Consumer Group Offsets



@Vahid: make sense to add "new lag" info IMO, I will update the KIP.

@Becket:

1. About deleting, I think ConsumerGroupCommand already has an option to
delete Group information by topic. From delete docs: "Pass in groups to
delete topic partition offsets and ownership information over the entire
consumer group.". Let me know if this solves is enough for your case, of
we
can consider to add something to the Reset Offsets tool.

2. Yes, for instance in the case of active consumers, the tool will
validate that there are no active consumers to avoid race conditions. I
have added some code snippets to the wiki, thanks for pointing that out.

El sáb., 25 feb. 2017 a las 0:29, Becket Qin (<be...@gmail.com>)
escribió:

> Thanks for the KIP Jorge. I think this is a useful KIP. I haven't read
the
> KIP in detail yet, some comments from a quick review:
>
> 1. A glance at it it seems that there is no delete option. At LinkedIn
we
> identified some cases that users want to delete the committed offset of
a
> group. It would be good to include that as well.
>
> 2. It seems the KIP is missing some necessary implementation key points.
> e.g. how would the tool to commit offsets for a consumer group, does the
> broker need to know this is a special tool instead of an active consumer
in
> the group (the generation check will be made on offset commit)? They are
> probably in your proof of concept code. Could you add them to the wiki
as
> well?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Fri, Feb 24, 2017 at 1:19 PM, Vahid S Hashemian <
> vahidhashemian@us.ibm.com> wrote:
>
> > Thanks Jorge for addressing my question/suggestion.
> >
> > One last thing. I noticed is that in the example you have for the
"plan"
> > option
> > (
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 122%3A+Add+Reset+Consumer+Group+Offsets+tooling#KIP-122:
> > AddResetConsumerGroupOffsetstooling-ExecutionOptions
> > )
> > under "Description" column, you put 0 for lag. So I assume that is the
> > current lag being reported, and not the new lag. Might be helpful to
> > explicitly specify that (i.e. CURRENT-LAG) in the column header.
> > The other option is to report both current and new lags, but I
understand
> > if we don't want to do that since it's rather redundant info.
> >
> > Thanks again.
> > --Vahid
> >
> >
> >
> > From:   Jorge Esteban Quilcate Otoya <qu...@gmail.com>
> > To:     dev@kafka.apache.org
> > Date:   02/24/2017 12:47 PM
> > Subject:        Re: KIP-122: Add a tool to Reset Consumer Group
Offsets
> >
> >
> >
> > Hi Vahid,
> >
> > Thanks for your comments. Check my answers below:
> >
> > El vie., 24 feb. 2017 a las 19:41, Vahid S Hashemian (<
> > vahidhashemian@us.ibm.com>) escribió:
> >
> > > Hi Jorge,
> > >
> > > Thanks for the useful KIP.
> > >
> > > I have a question regarding the proposed "plan" option.
> > > The "current offset" and "lag" values of a topic partition are
> > meaningful
> > > within a consumer group. In other words, different consumer groups
> could
> > > have different values for these properties of each topic partition.
> > > I don't see that reflected in the discussion around the "plan"
option.
> > > Unless we are assuming a "--group" option is also provided by user
> > (which
> > > is not clear from the KIP if that is the case).
> > >
> >
> > I have added an additional comment to state that this options will
> require
> > a "group" argument.
> > It is considered to affect only one Consumer Group.
> >
> >
> > >
> > > Also, I was wondering if you can provide at least one full command
> > example
> > > for each of the "plan", "execute", and "export" options. They would
> > > definitely help in understanding some of the details.
> > >
> > >
> > Added to the KIP.
> >
> >
> > > Sorry for the delayed question/suggestion. I hope they make sense.
> > >
> > > Thanks.
> > > --Vahid
> > >
> > >
> > >
> > > From:   Jorge Esteban Quilcate Otoya <qu...@gmail.com>
> > > To:     dev@kafka.apache.org
> > > Date:   02/24/2017 09:51 AM
> > > Subject:        Re: KIP-122: Add a tool to Reset Consumer Group
Offsets
> > >
> > >
> > >
> > > Great! KIP updated.
> > >
> > >
> > >
> > > El vie., 24 feb. 2017 a las 18:22, Matthias J. Sax
> > > (<ma...@confluent.io>)
> > > escribió:
> > >
> > > > I like this!
> > > >
> > > > --by-duration and --shift-by
> > > >
> > > >
> > > > -Matthias
> > > >
> > > > On 2/24/17 12:57 AM, Jorge Esteban Quilcate Otoya wrote:
> > > > > Renaming to --by-duration LGTM
> > > > >
> > > > > Not sure about changing it to --shift-by-duration because we
could
> > end
> > > up
> > > > > with the same redundancy as before with reset: --reset-offsets
> > > > > --reset-to-*.
> > > > >
> > > > > Maybe changing --shift-offset-by to --shift-by 'n' could make it
> > > > consistent
> > > > > enough?
> > > > >
> > > > >
> > > > > El vie., 24 feb. 2017 a las 6:39, Matthias J. Sax (<
> > > > matthias@confluent.io>)
> > > > > escribió:
> > > > >
> > > > >> I just read the update KIP once more.
> > > > >>
> > > > >> I would suggest to rename --to-duration to --by-duration
> > > > >>
> > > > >> Or as a second idea, rename --to-duration to
--shift-by-duration
> > and
> > > at
> > > > >> the same time rename --shift-offset-by to --shift-by-offset
> > > > >>
> > > > >> Not sure what the best option is, but naming would be more
> > consistent
> > > > IMHO.
> > > > >>
> > > > >>
> > > > >>
> > > > >> -Matthias
> > > > >>
> > > > >> On 2/23/17 4:42 PM, Jorge Esteban Quilcate Otoya wrote:
> > > > >>> Hi All,
> > > > >>>
> > > > >>> If there are no more concerns, I'd like to start vote for this
> > KIP.
> > > > >>>
> > > > >>> Thanks!
> > > > >>> Jorge.
> > > > >>>
> > > > >>> El jue., 23 feb. 2017 a las 22:50, Jorge Esteban Quilcate
Otoya
> (<
> > > > >>> quilcate.jorge@gmail.com>) escribió:
> > > > >>>
> > > > >>>> Oh ok :)
> > > > >>>>
> > > > >>>> So, we can keep `--topic t1:1,2,3`
> > > > >>>>
> > > > >>>> I think with this one we have most of the feedback applied. I
> > will
> > > > >> update
> > > > >>>> the KIP with this change.
> > > > >>>>
> > > > >>>> El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<
> > > > >> matthias@confluent.io>)
> > > > >>>> escribió:
> > > > >>>>
> > > > >>>> Sounds reasonable.
> > > > >>>>
> > > > >>>> If we have multiple --topic arguments, it does also not
matter
> if
> > > we
> > > > use
> > > > >>>> t1:1,2 or t2=1,2
> > > > >>>>
> > > > >>>> I just suggested '=' because I wanted use ':' to chain
multiple
> > > > topics.
> > > > >>>>
> > > > >>>>
> > > > >>>> -Matthias
> > > > >>>>
> > > > >>>> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> > > > >>>>> Yeap, `--topic t1=1,2`LGTM
> > > > >>>>>
> > > > >>>>> Don't have idea neither about getting rid of repeated
--topic,
> > but
> > > > >>>> --group
> > > > >>>>> is also repeated in the case of deletion, so it could be ok
to
> > > have
> > > > >>>>> repeated --topic arguments.
> > > > >>>>>
> > > > >>>>> El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
> > > > >>>> matthias@confluent.io>)
> > > > >>>>> escribió:
> > > > >>>>>
> > > > >>>>>> So you suggest to merge "scope options" --topics, --topic,
and
> > > > >>>>>> --partitions into a single option? Sound good to me.
> > > > >>>>>>
> > > > >>>>>> I like the compact way to express it, ie,
> > > > topicname:list-of-partitions
> > > > >>>>>> with "all partitions" if not partitions are specified. It's
> > quite
> > > > >>>>>> intuitive to use.
> > > > >>>>>>
> > > > >>>>>> Just wondering, if we could get rid of the repeated --topic
> > > option;
> > > > >> it's
> > > > >>>>>> somewhat verbose. Have no good idea though who to improve
it.
> > > > >>>>>>
> > > > >>>>>> If you concatenate multiple topic, we need one more
character
> > > that
> > > > is
> > > > >>>>>> not allowed in topic names to separate the topics:
> > > > >>>>>>
> > > > >>>>>>> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'',
> ';',
> > > '*',
> > > > >>>>>> '?', ' ', '\t', '\r', '\n', '='};
> > > > >>>>>>
> > > > >>>>>> maybe
> > > > >>>>>>
> > > > >>>>>> --topics t1=1,2,3:t2:t3=3
> > > > >>>>>>
> > > > >>>>>> use '=' to specify partitions (instead of ':' as you
proposed)
> > > and
> > > > ':'
> > > > >>>>>> to separate topics? All other characters seem to be worse
to
> > use
> > > to
> > > > >> me.
> > > > >>>>>> But maybe you have a better idea.
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> -Matthias
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> > > > >>>>>>> @Matthias about the point 9:
> > > > >>>>>>>
> > > > >>>>>>> What about keeping only the --topic option, and support
this
> > > > format:
> > > > >>>>>>>
> > > > >>>>>>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
> > > > >>>>>>>
> > > > >>>>>>> In this case topics t1, t2, and t3 will be selected: topic
t1
> > > with
> > > > >>>>>>> partitions 0,1 and 2; topic t2 with all its partitions;
and
> > > topic
> > > > t3,
> > > > >>>>>> with
> > > > >>>>>>> only partition 2.
> > > > >>>>>>>
> > > > >>>>>>> Jorge.
> > > > >>>>>>>
> > > > >>>>>>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate
> > Otoya
> > > (<
> > > > >>>>>>> quilcate.jorge@gmail.com>) escribió:
> > > > >>>>>>>
> > > > >>>>>>>> Thanks for the feedback Matthias.
> > > > >>>>>>>>
> > > > >>>>>>>> * 1. You're right. I'll reorder the scenarios.
> > > > >>>>>>>>
> > > > >>>>>>>> * 2. Agree. I'll update the KIP.
> > > > >>>>>>>>
> > > > >>>>>>>> * 3. I like it, updating to `reset-offsets`
> > > > >>>>>>>>
> > > > >>>>>>>> * 4. Agree, removing the `reset-` part
> > > > >>>>>>>>
> > > > >>>>>>>> * 5. Yes, 1.e option without --execute or --export will
> print
> > > out
> > > > >>>>>> current
> > > > >>>>>>>> offset, and the new offset, that will be the same. The
> > use-case
> > > of
> > > > >>>> this
> > > > >>>>>>>> option is to use it in combination with --export mostly
and
> > > have a
> > > > >>>>>> current
> > > > >>>>>>>> 'checkpoint' to reset later. I will add to the KIP how
the
> > > output
> > > > >>>> should
> > > > >>>>>>>> looks like.
> > > > >>>>>>>>
> > > > >>>>>>>> * 6. Considering 4., I will update it to `--to-offset`
> > > > >>>>>>>>
> > > > >>>>>>>> * 7. I like the idea to unify these options (plus,
minus).
> > > > >>>>>>>> `shift-offsets-by` is a good option, but I will like some
> > more
> > > > >>>> feedback
> > > > >>>>>>>> here about the name. I will update the KIP in the
meantime.
> > > > >>>>>>>>
> > > > >>>>>>>> * 8. Yes, discussed in 9.
> > > > >>>>>>>>
> > > > >>>>>>>> * 9. Agree. I'll love some feedback here. `topic` is
already
> > > used
> > > > by
> > > > >>>>>>>> `delete`, and we can add `--all-topics` to consider all
> > > > >>>>>> topics/partitions
> > > > >>>>>>>> assigned to a group. How could we define specific
> > > > topics/partitions?
> > > > >>>>>>>>
> > > > >>>>>>>> * 10. Haven't thought about it, but make sense.
> > > > >>>>>>>> <topic>,<partition>,<offset> would be enough.
> > > > >>>>>>>>
> > > > >>>>>>>> * 11. Agree. Solved with 10.
> > > > >>>>>>>>
> > > > >>>>>>>> Also, I have a couple of changes to mention:
> > > > >>>>>>>>
> > > > >>>>>>>> 1. I have add a reference to the branch where I'm working
on
> > > this
> > > > >> KIP.
> > > > >>>>>>>>
> > > > >>>>>>>> 2. About the period scenario `--to-period`. I will change
it
> > to
> > > > >>>>>>>> `--to-duration` given that duration (
> > > > >>>>>>>>
> > > https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html
> > > > )
> > > > >>>>>>>> follows this format: 'PnDTnHnMnS' and does not consider
> > > daylight
> > > > >>>> saving
> > > > >>>>>>>> efects.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
> > > > >>>>>> matthias@confluent.io>)
> > > > >>>>>>>> escribió:
> > > > >>>>>>>>
> > > > >>>>>>>> Hi,
> > > > >>>>>>>>
> > > > >>>>>>>> thanks for updating the KIP. Couple of follow up
comments:
> > > > >>>>>>>>
> > > > >>>>>>>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a
> > > "reset
> > > > by
> > > > >>>>>>>> time" option -- IMHO it belongs to "reset by position"?
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Nit: Description of "Reset to Earliest"
> > > > >>>>>>>>
> > > > >>>>>>>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
> > > > >>>>>>>>
> > > > >>>>>>>> I think this is strictly speaking not correct (as
> > > > auto.offset.reset
> > > > >>>> only
> > > > >>>>>>>> triggered if no valid offset is found, but this tool
> > explicitly
> > > > >>>> modified
> > > > >>>>>>>> committed offset), and should be phrased as
> > > > >>>>>>>>
> > > > >>>>>>>>> using Kafka Consumer's #seekToBeginning()
> > > > >>>>>>>>
> > > > >>>>>>>> -> similar issue for description of "Reset to Latest"
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Main option: rename to --reset-offsets (plural instead
of
> > > > >> singular)
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Scenario Options: I would remove "reset" from all
options,
> > > > because
> > > > >>>> the
> > > > >>>>>>>> main argument "--reset-offset" says already what to do:
> > > > >>>>>>>>
> > > > >>>>>>>>> bin/kafka-consumer-groups.sh --reset-offset
> > > --reset-to-datetime
> > > > XXX
> > > > >>>>>>>>
> > > > >>>>>>>> better (IMHO):
> > > > >>>>>>>>
> > > > >>>>>>>>> bin/kafka-consumer-groups.sh --reset-offsets
--to-datetime
> > XXX
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Option 1.e ("print and export current offset") is not
> > > intuitive
> > > > to
> > > > >>>> use
> > > > >>>>>>>> IMHO. The main option is "--reset-offset" but nothing
> happens
> > > if
> > > > no
> > > > >>>>>>>> scenario is specified. It is also not specified, what the
> > > output
> > > > >>>> should
> > > > >>>>>>>> look like?
> > > > >>>>>>>>
> > > > >>>>>>>> Furthermore, --describe should actually show currently
> > > committed
> > > > >>>> offset
> > > > >>>>>>>> for a group. So it seems to be redundant to have the same
> > > option
> > > > in
> > > > >>>>>>>> --reset-offsets
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Option 2.a: I would rename to "--reset-to-offset" (or
> > > > considering
> > > > >>>> the
> > > > >>>>>>>> comment above to "--to-offset")
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Option 2.b and 2.c: I would unify to
"--shift-offsets-by"
> > (or
> > > > >>>> similar)
> > > > >>>>>>>> and accept positive/negative values
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * About Scope "all": maybe it's better to have an option
> > > > >>>> "--all-topics"
> > > > >>>>>>>> (or similar). IMHO explicit arguments are preferable over
> > > implicit
> > > > >>>>>>>> setting to guard again accidental miss use of the tool.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Scope: I also think, that "--topic" (singular) and
> > "--topics"
> > > > >>>> (plural)
> > > > >>>>>>>> are too similar and easy to use in a wrong way (ie, mix
up)
> > --
> > > > maybe
> > > > >>>> we
> > > > >>>>>>>> can have two options that are easier to distinguish.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * I still think that JSON is not the best format (it's
too
> > > > >>>> verbose/hard
> > > > >>>>>>>> to write for humans from scratch). A simple CSV format
with
> > > > implicit
> > > > >>>>>>>> schema (topic,partition,offset) would be sufficient.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Why does the JSON contain "group_id" field -- there is
> > > parameter
> > > > >>>>>>>> "--group" to specify the group ID. Would one overwrite
the
> > > other
> > > > >> (what
> > > > >>>>>>>> order) or would there be an error if "--group" is used in
> > > > >> combination
> > > > >>>>>>>> with "--reset-from-file"?
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> -Matthias
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> > > > >>>>>>>>> Hi,
> > > > >>>>>>>>>
> > > > >>>>>>>>> according to the feedback, I've updated the KIP:
> > > > >>>>>>>>>
> > > > >>>>>>>>> - We have added and ordered the scenarios, scopes and
> > > executions
> > > > of
> > > > >>>> the
> > > > >>>>>>>>> Reset Offset tool.
> > > > >>>>>>>>> - Consider it as an extension to the current
> > > > `ConsumerGroupCommand`
> > > > >>>>>> tool
> > > > >>>>>>>>> - Execution will be possible without generating JSON
files.
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>
> > > > >>>>
> > > > >>
> > > >
> > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> >
> > >
> > > > >>>>>>>>>
> > > > >>>>>>>>> Looking forward to your feedback!
> > > > >>>>>>>>>
> > > > >>>>>>>>> Jorge.
> > > > >>>>>>>>>
> > > > >>>>>>>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate
> > Otoya
> > > (<
> > > > >>>>>>>>> quilcate.jorge@gmail.com>) escribió:
> > > > >>>>>>>>>
> > > > >>>>>>>>>> Great. I think I got the idea. What about this options:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Scenarios:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 1. Current status
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 2. To Datetime
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > >>>>>> --reset-to-datetime
> > > > >>>>>>>>>> 2017-01-01T00:00:00.000´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 3. To Period
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > >>>> --reset-to-period
> > > > >>>>>>>> P2D´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 4. To Earliest
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > >>>>>>>> --reset-to-earliest´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 5. To Latest
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > >>>>>> --reset-to-latest´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 6. Minus 'n' offsets
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > --reset-minus
> > > > >>>> n´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 7. Plus 'n' offsets
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > --reset-plus
> > > > >> n´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 8. To specific offset
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > --reset-to
> > > > x´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Scopes:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> a. All topics used by Consumer Group
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Don't specify --topics
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> b. Specific List of Topics
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Add list of values in --topics t1,t2,tn
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> c. One Topic, all Partitions
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Add one topic and no partitions values: --topic t1
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> d. One Topic, List of Partitions
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Add one topic and partitions values: --topic t1
> > --partitions
> > > > 0,1,2
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> About Reset Plan (JSON file):
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> I think is still valid to have the option to persist
reset
> > > > >>>>>> configuration
> > > > >>>>>>>>>> as a file, but I agree to give the option to run the
tool
> > > > without
> > > > >>>>>> going
> > > > >>>>>>>>>> down to the JSON file.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Execution options:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 1. Without execution argument (No args):
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Print out results (reset plan)
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 2. With --execute argument:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Run reset process
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 3. With --output argument:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Save result in a JSON format.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 4. Only with --execute option and --reset-file (path to
> > JSON)
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Reset based on file
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 4. Only with --verify option and --reset-file (path to
> > JSON)
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Verify file values with current offsets
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> I think we can remove --generate-and-execute because is
a
> > bit
> > > > >>>> clumsy.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> With this options we will be able to execute with
manual
> > JSON
> > > > >>>>>>>>>> configuration.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<
> > > > ben@confluent.io
> > > > >>> )
> > > > >>>>>>>>>> escribió:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Yes - using a tool like this to skip a set of consumer
> > groups
> > > > >> over a
> > > > >>>>>>>>>> corrupt/bad message is definitely appealing.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> B
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira
> > > <gw...@confluent.io>
> > > > >>>>>> wrote:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> I like the --reset-to-earliest and --reset-to-latest.
In
> > > > general,
> > > > >>>>>>>>>>> since the JSON route is the most challenging for
users,
> we
> > > want
> > > > >> to
> > > > >>>>>>>>>>> provide a lot of ways to do useful things without
going
> > > there.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Two things that can help:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> 1. A lot of times, users want to skip few messages
that
> > > cause
> > > > >>>> issues
> > > > >>>>>>>>>>> and continue. maybe just specifying the topic,
partition
> > and
> > > > >> delta
> > > > >>>>>>>>>>> will be better than having to find the offset and
write a
> > > JSON
> > > > >> and
> > > > >>>>>>>>>>> validate the JSON etc.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> 2. Thinking if there are other common use-cases that
we
> > can
> > > > make
> > > > >>>> easy
> > > > >>>>>>>>>>> rather than just one generic but not very usable
method.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Gwen
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate
> > Otoya
> > > > >>>>>>>>>>> <qu...@gmail.com> wrote:
> > > > >>>>>>>>>>>> Thanks for the feedback!
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> @Onur, @Gwen:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Agree. Actually at the first draft I considered to
have
> > it
> > > > >> inside
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose
it
> as
> > a
> > > > >>>>>> standalone
> > > > >>>>>>>>>>> tool
> > > > >>>>>>>>>>>> to describe it clearly and focus it on reset
> > functionality.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> But now that you mentioned, it does make sense to
have
> it
> > > in
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent
> way
> > > to
> > > > >>>>>> introduce
> > > > >>>>>>>>>>> it?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Maybe something like this:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate
> > --group
> > > > cg1
> > > > >>>>>>>>>> --topics
> > > > >>>>>>>>>>> t1
> > > > >>>>>>>>>>>> --reset-from 2017-01-01T00:00:00.000 --output
plan.json´
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify
> > > > >>>> --reset-json-file
> > > > >>>>>>>>>>>> plan.json´
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute
> > > > >>>> --reset-json-file
> > > > >>>>>>>>>>>> plan.json´
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset
> > > > --generate-and-execute
> > > > >>>>>>>> --group
> > > > >>>>>>>>>>> cg1
> > > > >>>>>>>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> @Gwen:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> It looks exactly like the replica assignment tool
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> It was influenced by ;-) I use the
> > generate-verify-execute
> > > > >> process
> > > > >>>>>>>> here
> > > > >>>>>>>>>>> to
> > > > >>>>>>>>>>>> make sure user will be aware of the result of this
> > > operation.
> > > > At
> > > > >>>> the
> > > > >>>>>>>>>>>> beginning we considered only add a couple of options
to
> > > > Consumer
> > > > >>>>>> Group
> > > > >>>>>>>>>>>> Command:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> --rewind-to-timestamp and --rewind-to-period
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> @Onur:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> You can actually get away with overriding while
members
> > of
> > > > the
> > > > >>>>>> group
> > > > >>>>>>>>>>> are live
> > > > >>>>>>>>>>>> with method 2 by using group information from
> > > > >>>> DescribeGroupsRequest.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> This means that we need to have Consumer Group
stopped
> > > before
> > > > >>>>>>>> executing
> > > > >>>>>>>>>>> and
> > > > >>>>>>>>>>>> start a new consumer internally to do this?
Therefore,
> we
> > > > won't
> > > > >> be
> > > > >>>>>>>> able
> > > > >>>>>>>>>>> to
> > > > >>>>>>>>>>>> consider executing reset when ConsumerGroup is
active?
> > > (trying
> > > > >> to
> > > > >>>>>>>>>> relate
> > > > >>>>>>>>>>> it
> > > > >>>>>>>>>>>> with @Dong 5th question)
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> @Dong:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Should we allow user to use wildcard to reset offset
of
> > > all
> > > > >>>> groups
> > > > >>>>>>>>>> for a
> > > > >>>>>>>>>>>> given topic as well?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I haven't thought about this scenario. Could be
> > > interesting.
> > > > >>>>>> Following
> > > > >>>>>>>>>>> the
> > > > >>>>>>>>>>>> recommendation to add it into Consumer Group Command,
in
> > > this
> > > > >> case
> > > > >>>>>>>>>> Group
> > > > >>>>>>>>>>>> argument will be optional if there are only 1 topic.
I
> > > think
> > > > for
> > > > >>>>>>>>>> multiple
> > > > >>>>>>>>>>>> topic won't be that useful.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Should we allow user to specify timestamp per topic
> > > partition
> > > > >> in
> > > > >>>>>> the
> > > > >>>>>>>>>>> json
> > > > >>>>>>>>>>>> file as well?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Don't think this could be a valid from the tool, but
if
> > > Reset
> > > > >> Plan
> > > > >>>>>> is
> > > > >>>>>>>>>>>> generated, and user want to set the offset for a
> specific
> > > > >>>> partition
> > > > >>>>>> to
> > > > >>>>>>>>>>>> other offset (eventually based on another timestamp),
> and
> > > > >> execute
> > > > >>>>>> it,
> > > > >>>>>>>>>> it
> > > > >>>>>>>>>>>> will be up to her/him.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Should the script take some credential file to make
> sure
> > > that
> > > > >>>> this
> > > > >>>>>>>>>>>> operation is authenticated given the potential impact
of
> > > this
> > > > >>>>>>>>>> operation?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Haven't tried to secure brokers yet, but the tool
should
> > > > support
> > > > >>>>>>>>>>>> authorization if it's enabled in the broker.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Should we provide constant to reset committed offset
to
> > > > >>>>>>>>>> earliest/latest
> > > > >>>>>>>>>>>> offset of a partition, e.g. -1 indicates earliest
offset
> > > and
> > > > -2
> > > > >>>>>>>>>> indicates
> > > > >>>>>>>>>>>> latest offset.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I will go for something like ´--reset-to-earliest´
and
> > > > >>>>>>>>>>> ´--reset-to-latest´
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Should we allow dynamic change of the comitted
offset
> > when
> > > > >>>> consumer
> > > > >>>>>>>>>> are
> > > > >>>>>>>>>>>> running, such that consumer will seek to the newly
> > > committed
> > > > >>>> offset
> > > > >>>>>>>> and
> > > > >>>>>>>>>>>> start consuming from there?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Not sure about this. I will recommend to keep it
simple
> > and
> > > > ask
> > > > >>>> user
> > > > >>>>>>>> to
> > > > >>>>>>>>>>>> stop consumers first. But I would considered it if
the
> > > > >> trade-offs
> > > > >>>>>> are
> > > > >>>>>>>>>>>> clear.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> @Matthias
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Added :). And thanks a lot for your help to define
this
> > > KIP!
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<
> > > > >> gwen@confluent.io
> > > > >>>>> )
> > > > >>>>>>>>>>>> escribió:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> As long as the CLI is a bit consistent? Like, not
just
> > > > adding 3
> > > > >>>>>>>>>>>>> arguments and a JSON parser to the existing tool,
> right?
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> > > > >>>>>>>>>>>>> <on...@gmail.com> wrote:
> > > > >>>>>>>>>>>>>> I think it makes sense to just add the feature to
> > > > >>>>>>>>>>>>> kafka-consumer-groups.sh
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <
> > > > >>>> gwen@confluent.io>
> > > > >>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Thanks for the KIP. I'm super happy about adding
the
> > > > >>>> capability.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> I hate the interface, though. It looks exactly
like
> > the
> > > > >> replica
> > > > >>>>>>>>>>>>>>> assignment tool. A tool everyone loves so much
that
> > > there
> > > > are
> > > > >>>>>>>>>>> multiple
> > > > >>>>>>>>>>>>>>> projects, open and closed, that try to fix it.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Can we swap it with something that looks a bit
more
> > like
> > > > the
> > > > >>>>>>>>>> consumer
> > > > >>>>>>>>>>>>>>> group tool? or the kafka streams reset tool?
> > Consistency
> > > is
> > > > >>>>>> helpful
> > > > >>>>>>>>>>> in
> > > > >>>>>>>>>>>>>>> such cases. I spent some time learning existing
tools
> > > and
> > > > >>>>>> learning
> > > > >>>>>>>>>>> yet
> > > > >>>>>>>>>>>>>>> another one is a deterrent.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Gwen
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban
> Quilcate
> > > > Otoya
> > > > >>>>>>>>>>>>>>> <qu...@gmail.com> wrote:
> > > > >>>>>>>>>>>>>>>> Hi all,
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> I would like to propose a KIP to Add a tool to
Reset
> > > > >> Consumer
> > > > >>>>>>>>>> Group
> > > > >>>>>>>>>>>>>>> Offsets.
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > >>>>>>>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> Please, take a look at the proposal and share
your
> > > > feedback.
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>>>> Jorge.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> --
> > > > >>>>>>>>>>>>>>> Gwen Shapira
> > > > >>>>>>>>>>>>>>> Product Manager | Confluent
> > > > >>>>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
<(650)%20450-2760>
> <(650)%20450-2760>
> > > <(650)%20450-2760>
> > > > <(650)%20450-2760>
> > > > >> <(650)%20450-2760>
> > > > >>>> <(650)%20450-2760>
> > > > >>>>>> <(650)%20450-2760>
> > > > >>>>>>>> <(650)%20450-2760>
> > > > >>>>>>>>>> <(650)%20450-2760> | @gwenshap
> > > > >>>>>>>>>>>>>>> Follow us: Twitter | blog
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> --
> > > > >>>>>>>>>>>>> Gwen Shapira
> > > > >>>>>>>>>>>>> Product Manager | Confluent
> > > > >>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
<(650)%20450-2760>
> <(650)%20450-2760>
> > > <(650)%20450-2760>
> > > > <(650)%20450-2760>
> > > > >> <(650)%20450-2760>
> > > > >>>> <(650)%20450-2760>
> > > > >>>>>> <(650)%20450-2760>
> > > > >>>>>>>> <(650)%20450-2760>
> > > > >>>>>>>>>> <(650)%20450-2760> | @gwenshap
> > > > >>>>>>>>>>>>> Follow us: Twitter | blog
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> --
> > > > >>>>>>>>>>> Gwen Shapira
> > > > >>>>>>>>>>> Product Manager | Confluent
> > > > >>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
<(650)%20450-2760>
> <(650)%20450-2760>
> > > <(650)%20450-2760>
> > > > <(650)%20450-2760>
> > > > >> <(650)%20450-2760>
> > > > >>>> <(650)%20450-2760>
> > > > >>>>>> <(650)%20450-2760> <(650)%20450-2760>
> > > > >>>>>>>> | @gwenshap
> > > > >>>>>>>>>>> Follow us: Twitter | blog
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > > >>
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
> >
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Vahid S Hashemian <va...@us.ibm.com>.
Thanks Jorge for addressing my suggestions. Looks good to me.

--Vahid



From:   Jorge Esteban Quilcate Otoya <qu...@gmail.com>
To:     dev@kafka.apache.org
Date:   02/27/2017 01:57 AM
Subject:        Re: KIP-122: Add a tool to Reset Consumer Group Offsets



@Vahid: make sense to add "new lag" info IMO, I will update the KIP.

@Becket:

1. About deleting, I think ConsumerGroupCommand already has an option to
delete Group information by topic. From delete docs: "Pass in groups to
delete topic partition offsets and ownership information over the entire
consumer group.". Let me know if this solves is enough for your case, of 
we
can consider to add something to the Reset Offsets tool.

2. Yes, for instance in the case of active consumers, the tool will
validate that there are no active consumers to avoid race conditions. I
have added some code snippets to the wiki, thanks for pointing that out.

El sáb., 25 feb. 2017 a las 0:29, Becket Qin (<be...@gmail.com>)
escribió:

> Thanks for the KIP Jorge. I think this is a useful KIP. I haven't read 
the
> KIP in detail yet, some comments from a quick review:
>
> 1. A glance at it it seems that there is no delete option. At LinkedIn 
we
> identified some cases that users want to delete the committed offset of 
a
> group. It would be good to include that as well.
>
> 2. It seems the KIP is missing some necessary implementation key points.
> e.g. how would the tool to commit offsets for a consumer group, does the
> broker need to know this is a special tool instead of an active consumer 
in
> the group (the generation check will be made on offset commit)? They are
> probably in your proof of concept code. Could you add them to the wiki 
as
> well?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Fri, Feb 24, 2017 at 1:19 PM, Vahid S Hashemian <
> vahidhashemian@us.ibm.com> wrote:
>
> > Thanks Jorge for addressing my question/suggestion.
> >
> > One last thing. I noticed is that in the example you have for the 
"plan"
> > option
> > (
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 122%3A+Add+Reset+Consumer+Group+Offsets+tooling#KIP-122:
> > AddResetConsumerGroupOffsetstooling-ExecutionOptions
> > )
> > under "Description" column, you put 0 for lag. So I assume that is the
> > current lag being reported, and not the new lag. Might be helpful to
> > explicitly specify that (i.e. CURRENT-LAG) in the column header.
> > The other option is to report both current and new lags, but I 
understand
> > if we don't want to do that since it's rather redundant info.
> >
> > Thanks again.
> > --Vahid
> >
> >
> >
> > From:   Jorge Esteban Quilcate Otoya <qu...@gmail.com>
> > To:     dev@kafka.apache.org
> > Date:   02/24/2017 12:47 PM
> > Subject:        Re: KIP-122: Add a tool to Reset Consumer Group 
Offsets
> >
> >
> >
> > Hi Vahid,
> >
> > Thanks for your comments. Check my answers below:
> >
> > El vie., 24 feb. 2017 a las 19:41, Vahid S Hashemian (<
> > vahidhashemian@us.ibm.com>) escribió:
> >
> > > Hi Jorge,
> > >
> > > Thanks for the useful KIP.
> > >
> > > I have a question regarding the proposed "plan" option.
> > > The "current offset" and "lag" values of a topic partition are
> > meaningful
> > > within a consumer group. In other words, different consumer groups
> could
> > > have different values for these properties of each topic partition.
> > > I don't see that reflected in the discussion around the "plan" 
option.
> > > Unless we are assuming a "--group" option is also provided by user
> > (which
> > > is not clear from the KIP if that is the case).
> > >
> >
> > I have added an additional comment to state that this options will
> require
> > a "group" argument.
> > It is considered to affect only one Consumer Group.
> >
> >
> > >
> > > Also, I was wondering if you can provide at least one full command
> > example
> > > for each of the "plan", "execute", and "export" options. They would
> > > definitely help in understanding some of the details.
> > >
> > >
> > Added to the KIP.
> >
> >
> > > Sorry for the delayed question/suggestion. I hope they make sense.
> > >
> > > Thanks.
> > > --Vahid
> > >
> > >
> > >
> > > From:   Jorge Esteban Quilcate Otoya <qu...@gmail.com>
> > > To:     dev@kafka.apache.org
> > > Date:   02/24/2017 09:51 AM
> > > Subject:        Re: KIP-122: Add a tool to Reset Consumer Group 
Offsets
> > >
> > >
> > >
> > > Great! KIP updated.
> > >
> > >
> > >
> > > El vie., 24 feb. 2017 a las 18:22, Matthias J. Sax
> > > (<ma...@confluent.io>)
> > > escribió:
> > >
> > > > I like this!
> > > >
> > > > --by-duration and --shift-by
> > > >
> > > >
> > > > -Matthias
> > > >
> > > > On 2/24/17 12:57 AM, Jorge Esteban Quilcate Otoya wrote:
> > > > > Renaming to --by-duration LGTM
> > > > >
> > > > > Not sure about changing it to --shift-by-duration because we 
could
> > end
> > > up
> > > > > with the same redundancy as before with reset: --reset-offsets
> > > > > --reset-to-*.
> > > > >
> > > > > Maybe changing --shift-offset-by to --shift-by 'n' could make it
> > > > consistent
> > > > > enough?
> > > > >
> > > > >
> > > > > El vie., 24 feb. 2017 a las 6:39, Matthias J. Sax (<
> > > > matthias@confluent.io>)
> > > > > escribió:
> > > > >
> > > > >> I just read the update KIP once more.
> > > > >>
> > > > >> I would suggest to rename --to-duration to --by-duration
> > > > >>
> > > > >> Or as a second idea, rename --to-duration to 
--shift-by-duration
> > and
> > > at
> > > > >> the same time rename --shift-offset-by to --shift-by-offset
> > > > >>
> > > > >> Not sure what the best option is, but naming would be more
> > consistent
> > > > IMHO.
> > > > >>
> > > > >>
> > > > >>
> > > > >> -Matthias
> > > > >>
> > > > >> On 2/23/17 4:42 PM, Jorge Esteban Quilcate Otoya wrote:
> > > > >>> Hi All,
> > > > >>>
> > > > >>> If there are no more concerns, I'd like to start vote for this
> > KIP.
> > > > >>>
> > > > >>> Thanks!
> > > > >>> Jorge.
> > > > >>>
> > > > >>> El jue., 23 feb. 2017 a las 22:50, Jorge Esteban Quilcate 
Otoya
> (<
> > > > >>> quilcate.jorge@gmail.com>) escribió:
> > > > >>>
> > > > >>>> Oh ok :)
> > > > >>>>
> > > > >>>> So, we can keep `--topic t1:1,2,3`
> > > > >>>>
> > > > >>>> I think with this one we have most of the feedback applied. I
> > will
> > > > >> update
> > > > >>>> the KIP with this change.
> > > > >>>>
> > > > >>>> El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<
> > > > >> matthias@confluent.io>)
> > > > >>>> escribió:
> > > > >>>>
> > > > >>>> Sounds reasonable.
> > > > >>>>
> > > > >>>> If we have multiple --topic arguments, it does also not 
matter
> if
> > > we
> > > > use
> > > > >>>> t1:1,2 or t2=1,2
> > > > >>>>
> > > > >>>> I just suggested '=' because I wanted use ':' to chain 
multiple
> > > > topics.
> > > > >>>>
> > > > >>>>
> > > > >>>> -Matthias
> > > > >>>>
> > > > >>>> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> > > > >>>>> Yeap, `--topic t1=1,2`LGTM
> > > > >>>>>
> > > > >>>>> Don't have idea neither about getting rid of repeated 
--topic,
> > but
> > > > >>>> --group
> > > > >>>>> is also repeated in the case of deletion, so it could be ok 
to
> > > have
> > > > >>>>> repeated --topic arguments.
> > > > >>>>>
> > > > >>>>> El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
> > > > >>>> matthias@confluent.io>)
> > > > >>>>> escribió:
> > > > >>>>>
> > > > >>>>>> So you suggest to merge "scope options" --topics, --topic, 
and
> > > > >>>>>> --partitions into a single option? Sound good to me.
> > > > >>>>>>
> > > > >>>>>> I like the compact way to express it, ie,
> > > > topicname:list-of-partitions
> > > > >>>>>> with "all partitions" if not partitions are specified. It's
> > quite
> > > > >>>>>> intuitive to use.
> > > > >>>>>>
> > > > >>>>>> Just wondering, if we could get rid of the repeated --topic
> > > option;
> > > > >> it's
> > > > >>>>>> somewhat verbose. Have no good idea though who to improve 
it.
> > > > >>>>>>
> > > > >>>>>> If you concatenate multiple topic, we need one more 
character
> > > that
> > > > is
> > > > >>>>>> not allowed in topic names to separate the topics:
> > > > >>>>>>
> > > > >>>>>>> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'',
> ';',
> > > '*',
> > > > >>>>>> '?', ' ', '\t', '\r', '\n', '='};
> > > > >>>>>>
> > > > >>>>>> maybe
> > > > >>>>>>
> > > > >>>>>> --topics t1=1,2,3:t2:t3=3
> > > > >>>>>>
> > > > >>>>>> use '=' to specify partitions (instead of ':' as you 
proposed)
> > > and
> > > > ':'
> > > > >>>>>> to separate topics? All other characters seem to be worse 
to
> > use
> > > to
> > > > >> me.
> > > > >>>>>> But maybe you have a better idea.
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> -Matthias
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> > > > >>>>>>> @Matthias about the point 9:
> > > > >>>>>>>
> > > > >>>>>>> What about keeping only the --topic option, and support 
this
> > > > format:
> > > > >>>>>>>
> > > > >>>>>>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
> > > > >>>>>>>
> > > > >>>>>>> In this case topics t1, t2, and t3 will be selected: topic 
t1
> > > with
> > > > >>>>>>> partitions 0,1 and 2; topic t2 with all its partitions; 
and
> > > topic
> > > > t3,
> > > > >>>>>> with
> > > > >>>>>>> only partition 2.
> > > > >>>>>>>
> > > > >>>>>>> Jorge.
> > > > >>>>>>>
> > > > >>>>>>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate
> > Otoya
> > > (<
> > > > >>>>>>> quilcate.jorge@gmail.com>) escribió:
> > > > >>>>>>>
> > > > >>>>>>>> Thanks for the feedback Matthias.
> > > > >>>>>>>>
> > > > >>>>>>>> * 1. You're right. I'll reorder the scenarios.
> > > > >>>>>>>>
> > > > >>>>>>>> * 2. Agree. I'll update the KIP.
> > > > >>>>>>>>
> > > > >>>>>>>> * 3. I like it, updating to `reset-offsets`
> > > > >>>>>>>>
> > > > >>>>>>>> * 4. Agree, removing the `reset-` part
> > > > >>>>>>>>
> > > > >>>>>>>> * 5. Yes, 1.e option without --execute or --export will
> print
> > > out
> > > > >>>>>> current
> > > > >>>>>>>> offset, and the new offset, that will be the same. The
> > use-case
> > > of
> > > > >>>> this
> > > > >>>>>>>> option is to use it in combination with --export mostly 
and
> > > have a
> > > > >>>>>> current
> > > > >>>>>>>> 'checkpoint' to reset later. I will add to the KIP how 
the
> > > output
> > > > >>>> should
> > > > >>>>>>>> looks like.
> > > > >>>>>>>>
> > > > >>>>>>>> * 6. Considering 4., I will update it to `--to-offset`
> > > > >>>>>>>>
> > > > >>>>>>>> * 7. I like the idea to unify these options (plus, 
minus).
> > > > >>>>>>>> `shift-offsets-by` is a good option, but I will like some
> > more
> > > > >>>> feedback
> > > > >>>>>>>> here about the name. I will update the KIP in the 
meantime.
> > > > >>>>>>>>
> > > > >>>>>>>> * 8. Yes, discussed in 9.
> > > > >>>>>>>>
> > > > >>>>>>>> * 9. Agree. I'll love some feedback here. `topic` is 
already
> > > used
> > > > by
> > > > >>>>>>>> `delete`, and we can add `--all-topics` to consider all
> > > > >>>>>> topics/partitions
> > > > >>>>>>>> assigned to a group. How could we define specific
> > > > topics/partitions?
> > > > >>>>>>>>
> > > > >>>>>>>> * 10. Haven't thought about it, but make sense.
> > > > >>>>>>>> <topic>,<partition>,<offset> would be enough.
> > > > >>>>>>>>
> > > > >>>>>>>> * 11. Agree. Solved with 10.
> > > > >>>>>>>>
> > > > >>>>>>>> Also, I have a couple of changes to mention:
> > > > >>>>>>>>
> > > > >>>>>>>> 1. I have add a reference to the branch where I'm working 
on
> > > this
> > > > >> KIP.
> > > > >>>>>>>>
> > > > >>>>>>>> 2. About the period scenario `--to-period`. I will change 
it
> > to
> > > > >>>>>>>> `--to-duration` given that duration (
> > > > >>>>>>>>
> > > https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html
> > > > )
> > > > >>>>>>>> follows this format: 'PnDTnHnMnS' and does not consider
> > > daylight
> > > > >>>> saving
> > > > >>>>>>>> efects.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
> > > > >>>>>> matthias@confluent.io>)
> > > > >>>>>>>> escribió:
> > > > >>>>>>>>
> > > > >>>>>>>> Hi,
> > > > >>>>>>>>
> > > > >>>>>>>> thanks for updating the KIP. Couple of follow up 
comments:
> > > > >>>>>>>>
> > > > >>>>>>>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a
> > > "reset
> > > > by
> > > > >>>>>>>> time" option -- IMHO it belongs to "reset by position"?
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Nit: Description of "Reset to Earliest"
> > > > >>>>>>>>
> > > > >>>>>>>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
> > > > >>>>>>>>
> > > > >>>>>>>> I think this is strictly speaking not correct (as
> > > > auto.offset.reset
> > > > >>>> only
> > > > >>>>>>>> triggered if no valid offset is found, but this tool
> > explicitly
> > > > >>>> modified
> > > > >>>>>>>> committed offset), and should be phrased as
> > > > >>>>>>>>
> > > > >>>>>>>>> using Kafka Consumer's #seekToBeginning()
> > > > >>>>>>>>
> > > > >>>>>>>> -> similar issue for description of "Reset to Latest"
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Main option: rename to --reset-offsets (plural instead 
of
> > > > >> singular)
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Scenario Options: I would remove "reset" from all 
options,
> > > > because
> > > > >>>> the
> > > > >>>>>>>> main argument "--reset-offset" says already what to do:
> > > > >>>>>>>>
> > > > >>>>>>>>> bin/kafka-consumer-groups.sh --reset-offset
> > > --reset-to-datetime
> > > > XXX
> > > > >>>>>>>>
> > > > >>>>>>>> better (IMHO):
> > > > >>>>>>>>
> > > > >>>>>>>>> bin/kafka-consumer-groups.sh --reset-offsets 
--to-datetime
> > XXX
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Option 1.e ("print and export current offset") is not
> > > intuitive
> > > > to
> > > > >>>> use
> > > > >>>>>>>> IMHO. The main option is "--reset-offset" but nothing
> happens
> > > if
> > > > no
> > > > >>>>>>>> scenario is specified. It is also not specified, what the
> > > output
> > > > >>>> should
> > > > >>>>>>>> look like?
> > > > >>>>>>>>
> > > > >>>>>>>> Furthermore, --describe should actually show currently
> > > committed
> > > > >>>> offset
> > > > >>>>>>>> for a group. So it seems to be redundant to have the same
> > > option
> > > > in
> > > > >>>>>>>> --reset-offsets
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Option 2.a: I would rename to "--reset-to-offset" (or
> > > > considering
> > > > >>>> the
> > > > >>>>>>>> comment above to "--to-offset")
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Option 2.b and 2.c: I would unify to 
"--shift-offsets-by"
> > (or
> > > > >>>> similar)
> > > > >>>>>>>> and accept positive/negative values
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * About Scope "all": maybe it's better to have an option
> > > > >>>> "--all-topics"
> > > > >>>>>>>> (or similar). IMHO explicit arguments are preferable over
> > > implicit
> > > > >>>>>>>> setting to guard again accidental miss use of the tool.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Scope: I also think, that "--topic" (singular) and
> > "--topics"
> > > > >>>> (plural)
> > > > >>>>>>>> are too similar and easy to use in a wrong way (ie, mix 
up)
> > --
> > > > maybe
> > > > >>>> we
> > > > >>>>>>>> can have two options that are easier to distinguish.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * I still think that JSON is not the best format (it's 
too
> > > > >>>> verbose/hard
> > > > >>>>>>>> to write for humans from scratch). A simple CSV format 
with
> > > > implicit
> > > > >>>>>>>> schema (topic,partition,offset) would be sufficient.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Why does the JSON contain "group_id" field -- there is
> > > parameter
> > > > >>>>>>>> "--group" to specify the group ID. Would one overwrite 
the
> > > other
> > > > >> (what
> > > > >>>>>>>> order) or would there be an error if "--group" is used in
> > > > >> combination
> > > > >>>>>>>> with "--reset-from-file"?
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> -Matthias
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> > > > >>>>>>>>> Hi,
> > > > >>>>>>>>>
> > > > >>>>>>>>> according to the feedback, I've updated the KIP:
> > > > >>>>>>>>>
> > > > >>>>>>>>> - We have added and ordered the scenarios, scopes and
> > > executions
> > > > of
> > > > >>>> the
> > > > >>>>>>>>> Reset Offset tool.
> > > > >>>>>>>>> - Consider it as an extension to the current
> > > > `ConsumerGroupCommand`
> > > > >>>>>> tool
> > > > >>>>>>>>> - Execution will be possible without generating JSON 
files.
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>
> > > > >>>>
> > > > >>
> > > >
> > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> >
> > >
> > > > >>>>>>>>>
> > > > >>>>>>>>> Looking forward to your feedback!
> > > > >>>>>>>>>
> > > > >>>>>>>>> Jorge.
> > > > >>>>>>>>>
> > > > >>>>>>>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate
> > Otoya
> > > (<
> > > > >>>>>>>>> quilcate.jorge@gmail.com>) escribió:
> > > > >>>>>>>>>
> > > > >>>>>>>>>> Great. I think I got the idea. What about this options:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Scenarios:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 1. Current status
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 2. To Datetime
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > >>>>>> --reset-to-datetime
> > > > >>>>>>>>>> 2017-01-01T00:00:00.000´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 3. To Period
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > >>>> --reset-to-period
> > > > >>>>>>>> P2D´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 4. To Earliest
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > >>>>>>>> --reset-to-earliest´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 5. To Latest
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > >>>>>> --reset-to-latest´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 6. Minus 'n' offsets
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > --reset-minus
> > > > >>>> n´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 7. Plus 'n' offsets
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > --reset-plus
> > > > >> n´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 8. To specific offset
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > --reset-to
> > > > x´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Scopes:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> a. All topics used by Consumer Group
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Don't specify --topics
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> b. Specific List of Topics
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Add list of values in --topics t1,t2,tn
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> c. One Topic, all Partitions
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Add one topic and no partitions values: --topic t1
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> d. One Topic, List of Partitions
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Add one topic and partitions values: --topic t1
> > --partitions
> > > > 0,1,2
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> About Reset Plan (JSON file):
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> I think is still valid to have the option to persist 
reset
> > > > >>>>>> configuration
> > > > >>>>>>>>>> as a file, but I agree to give the option to run the 
tool
> > > > without
> > > > >>>>>> going
> > > > >>>>>>>>>> down to the JSON file.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Execution options:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 1. Without execution argument (No args):
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Print out results (reset plan)
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 2. With --execute argument:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Run reset process
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 3. With --output argument:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Save result in a JSON format.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 4. Only with --execute option and --reset-file (path to
> > JSON)
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Reset based on file
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 4. Only with --verify option and --reset-file (path to
> > JSON)
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Verify file values with current offsets
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> I think we can remove --generate-and-execute because is 
a
> > bit
> > > > >>>> clumsy.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> With this options we will be able to execute with 
manual
> > JSON
> > > > >>>>>>>>>> configuration.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<
> > > > ben@confluent.io
> > > > >>> )
> > > > >>>>>>>>>> escribió:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Yes - using a tool like this to skip a set of consumer
> > groups
> > > > >> over a
> > > > >>>>>>>>>> corrupt/bad message is definitely appealing.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> B
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira
> > > <gw...@confluent.io>
> > > > >>>>>> wrote:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> I like the --reset-to-earliest and --reset-to-latest. 
In
> > > > general,
> > > > >>>>>>>>>>> since the JSON route is the most challenging for 
users,
> we
> > > want
> > > > >> to
> > > > >>>>>>>>>>> provide a lot of ways to do useful things without 
going
> > > there.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Two things that can help:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> 1. A lot of times, users want to skip few messages 
that
> > > cause
> > > > >>>> issues
> > > > >>>>>>>>>>> and continue. maybe just specifying the topic, 
partition
> > and
> > > > >> delta
> > > > >>>>>>>>>>> will be better than having to find the offset and 
write a
> > > JSON
> > > > >> and
> > > > >>>>>>>>>>> validate the JSON etc.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> 2. Thinking if there are other common use-cases that 
we
> > can
> > > > make
> > > > >>>> easy
> > > > >>>>>>>>>>> rather than just one generic but not very usable 
method.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Gwen
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate
> > Otoya
> > > > >>>>>>>>>>> <qu...@gmail.com> wrote:
> > > > >>>>>>>>>>>> Thanks for the feedback!
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> @Onur, @Gwen:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Agree. Actually at the first draft I considered to 
have
> > it
> > > > >> inside
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose 
it
> as
> > a
> > > > >>>>>> standalone
> > > > >>>>>>>>>>> tool
> > > > >>>>>>>>>>>> to describe it clearly and focus it on reset
> > functionality.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> But now that you mentioned, it does make sense to 
have
> it
> > > in
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent
> way
> > > to
> > > > >>>>>> introduce
> > > > >>>>>>>>>>> it?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Maybe something like this:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate
> > --group
> > > > cg1
> > > > >>>>>>>>>> --topics
> > > > >>>>>>>>>>> t1
> > > > >>>>>>>>>>>> --reset-from 2017-01-01T00:00:00.000 --output 
plan.json´
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify
> > > > >>>> --reset-json-file
> > > > >>>>>>>>>>>> plan.json´
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute
> > > > >>>> --reset-json-file
> > > > >>>>>>>>>>>> plan.json´
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset
> > > > --generate-and-execute
> > > > >>>>>>>> --group
> > > > >>>>>>>>>>> cg1
> > > > >>>>>>>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> @Gwen:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> It looks exactly like the replica assignment tool
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> It was influenced by ;-) I use the
> > generate-verify-execute
> > > > >> process
> > > > >>>>>>>> here
> > > > >>>>>>>>>>> to
> > > > >>>>>>>>>>>> make sure user will be aware of the result of this
> > > operation.
> > > > At
> > > > >>>> the
> > > > >>>>>>>>>>>> beginning we considered only add a couple of options 
to
> > > > Consumer
> > > > >>>>>> Group
> > > > >>>>>>>>>>>> Command:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> --rewind-to-timestamp and --rewind-to-period
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> @Onur:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> You can actually get away with overriding while 
members
> > of
> > > > the
> > > > >>>>>> group
> > > > >>>>>>>>>>> are live
> > > > >>>>>>>>>>>> with method 2 by using group information from
> > > > >>>> DescribeGroupsRequest.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> This means that we need to have Consumer Group 
stopped
> > > before
> > > > >>>>>>>> executing
> > > > >>>>>>>>>>> and
> > > > >>>>>>>>>>>> start a new consumer internally to do this? 
Therefore,
> we
> > > > won't
> > > > >> be
> > > > >>>>>>>> able
> > > > >>>>>>>>>>> to
> > > > >>>>>>>>>>>> consider executing reset when ConsumerGroup is 
active?
> > > (trying
> > > > >> to
> > > > >>>>>>>>>> relate
> > > > >>>>>>>>>>> it
> > > > >>>>>>>>>>>> with @Dong 5th question)
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> @Dong:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Should we allow user to use wildcard to reset offset 
of
> > > all
> > > > >>>> groups
> > > > >>>>>>>>>> for a
> > > > >>>>>>>>>>>> given topic as well?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I haven't thought about this scenario. Could be
> > > interesting.
> > > > >>>>>> Following
> > > > >>>>>>>>>>> the
> > > > >>>>>>>>>>>> recommendation to add it into Consumer Group Command, 
in
> > > this
> > > > >> case
> > > > >>>>>>>>>> Group
> > > > >>>>>>>>>>>> argument will be optional if there are only 1 topic. 
I
> > > think
> > > > for
> > > > >>>>>>>>>> multiple
> > > > >>>>>>>>>>>> topic won't be that useful.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Should we allow user to specify timestamp per topic
> > > partition
> > > > >> in
> > > > >>>>>> the
> > > > >>>>>>>>>>> json
> > > > >>>>>>>>>>>> file as well?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Don't think this could be a valid from the tool, but 
if
> > > Reset
> > > > >> Plan
> > > > >>>>>> is
> > > > >>>>>>>>>>>> generated, and user want to set the offset for a
> specific
> > > > >>>> partition
> > > > >>>>>> to
> > > > >>>>>>>>>>>> other offset (eventually based on another timestamp),
> and
> > > > >> execute
> > > > >>>>>> it,
> > > > >>>>>>>>>> it
> > > > >>>>>>>>>>>> will be up to her/him.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Should the script take some credential file to make
> sure
> > > that
> > > > >>>> this
> > > > >>>>>>>>>>>> operation is authenticated given the potential impact 
of
> > > this
> > > > >>>>>>>>>> operation?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Haven't tried to secure brokers yet, but the tool 
should
> > > > support
> > > > >>>>>>>>>>>> authorization if it's enabled in the broker.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Should we provide constant to reset committed offset 
to
> > > > >>>>>>>>>> earliest/latest
> > > > >>>>>>>>>>>> offset of a partition, e.g. -1 indicates earliest 
offset
> > > and
> > > > -2
> > > > >>>>>>>>>> indicates
> > > > >>>>>>>>>>>> latest offset.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I will go for something like ´--reset-to-earliest´ 
and
> > > > >>>>>>>>>>> ´--reset-to-latest´
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Should we allow dynamic change of the comitted 
offset
> > when
> > > > >>>> consumer
> > > > >>>>>>>>>> are
> > > > >>>>>>>>>>>> running, such that consumer will seek to the newly
> > > committed
> > > > >>>> offset
> > > > >>>>>>>> and
> > > > >>>>>>>>>>>> start consuming from there?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Not sure about this. I will recommend to keep it 
simple
> > and
> > > > ask
> > > > >>>> user
> > > > >>>>>>>> to
> > > > >>>>>>>>>>>> stop consumers first. But I would considered it if 
the
> > > > >> trade-offs
> > > > >>>>>> are
> > > > >>>>>>>>>>>> clear.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> @Matthias
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Added :). And thanks a lot for your help to define 
this
> > > KIP!
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<
> > > > >> gwen@confluent.io
> > > > >>>>> )
> > > > >>>>>>>>>>>> escribió:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> As long as the CLI is a bit consistent? Like, not 
just
> > > > adding 3
> > > > >>>>>>>>>>>>> arguments and a JSON parser to the existing tool,
> right?
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> > > > >>>>>>>>>>>>> <on...@gmail.com> wrote:
> > > > >>>>>>>>>>>>>> I think it makes sense to just add the feature to
> > > > >>>>>>>>>>>>> kafka-consumer-groups.sh
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <
> > > > >>>> gwen@confluent.io>
> > > > >>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Thanks for the KIP. I'm super happy about adding 
the
> > > > >>>> capability.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> I hate the interface, though. It looks exactly 
like
> > the
> > > > >> replica
> > > > >>>>>>>>>>>>>>> assignment tool. A tool everyone loves so much 
that
> > > there
> > > > are
> > > > >>>>>>>>>>> multiple
> > > > >>>>>>>>>>>>>>> projects, open and closed, that try to fix it.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Can we swap it with something that looks a bit 
more
> > like
> > > > the
> > > > >>>>>>>>>> consumer
> > > > >>>>>>>>>>>>>>> group tool? or the kafka streams reset tool?
> > Consistency
> > > is
> > > > >>>>>> helpful
> > > > >>>>>>>>>>> in
> > > > >>>>>>>>>>>>>>> such cases. I spent some time learning existing 
tools
> > > and
> > > > >>>>>> learning
> > > > >>>>>>>>>>> yet
> > > > >>>>>>>>>>>>>>> another one is a deterrent.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Gwen
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban
> Quilcate
> > > > Otoya
> > > > >>>>>>>>>>>>>>> <qu...@gmail.com> wrote:
> > > > >>>>>>>>>>>>>>>> Hi all,
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> I would like to propose a KIP to Add a tool to 
Reset
> > > > >> Consumer
> > > > >>>>>>>>>> Group
> > > > >>>>>>>>>>>>>>> Offsets.
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > >>>>>>>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> Please, take a look at the proposal and share 
your
> > > > feedback.
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>>>> Jorge.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> --
> > > > >>>>>>>>>>>>>>> Gwen Shapira
> > > > >>>>>>>>>>>>>>> Product Manager | Confluent
> > > > >>>>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> > > <(650)%20450-2760>
> > > > <(650)%20450-2760>
> > > > >> <(650)%20450-2760>
> > > > >>>> <(650)%20450-2760>
> > > > >>>>>> <(650)%20450-2760>
> > > > >>>>>>>> <(650)%20450-2760>
> > > > >>>>>>>>>> <(650)%20450-2760> | @gwenshap
> > > > >>>>>>>>>>>>>>> Follow us: Twitter | blog
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> --
> > > > >>>>>>>>>>>>> Gwen Shapira
> > > > >>>>>>>>>>>>> Product Manager | Confluent
> > > > >>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> > > <(650)%20450-2760>
> > > > <(650)%20450-2760>
> > > > >> <(650)%20450-2760>
> > > > >>>> <(650)%20450-2760>
> > > > >>>>>> <(650)%20450-2760>
> > > > >>>>>>>> <(650)%20450-2760>
> > > > >>>>>>>>>> <(650)%20450-2760> | @gwenshap
> > > > >>>>>>>>>>>>> Follow us: Twitter | blog
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> --
> > > > >>>>>>>>>>> Gwen Shapira
> > > > >>>>>>>>>>> Product Manager | Confluent
> > > > >>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> > > <(650)%20450-2760>
> > > > <(650)%20450-2760>
> > > > >> <(650)%20450-2760>
> > > > >>>> <(650)%20450-2760>
> > > > >>>>>> <(650)%20450-2760> <(650)%20450-2760>
> > > > >>>>>>>> | @gwenshap
> > > > >>>>>>>>>>> Follow us: Twitter | blog
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > > >>
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
> >
>





Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
@Vahid: make sense to add "new lag" info IMO, I will update the KIP.

@Becket:

1. About deleting, I think ConsumerGroupCommand already has an option to
delete Group information by topic. From delete docs: "Pass in groups to
delete topic partition offsets and ownership information over the entire
consumer group.". Let me know if this solves is enough for your case, of we
can consider to add something to the Reset Offsets tool.

2. Yes, for instance in the case of active consumers, the tool will
validate that there are no active consumers to avoid race conditions. I
have added some code snippets to the wiki, thanks for pointing that out.

El sáb., 25 feb. 2017 a las 0:29, Becket Qin (<be...@gmail.com>)
escribió:

> Thanks for the KIP Jorge. I think this is a useful KIP. I haven't read the
> KIP in detail yet, some comments from a quick review:
>
> 1. A glance at it it seems that there is no delete option. At LinkedIn we
> identified some cases that users want to delete the committed offset of a
> group. It would be good to include that as well.
>
> 2. It seems the KIP is missing some necessary implementation key points.
> e.g. how would the tool to commit offsets for a consumer group, does the
> broker need to know this is a special tool instead of an active consumer in
> the group (the generation check will be made on offset commit)? They are
> probably in your proof of concept code. Could you add them to the wiki as
> well?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Fri, Feb 24, 2017 at 1:19 PM, Vahid S Hashemian <
> vahidhashemian@us.ibm.com> wrote:
>
> > Thanks Jorge for addressing my question/suggestion.
> >
> > One last thing. I noticed is that in the example you have for the "plan"
> > option
> > (
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 122%3A+Add+Reset+Consumer+Group+Offsets+tooling#KIP-122:
> > AddResetConsumerGroupOffsetstooling-ExecutionOptions
> > )
> > under "Description" column, you put 0 for lag. So I assume that is the
> > current lag being reported, and not the new lag. Might be helpful to
> > explicitly specify that (i.e. CURRENT-LAG) in the column header.
> > The other option is to report both current and new lags, but I understand
> > if we don't want to do that since it's rather redundant info.
> >
> > Thanks again.
> > --Vahid
> >
> >
> >
> > From:   Jorge Esteban Quilcate Otoya <qu...@gmail.com>
> > To:     dev@kafka.apache.org
> > Date:   02/24/2017 12:47 PM
> > Subject:        Re: KIP-122: Add a tool to Reset Consumer Group Offsets
> >
> >
> >
> > Hi Vahid,
> >
> > Thanks for your comments. Check my answers below:
> >
> > El vie., 24 feb. 2017 a las 19:41, Vahid S Hashemian (<
> > vahidhashemian@us.ibm.com>) escribió:
> >
> > > Hi Jorge,
> > >
> > > Thanks for the useful KIP.
> > >
> > > I have a question regarding the proposed "plan" option.
> > > The "current offset" and "lag" values of a topic partition are
> > meaningful
> > > within a consumer group. In other words, different consumer groups
> could
> > > have different values for these properties of each topic partition.
> > > I don't see that reflected in the discussion around the "plan" option.
> > > Unless we are assuming a "--group" option is also provided by user
> > (which
> > > is not clear from the KIP if that is the case).
> > >
> >
> > I have added an additional comment to state that this options will
> require
> > a "group" argument.
> > It is considered to affect only one Consumer Group.
> >
> >
> > >
> > > Also, I was wondering if you can provide at least one full command
> > example
> > > for each of the "plan", "execute", and "export" options. They would
> > > definitely help in understanding some of the details.
> > >
> > >
> > Added to the KIP.
> >
> >
> > > Sorry for the delayed question/suggestion. I hope they make sense.
> > >
> > > Thanks.
> > > --Vahid
> > >
> > >
> > >
> > > From:   Jorge Esteban Quilcate Otoya <qu...@gmail.com>
> > > To:     dev@kafka.apache.org
> > > Date:   02/24/2017 09:51 AM
> > > Subject:        Re: KIP-122: Add a tool to Reset Consumer Group Offsets
> > >
> > >
> > >
> > > Great! KIP updated.
> > >
> > >
> > >
> > > El vie., 24 feb. 2017 a las 18:22, Matthias J. Sax
> > > (<ma...@confluent.io>)
> > > escribió:
> > >
> > > > I like this!
> > > >
> > > > --by-duration and --shift-by
> > > >
> > > >
> > > > -Matthias
> > > >
> > > > On 2/24/17 12:57 AM, Jorge Esteban Quilcate Otoya wrote:
> > > > > Renaming to --by-duration LGTM
> > > > >
> > > > > Not sure about changing it to --shift-by-duration because we could
> > end
> > > up
> > > > > with the same redundancy as before with reset: --reset-offsets
> > > > > --reset-to-*.
> > > > >
> > > > > Maybe changing --shift-offset-by to --shift-by 'n' could make it
> > > > consistent
> > > > > enough?
> > > > >
> > > > >
> > > > > El vie., 24 feb. 2017 a las 6:39, Matthias J. Sax (<
> > > > matthias@confluent.io>)
> > > > > escribió:
> > > > >
> > > > >> I just read the update KIP once more.
> > > > >>
> > > > >> I would suggest to rename --to-duration to --by-duration
> > > > >>
> > > > >> Or as a second idea, rename --to-duration to --shift-by-duration
> > and
> > > at
> > > > >> the same time rename --shift-offset-by to --shift-by-offset
> > > > >>
> > > > >> Not sure what the best option is, but naming would be more
> > consistent
> > > > IMHO.
> > > > >>
> > > > >>
> > > > >>
> > > > >> -Matthias
> > > > >>
> > > > >> On 2/23/17 4:42 PM, Jorge Esteban Quilcate Otoya wrote:
> > > > >>> Hi All,
> > > > >>>
> > > > >>> If there are no more concerns, I'd like to start vote for this
> > KIP.
> > > > >>>
> > > > >>> Thanks!
> > > > >>> Jorge.
> > > > >>>
> > > > >>> El jue., 23 feb. 2017 a las 22:50, Jorge Esteban Quilcate Otoya
> (<
> > > > >>> quilcate.jorge@gmail.com>) escribió:
> > > > >>>
> > > > >>>> Oh ok :)
> > > > >>>>
> > > > >>>> So, we can keep `--topic t1:1,2,3`
> > > > >>>>
> > > > >>>> I think with this one we have most of the feedback applied. I
> > will
> > > > >> update
> > > > >>>> the KIP with this change.
> > > > >>>>
> > > > >>>> El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<
> > > > >> matthias@confluent.io>)
> > > > >>>> escribió:
> > > > >>>>
> > > > >>>> Sounds reasonable.
> > > > >>>>
> > > > >>>> If we have multiple --topic arguments, it does also not matter
> if
> > > we
> > > > use
> > > > >>>> t1:1,2 or t2=1,2
> > > > >>>>
> > > > >>>> I just suggested '=' because I wanted use ':' to chain multiple
> > > > topics.
> > > > >>>>
> > > > >>>>
> > > > >>>> -Matthias
> > > > >>>>
> > > > >>>> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> > > > >>>>> Yeap, `--topic t1=1,2`LGTM
> > > > >>>>>
> > > > >>>>> Don't have idea neither about getting rid of repeated --topic,
> > but
> > > > >>>> --group
> > > > >>>>> is also repeated in the case of deletion, so it could be ok to
> > > have
> > > > >>>>> repeated --topic arguments.
> > > > >>>>>
> > > > >>>>> El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
> > > > >>>> matthias@confluent.io>)
> > > > >>>>> escribió:
> > > > >>>>>
> > > > >>>>>> So you suggest to merge "scope options" --topics, --topic, and
> > > > >>>>>> --partitions into a single option? Sound good to me.
> > > > >>>>>>
> > > > >>>>>> I like the compact way to express it, ie,
> > > > topicname:list-of-partitions
> > > > >>>>>> with "all partitions" if not partitions are specified. It's
> > quite
> > > > >>>>>> intuitive to use.
> > > > >>>>>>
> > > > >>>>>> Just wondering, if we could get rid of the repeated --topic
> > > option;
> > > > >> it's
> > > > >>>>>> somewhat verbose. Have no good idea though who to improve it.
> > > > >>>>>>
> > > > >>>>>> If you concatenate multiple topic, we need one more character
> > > that
> > > > is
> > > > >>>>>> not allowed in topic names to separate the topics:
> > > > >>>>>>
> > > > >>>>>>> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'',
> ';',
> > > '*',
> > > > >>>>>> '?', ' ', '\t', '\r', '\n', '='};
> > > > >>>>>>
> > > > >>>>>> maybe
> > > > >>>>>>
> > > > >>>>>> --topics t1=1,2,3:t2:t3=3
> > > > >>>>>>
> > > > >>>>>> use '=' to specify partitions (instead of ':' as you proposed)
> > > and
> > > > ':'
> > > > >>>>>> to separate topics? All other characters seem to be worse to
> > use
> > > to
> > > > >> me.
> > > > >>>>>> But maybe you have a better idea.
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> -Matthias
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> > > > >>>>>>> @Matthias about the point 9:
> > > > >>>>>>>
> > > > >>>>>>> What about keeping only the --topic option, and support this
> > > > format:
> > > > >>>>>>>
> > > > >>>>>>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
> > > > >>>>>>>
> > > > >>>>>>> In this case topics t1, t2, and t3 will be selected: topic t1
> > > with
> > > > >>>>>>> partitions 0,1 and 2; topic t2 with all its partitions; and
> > > topic
> > > > t3,
> > > > >>>>>> with
> > > > >>>>>>> only partition 2.
> > > > >>>>>>>
> > > > >>>>>>> Jorge.
> > > > >>>>>>>
> > > > >>>>>>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate
> > Otoya
> > > (<
> > > > >>>>>>> quilcate.jorge@gmail.com>) escribió:
> > > > >>>>>>>
> > > > >>>>>>>> Thanks for the feedback Matthias.
> > > > >>>>>>>>
> > > > >>>>>>>> * 1. You're right. I'll reorder the scenarios.
> > > > >>>>>>>>
> > > > >>>>>>>> * 2. Agree. I'll update the KIP.
> > > > >>>>>>>>
> > > > >>>>>>>> * 3. I like it, updating to `reset-offsets`
> > > > >>>>>>>>
> > > > >>>>>>>> * 4. Agree, removing the `reset-` part
> > > > >>>>>>>>
> > > > >>>>>>>> * 5. Yes, 1.e option without --execute or --export will
> print
> > > out
> > > > >>>>>> current
> > > > >>>>>>>> offset, and the new offset, that will be the same. The
> > use-case
> > > of
> > > > >>>> this
> > > > >>>>>>>> option is to use it in combination with --export mostly and
> > > have a
> > > > >>>>>> current
> > > > >>>>>>>> 'checkpoint' to reset later. I will add to the KIP how the
> > > output
> > > > >>>> should
> > > > >>>>>>>> looks like.
> > > > >>>>>>>>
> > > > >>>>>>>> * 6. Considering 4., I will update it to `--to-offset`
> > > > >>>>>>>>
> > > > >>>>>>>> * 7. I like the idea to unify these options (plus, minus).
> > > > >>>>>>>> `shift-offsets-by` is a good option, but I will like some
> > more
> > > > >>>> feedback
> > > > >>>>>>>> here about the name. I will update the KIP in the meantime.
> > > > >>>>>>>>
> > > > >>>>>>>> * 8. Yes, discussed in 9.
> > > > >>>>>>>>
> > > > >>>>>>>> * 9. Agree. I'll love some feedback here. `topic` is already
> > > used
> > > > by
> > > > >>>>>>>> `delete`, and we can add `--all-topics` to consider all
> > > > >>>>>> topics/partitions
> > > > >>>>>>>> assigned to a group. How could we define specific
> > > > topics/partitions?
> > > > >>>>>>>>
> > > > >>>>>>>> * 10. Haven't thought about it, but make sense.
> > > > >>>>>>>> <topic>,<partition>,<offset> would be enough.
> > > > >>>>>>>>
> > > > >>>>>>>> * 11. Agree. Solved with 10.
> > > > >>>>>>>>
> > > > >>>>>>>> Also, I have a couple of changes to mention:
> > > > >>>>>>>>
> > > > >>>>>>>> 1. I have add a reference to the branch where I'm working on
> > > this
> > > > >> KIP.
> > > > >>>>>>>>
> > > > >>>>>>>> 2. About the period scenario `--to-period`. I will change it
> > to
> > > > >>>>>>>> `--to-duration` given that duration (
> > > > >>>>>>>>
> > > https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html
> > > > )
> > > > >>>>>>>> follows this format: 'PnDTnHnMnS' and does not consider
> > > daylight
> > > > >>>> saving
> > > > >>>>>>>> efects.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
> > > > >>>>>> matthias@confluent.io>)
> > > > >>>>>>>> escribió:
> > > > >>>>>>>>
> > > > >>>>>>>> Hi,
> > > > >>>>>>>>
> > > > >>>>>>>> thanks for updating the KIP. Couple of follow up comments:
> > > > >>>>>>>>
> > > > >>>>>>>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a
> > > "reset
> > > > by
> > > > >>>>>>>> time" option -- IMHO it belongs to "reset by position"?
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Nit: Description of "Reset to Earliest"
> > > > >>>>>>>>
> > > > >>>>>>>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
> > > > >>>>>>>>
> > > > >>>>>>>> I think this is strictly speaking not correct (as
> > > > auto.offset.reset
> > > > >>>> only
> > > > >>>>>>>> triggered if no valid offset is found, but this tool
> > explicitly
> > > > >>>> modified
> > > > >>>>>>>> committed offset), and should be phrased as
> > > > >>>>>>>>
> > > > >>>>>>>>> using Kafka Consumer's #seekToBeginning()
> > > > >>>>>>>>
> > > > >>>>>>>> -> similar issue for description of "Reset to Latest"
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Main option: rename to --reset-offsets (plural instead of
> > > > >> singular)
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Scenario Options: I would remove "reset" from all options,
> > > > because
> > > > >>>> the
> > > > >>>>>>>> main argument "--reset-offset" says already what to do:
> > > > >>>>>>>>
> > > > >>>>>>>>> bin/kafka-consumer-groups.sh --reset-offset
> > > --reset-to-datetime
> > > > XXX
> > > > >>>>>>>>
> > > > >>>>>>>> better (IMHO):
> > > > >>>>>>>>
> > > > >>>>>>>>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime
> > XXX
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Option 1.e ("print and export current offset") is not
> > > intuitive
> > > > to
> > > > >>>> use
> > > > >>>>>>>> IMHO. The main option is "--reset-offset" but nothing
> happens
> > > if
> > > > no
> > > > >>>>>>>> scenario is specified. It is also not specified, what the
> > > output
> > > > >>>> should
> > > > >>>>>>>> look like?
> > > > >>>>>>>>
> > > > >>>>>>>> Furthermore, --describe should actually show currently
> > > committed
> > > > >>>> offset
> > > > >>>>>>>> for a group. So it seems to be redundant to have the same
> > > option
> > > > in
> > > > >>>>>>>> --reset-offsets
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Option 2.a: I would rename to "--reset-to-offset" (or
> > > > considering
> > > > >>>> the
> > > > >>>>>>>> comment above to "--to-offset")
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Option 2.b and 2.c: I would unify to "--shift-offsets-by"
> > (or
> > > > >>>> similar)
> > > > >>>>>>>> and accept positive/negative values
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * About Scope "all": maybe it's better to have an option
> > > > >>>> "--all-topics"
> > > > >>>>>>>> (or similar). IMHO explicit arguments are preferable over
> > > implicit
> > > > >>>>>>>> setting to guard again accidental miss use of the tool.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Scope: I also think, that "--topic" (singular) and
> > "--topics"
> > > > >>>> (plural)
> > > > >>>>>>>> are too similar and easy to use in a wrong way (ie, mix up)
> > --
> > > > maybe
> > > > >>>> we
> > > > >>>>>>>> can have two options that are easier to distinguish.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * I still think that JSON is not the best format (it's too
> > > > >>>> verbose/hard
> > > > >>>>>>>> to write for humans from scratch). A simple CSV format with
> > > > implicit
> > > > >>>>>>>> schema (topic,partition,offset) would be sufficient.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> * Why does the JSON contain "group_id" field -- there is
> > > parameter
> > > > >>>>>>>> "--group" to specify the group ID. Would one overwrite the
> > > other
> > > > >> (what
> > > > >>>>>>>> order) or would there be an error if "--group" is used in
> > > > >> combination
> > > > >>>>>>>> with "--reset-from-file"?
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> -Matthias
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> > > > >>>>>>>>> Hi,
> > > > >>>>>>>>>
> > > > >>>>>>>>> according to the feedback, I've updated the KIP:
> > > > >>>>>>>>>
> > > > >>>>>>>>> - We have added and ordered the scenarios, scopes and
> > > executions
> > > > of
> > > > >>>> the
> > > > >>>>>>>>> Reset Offset tool.
> > > > >>>>>>>>> - Consider it as an extension to the current
> > > > `ConsumerGroupCommand`
> > > > >>>>>> tool
> > > > >>>>>>>>> - Execution will be possible without generating JSON files.
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>
> > > > >>>>
> > > > >>
> > > >
> > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> >
> > >
> > > > >>>>>>>>>
> > > > >>>>>>>>> Looking forward to your feedback!
> > > > >>>>>>>>>
> > > > >>>>>>>>> Jorge.
> > > > >>>>>>>>>
> > > > >>>>>>>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate
> > Otoya
> > > (<
> > > > >>>>>>>>> quilcate.jorge@gmail.com>) escribió:
> > > > >>>>>>>>>
> > > > >>>>>>>>>> Great. I think I got the idea. What about this options:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Scenarios:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 1. Current status
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 2. To Datetime
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > >>>>>> --reset-to-datetime
> > > > >>>>>>>>>> 2017-01-01T00:00:00.000´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 3. To Period
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > >>>> --reset-to-period
> > > > >>>>>>>> P2D´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 4. To Earliest
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > >>>>>>>> --reset-to-earliest´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 5. To Latest
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > >>>>>> --reset-to-latest´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 6. Minus 'n' offsets
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > --reset-minus
> > > > >>>> n´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 7. Plus 'n' offsets
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > > --reset-plus
> > > > >> n´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 8. To specific offset
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > --reset-to
> > > > x´
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Scopes:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> a. All topics used by Consumer Group
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Don't specify --topics
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> b. Specific List of Topics
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Add list of values in --topics t1,t2,tn
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> c. One Topic, all Partitions
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Add one topic and no partitions values: --topic t1
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> d. One Topic, List of Partitions
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Add one topic and partitions values: --topic t1
> > --partitions
> > > > 0,1,2
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> About Reset Plan (JSON file):
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> I think is still valid to have the option to persist reset
> > > > >>>>>> configuration
> > > > >>>>>>>>>> as a file, but I agree to give the option to run the tool
> > > > without
> > > > >>>>>> going
> > > > >>>>>>>>>> down to the JSON file.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Execution options:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 1. Without execution argument (No args):
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Print out results (reset plan)
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 2. With --execute argument:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Run reset process
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 3. With --output argument:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Save result in a JSON format.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 4. Only with --execute option and --reset-file (path to
> > JSON)
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Reset based on file
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 4. Only with --verify option and --reset-file (path to
> > JSON)
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Verify file values with current offsets
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> I think we can remove --generate-and-execute because is a
> > bit
> > > > >>>> clumsy.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> With this options we will be able to execute with manual
> > JSON
> > > > >>>>>>>>>> configuration.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<
> > > > ben@confluent.io
> > > > >>> )
> > > > >>>>>>>>>> escribió:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Yes - using a tool like this to skip a set of consumer
> > groups
> > > > >> over a
> > > > >>>>>>>>>> corrupt/bad message is definitely appealing.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> B
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira
> > > <gw...@confluent.io>
> > > > >>>>>> wrote:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> I like the --reset-to-earliest and --reset-to-latest. In
> > > > general,
> > > > >>>>>>>>>>> since the JSON route is the most challenging for users,
> we
> > > want
> > > > >> to
> > > > >>>>>>>>>>> provide a lot of ways to do useful things without going
> > > there.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Two things that can help:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> 1. A lot of times, users want to skip few messages that
> > > cause
> > > > >>>> issues
> > > > >>>>>>>>>>> and continue. maybe just specifying the topic, partition
> > and
> > > > >> delta
> > > > >>>>>>>>>>> will be better than having to find the offset and write a
> > > JSON
> > > > >> and
> > > > >>>>>>>>>>> validate the JSON etc.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> 2. Thinking if there are other common use-cases that we
> > can
> > > > make
> > > > >>>> easy
> > > > >>>>>>>>>>> rather than just one generic but not very usable method.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Gwen
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate
> > Otoya
> > > > >>>>>>>>>>> <qu...@gmail.com> wrote:
> > > > >>>>>>>>>>>> Thanks for the feedback!
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> @Onur, @Gwen:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Agree. Actually at the first draft I considered to have
> > it
> > > > >> inside
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it
> as
> > a
> > > > >>>>>> standalone
> > > > >>>>>>>>>>> tool
> > > > >>>>>>>>>>>> to describe it clearly and focus it on reset
> > functionality.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> But now that you mentioned, it does make sense to have
> it
> > > in
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent
> way
> > > to
> > > > >>>>>> introduce
> > > > >>>>>>>>>>> it?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Maybe something like this:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate
> > --group
> > > > cg1
> > > > >>>>>>>>>> --topics
> > > > >>>>>>>>>>> t1
> > > > >>>>>>>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify
> > > > >>>> --reset-json-file
> > > > >>>>>>>>>>>> plan.json´
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute
> > > > >>>> --reset-json-file
> > > > >>>>>>>>>>>> plan.json´
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset
> > > > --generate-and-execute
> > > > >>>>>>>> --group
> > > > >>>>>>>>>>> cg1
> > > > >>>>>>>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> @Gwen:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> It looks exactly like the replica assignment tool
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> It was influenced by ;-) I use the
> > generate-verify-execute
> > > > >> process
> > > > >>>>>>>> here
> > > > >>>>>>>>>>> to
> > > > >>>>>>>>>>>> make sure user will be aware of the result of this
> > > operation.
> > > > At
> > > > >>>> the
> > > > >>>>>>>>>>>> beginning we considered only add a couple of options to
> > > > Consumer
> > > > >>>>>> Group
> > > > >>>>>>>>>>>> Command:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> --rewind-to-timestamp and --rewind-to-period
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> @Onur:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> You can actually get away with overriding while members
> > of
> > > > the
> > > > >>>>>> group
> > > > >>>>>>>>>>> are live
> > > > >>>>>>>>>>>> with method 2 by using group information from
> > > > >>>> DescribeGroupsRequest.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> This means that we need to have Consumer Group stopped
> > > before
> > > > >>>>>>>> executing
> > > > >>>>>>>>>>> and
> > > > >>>>>>>>>>>> start a new consumer internally to do this? Therefore,
> we
> > > > won't
> > > > >> be
> > > > >>>>>>>> able
> > > > >>>>>>>>>>> to
> > > > >>>>>>>>>>>> consider executing reset when ConsumerGroup is active?
> > > (trying
> > > > >> to
> > > > >>>>>>>>>> relate
> > > > >>>>>>>>>>> it
> > > > >>>>>>>>>>>> with @Dong 5th question)
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> @Dong:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Should we allow user to use wildcard to reset offset of
> > > all
> > > > >>>> groups
> > > > >>>>>>>>>> for a
> > > > >>>>>>>>>>>> given topic as well?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I haven't thought about this scenario. Could be
> > > interesting.
> > > > >>>>>> Following
> > > > >>>>>>>>>>> the
> > > > >>>>>>>>>>>> recommendation to add it into Consumer Group Command, in
> > > this
> > > > >> case
> > > > >>>>>>>>>> Group
> > > > >>>>>>>>>>>> argument will be optional if there are only 1 topic. I
> > > think
> > > > for
> > > > >>>>>>>>>> multiple
> > > > >>>>>>>>>>>> topic won't be that useful.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Should we allow user to specify timestamp per topic
> > > partition
> > > > >> in
> > > > >>>>>> the
> > > > >>>>>>>>>>> json
> > > > >>>>>>>>>>>> file as well?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Don't think this could be a valid from the tool, but if
> > > Reset
> > > > >> Plan
> > > > >>>>>> is
> > > > >>>>>>>>>>>> generated, and user want to set the offset for a
> specific
> > > > >>>> partition
> > > > >>>>>> to
> > > > >>>>>>>>>>>> other offset (eventually based on another timestamp),
> and
> > > > >> execute
> > > > >>>>>> it,
> > > > >>>>>>>>>> it
> > > > >>>>>>>>>>>> will be up to her/him.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Should the script take some credential file to make
> sure
> > > that
> > > > >>>> this
> > > > >>>>>>>>>>>> operation is authenticated given the potential impact of
> > > this
> > > > >>>>>>>>>> operation?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Haven't tried to secure brokers yet, but the tool should
> > > > support
> > > > >>>>>>>>>>>> authorization if it's enabled in the broker.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Should we provide constant to reset committed offset to
> > > > >>>>>>>>>> earliest/latest
> > > > >>>>>>>>>>>> offset of a partition, e.g. -1 indicates earliest offset
> > > and
> > > > -2
> > > > >>>>>>>>>> indicates
> > > > >>>>>>>>>>>> latest offset.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I will go for something like ´--reset-to-earliest´ and
> > > > >>>>>>>>>>> ´--reset-to-latest´
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Should we allow dynamic change of the comitted offset
> > when
> > > > >>>> consumer
> > > > >>>>>>>>>> are
> > > > >>>>>>>>>>>> running, such that consumer will seek to the newly
> > > committed
> > > > >>>> offset
> > > > >>>>>>>> and
> > > > >>>>>>>>>>>> start consuming from there?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Not sure about this. I will recommend to keep it simple
> > and
> > > > ask
> > > > >>>> user
> > > > >>>>>>>> to
> > > > >>>>>>>>>>>> stop consumers first. But I would considered it if the
> > > > >> trade-offs
> > > > >>>>>> are
> > > > >>>>>>>>>>>> clear.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> @Matthias
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Added :). And thanks a lot for your help to define this
> > > KIP!
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<
> > > > >> gwen@confluent.io
> > > > >>>>> )
> > > > >>>>>>>>>>>> escribió:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> As long as the CLI is a bit consistent? Like, not just
> > > > adding 3
> > > > >>>>>>>>>>>>> arguments and a JSON parser to the existing tool,
> right?
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> > > > >>>>>>>>>>>>> <on...@gmail.com> wrote:
> > > > >>>>>>>>>>>>>> I think it makes sense to just add the feature to
> > > > >>>>>>>>>>>>> kafka-consumer-groups.sh
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <
> > > > >>>> gwen@confluent.io>
> > > > >>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Thanks for the KIP. I'm super happy about adding the
> > > > >>>> capability.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> I hate the interface, though. It looks exactly like
> > the
> > > > >> replica
> > > > >>>>>>>>>>>>>>> assignment tool. A tool everyone loves so much that
> > > there
> > > > are
> > > > >>>>>>>>>>> multiple
> > > > >>>>>>>>>>>>>>> projects, open and closed, that try to fix it.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Can we swap it with something that looks a bit more
> > like
> > > > the
> > > > >>>>>>>>>> consumer
> > > > >>>>>>>>>>>>>>> group tool? or the kafka streams reset tool?
> > Consistency
> > > is
> > > > >>>>>> helpful
> > > > >>>>>>>>>>> in
> > > > >>>>>>>>>>>>>>> such cases. I spent some time learning existing tools
> > > and
> > > > >>>>>> learning
> > > > >>>>>>>>>>> yet
> > > > >>>>>>>>>>>>>>> another one is a deterrent.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Gwen
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban
> Quilcate
> > > > Otoya
> > > > >>>>>>>>>>>>>>> <qu...@gmail.com> wrote:
> > > > >>>>>>>>>>>>>>>> Hi all,
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset
> > > > >> Consumer
> > > > >>>>>>>>>> Group
> > > > >>>>>>>>>>>>>>> Offsets.
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > >>>>>>>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> Please, take a look at the proposal and share your
> > > > feedback.
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>>>> Jorge.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> --
> > > > >>>>>>>>>>>>>>> Gwen Shapira
> > > > >>>>>>>>>>>>>>> Product Manager | Confluent
> > > > >>>>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> > > <(650)%20450-2760>
> > > > <(650)%20450-2760>
> > > > >> <(650)%20450-2760>
> > > > >>>> <(650)%20450-2760>
> > > > >>>>>> <(650)%20450-2760>
> > > > >>>>>>>> <(650)%20450-2760>
> > > > >>>>>>>>>> <(650)%20450-2760> | @gwenshap
> > > > >>>>>>>>>>>>>>> Follow us: Twitter | blog
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> --
> > > > >>>>>>>>>>>>> Gwen Shapira
> > > > >>>>>>>>>>>>> Product Manager | Confluent
> > > > >>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> > > <(650)%20450-2760>
> > > > <(650)%20450-2760>
> > > > >> <(650)%20450-2760>
> > > > >>>> <(650)%20450-2760>
> > > > >>>>>> <(650)%20450-2760>
> > > > >>>>>>>> <(650)%20450-2760>
> > > > >>>>>>>>>> <(650)%20450-2760> | @gwenshap
> > > > >>>>>>>>>>>>> Follow us: Twitter | blog
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> --
> > > > >>>>>>>>>>> Gwen Shapira
> > > > >>>>>>>>>>> Product Manager | Confluent
> > > > >>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> > > <(650)%20450-2760>
> > > > <(650)%20450-2760>
> > > > >> <(650)%20450-2760>
> > > > >>>> <(650)%20450-2760>
> > > > >>>>>> <(650)%20450-2760> <(650)%20450-2760>
> > > > >>>>>>>> | @gwenshap
> > > > >>>>>>>>>>> Follow us: Twitter | blog
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > > >>
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
> >
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Becket Qin <be...@gmail.com>.
Thanks for the KIP Jorge. I think this is a useful KIP. I haven't read the
KIP in detail yet, some comments from a quick review:

1. A glance at it it seems that there is no delete option. At LinkedIn we
identified some cases that users want to delete the committed offset of a
group. It would be good to include that as well.

2. It seems the KIP is missing some necessary implementation key points.
e.g. how would the tool to commit offsets for a consumer group, does the
broker need to know this is a special tool instead of an active consumer in
the group (the generation check will be made on offset commit)? They are
probably in your proof of concept code. Could you add them to the wiki as
well?

Thanks,

Jiangjie (Becket) Qin

On Fri, Feb 24, 2017 at 1:19 PM, Vahid S Hashemian <
vahidhashemian@us.ibm.com> wrote:

> Thanks Jorge for addressing my question/suggestion.
>
> One last thing. I noticed is that in the example you have for the "plan"
> option
> (
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 122%3A+Add+Reset+Consumer+Group+Offsets+tooling#KIP-122:
> AddResetConsumerGroupOffsetstooling-ExecutionOptions
> )
> under "Description" column, you put 0 for lag. So I assume that is the
> current lag being reported, and not the new lag. Might be helpful to
> explicitly specify that (i.e. CURRENT-LAG) in the column header.
> The other option is to report both current and new lags, but I understand
> if we don't want to do that since it's rather redundant info.
>
> Thanks again.
> --Vahid
>
>
>
> From:   Jorge Esteban Quilcate Otoya <qu...@gmail.com>
> To:     dev@kafka.apache.org
> Date:   02/24/2017 12:47 PM
> Subject:        Re: KIP-122: Add a tool to Reset Consumer Group Offsets
>
>
>
> Hi Vahid,
>
> Thanks for your comments. Check my answers below:
>
> El vie., 24 feb. 2017 a las 19:41, Vahid S Hashemian (<
> vahidhashemian@us.ibm.com>) escribió:
>
> > Hi Jorge,
> >
> > Thanks for the useful KIP.
> >
> > I have a question regarding the proposed "plan" option.
> > The "current offset" and "lag" values of a topic partition are
> meaningful
> > within a consumer group. In other words, different consumer groups could
> > have different values for these properties of each topic partition.
> > I don't see that reflected in the discussion around the "plan" option.
> > Unless we are assuming a "--group" option is also provided by user
> (which
> > is not clear from the KIP if that is the case).
> >
>
> I have added an additional comment to state that this options will require
> a "group" argument.
> It is considered to affect only one Consumer Group.
>
>
> >
> > Also, I was wondering if you can provide at least one full command
> example
> > for each of the "plan", "execute", and "export" options. They would
> > definitely help in understanding some of the details.
> >
> >
> Added to the KIP.
>
>
> > Sorry for the delayed question/suggestion. I hope they make sense.
> >
> > Thanks.
> > --Vahid
> >
> >
> >
> > From:   Jorge Esteban Quilcate Otoya <qu...@gmail.com>
> > To:     dev@kafka.apache.org
> > Date:   02/24/2017 09:51 AM
> > Subject:        Re: KIP-122: Add a tool to Reset Consumer Group Offsets
> >
> >
> >
> > Great! KIP updated.
> >
> >
> >
> > El vie., 24 feb. 2017 a las 18:22, Matthias J. Sax
> > (<ma...@confluent.io>)
> > escribió:
> >
> > > I like this!
> > >
> > > --by-duration and --shift-by
> > >
> > >
> > > -Matthias
> > >
> > > On 2/24/17 12:57 AM, Jorge Esteban Quilcate Otoya wrote:
> > > > Renaming to --by-duration LGTM
> > > >
> > > > Not sure about changing it to --shift-by-duration because we could
> end
> > up
> > > > with the same redundancy as before with reset: --reset-offsets
> > > > --reset-to-*.
> > > >
> > > > Maybe changing --shift-offset-by to --shift-by 'n' could make it
> > > consistent
> > > > enough?
> > > >
> > > >
> > > > El vie., 24 feb. 2017 a las 6:39, Matthias J. Sax (<
> > > matthias@confluent.io>)
> > > > escribió:
> > > >
> > > >> I just read the update KIP once more.
> > > >>
> > > >> I would suggest to rename --to-duration to --by-duration
> > > >>
> > > >> Or as a second idea, rename --to-duration to --shift-by-duration
> and
> > at
> > > >> the same time rename --shift-offset-by to --shift-by-offset
> > > >>
> > > >> Not sure what the best option is, but naming would be more
> consistent
> > > IMHO.
> > > >>
> > > >>
> > > >>
> > > >> -Matthias
> > > >>
> > > >> On 2/23/17 4:42 PM, Jorge Esteban Quilcate Otoya wrote:
> > > >>> Hi All,
> > > >>>
> > > >>> If there are no more concerns, I'd like to start vote for this
> KIP.
> > > >>>
> > > >>> Thanks!
> > > >>> Jorge.
> > > >>>
> > > >>> El jue., 23 feb. 2017 a las 22:50, Jorge Esteban Quilcate Otoya (<
> > > >>> quilcate.jorge@gmail.com>) escribió:
> > > >>>
> > > >>>> Oh ok :)
> > > >>>>
> > > >>>> So, we can keep `--topic t1:1,2,3`
> > > >>>>
> > > >>>> I think with this one we have most of the feedback applied. I
> will
> > > >> update
> > > >>>> the KIP with this change.
> > > >>>>
> > > >>>> El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<
> > > >> matthias@confluent.io>)
> > > >>>> escribió:
> > > >>>>
> > > >>>> Sounds reasonable.
> > > >>>>
> > > >>>> If we have multiple --topic arguments, it does also not matter if
> > we
> > > use
> > > >>>> t1:1,2 or t2=1,2
> > > >>>>
> > > >>>> I just suggested '=' because I wanted use ':' to chain multiple
> > > topics.
> > > >>>>
> > > >>>>
> > > >>>> -Matthias
> > > >>>>
> > > >>>> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> > > >>>>> Yeap, `--topic t1=1,2`LGTM
> > > >>>>>
> > > >>>>> Don't have idea neither about getting rid of repeated --topic,
> but
> > > >>>> --group
> > > >>>>> is also repeated in the case of deletion, so it could be ok to
> > have
> > > >>>>> repeated --topic arguments.
> > > >>>>>
> > > >>>>> El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
> > > >>>> matthias@confluent.io>)
> > > >>>>> escribió:
> > > >>>>>
> > > >>>>>> So you suggest to merge "scope options" --topics, --topic, and
> > > >>>>>> --partitions into a single option? Sound good to me.
> > > >>>>>>
> > > >>>>>> I like the compact way to express it, ie,
> > > topicname:list-of-partitions
> > > >>>>>> with "all partitions" if not partitions are specified. It's
> quite
> > > >>>>>> intuitive to use.
> > > >>>>>>
> > > >>>>>> Just wondering, if we could get rid of the repeated --topic
> > option;
> > > >> it's
> > > >>>>>> somewhat verbose. Have no good idea though who to improve it.
> > > >>>>>>
> > > >>>>>> If you concatenate multiple topic, we need one more character
> > that
> > > is
> > > >>>>>> not allowed in topic names to separate the topics:
> > > >>>>>>
> > > >>>>>>> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';',
> > '*',
> > > >>>>>> '?', ' ', '\t', '\r', '\n', '='};
> > > >>>>>>
> > > >>>>>> maybe
> > > >>>>>>
> > > >>>>>> --topics t1=1,2,3:t2:t3=3
> > > >>>>>>
> > > >>>>>> use '=' to specify partitions (instead of ':' as you proposed)
> > and
> > > ':'
> > > >>>>>> to separate topics? All other characters seem to be worse to
> use
> > to
> > > >> me.
> > > >>>>>> But maybe you have a better idea.
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> -Matthias
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> > > >>>>>>> @Matthias about the point 9:
> > > >>>>>>>
> > > >>>>>>> What about keeping only the --topic option, and support this
> > > format:
> > > >>>>>>>
> > > >>>>>>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
> > > >>>>>>>
> > > >>>>>>> In this case topics t1, t2, and t3 will be selected: topic t1
> > with
> > > >>>>>>> partitions 0,1 and 2; topic t2 with all its partitions; and
> > topic
> > > t3,
> > > >>>>>> with
> > > >>>>>>> only partition 2.
> > > >>>>>>>
> > > >>>>>>> Jorge.
> > > >>>>>>>
> > > >>>>>>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate
> Otoya
> > (<
> > > >>>>>>> quilcate.jorge@gmail.com>) escribió:
> > > >>>>>>>
> > > >>>>>>>> Thanks for the feedback Matthias.
> > > >>>>>>>>
> > > >>>>>>>> * 1. You're right. I'll reorder the scenarios.
> > > >>>>>>>>
> > > >>>>>>>> * 2. Agree. I'll update the KIP.
> > > >>>>>>>>
> > > >>>>>>>> * 3. I like it, updating to `reset-offsets`
> > > >>>>>>>>
> > > >>>>>>>> * 4. Agree, removing the `reset-` part
> > > >>>>>>>>
> > > >>>>>>>> * 5. Yes, 1.e option without --execute or --export will print
> > out
> > > >>>>>> current
> > > >>>>>>>> offset, and the new offset, that will be the same. The
> use-case
> > of
> > > >>>> this
> > > >>>>>>>> option is to use it in combination with --export mostly and
> > have a
> > > >>>>>> current
> > > >>>>>>>> 'checkpoint' to reset later. I will add to the KIP how the
> > output
> > > >>>> should
> > > >>>>>>>> looks like.
> > > >>>>>>>>
> > > >>>>>>>> * 6. Considering 4., I will update it to `--to-offset`
> > > >>>>>>>>
> > > >>>>>>>> * 7. I like the idea to unify these options (plus, minus).
> > > >>>>>>>> `shift-offsets-by` is a good option, but I will like some
> more
> > > >>>> feedback
> > > >>>>>>>> here about the name. I will update the KIP in the meantime.
> > > >>>>>>>>
> > > >>>>>>>> * 8. Yes, discussed in 9.
> > > >>>>>>>>
> > > >>>>>>>> * 9. Agree. I'll love some feedback here. `topic` is already
> > used
> > > by
> > > >>>>>>>> `delete`, and we can add `--all-topics` to consider all
> > > >>>>>> topics/partitions
> > > >>>>>>>> assigned to a group. How could we define specific
> > > topics/partitions?
> > > >>>>>>>>
> > > >>>>>>>> * 10. Haven't thought about it, but make sense.
> > > >>>>>>>> <topic>,<partition>,<offset> would be enough.
> > > >>>>>>>>
> > > >>>>>>>> * 11. Agree. Solved with 10.
> > > >>>>>>>>
> > > >>>>>>>> Also, I have a couple of changes to mention:
> > > >>>>>>>>
> > > >>>>>>>> 1. I have add a reference to the branch where I'm working on
> > this
> > > >> KIP.
> > > >>>>>>>>
> > > >>>>>>>> 2. About the period scenario `--to-period`. I will change it
> to
> > > >>>>>>>> `--to-duration` given that duration (
> > > >>>>>>>>
> > https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html
> > > )
> > > >>>>>>>> follows this format: 'PnDTnHnMnS' and does not consider
> > daylight
> > > >>>> saving
> > > >>>>>>>> efects.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
> > > >>>>>> matthias@confluent.io>)
> > > >>>>>>>> escribió:
> > > >>>>>>>>
> > > >>>>>>>> Hi,
> > > >>>>>>>>
> > > >>>>>>>> thanks for updating the KIP. Couple of follow up comments:
> > > >>>>>>>>
> > > >>>>>>>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a
> > "reset
> > > by
> > > >>>>>>>> time" option -- IMHO it belongs to "reset by position"?
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> * Nit: Description of "Reset to Earliest"
> > > >>>>>>>>
> > > >>>>>>>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
> > > >>>>>>>>
> > > >>>>>>>> I think this is strictly speaking not correct (as
> > > auto.offset.reset
> > > >>>> only
> > > >>>>>>>> triggered if no valid offset is found, but this tool
> explicitly
> > > >>>> modified
> > > >>>>>>>> committed offset), and should be phrased as
> > > >>>>>>>>
> > > >>>>>>>>> using Kafka Consumer's #seekToBeginning()
> > > >>>>>>>>
> > > >>>>>>>> -> similar issue for description of "Reset to Latest"
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> * Main option: rename to --reset-offsets (plural instead of
> > > >> singular)
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> * Scenario Options: I would remove "reset" from all options,
> > > because
> > > >>>> the
> > > >>>>>>>> main argument "--reset-offset" says already what to do:
> > > >>>>>>>>
> > > >>>>>>>>> bin/kafka-consumer-groups.sh --reset-offset
> > --reset-to-datetime
> > > XXX
> > > >>>>>>>>
> > > >>>>>>>> better (IMHO):
> > > >>>>>>>>
> > > >>>>>>>>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime
> XXX
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> * Option 1.e ("print and export current offset") is not
> > intuitive
> > > to
> > > >>>> use
> > > >>>>>>>> IMHO. The main option is "--reset-offset" but nothing happens
> > if
> > > no
> > > >>>>>>>> scenario is specified. It is also not specified, what the
> > output
> > > >>>> should
> > > >>>>>>>> look like?
> > > >>>>>>>>
> > > >>>>>>>> Furthermore, --describe should actually show currently
> > committed
> > > >>>> offset
> > > >>>>>>>> for a group. So it seems to be redundant to have the same
> > option
> > > in
> > > >>>>>>>> --reset-offsets
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> * Option 2.a: I would rename to "--reset-to-offset" (or
> > > considering
> > > >>>> the
> > > >>>>>>>> comment above to "--to-offset")
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> * Option 2.b and 2.c: I would unify to "--shift-offsets-by"
> (or
> > > >>>> similar)
> > > >>>>>>>> and accept positive/negative values
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> * About Scope "all": maybe it's better to have an option
> > > >>>> "--all-topics"
> > > >>>>>>>> (or similar). IMHO explicit arguments are preferable over
> > implicit
> > > >>>>>>>> setting to guard again accidental miss use of the tool.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> * Scope: I also think, that "--topic" (singular) and
> "--topics"
> > > >>>> (plural)
> > > >>>>>>>> are too similar and easy to use in a wrong way (ie, mix up)
> --
> > > maybe
> > > >>>> we
> > > >>>>>>>> can have two options that are easier to distinguish.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> * I still think that JSON is not the best format (it's too
> > > >>>> verbose/hard
> > > >>>>>>>> to write for humans from scratch). A simple CSV format with
> > > implicit
> > > >>>>>>>> schema (topic,partition,offset) would be sufficient.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> * Why does the JSON contain "group_id" field -- there is
> > parameter
> > > >>>>>>>> "--group" to specify the group ID. Would one overwrite the
> > other
> > > >> (what
> > > >>>>>>>> order) or would there be an error if "--group" is used in
> > > >> combination
> > > >>>>>>>> with "--reset-from-file"?
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> -Matthias
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> > > >>>>>>>>> Hi,
> > > >>>>>>>>>
> > > >>>>>>>>> according to the feedback, I've updated the KIP:
> > > >>>>>>>>>
> > > >>>>>>>>> - We have added and ordered the scenarios, scopes and
> > executions
> > > of
> > > >>>> the
> > > >>>>>>>>> Reset Offset tool.
> > > >>>>>>>>> - Consider it as an extension to the current
> > > `ConsumerGroupCommand`
> > > >>>>>> tool
> > > >>>>>>>>> - Execution will be possible without generating JSON files.
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>
> > >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 122%3A+Add+Reset+Consumer+Group+Offsets+tooling
>
> >
> > > >>>>>>>>>
> > > >>>>>>>>> Looking forward to your feedback!
> > > >>>>>>>>>
> > > >>>>>>>>> Jorge.
> > > >>>>>>>>>
> > > >>>>>>>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate
> Otoya
> > (<
> > > >>>>>>>>> quilcate.jorge@gmail.com>) escribió:
> > > >>>>>>>>>
> > > >>>>>>>>>> Great. I think I got the idea. What about this options:
> > > >>>>>>>>>>
> > > >>>>>>>>>> Scenarios:
> > > >>>>>>>>>>
> > > >>>>>>>>>> 1. Current status
> > > >>>>>>>>>>
> > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> > > >>>>>>>>>>
> > > >>>>>>>>>> 2. To Datetime
> > > >>>>>>>>>>
> > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > >>>>>> --reset-to-datetime
> > > >>>>>>>>>> 2017-01-01T00:00:00.000´
> > > >>>>>>>>>>
> > > >>>>>>>>>> 3. To Period
> > > >>>>>>>>>>
> > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > >>>> --reset-to-period
> > > >>>>>>>> P2D´
> > > >>>>>>>>>>
> > > >>>>>>>>>> 4. To Earliest
> > > >>>>>>>>>>
> > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > >>>>>>>> --reset-to-earliest´
> > > >>>>>>>>>>
> > > >>>>>>>>>> 5. To Latest
> > > >>>>>>>>>>
> > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > >>>>>> --reset-to-latest´
> > > >>>>>>>>>>
> > > >>>>>>>>>> 6. Minus 'n' offsets
> > > >>>>>>>>>>
> > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > --reset-minus
> > > >>>> n´
> > > >>>>>>>>>>
> > > >>>>>>>>>> 7. Plus 'n' offsets
> > > >>>>>>>>>>
> > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > > --reset-plus
> > > >> n´
> > > >>>>>>>>>>
> > > >>>>>>>>>> 8. To specific offset
> > > >>>>>>>>>>
> > > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > --reset-to
> > > x´
> > > >>>>>>>>>>
> > > >>>>>>>>>> Scopes:
> > > >>>>>>>>>>
> > > >>>>>>>>>> a. All topics used by Consumer Group
> > > >>>>>>>>>>
> > > >>>>>>>>>> Don't specify --topics
> > > >>>>>>>>>>
> > > >>>>>>>>>> b. Specific List of Topics
> > > >>>>>>>>>>
> > > >>>>>>>>>> Add list of values in --topics t1,t2,tn
> > > >>>>>>>>>>
> > > >>>>>>>>>> c. One Topic, all Partitions
> > > >>>>>>>>>>
> > > >>>>>>>>>> Add one topic and no partitions values: --topic t1
> > > >>>>>>>>>>
> > > >>>>>>>>>> d. One Topic, List of Partitions
> > > >>>>>>>>>>
> > > >>>>>>>>>> Add one topic and partitions values: --topic t1
> --partitions
> > > 0,1,2
> > > >>>>>>>>>>
> > > >>>>>>>>>> About Reset Plan (JSON file):
> > > >>>>>>>>>>
> > > >>>>>>>>>> I think is still valid to have the option to persist reset
> > > >>>>>> configuration
> > > >>>>>>>>>> as a file, but I agree to give the option to run the tool
> > > without
> > > >>>>>> going
> > > >>>>>>>>>> down to the JSON file.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Execution options:
> > > >>>>>>>>>>
> > > >>>>>>>>>> 1. Without execution argument (No args):
> > > >>>>>>>>>>
> > > >>>>>>>>>> Print out results (reset plan)
> > > >>>>>>>>>>
> > > >>>>>>>>>> 2. With --execute argument:
> > > >>>>>>>>>>
> > > >>>>>>>>>> Run reset process
> > > >>>>>>>>>>
> > > >>>>>>>>>> 3. With --output argument:
> > > >>>>>>>>>>
> > > >>>>>>>>>> Save result in a JSON format.
> > > >>>>>>>>>>
> > > >>>>>>>>>> 4. Only with --execute option and --reset-file (path to
> JSON)
> > > >>>>>>>>>>
> > > >>>>>>>>>> Reset based on file
> > > >>>>>>>>>>
> > > >>>>>>>>>> 4. Only with --verify option and --reset-file (path to
> JSON)
> > > >>>>>>>>>>
> > > >>>>>>>>>> Verify file values with current offsets
> > > >>>>>>>>>>
> > > >>>>>>>>>> I think we can remove --generate-and-execute because is a
> bit
> > > >>>> clumsy.
> > > >>>>>>>>>>
> > > >>>>>>>>>> With this options we will be able to execute with manual
> JSON
> > > >>>>>>>>>> configuration.
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<
> > > ben@confluent.io
> > > >>> )
> > > >>>>>>>>>> escribió:
> > > >>>>>>>>>>
> > > >>>>>>>>>> Yes - using a tool like this to skip a set of consumer
> groups
> > > >> over a
> > > >>>>>>>>>> corrupt/bad message is definitely appealing.
> > > >>>>>>>>>>
> > > >>>>>>>>>> B
> > > >>>>>>>>>>
> > > >>>>>>>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira
> > <gw...@confluent.io>
> > > >>>>>> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> I like the --reset-to-earliest and --reset-to-latest. In
> > > general,
> > > >>>>>>>>>>> since the JSON route is the most challenging for users, we
> > want
> > > >> to
> > > >>>>>>>>>>> provide a lot of ways to do useful things without going
> > there.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Two things that can help:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> 1. A lot of times, users want to skip few messages that
> > cause
> > > >>>> issues
> > > >>>>>>>>>>> and continue. maybe just specifying the topic, partition
> and
> > > >> delta
> > > >>>>>>>>>>> will be better than having to find the offset and write a
> > JSON
> > > >> and
> > > >>>>>>>>>>> validate the JSON etc.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> 2. Thinking if there are other common use-cases that we
> can
> > > make
> > > >>>> easy
> > > >>>>>>>>>>> rather than just one generic but not very usable method.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Gwen
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate
> Otoya
> > > >>>>>>>>>>> <qu...@gmail.com> wrote:
> > > >>>>>>>>>>>> Thanks for the feedback!
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> @Onur, @Gwen:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Agree. Actually at the first draft I considered to have
> it
> > > >> inside
> > > >>>>>>>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as
> a
> > > >>>>>> standalone
> > > >>>>>>>>>>> tool
> > > >>>>>>>>>>>> to describe it clearly and focus it on reset
> functionality.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> But now that you mentioned, it does make sense to have it
> > in
> > > >>>>>>>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way
> > to
> > > >>>>>> introduce
> > > >>>>>>>>>>> it?
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Maybe something like this:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate
> --group
> > > cg1
> > > >>>>>>>>>> --topics
> > > >>>>>>>>>>> t1
> > > >>>>>>>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify
> > > >>>> --reset-json-file
> > > >>>>>>>>>>>> plan.json´
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute
> > > >>>> --reset-json-file
> > > >>>>>>>>>>>> plan.json´
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset
> > > --generate-and-execute
> > > >>>>>>>> --group
> > > >>>>>>>>>>> cg1
> > > >>>>>>>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> @Gwen:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> It looks exactly like the replica assignment tool
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> It was influenced by ;-) I use the
> generate-verify-execute
> > > >> process
> > > >>>>>>>> here
> > > >>>>>>>>>>> to
> > > >>>>>>>>>>>> make sure user will be aware of the result of this
> > operation.
> > > At
> > > >>>> the
> > > >>>>>>>>>>>> beginning we considered only add a couple of options to
> > > Consumer
> > > >>>>>> Group
> > > >>>>>>>>>>>> Command:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> --rewind-to-timestamp and --rewind-to-period
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> @Onur:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> You can actually get away with overriding while members
> of
> > > the
> > > >>>>>> group
> > > >>>>>>>>>>> are live
> > > >>>>>>>>>>>> with method 2 by using group information from
> > > >>>> DescribeGroupsRequest.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> This means that we need to have Consumer Group stopped
> > before
> > > >>>>>>>> executing
> > > >>>>>>>>>>> and
> > > >>>>>>>>>>>> start a new consumer internally to do this? Therefore, we
> > > won't
> > > >> be
> > > >>>>>>>> able
> > > >>>>>>>>>>> to
> > > >>>>>>>>>>>> consider executing reset when ConsumerGroup is active?
> > (trying
> > > >> to
> > > >>>>>>>>>> relate
> > > >>>>>>>>>>> it
> > > >>>>>>>>>>>> with @Dong 5th question)
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> @Dong:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> Should we allow user to use wildcard to reset offset of
> > all
> > > >>>> groups
> > > >>>>>>>>>> for a
> > > >>>>>>>>>>>> given topic as well?
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I haven't thought about this scenario. Could be
> > interesting.
> > > >>>>>> Following
> > > >>>>>>>>>>> the
> > > >>>>>>>>>>>> recommendation to add it into Consumer Group Command, in
> > this
> > > >> case
> > > >>>>>>>>>> Group
> > > >>>>>>>>>>>> argument will be optional if there are only 1 topic. I
> > think
> > > for
> > > >>>>>>>>>> multiple
> > > >>>>>>>>>>>> topic won't be that useful.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> Should we allow user to specify timestamp per topic
> > partition
> > > >> in
> > > >>>>>> the
> > > >>>>>>>>>>> json
> > > >>>>>>>>>>>> file as well?
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Don't think this could be a valid from the tool, but if
> > Reset
> > > >> Plan
> > > >>>>>> is
> > > >>>>>>>>>>>> generated, and user want to set the offset for a specific
> > > >>>> partition
> > > >>>>>> to
> > > >>>>>>>>>>>> other offset (eventually based on another timestamp), and
> > > >> execute
> > > >>>>>> it,
> > > >>>>>>>>>> it
> > > >>>>>>>>>>>> will be up to her/him.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> Should the script take some credential file to make sure
> > that
> > > >>>> this
> > > >>>>>>>>>>>> operation is authenticated given the potential impact of
> > this
> > > >>>>>>>>>> operation?
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Haven't tried to secure brokers yet, but the tool should
> > > support
> > > >>>>>>>>>>>> authorization if it's enabled in the broker.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> Should we provide constant to reset committed offset to
> > > >>>>>>>>>> earliest/latest
> > > >>>>>>>>>>>> offset of a partition, e.g. -1 indicates earliest offset
> > and
> > > -2
> > > >>>>>>>>>> indicates
> > > >>>>>>>>>>>> latest offset.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I will go for something like ´--reset-to-earliest´ and
> > > >>>>>>>>>>> ´--reset-to-latest´
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> Should we allow dynamic change of the comitted offset
> when
> > > >>>> consumer
> > > >>>>>>>>>> are
> > > >>>>>>>>>>>> running, such that consumer will seek to the newly
> > committed
> > > >>>> offset
> > > >>>>>>>> and
> > > >>>>>>>>>>>> start consuming from there?
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Not sure about this. I will recommend to keep it simple
> and
> > > ask
> > > >>>> user
> > > >>>>>>>> to
> > > >>>>>>>>>>>> stop consumers first. But I would considered it if the
> > > >> trade-offs
> > > >>>>>> are
> > > >>>>>>>>>>>> clear.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> @Matthias
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Added :). And thanks a lot for your help to define this
> > KIP!
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<
> > > >> gwen@confluent.io
> > > >>>>> )
> > > >>>>>>>>>>>> escribió:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> As long as the CLI is a bit consistent? Like, not just
> > > adding 3
> > > >>>>>>>>>>>>> arguments and a JSON parser to the existing tool, right?
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> > > >>>>>>>>>>>>> <on...@gmail.com> wrote:
> > > >>>>>>>>>>>>>> I think it makes sense to just add the feature to
> > > >>>>>>>>>>>>> kafka-consumer-groups.sh
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <
> > > >>>> gwen@confluent.io>
> > > >>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Thanks for the KIP. I'm super happy about adding the
> > > >>>> capability.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> I hate the interface, though. It looks exactly like
> the
> > > >> replica
> > > >>>>>>>>>>>>>>> assignment tool. A tool everyone loves so much that
> > there
> > > are
> > > >>>>>>>>>>> multiple
> > > >>>>>>>>>>>>>>> projects, open and closed, that try to fix it.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Can we swap it with something that looks a bit more
> like
> > > the
> > > >>>>>>>>>> consumer
> > > >>>>>>>>>>>>>>> group tool? or the kafka streams reset tool?
> Consistency
> > is
> > > >>>>>> helpful
> > > >>>>>>>>>>> in
> > > >>>>>>>>>>>>>>> such cases. I spent some time learning existing tools
> > and
> > > >>>>>> learning
> > > >>>>>>>>>>> yet
> > > >>>>>>>>>>>>>>> another one is a deterrent.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Gwen
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate
> > > Otoya
> > > >>>>>>>>>>>>>>> <qu...@gmail.com> wrote:
> > > >>>>>>>>>>>>>>>> Hi all,
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset
> > > >> Consumer
> > > >>>>>>>>>> Group
> > > >>>>>>>>>>>>>>> Offsets.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > >>>>>>>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Please, take a look at the proposal and share your
> > > feedback.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>>>>>> Jorge.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> --
> > > >>>>>>>>>>>>>>> Gwen Shapira
> > > >>>>>>>>>>>>>>> Product Manager | Confluent
> > > >>>>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> > <(650)%20450-2760>
> > > <(650)%20450-2760>
> > > >> <(650)%20450-2760>
> > > >>>> <(650)%20450-2760>
> > > >>>>>> <(650)%20450-2760>
> > > >>>>>>>> <(650)%20450-2760>
> > > >>>>>>>>>> <(650)%20450-2760> | @gwenshap
> > > >>>>>>>>>>>>>>> Follow us: Twitter | blog
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> --
> > > >>>>>>>>>>>>> Gwen Shapira
> > > >>>>>>>>>>>>> Product Manager | Confluent
> > > >>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> > <(650)%20450-2760>
> > > <(650)%20450-2760>
> > > >> <(650)%20450-2760>
> > > >>>> <(650)%20450-2760>
> > > >>>>>> <(650)%20450-2760>
> > > >>>>>>>> <(650)%20450-2760>
> > > >>>>>>>>>> <(650)%20450-2760> | @gwenshap
> > > >>>>>>>>>>>>> Follow us: Twitter | blog
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> --
> > > >>>>>>>>>>> Gwen Shapira
> > > >>>>>>>>>>> Product Manager | Confluent
> > > >>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> > <(650)%20450-2760>
> > > <(650)%20450-2760>
> > > >> <(650)%20450-2760>
> > > >>>> <(650)%20450-2760>
> > > >>>>>> <(650)%20450-2760> <(650)%20450-2760>
> > > >>>>>>>> | @gwenshap
> > > >>>>>>>>>>> Follow us: Twitter | blog
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>>
> > > >>>
> > > >>
> > > >>
> > > >
> > >
> > >
> >
> >
> >
> >
> >
>
>
>
>
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Vahid S Hashemian <va...@us.ibm.com>.
Thanks Jorge for addressing my question/suggestion.

One last thing. I noticed is that in the example you have for the "plan" 
option
(
https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling#KIP-122:AddResetConsumerGroupOffsetstooling-ExecutionOptions
)
under "Description" column, you put 0 for lag. So I assume that is the 
current lag being reported, and not the new lag. Might be helpful to 
explicitly specify that (i.e. CURRENT-LAG) in the column header.
The other option is to report both current and new lags, but I understand 
if we don't want to do that since it's rather redundant info.

Thanks again.
--Vahid



From:   Jorge Esteban Quilcate Otoya <qu...@gmail.com>
To:     dev@kafka.apache.org
Date:   02/24/2017 12:47 PM
Subject:        Re: KIP-122: Add a tool to Reset Consumer Group Offsets



Hi Vahid,

Thanks for your comments. Check my answers below:

El vie., 24 feb. 2017 a las 19:41, Vahid S Hashemian (<
vahidhashemian@us.ibm.com>) escribió:

> Hi Jorge,
>
> Thanks for the useful KIP.
>
> I have a question regarding the proposed "plan" option.
> The "current offset" and "lag" values of a topic partition are 
meaningful
> within a consumer group. In other words, different consumer groups could
> have different values for these properties of each topic partition.
> I don't see that reflected in the discussion around the "plan" option.
> Unless we are assuming a "--group" option is also provided by user 
(which
> is not clear from the KIP if that is the case).
>

I have added an additional comment to state that this options will require
a "group" argument.
It is considered to affect only one Consumer Group.


>
> Also, I was wondering if you can provide at least one full command 
example
> for each of the "plan", "execute", and "export" options. They would
> definitely help in understanding some of the details.
>
>
Added to the KIP.


> Sorry for the delayed question/suggestion. I hope they make sense.
>
> Thanks.
> --Vahid
>
>
>
> From:   Jorge Esteban Quilcate Otoya <qu...@gmail.com>
> To:     dev@kafka.apache.org
> Date:   02/24/2017 09:51 AM
> Subject:        Re: KIP-122: Add a tool to Reset Consumer Group Offsets
>
>
>
> Great! KIP updated.
>
>
>
> El vie., 24 feb. 2017 a las 18:22, Matthias J. Sax
> (<ma...@confluent.io>)
> escribió:
>
> > I like this!
> >
> > --by-duration and --shift-by
> >
> >
> > -Matthias
> >
> > On 2/24/17 12:57 AM, Jorge Esteban Quilcate Otoya wrote:
> > > Renaming to --by-duration LGTM
> > >
> > > Not sure about changing it to --shift-by-duration because we could 
end
> up
> > > with the same redundancy as before with reset: --reset-offsets
> > > --reset-to-*.
> > >
> > > Maybe changing --shift-offset-by to --shift-by 'n' could make it
> > consistent
> > > enough?
> > >
> > >
> > > El vie., 24 feb. 2017 a las 6:39, Matthias J. Sax (<
> > matthias@confluent.io>)
> > > escribió:
> > >
> > >> I just read the update KIP once more.
> > >>
> > >> I would suggest to rename --to-duration to --by-duration
> > >>
> > >> Or as a second idea, rename --to-duration to --shift-by-duration 
and
> at
> > >> the same time rename --shift-offset-by to --shift-by-offset
> > >>
> > >> Not sure what the best option is, but naming would be more 
consistent
> > IMHO.
> > >>
> > >>
> > >>
> > >> -Matthias
> > >>
> > >> On 2/23/17 4:42 PM, Jorge Esteban Quilcate Otoya wrote:
> > >>> Hi All,
> > >>>
> > >>> If there are no more concerns, I'd like to start vote for this 
KIP.
> > >>>
> > >>> Thanks!
> > >>> Jorge.
> > >>>
> > >>> El jue., 23 feb. 2017 a las 22:50, Jorge Esteban Quilcate Otoya (<
> > >>> quilcate.jorge@gmail.com>) escribió:
> > >>>
> > >>>> Oh ok :)
> > >>>>
> > >>>> So, we can keep `--topic t1:1,2,3`
> > >>>>
> > >>>> I think with this one we have most of the feedback applied. I 
will
> > >> update
> > >>>> the KIP with this change.
> > >>>>
> > >>>> El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<
> > >> matthias@confluent.io>)
> > >>>> escribió:
> > >>>>
> > >>>> Sounds reasonable.
> > >>>>
> > >>>> If we have multiple --topic arguments, it does also not matter if
> we
> > use
> > >>>> t1:1,2 or t2=1,2
> > >>>>
> > >>>> I just suggested '=' because I wanted use ':' to chain multiple
> > topics.
> > >>>>
> > >>>>
> > >>>> -Matthias
> > >>>>
> > >>>> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> > >>>>> Yeap, `--topic t1=1,2`LGTM
> > >>>>>
> > >>>>> Don't have idea neither about getting rid of repeated --topic, 
but
> > >>>> --group
> > >>>>> is also repeated in the case of deletion, so it could be ok to
> have
> > >>>>> repeated --topic arguments.
> > >>>>>
> > >>>>> El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
> > >>>> matthias@confluent.io>)
> > >>>>> escribió:
> > >>>>>
> > >>>>>> So you suggest to merge "scope options" --topics, --topic, and
> > >>>>>> --partitions into a single option? Sound good to me.
> > >>>>>>
> > >>>>>> I like the compact way to express it, ie,
> > topicname:list-of-partitions
> > >>>>>> with "all partitions" if not partitions are specified. It's 
quite
> > >>>>>> intuitive to use.
> > >>>>>>
> > >>>>>> Just wondering, if we could get rid of the repeated --topic
> option;
> > >> it's
> > >>>>>> somewhat verbose. Have no good idea though who to improve it.
> > >>>>>>
> > >>>>>> If you concatenate multiple topic, we need one more character
> that
> > is
> > >>>>>> not allowed in topic names to separate the topics:
> > >>>>>>
> > >>>>>>> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';',
> '*',
> > >>>>>> '?', ' ', '\t', '\r', '\n', '='};
> > >>>>>>
> > >>>>>> maybe
> > >>>>>>
> > >>>>>> --topics t1=1,2,3:t2:t3=3
> > >>>>>>
> > >>>>>> use '=' to specify partitions (instead of ':' as you proposed)
> and
> > ':'
> > >>>>>> to separate topics? All other characters seem to be worse to 
use
> to
> > >> me.
> > >>>>>> But maybe you have a better idea.
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> -Matthias
> > >>>>>>
> > >>>>>>
> > >>>>>> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> > >>>>>>> @Matthias about the point 9:
> > >>>>>>>
> > >>>>>>> What about keeping only the --topic option, and support this
> > format:
> > >>>>>>>
> > >>>>>>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
> > >>>>>>>
> > >>>>>>> In this case topics t1, t2, and t3 will be selected: topic t1
> with
> > >>>>>>> partitions 0,1 and 2; topic t2 with all its partitions; and
> topic
> > t3,
> > >>>>>> with
> > >>>>>>> only partition 2.
> > >>>>>>>
> > >>>>>>> Jorge.
> > >>>>>>>
> > >>>>>>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate 
Otoya
> (<
> > >>>>>>> quilcate.jorge@gmail.com>) escribió:
> > >>>>>>>
> > >>>>>>>> Thanks for the feedback Matthias.
> > >>>>>>>>
> > >>>>>>>> * 1. You're right. I'll reorder the scenarios.
> > >>>>>>>>
> > >>>>>>>> * 2. Agree. I'll update the KIP.
> > >>>>>>>>
> > >>>>>>>> * 3. I like it, updating to `reset-offsets`
> > >>>>>>>>
> > >>>>>>>> * 4. Agree, removing the `reset-` part
> > >>>>>>>>
> > >>>>>>>> * 5. Yes, 1.e option without --execute or --export will print
> out
> > >>>>>> current
> > >>>>>>>> offset, and the new offset, that will be the same. The 
use-case
> of
> > >>>> this
> > >>>>>>>> option is to use it in combination with --export mostly and
> have a
> > >>>>>> current
> > >>>>>>>> 'checkpoint' to reset later. I will add to the KIP how the
> output
> > >>>> should
> > >>>>>>>> looks like.
> > >>>>>>>>
> > >>>>>>>> * 6. Considering 4., I will update it to `--to-offset`
> > >>>>>>>>
> > >>>>>>>> * 7. I like the idea to unify these options (plus, minus).
> > >>>>>>>> `shift-offsets-by` is a good option, but I will like some 
more
> > >>>> feedback
> > >>>>>>>> here about the name. I will update the KIP in the meantime.
> > >>>>>>>>
> > >>>>>>>> * 8. Yes, discussed in 9.
> > >>>>>>>>
> > >>>>>>>> * 9. Agree. I'll love some feedback here. `topic` is already
> used
> > by
> > >>>>>>>> `delete`, and we can add `--all-topics` to consider all
> > >>>>>> topics/partitions
> > >>>>>>>> assigned to a group. How could we define specific
> > topics/partitions?
> > >>>>>>>>
> > >>>>>>>> * 10. Haven't thought about it, but make sense.
> > >>>>>>>> <topic>,<partition>,<offset> would be enough.
> > >>>>>>>>
> > >>>>>>>> * 11. Agree. Solved with 10.
> > >>>>>>>>
> > >>>>>>>> Also, I have a couple of changes to mention:
> > >>>>>>>>
> > >>>>>>>> 1. I have add a reference to the branch where I'm working on
> this
> > >> KIP.
> > >>>>>>>>
> > >>>>>>>> 2. About the period scenario `--to-period`. I will change it 
to
> > >>>>>>>> `--to-duration` given that duration (
> > >>>>>>>>
> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html
> > )
> > >>>>>>>> follows this format: 'PnDTnHnMnS' and does not consider
> daylight
> > >>>> saving
> > >>>>>>>> efects.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
> > >>>>>> matthias@confluent.io>)
> > >>>>>>>> escribió:
> > >>>>>>>>
> > >>>>>>>> Hi,
> > >>>>>>>>
> > >>>>>>>> thanks for updating the KIP. Couple of follow up comments:
> > >>>>>>>>
> > >>>>>>>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a
> "reset
> > by
> > >>>>>>>> time" option -- IMHO it belongs to "reset by position"?
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * Nit: Description of "Reset to Earliest"
> > >>>>>>>>
> > >>>>>>>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
> > >>>>>>>>
> > >>>>>>>> I think this is strictly speaking not correct (as
> > auto.offset.reset
> > >>>> only
> > >>>>>>>> triggered if no valid offset is found, but this tool 
explicitly
> > >>>> modified
> > >>>>>>>> committed offset), and should be phrased as
> > >>>>>>>>
> > >>>>>>>>> using Kafka Consumer's #seekToBeginning()
> > >>>>>>>>
> > >>>>>>>> -> similar issue for description of "Reset to Latest"
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * Main option: rename to --reset-offsets (plural instead of
> > >> singular)
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * Scenario Options: I would remove "reset" from all options,
> > because
> > >>>> the
> > >>>>>>>> main argument "--reset-offset" says already what to do:
> > >>>>>>>>
> > >>>>>>>>> bin/kafka-consumer-groups.sh --reset-offset
> --reset-to-datetime
> > XXX
> > >>>>>>>>
> > >>>>>>>> better (IMHO):
> > >>>>>>>>
> > >>>>>>>>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime 
XXX
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * Option 1.e ("print and export current offset") is not
> intuitive
> > to
> > >>>> use
> > >>>>>>>> IMHO. The main option is "--reset-offset" but nothing happens
> if
> > no
> > >>>>>>>> scenario is specified. It is also not specified, what the
> output
> > >>>> should
> > >>>>>>>> look like?
> > >>>>>>>>
> > >>>>>>>> Furthermore, --describe should actually show currently
> committed
> > >>>> offset
> > >>>>>>>> for a group. So it seems to be redundant to have the same
> option
> > in
> > >>>>>>>> --reset-offsets
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * Option 2.a: I would rename to "--reset-to-offset" (or
> > considering
> > >>>> the
> > >>>>>>>> comment above to "--to-offset")
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" 
(or
> > >>>> similar)
> > >>>>>>>> and accept positive/negative values
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * About Scope "all": maybe it's better to have an option
> > >>>> "--all-topics"
> > >>>>>>>> (or similar). IMHO explicit arguments are preferable over
> implicit
> > >>>>>>>> setting to guard again accidental miss use of the tool.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * Scope: I also think, that "--topic" (singular) and 
"--topics"
> > >>>> (plural)
> > >>>>>>>> are too similar and easy to use in a wrong way (ie, mix up) 
--
> > maybe
> > >>>> we
> > >>>>>>>> can have two options that are easier to distinguish.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * I still think that JSON is not the best format (it's too
> > >>>> verbose/hard
> > >>>>>>>> to write for humans from scratch). A simple CSV format with
> > implicit
> > >>>>>>>> schema (topic,partition,offset) would be sufficient.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * Why does the JSON contain "group_id" field -- there is
> parameter
> > >>>>>>>> "--group" to specify the group ID. Would one overwrite the
> other
> > >> (what
> > >>>>>>>> order) or would there be an error if "--group" is used in
> > >> combination
> > >>>>>>>> with "--reset-from-file"?
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> -Matthias
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> > >>>>>>>>> Hi,
> > >>>>>>>>>
> > >>>>>>>>> according to the feedback, I've updated the KIP:
> > >>>>>>>>>
> > >>>>>>>>> - We have added and ordered the scenarios, scopes and
> executions
> > of
> > >>>> the
> > >>>>>>>>> Reset Offset tool.
> > >>>>>>>>> - Consider it as an extension to the current
> > `ConsumerGroupCommand`
> > >>>>>> tool
> > >>>>>>>>> - Execution will be possible without generating JSON files.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>
> > >>
> >
>
> 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling

>
> > >>>>>>>>>
> > >>>>>>>>> Looking forward to your feedback!
> > >>>>>>>>>
> > >>>>>>>>> Jorge.
> > >>>>>>>>>
> > >>>>>>>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate 
Otoya
> (<
> > >>>>>>>>> quilcate.jorge@gmail.com>) escribió:
> > >>>>>>>>>
> > >>>>>>>>>> Great. I think I got the idea. What about this options:
> > >>>>>>>>>>
> > >>>>>>>>>> Scenarios:
> > >>>>>>>>>>
> > >>>>>>>>>> 1. Current status
> > >>>>>>>>>>
> > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> > >>>>>>>>>>
> > >>>>>>>>>> 2. To Datetime
> > >>>>>>>>>>
> > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > >>>>>> --reset-to-datetime
> > >>>>>>>>>> 2017-01-01T00:00:00.000´
> > >>>>>>>>>>
> > >>>>>>>>>> 3. To Period
> > >>>>>>>>>>
> > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > >>>> --reset-to-period
> > >>>>>>>> P2D´
> > >>>>>>>>>>
> > >>>>>>>>>> 4. To Earliest
> > >>>>>>>>>>
> > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > >>>>>>>> --reset-to-earliest´
> > >>>>>>>>>>
> > >>>>>>>>>> 5. To Latest
> > >>>>>>>>>>
> > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > >>>>>> --reset-to-latest´
> > >>>>>>>>>>
> > >>>>>>>>>> 6. Minus 'n' offsets
> > >>>>>>>>>>
> > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > --reset-minus
> > >>>> n´
> > >>>>>>>>>>
> > >>>>>>>>>> 7. Plus 'n' offsets
> > >>>>>>>>>>
> > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > --reset-plus
> > >> n´
> > >>>>>>>>>>
> > >>>>>>>>>> 8. To specific offset
> > >>>>>>>>>>
> > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-to
> > x´
> > >>>>>>>>>>
> > >>>>>>>>>> Scopes:
> > >>>>>>>>>>
> > >>>>>>>>>> a. All topics used by Consumer Group
> > >>>>>>>>>>
> > >>>>>>>>>> Don't specify --topics
> > >>>>>>>>>>
> > >>>>>>>>>> b. Specific List of Topics
> > >>>>>>>>>>
> > >>>>>>>>>> Add list of values in --topics t1,t2,tn
> > >>>>>>>>>>
> > >>>>>>>>>> c. One Topic, all Partitions
> > >>>>>>>>>>
> > >>>>>>>>>> Add one topic and no partitions values: --topic t1
> > >>>>>>>>>>
> > >>>>>>>>>> d. One Topic, List of Partitions
> > >>>>>>>>>>
> > >>>>>>>>>> Add one topic and partitions values: --topic t1 
--partitions
> > 0,1,2
> > >>>>>>>>>>
> > >>>>>>>>>> About Reset Plan (JSON file):
> > >>>>>>>>>>
> > >>>>>>>>>> I think is still valid to have the option to persist reset
> > >>>>>> configuration
> > >>>>>>>>>> as a file, but I agree to give the option to run the tool
> > without
> > >>>>>> going
> > >>>>>>>>>> down to the JSON file.
> > >>>>>>>>>>
> > >>>>>>>>>> Execution options:
> > >>>>>>>>>>
> > >>>>>>>>>> 1. Without execution argument (No args):
> > >>>>>>>>>>
> > >>>>>>>>>> Print out results (reset plan)
> > >>>>>>>>>>
> > >>>>>>>>>> 2. With --execute argument:
> > >>>>>>>>>>
> > >>>>>>>>>> Run reset process
> > >>>>>>>>>>
> > >>>>>>>>>> 3. With --output argument:
> > >>>>>>>>>>
> > >>>>>>>>>> Save result in a JSON format.
> > >>>>>>>>>>
> > >>>>>>>>>> 4. Only with --execute option and --reset-file (path to 
JSON)
> > >>>>>>>>>>
> > >>>>>>>>>> Reset based on file
> > >>>>>>>>>>
> > >>>>>>>>>> 4. Only with --verify option and --reset-file (path to 
JSON)
> > >>>>>>>>>>
> > >>>>>>>>>> Verify file values with current offsets
> > >>>>>>>>>>
> > >>>>>>>>>> I think we can remove --generate-and-execute because is a 
bit
> > >>>> clumsy.
> > >>>>>>>>>>
> > >>>>>>>>>> With this options we will be able to execute with manual 
JSON
> > >>>>>>>>>> configuration.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<
> > ben@confluent.io
> > >>> )
> > >>>>>>>>>> escribió:
> > >>>>>>>>>>
> > >>>>>>>>>> Yes - using a tool like this to skip a set of consumer 
groups
> > >> over a
> > >>>>>>>>>> corrupt/bad message is definitely appealing.
> > >>>>>>>>>>
> > >>>>>>>>>> B
> > >>>>>>>>>>
> > >>>>>>>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira
> <gw...@confluent.io>
> > >>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> I like the --reset-to-earliest and --reset-to-latest. In
> > general,
> > >>>>>>>>>>> since the JSON route is the most challenging for users, we
> want
> > >> to
> > >>>>>>>>>>> provide a lot of ways to do useful things without going
> there.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Two things that can help:
> > >>>>>>>>>>>
> > >>>>>>>>>>> 1. A lot of times, users want to skip few messages that
> cause
> > >>>> issues
> > >>>>>>>>>>> and continue. maybe just specifying the topic, partition 
and
> > >> delta
> > >>>>>>>>>>> will be better than having to find the offset and write a
> JSON
> > >> and
> > >>>>>>>>>>> validate the JSON etc.
> > >>>>>>>>>>>
> > >>>>>>>>>>> 2. Thinking if there are other common use-cases that we 
can
> > make
> > >>>> easy
> > >>>>>>>>>>> rather than just one generic but not very usable method.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Gwen
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate 
Otoya
> > >>>>>>>>>>> <qu...@gmail.com> wrote:
> > >>>>>>>>>>>> Thanks for the feedback!
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> @Onur, @Gwen:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Agree. Actually at the first draft I considered to have 
it
> > >> inside
> > >>>>>>>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as 
a
> > >>>>>> standalone
> > >>>>>>>>>>> tool
> > >>>>>>>>>>>> to describe it clearly and focus it on reset 
functionality.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> But now that you mentioned, it does make sense to have it
> in
> > >>>>>>>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way
> to
> > >>>>>> introduce
> > >>>>>>>>>>> it?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Maybe something like this:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate 
--group
> > cg1
> > >>>>>>>>>> --topics
> > >>>>>>>>>>> t1
> > >>>>>>>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify
> > >>>> --reset-json-file
> > >>>>>>>>>>>> plan.json´
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute
> > >>>> --reset-json-file
> > >>>>>>>>>>>> plan.json´
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset
> > --generate-and-execute
> > >>>>>>>> --group
> > >>>>>>>>>>> cg1
> > >>>>>>>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> @Gwen:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> It looks exactly like the replica assignment tool
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> It was influenced by ;-) I use the 
generate-verify-execute
> > >> process
> > >>>>>>>> here
> > >>>>>>>>>>> to
> > >>>>>>>>>>>> make sure user will be aware of the result of this
> operation.
> > At
> > >>>> the
> > >>>>>>>>>>>> beginning we considered only add a couple of options to
> > Consumer
> > >>>>>> Group
> > >>>>>>>>>>>> Command:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> --rewind-to-timestamp and --rewind-to-period
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> @Onur:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> You can actually get away with overriding while members 
of
> > the
> > >>>>>> group
> > >>>>>>>>>>> are live
> > >>>>>>>>>>>> with method 2 by using group information from
> > >>>> DescribeGroupsRequest.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> This means that we need to have Consumer Group stopped
> before
> > >>>>>>>> executing
> > >>>>>>>>>>> and
> > >>>>>>>>>>>> start a new consumer internally to do this? Therefore, we
> > won't
> > >> be
> > >>>>>>>> able
> > >>>>>>>>>>> to
> > >>>>>>>>>>>> consider executing reset when ConsumerGroup is active?
> (trying
> > >> to
> > >>>>>>>>>> relate
> > >>>>>>>>>>> it
> > >>>>>>>>>>>> with @Dong 5th question)
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> @Dong:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Should we allow user to use wildcard to reset offset of
> all
> > >>>> groups
> > >>>>>>>>>> for a
> > >>>>>>>>>>>> given topic as well?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I haven't thought about this scenario. Could be
> interesting.
> > >>>>>> Following
> > >>>>>>>>>>> the
> > >>>>>>>>>>>> recommendation to add it into Consumer Group Command, in
> this
> > >> case
> > >>>>>>>>>> Group
> > >>>>>>>>>>>> argument will be optional if there are only 1 topic. I
> think
> > for
> > >>>>>>>>>> multiple
> > >>>>>>>>>>>> topic won't be that useful.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Should we allow user to specify timestamp per topic
> partition
> > >> in
> > >>>>>> the
> > >>>>>>>>>>> json
> > >>>>>>>>>>>> file as well?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Don't think this could be a valid from the tool, but if
> Reset
> > >> Plan
> > >>>>>> is
> > >>>>>>>>>>>> generated, and user want to set the offset for a specific
> > >>>> partition
> > >>>>>> to
> > >>>>>>>>>>>> other offset (eventually based on another timestamp), and
> > >> execute
> > >>>>>> it,
> > >>>>>>>>>> it
> > >>>>>>>>>>>> will be up to her/him.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Should the script take some credential file to make sure
> that
> > >>>> this
> > >>>>>>>>>>>> operation is authenticated given the potential impact of
> this
> > >>>>>>>>>> operation?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Haven't tried to secure brokers yet, but the tool should
> > support
> > >>>>>>>>>>>> authorization if it's enabled in the broker.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Should we provide constant to reset committed offset to
> > >>>>>>>>>> earliest/latest
> > >>>>>>>>>>>> offset of a partition, e.g. -1 indicates earliest offset
> and
> > -2
> > >>>>>>>>>> indicates
> > >>>>>>>>>>>> latest offset.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I will go for something like ´--reset-to-earliest´ and
> > >>>>>>>>>>> ´--reset-to-latest´
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Should we allow dynamic change of the comitted offset 
when
> > >>>> consumer
> > >>>>>>>>>> are
> > >>>>>>>>>>>> running, such that consumer will seek to the newly
> committed
> > >>>> offset
> > >>>>>>>> and
> > >>>>>>>>>>>> start consuming from there?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Not sure about this. I will recommend to keep it simple 
and
> > ask
> > >>>> user
> > >>>>>>>> to
> > >>>>>>>>>>>> stop consumers first. But I would considered it if the
> > >> trade-offs
> > >>>>>> are
> > >>>>>>>>>>>> clear.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> @Matthias
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Added :). And thanks a lot for your help to define this
> KIP!
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<
> > >> gwen@confluent.io
> > >>>>> )
> > >>>>>>>>>>>> escribió:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> As long as the CLI is a bit consistent? Like, not just
> > adding 3
> > >>>>>>>>>>>>> arguments and a JSON parser to the existing tool, right?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> > >>>>>>>>>>>>> <on...@gmail.com> wrote:
> > >>>>>>>>>>>>>> I think it makes sense to just add the feature to
> > >>>>>>>>>>>>> kafka-consumer-groups.sh
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <
> > >>>> gwen@confluent.io>
> > >>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Thanks for the KIP. I'm super happy about adding the
> > >>>> capability.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> I hate the interface, though. It looks exactly like 
the
> > >> replica
> > >>>>>>>>>>>>>>> assignment tool. A tool everyone loves so much that
> there
> > are
> > >>>>>>>>>>> multiple
> > >>>>>>>>>>>>>>> projects, open and closed, that try to fix it.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Can we swap it with something that looks a bit more 
like
> > the
> > >>>>>>>>>> consumer
> > >>>>>>>>>>>>>>> group tool? or the kafka streams reset tool? 
Consistency
> is
> > >>>>>> helpful
> > >>>>>>>>>>> in
> > >>>>>>>>>>>>>>> such cases. I spent some time learning existing tools
> and
> > >>>>>> learning
> > >>>>>>>>>>> yet
> > >>>>>>>>>>>>>>> another one is a deterrent.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Gwen
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate
> > Otoya
> > >>>>>>>>>>>>>>> <qu...@gmail.com> wrote:
> > >>>>>>>>>>>>>>>> Hi all,
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset
> > >> Consumer
> > >>>>>>>>>> Group
> > >>>>>>>>>>>>>>> Offsets.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >>>>>>>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Please, take a look at the proposal and share your
> > feedback.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>> Jorge.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>> Gwen Shapira
> > >>>>>>>>>>>>>>> Product Manager | Confluent
> > >>>>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> > <(650)%20450-2760>
> > >> <(650)%20450-2760>
> > >>>> <(650)%20450-2760>
> > >>>>>> <(650)%20450-2760>
> > >>>>>>>> <(650)%20450-2760>
> > >>>>>>>>>> <(650)%20450-2760> | @gwenshap
> > >>>>>>>>>>>>>>> Follow us: Twitter | blog
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> --
> > >>>>>>>>>>>>> Gwen Shapira
> > >>>>>>>>>>>>> Product Manager | Confluent
> > >>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> > <(650)%20450-2760>
> > >> <(650)%20450-2760>
> > >>>> <(650)%20450-2760>
> > >>>>>> <(650)%20450-2760>
> > >>>>>>>> <(650)%20450-2760>
> > >>>>>>>>>> <(650)%20450-2760> | @gwenshap
> > >>>>>>>>>>>>> Follow us: Twitter | blog
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> --
> > >>>>>>>>>>> Gwen Shapira
> > >>>>>>>>>>> Product Manager | Confluent
> > >>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> > <(650)%20450-2760>
> > >> <(650)%20450-2760>
> > >>>> <(650)%20450-2760>
> > >>>>>> <(650)%20450-2760> <(650)%20450-2760>
> > >>>>>>>> | @gwenshap
> > >>>>>>>>>>> Follow us: Twitter | blog
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> > >>
> > >
> >
> >
>
>
>
>
>





Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
Hi Vahid,

Thanks for your comments. Check my answers below:

El vie., 24 feb. 2017 a las 19:41, Vahid S Hashemian (<
vahidhashemian@us.ibm.com>) escribió:

> Hi Jorge,
>
> Thanks for the useful KIP.
>
> I have a question regarding the proposed "plan" option.
> The "current offset" and "lag" values of a topic partition are meaningful
> within a consumer group. In other words, different consumer groups could
> have different values for these properties of each topic partition.
> I don't see that reflected in the discussion around the "plan" option.
> Unless we are assuming a "--group" option is also provided by user (which
> is not clear from the KIP if that is the case).
>

I have added an additional comment to state that this options will require
a "group" argument.
It is considered to affect only one Consumer Group.


>
> Also, I was wondering if you can provide at least one full command example
> for each of the "plan", "execute", and "export" options. They would
> definitely help in understanding some of the details.
>
>
Added to the KIP.


> Sorry for the delayed question/suggestion. I hope they make sense.
>
> Thanks.
> --Vahid
>
>
>
> From:   Jorge Esteban Quilcate Otoya <qu...@gmail.com>
> To:     dev@kafka.apache.org
> Date:   02/24/2017 09:51 AM
> Subject:        Re: KIP-122: Add a tool to Reset Consumer Group Offsets
>
>
>
> Great! KIP updated.
>
>
>
> El vie., 24 feb. 2017 a las 18:22, Matthias J. Sax
> (<ma...@confluent.io>)
> escribió:
>
> > I like this!
> >
> > --by-duration and --shift-by
> >
> >
> > -Matthias
> >
> > On 2/24/17 12:57 AM, Jorge Esteban Quilcate Otoya wrote:
> > > Renaming to --by-duration LGTM
> > >
> > > Not sure about changing it to --shift-by-duration because we could end
> up
> > > with the same redundancy as before with reset: --reset-offsets
> > > --reset-to-*.
> > >
> > > Maybe changing --shift-offset-by to --shift-by 'n' could make it
> > consistent
> > > enough?
> > >
> > >
> > > El vie., 24 feb. 2017 a las 6:39, Matthias J. Sax (<
> > matthias@confluent.io>)
> > > escribió:
> > >
> > >> I just read the update KIP once more.
> > >>
> > >> I would suggest to rename --to-duration to --by-duration
> > >>
> > >> Or as a second idea, rename --to-duration to --shift-by-duration and
> at
> > >> the same time rename --shift-offset-by to --shift-by-offset
> > >>
> > >> Not sure what the best option is, but naming would be more consistent
> > IMHO.
> > >>
> > >>
> > >>
> > >> -Matthias
> > >>
> > >> On 2/23/17 4:42 PM, Jorge Esteban Quilcate Otoya wrote:
> > >>> Hi All,
> > >>>
> > >>> If there are no more concerns, I'd like to start vote for this KIP.
> > >>>
> > >>> Thanks!
> > >>> Jorge.
> > >>>
> > >>> El jue., 23 feb. 2017 a las 22:50, Jorge Esteban Quilcate Otoya (<
> > >>> quilcate.jorge@gmail.com>) escribió:
> > >>>
> > >>>> Oh ok :)
> > >>>>
> > >>>> So, we can keep `--topic t1:1,2,3`
> > >>>>
> > >>>> I think with this one we have most of the feedback applied. I will
> > >> update
> > >>>> the KIP with this change.
> > >>>>
> > >>>> El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<
> > >> matthias@confluent.io>)
> > >>>> escribió:
> > >>>>
> > >>>> Sounds reasonable.
> > >>>>
> > >>>> If we have multiple --topic arguments, it does also not matter if
> we
> > use
> > >>>> t1:1,2 or t2=1,2
> > >>>>
> > >>>> I just suggested '=' because I wanted use ':' to chain multiple
> > topics.
> > >>>>
> > >>>>
> > >>>> -Matthias
> > >>>>
> > >>>> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> > >>>>> Yeap, `--topic t1=1,2`LGTM
> > >>>>>
> > >>>>> Don't have idea neither about getting rid of repeated --topic, but
> > >>>> --group
> > >>>>> is also repeated in the case of deletion, so it could be ok to
> have
> > >>>>> repeated --topic arguments.
> > >>>>>
> > >>>>> El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
> > >>>> matthias@confluent.io>)
> > >>>>> escribió:
> > >>>>>
> > >>>>>> So you suggest to merge "scope options" --topics, --topic, and
> > >>>>>> --partitions into a single option? Sound good to me.
> > >>>>>>
> > >>>>>> I like the compact way to express it, ie,
> > topicname:list-of-partitions
> > >>>>>> with "all partitions" if not partitions are specified. It's quite
> > >>>>>> intuitive to use.
> > >>>>>>
> > >>>>>> Just wondering, if we could get rid of the repeated --topic
> option;
> > >> it's
> > >>>>>> somewhat verbose. Have no good idea though who to improve it.
> > >>>>>>
> > >>>>>> If you concatenate multiple topic, we need one more character
> that
> > is
> > >>>>>> not allowed in topic names to separate the topics:
> > >>>>>>
> > >>>>>>> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';',
> '*',
> > >>>>>> '?', ' ', '\t', '\r', '\n', '='};
> > >>>>>>
> > >>>>>> maybe
> > >>>>>>
> > >>>>>> --topics t1=1,2,3:t2:t3=3
> > >>>>>>
> > >>>>>> use '=' to specify partitions (instead of ':' as you proposed)
> and
> > ':'
> > >>>>>> to separate topics? All other characters seem to be worse to use
> to
> > >> me.
> > >>>>>> But maybe you have a better idea.
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> -Matthias
> > >>>>>>
> > >>>>>>
> > >>>>>> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> > >>>>>>> @Matthias about the point 9:
> > >>>>>>>
> > >>>>>>> What about keeping only the --topic option, and support this
> > format:
> > >>>>>>>
> > >>>>>>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
> > >>>>>>>
> > >>>>>>> In this case topics t1, t2, and t3 will be selected: topic t1
> with
> > >>>>>>> partitions 0,1 and 2; topic t2 with all its partitions; and
> topic
> > t3,
> > >>>>>> with
> > >>>>>>> only partition 2.
> > >>>>>>>
> > >>>>>>> Jorge.
> > >>>>>>>
> > >>>>>>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya
> (<
> > >>>>>>> quilcate.jorge@gmail.com>) escribió:
> > >>>>>>>
> > >>>>>>>> Thanks for the feedback Matthias.
> > >>>>>>>>
> > >>>>>>>> * 1. You're right. I'll reorder the scenarios.
> > >>>>>>>>
> > >>>>>>>> * 2. Agree. I'll update the KIP.
> > >>>>>>>>
> > >>>>>>>> * 3. I like it, updating to `reset-offsets`
> > >>>>>>>>
> > >>>>>>>> * 4. Agree, removing the `reset-` part
> > >>>>>>>>
> > >>>>>>>> * 5. Yes, 1.e option without --execute or --export will print
> out
> > >>>>>> current
> > >>>>>>>> offset, and the new offset, that will be the same. The use-case
> of
> > >>>> this
> > >>>>>>>> option is to use it in combination with --export mostly and
> have a
> > >>>>>> current
> > >>>>>>>> 'checkpoint' to reset later. I will add to the KIP how the
> output
> > >>>> should
> > >>>>>>>> looks like.
> > >>>>>>>>
> > >>>>>>>> * 6. Considering 4., I will update it to `--to-offset`
> > >>>>>>>>
> > >>>>>>>> * 7. I like the idea to unify these options (plus, minus).
> > >>>>>>>> `shift-offsets-by` is a good option, but I will like some more
> > >>>> feedback
> > >>>>>>>> here about the name. I will update the KIP in the meantime.
> > >>>>>>>>
> > >>>>>>>> * 8. Yes, discussed in 9.
> > >>>>>>>>
> > >>>>>>>> * 9. Agree. I'll love some feedback here. `topic` is already
> used
> > by
> > >>>>>>>> `delete`, and we can add `--all-topics` to consider all
> > >>>>>> topics/partitions
> > >>>>>>>> assigned to a group. How could we define specific
> > topics/partitions?
> > >>>>>>>>
> > >>>>>>>> * 10. Haven't thought about it, but make sense.
> > >>>>>>>> <topic>,<partition>,<offset> would be enough.
> > >>>>>>>>
> > >>>>>>>> * 11. Agree. Solved with 10.
> > >>>>>>>>
> > >>>>>>>> Also, I have a couple of changes to mention:
> > >>>>>>>>
> > >>>>>>>> 1. I have add a reference to the branch where I'm working on
> this
> > >> KIP.
> > >>>>>>>>
> > >>>>>>>> 2. About the period scenario `--to-period`. I will change it to
> > >>>>>>>> `--to-duration` given that duration (
> > >>>>>>>>
> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html
> > )
> > >>>>>>>> follows this format: 'PnDTnHnMnS' and does not consider
> daylight
> > >>>> saving
> > >>>>>>>> efects.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
> > >>>>>> matthias@confluent.io>)
> > >>>>>>>> escribió:
> > >>>>>>>>
> > >>>>>>>> Hi,
> > >>>>>>>>
> > >>>>>>>> thanks for updating the KIP. Couple of follow up comments:
> > >>>>>>>>
> > >>>>>>>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a
> "reset
> > by
> > >>>>>>>> time" option -- IMHO it belongs to "reset by position"?
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * Nit: Description of "Reset to Earliest"
> > >>>>>>>>
> > >>>>>>>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
> > >>>>>>>>
> > >>>>>>>> I think this is strictly speaking not correct (as
> > auto.offset.reset
> > >>>> only
> > >>>>>>>> triggered if no valid offset is found, but this tool explicitly
> > >>>> modified
> > >>>>>>>> committed offset), and should be phrased as
> > >>>>>>>>
> > >>>>>>>>> using Kafka Consumer's #seekToBeginning()
> > >>>>>>>>
> > >>>>>>>> -> similar issue for description of "Reset to Latest"
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * Main option: rename to --reset-offsets (plural instead of
> > >> singular)
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * Scenario Options: I would remove "reset" from all options,
> > because
> > >>>> the
> > >>>>>>>> main argument "--reset-offset" says already what to do:
> > >>>>>>>>
> > >>>>>>>>> bin/kafka-consumer-groups.sh --reset-offset
> --reset-to-datetime
> > XXX
> > >>>>>>>>
> > >>>>>>>> better (IMHO):
> > >>>>>>>>
> > >>>>>>>>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * Option 1.e ("print and export current offset") is not
> intuitive
> > to
> > >>>> use
> > >>>>>>>> IMHO. The main option is "--reset-offset" but nothing happens
> if
> > no
> > >>>>>>>> scenario is specified. It is also not specified, what the
> output
> > >>>> should
> > >>>>>>>> look like?
> > >>>>>>>>
> > >>>>>>>> Furthermore, --describe should actually show currently
> committed
> > >>>> offset
> > >>>>>>>> for a group. So it seems to be redundant to have the same
> option
> > in
> > >>>>>>>> --reset-offsets
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * Option 2.a: I would rename to "--reset-to-offset" (or
> > considering
> > >>>> the
> > >>>>>>>> comment above to "--to-offset")
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or
> > >>>> similar)
> > >>>>>>>> and accept positive/negative values
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * About Scope "all": maybe it's better to have an option
> > >>>> "--all-topics"
> > >>>>>>>> (or similar). IMHO explicit arguments are preferable over
> implicit
> > >>>>>>>> setting to guard again accidental miss use of the tool.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * Scope: I also think, that "--topic" (singular) and "--topics"
> > >>>> (plural)
> > >>>>>>>> are too similar and easy to use in a wrong way (ie, mix up) --
> > maybe
> > >>>> we
> > >>>>>>>> can have two options that are easier to distinguish.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * I still think that JSON is not the best format (it's too
> > >>>> verbose/hard
> > >>>>>>>> to write for humans from scratch). A simple CSV format with
> > implicit
> > >>>>>>>> schema (topic,partition,offset) would be sufficient.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> * Why does the JSON contain "group_id" field -- there is
> parameter
> > >>>>>>>> "--group" to specify the group ID. Would one overwrite the
> other
> > >> (what
> > >>>>>>>> order) or would there be an error if "--group" is used in
> > >> combination
> > >>>>>>>> with "--reset-from-file"?
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> -Matthias
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> > >>>>>>>>> Hi,
> > >>>>>>>>>
> > >>>>>>>>> according to the feedback, I've updated the KIP:
> > >>>>>>>>>
> > >>>>>>>>> - We have added and ordered the scenarios, scopes and
> executions
> > of
> > >>>> the
> > >>>>>>>>> Reset Offset tool.
> > >>>>>>>>> - Consider it as an extension to the current
> > `ConsumerGroupCommand`
> > >>>>>> tool
> > >>>>>>>>> - Execution will be possible without generating JSON files.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>
> > >>
> >
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
>
> > >>>>>>>>>
> > >>>>>>>>> Looking forward to your feedback!
> > >>>>>>>>>
> > >>>>>>>>> Jorge.
> > >>>>>>>>>
> > >>>>>>>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya
> (<
> > >>>>>>>>> quilcate.jorge@gmail.com>) escribió:
> > >>>>>>>>>
> > >>>>>>>>>> Great. I think I got the idea. What about this options:
> > >>>>>>>>>>
> > >>>>>>>>>> Scenarios:
> > >>>>>>>>>>
> > >>>>>>>>>> 1. Current status
> > >>>>>>>>>>
> > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> > >>>>>>>>>>
> > >>>>>>>>>> 2. To Datetime
> > >>>>>>>>>>
> > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > >>>>>> --reset-to-datetime
> > >>>>>>>>>> 2017-01-01T00:00:00.000´
> > >>>>>>>>>>
> > >>>>>>>>>> 3. To Period
> > >>>>>>>>>>
> > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > >>>> --reset-to-period
> > >>>>>>>> P2D´
> > >>>>>>>>>>
> > >>>>>>>>>> 4. To Earliest
> > >>>>>>>>>>
> > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > >>>>>>>> --reset-to-earliest´
> > >>>>>>>>>>
> > >>>>>>>>>> 5. To Latest
> > >>>>>>>>>>
> > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > >>>>>> --reset-to-latest´
> > >>>>>>>>>>
> > >>>>>>>>>> 6. Minus 'n' offsets
> > >>>>>>>>>>
> > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > --reset-minus
> > >>>> n´
> > >>>>>>>>>>
> > >>>>>>>>>> 7. Plus 'n' offsets
> > >>>>>>>>>>
> > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> > --reset-plus
> > >> n´
> > >>>>>>>>>>
> > >>>>>>>>>> 8. To specific offset
> > >>>>>>>>>>
> > >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-to
> > x´
> > >>>>>>>>>>
> > >>>>>>>>>> Scopes:
> > >>>>>>>>>>
> > >>>>>>>>>> a. All topics used by Consumer Group
> > >>>>>>>>>>
> > >>>>>>>>>> Don't specify --topics
> > >>>>>>>>>>
> > >>>>>>>>>> b. Specific List of Topics
> > >>>>>>>>>>
> > >>>>>>>>>> Add list of values in --topics t1,t2,tn
> > >>>>>>>>>>
> > >>>>>>>>>> c. One Topic, all Partitions
> > >>>>>>>>>>
> > >>>>>>>>>> Add one topic and no partitions values: --topic t1
> > >>>>>>>>>>
> > >>>>>>>>>> d. One Topic, List of Partitions
> > >>>>>>>>>>
> > >>>>>>>>>> Add one topic and partitions values: --topic t1 --partitions
> > 0,1,2
> > >>>>>>>>>>
> > >>>>>>>>>> About Reset Plan (JSON file):
> > >>>>>>>>>>
> > >>>>>>>>>> I think is still valid to have the option to persist reset
> > >>>>>> configuration
> > >>>>>>>>>> as a file, but I agree to give the option to run the tool
> > without
> > >>>>>> going
> > >>>>>>>>>> down to the JSON file.
> > >>>>>>>>>>
> > >>>>>>>>>> Execution options:
> > >>>>>>>>>>
> > >>>>>>>>>> 1. Without execution argument (No args):
> > >>>>>>>>>>
> > >>>>>>>>>> Print out results (reset plan)
> > >>>>>>>>>>
> > >>>>>>>>>> 2. With --execute argument:
> > >>>>>>>>>>
> > >>>>>>>>>> Run reset process
> > >>>>>>>>>>
> > >>>>>>>>>> 3. With --output argument:
> > >>>>>>>>>>
> > >>>>>>>>>> Save result in a JSON format.
> > >>>>>>>>>>
> > >>>>>>>>>> 4. Only with --execute option and --reset-file (path to JSON)
> > >>>>>>>>>>
> > >>>>>>>>>> Reset based on file
> > >>>>>>>>>>
> > >>>>>>>>>> 4. Only with --verify option and --reset-file (path to JSON)
> > >>>>>>>>>>
> > >>>>>>>>>> Verify file values with current offsets
> > >>>>>>>>>>
> > >>>>>>>>>> I think we can remove --generate-and-execute because is a bit
> > >>>> clumsy.
> > >>>>>>>>>>
> > >>>>>>>>>> With this options we will be able to execute with manual JSON
> > >>>>>>>>>> configuration.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<
> > ben@confluent.io
> > >>> )
> > >>>>>>>>>> escribió:
> > >>>>>>>>>>
> > >>>>>>>>>> Yes - using a tool like this to skip a set of consumer groups
> > >> over a
> > >>>>>>>>>> corrupt/bad message is definitely appealing.
> > >>>>>>>>>>
> > >>>>>>>>>> B
> > >>>>>>>>>>
> > >>>>>>>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira
> <gw...@confluent.io>
> > >>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> I like the --reset-to-earliest and --reset-to-latest. In
> > general,
> > >>>>>>>>>>> since the JSON route is the most challenging for users, we
> want
> > >> to
> > >>>>>>>>>>> provide a lot of ways to do useful things without going
> there.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Two things that can help:
> > >>>>>>>>>>>
> > >>>>>>>>>>> 1. A lot of times, users want to skip few messages that
> cause
> > >>>> issues
> > >>>>>>>>>>> and continue. maybe just specifying the topic, partition and
> > >> delta
> > >>>>>>>>>>> will be better than having to find the offset and write a
> JSON
> > >> and
> > >>>>>>>>>>> validate the JSON etc.
> > >>>>>>>>>>>
> > >>>>>>>>>>> 2. Thinking if there are other common use-cases that we can
> > make
> > >>>> easy
> > >>>>>>>>>>> rather than just one generic but not very usable method.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Gwen
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> > >>>>>>>>>>> <qu...@gmail.com> wrote:
> > >>>>>>>>>>>> Thanks for the feedback!
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> @Onur, @Gwen:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Agree. Actually at the first draft I considered to have it
> > >> inside
> > >>>>>>>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a
> > >>>>>> standalone
> > >>>>>>>>>>> tool
> > >>>>>>>>>>>> to describe it clearly and focus it on reset functionality.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> But now that you mentioned, it does make sense to have it
> in
> > >>>>>>>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way
> to
> > >>>>>> introduce
> > >>>>>>>>>>> it?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Maybe something like this:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group
> > cg1
> > >>>>>>>>>> --topics
> > >>>>>>>>>>> t1
> > >>>>>>>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify
> > >>>> --reset-json-file
> > >>>>>>>>>>>> plan.json´
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute
> > >>>> --reset-json-file
> > >>>>>>>>>>>> plan.json´
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset
> > --generate-and-execute
> > >>>>>>>> --group
> > >>>>>>>>>>> cg1
> > >>>>>>>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> @Gwen:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> It looks exactly like the replica assignment tool
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> It was influenced by ;-) I use the generate-verify-execute
> > >> process
> > >>>>>>>> here
> > >>>>>>>>>>> to
> > >>>>>>>>>>>> make sure user will be aware of the result of this
> operation.
> > At
> > >>>> the
> > >>>>>>>>>>>> beginning we considered only add a couple of options to
> > Consumer
> > >>>>>> Group
> > >>>>>>>>>>>> Command:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> --rewind-to-timestamp and --rewind-to-period
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> @Onur:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> You can actually get away with overriding while members of
> > the
> > >>>>>> group
> > >>>>>>>>>>> are live
> > >>>>>>>>>>>> with method 2 by using group information from
> > >>>> DescribeGroupsRequest.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> This means that we need to have Consumer Group stopped
> before
> > >>>>>>>> executing
> > >>>>>>>>>>> and
> > >>>>>>>>>>>> start a new consumer internally to do this? Therefore, we
> > won't
> > >> be
> > >>>>>>>> able
> > >>>>>>>>>>> to
> > >>>>>>>>>>>> consider executing reset when ConsumerGroup is active?
> (trying
> > >> to
> > >>>>>>>>>> relate
> > >>>>>>>>>>> it
> > >>>>>>>>>>>> with @Dong 5th question)
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> @Dong:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Should we allow user to use wildcard to reset offset of
> all
> > >>>> groups
> > >>>>>>>>>> for a
> > >>>>>>>>>>>> given topic as well?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I haven't thought about this scenario. Could be
> interesting.
> > >>>>>> Following
> > >>>>>>>>>>> the
> > >>>>>>>>>>>> recommendation to add it into Consumer Group Command, in
> this
> > >> case
> > >>>>>>>>>> Group
> > >>>>>>>>>>>> argument will be optional if there are only 1 topic. I
> think
> > for
> > >>>>>>>>>> multiple
> > >>>>>>>>>>>> topic won't be that useful.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Should we allow user to specify timestamp per topic
> partition
> > >> in
> > >>>>>> the
> > >>>>>>>>>>> json
> > >>>>>>>>>>>> file as well?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Don't think this could be a valid from the tool, but if
> Reset
> > >> Plan
> > >>>>>> is
> > >>>>>>>>>>>> generated, and user want to set the offset for a specific
> > >>>> partition
> > >>>>>> to
> > >>>>>>>>>>>> other offset (eventually based on another timestamp), and
> > >> execute
> > >>>>>> it,
> > >>>>>>>>>> it
> > >>>>>>>>>>>> will be up to her/him.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Should the script take some credential file to make sure
> that
> > >>>> this
> > >>>>>>>>>>>> operation is authenticated given the potential impact of
> this
> > >>>>>>>>>> operation?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Haven't tried to secure brokers yet, but the tool should
> > support
> > >>>>>>>>>>>> authorization if it's enabled in the broker.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Should we provide constant to reset committed offset to
> > >>>>>>>>>> earliest/latest
> > >>>>>>>>>>>> offset of a partition, e.g. -1 indicates earliest offset
> and
> > -2
> > >>>>>>>>>> indicates
> > >>>>>>>>>>>> latest offset.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I will go for something like ´--reset-to-earliest´ and
> > >>>>>>>>>>> ´--reset-to-latest´
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Should we allow dynamic change of the comitted offset when
> > >>>> consumer
> > >>>>>>>>>> are
> > >>>>>>>>>>>> running, such that consumer will seek to the newly
> committed
> > >>>> offset
> > >>>>>>>> and
> > >>>>>>>>>>>> start consuming from there?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Not sure about this. I will recommend to keep it simple and
> > ask
> > >>>> user
> > >>>>>>>> to
> > >>>>>>>>>>>> stop consumers first. But I would considered it if the
> > >> trade-offs
> > >>>>>> are
> > >>>>>>>>>>>> clear.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> @Matthias
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Added :). And thanks a lot for your help to define this
> KIP!
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<
> > >> gwen@confluent.io
> > >>>>> )
> > >>>>>>>>>>>> escribió:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> As long as the CLI is a bit consistent? Like, not just
> > adding 3
> > >>>>>>>>>>>>> arguments and a JSON parser to the existing tool, right?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> > >>>>>>>>>>>>> <on...@gmail.com> wrote:
> > >>>>>>>>>>>>>> I think it makes sense to just add the feature to
> > >>>>>>>>>>>>> kafka-consumer-groups.sh
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <
> > >>>> gwen@confluent.io>
> > >>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Thanks for the KIP. I'm super happy about adding the
> > >>>> capability.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> I hate the interface, though. It looks exactly like the
> > >> replica
> > >>>>>>>>>>>>>>> assignment tool. A tool everyone loves so much that
> there
> > are
> > >>>>>>>>>>> multiple
> > >>>>>>>>>>>>>>> projects, open and closed, that try to fix it.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Can we swap it with something that looks a bit more like
> > the
> > >>>>>>>>>> consumer
> > >>>>>>>>>>>>>>> group tool? or the kafka streams reset tool? Consistency
> is
> > >>>>>> helpful
> > >>>>>>>>>>> in
> > >>>>>>>>>>>>>>> such cases. I spent some time learning existing tools
> and
> > >>>>>> learning
> > >>>>>>>>>>> yet
> > >>>>>>>>>>>>>>> another one is a deterrent.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Gwen
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate
> > Otoya
> > >>>>>>>>>>>>>>> <qu...@gmail.com> wrote:
> > >>>>>>>>>>>>>>>> Hi all,
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset
> > >> Consumer
> > >>>>>>>>>> Group
> > >>>>>>>>>>>>>>> Offsets.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >>>>>>>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Please, take a look at the proposal and share your
> > feedback.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>> Jorge.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>> Gwen Shapira
> > >>>>>>>>>>>>>>> Product Manager | Confluent
> > >>>>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> > <(650)%20450-2760>
> > >> <(650)%20450-2760>
> > >>>> <(650)%20450-2760>
> > >>>>>> <(650)%20450-2760>
> > >>>>>>>> <(650)%20450-2760>
> > >>>>>>>>>> <(650)%20450-2760> | @gwenshap
> > >>>>>>>>>>>>>>> Follow us: Twitter | blog
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> --
> > >>>>>>>>>>>>> Gwen Shapira
> > >>>>>>>>>>>>> Product Manager | Confluent
> > >>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> > <(650)%20450-2760>
> > >> <(650)%20450-2760>
> > >>>> <(650)%20450-2760>
> > >>>>>> <(650)%20450-2760>
> > >>>>>>>> <(650)%20450-2760>
> > >>>>>>>>>> <(650)%20450-2760> | @gwenshap
> > >>>>>>>>>>>>> Follow us: Twitter | blog
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> --
> > >>>>>>>>>>> Gwen Shapira
> > >>>>>>>>>>> Product Manager | Confluent
> > >>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> > <(650)%20450-2760>
> > >> <(650)%20450-2760>
> > >>>> <(650)%20450-2760>
> > >>>>>> <(650)%20450-2760> <(650)%20450-2760>
> > >>>>>>>> | @gwenshap
> > >>>>>>>>>>> Follow us: Twitter | blog
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> > >>
> > >
> >
> >
>
>
>
>
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Vahid S Hashemian <va...@us.ibm.com>.
Hi Jorge,

Thanks for the useful KIP.

I have a question regarding the proposed "plan" option.
The "current offset" and "lag" values of a topic partition are meaningful 
within a consumer group. In other words, different consumer groups could 
have different values for these properties of each topic partition.
I don't see that reflected in the discussion around the "plan" option. 
Unless we are assuming a "--group" option is also provided by user (which 
is not clear from the KIP if that is the case).

Also, I was wondering if you can provide at least one full command example 
for each of the "plan", "execute", and "export" options. They would 
definitely help in understanding some of the details.

Sorry for the delayed question/suggestion. I hope they make sense.

Thanks.
--Vahid



From:   Jorge Esteban Quilcate Otoya <qu...@gmail.com>
To:     dev@kafka.apache.org
Date:   02/24/2017 09:51 AM
Subject:        Re: KIP-122: Add a tool to Reset Consumer Group Offsets



Great! KIP updated.



El vie., 24 feb. 2017 a las 18:22, Matthias J. Sax 
(<ma...@confluent.io>)
escribió:

> I like this!
>
> --by-duration and --shift-by
>
>
> -Matthias
>
> On 2/24/17 12:57 AM, Jorge Esteban Quilcate Otoya wrote:
> > Renaming to --by-duration LGTM
> >
> > Not sure about changing it to --shift-by-duration because we could end 
up
> > with the same redundancy as before with reset: --reset-offsets
> > --reset-to-*.
> >
> > Maybe changing --shift-offset-by to --shift-by 'n' could make it
> consistent
> > enough?
> >
> >
> > El vie., 24 feb. 2017 a las 6:39, Matthias J. Sax (<
> matthias@confluent.io>)
> > escribió:
> >
> >> I just read the update KIP once more.
> >>
> >> I would suggest to rename --to-duration to --by-duration
> >>
> >> Or as a second idea, rename --to-duration to --shift-by-duration and 
at
> >> the same time rename --shift-offset-by to --shift-by-offset
> >>
> >> Not sure what the best option is, but naming would be more consistent
> IMHO.
> >>
> >>
> >>
> >> -Matthias
> >>
> >> On 2/23/17 4:42 PM, Jorge Esteban Quilcate Otoya wrote:
> >>> Hi All,
> >>>
> >>> If there are no more concerns, I'd like to start vote for this KIP.
> >>>
> >>> Thanks!
> >>> Jorge.
> >>>
> >>> El jue., 23 feb. 2017 a las 22:50, Jorge Esteban Quilcate Otoya (<
> >>> quilcate.jorge@gmail.com>) escribió:
> >>>
> >>>> Oh ok :)
> >>>>
> >>>> So, we can keep `--topic t1:1,2,3`
> >>>>
> >>>> I think with this one we have most of the feedback applied. I will
> >> update
> >>>> the KIP with this change.
> >>>>
> >>>> El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<
> >> matthias@confluent.io>)
> >>>> escribió:
> >>>>
> >>>> Sounds reasonable.
> >>>>
> >>>> If we have multiple --topic arguments, it does also not matter if 
we
> use
> >>>> t1:1,2 or t2=1,2
> >>>>
> >>>> I just suggested '=' because I wanted use ':' to chain multiple
> topics.
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> >>>>> Yeap, `--topic t1=1,2`LGTM
> >>>>>
> >>>>> Don't have idea neither about getting rid of repeated --topic, but
> >>>> --group
> >>>>> is also repeated in the case of deletion, so it could be ok to 
have
> >>>>> repeated --topic arguments.
> >>>>>
> >>>>> El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
> >>>> matthias@confluent.io>)
> >>>>> escribió:
> >>>>>
> >>>>>> So you suggest to merge "scope options" --topics, --topic, and
> >>>>>> --partitions into a single option? Sound good to me.
> >>>>>>
> >>>>>> I like the compact way to express it, ie,
> topicname:list-of-partitions
> >>>>>> with "all partitions" if not partitions are specified. It's quite
> >>>>>> intuitive to use.
> >>>>>>
> >>>>>> Just wondering, if we could get rid of the repeated --topic 
option;
> >> it's
> >>>>>> somewhat verbose. Have no good idea though who to improve it.
> >>>>>>
> >>>>>> If you concatenate multiple topic, we need one more character 
that
> is
> >>>>>> not allowed in topic names to separate the topics:
> >>>>>>
> >>>>>>> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';', 
'*',
> >>>>>> '?', ' ', '\t', '\r', '\n', '='};
> >>>>>>
> >>>>>> maybe
> >>>>>>
> >>>>>> --topics t1=1,2,3:t2:t3=3
> >>>>>>
> >>>>>> use '=' to specify partitions (instead of ':' as you proposed) 
and
> ':'
> >>>>>> to separate topics? All other characters seem to be worse to use 
to
> >> me.
> >>>>>> But maybe you have a better idea.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> -Matthias
> >>>>>>
> >>>>>>
> >>>>>> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> >>>>>>> @Matthias about the point 9:
> >>>>>>>
> >>>>>>> What about keeping only the --topic option, and support this
> format:
> >>>>>>>
> >>>>>>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
> >>>>>>>
> >>>>>>> In this case topics t1, t2, and t3 will be selected: topic t1 
with
> >>>>>>> partitions 0,1 and 2; topic t2 with all its partitions; and 
topic
> t3,
> >>>>>> with
> >>>>>>> only partition 2.
> >>>>>>>
> >>>>>>> Jorge.
> >>>>>>>
> >>>>>>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya 
(<
> >>>>>>> quilcate.jorge@gmail.com>) escribió:
> >>>>>>>
> >>>>>>>> Thanks for the feedback Matthias.
> >>>>>>>>
> >>>>>>>> * 1. You're right. I'll reorder the scenarios.
> >>>>>>>>
> >>>>>>>> * 2. Agree. I'll update the KIP.
> >>>>>>>>
> >>>>>>>> * 3. I like it, updating to `reset-offsets`
> >>>>>>>>
> >>>>>>>> * 4. Agree, removing the `reset-` part
> >>>>>>>>
> >>>>>>>> * 5. Yes, 1.e option without --execute or --export will print 
out
> >>>>>> current
> >>>>>>>> offset, and the new offset, that will be the same. The use-case 
of
> >>>> this
> >>>>>>>> option is to use it in combination with --export mostly and 
have a
> >>>>>> current
> >>>>>>>> 'checkpoint' to reset later. I will add to the KIP how the 
output
> >>>> should
> >>>>>>>> looks like.
> >>>>>>>>
> >>>>>>>> * 6. Considering 4., I will update it to `--to-offset`
> >>>>>>>>
> >>>>>>>> * 7. I like the idea to unify these options (plus, minus).
> >>>>>>>> `shift-offsets-by` is a good option, but I will like some more
> >>>> feedback
> >>>>>>>> here about the name. I will update the KIP in the meantime.
> >>>>>>>>
> >>>>>>>> * 8. Yes, discussed in 9.
> >>>>>>>>
> >>>>>>>> * 9. Agree. I'll love some feedback here. `topic` is already 
used
> by
> >>>>>>>> `delete`, and we can add `--all-topics` to consider all
> >>>>>> topics/partitions
> >>>>>>>> assigned to a group. How could we define specific
> topics/partitions?
> >>>>>>>>
> >>>>>>>> * 10. Haven't thought about it, but make sense.
> >>>>>>>> <topic>,<partition>,<offset> would be enough.
> >>>>>>>>
> >>>>>>>> * 11. Agree. Solved with 10.
> >>>>>>>>
> >>>>>>>> Also, I have a couple of changes to mention:
> >>>>>>>>
> >>>>>>>> 1. I have add a reference to the branch where I'm working on 
this
> >> KIP.
> >>>>>>>>
> >>>>>>>> 2. About the period scenario `--to-period`. I will change it to
> >>>>>>>> `--to-duration` given that duration (
> >>>>>>>> 
https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html
> )
> >>>>>>>> follows this format: 'PnDTnHnMnS' and does not consider 
daylight
> >>>> saving
> >>>>>>>> efects.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
> >>>>>> matthias@confluent.io>)
> >>>>>>>> escribió:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> thanks for updating the KIP. Couple of follow up comments:
> >>>>>>>>
> >>>>>>>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a 
"reset
> by
> >>>>>>>> time" option -- IMHO it belongs to "reset by position"?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * Nit: Description of "Reset to Earliest"
> >>>>>>>>
> >>>>>>>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
> >>>>>>>>
> >>>>>>>> I think this is strictly speaking not correct (as
> auto.offset.reset
> >>>> only
> >>>>>>>> triggered if no valid offset is found, but this tool explicitly
> >>>> modified
> >>>>>>>> committed offset), and should be phrased as
> >>>>>>>>
> >>>>>>>>> using Kafka Consumer's #seekToBeginning()
> >>>>>>>>
> >>>>>>>> -> similar issue for description of "Reset to Latest"
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * Main option: rename to --reset-offsets (plural instead of
> >> singular)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * Scenario Options: I would remove "reset" from all options,
> because
> >>>> the
> >>>>>>>> main argument "--reset-offset" says already what to do:
> >>>>>>>>
> >>>>>>>>> bin/kafka-consumer-groups.sh --reset-offset 
--reset-to-datetime
> XXX
> >>>>>>>>
> >>>>>>>> better (IMHO):
> >>>>>>>>
> >>>>>>>>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * Option 1.e ("print and export current offset") is not 
intuitive
> to
> >>>> use
> >>>>>>>> IMHO. The main option is "--reset-offset" but nothing happens 
if
> no
> >>>>>>>> scenario is specified. It is also not specified, what the 
output
> >>>> should
> >>>>>>>> look like?
> >>>>>>>>
> >>>>>>>> Furthermore, --describe should actually show currently 
committed
> >>>> offset
> >>>>>>>> for a group. So it seems to be redundant to have the same 
option
> in
> >>>>>>>> --reset-offsets
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * Option 2.a: I would rename to "--reset-to-offset" (or
> considering
> >>>> the
> >>>>>>>> comment above to "--to-offset")
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or
> >>>> similar)
> >>>>>>>> and accept positive/negative values
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * About Scope "all": maybe it's better to have an option
> >>>> "--all-topics"
> >>>>>>>> (or similar). IMHO explicit arguments are preferable over 
implicit
> >>>>>>>> setting to guard again accidental miss use of the tool.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * Scope: I also think, that "--topic" (singular) and "--topics"
> >>>> (plural)
> >>>>>>>> are too similar and easy to use in a wrong way (ie, mix up) --
> maybe
> >>>> we
> >>>>>>>> can have two options that are easier to distinguish.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * I still think that JSON is not the best format (it's too
> >>>> verbose/hard
> >>>>>>>> to write for humans from scratch). A simple CSV format with
> implicit
> >>>>>>>> schema (topic,partition,offset) would be sufficient.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * Why does the JSON contain "group_id" field -- there is 
parameter
> >>>>>>>> "--group" to specify the group ID. Would one overwrite the 
other
> >> (what
> >>>>>>>> order) or would there be an error if "--group" is used in
> >> combination
> >>>>>>>> with "--reset-from-file"?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> -Matthias
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> according to the feedback, I've updated the KIP:
> >>>>>>>>>
> >>>>>>>>> - We have added and ordered the scenarios, scopes and 
executions
> of
> >>>> the
> >>>>>>>>> Reset Offset tool.
> >>>>>>>>> - Consider it as an extension to the current
> `ConsumerGroupCommand`
> >>>>>> tool
> >>>>>>>>> - Execution will be possible without generating JSON files.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>
> 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling

> >>>>>>>>>
> >>>>>>>>> Looking forward to your feedback!
> >>>>>>>>>
> >>>>>>>>> Jorge.
> >>>>>>>>>
> >>>>>>>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya 
(<
> >>>>>>>>> quilcate.jorge@gmail.com>) escribió:
> >>>>>>>>>
> >>>>>>>>>> Great. I think I got the idea. What about this options:
> >>>>>>>>>>
> >>>>>>>>>> Scenarios:
> >>>>>>>>>>
> >>>>>>>>>> 1. Current status
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> >>>>>>>>>>
> >>>>>>>>>> 2. To Datetime
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >>>>>> --reset-to-datetime
> >>>>>>>>>> 2017-01-01T00:00:00.000´
> >>>>>>>>>>
> >>>>>>>>>> 3. To Period
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >>>> --reset-to-period
> >>>>>>>> P2D´
> >>>>>>>>>>
> >>>>>>>>>> 4. To Earliest
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >>>>>>>> --reset-to-earliest´
> >>>>>>>>>>
> >>>>>>>>>> 5. To Latest
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >>>>>> --reset-to-latest´
> >>>>>>>>>>
> >>>>>>>>>> 6. Minus 'n' offsets
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-minus
> >>>> n´
> >>>>>>>>>>
> >>>>>>>>>> 7. Plus 'n' offsets
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-plus
> >> n´
> >>>>>>>>>>
> >>>>>>>>>> 8. To specific offset
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 
--reset-to
> x´
> >>>>>>>>>>
> >>>>>>>>>> Scopes:
> >>>>>>>>>>
> >>>>>>>>>> a. All topics used by Consumer Group
> >>>>>>>>>>
> >>>>>>>>>> Don't specify --topics
> >>>>>>>>>>
> >>>>>>>>>> b. Specific List of Topics
> >>>>>>>>>>
> >>>>>>>>>> Add list of values in --topics t1,t2,tn
> >>>>>>>>>>
> >>>>>>>>>> c. One Topic, all Partitions
> >>>>>>>>>>
> >>>>>>>>>> Add one topic and no partitions values: --topic t1
> >>>>>>>>>>
> >>>>>>>>>> d. One Topic, List of Partitions
> >>>>>>>>>>
> >>>>>>>>>> Add one topic and partitions values: --topic t1 --partitions
> 0,1,2
> >>>>>>>>>>
> >>>>>>>>>> About Reset Plan (JSON file):
> >>>>>>>>>>
> >>>>>>>>>> I think is still valid to have the option to persist reset
> >>>>>> configuration
> >>>>>>>>>> as a file, but I agree to give the option to run the tool
> without
> >>>>>> going
> >>>>>>>>>> down to the JSON file.
> >>>>>>>>>>
> >>>>>>>>>> Execution options:
> >>>>>>>>>>
> >>>>>>>>>> 1. Without execution argument (No args):
> >>>>>>>>>>
> >>>>>>>>>> Print out results (reset plan)
> >>>>>>>>>>
> >>>>>>>>>> 2. With --execute argument:
> >>>>>>>>>>
> >>>>>>>>>> Run reset process
> >>>>>>>>>>
> >>>>>>>>>> 3. With --output argument:
> >>>>>>>>>>
> >>>>>>>>>> Save result in a JSON format.
> >>>>>>>>>>
> >>>>>>>>>> 4. Only with --execute option and --reset-file (path to JSON)
> >>>>>>>>>>
> >>>>>>>>>> Reset based on file
> >>>>>>>>>>
> >>>>>>>>>> 4. Only with --verify option and --reset-file (path to JSON)
> >>>>>>>>>>
> >>>>>>>>>> Verify file values with current offsets
> >>>>>>>>>>
> >>>>>>>>>> I think we can remove --generate-and-execute because is a bit
> >>>> clumsy.
> >>>>>>>>>>
> >>>>>>>>>> With this options we will be able to execute with manual JSON
> >>>>>>>>>> configuration.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<
> ben@confluent.io
> >>> )
> >>>>>>>>>> escribió:
> >>>>>>>>>>
> >>>>>>>>>> Yes - using a tool like this to skip a set of consumer groups
> >> over a
> >>>>>>>>>> corrupt/bad message is definitely appealing.
> >>>>>>>>>>
> >>>>>>>>>> B
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira 
<gw...@confluent.io>
> >>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> I like the --reset-to-earliest and --reset-to-latest. In
> general,
> >>>>>>>>>>> since the JSON route is the most challenging for users, we 
want
> >> to
> >>>>>>>>>>> provide a lot of ways to do useful things without going 
there.
> >>>>>>>>>>>
> >>>>>>>>>>> Two things that can help:
> >>>>>>>>>>>
> >>>>>>>>>>> 1. A lot of times, users want to skip few messages that 
cause
> >>>> issues
> >>>>>>>>>>> and continue. maybe just specifying the topic, partition and
> >> delta
> >>>>>>>>>>> will be better than having to find the offset and write a 
JSON
> >> and
> >>>>>>>>>>> validate the JSON etc.
> >>>>>>>>>>>
> >>>>>>>>>>> 2. Thinking if there are other common use-cases that we can
> make
> >>>> easy
> >>>>>>>>>>> rather than just one generic but not very usable method.
> >>>>>>>>>>>
> >>>>>>>>>>> Gwen
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> >>>>>>>>>>> <qu...@gmail.com> wrote:
> >>>>>>>>>>>> Thanks for the feedback!
> >>>>>>>>>>>>
> >>>>>>>>>>>> @Onur, @Gwen:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Agree. Actually at the first draft I considered to have it
> >> inside
> >>>>>>>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a
> >>>>>> standalone
> >>>>>>>>>>> tool
> >>>>>>>>>>>> to describe it clearly and focus it on reset functionality.
> >>>>>>>>>>>>
> >>>>>>>>>>>> But now that you mentioned, it does make sense to have it 
in
> >>>>>>>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way 
to
> >>>>>> introduce
> >>>>>>>>>>> it?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Maybe something like this:
> >>>>>>>>>>>>
> >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group
> cg1
> >>>>>>>>>> --topics
> >>>>>>>>>>> t1
> >>>>>>>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> >>>>>>>>>>>>
> >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify
> >>>> --reset-json-file
> >>>>>>>>>>>> plan.json´
> >>>>>>>>>>>>
> >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute
> >>>> --reset-json-file
> >>>>>>>>>>>> plan.json´
> >>>>>>>>>>>>
> >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset
> --generate-and-execute
> >>>>>>>> --group
> >>>>>>>>>>> cg1
> >>>>>>>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> >>>>>>>>>>>>
> >>>>>>>>>>>> @Gwen:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> It looks exactly like the replica assignment tool
> >>>>>>>>>>>>
> >>>>>>>>>>>> It was influenced by ;-) I use the generate-verify-execute
> >> process
> >>>>>>>> here
> >>>>>>>>>>> to
> >>>>>>>>>>>> make sure user will be aware of the result of this 
operation.
> At
> >>>> the
> >>>>>>>>>>>> beginning we considered only add a couple of options to
> Consumer
> >>>>>> Group
> >>>>>>>>>>>> Command:
> >>>>>>>>>>>>
> >>>>>>>>>>>> --rewind-to-timestamp and --rewind-to-period
> >>>>>>>>>>>>
> >>>>>>>>>>>> @Onur:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> You can actually get away with overriding while members of
> the
> >>>>>> group
> >>>>>>>>>>> are live
> >>>>>>>>>>>> with method 2 by using group information from
> >>>> DescribeGroupsRequest.
> >>>>>>>>>>>>
> >>>>>>>>>>>> This means that we need to have Consumer Group stopped 
before
> >>>>>>>> executing
> >>>>>>>>>>> and
> >>>>>>>>>>>> start a new consumer internally to do this? Therefore, we
> won't
> >> be
> >>>>>>>> able
> >>>>>>>>>>> to
> >>>>>>>>>>>> consider executing reset when ConsumerGroup is active? 
(trying
> >> to
> >>>>>>>>>> relate
> >>>>>>>>>>> it
> >>>>>>>>>>>> with @Dong 5th question)
> >>>>>>>>>>>>
> >>>>>>>>>>>> @Dong:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Should we allow user to use wildcard to reset offset of 
all
> >>>> groups
> >>>>>>>>>> for a
> >>>>>>>>>>>> given topic as well?
> >>>>>>>>>>>>
> >>>>>>>>>>>> I haven't thought about this scenario. Could be 
interesting.
> >>>>>> Following
> >>>>>>>>>>> the
> >>>>>>>>>>>> recommendation to add it into Consumer Group Command, in 
this
> >> case
> >>>>>>>>>> Group
> >>>>>>>>>>>> argument will be optional if there are only 1 topic. I 
think
> for
> >>>>>>>>>> multiple
> >>>>>>>>>>>> topic won't be that useful.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Should we allow user to specify timestamp per topic 
partition
> >> in
> >>>>>> the
> >>>>>>>>>>> json
> >>>>>>>>>>>> file as well?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Don't think this could be a valid from the tool, but if 
Reset
> >> Plan
> >>>>>> is
> >>>>>>>>>>>> generated, and user want to set the offset for a specific
> >>>> partition
> >>>>>> to
> >>>>>>>>>>>> other offset (eventually based on another timestamp), and
> >> execute
> >>>>>> it,
> >>>>>>>>>> it
> >>>>>>>>>>>> will be up to her/him.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Should the script take some credential file to make sure 
that
> >>>> this
> >>>>>>>>>>>> operation is authenticated given the potential impact of 
this
> >>>>>>>>>> operation?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Haven't tried to secure brokers yet, but the tool should
> support
> >>>>>>>>>>>> authorization if it's enabled in the broker.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Should we provide constant to reset committed offset to
> >>>>>>>>>> earliest/latest
> >>>>>>>>>>>> offset of a partition, e.g. -1 indicates earliest offset 
and
> -2
> >>>>>>>>>> indicates
> >>>>>>>>>>>> latest offset.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I will go for something like ´--reset-to-earliest´ and
> >>>>>>>>>>> ´--reset-to-latest´
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Should we allow dynamic change of the comitted offset when
> >>>> consumer
> >>>>>>>>>> are
> >>>>>>>>>>>> running, such that consumer will seek to the newly 
committed
> >>>> offset
> >>>>>>>> and
> >>>>>>>>>>>> start consuming from there?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Not sure about this. I will recommend to keep it simple and
> ask
> >>>> user
> >>>>>>>> to
> >>>>>>>>>>>> stop consumers first. But I would considered it if the
> >> trade-offs
> >>>>>> are
> >>>>>>>>>>>> clear.
> >>>>>>>>>>>>
> >>>>>>>>>>>> @Matthias
> >>>>>>>>>>>>
> >>>>>>>>>>>> Added :). And thanks a lot for your help to define this 
KIP!
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<
> >> gwen@confluent.io
> >>>>> )
> >>>>>>>>>>>> escribió:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> As long as the CLI is a bit consistent? Like, not just
> adding 3
> >>>>>>>>>>>>> arguments and a JSON parser to the existing tool, right?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> >>>>>>>>>>>>> <on...@gmail.com> wrote:
> >>>>>>>>>>>>>> I think it makes sense to just add the feature to
> >>>>>>>>>>>>> kafka-consumer-groups.sh
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <
> >>>> gwen@confluent.io>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks for the KIP. I'm super happy about adding the
> >>>> capability.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I hate the interface, though. It looks exactly like the
> >> replica
> >>>>>>>>>>>>>>> assignment tool. A tool everyone loves so much that 
there
> are
> >>>>>>>>>>> multiple
> >>>>>>>>>>>>>>> projects, open and closed, that try to fix it.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Can we swap it with something that looks a bit more like
> the
> >>>>>>>>>> consumer
> >>>>>>>>>>>>>>> group tool? or the kafka streams reset tool? Consistency 
is
> >>>>>> helpful
> >>>>>>>>>>> in
> >>>>>>>>>>>>>>> such cases. I spent some time learning existing tools 
and
> >>>>>> learning
> >>>>>>>>>>> yet
> >>>>>>>>>>>>>>> another one is a deterrent.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Gwen
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate
> Otoya
> >>>>>>>>>>>>>>> <qu...@gmail.com> wrote:
> >>>>>>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset
> >> Consumer
> >>>>>>>>>> Group
> >>>>>>>>>>>>>>> Offsets.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Please, take a look at the proposal and share your
> feedback.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>> Jorge.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>> Gwen Shapira
> >>>>>>>>>>>>>>> Product Manager | Confluent
> >>>>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760>
> >>>>>> <(650)%20450-2760>
> >>>>>>>> <(650)%20450-2760>
> >>>>>>>>>> <(650)%20450-2760> | @gwenshap
> >>>>>>>>>>>>>>> Follow us: Twitter | blog
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> Gwen Shapira
> >>>>>>>>>>>>> Product Manager | Confluent
> >>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760>
> >>>>>> <(650)%20450-2760>
> >>>>>>>> <(650)%20450-2760>
> >>>>>>>>>> <(650)%20450-2760> | @gwenshap
> >>>>>>>>>>>>> Follow us: Twitter | blog
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> Gwen Shapira
> >>>>>>>>>>> Product Manager | Confluent
> >>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760>
> >>>>>> <(650)%20450-2760> <(650)%20450-2760>
> >>>>>>>> | @gwenshap
> >>>>>>>>>>> Follow us: Twitter | blog
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>





Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
Great! KIP updated.



El vie., 24 feb. 2017 a las 18:22, Matthias J. Sax (<ma...@confluent.io>)
escribió:

> I like this!
>
> --by-duration and --shift-by
>
>
> -Matthias
>
> On 2/24/17 12:57 AM, Jorge Esteban Quilcate Otoya wrote:
> > Renaming to --by-duration LGTM
> >
> > Not sure about changing it to --shift-by-duration because we could end up
> > with the same redundancy as before with reset: --reset-offsets
> > --reset-to-*.
> >
> > Maybe changing --shift-offset-by to --shift-by 'n' could make it
> consistent
> > enough?
> >
> >
> > El vie., 24 feb. 2017 a las 6:39, Matthias J. Sax (<
> matthias@confluent.io>)
> > escribió:
> >
> >> I just read the update KIP once more.
> >>
> >> I would suggest to rename --to-duration to --by-duration
> >>
> >> Or as a second idea, rename --to-duration to --shift-by-duration and at
> >> the same time rename --shift-offset-by to --shift-by-offset
> >>
> >> Not sure what the best option is, but naming would be more consistent
> IMHO.
> >>
> >>
> >>
> >> -Matthias
> >>
> >> On 2/23/17 4:42 PM, Jorge Esteban Quilcate Otoya wrote:
> >>> Hi All,
> >>>
> >>> If there are no more concerns, I'd like to start vote for this KIP.
> >>>
> >>> Thanks!
> >>> Jorge.
> >>>
> >>> El jue., 23 feb. 2017 a las 22:50, Jorge Esteban Quilcate Otoya (<
> >>> quilcate.jorge@gmail.com>) escribió:
> >>>
> >>>> Oh ok :)
> >>>>
> >>>> So, we can keep `--topic t1:1,2,3`
> >>>>
> >>>> I think with this one we have most of the feedback applied. I will
> >> update
> >>>> the KIP with this change.
> >>>>
> >>>> El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<
> >> matthias@confluent.io>)
> >>>> escribió:
> >>>>
> >>>> Sounds reasonable.
> >>>>
> >>>> If we have multiple --topic arguments, it does also not matter if we
> use
> >>>> t1:1,2 or t2=1,2
> >>>>
> >>>> I just suggested '=' because I wanted use ':' to chain multiple
> topics.
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> >>>>> Yeap, `--topic t1=1,2`LGTM
> >>>>>
> >>>>> Don't have idea neither about getting rid of repeated --topic, but
> >>>> --group
> >>>>> is also repeated in the case of deletion, so it could be ok to have
> >>>>> repeated --topic arguments.
> >>>>>
> >>>>> El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
> >>>> matthias@confluent.io>)
> >>>>> escribió:
> >>>>>
> >>>>>> So you suggest to merge "scope options" --topics, --topic, and
> >>>>>> --partitions into a single option? Sound good to me.
> >>>>>>
> >>>>>> I like the compact way to express it, ie,
> topicname:list-of-partitions
> >>>>>> with "all partitions" if not partitions are specified. It's quite
> >>>>>> intuitive to use.
> >>>>>>
> >>>>>> Just wondering, if we could get rid of the repeated --topic option;
> >> it's
> >>>>>> somewhat verbose. Have no good idea though who to improve it.
> >>>>>>
> >>>>>> If you concatenate multiple topic, we need one more character that
> is
> >>>>>> not allowed in topic names to separate the topics:
> >>>>>>
> >>>>>>> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';', '*',
> >>>>>> '?', ' ', '\t', '\r', '\n', '='};
> >>>>>>
> >>>>>> maybe
> >>>>>>
> >>>>>> --topics t1=1,2,3:t2:t3=3
> >>>>>>
> >>>>>> use '=' to specify partitions (instead of ':' as you proposed) and
> ':'
> >>>>>> to separate topics? All other characters seem to be worse to use to
> >> me.
> >>>>>> But maybe you have a better idea.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> -Matthias
> >>>>>>
> >>>>>>
> >>>>>> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> >>>>>>> @Matthias about the point 9:
> >>>>>>>
> >>>>>>> What about keeping only the --topic option, and support this
> format:
> >>>>>>>
> >>>>>>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
> >>>>>>>
> >>>>>>> In this case topics t1, t2, and t3 will be selected: topic t1 with
> >>>>>>> partitions 0,1 and 2; topic t2 with all its partitions; and topic
> t3,
> >>>>>> with
> >>>>>>> only partition 2.
> >>>>>>>
> >>>>>>> Jorge.
> >>>>>>>
> >>>>>>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
> >>>>>>> quilcate.jorge@gmail.com>) escribió:
> >>>>>>>
> >>>>>>>> Thanks for the feedback Matthias.
> >>>>>>>>
> >>>>>>>> * 1. You're right. I'll reorder the scenarios.
> >>>>>>>>
> >>>>>>>> * 2. Agree. I'll update the KIP.
> >>>>>>>>
> >>>>>>>> * 3. I like it, updating to `reset-offsets`
> >>>>>>>>
> >>>>>>>> * 4. Agree, removing the `reset-` part
> >>>>>>>>
> >>>>>>>> * 5. Yes, 1.e option without --execute or --export will print out
> >>>>>> current
> >>>>>>>> offset, and the new offset, that will be the same. The use-case of
> >>>> this
> >>>>>>>> option is to use it in combination with --export mostly and have a
> >>>>>> current
> >>>>>>>> 'checkpoint' to reset later. I will add to the KIP how the output
> >>>> should
> >>>>>>>> looks like.
> >>>>>>>>
> >>>>>>>> * 6. Considering 4., I will update it to `--to-offset`
> >>>>>>>>
> >>>>>>>> * 7. I like the idea to unify these options (plus, minus).
> >>>>>>>> `shift-offsets-by` is a good option, but I will like some more
> >>>> feedback
> >>>>>>>> here about the name. I will update the KIP in the meantime.
> >>>>>>>>
> >>>>>>>> * 8. Yes, discussed in 9.
> >>>>>>>>
> >>>>>>>> * 9. Agree. I'll love some feedback here. `topic` is already used
> by
> >>>>>>>> `delete`, and we can add `--all-topics` to consider all
> >>>>>> topics/partitions
> >>>>>>>> assigned to a group. How could we define specific
> topics/partitions?
> >>>>>>>>
> >>>>>>>> * 10. Haven't thought about it, but make sense.
> >>>>>>>> <topic>,<partition>,<offset> would be enough.
> >>>>>>>>
> >>>>>>>> * 11. Agree. Solved with 10.
> >>>>>>>>
> >>>>>>>> Also, I have a couple of changes to mention:
> >>>>>>>>
> >>>>>>>> 1. I have add a reference to the branch where I'm working on this
> >> KIP.
> >>>>>>>>
> >>>>>>>> 2. About the period scenario `--to-period`. I will change it to
> >>>>>>>> `--to-duration` given that duration (
> >>>>>>>> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html
> )
> >>>>>>>> follows this format: 'PnDTnHnMnS' and does not consider daylight
> >>>> saving
> >>>>>>>> efects.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
> >>>>>> matthias@confluent.io>)
> >>>>>>>> escribió:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> thanks for updating the KIP. Couple of follow up comments:
> >>>>>>>>
> >>>>>>>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset
> by
> >>>>>>>> time" option -- IMHO it belongs to "reset by position"?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * Nit: Description of "Reset to Earliest"
> >>>>>>>>
> >>>>>>>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
> >>>>>>>>
> >>>>>>>> I think this is strictly speaking not correct (as
> auto.offset.reset
> >>>> only
> >>>>>>>> triggered if no valid offset is found, but this tool explicitly
> >>>> modified
> >>>>>>>> committed offset), and should be phrased as
> >>>>>>>>
> >>>>>>>>> using Kafka Consumer's #seekToBeginning()
> >>>>>>>>
> >>>>>>>> -> similar issue for description of "Reset to Latest"
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * Main option: rename to --reset-offsets (plural instead of
> >> singular)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * Scenario Options: I would remove "reset" from all options,
> because
> >>>> the
> >>>>>>>> main argument "--reset-offset" says already what to do:
> >>>>>>>>
> >>>>>>>>> bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime
> XXX
> >>>>>>>>
> >>>>>>>> better (IMHO):
> >>>>>>>>
> >>>>>>>>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * Option 1.e ("print and export current offset") is not intuitive
> to
> >>>> use
> >>>>>>>> IMHO. The main option is "--reset-offset" but nothing happens if
> no
> >>>>>>>> scenario is specified. It is also not specified, what the output
> >>>> should
> >>>>>>>> look like?
> >>>>>>>>
> >>>>>>>> Furthermore, --describe should actually show currently committed
> >>>> offset
> >>>>>>>> for a group. So it seems to be redundant to have the same option
> in
> >>>>>>>> --reset-offsets
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * Option 2.a: I would rename to "--reset-to-offset" (or
> considering
> >>>> the
> >>>>>>>> comment above to "--to-offset")
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or
> >>>> similar)
> >>>>>>>> and accept positive/negative values
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * About Scope "all": maybe it's better to have an option
> >>>> "--all-topics"
> >>>>>>>> (or similar). IMHO explicit arguments are preferable over implicit
> >>>>>>>> setting to guard again accidental miss use of the tool.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * Scope: I also think, that "--topic" (singular) and "--topics"
> >>>> (plural)
> >>>>>>>> are too similar and easy to use in a wrong way (ie, mix up) --
> maybe
> >>>> we
> >>>>>>>> can have two options that are easier to distinguish.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * I still think that JSON is not the best format (it's too
> >>>> verbose/hard
> >>>>>>>> to write for humans from scratch). A simple CSV format with
> implicit
> >>>>>>>> schema (topic,partition,offset) would be sufficient.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> * Why does the JSON contain "group_id" field -- there is parameter
> >>>>>>>> "--group" to specify the group ID. Would one overwrite the other
> >> (what
> >>>>>>>> order) or would there be an error if "--group" is used in
> >> combination
> >>>>>>>> with "--reset-from-file"?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> -Matthias
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> according to the feedback, I've updated the KIP:
> >>>>>>>>>
> >>>>>>>>> - We have added and ordered the scenarios, scopes and executions
> of
> >>>> the
> >>>>>>>>> Reset Offset tool.
> >>>>>>>>> - Consider it as an extension to the current
> `ConsumerGroupCommand`
> >>>>>> tool
> >>>>>>>>> - Execution will be possible without generating JSON files.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> >>>>>>>>>
> >>>>>>>>> Looking forward to your feedback!
> >>>>>>>>>
> >>>>>>>>> Jorge.
> >>>>>>>>>
> >>>>>>>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
> >>>>>>>>> quilcate.jorge@gmail.com>) escribió:
> >>>>>>>>>
> >>>>>>>>>> Great. I think I got the idea. What about this options:
> >>>>>>>>>>
> >>>>>>>>>> Scenarios:
> >>>>>>>>>>
> >>>>>>>>>> 1. Current status
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> >>>>>>>>>>
> >>>>>>>>>> 2. To Datetime
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >>>>>> --reset-to-datetime
> >>>>>>>>>> 2017-01-01T00:00:00.000´
> >>>>>>>>>>
> >>>>>>>>>> 3. To Period
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >>>> --reset-to-period
> >>>>>>>> P2D´
> >>>>>>>>>>
> >>>>>>>>>> 4. To Earliest
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >>>>>>>> --reset-to-earliest´
> >>>>>>>>>>
> >>>>>>>>>> 5. To Latest
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >>>>>> --reset-to-latest´
> >>>>>>>>>>
> >>>>>>>>>> 6. Minus 'n' offsets
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-minus
> >>>> n´
> >>>>>>>>>>
> >>>>>>>>>> 7. Plus 'n' offsets
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-plus
> >> n´
> >>>>>>>>>>
> >>>>>>>>>> 8. To specific offset
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to
> x´
> >>>>>>>>>>
> >>>>>>>>>> Scopes:
> >>>>>>>>>>
> >>>>>>>>>> a. All topics used by Consumer Group
> >>>>>>>>>>
> >>>>>>>>>> Don't specify --topics
> >>>>>>>>>>
> >>>>>>>>>> b. Specific List of Topics
> >>>>>>>>>>
> >>>>>>>>>> Add list of values in --topics t1,t2,tn
> >>>>>>>>>>
> >>>>>>>>>> c. One Topic, all Partitions
> >>>>>>>>>>
> >>>>>>>>>> Add one topic and no partitions values: --topic t1
> >>>>>>>>>>
> >>>>>>>>>> d. One Topic, List of Partitions
> >>>>>>>>>>
> >>>>>>>>>> Add one topic and partitions values: --topic t1 --partitions
> 0,1,2
> >>>>>>>>>>
> >>>>>>>>>> About Reset Plan (JSON file):
> >>>>>>>>>>
> >>>>>>>>>> I think is still valid to have the option to persist reset
> >>>>>> configuration
> >>>>>>>>>> as a file, but I agree to give the option to run the tool
> without
> >>>>>> going
> >>>>>>>>>> down to the JSON file.
> >>>>>>>>>>
> >>>>>>>>>> Execution options:
> >>>>>>>>>>
> >>>>>>>>>> 1. Without execution argument (No args):
> >>>>>>>>>>
> >>>>>>>>>> Print out results (reset plan)
> >>>>>>>>>>
> >>>>>>>>>> 2. With --execute argument:
> >>>>>>>>>>
> >>>>>>>>>> Run reset process
> >>>>>>>>>>
> >>>>>>>>>> 3. With --output argument:
> >>>>>>>>>>
> >>>>>>>>>> Save result in a JSON format.
> >>>>>>>>>>
> >>>>>>>>>> 4. Only with --execute option and --reset-file (path to JSON)
> >>>>>>>>>>
> >>>>>>>>>> Reset based on file
> >>>>>>>>>>
> >>>>>>>>>> 4. Only with --verify option and --reset-file (path to JSON)
> >>>>>>>>>>
> >>>>>>>>>> Verify file values with current offsets
> >>>>>>>>>>
> >>>>>>>>>> I think we can remove --generate-and-execute because is a bit
> >>>> clumsy.
> >>>>>>>>>>
> >>>>>>>>>> With this options we will be able to execute with manual JSON
> >>>>>>>>>> configuration.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<
> ben@confluent.io
> >>> )
> >>>>>>>>>> escribió:
> >>>>>>>>>>
> >>>>>>>>>> Yes - using a tool like this to skip a set of consumer groups
> >> over a
> >>>>>>>>>> corrupt/bad message is definitely appealing.
> >>>>>>>>>>
> >>>>>>>>>> B
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io>
> >>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> I like the --reset-to-earliest and --reset-to-latest. In
> general,
> >>>>>>>>>>> since the JSON route is the most challenging for users, we want
> >> to
> >>>>>>>>>>> provide a lot of ways to do useful things without going there.
> >>>>>>>>>>>
> >>>>>>>>>>> Two things that can help:
> >>>>>>>>>>>
> >>>>>>>>>>> 1. A lot of times, users want to skip few messages that cause
> >>>> issues
> >>>>>>>>>>> and continue. maybe just specifying the topic, partition and
> >> delta
> >>>>>>>>>>> will be better than having to find the offset and write a JSON
> >> and
> >>>>>>>>>>> validate the JSON etc.
> >>>>>>>>>>>
> >>>>>>>>>>> 2. Thinking if there are other common use-cases that we can
> make
> >>>> easy
> >>>>>>>>>>> rather than just one generic but not very usable method.
> >>>>>>>>>>>
> >>>>>>>>>>> Gwen
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> >>>>>>>>>>> <qu...@gmail.com> wrote:
> >>>>>>>>>>>> Thanks for the feedback!
> >>>>>>>>>>>>
> >>>>>>>>>>>> @Onur, @Gwen:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Agree. Actually at the first draft I considered to have it
> >> inside
> >>>>>>>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a
> >>>>>> standalone
> >>>>>>>>>>> tool
> >>>>>>>>>>>> to describe it clearly and focus it on reset functionality.
> >>>>>>>>>>>>
> >>>>>>>>>>>> But now that you mentioned, it does make sense to have it in
> >>>>>>>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to
> >>>>>> introduce
> >>>>>>>>>>> it?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Maybe something like this:
> >>>>>>>>>>>>
> >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group
> cg1
> >>>>>>>>>> --topics
> >>>>>>>>>>> t1
> >>>>>>>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> >>>>>>>>>>>>
> >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify
> >>>> --reset-json-file
> >>>>>>>>>>>> plan.json´
> >>>>>>>>>>>>
> >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute
> >>>> --reset-json-file
> >>>>>>>>>>>> plan.json´
> >>>>>>>>>>>>
> >>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset
> --generate-and-execute
> >>>>>>>> --group
> >>>>>>>>>>> cg1
> >>>>>>>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> >>>>>>>>>>>>
> >>>>>>>>>>>> @Gwen:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> It looks exactly like the replica assignment tool
> >>>>>>>>>>>>
> >>>>>>>>>>>> It was influenced by ;-) I use the generate-verify-execute
> >> process
> >>>>>>>> here
> >>>>>>>>>>> to
> >>>>>>>>>>>> make sure user will be aware of the result of this operation.
> At
> >>>> the
> >>>>>>>>>>>> beginning we considered only add a couple of options to
> Consumer
> >>>>>> Group
> >>>>>>>>>>>> Command:
> >>>>>>>>>>>>
> >>>>>>>>>>>> --rewind-to-timestamp and --rewind-to-period
> >>>>>>>>>>>>
> >>>>>>>>>>>> @Onur:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> You can actually get away with overriding while members of
> the
> >>>>>> group
> >>>>>>>>>>> are live
> >>>>>>>>>>>> with method 2 by using group information from
> >>>> DescribeGroupsRequest.
> >>>>>>>>>>>>
> >>>>>>>>>>>> This means that we need to have Consumer Group stopped before
> >>>>>>>> executing
> >>>>>>>>>>> and
> >>>>>>>>>>>> start a new consumer internally to do this? Therefore, we
> won't
> >> be
> >>>>>>>> able
> >>>>>>>>>>> to
> >>>>>>>>>>>> consider executing reset when ConsumerGroup is active? (trying
> >> to
> >>>>>>>>>> relate
> >>>>>>>>>>> it
> >>>>>>>>>>>> with @Dong 5th question)
> >>>>>>>>>>>>
> >>>>>>>>>>>> @Dong:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Should we allow user to use wildcard to reset offset of all
> >>>> groups
> >>>>>>>>>> for a
> >>>>>>>>>>>> given topic as well?
> >>>>>>>>>>>>
> >>>>>>>>>>>> I haven't thought about this scenario. Could be interesting.
> >>>>>> Following
> >>>>>>>>>>> the
> >>>>>>>>>>>> recommendation to add it into Consumer Group Command, in this
> >> case
> >>>>>>>>>> Group
> >>>>>>>>>>>> argument will be optional if there are only 1 topic. I think
> for
> >>>>>>>>>> multiple
> >>>>>>>>>>>> topic won't be that useful.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Should we allow user to specify timestamp per topic partition
> >> in
> >>>>>> the
> >>>>>>>>>>> json
> >>>>>>>>>>>> file as well?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Don't think this could be a valid from the tool, but if Reset
> >> Plan
> >>>>>> is
> >>>>>>>>>>>> generated, and user want to set the offset for a specific
> >>>> partition
> >>>>>> to
> >>>>>>>>>>>> other offset (eventually based on another timestamp), and
> >> execute
> >>>>>> it,
> >>>>>>>>>> it
> >>>>>>>>>>>> will be up to her/him.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Should the script take some credential file to make sure that
> >>>> this
> >>>>>>>>>>>> operation is authenticated given the potential impact of this
> >>>>>>>>>> operation?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Haven't tried to secure brokers yet, but the tool should
> support
> >>>>>>>>>>>> authorization if it's enabled in the broker.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Should we provide constant to reset committed offset to
> >>>>>>>>>> earliest/latest
> >>>>>>>>>>>> offset of a partition, e.g. -1 indicates earliest offset and
> -2
> >>>>>>>>>> indicates
> >>>>>>>>>>>> latest offset.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I will go for something like ´--reset-to-earliest´ and
> >>>>>>>>>>> ´--reset-to-latest´
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Should we allow dynamic change of the comitted offset when
> >>>> consumer
> >>>>>>>>>> are
> >>>>>>>>>>>> running, such that consumer will seek to the newly committed
> >>>> offset
> >>>>>>>> and
> >>>>>>>>>>>> start consuming from there?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Not sure about this. I will recommend to keep it simple and
> ask
> >>>> user
> >>>>>>>> to
> >>>>>>>>>>>> stop consumers first. But I would considered it if the
> >> trade-offs
> >>>>>> are
> >>>>>>>>>>>> clear.
> >>>>>>>>>>>>
> >>>>>>>>>>>> @Matthias
> >>>>>>>>>>>>
> >>>>>>>>>>>> Added :). And thanks a lot for your help to define this KIP!
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<
> >> gwen@confluent.io
> >>>>> )
> >>>>>>>>>>>> escribió:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> As long as the CLI is a bit consistent? Like, not just
> adding 3
> >>>>>>>>>>>>> arguments and a JSON parser to the existing tool, right?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> >>>>>>>>>>>>> <on...@gmail.com> wrote:
> >>>>>>>>>>>>>> I think it makes sense to just add the feature to
> >>>>>>>>>>>>> kafka-consumer-groups.sh
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <
> >>>> gwen@confluent.io>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks for the KIP. I'm super happy about adding the
> >>>> capability.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I hate the interface, though. It looks exactly like the
> >> replica
> >>>>>>>>>>>>>>> assignment tool. A tool everyone loves so much that there
> are
> >>>>>>>>>>> multiple
> >>>>>>>>>>>>>>> projects, open and closed, that try to fix it.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Can we swap it with something that looks a bit more like
> the
> >>>>>>>>>> consumer
> >>>>>>>>>>>>>>> group tool? or the kafka streams reset tool? Consistency is
> >>>>>> helpful
> >>>>>>>>>>> in
> >>>>>>>>>>>>>>> such cases. I spent some time learning existing tools and
> >>>>>> learning
> >>>>>>>>>>> yet
> >>>>>>>>>>>>>>> another one is a deterrent.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Gwen
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate
> Otoya
> >>>>>>>>>>>>>>> <qu...@gmail.com> wrote:
> >>>>>>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset
> >> Consumer
> >>>>>>>>>> Group
> >>>>>>>>>>>>>>> Offsets.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Please, take a look at the proposal and share your
> feedback.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>> Jorge.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>> Gwen Shapira
> >>>>>>>>>>>>>>> Product Manager | Confluent
> >>>>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760>
> >>>>>> <(650)%20450-2760>
> >>>>>>>> <(650)%20450-2760>
> >>>>>>>>>> <(650)%20450-2760> | @gwenshap
> >>>>>>>>>>>>>>> Follow us: Twitter | blog
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> Gwen Shapira
> >>>>>>>>>>>>> Product Manager | Confluent
> >>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760>
> >>>>>> <(650)%20450-2760>
> >>>>>>>> <(650)%20450-2760>
> >>>>>>>>>> <(650)%20450-2760> | @gwenshap
> >>>>>>>>>>>>> Follow us: Twitter | blog
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> Gwen Shapira
> >>>>>>>>>>> Product Manager | Confluent
> >>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760>
> >>>>>> <(650)%20450-2760> <(650)%20450-2760>
> >>>>>>>> | @gwenshap
> >>>>>>>>>>> Follow us: Twitter | blog
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by "Matthias J. Sax" <ma...@confluent.io>.
I like this!

--by-duration and --shift-by


-Matthias

On 2/24/17 12:57 AM, Jorge Esteban Quilcate Otoya wrote:
> Renaming to --by-duration LGTM
> 
> Not sure about changing it to --shift-by-duration because we could end up
> with the same redundancy as before with reset: --reset-offsets
> --reset-to-*.
> 
> Maybe changing --shift-offset-by to --shift-by 'n' could make it consistent
> enough?
> 
> 
> El vie., 24 feb. 2017 a las 6:39, Matthias J. Sax (<ma...@confluent.io>)
> escribió:
> 
>> I just read the update KIP once more.
>>
>> I would suggest to rename --to-duration to --by-duration
>>
>> Or as a second idea, rename --to-duration to --shift-by-duration and at
>> the same time rename --shift-offset-by to --shift-by-offset
>>
>> Not sure what the best option is, but naming would be more consistent IMHO.
>>
>>
>>
>> -Matthias
>>
>> On 2/23/17 4:42 PM, Jorge Esteban Quilcate Otoya wrote:
>>> Hi All,
>>>
>>> If there are no more concerns, I'd like to start vote for this KIP.
>>>
>>> Thanks!
>>> Jorge.
>>>
>>> El jue., 23 feb. 2017 a las 22:50, Jorge Esteban Quilcate Otoya (<
>>> quilcate.jorge@gmail.com>) escribió:
>>>
>>>> Oh ok :)
>>>>
>>>> So, we can keep `--topic t1:1,2,3`
>>>>
>>>> I think with this one we have most of the feedback applied. I will
>> update
>>>> the KIP with this change.
>>>>
>>>> El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<
>> matthias@confluent.io>)
>>>> escribió:
>>>>
>>>> Sounds reasonable.
>>>>
>>>> If we have multiple --topic arguments, it does also not matter if we use
>>>> t1:1,2 or t2=1,2
>>>>
>>>> I just suggested '=' because I wanted use ':' to chain multiple topics.
>>>>
>>>>
>>>> -Matthias
>>>>
>>>> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
>>>>> Yeap, `--topic t1=1,2`LGTM
>>>>>
>>>>> Don't have idea neither about getting rid of repeated --topic, but
>>>> --group
>>>>> is also repeated in the case of deletion, so it could be ok to have
>>>>> repeated --topic arguments.
>>>>>
>>>>> El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
>>>> matthias@confluent.io>)
>>>>> escribió:
>>>>>
>>>>>> So you suggest to merge "scope options" --topics, --topic, and
>>>>>> --partitions into a single option? Sound good to me.
>>>>>>
>>>>>> I like the compact way to express it, ie, topicname:list-of-partitions
>>>>>> with "all partitions" if not partitions are specified. It's quite
>>>>>> intuitive to use.
>>>>>>
>>>>>> Just wondering, if we could get rid of the repeated --topic option;
>> it's
>>>>>> somewhat verbose. Have no good idea though who to improve it.
>>>>>>
>>>>>> If you concatenate multiple topic, we need one more character that is
>>>>>> not allowed in topic names to separate the topics:
>>>>>>
>>>>>>> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';', '*',
>>>>>> '?', ' ', '\t', '\r', '\n', '='};
>>>>>>
>>>>>> maybe
>>>>>>
>>>>>> --topics t1=1,2,3:t2:t3=3
>>>>>>
>>>>>> use '=' to specify partitions (instead of ':' as you proposed) and ':'
>>>>>> to separate topics? All other characters seem to be worse to use to
>> me.
>>>>>> But maybe you have a better idea.
>>>>>>
>>>>>>
>>>>>>
>>>>>> -Matthias
>>>>>>
>>>>>>
>>>>>> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
>>>>>>> @Matthias about the point 9:
>>>>>>>
>>>>>>> What about keeping only the --topic option, and support this format:
>>>>>>>
>>>>>>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
>>>>>>>
>>>>>>> In this case topics t1, t2, and t3 will be selected: topic t1 with
>>>>>>> partitions 0,1 and 2; topic t2 with all its partitions; and topic t3,
>>>>>> with
>>>>>>> only partition 2.
>>>>>>>
>>>>>>> Jorge.
>>>>>>>
>>>>>>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
>>>>>>> quilcate.jorge@gmail.com>) escribió:
>>>>>>>
>>>>>>>> Thanks for the feedback Matthias.
>>>>>>>>
>>>>>>>> * 1. You're right. I'll reorder the scenarios.
>>>>>>>>
>>>>>>>> * 2. Agree. I'll update the KIP.
>>>>>>>>
>>>>>>>> * 3. I like it, updating to `reset-offsets`
>>>>>>>>
>>>>>>>> * 4. Agree, removing the `reset-` part
>>>>>>>>
>>>>>>>> * 5. Yes, 1.e option without --execute or --export will print out
>>>>>> current
>>>>>>>> offset, and the new offset, that will be the same. The use-case of
>>>> this
>>>>>>>> option is to use it in combination with --export mostly and have a
>>>>>> current
>>>>>>>> 'checkpoint' to reset later. I will add to the KIP how the output
>>>> should
>>>>>>>> looks like.
>>>>>>>>
>>>>>>>> * 6. Considering 4., I will update it to `--to-offset`
>>>>>>>>
>>>>>>>> * 7. I like the idea to unify these options (plus, minus).
>>>>>>>> `shift-offsets-by` is a good option, but I will like some more
>>>> feedback
>>>>>>>> here about the name. I will update the KIP in the meantime.
>>>>>>>>
>>>>>>>> * 8. Yes, discussed in 9.
>>>>>>>>
>>>>>>>> * 9. Agree. I'll love some feedback here. `topic` is already used by
>>>>>>>> `delete`, and we can add `--all-topics` to consider all
>>>>>> topics/partitions
>>>>>>>> assigned to a group. How could we define specific topics/partitions?
>>>>>>>>
>>>>>>>> * 10. Haven't thought about it, but make sense.
>>>>>>>> <topic>,<partition>,<offset> would be enough.
>>>>>>>>
>>>>>>>> * 11. Agree. Solved with 10.
>>>>>>>>
>>>>>>>> Also, I have a couple of changes to mention:
>>>>>>>>
>>>>>>>> 1. I have add a reference to the branch where I'm working on this
>> KIP.
>>>>>>>>
>>>>>>>> 2. About the period scenario `--to-period`. I will change it to
>>>>>>>> `--to-duration` given that duration (
>>>>>>>> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html)
>>>>>>>> follows this format: 'PnDTnHnMnS' and does not consider daylight
>>>> saving
>>>>>>>> efects.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
>>>>>> matthias@confluent.io>)
>>>>>>>> escribió:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> thanks for updating the KIP. Couple of follow up comments:
>>>>>>>>
>>>>>>>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
>>>>>>>> time" option -- IMHO it belongs to "reset by position"?
>>>>>>>>
>>>>>>>>
>>>>>>>> * Nit: Description of "Reset to Earliest"
>>>>>>>>
>>>>>>>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
>>>>>>>>
>>>>>>>> I think this is strictly speaking not correct (as auto.offset.reset
>>>> only
>>>>>>>> triggered if no valid offset is found, but this tool explicitly
>>>> modified
>>>>>>>> committed offset), and should be phrased as
>>>>>>>>
>>>>>>>>> using Kafka Consumer's #seekToBeginning()
>>>>>>>>
>>>>>>>> -> similar issue for description of "Reset to Latest"
>>>>>>>>
>>>>>>>>
>>>>>>>> * Main option: rename to --reset-offsets (plural instead of
>> singular)
>>>>>>>>
>>>>>>>>
>>>>>>>> * Scenario Options: I would remove "reset" from all options, because
>>>> the
>>>>>>>> main argument "--reset-offset" says already what to do:
>>>>>>>>
>>>>>>>>> bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
>>>>>>>>
>>>>>>>> better (IMHO):
>>>>>>>>
>>>>>>>>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> * Option 1.e ("print and export current offset") is not intuitive to
>>>> use
>>>>>>>> IMHO. The main option is "--reset-offset" but nothing happens if no
>>>>>>>> scenario is specified. It is also not specified, what the output
>>>> should
>>>>>>>> look like?
>>>>>>>>
>>>>>>>> Furthermore, --describe should actually show currently committed
>>>> offset
>>>>>>>> for a group. So it seems to be redundant to have the same option in
>>>>>>>> --reset-offsets
>>>>>>>>
>>>>>>>>
>>>>>>>> * Option 2.a: I would rename to "--reset-to-offset" (or considering
>>>> the
>>>>>>>> comment above to "--to-offset")
>>>>>>>>
>>>>>>>>
>>>>>>>> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or
>>>> similar)
>>>>>>>> and accept positive/negative values
>>>>>>>>
>>>>>>>>
>>>>>>>> * About Scope "all": maybe it's better to have an option
>>>> "--all-topics"
>>>>>>>> (or similar). IMHO explicit arguments are preferable over implicit
>>>>>>>> setting to guard again accidental miss use of the tool.
>>>>>>>>
>>>>>>>>
>>>>>>>> * Scope: I also think, that "--topic" (singular) and "--topics"
>>>> (plural)
>>>>>>>> are too similar and easy to use in a wrong way (ie, mix up) -- maybe
>>>> we
>>>>>>>> can have two options that are easier to distinguish.
>>>>>>>>
>>>>>>>>
>>>>>>>> * I still think that JSON is not the best format (it's too
>>>> verbose/hard
>>>>>>>> to write for humans from scratch). A simple CSV format with implicit
>>>>>>>> schema (topic,partition,offset) would be sufficient.
>>>>>>>>
>>>>>>>>
>>>>>>>> * Why does the JSON contain "group_id" field -- there is parameter
>>>>>>>> "--group" to specify the group ID. Would one overwrite the other
>> (what
>>>>>>>> order) or would there be an error if "--group" is used in
>> combination
>>>>>>>> with "--reset-from-file"?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -Matthias
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> according to the feedback, I've updated the KIP:
>>>>>>>>>
>>>>>>>>> - We have added and ordered the scenarios, scopes and executions of
>>>> the
>>>>>>>>> Reset Offset tool.
>>>>>>>>> - Consider it as an extension to the current `ConsumerGroupCommand`
>>>>>> tool
>>>>>>>>> - Execution will be possible without generating JSON files.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
>>>>>>>>>
>>>>>>>>> Looking forward to your feedback!
>>>>>>>>>
>>>>>>>>> Jorge.
>>>>>>>>>
>>>>>>>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
>>>>>>>>> quilcate.jorge@gmail.com>) escribió:
>>>>>>>>>
>>>>>>>>>> Great. I think I got the idea. What about this options:
>>>>>>>>>>
>>>>>>>>>> Scenarios:
>>>>>>>>>>
>>>>>>>>>> 1. Current status
>>>>>>>>>>
>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
>>>>>>>>>>
>>>>>>>>>> 2. To Datetime
>>>>>>>>>>
>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
>>>>>> --reset-to-datetime
>>>>>>>>>> 2017-01-01T00:00:00.000´
>>>>>>>>>>
>>>>>>>>>> 3. To Period
>>>>>>>>>>
>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
>>>> --reset-to-period
>>>>>>>> P2D´
>>>>>>>>>>
>>>>>>>>>> 4. To Earliest
>>>>>>>>>>
>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
>>>>>>>> --reset-to-earliest´
>>>>>>>>>>
>>>>>>>>>> 5. To Latest
>>>>>>>>>>
>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
>>>>>> --reset-to-latest´
>>>>>>>>>>
>>>>>>>>>> 6. Minus 'n' offsets
>>>>>>>>>>
>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus
>>>> n´
>>>>>>>>>>
>>>>>>>>>> 7. Plus 'n' offsets
>>>>>>>>>>
>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus
>> n´
>>>>>>>>>>
>>>>>>>>>> 8. To specific offset
>>>>>>>>>>
>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
>>>>>>>>>>
>>>>>>>>>> Scopes:
>>>>>>>>>>
>>>>>>>>>> a. All topics used by Consumer Group
>>>>>>>>>>
>>>>>>>>>> Don't specify --topics
>>>>>>>>>>
>>>>>>>>>> b. Specific List of Topics
>>>>>>>>>>
>>>>>>>>>> Add list of values in --topics t1,t2,tn
>>>>>>>>>>
>>>>>>>>>> c. One Topic, all Partitions
>>>>>>>>>>
>>>>>>>>>> Add one topic and no partitions values: --topic t1
>>>>>>>>>>
>>>>>>>>>> d. One Topic, List of Partitions
>>>>>>>>>>
>>>>>>>>>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
>>>>>>>>>>
>>>>>>>>>> About Reset Plan (JSON file):
>>>>>>>>>>
>>>>>>>>>> I think is still valid to have the option to persist reset
>>>>>> configuration
>>>>>>>>>> as a file, but I agree to give the option to run the tool without
>>>>>> going
>>>>>>>>>> down to the JSON file.
>>>>>>>>>>
>>>>>>>>>> Execution options:
>>>>>>>>>>
>>>>>>>>>> 1. Without execution argument (No args):
>>>>>>>>>>
>>>>>>>>>> Print out results (reset plan)
>>>>>>>>>>
>>>>>>>>>> 2. With --execute argument:
>>>>>>>>>>
>>>>>>>>>> Run reset process
>>>>>>>>>>
>>>>>>>>>> 3. With --output argument:
>>>>>>>>>>
>>>>>>>>>> Save result in a JSON format.
>>>>>>>>>>
>>>>>>>>>> 4. Only with --execute option and --reset-file (path to JSON)
>>>>>>>>>>
>>>>>>>>>> Reset based on file
>>>>>>>>>>
>>>>>>>>>> 4. Only with --verify option and --reset-file (path to JSON)
>>>>>>>>>>
>>>>>>>>>> Verify file values with current offsets
>>>>>>>>>>
>>>>>>>>>> I think we can remove --generate-and-execute because is a bit
>>>> clumsy.
>>>>>>>>>>
>>>>>>>>>> With this options we will be able to execute with manual JSON
>>>>>>>>>> configuration.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<ben@confluent.io
>>> )
>>>>>>>>>> escribió:
>>>>>>>>>>
>>>>>>>>>> Yes - using a tool like this to skip a set of consumer groups
>> over a
>>>>>>>>>> corrupt/bad message is definitely appealing.
>>>>>>>>>>
>>>>>>>>>> B
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io>
>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I like the --reset-to-earliest and --reset-to-latest. In general,
>>>>>>>>>>> since the JSON route is the most challenging for users, we want
>> to
>>>>>>>>>>> provide a lot of ways to do useful things without going there.
>>>>>>>>>>>
>>>>>>>>>>> Two things that can help:
>>>>>>>>>>>
>>>>>>>>>>> 1. A lot of times, users want to skip few messages that cause
>>>> issues
>>>>>>>>>>> and continue. maybe just specifying the topic, partition and
>> delta
>>>>>>>>>>> will be better than having to find the offset and write a JSON
>> and
>>>>>>>>>>> validate the JSON etc.
>>>>>>>>>>>
>>>>>>>>>>> 2. Thinking if there are other common use-cases that we can make
>>>> easy
>>>>>>>>>>> rather than just one generic but not very usable method.
>>>>>>>>>>>
>>>>>>>>>>> Gwen
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
>>>>>>>>>>> <qu...@gmail.com> wrote:
>>>>>>>>>>>> Thanks for the feedback!
>>>>>>>>>>>>
>>>>>>>>>>>> @Onur, @Gwen:
>>>>>>>>>>>>
>>>>>>>>>>>> Agree. Actually at the first draft I considered to have it
>> inside
>>>>>>>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a
>>>>>> standalone
>>>>>>>>>>> tool
>>>>>>>>>>>> to describe it clearly and focus it on reset functionality.
>>>>>>>>>>>>
>>>>>>>>>>>> But now that you mentioned, it does make sense to have it in
>>>>>>>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to
>>>>>> introduce
>>>>>>>>>>> it?
>>>>>>>>>>>>
>>>>>>>>>>>> Maybe something like this:
>>>>>>>>>>>>
>>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
>>>>>>>>>> --topics
>>>>>>>>>>> t1
>>>>>>>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
>>>>>>>>>>>>
>>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify
>>>> --reset-json-file
>>>>>>>>>>>> plan.json´
>>>>>>>>>>>>
>>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute
>>>> --reset-json-file
>>>>>>>>>>>> plan.json´
>>>>>>>>>>>>
>>>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute
>>>>>>>> --group
>>>>>>>>>>> cg1
>>>>>>>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
>>>>>>>>>>>>
>>>>>>>>>>>> @Gwen:
>>>>>>>>>>>>
>>>>>>>>>>>>> It looks exactly like the replica assignment tool
>>>>>>>>>>>>
>>>>>>>>>>>> It was influenced by ;-) I use the generate-verify-execute
>> process
>>>>>>>> here
>>>>>>>>>>> to
>>>>>>>>>>>> make sure user will be aware of the result of this operation. At
>>>> the
>>>>>>>>>>>> beginning we considered only add a couple of options to Consumer
>>>>>> Group
>>>>>>>>>>>> Command:
>>>>>>>>>>>>
>>>>>>>>>>>> --rewind-to-timestamp and --rewind-to-period
>>>>>>>>>>>>
>>>>>>>>>>>> @Onur:
>>>>>>>>>>>>
>>>>>>>>>>>>> You can actually get away with overriding while members of the
>>>>>> group
>>>>>>>>>>> are live
>>>>>>>>>>>> with method 2 by using group information from
>>>> DescribeGroupsRequest.
>>>>>>>>>>>>
>>>>>>>>>>>> This means that we need to have Consumer Group stopped before
>>>>>>>> executing
>>>>>>>>>>> and
>>>>>>>>>>>> start a new consumer internally to do this? Therefore, we won't
>> be
>>>>>>>> able
>>>>>>>>>>> to
>>>>>>>>>>>> consider executing reset when ConsumerGroup is active? (trying
>> to
>>>>>>>>>> relate
>>>>>>>>>>> it
>>>>>>>>>>>> with @Dong 5th question)
>>>>>>>>>>>>
>>>>>>>>>>>> @Dong:
>>>>>>>>>>>>
>>>>>>>>>>>>> Should we allow user to use wildcard to reset offset of all
>>>> groups
>>>>>>>>>> for a
>>>>>>>>>>>> given topic as well?
>>>>>>>>>>>>
>>>>>>>>>>>> I haven't thought about this scenario. Could be interesting.
>>>>>> Following
>>>>>>>>>>> the
>>>>>>>>>>>> recommendation to add it into Consumer Group Command, in this
>> case
>>>>>>>>>> Group
>>>>>>>>>>>> argument will be optional if there are only 1 topic. I think for
>>>>>>>>>> multiple
>>>>>>>>>>>> topic won't be that useful.
>>>>>>>>>>>>
>>>>>>>>>>>>> Should we allow user to specify timestamp per topic partition
>> in
>>>>>> the
>>>>>>>>>>> json
>>>>>>>>>>>> file as well?
>>>>>>>>>>>>
>>>>>>>>>>>> Don't think this could be a valid from the tool, but if Reset
>> Plan
>>>>>> is
>>>>>>>>>>>> generated, and user want to set the offset for a specific
>>>> partition
>>>>>> to
>>>>>>>>>>>> other offset (eventually based on another timestamp), and
>> execute
>>>>>> it,
>>>>>>>>>> it
>>>>>>>>>>>> will be up to her/him.
>>>>>>>>>>>>
>>>>>>>>>>>>> Should the script take some credential file to make sure that
>>>> this
>>>>>>>>>>>> operation is authenticated given the potential impact of this
>>>>>>>>>> operation?
>>>>>>>>>>>>
>>>>>>>>>>>> Haven't tried to secure brokers yet, but the tool should support
>>>>>>>>>>>> authorization if it's enabled in the broker.
>>>>>>>>>>>>
>>>>>>>>>>>>> Should we provide constant to reset committed offset to
>>>>>>>>>> earliest/latest
>>>>>>>>>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
>>>>>>>>>> indicates
>>>>>>>>>>>> latest offset.
>>>>>>>>>>>>
>>>>>>>>>>>> I will go for something like ´--reset-to-earliest´ and
>>>>>>>>>>> ´--reset-to-latest´
>>>>>>>>>>>>
>>>>>>>>>>>>> Should we allow dynamic change of the comitted offset when
>>>> consumer
>>>>>>>>>> are
>>>>>>>>>>>> running, such that consumer will seek to the newly committed
>>>> offset
>>>>>>>> and
>>>>>>>>>>>> start consuming from there?
>>>>>>>>>>>>
>>>>>>>>>>>> Not sure about this. I will recommend to keep it simple and ask
>>>> user
>>>>>>>> to
>>>>>>>>>>>> stop consumers first. But I would considered it if the
>> trade-offs
>>>>>> are
>>>>>>>>>>>> clear.
>>>>>>>>>>>>
>>>>>>>>>>>> @Matthias
>>>>>>>>>>>>
>>>>>>>>>>>> Added :). And thanks a lot for your help to define this KIP!
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<
>> gwen@confluent.io
>>>>> )
>>>>>>>>>>>> escribió:
>>>>>>>>>>>>
>>>>>>>>>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
>>>>>>>>>>>>> arguments and a JSON parser to the existing tool, right?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
>>>>>>>>>>>>> <on...@gmail.com> wrote:
>>>>>>>>>>>>>> I think it makes sense to just add the feature to
>>>>>>>>>>>>> kafka-consumer-groups.sh
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <
>>>> gwen@confluent.io>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for the KIP. I'm super happy about adding the
>>>> capability.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I hate the interface, though. It looks exactly like the
>> replica
>>>>>>>>>>>>>>> assignment tool. A tool everyone loves so much that there are
>>>>>>>>>>> multiple
>>>>>>>>>>>>>>> projects, open and closed, that try to fix it.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can we swap it with something that looks a bit more like the
>>>>>>>>>> consumer
>>>>>>>>>>>>>>> group tool? or the kafka streams reset tool? Consistency is
>>>>>> helpful
>>>>>>>>>>> in
>>>>>>>>>>>>>>> such cases. I spent some time learning existing tools and
>>>>>> learning
>>>>>>>>>>> yet
>>>>>>>>>>>>>>> another one is a deterrent.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Gwen
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>>>>>>>>>>>>>>> <qu...@gmail.com> wrote:
>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset
>> Consumer
>>>>>>>>>> Group
>>>>>>>>>>>>>>> Offsets.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please, take a look at the proposal and share your feedback.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Jorge.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Gwen Shapira
>>>>>>>>>>>>>>> Product Manager | Confluent
>>>>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760>
>>>> <(650)%20450-2760>
>>>>>> <(650)%20450-2760>
>>>>>>>> <(650)%20450-2760>
>>>>>>>>>> <(650)%20450-2760> | @gwenshap
>>>>>>>>>>>>>>> Follow us: Twitter | blog
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Gwen Shapira
>>>>>>>>>>>>> Product Manager | Confluent
>>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760>
>>>> <(650)%20450-2760>
>>>>>> <(650)%20450-2760>
>>>>>>>> <(650)%20450-2760>
>>>>>>>>>> <(650)%20450-2760> | @gwenshap
>>>>>>>>>>>>> Follow us: Twitter | blog
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Gwen Shapira
>>>>>>>>>>> Product Manager | Confluent
>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760>
>>>> <(650)%20450-2760>
>>>>>> <(650)%20450-2760> <(650)%20450-2760>
>>>>>>>> | @gwenshap
>>>>>>>>>>> Follow us: Twitter | blog
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
> 


Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
Renaming to --by-duration LGTM

Not sure about changing it to --shift-by-duration because we could end up
with the same redundancy as before with reset: --reset-offsets
--reset-to-*.

Maybe changing --shift-offset-by to --shift-by 'n' could make it consistent
enough?


El vie., 24 feb. 2017 a las 6:39, Matthias J. Sax (<ma...@confluent.io>)
escribió:

> I just read the update KIP once more.
>
> I would suggest to rename --to-duration to --by-duration
>
> Or as a second idea, rename --to-duration to --shift-by-duration and at
> the same time rename --shift-offset-by to --shift-by-offset
>
> Not sure what the best option is, but naming would be more consistent IMHO.
>
>
>
> -Matthias
>
> On 2/23/17 4:42 PM, Jorge Esteban Quilcate Otoya wrote:
> > Hi All,
> >
> > If there are no more concerns, I'd like to start vote for this KIP.
> >
> > Thanks!
> > Jorge.
> >
> > El jue., 23 feb. 2017 a las 22:50, Jorge Esteban Quilcate Otoya (<
> > quilcate.jorge@gmail.com>) escribió:
> >
> >> Oh ok :)
> >>
> >> So, we can keep `--topic t1:1,2,3`
> >>
> >> I think with this one we have most of the feedback applied. I will
> update
> >> the KIP with this change.
> >>
> >> El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<
> matthias@confluent.io>)
> >> escribió:
> >>
> >> Sounds reasonable.
> >>
> >> If we have multiple --topic arguments, it does also not matter if we use
> >> t1:1,2 or t2=1,2
> >>
> >> I just suggested '=' because I wanted use ':' to chain multiple topics.
> >>
> >>
> >> -Matthias
> >>
> >> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> >>> Yeap, `--topic t1=1,2`LGTM
> >>>
> >>> Don't have idea neither about getting rid of repeated --topic, but
> >> --group
> >>> is also repeated in the case of deletion, so it could be ok to have
> >>> repeated --topic arguments.
> >>>
> >>> El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
> >> matthias@confluent.io>)
> >>> escribió:
> >>>
> >>>> So you suggest to merge "scope options" --topics, --topic, and
> >>>> --partitions into a single option? Sound good to me.
> >>>>
> >>>> I like the compact way to express it, ie, topicname:list-of-partitions
> >>>> with "all partitions" if not partitions are specified. It's quite
> >>>> intuitive to use.
> >>>>
> >>>> Just wondering, if we could get rid of the repeated --topic option;
> it's
> >>>> somewhat verbose. Have no good idea though who to improve it.
> >>>>
> >>>> If you concatenate multiple topic, we need one more character that is
> >>>> not allowed in topic names to separate the topics:
> >>>>
> >>>>> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';', '*',
> >>>> '?', ' ', '\t', '\r', '\n', '='};
> >>>>
> >>>> maybe
> >>>>
> >>>> --topics t1=1,2,3:t2:t3=3
> >>>>
> >>>> use '=' to specify partitions (instead of ':' as you proposed) and ':'
> >>>> to separate topics? All other characters seem to be worse to use to
> me.
> >>>> But maybe you have a better idea.
> >>>>
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>>
> >>>> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> >>>>> @Matthias about the point 9:
> >>>>>
> >>>>> What about keeping only the --topic option, and support this format:
> >>>>>
> >>>>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
> >>>>>
> >>>>> In this case topics t1, t2, and t3 will be selected: topic t1 with
> >>>>> partitions 0,1 and 2; topic t2 with all its partitions; and topic t3,
> >>>> with
> >>>>> only partition 2.
> >>>>>
> >>>>> Jorge.
> >>>>>
> >>>>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
> >>>>> quilcate.jorge@gmail.com>) escribió:
> >>>>>
> >>>>>> Thanks for the feedback Matthias.
> >>>>>>
> >>>>>> * 1. You're right. I'll reorder the scenarios.
> >>>>>>
> >>>>>> * 2. Agree. I'll update the KIP.
> >>>>>>
> >>>>>> * 3. I like it, updating to `reset-offsets`
> >>>>>>
> >>>>>> * 4. Agree, removing the `reset-` part
> >>>>>>
> >>>>>> * 5. Yes, 1.e option without --execute or --export will print out
> >>>> current
> >>>>>> offset, and the new offset, that will be the same. The use-case of
> >> this
> >>>>>> option is to use it in combination with --export mostly and have a
> >>>> current
> >>>>>> 'checkpoint' to reset later. I will add to the KIP how the output
> >> should
> >>>>>> looks like.
> >>>>>>
> >>>>>> * 6. Considering 4., I will update it to `--to-offset`
> >>>>>>
> >>>>>> * 7. I like the idea to unify these options (plus, minus).
> >>>>>> `shift-offsets-by` is a good option, but I will like some more
> >> feedback
> >>>>>> here about the name. I will update the KIP in the meantime.
> >>>>>>
> >>>>>> * 8. Yes, discussed in 9.
> >>>>>>
> >>>>>> * 9. Agree. I'll love some feedback here. `topic` is already used by
> >>>>>> `delete`, and we can add `--all-topics` to consider all
> >>>> topics/partitions
> >>>>>> assigned to a group. How could we define specific topics/partitions?
> >>>>>>
> >>>>>> * 10. Haven't thought about it, but make sense.
> >>>>>> <topic>,<partition>,<offset> would be enough.
> >>>>>>
> >>>>>> * 11. Agree. Solved with 10.
> >>>>>>
> >>>>>> Also, I have a couple of changes to mention:
> >>>>>>
> >>>>>> 1. I have add a reference to the branch where I'm working on this
> KIP.
> >>>>>>
> >>>>>> 2. About the period scenario `--to-period`. I will change it to
> >>>>>> `--to-duration` given that duration (
> >>>>>> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html)
> >>>>>> follows this format: 'PnDTnHnMnS' and does not consider daylight
> >> saving
> >>>>>> efects.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
> >>>> matthias@confluent.io>)
> >>>>>> escribió:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> thanks for updating the KIP. Couple of follow up comments:
> >>>>>>
> >>>>>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
> >>>>>> time" option -- IMHO it belongs to "reset by position"?
> >>>>>>
> >>>>>>
> >>>>>> * Nit: Description of "Reset to Earliest"
> >>>>>>
> >>>>>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
> >>>>>>
> >>>>>> I think this is strictly speaking not correct (as auto.offset.reset
> >> only
> >>>>>> triggered if no valid offset is found, but this tool explicitly
> >> modified
> >>>>>> committed offset), and should be phrased as
> >>>>>>
> >>>>>>> using Kafka Consumer's #seekToBeginning()
> >>>>>>
> >>>>>> -> similar issue for description of "Reset to Latest"
> >>>>>>
> >>>>>>
> >>>>>> * Main option: rename to --reset-offsets (plural instead of
> singular)
> >>>>>>
> >>>>>>
> >>>>>> * Scenario Options: I would remove "reset" from all options, because
> >> the
> >>>>>> main argument "--reset-offset" says already what to do:
> >>>>>>
> >>>>>>> bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
> >>>>>>
> >>>>>> better (IMHO):
> >>>>>>
> >>>>>>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> * Option 1.e ("print and export current offset") is not intuitive to
> >> use
> >>>>>> IMHO. The main option is "--reset-offset" but nothing happens if no
> >>>>>> scenario is specified. It is also not specified, what the output
> >> should
> >>>>>> look like?
> >>>>>>
> >>>>>> Furthermore, --describe should actually show currently committed
> >> offset
> >>>>>> for a group. So it seems to be redundant to have the same option in
> >>>>>> --reset-offsets
> >>>>>>
> >>>>>>
> >>>>>> * Option 2.a: I would rename to "--reset-to-offset" (or considering
> >> the
> >>>>>> comment above to "--to-offset")
> >>>>>>
> >>>>>>
> >>>>>> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or
> >> similar)
> >>>>>> and accept positive/negative values
> >>>>>>
> >>>>>>
> >>>>>> * About Scope "all": maybe it's better to have an option
> >> "--all-topics"
> >>>>>> (or similar). IMHO explicit arguments are preferable over implicit
> >>>>>> setting to guard again accidental miss use of the tool.
> >>>>>>
> >>>>>>
> >>>>>> * Scope: I also think, that "--topic" (singular) and "--topics"
> >> (plural)
> >>>>>> are too similar and easy to use in a wrong way (ie, mix up) -- maybe
> >> we
> >>>>>> can have two options that are easier to distinguish.
> >>>>>>
> >>>>>>
> >>>>>> * I still think that JSON is not the best format (it's too
> >> verbose/hard
> >>>>>> to write for humans from scratch). A simple CSV format with implicit
> >>>>>> schema (topic,partition,offset) would be sufficient.
> >>>>>>
> >>>>>>
> >>>>>> * Why does the JSON contain "group_id" field -- there is parameter
> >>>>>> "--group" to specify the group ID. Would one overwrite the other
> (what
> >>>>>> order) or would there be an error if "--group" is used in
> combination
> >>>>>> with "--reset-from-file"?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> -Matthias
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> according to the feedback, I've updated the KIP:
> >>>>>>>
> >>>>>>> - We have added and ordered the scenarios, scopes and executions of
> >> the
> >>>>>>> Reset Offset tool.
> >>>>>>> - Consider it as an extension to the current `ConsumerGroupCommand`
> >>>> tool
> >>>>>>> - Execution will be possible without generating JSON files.
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> >>>>>>>
> >>>>>>> Looking forward to your feedback!
> >>>>>>>
> >>>>>>> Jorge.
> >>>>>>>
> >>>>>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
> >>>>>>> quilcate.jorge@gmail.com>) escribió:
> >>>>>>>
> >>>>>>>> Great. I think I got the idea. What about this options:
> >>>>>>>>
> >>>>>>>> Scenarios:
> >>>>>>>>
> >>>>>>>> 1. Current status
> >>>>>>>>
> >>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> >>>>>>>>
> >>>>>>>> 2. To Datetime
> >>>>>>>>
> >>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >>>> --reset-to-datetime
> >>>>>>>> 2017-01-01T00:00:00.000´
> >>>>>>>>
> >>>>>>>> 3. To Period
> >>>>>>>>
> >>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >> --reset-to-period
> >>>>>> P2D´
> >>>>>>>>
> >>>>>>>> 4. To Earliest
> >>>>>>>>
> >>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >>>>>> --reset-to-earliest´
> >>>>>>>>
> >>>>>>>> 5. To Latest
> >>>>>>>>
> >>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >>>> --reset-to-latest´
> >>>>>>>>
> >>>>>>>> 6. Minus 'n' offsets
> >>>>>>>>
> >>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus
> >> n´
> >>>>>>>>
> >>>>>>>> 7. Plus 'n' offsets
> >>>>>>>>
> >>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus
> n´
> >>>>>>>>
> >>>>>>>> 8. To specific offset
> >>>>>>>>
> >>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
> >>>>>>>>
> >>>>>>>> Scopes:
> >>>>>>>>
> >>>>>>>> a. All topics used by Consumer Group
> >>>>>>>>
> >>>>>>>> Don't specify --topics
> >>>>>>>>
> >>>>>>>> b. Specific List of Topics
> >>>>>>>>
> >>>>>>>> Add list of values in --topics t1,t2,tn
> >>>>>>>>
> >>>>>>>> c. One Topic, all Partitions
> >>>>>>>>
> >>>>>>>> Add one topic and no partitions values: --topic t1
> >>>>>>>>
> >>>>>>>> d. One Topic, List of Partitions
> >>>>>>>>
> >>>>>>>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
> >>>>>>>>
> >>>>>>>> About Reset Plan (JSON file):
> >>>>>>>>
> >>>>>>>> I think is still valid to have the option to persist reset
> >>>> configuration
> >>>>>>>> as a file, but I agree to give the option to run the tool without
> >>>> going
> >>>>>>>> down to the JSON file.
> >>>>>>>>
> >>>>>>>> Execution options:
> >>>>>>>>
> >>>>>>>> 1. Without execution argument (No args):
> >>>>>>>>
> >>>>>>>> Print out results (reset plan)
> >>>>>>>>
> >>>>>>>> 2. With --execute argument:
> >>>>>>>>
> >>>>>>>> Run reset process
> >>>>>>>>
> >>>>>>>> 3. With --output argument:
> >>>>>>>>
> >>>>>>>> Save result in a JSON format.
> >>>>>>>>
> >>>>>>>> 4. Only with --execute option and --reset-file (path to JSON)
> >>>>>>>>
> >>>>>>>> Reset based on file
> >>>>>>>>
> >>>>>>>> 4. Only with --verify option and --reset-file (path to JSON)
> >>>>>>>>
> >>>>>>>> Verify file values with current offsets
> >>>>>>>>
> >>>>>>>> I think we can remove --generate-and-execute because is a bit
> >> clumsy.
> >>>>>>>>
> >>>>>>>> With this options we will be able to execute with manual JSON
> >>>>>>>> configuration.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<ben@confluent.io
> >)
> >>>>>>>> escribió:
> >>>>>>>>
> >>>>>>>> Yes - using a tool like this to skip a set of consumer groups
> over a
> >>>>>>>> corrupt/bad message is definitely appealing.
> >>>>>>>>
> >>>>>>>> B
> >>>>>>>>
> >>>>>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io>
> >>>> wrote:
> >>>>>>>>
> >>>>>>>>> I like the --reset-to-earliest and --reset-to-latest. In general,
> >>>>>>>>> since the JSON route is the most challenging for users, we want
> to
> >>>>>>>>> provide a lot of ways to do useful things without going there.
> >>>>>>>>>
> >>>>>>>>> Two things that can help:
> >>>>>>>>>
> >>>>>>>>> 1. A lot of times, users want to skip few messages that cause
> >> issues
> >>>>>>>>> and continue. maybe just specifying the topic, partition and
> delta
> >>>>>>>>> will be better than having to find the offset and write a JSON
> and
> >>>>>>>>> validate the JSON etc.
> >>>>>>>>>
> >>>>>>>>> 2. Thinking if there are other common use-cases that we can make
> >> easy
> >>>>>>>>> rather than just one generic but not very usable method.
> >>>>>>>>>
> >>>>>>>>> Gwen
> >>>>>>>>>
> >>>>>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> >>>>>>>>> <qu...@gmail.com> wrote:
> >>>>>>>>>> Thanks for the feedback!
> >>>>>>>>>>
> >>>>>>>>>> @Onur, @Gwen:
> >>>>>>>>>>
> >>>>>>>>>> Agree. Actually at the first draft I considered to have it
> inside
> >>>>>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a
> >>>> standalone
> >>>>>>>>> tool
> >>>>>>>>>> to describe it clearly and focus it on reset functionality.
> >>>>>>>>>>
> >>>>>>>>>> But now that you mentioned, it does make sense to have it in
> >>>>>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to
> >>>> introduce
> >>>>>>>>> it?
> >>>>>>>>>>
> >>>>>>>>>> Maybe something like this:
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> >>>>>>>> --topics
> >>>>>>>>> t1
> >>>>>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify
> >> --reset-json-file
> >>>>>>>>>> plan.json´
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute
> >> --reset-json-file
> >>>>>>>>>> plan.json´
> >>>>>>>>>>
> >>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute
> >>>>>> --group
> >>>>>>>>> cg1
> >>>>>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> >>>>>>>>>>
> >>>>>>>>>> @Gwen:
> >>>>>>>>>>
> >>>>>>>>>>> It looks exactly like the replica assignment tool
> >>>>>>>>>>
> >>>>>>>>>> It was influenced by ;-) I use the generate-verify-execute
> process
> >>>>>> here
> >>>>>>>>> to
> >>>>>>>>>> make sure user will be aware of the result of this operation. At
> >> the
> >>>>>>>>>> beginning we considered only add a couple of options to Consumer
> >>>> Group
> >>>>>>>>>> Command:
> >>>>>>>>>>
> >>>>>>>>>> --rewind-to-timestamp and --rewind-to-period
> >>>>>>>>>>
> >>>>>>>>>> @Onur:
> >>>>>>>>>>
> >>>>>>>>>>> You can actually get away with overriding while members of the
> >>>> group
> >>>>>>>>> are live
> >>>>>>>>>> with method 2 by using group information from
> >> DescribeGroupsRequest.
> >>>>>>>>>>
> >>>>>>>>>> This means that we need to have Consumer Group stopped before
> >>>>>> executing
> >>>>>>>>> and
> >>>>>>>>>> start a new consumer internally to do this? Therefore, we won't
> be
> >>>>>> able
> >>>>>>>>> to
> >>>>>>>>>> consider executing reset when ConsumerGroup is active? (trying
> to
> >>>>>>>> relate
> >>>>>>>>> it
> >>>>>>>>>> with @Dong 5th question)
> >>>>>>>>>>
> >>>>>>>>>> @Dong:
> >>>>>>>>>>
> >>>>>>>>>>> Should we allow user to use wildcard to reset offset of all
> >> groups
> >>>>>>>> for a
> >>>>>>>>>> given topic as well?
> >>>>>>>>>>
> >>>>>>>>>> I haven't thought about this scenario. Could be interesting.
> >>>> Following
> >>>>>>>>> the
> >>>>>>>>>> recommendation to add it into Consumer Group Command, in this
> case
> >>>>>>>> Group
> >>>>>>>>>> argument will be optional if there are only 1 topic. I think for
> >>>>>>>> multiple
> >>>>>>>>>> topic won't be that useful.
> >>>>>>>>>>
> >>>>>>>>>>> Should we allow user to specify timestamp per topic partition
> in
> >>>> the
> >>>>>>>>> json
> >>>>>>>>>> file as well?
> >>>>>>>>>>
> >>>>>>>>>> Don't think this could be a valid from the tool, but if Reset
> Plan
> >>>> is
> >>>>>>>>>> generated, and user want to set the offset for a specific
> >> partition
> >>>> to
> >>>>>>>>>> other offset (eventually based on another timestamp), and
> execute
> >>>> it,
> >>>>>>>> it
> >>>>>>>>>> will be up to her/him.
> >>>>>>>>>>
> >>>>>>>>>>> Should the script take some credential file to make sure that
> >> this
> >>>>>>>>>> operation is authenticated given the potential impact of this
> >>>>>>>> operation?
> >>>>>>>>>>
> >>>>>>>>>> Haven't tried to secure brokers yet, but the tool should support
> >>>>>>>>>> authorization if it's enabled in the broker.
> >>>>>>>>>>
> >>>>>>>>>>> Should we provide constant to reset committed offset to
> >>>>>>>> earliest/latest
> >>>>>>>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
> >>>>>>>> indicates
> >>>>>>>>>> latest offset.
> >>>>>>>>>>
> >>>>>>>>>> I will go for something like ´--reset-to-earliest´ and
> >>>>>>>>> ´--reset-to-latest´
> >>>>>>>>>>
> >>>>>>>>>>> Should we allow dynamic change of the comitted offset when
> >> consumer
> >>>>>>>> are
> >>>>>>>>>> running, such that consumer will seek to the newly committed
> >> offset
> >>>>>> and
> >>>>>>>>>> start consuming from there?
> >>>>>>>>>>
> >>>>>>>>>> Not sure about this. I will recommend to keep it simple and ask
> >> user
> >>>>>> to
> >>>>>>>>>> stop consumers first. But I would considered it if the
> trade-offs
> >>>> are
> >>>>>>>>>> clear.
> >>>>>>>>>>
> >>>>>>>>>> @Matthias
> >>>>>>>>>>
> >>>>>>>>>> Added :). And thanks a lot for your help to define this KIP!
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<
> gwen@confluent.io
> >>> )
> >>>>>>>>>> escribió:
> >>>>>>>>>>
> >>>>>>>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
> >>>>>>>>>>> arguments and a JSON parser to the existing tool, right?
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> >>>>>>>>>>> <on...@gmail.com> wrote:
> >>>>>>>>>>>> I think it makes sense to just add the feature to
> >>>>>>>>>>> kafka-consumer-groups.sh
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <
> >> gwen@confluent.io>
> >>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks for the KIP. I'm super happy about adding the
> >> capability.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I hate the interface, though. It looks exactly like the
> replica
> >>>>>>>>>>>>> assignment tool. A tool everyone loves so much that there are
> >>>>>>>>> multiple
> >>>>>>>>>>>>> projects, open and closed, that try to fix it.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Can we swap it with something that looks a bit more like the
> >>>>>>>> consumer
> >>>>>>>>>>>>> group tool? or the kafka streams reset tool? Consistency is
> >>>> helpful
> >>>>>>>>> in
> >>>>>>>>>>>>> such cases. I spent some time learning existing tools and
> >>>> learning
> >>>>>>>>> yet
> >>>>>>>>>>>>> another one is a deterrent.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Gwen
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> >>>>>>>>>>>>> <qu...@gmail.com> wrote:
> >>>>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset
> Consumer
> >>>>>>>> Group
> >>>>>>>>>>>>> Offsets.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Please, take a look at the proposal and share your feedback.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>> Jorge.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> Gwen Shapira
> >>>>>>>>>>>>> Product Manager | Confluent
> >>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760>
> >>>>>> <(650)%20450-2760>
> >>>>>>>> <(650)%20450-2760> | @gwenshap
> >>>>>>>>>>>>> Follow us: Twitter | blog
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> Gwen Shapira
> >>>>>>>>>>> Product Manager | Confluent
> >>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760>
> >>>>>> <(650)%20450-2760>
> >>>>>>>> <(650)%20450-2760> | @gwenshap
> >>>>>>>>>>> Follow us: Twitter | blog
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Gwen Shapira
> >>>>>>>>> Product Manager | Confluent
> >>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760> <(650)%20450-2760>
> >>>>>> | @gwenshap
> >>>>>>>>> Follow us: Twitter | blog
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by "Matthias J. Sax" <ma...@confluent.io>.
I just read the update KIP once more.

I would suggest to rename --to-duration to --by-duration

Or as a second idea, rename --to-duration to --shift-by-duration and at
the same time rename --shift-offset-by to --shift-by-offset

Not sure what the best option is, but naming would be more consistent IMHO.



-Matthias

On 2/23/17 4:42 PM, Jorge Esteban Quilcate Otoya wrote:
> Hi All,
> 
> If there are no more concerns, I'd like to start vote for this KIP.
> 
> Thanks!
> Jorge.
> 
> El jue., 23 feb. 2017 a las 22:50, Jorge Esteban Quilcate Otoya (<
> quilcate.jorge@gmail.com>) escribió:
> 
>> Oh ok :)
>>
>> So, we can keep `--topic t1:1,2,3`
>>
>> I think with this one we have most of the feedback applied. I will update
>> the KIP with this change.
>>
>> El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<ma...@confluent.io>)
>> escribió:
>>
>> Sounds reasonable.
>>
>> If we have multiple --topic arguments, it does also not matter if we use
>> t1:1,2 or t2=1,2
>>
>> I just suggested '=' because I wanted use ':' to chain multiple topics.
>>
>>
>> -Matthias
>>
>> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
>>> Yeap, `--topic t1=1,2`LGTM
>>>
>>> Don't have idea neither about getting rid of repeated --topic, but
>> --group
>>> is also repeated in the case of deletion, so it could be ok to have
>>> repeated --topic arguments.
>>>
>>> El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
>> matthias@confluent.io>)
>>> escribió:
>>>
>>>> So you suggest to merge "scope options" --topics, --topic, and
>>>> --partitions into a single option? Sound good to me.
>>>>
>>>> I like the compact way to express it, ie, topicname:list-of-partitions
>>>> with "all partitions" if not partitions are specified. It's quite
>>>> intuitive to use.
>>>>
>>>> Just wondering, if we could get rid of the repeated --topic option; it's
>>>> somewhat verbose. Have no good idea though who to improve it.
>>>>
>>>> If you concatenate multiple topic, we need one more character that is
>>>> not allowed in topic names to separate the topics:
>>>>
>>>>> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';', '*',
>>>> '?', ' ', '\t', '\r', '\n', '='};
>>>>
>>>> maybe
>>>>
>>>> --topics t1=1,2,3:t2:t3=3
>>>>
>>>> use '=' to specify partitions (instead of ':' as you proposed) and ':'
>>>> to separate topics? All other characters seem to be worse to use to me.
>>>> But maybe you have a better idea.
>>>>
>>>>
>>>>
>>>> -Matthias
>>>>
>>>>
>>>> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
>>>>> @Matthias about the point 9:
>>>>>
>>>>> What about keeping only the --topic option, and support this format:
>>>>>
>>>>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
>>>>>
>>>>> In this case topics t1, t2, and t3 will be selected: topic t1 with
>>>>> partitions 0,1 and 2; topic t2 with all its partitions; and topic t3,
>>>> with
>>>>> only partition 2.
>>>>>
>>>>> Jorge.
>>>>>
>>>>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
>>>>> quilcate.jorge@gmail.com>) escribió:
>>>>>
>>>>>> Thanks for the feedback Matthias.
>>>>>>
>>>>>> * 1. You're right. I'll reorder the scenarios.
>>>>>>
>>>>>> * 2. Agree. I'll update the KIP.
>>>>>>
>>>>>> * 3. I like it, updating to `reset-offsets`
>>>>>>
>>>>>> * 4. Agree, removing the `reset-` part
>>>>>>
>>>>>> * 5. Yes, 1.e option without --execute or --export will print out
>>>> current
>>>>>> offset, and the new offset, that will be the same. The use-case of
>> this
>>>>>> option is to use it in combination with --export mostly and have a
>>>> current
>>>>>> 'checkpoint' to reset later. I will add to the KIP how the output
>> should
>>>>>> looks like.
>>>>>>
>>>>>> * 6. Considering 4., I will update it to `--to-offset`
>>>>>>
>>>>>> * 7. I like the idea to unify these options (plus, minus).
>>>>>> `shift-offsets-by` is a good option, but I will like some more
>> feedback
>>>>>> here about the name. I will update the KIP in the meantime.
>>>>>>
>>>>>> * 8. Yes, discussed in 9.
>>>>>>
>>>>>> * 9. Agree. I'll love some feedback here. `topic` is already used by
>>>>>> `delete`, and we can add `--all-topics` to consider all
>>>> topics/partitions
>>>>>> assigned to a group. How could we define specific topics/partitions?
>>>>>>
>>>>>> * 10. Haven't thought about it, but make sense.
>>>>>> <topic>,<partition>,<offset> would be enough.
>>>>>>
>>>>>> * 11. Agree. Solved with 10.
>>>>>>
>>>>>> Also, I have a couple of changes to mention:
>>>>>>
>>>>>> 1. I have add a reference to the branch where I'm working on this KIP.
>>>>>>
>>>>>> 2. About the period scenario `--to-period`. I will change it to
>>>>>> `--to-duration` given that duration (
>>>>>> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html)
>>>>>> follows this format: 'PnDTnHnMnS' and does not consider daylight
>> saving
>>>>>> efects.
>>>>>>
>>>>>>
>>>>>>
>>>>>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
>>>> matthias@confluent.io>)
>>>>>> escribió:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> thanks for updating the KIP. Couple of follow up comments:
>>>>>>
>>>>>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
>>>>>> time" option -- IMHO it belongs to "reset by position"?
>>>>>>
>>>>>>
>>>>>> * Nit: Description of "Reset to Earliest"
>>>>>>
>>>>>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
>>>>>>
>>>>>> I think this is strictly speaking not correct (as auto.offset.reset
>> only
>>>>>> triggered if no valid offset is found, but this tool explicitly
>> modified
>>>>>> committed offset), and should be phrased as
>>>>>>
>>>>>>> using Kafka Consumer's #seekToBeginning()
>>>>>>
>>>>>> -> similar issue for description of "Reset to Latest"
>>>>>>
>>>>>>
>>>>>> * Main option: rename to --reset-offsets (plural instead of singular)
>>>>>>
>>>>>>
>>>>>> * Scenario Options: I would remove "reset" from all options, because
>> the
>>>>>> main argument "--reset-offset" says already what to do:
>>>>>>
>>>>>>> bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
>>>>>>
>>>>>> better (IMHO):
>>>>>>
>>>>>>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
>>>>>>
>>>>>>
>>>>>>
>>>>>> * Option 1.e ("print and export current offset") is not intuitive to
>> use
>>>>>> IMHO. The main option is "--reset-offset" but nothing happens if no
>>>>>> scenario is specified. It is also not specified, what the output
>> should
>>>>>> look like?
>>>>>>
>>>>>> Furthermore, --describe should actually show currently committed
>> offset
>>>>>> for a group. So it seems to be redundant to have the same option in
>>>>>> --reset-offsets
>>>>>>
>>>>>>
>>>>>> * Option 2.a: I would rename to "--reset-to-offset" (or considering
>> the
>>>>>> comment above to "--to-offset")
>>>>>>
>>>>>>
>>>>>> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or
>> similar)
>>>>>> and accept positive/negative values
>>>>>>
>>>>>>
>>>>>> * About Scope "all": maybe it's better to have an option
>> "--all-topics"
>>>>>> (or similar). IMHO explicit arguments are preferable over implicit
>>>>>> setting to guard again accidental miss use of the tool.
>>>>>>
>>>>>>
>>>>>> * Scope: I also think, that "--topic" (singular) and "--topics"
>> (plural)
>>>>>> are too similar and easy to use in a wrong way (ie, mix up) -- maybe
>> we
>>>>>> can have two options that are easier to distinguish.
>>>>>>
>>>>>>
>>>>>> * I still think that JSON is not the best format (it's too
>> verbose/hard
>>>>>> to write for humans from scratch). A simple CSV format with implicit
>>>>>> schema (topic,partition,offset) would be sufficient.
>>>>>>
>>>>>>
>>>>>> * Why does the JSON contain "group_id" field -- there is parameter
>>>>>> "--group" to specify the group ID. Would one overwrite the other (what
>>>>>> order) or would there be an error if "--group" is used in combination
>>>>>> with "--reset-from-file"?
>>>>>>
>>>>>>
>>>>>>
>>>>>> -Matthias
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> according to the feedback, I've updated the KIP:
>>>>>>>
>>>>>>> - We have added and ordered the scenarios, scopes and executions of
>> the
>>>>>>> Reset Offset tool.
>>>>>>> - Consider it as an extension to the current `ConsumerGroupCommand`
>>>> tool
>>>>>>> - Execution will be possible without generating JSON files.
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
>>>>>>>
>>>>>>> Looking forward to your feedback!
>>>>>>>
>>>>>>> Jorge.
>>>>>>>
>>>>>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
>>>>>>> quilcate.jorge@gmail.com>) escribió:
>>>>>>>
>>>>>>>> Great. I think I got the idea. What about this options:
>>>>>>>>
>>>>>>>> Scenarios:
>>>>>>>>
>>>>>>>> 1. Current status
>>>>>>>>
>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
>>>>>>>>
>>>>>>>> 2. To Datetime
>>>>>>>>
>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
>>>> --reset-to-datetime
>>>>>>>> 2017-01-01T00:00:00.000´
>>>>>>>>
>>>>>>>> 3. To Period
>>>>>>>>
>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
>> --reset-to-period
>>>>>> P2D´
>>>>>>>>
>>>>>>>> 4. To Earliest
>>>>>>>>
>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
>>>>>> --reset-to-earliest´
>>>>>>>>
>>>>>>>> 5. To Latest
>>>>>>>>
>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
>>>> --reset-to-latest´
>>>>>>>>
>>>>>>>> 6. Minus 'n' offsets
>>>>>>>>
>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus
>> n´
>>>>>>>>
>>>>>>>> 7. Plus 'n' offsets
>>>>>>>>
>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
>>>>>>>>
>>>>>>>> 8. To specific offset
>>>>>>>>
>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
>>>>>>>>
>>>>>>>> Scopes:
>>>>>>>>
>>>>>>>> a. All topics used by Consumer Group
>>>>>>>>
>>>>>>>> Don't specify --topics
>>>>>>>>
>>>>>>>> b. Specific List of Topics
>>>>>>>>
>>>>>>>> Add list of values in --topics t1,t2,tn
>>>>>>>>
>>>>>>>> c. One Topic, all Partitions
>>>>>>>>
>>>>>>>> Add one topic and no partitions values: --topic t1
>>>>>>>>
>>>>>>>> d. One Topic, List of Partitions
>>>>>>>>
>>>>>>>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
>>>>>>>>
>>>>>>>> About Reset Plan (JSON file):
>>>>>>>>
>>>>>>>> I think is still valid to have the option to persist reset
>>>> configuration
>>>>>>>> as a file, but I agree to give the option to run the tool without
>>>> going
>>>>>>>> down to the JSON file.
>>>>>>>>
>>>>>>>> Execution options:
>>>>>>>>
>>>>>>>> 1. Without execution argument (No args):
>>>>>>>>
>>>>>>>> Print out results (reset plan)
>>>>>>>>
>>>>>>>> 2. With --execute argument:
>>>>>>>>
>>>>>>>> Run reset process
>>>>>>>>
>>>>>>>> 3. With --output argument:
>>>>>>>>
>>>>>>>> Save result in a JSON format.
>>>>>>>>
>>>>>>>> 4. Only with --execute option and --reset-file (path to JSON)
>>>>>>>>
>>>>>>>> Reset based on file
>>>>>>>>
>>>>>>>> 4. Only with --verify option and --reset-file (path to JSON)
>>>>>>>>
>>>>>>>> Verify file values with current offsets
>>>>>>>>
>>>>>>>> I think we can remove --generate-and-execute because is a bit
>> clumsy.
>>>>>>>>
>>>>>>>> With this options we will be able to execute with manual JSON
>>>>>>>> configuration.
>>>>>>>>
>>>>>>>>
>>>>>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
>>>>>>>> escribió:
>>>>>>>>
>>>>>>>> Yes - using a tool like this to skip a set of consumer groups over a
>>>>>>>> corrupt/bad message is definitely appealing.
>>>>>>>>
>>>>>>>> B
>>>>>>>>
>>>>>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io>
>>>> wrote:
>>>>>>>>
>>>>>>>>> I like the --reset-to-earliest and --reset-to-latest. In general,
>>>>>>>>> since the JSON route is the most challenging for users, we want to
>>>>>>>>> provide a lot of ways to do useful things without going there.
>>>>>>>>>
>>>>>>>>> Two things that can help:
>>>>>>>>>
>>>>>>>>> 1. A lot of times, users want to skip few messages that cause
>> issues
>>>>>>>>> and continue. maybe just specifying the topic, partition and delta
>>>>>>>>> will be better than having to find the offset and write a JSON and
>>>>>>>>> validate the JSON etc.
>>>>>>>>>
>>>>>>>>> 2. Thinking if there are other common use-cases that we can make
>> easy
>>>>>>>>> rather than just one generic but not very usable method.
>>>>>>>>>
>>>>>>>>> Gwen
>>>>>>>>>
>>>>>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
>>>>>>>>> <qu...@gmail.com> wrote:
>>>>>>>>>> Thanks for the feedback!
>>>>>>>>>>
>>>>>>>>>> @Onur, @Gwen:
>>>>>>>>>>
>>>>>>>>>> Agree. Actually at the first draft I considered to have it inside
>>>>>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a
>>>> standalone
>>>>>>>>> tool
>>>>>>>>>> to describe it clearly and focus it on reset functionality.
>>>>>>>>>>
>>>>>>>>>> But now that you mentioned, it does make sense to have it in
>>>>>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to
>>>> introduce
>>>>>>>>> it?
>>>>>>>>>>
>>>>>>>>>> Maybe something like this:
>>>>>>>>>>
>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
>>>>>>>> --topics
>>>>>>>>> t1
>>>>>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
>>>>>>>>>>
>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify
>> --reset-json-file
>>>>>>>>>> plan.json´
>>>>>>>>>>
>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute
>> --reset-json-file
>>>>>>>>>> plan.json´
>>>>>>>>>>
>>>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute
>>>>>> --group
>>>>>>>>> cg1
>>>>>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
>>>>>>>>>>
>>>>>>>>>> @Gwen:
>>>>>>>>>>
>>>>>>>>>>> It looks exactly like the replica assignment tool
>>>>>>>>>>
>>>>>>>>>> It was influenced by ;-) I use the generate-verify-execute process
>>>>>> here
>>>>>>>>> to
>>>>>>>>>> make sure user will be aware of the result of this operation. At
>> the
>>>>>>>>>> beginning we considered only add a couple of options to Consumer
>>>> Group
>>>>>>>>>> Command:
>>>>>>>>>>
>>>>>>>>>> --rewind-to-timestamp and --rewind-to-period
>>>>>>>>>>
>>>>>>>>>> @Onur:
>>>>>>>>>>
>>>>>>>>>>> You can actually get away with overriding while members of the
>>>> group
>>>>>>>>> are live
>>>>>>>>>> with method 2 by using group information from
>> DescribeGroupsRequest.
>>>>>>>>>>
>>>>>>>>>> This means that we need to have Consumer Group stopped before
>>>>>> executing
>>>>>>>>> and
>>>>>>>>>> start a new consumer internally to do this? Therefore, we won't be
>>>>>> able
>>>>>>>>> to
>>>>>>>>>> consider executing reset when ConsumerGroup is active? (trying to
>>>>>>>> relate
>>>>>>>>> it
>>>>>>>>>> with @Dong 5th question)
>>>>>>>>>>
>>>>>>>>>> @Dong:
>>>>>>>>>>
>>>>>>>>>>> Should we allow user to use wildcard to reset offset of all
>> groups
>>>>>>>> for a
>>>>>>>>>> given topic as well?
>>>>>>>>>>
>>>>>>>>>> I haven't thought about this scenario. Could be interesting.
>>>> Following
>>>>>>>>> the
>>>>>>>>>> recommendation to add it into Consumer Group Command, in this case
>>>>>>>> Group
>>>>>>>>>> argument will be optional if there are only 1 topic. I think for
>>>>>>>> multiple
>>>>>>>>>> topic won't be that useful.
>>>>>>>>>>
>>>>>>>>>>> Should we allow user to specify timestamp per topic partition in
>>>> the
>>>>>>>>> json
>>>>>>>>>> file as well?
>>>>>>>>>>
>>>>>>>>>> Don't think this could be a valid from the tool, but if Reset Plan
>>>> is
>>>>>>>>>> generated, and user want to set the offset for a specific
>> partition
>>>> to
>>>>>>>>>> other offset (eventually based on another timestamp), and execute
>>>> it,
>>>>>>>> it
>>>>>>>>>> will be up to her/him.
>>>>>>>>>>
>>>>>>>>>>> Should the script take some credential file to make sure that
>> this
>>>>>>>>>> operation is authenticated given the potential impact of this
>>>>>>>> operation?
>>>>>>>>>>
>>>>>>>>>> Haven't tried to secure brokers yet, but the tool should support
>>>>>>>>>> authorization if it's enabled in the broker.
>>>>>>>>>>
>>>>>>>>>>> Should we provide constant to reset committed offset to
>>>>>>>> earliest/latest
>>>>>>>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
>>>>>>>> indicates
>>>>>>>>>> latest offset.
>>>>>>>>>>
>>>>>>>>>> I will go for something like ´--reset-to-earliest´ and
>>>>>>>>> ´--reset-to-latest´
>>>>>>>>>>
>>>>>>>>>>> Should we allow dynamic change of the comitted offset when
>> consumer
>>>>>>>> are
>>>>>>>>>> running, such that consumer will seek to the newly committed
>> offset
>>>>>> and
>>>>>>>>>> start consuming from there?
>>>>>>>>>>
>>>>>>>>>> Not sure about this. I will recommend to keep it simple and ask
>> user
>>>>>> to
>>>>>>>>>> stop consumers first. But I would considered it if the trade-offs
>>>> are
>>>>>>>>>> clear.
>>>>>>>>>>
>>>>>>>>>> @Matthias
>>>>>>>>>>
>>>>>>>>>> Added :). And thanks a lot for your help to define this KIP!
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gwen@confluent.io
>>> )
>>>>>>>>>> escribió:
>>>>>>>>>>
>>>>>>>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
>>>>>>>>>>> arguments and a JSON parser to the existing tool, right?
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
>>>>>>>>>>> <on...@gmail.com> wrote:
>>>>>>>>>>>> I think it makes sense to just add the feature to
>>>>>>>>>>> kafka-consumer-groups.sh
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <
>> gwen@confluent.io>
>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the KIP. I'm super happy about adding the
>> capability.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I hate the interface, though. It looks exactly like the replica
>>>>>>>>>>>>> assignment tool. A tool everyone loves so much that there are
>>>>>>>>> multiple
>>>>>>>>>>>>> projects, open and closed, that try to fix it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can we swap it with something that looks a bit more like the
>>>>>>>> consumer
>>>>>>>>>>>>> group tool? or the kafka streams reset tool? Consistency is
>>>> helpful
>>>>>>>>> in
>>>>>>>>>>>>> such cases. I spent some time learning existing tools and
>>>> learning
>>>>>>>>> yet
>>>>>>>>>>>>> another one is a deterrent.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Gwen
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>>>>>>>>>>>>> <qu...@gmail.com> wrote:
>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
>>>>>>>> Group
>>>>>>>>>>>>> Offsets.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please, take a look at the proposal and share your feedback.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Jorge.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Gwen Shapira
>>>>>>>>>>>>> Product Manager | Confluent
>>>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760>
>>>> <(650)%20450-2760>
>>>>>> <(650)%20450-2760>
>>>>>>>> <(650)%20450-2760> | @gwenshap
>>>>>>>>>>>>> Follow us: Twitter | blog
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Gwen Shapira
>>>>>>>>>>> Product Manager | Confluent
>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760>
>>>> <(650)%20450-2760>
>>>>>> <(650)%20450-2760>
>>>>>>>> <(650)%20450-2760> | @gwenshap
>>>>>>>>>>> Follow us: Twitter | blog
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Gwen Shapira
>>>>>>>>> Product Manager | Confluent
>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760>
>>>> <(650)%20450-2760> <(650)%20450-2760>
>>>>>> | @gwenshap
>>>>>>>>> Follow us: Twitter | blog
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
> 


Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
Hi All,

If there are no more concerns, I'd like to start vote for this KIP.

Thanks!
Jorge.

El jue., 23 feb. 2017 a las 22:50, Jorge Esteban Quilcate Otoya (<
quilcate.jorge@gmail.com>) escribió:

> Oh ok :)
>
> So, we can keep `--topic t1:1,2,3`
>
> I think with this one we have most of the feedback applied. I will update
> the KIP with this change.
>
> El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<ma...@confluent.io>)
> escribió:
>
> Sounds reasonable.
>
> If we have multiple --topic arguments, it does also not matter if we use
> t1:1,2 or t2=1,2
>
> I just suggested '=' because I wanted use ':' to chain multiple topics.
>
>
> -Matthias
>
> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> > Yeap, `--topic t1=1,2`LGTM
> >
> > Don't have idea neither about getting rid of repeated --topic, but
> --group
> > is also repeated in the case of deletion, so it could be ok to have
> > repeated --topic arguments.
> >
> > El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
> matthias@confluent.io>)
> > escribió:
> >
> >> So you suggest to merge "scope options" --topics, --topic, and
> >> --partitions into a single option? Sound good to me.
> >>
> >> I like the compact way to express it, ie, topicname:list-of-partitions
> >> with "all partitions" if not partitions are specified. It's quite
> >> intuitive to use.
> >>
> >> Just wondering, if we could get rid of the repeated --topic option; it's
> >> somewhat verbose. Have no good idea though who to improve it.
> >>
> >> If you concatenate multiple topic, we need one more character that is
> >> not allowed in topic names to separate the topics:
> >>
> >>> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';', '*',
> >> '?', ' ', '\t', '\r', '\n', '='};
> >>
> >> maybe
> >>
> >> --topics t1=1,2,3:t2:t3=3
> >>
> >> use '=' to specify partitions (instead of ':' as you proposed) and ':'
> >> to separate topics? All other characters seem to be worse to use to me.
> >> But maybe you have a better idea.
> >>
> >>
> >>
> >> -Matthias
> >>
> >>
> >> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> >>> @Matthias about the point 9:
> >>>
> >>> What about keeping only the --topic option, and support this format:
> >>>
> >>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
> >>>
> >>> In this case topics t1, t2, and t3 will be selected: topic t1 with
> >>> partitions 0,1 and 2; topic t2 with all its partitions; and topic t3,
> >> with
> >>> only partition 2.
> >>>
> >>> Jorge.
> >>>
> >>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
> >>> quilcate.jorge@gmail.com>) escribió:
> >>>
> >>>> Thanks for the feedback Matthias.
> >>>>
> >>>> * 1. You're right. I'll reorder the scenarios.
> >>>>
> >>>> * 2. Agree. I'll update the KIP.
> >>>>
> >>>> * 3. I like it, updating to `reset-offsets`
> >>>>
> >>>> * 4. Agree, removing the `reset-` part
> >>>>
> >>>> * 5. Yes, 1.e option without --execute or --export will print out
> >> current
> >>>> offset, and the new offset, that will be the same. The use-case of
> this
> >>>> option is to use it in combination with --export mostly and have a
> >> current
> >>>> 'checkpoint' to reset later. I will add to the KIP how the output
> should
> >>>> looks like.
> >>>>
> >>>> * 6. Considering 4., I will update it to `--to-offset`
> >>>>
> >>>> * 7. I like the idea to unify these options (plus, minus).
> >>>> `shift-offsets-by` is a good option, but I will like some more
> feedback
> >>>> here about the name. I will update the KIP in the meantime.
> >>>>
> >>>> * 8. Yes, discussed in 9.
> >>>>
> >>>> * 9. Agree. I'll love some feedback here. `topic` is already used by
> >>>> `delete`, and we can add `--all-topics` to consider all
> >> topics/partitions
> >>>> assigned to a group. How could we define specific topics/partitions?
> >>>>
> >>>> * 10. Haven't thought about it, but make sense.
> >>>> <topic>,<partition>,<offset> would be enough.
> >>>>
> >>>> * 11. Agree. Solved with 10.
> >>>>
> >>>> Also, I have a couple of changes to mention:
> >>>>
> >>>> 1. I have add a reference to the branch where I'm working on this KIP.
> >>>>
> >>>> 2. About the period scenario `--to-period`. I will change it to
> >>>> `--to-duration` given that duration (
> >>>> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html)
> >>>> follows this format: 'PnDTnHnMnS' and does not consider daylight
> saving
> >>>> efects.
> >>>>
> >>>>
> >>>>
> >>>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
> >> matthias@confluent.io>)
> >>>> escribió:
> >>>>
> >>>> Hi,
> >>>>
> >>>> thanks for updating the KIP. Couple of follow up comments:
> >>>>
> >>>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
> >>>> time" option -- IMHO it belongs to "reset by position"?
> >>>>
> >>>>
> >>>> * Nit: Description of "Reset to Earliest"
> >>>>
> >>>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
> >>>>
> >>>> I think this is strictly speaking not correct (as auto.offset.reset
> only
> >>>> triggered if no valid offset is found, but this tool explicitly
> modified
> >>>> committed offset), and should be phrased as
> >>>>
> >>>>> using Kafka Consumer's #seekToBeginning()
> >>>>
> >>>> -> similar issue for description of "Reset to Latest"
> >>>>
> >>>>
> >>>> * Main option: rename to --reset-offsets (plural instead of singular)
> >>>>
> >>>>
> >>>> * Scenario Options: I would remove "reset" from all options, because
> the
> >>>> main argument "--reset-offset" says already what to do:
> >>>>
> >>>>> bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
> >>>>
> >>>> better (IMHO):
> >>>>
> >>>>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
> >>>>
> >>>>
> >>>>
> >>>> * Option 1.e ("print and export current offset") is not intuitive to
> use
> >>>> IMHO. The main option is "--reset-offset" but nothing happens if no
> >>>> scenario is specified. It is also not specified, what the output
> should
> >>>> look like?
> >>>>
> >>>> Furthermore, --describe should actually show currently committed
> offset
> >>>> for a group. So it seems to be redundant to have the same option in
> >>>> --reset-offsets
> >>>>
> >>>>
> >>>> * Option 2.a: I would rename to "--reset-to-offset" (or considering
> the
> >>>> comment above to "--to-offset")
> >>>>
> >>>>
> >>>> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or
> similar)
> >>>> and accept positive/negative values
> >>>>
> >>>>
> >>>> * About Scope "all": maybe it's better to have an option
> "--all-topics"
> >>>> (or similar). IMHO explicit arguments are preferable over implicit
> >>>> setting to guard again accidental miss use of the tool.
> >>>>
> >>>>
> >>>> * Scope: I also think, that "--topic" (singular) and "--topics"
> (plural)
> >>>> are too similar and easy to use in a wrong way (ie, mix up) -- maybe
> we
> >>>> can have two options that are easier to distinguish.
> >>>>
> >>>>
> >>>> * I still think that JSON is not the best format (it's too
> verbose/hard
> >>>> to write for humans from scratch). A simple CSV format with implicit
> >>>> schema (topic,partition,offset) would be sufficient.
> >>>>
> >>>>
> >>>> * Why does the JSON contain "group_id" field -- there is parameter
> >>>> "--group" to specify the group ID. Would one overwrite the other (what
> >>>> order) or would there be an error if "--group" is used in combination
> >>>> with "--reset-from-file"?
> >>>>
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> >>>>> Hi,
> >>>>>
> >>>>> according to the feedback, I've updated the KIP:
> >>>>>
> >>>>> - We have added and ordered the scenarios, scopes and executions of
> the
> >>>>> Reset Offset tool.
> >>>>> - Consider it as an extension to the current `ConsumerGroupCommand`
> >> tool
> >>>>> - Execution will be possible without generating JSON files.
> >>>>>
> >>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> >>>>>
> >>>>> Looking forward to your feedback!
> >>>>>
> >>>>> Jorge.
> >>>>>
> >>>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
> >>>>> quilcate.jorge@gmail.com>) escribió:
> >>>>>
> >>>>>> Great. I think I got the idea. What about this options:
> >>>>>>
> >>>>>> Scenarios:
> >>>>>>
> >>>>>> 1. Current status
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> >>>>>>
> >>>>>> 2. To Datetime
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >> --reset-to-datetime
> >>>>>> 2017-01-01T00:00:00.000´
> >>>>>>
> >>>>>> 3. To Period
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-to-period
> >>>> P2D´
> >>>>>>
> >>>>>> 4. To Earliest
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >>>> --reset-to-earliest´
> >>>>>>
> >>>>>> 5. To Latest
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >> --reset-to-latest´
> >>>>>>
> >>>>>> 6. Minus 'n' offsets
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus
> n´
> >>>>>>
> >>>>>> 7. Plus 'n' offsets
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
> >>>>>>
> >>>>>> 8. To specific offset
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
> >>>>>>
> >>>>>> Scopes:
> >>>>>>
> >>>>>> a. All topics used by Consumer Group
> >>>>>>
> >>>>>> Don't specify --topics
> >>>>>>
> >>>>>> b. Specific List of Topics
> >>>>>>
> >>>>>> Add list of values in --topics t1,t2,tn
> >>>>>>
> >>>>>> c. One Topic, all Partitions
> >>>>>>
> >>>>>> Add one topic and no partitions values: --topic t1
> >>>>>>
> >>>>>> d. One Topic, List of Partitions
> >>>>>>
> >>>>>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
> >>>>>>
> >>>>>> About Reset Plan (JSON file):
> >>>>>>
> >>>>>> I think is still valid to have the option to persist reset
> >> configuration
> >>>>>> as a file, but I agree to give the option to run the tool without
> >> going
> >>>>>> down to the JSON file.
> >>>>>>
> >>>>>> Execution options:
> >>>>>>
> >>>>>> 1. Without execution argument (No args):
> >>>>>>
> >>>>>> Print out results (reset plan)
> >>>>>>
> >>>>>> 2. With --execute argument:
> >>>>>>
> >>>>>> Run reset process
> >>>>>>
> >>>>>> 3. With --output argument:
> >>>>>>
> >>>>>> Save result in a JSON format.
> >>>>>>
> >>>>>> 4. Only with --execute option and --reset-file (path to JSON)
> >>>>>>
> >>>>>> Reset based on file
> >>>>>>
> >>>>>> 4. Only with --verify option and --reset-file (path to JSON)
> >>>>>>
> >>>>>> Verify file values with current offsets
> >>>>>>
> >>>>>> I think we can remove --generate-and-execute because is a bit
> clumsy.
> >>>>>>
> >>>>>> With this options we will be able to execute with manual JSON
> >>>>>> configuration.
> >>>>>>
> >>>>>>
> >>>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
> >>>>>> escribió:
> >>>>>>
> >>>>>> Yes - using a tool like this to skip a set of consumer groups over a
> >>>>>> corrupt/bad message is definitely appealing.
> >>>>>>
> >>>>>> B
> >>>>>>
> >>>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io>
> >> wrote:
> >>>>>>
> >>>>>>> I like the --reset-to-earliest and --reset-to-latest. In general,
> >>>>>>> since the JSON route is the most challenging for users, we want to
> >>>>>>> provide a lot of ways to do useful things without going there.
> >>>>>>>
> >>>>>>> Two things that can help:
> >>>>>>>
> >>>>>>> 1. A lot of times, users want to skip few messages that cause
> issues
> >>>>>>> and continue. maybe just specifying the topic, partition and delta
> >>>>>>> will be better than having to find the offset and write a JSON and
> >>>>>>> validate the JSON etc.
> >>>>>>>
> >>>>>>> 2. Thinking if there are other common use-cases that we can make
> easy
> >>>>>>> rather than just one generic but not very usable method.
> >>>>>>>
> >>>>>>> Gwen
> >>>>>>>
> >>>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> >>>>>>> <qu...@gmail.com> wrote:
> >>>>>>>> Thanks for the feedback!
> >>>>>>>>
> >>>>>>>> @Onur, @Gwen:
> >>>>>>>>
> >>>>>>>> Agree. Actually at the first draft I considered to have it inside
> >>>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a
> >> standalone
> >>>>>>> tool
> >>>>>>>> to describe it clearly and focus it on reset functionality.
> >>>>>>>>
> >>>>>>>> But now that you mentioned, it does make sense to have it in
> >>>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to
> >> introduce
> >>>>>>> it?
> >>>>>>>>
> >>>>>>>> Maybe something like this:
> >>>>>>>>
> >>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> >>>>>> --topics
> >>>>>>> t1
> >>>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> >>>>>>>>
> >>>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify
> --reset-json-file
> >>>>>>>> plan.json´
> >>>>>>>>
> >>>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute
> --reset-json-file
> >>>>>>>> plan.json´
> >>>>>>>>
> >>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute
> >>>> --group
> >>>>>>> cg1
> >>>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> >>>>>>>>
> >>>>>>>> @Gwen:
> >>>>>>>>
> >>>>>>>>> It looks exactly like the replica assignment tool
> >>>>>>>>
> >>>>>>>> It was influenced by ;-) I use the generate-verify-execute process
> >>>> here
> >>>>>>> to
> >>>>>>>> make sure user will be aware of the result of this operation. At
> the
> >>>>>>>> beginning we considered only add a couple of options to Consumer
> >> Group
> >>>>>>>> Command:
> >>>>>>>>
> >>>>>>>> --rewind-to-timestamp and --rewind-to-period
> >>>>>>>>
> >>>>>>>> @Onur:
> >>>>>>>>
> >>>>>>>>> You can actually get away with overriding while members of the
> >> group
> >>>>>>> are live
> >>>>>>>> with method 2 by using group information from
> DescribeGroupsRequest.
> >>>>>>>>
> >>>>>>>> This means that we need to have Consumer Group stopped before
> >>>> executing
> >>>>>>> and
> >>>>>>>> start a new consumer internally to do this? Therefore, we won't be
> >>>> able
> >>>>>>> to
> >>>>>>>> consider executing reset when ConsumerGroup is active? (trying to
> >>>>>> relate
> >>>>>>> it
> >>>>>>>> with @Dong 5th question)
> >>>>>>>>
> >>>>>>>> @Dong:
> >>>>>>>>
> >>>>>>>>> Should we allow user to use wildcard to reset offset of all
> groups
> >>>>>> for a
> >>>>>>>> given topic as well?
> >>>>>>>>
> >>>>>>>> I haven't thought about this scenario. Could be interesting.
> >> Following
> >>>>>>> the
> >>>>>>>> recommendation to add it into Consumer Group Command, in this case
> >>>>>> Group
> >>>>>>>> argument will be optional if there are only 1 topic. I think for
> >>>>>> multiple
> >>>>>>>> topic won't be that useful.
> >>>>>>>>
> >>>>>>>>> Should we allow user to specify timestamp per topic partition in
> >> the
> >>>>>>> json
> >>>>>>>> file as well?
> >>>>>>>>
> >>>>>>>> Don't think this could be a valid from the tool, but if Reset Plan
> >> is
> >>>>>>>> generated, and user want to set the offset for a specific
> partition
> >> to
> >>>>>>>> other offset (eventually based on another timestamp), and execute
> >> it,
> >>>>>> it
> >>>>>>>> will be up to her/him.
> >>>>>>>>
> >>>>>>>>> Should the script take some credential file to make sure that
> this
> >>>>>>>> operation is authenticated given the potential impact of this
> >>>>>> operation?
> >>>>>>>>
> >>>>>>>> Haven't tried to secure brokers yet, but the tool should support
> >>>>>>>> authorization if it's enabled in the broker.
> >>>>>>>>
> >>>>>>>>> Should we provide constant to reset committed offset to
> >>>>>> earliest/latest
> >>>>>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
> >>>>>> indicates
> >>>>>>>> latest offset.
> >>>>>>>>
> >>>>>>>> I will go for something like ´--reset-to-earliest´ and
> >>>>>>> ´--reset-to-latest´
> >>>>>>>>
> >>>>>>>>> Should we allow dynamic change of the comitted offset when
> consumer
> >>>>>> are
> >>>>>>>> running, such that consumer will seek to the newly committed
> offset
> >>>> and
> >>>>>>>> start consuming from there?
> >>>>>>>>
> >>>>>>>> Not sure about this. I will recommend to keep it simple and ask
> user
> >>>> to
> >>>>>>>> stop consumers first. But I would considered it if the trade-offs
> >> are
> >>>>>>>> clear.
> >>>>>>>>
> >>>>>>>> @Matthias
> >>>>>>>>
> >>>>>>>> Added :). And thanks a lot for your help to define this KIP!
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gwen@confluent.io
> >)
> >>>>>>>> escribió:
> >>>>>>>>
> >>>>>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
> >>>>>>>>> arguments and a JSON parser to the existing tool, right?
> >>>>>>>>>
> >>>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> >>>>>>>>> <on...@gmail.com> wrote:
> >>>>>>>>>> I think it makes sense to just add the feature to
> >>>>>>>>> kafka-consumer-groups.sh
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <
> gwen@confluent.io>
> >>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Thanks for the KIP. I'm super happy about adding the
> capability.
> >>>>>>>>>>>
> >>>>>>>>>>> I hate the interface, though. It looks exactly like the replica
> >>>>>>>>>>> assignment tool. A tool everyone loves so much that there are
> >>>>>>> multiple
> >>>>>>>>>>> projects, open and closed, that try to fix it.
> >>>>>>>>>>>
> >>>>>>>>>>> Can we swap it with something that looks a bit more like the
> >>>>>> consumer
> >>>>>>>>>>> group tool? or the kafka streams reset tool? Consistency is
> >> helpful
> >>>>>>> in
> >>>>>>>>>>> such cases. I spent some time learning existing tools and
> >> learning
> >>>>>>> yet
> >>>>>>>>>>> another one is a deterrent.
> >>>>>>>>>>>
> >>>>>>>>>>> Gwen
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> >>>>>>>>>>> <qu...@gmail.com> wrote:
> >>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
> >>>>>> Group
> >>>>>>>>>>> Offsets.
> >>>>>>>>>>>>
> >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >>>>>>>>>>>>
> >>>>>>>>>>>> Please, take a look at the proposal and share your feedback.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> Jorge.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> Gwen Shapira
> >>>>>>>>>>> Product Manager | Confluent
> >>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760>
> >>>>>> <(650)%20450-2760> | @gwenshap
> >>>>>>>>>>> Follow us: Twitter | blog
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Gwen Shapira
> >>>>>>>>> Product Manager | Confluent
> >>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760>
> >>>>>> <(650)%20450-2760> | @gwenshap
> >>>>>>>>> Follow us: Twitter | blog
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Gwen Shapira
> >>>>>>> Product Manager | Confluent
> >>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760> <(650)%20450-2760>
> >>>> | @gwenshap
> >>>>>>> Follow us: Twitter | blog
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
Oh ok :)

So, we can keep `--topic t1:1,2,3`

I think with this one we have most of the feedback applied. I will update
the KIP with this change.

El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<ma...@confluent.io>)
escribió:

> Sounds reasonable.
>
> If we have multiple --topic arguments, it does also not matter if we use
> t1:1,2 or t2=1,2
>
> I just suggested '=' because I wanted use ':' to chain multiple topics.
>
>
> -Matthias
>
> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> > Yeap, `--topic t1=1,2`LGTM
> >
> > Don't have idea neither about getting rid of repeated --topic, but
> --group
> > is also repeated in the case of deletion, so it could be ok to have
> > repeated --topic arguments.
> >
> > El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
> matthias@confluent.io>)
> > escribió:
> >
> >> So you suggest to merge "scope options" --topics, --topic, and
> >> --partitions into a single option? Sound good to me.
> >>
> >> I like the compact way to express it, ie, topicname:list-of-partitions
> >> with "all partitions" if not partitions are specified. It's quite
> >> intuitive to use.
> >>
> >> Just wondering, if we could get rid of the repeated --topic option; it's
> >> somewhat verbose. Have no good idea though who to improve it.
> >>
> >> If you concatenate multiple topic, we need one more character that is
> >> not allowed in topic names to separate the topics:
> >>
> >>> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';', '*',
> >> '?', ' ', '\t', '\r', '\n', '='};
> >>
> >> maybe
> >>
> >> --topics t1=1,2,3:t2:t3=3
> >>
> >> use '=' to specify partitions (instead of ':' as you proposed) and ':'
> >> to separate topics? All other characters seem to be worse to use to me.
> >> But maybe you have a better idea.
> >>
> >>
> >>
> >> -Matthias
> >>
> >>
> >> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> >>> @Matthias about the point 9:
> >>>
> >>> What about keeping only the --topic option, and support this format:
> >>>
> >>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
> >>>
> >>> In this case topics t1, t2, and t3 will be selected: topic t1 with
> >>> partitions 0,1 and 2; topic t2 with all its partitions; and topic t3,
> >> with
> >>> only partition 2.
> >>>
> >>> Jorge.
> >>>
> >>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
> >>> quilcate.jorge@gmail.com>) escribió:
> >>>
> >>>> Thanks for the feedback Matthias.
> >>>>
> >>>> * 1. You're right. I'll reorder the scenarios.
> >>>>
> >>>> * 2. Agree. I'll update the KIP.
> >>>>
> >>>> * 3. I like it, updating to `reset-offsets`
> >>>>
> >>>> * 4. Agree, removing the `reset-` part
> >>>>
> >>>> * 5. Yes, 1.e option without --execute or --export will print out
> >> current
> >>>> offset, and the new offset, that will be the same. The use-case of
> this
> >>>> option is to use it in combination with --export mostly and have a
> >> current
> >>>> 'checkpoint' to reset later. I will add to the KIP how the output
> should
> >>>> looks like.
> >>>>
> >>>> * 6. Considering 4., I will update it to `--to-offset`
> >>>>
> >>>> * 7. I like the idea to unify these options (plus, minus).
> >>>> `shift-offsets-by` is a good option, but I will like some more
> feedback
> >>>> here about the name. I will update the KIP in the meantime.
> >>>>
> >>>> * 8. Yes, discussed in 9.
> >>>>
> >>>> * 9. Agree. I'll love some feedback here. `topic` is already used by
> >>>> `delete`, and we can add `--all-topics` to consider all
> >> topics/partitions
> >>>> assigned to a group. How could we define specific topics/partitions?
> >>>>
> >>>> * 10. Haven't thought about it, but make sense.
> >>>> <topic>,<partition>,<offset> would be enough.
> >>>>
> >>>> * 11. Agree. Solved with 10.
> >>>>
> >>>> Also, I have a couple of changes to mention:
> >>>>
> >>>> 1. I have add a reference to the branch where I'm working on this KIP.
> >>>>
> >>>> 2. About the period scenario `--to-period`. I will change it to
> >>>> `--to-duration` given that duration (
> >>>> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html)
> >>>> follows this format: 'PnDTnHnMnS' and does not consider daylight
> saving
> >>>> efects.
> >>>>
> >>>>
> >>>>
> >>>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
> >> matthias@confluent.io>)
> >>>> escribió:
> >>>>
> >>>> Hi,
> >>>>
> >>>> thanks for updating the KIP. Couple of follow up comments:
> >>>>
> >>>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
> >>>> time" option -- IMHO it belongs to "reset by position"?
> >>>>
> >>>>
> >>>> * Nit: Description of "Reset to Earliest"
> >>>>
> >>>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
> >>>>
> >>>> I think this is strictly speaking not correct (as auto.offset.reset
> only
> >>>> triggered if no valid offset is found, but this tool explicitly
> modified
> >>>> committed offset), and should be phrased as
> >>>>
> >>>>> using Kafka Consumer's #seekToBeginning()
> >>>>
> >>>> -> similar issue for description of "Reset to Latest"
> >>>>
> >>>>
> >>>> * Main option: rename to --reset-offsets (plural instead of singular)
> >>>>
> >>>>
> >>>> * Scenario Options: I would remove "reset" from all options, because
> the
> >>>> main argument "--reset-offset" says already what to do:
> >>>>
> >>>>> bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
> >>>>
> >>>> better (IMHO):
> >>>>
> >>>>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
> >>>>
> >>>>
> >>>>
> >>>> * Option 1.e ("print and export current offset") is not intuitive to
> use
> >>>> IMHO. The main option is "--reset-offset" but nothing happens if no
> >>>> scenario is specified. It is also not specified, what the output
> should
> >>>> look like?
> >>>>
> >>>> Furthermore, --describe should actually show currently committed
> offset
> >>>> for a group. So it seems to be redundant to have the same option in
> >>>> --reset-offsets
> >>>>
> >>>>
> >>>> * Option 2.a: I would rename to "--reset-to-offset" (or considering
> the
> >>>> comment above to "--to-offset")
> >>>>
> >>>>
> >>>> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or
> similar)
> >>>> and accept positive/negative values
> >>>>
> >>>>
> >>>> * About Scope "all": maybe it's better to have an option
> "--all-topics"
> >>>> (or similar). IMHO explicit arguments are preferable over implicit
> >>>> setting to guard again accidental miss use of the tool.
> >>>>
> >>>>
> >>>> * Scope: I also think, that "--topic" (singular) and "--topics"
> (plural)
> >>>> are too similar and easy to use in a wrong way (ie, mix up) -- maybe
> we
> >>>> can have two options that are easier to distinguish.
> >>>>
> >>>>
> >>>> * I still think that JSON is not the best format (it's too
> verbose/hard
> >>>> to write for humans from scratch). A simple CSV format with implicit
> >>>> schema (topic,partition,offset) would be sufficient.
> >>>>
> >>>>
> >>>> * Why does the JSON contain "group_id" field -- there is parameter
> >>>> "--group" to specify the group ID. Would one overwrite the other (what
> >>>> order) or would there be an error if "--group" is used in combination
> >>>> with "--reset-from-file"?
> >>>>
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> >>>>> Hi,
> >>>>>
> >>>>> according to the feedback, I've updated the KIP:
> >>>>>
> >>>>> - We have added and ordered the scenarios, scopes and executions of
> the
> >>>>> Reset Offset tool.
> >>>>> - Consider it as an extension to the current `ConsumerGroupCommand`
> >> tool
> >>>>> - Execution will be possible without generating JSON files.
> >>>>>
> >>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> >>>>>
> >>>>> Looking forward to your feedback!
> >>>>>
> >>>>> Jorge.
> >>>>>
> >>>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
> >>>>> quilcate.jorge@gmail.com>) escribió:
> >>>>>
> >>>>>> Great. I think I got the idea. What about this options:
> >>>>>>
> >>>>>> Scenarios:
> >>>>>>
> >>>>>> 1. Current status
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> >>>>>>
> >>>>>> 2. To Datetime
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >> --reset-to-datetime
> >>>>>> 2017-01-01T00:00:00.000´
> >>>>>>
> >>>>>> 3. To Period
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-to-period
> >>>> P2D´
> >>>>>>
> >>>>>> 4. To Earliest
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >>>> --reset-to-earliest´
> >>>>>>
> >>>>>> 5. To Latest
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >> --reset-to-latest´
> >>>>>>
> >>>>>> 6. Minus 'n' offsets
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus
> n´
> >>>>>>
> >>>>>> 7. Plus 'n' offsets
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
> >>>>>>
> >>>>>> 8. To specific offset
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
> >>>>>>
> >>>>>> Scopes:
> >>>>>>
> >>>>>> a. All topics used by Consumer Group
> >>>>>>
> >>>>>> Don't specify --topics
> >>>>>>
> >>>>>> b. Specific List of Topics
> >>>>>>
> >>>>>> Add list of values in --topics t1,t2,tn
> >>>>>>
> >>>>>> c. One Topic, all Partitions
> >>>>>>
> >>>>>> Add one topic and no partitions values: --topic t1
> >>>>>>
> >>>>>> d. One Topic, List of Partitions
> >>>>>>
> >>>>>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
> >>>>>>
> >>>>>> About Reset Plan (JSON file):
> >>>>>>
> >>>>>> I think is still valid to have the option to persist reset
> >> configuration
> >>>>>> as a file, but I agree to give the option to run the tool without
> >> going
> >>>>>> down to the JSON file.
> >>>>>>
> >>>>>> Execution options:
> >>>>>>
> >>>>>> 1. Without execution argument (No args):
> >>>>>>
> >>>>>> Print out results (reset plan)
> >>>>>>
> >>>>>> 2. With --execute argument:
> >>>>>>
> >>>>>> Run reset process
> >>>>>>
> >>>>>> 3. With --output argument:
> >>>>>>
> >>>>>> Save result in a JSON format.
> >>>>>>
> >>>>>> 4. Only with --execute option and --reset-file (path to JSON)
> >>>>>>
> >>>>>> Reset based on file
> >>>>>>
> >>>>>> 4. Only with --verify option and --reset-file (path to JSON)
> >>>>>>
> >>>>>> Verify file values with current offsets
> >>>>>>
> >>>>>> I think we can remove --generate-and-execute because is a bit
> clumsy.
> >>>>>>
> >>>>>> With this options we will be able to execute with manual JSON
> >>>>>> configuration.
> >>>>>>
> >>>>>>
> >>>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
> >>>>>> escribió:
> >>>>>>
> >>>>>> Yes - using a tool like this to skip a set of consumer groups over a
> >>>>>> corrupt/bad message is definitely appealing.
> >>>>>>
> >>>>>> B
> >>>>>>
> >>>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io>
> >> wrote:
> >>>>>>
> >>>>>>> I like the --reset-to-earliest and --reset-to-latest. In general,
> >>>>>>> since the JSON route is the most challenging for users, we want to
> >>>>>>> provide a lot of ways to do useful things without going there.
> >>>>>>>
> >>>>>>> Two things that can help:
> >>>>>>>
> >>>>>>> 1. A lot of times, users want to skip few messages that cause
> issues
> >>>>>>> and continue. maybe just specifying the topic, partition and delta
> >>>>>>> will be better than having to find the offset and write a JSON and
> >>>>>>> validate the JSON etc.
> >>>>>>>
> >>>>>>> 2. Thinking if there are other common use-cases that we can make
> easy
> >>>>>>> rather than just one generic but not very usable method.
> >>>>>>>
> >>>>>>> Gwen
> >>>>>>>
> >>>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> >>>>>>> <qu...@gmail.com> wrote:
> >>>>>>>> Thanks for the feedback!
> >>>>>>>>
> >>>>>>>> @Onur, @Gwen:
> >>>>>>>>
> >>>>>>>> Agree. Actually at the first draft I considered to have it inside
> >>>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a
> >> standalone
> >>>>>>> tool
> >>>>>>>> to describe it clearly and focus it on reset functionality.
> >>>>>>>>
> >>>>>>>> But now that you mentioned, it does make sense to have it in
> >>>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to
> >> introduce
> >>>>>>> it?
> >>>>>>>>
> >>>>>>>> Maybe something like this:
> >>>>>>>>
> >>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> >>>>>> --topics
> >>>>>>> t1
> >>>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> >>>>>>>>
> >>>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify
> --reset-json-file
> >>>>>>>> plan.json´
> >>>>>>>>
> >>>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute
> --reset-json-file
> >>>>>>>> plan.json´
> >>>>>>>>
> >>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute
> >>>> --group
> >>>>>>> cg1
> >>>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> >>>>>>>>
> >>>>>>>> @Gwen:
> >>>>>>>>
> >>>>>>>>> It looks exactly like the replica assignment tool
> >>>>>>>>
> >>>>>>>> It was influenced by ;-) I use the generate-verify-execute process
> >>>> here
> >>>>>>> to
> >>>>>>>> make sure user will be aware of the result of this operation. At
> the
> >>>>>>>> beginning we considered only add a couple of options to Consumer
> >> Group
> >>>>>>>> Command:
> >>>>>>>>
> >>>>>>>> --rewind-to-timestamp and --rewind-to-period
> >>>>>>>>
> >>>>>>>> @Onur:
> >>>>>>>>
> >>>>>>>>> You can actually get away with overriding while members of the
> >> group
> >>>>>>> are live
> >>>>>>>> with method 2 by using group information from
> DescribeGroupsRequest.
> >>>>>>>>
> >>>>>>>> This means that we need to have Consumer Group stopped before
> >>>> executing
> >>>>>>> and
> >>>>>>>> start a new consumer internally to do this? Therefore, we won't be
> >>>> able
> >>>>>>> to
> >>>>>>>> consider executing reset when ConsumerGroup is active? (trying to
> >>>>>> relate
> >>>>>>> it
> >>>>>>>> with @Dong 5th question)
> >>>>>>>>
> >>>>>>>> @Dong:
> >>>>>>>>
> >>>>>>>>> Should we allow user to use wildcard to reset offset of all
> groups
> >>>>>> for a
> >>>>>>>> given topic as well?
> >>>>>>>>
> >>>>>>>> I haven't thought about this scenario. Could be interesting.
> >> Following
> >>>>>>> the
> >>>>>>>> recommendation to add it into Consumer Group Command, in this case
> >>>>>> Group
> >>>>>>>> argument will be optional if there are only 1 topic. I think for
> >>>>>> multiple
> >>>>>>>> topic won't be that useful.
> >>>>>>>>
> >>>>>>>>> Should we allow user to specify timestamp per topic partition in
> >> the
> >>>>>>> json
> >>>>>>>> file as well?
> >>>>>>>>
> >>>>>>>> Don't think this could be a valid from the tool, but if Reset Plan
> >> is
> >>>>>>>> generated, and user want to set the offset for a specific
> partition
> >> to
> >>>>>>>> other offset (eventually based on another timestamp), and execute
> >> it,
> >>>>>> it
> >>>>>>>> will be up to her/him.
> >>>>>>>>
> >>>>>>>>> Should the script take some credential file to make sure that
> this
> >>>>>>>> operation is authenticated given the potential impact of this
> >>>>>> operation?
> >>>>>>>>
> >>>>>>>> Haven't tried to secure brokers yet, but the tool should support
> >>>>>>>> authorization if it's enabled in the broker.
> >>>>>>>>
> >>>>>>>>> Should we provide constant to reset committed offset to
> >>>>>> earliest/latest
> >>>>>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
> >>>>>> indicates
> >>>>>>>> latest offset.
> >>>>>>>>
> >>>>>>>> I will go for something like ´--reset-to-earliest´ and
> >>>>>>> ´--reset-to-latest´
> >>>>>>>>
> >>>>>>>>> Should we allow dynamic change of the comitted offset when
> consumer
> >>>>>> are
> >>>>>>>> running, such that consumer will seek to the newly committed
> offset
> >>>> and
> >>>>>>>> start consuming from there?
> >>>>>>>>
> >>>>>>>> Not sure about this. I will recommend to keep it simple and ask
> user
> >>>> to
> >>>>>>>> stop consumers first. But I would considered it if the trade-offs
> >> are
> >>>>>>>> clear.
> >>>>>>>>
> >>>>>>>> @Matthias
> >>>>>>>>
> >>>>>>>> Added :). And thanks a lot for your help to define this KIP!
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gwen@confluent.io
> >)
> >>>>>>>> escribió:
> >>>>>>>>
> >>>>>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
> >>>>>>>>> arguments and a JSON parser to the existing tool, right?
> >>>>>>>>>
> >>>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> >>>>>>>>> <on...@gmail.com> wrote:
> >>>>>>>>>> I think it makes sense to just add the feature to
> >>>>>>>>> kafka-consumer-groups.sh
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <
> gwen@confluent.io>
> >>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Thanks for the KIP. I'm super happy about adding the
> capability.
> >>>>>>>>>>>
> >>>>>>>>>>> I hate the interface, though. It looks exactly like the replica
> >>>>>>>>>>> assignment tool. A tool everyone loves so much that there are
> >>>>>>> multiple
> >>>>>>>>>>> projects, open and closed, that try to fix it.
> >>>>>>>>>>>
> >>>>>>>>>>> Can we swap it with something that looks a bit more like the
> >>>>>> consumer
> >>>>>>>>>>> group tool? or the kafka streams reset tool? Consistency is
> >> helpful
> >>>>>>> in
> >>>>>>>>>>> such cases. I spent some time learning existing tools and
> >> learning
> >>>>>>> yet
> >>>>>>>>>>> another one is a deterrent.
> >>>>>>>>>>>
> >>>>>>>>>>> Gwen
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> >>>>>>>>>>> <qu...@gmail.com> wrote:
> >>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
> >>>>>> Group
> >>>>>>>>>>> Offsets.
> >>>>>>>>>>>>
> >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >>>>>>>>>>>>
> >>>>>>>>>>>> Please, take a look at the proposal and share your feedback.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> Jorge.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> Gwen Shapira
> >>>>>>>>>>> Product Manager | Confluent
> >>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760>
> >>>>>> <(650)%20450-2760> | @gwenshap
> >>>>>>>>>>> Follow us: Twitter | blog
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Gwen Shapira
> >>>>>>>>> Product Manager | Confluent
> >>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760>
> >>>>>> <(650)%20450-2760> | @gwenshap
> >>>>>>>>> Follow us: Twitter | blog
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Gwen Shapira
> >>>>>>> Product Manager | Confluent
> >>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760> <(650)%20450-2760>
> >>>> | @gwenshap
> >>>>>>> Follow us: Twitter | blog
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by "Matthias J. Sax" <ma...@confluent.io>.
Sounds reasonable.

If we have multiple --topic arguments, it does also not matter if we use
t1:1,2 or t2=1,2

I just suggested '=' because I wanted use ':' to chain multiple topics.


-Matthias

On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> Yeap, `--topic t1=1,2`LGTM
> 
> Don't have idea neither about getting rid of repeated --topic, but --group
> is also repeated in the case of deletion, so it could be ok to have
> repeated --topic arguments.
> 
> El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<ma...@confluent.io>)
> escribió:
> 
>> So you suggest to merge "scope options" --topics, --topic, and
>> --partitions into a single option? Sound good to me.
>>
>> I like the compact way to express it, ie, topicname:list-of-partitions
>> with "all partitions" if not partitions are specified. It's quite
>> intuitive to use.
>>
>> Just wondering, if we could get rid of the repeated --topic option; it's
>> somewhat verbose. Have no good idea though who to improve it.
>>
>> If you concatenate multiple topic, we need one more character that is
>> not allowed in topic names to separate the topics:
>>
>>> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';', '*',
>> '?', ' ', '\t', '\r', '\n', '='};
>>
>> maybe
>>
>> --topics t1=1,2,3:t2:t3=3
>>
>> use '=' to specify partitions (instead of ':' as you proposed) and ':'
>> to separate topics? All other characters seem to be worse to use to me.
>> But maybe you have a better idea.
>>
>>
>>
>> -Matthias
>>
>>
>> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
>>> @Matthias about the point 9:
>>>
>>> What about keeping only the --topic option, and support this format:
>>>
>>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
>>>
>>> In this case topics t1, t2, and t3 will be selected: topic t1 with
>>> partitions 0,1 and 2; topic t2 with all its partitions; and topic t3,
>> with
>>> only partition 2.
>>>
>>> Jorge.
>>>
>>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
>>> quilcate.jorge@gmail.com>) escribió:
>>>
>>>> Thanks for the feedback Matthias.
>>>>
>>>> * 1. You're right. I'll reorder the scenarios.
>>>>
>>>> * 2. Agree. I'll update the KIP.
>>>>
>>>> * 3. I like it, updating to `reset-offsets`
>>>>
>>>> * 4. Agree, removing the `reset-` part
>>>>
>>>> * 5. Yes, 1.e option without --execute or --export will print out
>> current
>>>> offset, and the new offset, that will be the same. The use-case of this
>>>> option is to use it in combination with --export mostly and have a
>> current
>>>> 'checkpoint' to reset later. I will add to the KIP how the output should
>>>> looks like.
>>>>
>>>> * 6. Considering 4., I will update it to `--to-offset`
>>>>
>>>> * 7. I like the idea to unify these options (plus, minus).
>>>> `shift-offsets-by` is a good option, but I will like some more feedback
>>>> here about the name. I will update the KIP in the meantime.
>>>>
>>>> * 8. Yes, discussed in 9.
>>>>
>>>> * 9. Agree. I'll love some feedback here. `topic` is already used by
>>>> `delete`, and we can add `--all-topics` to consider all
>> topics/partitions
>>>> assigned to a group. How could we define specific topics/partitions?
>>>>
>>>> * 10. Haven't thought about it, but make sense.
>>>> <topic>,<partition>,<offset> would be enough.
>>>>
>>>> * 11. Agree. Solved with 10.
>>>>
>>>> Also, I have a couple of changes to mention:
>>>>
>>>> 1. I have add a reference to the branch where I'm working on this KIP.
>>>>
>>>> 2. About the period scenario `--to-period`. I will change it to
>>>> `--to-duration` given that duration (
>>>> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html)
>>>> follows this format: 'PnDTnHnMnS' and does not consider daylight saving
>>>> efects.
>>>>
>>>>
>>>>
>>>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
>> matthias@confluent.io>)
>>>> escribió:
>>>>
>>>> Hi,
>>>>
>>>> thanks for updating the KIP. Couple of follow up comments:
>>>>
>>>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
>>>> time" option -- IMHO it belongs to "reset by position"?
>>>>
>>>>
>>>> * Nit: Description of "Reset to Earliest"
>>>>
>>>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
>>>>
>>>> I think this is strictly speaking not correct (as auto.offset.reset only
>>>> triggered if no valid offset is found, but this tool explicitly modified
>>>> committed offset), and should be phrased as
>>>>
>>>>> using Kafka Consumer's #seekToBeginning()
>>>>
>>>> -> similar issue for description of "Reset to Latest"
>>>>
>>>>
>>>> * Main option: rename to --reset-offsets (plural instead of singular)
>>>>
>>>>
>>>> * Scenario Options: I would remove "reset" from all options, because the
>>>> main argument "--reset-offset" says already what to do:
>>>>
>>>>> bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
>>>>
>>>> better (IMHO):
>>>>
>>>>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
>>>>
>>>>
>>>>
>>>> * Option 1.e ("print and export current offset") is not intuitive to use
>>>> IMHO. The main option is "--reset-offset" but nothing happens if no
>>>> scenario is specified. It is also not specified, what the output should
>>>> look like?
>>>>
>>>> Furthermore, --describe should actually show currently committed offset
>>>> for a group. So it seems to be redundant to have the same option in
>>>> --reset-offsets
>>>>
>>>>
>>>> * Option 2.a: I would rename to "--reset-to-offset" (or considering the
>>>> comment above to "--to-offset")
>>>>
>>>>
>>>> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar)
>>>> and accept positive/negative values
>>>>
>>>>
>>>> * About Scope "all": maybe it's better to have an option "--all-topics"
>>>> (or similar). IMHO explicit arguments are preferable over implicit
>>>> setting to guard again accidental miss use of the tool.
>>>>
>>>>
>>>> * Scope: I also think, that "--topic" (singular) and "--topics" (plural)
>>>> are too similar and easy to use in a wrong way (ie, mix up) -- maybe we
>>>> can have two options that are easier to distinguish.
>>>>
>>>>
>>>> * I still think that JSON is not the best format (it's too verbose/hard
>>>> to write for humans from scratch). A simple CSV format with implicit
>>>> schema (topic,partition,offset) would be sufficient.
>>>>
>>>>
>>>> * Why does the JSON contain "group_id" field -- there is parameter
>>>> "--group" to specify the group ID. Would one overwrite the other (what
>>>> order) or would there be an error if "--group" is used in combination
>>>> with "--reset-from-file"?
>>>>
>>>>
>>>>
>>>> -Matthias
>>>>
>>>>
>>>>
>>>>
>>>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
>>>>> Hi,
>>>>>
>>>>> according to the feedback, I've updated the KIP:
>>>>>
>>>>> - We have added and ordered the scenarios, scopes and executions of the
>>>>> Reset Offset tool.
>>>>> - Consider it as an extension to the current `ConsumerGroupCommand`
>> tool
>>>>> - Execution will be possible without generating JSON files.
>>>>>
>>>>>
>>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
>>>>>
>>>>> Looking forward to your feedback!
>>>>>
>>>>> Jorge.
>>>>>
>>>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
>>>>> quilcate.jorge@gmail.com>) escribió:
>>>>>
>>>>>> Great. I think I got the idea. What about this options:
>>>>>>
>>>>>> Scenarios:
>>>>>>
>>>>>> 1. Current status
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
>>>>>>
>>>>>> 2. To Datetime
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
>> --reset-to-datetime
>>>>>> 2017-01-01T00:00:00.000´
>>>>>>
>>>>>> 3. To Period
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period
>>>> P2D´
>>>>>>
>>>>>> 4. To Earliest
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
>>>> --reset-to-earliest´
>>>>>>
>>>>>> 5. To Latest
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
>> --reset-to-latest´
>>>>>>
>>>>>> 6. Minus 'n' offsets
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
>>>>>>
>>>>>> 7. Plus 'n' offsets
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
>>>>>>
>>>>>> 8. To specific offset
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
>>>>>>
>>>>>> Scopes:
>>>>>>
>>>>>> a. All topics used by Consumer Group
>>>>>>
>>>>>> Don't specify --topics
>>>>>>
>>>>>> b. Specific List of Topics
>>>>>>
>>>>>> Add list of values in --topics t1,t2,tn
>>>>>>
>>>>>> c. One Topic, all Partitions
>>>>>>
>>>>>> Add one topic and no partitions values: --topic t1
>>>>>>
>>>>>> d. One Topic, List of Partitions
>>>>>>
>>>>>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
>>>>>>
>>>>>> About Reset Plan (JSON file):
>>>>>>
>>>>>> I think is still valid to have the option to persist reset
>> configuration
>>>>>> as a file, but I agree to give the option to run the tool without
>> going
>>>>>> down to the JSON file.
>>>>>>
>>>>>> Execution options:
>>>>>>
>>>>>> 1. Without execution argument (No args):
>>>>>>
>>>>>> Print out results (reset plan)
>>>>>>
>>>>>> 2. With --execute argument:
>>>>>>
>>>>>> Run reset process
>>>>>>
>>>>>> 3. With --output argument:
>>>>>>
>>>>>> Save result in a JSON format.
>>>>>>
>>>>>> 4. Only with --execute option and --reset-file (path to JSON)
>>>>>>
>>>>>> Reset based on file
>>>>>>
>>>>>> 4. Only with --verify option and --reset-file (path to JSON)
>>>>>>
>>>>>> Verify file values with current offsets
>>>>>>
>>>>>> I think we can remove --generate-and-execute because is a bit clumsy.
>>>>>>
>>>>>> With this options we will be able to execute with manual JSON
>>>>>> configuration.
>>>>>>
>>>>>>
>>>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
>>>>>> escribió:
>>>>>>
>>>>>> Yes - using a tool like this to skip a set of consumer groups over a
>>>>>> corrupt/bad message is definitely appealing.
>>>>>>
>>>>>> B
>>>>>>
>>>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io>
>> wrote:
>>>>>>
>>>>>>> I like the --reset-to-earliest and --reset-to-latest. In general,
>>>>>>> since the JSON route is the most challenging for users, we want to
>>>>>>> provide a lot of ways to do useful things without going there.
>>>>>>>
>>>>>>> Two things that can help:
>>>>>>>
>>>>>>> 1. A lot of times, users want to skip few messages that cause issues
>>>>>>> and continue. maybe just specifying the topic, partition and delta
>>>>>>> will be better than having to find the offset and write a JSON and
>>>>>>> validate the JSON etc.
>>>>>>>
>>>>>>> 2. Thinking if there are other common use-cases that we can make easy
>>>>>>> rather than just one generic but not very usable method.
>>>>>>>
>>>>>>> Gwen
>>>>>>>
>>>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
>>>>>>> <qu...@gmail.com> wrote:
>>>>>>>> Thanks for the feedback!
>>>>>>>>
>>>>>>>> @Onur, @Gwen:
>>>>>>>>
>>>>>>>> Agree. Actually at the first draft I considered to have it inside
>>>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a
>> standalone
>>>>>>> tool
>>>>>>>> to describe it clearly and focus it on reset functionality.
>>>>>>>>
>>>>>>>> But now that you mentioned, it does make sense to have it in
>>>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to
>> introduce
>>>>>>> it?
>>>>>>>>
>>>>>>>> Maybe something like this:
>>>>>>>>
>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
>>>>>> --topics
>>>>>>> t1
>>>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
>>>>>>>>
>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
>>>>>>>> plan.json´
>>>>>>>>
>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
>>>>>>>> plan.json´
>>>>>>>>
>>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute
>>>> --group
>>>>>>> cg1
>>>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
>>>>>>>>
>>>>>>>> @Gwen:
>>>>>>>>
>>>>>>>>> It looks exactly like the replica assignment tool
>>>>>>>>
>>>>>>>> It was influenced by ;-) I use the generate-verify-execute process
>>>> here
>>>>>>> to
>>>>>>>> make sure user will be aware of the result of this operation. At the
>>>>>>>> beginning we considered only add a couple of options to Consumer
>> Group
>>>>>>>> Command:
>>>>>>>>
>>>>>>>> --rewind-to-timestamp and --rewind-to-period
>>>>>>>>
>>>>>>>> @Onur:
>>>>>>>>
>>>>>>>>> You can actually get away with overriding while members of the
>> group
>>>>>>> are live
>>>>>>>> with method 2 by using group information from DescribeGroupsRequest.
>>>>>>>>
>>>>>>>> This means that we need to have Consumer Group stopped before
>>>> executing
>>>>>>> and
>>>>>>>> start a new consumer internally to do this? Therefore, we won't be
>>>> able
>>>>>>> to
>>>>>>>> consider executing reset when ConsumerGroup is active? (trying to
>>>>>> relate
>>>>>>> it
>>>>>>>> with @Dong 5th question)
>>>>>>>>
>>>>>>>> @Dong:
>>>>>>>>
>>>>>>>>> Should we allow user to use wildcard to reset offset of all groups
>>>>>> for a
>>>>>>>> given topic as well?
>>>>>>>>
>>>>>>>> I haven't thought about this scenario. Could be interesting.
>> Following
>>>>>>> the
>>>>>>>> recommendation to add it into Consumer Group Command, in this case
>>>>>> Group
>>>>>>>> argument will be optional if there are only 1 topic. I think for
>>>>>> multiple
>>>>>>>> topic won't be that useful.
>>>>>>>>
>>>>>>>>> Should we allow user to specify timestamp per topic partition in
>> the
>>>>>>> json
>>>>>>>> file as well?
>>>>>>>>
>>>>>>>> Don't think this could be a valid from the tool, but if Reset Plan
>> is
>>>>>>>> generated, and user want to set the offset for a specific partition
>> to
>>>>>>>> other offset (eventually based on another timestamp), and execute
>> it,
>>>>>> it
>>>>>>>> will be up to her/him.
>>>>>>>>
>>>>>>>>> Should the script take some credential file to make sure that this
>>>>>>>> operation is authenticated given the potential impact of this
>>>>>> operation?
>>>>>>>>
>>>>>>>> Haven't tried to secure brokers yet, but the tool should support
>>>>>>>> authorization if it's enabled in the broker.
>>>>>>>>
>>>>>>>>> Should we provide constant to reset committed offset to
>>>>>> earliest/latest
>>>>>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
>>>>>> indicates
>>>>>>>> latest offset.
>>>>>>>>
>>>>>>>> I will go for something like ´--reset-to-earliest´ and
>>>>>>> ´--reset-to-latest´
>>>>>>>>
>>>>>>>>> Should we allow dynamic change of the comitted offset when consumer
>>>>>> are
>>>>>>>> running, such that consumer will seek to the newly committed offset
>>>> and
>>>>>>>> start consuming from there?
>>>>>>>>
>>>>>>>> Not sure about this. I will recommend to keep it simple and ask user
>>>> to
>>>>>>>> stop consumers first. But I would considered it if the trade-offs
>> are
>>>>>>>> clear.
>>>>>>>>
>>>>>>>> @Matthias
>>>>>>>>
>>>>>>>> Added :). And thanks a lot for your help to define this KIP!
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
>>>>>>>> escribió:
>>>>>>>>
>>>>>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
>>>>>>>>> arguments and a JSON parser to the existing tool, right?
>>>>>>>>>
>>>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
>>>>>>>>> <on...@gmail.com> wrote:
>>>>>>>>>> I think it makes sense to just add the feature to
>>>>>>>>> kafka-consumer-groups.sh
>>>>>>>>>>
>>>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
>>>>>>>>>>>
>>>>>>>>>>> I hate the interface, though. It looks exactly like the replica
>>>>>>>>>>> assignment tool. A tool everyone loves so much that there are
>>>>>>> multiple
>>>>>>>>>>> projects, open and closed, that try to fix it.
>>>>>>>>>>>
>>>>>>>>>>> Can we swap it with something that looks a bit more like the
>>>>>> consumer
>>>>>>>>>>> group tool? or the kafka streams reset tool? Consistency is
>> helpful
>>>>>>> in
>>>>>>>>>>> such cases. I spent some time learning existing tools and
>> learning
>>>>>>> yet
>>>>>>>>>>> another one is a deterrent.
>>>>>>>>>>>
>>>>>>>>>>> Gwen
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>>>>>>>>>>> <qu...@gmail.com> wrote:
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
>>>>>> Group
>>>>>>>>>>> Offsets.
>>>>>>>>>>>>
>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>>>>>>>>>>>>
>>>>>>>>>>>> Please, take a look at the proposal and share your feedback.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Jorge.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Gwen Shapira
>>>>>>>>>>> Product Manager | Confluent
>>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760>
>>>> <(650)%20450-2760>
>>>>>> <(650)%20450-2760> | @gwenshap
>>>>>>>>>>> Follow us: Twitter | blog
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Gwen Shapira
>>>>>>>>> Product Manager | Confluent
>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760>
>>>> <(650)%20450-2760>
>>>>>> <(650)%20450-2760> | @gwenshap
>>>>>>>>> Follow us: Twitter | blog
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Gwen Shapira
>>>>>>> Product Manager | Confluent
>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760> <(650)%20450-2760>
>>>> | @gwenshap
>>>>>>> Follow us: Twitter | blog
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
> 


Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
Yeap, `--topic t1=1,2`LGTM

Don't have idea neither about getting rid of repeated --topic, but --group
is also repeated in the case of deletion, so it could be ok to have
repeated --topic arguments.

El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<ma...@confluent.io>)
escribió:

> So you suggest to merge "scope options" --topics, --topic, and
> --partitions into a single option? Sound good to me.
>
> I like the compact way to express it, ie, topicname:list-of-partitions
> with "all partitions" if not partitions are specified. It's quite
> intuitive to use.
>
> Just wondering, if we could get rid of the repeated --topic option; it's
> somewhat verbose. Have no good idea though who to improve it.
>
> If you concatenate multiple topic, we need one more character that is
> not allowed in topic names to separate the topics:
>
> > invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';', '*',
> '?', ' ', '\t', '\r', '\n', '='};
>
> maybe
>
> --topics t1=1,2,3:t2:t3=3
>
> use '=' to specify partitions (instead of ':' as you proposed) and ':'
> to separate topics? All other characters seem to be worse to use to me.
> But maybe you have a better idea.
>
>
>
> -Matthias
>
>
> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> > @Matthias about the point 9:
> >
> > What about keeping only the --topic option, and support this format:
> >
> > `--topic t1:0,1,2 --topic t2 --topic t3:2`
> >
> > In this case topics t1, t2, and t3 will be selected: topic t1 with
> > partitions 0,1 and 2; topic t2 with all its partitions; and topic t3,
> with
> > only partition 2.
> >
> > Jorge.
> >
> > El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
> > quilcate.jorge@gmail.com>) escribió:
> >
> >> Thanks for the feedback Matthias.
> >>
> >> * 1. You're right. I'll reorder the scenarios.
> >>
> >> * 2. Agree. I'll update the KIP.
> >>
> >> * 3. I like it, updating to `reset-offsets`
> >>
> >> * 4. Agree, removing the `reset-` part
> >>
> >> * 5. Yes, 1.e option without --execute or --export will print out
> current
> >> offset, and the new offset, that will be the same. The use-case of this
> >> option is to use it in combination with --export mostly and have a
> current
> >> 'checkpoint' to reset later. I will add to the KIP how the output should
> >> looks like.
> >>
> >> * 6. Considering 4., I will update it to `--to-offset`
> >>
> >> * 7. I like the idea to unify these options (plus, minus).
> >> `shift-offsets-by` is a good option, but I will like some more feedback
> >> here about the name. I will update the KIP in the meantime.
> >>
> >> * 8. Yes, discussed in 9.
> >>
> >> * 9. Agree. I'll love some feedback here. `topic` is already used by
> >> `delete`, and we can add `--all-topics` to consider all
> topics/partitions
> >> assigned to a group. How could we define specific topics/partitions?
> >>
> >> * 10. Haven't thought about it, but make sense.
> >> <topic>,<partition>,<offset> would be enough.
> >>
> >> * 11. Agree. Solved with 10.
> >>
> >> Also, I have a couple of changes to mention:
> >>
> >> 1. I have add a reference to the branch where I'm working on this KIP.
> >>
> >> 2. About the period scenario `--to-period`. I will change it to
> >> `--to-duration` given that duration (
> >> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html)
> >> follows this format: 'PnDTnHnMnS' and does not consider daylight saving
> >> efects.
> >>
> >>
> >>
> >> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
> matthias@confluent.io>)
> >> escribió:
> >>
> >> Hi,
> >>
> >> thanks for updating the KIP. Couple of follow up comments:
> >>
> >> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
> >> time" option -- IMHO it belongs to "reset by position"?
> >>
> >>
> >> * Nit: Description of "Reset to Earliest"
> >>
> >>> using Kafka Consumer's `auto.offset.reset` to `earliest`
> >>
> >> I think this is strictly speaking not correct (as auto.offset.reset only
> >> triggered if no valid offset is found, but this tool explicitly modified
> >> committed offset), and should be phrased as
> >>
> >>> using Kafka Consumer's #seekToBeginning()
> >>
> >> -> similar issue for description of "Reset to Latest"
> >>
> >>
> >> * Main option: rename to --reset-offsets (plural instead of singular)
> >>
> >>
> >> * Scenario Options: I would remove "reset" from all options, because the
> >> main argument "--reset-offset" says already what to do:
> >>
> >>> bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
> >>
> >> better (IMHO):
> >>
> >>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
> >>
> >>
> >>
> >> * Option 1.e ("print and export current offset") is not intuitive to use
> >> IMHO. The main option is "--reset-offset" but nothing happens if no
> >> scenario is specified. It is also not specified, what the output should
> >> look like?
> >>
> >> Furthermore, --describe should actually show currently committed offset
> >> for a group. So it seems to be redundant to have the same option in
> >> --reset-offsets
> >>
> >>
> >> * Option 2.a: I would rename to "--reset-to-offset" (or considering the
> >> comment above to "--to-offset")
> >>
> >>
> >> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar)
> >> and accept positive/negative values
> >>
> >>
> >> * About Scope "all": maybe it's better to have an option "--all-topics"
> >> (or similar). IMHO explicit arguments are preferable over implicit
> >> setting to guard again accidental miss use of the tool.
> >>
> >>
> >> * Scope: I also think, that "--topic" (singular) and "--topics" (plural)
> >> are too similar and easy to use in a wrong way (ie, mix up) -- maybe we
> >> can have two options that are easier to distinguish.
> >>
> >>
> >> * I still think that JSON is not the best format (it's too verbose/hard
> >> to write for humans from scratch). A simple CSV format with implicit
> >> schema (topic,partition,offset) would be sufficient.
> >>
> >>
> >> * Why does the JSON contain "group_id" field -- there is parameter
> >> "--group" to specify the group ID. Would one overwrite the other (what
> >> order) or would there be an error if "--group" is used in combination
> >> with "--reset-from-file"?
> >>
> >>
> >>
> >> -Matthias
> >>
> >>
> >>
> >>
> >> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> >>> Hi,
> >>>
> >>> according to the feedback, I've updated the KIP:
> >>>
> >>> - We have added and ordered the scenarios, scopes and executions of the
> >>> Reset Offset tool.
> >>> - Consider it as an extension to the current `ConsumerGroupCommand`
> tool
> >>> - Execution will be possible without generating JSON files.
> >>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> >>>
> >>> Looking forward to your feedback!
> >>>
> >>> Jorge.
> >>>
> >>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
> >>> quilcate.jorge@gmail.com>) escribió:
> >>>
> >>>> Great. I think I got the idea. What about this options:
> >>>>
> >>>> Scenarios:
> >>>>
> >>>> 1. Current status
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> >>>>
> >>>> 2. To Datetime
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-to-datetime
> >>>> 2017-01-01T00:00:00.000´
> >>>>
> >>>> 3. To Period
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period
> >> P2D´
> >>>>
> >>>> 4. To Earliest
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >> --reset-to-earliest´
> >>>>
> >>>> 5. To Latest
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-to-latest´
> >>>>
> >>>> 6. Minus 'n' offsets
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
> >>>>
> >>>> 7. Plus 'n' offsets
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
> >>>>
> >>>> 8. To specific offset
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
> >>>>
> >>>> Scopes:
> >>>>
> >>>> a. All topics used by Consumer Group
> >>>>
> >>>> Don't specify --topics
> >>>>
> >>>> b. Specific List of Topics
> >>>>
> >>>> Add list of values in --topics t1,t2,tn
> >>>>
> >>>> c. One Topic, all Partitions
> >>>>
> >>>> Add one topic and no partitions values: --topic t1
> >>>>
> >>>> d. One Topic, List of Partitions
> >>>>
> >>>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
> >>>>
> >>>> About Reset Plan (JSON file):
> >>>>
> >>>> I think is still valid to have the option to persist reset
> configuration
> >>>> as a file, but I agree to give the option to run the tool without
> going
> >>>> down to the JSON file.
> >>>>
> >>>> Execution options:
> >>>>
> >>>> 1. Without execution argument (No args):
> >>>>
> >>>> Print out results (reset plan)
> >>>>
> >>>> 2. With --execute argument:
> >>>>
> >>>> Run reset process
> >>>>
> >>>> 3. With --output argument:
> >>>>
> >>>> Save result in a JSON format.
> >>>>
> >>>> 4. Only with --execute option and --reset-file (path to JSON)
> >>>>
> >>>> Reset based on file
> >>>>
> >>>> 4. Only with --verify option and --reset-file (path to JSON)
> >>>>
> >>>> Verify file values with current offsets
> >>>>
> >>>> I think we can remove --generate-and-execute because is a bit clumsy.
> >>>>
> >>>> With this options we will be able to execute with manual JSON
> >>>> configuration.
> >>>>
> >>>>
> >>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
> >>>> escribió:
> >>>>
> >>>> Yes - using a tool like this to skip a set of consumer groups over a
> >>>> corrupt/bad message is definitely appealing.
> >>>>
> >>>> B
> >>>>
> >>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io>
> wrote:
> >>>>
> >>>>> I like the --reset-to-earliest and --reset-to-latest. In general,
> >>>>> since the JSON route is the most challenging for users, we want to
> >>>>> provide a lot of ways to do useful things without going there.
> >>>>>
> >>>>> Two things that can help:
> >>>>>
> >>>>> 1. A lot of times, users want to skip few messages that cause issues
> >>>>> and continue. maybe just specifying the topic, partition and delta
> >>>>> will be better than having to find the offset and write a JSON and
> >>>>> validate the JSON etc.
> >>>>>
> >>>>> 2. Thinking if there are other common use-cases that we can make easy
> >>>>> rather than just one generic but not very usable method.
> >>>>>
> >>>>> Gwen
> >>>>>
> >>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> >>>>> <qu...@gmail.com> wrote:
> >>>>>> Thanks for the feedback!
> >>>>>>
> >>>>>> @Onur, @Gwen:
> >>>>>>
> >>>>>> Agree. Actually at the first draft I considered to have it inside
> >>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a
> standalone
> >>>>> tool
> >>>>>> to describe it clearly and focus it on reset functionality.
> >>>>>>
> >>>>>> But now that you mentioned, it does make sense to have it in
> >>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to
> introduce
> >>>>> it?
> >>>>>>
> >>>>>> Maybe something like this:
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> >>>> --topics
> >>>>> t1
> >>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> >>>>>> plan.json´
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> >>>>>> plan.json´
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute
> >> --group
> >>>>> cg1
> >>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> >>>>>>
> >>>>>> @Gwen:
> >>>>>>
> >>>>>>> It looks exactly like the replica assignment tool
> >>>>>>
> >>>>>> It was influenced by ;-) I use the generate-verify-execute process
> >> here
> >>>>> to
> >>>>>> make sure user will be aware of the result of this operation. At the
> >>>>>> beginning we considered only add a couple of options to Consumer
> Group
> >>>>>> Command:
> >>>>>>
> >>>>>> --rewind-to-timestamp and --rewind-to-period
> >>>>>>
> >>>>>> @Onur:
> >>>>>>
> >>>>>>> You can actually get away with overriding while members of the
> group
> >>>>> are live
> >>>>>> with method 2 by using group information from DescribeGroupsRequest.
> >>>>>>
> >>>>>> This means that we need to have Consumer Group stopped before
> >> executing
> >>>>> and
> >>>>>> start a new consumer internally to do this? Therefore, we won't be
> >> able
> >>>>> to
> >>>>>> consider executing reset when ConsumerGroup is active? (trying to
> >>>> relate
> >>>>> it
> >>>>>> with @Dong 5th question)
> >>>>>>
> >>>>>> @Dong:
> >>>>>>
> >>>>>>> Should we allow user to use wildcard to reset offset of all groups
> >>>> for a
> >>>>>> given topic as well?
> >>>>>>
> >>>>>> I haven't thought about this scenario. Could be interesting.
> Following
> >>>>> the
> >>>>>> recommendation to add it into Consumer Group Command, in this case
> >>>> Group
> >>>>>> argument will be optional if there are only 1 topic. I think for
> >>>> multiple
> >>>>>> topic won't be that useful.
> >>>>>>
> >>>>>>> Should we allow user to specify timestamp per topic partition in
> the
> >>>>> json
> >>>>>> file as well?
> >>>>>>
> >>>>>> Don't think this could be a valid from the tool, but if Reset Plan
> is
> >>>>>> generated, and user want to set the offset for a specific partition
> to
> >>>>>> other offset (eventually based on another timestamp), and execute
> it,
> >>>> it
> >>>>>> will be up to her/him.
> >>>>>>
> >>>>>>> Should the script take some credential file to make sure that this
> >>>>>> operation is authenticated given the potential impact of this
> >>>> operation?
> >>>>>>
> >>>>>> Haven't tried to secure brokers yet, but the tool should support
> >>>>>> authorization if it's enabled in the broker.
> >>>>>>
> >>>>>>> Should we provide constant to reset committed offset to
> >>>> earliest/latest
> >>>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
> >>>> indicates
> >>>>>> latest offset.
> >>>>>>
> >>>>>> I will go for something like ´--reset-to-earliest´ and
> >>>>> ´--reset-to-latest´
> >>>>>>
> >>>>>>> Should we allow dynamic change of the comitted offset when consumer
> >>>> are
> >>>>>> running, such that consumer will seek to the newly committed offset
> >> and
> >>>>>> start consuming from there?
> >>>>>>
> >>>>>> Not sure about this. I will recommend to keep it simple and ask user
> >> to
> >>>>>> stop consumers first. But I would considered it if the trade-offs
> are
> >>>>>> clear.
> >>>>>>
> >>>>>> @Matthias
> >>>>>>
> >>>>>> Added :). And thanks a lot for your help to define this KIP!
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
> >>>>>> escribió:
> >>>>>>
> >>>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
> >>>>>>> arguments and a JSON parser to the existing tool, right?
> >>>>>>>
> >>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> >>>>>>> <on...@gmail.com> wrote:
> >>>>>>>> I think it makes sense to just add the feature to
> >>>>>>> kafka-consumer-groups.sh
> >>>>>>>>
> >>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
> >>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
> >>>>>>>>>
> >>>>>>>>> I hate the interface, though. It looks exactly like the replica
> >>>>>>>>> assignment tool. A tool everyone loves so much that there are
> >>>>> multiple
> >>>>>>>>> projects, open and closed, that try to fix it.
> >>>>>>>>>
> >>>>>>>>> Can we swap it with something that looks a bit more like the
> >>>> consumer
> >>>>>>>>> group tool? or the kafka streams reset tool? Consistency is
> helpful
> >>>>> in
> >>>>>>>>> such cases. I spent some time learning existing tools and
> learning
> >>>>> yet
> >>>>>>>>> another one is a deterrent.
> >>>>>>>>>
> >>>>>>>>> Gwen
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> >>>>>>>>> <qu...@gmail.com> wrote:
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
> >>>> Group
> >>>>>>>>> Offsets.
> >>>>>>>>>>
> >>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >>>>>>>>>>
> >>>>>>>>>> Please, take a look at the proposal and share your feedback.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Jorge.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Gwen Shapira
> >>>>>>>>> Product Manager | Confluent
> >>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760> | @gwenshap
> >>>>>>>>> Follow us: Twitter | blog
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Gwen Shapira
> >>>>>>> Product Manager | Confluent
> >>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760> | @gwenshap
> >>>>>>> Follow us: Twitter | blog
> >>>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Gwen Shapira
> >>>>> Product Manager | Confluent
> >>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760> <(650)%20450-2760>
> >> | @gwenshap
> >>>>> Follow us: Twitter | blog
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by "Matthias J. Sax" <ma...@confluent.io>.
So you suggest to merge "scope options" --topics, --topic, and
--partitions into a single option? Sound good to me.

I like the compact way to express it, ie, topicname:list-of-partitions
with "all partitions" if not partitions are specified. It's quite
intuitive to use.

Just wondering, if we could get rid of the repeated --topic option; it's
somewhat verbose. Have no good idea though who to improve it.

If you concatenate multiple topic, we need one more character that is
not allowed in topic names to separate the topics:

> invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';', '*',
'?', ' ', '\t', '\r', '\n', '='};

maybe

--topics t1=1,2,3:t2:t3=3

use '=' to specify partitions (instead of ':' as you proposed) and ':'
to separate topics? All other characters seem to be worse to use to me.
But maybe you have a better idea.



-Matthias


On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> @Matthias about the point 9:
> 
> What about keeping only the --topic option, and support this format:
> 
> `--topic t1:0,1,2 --topic t2 --topic t3:2`
> 
> In this case topics t1, t2, and t3 will be selected: topic t1 with
> partitions 0,1 and 2; topic t2 with all its partitions; and topic t3, with
> only partition 2.
> 
> Jorge.
> 
> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
> quilcate.jorge@gmail.com>) escribió:
> 
>> Thanks for the feedback Matthias.
>>
>> * 1. You're right. I'll reorder the scenarios.
>>
>> * 2. Agree. I'll update the KIP.
>>
>> * 3. I like it, updating to `reset-offsets`
>>
>> * 4. Agree, removing the `reset-` part
>>
>> * 5. Yes, 1.e option without --execute or --export will print out current
>> offset, and the new offset, that will be the same. The use-case of this
>> option is to use it in combination with --export mostly and have a current
>> 'checkpoint' to reset later. I will add to the KIP how the output should
>> looks like.
>>
>> * 6. Considering 4., I will update it to `--to-offset`
>>
>> * 7. I like the idea to unify these options (plus, minus).
>> `shift-offsets-by` is a good option, but I will like some more feedback
>> here about the name. I will update the KIP in the meantime.
>>
>> * 8. Yes, discussed in 9.
>>
>> * 9. Agree. I'll love some feedback here. `topic` is already used by
>> `delete`, and we can add `--all-topics` to consider all topics/partitions
>> assigned to a group. How could we define specific topics/partitions?
>>
>> * 10. Haven't thought about it, but make sense.
>> <topic>,<partition>,<offset> would be enough.
>>
>> * 11. Agree. Solved with 10.
>>
>> Also, I have a couple of changes to mention:
>>
>> 1. I have add a reference to the branch where I'm working on this KIP.
>>
>> 2. About the period scenario `--to-period`. I will change it to
>> `--to-duration` given that duration (
>> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html)
>> follows this format: 'PnDTnHnMnS' and does not consider daylight saving
>> efects.
>>
>>
>>
>> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<ma...@confluent.io>)
>> escribió:
>>
>> Hi,
>>
>> thanks for updating the KIP. Couple of follow up comments:
>>
>> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
>> time" option -- IMHO it belongs to "reset by position"?
>>
>>
>> * Nit: Description of "Reset to Earliest"
>>
>>> using Kafka Consumer's `auto.offset.reset` to `earliest`
>>
>> I think this is strictly speaking not correct (as auto.offset.reset only
>> triggered if no valid offset is found, but this tool explicitly modified
>> committed offset), and should be phrased as
>>
>>> using Kafka Consumer's #seekToBeginning()
>>
>> -> similar issue for description of "Reset to Latest"
>>
>>
>> * Main option: rename to --reset-offsets (plural instead of singular)
>>
>>
>> * Scenario Options: I would remove "reset" from all options, because the
>> main argument "--reset-offset" says already what to do:
>>
>>> bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
>>
>> better (IMHO):
>>
>>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
>>
>>
>>
>> * Option 1.e ("print and export current offset") is not intuitive to use
>> IMHO. The main option is "--reset-offset" but nothing happens if no
>> scenario is specified. It is also not specified, what the output should
>> look like?
>>
>> Furthermore, --describe should actually show currently committed offset
>> for a group. So it seems to be redundant to have the same option in
>> --reset-offsets
>>
>>
>> * Option 2.a: I would rename to "--reset-to-offset" (or considering the
>> comment above to "--to-offset")
>>
>>
>> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar)
>> and accept positive/negative values
>>
>>
>> * About Scope "all": maybe it's better to have an option "--all-topics"
>> (or similar). IMHO explicit arguments are preferable over implicit
>> setting to guard again accidental miss use of the tool.
>>
>>
>> * Scope: I also think, that "--topic" (singular) and "--topics" (plural)
>> are too similar and easy to use in a wrong way (ie, mix up) -- maybe we
>> can have two options that are easier to distinguish.
>>
>>
>> * I still think that JSON is not the best format (it's too verbose/hard
>> to write for humans from scratch). A simple CSV format with implicit
>> schema (topic,partition,offset) would be sufficient.
>>
>>
>> * Why does the JSON contain "group_id" field -- there is parameter
>> "--group" to specify the group ID. Would one overwrite the other (what
>> order) or would there be an error if "--group" is used in combination
>> with "--reset-from-file"?
>>
>>
>>
>> -Matthias
>>
>>
>>
>>
>> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
>>> Hi,
>>>
>>> according to the feedback, I've updated the KIP:
>>>
>>> - We have added and ordered the scenarios, scopes and executions of the
>>> Reset Offset tool.
>>> - Consider it as an extension to the current `ConsumerGroupCommand` tool
>>> - Execution will be possible without generating JSON files.
>>>
>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
>>>
>>> Looking forward to your feedback!
>>>
>>> Jorge.
>>>
>>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
>>> quilcate.jorge@gmail.com>) escribió:
>>>
>>>> Great. I think I got the idea. What about this options:
>>>>
>>>> Scenarios:
>>>>
>>>> 1. Current status
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
>>>>
>>>> 2. To Datetime
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
>>>> 2017-01-01T00:00:00.000´
>>>>
>>>> 3. To Period
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period
>> P2D´
>>>>
>>>> 4. To Earliest
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
>> --reset-to-earliest´
>>>>
>>>> 5. To Latest
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´
>>>>
>>>> 6. Minus 'n' offsets
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
>>>>
>>>> 7. Plus 'n' offsets
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
>>>>
>>>> 8. To specific offset
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
>>>>
>>>> Scopes:
>>>>
>>>> a. All topics used by Consumer Group
>>>>
>>>> Don't specify --topics
>>>>
>>>> b. Specific List of Topics
>>>>
>>>> Add list of values in --topics t1,t2,tn
>>>>
>>>> c. One Topic, all Partitions
>>>>
>>>> Add one topic and no partitions values: --topic t1
>>>>
>>>> d. One Topic, List of Partitions
>>>>
>>>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
>>>>
>>>> About Reset Plan (JSON file):
>>>>
>>>> I think is still valid to have the option to persist reset configuration
>>>> as a file, but I agree to give the option to run the tool without going
>>>> down to the JSON file.
>>>>
>>>> Execution options:
>>>>
>>>> 1. Without execution argument (No args):
>>>>
>>>> Print out results (reset plan)
>>>>
>>>> 2. With --execute argument:
>>>>
>>>> Run reset process
>>>>
>>>> 3. With --output argument:
>>>>
>>>> Save result in a JSON format.
>>>>
>>>> 4. Only with --execute option and --reset-file (path to JSON)
>>>>
>>>> Reset based on file
>>>>
>>>> 4. Only with --verify option and --reset-file (path to JSON)
>>>>
>>>> Verify file values with current offsets
>>>>
>>>> I think we can remove --generate-and-execute because is a bit clumsy.
>>>>
>>>> With this options we will be able to execute with manual JSON
>>>> configuration.
>>>>
>>>>
>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
>>>> escribió:
>>>>
>>>> Yes - using a tool like this to skip a set of consumer groups over a
>>>> corrupt/bad message is definitely appealing.
>>>>
>>>> B
>>>>
>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
>>>>
>>>>> I like the --reset-to-earliest and --reset-to-latest. In general,
>>>>> since the JSON route is the most challenging for users, we want to
>>>>> provide a lot of ways to do useful things without going there.
>>>>>
>>>>> Two things that can help:
>>>>>
>>>>> 1. A lot of times, users want to skip few messages that cause issues
>>>>> and continue. maybe just specifying the topic, partition and delta
>>>>> will be better than having to find the offset and write a JSON and
>>>>> validate the JSON etc.
>>>>>
>>>>> 2. Thinking if there are other common use-cases that we can make easy
>>>>> rather than just one generic but not very usable method.
>>>>>
>>>>> Gwen
>>>>>
>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
>>>>> <qu...@gmail.com> wrote:
>>>>>> Thanks for the feedback!
>>>>>>
>>>>>> @Onur, @Gwen:
>>>>>>
>>>>>> Agree. Actually at the first draft I considered to have it inside
>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
>>>>> tool
>>>>>> to describe it clearly and focus it on reset functionality.
>>>>>>
>>>>>> But now that you mentioned, it does make sense to have it in
>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
>>>>> it?
>>>>>>
>>>>>> Maybe something like this:
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
>>>> --topics
>>>>> t1
>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
>>>>>> plan.json´
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
>>>>>> plan.json´
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute
>> --group
>>>>> cg1
>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
>>>>>>
>>>>>> @Gwen:
>>>>>>
>>>>>>> It looks exactly like the replica assignment tool
>>>>>>
>>>>>> It was influenced by ;-) I use the generate-verify-execute process
>> here
>>>>> to
>>>>>> make sure user will be aware of the result of this operation. At the
>>>>>> beginning we considered only add a couple of options to Consumer Group
>>>>>> Command:
>>>>>>
>>>>>> --rewind-to-timestamp and --rewind-to-period
>>>>>>
>>>>>> @Onur:
>>>>>>
>>>>>>> You can actually get away with overriding while members of the group
>>>>> are live
>>>>>> with method 2 by using group information from DescribeGroupsRequest.
>>>>>>
>>>>>> This means that we need to have Consumer Group stopped before
>> executing
>>>>> and
>>>>>> start a new consumer internally to do this? Therefore, we won't be
>> able
>>>>> to
>>>>>> consider executing reset when ConsumerGroup is active? (trying to
>>>> relate
>>>>> it
>>>>>> with @Dong 5th question)
>>>>>>
>>>>>> @Dong:
>>>>>>
>>>>>>> Should we allow user to use wildcard to reset offset of all groups
>>>> for a
>>>>>> given topic as well?
>>>>>>
>>>>>> I haven't thought about this scenario. Could be interesting. Following
>>>>> the
>>>>>> recommendation to add it into Consumer Group Command, in this case
>>>> Group
>>>>>> argument will be optional if there are only 1 topic. I think for
>>>> multiple
>>>>>> topic won't be that useful.
>>>>>>
>>>>>>> Should we allow user to specify timestamp per topic partition in the
>>>>> json
>>>>>> file as well?
>>>>>>
>>>>>> Don't think this could be a valid from the tool, but if Reset Plan is
>>>>>> generated, and user want to set the offset for a specific partition to
>>>>>> other offset (eventually based on another timestamp), and execute it,
>>>> it
>>>>>> will be up to her/him.
>>>>>>
>>>>>>> Should the script take some credential file to make sure that this
>>>>>> operation is authenticated given the potential impact of this
>>>> operation?
>>>>>>
>>>>>> Haven't tried to secure brokers yet, but the tool should support
>>>>>> authorization if it's enabled in the broker.
>>>>>>
>>>>>>> Should we provide constant to reset committed offset to
>>>> earliest/latest
>>>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
>>>> indicates
>>>>>> latest offset.
>>>>>>
>>>>>> I will go for something like ´--reset-to-earliest´ and
>>>>> ´--reset-to-latest´
>>>>>>
>>>>>>> Should we allow dynamic change of the comitted offset when consumer
>>>> are
>>>>>> running, such that consumer will seek to the newly committed offset
>> and
>>>>>> start consuming from there?
>>>>>>
>>>>>> Not sure about this. I will recommend to keep it simple and ask user
>> to
>>>>>> stop consumers first. But I would considered it if the trade-offs are
>>>>>> clear.
>>>>>>
>>>>>> @Matthias
>>>>>>
>>>>>> Added :). And thanks a lot for your help to define this KIP!
>>>>>>
>>>>>>
>>>>>>
>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
>>>>>> escribió:
>>>>>>
>>>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
>>>>>>> arguments and a JSON parser to the existing tool, right?
>>>>>>>
>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
>>>>>>> <on...@gmail.com> wrote:
>>>>>>>> I think it makes sense to just add the feature to
>>>>>>> kafka-consumer-groups.sh
>>>>>>>>
>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
>>>>>>>>>
>>>>>>>>> I hate the interface, though. It looks exactly like the replica
>>>>>>>>> assignment tool. A tool everyone loves so much that there are
>>>>> multiple
>>>>>>>>> projects, open and closed, that try to fix it.
>>>>>>>>>
>>>>>>>>> Can we swap it with something that looks a bit more like the
>>>> consumer
>>>>>>>>> group tool? or the kafka streams reset tool? Consistency is helpful
>>>>> in
>>>>>>>>> such cases. I spent some time learning existing tools and learning
>>>>> yet
>>>>>>>>> another one is a deterrent.
>>>>>>>>>
>>>>>>>>> Gwen
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>>>>>>>>> <qu...@gmail.com> wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
>>>> Group
>>>>>>>>> Offsets.
>>>>>>>>>>
>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>>>>>>>>>>
>>>>>>>>>> Please, take a look at the proposal and share your feedback.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Jorge.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Gwen Shapira
>>>>>>>>> Product Manager | Confluent
>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760>
>>>> <(650)%20450-2760> | @gwenshap
>>>>>>>>> Follow us: Twitter | blog
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Gwen Shapira
>>>>>>> Product Manager | Confluent
>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760>
>>>> <(650)%20450-2760> | @gwenshap
>>>>>>> Follow us: Twitter | blog
>>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Gwen Shapira
>>>>> Product Manager | Confluent
>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> <(650)%20450-2760>
>> | @gwenshap
>>>>> Follow us: Twitter | blog
>>>>>
>>>>
>>>>
>>>
>>
>>
> 


Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
@Matthias about the point 9:

What about keeping only the --topic option, and support this format:

`--topic t1:0,1,2 --topic t2 --topic t3:2`

In this case topics t1, t2, and t3 will be selected: topic t1 with
partitions 0,1 and 2; topic t2 with all its partitions; and topic t3, with
only partition 2.

Jorge.

El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
quilcate.jorge@gmail.com>) escribió:

> Thanks for the feedback Matthias.
>
> * 1. You're right. I'll reorder the scenarios.
>
> * 2. Agree. I'll update the KIP.
>
> * 3. I like it, updating to `reset-offsets`
>
> * 4. Agree, removing the `reset-` part
>
> * 5. Yes, 1.e option without --execute or --export will print out current
> offset, and the new offset, that will be the same. The use-case of this
> option is to use it in combination with --export mostly and have a current
> 'checkpoint' to reset later. I will add to the KIP how the output should
> looks like.
>
> * 6. Considering 4., I will update it to `--to-offset`
>
> * 7. I like the idea to unify these options (plus, minus).
> `shift-offsets-by` is a good option, but I will like some more feedback
> here about the name. I will update the KIP in the meantime.
>
> * 8. Yes, discussed in 9.
>
> * 9. Agree. I'll love some feedback here. `topic` is already used by
> `delete`, and we can add `--all-topics` to consider all topics/partitions
> assigned to a group. How could we define specific topics/partitions?
>
> * 10. Haven't thought about it, but make sense.
> <topic>,<partition>,<offset> would be enough.
>
> * 11. Agree. Solved with 10.
>
> Also, I have a couple of changes to mention:
>
> 1. I have add a reference to the branch where I'm working on this KIP.
>
> 2. About the period scenario `--to-period`. I will change it to
> `--to-duration` given that duration (
> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html)
> follows this format: 'PnDTnHnMnS' and does not consider daylight saving
> efects.
>
>
>
> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<ma...@confluent.io>)
> escribió:
>
> Hi,
>
> thanks for updating the KIP. Couple of follow up comments:
>
> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
> time" option -- IMHO it belongs to "reset by position"?
>
>
> * Nit: Description of "Reset to Earliest"
>
> > using Kafka Consumer's `auto.offset.reset` to `earliest`
>
> I think this is strictly speaking not correct (as auto.offset.reset only
> triggered if no valid offset is found, but this tool explicitly modified
> committed offset), and should be phrased as
>
> > using Kafka Consumer's #seekToBeginning()
>
> -> similar issue for description of "Reset to Latest"
>
>
> * Main option: rename to --reset-offsets (plural instead of singular)
>
>
> * Scenario Options: I would remove "reset" from all options, because the
> main argument "--reset-offset" says already what to do:
>
> > bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
>
> better (IMHO):
>
> > bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
>
>
>
> * Option 1.e ("print and export current offset") is not intuitive to use
> IMHO. The main option is "--reset-offset" but nothing happens if no
> scenario is specified. It is also not specified, what the output should
> look like?
>
> Furthermore, --describe should actually show currently committed offset
> for a group. So it seems to be redundant to have the same option in
> --reset-offsets
>
>
> * Option 2.a: I would rename to "--reset-to-offset" (or considering the
> comment above to "--to-offset")
>
>
> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar)
> and accept positive/negative values
>
>
> * About Scope "all": maybe it's better to have an option "--all-topics"
> (or similar). IMHO explicit arguments are preferable over implicit
> setting to guard again accidental miss use of the tool.
>
>
> * Scope: I also think, that "--topic" (singular) and "--topics" (plural)
> are too similar and easy to use in a wrong way (ie, mix up) -- maybe we
> can have two options that are easier to distinguish.
>
>
> * I still think that JSON is not the best format (it's too verbose/hard
> to write for humans from scratch). A simple CSV format with implicit
> schema (topic,partition,offset) would be sufficient.
>
>
> * Why does the JSON contain "group_id" field -- there is parameter
> "--group" to specify the group ID. Would one overwrite the other (what
> order) or would there be an error if "--group" is used in combination
> with "--reset-from-file"?
>
>
>
> -Matthias
>
>
>
>
> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> > Hi,
> >
> > according to the feedback, I've updated the KIP:
> >
> > - We have added and ordered the scenarios, scopes and executions of the
> > Reset Offset tool.
> > - Consider it as an extension to the current `ConsumerGroupCommand` tool
> > - Execution will be possible without generating JSON files.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> >
> > Looking forward to your feedback!
> >
> > Jorge.
> >
> > El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
> > quilcate.jorge@gmail.com>) escribió:
> >
> >> Great. I think I got the idea. What about this options:
> >>
> >> Scenarios:
> >>
> >> 1. Current status
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> >>
> >> 2. To Datetime
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
> >> 2017-01-01T00:00:00.000´
> >>
> >> 3. To Period
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period
> P2D´
> >>
> >> 4. To Earliest
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-to-earliest´
> >>
> >> 5. To Latest
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´
> >>
> >> 6. Minus 'n' offsets
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
> >>
> >> 7. Plus 'n' offsets
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
> >>
> >> 8. To specific offset
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
> >>
> >> Scopes:
> >>
> >> a. All topics used by Consumer Group
> >>
> >> Don't specify --topics
> >>
> >> b. Specific List of Topics
> >>
> >> Add list of values in --topics t1,t2,tn
> >>
> >> c. One Topic, all Partitions
> >>
> >> Add one topic and no partitions values: --topic t1
> >>
> >> d. One Topic, List of Partitions
> >>
> >> Add one topic and partitions values: --topic t1 --partitions 0,1,2
> >>
> >> About Reset Plan (JSON file):
> >>
> >> I think is still valid to have the option to persist reset configuration
> >> as a file, but I agree to give the option to run the tool without going
> >> down to the JSON file.
> >>
> >> Execution options:
> >>
> >> 1. Without execution argument (No args):
> >>
> >> Print out results (reset plan)
> >>
> >> 2. With --execute argument:
> >>
> >> Run reset process
> >>
> >> 3. With --output argument:
> >>
> >> Save result in a JSON format.
> >>
> >> 4. Only with --execute option and --reset-file (path to JSON)
> >>
> >> Reset based on file
> >>
> >> 4. Only with --verify option and --reset-file (path to JSON)
> >>
> >> Verify file values with current offsets
> >>
> >> I think we can remove --generate-and-execute because is a bit clumsy.
> >>
> >> With this options we will be able to execute with manual JSON
> >> configuration.
> >>
> >>
> >> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
> >> escribió:
> >>
> >> Yes - using a tool like this to skip a set of consumer groups over a
> >> corrupt/bad message is definitely appealing.
> >>
> >> B
> >>
> >> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
> >>
> >>> I like the --reset-to-earliest and --reset-to-latest. In general,
> >>> since the JSON route is the most challenging for users, we want to
> >>> provide a lot of ways to do useful things without going there.
> >>>
> >>> Two things that can help:
> >>>
> >>> 1. A lot of times, users want to skip few messages that cause issues
> >>> and continue. maybe just specifying the topic, partition and delta
> >>> will be better than having to find the offset and write a JSON and
> >>> validate the JSON etc.
> >>>
> >>> 2. Thinking if there are other common use-cases that we can make easy
> >>> rather than just one generic but not very usable method.
> >>>
> >>> Gwen
> >>>
> >>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> >>> <qu...@gmail.com> wrote:
> >>>> Thanks for the feedback!
> >>>>
> >>>> @Onur, @Gwen:
> >>>>
> >>>> Agree. Actually at the first draft I considered to have it inside
> >>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
> >>> tool
> >>>> to describe it clearly and focus it on reset functionality.
> >>>>
> >>>> But now that you mentioned, it does make sense to have it in
> >>>> ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
> >>> it?
> >>>>
> >>>> Maybe something like this:
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> >> --topics
> >>> t1
> >>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> >>>> plan.json´
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> >>>> plan.json´
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute
> --group
> >>> cg1
> >>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> >>>>
> >>>> @Gwen:
> >>>>
> >>>>> It looks exactly like the replica assignment tool
> >>>>
> >>>> It was influenced by ;-) I use the generate-verify-execute process
> here
> >>> to
> >>>> make sure user will be aware of the result of this operation. At the
> >>>> beginning we considered only add a couple of options to Consumer Group
> >>>> Command:
> >>>>
> >>>> --rewind-to-timestamp and --rewind-to-period
> >>>>
> >>>> @Onur:
> >>>>
> >>>>> You can actually get away with overriding while members of the group
> >>> are live
> >>>> with method 2 by using group information from DescribeGroupsRequest.
> >>>>
> >>>> This means that we need to have Consumer Group stopped before
> executing
> >>> and
> >>>> start a new consumer internally to do this? Therefore, we won't be
> able
> >>> to
> >>>> consider executing reset when ConsumerGroup is active? (trying to
> >> relate
> >>> it
> >>>> with @Dong 5th question)
> >>>>
> >>>> @Dong:
> >>>>
> >>>>> Should we allow user to use wildcard to reset offset of all groups
> >> for a
> >>>> given topic as well?
> >>>>
> >>>> I haven't thought about this scenario. Could be interesting. Following
> >>> the
> >>>> recommendation to add it into Consumer Group Command, in this case
> >> Group
> >>>> argument will be optional if there are only 1 topic. I think for
> >> multiple
> >>>> topic won't be that useful.
> >>>>
> >>>>> Should we allow user to specify timestamp per topic partition in the
> >>> json
> >>>> file as well?
> >>>>
> >>>> Don't think this could be a valid from the tool, but if Reset Plan is
> >>>> generated, and user want to set the offset for a specific partition to
> >>>> other offset (eventually based on another timestamp), and execute it,
> >> it
> >>>> will be up to her/him.
> >>>>
> >>>>> Should the script take some credential file to make sure that this
> >>>> operation is authenticated given the potential impact of this
> >> operation?
> >>>>
> >>>> Haven't tried to secure brokers yet, but the tool should support
> >>>> authorization if it's enabled in the broker.
> >>>>
> >>>>> Should we provide constant to reset committed offset to
> >> earliest/latest
> >>>> offset of a partition, e.g. -1 indicates earliest offset and -2
> >> indicates
> >>>> latest offset.
> >>>>
> >>>> I will go for something like ´--reset-to-earliest´ and
> >>> ´--reset-to-latest´
> >>>>
> >>>>> Should we allow dynamic change of the comitted offset when consumer
> >> are
> >>>> running, such that consumer will seek to the newly committed offset
> and
> >>>> start consuming from there?
> >>>>
> >>>> Not sure about this. I will recommend to keep it simple and ask user
> to
> >>>> stop consumers first. But I would considered it if the trade-offs are
> >>>> clear.
> >>>>
> >>>> @Matthias
> >>>>
> >>>> Added :). And thanks a lot for your help to define this KIP!
> >>>>
> >>>>
> >>>>
> >>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
> >>>> escribió:
> >>>>
> >>>>> As long as the CLI is a bit consistent? Like, not just adding 3
> >>>>> arguments and a JSON parser to the existing tool, right?
> >>>>>
> >>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> >>>>> <on...@gmail.com> wrote:
> >>>>>> I think it makes sense to just add the feature to
> >>>>> kafka-consumer-groups.sh
> >>>>>>
> >>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
> >>> wrote:
> >>>>>>
> >>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
> >>>>>>>
> >>>>>>> I hate the interface, though. It looks exactly like the replica
> >>>>>>> assignment tool. A tool everyone loves so much that there are
> >>> multiple
> >>>>>>> projects, open and closed, that try to fix it.
> >>>>>>>
> >>>>>>> Can we swap it with something that looks a bit more like the
> >> consumer
> >>>>>>> group tool? or the kafka streams reset tool? Consistency is helpful
> >>> in
> >>>>>>> such cases. I spent some time learning existing tools and learning
> >>> yet
> >>>>>>> another one is a deterrent.
> >>>>>>>
> >>>>>>> Gwen
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> >>>>>>> <qu...@gmail.com> wrote:
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
> >> Group
> >>>>>>> Offsets.
> >>>>>>>>
> >>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >>>>>>>>
> >>>>>>>> Please, take a look at the proposal and share your feedback.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Jorge.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Gwen Shapira
> >>>>>>> Product Manager | Confluent
> >>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760> | @gwenshap
> >>>>>>> Follow us: Twitter | blog
> >>>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Gwen Shapira
> >>>>> Product Manager | Confluent
> >>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760> | @gwenshap
> >>>>> Follow us: Twitter | blog
> >>>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Gwen Shapira
> >>> Product Manager | Confluent
> >>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> <(650)%20450-2760>
> | @gwenshap
> >>> Follow us: Twitter | blog
> >>>
> >>
> >>
> >
>
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
@Matthias about the point 9:

What about keeping only the --topic option, and support this format:

`--topic t1:0,1,2 --topic t2 --topic t3:2`

In this case topics t1, t2, and t3 will be selected: topic t1 with
partitions 0,1 and 2; topic t2 with all its partitions; and topic t3, with
only partition 2.

Jorge.

El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
quilcate.jorge@gmail.com>) escribió:

> Thanks for the feedback Matthias.
>
> * 1. You're right. I'll reorder the scenarios.
>
> * 2. Agree. I'll update the KIP.
>
> * 3. I like it, updating to `reset-offsets`
>
> * 4. Agree, removing the `reset-` part
>
> * 5. Yes, 1.e option without --execute or --export will print out current
> offset, and the new offset, that will be the same. The use-case of this
> option is to use it in combination with --export mostly and have a current
> 'checkpoint' to reset later. I will add to the KIP how the output should
> looks like.
>
> * 6. Considering 4., I will update it to `--to-offset`
>
> * 7. I like the idea to unify these options (plus, minus).
> `shift-offsets-by` is a good option, but I will like some more feedback
> here about the name. I will update the KIP in the meantime.
>
> * 8. Yes, discussed in 9.
>
> * 9. Agree. I'll love some feedback here. `topic` is already used by
> `delete`, and we can add `--all-topics` to consider all topics/partitions
> assigned to a group. How could we define specific topics/partitions?
>
> * 10. Haven't thought about it, but make sense.
> <topic>,<partition>,<offset> would be enough.
>
> * 11. Agree. Solved with 10.
>
> Also, I have a couple of changes to mention:
>
> 1. I have add a reference to the branch where I'm working on this KIP.
>
> 2. About the period scenario `--to-period`. I will change it to
> `--to-duration` given that duration (
> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html)
> follows this format: 'PnDTnHnMnS' and does not consider daylight saving
> efects.
>
>
>
> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<ma...@confluent.io>)
> escribió:
>
> Hi,
>
> thanks for updating the KIP. Couple of follow up comments:
>
> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
> time" option -- IMHO it belongs to "reset by position"?
>
>
> * Nit: Description of "Reset to Earliest"
>
> > using Kafka Consumer's `auto.offset.reset` to `earliest`
>
> I think this is strictly speaking not correct (as auto.offset.reset only
> triggered if no valid offset is found, but this tool explicitly modified
> committed offset), and should be phrased as
>
> > using Kafka Consumer's #seekToBeginning()
>
> -> similar issue for description of "Reset to Latest"
>
>
> * Main option: rename to --reset-offsets (plural instead of singular)
>
>
> * Scenario Options: I would remove "reset" from all options, because the
> main argument "--reset-offset" says already what to do:
>
> > bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
>
> better (IMHO):
>
> > bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
>
>
>
> * Option 1.e ("print and export current offset") is not intuitive to use
> IMHO. The main option is "--reset-offset" but nothing happens if no
> scenario is specified. It is also not specified, what the output should
> look like?
>
> Furthermore, --describe should actually show currently committed offset
> for a group. So it seems to be redundant to have the same option in
> --reset-offsets
>
>
> * Option 2.a: I would rename to "--reset-to-offset" (or considering the
> comment above to "--to-offset")
>
>
> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar)
> and accept positive/negative values
>
>
> * About Scope "all": maybe it's better to have an option "--all-topics"
> (or similar). IMHO explicit arguments are preferable over implicit
> setting to guard again accidental miss use of the tool.
>
>
> * Scope: I also think, that "--topic" (singular) and "--topics" (plural)
> are too similar and easy to use in a wrong way (ie, mix up) -- maybe we
> can have two options that are easier to distinguish.
>
>
> * I still think that JSON is not the best format (it's too verbose/hard
> to write for humans from scratch). A simple CSV format with implicit
> schema (topic,partition,offset) would be sufficient.
>
>
> * Why does the JSON contain "group_id" field -- there is parameter
> "--group" to specify the group ID. Would one overwrite the other (what
> order) or would there be an error if "--group" is used in combination
> with "--reset-from-file"?
>
>
>
> -Matthias
>
>
>
>
> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> > Hi,
> >
> > according to the feedback, I've updated the KIP:
> >
> > - We have added and ordered the scenarios, scopes and executions of the
> > Reset Offset tool.
> > - Consider it as an extension to the current `ConsumerGroupCommand` tool
> > - Execution will be possible without generating JSON files.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> >
> > Looking forward to your feedback!
> >
> > Jorge.
> >
> > El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
> > quilcate.jorge@gmail.com>) escribió:
> >
> >> Great. I think I got the idea. What about this options:
> >>
> >> Scenarios:
> >>
> >> 1. Current status
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> >>
> >> 2. To Datetime
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
> >> 2017-01-01T00:00:00.000´
> >>
> >> 3. To Period
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period
> P2D´
> >>
> >> 4. To Earliest
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-to-earliest´
> >>
> >> 5. To Latest
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´
> >>
> >> 6. Minus 'n' offsets
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
> >>
> >> 7. Plus 'n' offsets
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
> >>
> >> 8. To specific offset
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
> >>
> >> Scopes:
> >>
> >> a. All topics used by Consumer Group
> >>
> >> Don't specify --topics
> >>
> >> b. Specific List of Topics
> >>
> >> Add list of values in --topics t1,t2,tn
> >>
> >> c. One Topic, all Partitions
> >>
> >> Add one topic and no partitions values: --topic t1
> >>
> >> d. One Topic, List of Partitions
> >>
> >> Add one topic and partitions values: --topic t1 --partitions 0,1,2
> >>
> >> About Reset Plan (JSON file):
> >>
> >> I think is still valid to have the option to persist reset configuration
> >> as a file, but I agree to give the option to run the tool without going
> >> down to the JSON file.
> >>
> >> Execution options:
> >>
> >> 1. Without execution argument (No args):
> >>
> >> Print out results (reset plan)
> >>
> >> 2. With --execute argument:
> >>
> >> Run reset process
> >>
> >> 3. With --output argument:
> >>
> >> Save result in a JSON format.
> >>
> >> 4. Only with --execute option and --reset-file (path to JSON)
> >>
> >> Reset based on file
> >>
> >> 4. Only with --verify option and --reset-file (path to JSON)
> >>
> >> Verify file values with current offsets
> >>
> >> I think we can remove --generate-and-execute because is a bit clumsy.
> >>
> >> With this options we will be able to execute with manual JSON
> >> configuration.
> >>
> >>
> >> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
> >> escribió:
> >>
> >> Yes - using a tool like this to skip a set of consumer groups over a
> >> corrupt/bad message is definitely appealing.
> >>
> >> B
> >>
> >> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
> >>
> >>> I like the --reset-to-earliest and --reset-to-latest. In general,
> >>> since the JSON route is the most challenging for users, we want to
> >>> provide a lot of ways to do useful things without going there.
> >>>
> >>> Two things that can help:
> >>>
> >>> 1. A lot of times, users want to skip few messages that cause issues
> >>> and continue. maybe just specifying the topic, partition and delta
> >>> will be better than having to find the offset and write a JSON and
> >>> validate the JSON etc.
> >>>
> >>> 2. Thinking if there are other common use-cases that we can make easy
> >>> rather than just one generic but not very usable method.
> >>>
> >>> Gwen
> >>>
> >>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> >>> <qu...@gmail.com> wrote:
> >>>> Thanks for the feedback!
> >>>>
> >>>> @Onur, @Gwen:
> >>>>
> >>>> Agree. Actually at the first draft I considered to have it inside
> >>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
> >>> tool
> >>>> to describe it clearly and focus it on reset functionality.
> >>>>
> >>>> But now that you mentioned, it does make sense to have it in
> >>>> ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
> >>> it?
> >>>>
> >>>> Maybe something like this:
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> >> --topics
> >>> t1
> >>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> >>>> plan.json´
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> >>>> plan.json´
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute
> --group
> >>> cg1
> >>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> >>>>
> >>>> @Gwen:
> >>>>
> >>>>> It looks exactly like the replica assignment tool
> >>>>
> >>>> It was influenced by ;-) I use the generate-verify-execute process
> here
> >>> to
> >>>> make sure user will be aware of the result of this operation. At the
> >>>> beginning we considered only add a couple of options to Consumer Group
> >>>> Command:
> >>>>
> >>>> --rewind-to-timestamp and --rewind-to-period
> >>>>
> >>>> @Onur:
> >>>>
> >>>>> You can actually get away with overriding while members of the group
> >>> are live
> >>>> with method 2 by using group information from DescribeGroupsRequest.
> >>>>
> >>>> This means that we need to have Consumer Group stopped before
> executing
> >>> and
> >>>> start a new consumer internally to do this? Therefore, we won't be
> able
> >>> to
> >>>> consider executing reset when ConsumerGroup is active? (trying to
> >> relate
> >>> it
> >>>> with @Dong 5th question)
> >>>>
> >>>> @Dong:
> >>>>
> >>>>> Should we allow user to use wildcard to reset offset of all groups
> >> for a
> >>>> given topic as well?
> >>>>
> >>>> I haven't thought about this scenario. Could be interesting. Following
> >>> the
> >>>> recommendation to add it into Consumer Group Command, in this case
> >> Group
> >>>> argument will be optional if there are only 1 topic. I think for
> >> multiple
> >>>> topic won't be that useful.
> >>>>
> >>>>> Should we allow user to specify timestamp per topic partition in the
> >>> json
> >>>> file as well?
> >>>>
> >>>> Don't think this could be a valid from the tool, but if Reset Plan is
> >>>> generated, and user want to set the offset for a specific partition to
> >>>> other offset (eventually based on another timestamp), and execute it,
> >> it
> >>>> will be up to her/him.
> >>>>
> >>>>> Should the script take some credential file to make sure that this
> >>>> operation is authenticated given the potential impact of this
> >> operation?
> >>>>
> >>>> Haven't tried to secure brokers yet, but the tool should support
> >>>> authorization if it's enabled in the broker.
> >>>>
> >>>>> Should we provide constant to reset committed offset to
> >> earliest/latest
> >>>> offset of a partition, e.g. -1 indicates earliest offset and -2
> >> indicates
> >>>> latest offset.
> >>>>
> >>>> I will go for something like ´--reset-to-earliest´ and
> >>> ´--reset-to-latest´
> >>>>
> >>>>> Should we allow dynamic change of the comitted offset when consumer
> >> are
> >>>> running, such that consumer will seek to the newly committed offset
> and
> >>>> start consuming from there?
> >>>>
> >>>> Not sure about this. I will recommend to keep it simple and ask user
> to
> >>>> stop consumers first. But I would considered it if the trade-offs are
> >>>> clear.
> >>>>
> >>>> @Matthias
> >>>>
> >>>> Added :). And thanks a lot for your help to define this KIP!
> >>>>
> >>>>
> >>>>
> >>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
> >>>> escribió:
> >>>>
> >>>>> As long as the CLI is a bit consistent? Like, not just adding 3
> >>>>> arguments and a JSON parser to the existing tool, right?
> >>>>>
> >>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> >>>>> <on...@gmail.com> wrote:
> >>>>>> I think it makes sense to just add the feature to
> >>>>> kafka-consumer-groups.sh
> >>>>>>
> >>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
> >>> wrote:
> >>>>>>
> >>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
> >>>>>>>
> >>>>>>> I hate the interface, though. It looks exactly like the replica
> >>>>>>> assignment tool. A tool everyone loves so much that there are
> >>> multiple
> >>>>>>> projects, open and closed, that try to fix it.
> >>>>>>>
> >>>>>>> Can we swap it with something that looks a bit more like the
> >> consumer
> >>>>>>> group tool? or the kafka streams reset tool? Consistency is helpful
> >>> in
> >>>>>>> such cases. I spent some time learning existing tools and learning
> >>> yet
> >>>>>>> another one is a deterrent.
> >>>>>>>
> >>>>>>> Gwen
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> >>>>>>> <qu...@gmail.com> wrote:
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
> >> Group
> >>>>>>> Offsets.
> >>>>>>>>
> >>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >>>>>>>>
> >>>>>>>> Please, take a look at the proposal and share your feedback.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Jorge.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Gwen Shapira
> >>>>>>> Product Manager | Confluent
> >>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760> | @gwenshap
> >>>>>>> Follow us: Twitter | blog
> >>>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Gwen Shapira
> >>>>> Product Manager | Confluent
> >>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760> | @gwenshap
> >>>>> Follow us: Twitter | blog
> >>>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Gwen Shapira
> >>> Product Manager | Confluent
> >>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> <(650)%20450-2760>
> | @gwenshap
> >>> Follow us: Twitter | blog
> >>>
> >>
> >>
> >
>
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
Thanks for the feedback Matthias.

* 1. You're right. I'll reorder the scenarios.

* 2. Agree. I'll update the KIP.

* 3. I like it, updating to `reset-offsets`

* 4. Agree, removing the `reset-` part

* 5. Yes, 1.e option without --execute or --export will print out current
offset, and the new offset, that will be the same. The use-case of this
option is to use it in combination with --export mostly and have a current
'checkpoint' to reset later. I will add to the KIP how the output should
looks like.

* 6. Considering 4., I will update it to `--to-offset`

* 7. I like the idea to unify these options (plus, minus).
`shift-offsets-by` is a good option, but I will like some more feedback
here about the name. I will update the KIP in the meantime.

* 8. Yes, discussed in 9.

* 9. Agree. I'll love some feedback here. `topic` is already used by
`delete`, and we can add `--all-topics` to consider all topics/partitions
assigned to a group. How could we define specific topics/partitions?

* 10. Haven't thought about it, but make sense.
<topic>,<partition>,<offset> would be enough.

* 11. Agree. Solved with 10.

Also, I have a couple of changes to mention:

1. I have add a reference to the branch where I'm working on this KIP.

2. About the period scenario `--to-period`. I will change it to
`--to-duration` given that duration (
https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html) follows
this format: 'PnDTnHnMnS' and does not consider daylight saving efects.



El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<ma...@confluent.io>)
escribió:

> Hi,
>
> thanks for updating the KIP. Couple of follow up comments:
>
> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
> time" option -- IMHO it belongs to "reset by position"?
>
>
> * Nit: Description of "Reset to Earliest"
>
> > using Kafka Consumer's `auto.offset.reset` to `earliest`
>
> I think this is strictly speaking not correct (as auto.offset.reset only
> triggered if no valid offset is found, but this tool explicitly modified
> committed offset), and should be phrased as
>
> > using Kafka Consumer's #seekToBeginning()
>
> -> similar issue for description of "Reset to Latest"
>
>
> * Main option: rename to --reset-offsets (plural instead of singular)
>
>
> * Scenario Options: I would remove "reset" from all options, because the
> main argument "--reset-offset" says already what to do:
>
> > bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
>
> better (IMHO):
>
> > bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
>
>
>
> * Option 1.e ("print and export current offset") is not intuitive to use
> IMHO. The main option is "--reset-offset" but nothing happens if no
> scenario is specified. It is also not specified, what the output should
> look like?
>
> Furthermore, --describe should actually show currently committed offset
> for a group. So it seems to be redundant to have the same option in
> --reset-offsets
>
>
> * Option 2.a: I would rename to "--reset-to-offset" (or considering the
> comment above to "--to-offset")
>
>
> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar)
> and accept positive/negative values
>
>
> * About Scope "all": maybe it's better to have an option "--all-topics"
> (or similar). IMHO explicit arguments are preferable over implicit
> setting to guard again accidental miss use of the tool.
>
>
> * Scope: I also think, that "--topic" (singular) and "--topics" (plural)
> are too similar and easy to use in a wrong way (ie, mix up) -- maybe we
> can have two options that are easier to distinguish.
>
>
> * I still think that JSON is not the best format (it's too verbose/hard
> to write for humans from scratch). A simple CSV format with implicit
> schema (topic,partition,offset) would be sufficient.
>
>
> * Why does the JSON contain "group_id" field -- there is parameter
> "--group" to specify the group ID. Would one overwrite the other (what
> order) or would there be an error if "--group" is used in combination
> with "--reset-from-file"?
>
>
>
> -Matthias
>
>
>
>
> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> > Hi,
> >
> > according to the feedback, I've updated the KIP:
> >
> > - We have added and ordered the scenarios, scopes and executions of the
> > Reset Offset tool.
> > - Consider it as an extension to the current `ConsumerGroupCommand` tool
> > - Execution will be possible without generating JSON files.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> >
> > Looking forward to your feedback!
> >
> > Jorge.
> >
> > El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
> > quilcate.jorge@gmail.com>) escribió:
> >
> >> Great. I think I got the idea. What about this options:
> >>
> >> Scenarios:
> >>
> >> 1. Current status
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> >>
> >> 2. To Datetime
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
> >> 2017-01-01T00:00:00.000´
> >>
> >> 3. To Period
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period
> P2D´
> >>
> >> 4. To Earliest
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-to-earliest´
> >>
> >> 5. To Latest
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´
> >>
> >> 6. Minus 'n' offsets
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
> >>
> >> 7. Plus 'n' offsets
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
> >>
> >> 8. To specific offset
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
> >>
> >> Scopes:
> >>
> >> a. All topics used by Consumer Group
> >>
> >> Don't specify --topics
> >>
> >> b. Specific List of Topics
> >>
> >> Add list of values in --topics t1,t2,tn
> >>
> >> c. One Topic, all Partitions
> >>
> >> Add one topic and no partitions values: --topic t1
> >>
> >> d. One Topic, List of Partitions
> >>
> >> Add one topic and partitions values: --topic t1 --partitions 0,1,2
> >>
> >> About Reset Plan (JSON file):
> >>
> >> I think is still valid to have the option to persist reset configuration
> >> as a file, but I agree to give the option to run the tool without going
> >> down to the JSON file.
> >>
> >> Execution options:
> >>
> >> 1. Without execution argument (No args):
> >>
> >> Print out results (reset plan)
> >>
> >> 2. With --execute argument:
> >>
> >> Run reset process
> >>
> >> 3. With --output argument:
> >>
> >> Save result in a JSON format.
> >>
> >> 4. Only with --execute option and --reset-file (path to JSON)
> >>
> >> Reset based on file
> >>
> >> 4. Only with --verify option and --reset-file (path to JSON)
> >>
> >> Verify file values with current offsets
> >>
> >> I think we can remove --generate-and-execute because is a bit clumsy.
> >>
> >> With this options we will be able to execute with manual JSON
> >> configuration.
> >>
> >>
> >> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
> >> escribió:
> >>
> >> Yes - using a tool like this to skip a set of consumer groups over a
> >> corrupt/bad message is definitely appealing.
> >>
> >> B
> >>
> >> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
> >>
> >>> I like the --reset-to-earliest and --reset-to-latest. In general,
> >>> since the JSON route is the most challenging for users, we want to
> >>> provide a lot of ways to do useful things without going there.
> >>>
> >>> Two things that can help:
> >>>
> >>> 1. A lot of times, users want to skip few messages that cause issues
> >>> and continue. maybe just specifying the topic, partition and delta
> >>> will be better than having to find the offset and write a JSON and
> >>> validate the JSON etc.
> >>>
> >>> 2. Thinking if there are other common use-cases that we can make easy
> >>> rather than just one generic but not very usable method.
> >>>
> >>> Gwen
> >>>
> >>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> >>> <qu...@gmail.com> wrote:
> >>>> Thanks for the feedback!
> >>>>
> >>>> @Onur, @Gwen:
> >>>>
> >>>> Agree. Actually at the first draft I considered to have it inside
> >>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
> >>> tool
> >>>> to describe it clearly and focus it on reset functionality.
> >>>>
> >>>> But now that you mentioned, it does make sense to have it in
> >>>> ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
> >>> it?
> >>>>
> >>>> Maybe something like this:
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> >> --topics
> >>> t1
> >>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> >>>> plan.json´
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> >>>> plan.json´
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute
> --group
> >>> cg1
> >>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> >>>>
> >>>> @Gwen:
> >>>>
> >>>>> It looks exactly like the replica assignment tool
> >>>>
> >>>> It was influenced by ;-) I use the generate-verify-execute process
> here
> >>> to
> >>>> make sure user will be aware of the result of this operation. At the
> >>>> beginning we considered only add a couple of options to Consumer Group
> >>>> Command:
> >>>>
> >>>> --rewind-to-timestamp and --rewind-to-period
> >>>>
> >>>> @Onur:
> >>>>
> >>>>> You can actually get away with overriding while members of the group
> >>> are live
> >>>> with method 2 by using group information from DescribeGroupsRequest.
> >>>>
> >>>> This means that we need to have Consumer Group stopped before
> executing
> >>> and
> >>>> start a new consumer internally to do this? Therefore, we won't be
> able
> >>> to
> >>>> consider executing reset when ConsumerGroup is active? (trying to
> >> relate
> >>> it
> >>>> with @Dong 5th question)
> >>>>
> >>>> @Dong:
> >>>>
> >>>>> Should we allow user to use wildcard to reset offset of all groups
> >> for a
> >>>> given topic as well?
> >>>>
> >>>> I haven't thought about this scenario. Could be interesting. Following
> >>> the
> >>>> recommendation to add it into Consumer Group Command, in this case
> >> Group
> >>>> argument will be optional if there are only 1 topic. I think for
> >> multiple
> >>>> topic won't be that useful.
> >>>>
> >>>>> Should we allow user to specify timestamp per topic partition in the
> >>> json
> >>>> file as well?
> >>>>
> >>>> Don't think this could be a valid from the tool, but if Reset Plan is
> >>>> generated, and user want to set the offset for a specific partition to
> >>>> other offset (eventually based on another timestamp), and execute it,
> >> it
> >>>> will be up to her/him.
> >>>>
> >>>>> Should the script take some credential file to make sure that this
> >>>> operation is authenticated given the potential impact of this
> >> operation?
> >>>>
> >>>> Haven't tried to secure brokers yet, but the tool should support
> >>>> authorization if it's enabled in the broker.
> >>>>
> >>>>> Should we provide constant to reset committed offset to
> >> earliest/latest
> >>>> offset of a partition, e.g. -1 indicates earliest offset and -2
> >> indicates
> >>>> latest offset.
> >>>>
> >>>> I will go for something like ´--reset-to-earliest´ and
> >>> ´--reset-to-latest´
> >>>>
> >>>>> Should we allow dynamic change of the comitted offset when consumer
> >> are
> >>>> running, such that consumer will seek to the newly committed offset
> and
> >>>> start consuming from there?
> >>>>
> >>>> Not sure about this. I will recommend to keep it simple and ask user
> to
> >>>> stop consumers first. But I would considered it if the trade-offs are
> >>>> clear.
> >>>>
> >>>> @Matthias
> >>>>
> >>>> Added :). And thanks a lot for your help to define this KIP!
> >>>>
> >>>>
> >>>>
> >>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
> >>>> escribió:
> >>>>
> >>>>> As long as the CLI is a bit consistent? Like, not just adding 3
> >>>>> arguments and a JSON parser to the existing tool, right?
> >>>>>
> >>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> >>>>> <on...@gmail.com> wrote:
> >>>>>> I think it makes sense to just add the feature to
> >>>>> kafka-consumer-groups.sh
> >>>>>>
> >>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
> >>> wrote:
> >>>>>>
> >>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
> >>>>>>>
> >>>>>>> I hate the interface, though. It looks exactly like the replica
> >>>>>>> assignment tool. A tool everyone loves so much that there are
> >>> multiple
> >>>>>>> projects, open and closed, that try to fix it.
> >>>>>>>
> >>>>>>> Can we swap it with something that looks a bit more like the
> >> consumer
> >>>>>>> group tool? or the kafka streams reset tool? Consistency is helpful
> >>> in
> >>>>>>> such cases. I spent some time learning existing tools and learning
> >>> yet
> >>>>>>> another one is a deterrent.
> >>>>>>>
> >>>>>>> Gwen
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> >>>>>>> <qu...@gmail.com> wrote:
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
> >> Group
> >>>>>>> Offsets.
> >>>>>>>>
> >>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >>>>>>>>
> >>>>>>>> Please, take a look at the proposal and share your feedback.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Jorge.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Gwen Shapira
> >>>>>>> Product Manager | Confluent
> >>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760> | @gwenshap
> >>>>>>> Follow us: Twitter | blog
> >>>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Gwen Shapira
> >>>>> Product Manager | Confluent
> >>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760> | @gwenshap
> >>>>> Follow us: Twitter | blog
> >>>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Gwen Shapira
> >>> Product Manager | Confluent
> >>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> <(650)%20450-2760>
> | @gwenshap
> >>> Follow us: Twitter | blog
> >>>
> >>
> >>
> >
>
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
Thanks for the feedback Matthias.

* 1. You're right. I'll reorder the scenarios.

* 2. Agree. I'll update the KIP.

* 3. I like it, updating to `reset-offsets`

* 4. Agree, removing the `reset-` part

* 5. Yes, 1.e option without --execute or --export will print out current
offset, and the new offset, that will be the same. The use-case of this
option is to use it in combination with --export mostly and have a current
'checkpoint' to reset later. I will add to the KIP how the output should
looks like.

* 6. Considering 4., I will update it to `--to-offset`

* 7. I like the idea to unify these options (plus, minus).
`shift-offsets-by` is a good option, but I will like some more feedback
here about the name. I will update the KIP in the meantime.

* 8. Yes, discussed in 9.

* 9. Agree. I'll love some feedback here. `topic` is already used by
`delete`, and we can add `--all-topics` to consider all topics/partitions
assigned to a group. How could we define specific topics/partitions?

* 10. Haven't thought about it, but make sense.
<topic>,<partition>,<offset> would be enough.

* 11. Agree. Solved with 10.

Also, I have a couple of changes to mention:

1. I have add a reference to the branch where I'm working on this KIP.

2. About the period scenario `--to-period`. I will change it to
`--to-duration` given that duration (
https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html) follows
this format: 'PnDTnHnMnS' and does not consider daylight saving efects.



El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<ma...@confluent.io>)
escribió:

> Hi,
>
> thanks for updating the KIP. Couple of follow up comments:
>
> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
> time" option -- IMHO it belongs to "reset by position"?
>
>
> * Nit: Description of "Reset to Earliest"
>
> > using Kafka Consumer's `auto.offset.reset` to `earliest`
>
> I think this is strictly speaking not correct (as auto.offset.reset only
> triggered if no valid offset is found, but this tool explicitly modified
> committed offset), and should be phrased as
>
> > using Kafka Consumer's #seekToBeginning()
>
> -> similar issue for description of "Reset to Latest"
>
>
> * Main option: rename to --reset-offsets (plural instead of singular)
>
>
> * Scenario Options: I would remove "reset" from all options, because the
> main argument "--reset-offset" says already what to do:
>
> > bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
>
> better (IMHO):
>
> > bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
>
>
>
> * Option 1.e ("print and export current offset") is not intuitive to use
> IMHO. The main option is "--reset-offset" but nothing happens if no
> scenario is specified. It is also not specified, what the output should
> look like?
>
> Furthermore, --describe should actually show currently committed offset
> for a group. So it seems to be redundant to have the same option in
> --reset-offsets
>
>
> * Option 2.a: I would rename to "--reset-to-offset" (or considering the
> comment above to "--to-offset")
>
>
> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar)
> and accept positive/negative values
>
>
> * About Scope "all": maybe it's better to have an option "--all-topics"
> (or similar). IMHO explicit arguments are preferable over implicit
> setting to guard again accidental miss use of the tool.
>
>
> * Scope: I also think, that "--topic" (singular) and "--topics" (plural)
> are too similar and easy to use in a wrong way (ie, mix up) -- maybe we
> can have two options that are easier to distinguish.
>
>
> * I still think that JSON is not the best format (it's too verbose/hard
> to write for humans from scratch). A simple CSV format with implicit
> schema (topic,partition,offset) would be sufficient.
>
>
> * Why does the JSON contain "group_id" field -- there is parameter
> "--group" to specify the group ID. Would one overwrite the other (what
> order) or would there be an error if "--group" is used in combination
> with "--reset-from-file"?
>
>
>
> -Matthias
>
>
>
>
> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> > Hi,
> >
> > according to the feedback, I've updated the KIP:
> >
> > - We have added and ordered the scenarios, scopes and executions of the
> > Reset Offset tool.
> > - Consider it as an extension to the current `ConsumerGroupCommand` tool
> > - Execution will be possible without generating JSON files.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> >
> > Looking forward to your feedback!
> >
> > Jorge.
> >
> > El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
> > quilcate.jorge@gmail.com>) escribió:
> >
> >> Great. I think I got the idea. What about this options:
> >>
> >> Scenarios:
> >>
> >> 1. Current status
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> >>
> >> 2. To Datetime
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
> >> 2017-01-01T00:00:00.000´
> >>
> >> 3. To Period
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period
> P2D´
> >>
> >> 4. To Earliest
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-to-earliest´
> >>
> >> 5. To Latest
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´
> >>
> >> 6. Minus 'n' offsets
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
> >>
> >> 7. Plus 'n' offsets
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
> >>
> >> 8. To specific offset
> >>
> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
> >>
> >> Scopes:
> >>
> >> a. All topics used by Consumer Group
> >>
> >> Don't specify --topics
> >>
> >> b. Specific List of Topics
> >>
> >> Add list of values in --topics t1,t2,tn
> >>
> >> c. One Topic, all Partitions
> >>
> >> Add one topic and no partitions values: --topic t1
> >>
> >> d. One Topic, List of Partitions
> >>
> >> Add one topic and partitions values: --topic t1 --partitions 0,1,2
> >>
> >> About Reset Plan (JSON file):
> >>
> >> I think is still valid to have the option to persist reset configuration
> >> as a file, but I agree to give the option to run the tool without going
> >> down to the JSON file.
> >>
> >> Execution options:
> >>
> >> 1. Without execution argument (No args):
> >>
> >> Print out results (reset plan)
> >>
> >> 2. With --execute argument:
> >>
> >> Run reset process
> >>
> >> 3. With --output argument:
> >>
> >> Save result in a JSON format.
> >>
> >> 4. Only with --execute option and --reset-file (path to JSON)
> >>
> >> Reset based on file
> >>
> >> 4. Only with --verify option and --reset-file (path to JSON)
> >>
> >> Verify file values with current offsets
> >>
> >> I think we can remove --generate-and-execute because is a bit clumsy.
> >>
> >> With this options we will be able to execute with manual JSON
> >> configuration.
> >>
> >>
> >> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
> >> escribió:
> >>
> >> Yes - using a tool like this to skip a set of consumer groups over a
> >> corrupt/bad message is definitely appealing.
> >>
> >> B
> >>
> >> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
> >>
> >>> I like the --reset-to-earliest and --reset-to-latest. In general,
> >>> since the JSON route is the most challenging for users, we want to
> >>> provide a lot of ways to do useful things without going there.
> >>>
> >>> Two things that can help:
> >>>
> >>> 1. A lot of times, users want to skip few messages that cause issues
> >>> and continue. maybe just specifying the topic, partition and delta
> >>> will be better than having to find the offset and write a JSON and
> >>> validate the JSON etc.
> >>>
> >>> 2. Thinking if there are other common use-cases that we can make easy
> >>> rather than just one generic but not very usable method.
> >>>
> >>> Gwen
> >>>
> >>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> >>> <qu...@gmail.com> wrote:
> >>>> Thanks for the feedback!
> >>>>
> >>>> @Onur, @Gwen:
> >>>>
> >>>> Agree. Actually at the first draft I considered to have it inside
> >>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
> >>> tool
> >>>> to describe it clearly and focus it on reset functionality.
> >>>>
> >>>> But now that you mentioned, it does make sense to have it in
> >>>> ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
> >>> it?
> >>>>
> >>>> Maybe something like this:
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> >> --topics
> >>> t1
> >>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> >>>> plan.json´
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> >>>> plan.json´
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute
> --group
> >>> cg1
> >>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> >>>>
> >>>> @Gwen:
> >>>>
> >>>>> It looks exactly like the replica assignment tool
> >>>>
> >>>> It was influenced by ;-) I use the generate-verify-execute process
> here
> >>> to
> >>>> make sure user will be aware of the result of this operation. At the
> >>>> beginning we considered only add a couple of options to Consumer Group
> >>>> Command:
> >>>>
> >>>> --rewind-to-timestamp and --rewind-to-period
> >>>>
> >>>> @Onur:
> >>>>
> >>>>> You can actually get away with overriding while members of the group
> >>> are live
> >>>> with method 2 by using group information from DescribeGroupsRequest.
> >>>>
> >>>> This means that we need to have Consumer Group stopped before
> executing
> >>> and
> >>>> start a new consumer internally to do this? Therefore, we won't be
> able
> >>> to
> >>>> consider executing reset when ConsumerGroup is active? (trying to
> >> relate
> >>> it
> >>>> with @Dong 5th question)
> >>>>
> >>>> @Dong:
> >>>>
> >>>>> Should we allow user to use wildcard to reset offset of all groups
> >> for a
> >>>> given topic as well?
> >>>>
> >>>> I haven't thought about this scenario. Could be interesting. Following
> >>> the
> >>>> recommendation to add it into Consumer Group Command, in this case
> >> Group
> >>>> argument will be optional if there are only 1 topic. I think for
> >> multiple
> >>>> topic won't be that useful.
> >>>>
> >>>>> Should we allow user to specify timestamp per topic partition in the
> >>> json
> >>>> file as well?
> >>>>
> >>>> Don't think this could be a valid from the tool, but if Reset Plan is
> >>>> generated, and user want to set the offset for a specific partition to
> >>>> other offset (eventually based on another timestamp), and execute it,
> >> it
> >>>> will be up to her/him.
> >>>>
> >>>>> Should the script take some credential file to make sure that this
> >>>> operation is authenticated given the potential impact of this
> >> operation?
> >>>>
> >>>> Haven't tried to secure brokers yet, but the tool should support
> >>>> authorization if it's enabled in the broker.
> >>>>
> >>>>> Should we provide constant to reset committed offset to
> >> earliest/latest
> >>>> offset of a partition, e.g. -1 indicates earliest offset and -2
> >> indicates
> >>>> latest offset.
> >>>>
> >>>> I will go for something like ´--reset-to-earliest´ and
> >>> ´--reset-to-latest´
> >>>>
> >>>>> Should we allow dynamic change of the comitted offset when consumer
> >> are
> >>>> running, such that consumer will seek to the newly committed offset
> and
> >>>> start consuming from there?
> >>>>
> >>>> Not sure about this. I will recommend to keep it simple and ask user
> to
> >>>> stop consumers first. But I would considered it if the trade-offs are
> >>>> clear.
> >>>>
> >>>> @Matthias
> >>>>
> >>>> Added :). And thanks a lot for your help to define this KIP!
> >>>>
> >>>>
> >>>>
> >>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
> >>>> escribió:
> >>>>
> >>>>> As long as the CLI is a bit consistent? Like, not just adding 3
> >>>>> arguments and a JSON parser to the existing tool, right?
> >>>>>
> >>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> >>>>> <on...@gmail.com> wrote:
> >>>>>> I think it makes sense to just add the feature to
> >>>>> kafka-consumer-groups.sh
> >>>>>>
> >>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
> >>> wrote:
> >>>>>>
> >>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
> >>>>>>>
> >>>>>>> I hate the interface, though. It looks exactly like the replica
> >>>>>>> assignment tool. A tool everyone loves so much that there are
> >>> multiple
> >>>>>>> projects, open and closed, that try to fix it.
> >>>>>>>
> >>>>>>> Can we swap it with something that looks a bit more like the
> >> consumer
> >>>>>>> group tool? or the kafka streams reset tool? Consistency is helpful
> >>> in
> >>>>>>> such cases. I spent some time learning existing tools and learning
> >>> yet
> >>>>>>> another one is a deterrent.
> >>>>>>>
> >>>>>>> Gwen
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> >>>>>>> <qu...@gmail.com> wrote:
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
> >> Group
> >>>>>>> Offsets.
> >>>>>>>>
> >>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >>>>>>>>
> >>>>>>>> Please, take a look at the proposal and share your feedback.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Jorge.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Gwen Shapira
> >>>>>>> Product Manager | Confluent
> >>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760> | @gwenshap
> >>>>>>> Follow us: Twitter | blog
> >>>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Gwen Shapira
> >>>>> Product Manager | Confluent
> >>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760>
> >> <(650)%20450-2760> | @gwenshap
> >>>>> Follow us: Twitter | blog
> >>>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Gwen Shapira
> >>> Product Manager | Confluent
> >>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> <(650)%20450-2760>
> | @gwenshap
> >>> Follow us: Twitter | blog
> >>>
> >>
> >>
> >
>
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by "Matthias J. Sax" <ma...@confluent.io>.
Hi,

thanks for updating the KIP. Couple of follow up comments:

* Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
time" option -- IMHO it belongs to "reset by position"?


* Nit: Description of "Reset to Earliest"

> using Kafka Consumer's `auto.offset.reset` to `earliest`

I think this is strictly speaking not correct (as auto.offset.reset only
triggered if no valid offset is found, but this tool explicitly modified
committed offset), and should be phrased as

> using Kafka Consumer's #seekToBeginning()

-> similar issue for description of "Reset to Latest"


* Main option: rename to --reset-offsets (plural instead of singular)


* Scenario Options: I would remove "reset" from all options, because the
main argument "--reset-offset" says already what to do:

> bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX

better (IMHO):

> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX



* Option 1.e ("print and export current offset") is not intuitive to use
IMHO. The main option is "--reset-offset" but nothing happens if no
scenario is specified. It is also not specified, what the output should
look like?

Furthermore, --describe should actually show currently committed offset
for a group. So it seems to be redundant to have the same option in
--reset-offsets


* Option 2.a: I would rename to "--reset-to-offset" (or considering the
comment above to "--to-offset")


* Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar)
and accept positive/negative values


* About Scope "all": maybe it's better to have an option "--all-topics"
(or similar). IMHO explicit arguments are preferable over implicit
setting to guard again accidental miss use of the tool.


* Scope: I also think, that "--topic" (singular) and "--topics" (plural)
are too similar and easy to use in a wrong way (ie, mix up) -- maybe we
can have two options that are easier to distinguish.


* I still think that JSON is not the best format (it's too verbose/hard
to write for humans from scratch). A simple CSV format with implicit
schema (topic,partition,offset) would be sufficient.


* Why does the JSON contain "group_id" field -- there is parameter
"--group" to specify the group ID. Would one overwrite the other (what
order) or would there be an error if "--group" is used in combination
with "--reset-from-file"?



-Matthias




On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> Hi,
> 
> according to the feedback, I've updated the KIP:
> 
> - We have added and ordered the scenarios, scopes and executions of the
> Reset Offset tool.
> - Consider it as an extension to the current `ConsumerGroupCommand` tool
> - Execution will be possible without generating JSON files.
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> 
> Looking forward to your feedback!
> 
> Jorge.
> 
> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
> quilcate.jorge@gmail.com>) escribió:
> 
>> Great. I think I got the idea. What about this options:
>>
>> Scenarios:
>>
>> 1. Current status
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
>>
>> 2. To Datetime
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
>> 2017-01-01T00:00:00.000´
>>
>> 3. To Period
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period P2D´
>>
>> 4. To Earliest
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-earliest´
>>
>> 5. To Latest
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´
>>
>> 6. Minus 'n' offsets
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
>>
>> 7. Plus 'n' offsets
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
>>
>> 8. To specific offset
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
>>
>> Scopes:
>>
>> a. All topics used by Consumer Group
>>
>> Don't specify --topics
>>
>> b. Specific List of Topics
>>
>> Add list of values in --topics t1,t2,tn
>>
>> c. One Topic, all Partitions
>>
>> Add one topic and no partitions values: --topic t1
>>
>> d. One Topic, List of Partitions
>>
>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
>>
>> About Reset Plan (JSON file):
>>
>> I think is still valid to have the option to persist reset configuration
>> as a file, but I agree to give the option to run the tool without going
>> down to the JSON file.
>>
>> Execution options:
>>
>> 1. Without execution argument (No args):
>>
>> Print out results (reset plan)
>>
>> 2. With --execute argument:
>>
>> Run reset process
>>
>> 3. With --output argument:
>>
>> Save result in a JSON format.
>>
>> 4. Only with --execute option and --reset-file (path to JSON)
>>
>> Reset based on file
>>
>> 4. Only with --verify option and --reset-file (path to JSON)
>>
>> Verify file values with current offsets
>>
>> I think we can remove --generate-and-execute because is a bit clumsy.
>>
>> With this options we will be able to execute with manual JSON
>> configuration.
>>
>>
>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
>> escribió:
>>
>> Yes - using a tool like this to skip a set of consumer groups over a
>> corrupt/bad message is definitely appealing.
>>
>> B
>>
>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
>>
>>> I like the --reset-to-earliest and --reset-to-latest. In general,
>>> since the JSON route is the most challenging for users, we want to
>>> provide a lot of ways to do useful things without going there.
>>>
>>> Two things that can help:
>>>
>>> 1. A lot of times, users want to skip few messages that cause issues
>>> and continue. maybe just specifying the topic, partition and delta
>>> will be better than having to find the offset and write a JSON and
>>> validate the JSON etc.
>>>
>>> 2. Thinking if there are other common use-cases that we can make easy
>>> rather than just one generic but not very usable method.
>>>
>>> Gwen
>>>
>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
>>> <qu...@gmail.com> wrote:
>>>> Thanks for the feedback!
>>>>
>>>> @Onur, @Gwen:
>>>>
>>>> Agree. Actually at the first draft I considered to have it inside
>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
>>> tool
>>>> to describe it clearly and focus it on reset functionality.
>>>>
>>>> But now that you mentioned, it does make sense to have it in
>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
>>> it?
>>>>
>>>> Maybe something like this:
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
>> --topics
>>> t1
>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
>>>> plan.json´
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
>>>> plan.json´
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group
>>> cg1
>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
>>>>
>>>> @Gwen:
>>>>
>>>>> It looks exactly like the replica assignment tool
>>>>
>>>> It was influenced by ;-) I use the generate-verify-execute process here
>>> to
>>>> make sure user will be aware of the result of this operation. At the
>>>> beginning we considered only add a couple of options to Consumer Group
>>>> Command:
>>>>
>>>> --rewind-to-timestamp and --rewind-to-period
>>>>
>>>> @Onur:
>>>>
>>>>> You can actually get away with overriding while members of the group
>>> are live
>>>> with method 2 by using group information from DescribeGroupsRequest.
>>>>
>>>> This means that we need to have Consumer Group stopped before executing
>>> and
>>>> start a new consumer internally to do this? Therefore, we won't be able
>>> to
>>>> consider executing reset when ConsumerGroup is active? (trying to
>> relate
>>> it
>>>> with @Dong 5th question)
>>>>
>>>> @Dong:
>>>>
>>>>> Should we allow user to use wildcard to reset offset of all groups
>> for a
>>>> given topic as well?
>>>>
>>>> I haven't thought about this scenario. Could be interesting. Following
>>> the
>>>> recommendation to add it into Consumer Group Command, in this case
>> Group
>>>> argument will be optional if there are only 1 topic. I think for
>> multiple
>>>> topic won't be that useful.
>>>>
>>>>> Should we allow user to specify timestamp per topic partition in the
>>> json
>>>> file as well?
>>>>
>>>> Don't think this could be a valid from the tool, but if Reset Plan is
>>>> generated, and user want to set the offset for a specific partition to
>>>> other offset (eventually based on another timestamp), and execute it,
>> it
>>>> will be up to her/him.
>>>>
>>>>> Should the script take some credential file to make sure that this
>>>> operation is authenticated given the potential impact of this
>> operation?
>>>>
>>>> Haven't tried to secure brokers yet, but the tool should support
>>>> authorization if it's enabled in the broker.
>>>>
>>>>> Should we provide constant to reset committed offset to
>> earliest/latest
>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
>> indicates
>>>> latest offset.
>>>>
>>>> I will go for something like ´--reset-to-earliest´ and
>>> ´--reset-to-latest´
>>>>
>>>>> Should we allow dynamic change of the comitted offset when consumer
>> are
>>>> running, such that consumer will seek to the newly committed offset and
>>>> start consuming from there?
>>>>
>>>> Not sure about this. I will recommend to keep it simple and ask user to
>>>> stop consumers first. But I would considered it if the trade-offs are
>>>> clear.
>>>>
>>>> @Matthias
>>>>
>>>> Added :). And thanks a lot for your help to define this KIP!
>>>>
>>>>
>>>>
>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
>>>> escribió:
>>>>
>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
>>>>> arguments and a JSON parser to the existing tool, right?
>>>>>
>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
>>>>> <on...@gmail.com> wrote:
>>>>>> I think it makes sense to just add the feature to
>>>>> kafka-consumer-groups.sh
>>>>>>
>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
>>> wrote:
>>>>>>
>>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
>>>>>>>
>>>>>>> I hate the interface, though. It looks exactly like the replica
>>>>>>> assignment tool. A tool everyone loves so much that there are
>>> multiple
>>>>>>> projects, open and closed, that try to fix it.
>>>>>>>
>>>>>>> Can we swap it with something that looks a bit more like the
>> consumer
>>>>>>> group tool? or the kafka streams reset tool? Consistency is helpful
>>> in
>>>>>>> such cases. I spent some time learning existing tools and learning
>>> yet
>>>>>>> another one is a deterrent.
>>>>>>>
>>>>>>> Gwen
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>>>>>>> <qu...@gmail.com> wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
>> Group
>>>>>>> Offsets.
>>>>>>>>
>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>>>>>>>>
>>>>>>>> Please, take a look at the proposal and share your feedback.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Jorge.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Gwen Shapira
>>>>>>> Product Manager | Confluent
>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760> | @gwenshap
>>>>>>> Follow us: Twitter | blog
>>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Gwen Shapira
>>>>> Product Manager | Confluent
>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760> | @gwenshap
>>>>> Follow us: Twitter | blog
>>>>>
>>>
>>>
>>>
>>> --
>>> Gwen Shapira
>>> Product Manager | Confluent
>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap
>>> Follow us: Twitter | blog
>>>
>>
>>
> 


Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by "Matthias J. Sax" <ma...@confluent.io>.
Hi,

thanks for updating the KIP. Couple of follow up comments:

* Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
time" option -- IMHO it belongs to "reset by position"?


* Nit: Description of "Reset to Earliest"

> using Kafka Consumer's `auto.offset.reset` to `earliest`

I think this is strictly speaking not correct (as auto.offset.reset only
triggered if no valid offset is found, but this tool explicitly modified
committed offset), and should be phrased as

> using Kafka Consumer's #seekToBeginning()

-> similar issue for description of "Reset to Latest"


* Main option: rename to --reset-offsets (plural instead of singular)


* Scenario Options: I would remove "reset" from all options, because the
main argument "--reset-offset" says already what to do:

> bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX

better (IMHO):

> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX



* Option 1.e ("print and export current offset") is not intuitive to use
IMHO. The main option is "--reset-offset" but nothing happens if no
scenario is specified. It is also not specified, what the output should
look like?

Furthermore, --describe should actually show currently committed offset
for a group. So it seems to be redundant to have the same option in
--reset-offsets


* Option 2.a: I would rename to "--reset-to-offset" (or considering the
comment above to "--to-offset")


* Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar)
and accept positive/negative values


* About Scope "all": maybe it's better to have an option "--all-topics"
(or similar). IMHO explicit arguments are preferable over implicit
setting to guard again accidental miss use of the tool.


* Scope: I also think, that "--topic" (singular) and "--topics" (plural)
are too similar and easy to use in a wrong way (ie, mix up) -- maybe we
can have two options that are easier to distinguish.


* I still think that JSON is not the best format (it's too verbose/hard
to write for humans from scratch). A simple CSV format with implicit
schema (topic,partition,offset) would be sufficient.


* Why does the JSON contain "group_id" field -- there is parameter
"--group" to specify the group ID. Would one overwrite the other (what
order) or would there be an error if "--group" is used in combination
with "--reset-from-file"?



-Matthias




On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> Hi,
> 
> according to the feedback, I've updated the KIP:
> 
> - We have added and ordered the scenarios, scopes and executions of the
> Reset Offset tool.
> - Consider it as an extension to the current `ConsumerGroupCommand` tool
> - Execution will be possible without generating JSON files.
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> 
> Looking forward to your feedback!
> 
> Jorge.
> 
> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
> quilcate.jorge@gmail.com>) escribió:
> 
>> Great. I think I got the idea. What about this options:
>>
>> Scenarios:
>>
>> 1. Current status
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
>>
>> 2. To Datetime
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
>> 2017-01-01T00:00:00.000´
>>
>> 3. To Period
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period P2D´
>>
>> 4. To Earliest
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-earliest´
>>
>> 5. To Latest
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´
>>
>> 6. Minus 'n' offsets
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
>>
>> 7. Plus 'n' offsets
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
>>
>> 8. To specific offset
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
>>
>> Scopes:
>>
>> a. All topics used by Consumer Group
>>
>> Don't specify --topics
>>
>> b. Specific List of Topics
>>
>> Add list of values in --topics t1,t2,tn
>>
>> c. One Topic, all Partitions
>>
>> Add one topic and no partitions values: --topic t1
>>
>> d. One Topic, List of Partitions
>>
>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
>>
>> About Reset Plan (JSON file):
>>
>> I think is still valid to have the option to persist reset configuration
>> as a file, but I agree to give the option to run the tool without going
>> down to the JSON file.
>>
>> Execution options:
>>
>> 1. Without execution argument (No args):
>>
>> Print out results (reset plan)
>>
>> 2. With --execute argument:
>>
>> Run reset process
>>
>> 3. With --output argument:
>>
>> Save result in a JSON format.
>>
>> 4. Only with --execute option and --reset-file (path to JSON)
>>
>> Reset based on file
>>
>> 4. Only with --verify option and --reset-file (path to JSON)
>>
>> Verify file values with current offsets
>>
>> I think we can remove --generate-and-execute because is a bit clumsy.
>>
>> With this options we will be able to execute with manual JSON
>> configuration.
>>
>>
>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
>> escribió:
>>
>> Yes - using a tool like this to skip a set of consumer groups over a
>> corrupt/bad message is definitely appealing.
>>
>> B
>>
>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
>>
>>> I like the --reset-to-earliest and --reset-to-latest. In general,
>>> since the JSON route is the most challenging for users, we want to
>>> provide a lot of ways to do useful things without going there.
>>>
>>> Two things that can help:
>>>
>>> 1. A lot of times, users want to skip few messages that cause issues
>>> and continue. maybe just specifying the topic, partition and delta
>>> will be better than having to find the offset and write a JSON and
>>> validate the JSON etc.
>>>
>>> 2. Thinking if there are other common use-cases that we can make easy
>>> rather than just one generic but not very usable method.
>>>
>>> Gwen
>>>
>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
>>> <qu...@gmail.com> wrote:
>>>> Thanks for the feedback!
>>>>
>>>> @Onur, @Gwen:
>>>>
>>>> Agree. Actually at the first draft I considered to have it inside
>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
>>> tool
>>>> to describe it clearly and focus it on reset functionality.
>>>>
>>>> But now that you mentioned, it does make sense to have it in
>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
>>> it?
>>>>
>>>> Maybe something like this:
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
>> --topics
>>> t1
>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
>>>> plan.json´
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
>>>> plan.json´
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group
>>> cg1
>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
>>>>
>>>> @Gwen:
>>>>
>>>>> It looks exactly like the replica assignment tool
>>>>
>>>> It was influenced by ;-) I use the generate-verify-execute process here
>>> to
>>>> make sure user will be aware of the result of this operation. At the
>>>> beginning we considered only add a couple of options to Consumer Group
>>>> Command:
>>>>
>>>> --rewind-to-timestamp and --rewind-to-period
>>>>
>>>> @Onur:
>>>>
>>>>> You can actually get away with overriding while members of the group
>>> are live
>>>> with method 2 by using group information from DescribeGroupsRequest.
>>>>
>>>> This means that we need to have Consumer Group stopped before executing
>>> and
>>>> start a new consumer internally to do this? Therefore, we won't be able
>>> to
>>>> consider executing reset when ConsumerGroup is active? (trying to
>> relate
>>> it
>>>> with @Dong 5th question)
>>>>
>>>> @Dong:
>>>>
>>>>> Should we allow user to use wildcard to reset offset of all groups
>> for a
>>>> given topic as well?
>>>>
>>>> I haven't thought about this scenario. Could be interesting. Following
>>> the
>>>> recommendation to add it into Consumer Group Command, in this case
>> Group
>>>> argument will be optional if there are only 1 topic. I think for
>> multiple
>>>> topic won't be that useful.
>>>>
>>>>> Should we allow user to specify timestamp per topic partition in the
>>> json
>>>> file as well?
>>>>
>>>> Don't think this could be a valid from the tool, but if Reset Plan is
>>>> generated, and user want to set the offset for a specific partition to
>>>> other offset (eventually based on another timestamp), and execute it,
>> it
>>>> will be up to her/him.
>>>>
>>>>> Should the script take some credential file to make sure that this
>>>> operation is authenticated given the potential impact of this
>> operation?
>>>>
>>>> Haven't tried to secure brokers yet, but the tool should support
>>>> authorization if it's enabled in the broker.
>>>>
>>>>> Should we provide constant to reset committed offset to
>> earliest/latest
>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
>> indicates
>>>> latest offset.
>>>>
>>>> I will go for something like ´--reset-to-earliest´ and
>>> ´--reset-to-latest´
>>>>
>>>>> Should we allow dynamic change of the comitted offset when consumer
>> are
>>>> running, such that consumer will seek to the newly committed offset and
>>>> start consuming from there?
>>>>
>>>> Not sure about this. I will recommend to keep it simple and ask user to
>>>> stop consumers first. But I would considered it if the trade-offs are
>>>> clear.
>>>>
>>>> @Matthias
>>>>
>>>> Added :). And thanks a lot for your help to define this KIP!
>>>>
>>>>
>>>>
>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
>>>> escribió:
>>>>
>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
>>>>> arguments and a JSON parser to the existing tool, right?
>>>>>
>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
>>>>> <on...@gmail.com> wrote:
>>>>>> I think it makes sense to just add the feature to
>>>>> kafka-consumer-groups.sh
>>>>>>
>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
>>> wrote:
>>>>>>
>>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
>>>>>>>
>>>>>>> I hate the interface, though. It looks exactly like the replica
>>>>>>> assignment tool. A tool everyone loves so much that there are
>>> multiple
>>>>>>> projects, open and closed, that try to fix it.
>>>>>>>
>>>>>>> Can we swap it with something that looks a bit more like the
>> consumer
>>>>>>> group tool? or the kafka streams reset tool? Consistency is helpful
>>> in
>>>>>>> such cases. I spent some time learning existing tools and learning
>>> yet
>>>>>>> another one is a deterrent.
>>>>>>>
>>>>>>> Gwen
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>>>>>>> <qu...@gmail.com> wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
>> Group
>>>>>>> Offsets.
>>>>>>>>
>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>>>>>>>>
>>>>>>>> Please, take a look at the proposal and share your feedback.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Jorge.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Gwen Shapira
>>>>>>> Product Manager | Confluent
>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760> | @gwenshap
>>>>>>> Follow us: Twitter | blog
>>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Gwen Shapira
>>>>> Product Manager | Confluent
>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760> | @gwenshap
>>>>> Follow us: Twitter | blog
>>>>>
>>>
>>>
>>>
>>> --
>>> Gwen Shapira
>>> Product Manager | Confluent
>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap
>>> Follow us: Twitter | blog
>>>
>>
>>
> 


Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
Hi,

according to the feedback, I've updated the KIP:

- We have added and ordered the scenarios, scopes and executions of the
Reset Offset tool.
- Consider it as an extension to the current `ConsumerGroupCommand` tool
- Execution will be possible without generating JSON files.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling

Looking forward to your feedback!

Jorge.

El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
quilcate.jorge@gmail.com>) escribió:

> Great. I think I got the idea. What about this options:
>
> Scenarios:
>
> 1. Current status
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1´
>
> 2. To Datetime
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
> 2017-01-01T00:00:00.000´
>
> 3. To Period
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period P2D´
>
> 4. To Earliest
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-earliest´
>
> 5. To Latest
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´
>
> 6. Minus 'n' offsets
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
>
> 7. Plus 'n' offsets
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
>
> 8. To specific offset
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
>
> Scopes:
>
> a. All topics used by Consumer Group
>
> Don't specify --topics
>
> b. Specific List of Topics
>
> Add list of values in --topics t1,t2,tn
>
> c. One Topic, all Partitions
>
> Add one topic and no partitions values: --topic t1
>
> d. One Topic, List of Partitions
>
> Add one topic and partitions values: --topic t1 --partitions 0,1,2
>
> About Reset Plan (JSON file):
>
> I think is still valid to have the option to persist reset configuration
> as a file, but I agree to give the option to run the tool without going
> down to the JSON file.
>
> Execution options:
>
> 1. Without execution argument (No args):
>
> Print out results (reset plan)
>
> 2. With --execute argument:
>
> Run reset process
>
> 3. With --output argument:
>
> Save result in a JSON format.
>
> 4. Only with --execute option and --reset-file (path to JSON)
>
> Reset based on file
>
> 4. Only with --verify option and --reset-file (path to JSON)
>
> Verify file values with current offsets
>
> I think we can remove --generate-and-execute because is a bit clumsy.
>
> With this options we will be able to execute with manual JSON
> configuration.
>
>
> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
> escribió:
>
> Yes - using a tool like this to skip a set of consumer groups over a
> corrupt/bad message is definitely appealing.
>
> B
>
> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
>
> > I like the --reset-to-earliest and --reset-to-latest. In general,
> > since the JSON route is the most challenging for users, we want to
> > provide a lot of ways to do useful things without going there.
> >
> > Two things that can help:
> >
> > 1. A lot of times, users want to skip few messages that cause issues
> > and continue. maybe just specifying the topic, partition and delta
> > will be better than having to find the offset and write a JSON and
> > validate the JSON etc.
> >
> > 2. Thinking if there are other common use-cases that we can make easy
> > rather than just one generic but not very usable method.
> >
> > Gwen
> >
> > On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> > <qu...@gmail.com> wrote:
> > > Thanks for the feedback!
> > >
> > > @Onur, @Gwen:
> > >
> > > Agree. Actually at the first draft I considered to have it inside
> > > ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
> > tool
> > > to describe it clearly and focus it on reset functionality.
> > >
> > > But now that you mentioned, it does make sense to have it in
> > > ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
> > it?
> > >
> > > Maybe something like this:
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> --topics
> > t1
> > > --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> > > plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> > > plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group
> > cg1
> > > --topics t1 --reset-from 2017-01-01T00:00:00.000´
> > >
> > > @Gwen:
> > >
> > >> It looks exactly like the replica assignment tool
> > >
> > > It was influenced by ;-) I use the generate-verify-execute process here
> > to
> > > make sure user will be aware of the result of this operation. At the
> > > beginning we considered only add a couple of options to Consumer Group
> > > Command:
> > >
> > > --rewind-to-timestamp and --rewind-to-period
> > >
> > > @Onur:
> > >
> > >> You can actually get away with overriding while members of the group
> > are live
> > > with method 2 by using group information from DescribeGroupsRequest.
> > >
> > > This means that we need to have Consumer Group stopped before executing
> > and
> > > start a new consumer internally to do this? Therefore, we won't be able
> > to
> > > consider executing reset when ConsumerGroup is active? (trying to
> relate
> > it
> > > with @Dong 5th question)
> > >
> > > @Dong:
> > >
> > >> Should we allow user to use wildcard to reset offset of all groups
> for a
> > > given topic as well?
> > >
> > > I haven't thought about this scenario. Could be interesting. Following
> > the
> > > recommendation to add it into Consumer Group Command, in this case
> Group
> > > argument will be optional if there are only 1 topic. I think for
> multiple
> > > topic won't be that useful.
> > >
> > >> Should we allow user to specify timestamp per topic partition in the
> > json
> > > file as well?
> > >
> > > Don't think this could be a valid from the tool, but if Reset Plan is
> > > generated, and user want to set the offset for a specific partition to
> > > other offset (eventually based on another timestamp), and execute it,
> it
> > > will be up to her/him.
> > >
> > >> Should the script take some credential file to make sure that this
> > > operation is authenticated given the potential impact of this
> operation?
> > >
> > > Haven't tried to secure brokers yet, but the tool should support
> > > authorization if it's enabled in the broker.
> > >
> > >> Should we provide constant to reset committed offset to
> earliest/latest
> > > offset of a partition, e.g. -1 indicates earliest offset and -2
> indicates
> > > latest offset.
> > >
> > > I will go for something like ´--reset-to-earliest´ and
> > ´--reset-to-latest´
> > >
> > >> Should we allow dynamic change of the comitted offset when consumer
> are
> > > running, such that consumer will seek to the newly committed offset and
> > > start consuming from there?
> > >
> > > Not sure about this. I will recommend to keep it simple and ask user to
> > > stop consumers first. But I would considered it if the trade-offs are
> > > clear.
> > >
> > > @Matthias
> > >
> > > Added :). And thanks a lot for your help to define this KIP!
> > >
> > >
> > >
> > > El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
> > > escribió:
> > >
> > >> As long as the CLI is a bit consistent? Like, not just adding 3
> > >> arguments and a JSON parser to the existing tool, right?
> > >>
> > >> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> > >> <on...@gmail.com> wrote:
> > >> > I think it makes sense to just add the feature to
> > >> kafka-consumer-groups.sh
> > >> >
> > >> > On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
> > wrote:
> > >> >
> > >> >> Thanks for the KIP. I'm super happy about adding the capability.
> > >> >>
> > >> >> I hate the interface, though. It looks exactly like the replica
> > >> >> assignment tool. A tool everyone loves so much that there are
> > multiple
> > >> >> projects, open and closed, that try to fix it.
> > >> >>
> > >> >> Can we swap it with something that looks a bit more like the
> consumer
> > >> >> group tool? or the kafka streams reset tool? Consistency is helpful
> > in
> > >> >> such cases. I spent some time learning existing tools and learning
> > yet
> > >> >> another one is a deterrent.
> > >> >>
> > >> >> Gwen
> > >> >>
> > >> >>
> > >> >>
> > >> >> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> > >> >> <qu...@gmail.com> wrote:
> > >> >> > Hi all,
> > >> >> >
> > >> >> > I would like to propose a KIP to Add a tool to Reset Consumer
> Group
> > >> >> Offsets.
> > >> >> >
> > >> >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >> >> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> > >> >> >
> > >> >> > Please, take a look at the proposal and share your feedback.
> > >> >> >
> > >> >> > Thanks,
> > >> >> > Jorge.
> > >> >>
> > >> >>
> > >> >>
> > >> >> --
> > >> >> Gwen Shapira
> > >> >> Product Manager | Confluent
> > >> >> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760> | @gwenshap
> > >> >> Follow us: Twitter | blog
> > >> >>
> > >>
> > >>
> > >>
> > >> --
> > >> Gwen Shapira
> > >> Product Manager | Confluent
> > >> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760> | @gwenshap
> > >> Follow us: Twitter | blog
> > >>
> >
> >
> >
> > --
> > Gwen Shapira
> > Product Manager | Confluent
> > 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap
> > Follow us: Twitter | blog
> >
>
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by James Cheng <wu...@gmail.com>.
Yeah, that's a good point.

Some of the operations might make sense on multiple partitions at once. Moving to a timestamp might apply to all partitions, moving backwards and forwards by N offsets might apply to all partitions.

However, moving to a specific offset ("set to offset 43") would most likely only make sense to one partition at time. It might make sense to *require* topic and partition, when moving to a specific offset.

-James

> On Feb 8, 2017, at 3:36 PM, Gwen Shapira <gw...@confluent.io> wrote:
> 
> Just to clarify, we'll need to allow specifying topic and partition. I
> don't think we want this on ALL partitions at once.
> 
> On Wed, Feb 8, 2017 at 3:35 PM, Gwen Shapira <gw...@confluent.io> wrote:
>> That's what I'd like to see. For example, suppose a Connect task fails
>> because it can't deserialize an event from a partition. Stop
>> connector, move offset forward, start connector. Boom!
>> 
>> 
>> On Wed, Feb 8, 2017 at 3:22 PM, Matthias J. Sax <ma...@confluent.io> wrote:
>>> I am not sure about --reset-plus and --reset-minus
>>> 
>>> Would this skip n messages forward/backward for each partitions?
>>> 
>>> 
>>> -Matthias
>>> 
>>> On 2/8/17 2:23 PM, Jorge Esteban Quilcate Otoya wrote:
>>>> Great. I think I got the idea. What about this options:
>>>> 
>>>> Scenarios:
>>>> 
>>>> 1. Current status
>>>> 
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
>>>> 
>>>> 2. To Datetime
>>>> 
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
>>>> 2017-01-01T00:00:00.000´
>>>> 
>>>> 3. To Period
>>>> 
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period P2D´
>>>> 
>>>> 4. To Earliest
>>>> 
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-earliest´
>>>> 
>>>> 5. To Latest
>>>> 
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´
>>>> 
>>>> 6. Minus 'n' offsets
>>>> 
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
>>>> 
>>>> 7. Plus 'n' offsets
>>>> 
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
>>>> 
>>>> 8. To specific offset
>>>> 
>>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
>>>> 
>>>> Scopes:
>>>> 
>>>> a. All topics used by Consumer Group
>>>> 
>>>> Don't specify --topics
>>>> 
>>>> b. Specific List of Topics
>>>> 
>>>> Add list of values in --topics t1,t2,tn
>>>> 
>>>> c. One Topic, all Partitions
>>>> 
>>>> Add one topic and no partitions values: --topic t1
>>>> 
>>>> d. One Topic, List of Partitions
>>>> 
>>>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
>>>> 
>>>> About Reset Plan (JSON file):
>>>> 
>>>> I think is still valid to have the option to persist reset configuration as
>>>> a file, but I agree to give the option to run the tool without going down
>>>> to the JSON file.
>>>> 
>>>> Execution options:
>>>> 
>>>> 1. Without execution argument (No args):
>>>> 
>>>> Print out results (reset plan)
>>>> 
>>>> 2. With --execute argument:
>>>> 
>>>> Run reset process
>>>> 
>>>> 3. With --output argument:
>>>> 
>>>> Save result in a JSON format.
>>>> 
>>>> 4. Only with --execute option and --reset-file (path to JSON)
>>>> 
>>>> Reset based on file
>>>> 
>>>> 4. Only with --verify option and --reset-file (path to JSON)
>>>> 
>>>> Verify file values with current offsets
>>>> 
>>>> I think we can remove --generate-and-execute because is a bit clumsy.
>>>> 
>>>> With this options we will be able to execute with manual JSON configuration.
>>>> 
>>>> 
>>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
>>>> escribió:
>>>> 
>>>>> Yes - using a tool like this to skip a set of consumer groups over a
>>>>> corrupt/bad message is definitely appealing.
>>>>> 
>>>>> B
>>>>> 
>>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
>>>>> 
>>>>>> I like the --reset-to-earliest and --reset-to-latest. In general,
>>>>>> since the JSON route is the most challenging for users, we want to
>>>>>> provide a lot of ways to do useful things without going there.
>>>>>> 
>>>>>> Two things that can help:
>>>>>> 
>>>>>> 1. A lot of times, users want to skip few messages that cause issues
>>>>>> and continue. maybe just specifying the topic, partition and delta
>>>>>> will be better than having to find the offset and write a JSON and
>>>>>> validate the JSON etc.
>>>>>> 
>>>>>> 2. Thinking if there are other common use-cases that we can make easy
>>>>>> rather than just one generic but not very usable method.
>>>>>> 
>>>>>> Gwen
>>>>>> 
>>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
>>>>>> <qu...@gmail.com> wrote:
>>>>>>> Thanks for the feedback!
>>>>>>> 
>>>>>>> @Onur, @Gwen:
>>>>>>> 
>>>>>>> Agree. Actually at the first draft I considered to have it inside
>>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
>>>>>> tool
>>>>>>> to describe it clearly and focus it on reset functionality.
>>>>>>> 
>>>>>>> But now that you mentioned, it does make sense to have it in
>>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
>>>>>> it?
>>>>>>> 
>>>>>>> Maybe something like this:
>>>>>>> 
>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
>>>>> --topics
>>>>>> t1
>>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
>>>>>>> 
>>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
>>>>>>> plan.json´
>>>>>>> 
>>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
>>>>>>> plan.json´
>>>>>>> 
>>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group
>>>>>> cg1
>>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
>>>>>>> 
>>>>>>> @Gwen:
>>>>>>> 
>>>>>>>> It looks exactly like the replica assignment tool
>>>>>>> 
>>>>>>> It was influenced by ;-) I use the generate-verify-execute process here
>>>>>> to
>>>>>>> make sure user will be aware of the result of this operation. At the
>>>>>>> beginning we considered only add a couple of options to Consumer Group
>>>>>>> Command:
>>>>>>> 
>>>>>>> --rewind-to-timestamp and --rewind-to-period
>>>>>>> 
>>>>>>> @Onur:
>>>>>>> 
>>>>>>>> You can actually get away with overriding while members of the group
>>>>>> are live
>>>>>>> with method 2 by using group information from DescribeGroupsRequest.
>>>>>>> 
>>>>>>> This means that we need to have Consumer Group stopped before executing
>>>>>> and
>>>>>>> start a new consumer internally to do this? Therefore, we won't be able
>>>>>> to
>>>>>>> consider executing reset when ConsumerGroup is active? (trying to
>>>>> relate
>>>>>> it
>>>>>>> with @Dong 5th question)
>>>>>>> 
>>>>>>> @Dong:
>>>>>>> 
>>>>>>>> Should we allow user to use wildcard to reset offset of all groups
>>>>> for a
>>>>>>> given topic as well?
>>>>>>> 
>>>>>>> I haven't thought about this scenario. Could be interesting. Following
>>>>>> the
>>>>>>> recommendation to add it into Consumer Group Command, in this case
>>>>> Group
>>>>>>> argument will be optional if there are only 1 topic. I think for
>>>>> multiple
>>>>>>> topic won't be that useful.
>>>>>>> 
>>>>>>>> Should we allow user to specify timestamp per topic partition in the
>>>>>> json
>>>>>>> file as well?
>>>>>>> 
>>>>>>> Don't think this could be a valid from the tool, but if Reset Plan is
>>>>>>> generated, and user want to set the offset for a specific partition to
>>>>>>> other offset (eventually based on another timestamp), and execute it,
>>>>> it
>>>>>>> will be up to her/him.
>>>>>>> 
>>>>>>>> Should the script take some credential file to make sure that this
>>>>>>> operation is authenticated given the potential impact of this
>>>>> operation?
>>>>>>> 
>>>>>>> Haven't tried to secure brokers yet, but the tool should support
>>>>>>> authorization if it's enabled in the broker.
>>>>>>> 
>>>>>>>> Should we provide constant to reset committed offset to
>>>>> earliest/latest
>>>>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
>>>>> indicates
>>>>>>> latest offset.
>>>>>>> 
>>>>>>> I will go for something like ´--reset-to-earliest´ and
>>>>>> ´--reset-to-latest´
>>>>>>> 
>>>>>>>> Should we allow dynamic change of the comitted offset when consumer
>>>>> are
>>>>>>> running, such that consumer will seek to the newly committed offset and
>>>>>>> start consuming from there?
>>>>>>> 
>>>>>>> Not sure about this. I will recommend to keep it simple and ask user to
>>>>>>> stop consumers first. But I would considered it if the trade-offs are
>>>>>>> clear.
>>>>>>> 
>>>>>>> @Matthias
>>>>>>> 
>>>>>>> Added :). And thanks a lot for your help to define this KIP!
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
>>>>>>> escribió:
>>>>>>> 
>>>>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
>>>>>>>> arguments and a JSON parser to the existing tool, right?
>>>>>>>> 
>>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
>>>>>>>> <on...@gmail.com> wrote:
>>>>>>>>> I think it makes sense to just add the feature to
>>>>>>>> kafka-consumer-groups.sh
>>>>>>>>> 
>>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
>>>>>>>>>> 
>>>>>>>>>> I hate the interface, though. It looks exactly like the replica
>>>>>>>>>> assignment tool. A tool everyone loves so much that there are
>>>>>> multiple
>>>>>>>>>> projects, open and closed, that try to fix it.
>>>>>>>>>> 
>>>>>>>>>> Can we swap it with something that looks a bit more like the
>>>>> consumer
>>>>>>>>>> group tool? or the kafka streams reset tool? Consistency is helpful
>>>>>> in
>>>>>>>>>> such cases. I spent some time learning existing tools and learning
>>>>>> yet
>>>>>>>>>> another one is a deterrent.
>>>>>>>>>> 
>>>>>>>>>> Gwen
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>>>>>>>>>> <qu...@gmail.com> wrote:
>>>>>>>>>>> Hi all,
>>>>>>>>>>> 
>>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
>>>>> Group
>>>>>>>>>> Offsets.
>>>>>>>>>>> 
>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>>>>>>>>>>> 
>>>>>>>>>>> Please, take a look at the proposal and share your feedback.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Jorge.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Gwen Shapira
>>>>>>>>>> Product Manager | Confluent
>>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>>>>> <(650)%20450-2760> | @gwenshap
>>>>>>>>>> Follow us: Twitter | blog
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Gwen Shapira
>>>>>>>> Product Manager | Confluent
>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>>>>> <(650)%20450-2760> | @gwenshap
>>>>>>>> Follow us: Twitter | blog
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Gwen Shapira
>>>>>> Product Manager | Confluent
>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap
>>>>>> Follow us: Twitter | blog
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>> 
>> 
>> --
>> Gwen Shapira
>> Product Manager | Confluent
>> 650.450.2760 | @gwenshap
>> Follow us: Twitter | blog
> 
> 
> 
> -- 
> Gwen Shapira
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter | blog


Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Gwen Shapira <gw...@confluent.io>.
Just to clarify, we'll need to allow specifying topic and partition. I
don't think we want this on ALL partitions at once.

On Wed, Feb 8, 2017 at 3:35 PM, Gwen Shapira <gw...@confluent.io> wrote:
> That's what I'd like to see. For example, suppose a Connect task fails
> because it can't deserialize an event from a partition. Stop
> connector, move offset forward, start connector. Boom!
>
>
> On Wed, Feb 8, 2017 at 3:22 PM, Matthias J. Sax <ma...@confluent.io> wrote:
>> I am not sure about --reset-plus and --reset-minus
>>
>> Would this skip n messages forward/backward for each partitions?
>>
>>
>> -Matthias
>>
>> On 2/8/17 2:23 PM, Jorge Esteban Quilcate Otoya wrote:
>>> Great. I think I got the idea. What about this options:
>>>
>>> Scenarios:
>>>
>>> 1. Current status
>>>
>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
>>>
>>> 2. To Datetime
>>>
>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
>>> 2017-01-01T00:00:00.000´
>>>
>>> 3. To Period
>>>
>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period P2D´
>>>
>>> 4. To Earliest
>>>
>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-earliest´
>>>
>>> 5. To Latest
>>>
>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´
>>>
>>> 6. Minus 'n' offsets
>>>
>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
>>>
>>> 7. Plus 'n' offsets
>>>
>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
>>>
>>> 8. To specific offset
>>>
>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
>>>
>>> Scopes:
>>>
>>> a. All topics used by Consumer Group
>>>
>>> Don't specify --topics
>>>
>>> b. Specific List of Topics
>>>
>>> Add list of values in --topics t1,t2,tn
>>>
>>> c. One Topic, all Partitions
>>>
>>> Add one topic and no partitions values: --topic t1
>>>
>>> d. One Topic, List of Partitions
>>>
>>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
>>>
>>> About Reset Plan (JSON file):
>>>
>>> I think is still valid to have the option to persist reset configuration as
>>> a file, but I agree to give the option to run the tool without going down
>>> to the JSON file.
>>>
>>> Execution options:
>>>
>>> 1. Without execution argument (No args):
>>>
>>> Print out results (reset plan)
>>>
>>> 2. With --execute argument:
>>>
>>> Run reset process
>>>
>>> 3. With --output argument:
>>>
>>> Save result in a JSON format.
>>>
>>> 4. Only with --execute option and --reset-file (path to JSON)
>>>
>>> Reset based on file
>>>
>>> 4. Only with --verify option and --reset-file (path to JSON)
>>>
>>> Verify file values with current offsets
>>>
>>> I think we can remove --generate-and-execute because is a bit clumsy.
>>>
>>> With this options we will be able to execute with manual JSON configuration.
>>>
>>>
>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
>>> escribió:
>>>
>>>> Yes - using a tool like this to skip a set of consumer groups over a
>>>> corrupt/bad message is definitely appealing.
>>>>
>>>> B
>>>>
>>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
>>>>
>>>>> I like the --reset-to-earliest and --reset-to-latest. In general,
>>>>> since the JSON route is the most challenging for users, we want to
>>>>> provide a lot of ways to do useful things without going there.
>>>>>
>>>>> Two things that can help:
>>>>>
>>>>> 1. A lot of times, users want to skip few messages that cause issues
>>>>> and continue. maybe just specifying the topic, partition and delta
>>>>> will be better than having to find the offset and write a JSON and
>>>>> validate the JSON etc.
>>>>>
>>>>> 2. Thinking if there are other common use-cases that we can make easy
>>>>> rather than just one generic but not very usable method.
>>>>>
>>>>> Gwen
>>>>>
>>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
>>>>> <qu...@gmail.com> wrote:
>>>>>> Thanks for the feedback!
>>>>>>
>>>>>> @Onur, @Gwen:
>>>>>>
>>>>>> Agree. Actually at the first draft I considered to have it inside
>>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
>>>>> tool
>>>>>> to describe it clearly and focus it on reset functionality.
>>>>>>
>>>>>> But now that you mentioned, it does make sense to have it in
>>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
>>>>> it?
>>>>>>
>>>>>> Maybe something like this:
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
>>>> --topics
>>>>> t1
>>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
>>>>>> plan.json´
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
>>>>>> plan.json´
>>>>>>
>>>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group
>>>>> cg1
>>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
>>>>>>
>>>>>> @Gwen:
>>>>>>
>>>>>>> It looks exactly like the replica assignment tool
>>>>>>
>>>>>> It was influenced by ;-) I use the generate-verify-execute process here
>>>>> to
>>>>>> make sure user will be aware of the result of this operation. At the
>>>>>> beginning we considered only add a couple of options to Consumer Group
>>>>>> Command:
>>>>>>
>>>>>> --rewind-to-timestamp and --rewind-to-period
>>>>>>
>>>>>> @Onur:
>>>>>>
>>>>>>> You can actually get away with overriding while members of the group
>>>>> are live
>>>>>> with method 2 by using group information from DescribeGroupsRequest.
>>>>>>
>>>>>> This means that we need to have Consumer Group stopped before executing
>>>>> and
>>>>>> start a new consumer internally to do this? Therefore, we won't be able
>>>>> to
>>>>>> consider executing reset when ConsumerGroup is active? (trying to
>>>> relate
>>>>> it
>>>>>> with @Dong 5th question)
>>>>>>
>>>>>> @Dong:
>>>>>>
>>>>>>> Should we allow user to use wildcard to reset offset of all groups
>>>> for a
>>>>>> given topic as well?
>>>>>>
>>>>>> I haven't thought about this scenario. Could be interesting. Following
>>>>> the
>>>>>> recommendation to add it into Consumer Group Command, in this case
>>>> Group
>>>>>> argument will be optional if there are only 1 topic. I think for
>>>> multiple
>>>>>> topic won't be that useful.
>>>>>>
>>>>>>> Should we allow user to specify timestamp per topic partition in the
>>>>> json
>>>>>> file as well?
>>>>>>
>>>>>> Don't think this could be a valid from the tool, but if Reset Plan is
>>>>>> generated, and user want to set the offset for a specific partition to
>>>>>> other offset (eventually based on another timestamp), and execute it,
>>>> it
>>>>>> will be up to her/him.
>>>>>>
>>>>>>> Should the script take some credential file to make sure that this
>>>>>> operation is authenticated given the potential impact of this
>>>> operation?
>>>>>>
>>>>>> Haven't tried to secure brokers yet, but the tool should support
>>>>>> authorization if it's enabled in the broker.
>>>>>>
>>>>>>> Should we provide constant to reset committed offset to
>>>> earliest/latest
>>>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
>>>> indicates
>>>>>> latest offset.
>>>>>>
>>>>>> I will go for something like ´--reset-to-earliest´ and
>>>>> ´--reset-to-latest´
>>>>>>
>>>>>>> Should we allow dynamic change of the comitted offset when consumer
>>>> are
>>>>>> running, such that consumer will seek to the newly committed offset and
>>>>>> start consuming from there?
>>>>>>
>>>>>> Not sure about this. I will recommend to keep it simple and ask user to
>>>>>> stop consumers first. But I would considered it if the trade-offs are
>>>>>> clear.
>>>>>>
>>>>>> @Matthias
>>>>>>
>>>>>> Added :). And thanks a lot for your help to define this KIP!
>>>>>>
>>>>>>
>>>>>>
>>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
>>>>>> escribió:
>>>>>>
>>>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
>>>>>>> arguments and a JSON parser to the existing tool, right?
>>>>>>>
>>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
>>>>>>> <on...@gmail.com> wrote:
>>>>>>>> I think it makes sense to just add the feature to
>>>>>>> kafka-consumer-groups.sh
>>>>>>>>
>>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
>>>>>>>>>
>>>>>>>>> I hate the interface, though. It looks exactly like the replica
>>>>>>>>> assignment tool. A tool everyone loves so much that there are
>>>>> multiple
>>>>>>>>> projects, open and closed, that try to fix it.
>>>>>>>>>
>>>>>>>>> Can we swap it with something that looks a bit more like the
>>>> consumer
>>>>>>>>> group tool? or the kafka streams reset tool? Consistency is helpful
>>>>> in
>>>>>>>>> such cases. I spent some time learning existing tools and learning
>>>>> yet
>>>>>>>>> another one is a deterrent.
>>>>>>>>>
>>>>>>>>> Gwen
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>>>>>>>>> <qu...@gmail.com> wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
>>>> Group
>>>>>>>>> Offsets.
>>>>>>>>>>
>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>>>>>>>>>>
>>>>>>>>>> Please, take a look at the proposal and share your feedback.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Jorge.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Gwen Shapira
>>>>>>>>> Product Manager | Confluent
>>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>>>> <(650)%20450-2760> | @gwenshap
>>>>>>>>> Follow us: Twitter | blog
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Gwen Shapira
>>>>>>> Product Manager | Confluent
>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>>>> <(650)%20450-2760> | @gwenshap
>>>>>>> Follow us: Twitter | blog
>>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Gwen Shapira
>>>>> Product Manager | Confluent
>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap
>>>>> Follow us: Twitter | blog
>>>>>
>>>>
>>>
>>
>
>
>
> --
> Gwen Shapira
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter | blog



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Gwen Shapira <gw...@confluent.io>.
That's what I'd like to see. For example, suppose a Connect task fails
because it can't deserialize an event from a partition. Stop
connector, move offset forward, start connector. Boom!


On Wed, Feb 8, 2017 at 3:22 PM, Matthias J. Sax <ma...@confluent.io> wrote:
> I am not sure about --reset-plus and --reset-minus
>
> Would this skip n messages forward/backward for each partitions?
>
>
> -Matthias
>
> On 2/8/17 2:23 PM, Jorge Esteban Quilcate Otoya wrote:
>> Great. I think I got the idea. What about this options:
>>
>> Scenarios:
>>
>> 1. Current status
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
>>
>> 2. To Datetime
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
>> 2017-01-01T00:00:00.000´
>>
>> 3. To Period
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period P2D´
>>
>> 4. To Earliest
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-earliest´
>>
>> 5. To Latest
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´
>>
>> 6. Minus 'n' offsets
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
>>
>> 7. Plus 'n' offsets
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
>>
>> 8. To specific offset
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
>>
>> Scopes:
>>
>> a. All topics used by Consumer Group
>>
>> Don't specify --topics
>>
>> b. Specific List of Topics
>>
>> Add list of values in --topics t1,t2,tn
>>
>> c. One Topic, all Partitions
>>
>> Add one topic and no partitions values: --topic t1
>>
>> d. One Topic, List of Partitions
>>
>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
>>
>> About Reset Plan (JSON file):
>>
>> I think is still valid to have the option to persist reset configuration as
>> a file, but I agree to give the option to run the tool without going down
>> to the JSON file.
>>
>> Execution options:
>>
>> 1. Without execution argument (No args):
>>
>> Print out results (reset plan)
>>
>> 2. With --execute argument:
>>
>> Run reset process
>>
>> 3. With --output argument:
>>
>> Save result in a JSON format.
>>
>> 4. Only with --execute option and --reset-file (path to JSON)
>>
>> Reset based on file
>>
>> 4. Only with --verify option and --reset-file (path to JSON)
>>
>> Verify file values with current offsets
>>
>> I think we can remove --generate-and-execute because is a bit clumsy.
>>
>> With this options we will be able to execute with manual JSON configuration.
>>
>>
>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
>> escribió:
>>
>>> Yes - using a tool like this to skip a set of consumer groups over a
>>> corrupt/bad message is definitely appealing.
>>>
>>> B
>>>
>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
>>>
>>>> I like the --reset-to-earliest and --reset-to-latest. In general,
>>>> since the JSON route is the most challenging for users, we want to
>>>> provide a lot of ways to do useful things without going there.
>>>>
>>>> Two things that can help:
>>>>
>>>> 1. A lot of times, users want to skip few messages that cause issues
>>>> and continue. maybe just specifying the topic, partition and delta
>>>> will be better than having to find the offset and write a JSON and
>>>> validate the JSON etc.
>>>>
>>>> 2. Thinking if there are other common use-cases that we can make easy
>>>> rather than just one generic but not very usable method.
>>>>
>>>> Gwen
>>>>
>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
>>>> <qu...@gmail.com> wrote:
>>>>> Thanks for the feedback!
>>>>>
>>>>> @Onur, @Gwen:
>>>>>
>>>>> Agree. Actually at the first draft I considered to have it inside
>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
>>>> tool
>>>>> to describe it clearly and focus it on reset functionality.
>>>>>
>>>>> But now that you mentioned, it does make sense to have it in
>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
>>>> it?
>>>>>
>>>>> Maybe something like this:
>>>>>
>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
>>> --topics
>>>> t1
>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
>>>>>
>>>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
>>>>> plan.json´
>>>>>
>>>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
>>>>> plan.json´
>>>>>
>>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group
>>>> cg1
>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
>>>>>
>>>>> @Gwen:
>>>>>
>>>>>> It looks exactly like the replica assignment tool
>>>>>
>>>>> It was influenced by ;-) I use the generate-verify-execute process here
>>>> to
>>>>> make sure user will be aware of the result of this operation. At the
>>>>> beginning we considered only add a couple of options to Consumer Group
>>>>> Command:
>>>>>
>>>>> --rewind-to-timestamp and --rewind-to-period
>>>>>
>>>>> @Onur:
>>>>>
>>>>>> You can actually get away with overriding while members of the group
>>>> are live
>>>>> with method 2 by using group information from DescribeGroupsRequest.
>>>>>
>>>>> This means that we need to have Consumer Group stopped before executing
>>>> and
>>>>> start a new consumer internally to do this? Therefore, we won't be able
>>>> to
>>>>> consider executing reset when ConsumerGroup is active? (trying to
>>> relate
>>>> it
>>>>> with @Dong 5th question)
>>>>>
>>>>> @Dong:
>>>>>
>>>>>> Should we allow user to use wildcard to reset offset of all groups
>>> for a
>>>>> given topic as well?
>>>>>
>>>>> I haven't thought about this scenario. Could be interesting. Following
>>>> the
>>>>> recommendation to add it into Consumer Group Command, in this case
>>> Group
>>>>> argument will be optional if there are only 1 topic. I think for
>>> multiple
>>>>> topic won't be that useful.
>>>>>
>>>>>> Should we allow user to specify timestamp per topic partition in the
>>>> json
>>>>> file as well?
>>>>>
>>>>> Don't think this could be a valid from the tool, but if Reset Plan is
>>>>> generated, and user want to set the offset for a specific partition to
>>>>> other offset (eventually based on another timestamp), and execute it,
>>> it
>>>>> will be up to her/him.
>>>>>
>>>>>> Should the script take some credential file to make sure that this
>>>>> operation is authenticated given the potential impact of this
>>> operation?
>>>>>
>>>>> Haven't tried to secure brokers yet, but the tool should support
>>>>> authorization if it's enabled in the broker.
>>>>>
>>>>>> Should we provide constant to reset committed offset to
>>> earliest/latest
>>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
>>> indicates
>>>>> latest offset.
>>>>>
>>>>> I will go for something like ´--reset-to-earliest´ and
>>>> ´--reset-to-latest´
>>>>>
>>>>>> Should we allow dynamic change of the comitted offset when consumer
>>> are
>>>>> running, such that consumer will seek to the newly committed offset and
>>>>> start consuming from there?
>>>>>
>>>>> Not sure about this. I will recommend to keep it simple and ask user to
>>>>> stop consumers first. But I would considered it if the trade-offs are
>>>>> clear.
>>>>>
>>>>> @Matthias
>>>>>
>>>>> Added :). And thanks a lot for your help to define this KIP!
>>>>>
>>>>>
>>>>>
>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
>>>>> escribió:
>>>>>
>>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
>>>>>> arguments and a JSON parser to the existing tool, right?
>>>>>>
>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
>>>>>> <on...@gmail.com> wrote:
>>>>>>> I think it makes sense to just add the feature to
>>>>>> kafka-consumer-groups.sh
>>>>>>>
>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
>>>> wrote:
>>>>>>>
>>>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
>>>>>>>>
>>>>>>>> I hate the interface, though. It looks exactly like the replica
>>>>>>>> assignment tool. A tool everyone loves so much that there are
>>>> multiple
>>>>>>>> projects, open and closed, that try to fix it.
>>>>>>>>
>>>>>>>> Can we swap it with something that looks a bit more like the
>>> consumer
>>>>>>>> group tool? or the kafka streams reset tool? Consistency is helpful
>>>> in
>>>>>>>> such cases. I spent some time learning existing tools and learning
>>>> yet
>>>>>>>> another one is a deterrent.
>>>>>>>>
>>>>>>>> Gwen
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>>>>>>>> <qu...@gmail.com> wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
>>> Group
>>>>>>>> Offsets.
>>>>>>>>>
>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>>>>>>>>>
>>>>>>>>> Please, take a look at the proposal and share your feedback.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Jorge.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Gwen Shapira
>>>>>>>> Product Manager | Confluent
>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>>> <(650)%20450-2760> | @gwenshap
>>>>>>>> Follow us: Twitter | blog
>>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Gwen Shapira
>>>>>> Product Manager | Confluent
>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>>> <(650)%20450-2760> | @gwenshap
>>>>>> Follow us: Twitter | blog
>>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Gwen Shapira
>>>> Product Manager | Confluent
>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap
>>>> Follow us: Twitter | blog
>>>>
>>>
>>
>



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by "Matthias J. Sax" <ma...@confluent.io>.
I am not sure about --reset-plus and --reset-minus

Would this skip n messages forward/backward for each partitions?


-Matthias

On 2/8/17 2:23 PM, Jorge Esteban Quilcate Otoya wrote:
> Great. I think I got the idea. What about this options:
> 
> Scenarios:
> 
> 1. Current status
> 
> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> 
> 2. To Datetime
> 
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
> 2017-01-01T00:00:00.000´
> 
> 3. To Period
> 
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period P2D´
> 
> 4. To Earliest
> 
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-earliest´
> 
> 5. To Latest
> 
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´
> 
> 6. Minus 'n' offsets
> 
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
> 
> 7. Plus 'n' offsets
> 
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
> 
> 8. To specific offset
> 
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
> 
> Scopes:
> 
> a. All topics used by Consumer Group
> 
> Don't specify --topics
> 
> b. Specific List of Topics
> 
> Add list of values in --topics t1,t2,tn
> 
> c. One Topic, all Partitions
> 
> Add one topic and no partitions values: --topic t1
> 
> d. One Topic, List of Partitions
> 
> Add one topic and partitions values: --topic t1 --partitions 0,1,2
> 
> About Reset Plan (JSON file):
> 
> I think is still valid to have the option to persist reset configuration as
> a file, but I agree to give the option to run the tool without going down
> to the JSON file.
> 
> Execution options:
> 
> 1. Without execution argument (No args):
> 
> Print out results (reset plan)
> 
> 2. With --execute argument:
> 
> Run reset process
> 
> 3. With --output argument:
> 
> Save result in a JSON format.
> 
> 4. Only with --execute option and --reset-file (path to JSON)
> 
> Reset based on file
> 
> 4. Only with --verify option and --reset-file (path to JSON)
> 
> Verify file values with current offsets
> 
> I think we can remove --generate-and-execute because is a bit clumsy.
> 
> With this options we will be able to execute with manual JSON configuration.
> 
> 
> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
> escribió:
> 
>> Yes - using a tool like this to skip a set of consumer groups over a
>> corrupt/bad message is definitely appealing.
>>
>> B
>>
>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
>>
>>> I like the --reset-to-earliest and --reset-to-latest. In general,
>>> since the JSON route is the most challenging for users, we want to
>>> provide a lot of ways to do useful things without going there.
>>>
>>> Two things that can help:
>>>
>>> 1. A lot of times, users want to skip few messages that cause issues
>>> and continue. maybe just specifying the topic, partition and delta
>>> will be better than having to find the offset and write a JSON and
>>> validate the JSON etc.
>>>
>>> 2. Thinking if there are other common use-cases that we can make easy
>>> rather than just one generic but not very usable method.
>>>
>>> Gwen
>>>
>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
>>> <qu...@gmail.com> wrote:
>>>> Thanks for the feedback!
>>>>
>>>> @Onur, @Gwen:
>>>>
>>>> Agree. Actually at the first draft I considered to have it inside
>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
>>> tool
>>>> to describe it clearly and focus it on reset functionality.
>>>>
>>>> But now that you mentioned, it does make sense to have it in
>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
>>> it?
>>>>
>>>> Maybe something like this:
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
>> --topics
>>> t1
>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
>>>> plan.json´
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
>>>> plan.json´
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group
>>> cg1
>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
>>>>
>>>> @Gwen:
>>>>
>>>>> It looks exactly like the replica assignment tool
>>>>
>>>> It was influenced by ;-) I use the generate-verify-execute process here
>>> to
>>>> make sure user will be aware of the result of this operation. At the
>>>> beginning we considered only add a couple of options to Consumer Group
>>>> Command:
>>>>
>>>> --rewind-to-timestamp and --rewind-to-period
>>>>
>>>> @Onur:
>>>>
>>>>> You can actually get away with overriding while members of the group
>>> are live
>>>> with method 2 by using group information from DescribeGroupsRequest.
>>>>
>>>> This means that we need to have Consumer Group stopped before executing
>>> and
>>>> start a new consumer internally to do this? Therefore, we won't be able
>>> to
>>>> consider executing reset when ConsumerGroup is active? (trying to
>> relate
>>> it
>>>> with @Dong 5th question)
>>>>
>>>> @Dong:
>>>>
>>>>> Should we allow user to use wildcard to reset offset of all groups
>> for a
>>>> given topic as well?
>>>>
>>>> I haven't thought about this scenario. Could be interesting. Following
>>> the
>>>> recommendation to add it into Consumer Group Command, in this case
>> Group
>>>> argument will be optional if there are only 1 topic. I think for
>> multiple
>>>> topic won't be that useful.
>>>>
>>>>> Should we allow user to specify timestamp per topic partition in the
>>> json
>>>> file as well?
>>>>
>>>> Don't think this could be a valid from the tool, but if Reset Plan is
>>>> generated, and user want to set the offset for a specific partition to
>>>> other offset (eventually based on another timestamp), and execute it,
>> it
>>>> will be up to her/him.
>>>>
>>>>> Should the script take some credential file to make sure that this
>>>> operation is authenticated given the potential impact of this
>> operation?
>>>>
>>>> Haven't tried to secure brokers yet, but the tool should support
>>>> authorization if it's enabled in the broker.
>>>>
>>>>> Should we provide constant to reset committed offset to
>> earliest/latest
>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
>> indicates
>>>> latest offset.
>>>>
>>>> I will go for something like ´--reset-to-earliest´ and
>>> ´--reset-to-latest´
>>>>
>>>>> Should we allow dynamic change of the comitted offset when consumer
>> are
>>>> running, such that consumer will seek to the newly committed offset and
>>>> start consuming from there?
>>>>
>>>> Not sure about this. I will recommend to keep it simple and ask user to
>>>> stop consumers first. But I would considered it if the trade-offs are
>>>> clear.
>>>>
>>>> @Matthias
>>>>
>>>> Added :). And thanks a lot for your help to define this KIP!
>>>>
>>>>
>>>>
>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
>>>> escribió:
>>>>
>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
>>>>> arguments and a JSON parser to the existing tool, right?
>>>>>
>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
>>>>> <on...@gmail.com> wrote:
>>>>>> I think it makes sense to just add the feature to
>>>>> kafka-consumer-groups.sh
>>>>>>
>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
>>> wrote:
>>>>>>
>>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
>>>>>>>
>>>>>>> I hate the interface, though. It looks exactly like the replica
>>>>>>> assignment tool. A tool everyone loves so much that there are
>>> multiple
>>>>>>> projects, open and closed, that try to fix it.
>>>>>>>
>>>>>>> Can we swap it with something that looks a bit more like the
>> consumer
>>>>>>> group tool? or the kafka streams reset tool? Consistency is helpful
>>> in
>>>>>>> such cases. I spent some time learning existing tools and learning
>>> yet
>>>>>>> another one is a deterrent.
>>>>>>>
>>>>>>> Gwen
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>>>>>>> <qu...@gmail.com> wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
>> Group
>>>>>>> Offsets.
>>>>>>>>
>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>>>>>>>>
>>>>>>>> Please, take a look at the proposal and share your feedback.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Jorge.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Gwen Shapira
>>>>>>> Product Manager | Confluent
>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760> | @gwenshap
>>>>>>> Follow us: Twitter | blog
>>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Gwen Shapira
>>>>> Product Manager | Confluent
>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760> | @gwenshap
>>>>> Follow us: Twitter | blog
>>>>>
>>>
>>>
>>>
>>> --
>>> Gwen Shapira
>>> Product Manager | Confluent
>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap
>>> Follow us: Twitter | blog
>>>
>>
> 


Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
Hi,

according to the feedback, I've updated the KIP:

- We have added and ordered the scenarios, scopes and executions of the
Reset Offset tool.
- Consider it as an extension to the current `ConsumerGroupCommand` tool
- Execution will be possible without generating JSON files.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling

Looking forward to your feedback!

Jorge.

El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
quilcate.jorge@gmail.com>) escribió:

> Great. I think I got the idea. What about this options:
>
> Scenarios:
>
> 1. Current status
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1´
>
> 2. To Datetime
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
> 2017-01-01T00:00:00.000´
>
> 3. To Period
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period P2D´
>
> 4. To Earliest
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-earliest´
>
> 5. To Latest
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´
>
> 6. Minus 'n' offsets
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
>
> 7. Plus 'n' offsets
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
>
> 8. To specific offset
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
>
> Scopes:
>
> a. All topics used by Consumer Group
>
> Don't specify --topics
>
> b. Specific List of Topics
>
> Add list of values in --topics t1,t2,tn
>
> c. One Topic, all Partitions
>
> Add one topic and no partitions values: --topic t1
>
> d. One Topic, List of Partitions
>
> Add one topic and partitions values: --topic t1 --partitions 0,1,2
>
> About Reset Plan (JSON file):
>
> I think is still valid to have the option to persist reset configuration
> as a file, but I agree to give the option to run the tool without going
> down to the JSON file.
>
> Execution options:
>
> 1. Without execution argument (No args):
>
> Print out results (reset plan)
>
> 2. With --execute argument:
>
> Run reset process
>
> 3. With --output argument:
>
> Save result in a JSON format.
>
> 4. Only with --execute option and --reset-file (path to JSON)
>
> Reset based on file
>
> 4. Only with --verify option and --reset-file (path to JSON)
>
> Verify file values with current offsets
>
> I think we can remove --generate-and-execute because is a bit clumsy.
>
> With this options we will be able to execute with manual JSON
> configuration.
>
>
> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
> escribió:
>
> Yes - using a tool like this to skip a set of consumer groups over a
> corrupt/bad message is definitely appealing.
>
> B
>
> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
>
> > I like the --reset-to-earliest and --reset-to-latest. In general,
> > since the JSON route is the most challenging for users, we want to
> > provide a lot of ways to do useful things without going there.
> >
> > Two things that can help:
> >
> > 1. A lot of times, users want to skip few messages that cause issues
> > and continue. maybe just specifying the topic, partition and delta
> > will be better than having to find the offset and write a JSON and
> > validate the JSON etc.
> >
> > 2. Thinking if there are other common use-cases that we can make easy
> > rather than just one generic but not very usable method.
> >
> > Gwen
> >
> > On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> > <qu...@gmail.com> wrote:
> > > Thanks for the feedback!
> > >
> > > @Onur, @Gwen:
> > >
> > > Agree. Actually at the first draft I considered to have it inside
> > > ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
> > tool
> > > to describe it clearly and focus it on reset functionality.
> > >
> > > But now that you mentioned, it does make sense to have it in
> > > ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
> > it?
> > >
> > > Maybe something like this:
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> --topics
> > t1
> > > --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> > > plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> > > plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group
> > cg1
> > > --topics t1 --reset-from 2017-01-01T00:00:00.000´
> > >
> > > @Gwen:
> > >
> > >> It looks exactly like the replica assignment tool
> > >
> > > It was influenced by ;-) I use the generate-verify-execute process here
> > to
> > > make sure user will be aware of the result of this operation. At the
> > > beginning we considered only add a couple of options to Consumer Group
> > > Command:
> > >
> > > --rewind-to-timestamp and --rewind-to-period
> > >
> > > @Onur:
> > >
> > >> You can actually get away with overriding while members of the group
> > are live
> > > with method 2 by using group information from DescribeGroupsRequest.
> > >
> > > This means that we need to have Consumer Group stopped before executing
> > and
> > > start a new consumer internally to do this? Therefore, we won't be able
> > to
> > > consider executing reset when ConsumerGroup is active? (trying to
> relate
> > it
> > > with @Dong 5th question)
> > >
> > > @Dong:
> > >
> > >> Should we allow user to use wildcard to reset offset of all groups
> for a
> > > given topic as well?
> > >
> > > I haven't thought about this scenario. Could be interesting. Following
> > the
> > > recommendation to add it into Consumer Group Command, in this case
> Group
> > > argument will be optional if there are only 1 topic. I think for
> multiple
> > > topic won't be that useful.
> > >
> > >> Should we allow user to specify timestamp per topic partition in the
> > json
> > > file as well?
> > >
> > > Don't think this could be a valid from the tool, but if Reset Plan is
> > > generated, and user want to set the offset for a specific partition to
> > > other offset (eventually based on another timestamp), and execute it,
> it
> > > will be up to her/him.
> > >
> > >> Should the script take some credential file to make sure that this
> > > operation is authenticated given the potential impact of this
> operation?
> > >
> > > Haven't tried to secure brokers yet, but the tool should support
> > > authorization if it's enabled in the broker.
> > >
> > >> Should we provide constant to reset committed offset to
> earliest/latest
> > > offset of a partition, e.g. -1 indicates earliest offset and -2
> indicates
> > > latest offset.
> > >
> > > I will go for something like ´--reset-to-earliest´ and
> > ´--reset-to-latest´
> > >
> > >> Should we allow dynamic change of the comitted offset when consumer
> are
> > > running, such that consumer will seek to the newly committed offset and
> > > start consuming from there?
> > >
> > > Not sure about this. I will recommend to keep it simple and ask user to
> > > stop consumers first. But I would considered it if the trade-offs are
> > > clear.
> > >
> > > @Matthias
> > >
> > > Added :). And thanks a lot for your help to define this KIP!
> > >
> > >
> > >
> > > El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
> > > escribió:
> > >
> > >> As long as the CLI is a bit consistent? Like, not just adding 3
> > >> arguments and a JSON parser to the existing tool, right?
> > >>
> > >> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> > >> <on...@gmail.com> wrote:
> > >> > I think it makes sense to just add the feature to
> > >> kafka-consumer-groups.sh
> > >> >
> > >> > On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
> > wrote:
> > >> >
> > >> >> Thanks for the KIP. I'm super happy about adding the capability.
> > >> >>
> > >> >> I hate the interface, though. It looks exactly like the replica
> > >> >> assignment tool. A tool everyone loves so much that there are
> > multiple
> > >> >> projects, open and closed, that try to fix it.
> > >> >>
> > >> >> Can we swap it with something that looks a bit more like the
> consumer
> > >> >> group tool? or the kafka streams reset tool? Consistency is helpful
> > in
> > >> >> such cases. I spent some time learning existing tools and learning
> > yet
> > >> >> another one is a deterrent.
> > >> >>
> > >> >> Gwen
> > >> >>
> > >> >>
> > >> >>
> > >> >> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> > >> >> <qu...@gmail.com> wrote:
> > >> >> > Hi all,
> > >> >> >
> > >> >> > I would like to propose a KIP to Add a tool to Reset Consumer
> Group
> > >> >> Offsets.
> > >> >> >
> > >> >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >> >> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> > >> >> >
> > >> >> > Please, take a look at the proposal and share your feedback.
> > >> >> >
> > >> >> > Thanks,
> > >> >> > Jorge.
> > >> >>
> > >> >>
> > >> >>
> > >> >> --
> > >> >> Gwen Shapira
> > >> >> Product Manager | Confluent
> > >> >> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760> | @gwenshap
> > >> >> Follow us: Twitter | blog
> > >> >>
> > >>
> > >>
> > >>
> > >> --
> > >> Gwen Shapira
> > >> Product Manager | Confluent
> > >> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760> | @gwenshap
> > >> Follow us: Twitter | blog
> > >>
> >
> >
> >
> > --
> > Gwen Shapira
> > Product Manager | Confluent
> > 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap
> > Follow us: Twitter | blog
> >
>
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
Great. I think I got the idea. What about this options:

Scenarios:

1. Current status

´kafka-consumer-groups.sh --reset-offset --group cg1´

2. To Datetime

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
2017-01-01T00:00:00.000´

3. To Period

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period P2D´

4. To Earliest

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-earliest´

5. To Latest

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´

6. Minus 'n' offsets

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´

7. Plus 'n' offsets

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´

8. To specific offset

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´

Scopes:

a. All topics used by Consumer Group

Don't specify --topics

b. Specific List of Topics

Add list of values in --topics t1,t2,tn

c. One Topic, all Partitions

Add one topic and no partitions values: --topic t1

d. One Topic, List of Partitions

Add one topic and partitions values: --topic t1 --partitions 0,1,2

About Reset Plan (JSON file):

I think is still valid to have the option to persist reset configuration as
a file, but I agree to give the option to run the tool without going down
to the JSON file.

Execution options:

1. Without execution argument (No args):

Print out results (reset plan)

2. With --execute argument:

Run reset process

3. With --output argument:

Save result in a JSON format.

4. Only with --execute option and --reset-file (path to JSON)

Reset based on file

4. Only with --verify option and --reset-file (path to JSON)

Verify file values with current offsets

I think we can remove --generate-and-execute because is a bit clumsy.

With this options we will be able to execute with manual JSON configuration.


El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
escribió:

> Yes - using a tool like this to skip a set of consumer groups over a
> corrupt/bad message is definitely appealing.
>
> B
>
> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
>
> > I like the --reset-to-earliest and --reset-to-latest. In general,
> > since the JSON route is the most challenging for users, we want to
> > provide a lot of ways to do useful things without going there.
> >
> > Two things that can help:
> >
> > 1. A lot of times, users want to skip few messages that cause issues
> > and continue. maybe just specifying the topic, partition and delta
> > will be better than having to find the offset and write a JSON and
> > validate the JSON etc.
> >
> > 2. Thinking if there are other common use-cases that we can make easy
> > rather than just one generic but not very usable method.
> >
> > Gwen
> >
> > On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> > <qu...@gmail.com> wrote:
> > > Thanks for the feedback!
> > >
> > > @Onur, @Gwen:
> > >
> > > Agree. Actually at the first draft I considered to have it inside
> > > ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
> > tool
> > > to describe it clearly and focus it on reset functionality.
> > >
> > > But now that you mentioned, it does make sense to have it in
> > > ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
> > it?
> > >
> > > Maybe something like this:
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> --topics
> > t1
> > > --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> > > plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> > > plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group
> > cg1
> > > --topics t1 --reset-from 2017-01-01T00:00:00.000´
> > >
> > > @Gwen:
> > >
> > >> It looks exactly like the replica assignment tool
> > >
> > > It was influenced by ;-) I use the generate-verify-execute process here
> > to
> > > make sure user will be aware of the result of this operation. At the
> > > beginning we considered only add a couple of options to Consumer Group
> > > Command:
> > >
> > > --rewind-to-timestamp and --rewind-to-period
> > >
> > > @Onur:
> > >
> > >> You can actually get away with overriding while members of the group
> > are live
> > > with method 2 by using group information from DescribeGroupsRequest.
> > >
> > > This means that we need to have Consumer Group stopped before executing
> > and
> > > start a new consumer internally to do this? Therefore, we won't be able
> > to
> > > consider executing reset when ConsumerGroup is active? (trying to
> relate
> > it
> > > with @Dong 5th question)
> > >
> > > @Dong:
> > >
> > >> Should we allow user to use wildcard to reset offset of all groups
> for a
> > > given topic as well?
> > >
> > > I haven't thought about this scenario. Could be interesting. Following
> > the
> > > recommendation to add it into Consumer Group Command, in this case
> Group
> > > argument will be optional if there are only 1 topic. I think for
> multiple
> > > topic won't be that useful.
> > >
> > >> Should we allow user to specify timestamp per topic partition in the
> > json
> > > file as well?
> > >
> > > Don't think this could be a valid from the tool, but if Reset Plan is
> > > generated, and user want to set the offset for a specific partition to
> > > other offset (eventually based on another timestamp), and execute it,
> it
> > > will be up to her/him.
> > >
> > >> Should the script take some credential file to make sure that this
> > > operation is authenticated given the potential impact of this
> operation?
> > >
> > > Haven't tried to secure brokers yet, but the tool should support
> > > authorization if it's enabled in the broker.
> > >
> > >> Should we provide constant to reset committed offset to
> earliest/latest
> > > offset of a partition, e.g. -1 indicates earliest offset and -2
> indicates
> > > latest offset.
> > >
> > > I will go for something like ´--reset-to-earliest´ and
> > ´--reset-to-latest´
> > >
> > >> Should we allow dynamic change of the comitted offset when consumer
> are
> > > running, such that consumer will seek to the newly committed offset and
> > > start consuming from there?
> > >
> > > Not sure about this. I will recommend to keep it simple and ask user to
> > > stop consumers first. But I would considered it if the trade-offs are
> > > clear.
> > >
> > > @Matthias
> > >
> > > Added :). And thanks a lot for your help to define this KIP!
> > >
> > >
> > >
> > > El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
> > > escribió:
> > >
> > >> As long as the CLI is a bit consistent? Like, not just adding 3
> > >> arguments and a JSON parser to the existing tool, right?
> > >>
> > >> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> > >> <on...@gmail.com> wrote:
> > >> > I think it makes sense to just add the feature to
> > >> kafka-consumer-groups.sh
> > >> >
> > >> > On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
> > wrote:
> > >> >
> > >> >> Thanks for the KIP. I'm super happy about adding the capability.
> > >> >>
> > >> >> I hate the interface, though. It looks exactly like the replica
> > >> >> assignment tool. A tool everyone loves so much that there are
> > multiple
> > >> >> projects, open and closed, that try to fix it.
> > >> >>
> > >> >> Can we swap it with something that looks a bit more like the
> consumer
> > >> >> group tool? or the kafka streams reset tool? Consistency is helpful
> > in
> > >> >> such cases. I spent some time learning existing tools and learning
> > yet
> > >> >> another one is a deterrent.
> > >> >>
> > >> >> Gwen
> > >> >>
> > >> >>
> > >> >>
> > >> >> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> > >> >> <qu...@gmail.com> wrote:
> > >> >> > Hi all,
> > >> >> >
> > >> >> > I would like to propose a KIP to Add a tool to Reset Consumer
> Group
> > >> >> Offsets.
> > >> >> >
> > >> >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >> >> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> > >> >> >
> > >> >> > Please, take a look at the proposal and share your feedback.
> > >> >> >
> > >> >> > Thanks,
> > >> >> > Jorge.
> > >> >>
> > >> >>
> > >> >>
> > >> >> --
> > >> >> Gwen Shapira
> > >> >> Product Manager | Confluent
> > >> >> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760> | @gwenshap
> > >> >> Follow us: Twitter | blog
> > >> >>
> > >>
> > >>
> > >>
> > >> --
> > >> Gwen Shapira
> > >> Product Manager | Confluent
> > >> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760> | @gwenshap
> > >> Follow us: Twitter | blog
> > >>
> >
> >
> >
> > --
> > Gwen Shapira
> > Product Manager | Confluent
> > 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap
> > Follow us: Twitter | blog
> >
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
Great. I think I got the idea. What about this options:

Scenarios:

1. Current status

´kafka-consumer-groups.sh --reset-offset --group cg1´

2. To Datetime

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
2017-01-01T00:00:00.000´

3. To Period

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period P2D´

4. To Earliest

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-earliest´

5. To Latest

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´

6. Minus 'n' offsets

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´

7. Plus 'n' offsets

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´

8. To specific offset

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´

Scopes:

a. All topics used by Consumer Group

Don't specify --topics

b. Specific List of Topics

Add list of values in --topics t1,t2,tn

c. One Topic, all Partitions

Add one topic and no partitions values: --topic t1

d. One Topic, List of Partitions

Add one topic and partitions values: --topic t1 --partitions 0,1,2

About Reset Plan (JSON file):

I think is still valid to have the option to persist reset configuration as
a file, but I agree to give the option to run the tool without going down
to the JSON file.

Execution options:

1. Without execution argument (No args):

Print out results (reset plan)

2. With --execute argument:

Run reset process

3. With --output argument:

Save result in a JSON format.

4. Only with --execute option and --reset-file (path to JSON)

Reset based on file

4. Only with --verify option and --reset-file (path to JSON)

Verify file values with current offsets

I think we can remove --generate-and-execute because is a bit clumsy.

With this options we will be able to execute with manual JSON configuration.


El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<be...@confluent.io>)
escribió:

> Yes - using a tool like this to skip a set of consumer groups over a
> corrupt/bad message is definitely appealing.
>
> B
>
> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:
>
> > I like the --reset-to-earliest and --reset-to-latest. In general,
> > since the JSON route is the most challenging for users, we want to
> > provide a lot of ways to do useful things without going there.
> >
> > Two things that can help:
> >
> > 1. A lot of times, users want to skip few messages that cause issues
> > and continue. maybe just specifying the topic, partition and delta
> > will be better than having to find the offset and write a JSON and
> > validate the JSON etc.
> >
> > 2. Thinking if there are other common use-cases that we can make easy
> > rather than just one generic but not very usable method.
> >
> > Gwen
> >
> > On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> > <qu...@gmail.com> wrote:
> > > Thanks for the feedback!
> > >
> > > @Onur, @Gwen:
> > >
> > > Agree. Actually at the first draft I considered to have it inside
> > > ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
> > tool
> > > to describe it clearly and focus it on reset functionality.
> > >
> > > But now that you mentioned, it does make sense to have it in
> > > ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
> > it?
> > >
> > > Maybe something like this:
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> --topics
> > t1
> > > --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> > > plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> > > plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group
> > cg1
> > > --topics t1 --reset-from 2017-01-01T00:00:00.000´
> > >
> > > @Gwen:
> > >
> > >> It looks exactly like the replica assignment tool
> > >
> > > It was influenced by ;-) I use the generate-verify-execute process here
> > to
> > > make sure user will be aware of the result of this operation. At the
> > > beginning we considered only add a couple of options to Consumer Group
> > > Command:
> > >
> > > --rewind-to-timestamp and --rewind-to-period
> > >
> > > @Onur:
> > >
> > >> You can actually get away with overriding while members of the group
> > are live
> > > with method 2 by using group information from DescribeGroupsRequest.
> > >
> > > This means that we need to have Consumer Group stopped before executing
> > and
> > > start a new consumer internally to do this? Therefore, we won't be able
> > to
> > > consider executing reset when ConsumerGroup is active? (trying to
> relate
> > it
> > > with @Dong 5th question)
> > >
> > > @Dong:
> > >
> > >> Should we allow user to use wildcard to reset offset of all groups
> for a
> > > given topic as well?
> > >
> > > I haven't thought about this scenario. Could be interesting. Following
> > the
> > > recommendation to add it into Consumer Group Command, in this case
> Group
> > > argument will be optional if there are only 1 topic. I think for
> multiple
> > > topic won't be that useful.
> > >
> > >> Should we allow user to specify timestamp per topic partition in the
> > json
> > > file as well?
> > >
> > > Don't think this could be a valid from the tool, but if Reset Plan is
> > > generated, and user want to set the offset for a specific partition to
> > > other offset (eventually based on another timestamp), and execute it,
> it
> > > will be up to her/him.
> > >
> > >> Should the script take some credential file to make sure that this
> > > operation is authenticated given the potential impact of this
> operation?
> > >
> > > Haven't tried to secure brokers yet, but the tool should support
> > > authorization if it's enabled in the broker.
> > >
> > >> Should we provide constant to reset committed offset to
> earliest/latest
> > > offset of a partition, e.g. -1 indicates earliest offset and -2
> indicates
> > > latest offset.
> > >
> > > I will go for something like ´--reset-to-earliest´ and
> > ´--reset-to-latest´
> > >
> > >> Should we allow dynamic change of the comitted offset when consumer
> are
> > > running, such that consumer will seek to the newly committed offset and
> > > start consuming from there?
> > >
> > > Not sure about this. I will recommend to keep it simple and ask user to
> > > stop consumers first. But I would considered it if the trade-offs are
> > > clear.
> > >
> > > @Matthias
> > >
> > > Added :). And thanks a lot for your help to define this KIP!
> > >
> > >
> > >
> > > El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
> > > escribió:
> > >
> > >> As long as the CLI is a bit consistent? Like, not just adding 3
> > >> arguments and a JSON parser to the existing tool, right?
> > >>
> > >> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> > >> <on...@gmail.com> wrote:
> > >> > I think it makes sense to just add the feature to
> > >> kafka-consumer-groups.sh
> > >> >
> > >> > On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
> > wrote:
> > >> >
> > >> >> Thanks for the KIP. I'm super happy about adding the capability.
> > >> >>
> > >> >> I hate the interface, though. It looks exactly like the replica
> > >> >> assignment tool. A tool everyone loves so much that there are
> > multiple
> > >> >> projects, open and closed, that try to fix it.
> > >> >>
> > >> >> Can we swap it with something that looks a bit more like the
> consumer
> > >> >> group tool? or the kafka streams reset tool? Consistency is helpful
> > in
> > >> >> such cases. I spent some time learning existing tools and learning
> > yet
> > >> >> another one is a deterrent.
> > >> >>
> > >> >> Gwen
> > >> >>
> > >> >>
> > >> >>
> > >> >> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> > >> >> <qu...@gmail.com> wrote:
> > >> >> > Hi all,
> > >> >> >
> > >> >> > I would like to propose a KIP to Add a tool to Reset Consumer
> Group
> > >> >> Offsets.
> > >> >> >
> > >> >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >> >> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> > >> >> >
> > >> >> > Please, take a look at the proposal and share your feedback.
> > >> >> >
> > >> >> > Thanks,
> > >> >> > Jorge.
> > >> >>
> > >> >>
> > >> >>
> > >> >> --
> > >> >> Gwen Shapira
> > >> >> Product Manager | Confluent
> > >> >> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760> | @gwenshap
> > >> >> Follow us: Twitter | blog
> > >> >>
> > >>
> > >>
> > >>
> > >> --
> > >> Gwen Shapira
> > >> Product Manager | Confluent
> > >> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> <(650)%20450-2760> | @gwenshap
> > >> Follow us: Twitter | blog
> > >>
> >
> >
> >
> > --
> > Gwen Shapira
> > Product Manager | Confluent
> > 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap
> > Follow us: Twitter | blog
> >
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Ben Stopford <be...@confluent.io>.
Yes - using a tool like this to skip a set of consumer groups over a
corrupt/bad message is definitely appealing.

B

On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:

> I like the --reset-to-earliest and --reset-to-latest. In general,
> since the JSON route is the most challenging for users, we want to
> provide a lot of ways to do useful things without going there.
>
> Two things that can help:
>
> 1. A lot of times, users want to skip few messages that cause issues
> and continue. maybe just specifying the topic, partition and delta
> will be better than having to find the offset and write a JSON and
> validate the JSON etc.
>
> 2. Thinking if there are other common use-cases that we can make easy
> rather than just one generic but not very usable method.
>
> Gwen
>
> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> <qu...@gmail.com> wrote:
> > Thanks for the feedback!
> >
> > @Onur, @Gwen:
> >
> > Agree. Actually at the first draft I considered to have it inside
> > ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
> tool
> > to describe it clearly and focus it on reset functionality.
> >
> > But now that you mentioned, it does make sense to have it in
> > ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
> it?
> >
> > Maybe something like this:
> >
> > ´kafka-consumer-groups.sh --reset-offset --generate --group cg1 --topics
> t1
> > --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> >
> > ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> > plan.json´
> >
> > ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> > plan.json´
> >
> > ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group
> cg1
> > --topics t1 --reset-from 2017-01-01T00:00:00.000´
> >
> > @Gwen:
> >
> >> It looks exactly like the replica assignment tool
> >
> > It was influenced by ;-) I use the generate-verify-execute process here
> to
> > make sure user will be aware of the result of this operation. At the
> > beginning we considered only add a couple of options to Consumer Group
> > Command:
> >
> > --rewind-to-timestamp and --rewind-to-period
> >
> > @Onur:
> >
> >> You can actually get away with overriding while members of the group
> are live
> > with method 2 by using group information from DescribeGroupsRequest.
> >
> > This means that we need to have Consumer Group stopped before executing
> and
> > start a new consumer internally to do this? Therefore, we won't be able
> to
> > consider executing reset when ConsumerGroup is active? (trying to relate
> it
> > with @Dong 5th question)
> >
> > @Dong:
> >
> >> Should we allow user to use wildcard to reset offset of all groups for a
> > given topic as well?
> >
> > I haven't thought about this scenario. Could be interesting. Following
> the
> > recommendation to add it into Consumer Group Command, in this case Group
> > argument will be optional if there are only 1 topic. I think for multiple
> > topic won't be that useful.
> >
> >> Should we allow user to specify timestamp per topic partition in the
> json
> > file as well?
> >
> > Don't think this could be a valid from the tool, but if Reset Plan is
> > generated, and user want to set the offset for a specific partition to
> > other offset (eventually based on another timestamp), and execute it, it
> > will be up to her/him.
> >
> >> Should the script take some credential file to make sure that this
> > operation is authenticated given the potential impact of this operation?
> >
> > Haven't tried to secure brokers yet, but the tool should support
> > authorization if it's enabled in the broker.
> >
> >> Should we provide constant to reset committed offset to earliest/latest
> > offset of a partition, e.g. -1 indicates earliest offset and -2 indicates
> > latest offset.
> >
> > I will go for something like ´--reset-to-earliest´ and
> ´--reset-to-latest´
> >
> >> Should we allow dynamic change of the comitted offset when consumer are
> > running, such that consumer will seek to the newly committed offset and
> > start consuming from there?
> >
> > Not sure about this. I will recommend to keep it simple and ask user to
> > stop consumers first. But I would considered it if the trade-offs are
> > clear.
> >
> > @Matthias
> >
> > Added :). And thanks a lot for your help to define this KIP!
> >
> >
> >
> > El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
> > escribió:
> >
> >> As long as the CLI is a bit consistent? Like, not just adding 3
> >> arguments and a JSON parser to the existing tool, right?
> >>
> >> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> >> <on...@gmail.com> wrote:
> >> > I think it makes sense to just add the feature to
> >> kafka-consumer-groups.sh
> >> >
> >> > On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
> wrote:
> >> >
> >> >> Thanks for the KIP. I'm super happy about adding the capability.
> >> >>
> >> >> I hate the interface, though. It looks exactly like the replica
> >> >> assignment tool. A tool everyone loves so much that there are
> multiple
> >> >> projects, open and closed, that try to fix it.
> >> >>
> >> >> Can we swap it with something that looks a bit more like the consumer
> >> >> group tool? or the kafka streams reset tool? Consistency is helpful
> in
> >> >> such cases. I spent some time learning existing tools and learning
> yet
> >> >> another one is a deterrent.
> >> >>
> >> >> Gwen
> >> >>
> >> >>
> >> >>
> >> >> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> >> >> <qu...@gmail.com> wrote:
> >> >> > Hi all,
> >> >> >
> >> >> > I would like to propose a KIP to Add a tool to Reset Consumer Group
> >> >> Offsets.
> >> >> >
> >> >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> >> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >> >> >
> >> >> > Please, take a look at the proposal and share your feedback.
> >> >> >
> >> >> > Thanks,
> >> >> > Jorge.
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Gwen Shapira
> >> >> Product Manager | Confluent
> >> >> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap
> >> >> Follow us: Twitter | blog
> >> >>
> >>
> >>
> >>
> >> --
> >> Gwen Shapira
> >> Product Manager | Confluent
> >> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap
> >> Follow us: Twitter | blog
> >>
>
>
>
> --
> Gwen Shapira
> Product Manager | Confluent
> 650.450.2760 <(650)%20450-2760> | @gwenshap
> Follow us: Twitter | blog
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Ben Stopford <be...@confluent.io>.
Yes - using a tool like this to skip a set of consumer groups over a
corrupt/bad message is definitely appealing.

B

On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <gw...@confluent.io> wrote:

> I like the --reset-to-earliest and --reset-to-latest. In general,
> since the JSON route is the most challenging for users, we want to
> provide a lot of ways to do useful things without going there.
>
> Two things that can help:
>
> 1. A lot of times, users want to skip few messages that cause issues
> and continue. maybe just specifying the topic, partition and delta
> will be better than having to find the offset and write a JSON and
> validate the JSON etc.
>
> 2. Thinking if there are other common use-cases that we can make easy
> rather than just one generic but not very usable method.
>
> Gwen
>
> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> <qu...@gmail.com> wrote:
> > Thanks for the feedback!
> >
> > @Onur, @Gwen:
> >
> > Agree. Actually at the first draft I considered to have it inside
> > ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
> tool
> > to describe it clearly and focus it on reset functionality.
> >
> > But now that you mentioned, it does make sense to have it in
> > ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
> it?
> >
> > Maybe something like this:
> >
> > ´kafka-consumer-groups.sh --reset-offset --generate --group cg1 --topics
> t1
> > --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> >
> > ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> > plan.json´
> >
> > ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> > plan.json´
> >
> > ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group
> cg1
> > --topics t1 --reset-from 2017-01-01T00:00:00.000´
> >
> > @Gwen:
> >
> >> It looks exactly like the replica assignment tool
> >
> > It was influenced by ;-) I use the generate-verify-execute process here
> to
> > make sure user will be aware of the result of this operation. At the
> > beginning we considered only add a couple of options to Consumer Group
> > Command:
> >
> > --rewind-to-timestamp and --rewind-to-period
> >
> > @Onur:
> >
> >> You can actually get away with overriding while members of the group
> are live
> > with method 2 by using group information from DescribeGroupsRequest.
> >
> > This means that we need to have Consumer Group stopped before executing
> and
> > start a new consumer internally to do this? Therefore, we won't be able
> to
> > consider executing reset when ConsumerGroup is active? (trying to relate
> it
> > with @Dong 5th question)
> >
> > @Dong:
> >
> >> Should we allow user to use wildcard to reset offset of all groups for a
> > given topic as well?
> >
> > I haven't thought about this scenario. Could be interesting. Following
> the
> > recommendation to add it into Consumer Group Command, in this case Group
> > argument will be optional if there are only 1 topic. I think for multiple
> > topic won't be that useful.
> >
> >> Should we allow user to specify timestamp per topic partition in the
> json
> > file as well?
> >
> > Don't think this could be a valid from the tool, but if Reset Plan is
> > generated, and user want to set the offset for a specific partition to
> > other offset (eventually based on another timestamp), and execute it, it
> > will be up to her/him.
> >
> >> Should the script take some credential file to make sure that this
> > operation is authenticated given the potential impact of this operation?
> >
> > Haven't tried to secure brokers yet, but the tool should support
> > authorization if it's enabled in the broker.
> >
> >> Should we provide constant to reset committed offset to earliest/latest
> > offset of a partition, e.g. -1 indicates earliest offset and -2 indicates
> > latest offset.
> >
> > I will go for something like ´--reset-to-earliest´ and
> ´--reset-to-latest´
> >
> >> Should we allow dynamic change of the comitted offset when consumer are
> > running, such that consumer will seek to the newly committed offset and
> > start consuming from there?
> >
> > Not sure about this. I will recommend to keep it simple and ask user to
> > stop consumers first. But I would considered it if the trade-offs are
> > clear.
> >
> > @Matthias
> >
> > Added :). And thanks a lot for your help to define this KIP!
> >
> >
> >
> > El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
> > escribió:
> >
> >> As long as the CLI is a bit consistent? Like, not just adding 3
> >> arguments and a JSON parser to the existing tool, right?
> >>
> >> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> >> <on...@gmail.com> wrote:
> >> > I think it makes sense to just add the feature to
> >> kafka-consumer-groups.sh
> >> >
> >> > On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io>
> wrote:
> >> >
> >> >> Thanks for the KIP. I'm super happy about adding the capability.
> >> >>
> >> >> I hate the interface, though. It looks exactly like the replica
> >> >> assignment tool. A tool everyone loves so much that there are
> multiple
> >> >> projects, open and closed, that try to fix it.
> >> >>
> >> >> Can we swap it with something that looks a bit more like the consumer
> >> >> group tool? or the kafka streams reset tool? Consistency is helpful
> in
> >> >> such cases. I spent some time learning existing tools and learning
> yet
> >> >> another one is a deterrent.
> >> >>
> >> >> Gwen
> >> >>
> >> >>
> >> >>
> >> >> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> >> >> <qu...@gmail.com> wrote:
> >> >> > Hi all,
> >> >> >
> >> >> > I would like to propose a KIP to Add a tool to Reset Consumer Group
> >> >> Offsets.
> >> >> >
> >> >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> >> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >> >> >
> >> >> > Please, take a look at the proposal and share your feedback.
> >> >> >
> >> >> > Thanks,
> >> >> > Jorge.
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Gwen Shapira
> >> >> Product Manager | Confluent
> >> >> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap
> >> >> Follow us: Twitter | blog
> >> >>
> >>
> >>
> >>
> >> --
> >> Gwen Shapira
> >> Product Manager | Confluent
> >> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap
> >> Follow us: Twitter | blog
> >>
>
>
>
> --
> Gwen Shapira
> Product Manager | Confluent
> 650.450.2760 <(650)%20450-2760> | @gwenshap
> Follow us: Twitter | blog
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Gwen Shapira <gw...@confluent.io>.
I like the --reset-to-earliest and --reset-to-latest. In general,
since the JSON route is the most challenging for users, we want to
provide a lot of ways to do useful things without going there.

Two things that can help:

1. A lot of times, users want to skip few messages that cause issues
and continue. maybe just specifying the topic, partition and delta
will be better than having to find the offset and write a JSON and
validate the JSON etc.

2. Thinking if there are other common use-cases that we can make easy
rather than just one generic but not very usable method.

Gwen

On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
<qu...@gmail.com> wrote:
> Thanks for the feedback!
>
> @Onur, @Gwen:
>
> Agree. Actually at the first draft I considered to have it inside
> ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone tool
> to describe it clearly and focus it on reset functionality.
>
> But now that you mentioned, it does make sense to have it in
> ´kafka-consumer-groups.sh´. How would be a consistent way to introduce it?
>
> Maybe something like this:
>
> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1 --topics t1
> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
>
> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> plan.json´
>
> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> plan.json´
>
> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group cg1
> --topics t1 --reset-from 2017-01-01T00:00:00.000´
>
> @Gwen:
>
>> It looks exactly like the replica assignment tool
>
> It was influenced by ;-) I use the generate-verify-execute process here to
> make sure user will be aware of the result of this operation. At the
> beginning we considered only add a couple of options to Consumer Group
> Command:
>
> --rewind-to-timestamp and --rewind-to-period
>
> @Onur:
>
>> You can actually get away with overriding while members of the group are live
> with method 2 by using group information from DescribeGroupsRequest.
>
> This means that we need to have Consumer Group stopped before executing and
> start a new consumer internally to do this? Therefore, we won't be able to
> consider executing reset when ConsumerGroup is active? (trying to relate it
> with @Dong 5th question)
>
> @Dong:
>
>> Should we allow user to use wildcard to reset offset of all groups for a
> given topic as well?
>
> I haven't thought about this scenario. Could be interesting. Following the
> recommendation to add it into Consumer Group Command, in this case Group
> argument will be optional if there are only 1 topic. I think for multiple
> topic won't be that useful.
>
>> Should we allow user to specify timestamp per topic partition in the json
> file as well?
>
> Don't think this could be a valid from the tool, but if Reset Plan is
> generated, and user want to set the offset for a specific partition to
> other offset (eventually based on another timestamp), and execute it, it
> will be up to her/him.
>
>> Should the script take some credential file to make sure that this
> operation is authenticated given the potential impact of this operation?
>
> Haven't tried to secure brokers yet, but the tool should support
> authorization if it's enabled in the broker.
>
>> Should we provide constant to reset committed offset to earliest/latest
> offset of a partition, e.g. -1 indicates earliest offset and -2 indicates
> latest offset.
>
> I will go for something like ´--reset-to-earliest´ and ´--reset-to-latest´
>
>> Should we allow dynamic change of the comitted offset when consumer are
> running, such that consumer will seek to the newly committed offset and
> start consuming from there?
>
> Not sure about this. I will recommend to keep it simple and ask user to
> stop consumers first. But I would considered it if the trade-offs are
> clear.
>
> @Matthias
>
> Added :). And thanks a lot for your help to define this KIP!
>
>
>
> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
> escribió:
>
>> As long as the CLI is a bit consistent? Like, not just adding 3
>> arguments and a JSON parser to the existing tool, right?
>>
>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
>> <on...@gmail.com> wrote:
>> > I think it makes sense to just add the feature to
>> kafka-consumer-groups.sh
>> >
>> > On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io> wrote:
>> >
>> >> Thanks for the KIP. I'm super happy about adding the capability.
>> >>
>> >> I hate the interface, though. It looks exactly like the replica
>> >> assignment tool. A tool everyone loves so much that there are multiple
>> >> projects, open and closed, that try to fix it.
>> >>
>> >> Can we swap it with something that looks a bit more like the consumer
>> >> group tool? or the kafka streams reset tool? Consistency is helpful in
>> >> such cases. I spent some time learning existing tools and learning yet
>> >> another one is a deterrent.
>> >>
>> >> Gwen
>> >>
>> >>
>> >>
>> >> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>> >> <qu...@gmail.com> wrote:
>> >> > Hi all,
>> >> >
>> >> > I would like to propose a KIP to Add a tool to Reset Consumer Group
>> >> Offsets.
>> >> >
>> >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> >> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>> >> >
>> >> > Please, take a look at the proposal and share your feedback.
>> >> >
>> >> > Thanks,
>> >> > Jorge.
>> >>
>> >>
>> >>
>> >> --
>> >> Gwen Shapira
>> >> Product Manager | Confluent
>> >> 650.450.2760 <(650)%20450-2760> | @gwenshap
>> >> Follow us: Twitter | blog
>> >>
>>
>>
>>
>> --
>> Gwen Shapira
>> Product Manager | Confluent
>> 650.450.2760 <(650)%20450-2760> | @gwenshap
>> Follow us: Twitter | blog
>>



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Gwen Shapira <gw...@confluent.io>.
I like the --reset-to-earliest and --reset-to-latest. In general,
since the JSON route is the most challenging for users, we want to
provide a lot of ways to do useful things without going there.

Two things that can help:

1. A lot of times, users want to skip few messages that cause issues
and continue. maybe just specifying the topic, partition and delta
will be better than having to find the offset and write a JSON and
validate the JSON etc.

2. Thinking if there are other common use-cases that we can make easy
rather than just one generic but not very usable method.

Gwen

On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
<qu...@gmail.com> wrote:
> Thanks for the feedback!
>
> @Onur, @Gwen:
>
> Agree. Actually at the first draft I considered to have it inside
> ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone tool
> to describe it clearly and focus it on reset functionality.
>
> But now that you mentioned, it does make sense to have it in
> ´kafka-consumer-groups.sh´. How would be a consistent way to introduce it?
>
> Maybe something like this:
>
> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1 --topics t1
> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
>
> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> plan.json´
>
> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> plan.json´
>
> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group cg1
> --topics t1 --reset-from 2017-01-01T00:00:00.000´
>
> @Gwen:
>
>> It looks exactly like the replica assignment tool
>
> It was influenced by ;-) I use the generate-verify-execute process here to
> make sure user will be aware of the result of this operation. At the
> beginning we considered only add a couple of options to Consumer Group
> Command:
>
> --rewind-to-timestamp and --rewind-to-period
>
> @Onur:
>
>> You can actually get away with overriding while members of the group are live
> with method 2 by using group information from DescribeGroupsRequest.
>
> This means that we need to have Consumer Group stopped before executing and
> start a new consumer internally to do this? Therefore, we won't be able to
> consider executing reset when ConsumerGroup is active? (trying to relate it
> with @Dong 5th question)
>
> @Dong:
>
>> Should we allow user to use wildcard to reset offset of all groups for a
> given topic as well?
>
> I haven't thought about this scenario. Could be interesting. Following the
> recommendation to add it into Consumer Group Command, in this case Group
> argument will be optional if there are only 1 topic. I think for multiple
> topic won't be that useful.
>
>> Should we allow user to specify timestamp per topic partition in the json
> file as well?
>
> Don't think this could be a valid from the tool, but if Reset Plan is
> generated, and user want to set the offset for a specific partition to
> other offset (eventually based on another timestamp), and execute it, it
> will be up to her/him.
>
>> Should the script take some credential file to make sure that this
> operation is authenticated given the potential impact of this operation?
>
> Haven't tried to secure brokers yet, but the tool should support
> authorization if it's enabled in the broker.
>
>> Should we provide constant to reset committed offset to earliest/latest
> offset of a partition, e.g. -1 indicates earliest offset and -2 indicates
> latest offset.
>
> I will go for something like ´--reset-to-earliest´ and ´--reset-to-latest´
>
>> Should we allow dynamic change of the comitted offset when consumer are
> running, such that consumer will seek to the newly committed offset and
> start consuming from there?
>
> Not sure about this. I will recommend to keep it simple and ask user to
> stop consumers first. But I would considered it if the trade-offs are
> clear.
>
> @Matthias
>
> Added :). And thanks a lot for your help to define this KIP!
>
>
>
> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
> escribió:
>
>> As long as the CLI is a bit consistent? Like, not just adding 3
>> arguments and a JSON parser to the existing tool, right?
>>
>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
>> <on...@gmail.com> wrote:
>> > I think it makes sense to just add the feature to
>> kafka-consumer-groups.sh
>> >
>> > On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io> wrote:
>> >
>> >> Thanks for the KIP. I'm super happy about adding the capability.
>> >>
>> >> I hate the interface, though. It looks exactly like the replica
>> >> assignment tool. A tool everyone loves so much that there are multiple
>> >> projects, open and closed, that try to fix it.
>> >>
>> >> Can we swap it with something that looks a bit more like the consumer
>> >> group tool? or the kafka streams reset tool? Consistency is helpful in
>> >> such cases. I spent some time learning existing tools and learning yet
>> >> another one is a deterrent.
>> >>
>> >> Gwen
>> >>
>> >>
>> >>
>> >> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>> >> <qu...@gmail.com> wrote:
>> >> > Hi all,
>> >> >
>> >> > I would like to propose a KIP to Add a tool to Reset Consumer Group
>> >> Offsets.
>> >> >
>> >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> >> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>> >> >
>> >> > Please, take a look at the proposal and share your feedback.
>> >> >
>> >> > Thanks,
>> >> > Jorge.
>> >>
>> >>
>> >>
>> >> --
>> >> Gwen Shapira
>> >> Product Manager | Confluent
>> >> 650.450.2760 <(650)%20450-2760> | @gwenshap
>> >> Follow us: Twitter | blog
>> >>
>>
>>
>>
>> --
>> Gwen Shapira
>> Product Manager | Confluent
>> 650.450.2760 <(650)%20450-2760> | @gwenshap
>> Follow us: Twitter | blog
>>



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
Thanks for the feedback!

@Onur, @Gwen:

Agree. Actually at the first draft I considered to have it inside
´kafka-consumer-groups.sh´, but I decide to propose it as a standalone tool
to describe it clearly and focus it on reset functionality.

But now that you mentioned, it does make sense to have it in
´kafka-consumer-groups.sh´. How would be a consistent way to introduce it?

Maybe something like this:

´kafka-consumer-groups.sh --reset-offset --generate --group cg1 --topics t1
--reset-from 2017-01-01T00:00:00.000 --output plan.json´

´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
plan.json´

´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
plan.json´

´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group cg1
--topics t1 --reset-from 2017-01-01T00:00:00.000´

@Gwen:

> It looks exactly like the replica assignment tool

It was influenced by ;-) I use the generate-verify-execute process here to
make sure user will be aware of the result of this operation. At the
beginning we considered only add a couple of options to Consumer Group
Command:

--rewind-to-timestamp and --rewind-to-period

@Onur:

> You can actually get away with overriding while members of the group are live
with method 2 by using group information from DescribeGroupsRequest.

This means that we need to have Consumer Group stopped before executing and
start a new consumer internally to do this? Therefore, we won't be able to
consider executing reset when ConsumerGroup is active? (trying to relate it
with @Dong 5th question)

@Dong:

> Should we allow user to use wildcard to reset offset of all groups for a
given topic as well?

I haven't thought about this scenario. Could be interesting. Following the
recommendation to add it into Consumer Group Command, in this case Group
argument will be optional if there are only 1 topic. I think for multiple
topic won't be that useful.

> Should we allow user to specify timestamp per topic partition in the json
file as well?

Don't think this could be a valid from the tool, but if Reset Plan is
generated, and user want to set the offset for a specific partition to
other offset (eventually based on another timestamp), and execute it, it
will be up to her/him.

> Should the script take some credential file to make sure that this
operation is authenticated given the potential impact of this operation?

Haven't tried to secure brokers yet, but the tool should support
authorization if it's enabled in the broker.

> Should we provide constant to reset committed offset to earliest/latest
offset of a partition, e.g. -1 indicates earliest offset and -2 indicates
latest offset.

I will go for something like ´--reset-to-earliest´ and ´--reset-to-latest´

> Should we allow dynamic change of the comitted offset when consumer are
running, such that consumer will seek to the newly committed offset and
start consuming from there?

Not sure about this. I will recommend to keep it simple and ask user to
stop consumers first. But I would considered it if the trade-offs are
clear.

@Matthias

Added :). And thanks a lot for your help to define this KIP!



El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
escribió:

> As long as the CLI is a bit consistent? Like, not just adding 3
> arguments and a JSON parser to the existing tool, right?
>
> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> <on...@gmail.com> wrote:
> > I think it makes sense to just add the feature to
> kafka-consumer-groups.sh
> >
> > On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io> wrote:
> >
> >> Thanks for the KIP. I'm super happy about adding the capability.
> >>
> >> I hate the interface, though. It looks exactly like the replica
> >> assignment tool. A tool everyone loves so much that there are multiple
> >> projects, open and closed, that try to fix it.
> >>
> >> Can we swap it with something that looks a bit more like the consumer
> >> group tool? or the kafka streams reset tool? Consistency is helpful in
> >> such cases. I spent some time learning existing tools and learning yet
> >> another one is a deterrent.
> >>
> >> Gwen
> >>
> >>
> >>
> >> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> >> <qu...@gmail.com> wrote:
> >> > Hi all,
> >> >
> >> > I would like to propose a KIP to Add a tool to Reset Consumer Group
> >> Offsets.
> >> >
> >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >> >
> >> > Please, take a look at the proposal and share your feedback.
> >> >
> >> > Thanks,
> >> > Jorge.
> >>
> >>
> >>
> >> --
> >> Gwen Shapira
> >> Product Manager | Confluent
> >> 650.450.2760 <(650)%20450-2760> | @gwenshap
> >> Follow us: Twitter | blog
> >>
>
>
>
> --
> Gwen Shapira
> Product Manager | Confluent
> 650.450.2760 <(650)%20450-2760> | @gwenshap
> Follow us: Twitter | blog
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jorge Esteban Quilcate Otoya <qu...@gmail.com>.
Thanks for the feedback!

@Onur, @Gwen:

Agree. Actually at the first draft I considered to have it inside
´kafka-consumer-groups.sh´, but I decide to propose it as a standalone tool
to describe it clearly and focus it on reset functionality.

But now that you mentioned, it does make sense to have it in
´kafka-consumer-groups.sh´. How would be a consistent way to introduce it?

Maybe something like this:

´kafka-consumer-groups.sh --reset-offset --generate --group cg1 --topics t1
--reset-from 2017-01-01T00:00:00.000 --output plan.json´

´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
plan.json´

´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
plan.json´

´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group cg1
--topics t1 --reset-from 2017-01-01T00:00:00.000´

@Gwen:

> It looks exactly like the replica assignment tool

It was influenced by ;-) I use the generate-verify-execute process here to
make sure user will be aware of the result of this operation. At the
beginning we considered only add a couple of options to Consumer Group
Command:

--rewind-to-timestamp and --rewind-to-period

@Onur:

> You can actually get away with overriding while members of the group are live
with method 2 by using group information from DescribeGroupsRequest.

This means that we need to have Consumer Group stopped before executing and
start a new consumer internally to do this? Therefore, we won't be able to
consider executing reset when ConsumerGroup is active? (trying to relate it
with @Dong 5th question)

@Dong:

> Should we allow user to use wildcard to reset offset of all groups for a
given topic as well?

I haven't thought about this scenario. Could be interesting. Following the
recommendation to add it into Consumer Group Command, in this case Group
argument will be optional if there are only 1 topic. I think for multiple
topic won't be that useful.

> Should we allow user to specify timestamp per topic partition in the json
file as well?

Don't think this could be a valid from the tool, but if Reset Plan is
generated, and user want to set the offset for a specific partition to
other offset (eventually based on another timestamp), and execute it, it
will be up to her/him.

> Should the script take some credential file to make sure that this
operation is authenticated given the potential impact of this operation?

Haven't tried to secure brokers yet, but the tool should support
authorization if it's enabled in the broker.

> Should we provide constant to reset committed offset to earliest/latest
offset of a partition, e.g. -1 indicates earliest offset and -2 indicates
latest offset.

I will go for something like ´--reset-to-earliest´ and ´--reset-to-latest´

> Should we allow dynamic change of the comitted offset when consumer are
running, such that consumer will seek to the newly committed offset and
start consuming from there?

Not sure about this. I will recommend to keep it simple and ask user to
stop consumers first. But I would considered it if the trade-offs are
clear.

@Matthias

Added :). And thanks a lot for your help to define this KIP!



El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<gw...@confluent.io>)
escribió:

> As long as the CLI is a bit consistent? Like, not just adding 3
> arguments and a JSON parser to the existing tool, right?
>
> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> <on...@gmail.com> wrote:
> > I think it makes sense to just add the feature to
> kafka-consumer-groups.sh
> >
> > On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io> wrote:
> >
> >> Thanks for the KIP. I'm super happy about adding the capability.
> >>
> >> I hate the interface, though. It looks exactly like the replica
> >> assignment tool. A tool everyone loves so much that there are multiple
> >> projects, open and closed, that try to fix it.
> >>
> >> Can we swap it with something that looks a bit more like the consumer
> >> group tool? or the kafka streams reset tool? Consistency is helpful in
> >> such cases. I spent some time learning existing tools and learning yet
> >> another one is a deterrent.
> >>
> >> Gwen
> >>
> >>
> >>
> >> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> >> <qu...@gmail.com> wrote:
> >> > Hi all,
> >> >
> >> > I would like to propose a KIP to Add a tool to Reset Consumer Group
> >> Offsets.
> >> >
> >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >> >
> >> > Please, take a look at the proposal and share your feedback.
> >> >
> >> > Thanks,
> >> > Jorge.
> >>
> >>
> >>
> >> --
> >> Gwen Shapira
> >> Product Manager | Confluent
> >> 650.450.2760 <(650)%20450-2760> | @gwenshap
> >> Follow us: Twitter | blog
> >>
>
>
>
> --
> Gwen Shapira
> Product Manager | Confluent
> 650.450.2760 <(650)%20450-2760> | @gwenshap
> Follow us: Twitter | blog
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Gwen Shapira <gw...@confluent.io>.
As long as the CLI is a bit consistent? Like, not just adding 3
arguments and a JSON parser to the existing tool, right?

On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
<on...@gmail.com> wrote:
> I think it makes sense to just add the feature to kafka-consumer-groups.sh
>
> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io> wrote:
>
>> Thanks for the KIP. I'm super happy about adding the capability.
>>
>> I hate the interface, though. It looks exactly like the replica
>> assignment tool. A tool everyone loves so much that there are multiple
>> projects, open and closed, that try to fix it.
>>
>> Can we swap it with something that looks a bit more like the consumer
>> group tool? or the kafka streams reset tool? Consistency is helpful in
>> such cases. I spent some time learning existing tools and learning yet
>> another one is a deterrent.
>>
>> Gwen
>>
>>
>>
>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>> <qu...@gmail.com> wrote:
>> > Hi all,
>> >
>> > I would like to propose a KIP to Add a tool to Reset Consumer Group
>> Offsets.
>> >
>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>> >
>> > Please, take a look at the proposal and share your feedback.
>> >
>> > Thanks,
>> > Jorge.
>>
>>
>>
>> --
>> Gwen Shapira
>> Product Manager | Confluent
>> 650.450.2760 | @gwenshap
>> Follow us: Twitter | blog
>>



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Gwen Shapira <gw...@confluent.io>.
As long as the CLI is a bit consistent? Like, not just adding 3
arguments and a JSON parser to the existing tool, right?

On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
<on...@gmail.com> wrote:
> I think it makes sense to just add the feature to kafka-consumer-groups.sh
>
> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io> wrote:
>
>> Thanks for the KIP. I'm super happy about adding the capability.
>>
>> I hate the interface, though. It looks exactly like the replica
>> assignment tool. A tool everyone loves so much that there are multiple
>> projects, open and closed, that try to fix it.
>>
>> Can we swap it with something that looks a bit more like the consumer
>> group tool? or the kafka streams reset tool? Consistency is helpful in
>> such cases. I spent some time learning existing tools and learning yet
>> another one is a deterrent.
>>
>> Gwen
>>
>>
>>
>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>> <qu...@gmail.com> wrote:
>> > Hi all,
>> >
>> > I would like to propose a KIP to Add a tool to Reset Consumer Group
>> Offsets.
>> >
>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>> >
>> > Please, take a look at the proposal and share your feedback.
>> >
>> > Thanks,
>> > Jorge.
>>
>>
>>
>> --
>> Gwen Shapira
>> Product Manager | Confluent
>> 650.450.2760 | @gwenshap
>> Follow us: Twitter | blog
>>



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Onur Karaman <on...@gmail.com>.
I think it makes sense to just add the feature to kafka-consumer-groups.sh

On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io> wrote:

> Thanks for the KIP. I'm super happy about adding the capability.
>
> I hate the interface, though. It looks exactly like the replica
> assignment tool. A tool everyone loves so much that there are multiple
> projects, open and closed, that try to fix it.
>
> Can we swap it with something that looks a bit more like the consumer
> group tool? or the kafka streams reset tool? Consistency is helpful in
> such cases. I spent some time learning existing tools and learning yet
> another one is a deterrent.
>
> Gwen
>
>
>
> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> <qu...@gmail.com> wrote:
> > Hi all,
> >
> > I would like to propose a KIP to Add a tool to Reset Consumer Group
> Offsets.
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >
> > Please, take a look at the proposal and share your feedback.
> >
> > Thanks,
> > Jorge.
>
>
>
> --
> Gwen Shapira
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter | blog
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Onur Karaman <on...@gmail.com>.
I think it makes sense to just add the feature to kafka-consumer-groups.sh

On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <gw...@confluent.io> wrote:

> Thanks for the KIP. I'm super happy about adding the capability.
>
> I hate the interface, though. It looks exactly like the replica
> assignment tool. A tool everyone loves so much that there are multiple
> projects, open and closed, that try to fix it.
>
> Can we swap it with something that looks a bit more like the consumer
> group tool? or the kafka streams reset tool? Consistency is helpful in
> such cases. I spent some time learning existing tools and learning yet
> another one is a deterrent.
>
> Gwen
>
>
>
> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> <qu...@gmail.com> wrote:
> > Hi all,
> >
> > I would like to propose a KIP to Add a tool to Reset Consumer Group
> Offsets.
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >
> > Please, take a look at the proposal and share your feedback.
> >
> > Thanks,
> > Jorge.
>
>
>
> --
> Gwen Shapira
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter | blog
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Gwen Shapira <gw...@confluent.io>.
Thanks for the KIP. I'm super happy about adding the capability.

I hate the interface, though. It looks exactly like the replica
assignment tool. A tool everyone loves so much that there are multiple
projects, open and closed, that try to fix it.

Can we swap it with something that looks a bit more like the consumer
group tool? or the kafka streams reset tool? Consistency is helpful in
such cases. I spent some time learning existing tools and learning yet
another one is a deterrent.

Gwen



On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
<qu...@gmail.com> wrote:
> Hi all,
>
> I would like to propose a KIP to Add a tool to Reset Consumer Group Offsets.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>
> Please, take a look at the proposal and share your feedback.
>
> Thanks,
> Jorge.



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Gwen Shapira <gw...@confluent.io>.
Thanks for the KIP. I'm super happy about adding the capability.

I hate the interface, though. It looks exactly like the replica
assignment tool. A tool everyone loves so much that there are multiple
projects, open and closed, that try to fix it.

Can we swap it with something that looks a bit more like the consumer
group tool? or the kafka streams reset tool? Consistency is helpful in
such cases. I spent some time learning existing tools and learning yet
another one is a deterrent.

Gwen



On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
<qu...@gmail.com> wrote:
> Hi all,
>
> I would like to propose a KIP to Add a tool to Reset Consumer Group Offsets.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>
> Please, take a look at the proposal and share your feedback.
>
> Thanks,
> Jorge.



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Jan Filipiak <Ja...@trivago.com>.
Hi,

Just my few thoughts:

does it need to be json?
the old zkOffset tool had a nice format,
very easy to manipulate on cli
very powerfull: changes as many consumergroups/topics/partitions in one 
go as you want

maybe allow -1 and -2 to indicate earliest and latest reset regardless 
of what the group has as auto mechanism

I would definitely prefer a line oriented format rather than json. I 
ramped my https://stedolan.github.io/jq/ skills up
so I can do some partition assignments but its no joy, better grep awk ...

Best Jan

On 08.02.2017 03:43, Jorge Esteban Quilcate Otoya wrote:
> Hi all,
>
> I would like to propose a KIP to Add a tool to Reset Consumer Group Offsets.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>
> Please, take a look at the proposal and share your feedback.
>
> Thanks,
> Jorge.
>


Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by "Matthias J. Sax" <ma...@confluent.io>.
Jorge,

can you please add your KIP to this table:

https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals#KafkaImprovementProposals-KIPsunderdiscussion


Thanks!


-Matthias

On 2/7/17 9:29 PM, Matthias J. Sax wrote:
> Jorge,
> 
> thanks for you KIP. I like it a lot and think it will be a nice addition!
> 
> 
> -Matthias
> 
> 
> On 2/7/17 7:04 PM, Dong Lin wrote:
>> Hey Jorge,
>>
>> Thanks for the KIP. I have some quick comments:
>>
>> - Should we allow user to use wildcard to reset offset of all groups for a
>> given topic as well?
>> - Should we allow user to specify timestamp per topic partition in the json
>> file as well?
>> - Should the script take some credential file to make sure that this
>> operation is authenticated given the potential impact of this operation?
>> - Should we provide constant to reset committed offset to earliest/latest
>> offset of a partition, e.g. -1 indicates earliest offset
>> and -2 indicates latest offset.
>> - Should we allow dynamic change of the comitted offset when consumer are
>> running, such that consumer will seek to the newly committed offset and
>> start consuming from there?
>>
>> BTW, I guess more people just write their own program which starts a
>> consumer and commits offset instead of re-deploying application as
>> suggested in the motivation section. I agree that having a ready-to-use
>> script will make it easier.
>>
>> Thanks,
>> Dong
>>
>>
>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya <
>> quilcate.jorge@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I would like to propose a KIP to Add a tool to Reset Consumer Group
>>> Offsets.
>>>
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%
>>> 3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>>>
>>> Please, take a look at the proposal and share your feedback.
>>>
>>> Thanks,
>>> Jorge.
>>>
>>
> 


Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Onur Karaman <on...@gmail.com>.
I've been meaning to suggest something very similar to this KIP.

Something lacking in the KIP is under what scenarios the offset reset tool
will run. Are all members of the group expected to be offline or can we
override offsets while members of the group are live? This matters when
factoring in the offset commit logic on the GroupCoordinator.

Currently the only way for an admin to successfully override offsets is if
either:
1. they send an OffsetCommitRequest with generationId -1 and if the group
is in the Empty state
2. they send an OffsetCommitRequest impersonating a member of the group
with accurate generationId and memberId

You can actually get away with overriding while members of the group are
live with method 2 by using group information from DescribeGroupsRequest.

On Tue, Feb 7, 2017 at 9:29 PM, Matthias J. Sax <ma...@confluent.io>
wrote:

> Jorge,
>
> thanks for you KIP. I like it a lot and think it will be a nice addition!
>
>
> -Matthias
>
>
> On 2/7/17 7:04 PM, Dong Lin wrote:
> > Hey Jorge,
> >
> > Thanks for the KIP. I have some quick comments:
> >
> > - Should we allow user to use wildcard to reset offset of all groups for
> a
> > given topic as well?
> > - Should we allow user to specify timestamp per topic partition in the
> json
> > file as well?
> > - Should the script take some credential file to make sure that this
> > operation is authenticated given the potential impact of this operation?
> > - Should we provide constant to reset committed offset to earliest/latest
> > offset of a partition, e.g. -1 indicates earliest offset
> > and -2 indicates latest offset.
> > - Should we allow dynamic change of the comitted offset when consumer are
> > running, such that consumer will seek to the newly committed offset and
> > start consuming from there?
> >
> > BTW, I guess more people just write their own program which starts a
> > consumer and commits offset instead of re-deploying application as
> > suggested in the motivation section. I agree that having a ready-to-use
> > script will make it easier.
> >
> > Thanks,
> > Dong
> >
> >
> > On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya <
> > quilcate.jorge@gmail.com> wrote:
> >
> >> Hi all,
> >>
> >> I would like to propose a KIP to Add a tool to Reset Consumer Group
> >> Offsets.
> >>
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%
> >> 3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >>
> >> Please, take a look at the proposal and share your feedback.
> >>
> >> Thanks,
> >> Jorge.
> >>
> >
>
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by "Matthias J. Sax" <ma...@confluent.io>.
Jorge,

thanks for you KIP. I like it a lot and think it will be a nice addition!


-Matthias


On 2/7/17 7:04 PM, Dong Lin wrote:
> Hey Jorge,
> 
> Thanks for the KIP. I have some quick comments:
> 
> - Should we allow user to use wildcard to reset offset of all groups for a
> given topic as well?
> - Should we allow user to specify timestamp per topic partition in the json
> file as well?
> - Should the script take some credential file to make sure that this
> operation is authenticated given the potential impact of this operation?
> - Should we provide constant to reset committed offset to earliest/latest
> offset of a partition, e.g. -1 indicates earliest offset
> and -2 indicates latest offset.
> - Should we allow dynamic change of the comitted offset when consumer are
> running, such that consumer will seek to the newly committed offset and
> start consuming from there?
> 
> BTW, I guess more people just write their own program which starts a
> consumer and commits offset instead of re-deploying application as
> suggested in the motivation section. I agree that having a ready-to-use
> script will make it easier.
> 
> Thanks,
> Dong
> 
> 
> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya <
> quilcate.jorge@gmail.com> wrote:
> 
>> Hi all,
>>
>> I would like to propose a KIP to Add a tool to Reset Consumer Group
>> Offsets.
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%
>> 3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>>
>> Please, take a look at the proposal and share your feedback.
>>
>> Thanks,
>> Jorge.
>>
> 


Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Dong Lin <li...@gmail.com>.
Hey Jorge,

Thanks for the KIP. I have some quick comments:

- Should we allow user to use wildcard to reset offset of all groups for a
given topic as well?
- Should we allow user to specify timestamp per topic partition in the json
file as well?
- Should the script take some credential file to make sure that this
operation is authenticated given the potential impact of this operation?
- Should we provide constant to reset committed offset to earliest/latest
offset of a partition, e.g. -1 indicates earliest offset
and -2 indicates latest offset.
- Should we allow dynamic change of the comitted offset when consumer are
running, such that consumer will seek to the newly committed offset and
start consuming from there?

BTW, I guess more people just write their own program which starts a
consumer and commits offset instead of re-deploying application as
suggested in the motivation section. I agree that having a ready-to-use
script will make it easier.

Thanks,
Dong


On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya <
quilcate.jorge@gmail.com> wrote:

> Hi all,
>
> I would like to propose a KIP to Add a tool to Reset Consumer Group
> Offsets.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%
> 3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>
> Please, take a look at the proposal and share your feedback.
>
> Thanks,
> Jorge.
>

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

Posted by Dong Lin <li...@gmail.com>.
Hey Jorge,

Thanks for the KIP. I have some quick comments:

- Should we allow user to use wildcard to reset offset of all groups for a
given topic as well?
- Should we allow user to specify timestamp per topic partition in the json
file as well?
- Should the script take some credential file to make sure that this
operation is authenticated given the potential impact of this operation?
- Should we provide constant to reset committed offset to earliest/latest
offset of a partition, e.g. -1 indicates earliest offset
and -2 indicates latest offset.
- Should we allow dynamic change of the comitted offset when consumer are
running, such that consumer will seek to the newly committed offset and
start consuming from there?

BTW, I guess more people just write their own program which starts a
consumer and commits offset instead of re-deploying application as
suggested in the motivation section. I agree that having a ready-to-use
script will make it easier.

Thanks,
Dong


On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya <
quilcate.jorge@gmail.com> wrote:

> Hi all,
>
> I would like to propose a KIP to Add a tool to Reset Consumer Group
> Offsets.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%
> 3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>
> Please, take a look at the proposal and share your feedback.
>
> Thanks,
> Jorge.
>