You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Tom Bentley <t....@gmail.com> on 2017/09/07 16:38:20 UTC

[DISCUSS] KIP-195: AdminClient.increasePartitions

As suggested by Ismael, I've factored the increasePartitionCounts() API out
of KIP-179 out into a separate KIP which hopefully can progress more
quickly.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-195%3A+AdminClient.increasePartitions

If you've looked at KIP-179 in the last few days there's no much new to see
here, but if not you should find KIP-195 a lighter read.

Cheers,

Tom

Re: Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Kamal Chandraprakash <ka...@gmail.com>.
Selvamalar,

    It's a self service. Please send a mail to `
dev-unsubscribe@kafka.apache.org` and `users-unsubscribe@kafka.apache.org`
to unsubscribe from the `dev

On Fri, Sep 8, 2017 at 10:02 PM, Selvamalar Arunkumar <
Selvamalar.Arunkumar@walmart.com> wrote:

> Please remove my email id from the group.
>
> Regards,
> SelvaMalar
>
> Connect With http://www.walmart.com/
> https://twitter.com/Walmart
> 805 Moberly Lane
> Bentonville, AR 72716-0560
> Save Money. Live Better.
>
> -----Original Message-----
> From: Tom Bentley [mailto:t.j.bentley@gmail.com]
> Sent: Friday, September 08, 2017 11:28 AM
> To: dev@kafka.apache.org
> Subject: EXT: Re: [DISCUSS] KIP-195: AdminClient.increasePartitions
>
> I've gone with createPartitions() and NewPartitions.
>
> To answer my own question about the replication factor, the current code
> (specifically AdminUtils.addPartitions()) just uses the size of the replica
> list of the first partition as the replication factor for the whole topic.
> I don't think this is 100% correct, for the reasons I mentioned earlier,
> but I propose to do the same thing in order to apply the CreateTopicPolicy.
>
> I'm very aware that I need to start the vote today if this is to have a
> hope of getting into 1.0.0, so I'm going to call the vote now.
>
> On 8 September 2017 at 16:28, Tom Bentley <t....@gmail.com> wrote:
>
> > Putting aside the naming question for a moment, I have an implementation
> > question:
> >
> > I'm looking at applying the createTopicPolicy to the new partition count.
> > (As mentioned elsewhere, this is necessary to avoid creating a trivial
> > loophole in the policy where a user creates and then modifies a topic in
> a
> > way which would have violated the createTopicPolicy had the topic
> initially
> > been created with the new number of partitions.)
> >
> > To apply the policy I need the replication factor. But AFAICS the
> > configured replication factor isn't available anywhere on the broker. The
> > best I can do it look at size of the replicas list of a partition. But if
> > the partition is being reassigned this might not be the "real"
> replication
> > factor (it could be transiently higher). And yes, as far as I'm aware it
> is
> > possible to simultaneously increase the partition count and reassign the
> > existing partitions since they use different znodes.
> >
> > Also it seems that once the topic has been created it's possible to
> change
> > the replication factor of individual partitions by reassigning them.
> Which
> > is fine except it means there is no meaningful replication factor for the
> > *topic* after topic creation. If there's no meaningful replication
> factor I
> > can't apply the policy.
> >
> > Am I missing something?
> >
> > Thanks,
> >
> > Tom
> >
> > On 8 September 2017 at 13:50, Ted Yu <yu...@gmail.com> wrote:
> >
> >> I think increasePartitions is better name, implying there are already
> >> partitions.
> >>
> >> bq. use cases for non-consecutive partition ids.
> >>
> >> +1 on not supporting non-consecutive partition ids
> >>
> >> Cheers
> >>
> >> On Fri, Sep 8, 2017 at 5:34 AM, Ismael Juma <is...@juma.me.uk> wrote:
> >>
> >> > Hmm, maybe it should be createPartitions for symmetry with
> createTopics?
> >> >
> >> > Ismael
> >> >
> >> > On Fri, Sep 8, 2017 at 12:33 PM, Tom Bentley <t....@gmail.com>
> >> > wrote:
> >> >
> >> > > Hi Paolo,
> >> > >
> >> > > Thanks for commenting.
> >> > >
> >> > > The main reason I suggested `NumPartitionsIncrease` rather than just
> >> > > `NumPartitions` was in case we ever implement a
> >> decreaseNumPartitions()
> >> > > API. The semantics of the class are not appropriate for using with a
> >> > > decrease API, but calling it NumPartitions suggests that it was
> >> related.
> >> > >
> >> > > Radical thought: What about if the method was called addPartitions()
> >> and
> >> > > the class was called NewPartitions?
> >> > >
> >> > > Then if an API for decreasing were ever implemented it could be
> >> > > removePartitions() with a RemovedPartitions class if necessary.
> >> > >
> >> > > Cheers,
> >> > >
> >> > > Tom
> >> > >
> >> > > On 8 September 2017 at 12:13, Paolo Patierno <pp...@live.com>
> >> wrote:
> >> > >
> >> > > > My 2 cents about naming ...
> >> > > >
> >> > > >
> >> > > > PartitionCount or NumPartitions sound better to me providing an
> >> > > "absolute"
> >> > > > value (as today the kafka-topics script work) for an idempotent
> >> > operation
> >> > > > while the NumPartitionsIncrease name sounds to me like the
> >> "increment"
> >> > > > value.
> >> > > >
> >> > > >
> >> > > > Paolo Patierno
> >> > > > Senior Software Engineer (IoT) @ Red Hat
> >> > > > Microsoft MVP on Windows Embedded & IoT
> >> > > > Microsoft Azure Advisor
> >> > > >
> >> > > > Twitter : @ppatierno<http://twitter.com/ppatierno>
> >> > > > Linkedin : paolopatierno<http://it.linkedin.com/in/paolopatierno>
> >> > > > Blog : DevExperience<http://paolopatierno.wordpress.com/>
> >> > > >
> >> > > >
> >> > > > ________________________________
> >> > > > From: Tom Bentley <t....@gmail.com>
> >> > > > Sent: Friday, September 8, 2017 9:39 AM
> >> > > > To: dev@kafka.apache.org
> >> > > > Subject: Re: [DISCUSS] KIP-195: AdminClient.increasePartitions
> >> > > >
> >> > > > Hi Ismael,
> >> > > >
> >> > > > Thanks for the comments.
> >> > > >
> >> > > > My bad for not noticing the custom assignment requirement. The
> >> current
> >> > > > > proposal has the following example (I updated it a little so
> that
> >> 2
> >> > > > > partitions are added):
> >> > > > >
> >> > > > > increasePartitionCount(4, asList(asList(1, 2), asList(2, 3))
> >> > > > >
> >> > > > > Why not simply provide the assignment? For example, if you want
> to
> >> > add
> >> > > 2
> >> > > > > partitions, you'd simply do:
> >> > > > >
> >> > > > > increasePartitionCount(asList(asList(1, 2), asList(2, 3))
> >> > > > >
> >> > > > > Not sure why need the number.
> >> > > >
> >> > > >
> >> > > > kafka-topics.sh allows to increase the number of partitions
> without
> >> > > > supplying an assignment, so one reason is simply to be able to
> >> support
> >> > > > that.
> >> > > >
> >> > > > When you don't supply an assignment you're leaving it to the
> >> cluster to
> >> > > > decide for you. By requiring an assignment you're forcing the user
> >> to
> >> > > > decide. The user might not care much and thus make a worse choice
> >> than
> >> > if
> >> > > > you'd left it to the server.
> >> > > >
> >> > > >
> >> > > > > The other two questions:
> >> > > > >
> >> > > > > 2. Do we want to allow people to add non-consecutive partition
> >> ids?
> >> > > This
> >> > > > is
> >> > > > > possible to do with the current AdminUtils, but it's not exactly
> >> > > > supported
> >> > > > > (although it apparently works fine in the broker). Still, I
> >> wanted to
> >> > > > make
> >> > > > > sure we considered it.
> >> > > > >
> >> > > >
> >> > > > I admit I had assumed this wasn't possible. How does partitioning
> >> work
> >> > if
> >> > > > there are holes? You would need the list of partition ids in order
> >> to
> >> > > > produce a correct partition id.
> >> > > >
> >> > > > Suspending my scepticism for a moment, to support something like
> >> that
> >> > > we'd
> >> > > > have to change the List<List<Integer>> assignment to a
> Map<Integer,
> >> > > > List<Integer>>, so the request explicitly identified the new
> topics,
> >> > > rather
> >> > > > than it being implied. That would make it slightly less easy to
> >> form a
> >> > > > valid request for the normal case of consecutive partition ids:
> >> You'd
> >> > > have
> >> > > > to actually know the current number of partitions, which might
> >> > > necessitate
> >> > > > a describeTopics().
> >> > > >
> >> > > > It doesn't sound like there are any known use cases for
> >> non-consecutive
> >> > > > partition ids. It also sounds like whatever existing support there
> >> is
> >> > > might
> >> > > > be only lightly tested. It sounds like a source of gotchas and
> >> subtle
> >> > > bugs
> >> > > > to me (both in Kafka itself and for users). I have to wonder
> >> whether it
> >> > > > would be worth supporting this.
> >> > > >
> >> > > > If we decide not to support it, we should fix the rest of the
> >> > AdminClient
> >> > > > so it's not possible to create non-consecutive partition ids.
> >> > > >
> >> > > > WDYT?
> >> > > >
> >> > > >
> >> > > > 3. Do we want to provide the target partition count or the number
> we
> >> > want
> >> > > > > to increase it by? This is related to the first point as well.
> >> > Thinking
> >> > > > > about it, one benefit of specifying the target number is that
> the
> >> > > > operation
> >> > > > > is then idempotent. If we state the number we want to increase
> by,
> >> > it's
> >> > > > > easier to make a mistake and increase it twice under failure
> >> > scenarios.
> >> > > > Was
> >> > > > > that the motivation for specifying the target partition count?
> >> > > > >
> >> > > > >
> >> > > > Right, if you're just supplying an increment you can't be certain
> >> what
> >> > > > you're incrementing it to (which is what you actually care about).
> >> And
> >> > > > idempotency is so nice to have if something goes wrong.
> >> > > >
> >> > > > Using an increment would make the `NumPartitionIncrease` class a
> bit
> >> > more
> >> > > > easily understood, as then the outer list would have to be the
> same
> >> > size
> >> > > as
> >> > > > the increment. But for me idempotency is the more valuable
> feature.
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

RE: Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Selvamalar Arunkumar <Se...@walmart.com>.
Please remove my email id from the group.

Regards,
SelvaMalar

Connect With http://www.walmart.com/
https://twitter.com/Walmart
805 Moberly Lane 
Bentonville, AR 72716-0560 
Save Money. Live Better.

-----Original Message-----
From: Tom Bentley [mailto:t.j.bentley@gmail.com] 
Sent: Friday, September 08, 2017 11:28 AM
To: dev@kafka.apache.org
Subject: EXT: Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

I've gone with createPartitions() and NewPartitions.

To answer my own question about the replication factor, the current code
(specifically AdminUtils.addPartitions()) just uses the size of the replica
list of the first partition as the replication factor for the whole topic.
I don't think this is 100% correct, for the reasons I mentioned earlier,
but I propose to do the same thing in order to apply the CreateTopicPolicy.

I'm very aware that I need to start the vote today if this is to have a
hope of getting into 1.0.0, so I'm going to call the vote now.

On 8 September 2017 at 16:28, Tom Bentley <t....@gmail.com> wrote:

> Putting aside the naming question for a moment, I have an implementation
> question:
>
> I'm looking at applying the createTopicPolicy to the new partition count.
> (As mentioned elsewhere, this is necessary to avoid creating a trivial
> loophole in the policy where a user creates and then modifies a topic in a
> way which would have violated the createTopicPolicy had the topic initially
> been created with the new number of partitions.)
>
> To apply the policy I need the replication factor. But AFAICS the
> configured replication factor isn't available anywhere on the broker. The
> best I can do it look at size of the replicas list of a partition. But if
> the partition is being reassigned this might not be the "real" replication
> factor (it could be transiently higher). And yes, as far as I'm aware it is
> possible to simultaneously increase the partition count and reassign the
> existing partitions since they use different znodes.
>
> Also it seems that once the topic has been created it's possible to change
> the replication factor of individual partitions by reassigning them. Which
> is fine except it means there is no meaningful replication factor for the
> *topic* after topic creation. If there's no meaningful replication factor I
> can't apply the policy.
>
> Am I missing something?
>
> Thanks,
>
> Tom
>
> On 8 September 2017 at 13:50, Ted Yu <yu...@gmail.com> wrote:
>
>> I think increasePartitions is better name, implying there are already
>> partitions.
>>
>> bq. use cases for non-consecutive partition ids.
>>
>> +1 on not supporting non-consecutive partition ids
>>
>> Cheers
>>
>> On Fri, Sep 8, 2017 at 5:34 AM, Ismael Juma <is...@juma.me.uk> wrote:
>>
>> > Hmm, maybe it should be createPartitions for symmetry with createTopics?
>> >
>> > Ismael
>> >
>> > On Fri, Sep 8, 2017 at 12:33 PM, Tom Bentley <t....@gmail.com>
>> > wrote:
>> >
>> > > Hi Paolo,
>> > >
>> > > Thanks for commenting.
>> > >
>> > > The main reason I suggested `NumPartitionsIncrease` rather than just
>> > > `NumPartitions` was in case we ever implement a
>> decreaseNumPartitions()
>> > > API. The semantics of the class are not appropriate for using with a
>> > > decrease API, but calling it NumPartitions suggests that it was
>> related.
>> > >
>> > > Radical thought: What about if the method was called addPartitions()
>> and
>> > > the class was called NewPartitions?
>> > >
>> > > Then if an API for decreasing were ever implemented it could be
>> > > removePartitions() with a RemovedPartitions class if necessary.
>> > >
>> > > Cheers,
>> > >
>> > > Tom
>> > >
>> > > On 8 September 2017 at 12:13, Paolo Patierno <pp...@live.com>
>> wrote:
>> > >
>> > > > My 2 cents about naming ...
>> > > >
>> > > >
>> > > > PartitionCount or NumPartitions sound better to me providing an
>> > > "absolute"
>> > > > value (as today the kafka-topics script work) for an idempotent
>> > operation
>> > > > while the NumPartitionsIncrease name sounds to me like the
>> "increment"
>> > > > value.
>> > > >
>> > > >
>> > > > Paolo Patierno
>> > > > Senior Software Engineer (IoT) @ Red Hat
>> > > > Microsoft MVP on Windows Embedded & IoT
>> > > > Microsoft Azure Advisor
>> > > >
>> > > > Twitter : @ppatierno<http://twitter.com/ppatierno>
>> > > > Linkedin : paolopatierno<http://it.linkedin.com/in/paolopatierno>
>> > > > Blog : DevExperience<http://paolopatierno.wordpress.com/>
>> > > >
>> > > >
>> > > > ________________________________
>> > > > From: Tom Bentley <t....@gmail.com>
>> > > > Sent: Friday, September 8, 2017 9:39 AM
>> > > > To: dev@kafka.apache.org
>> > > > Subject: Re: [DISCUSS] KIP-195: AdminClient.increasePartitions
>> > > >
>> > > > Hi Ismael,
>> > > >
>> > > > Thanks for the comments.
>> > > >
>> > > > My bad for not noticing the custom assignment requirement. The
>> current
>> > > > > proposal has the following example (I updated it a little so that
>> 2
>> > > > > partitions are added):
>> > > > >
>> > > > > increasePartitionCount(4, asList(asList(1, 2), asList(2, 3))
>> > > > >
>> > > > > Why not simply provide the assignment? For example, if you want to
>> > add
>> > > 2
>> > > > > partitions, you'd simply do:
>> > > > >
>> > > > > increasePartitionCount(asList(asList(1, 2), asList(2, 3))
>> > > > >
>> > > > > Not sure why need the number.
>> > > >
>> > > >
>> > > > kafka-topics.sh allows to increase the number of partitions without
>> > > > supplying an assignment, so one reason is simply to be able to
>> support
>> > > > that.
>> > > >
>> > > > When you don't supply an assignment you're leaving it to the
>> cluster to
>> > > > decide for you. By requiring an assignment you're forcing the user
>> to
>> > > > decide. The user might not care much and thus make a worse choice
>> than
>> > if
>> > > > you'd left it to the server.
>> > > >
>> > > >
>> > > > > The other two questions:
>> > > > >
>> > > > > 2. Do we want to allow people to add non-consecutive partition
>> ids?
>> > > This
>> > > > is
>> > > > > possible to do with the current AdminUtils, but it's not exactly
>> > > > supported
>> > > > > (although it apparently works fine in the broker). Still, I
>> wanted to
>> > > > make
>> > > > > sure we considered it.
>> > > > >
>> > > >
>> > > > I admit I had assumed this wasn't possible. How does partitioning
>> work
>> > if
>> > > > there are holes? You would need the list of partition ids in order
>> to
>> > > > produce a correct partition id.
>> > > >
>> > > > Suspending my scepticism for a moment, to support something like
>> that
>> > > we'd
>> > > > have to change the List<List<Integer>> assignment to a Map<Integer,
>> > > > List<Integer>>, so the request explicitly identified the new topics,
>> > > rather
>> > > > than it being implied. That would make it slightly less easy to
>> form a
>> > > > valid request for the normal case of consecutive partition ids:
>> You'd
>> > > have
>> > > > to actually know the current number of partitions, which might
>> > > necessitate
>> > > > a describeTopics().
>> > > >
>> > > > It doesn't sound like there are any known use cases for
>> non-consecutive
>> > > > partition ids. It also sounds like whatever existing support there
>> is
>> > > might
>> > > > be only lightly tested. It sounds like a source of gotchas and
>> subtle
>> > > bugs
>> > > > to me (both in Kafka itself and for users). I have to wonder
>> whether it
>> > > > would be worth supporting this.
>> > > >
>> > > > If we decide not to support it, we should fix the rest of the
>> > AdminClient
>> > > > so it's not possible to create non-consecutive partition ids.
>> > > >
>> > > > WDYT?
>> > > >
>> > > >
>> > > > 3. Do we want to provide the target partition count or the number we
>> > want
>> > > > > to increase it by? This is related to the first point as well.
>> > Thinking
>> > > > > about it, one benefit of specifying the target number is that the
>> > > > operation
>> > > > > is then idempotent. If we state the number we want to increase by,
>> > it's
>> > > > > easier to make a mistake and increase it twice under failure
>> > scenarios.
>> > > > Was
>> > > > > that the motivation for specifying the target partition count?
>> > > > >
>> > > > >
>> > > > Right, if you're just supplying an increment you can't be certain
>> what
>> > > > you're incrementing it to (which is what you actually care about).
>> And
>> > > > idempotency is so nice to have if something goes wrong.
>> > > >
>> > > > Using an increment would make the `NumPartitionIncrease` class a bit
>> > more
>> > > > easily understood, as then the outer list would have to be the same
>> > size
>> > > as
>> > > > the increment. But for me idempotency is the more valuable feature.
>> > > >
>> > >
>> >
>>
>
>

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Tom Bentley <t....@gmail.com>.
Hi Ismael,

Thanks for your comments.

Regarding the loophole issue, keep in mind that the alter topics
> authorization would still be required, so I don't think it's an issue
>

It could be an issue for people trying to provide a Kafka-as-a-Service
offering, couldn't it? I mean if the providers are using the policy to
enforce a maximum number of partitions rule on their customers, but they've
also given those customers Alter(Topic) so they can change topics configs
then this API provides a loophole that wouldn't otherwise exist. Moreover
the motivation in KIP-170 suggests to me that this isn't a hypothetical
issue.

update the KIP so that the request must
> be sent to the Controller and we wait until the data is propagated into the
> Controller's metadata cache
>

About whether we send the request to the controller and when we consider
the request to have completed successfully... Currently the kafka-topics.sh
tool considers the change successful if the change was written to ZK. I'm
happy to change it, but:

a) I wanted to highlight that it would be a small change in semantics
b) OTOH is there any benefit to clients to change the success criterion to
when the metadata cache gets updated? If not then shouldn't we stick to the
existing criterion of a successful write to ZK?

Thanks,

Tom



On 12 September 2017 at 16:30, Ismael Juma <is...@juma.me.uk> wrote:

> Hi Tom,
>
> OK, I suggest not calling any policy then. We can do a separate KIP for
> overhauling topic policies so that they work with all operations for 1.1.0.
> Regarding the loophole issue, keep in mind that the alter topics
> authorization would still be required, so I don't think it's an issue.
> Users that really need a policy will have to wait until 1.1.0, but that's
> no worse than if KIP-195 doesn't make it into 1.0.0.
>
> If you remove the policy text and update the KIP so that the request must
> be sent to the Controller and we wait until the data is propagated into the
> Controller's metadata cache (similar to create topic), it's a +1 from me.
>
> Ismael
>
> On Tue, Sep 12, 2017 at 10:26 AM, Tom Bentley <t....@gmail.com>
> wrote:
>
> > 2. About using the create topics policy, I'm not sure. Aside from the
> > > naming issue, there's also the problem that the policy doesn't know if
> a
> > > creation or update is taking place. This matters because one may not
> want
> > > to allow the number of partitions to be changed after creation as it
> > > affects the semantics if keys are used. One option is to introduce a
> new
> > > interface that can be used by create, alter and delete with a new
> config.
> > > And deprecate CreateTopicPolicy. I doubt many are using it. What do you
> > > think?
> > >
> >
> > I included the part about the create topics policy because I felt it was
> > better, in the short term, to prevent the loophole than to just ignore
> it.
> > The create topic policy is obviously not a good fit for applying to topic
> > modifications, but I think designing a good policy interface that covered
> > creation, modification and deletion could be the subject of its own KIP.
> > Note that modification would include the APIs proposed in KIP-195 and
> > KIP-179. KIP-170 is already proposing to change the creation policy and
> add
> > a deletion policy, so shouldn't the changes necessary for KIP-195 be
> > considered as part of that KIP?
> >
> > I'm happy to propose something in KIP-195 if you really want, though it
> > would put in doubt whether it could be part of Kafka 1.0.0.
> >
>

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Ismael Juma <is...@juma.me.uk>.
Hi Tom,

OK, I suggest not calling any policy then. We can do a separate KIP for
overhauling topic policies so that they work with all operations for 1.1.0.
Regarding the loophole issue, keep in mind that the alter topics
authorization would still be required, so I don't think it's an issue.
Users that really need a policy will have to wait until 1.1.0, but that's
no worse than if KIP-195 doesn't make it into 1.0.0.

If you remove the policy text and update the KIP so that the request must
be sent to the Controller and we wait until the data is propagated into the
Controller's metadata cache (similar to create topic), it's a +1 from me.

Ismael

On Tue, Sep 12, 2017 at 10:26 AM, Tom Bentley <t....@gmail.com> wrote:

> 2. About using the create topics policy, I'm not sure. Aside from the
> > naming issue, there's also the problem that the policy doesn't know if a
> > creation or update is taking place. This matters because one may not want
> > to allow the number of partitions to be changed after creation as it
> > affects the semantics if keys are used. One option is to introduce a new
> > interface that can be used by create, alter and delete with a new config.
> > And deprecate CreateTopicPolicy. I doubt many are using it. What do you
> > think?
> >
>
> I included the part about the create topics policy because I felt it was
> better, in the short term, to prevent the loophole than to just ignore it.
> The create topic policy is obviously not a good fit for applying to topic
> modifications, but I think designing a good policy interface that covered
> creation, modification and deletion could be the subject of its own KIP.
> Note that modification would include the APIs proposed in KIP-195 and
> KIP-179. KIP-170 is already proposing to change the creation policy and add
> a deletion policy, so shouldn't the changes necessary for KIP-195 be
> considered as part of that KIP?
>
> I'm happy to propose something in KIP-195 if you really want, though it
> would put in doubt whether it could be part of Kafka 1.0.0.
>

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Tom Bentley <t....@gmail.com>.
2. About using the create topics policy, I'm not sure. Aside from the
> naming issue, there's also the problem that the policy doesn't know if a
> creation or update is taking place. This matters because one may not want
> to allow the number of partitions to be changed after creation as it
> affects the semantics if keys are used. One option is to introduce a new
> interface that can be used by create, alter and delete with a new config.
> And deprecate CreateTopicPolicy. I doubt many are using it. What do you
> think?
>

I included the part about the create topics policy because I felt it was
better, in the short term, to prevent the loophole than to just ignore it.
The create topic policy is obviously not a good fit for applying to topic
modifications, but I think designing a good policy interface that covered
creation, modification and deletion could be the subject of its own KIP.
Note that modification would include the APIs proposed in KIP-195 and
KIP-179. KIP-170 is already proposing to change the creation policy and add
a deletion policy, so shouldn't the changes necessary for KIP-195 be
considered as part of that KIP?

I'm happy to propose something in KIP-195 if you really want, though it
would put in doubt whether it could be part of Kafka 1.0.0.

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Ismael Juma <is...@juma.me.uk>.
Hi Tom,

A couple of comments:

1. "This API is synchronous in the sense that the client can assume that
the partition count has been changed (or the request was rejected) once
they have obtained the result for the topic from the
CreatePartitionsResult." If you want to do this, I think you'd have to send
the request to the Controller (which seems reasonable). The controller
would then wait until the change has propagated to its metadata cache (like
it does for create topics) before returning.

2. About using the create topics policy, I'm not sure. Aside from the
naming issue, there's also the problem that the policy doesn't know if a
creation or update is taking place. This matters because one may not want
to allow the number of partitions to be changed after creation as it
affects the semantics if keys are used. One option is to introduce a new
interface that can be used by create, alter and delete with a new config.
And deprecate CreateTopicPolicy. I doubt many are using it. What do you
think?

Ismael

On Fri, Sep 8, 2017 at 5:34 PM, Ismael Juma <is...@juma.me.uk> wrote:

> Hi Tom,
>
> Maybe we should disallow the creation of partitions while a reassignment
> is in progress. As you pointed out, there are weird edge cases. We can
> relax the restriction later, if needed.
>
> Ismael
>
> On Fri, Sep 8, 2017 at 5:28 PM, Tom Bentley <t....@gmail.com> wrote:
>
>> I've gone with createPartitions() and NewPartitions.
>>
>> To answer my own question about the replication factor, the current code
>> (specifically AdminUtils.addPartitions()) just uses the size of the
>> replica
>> list of the first partition as the replication factor for the whole topic.
>> I don't think this is 100% correct, for the reasons I mentioned earlier,
>> but I propose to do the same thing in order to apply the
>> CreateTopicPolicy.
>>
>> I'm very aware that I need to start the vote today if this is to have a
>> hope of getting into 1.0.0, so I'm going to call the vote now.
>>
>> On 8 September 2017 at 16:28, Tom Bentley <t....@gmail.com> wrote:
>>
>> > Putting aside the naming question for a moment, I have an implementation
>> > question:
>> >
>> > I'm looking at applying the createTopicPolicy to the new partition
>> count.
>> > (As mentioned elsewhere, this is necessary to avoid creating a trivial
>> > loophole in the policy where a user creates and then modifies a topic
>> in a
>> > way which would have violated the createTopicPolicy had the topic
>> initially
>> > been created with the new number of partitions.)
>> >
>> > To apply the policy I need the replication factor. But AFAICS the
>> > configured replication factor isn't available anywhere on the broker.
>> The
>> > best I can do it look at size of the replicas list of a partition. But
>> if
>> > the partition is being reassigned this might not be the "real"
>> replication
>> > factor (it could be transiently higher). And yes, as far as I'm aware
>> it is
>> > possible to simultaneously increase the partition count and reassign the
>> > existing partitions since they use different znodes.
>> >
>> > Also it seems that once the topic has been created it's possible to
>> change
>> > the replication factor of individual partitions by reassigning them.
>> Which
>> > is fine except it means there is no meaningful replication factor for
>> the
>> > *topic* after topic creation. If there's no meaningful replication
>> factor I
>> > can't apply the policy.
>> >
>> > Am I missing something?
>> >
>> > Thanks,
>> >
>> > Tom
>> >
>> > On 8 September 2017 at 13:50, Ted Yu <yu...@gmail.com> wrote:
>> >
>> >> I think increasePartitions is better name, implying there are already
>> >> partitions.
>> >>
>> >> bq. use cases for non-consecutive partition ids.
>> >>
>> >> +1 on not supporting non-consecutive partition ids
>> >>
>> >> Cheers
>> >>
>> >> On Fri, Sep 8, 2017 at 5:34 AM, Ismael Juma <is...@juma.me.uk> wrote:
>> >>
>> >> > Hmm, maybe it should be createPartitions for symmetry with
>> createTopics?
>> >> >
>> >> > Ismael
>> >> >
>> >> > On Fri, Sep 8, 2017 at 12:33 PM, Tom Bentley <t....@gmail.com>
>> >> > wrote:
>> >> >
>> >> > > Hi Paolo,
>> >> > >
>> >> > > Thanks for commenting.
>> >> > >
>> >> > > The main reason I suggested `NumPartitionsIncrease` rather than
>> just
>> >> > > `NumPartitions` was in case we ever implement a
>> >> decreaseNumPartitions()
>> >> > > API. The semantics of the class are not appropriate for using with
>> a
>> >> > > decrease API, but calling it NumPartitions suggests that it was
>> >> related.
>> >> > >
>> >> > > Radical thought: What about if the method was called
>> addPartitions()
>> >> and
>> >> > > the class was called NewPartitions?
>> >> > >
>> >> > > Then if an API for decreasing were ever implemented it could be
>> >> > > removePartitions() with a RemovedPartitions class if necessary.
>> >> > >
>> >> > > Cheers,
>> >> > >
>> >> > > Tom
>> >> > >
>> >> > > On 8 September 2017 at 12:13, Paolo Patierno <pp...@live.com>
>> >> wrote:
>> >> > >
>> >> > > > My 2 cents about naming ...
>> >> > > >
>> >> > > >
>> >> > > > PartitionCount or NumPartitions sound better to me providing an
>> >> > > "absolute"
>> >> > > > value (as today the kafka-topics script work) for an idempotent
>> >> > operation
>> >> > > > while the NumPartitionsIncrease name sounds to me like the
>> >> "increment"
>> >> > > > value.
>> >> > > >
>> >> > > >
>> >> > > > Paolo Patierno
>> >> > > > Senior Software Engineer (IoT) @ Red Hat
>> >> > > > Microsoft MVP on Windows Embedded & IoT
>> >> > > > Microsoft Azure Advisor
>> >> > > >
>> >> > > > Twitter : @ppatierno<http://twitter.com/ppatierno>
>> >> > > > Linkedin : paolopatierno<http://it.linkedin.com/in/paolopatierno
>> >
>> >> > > > Blog : DevExperience<http://paolopatierno.wordpress.com/>
>> >> > > >
>> >> > > >
>> >> > > > ________________________________
>> >> > > > From: Tom Bentley <t....@gmail.com>
>> >> > > > Sent: Friday, September 8, 2017 9:39 AM
>> >> > > > To: dev@kafka.apache.org
>> >> > > > Subject: Re: [DISCUSS] KIP-195: AdminClient.increasePartitions
>> >> > > >
>> >> > > > Hi Ismael,
>> >> > > >
>> >> > > > Thanks for the comments.
>> >> > > >
>> >> > > > My bad for not noticing the custom assignment requirement. The
>> >> current
>> >> > > > > proposal has the following example (I updated it a little so
>> that
>> >> 2
>> >> > > > > partitions are added):
>> >> > > > >
>> >> > > > > increasePartitionCount(4, asList(asList(1, 2), asList(2, 3))
>> >> > > > >
>> >> > > > > Why not simply provide the assignment? For example, if you
>> want to
>> >> > add
>> >> > > 2
>> >> > > > > partitions, you'd simply do:
>> >> > > > >
>> >> > > > > increasePartitionCount(asList(asList(1, 2), asList(2, 3))
>> >> > > > >
>> >> > > > > Not sure why need the number.
>> >> > > >
>> >> > > >
>> >> > > > kafka-topics.sh allows to increase the number of partitions
>> without
>> >> > > > supplying an assignment, so one reason is simply to be able to
>> >> support
>> >> > > > that.
>> >> > > >
>> >> > > > When you don't supply an assignment you're leaving it to the
>> >> cluster to
>> >> > > > decide for you. By requiring an assignment you're forcing the
>> user
>> >> to
>> >> > > > decide. The user might not care much and thus make a worse choice
>> >> than
>> >> > if
>> >> > > > you'd left it to the server.
>> >> > > >
>> >> > > >
>> >> > > > > The other two questions:
>> >> > > > >
>> >> > > > > 2. Do we want to allow people to add non-consecutive partition
>> >> ids?
>> >> > > This
>> >> > > > is
>> >> > > > > possible to do with the current AdminUtils, but it's not
>> exactly
>> >> > > > supported
>> >> > > > > (although it apparently works fine in the broker). Still, I
>> >> wanted to
>> >> > > > make
>> >> > > > > sure we considered it.
>> >> > > > >
>> >> > > >
>> >> > > > I admit I had assumed this wasn't possible. How does partitioning
>> >> work
>> >> > if
>> >> > > > there are holes? You would need the list of partition ids in
>> order
>> >> to
>> >> > > > produce a correct partition id.
>> >> > > >
>> >> > > > Suspending my scepticism for a moment, to support something like
>> >> that
>> >> > > we'd
>> >> > > > have to change the List<List<Integer>> assignment to a
>> Map<Integer,
>> >> > > > List<Integer>>, so the request explicitly identified the new
>> topics,
>> >> > > rather
>> >> > > > than it being implied. That would make it slightly less easy to
>> >> form a
>> >> > > > valid request for the normal case of consecutive partition ids:
>> >> You'd
>> >> > > have
>> >> > > > to actually know the current number of partitions, which might
>> >> > > necessitate
>> >> > > > a describeTopics().
>> >> > > >
>> >> > > > It doesn't sound like there are any known use cases for
>> >> non-consecutive
>> >> > > > partition ids. It also sounds like whatever existing support
>> there
>> >> is
>> >> > > might
>> >> > > > be only lightly tested. It sounds like a source of gotchas and
>> >> subtle
>> >> > > bugs
>> >> > > > to me (both in Kafka itself and for users). I have to wonder
>> >> whether it
>> >> > > > would be worth supporting this.
>> >> > > >
>> >> > > > If we decide not to support it, we should fix the rest of the
>> >> > AdminClient
>> >> > > > so it's not possible to create non-consecutive partition ids.
>> >> > > >
>> >> > > > WDYT?
>> >> > > >
>> >> > > >
>> >> > > > 3. Do we want to provide the target partition count or the
>> number we
>> >> > want
>> >> > > > > to increase it by? This is related to the first point as well.
>> >> > Thinking
>> >> > > > > about it, one benefit of specifying the target number is that
>> the
>> >> > > > operation
>> >> > > > > is then idempotent. If we state the number we want to increase
>> by,
>> >> > it's
>> >> > > > > easier to make a mistake and increase it twice under failure
>> >> > scenarios.
>> >> > > > Was
>> >> > > > > that the motivation for specifying the target partition count?
>> >> > > > >
>> >> > > > >
>> >> > > > Right, if you're just supplying an increment you can't be certain
>> >> what
>> >> > > > you're incrementing it to (which is what you actually care
>> about).
>> >> And
>> >> > > > idempotency is so nice to have if something goes wrong.
>> >> > > >
>> >> > > > Using an increment would make the `NumPartitionIncrease` class a
>> bit
>> >> > more
>> >> > > > easily understood, as then the outer list would have to be the
>> same
>> >> > size
>> >> > > as
>> >> > > > the increment. But for me idempotency is the more valuable
>> feature.
>> >> > > >
>> >> > >
>> >> >
>> >>
>> >
>> >
>>
>
>

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Ismael Juma <is...@juma.me.uk>.
Hi Tom,

Maybe we should disallow the creation of partitions while a reassignment is
in progress. As you pointed out, there are weird edge cases. We can relax
the restriction later, if needed.

Ismael

On Fri, Sep 8, 2017 at 5:28 PM, Tom Bentley <t....@gmail.com> wrote:

> I've gone with createPartitions() and NewPartitions.
>
> To answer my own question about the replication factor, the current code
> (specifically AdminUtils.addPartitions()) just uses the size of the replica
> list of the first partition as the replication factor for the whole topic.
> I don't think this is 100% correct, for the reasons I mentioned earlier,
> but I propose to do the same thing in order to apply the CreateTopicPolicy.
>
> I'm very aware that I need to start the vote today if this is to have a
> hope of getting into 1.0.0, so I'm going to call the vote now.
>
> On 8 September 2017 at 16:28, Tom Bentley <t....@gmail.com> wrote:
>
> > Putting aside the naming question for a moment, I have an implementation
> > question:
> >
> > I'm looking at applying the createTopicPolicy to the new partition count.
> > (As mentioned elsewhere, this is necessary to avoid creating a trivial
> > loophole in the policy where a user creates and then modifies a topic in
> a
> > way which would have violated the createTopicPolicy had the topic
> initially
> > been created with the new number of partitions.)
> >
> > To apply the policy I need the replication factor. But AFAICS the
> > configured replication factor isn't available anywhere on the broker. The
> > best I can do it look at size of the replicas list of a partition. But if
> > the partition is being reassigned this might not be the "real"
> replication
> > factor (it could be transiently higher). And yes, as far as I'm aware it
> is
> > possible to simultaneously increase the partition count and reassign the
> > existing partitions since they use different znodes.
> >
> > Also it seems that once the topic has been created it's possible to
> change
> > the replication factor of individual partitions by reassigning them.
> Which
> > is fine except it means there is no meaningful replication factor for the
> > *topic* after topic creation. If there's no meaningful replication
> factor I
> > can't apply the policy.
> >
> > Am I missing something?
> >
> > Thanks,
> >
> > Tom
> >
> > On 8 September 2017 at 13:50, Ted Yu <yu...@gmail.com> wrote:
> >
> >> I think increasePartitions is better name, implying there are already
> >> partitions.
> >>
> >> bq. use cases for non-consecutive partition ids.
> >>
> >> +1 on not supporting non-consecutive partition ids
> >>
> >> Cheers
> >>
> >> On Fri, Sep 8, 2017 at 5:34 AM, Ismael Juma <is...@juma.me.uk> wrote:
> >>
> >> > Hmm, maybe it should be createPartitions for symmetry with
> createTopics?
> >> >
> >> > Ismael
> >> >
> >> > On Fri, Sep 8, 2017 at 12:33 PM, Tom Bentley <t....@gmail.com>
> >> > wrote:
> >> >
> >> > > Hi Paolo,
> >> > >
> >> > > Thanks for commenting.
> >> > >
> >> > > The main reason I suggested `NumPartitionsIncrease` rather than just
> >> > > `NumPartitions` was in case we ever implement a
> >> decreaseNumPartitions()
> >> > > API. The semantics of the class are not appropriate for using with a
> >> > > decrease API, but calling it NumPartitions suggests that it was
> >> related.
> >> > >
> >> > > Radical thought: What about if the method was called addPartitions()
> >> and
> >> > > the class was called NewPartitions?
> >> > >
> >> > > Then if an API for decreasing were ever implemented it could be
> >> > > removePartitions() with a RemovedPartitions class if necessary.
> >> > >
> >> > > Cheers,
> >> > >
> >> > > Tom
> >> > >
> >> > > On 8 September 2017 at 12:13, Paolo Patierno <pp...@live.com>
> >> wrote:
> >> > >
> >> > > > My 2 cents about naming ...
> >> > > >
> >> > > >
> >> > > > PartitionCount or NumPartitions sound better to me providing an
> >> > > "absolute"
> >> > > > value (as today the kafka-topics script work) for an idempotent
> >> > operation
> >> > > > while the NumPartitionsIncrease name sounds to me like the
> >> "increment"
> >> > > > value.
> >> > > >
> >> > > >
> >> > > > Paolo Patierno
> >> > > > Senior Software Engineer (IoT) @ Red Hat
> >> > > > Microsoft MVP on Windows Embedded & IoT
> >> > > > Microsoft Azure Advisor
> >> > > >
> >> > > > Twitter : @ppatierno<http://twitter.com/ppatierno>
> >> > > > Linkedin : paolopatierno<http://it.linkedin.com/in/paolopatierno>
> >> > > > Blog : DevExperience<http://paolopatierno.wordpress.com/>
> >> > > >
> >> > > >
> >> > > > ________________________________
> >> > > > From: Tom Bentley <t....@gmail.com>
> >> > > > Sent: Friday, September 8, 2017 9:39 AM
> >> > > > To: dev@kafka.apache.org
> >> > > > Subject: Re: [DISCUSS] KIP-195: AdminClient.increasePartitions
> >> > > >
> >> > > > Hi Ismael,
> >> > > >
> >> > > > Thanks for the comments.
> >> > > >
> >> > > > My bad for not noticing the custom assignment requirement. The
> >> current
> >> > > > > proposal has the following example (I updated it a little so
> that
> >> 2
> >> > > > > partitions are added):
> >> > > > >
> >> > > > > increasePartitionCount(4, asList(asList(1, 2), asList(2, 3))
> >> > > > >
> >> > > > > Why not simply provide the assignment? For example, if you want
> to
> >> > add
> >> > > 2
> >> > > > > partitions, you'd simply do:
> >> > > > >
> >> > > > > increasePartitionCount(asList(asList(1, 2), asList(2, 3))
> >> > > > >
> >> > > > > Not sure why need the number.
> >> > > >
> >> > > >
> >> > > > kafka-topics.sh allows to increase the number of partitions
> without
> >> > > > supplying an assignment, so one reason is simply to be able to
> >> support
> >> > > > that.
> >> > > >
> >> > > > When you don't supply an assignment you're leaving it to the
> >> cluster to
> >> > > > decide for you. By requiring an assignment you're forcing the user
> >> to
> >> > > > decide. The user might not care much and thus make a worse choice
> >> than
> >> > if
> >> > > > you'd left it to the server.
> >> > > >
> >> > > >
> >> > > > > The other two questions:
> >> > > > >
> >> > > > > 2. Do we want to allow people to add non-consecutive partition
> >> ids?
> >> > > This
> >> > > > is
> >> > > > > possible to do with the current AdminUtils, but it's not exactly
> >> > > > supported
> >> > > > > (although it apparently works fine in the broker). Still, I
> >> wanted to
> >> > > > make
> >> > > > > sure we considered it.
> >> > > > >
> >> > > >
> >> > > > I admit I had assumed this wasn't possible. How does partitioning
> >> work
> >> > if
> >> > > > there are holes? You would need the list of partition ids in order
> >> to
> >> > > > produce a correct partition id.
> >> > > >
> >> > > > Suspending my scepticism for a moment, to support something like
> >> that
> >> > > we'd
> >> > > > have to change the List<List<Integer>> assignment to a
> Map<Integer,
> >> > > > List<Integer>>, so the request explicitly identified the new
> topics,
> >> > > rather
> >> > > > than it being implied. That would make it slightly less easy to
> >> form a
> >> > > > valid request for the normal case of consecutive partition ids:
> >> You'd
> >> > > have
> >> > > > to actually know the current number of partitions, which might
> >> > > necessitate
> >> > > > a describeTopics().
> >> > > >
> >> > > > It doesn't sound like there are any known use cases for
> >> non-consecutive
> >> > > > partition ids. It also sounds like whatever existing support there
> >> is
> >> > > might
> >> > > > be only lightly tested. It sounds like a source of gotchas and
> >> subtle
> >> > > bugs
> >> > > > to me (both in Kafka itself and for users). I have to wonder
> >> whether it
> >> > > > would be worth supporting this.
> >> > > >
> >> > > > If we decide not to support it, we should fix the rest of the
> >> > AdminClient
> >> > > > so it's not possible to create non-consecutive partition ids.
> >> > > >
> >> > > > WDYT?
> >> > > >
> >> > > >
> >> > > > 3. Do we want to provide the target partition count or the number
> we
> >> > want
> >> > > > > to increase it by? This is related to the first point as well.
> >> > Thinking
> >> > > > > about it, one benefit of specifying the target number is that
> the
> >> > > > operation
> >> > > > > is then idempotent. If we state the number we want to increase
> by,
> >> > it's
> >> > > > > easier to make a mistake and increase it twice under failure
> >> > scenarios.
> >> > > > Was
> >> > > > > that the motivation for specifying the target partition count?
> >> > > > >
> >> > > > >
> >> > > > Right, if you're just supplying an increment you can't be certain
> >> what
> >> > > > you're incrementing it to (which is what you actually care about).
> >> And
> >> > > > idempotency is so nice to have if something goes wrong.
> >> > > >
> >> > > > Using an increment would make the `NumPartitionIncrease` class a
> bit
> >> > more
> >> > > > easily understood, as then the outer list would have to be the
> same
> >> > size
> >> > > as
> >> > > > the increment. But for me idempotency is the more valuable
> feature.
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Tom Bentley <t....@gmail.com>.
I've gone with createPartitions() and NewPartitions.

To answer my own question about the replication factor, the current code
(specifically AdminUtils.addPartitions()) just uses the size of the replica
list of the first partition as the replication factor for the whole topic.
I don't think this is 100% correct, for the reasons I mentioned earlier,
but I propose to do the same thing in order to apply the CreateTopicPolicy.

I'm very aware that I need to start the vote today if this is to have a
hope of getting into 1.0.0, so I'm going to call the vote now.

On 8 September 2017 at 16:28, Tom Bentley <t....@gmail.com> wrote:

> Putting aside the naming question for a moment, I have an implementation
> question:
>
> I'm looking at applying the createTopicPolicy to the new partition count.
> (As mentioned elsewhere, this is necessary to avoid creating a trivial
> loophole in the policy where a user creates and then modifies a topic in a
> way which would have violated the createTopicPolicy had the topic initially
> been created with the new number of partitions.)
>
> To apply the policy I need the replication factor. But AFAICS the
> configured replication factor isn't available anywhere on the broker. The
> best I can do it look at size of the replicas list of a partition. But if
> the partition is being reassigned this might not be the "real" replication
> factor (it could be transiently higher). And yes, as far as I'm aware it is
> possible to simultaneously increase the partition count and reassign the
> existing partitions since they use different znodes.
>
> Also it seems that once the topic has been created it's possible to change
> the replication factor of individual partitions by reassigning them. Which
> is fine except it means there is no meaningful replication factor for the
> *topic* after topic creation. If there's no meaningful replication factor I
> can't apply the policy.
>
> Am I missing something?
>
> Thanks,
>
> Tom
>
> On 8 September 2017 at 13:50, Ted Yu <yu...@gmail.com> wrote:
>
>> I think increasePartitions is better name, implying there are already
>> partitions.
>>
>> bq. use cases for non-consecutive partition ids.
>>
>> +1 on not supporting non-consecutive partition ids
>>
>> Cheers
>>
>> On Fri, Sep 8, 2017 at 5:34 AM, Ismael Juma <is...@juma.me.uk> wrote:
>>
>> > Hmm, maybe it should be createPartitions for symmetry with createTopics?
>> >
>> > Ismael
>> >
>> > On Fri, Sep 8, 2017 at 12:33 PM, Tom Bentley <t....@gmail.com>
>> > wrote:
>> >
>> > > Hi Paolo,
>> > >
>> > > Thanks for commenting.
>> > >
>> > > The main reason I suggested `NumPartitionsIncrease` rather than just
>> > > `NumPartitions` was in case we ever implement a
>> decreaseNumPartitions()
>> > > API. The semantics of the class are not appropriate for using with a
>> > > decrease API, but calling it NumPartitions suggests that it was
>> related.
>> > >
>> > > Radical thought: What about if the method was called addPartitions()
>> and
>> > > the class was called NewPartitions?
>> > >
>> > > Then if an API for decreasing were ever implemented it could be
>> > > removePartitions() with a RemovedPartitions class if necessary.
>> > >
>> > > Cheers,
>> > >
>> > > Tom
>> > >
>> > > On 8 September 2017 at 12:13, Paolo Patierno <pp...@live.com>
>> wrote:
>> > >
>> > > > My 2 cents about naming ...
>> > > >
>> > > >
>> > > > PartitionCount or NumPartitions sound better to me providing an
>> > > "absolute"
>> > > > value (as today the kafka-topics script work) for an idempotent
>> > operation
>> > > > while the NumPartitionsIncrease name sounds to me like the
>> "increment"
>> > > > value.
>> > > >
>> > > >
>> > > > Paolo Patierno
>> > > > Senior Software Engineer (IoT) @ Red Hat
>> > > > Microsoft MVP on Windows Embedded & IoT
>> > > > Microsoft Azure Advisor
>> > > >
>> > > > Twitter : @ppatierno<http://twitter.com/ppatierno>
>> > > > Linkedin : paolopatierno<http://it.linkedin.com/in/paolopatierno>
>> > > > Blog : DevExperience<http://paolopatierno.wordpress.com/>
>> > > >
>> > > >
>> > > > ________________________________
>> > > > From: Tom Bentley <t....@gmail.com>
>> > > > Sent: Friday, September 8, 2017 9:39 AM
>> > > > To: dev@kafka.apache.org
>> > > > Subject: Re: [DISCUSS] KIP-195: AdminClient.increasePartitions
>> > > >
>> > > > Hi Ismael,
>> > > >
>> > > > Thanks for the comments.
>> > > >
>> > > > My bad for not noticing the custom assignment requirement. The
>> current
>> > > > > proposal has the following example (I updated it a little so that
>> 2
>> > > > > partitions are added):
>> > > > >
>> > > > > increasePartitionCount(4, asList(asList(1, 2), asList(2, 3))
>> > > > >
>> > > > > Why not simply provide the assignment? For example, if you want to
>> > add
>> > > 2
>> > > > > partitions, you'd simply do:
>> > > > >
>> > > > > increasePartitionCount(asList(asList(1, 2), asList(2, 3))
>> > > > >
>> > > > > Not sure why need the number.
>> > > >
>> > > >
>> > > > kafka-topics.sh allows to increase the number of partitions without
>> > > > supplying an assignment, so one reason is simply to be able to
>> support
>> > > > that.
>> > > >
>> > > > When you don't supply an assignment you're leaving it to the
>> cluster to
>> > > > decide for you. By requiring an assignment you're forcing the user
>> to
>> > > > decide. The user might not care much and thus make a worse choice
>> than
>> > if
>> > > > you'd left it to the server.
>> > > >
>> > > >
>> > > > > The other two questions:
>> > > > >
>> > > > > 2. Do we want to allow people to add non-consecutive partition
>> ids?
>> > > This
>> > > > is
>> > > > > possible to do with the current AdminUtils, but it's not exactly
>> > > > supported
>> > > > > (although it apparently works fine in the broker). Still, I
>> wanted to
>> > > > make
>> > > > > sure we considered it.
>> > > > >
>> > > >
>> > > > I admit I had assumed this wasn't possible. How does partitioning
>> work
>> > if
>> > > > there are holes? You would need the list of partition ids in order
>> to
>> > > > produce a correct partition id.
>> > > >
>> > > > Suspending my scepticism for a moment, to support something like
>> that
>> > > we'd
>> > > > have to change the List<List<Integer>> assignment to a Map<Integer,
>> > > > List<Integer>>, so the request explicitly identified the new topics,
>> > > rather
>> > > > than it being implied. That would make it slightly less easy to
>> form a
>> > > > valid request for the normal case of consecutive partition ids:
>> You'd
>> > > have
>> > > > to actually know the current number of partitions, which might
>> > > necessitate
>> > > > a describeTopics().
>> > > >
>> > > > It doesn't sound like there are any known use cases for
>> non-consecutive
>> > > > partition ids. It also sounds like whatever existing support there
>> is
>> > > might
>> > > > be only lightly tested. It sounds like a source of gotchas and
>> subtle
>> > > bugs
>> > > > to me (both in Kafka itself and for users). I have to wonder
>> whether it
>> > > > would be worth supporting this.
>> > > >
>> > > > If we decide not to support it, we should fix the rest of the
>> > AdminClient
>> > > > so it's not possible to create non-consecutive partition ids.
>> > > >
>> > > > WDYT?
>> > > >
>> > > >
>> > > > 3. Do we want to provide the target partition count or the number we
>> > want
>> > > > > to increase it by? This is related to the first point as well.
>> > Thinking
>> > > > > about it, one benefit of specifying the target number is that the
>> > > > operation
>> > > > > is then idempotent. If we state the number we want to increase by,
>> > it's
>> > > > > easier to make a mistake and increase it twice under failure
>> > scenarios.
>> > > > Was
>> > > > > that the motivation for specifying the target partition count?
>> > > > >
>> > > > >
>> > > > Right, if you're just supplying an increment you can't be certain
>> what
>> > > > you're incrementing it to (which is what you actually care about).
>> And
>> > > > idempotency is so nice to have if something goes wrong.
>> > > >
>> > > > Using an increment would make the `NumPartitionIncrease` class a bit
>> > more
>> > > > easily understood, as then the outer list would have to be the same
>> > size
>> > > as
>> > > > the increment. But for me idempotency is the more valuable feature.
>> > > >
>> > >
>> >
>>
>
>

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Tom Bentley <t....@gmail.com>.
Putting aside the naming question for a moment, I have an implementation
question:

I'm looking at applying the createTopicPolicy to the new partition count.
(As mentioned elsewhere, this is necessary to avoid creating a trivial
loophole in the policy where a user creates and then modifies a topic in a
way which would have violated the createTopicPolicy had the topic initially
been created with the new number of partitions.)

To apply the policy I need the replication factor. But AFAICS the
configured replication factor isn't available anywhere on the broker. The
best I can do it look at size of the replicas list of a partition. But if
the partition is being reassigned this might not be the "real" replication
factor (it could be transiently higher). And yes, as far as I'm aware it is
possible to simultaneously increase the partition count and reassign the
existing partitions since they use different znodes.

Also it seems that once the topic has been created it's possible to change
the replication factor of individual partitions by reassigning them. Which
is fine except it means there is no meaningful replication factor for the
*topic* after topic creation. If there's no meaningful replication factor I
can't apply the policy.

Am I missing something?

Thanks,

Tom

On 8 September 2017 at 13:50, Ted Yu <yu...@gmail.com> wrote:

> I think increasePartitions is better name, implying there are already
> partitions.
>
> bq. use cases for non-consecutive partition ids.
>
> +1 on not supporting non-consecutive partition ids
>
> Cheers
>
> On Fri, Sep 8, 2017 at 5:34 AM, Ismael Juma <is...@juma.me.uk> wrote:
>
> > Hmm, maybe it should be createPartitions for symmetry with createTopics?
> >
> > Ismael
> >
> > On Fri, Sep 8, 2017 at 12:33 PM, Tom Bentley <t....@gmail.com>
> > wrote:
> >
> > > Hi Paolo,
> > >
> > > Thanks for commenting.
> > >
> > > The main reason I suggested `NumPartitionsIncrease` rather than just
> > > `NumPartitions` was in case we ever implement a decreaseNumPartitions()
> > > API. The semantics of the class are not appropriate for using with a
> > > decrease API, but calling it NumPartitions suggests that it was
> related.
> > >
> > > Radical thought: What about if the method was called addPartitions()
> and
> > > the class was called NewPartitions?
> > >
> > > Then if an API for decreasing were ever implemented it could be
> > > removePartitions() with a RemovedPartitions class if necessary.
> > >
> > > Cheers,
> > >
> > > Tom
> > >
> > > On 8 September 2017 at 12:13, Paolo Patierno <pp...@live.com>
> wrote:
> > >
> > > > My 2 cents about naming ...
> > > >
> > > >
> > > > PartitionCount or NumPartitions sound better to me providing an
> > > "absolute"
> > > > value (as today the kafka-topics script work) for an idempotent
> > operation
> > > > while the NumPartitionsIncrease name sounds to me like the
> "increment"
> > > > value.
> > > >
> > > >
> > > > Paolo Patierno
> > > > Senior Software Engineer (IoT) @ Red Hat
> > > > Microsoft MVP on Windows Embedded & IoT
> > > > Microsoft Azure Advisor
> > > >
> > > > Twitter : @ppatierno<http://twitter.com/ppatierno>
> > > > Linkedin : paolopatierno<http://it.linkedin.com/in/paolopatierno>
> > > > Blog : DevExperience<http://paolopatierno.wordpress.com/>
> > > >
> > > >
> > > > ________________________________
> > > > From: Tom Bentley <t....@gmail.com>
> > > > Sent: Friday, September 8, 2017 9:39 AM
> > > > To: dev@kafka.apache.org
> > > > Subject: Re: [DISCUSS] KIP-195: AdminClient.increasePartitions
> > > >
> > > > Hi Ismael,
> > > >
> > > > Thanks for the comments.
> > > >
> > > > My bad for not noticing the custom assignment requirement. The
> current
> > > > > proposal has the following example (I updated it a little so that 2
> > > > > partitions are added):
> > > > >
> > > > > increasePartitionCount(4, asList(asList(1, 2), asList(2, 3))
> > > > >
> > > > > Why not simply provide the assignment? For example, if you want to
> > add
> > > 2
> > > > > partitions, you'd simply do:
> > > > >
> > > > > increasePartitionCount(asList(asList(1, 2), asList(2, 3))
> > > > >
> > > > > Not sure why need the number.
> > > >
> > > >
> > > > kafka-topics.sh allows to increase the number of partitions without
> > > > supplying an assignment, so one reason is simply to be able to
> support
> > > > that.
> > > >
> > > > When you don't supply an assignment you're leaving it to the cluster
> to
> > > > decide for you. By requiring an assignment you're forcing the user to
> > > > decide. The user might not care much and thus make a worse choice
> than
> > if
> > > > you'd left it to the server.
> > > >
> > > >
> > > > > The other two questions:
> > > > >
> > > > > 2. Do we want to allow people to add non-consecutive partition ids?
> > > This
> > > > is
> > > > > possible to do with the current AdminUtils, but it's not exactly
> > > > supported
> > > > > (although it apparently works fine in the broker). Still, I wanted
> to
> > > > make
> > > > > sure we considered it.
> > > > >
> > > >
> > > > I admit I had assumed this wasn't possible. How does partitioning
> work
> > if
> > > > there are holes? You would need the list of partition ids in order to
> > > > produce a correct partition id.
> > > >
> > > > Suspending my scepticism for a moment, to support something like that
> > > we'd
> > > > have to change the List<List<Integer>> assignment to a Map<Integer,
> > > > List<Integer>>, so the request explicitly identified the new topics,
> > > rather
> > > > than it being implied. That would make it slightly less easy to form
> a
> > > > valid request for the normal case of consecutive partition ids: You'd
> > > have
> > > > to actually know the current number of partitions, which might
> > > necessitate
> > > > a describeTopics().
> > > >
> > > > It doesn't sound like there are any known use cases for
> non-consecutive
> > > > partition ids. It also sounds like whatever existing support there is
> > > might
> > > > be only lightly tested. It sounds like a source of gotchas and subtle
> > > bugs
> > > > to me (both in Kafka itself and for users). I have to wonder whether
> it
> > > > would be worth supporting this.
> > > >
> > > > If we decide not to support it, we should fix the rest of the
> > AdminClient
> > > > so it's not possible to create non-consecutive partition ids.
> > > >
> > > > WDYT?
> > > >
> > > >
> > > > 3. Do we want to provide the target partition count or the number we
> > want
> > > > > to increase it by? This is related to the first point as well.
> > Thinking
> > > > > about it, one benefit of specifying the target number is that the
> > > > operation
> > > > > is then idempotent. If we state the number we want to increase by,
> > it's
> > > > > easier to make a mistake and increase it twice under failure
> > scenarios.
> > > > Was
> > > > > that the motivation for specifying the target partition count?
> > > > >
> > > > >
> > > > Right, if you're just supplying an increment you can't be certain
> what
> > > > you're incrementing it to (which is what you actually care about).
> And
> > > > idempotency is so nice to have if something goes wrong.
> > > >
> > > > Using an increment would make the `NumPartitionIncrease` class a bit
> > more
> > > > easily understood, as then the outer list would have to be the same
> > size
> > > as
> > > > the increment. But for me idempotency is the more valuable feature.
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Ted Yu <yu...@gmail.com>.
I think increasePartitions is better name, implying there are already
partitions.

bq. use cases for non-consecutive partition ids.

+1 on not supporting non-consecutive partition ids

Cheers

On Fri, Sep 8, 2017 at 5:34 AM, Ismael Juma <is...@juma.me.uk> wrote:

> Hmm, maybe it should be createPartitions for symmetry with createTopics?
>
> Ismael
>
> On Fri, Sep 8, 2017 at 12:33 PM, Tom Bentley <t....@gmail.com>
> wrote:
>
> > Hi Paolo,
> >
> > Thanks for commenting.
> >
> > The main reason I suggested `NumPartitionsIncrease` rather than just
> > `NumPartitions` was in case we ever implement a decreaseNumPartitions()
> > API. The semantics of the class are not appropriate for using with a
> > decrease API, but calling it NumPartitions suggests that it was related.
> >
> > Radical thought: What about if the method was called addPartitions() and
> > the class was called NewPartitions?
> >
> > Then if an API for decreasing were ever implemented it could be
> > removePartitions() with a RemovedPartitions class if necessary.
> >
> > Cheers,
> >
> > Tom
> >
> > On 8 September 2017 at 12:13, Paolo Patierno <pp...@live.com> wrote:
> >
> > > My 2 cents about naming ...
> > >
> > >
> > > PartitionCount or NumPartitions sound better to me providing an
> > "absolute"
> > > value (as today the kafka-topics script work) for an idempotent
> operation
> > > while the NumPartitionsIncrease name sounds to me like the "increment"
> > > value.
> > >
> > >
> > > Paolo Patierno
> > > Senior Software Engineer (IoT) @ Red Hat
> > > Microsoft MVP on Windows Embedded & IoT
> > > Microsoft Azure Advisor
> > >
> > > Twitter : @ppatierno<http://twitter.com/ppatierno>
> > > Linkedin : paolopatierno<http://it.linkedin.com/in/paolopatierno>
> > > Blog : DevExperience<http://paolopatierno.wordpress.com/>
> > >
> > >
> > > ________________________________
> > > From: Tom Bentley <t....@gmail.com>
> > > Sent: Friday, September 8, 2017 9:39 AM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [DISCUSS] KIP-195: AdminClient.increasePartitions
> > >
> > > Hi Ismael,
> > >
> > > Thanks for the comments.
> > >
> > > My bad for not noticing the custom assignment requirement. The current
> > > > proposal has the following example (I updated it a little so that 2
> > > > partitions are added):
> > > >
> > > > increasePartitionCount(4, asList(asList(1, 2), asList(2, 3))
> > > >
> > > > Why not simply provide the assignment? For example, if you want to
> add
> > 2
> > > > partitions, you'd simply do:
> > > >
> > > > increasePartitionCount(asList(asList(1, 2), asList(2, 3))
> > > >
> > > > Not sure why need the number.
> > >
> > >
> > > kafka-topics.sh allows to increase the number of partitions without
> > > supplying an assignment, so one reason is simply to be able to support
> > > that.
> > >
> > > When you don't supply an assignment you're leaving it to the cluster to
> > > decide for you. By requiring an assignment you're forcing the user to
> > > decide. The user might not care much and thus make a worse choice than
> if
> > > you'd left it to the server.
> > >
> > >
> > > > The other two questions:
> > > >
> > > > 2. Do we want to allow people to add non-consecutive partition ids?
> > This
> > > is
> > > > possible to do with the current AdminUtils, but it's not exactly
> > > supported
> > > > (although it apparently works fine in the broker). Still, I wanted to
> > > make
> > > > sure we considered it.
> > > >
> > >
> > > I admit I had assumed this wasn't possible. How does partitioning work
> if
> > > there are holes? You would need the list of partition ids in order to
> > > produce a correct partition id.
> > >
> > > Suspending my scepticism for a moment, to support something like that
> > we'd
> > > have to change the List<List<Integer>> assignment to a Map<Integer,
> > > List<Integer>>, so the request explicitly identified the new topics,
> > rather
> > > than it being implied. That would make it slightly less easy to form a
> > > valid request for the normal case of consecutive partition ids: You'd
> > have
> > > to actually know the current number of partitions, which might
> > necessitate
> > > a describeTopics().
> > >
> > > It doesn't sound like there are any known use cases for non-consecutive
> > > partition ids. It also sounds like whatever existing support there is
> > might
> > > be only lightly tested. It sounds like a source of gotchas and subtle
> > bugs
> > > to me (both in Kafka itself and for users). I have to wonder whether it
> > > would be worth supporting this.
> > >
> > > If we decide not to support it, we should fix the rest of the
> AdminClient
> > > so it's not possible to create non-consecutive partition ids.
> > >
> > > WDYT?
> > >
> > >
> > > 3. Do we want to provide the target partition count or the number we
> want
> > > > to increase it by? This is related to the first point as well.
> Thinking
> > > > about it, one benefit of specifying the target number is that the
> > > operation
> > > > is then idempotent. If we state the number we want to increase by,
> it's
> > > > easier to make a mistake and increase it twice under failure
> scenarios.
> > > Was
> > > > that the motivation for specifying the target partition count?
> > > >
> > > >
> > > Right, if you're just supplying an increment you can't be certain what
> > > you're incrementing it to (which is what you actually care about). And
> > > idempotency is so nice to have if something goes wrong.
> > >
> > > Using an increment would make the `NumPartitionIncrease` class a bit
> more
> > > easily understood, as then the outer list would have to be the same
> size
> > as
> > > the increment. But for me idempotency is the more valuable feature.
> > >
> >
>

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Ismael Juma <is...@juma.me.uk>.
Hmm, maybe it should be createPartitions for symmetry with createTopics?

Ismael

On Fri, Sep 8, 2017 at 12:33 PM, Tom Bentley <t....@gmail.com> wrote:

> Hi Paolo,
>
> Thanks for commenting.
>
> The main reason I suggested `NumPartitionsIncrease` rather than just
> `NumPartitions` was in case we ever implement a decreaseNumPartitions()
> API. The semantics of the class are not appropriate for using with a
> decrease API, but calling it NumPartitions suggests that it was related.
>
> Radical thought: What about if the method was called addPartitions() and
> the class was called NewPartitions?
>
> Then if an API for decreasing were ever implemented it could be
> removePartitions() with a RemovedPartitions class if necessary.
>
> Cheers,
>
> Tom
>
> On 8 September 2017 at 12:13, Paolo Patierno <pp...@live.com> wrote:
>
> > My 2 cents about naming ...
> >
> >
> > PartitionCount or NumPartitions sound better to me providing an
> "absolute"
> > value (as today the kafka-topics script work) for an idempotent operation
> > while the NumPartitionsIncrease name sounds to me like the "increment"
> > value.
> >
> >
> > Paolo Patierno
> > Senior Software Engineer (IoT) @ Red Hat
> > Microsoft MVP on Windows Embedded & IoT
> > Microsoft Azure Advisor
> >
> > Twitter : @ppatierno<http://twitter.com/ppatierno>
> > Linkedin : paolopatierno<http://it.linkedin.com/in/paolopatierno>
> > Blog : DevExperience<http://paolopatierno.wordpress.com/>
> >
> >
> > ________________________________
> > From: Tom Bentley <t....@gmail.com>
> > Sent: Friday, September 8, 2017 9:39 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-195: AdminClient.increasePartitions
> >
> > Hi Ismael,
> >
> > Thanks for the comments.
> >
> > My bad for not noticing the custom assignment requirement. The current
> > > proposal has the following example (I updated it a little so that 2
> > > partitions are added):
> > >
> > > increasePartitionCount(4, asList(asList(1, 2), asList(2, 3))
> > >
> > > Why not simply provide the assignment? For example, if you want to add
> 2
> > > partitions, you'd simply do:
> > >
> > > increasePartitionCount(asList(asList(1, 2), asList(2, 3))
> > >
> > > Not sure why need the number.
> >
> >
> > kafka-topics.sh allows to increase the number of partitions without
> > supplying an assignment, so one reason is simply to be able to support
> > that.
> >
> > When you don't supply an assignment you're leaving it to the cluster to
> > decide for you. By requiring an assignment you're forcing the user to
> > decide. The user might not care much and thus make a worse choice than if
> > you'd left it to the server.
> >
> >
> > > The other two questions:
> > >
> > > 2. Do we want to allow people to add non-consecutive partition ids?
> This
> > is
> > > possible to do with the current AdminUtils, but it's not exactly
> > supported
> > > (although it apparently works fine in the broker). Still, I wanted to
> > make
> > > sure we considered it.
> > >
> >
> > I admit I had assumed this wasn't possible. How does partitioning work if
> > there are holes? You would need the list of partition ids in order to
> > produce a correct partition id.
> >
> > Suspending my scepticism for a moment, to support something like that
> we'd
> > have to change the List<List<Integer>> assignment to a Map<Integer,
> > List<Integer>>, so the request explicitly identified the new topics,
> rather
> > than it being implied. That would make it slightly less easy to form a
> > valid request for the normal case of consecutive partition ids: You'd
> have
> > to actually know the current number of partitions, which might
> necessitate
> > a describeTopics().
> >
> > It doesn't sound like there are any known use cases for non-consecutive
> > partition ids. It also sounds like whatever existing support there is
> might
> > be only lightly tested. It sounds like a source of gotchas and subtle
> bugs
> > to me (both in Kafka itself and for users). I have to wonder whether it
> > would be worth supporting this.
> >
> > If we decide not to support it, we should fix the rest of the AdminClient
> > so it's not possible to create non-consecutive partition ids.
> >
> > WDYT?
> >
> >
> > 3. Do we want to provide the target partition count or the number we want
> > > to increase it by? This is related to the first point as well. Thinking
> > > about it, one benefit of specifying the target number is that the
> > operation
> > > is then idempotent. If we state the number we want to increase by, it's
> > > easier to make a mistake and increase it twice under failure scenarios.
> > Was
> > > that the motivation for specifying the target partition count?
> > >
> > >
> > Right, if you're just supplying an increment you can't be certain what
> > you're incrementing it to (which is what you actually care about). And
> > idempotency is so nice to have if something goes wrong.
> >
> > Using an increment would make the `NumPartitionIncrease` class a bit more
> > easily understood, as then the outer list would have to be the same size
> as
> > the increment. But for me idempotency is the more valuable feature.
> >
>

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Tom Bentley <t....@gmail.com>.
Hi Paolo,

Thanks for commenting.

The main reason I suggested `NumPartitionsIncrease` rather than just
`NumPartitions` was in case we ever implement a decreaseNumPartitions()
API. The semantics of the class are not appropriate for using with a
decrease API, but calling it NumPartitions suggests that it was related.

Radical thought: What about if the method was called addPartitions() and
the class was called NewPartitions?

Then if an API for decreasing were ever implemented it could be
removePartitions() with a RemovedPartitions class if necessary.

Cheers,

Tom

On 8 September 2017 at 12:13, Paolo Patierno <pp...@live.com> wrote:

> My 2 cents about naming ...
>
>
> PartitionCount or NumPartitions sound better to me providing an "absolute"
> value (as today the kafka-topics script work) for an idempotent operation
> while the NumPartitionsIncrease name sounds to me like the "increment"
> value.
>
>
> Paolo Patierno
> Senior Software Engineer (IoT) @ Red Hat
> Microsoft MVP on Windows Embedded & IoT
> Microsoft Azure Advisor
>
> Twitter : @ppatierno<http://twitter.com/ppatierno>
> Linkedin : paolopatierno<http://it.linkedin.com/in/paolopatierno>
> Blog : DevExperience<http://paolopatierno.wordpress.com/>
>
>
> ________________________________
> From: Tom Bentley <t....@gmail.com>
> Sent: Friday, September 8, 2017 9:39 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-195: AdminClient.increasePartitions
>
> Hi Ismael,
>
> Thanks for the comments.
>
> My bad for not noticing the custom assignment requirement. The current
> > proposal has the following example (I updated it a little so that 2
> > partitions are added):
> >
> > increasePartitionCount(4, asList(asList(1, 2), asList(2, 3))
> >
> > Why not simply provide the assignment? For example, if you want to add 2
> > partitions, you'd simply do:
> >
> > increasePartitionCount(asList(asList(1, 2), asList(2, 3))
> >
> > Not sure why need the number.
>
>
> kafka-topics.sh allows to increase the number of partitions without
> supplying an assignment, so one reason is simply to be able to support
> that.
>
> When you don't supply an assignment you're leaving it to the cluster to
> decide for you. By requiring an assignment you're forcing the user to
> decide. The user might not care much and thus make a worse choice than if
> you'd left it to the server.
>
>
> > The other two questions:
> >
> > 2. Do we want to allow people to add non-consecutive partition ids? This
> is
> > possible to do with the current AdminUtils, but it's not exactly
> supported
> > (although it apparently works fine in the broker). Still, I wanted to
> make
> > sure we considered it.
> >
>
> I admit I had assumed this wasn't possible. How does partitioning work if
> there are holes? You would need the list of partition ids in order to
> produce a correct partition id.
>
> Suspending my scepticism for a moment, to support something like that we'd
> have to change the List<List<Integer>> assignment to a Map<Integer,
> List<Integer>>, so the request explicitly identified the new topics, rather
> than it being implied. That would make it slightly less easy to form a
> valid request for the normal case of consecutive partition ids: You'd have
> to actually know the current number of partitions, which might necessitate
> a describeTopics().
>
> It doesn't sound like there are any known use cases for non-consecutive
> partition ids. It also sounds like whatever existing support there is might
> be only lightly tested. It sounds like a source of gotchas and subtle bugs
> to me (both in Kafka itself and for users). I have to wonder whether it
> would be worth supporting this.
>
> If we decide not to support it, we should fix the rest of the AdminClient
> so it's not possible to create non-consecutive partition ids.
>
> WDYT?
>
>
> 3. Do we want to provide the target partition count or the number we want
> > to increase it by? This is related to the first point as well. Thinking
> > about it, one benefit of specifying the target number is that the
> operation
> > is then idempotent. If we state the number we want to increase by, it's
> > easier to make a mistake and increase it twice under failure scenarios.
> Was
> > that the motivation for specifying the target partition count?
> >
> >
> Right, if you're just supplying an increment you can't be certain what
> you're incrementing it to (which is what you actually care about). And
> idempotency is so nice to have if something goes wrong.
>
> Using an increment would make the `NumPartitionIncrease` class a bit more
> easily understood, as then the outer list would have to be the same size as
> the increment. But for me idempotency is the more valuable feature.
>

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Paolo Patierno <pp...@live.com>.
My 2 cents about naming ...


PartitionCount or NumPartitions sound better to me providing an "absolute" value (as today the kafka-topics script work) for an idempotent operation while the NumPartitionsIncrease name sounds to me like the "increment" value.


Paolo Patierno
Senior Software Engineer (IoT) @ Red Hat
Microsoft MVP on Windows Embedded & IoT
Microsoft Azure Advisor

Twitter : @ppatierno<http://twitter.com/ppatierno>
Linkedin : paolopatierno<http://it.linkedin.com/in/paolopatierno>
Blog : DevExperience<http://paolopatierno.wordpress.com/>


________________________________
From: Tom Bentley <t....@gmail.com>
Sent: Friday, September 8, 2017 9:39 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Hi Ismael,

Thanks for the comments.

My bad for not noticing the custom assignment requirement. The current
> proposal has the following example (I updated it a little so that 2
> partitions are added):
>
> increasePartitionCount(4, asList(asList(1, 2), asList(2, 3))
>
> Why not simply provide the assignment? For example, if you want to add 2
> partitions, you'd simply do:
>
> increasePartitionCount(asList(asList(1, 2), asList(2, 3))
>
> Not sure why need the number.


kafka-topics.sh allows to increase the number of partitions without
supplying an assignment, so one reason is simply to be able to support that.

When you don't supply an assignment you're leaving it to the cluster to
decide for you. By requiring an assignment you're forcing the user to
decide. The user might not care much and thus make a worse choice than if
you'd left it to the server.


> The other two questions:
>
> 2. Do we want to allow people to add non-consecutive partition ids? This is
> possible to do with the current AdminUtils, but it's not exactly supported
> (although it apparently works fine in the broker). Still, I wanted to make
> sure we considered it.
>

I admit I had assumed this wasn't possible. How does partitioning work if
there are holes? You would need the list of partition ids in order to
produce a correct partition id.

Suspending my scepticism for a moment, to support something like that we'd
have to change the List<List<Integer>> assignment to a Map<Integer,
List<Integer>>, so the request explicitly identified the new topics, rather
than it being implied. That would make it slightly less easy to form a
valid request for the normal case of consecutive partition ids: You'd have
to actually know the current number of partitions, which might necessitate
a describeTopics().

It doesn't sound like there are any known use cases for non-consecutive
partition ids. It also sounds like whatever existing support there is might
be only lightly tested. It sounds like a source of gotchas and subtle bugs
to me (both in Kafka itself and for users). I have to wonder whether it
would be worth supporting this.

If we decide not to support it, we should fix the rest of the AdminClient
so it's not possible to create non-consecutive partition ids.

WDYT?


3. Do we want to provide the target partition count or the number we want
> to increase it by? This is related to the first point as well. Thinking
> about it, one benefit of specifying the target number is that the operation
> is then idempotent. If we state the number we want to increase by, it's
> easier to make a mistake and increase it twice under failure scenarios. Was
> that the motivation for specifying the target partition count?
>
>
Right, if you're just supplying an increment you can't be certain what
you're incrementing it to (which is what you actually care about). And
idempotency is so nice to have if something goes wrong.

Using an increment would make the `NumPartitionIncrease` class a bit more
easily understood, as then the outer list would have to be the same size as
the increment. But for me idempotency is the more valuable feature.

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Tom Bentley <t....@gmail.com>.
Hi Ismael,

Thanks for the comments.

My bad for not noticing the custom assignment requirement. The current
> proposal has the following example (I updated it a little so that 2
> partitions are added):
>
> increasePartitionCount(4, asList(asList(1, 2), asList(2, 3))
>
> Why not simply provide the assignment? For example, if you want to add 2
> partitions, you'd simply do:
>
> increasePartitionCount(asList(asList(1, 2), asList(2, 3))
>
> Not sure why need the number.


kafka-topics.sh allows to increase the number of partitions without
supplying an assignment, so one reason is simply to be able to support that.

When you don't supply an assignment you're leaving it to the cluster to
decide for you. By requiring an assignment you're forcing the user to
decide. The user might not care much and thus make a worse choice than if
you'd left it to the server.


> The other two questions:
>
> 2. Do we want to allow people to add non-consecutive partition ids? This is
> possible to do with the current AdminUtils, but it's not exactly supported
> (although it apparently works fine in the broker). Still, I wanted to make
> sure we considered it.
>

I admit I had assumed this wasn't possible. How does partitioning work if
there are holes? You would need the list of partition ids in order to
produce a correct partition id.

Suspending my scepticism for a moment, to support something like that we'd
have to change the List<List<Integer>> assignment to a Map<Integer,
List<Integer>>, so the request explicitly identified the new topics, rather
than it being implied. That would make it slightly less easy to form a
valid request for the normal case of consecutive partition ids: You'd have
to actually know the current number of partitions, which might necessitate
a describeTopics().

It doesn't sound like there are any known use cases for non-consecutive
partition ids. It also sounds like whatever existing support there is might
be only lightly tested. It sounds like a source of gotchas and subtle bugs
to me (both in Kafka itself and for users). I have to wonder whether it
would be worth supporting this.

If we decide not to support it, we should fix the rest of the AdminClient
so it's not possible to create non-consecutive partition ids.

WDYT?


3. Do we want to provide the target partition count or the number we want
> to increase it by? This is related to the first point as well. Thinking
> about it, one benefit of specifying the target number is that the operation
> is then idempotent. If we state the number we want to increase by, it's
> easier to make a mistake and increase it twice under failure scenarios. Was
> that the motivation for specifying the target partition count?
>
>
Right, if you're just supplying an increment you can't be certain what
you're incrementing it to (which is what you actually care about). And
idempotency is so nice to have if something goes wrong.

Using an increment would make the `NumPartitionIncrease` class a bit more
easily understood, as then the outer list would have to be the same size as
the increment. But for me idempotency is the more valuable feature.

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Ismael Juma <is...@juma.me.uk>.
My bad for not noticing the custom assignment requirement. The current
proposal has the following example (I updated it a little so that 2
partitions are added):

increasePartitionCount(4, asList(asList(1, 2), asList(2, 3))

Why not simply provide the assignment? For example, if you want to add 2
partitions, you'd simply do:

increasePartitionCount(asList(asList(1, 2), asList(2, 3))

Not sure why need the number. The other two questions:

2. Do we want to allow people to add non-consecutive partition ids? This is
possible to do with the current AdminUtils, but it's not exactly supported
(although it apparently works fine in the broker). Still, I wanted to make
sure we considered it.

3. Do we want to provide the target partition count or the number we want
to increase it by? This is related to the first point as well. Thinking
about it, one benefit of specifying the target number is that the operation
is then idempotent. If we state the number we want to increase by, it's
easier to make a mistake and increase it twice under failure scenarios. Was
that the motivation for specifying the target partition count?

Ismael

On Fri, Sep 8, 2017 at 9:22 AM, Tom Bentley <t....@gmail.com> wrote:

> Hi Ted and Ismael,
>
> Thanks for the comments.
>
> Ted, I've fixed the KIP for your comments.
>
> Ismael, see responses inline:
>
> On 8 September 2017 at 02:00, Ismael Juma <is...@juma.me.uk> wrote:
>
> > Thanks Tom. Thanks for the KIP. A few comments:
> >
> > 1. Does the `PartitionCount` class still make sense given that the method
> > can only increase the number of partitions now?
> >
>
> Yes, it still makes sense because its optional to supply assignments of the
> new partitions to brokers. I'm not totally convinced it's the best name
> though.
>
>
> > 2,. In `NewTopic`, we have `numPartitions`. Should we keep using that
> > variant and call the method `increaseNumPartitions`?
> >
>
> I don't mind, renaming it. It would make the `PartitionCount` name even
> less convincing though, so perhaps the right name for that would be
> `NumPartitionIncrease`?
>
>
> > 3. Since this has been discussed at length as part of the reassign
> > partitions KIP, I suggest starting the vote tomorrow if there are no
> > objections from others.
> >
>
> I'll start the vote today unless anyone raises points that I can't address.
>
> Thanks,
>
> Tom
>
>
> >
> > Thanks,
> > Ismael
> >
> > On Thu, Sep 7, 2017 at 5:38 PM, Tom Bentley <t....@gmail.com>
> wrote:
> >
> > > As suggested by Ismael, I've factored the increasePartitionCounts() API
> > out
> > > of KIP-179 out into a separate KIP which hopefully can progress more
> > > quickly.
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-195%
> > > 3A+AdminClient.increasePartitions
> > >
> > > If you've looked at KIP-179 in the last few days there's no much new to
> > see
> > > here, but if not you should find KIP-195 a lighter read.
> > >
> > > Cheers,
> > >
> > > Tom
> > >
> >
>

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Tom Bentley <t....@gmail.com>.
Hi Ted and Ismael,

Thanks for the comments.

Ted, I've fixed the KIP for your comments.

Ismael, see responses inline:

On 8 September 2017 at 02:00, Ismael Juma <is...@juma.me.uk> wrote:

> Thanks Tom. Thanks for the KIP. A few comments:
>
> 1. Does the `PartitionCount` class still make sense given that the method
> can only increase the number of partitions now?
>

Yes, it still makes sense because its optional to supply assignments of the
new partitions to brokers. I'm not totally convinced it's the best name
though.


> 2,. In `NewTopic`, we have `numPartitions`. Should we keep using that
> variant and call the method `increaseNumPartitions`?
>

I don't mind, renaming it. It would make the `PartitionCount` name even
less convincing though, so perhaps the right name for that would be
`NumPartitionIncrease`?


> 3. Since this has been discussed at length as part of the reassign
> partitions KIP, I suggest starting the vote tomorrow if there are no
> objections from others.
>

I'll start the vote today unless anyone raises points that I can't address.

Thanks,

Tom


>
> Thanks,
> Ismael
>
> On Thu, Sep 7, 2017 at 5:38 PM, Tom Bentley <t....@gmail.com> wrote:
>
> > As suggested by Ismael, I've factored the increasePartitionCounts() API
> out
> > of KIP-179 out into a separate KIP which hopefully can progress more
> > quickly.
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-195%
> > 3A+AdminClient.increasePartitions
> >
> > If you've looked at KIP-179 in the last few days there's no much new to
> see
> > here, but if not you should find KIP-195 a lighter read.
> >
> > Cheers,
> >
> > Tom
> >
>

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Ismael Juma <is...@juma.me.uk>.
Thanks Tom. Thanks for the KIP. A few comments:

1. Does the `PartitionCount` class still make sense given that the method
can only increase the number of partitions now?
2,. In `NewTopic`, we have `numPartitions`. Should we keep using that
variant and call the method `increaseNumPartitions`?
3. Since this has been discussed at length as part of the reassign
partitions KIP, I suggest starting the vote tomorrow if there are no
objections from others.

Thanks,
Ismael

On Thu, Sep 7, 2017 at 5:38 PM, Tom Bentley <t....@gmail.com> wrote:

> As suggested by Ismael, I've factored the increasePartitionCounts() API out
> of KIP-179 out into a separate KIP which hopefully can progress more
> quickly.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-195%
> 3A+AdminClient.increasePartitions
>
> If you've looked at KIP-179 in the last few days there's no much new to see
> here, but if not you should find KIP-195 a lighter read.
>
> Cheers,
>
> Tom
>

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

Posted by Ted Yu <yu...@gmail.com>.
Tom:
Looks good overall.

bq. for the topic from the AlterPartitionCountsResult

Please align the name of Result with current proposal.

Please also fill in JIRA number when you have it.

On Thu, Sep 7, 2017 at 9:38 AM, Tom Bentley <t....@gmail.com> wrote:

> As suggested by Ismael, I've factored the increasePartitionCounts() API out
> of KIP-179 out into a separate KIP which hopefully can progress more
> quickly.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-195%3A+AdminClient.
> increasePartitions
>
> If you've looked at KIP-179 in the last few days there's no much new to see
> here, but if not you should find KIP-195 a lighter read.
>
> Cheers,
>
> Tom
>