You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Navneeth Krishnan <re...@gmail.com> on 2019/07/30 03:58:13 UTC

Partition assignment in kafka streams

Hi,

I'm using the processor topology for my use case and I would like to get
the partitions assigned to a particular stream instance. I looked at the
addSouce function but I don't see a way to add a callback to get notified
when partition assignment or reassignment happens. Please advise.

Thank you

Re: Partition assignment in kafka streams

Posted by Guozhang Wang <wa...@gmail.com>.
Hi Navneeth,

In Streams you can only get your assignment at runtime via the API
mentioned above, but cannot get it before the streams app starts up (I
assume that is what you meant for "upfront"). So if you can 1) first start
the streams app, and then 2) read the assignment, and then start writing
data to each instance's local storage then it should work fine. If you
cannot do step 2) after step 1) currently there's no good way to achieve
your goal..


Guozhang


On Wed, Aug 7, 2019 at 10:40 PM Navneeth Krishnan <re...@gmail.com>
wrote:

> Hi All,
>
> Any suggestions?
>
> Thanks
>
>
> On Thu, Aug 1, 2019 at 8:58 PM Navneeth Krishnan <reachnavneeth2@gmail.com
> >
> wrote:
>
> > Hi Guozhang,
> >
> > Thanks for the clarification. What I want to achieve is use of localized
> > data. We have much larger state which has to be used at a per instance
> > context. So if I can get the assignment upfront I can basically write
> data
> > to partitions in such a way that all data goes to that specific node
> which
> > handles the logic.
> >
> > I wouldn't be able to achieve my use case with just one stream worker,
> I'm
> > trying to spawn up multiple worker and wire up the instance with some
> > static data which will be used in the per message business logic.
> >
> > Thanks
> >
> > On Thu, Aug 1, 2019 at 9:51 AM Guozhang Wang <wa...@gmail.com> wrote:
> >
> >> Hello Navneeth,
> >>
> >> I may be misunderstanding your intent from the previous emails here, so
> >> just a quick summary:
> >>
> >> 1) if you just want to "know" which partitions are assigned to which
> >> instance, this can be retrieved in multiple ways (e.g. the one mentioned
> >> by
> >> Matthias, and also one can get this info from JMX metrics which shows
> >> threads->tasks mapping).
> >>
> >> 2) if you want to "manipulate" the assignment so that a specific set of
> >> partitions to be assigned to a specific instance, today it is not doable
> >> directly as Streams library does not expose the task assignor
> customizable
> >> by users.
> >>
> >> Guozhang
> >>
> >> On Wed, Jul 31, 2019 at 4:48 PM Matthias J. Sax <ma...@confluent.io>
> >> wrote:
> >>
> >> > You cannot hook into partition assignment, and I am not sure what you
> >> > exactly want to do.
> >> >
> >> > You can get local assignment metadata via
> >> > `KafkaStreams#localThreadMetadata()` though.
> >> >
> >> > Hope this helps.
> >> >
> >> >
> >> > -Matthias
> >> >
> >> > On 7/29/19 11:29 PM, Navneeth Krishnan wrote:
> >> > > Hi All,
> >> > >
> >> > > The main reason for knowing the partitions is to have a localized
> >> routing
> >> > > based on partitions assigned to set a stream tasks. This would
> really
> >> > help
> >> > > in my use case.
> >> > >
> >> > > Thanks
> >> > >
> >> > > On Mon, Jul 29, 2019 at 8:58 PM Navneeth Krishnan <
> >> > reachnavneeth2@gmail.com>
> >> > > wrote:
> >> > >
> >> > >> Hi,
> >> > >>
> >> > >> I'm using the processor topology for my use case and I would like
> to
> >> get
> >> > >> the partitions assigned to a particular stream instance. I looked
> at
> >> the
> >> > >> addSouce function but I don't see a way to add a callback to get
> >> > notified
> >> > >> when partition assignment or reassignment happens. Please advise.
> >> > >>
> >> > >> Thank you
> >> > >>
> >> > >
> >> >
> >> >
> >>
> >> --
> >> -- Guozhang
> >>
> >
>


-- 
-- Guozhang

Re: Partition assignment in kafka streams

Posted by Navneeth Krishnan <re...@gmail.com>.
Hi All,

Any suggestions?

Thanks


On Thu, Aug 1, 2019 at 8:58 PM Navneeth Krishnan <re...@gmail.com>
wrote:

> Hi Guozhang,
>
> Thanks for the clarification. What I want to achieve is use of localized
> data. We have much larger state which has to be used at a per instance
> context. So if I can get the assignment upfront I can basically write data
> to partitions in such a way that all data goes to that specific node which
> handles the logic.
>
> I wouldn't be able to achieve my use case with just one stream worker, I'm
> trying to spawn up multiple worker and wire up the instance with some
> static data which will be used in the per message business logic.
>
> Thanks
>
> On Thu, Aug 1, 2019 at 9:51 AM Guozhang Wang <wa...@gmail.com> wrote:
>
>> Hello Navneeth,
>>
>> I may be misunderstanding your intent from the previous emails here, so
>> just a quick summary:
>>
>> 1) if you just want to "know" which partitions are assigned to which
>> instance, this can be retrieved in multiple ways (e.g. the one mentioned
>> by
>> Matthias, and also one can get this info from JMX metrics which shows
>> threads->tasks mapping).
>>
>> 2) if you want to "manipulate" the assignment so that a specific set of
>> partitions to be assigned to a specific instance, today it is not doable
>> directly as Streams library does not expose the task assignor customizable
>> by users.
>>
>> Guozhang
>>
>> On Wed, Jul 31, 2019 at 4:48 PM Matthias J. Sax <ma...@confluent.io>
>> wrote:
>>
>> > You cannot hook into partition assignment, and I am not sure what you
>> > exactly want to do.
>> >
>> > You can get local assignment metadata via
>> > `KafkaStreams#localThreadMetadata()` though.
>> >
>> > Hope this helps.
>> >
>> >
>> > -Matthias
>> >
>> > On 7/29/19 11:29 PM, Navneeth Krishnan wrote:
>> > > Hi All,
>> > >
>> > > The main reason for knowing the partitions is to have a localized
>> routing
>> > > based on partitions assigned to set a stream tasks. This would really
>> > help
>> > > in my use case.
>> > >
>> > > Thanks
>> > >
>> > > On Mon, Jul 29, 2019 at 8:58 PM Navneeth Krishnan <
>> > reachnavneeth2@gmail.com>
>> > > wrote:
>> > >
>> > >> Hi,
>> > >>
>> > >> I'm using the processor topology for my use case and I would like to
>> get
>> > >> the partitions assigned to a particular stream instance. I looked at
>> the
>> > >> addSouce function but I don't see a way to add a callback to get
>> > notified
>> > >> when partition assignment or reassignment happens. Please advise.
>> > >>
>> > >> Thank you
>> > >>
>> > >
>> >
>> >
>>
>> --
>> -- Guozhang
>>
>

Re: Partition assignment in kafka streams

Posted by Navneeth Krishnan <re...@gmail.com>.
Hi Guozhang,

Thanks for the clarification. What I want to achieve is use of localized
data. We have much larger state which has to be used at a per instance
context. So if I can get the assignment upfront I can basically write data
to partitions in such a way that all data goes to that specific node which
handles the logic.

I wouldn't be able to achieve my use case with just one stream worker, I'm
trying to spawn up multiple worker and wire up the instance with some
static data which will be used in the per message business logic.

Thanks

On Thu, Aug 1, 2019 at 9:51 AM Guozhang Wang <wa...@gmail.com> wrote:

> Hello Navneeth,
>
> I may be misunderstanding your intent from the previous emails here, so
> just a quick summary:
>
> 1) if you just want to "know" which partitions are assigned to which
> instance, this can be retrieved in multiple ways (e.g. the one mentioned by
> Matthias, and also one can get this info from JMX metrics which shows
> threads->tasks mapping).
>
> 2) if you want to "manipulate" the assignment so that a specific set of
> partitions to be assigned to a specific instance, today it is not doable
> directly as Streams library does not expose the task assignor customizable
> by users.
>
> Guozhang
>
> On Wed, Jul 31, 2019 at 4:48 PM Matthias J. Sax <ma...@confluent.io>
> wrote:
>
> > You cannot hook into partition assignment, and I am not sure what you
> > exactly want to do.
> >
> > You can get local assignment metadata via
> > `KafkaStreams#localThreadMetadata()` though.
> >
> > Hope this helps.
> >
> >
> > -Matthias
> >
> > On 7/29/19 11:29 PM, Navneeth Krishnan wrote:
> > > Hi All,
> > >
> > > The main reason for knowing the partitions is to have a localized
> routing
> > > based on partitions assigned to set a stream tasks. This would really
> > help
> > > in my use case.
> > >
> > > Thanks
> > >
> > > On Mon, Jul 29, 2019 at 8:58 PM Navneeth Krishnan <
> > reachnavneeth2@gmail.com>
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >> I'm using the processor topology for my use case and I would like to
> get
> > >> the partitions assigned to a particular stream instance. I looked at
> the
> > >> addSouce function but I don't see a way to add a callback to get
> > notified
> > >> when partition assignment or reassignment happens. Please advise.
> > >>
> > >> Thank you
> > >>
> > >
> >
> >
>
> --
> -- Guozhang
>

Re: Partition assignment in kafka streams

Posted by Guozhang Wang <wa...@gmail.com>.
Hello Navneeth,

I may be misunderstanding your intent from the previous emails here, so
just a quick summary:

1) if you just want to "know" which partitions are assigned to which
instance, this can be retrieved in multiple ways (e.g. the one mentioned by
Matthias, and also one can get this info from JMX metrics which shows
threads->tasks mapping).

2) if you want to "manipulate" the assignment so that a specific set of
partitions to be assigned to a specific instance, today it is not doable
directly as Streams library does not expose the task assignor customizable
by users.

Guozhang

On Wed, Jul 31, 2019 at 4:48 PM Matthias J. Sax <ma...@confluent.io>
wrote:

> You cannot hook into partition assignment, and I am not sure what you
> exactly want to do.
>
> You can get local assignment metadata via
> `KafkaStreams#localThreadMetadata()` though.
>
> Hope this helps.
>
>
> -Matthias
>
> On 7/29/19 11:29 PM, Navneeth Krishnan wrote:
> > Hi All,
> >
> > The main reason for knowing the partitions is to have a localized routing
> > based on partitions assigned to set a stream tasks. This would really
> help
> > in my use case.
> >
> > Thanks
> >
> > On Mon, Jul 29, 2019 at 8:58 PM Navneeth Krishnan <
> reachnavneeth2@gmail.com>
> > wrote:
> >
> >> Hi,
> >>
> >> I'm using the processor topology for my use case and I would like to get
> >> the partitions assigned to a particular stream instance. I looked at the
> >> addSouce function but I don't see a way to add a callback to get
> notified
> >> when partition assignment or reassignment happens. Please advise.
> >>
> >> Thank you
> >>
> >
>
>

-- 
-- Guozhang

Re: Partition assignment in kafka streams

Posted by "Matthias J. Sax" <ma...@confluent.io>.
You cannot hook into partition assignment, and I am not sure what you
exactly want to do.

You can get local assignment metadata via
`KafkaStreams#localThreadMetadata()` though.

Hope this helps.


-Matthias

On 7/29/19 11:29 PM, Navneeth Krishnan wrote:
> Hi All,
> 
> The main reason for knowing the partitions is to have a localized routing
> based on partitions assigned to set a stream tasks. This would really help
> in my use case.
> 
> Thanks
> 
> On Mon, Jul 29, 2019 at 8:58 PM Navneeth Krishnan <re...@gmail.com>
> wrote:
> 
>> Hi,
>>
>> I'm using the processor topology for my use case and I would like to get
>> the partitions assigned to a particular stream instance. I looked at the
>> addSouce function but I don't see a way to add a callback to get notified
>> when partition assignment or reassignment happens. Please advise.
>>
>> Thank you
>>
> 


Re: Partition assignment in kafka streams

Posted by Navneeth Krishnan <re...@gmail.com>.
Hi All,

The main reason for knowing the partitions is to have a localized routing
based on partitions assigned to set a stream tasks. This would really help
in my use case.

Thanks

On Mon, Jul 29, 2019 at 8:58 PM Navneeth Krishnan <re...@gmail.com>
wrote:

> Hi,
>
> I'm using the processor topology for my use case and I would like to get
> the partitions assigned to a particular stream instance. I looked at the
> addSouce function but I don't see a way to add a callback to get notified
> when partition assignment or reassignment happens. Please advise.
>
> Thank you
>