You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Jeyhun Karimov <je...@gmail.com> on 2017/04/28 15:13:39 UTC
Splitting tasks in streams?
Hi community,
I have a question regarding with streams library.
Currently, in kafka-streams we run the whole topology in one instance and
there can be several topologies or tasks in a single node. However, there
can be use-cases with very complex topologies with costly operators. So,
when we want to scale-up, instead of copying the whole topology to run in
parallel, we may need to scale-up specific operators (or subgraphs in
tasks) in topology (it depends on a defined cost function).
So my question is that, is the specified use-case is compatible with
kafka-streams motivation and would it be appreciated by community the
relevant contribution?
Cheers,
Jeyhun
--
-Cheers
Jeyhun
Re: Splitting tasks in streams?
Posted by "Matthias J. Sax" <ma...@confluent.io>.
I guess you can do thus "manually" by splitting you code into multiple
applications so you can scale each part independently.
I am not against an improvement in Streams itself, but I am not sure how
this could be done atm.
-Matthias
On 4/28/17 8:42 AM, Eno Thereska wrote:
> Hi Jeyhun,
>
> You make a good observation and I think a discussion/contribution around this would be very much appreciated by the community. Are you thinking of a KIP perhaps?
>
> Eno
>
>> On 28 Apr 2017, at 16:13, Jeyhun Karimov <je...@gmail.com> wrote:
>>
>> Hi community,
>>
>> I have a question regarding with streams library.
>>
>> Currently, in kafka-streams we run the whole topology in one instance and
>> there can be several topologies or tasks in a single node. However, there
>> can be use-cases with very complex topologies with costly operators. So,
>> when we want to scale-up, instead of copying the whole topology to run in
>> parallel, we may need to scale-up specific operators (or subgraphs in
>> tasks) in topology (it depends on a defined cost function).
>> So my question is that, is the specified use-case is compatible with
>> kafka-streams motivation and would it be appreciated by community the
>> relevant contribution?
>>
>>
>> Cheers,
>> Jeyhun
>> --
>> -Cheers
>>
>> Jeyhun
>
Re: Splitting tasks in streams?
Posted by Jeyhun Karimov <je...@gmail.com>.
Hi Eno,
Thanks for reply. For me it was important that the particular use-case can
be involved within kafka-stream's boundaries. I would put this in future
plans as I don't think now it is approproate time to introduce this feature
in streams library. Currently implementing query optimization (like [1]) on
a given topology and effective load balancing(like [2]) would be a good
start to reach the goal (which is provided in previous email).
[1] https://issues.apache.org/jira/browse/KAFKA-4601
[2] https://issues.apache.org/jira/browse/KAFKA-4969
Cheers,
Jeyhun
On Fri, Apr 28, 2017 at 5:42 PM Eno Thereska <en...@gmail.com> wrote:
> Hi Jeyhun,
>
> You make a good observation and I think a discussion/contribution around
> this would be very much appreciated by the community. Are you thinking of a
> KIP perhaps?
>
> Eno
>
> > On 28 Apr 2017, at 16:13, Jeyhun Karimov <je...@gmail.com> wrote:
> >
> > Hi community,
> >
> > I have a question regarding with streams library.
> >
> > Currently, in kafka-streams we run the whole topology in one instance and
> > there can be several topologies or tasks in a single node. However,
> there
> > can be use-cases with very complex topologies with costly operators. So,
> > when we want to scale-up, instead of copying the whole topology to run in
> > parallel, we may need to scale-up specific operators (or subgraphs in
> > tasks) in topology (it depends on a defined cost function).
> > So my question is that, is the specified use-case is compatible with
> > kafka-streams motivation and would it be appreciated by community the
> > relevant contribution?
> >
> >
> > Cheers,
> > Jeyhun
> > --
> > -Cheers
> >
> > Jeyhun
>
> --
-Cheers
Jeyhun
Re: Splitting tasks in streams?
Posted by Eno Thereska <en...@gmail.com>.
Hi Jeyhun,
You make a good observation and I think a discussion/contribution around this would be very much appreciated by the community. Are you thinking of a KIP perhaps?
Eno
> On 28 Apr 2017, at 16:13, Jeyhun Karimov <je...@gmail.com> wrote:
>
> Hi community,
>
> I have a question regarding with streams library.
>
> Currently, in kafka-streams we run the whole topology in one instance and
> there can be several topologies or tasks in a single node. However, there
> can be use-cases with very complex topologies with costly operators. So,
> when we want to scale-up, instead of copying the whole topology to run in
> parallel, we may need to scale-up specific operators (or subgraphs in
> tasks) in topology (it depends on a defined cost function).
> So my question is that, is the specified use-case is compatible with
> kafka-streams motivation and would it be appreciated by community the
> relevant contribution?
>
>
> Cheers,
> Jeyhun
> --
> -Cheers
>
> Jeyhun