You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Jeyhun Karimov <je...@gmail.com> on 2017/04/28 15:13:39 UTC

Splitting tasks in streams?

Hi community,

I have a question regarding with streams library.

Currently, in kafka-streams we run the whole topology in one instance and
there can be several topologies  or tasks in a single node. However, there
can be use-cases with very complex topologies with costly operators. So,
when we want to scale-up, instead of copying the whole topology to run in
parallel, we may need to scale-up specific operators (or subgraphs in
tasks) in topology (it depends on a defined cost function).
So my question is that,  is the specified use-case is compatible with
kafka-streams motivation and would it be appreciated by community the
relevant contribution?


Cheers,
Jeyhun
-- 
-Cheers

Jeyhun

Re: Splitting tasks in streams?

Posted by "Matthias J. Sax" <ma...@confluent.io>.
I guess you can do thus "manually" by splitting you code into multiple
applications so you can scale each part independently.

I am not against an improvement in Streams itself, but I am not sure how
this could be done atm.


-Matthias

On 4/28/17 8:42 AM, Eno Thereska wrote:
> Hi Jeyhun,
> 
> You make a good observation and I think a discussion/contribution around this would be very much appreciated by the community. Are you thinking of a KIP perhaps?
> 
> Eno
> 
>> On 28 Apr 2017, at 16:13, Jeyhun Karimov <je...@gmail.com> wrote:
>>
>> Hi community,
>>
>> I have a question regarding with streams library.
>>
>> Currently, in kafka-streams we run the whole topology in one instance and
>> there can be several topologies  or tasks in a single node. However, there
>> can be use-cases with very complex topologies with costly operators. So,
>> when we want to scale-up, instead of copying the whole topology to run in
>> parallel, we may need to scale-up specific operators (or subgraphs in
>> tasks) in topology (it depends on a defined cost function).
>> So my question is that,  is the specified use-case is compatible with
>> kafka-streams motivation and would it be appreciated by community the
>> relevant contribution?
>>
>>
>> Cheers,
>> Jeyhun
>> -- 
>> -Cheers
>>
>> Jeyhun
> 


Re: Splitting tasks in streams?

Posted by Jeyhun Karimov <je...@gmail.com>.
Hi Eno,

Thanks for reply. For me it was important that the particular use-case can
be involved within kafka-stream's boundaries. I would put this in future
plans as I don't think now it is approproate time to introduce this feature
in streams library. Currently implementing query optimization (like [1]) on
a given topology and effective load balancing(like [2]) would be a good
start to reach the goal (which is provided in previous email).

[1] https://issues.apache.org/jira/browse/KAFKA-4601
[2] https://issues.apache.org/jira/browse/KAFKA-4969

Cheers,
Jeyhun


On Fri, Apr 28, 2017 at 5:42 PM Eno Thereska <en...@gmail.com> wrote:

> Hi Jeyhun,
>
> You make a good observation and I think a discussion/contribution around
> this would be very much appreciated by the community. Are you thinking of a
> KIP perhaps?
>
> Eno
>
> > On 28 Apr 2017, at 16:13, Jeyhun Karimov <je...@gmail.com> wrote:
> >
> > Hi community,
> >
> > I have a question regarding with streams library.
> >
> > Currently, in kafka-streams we run the whole topology in one instance and
> > there can be several topologies  or tasks in a single node. However,
> there
> > can be use-cases with very complex topologies with costly operators. So,
> > when we want to scale-up, instead of copying the whole topology to run in
> > parallel, we may need to scale-up specific operators (or subgraphs in
> > tasks) in topology (it depends on a defined cost function).
> > So my question is that,  is the specified use-case is compatible with
> > kafka-streams motivation and would it be appreciated by community the
> > relevant contribution?
> >
> >
> > Cheers,
> > Jeyhun
> > --
> > -Cheers
> >
> > Jeyhun
>
> --
-Cheers

Jeyhun

Re: Splitting tasks in streams?

Posted by Eno Thereska <en...@gmail.com>.
Hi Jeyhun,

You make a good observation and I think a discussion/contribution around this would be very much appreciated by the community. Are you thinking of a KIP perhaps?

Eno

> On 28 Apr 2017, at 16:13, Jeyhun Karimov <je...@gmail.com> wrote:
> 
> Hi community,
> 
> I have a question regarding with streams library.
> 
> Currently, in kafka-streams we run the whole topology in one instance and
> there can be several topologies  or tasks in a single node. However, there
> can be use-cases with very complex topologies with costly operators. So,
> when we want to scale-up, instead of copying the whole topology to run in
> parallel, we may need to scale-up specific operators (or subgraphs in
> tasks) in topology (it depends on a defined cost function).
> So my question is that,  is the specified use-case is compatible with
> kafka-streams motivation and would it be appreciated by community the
> relevant contribution?
> 
> 
> Cheers,
> Jeyhun
> -- 
> -Cheers
> 
> Jeyhun