You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Valentin Forst <va...@aseno.de> on 2017/10/02 11:39:30 UTC

Using Kafka on DC/OS + Marathon

Hi there,

Working in a huge compony we are about to install Kafka on DC/OS (Mesos) and intend to use Marathon as a Scheduler. Since I am new to DC/OS and Marathon, I was wondering if this is a recommended way of using Kafka in the production environment.

My doubts are:
- Kafka manages Broker rebalancing (e.g. Failover, etc.) using its own semantic. Can I trust Marathon that it will match the requirements here?
- Since our Container Platform - DC/OS is going to be used by other „micro services“ - soon or later this is going to raise a performance issue. Should we better use a dedicated DC/OS instance for our Kafka-Cluster? Or Kafka-Cluster on its own?
- Is there something else we should consider important if using Kafka on DC/OS + Marathon?


Thanks in advance for your time.
Valentin


Re: Using Kafka on DC/OS + Marathon

Posted by Valentin Forst <va...@aseno.de>.
Hi David,

Thank you for your replay! Presumably I wasn’t clear in my previous post. Here an example to visualize what I'm trying to figure out:

Imagine we have a data flow propagating massages through a Kafka-Cluster which is happen to consist of 3 brokers (3 partitions, 3 replica). If one of those brokers goes down, Kafka does two things:
- Broker rebalancing
- Rebalancing the consumer within a group

Now when marathon starts the failed broker again, some messages could get duplicated or missed… That is exactly what I would like to avoid (requirement). Make sense?   

Does someone have experience with Kafka on DC/OS + Marathon on a production environment and supports Exactly-Ones Semantic? 

Which case would you recommend?
1. Kafka on DC/OS + Marathon using Mesos private nodes  (+ microservices on the public nodes)
2. Kafka on separate DC/OS-Cluster ? i.e. micro services have a different DC/OS Cluster
3. Kafka -Cluster on its own

Cheers,
Valentin


> Am 02.10.2017 um 16:35 schrieb David Garcia <da...@spiceworks.com>:
> 
> I’m not sure how your requirements of Kafka are related to your requirements for marathon.  Kafka is a streaming-log system and marathon is a scheduler.  Mesos, as your resource manager, simply “manages” resources.  Are you asking about multitenancy?  If so, I highly recommend that you separate your Kafka cluster (and zookeeper) from your other services.  Kafka leverages the OS page cache to optimize read performance and it seems likely this would interfere with Mesos resource management policy.
> 
> -David 
> 
> On 10/2/17, 6:39 AM, "Valentin Forst" <va...@aseno.de> wrote:
> 
>    Hi there,
> 
>    Working in a huge compony we are about to install Kafka on DC/OS (Mesos) and intend to use Marathon as a Scheduler. Since I am new to DC/OS and Marathon, I was wondering if this is a recommended way of using Kafka in the production environment.
> 
>    My doubts are:
>    - Kafka manages Broker rebalancing (e.g. Failover, etc.) using its own semantic. Can I trust Marathon that it will match the requirements here?
>    - Since our Container Platform - DC/OS is going to be used by other „micro services“ - soon or later this is going to raise a performance issue. Should we better use a dedicated DC/OS instance for our Kafka-Cluster? Or Kafka-Cluster on its own?
>    - Is there something else we should consider important if using Kafka on DC/OS + Marathon?
> 
> 
>    Thanks in advance for your time.
>    Valentin
> 
> 
> 


Re: Using Kafka on DC/OS + Marathon

Posted by Sean Glover <se...@lightbend.com>.
No, I don't.  I help others that do :)

On Tue, Oct 3, 2017 at 1:12 PM, Valentin Forst <va...@aseno.de> wrote:

> Hi Sean,
>
> Thanks a lot for this info !
> Are you running DC/OS in prod?
>
> Regards
> Valentin
>
> > Am 03.10.2017 um 15:29 schrieb Sean Glover <se...@lightbend.com>:
> >
> > Hi Valentin,
> >
> > Kafka is available on DC/OS in the Catalog (aka Universe) as part of the
> > `kafka` package.  Mesosphere has put a lot of effort into making Kafka
> work
> > on DC/OS.  Since Kafka requires persistent disk it's required to make
> sure
> > after initial deployment brokers stay put on their assigned Mesos agents.
> > Deployment and common ops tasks are supported with the help of the Kafka
> > scheduler developed in the mesosphere/dcos-commons repo.  For example,
> > configuration changes to brokers can be made through the DC/OS Kafka
> > service (through the UI or the CLI) and deployed out to brokers as a
> > rolling upgrade, where one at a time each broker server.config's are
> > updated and the server is cleanly bounced.  The Kafka scheduler also
> > supports other features such as upgrades for when Mesosphere releases a
> new
> > scheduler update or when a new version of Kafka is available.  Common ops
> > tasks like replacing a failed broker or adding more brokers is supported
> by
> > using the DC/OS CLI and Kafka scheduler configuration changes.  In short,
> > most of the the ops tasks are handled by the Kafka scheduler, but all
> other
> > tasks are just Kafka as usual.
> >
> > The biggest thing to watch out for is that running Kafka in DC/OS
> implies a
> > shared mixed-use environment.  It's possible other services could be
> > running on the Mesos agents brokers are installed on, which could have
> > resource conflicts, etc.  By default DC/OS Kafka shares the ZooKeeper
> > instances with Mesos and other services, you may want to consider a
> > standalone cluster for Kafka.  All these concerns can be mitigated with
> > configuration, but you'll need to get familiar with DC/OS and the Kafka
> > scheduler before you run anything in prod.
> >
> > Latest DC/OS Kafka release:
> > https://docs.mesosphere.com/service-docs/kafka/2.0.1-0.11.0/
> >
> > Regards,
> > Sean
> >
> > On Tue, Oct 3, 2017 at 5:20 AM, Valentin Forst <va...@aseno.de>
> wrote:
> >
> >> Hi Avinash,
> >>
> >> Thanks for this hint.
> >>
> >> It would have been great, if someone could share experience using this
> >> framework on the production environment.
> >>
> >> Thanks in advance
> >> Valentin
> >>
> >>> Am 02.10.2017 um 19:39 schrieb Avinash Shahdadpuri <
> >> avinashpuri@gmail.com>:
> >>>
> >>> There is a a native kafka framework which runs on top of DC/OS.
> >>>
> >>> https://docs.mesosphere.com/service-docs/kafka/
> >>>
> >>> This will most likely be a better way to run kafka on DC/OS rather than
> >>> running it as a marathon framework.
> >>>
> >>>
> >>
> >>
> >
> >
> > --
> > Senior Software Engineer, Lightbend, Inc.
> >
> > <http://lightbend.com>
> >
> > @seg1o <https://twitter.com/seg1o>
>
>


-- 
Senior Software Engineer, Lightbend, Inc.

<http://lightbend.com>

@seg1o <https://twitter.com/seg1o>

Re: Using Kafka on DC/OS + Marathon

Posted by Valentin Forst <va...@aseno.de>.
Hi Sean,

Thanks a lot for this info ! 
Are you running DC/OS in prod? 

Regards
Valentin

> Am 03.10.2017 um 15:29 schrieb Sean Glover <se...@lightbend.com>:
> 
> Hi Valentin,
> 
> Kafka is available on DC/OS in the Catalog (aka Universe) as part of the
> `kafka` package.  Mesosphere has put a lot of effort into making Kafka work
> on DC/OS.  Since Kafka requires persistent disk it's required to make sure
> after initial deployment brokers stay put on their assigned Mesos agents.
> Deployment and common ops tasks are supported with the help of the Kafka
> scheduler developed in the mesosphere/dcos-commons repo.  For example,
> configuration changes to brokers can be made through the DC/OS Kafka
> service (through the UI or the CLI) and deployed out to brokers as a
> rolling upgrade, where one at a time each broker server.config's are
> updated and the server is cleanly bounced.  The Kafka scheduler also
> supports other features such as upgrades for when Mesosphere releases a new
> scheduler update or when a new version of Kafka is available.  Common ops
> tasks like replacing a failed broker or adding more brokers is supported by
> using the DC/OS CLI and Kafka scheduler configuration changes.  In short,
> most of the the ops tasks are handled by the Kafka scheduler, but all other
> tasks are just Kafka as usual.
> 
> The biggest thing to watch out for is that running Kafka in DC/OS implies a
> shared mixed-use environment.  It's possible other services could be
> running on the Mesos agents brokers are installed on, which could have
> resource conflicts, etc.  By default DC/OS Kafka shares the ZooKeeper
> instances with Mesos and other services, you may want to consider a
> standalone cluster for Kafka.  All these concerns can be mitigated with
> configuration, but you'll need to get familiar with DC/OS and the Kafka
> scheduler before you run anything in prod.
> 
> Latest DC/OS Kafka release:
> https://docs.mesosphere.com/service-docs/kafka/2.0.1-0.11.0/
> 
> Regards,
> Sean
> 
> On Tue, Oct 3, 2017 at 5:20 AM, Valentin Forst <va...@aseno.de> wrote:
> 
>> Hi Avinash,
>> 
>> Thanks for this hint.
>> 
>> It would have been great, if someone could share experience using this
>> framework on the production environment.
>> 
>> Thanks in advance
>> Valentin
>> 
>>> Am 02.10.2017 um 19:39 schrieb Avinash Shahdadpuri <
>> avinashpuri@gmail.com>:
>>> 
>>> There is a a native kafka framework which runs on top of DC/OS.
>>> 
>>> https://docs.mesosphere.com/service-docs/kafka/
>>> 
>>> This will most likely be a better way to run kafka on DC/OS rather than
>>> running it as a marathon framework.
>>> 
>>> 
>> 
>> 
> 
> 
> -- 
> Senior Software Engineer, Lightbend, Inc.
> 
> <http://lightbend.com>
> 
> @seg1o <https://twitter.com/seg1o>


Re: Using Kafka on DC/OS + Marathon

Posted by Sean Glover <se...@lightbend.com>.
Hi Valentin,

Kafka is available on DC/OS in the Catalog (aka Universe) as part of the
`kafka` package.  Mesosphere has put a lot of effort into making Kafka work
on DC/OS.  Since Kafka requires persistent disk it's required to make sure
after initial deployment brokers stay put on their assigned Mesos agents.
Deployment and common ops tasks are supported with the help of the Kafka
scheduler developed in the mesosphere/dcos-commons repo.  For example,
configuration changes to brokers can be made through the DC/OS Kafka
service (through the UI or the CLI) and deployed out to brokers as a
rolling upgrade, where one at a time each broker server.config's are
updated and the server is cleanly bounced.  The Kafka scheduler also
supports other features such as upgrades for when Mesosphere releases a new
scheduler update or when a new version of Kafka is available.  Common ops
tasks like replacing a failed broker or adding more brokers is supported by
using the DC/OS CLI and Kafka scheduler configuration changes.  In short,
most of the the ops tasks are handled by the Kafka scheduler, but all other
tasks are just Kafka as usual.

The biggest thing to watch out for is that running Kafka in DC/OS implies a
shared mixed-use environment.  It's possible other services could be
running on the Mesos agents brokers are installed on, which could have
resource conflicts, etc.  By default DC/OS Kafka shares the ZooKeeper
instances with Mesos and other services, you may want to consider a
standalone cluster for Kafka.  All these concerns can be mitigated with
configuration, but you'll need to get familiar with DC/OS and the Kafka
scheduler before you run anything in prod.

Latest DC/OS Kafka release:
https://docs.mesosphere.com/service-docs/kafka/2.0.1-0.11.0/

Regards,
Sean

On Tue, Oct 3, 2017 at 5:20 AM, Valentin Forst <va...@aseno.de> wrote:

> Hi Avinash,
>
> Thanks for this hint.
>
> It would have been great, if someone could share experience using this
> framework on the production environment.
>
> Thanks in advance
> Valentin
>
> > Am 02.10.2017 um 19:39 schrieb Avinash Shahdadpuri <
> avinashpuri@gmail.com>:
> >
> > There is a a native kafka framework which runs on top of DC/OS.
> >
> > https://docs.mesosphere.com/service-docs/kafka/
> >
> > This will most likely be a better way to run kafka on DC/OS rather than
> > running it as a marathon framework.
> >
> >
>
>


-- 
Senior Software Engineer, Lightbend, Inc.

<http://lightbend.com>

@seg1o <https://twitter.com/seg1o>

Re: Using Kafka on DC/OS + Marathon

Posted by Valentin Forst <va...@aseno.de>.
Hi Avinash,

Thanks for this hint. 

It would have been great, if someone could share experience using this framework on the production environment.

Thanks in advance
Valentin

> Am 02.10.2017 um 19:39 schrieb Avinash Shahdadpuri <av...@gmail.com>:
> 
> There is a a native kafka framework which runs on top of DC/OS.
> 
> https://docs.mesosphere.com/service-docs/kafka/
> 
> This will most likely be a better way to run kafka on DC/OS rather than
> running it as a marathon framework.
> 
> 


Re: Using Kafka on DC/OS + Marathon

Posted by Avinash Shahdadpuri <av...@gmail.com>.
There is a a native kafka framework which runs on top of DC/OS.

https://docs.mesosphere.com/service-docs/kafka/

This will most likely be a better way to run kafka on DC/OS rather than
running it as a marathon framework.



On Mon, Oct 2, 2017 at 7:35 AM, David Garcia <da...@spiceworks.com> wrote:

> I’m not sure how your requirements of Kafka are related to your
> requirements for marathon.  Kafka is a streaming-log system and marathon is
> a scheduler.  Mesos, as your resource manager, simply “manages” resources.
> Are you asking about multitenancy?  If so, I highly recommend that you
> separate your Kafka cluster (and zookeeper) from your other services.
> Kafka leverages the OS page cache to optimize read performance and it seems
> likely this would interfere with Mesos resource management policy.
>
> -David
>
> On 10/2/17, 6:39 AM, "Valentin Forst" <va...@aseno.de> wrote:
>
>     Hi there,
>
>     Working in a huge compony we are about to install Kafka on DC/OS
> (Mesos) and intend to use Marathon as a Scheduler. Since I am new to DC/OS
> and Marathon, I was wondering if this is a recommended way of using Kafka
> in the production environment.
>
>     My doubts are:
>     - Kafka manages Broker rebalancing (e.g. Failover, etc.) using its own
> semantic. Can I trust Marathon that it will match the requirements here?
>     - Since our Container Platform - DC/OS is going to be used by other
> „micro services“ - soon or later this is going to raise a performance
> issue. Should we better use a dedicated DC/OS instance for our
> Kafka-Cluster? Or Kafka-Cluster on its own?
>     - Is there something else we should consider important if using Kafka
> on DC/OS + Marathon?
>
>
>     Thanks in advance for your time.
>     Valentin
>
>
>
>

Re: Using Kafka on DC/OS + Marathon

Posted by David Garcia <da...@spiceworks.com>.
I’m not sure how your requirements of Kafka are related to your requirements for marathon.  Kafka is a streaming-log system and marathon is a scheduler.  Mesos, as your resource manager, simply “manages” resources.  Are you asking about multitenancy?  If so, I highly recommend that you separate your Kafka cluster (and zookeeper) from your other services.  Kafka leverages the OS page cache to optimize read performance and it seems likely this would interfere with Mesos resource management policy.

-David 

On 10/2/17, 6:39 AM, "Valentin Forst" <va...@aseno.de> wrote:

    Hi there,
    
    Working in a huge compony we are about to install Kafka on DC/OS (Mesos) and intend to use Marathon as a Scheduler. Since I am new to DC/OS and Marathon, I was wondering if this is a recommended way of using Kafka in the production environment.
    
    My doubts are:
    - Kafka manages Broker rebalancing (e.g. Failover, etc.) using its own semantic. Can I trust Marathon that it will match the requirements here?
    - Since our Container Platform - DC/OS is going to be used by other „micro services“ - soon or later this is going to raise a performance issue. Should we better use a dedicated DC/OS instance for our Kafka-Cluster? Or Kafka-Cluster on its own?
    - Is there something else we should consider important if using Kafka on DC/OS + Marathon?
    
    
    Thanks in advance for your time.
    Valentin