You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Ovidiu-Cristian MARCU <ov...@inria.fr> on 2017/07/20 17:24:02 UTC

Consumer throughput drop

Hi guys,

I’m relatively new to Kafka’s world. I have an issue I describe below, maybe you can help me understand this behaviour.

I’m running a benchmark using the following setup: one producer sends data to a topic and concurrently one consumer pulls and writes it to another topic.
Measuring the consumer throughput, I observe values around 500K records/s only until the system’s cache gets filled - from this moment the consumer throughout drops to ~200K (2.5 times lower).
Looking at disk usage, I observe disk read I/O which corresponds to the moment the consumer throughout drops.
After some time, I kill the producer and immediately I observe the consumer throughput goes up to initial values ~ 500K records/s.

What can I do to avoid this throughput drop?

Attached an image showing disk I/O and CPU usage. I have about 128GB RAM on that server which gets filled at time ~2300.

Thanks,
Ovidiu

Re: Consumer throughput drop

Posted by Ismael Juma <is...@juma.me.uk>.

Thanks for reporting the results. Maybe you could submit a PR that updates
the ops section?

https://github.com/apache/kafka/blob/trunk/docs/ops.html

Ismael

On Fri, Jul 21, 2017 at 2:49 PM, Ovidiu-Cristian MARCU <
ovidiu-cristian.marcu@inria.fr> wrote:

> After some tuning, I got better results. What I changed, as suggested:
>
> dirty_ratio = 10 (previously 20)
> dirty_background_ratio=3 (previously 10)
>
> It results that disk read I/O is almost completely 0 (I have enough cache,
> the consumer is keeping with the producer).
>
> - producer throughput remains constant ~ 400K/s;
> - consumer throughput (a Flink app) stays in this interval [300K/s,
> 500K/s] even when the cache is filled (there are some variations but are
> not influenced by system’s cache);
>
> I don’t know if Kafka’s documentation is saying something, but this could
> be put somewhere in documentation if you also reproduce my tests and
> consider it useful.
>
> Thanks,
> Ovidiu
>
> > On 21 Jul 2017, at 01:57, Apurva Mehta <ap...@confluent.io> wrote:
> >
> > Hi Ovidu,
> >
> > The see-saw behavior is inevitable with linux when you have concurrent
> reads and writes. However, tuning the following two settings may help
> achieve more stable performance (from Jay's link):
> >
> > dirty_ratio
> > Defines a percentage value. Writeout of dirty data begins (via pdflush)
> when dirty data comprises this percentage of total system memory. The
> default value is 20.
> > Red Hat recommends a slightly lower value of 15 for database workloads.
> >
> > dirty_background_ratio
> > Defines a percentage value. Writeout of dirty data begins in the
> background (via pdflush) when dirty data comprises this percentage of total
> memory. The default value is 10. For database workloads, Red Hat recommends
> a lower value of 3.
> >
> > Thanks,
> > Apurva
> >
> >
> > On Thu, Jul 20, 2017 at 12:25 PM, Ovidiu-Cristian MARCU <
> ovidiu-cristian.marcu@inria.fr <ma...@inria.fr>>
> wrote:
> > Yes, I’m using Debian Jessie 2.6 installed on this hardware [1].
> >
> > It is also my understanding that Kafka is based on system’s cache (Linux
> in this case) which is based on Clock-Pro for page replacement policy,
> doing complex things for general workloads. I will check the tuning
> parameters, but I was hoping for some advices to avoid disk at all when
> reading, considering the system's cache is used completely by Kafka and is
> huge ~128GB - that is to tune Clock-Pro to be smarter when used for
> streaming access patterns.
> >
> > Thanks,
> > Ovidiu
> >
> > [1] https://www.grid5000.fr/mediawiki/index.php/Rennes:
> Hardware#Dell_Poweredge_R630_.28paravance.29 <https://www.grid5000.fr/
> mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29> <
> https://www.grid5000.fr/mediawiki/index.php/Rennes:
> Hardware#Dell_Poweredge_R630_.28paravance.29 <https://www.grid5000.fr/
> mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29>>
> >
> > > On 20 Jul 2017, at 21:06, Jay Kreps <jay@confluent.io <mailto:
> jay@confluent.io>> wrote:
> > >
> > > I suspect this is on Linux right?
> > >
> > > The way Linux works is it uses a percent of memory to buffer new
> writes, at a certain point it thinks it has too much buffered data and it
> gives high priority to writing that out. The good news about this is that
> the writes are very linear, well layed out, and high-throughput. The
> problem is that it leads to a bit of see-saw behavior.
> > >
> > > Now obviously the drop in performance isn't wrong. When your disk is
> writing data out it is doing work and obviously the read throughput will be
> higher when you are just reading and not writing then when you are doing
> both reading and writing simultaneously. So obviously you can't get the
> no-writing performance when you are also writing (unless you add I/O
> capacity).
> > >
> > > But still these big see-saws in performance are not ideal. You'd
> rather have more constant performance all the time rather than have linux
> bounce back and forth from writing nothing and then frantically writing
> full bore. Fortunately linux provides a set of pagecache tuning parameters
> that let you control this a bit.
> > >
> > > I think these docs cover some of the parameters:
> > > https://access.redhat.com/documentation/en-US/Red_Hat_
> Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html <
> https://access.redhat.com/documentation/en-US/Red_Hat_
> Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html> <
> https://access.redhat.com/documentation/en-US/Red_Hat_
> Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html <
> https://access.redhat.com/documentation/en-US/Red_Hat_
> Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html>>
> > >
> > > -Jay
> > >
> > > On Thu, Jul 20, 2017 at 10:24 AM, Ovidiu-Cristian MARCU <
> ovidiu-cristian.marcu@inria.fr <ma...@inria.fr>
> <mailto:ovidiu-cristian.marcu@inria.fr <mailto:ovidiu-cristian.marcu@
> inria.fr>>> wrote:
> > > Hi guys,
> > >
> > > I’m relatively new to Kafka’s world. I have an issue I describe below,
> maybe you can help me understand this behaviour.
> > >
> > > I’m running a benchmark using the following setup: one producer sends
> data to a topic and concurrently one consumer pulls and writes it to
> another topic.
> > > Measuring the consumer throughput, I observe values around 500K
> records/s only until the system’s cache gets filled - from this moment the
> consumer throughout drops to ~200K (2.5 times lower).
> > > Looking at disk usage, I observe disk read I/O which corresponds to
> the moment the consumer throughout drops.
> > > After some time, I kill the producer and immediately I observe the
> consumer throughput goes up to initial values ~ 500K records/s.
> > >
> > > What can I do to avoid this throughput drop?
> > >
> > > Attached an image showing disk I/O and CPU usage. I have about 128GB
> RAM on that server which gets filled at time ~2300.
> > >
> > > Thanks,
> > > Ovidiu
> > >
> > > <consumer-throughput-drops.png>
> > >
> >
> >
>
>

Re: Consumer throughput drop

Posted by Ismael Juma <is...@juma.me.uk>.

Thanks for reporting the results. Maybe you could submit a PR that updates
the ops section?

https://github.com/apache/kafka/blob/trunk/docs/ops.html

Ismael

On Fri, Jul 21, 2017 at 2:49 PM, Ovidiu-Cristian MARCU <
ovidiu-cristian.marcu@inria.fr> wrote:

> After some tuning, I got better results. What I changed, as suggested:
>
> dirty_ratio = 10 (previously 20)
> dirty_background_ratio=3 (previously 10)
>
> It results that disk read I/O is almost completely 0 (I have enough cache,
> the consumer is keeping with the producer).
>
> - producer throughput remains constant ~ 400K/s;
> - consumer throughput (a Flink app) stays in this interval [300K/s,
> 500K/s] even when the cache is filled (there are some variations but are
> not influenced by system’s cache);
>
> I don’t know if Kafka’s documentation is saying something, but this could
> be put somewhere in documentation if you also reproduce my tests and
> consider it useful.
>
> Thanks,
> Ovidiu
>
> > On 21 Jul 2017, at 01:57, Apurva Mehta <ap...@confluent.io> wrote:
> >
> > Hi Ovidu,
> >
> > The see-saw behavior is inevitable with linux when you have concurrent
> reads and writes. However, tuning the following two settings may help
> achieve more stable performance (from Jay's link):
> >
> > dirty_ratio
> > Defines a percentage value. Writeout of dirty data begins (via pdflush)
> when dirty data comprises this percentage of total system memory. The
> default value is 20.
> > Red Hat recommends a slightly lower value of 15 for database workloads.
> >
> > dirty_background_ratio
> > Defines a percentage value. Writeout of dirty data begins in the
> background (via pdflush) when dirty data comprises this percentage of total
> memory. The default value is 10. For database workloads, Red Hat recommends
> a lower value of 3.
> >
> > Thanks,
> > Apurva
> >
> >
> > On Thu, Jul 20, 2017 at 12:25 PM, Ovidiu-Cristian MARCU <
> ovidiu-cristian.marcu@inria.fr <ma...@inria.fr>>
> wrote:
> > Yes, I’m using Debian Jessie 2.6 installed on this hardware [1].
> >
> > It is also my understanding that Kafka is based on system’s cache (Linux
> in this case) which is based on Clock-Pro for page replacement policy,
> doing complex things for general workloads. I will check the tuning
> parameters, but I was hoping for some advices to avoid disk at all when
> reading, considering the system's cache is used completely by Kafka and is
> huge ~128GB - that is to tune Clock-Pro to be smarter when used for
> streaming access patterns.
> >
> > Thanks,
> > Ovidiu
> >
> > [1] https://www.grid5000.fr/mediawiki/index.php/Rennes:
> Hardware#Dell_Poweredge_R630_.28paravance.29 <https://www.grid5000.fr/
> mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29> <
> https://www.grid5000.fr/mediawiki/index.php/Rennes:
> Hardware#Dell_Poweredge_R630_.28paravance.29 <https://www.grid5000.fr/
> mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29>>
> >
> > > On 20 Jul 2017, at 21:06, Jay Kreps <jay@confluent.io <mailto:
> jay@confluent.io>> wrote:
> > >
> > > I suspect this is on Linux right?
> > >
> > > The way Linux works is it uses a percent of memory to buffer new
> writes, at a certain point it thinks it has too much buffered data and it
> gives high priority to writing that out. The good news about this is that
> the writes are very linear, well layed out, and high-throughput. The
> problem is that it leads to a bit of see-saw behavior.
> > >
> > > Now obviously the drop in performance isn't wrong. When your disk is
> writing data out it is doing work and obviously the read throughput will be
> higher when you are just reading and not writing then when you are doing
> both reading and writing simultaneously. So obviously you can't get the
> no-writing performance when you are also writing (unless you add I/O
> capacity).
> > >
> > > But still these big see-saws in performance are not ideal. You'd
> rather have more constant performance all the time rather than have linux
> bounce back and forth from writing nothing and then frantically writing
> full bore. Fortunately linux provides a set of pagecache tuning parameters
> that let you control this a bit.
> > >
> > > I think these docs cover some of the parameters:
> > > https://access.redhat.com/documentation/en-US/Red_Hat_
> Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html <
> https://access.redhat.com/documentation/en-US/Red_Hat_
> Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html> <
> https://access.redhat.com/documentation/en-US/Red_Hat_
> Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html <
> https://access.redhat.com/documentation/en-US/Red_Hat_
> Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html>>
> > >
> > > -Jay
> > >
> > > On Thu, Jul 20, 2017 at 10:24 AM, Ovidiu-Cristian MARCU <
> ovidiu-cristian.marcu@inria.fr <ma...@inria.fr>
> <mailto:ovidiu-cristian.marcu@inria.fr <mailto:ovidiu-cristian.marcu@
> inria.fr>>> wrote:
> > > Hi guys,
> > >
> > > I’m relatively new to Kafka’s world. I have an issue I describe below,
> maybe you can help me understand this behaviour.
> > >
> > > I’m running a benchmark using the following setup: one producer sends
> data to a topic and concurrently one consumer pulls and writes it to
> another topic.
> > > Measuring the consumer throughput, I observe values around 500K
> records/s only until the system’s cache gets filled - from this moment the
> consumer throughout drops to ~200K (2.5 times lower).
> > > Looking at disk usage, I observe disk read I/O which corresponds to
> the moment the consumer throughout drops.
> > > After some time, I kill the producer and immediately I observe the
> consumer throughput goes up to initial values ~ 500K records/s.
> > >
> > > What can I do to avoid this throughput drop?
> > >
> > > Attached an image showing disk I/O and CPU usage. I have about 128GB
> RAM on that server which gets filled at time ~2300.
> > >
> > > Thanks,
> > > Ovidiu
> > >
> > > <consumer-throughput-drops.png>
> > >
> >
> >
>
>

Re: Consumer throughput drop

Posted by Ovidiu-Cristian MARCU <ov...@inria.fr>.

After some tuning, I got better results. What I changed, as suggested:

dirty_ratio = 10 (previously 20)
dirty_background_ratio=3 (previously 10)

It results that disk read I/O is almost completely 0 (I have enough cache, the consumer is keeping with the producer). 

- producer throughput remains constant ~ 400K/s;
- consumer throughput (a Flink app) stays in this interval [300K/s, 500K/s] even when the cache is filled (there are some variations but are not influenced by system’s cache);

I don’t know if Kafka’s documentation is saying something, but this could be put somewhere in documentation if you also reproduce my tests and consider it useful.

Thanks,
Ovidiu

> On 21 Jul 2017, at 01:57, Apurva Mehta <ap...@confluent.io> wrote:
> 
> Hi Ovidu, 
> 
> The see-saw behavior is inevitable with linux when you have concurrent reads and writes. However, tuning the following two settings may help achieve more stable performance (from Jay's link): 
> 
> dirty_ratio
> Defines a percentage value. Writeout of dirty data begins (via pdflush) when dirty data comprises this percentage of total system memory. The default value is 20.
> Red Hat recommends a slightly lower value of 15 for database workloads.
>  
> dirty_background_ratio
> Defines a percentage value. Writeout of dirty data begins in the background (via pdflush) when dirty data comprises this percentage of total memory. The default value is 10. For database workloads, Red Hat recommends a lower value of 3.
> 
> Thanks,
> Apurva 
> 
> 
> On Thu, Jul 20, 2017 at 12:25 PM, Ovidiu-Cristian MARCU <ovidiu-cristian.marcu@inria.fr <ma...@inria.fr>> wrote:
> Yes, I’m using Debian Jessie 2.6 installed on this hardware [1].
> 
> It is also my understanding that Kafka is based on system’s cache (Linux in this case) which is based on Clock-Pro for page replacement policy, doing complex things for general workloads. I will check the tuning parameters, but I was hoping for some advices to avoid disk at all when reading, considering the system's cache is used completely by Kafka and is huge ~128GB - that is to tune Clock-Pro to be smarter when used for streaming access patterns.
> 
> Thanks,
> Ovidiu
> 
> [1] https://www.grid5000.fr/mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29 <https://www.grid5000.fr/mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29> <https://www.grid5000.fr/mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29 <https://www.grid5000.fr/mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29>>
> 
> > On 20 Jul 2017, at 21:06, Jay Kreps <jay@confluent.io <ma...@confluent.io>> wrote:
> >
> > I suspect this is on Linux right?
> >
> > The way Linux works is it uses a percent of memory to buffer new writes, at a certain point it thinks it has too much buffered data and it gives high priority to writing that out. The good news about this is that the writes are very linear, well layed out, and high-throughput. The problem is that it leads to a bit of see-saw behavior.
> >
> > Now obviously the drop in performance isn't wrong. When your disk is writing data out it is doing work and obviously the read throughput will be higher when you are just reading and not writing then when you are doing both reading and writing simultaneously. So obviously you can't get the no-writing performance when you are also writing (unless you add I/O capacity).
> >
> > But still these big see-saws in performance are not ideal. You'd rather have more constant performance all the time rather than have linux bounce back and forth from writing nothing and then frantically writing full bore. Fortunately linux provides a set of pagecache tuning parameters that let you control this a bit.
> >
> > I think these docs cover some of the parameters:
> > https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html <https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html> <https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html <https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html>>
> >
> > -Jay
> >
> > On Thu, Jul 20, 2017 at 10:24 AM, Ovidiu-Cristian MARCU <ovidiu-cristian.marcu@inria.fr <ma...@inria.fr> <mailto:ovidiu-cristian.marcu@inria.fr <ma...@inria.fr>>> wrote:
> > Hi guys,
> >
> > I’m relatively new to Kafka’s world. I have an issue I describe below, maybe you can help me understand this behaviour.
> >
> > I’m running a benchmark using the following setup: one producer sends data to a topic and concurrently one consumer pulls and writes it to another topic.
> > Measuring the consumer throughput, I observe values around 500K records/s only until the system’s cache gets filled - from this moment the consumer throughout drops to ~200K (2.5 times lower).
> > Looking at disk usage, I observe disk read I/O which corresponds to the moment the consumer throughout drops.
> > After some time, I kill the producer and immediately I observe the consumer throughput goes up to initial values ~ 500K records/s.
> >
> > What can I do to avoid this throughput drop?
> >
> > Attached an image showing disk I/O and CPU usage. I have about 128GB RAM on that server which gets filled at time ~2300.
> >
> > Thanks,
> > Ovidiu
> >
> > <consumer-throughput-drops.png>
> >
> 
>

Re: Consumer throughput drop

Posted by Ovidiu-Cristian MARCU <ov...@inria.fr>.

After some tuning, I got better results. What I changed, as suggested:

dirty_ratio = 10 (previously 20)
dirty_background_ratio=3 (previously 10)

It results that disk read I/O is almost completely 0 (I have enough cache, the consumer is keeping with the producer). 

- producer throughput remains constant ~ 400K/s;
- consumer throughput (a Flink app) stays in this interval [300K/s, 500K/s] even when the cache is filled (there are some variations but are not influenced by system’s cache);

I don’t know if Kafka’s documentation is saying something, but this could be put somewhere in documentation if you also reproduce my tests and consider it useful.

Thanks,
Ovidiu

> On 21 Jul 2017, at 01:57, Apurva Mehta <ap...@confluent.io> wrote:
> 
> Hi Ovidu, 
> 
> The see-saw behavior is inevitable with linux when you have concurrent reads and writes. However, tuning the following two settings may help achieve more stable performance (from Jay's link): 
> 
> dirty_ratio
> Defines a percentage value. Writeout of dirty data begins (via pdflush) when dirty data comprises this percentage of total system memory. The default value is 20.
> Red Hat recommends a slightly lower value of 15 for database workloads.
>  
> dirty_background_ratio
> Defines a percentage value. Writeout of dirty data begins in the background (via pdflush) when dirty data comprises this percentage of total memory. The default value is 10. For database workloads, Red Hat recommends a lower value of 3.
> 
> Thanks,
> Apurva 
> 
> 
> On Thu, Jul 20, 2017 at 12:25 PM, Ovidiu-Cristian MARCU <ovidiu-cristian.marcu@inria.fr <ma...@inria.fr>> wrote:
> Yes, I’m using Debian Jessie 2.6 installed on this hardware [1].
> 
> It is also my understanding that Kafka is based on system’s cache (Linux in this case) which is based on Clock-Pro for page replacement policy, doing complex things for general workloads. I will check the tuning parameters, but I was hoping for some advices to avoid disk at all when reading, considering the system's cache is used completely by Kafka and is huge ~128GB - that is to tune Clock-Pro to be smarter when used for streaming access patterns.
> 
> Thanks,
> Ovidiu
> 
> [1] https://www.grid5000.fr/mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29 <https://www.grid5000.fr/mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29> <https://www.grid5000.fr/mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29 <https://www.grid5000.fr/mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29>>
> 
> > On 20 Jul 2017, at 21:06, Jay Kreps <jay@confluent.io <ma...@confluent.io>> wrote:
> >
> > I suspect this is on Linux right?
> >
> > The way Linux works is it uses a percent of memory to buffer new writes, at a certain point it thinks it has too much buffered data and it gives high priority to writing that out. The good news about this is that the writes are very linear, well layed out, and high-throughput. The problem is that it leads to a bit of see-saw behavior.
> >
> > Now obviously the drop in performance isn't wrong. When your disk is writing data out it is doing work and obviously the read throughput will be higher when you are just reading and not writing then when you are doing both reading and writing simultaneously. So obviously you can't get the no-writing performance when you are also writing (unless you add I/O capacity).
> >
> > But still these big see-saws in performance are not ideal. You'd rather have more constant performance all the time rather than have linux bounce back and forth from writing nothing and then frantically writing full bore. Fortunately linux provides a set of pagecache tuning parameters that let you control this a bit.
> >
> > I think these docs cover some of the parameters:
> > https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html <https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html> <https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html <https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html>>
> >
> > -Jay
> >
> > On Thu, Jul 20, 2017 at 10:24 AM, Ovidiu-Cristian MARCU <ovidiu-cristian.marcu@inria.fr <ma...@inria.fr> <mailto:ovidiu-cristian.marcu@inria.fr <ma...@inria.fr>>> wrote:
> > Hi guys,
> >
> > I’m relatively new to Kafka’s world. I have an issue I describe below, maybe you can help me understand this behaviour.
> >
> > I’m running a benchmark using the following setup: one producer sends data to a topic and concurrently one consumer pulls and writes it to another topic.
> > Measuring the consumer throughput, I observe values around 500K records/s only until the system’s cache gets filled - from this moment the consumer throughout drops to ~200K (2.5 times lower).
> > Looking at disk usage, I observe disk read I/O which corresponds to the moment the consumer throughout drops.
> > After some time, I kill the producer and immediately I observe the consumer throughput goes up to initial values ~ 500K records/s.
> >
> > What can I do to avoid this throughput drop?
> >
> > Attached an image showing disk I/O and CPU usage. I have about 128GB RAM on that server which gets filled at time ~2300.
> >
> > Thanks,
> > Ovidiu
> >
> > <consumer-throughput-drops.png>
> >
> 
>

Re: Consumer throughput drop

Posted by Apurva Mehta <ap...@confluent.io>.

Hi Ovidu,

The see-saw behavior is inevitable with linux when you have concurrent
reads and writes. However, tuning the following two settings may help
achieve more stable performance (from Jay's link):


> *dirty_ratio*Defines a percentage value. Writeout of dirty data begins
> (via *pdflush*) when dirty data comprises this percentage of total system
> memory. The default value is 20.
> Red Hat recommends a slightly lower value of 15 for database workloads.
>


>
> *dirty_background_ratio*Defines a percentage value. Writeout of dirty
> data begins in the background (via *pdflush*) when dirty data comprises
> this percentage of total memory. The default value is 10. For database
> workloads, Red Hat recommends a lower value of 3.


Thanks,
Apurva


On Thu, Jul 20, 2017 at 12:25 PM, Ovidiu-Cristian MARCU <
ovidiu-cristian.marcu@inria.fr> wrote:

> Yes, I’m using Debian Jessie 2.6 installed on this hardware [1].
>
> It is also my understanding that Kafka is based on system’s cache (Linux
> in this case) which is based on Clock-Pro for page replacement policy,
> doing complex things for general workloads. I will check the tuning
> parameters, but I was hoping for some advices to avoid disk at all when
> reading, considering the system's cache is used completely by Kafka and is
> huge ~128GB - that is to tune Clock-Pro to be smarter when used for
> streaming access patterns.
>
> Thanks,
> Ovidiu
>
> [1] https://www.grid5000.fr/mediawiki/index.php/Rennes:
> Hardware#Dell_Poweredge_R630_.28paravance.29 <https://www.grid5000.fr/
> mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29>
>
> > On 20 Jul 2017, at 21:06, Jay Kreps <ja...@confluent.io> wrote:
> >
> > I suspect this is on Linux right?
> >
> > The way Linux works is it uses a percent of memory to buffer new writes,
> at a certain point it thinks it has too much buffered data and it gives
> high priority to writing that out. The good news about this is that the
> writes are very linear, well layed out, and high-throughput. The problem is
> that it leads to a bit of see-saw behavior.
> >
> > Now obviously the drop in performance isn't wrong. When your disk is
> writing data out it is doing work and obviously the read throughput will be
> higher when you are just reading and not writing then when you are doing
> both reading and writing simultaneously. So obviously you can't get the
> no-writing performance when you are also writing (unless you add I/O
> capacity).
> >
> > But still these big see-saws in performance are not ideal. You'd rather
> have more constant performance all the time rather than have linux bounce
> back and forth from writing nothing and then frantically writing full bore.
> Fortunately linux provides a set of pagecache tuning parameters that let
> you control this a bit.
> >
> > I think these docs cover some of the parameters:
> > https://access.redhat.com/documentation/en-US/Red_Hat_
> Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html <
> https://access.redhat.com/documentation/en-US/Red_Hat_
> Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html>
> >
> > -Jay
> >
> > On Thu, Jul 20, 2017 at 10:24 AM, Ovidiu-Cristian MARCU <
> ovidiu-cristian.marcu@inria.fr <ma...@inria.fr>>
> wrote:
> > Hi guys,
> >
> > I’m relatively new to Kafka’s world. I have an issue I describe below,
> maybe you can help me understand this behaviour.
> >
> > I’m running a benchmark using the following setup: one producer sends
> data to a topic and concurrently one consumer pulls and writes it to
> another topic.
> > Measuring the consumer throughput, I observe values around 500K
> records/s only until the system’s cache gets filled - from this moment the
> consumer throughout drops to ~200K (2.5 times lower).
> > Looking at disk usage, I observe disk read I/O which corresponds to the
> moment the consumer throughout drops.
> > After some time, I kill the producer and immediately I observe the
> consumer throughput goes up to initial values ~ 500K records/s.
> >
> > What can I do to avoid this throughput drop?
> >
> > Attached an image showing disk I/O and CPU usage. I have about 128GB RAM
> on that server which gets filled at time ~2300.
> >
> > Thanks,
> > Ovidiu
> >
> > <consumer-throughput-drops.png>
> >
>
>

Re: Consumer throughput drop

Posted by Apurva Mehta <ap...@confluent.io>.

Hi Ovidu,

The see-saw behavior is inevitable with linux when you have concurrent
reads and writes. However, tuning the following two settings may help
achieve more stable performance (from Jay's link):


> *dirty_ratio*Defines a percentage value. Writeout of dirty data begins
> (via *pdflush*) when dirty data comprises this percentage of total system
> memory. The default value is 20.
> Red Hat recommends a slightly lower value of 15 for database workloads.
>


>
> *dirty_background_ratio*Defines a percentage value. Writeout of dirty
> data begins in the background (via *pdflush*) when dirty data comprises
> this percentage of total memory. The default value is 10. For database
> workloads, Red Hat recommends a lower value of 3.


Thanks,
Apurva


On Thu, Jul 20, 2017 at 12:25 PM, Ovidiu-Cristian MARCU <
ovidiu-cristian.marcu@inria.fr> wrote:

> Yes, I’m using Debian Jessie 2.6 installed on this hardware [1].
>
> It is also my understanding that Kafka is based on system’s cache (Linux
> in this case) which is based on Clock-Pro for page replacement policy,
> doing complex things for general workloads. I will check the tuning
> parameters, but I was hoping for some advices to avoid disk at all when
> reading, considering the system's cache is used completely by Kafka and is
> huge ~128GB - that is to tune Clock-Pro to be smarter when used for
> streaming access patterns.
>
> Thanks,
> Ovidiu
>
> [1] https://www.grid5000.fr/mediawiki/index.php/Rennes:
> Hardware#Dell_Poweredge_R630_.28paravance.29 <https://www.grid5000.fr/
> mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29>
>
> > On 20 Jul 2017, at 21:06, Jay Kreps <ja...@confluent.io> wrote:
> >
> > I suspect this is on Linux right?
> >
> > The way Linux works is it uses a percent of memory to buffer new writes,
> at a certain point it thinks it has too much buffered data and it gives
> high priority to writing that out. The good news about this is that the
> writes are very linear, well layed out, and high-throughput. The problem is
> that it leads to a bit of see-saw behavior.
> >
> > Now obviously the drop in performance isn't wrong. When your disk is
> writing data out it is doing work and obviously the read throughput will be
> higher when you are just reading and not writing then when you are doing
> both reading and writing simultaneously. So obviously you can't get the
> no-writing performance when you are also writing (unless you add I/O
> capacity).
> >
> > But still these big see-saws in performance are not ideal. You'd rather
> have more constant performance all the time rather than have linux bounce
> back and forth from writing nothing and then frantically writing full bore.
> Fortunately linux provides a set of pagecache tuning parameters that let
> you control this a bit.
> >
> > I think these docs cover some of the parameters:
> > https://access.redhat.com/documentation/en-US/Red_Hat_
> Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html <
> https://access.redhat.com/documentation/en-US/Red_Hat_
> Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html>
> >
> > -Jay
> >
> > On Thu, Jul 20, 2017 at 10:24 AM, Ovidiu-Cristian MARCU <
> ovidiu-cristian.marcu@inria.fr <ma...@inria.fr>>
> wrote:
> > Hi guys,
> >
> > I’m relatively new to Kafka’s world. I have an issue I describe below,
> maybe you can help me understand this behaviour.
> >
> > I’m running a benchmark using the following setup: one producer sends
> data to a topic and concurrently one consumer pulls and writes it to
> another topic.
> > Measuring the consumer throughput, I observe values around 500K
> records/s only until the system’s cache gets filled - from this moment the
> consumer throughout drops to ~200K (2.5 times lower).
> > Looking at disk usage, I observe disk read I/O which corresponds to the
> moment the consumer throughout drops.
> > After some time, I kill the producer and immediately I observe the
> consumer throughput goes up to initial values ~ 500K records/s.
> >
> > What can I do to avoid this throughput drop?
> >
> > Attached an image showing disk I/O and CPU usage. I have about 128GB RAM
> on that server which gets filled at time ~2300.
> >
> > Thanks,
> > Ovidiu
> >
> > <consumer-throughput-drops.png>
> >
>
>

Re: Consumer throughput drop

Posted by Ovidiu-Cristian MARCU <ov...@inria.fr>.

Yes, I’m using Debian Jessie 2.6 installed on this hardware [1].

It is also my understanding that Kafka is based on system’s cache (Linux in this case) which is based on Clock-Pro for page replacement policy, doing complex things for general workloads. I will check the tuning parameters, but I was hoping for some advices to avoid disk at all when reading, considering the system's cache is used completely by Kafka and is huge ~128GB - that is to tune Clock-Pro to be smarter when used for streaming access patterns.

Thanks,
Ovidiu

[1] https://www.grid5000.fr/mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29 <https://www.grid5000.fr/mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29>

> On 20 Jul 2017, at 21:06, Jay Kreps <ja...@confluent.io> wrote:
> 
> I suspect this is on Linux right?
> 
> The way Linux works is it uses a percent of memory to buffer new writes, at a certain point it thinks it has too much buffered data and it gives high priority to writing that out. The good news about this is that the writes are very linear, well layed out, and high-throughput. The problem is that it leads to a bit of see-saw behavior.
> 
> Now obviously the drop in performance isn't wrong. When your disk is writing data out it is doing work and obviously the read throughput will be higher when you are just reading and not writing then when you are doing both reading and writing simultaneously. So obviously you can't get the no-writing performance when you are also writing (unless you add I/O capacity).
> 
> But still these big see-saws in performance are not ideal. You'd rather have more constant performance all the time rather than have linux bounce back and forth from writing nothing and then frantically writing full bore. Fortunately linux provides a set of pagecache tuning parameters that let you control this a bit. 
> 
> I think these docs cover some of the parameters:
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html <https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html>
> 
> -Jay
> 
> On Thu, Jul 20, 2017 at 10:24 AM, Ovidiu-Cristian MARCU <ovidiu-cristian.marcu@inria.fr <ma...@inria.fr>> wrote:
> Hi guys,
> 
> I’m relatively new to Kafka’s world. I have an issue I describe below, maybe you can help me understand this behaviour.
> 
> I’m running a benchmark using the following setup: one producer sends data to a topic and concurrently one consumer pulls and writes it to another topic.
> Measuring the consumer throughput, I observe values around 500K records/s only until the system’s cache gets filled - from this moment the consumer throughout drops to ~200K (2.5 times lower).
> Looking at disk usage, I observe disk read I/O which corresponds to the moment the consumer throughout drops.
> After some time, I kill the producer and immediately I observe the consumer throughput goes up to initial values ~ 500K records/s.
> 
> What can I do to avoid this throughput drop?
> 
> Attached an image showing disk I/O and CPU usage. I have about 128GB RAM on that server which gets filled at time ~2300.
> 
> Thanks,
> Ovidiu
> 
> <consumer-throughput-drops.png>
>

Re: Consumer throughput drop

Posted by Ovidiu-Cristian MARCU <ov...@inria.fr>.

Yes, I’m using Debian Jessie 2.6 installed on this hardware [1].

It is also my understanding that Kafka is based on system’s cache (Linux in this case) which is based on Clock-Pro for page replacement policy, doing complex things for general workloads. I will check the tuning parameters, but I was hoping for some advices to avoid disk at all when reading, considering the system's cache is used completely by Kafka and is huge ~128GB - that is to tune Clock-Pro to be smarter when used for streaming access patterns.

Thanks,
Ovidiu

[1] https://www.grid5000.fr/mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29 <https://www.grid5000.fr/mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29>

> On 20 Jul 2017, at 21:06, Jay Kreps <ja...@confluent.io> wrote:
> 
> I suspect this is on Linux right?
> 
> The way Linux works is it uses a percent of memory to buffer new writes, at a certain point it thinks it has too much buffered data and it gives high priority to writing that out. The good news about this is that the writes are very linear, well layed out, and high-throughput. The problem is that it leads to a bit of see-saw behavior.
> 
> Now obviously the drop in performance isn't wrong. When your disk is writing data out it is doing work and obviously the read throughput will be higher when you are just reading and not writing then when you are doing both reading and writing simultaneously. So obviously you can't get the no-writing performance when you are also writing (unless you add I/O capacity).
> 
> But still these big see-saws in performance are not ideal. You'd rather have more constant performance all the time rather than have linux bounce back and forth from writing nothing and then frantically writing full bore. Fortunately linux provides a set of pagecache tuning parameters that let you control this a bit. 
> 
> I think these docs cover some of the parameters:
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html <https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html>
> 
> -Jay
> 
> On Thu, Jul 20, 2017 at 10:24 AM, Ovidiu-Cristian MARCU <ovidiu-cristian.marcu@inria.fr <ma...@inria.fr>> wrote:
> Hi guys,
> 
> I’m relatively new to Kafka’s world. I have an issue I describe below, maybe you can help me understand this behaviour.
> 
> I’m running a benchmark using the following setup: one producer sends data to a topic and concurrently one consumer pulls and writes it to another topic.
> Measuring the consumer throughput, I observe values around 500K records/s only until the system’s cache gets filled - from this moment the consumer throughout drops to ~200K (2.5 times lower).
> Looking at disk usage, I observe disk read I/O which corresponds to the moment the consumer throughout drops.
> After some time, I kill the producer and immediately I observe the consumer throughput goes up to initial values ~ 500K records/s.
> 
> What can I do to avoid this throughput drop?
> 
> Attached an image showing disk I/O and CPU usage. I have about 128GB RAM on that server which gets filled at time ~2300.
> 
> Thanks,
> Ovidiu
> 
> <consumer-throughput-drops.png>
>

Re: Consumer throughput drop

Posted by Jay Kreps <ja...@confluent.io>.

I suspect this is on Linux right?

The way Linux works is it uses a percent of memory to buffer new writes, at
a certain point it thinks it has too much buffered data and it gives high
priority to writing that out. The good news about this is that the writes
are very linear, well layed out, and high-throughput. The problem is that
it leads to a bit of see-saw behavior.

Now obviously the drop in performance isn't wrong. When your disk is
writing data out it is doing work and obviously the read throughput will be
higher when you are just reading and not writing then when you are doing
both reading and writing simultaneously. So obviously you can't get the
no-writing performance when you are also writing (unless you add I/O
capacity).

But still these big see-saws in performance are not ideal. You'd rather
have more constant performance all the time rather than have linux bounce
back and forth from writing nothing and then frantically writing full bore.
Fortunately linux provides a set of pagecache tuning parameters that let
you control this a bit.

I think these docs cover some of the parameters:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html

-Jay

On Thu, Jul 20, 2017 at 10:24 AM, Ovidiu-Cristian MARCU <
ovidiu-cristian.marcu@inria.fr> wrote:

> Hi guys,
>
> I’m relatively new to Kafka’s world. I have an issue I describe below,
> maybe you can help me understand this behaviour.
>
> I’m running a benchmark using the following setup: one producer sends data
> to a topic and concurrently one consumer pulls and writes it to another
> topic.
> Measuring the consumer throughput, I observe values around 500K records/s
> only until the system’s cache gets filled - from this moment the consumer
> throughout drops to ~200K (2.5 times lower).
> Looking at disk usage, I observe disk read I/O which corresponds to the
> moment the consumer throughout drops.
> After some time, I kill the producer and immediately I observe the
> consumer throughput goes up to initial values ~ 500K records/s.
>
> What can I do to avoid this throughput drop?
>
> Attached an image showing disk I/O and CPU usage. I have about 128GB RAM
> on that server which gets filled at time ~2300.
>
> Thanks,
> Ovidiu
>
>

Re: Consumer throughput drop

Posted by Jay Kreps <ja...@confluent.io>.

I suspect this is on Linux right?

The way Linux works is it uses a percent of memory to buffer new writes, at
a certain point it thinks it has too much buffered data and it gives high
priority to writing that out. The good news about this is that the writes
are very linear, well layed out, and high-throughput. The problem is that
it leads to a bit of see-saw behavior.

Now obviously the drop in performance isn't wrong. When your disk is
writing data out it is doing work and obviously the read throughput will be
higher when you are just reading and not writing then when you are doing
both reading and writing simultaneously. So obviously you can't get the
no-writing performance when you are also writing (unless you add I/O
capacity).

But still these big see-saws in performance are not ideal. You'd rather
have more constant performance all the time rather than have linux bounce
back and forth from writing nothing and then frantically writing full bore.
Fortunately linux provides a set of pagecache tuning parameters that let
you control this a bit.

I think these docs cover some of the parameters:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html

-Jay

On Thu, Jul 20, 2017 at 10:24 AM, Ovidiu-Cristian MARCU <
ovidiu-cristian.marcu@inria.fr> wrote:

> Hi guys,
>
> I’m relatively new to Kafka’s world. I have an issue I describe below,
> maybe you can help me understand this behaviour.
>
> I’m running a benchmark using the following setup: one producer sends data
> to a topic and concurrently one consumer pulls and writes it to another
> topic.
> Measuring the consumer throughput, I observe values around 500K records/s
> only until the system’s cache gets filled - from this moment the consumer
> throughout drops to ~200K (2.5 times lower).
> Looking at disk usage, I observe disk read I/O which corresponds to the
> moment the consumer throughout drops.
> After some time, I kill the producer and immediately I observe the
> consumer throughput goes up to initial values ~ 500K records/s.
>
> What can I do to avoid this throughput drop?
>
> Attached an image showing disk I/O and CPU usage. I have about 128GB RAM
> on that server which gets filled at time ~2300.
>
> Thanks,
> Ovidiu
>
>