You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Pablo Barrera González <pa...@gmail.com> on 2013/01/21 23:04:57 UTC

Cross-site Kafka installation

Hello

In my enterprise we are deploying an cross-site installation of Kakfa. One
of the Kafka cluster is located in USA and one consumer is in Europe. Does
anybody have experience in such an environment? Any comments on the
configuration and best practices?

Thanks in advance

Pablo

Re: Cross-site Kafka installation

Posted by Pablo Barrera González <pa...@gmail.com>.
Hi Joel

Thanks for the hints. Apparently it was a configuration error at operating
system level.

We are using Debian Linux. Kafka uses setsockopt call with SO_SNDBUF to
setup the buffer size (socket.send.buffer). The operating system then set
the real buffer size to min(socket.send.buffer, net.core.wmem_max).
net.core.wmem_max was set to a really small value (131071), that was Debian
Linux default. However, if you delegate the buffsize to the operating
system (basically you don't call to setsockopt for SO_SNDBUF), the
operating system handles the buffer size using the configuration in
net.ipv4.tcp_wmem. The default values for our hosts were 4096, 16384,
4194304, meaning that a buffer can automatically grow up to 4MB (if there
is enough memory available).

Having such an small window was the main reason for the low performance.
With the new configuration we increased performance by an order of
magnitude.

The interesting think is that, at least in Linux, if you don't call to
setsockopt the default buffer works really well. I don't notice any
difference between calling to setsockopt or not (after fixing the
configuration of the machine). So why to setsockopt?

Anyway, I see value on using configuration to turn on or off the call to
setsockopt. I will send a patch.

Regards,

Pablo

PS: We are using 0.7.1 and Linux 2.6.32.




2013/1/22 Joel Koshy <jj...@gmail.com>
>
> We do mirroring across data-centers (but in the same continent). You
should
> basically set a high fetch size and socket buffer size in such scenarios.
>
> In general, you should set a high value for the socket buffer size on the
> consumer configuration (socket.buffersize) and the source cluster's broker
> configuration (socket.send.buffer).
>
> Assuming you are using the high-level consumer, the fetch size
(fetch.size)
> should be higher than the consumer's socket buffer size. Note that the
> socket buffer size configurations are a hint to the underlying platform's
> networking code. If you enable trace logging, you can check the actual
> receive buffer size and determine whether the setting in the OS networking
> layer also needs to be adjusted. Likewise, you will need to use higher
> connection/session timeouts for zookeeper and set your offset commit
> intervals to be fairly large.
>
> Thanks,
>
> Joel
>
>
> On Mon, Jan 21, 2013 at 2:04 PM, Pablo Barrera González <
> pablo.barrera@gmail.com> wrote:
>
> > Hello
> >
> > In my enterprise we are deploying an cross-site installation of Kakfa.
One
> > of the Kafka cluster is located in USA and one consumer is in Europe.
Does
> > anybody have experience in such an environment? Any comments on the
> > configuration and best practices?
> >
> > Thanks in advance
> >
> > Pablo
> >

Re: Cross-site Kafka installation

Posted by Joel Koshy <jj...@gmail.com>.
We do mirroring across data-centers (but in the same continent). You should
basically set a high fetch size and socket buffer size in such scenarios.

In general, you should set a high value for the socket buffer size on the
consumer configuration (socket.buffersize) and the source cluster's broker
configuration (socket.send.buffer).

Assuming you are using the high-level consumer, the fetch size (fetch.size)
should be higher than the consumer's socket buffer size. Note that the
socket buffer size configurations are a hint to the underlying platform's
networking code. If you enable trace logging, you can check the actual
receive buffer size and determine whether the setting in the OS networking
layer also needs to be adjusted. Likewise, you will need to use higher
connection/session timeouts for zookeeper and set your offset commit
intervals to be fairly large.

Thanks,

Joel


On Mon, Jan 21, 2013 at 2:04 PM, Pablo Barrera González <
pablo.barrera@gmail.com> wrote:

> Hello
>
> In my enterprise we are deploying an cross-site installation of Kakfa. One
> of the Kafka cluster is located in USA and one consumer is in Europe. Does
> anybody have experience in such an environment? Any comments on the
> configuration and best practices?
>
> Thanks in advance
>
> Pablo
>