You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by D <su...@gmail.com> on 2015/11/27 10:54:48 UTC

Fair Usage of Kafka Queues

Hi,

We have a ElasticSearch-Logstash-Kibana deployment in which multiple 
logstash-forwarders (from multiple service logs) pushes log to logstash 
which then sends it to Kafka and then one more logstash pulls those logs 
from Kafka and indexes them in ElasticSearch cluster. Right now we are 
using single kafka queue in which all the logs from all the services are 
going.

Can we configure Kafka in such a way that a single logstash-forwarder 
(of a single service) does not hog the entire set-up? We want to ensure 
that all services can use the set-up fairly. So basically I want to 
ensure the Kafka queues are not always filled up by only one type of 
messages.

Thanks,
Debraj

RE: Fair Usage of Kafka Queues

Posted by Todd Snyder <ts...@blackberry.com>.
Hi Debraj,



A couple things you could try.



Given your design

https://chart.googleapis.com/chart?chl=digraph+G+%7B%0D%0A+++rankdir%3DLR%3B%0D%0A+++service1LSFWD+-%3E+LS+-%3E+Kafka+-%3E+LSELK+-%3E+ES+-%3E+Kibana%0D%0A+++service2LSFWD+-%3E+LS%0D%0A+++service3LSFWD+-%3E+LS%0D%0A+++service4LSFWD+-%3E+LS%0D%0A%7D&cht=gv







1)      Try splitting services apart.  I do this by tagging each input, then using a condition in the output to send the data to a different kafka topic



input {

        file {

                path => "/var/log/hadoop/hdfs/hadoop-hdfs-namenode-*.log"

                tags => ["namenode-log"]

                …

                }

        }

}

input {

        file {

                path => "/var/log/hadoop/hdfs/hadoop-hdfs-namenode-*.out"

                tags => ["namenode-out"]

                …

                }

        }

}

output {

        if "namenode-log” in [tags] {

                kafka {

topic_id => "hadoop-namenode-log"

                                …

}

output {

        if "namenode-out” in [tags] {

                kafka {

topic_id => "hadoop-namenode-out"

                                …

}





2)      Try having more partitions in Kafka, so that LS fans out more

3)      Try adding more workers to the kafka output module (https://www.elastic.co/guide/en/logstash/current/plugins-outputs-kafka.html#plugins-outputs-kafka-workers)

4)      Try switching to the service nodes writing directly to kafka, using full Logstash, vs Logstash forwarder.  That removes a bottleneck.



Happy to try and help, as this is stuff is currently on my mind.  Maybe some more details about what queues you’re seeing filled up?



Cheers,



Todd.



-----Original Message-----
From: D [mailto:subharaj.manna@gmail.com]
Sent: Friday, November 27, 2015 4:55
To: users@kafka.apache.org
Subject: Fair Usage of Kafka Queues



Hi,



We have a ElasticSearch-Logstash-Kibana deployment in which multiple

logstash-forwarders (from multiple service logs) pushes log to logstash

which then sends it to Kafka and then one more logstash pulls those logs

from Kafka and indexes them in ElasticSearch cluster. Right now we are

using single kafka queue in which all the logs from all the services are

going.



Can we configure Kafka in such a way that a single logstash-forwarder

(of a single service) does not hog the entire set-up? We want to ensure

that all services can use the set-up fairly. So basically I want to

ensure the Kafka queues are not always filled up by only one type of

messages.



Thanks,

Debraj