You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by D <su...@gmail.com> on 2015/11/27 10:54:48 UTC
Fair Usage of Kafka Queues
Hi,
We have a ElasticSearch-Logstash-Kibana deployment in which multiple
logstash-forwarders (from multiple service logs) pushes log to logstash
which then sends it to Kafka and then one more logstash pulls those logs
from Kafka and indexes them in ElasticSearch cluster. Right now we are
using single kafka queue in which all the logs from all the services are
going.
Can we configure Kafka in such a way that a single logstash-forwarder
(of a single service) does not hog the entire set-up? We want to ensure
that all services can use the set-up fairly. So basically I want to
ensure the Kafka queues are not always filled up by only one type of
messages.
Thanks,
Debraj
RE: Fair Usage of Kafka Queues
Posted by Todd Snyder <ts...@blackberry.com>.
Hi Debraj,
A couple things you could try.
Given your design
https://chart.googleapis.com/chart?chl=digraph+G+%7B%0D%0A+++rankdir%3DLR%3B%0D%0A+++service1LSFWD+-%3E+LS+-%3E+Kafka+-%3E+LSELK+-%3E+ES+-%3E+Kibana%0D%0A+++service2LSFWD+-%3E+LS%0D%0A+++service3LSFWD+-%3E+LS%0D%0A+++service4LSFWD+-%3E+LS%0D%0A%7D&cht=gv
1) Try splitting services apart. I do this by tagging each input, then using a condition in the output to send the data to a different kafka topic
input {
file {
path => "/var/log/hadoop/hdfs/hadoop-hdfs-namenode-*.log"
tags => ["namenode-log"]
…
}
}
}
input {
file {
path => "/var/log/hadoop/hdfs/hadoop-hdfs-namenode-*.out"
tags => ["namenode-out"]
…
}
}
}
output {
if "namenode-log” in [tags] {
kafka {
topic_id => "hadoop-namenode-log"
…
}
output {
if "namenode-out” in [tags] {
kafka {
topic_id => "hadoop-namenode-out"
…
}
2) Try having more partitions in Kafka, so that LS fans out more
3) Try adding more workers to the kafka output module (https://www.elastic.co/guide/en/logstash/current/plugins-outputs-kafka.html#plugins-outputs-kafka-workers)
4) Try switching to the service nodes writing directly to kafka, using full Logstash, vs Logstash forwarder. That removes a bottleneck.
Happy to try and help, as this is stuff is currently on my mind. Maybe some more details about what queues you’re seeing filled up?
Cheers,
Todd.
-----Original Message-----
From: D [mailto:subharaj.manna@gmail.com]
Sent: Friday, November 27, 2015 4:55
To: users@kafka.apache.org
Subject: Fair Usage of Kafka Queues
Hi,
We have a ElasticSearch-Logstash-Kibana deployment in which multiple
logstash-forwarders (from multiple service logs) pushes log to logstash
which then sends it to Kafka and then one more logstash pulls those logs
from Kafka and indexes them in ElasticSearch cluster. Right now we are
using single kafka queue in which all the logs from all the services are
going.
Can we configure Kafka in such a way that a single logstash-forwarder
(of a single service) does not hog the entire set-up? We want to ensure
that all services can use the set-up fairly. So basically I want to
ensure the Kafka queues are not always filled up by only one type of
messages.
Thanks,
Debraj