You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Steve Miller <st...@idrathernotsay.com> on 2014/08/12 20:03:58 UTC

Using the kafka dissector in wireshark/tshark 1.12

   I'd seen references to there being a Kafka protocol dissector built into wireshark/tshark 1.12, but what I could find on that was a bit light on the specifics as to how to get it to do anything -- at least for someone (like me) who might use tcpdump a lot but who doesn't use tshark a lot.

   I got this working, so I figured I'd post a few pointers here on the off-chance that they save someone else a bit of time.

   Note that I'm using tshark, not wireshark; this might be easier and/or different in wireshark, but I don't feel like moving many gigabytes of data to a place where I can use wireshark. (-:

   If you're reading traffic live, you'll want to do something like this:

	tshark -V -i eth1 -o 'kafka.tcp.port:9092' -d tcp.port=9092,kafka -f 'dst port 9092' -Y (kafka options)

   For example, if you want to see output only for ProduceRequest and ProduceResponses, and only for the topic "mytopic", you can do:

	tshark -V -i eth1 -o 'kafka.tcp.port:9092' -d tcp.port=9092,kafka -f 'dst port 9092' -Y 'kafka.topic_name==mytopic && kafka.request_key==0'

   You can get a complete list of Kafka-related fields by doing:

	tshark -G fields | grep -i kafka

   There is a very significant downside to processing packets live: tshark uses dumpcap to generate the actual packets, and unless I'm missing some obscure tshark option (which is possible!) it won't toss old data.  So if you run this for a few hours, you'll end up with a ginormous file.

   By default (under Linux, at least) tshark is going to put that file in /tmp, so if your /tmp is small and/or a tmpfs that can make things a little exciting.  You can get around that by doing:

	(export TMPDIR=/big/damn/filesystem ; tshark bla bla bla)

which I figure given typical Kafka data volumes is probably pretty important to know, and which doesn't seem to be documented in the tshark man pages.  It is at least not all that hard to search for.

   In theory, you can use the tshark "-b" option to specify a ring buffer of files, even for real-time processing, though:

	* adding -b anything (e.g., "-b files:1 -b filesize:1024") seems to want to force you to use -w (filename)

	* just adding -b and -w to the invocation above gets a warning about display filters not being supported when capturing and saving packets

	* changing -Y to -2 -R and/or adding -P doesn't seem to help

(though again someone with more tshark experience might know the magic combination of arguments to get this to do what it's told).

   So instead, you can capture packets somewhere, e.g.:

	tcpdump -n -s 0 -w /var/tmp/kafka.tcpd -i eth1 'port 9092'

and then decode them later:

	tshark -V -r /var/tmp/kafka.tcpd -o 'kafka.tcp.port:9092' -d tcp.port=9092,kafka -R 'kafka.topic_name==mytopic && kafka.request_key==0' -2

   Anyway, if you're seeing protocol-related weirdness, hopefully this will be at least of some help to you.

	-Steve
	(Yes, the email address is a joke.  Just not on you!  It does work.)

Re: Using the kafka dissector in wireshark/tshark 1.12

Posted by Neha Narkhede <ne...@gmail.com>.
Thanks for sharing this, Steve!


On Tue, Aug 12, 2014 at 11:03 AM, Steve Miller <st...@idrathernotsay.com>
wrote:

>    I'd seen references to there being a Kafka protocol dissector built
> into wireshark/tshark 1.12, but what I could find on that was a bit light
> on the specifics as to how to get it to do anything -- at least for someone
> (like me) who might use tcpdump a lot but who doesn't use tshark a lot.
>
>    I got this working, so I figured I'd post a few pointers here on the
> off-chance that they save someone else a bit of time.
>
>    Note that I'm using tshark, not wireshark; this might be easier and/or
> different in wireshark, but I don't feel like moving many gigabytes of data
> to a place where I can use wireshark. (-:
>
>    If you're reading traffic live, you'll want to do something like this:
>
>         tshark -V -i eth1 -o 'kafka.tcp.port:9092' -d tcp.port=9092,kafka
> -f 'dst port 9092' -Y (kafka options)
>
>    For example, if you want to see output only for ProduceRequest and
> ProduceResponses, and only for the topic "mytopic", you can do:
>
>         tshark -V -i eth1 -o 'kafka.tcp.port:9092' -d tcp.port=9092,kafka
> -f 'dst port 9092' -Y 'kafka.topic_name==mytopic && kafka.request_key==0'
>
>    You can get a complete list of Kafka-related fields by doing:
>
>         tshark -G fields | grep -i kafka
>
>    There is a very significant downside to processing packets live: tshark
> uses dumpcap to generate the actual packets, and unless I'm missing some
> obscure tshark option (which is possible!) it won't toss old data.  So if
> you run this for a few hours, you'll end up with a ginormous file.
>
>    By default (under Linux, at least) tshark is going to put that file in
> /tmp, so if your /tmp is small and/or a tmpfs that can make things a little
> exciting.  You can get around that by doing:
>
>         (export TMPDIR=/big/damn/filesystem ; tshark bla bla bla)
>
> which I figure given typical Kafka data volumes is probably pretty
> important to know, and which doesn't seem to be documented in the tshark
> man pages.  It is at least not all that hard to search for.
>
>    In theory, you can use the tshark "-b" option to specify a ring buffer
> of files, even for real-time processing, though:
>
>         * adding -b anything (e.g., "-b files:1 -b filesize:1024") seems
> to want to force you to use -w (filename)
>
>         * just adding -b and -w to the invocation above gets a warning
> about display filters not being supported when capturing and saving packets
>
>         * changing -Y to -2 -R and/or adding -P doesn't seem to help
>
> (though again someone with more tshark experience might know the magic
> combination of arguments to get this to do what it's told).
>
>    So instead, you can capture packets somewhere, e.g.:
>
>         tcpdump -n -s 0 -w /var/tmp/kafka.tcpd -i eth1 'port 9092'
>
> and then decode them later:
>
>         tshark -V -r /var/tmp/kafka.tcpd -o 'kafka.tcp.port:9092' -d
> tcp.port=9092,kafka -R 'kafka.topic_name==mytopic && kafka.request_key==0'
> -2
>
>    Anyway, if you're seeing protocol-related weirdness, hopefully this
> will be at least of some help to you.
>
>         -Steve
>         (Yes, the email address is a joke.  Just not on you!  It does
> work.)
>