You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Jagadish Bihani <ja...@pubmatic.com> on 2014/06/07 09:17:12 UTC
About peculiar scenario in kafka camus consumer
Hi
I have observed a peculiar scenario in production environment in which a
mapper task for a particular topic-partition combination always fails
with the exception 'Task attempt failed to report status for 600 seconds'.
When I dug deep I found it stucks at either fetch() method/getNext
method of Kafkareader.
Things which I tried:
-------------------------
1. Network and /etc/hosts entries are checked. They are fine.
2. Machine on which that particular partition resides, there are another
partition as well and there is no problem in reading those partitions.
So it is not machine specific or network specific issue.
3. Tried increasing timeout parameters and changing buffering parameters.
4. Records are zlib compressed. I tried Kafka console-consumer but
couldn't verify with it as data was large.
Here are relevant configs:
-----------------------------------
kafka.client.name=camus1
# Fetch Request Parameters
kafka.fetch.buffer.size=1048576
#kafka.fetch.request.correlationid=
kafka.fetch.request.max.wait=100000
#kafka.fetch.request.min.bytes=
socket.receive.buffer.bytes=1048576
fetch.message.max.bytes=10485760
# Connection parameters.
kafka.brokers=<list of ips>
kafka.timeout.value=30000
Re: About peculiar scenario in kafka camus consumer
Posted by Jun Rao <ju...@gmail.com>.
Are you saying that the consumer is stuck at fetching data at the same
offset again and again w/o returning any message? If so, what's the max
message size on the broker? You need to make sure that consumer fetch size
is larger than the max message size.
Thanks,
Jun
On Sat, Jun 7, 2014 at 12:17 AM, Jagadish Bihani <
jagadish.bihani@pubmatic.com> wrote:
> Hi
> I have observed a peculiar scenario in production environment in which a
> mapper task for a particular topic-partition combination always fails with
> the exception 'Task attempt failed to report status for 600 seconds'.
>
> When I dug deep I found it stucks at either fetch() method/getNext method
> of Kafkareader.
>
> Things which I tried:
> -------------------------
> 1. Network and /etc/hosts entries are checked. They are fine.
> 2. Machine on which that particular partition resides, there are another
> partition as well and there is no problem in reading those partitions. So
> it is not machine specific or network specific issue.
> 3. Tried increasing timeout parameters and changing buffering parameters.
> 4. Records are zlib compressed. I tried Kafka console-consumer but
> couldn't verify with it as data was large.
>
> Here are relevant configs:
> -----------------------------------
> kafka.client.name=camus1
> # Fetch Request Parameters
> kafka.fetch.buffer.size=1048576
> #kafka.fetch.request.correlationid=
> kafka.fetch.request.max.wait=100000
> #kafka.fetch.request.min.bytes=
> socket.receive.buffer.bytes=1048576
> fetch.message.max.bytes=10485760
> # Connection parameters.
> kafka.brokers=<list of ips>
> kafka.timeout.value=30000
>
>