You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by static-max <fl...@googlemail.com> on 2016/10/11 08:51:47 UTC

"Slow ReadProcessor" warnings when using BucketSink

Hi,

I have a low throughput job (approx. 1000 messager per Minute), that
consumes from Kafka und writes directly to HDFS. After an hour or so, I get
the following warnings in the Task Manager log:

2016-10-10 01:59:44,635 WARN  org.apache.hadoop.hdfs.DFSClient
                 - Slow ReadProcessor read fields took 30001ms
(threshold=30000ms); ack: seqno: 66 reply: SUCCESS reply: SUCCESS reply:
SUCCESS downstreamAckTimeNanos: 1599276 flag: 0 flag: 0 flag: 0, targets:
[DatanodeInfoWithStorage[Node1, Node2, Node3]]
2016-10-10 02:04:44,635 WARN  org.apache.hadoop.hdfs.DFSClient
                 - Slow ReadProcessor read fields took 30002ms
(threshold=30000ms); ack: seqno: 13 reply: SUCCESS reply: SUCCESS reply:
SUCCESS downstreamAckTimeNanos: 2394027 flag: 0 flag: 0 flag: 0, targets:
[DatanodeInfoWithStorage[Node1, Node2, Node3]]
2016-10-10 02:05:14,635 WARN  org.apache.hadoop.hdfs.DFSClient
                 - Slow ReadProcessor read fields took 30001ms
(threshold=30000ms); ack: seqno: 17 reply: SUCCESS reply: SUCCESS reply:
SUCCESS downstreamAckTimeNanos: 2547467 flag: 0 flag: 0 flag: 0, targets:
[DatanodeInfoWithStorage[Node1, Node2, Node3]]

I have not found any erros or warning at the datanodes or the namenode.
Every other application using HDFS performs fine. I have very little load
and network latency is fine also. I also checked GC, disk I/O.

The files written are very small (only a few MB), so writing the blocks
should be fast.

The threshold is crossed only 1 or 2 ms, this makes me wonder.

Does anyone have an Idea where to look next or how to fix these warnings?

Re: "Slow ReadProcessor" warnings when using BucketSink

Posted by Robert Metzger <rm...@apache.org>.

Hi Max,

maybe you need to ask this question on the Hadoop user mailing list (or
your Hadoop vendor support, if you are using a Hadoop distribution).

On Tue, Oct 18, 2016 at 11:19 AM, static-max <fl...@googlemail.com>
wrote:

> Hi Robert,
>
> thanks for your reply. I also didn't find anything helpful on Google.
>
> I checked all GC Times, they look OK. Here are GC Times for the Job
> Manager (the job is running fine since 5 days):
>
> Collector Count Time
> PS-MarkSweep 3 1s
> PS-Scavenge 5814 2m 12s
>
> I have no window or any computation, just reading from Kafka and directly
> writing to HDFS.
>
> I can also run a terasort or teragen in parallel without any problems.
>
> Best,
> Max
>
> 2016-10-12 11:32 GMT+02:00 Robert Metzger <rm...@apache.org>:
>
>> Hi,
>> I haven't seen this error before. Also, I didn't find anything helpful
>> searching for the error on Google.
>>
>> Did you check the GC times also for Flink? Is your Flink job doing any
>> heavy tasks (like maintaining large windows, or other operations involving
>> a lot of heap space?)
>>
>> Regards,
>> Robert
>>
>>
>> On Tue, Oct 11, 2016 at 10:51 AM, static-max <fl...@googlemail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I have a low throughput job (approx. 1000 messager per Minute), that
>>> consumes from Kafka und writes directly to HDFS. After an hour or so, I get
>>> the following warnings in the Task Manager log:
>>>
>>> 2016-10-10 01:59:44,635 WARN  org.apache.hadoop.hdfs.DFSClient
>>>                      - Slow ReadProcessor read fields took 30001ms
>>> (threshold=30000ms); ack: seqno: 66 reply: SUCCESS reply: SUCCESS reply:
>>> SUCCESS downstreamAckTimeNanos: 1599276 flag: 0 flag: 0 flag: 0, targets:
>>> [DatanodeInfoWithStorage[Node1, Node2, Node3]]
>>> 2016-10-10 02:04:44,635 WARN  org.apache.hadoop.hdfs.DFSClient
>>>                      - Slow ReadProcessor read fields took 30002ms
>>> (threshold=30000ms); ack: seqno: 13 reply: SUCCESS reply: SUCCESS reply:
>>> SUCCESS downstreamAckTimeNanos: 2394027 flag: 0 flag: 0 flag: 0, targets:
>>> [DatanodeInfoWithStorage[Node1, Node2, Node3]]
>>> 2016-10-10 02:05:14,635 WARN  org.apache.hadoop.hdfs.DFSClient
>>>                      - Slow ReadProcessor read fields took 30001ms
>>> (threshold=30000ms); ack: seqno: 17 reply: SUCCESS reply: SUCCESS reply:
>>> SUCCESS downstreamAckTimeNanos: 2547467 flag: 0 flag: 0 flag: 0, targets:
>>> [DatanodeInfoWithStorage[Node1, Node2, Node3]]
>>>
>>> I have not found any erros or warning at the datanodes or the namenode.
>>> Every other application using HDFS performs fine. I have very little load
>>> and network latency is fine also. I also checked GC, disk I/O.
>>>
>>> The files written are very small (only a few MB), so writing the blocks
>>> should be fast.
>>>
>>> The threshold is crossed only 1 or 2 ms, this makes me wonder.
>>>
>>> Does anyone have an Idea where to look next or how to fix these warnings?
>>>
>>
>>
>

Re: "Slow ReadProcessor" warnings when using BucketSink

Posted by static-max <fl...@googlemail.com>.

Hi Robert,

thanks for your reply. I also didn't find anything helpful on Google.

I checked all GC Times, they look OK. Here are GC Times for the Job Manager
(the job is running fine since 5 days):

Collector Count Time
PS-MarkSweep 3 1s
PS-Scavenge 5814 2m 12s

I have no window or any computation, just reading from Kafka and directly
writing to HDFS.

I can also run a terasort or teragen in parallel without any problems.

Best,
Max

2016-10-12 11:32 GMT+02:00 Robert Metzger <rm...@apache.org>:

> Hi,
> I haven't seen this error before. Also, I didn't find anything helpful
> searching for the error on Google.
>
> Did you check the GC times also for Flink? Is your Flink job doing any
> heavy tasks (like maintaining large windows, or other operations involving
> a lot of heap space?)
>
> Regards,
> Robert
>
>
> On Tue, Oct 11, 2016 at 10:51 AM, static-max <fl...@googlemail.com>
> wrote:
>
>> Hi,
>>
>> I have a low throughput job (approx. 1000 messager per Minute), that
>> consumes from Kafka und writes directly to HDFS. After an hour or so, I get
>> the following warnings in the Task Manager log:
>>
>> 2016-10-10 01:59:44,635 WARN  org.apache.hadoop.hdfs.DFSClient
>>                    - Slow ReadProcessor read fields took 30001ms
>> (threshold=30000ms); ack: seqno: 66 reply: SUCCESS reply: SUCCESS reply:
>> SUCCESS downstreamAckTimeNanos: 1599276 flag: 0 flag: 0 flag: 0, targets:
>> [DatanodeInfoWithStorage[Node1, Node2, Node3]]
>> 2016-10-10 02:04:44,635 WARN  org.apache.hadoop.hdfs.DFSClient
>>                    - Slow ReadProcessor read fields took 30002ms
>> (threshold=30000ms); ack: seqno: 13 reply: SUCCESS reply: SUCCESS reply:
>> SUCCESS downstreamAckTimeNanos: 2394027 flag: 0 flag: 0 flag: 0, targets:
>> [DatanodeInfoWithStorage[Node1, Node2, Node3]]
>> 2016-10-10 02:05:14,635 WARN  org.apache.hadoop.hdfs.DFSClient
>>                    - Slow ReadProcessor read fields took 30001ms
>> (threshold=30000ms); ack: seqno: 17 reply: SUCCESS reply: SUCCESS reply:
>> SUCCESS downstreamAckTimeNanos: 2547467 flag: 0 flag: 0 flag: 0, targets:
>> [DatanodeInfoWithStorage[Node1, Node2, Node3]]
>>
>> I have not found any erros or warning at the datanodes or the namenode.
>> Every other application using HDFS performs fine. I have very little load
>> and network latency is fine also. I also checked GC, disk I/O.
>>
>> The files written are very small (only a few MB), so writing the blocks
>> should be fast.
>>
>> The threshold is crossed only 1 or 2 ms, this makes me wonder.
>>
>> Does anyone have an Idea where to look next or how to fix these warnings?
>>
>
>

Re: "Slow ReadProcessor" warnings when using BucketSink

Posted by Robert Metzger <rm...@apache.org>.

Hi,
I haven't seen this error before. Also, I didn't find anything helpful
searching for the error on Google.

Did you check the GC times also for Flink? Is your Flink job doing any
heavy tasks (like maintaining large windows, or other operations involving
a lot of heap space?)

Regards,
Robert


On Tue, Oct 11, 2016 at 10:51 AM, static-max <fl...@googlemail.com>
wrote:

> Hi,
>
> I have a low throughput job (approx. 1000 messager per Minute), that
> consumes from Kafka und writes directly to HDFS. After an hour or so, I get
> the following warnings in the Task Manager log:
>
> 2016-10-10 01:59:44,635 WARN  org.apache.hadoop.hdfs.DFSClient
>                    - Slow ReadProcessor read fields took 30001ms
> (threshold=30000ms); ack: seqno: 66 reply: SUCCESS reply: SUCCESS reply:
> SUCCESS downstreamAckTimeNanos: 1599276 flag: 0 flag: 0 flag: 0, targets:
> [DatanodeInfoWithStorage[Node1, Node2, Node3]]
> 2016-10-10 02:04:44,635 WARN  org.apache.hadoop.hdfs.DFSClient
>                    - Slow ReadProcessor read fields took 30002ms
> (threshold=30000ms); ack: seqno: 13 reply: SUCCESS reply: SUCCESS reply:
> SUCCESS downstreamAckTimeNanos: 2394027 flag: 0 flag: 0 flag: 0, targets:
> [DatanodeInfoWithStorage[Node1, Node2, Node3]]
> 2016-10-10 02:05:14,635 WARN  org.apache.hadoop.hdfs.DFSClient
>                    - Slow ReadProcessor read fields took 30001ms
> (threshold=30000ms); ack: seqno: 17 reply: SUCCESS reply: SUCCESS reply:
> SUCCESS downstreamAckTimeNanos: 2547467 flag: 0 flag: 0 flag: 0, targets:
> [DatanodeInfoWithStorage[Node1, Node2, Node3]]
>
> I have not found any erros or warning at the datanodes or the namenode.
> Every other application using HDFS performs fine. I have very little load
> and network latency is fine also. I also checked GC, disk I/O.
>
> The files written are very small (only a few MB), so writing the blocks
> should be fast.
>
> The threshold is crossed only 1 or 2 ms, this makes me wonder.
>
> Does anyone have an Idea where to look next or how to fix these warnings?
>