You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by nick toker <ni...@gmail.com> on 2020/06/16 07:44:21 UTC

Improved performance when using incremental checkpoints

Hello,

We are using RocksDB as the backend state.
At first we didn't enable the checkpoints mechanism.

We observed the following behaviour and we are wondering why ?

When using the rocksDB *without* checkpoint the performance was very
extremely bad.
And when we enabled the checkpoint the performance was improved by a*
factor of 10*.

Could you please explain if this behaviour is expected ?
Could you please explain why enabling the checkpoint significantly improves
the performance ?

BR,
Nick

Re: Improved performance when using incremental checkpoints

Posted by Congxian Qiu <qc...@gmail.com>.

Hi Nick

The result is a bit wired. Did you compare the disk util/performance before
and after enabling checkpoint?

Best,
Congxian


Yun Tang <my...@live.com> 于2020年6月17日周三 下午8:56写道：

> Hi Nick
>
> I think this thread use the same program as thread "MapState bad
> performance" talked.
> Please provide a simple program which could reproduce this so that we can
> help you more.
>
> Best
> Yun Tang
> ------------------------------
> *From:* Aljoscha Krettek <al...@apache.org>
> *Sent:* Tuesday, June 16, 2020 19:53
> *To:* user@flink.apache.org <us...@flink.apache.org>
> *Subject:* Re: Improved performance when using incremental checkpoints
>
> Hi,
>
> it might be that the operations that Flink performs on RocksDB during
> checkpointing will "poke" RocksDB somehow and make it clean up it's
> internal hierarchies of storage more. Other than that, I'm also a bit
> surprised by this.
>
> Maybe Yun Tang will come up with another idea.
>
> Best,
> Aljoscha
>
> On 16.06.20 12:42, nick toker wrote:
> > Hi,
> >
> > We used both flink versions 1.9.1 and 1.10.1
> > We used rocksDB default configuration.
> > The streaming pipeline is very simple.
> >
> > 1. Kafka consumer
> > 2. Process function
> > 3. Kafka producer
> >
> > The code of the process function is listed below:
> >
> > private transient MapState<String, Object> testMapState;
> >
> > @Override
> >      public void processElement(Map<String, Object> value, Context ctx,
> > Collector<Map<String, Object>> out) throws Exception {
> >
> >              if (testMapState.isEmpty()) {
> >
> >                  testMapState.putAll(value);
> >
> >                  out.collect(value);
> >
> >                  testMapState.clear();
> >              }
> >          }
> >
> > We used the same code with ValueState and observed the same results.
> >
> >
> > BR,
> >
> > Nick
> >
> >
> > ‫בתאריך יום ג׳, 16 ביוני 2020 ב-11:56 מאת ‪Yun Tang‬‏ <‪myasuka@live.com
> > ‬‏>:‬
> >
> >> Hi Nick
> >>
> >> It's really strange that performance could improve when checkpoint is
> >> enabled.
> >> In general, enable checkpoint might bring a bit performance downside to
> >> the whole job.
> >>
> >> Could you give more details e.g. Flink version, configurations of
> RocksDB
> >> and simple code which could reproduce this problem.
> >>
> >> Best
> >> Yun Tang
> >> ------------------------------
> >> *From:* nick toker <ni...@gmail.com>
> >> *Sent:* Tuesday, June 16, 2020 15:44
> >> *To:* user@flink.apache.org <us...@flink.apache.org>
> >> *Subject:* Improved performance when using incremental checkpoints
> >>
> >> Hello,
> >>
> >> We are using RocksDB as the backend state.
> >> At first we didn't enable the checkpoints mechanism.
> >>
> >> We observed the following behaviour and we are wondering why ?
> >>
> >> When using the rocksDB *without* checkpoint the performance was very
> >> extremely bad.
> >> And when we enabled the checkpoint the performance was improved by a*
> >> factor of 10*.
> >>
> >> Could you please explain if this behaviour is expected ?
> >> Could you please explain why enabling the checkpoint significantly
> >> improves the performance ?
> >>
> >> BR,
> >> Nick
> >>
> >
>
>

Re: Improved performance when using incremental checkpoints

Posted by Yun Tang <my...@live.com>.

Hi Nick

I think this thread use the same program as thread "MapState bad performance" talked.
Please provide a simple program which could reproduce this so that we can help you more.

Best
Yun Tang
________________________________
From: Aljoscha Krettek <al...@apache.org>
Sent: Tuesday, June 16, 2020 19:53
To: user@flink.apache.org <us...@flink.apache.org>
Subject: Re: Improved performance when using incremental checkpoints

Hi,

it might be that the operations that Flink performs on RocksDB during
checkpointing will "poke" RocksDB somehow and make it clean up it's
internal hierarchies of storage more. Other than that, I'm also a bit
surprised by this.

Maybe Yun Tang will come up with another idea.

Best,
Aljoscha

On 16.06.20 12:42, nick toker wrote:
> Hi,
>
> We used both flink versions 1.9.1 and 1.10.1
> We used rocksDB default configuration.
> The streaming pipeline is very simple.
>
> 1. Kafka consumer
> 2. Process function
> 3. Kafka producer
>
> The code of the process function is listed below:
>
> private transient MapState<String, Object> testMapState;
>
> @Override
>      public void processElement(Map<String, Object> value, Context ctx,
> Collector<Map<String, Object>> out) throws Exception {
>
>              if (testMapState.isEmpty()) {
>
>                  testMapState.putAll(value);
>
>                  out.collect(value);
>
>                  testMapState.clear();
>              }
>          }
>
> We used the same code with ValueState and observed the same results.
>
>
> BR,
>
> Nick
>
>
> ‫בתאריך יום ג׳, 16 ביוני 2020 ב-11:56 מאת ‪Yun Tang‬‏ <‪myasuka@live.com
> ‬‏>:‬
>
>> Hi Nick
>>
>> It's really strange that performance could improve when checkpoint is
>> enabled.
>> In general, enable checkpoint might bring a bit performance downside to
>> the whole job.
>>
>> Could you give more details e.g. Flink version, configurations of RocksDB
>> and simple code which could reproduce this problem.
>>
>> Best
>> Yun Tang
>> ------------------------------
>> *From:* nick toker <ni...@gmail.com>
>> *Sent:* Tuesday, June 16, 2020 15:44
>> *To:* user@flink.apache.org <us...@flink.apache.org>
>> *Subject:* Improved performance when using incremental checkpoints
>>
>> Hello,
>>
>> We are using RocksDB as the backend state.
>> At first we didn't enable the checkpoints mechanism.
>>
>> We observed the following behaviour and we are wondering why ?
>>
>> When using the rocksDB *without* checkpoint the performance was very
>> extremely bad.
>> And when we enabled the checkpoint the performance was improved by a*
>> factor of 10*.
>>
>> Could you please explain if this behaviour is expected ?
>> Could you please explain why enabling the checkpoint significantly
>> improves the performance ?
>>
>> BR,
>> Nick
>>
>

Re: Improved performance when using incremental checkpoints

Posted by Aljoscha Krettek <al...@apache.org>.

Hi,

it might be that the operations that Flink performs on RocksDB during 
checkpointing will "poke" RocksDB somehow and make it clean up it's 
internal hierarchies of storage more. Other than that, I'm also a bit 
surprised by this.

Maybe Yun Tang will come up with another idea.

Best,
Aljoscha

On 16.06.20 12:42, nick toker wrote:
> Hi,
> 
> We used both flink versions 1.9.1 and 1.10.1
> We used rocksDB default configuration.
> The streaming pipeline is very simple.
> 
> 1. Kafka consumer
> 2. Process function
> 3. Kafka producer
> 
> The code of the process function is listed below:
> 
> private transient MapState<String, Object> testMapState;
> 
> @Override
>      public void processElement(Map<String, Object> value, Context ctx,
> Collector<Map<String, Object>> out) throws Exception {
> 
>              if (testMapState.isEmpty()) {
> 
>                  testMapState.putAll(value);
> 
>                  out.collect(value);
> 
>                  testMapState.clear();
>              }
>          }
> 
> We used the same code with ValueState and observed the same results.
> 
> 
> BR,
> 
> Nick
> 
> 
> ‫בתאריך יום ג׳, 16 ביוני 2020 ב-11:56 מאת ‪Yun Tang‬‏ <‪myasuka@live.com
> ‬‏>:‬
> 
>> Hi Nick
>>
>> It's really strange that performance could improve when checkpoint is
>> enabled.
>> In general, enable checkpoint might bring a bit performance downside to
>> the whole job.
>>
>> Could you give more details e.g. Flink version, configurations of RocksDB
>> and simple code which could reproduce this problem.
>>
>> Best
>> Yun Tang
>> ------------------------------
>> *From:* nick toker <ni...@gmail.com>
>> *Sent:* Tuesday, June 16, 2020 15:44
>> *To:* user@flink.apache.org <us...@flink.apache.org>
>> *Subject:* Improved performance when using incremental checkpoints
>>
>> Hello,
>>
>> We are using RocksDB as the backend state.
>> At first we didn't enable the checkpoints mechanism.
>>
>> We observed the following behaviour and we are wondering why ?
>>
>> When using the rocksDB *without* checkpoint the performance was very
>> extremely bad.
>> And when we enabled the checkpoint the performance was improved by a*
>> factor of 10*.
>>
>> Could you please explain if this behaviour is expected ?
>> Could you please explain why enabling the checkpoint significantly
>> improves the performance ?
>>
>> BR,
>> Nick
>>
>

Re: Improved performance when using incremental checkpoints

Posted by nick toker <ni...@gmail.com>.

Hi,

We used both flink versions 1.9.1 and 1.10.1
We used rocksDB default configuration.
The streaming pipeline is very simple.

1. Kafka consumer
2. Process function
3. Kafka producer

The code of the process function is listed below:

private transient MapState<String, Object> testMapState;

@Override
    public void processElement(Map<String, Object> value, Context ctx,
Collector<Map<String, Object>> out) throws Exception {

            if (testMapState.isEmpty()) {

                testMapState.putAll(value);

                out.collect(value);

                testMapState.clear();
            }
        }

We used the same code with ValueState and observed the same results.


BR,

Nick


‫בתאריך יום ג׳, 16 ביוני 2020 ב-11:56 מאת ‪Yun Tang‬‏ <‪myasuka@live.com
‬‏>:‬

> Hi Nick
>
> It's really strange that performance could improve when checkpoint is
> enabled.
> In general, enable checkpoint might bring a bit performance downside to
> the whole job.
>
> Could you give more details e.g. Flink version, configurations of RocksDB
> and simple code which could reproduce this problem.
>
> Best
> Yun Tang
> ------------------------------
> *From:* nick toker <ni...@gmail.com>
> *Sent:* Tuesday, June 16, 2020 15:44
> *To:* user@flink.apache.org <us...@flink.apache.org>
> *Subject:* Improved performance when using incremental checkpoints
>
> Hello,
>
> We are using RocksDB as the backend state.
> At first we didn't enable the checkpoints mechanism.
>
> We observed the following behaviour and we are wondering why ?
>
> When using the rocksDB *without* checkpoint the performance was very
> extremely bad.
> And when we enabled the checkpoint the performance was improved by a*
> factor of 10*.
>
> Could you please explain if this behaviour is expected ?
> Could you please explain why enabling the checkpoint significantly
> improves the performance ?
>
> BR,
> Nick
>

Re: Improved performance when using incremental checkpoints

Posted by Yun Tang <my...@live.com>.

Hi Nick

It's really strange that performance could improve when checkpoint is enabled.
In general, enable checkpoint might bring a bit performance downside to the whole job.

Could you give more details e.g. Flink version, configurations of RocksDB and simple code which could reproduce this problem.

Best
Yun Tang
________________________________
From: nick toker <ni...@gmail.com>
Sent: Tuesday, June 16, 2020 15:44
To: user@flink.apache.org <us...@flink.apache.org>
Subject: Improved performance when using incremental checkpoints

Hello,

We are using RocksDB as the backend state.
At first we didn't enable the checkpoints mechanism.

We observed the following behaviour and we are wondering why ?

When using the rocksDB without checkpoint the performance was very extremely bad.
And when we enabled the checkpoint the performance was improved by a factor of 10.

Could you please explain if this behaviour is expected ?
Could you please explain why enabling the checkpoint significantly improves the performance ?

BR,
Nick