You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Juha Mynttinen <ju...@king.com> on 2020/09/16 05:58:21 UTC

Disable WAL in RocksDB recovery

Hello there,

I'd like to bring to discussion a previously discussed topic - disabling WAL in RocksDB recovery.

It's clear that WAL is not needed during the process, the reason being that the WAL is never read, so there's no need to write it.

AFAIK the last thing that was done with WAL during recovery is an attempt to remove it and later reverting that removal (https://issues.apache.org/jira/browse/FLINK-8922). If I interpret the comments in the ticket correctly, what happened was that a) WAL was kept in the recovery, 2) it's unknown why removing WAL causes segfault.

What can be seen in the ticket is that having WAL causes a significant performance penalty. Thus, getting rid of WAL would be a very nice performance improvement. I think it'd be worth to creating a new JIRA ticket at least as a reminder that WAL should be removed?

I'm planning adding an experimental flag to remove WAL in the environment I'm using Flink and trying it out. If the flag is made configurable, WAL can always be re-enabled if removing it causes issues.

Thoughts?

Regards,
Juha

Re: Disable WAL in RocksDB recovery

Posted by Yu Li <ca...@gmail.com>.

Great, thanks for the follow up.

Best Regards,
Yu


On Mon, 21 Sep 2020 at 15:04, Juha Mynttinen <ju...@king.com>
wrote:

> Good,
>
> I opened this JIRA for the issue
> https://issues.apache.org/jira/browse/FLINK-19303. The discussion can be
> moved there.
>
> Regards,
> Juha
> ------------------------------
> *From:* Yu Li <ca...@gmail.com>
> *Sent:* Friday, September 18, 2020 3:58 PM
> *To:* Juha Mynttinen <ju...@king.com>
> *Cc:* user@flink.apache.org <us...@flink.apache.org>
> *Subject:* Re: Disable WAL in RocksDB recovery
>
> Thanks for bringing this up Juha, and good catch.
>
> We actually are disabling WAL for routine writes by default when using
> RocksDB and never encountered segment fault issues. However, from history
> in FLINK-8922, segment fault issue occurs during restore if WAL is
> disabled, so I guess the root cause lies in RocksDB batch write
> (org.rocksdb.WriteBatch). And IMHO this is a RocksDB bug (it should work
> well when WAL is disabled, no matter under single or batch write).
>
> +1 for opening a new JIRA to figure the root cause out, fix it and disable
> WAL during restore by default (maybe checking the fixes around WriteBatch
> in later RocksDB versions could help locate the issue more quickly), and
> thanks for volunteering taking the efforts. I will follow up and help
> review if any findings / PR submission.
>
> Best Regards,
> Yu
>
>
> On Wed, 16 Sep 2020 at 13:58, Juha Mynttinen <ju...@king.com>
> wrote:
>
> Hello there,
>
> I'd like to bring to discussion a previously discussed topic - disabling
> WAL in RocksDB recovery.
>
> It's clear that WAL is not needed during the process, the reason being
> that the WAL is never read, so there's no need to write it.
>
> AFAIK the last thing that was done with WAL during recovery is an attempt
> to remove it and later reverting that removal (https://issues.apache.org/jira/browse/FLINK-8922
> [issues.apache.org]
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_FLINK-2D8922&d=DwMFaQ&c=-0jfte1J3SKEE6FyZmTngg&r=-2x4lRPm2yEX3Ylri2jKFRC6zr9S6Iqg2kAJIspWwfA&m=AxIzKYnvz1WPfhVBb3h7dasyjYw21mR3x-cuBH3L3Ww&s=EFZry0q99qolXx6Ml-joOUoVEBQXgvsvTg5Ww0Y8ha8&e=>).
> If I interpret the comments in the ticket correctly, what happened was that
> a) WAL was kept in the recovery, 2) it's unknown why removing WAL causes
> segfault.
>
> What can be seen in the ticket is that having WAL causes a significant
> performance penalty. Thus, getting rid of WAL would be a very nice
> performance improvement. I think it'd be worth to creating a new JIRA
> ticket at least as a reminder that WAL should be removed?
>
> I'm planning adding an experimental flag to remove WAL in the environment
> I'm using Flink and trying it out. If the flag is made configurable, WAL
> can always be re-enabled if removing it causes issues.
>
> Thoughts?
>
> Regards,
> Juha
>
>

Re: Disable WAL in RocksDB recovery

Posted by Juha Mynttinen <ju...@king.com>.

Good,

I opened this JIRA for the issue https://issues.apache.org/jira/browse/FLINK-19303. The discussion can be moved there.

Regards,
Juha
________________________________
From: Yu Li <ca...@gmail.com>
Sent: Friday, September 18, 2020 3:58 PM
To: Juha Mynttinen <ju...@king.com>
Cc: user@flink.apache.org <us...@flink.apache.org>
Subject: Re: Disable WAL in RocksDB recovery

Thanks for bringing this up Juha, and good catch.

We actually are disabling WAL for routine writes by default when using RocksDB and never encountered segment fault issues. However, from history in FLINK-8922, segment fault issue occurs during restore if WAL is disabled, so I guess the root cause lies in RocksDB batch write (org.rocksdb.WriteBatch). And IMHO this is a RocksDB bug (it should work well when WAL is disabled, no matter under single or batch write).

+1 for opening a new JIRA to figure the root cause out, fix it and disable WAL during restore by default (maybe checking the fixes around WriteBatch in later RocksDB versions could help locate the issue more quickly), and thanks for volunteering taking the efforts. I will follow up and help review if any findings / PR submission.

Best Regards,
Yu

On Wed, 16 Sep 2020 at 13:58, Juha Mynttinen <ju...@king.com>> wrote:
Hello there,

I'd like to bring to discussion a previously discussed topic - disabling WAL in RocksDB recovery.

It's clear that WAL is not needed during the process, the reason being that the WAL is never read, so there's no need to write it.

AFAIK the last thing that was done with WAL during recovery is an attempt to remove it and later reverting that removal (https://issues.apache.org/jira/browse/FLINK-8922 [issues.apache.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_FLINK-2D8922&d=DwMFaQ&c=-0jfte1J3SKEE6FyZmTngg&r=-2x4lRPm2yEX3Ylri2jKFRC6zr9S6Iqg2kAJIspWwfA&m=AxIzKYnvz1WPfhVBb3h7dasyjYw21mR3x-cuBH3L3Ww&s=EFZry0q99qolXx6Ml-joOUoVEBQXgvsvTg5Ww0Y8ha8&e=>). If I interpret the comments in the ticket correctly, what happened was that a) WAL was kept in the recovery, 2) it's unknown why removing WAL causes segfault.

What can be seen in the ticket is that having WAL causes a significant performance penalty. Thus, getting rid of WAL would be a very nice performance improvement. I think it'd be worth to creating a new JIRA ticket at least as a reminder that WAL should be removed?

I'm planning adding an experimental flag to remove WAL in the environment I'm using Flink and trying it out. If the flag is made configurable, WAL can always be re-enabled if removing it causes issues.

Thoughts?

Regards,
Juha

Re: Disable WAL in RocksDB recovery

Posted by Yu Li <ca...@gmail.com>.

Thanks for bringing this up Juha, and good catch.

We actually are disabling WAL for routine writes by default when using
RocksDB and never encountered segment fault issues. However, from history
in FLINK-8922, segment fault issue occurs during restore if WAL is
disabled, so I guess the root cause lies in RocksDB batch write
(org.rocksdb.WriteBatch). And IMHO this is a RocksDB bug (it should work
well when WAL is disabled, no matter under single or batch write).

+1 for opening a new JIRA to figure the root cause out, fix it and disable
WAL during restore by default (maybe checking the fixes around WriteBatch
in later RocksDB versions could help locate the issue more quickly), and
thanks for volunteering taking the efforts. I will follow up and help
review if any findings / PR submission.

Best Regards,
Yu


On Wed, 16 Sep 2020 at 13:58, Juha Mynttinen <ju...@king.com>
wrote:

> Hello there,
>
> I'd like to bring to discussion a previously discussed topic - disabling
> WAL in RocksDB recovery.
>
> It's clear that WAL is not needed during the process, the reason being
> that the WAL is never read, so there's no need to write it.
>
> AFAIK the last thing that was done with WAL during recovery is an attempt
> to remove it and later reverting that removal (
> https://issues.apache.org/jira/browse/FLINK-8922). If I interpret the
> comments in the ticket correctly, what happened was that a) WAL was kept in
> the recovery, 2) it's unknown why removing WAL causes segfault.
>
> What can be seen in the ticket is that having WAL causes a significant
> performance penalty. Thus, getting rid of WAL would be a very nice
> performance improvement. I think it'd be worth to creating a new JIRA
> ticket at least as a reminder that WAL should be removed?
>
> I'm planning adding an experimental flag to remove WAL in the environment
> I'm using Flink and trying it out. If the flag is made configurable, WAL
> can always be re-enabled if removing it causes issues.
>
> Thoughts?
>
> Regards,
> Juha
>
>