You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Wei Deng (Jira)" <ji...@apache.org> on 2020/08/13 16:02:00 UTC
[jira] [Created] (CASSANDRA-16047) Potential race condition in
creating hard link when incremental backup is turned on
Wei Deng created CASSANDRA-16047:
------------------------------------
Summary: Potential race condition in creating hard link when incremental backup is turned on
Key: CASSANDRA-16047
URL: https://issues.apache.org/jira/browse/CASSANDRA-16047
Project: Cassandra
Issue Type: Bug
Components: Local/SSTable
Reporter: Wei Deng
Attachments: incremental_backup_hardlink_exception.jpg, incremental_backup_hardlink_exception1.jpg
It seems that there is a race condition in creating hard link if incremental backup is turned on.
The following screenshot was captured in a production cluster running Cassandra 3.0.15 after turning on incremental backup. When this {{NoSuchFileException}} happens, due to the {{FSWriteError}} and the default disk failure policy, the JVM will be shutdown, so it's a pretty critical bug.
!incremental_backup_hardlink_exception.jpg!
Due to the risk of causing production database downtime (if similar issue happens on multiple nodes in a short time frame), incremental backup had to be turned off for now, but this is not an ideal situation.
!incremental_backup_hardlink_exception1.jpg!
The deployment is on a public cloud environment with EBS-like disks that are backed by SSD with decent latency, throughput and IOPS, so it is hard to think the culprit being in the OS and IO layer. Based on the second screenshot above, this is a low flush traffic {{system.size_estimates}} table, so compaction of the source SSTable doesn't seem to be at play here.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org