You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2020/09/22 06:02:00 UTC
[jira] [Commented] (KUDU-2233) Check failure during compactions:
pv_delete_redo != nullptr
[ https://issues.apache.org/jira/browse/KUDU-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17199833#comment-17199833 ]
ASF subversion and git services commented on KUDU-2233:
-------------------------------------------------------
Commit fcceb8b1a20afff30e15b6248a56ab3e06b61e79 in kudu's branch refs/heads/master from Andrew Wong
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=fcceb8b ]
KUDU-3191: fail replicas when KUDU-2233 is detected
Despite the longstanding fixes that stop bad KUDU-2233 compactions,
users still see the results of already corrupted data, particularly when
upgrading to newer versions that may compact more aggressively than
older versions.
Rather than crashing when hitting a KUDU-2233 failure, this patch
updates the behavior to fail the replica. Similar to disk failures or
CFile checksum corruption, this will trigger re-replication to happen,
and eviction will only happen if there is a healthy majority.
The hope is that fewer users will see this corruption cause problems, as
the corruption will henceforth not crash servers, and only tablets with
a majority corrupted will be unavailable.
Change-Id: I43570b961dfd5eb8518328121585255d32cf2ebb
Reviewed-on: http://gerrit.cloudera.org:8080/16471
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <as...@cloudera.com>
> Check failure during compactions: pv_delete_redo != nullptr
> -----------------------------------------------------------
>
> Key: KUDU-2233
> URL: https://issues.apache.org/jira/browse/KUDU-2233
> Project: Kudu
> Issue Type: Bug
> Components: tablet, tserver
> Affects Versions: 1.4.0
> Reporter: Andrew Wong
> Assignee: Andrew Wong
> Priority: Major
> Fix For: 1.7.0
>
>
> There have been a couple of reports of a check failure during compactions at least from 1.4, pasted below:
> {noformat}
> F1201 14:55:37.052140 10508 compaction.cc:756] Check failed: pv_delete_redo != nullptr
> *
> **
> *** Check failure stack trace: ***
> Wrote minidump to /var/log/kudu/minidumps/kudu-tserver/215cde39-7795-0885-0b51038d-771d875e.dmp
> *** Aborted at 1512161737 (unix time) try "date -d @1512161737" if you are using GNU date ***
> PC: @ 0x3ec3632625 (unknown)
> *** SIGABRT (@0x3b98eec0000028e3) received by PID 10467 (TID 0x7f8b02c58700) from PID 10467; stack trace: ***
> @ 0x3ec3a0f7e0 (unknown)
> @ 0x3ec3632625 (unknown)
> @ 0x3ec3633e05 (unknown)
> @ 0x1b53f59 (unknown)
> @ 0x8b9f6d google::LogMessage::Fail()
> @ 0x8bbe2d google::LogMessage::SendToLog()
> @ 0x8b9aa9 google::LogMessage::Flush()
> @ 0x8bc8cf google::LogMessageFatal::~LogMessageFatal()
> @ 0x9db0fe kudu::tablet::FlushCompactionInput()
> @ 0x9a056a kudu::tablet::Tablet::DoMergeCompactionOrFlush()
> @ 0x9a372d kudu::tablet::Tablet::Compact()
> @ 0x9bd8d1 kudu::tablet::CompactRowSetsOp::Perform()
> @ 0x1b4145f kudu::MaintenanceManager::LaunchOp()
> @ 0x1b8da06 kudu::ThreadPool::DispatchThread()
> @ 0x1b888ea kudu::Thread::SuperviseThread()
> @ 0x3ec3a07aa1 (unknown)
> @ 0x3ec36e893d (unknown)
> @ 0x0 (unknown)}}
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)