You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2016/03/04 04:51:40 UTC
[jira] [Resolved] (KUDU-969) Bootstrap may occasionally
mis-identify previously flushed updates
[ https://issues.apache.org/jira/browse/KUDU-969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Lipcon resolved KUDU-969.
------------------------------
Resolution: Fixed
Fix Version/s: 0.8.0
> Bootstrap may occasionally mis-identify previously flushed updates
> ------------------------------------------------------------------
>
> Key: KUDU-969
> URL: https://issues.apache.org/jira/browse/KUDU-969
> Project: Kudu
> Issue Type: Bug
> Components: tablet
> Affects Versions: 0.5.0, 0.6.0, 0.7.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Blocker
> Fix For: 0.8.0
>
>
> tablet_bootstrap has the following TODO:
> {code}
> if (!FindCopy(flushed_dms_by_drs_id_, target.rs_id(), &last_durable_dms_id)) {
> // if we have no data about this RowSet, then it must have been flushed and
> // then deleted.
> // TODO: how do we avoid a race where we get an update on a rowset before
> // it is persisted? add docs about the ordering of flush.
> return true;
> }
> {code}
> alter_table-randomized-test, when looped in TSAN, seems to fail after around 30 iterations with a sequence like:
> - a compaction enters "duplicating" phase
> - an update arrives, which is duplicated into the old and new rowsets ids
> -- the new rowset ID isn't part of the metadata yet
> - we get kill -9ed before we flush the metadata from the compaction
> It seems that we then mis-identify the update to the "new" store as already flushed, which can cause the bootstrap to fail (or maybe cause a missing update).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)