You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Adar Dembo (Jira)" <ji...@apache.org> on 2019/11/18 09:46:00 UTC

[jira] [Commented] (KUDU-38) bootstrap should not replay logs that are known to be fully flushed

    [ https://issues.apache.org/jira/browse/KUDU-38?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976392#comment-16976392 ] 

Adar Dembo commented on KUDU-38:
--------------------------------

[~tlipcon] I've been looking into this (with [~awong]'s help) and I have a few questions:
# Guaranteeing that every complete segment has a fully sync'ed index file makes for a nice invariant, but isn't it overkill for the task at hand? Couldn't we get away with sync'ing whichever index file contains the earliest anchored index at TabletMetadata flush time? I'm particularly concerned about the backwards compatibility implications: how do we establish this invariant after upgrading to a release including this fix? Or, how do we detect that it's not present in existing log index files?
# Alternatively, what about forgoing the log index file and rather than storing the earliest anchored index in the TabletMetadata, storing the "physical index" (i.e. the {{LogIndexEntry}} corresponding to the anchor)?
# Associating a TabletMetadata::Flush to the flushing store's log anchor (if there is one) is tricky indeed given the lack of visibility into anchors at that layer of the code.
# There's one bit in the "trickiness" you outlined that's confusing me: you used the plural "stores" rather than just "store". Does this mean that if we've got e.g. two active DMS flushes and one active MRS flush, we need to exclude all three anchors in the first call to TabletMetadata::Flush, two anchors from the second call, and one anchor from the third?

> bootstrap should not replay logs that are known to be fully flushed
> -------------------------------------------------------------------
>
>                 Key: KUDU-38
>                 URL: https://issues.apache.org/jira/browse/KUDU-38
>             Project: Kudu
>          Issue Type: Sub-task
>          Components: tablet
>    Affects Versions: M3
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Major
>              Labels: data-scalability, startup-time
>
> Currently the bootstrap process will process all of the log segments, including those that can be trivially determined to contain only durable edits. This makes startup unnecessarily slow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)