You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2016/08/30 05:06:20 UTC

[jira] [Created] (KUDU-1586) If a single op is larger than consensus_max_batch_size_bytes, consensus gets stuck

Todd Lipcon created KUDU-1586:
---------------------------------

             Summary: If a single op is larger than consensus_max_batch_size_bytes, consensus gets stuck
                 Key: KUDU-1586
                 URL: https://issues.apache.org/jira/browse/KUDU-1586
             Project: Kudu
          Issue Type: Bug
          Components: consensus
    Affects Versions: 0.10.0
            Reporter: Todd Lipcon
            Assignee: Todd Lipcon
            Priority: Blocker


I noticed on a cluster test that a leader was spinning with log messages like:

I0829 14:17:31.870786 22184 log_cache.cc:307] T e7cacfdb22744496a6d5d66227a69823 P 5d15962d2f2445b1ba15b93ead4fb31b: Successfully read 1 ops from disk (866604..866604)
I0829 14:17:31.873234  6186 log_cache.cc:307] T e7cacfdb22744496a6d5d66227a69823 P 5d15962d2f2445b1ba15b93ead4fb31b: Successfully read 1 ops from disk (866604..866604)
I0829 14:17:31.875713 22184 log_cache.cc:307] T e7cacfdb22744496a6d5d66227a69823 P 5d15962d2f2445b1ba15b93ead4fb31b: Successfully read 1 ops from disk (866604..866604)
I0829 14:17:31.878078  6186 log_cache.cc:307] T e7cacfdb22744496a6d5d66227a69823 P 5d15962d2f2445b1ba15b93ead4fb31b: Successfully read 1 ops from disk (866604..866604)

After investigation, it seems this op was larger than 1MB (default consensus batch size) and this caused this tight loop behavior with no progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)