You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Angelo Polo <la...@gmail.com> on 2019/12/19 12:01:32 UTC

CommitLogReaderTest

Hi C* Devs,

I've got two questions relating to the CommitLogReader.

1.
I maintain the Cassandra port for FreeBSD. For C* 3.11.5 (OpenJDK
1.8.0_222, FreeBSD 12.1-RELEASE-p1 GENERIC amd64), there are currently test
failures in org.apache.cassandra.db.commitlog.CommitLogReaderTest with zero
mutations read in each case. Only testReadCount passes and it looks like
that's because the number of generated samples in the test is smaller than
the others. Increase the sample size to around 300 and it will fail too;
similarly, the others pass with a smaller sample size.

The tests are failing because CommitLogDescriptor#readHeader returns null,
a result of the checksum check failing: the checksum read from the file
(CommitLogDescriptor:175) is zero. Similarly version, id, and
parameterLength are all zero and from what I can tell the commit log in
these instances is just all zeros.

Any idea why a "large" number of mutations could cause a problem with the
commit log? Has this been seen and fixed on any of the officially supported
platforms (Linux, Darwin)?

2.
Because of #1 above, I started playing around with CommitLogReaderTest on
Linux (OpenJDK 1.8.0_232, Ubuntu 18.04). Checked out the 3.11.5 release
commit and changed just samples and readCount in
CommitLogReaderTest#testReadFromMidpoint as follows:

    public void testReadFromMidpoint() throws Throwable
    {
        int samples = 100;  // formerly 1000
        int readCount = 60; // formerly 500

Shouldn't this fail uniformly every time with "Expected 60 seen mutations,
got: 50"? It produces somewhat random results:

[junit-timeout] Testcase:
testReadFromMidpoint(org.apache.cassandra.db.commitlog.CommitLogReaderTest):
 FAILED
[junit-timeout] Expected 60 seen mutations, got: 55 expected:<60> but
was:<55>
[junit-timeout] junit.framework.AssertionFailedError: Expected 60 seen
mutations, got: 55 expected:<60> but was:<55>
[junit-timeout]         at
org.apache.cassandra.db.commitlog.CommitLogReaderTest.testReadFromMidpoint(CommitLogReaderTest.java:111)

[junit-timeout] Testcase:
testReadFromMidpoint(org.apache.cassandra.db.commitlog.CommitLogReaderTest):
 FAILED
[junit-timeout] Expected 60 seen mutations, got: 57 expected:<60> but
was:<57>
[junit-timeout] junit.framework.AssertionFailedError: Expected 60 seen
mutations, got: 57 expected:<60> but was:<57>
[junit-timeout]         at
org.apache.cassandra.db.commitlog.CommitLogReaderTest.testReadFromMidpoint(CommitLogReaderTest.java:111)

Including, occasionally, the anticipated answer :)
[junit-timeout] Testcase:
testReadFromMidpoint(org.apache.cassandra.db.commitlog.CommitLogReaderTest):
 FAILED
[junit-timeout] Expected 60 seen mutations, got: 50 expected:<60> but
was:<50>
[junit-timeout] junit.framework.AssertionFailedError: Expected 60 seen
mutations, got: 50 expected:<60> but was:<50>
[junit-timeout]         at
org.apache.cassandra.db.commitlog.CommitLogReaderTest.testReadFromMidpoint(CommitLogReaderTest.java:111)

Shall I open a bug in JIRA for this or have I misunderstood how
testReadFromMidpoint works?

Thanks,
Angelo Polo

Re: CommitLogReaderTest

Posted by Joshua McKenzie <jm...@apache.org>.
Hey Angelo.

Here's my suspicion - only had a brief time to poke at this this morning
but it might help you bottom out on this. In short: I think the logic in
the test is flawed, and that the Assert.assertEquals check should instead
do a >= check against total mutations read from midpoint. The test iterates
through all files and plays mutations >= the midpoint in terms of file
offset, but what we often see testing the CommitLog is that other things
(system tables, etc) are being written to CommitLog's so we can't
deterministically say "check that X mutations are present, no more, no
less" because some runs will end up with extra things written in the CL,
some not, and it's just a timing concern.

Each of your cases, you see that you have >= expected actual readCount from
midpoint (50) mutations in your testHandler.seenMutationCount(), and I
suspect that's because mutations from disparate tables and keyspaces are
all interleaved in the CL segment so you can get more than you expected.

Hope that helps and isn't woefully incorrect. :)


On Thu, Dec 19, 2019 at 7:01 AM Angelo Polo <la...@gmail.com>
wrote:

> Hi C* Devs,
>
> I've got two questions relating to the CommitLogReader.
>
> 1.
> I maintain the Cassandra port for FreeBSD. For C* 3.11.5 (OpenJDK
> 1.8.0_222, FreeBSD 12.1-RELEASE-p1 GENERIC amd64), there are currently test
> failures in org.apache.cassandra.db.commitlog.CommitLogReaderTest with zero
> mutations read in each case. Only testReadCount passes and it looks like
> that's because the number of generated samples in the test is smaller than
> the others. Increase the sample size to around 300 and it will fail too;
> similarly, the others pass with a smaller sample size.
>
> The tests are failing because CommitLogDescriptor#readHeader returns null,
> a result of the checksum check failing: the checksum read from the file
> (CommitLogDescriptor:175) is zero. Similarly version, id, and
> parameterLength are all zero and from what I can tell the commit log in
> these instances is just all zeros.
>
> Any idea why a "large" number of mutations could cause a problem with the
> commit log? Has this been seen and fixed on any of the officially supported
> platforms (Linux, Darwin)?
>
> 2.
> Because of #1 above, I started playing around with CommitLogReaderTest on
> Linux (OpenJDK 1.8.0_232, Ubuntu 18.04). Checked out the 3.11.5 release
> commit and changed just samples and readCount in
> CommitLogReaderTest#testReadFromMidpoint as follows:
>
>     public void testReadFromMidpoint() throws Throwable
>     {
>         int samples = 100;  // formerly 1000
>         int readCount = 60; // formerly 500
>
> Shouldn't this fail uniformly every time with "Expected 60 seen mutations,
> got: 50"? It produces somewhat random results:
>
> [junit-timeout] Testcase:
>
> testReadFromMidpoint(org.apache.cassandra.db.commitlog.CommitLogReaderTest):
>  FAILED
> [junit-timeout] Expected 60 seen mutations, got: 55 expected:<60> but
> was:<55>
> [junit-timeout] junit.framework.AssertionFailedError: Expected 60 seen
> mutations, got: 55 expected:<60> but was:<55>
> [junit-timeout]         at
>
> org.apache.cassandra.db.commitlog.CommitLogReaderTest.testReadFromMidpoint(CommitLogReaderTest.java:111)
>
> [junit-timeout] Testcase:
>
> testReadFromMidpoint(org.apache.cassandra.db.commitlog.CommitLogReaderTest):
>  FAILED
> [junit-timeout] Expected 60 seen mutations, got: 57 expected:<60> but
> was:<57>
> [junit-timeout] junit.framework.AssertionFailedError: Expected 60 seen
> mutations, got: 57 expected:<60> but was:<57>
> [junit-timeout]         at
>
> org.apache.cassandra.db.commitlog.CommitLogReaderTest.testReadFromMidpoint(CommitLogReaderTest.java:111)
>
> Including, occasionally, the anticipated answer :)
> [junit-timeout] Testcase:
>
> testReadFromMidpoint(org.apache.cassandra.db.commitlog.CommitLogReaderTest):
>  FAILED
> [junit-timeout] Expected 60 seen mutations, got: 50 expected:<60> but
> was:<50>
> [junit-timeout] junit.framework.AssertionFailedError: Expected 60 seen
> mutations, got: 50 expected:<60> but was:<50>
> [junit-timeout]         at
>
> org.apache.cassandra.db.commitlog.CommitLogReaderTest.testReadFromMidpoint(CommitLogReaderTest.java:111)
>
> Shall I open a bug in JIRA for this or have I misunderstood how
> testReadFromMidpoint works?
>
> Thanks,
> Angelo Polo
>