You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Ariel Weisberg <ar...@weisberg.ws> on 2017/05/10 16:45:21 UTC

Soliciting volunteers for flaky dtests on trunk

Hi all,

The unit tests are looking pretty reliable right now. There is a long
tail of infrequently failing tests but it's not bad and almost all
builds succeed in the current build environment. In CircleCI it seems
like unit tests might be a little less reliable, but still usable.
The dtests on the other hand aren't producing clean builds yetl. There
is also a pretty diverse set of failing tests.
I did a bit of triaging of the flakey dtests. I started by cataloging
everything, but what I found is that the long tail of flakey dtests is
very long indeed so I narrowed focus to just the top frequently failing
tests for now. See https://goo.gl/b96CdO
I created spreadsheet with some of the failing tests. Links to JIRA,
last time the test was seen failing, and how many failures I found in
Apache Jenkins across the 3 dtest builds. There are a lot of failures
not listed. There would be  50+ entries if I cataloged each one.
There are two hard failing tests, but both are already moving along:
CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta
reviewing, last updated April 2017)  dtest failure in
topology_test.TestTopology.size_estimates_multidc_testCASSANDRA-13113 (Ready to commit, assigned Alex Petrov, Sam T Reviewing,
last updated March 2017) test failure in
auth_test.TestAuth.system_auth_ks_is_alterable_test
I think the tests we should tackle first are on this sheet in priority
order https://goo.gl/S3khv1
Suite Test JIRA Last failure Counted failures Status Assigned Reviewer
Comments bootstrap_test TestBootstrap.simultaneous_bootstrap_test
https://issues.apache.org/jira/browse/CASSANDRA-13506
 5/5/2017 45 Open



repair_test incremental_repair_test.TestIncRepair.compaction_test
https://issues.apache.org/jira/browse/CASSANDRA-13194
 5/4/2017 44 Open



sstableutil_test SSTableUtilTest.compaction_test
https://issues.apache.org/jira/browse/CASSANDRA-[1]13182
 5/4/2017 35 Open



paging_test TestPagingWithDeletions.test_ttl_deletions
https://issues.apache.org/jira/browse/CASSANDRA-[2]13507
4/25/2017 31 Open



repair_test incremental_repair_test.TestIncRepair.multiple_repair_test
https://issues.apache.org/jira/browse/CASSANDRA-[3]13515
 5/4/2017 18 Open



cqlsh_tests cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*
https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22
 5/8/2017 23




paxos_tests TestPaxos.contention_test_many_threads
https://issues.apache.org/jira/browse/CASSANDRA-[4]13517
 5/8/2017 15 Open



repair_test TestRepair
https://issues.apache.org/jira/issues/?jql=status%20%3D%20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22
 5/4/2017




No one test fails a lot but the number of failing tests is substantial
cqlsh_tests cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |
test_use_keyspace | test_create_keyspace]

4/22/2017 6
If you have spare cycles you can make a huge difference in test
stability by picking off one of these.
Regards,
Ariel

Links:

  1. https://issues.apache.org/jira/browse/CASSANDRA-13194
  2. https://issues.apache.org/jira/browse/CASSANDRA-13194
  3. https://issues.apache.org/jira/browse/CASSANDRA-13194
  4. https://issues.apache.org/jira/browse/CASSANDRA-13194

Re: Soliciting volunteers for flaky dtests on trunk

Posted by Lerh Chuan Low <le...@instaclustr.com>.
Hey Ariel,

It looks like you've closed the only JIRA I've found on CqlshSmokeTest (
https://issues.apache.org/jira/browse/CASSANDRA-13140) and as you mentioned
in the ticket, it hasn't been failing recently in both CassCI and Apache
Jenkins. I think we're gold for that one.

Would anyone like a hand with anything?

Lerh

On 18 May 2017 at 03:36, Ariel Weisberg <ar...@weisberg.ws> wrote:

> Hi,
>
> Thank you Blake, Lerh Chuan Low, Jason, and Kurt, and anyone else who
> volunteered.
>
> I'm going to look at repair_test.TestRepair which is not quite the same
> as repair_test.incremental_repair test which Blake is looking at.
>
> The one remaining somewhat high pole in the tent is
> cqlsh_tests.CqlshSmokeTest.
>
> Thanks,
> Ariel
>
> On Thu, May 11, 2017, at 01:12 PM, Jason Brown wrote:
> > I've taken
> > CASSANDRA-13507
> > CASSANDRA-13517
> >
> > -Jason
> >
> >
> > On Wed, May 10, 2017 at 9:45 PM, Lerh Chuan Low <le...@instaclustr.com>
> > wrote:
> >
> > > I'll try my hand on https://issues.apache.org/
> jira/browse/CASSANDRA-13182.
> > >
> > > On 11 May 2017 at 05:59, Blake Eggleston <be...@apple.com> wrote:
> > >
> > > > I've taken CASSANDRA-13194, CASSANDRA-13506, CASSANDRA-13515,
> > > > and CASSANDRA-13372 to start
> > > >
> > > > On May 10, 2017 at 12:44:47 PM, Ariel Weisberg (ariel@weisberg.ws)
> > > wrote:
> > > >
> > > > Hi,
> > > >
> > > > The dev list murdered my rich text formatted email. Here it is
> > > > reformatted as plain text.
> > > >
> > > > The unit tests are looking pretty reliable right now. There is a long
> > > > tail of infrequently failing tests but it's not bad and almost all
> > > > builds succeed in the current build environment. In CircleCI it seems
> > > > like unit tests might be a little less reliable, but still usable.
> > > >
> > > > The dtests on the other hand aren't producing clean builds yetl.
> There
> > > > is also a pretty diverse set of failing tests.
> > > >
> > > > I did a bit of triaging of the flakey dtests. I started by cataloging
> > > > everything, but what I found is that the long tail of flakey dtests
> is
> > > > very long indeed so I narrowed focus to just the top frequently
> failing
> > > > tests for now. See https://goo.gl/b96CdO
> > > >
> > > > I created spreadsheet with some of the failing tests. Links to JIRA,
> > > > last time the test was seen failing, and how many failures I found in
> > > > Apache Jenkins across the 3 dtest builds. There are a lot of failures
> > > > not listed. There would be 50+ entries if I cataloged each one.
> > > >
> > > > There are two hard failing tests, but both are already moving along:
> > > > CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta
> > > > reviewing, last updated April 2017) dtest failure in
> > > > topology_test.TestTopology.size_estimates_multidc_test
> > > > CASSANDRA-13113 (Ready to commit, assigned Alex Petrov, Sam T
> Reviewing,
> > > > last updated March 2017) test failure in
> > > > auth_test.TestAuth.system_auth_ks_is_alterable_test
> > > >
> > > > I think the tests we should tackle first are on this sheet in
> priority
> > > > order https://goo.gl/S3khv1
> > > >
> > > > Suite: bootstrap_test
> > > > Test: TestBootstrap.simultaneous_bootstrap_test
> > > > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13506
> > > > Last failure: 5/5/2017
> > > > Counted failures: 45
> > > >
> > > > Suite: repair_test
> > > > Test: incremental_repair_test.TestIncRepair.compaction_test
> > > > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13194
> > > > Last failure: 5/4/2017
> > > > Counted failures: 44
> > > >
> > > > Suite: sstableutil_test
> > > > Test: SSTableUtilTest.compaction_test
> > > > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13182
> > > > Last failure: 5/4/2017
> > > > Counted failures: 35
> > > >
> > > > Suite: paging_test
> > > > Test: TestPagingWithDeletions.test_ttl_deletions
> > > > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13507
> > > > Last failure: 4/25/2017
> > > > Counted failures: 31
> > > >
> > > > Suite: repair_test
> > > > Test: incremental_repair_test.TestIncRepair.multiple_repair_test
> > > > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13515
> > > > Last failed: 5/4/2017
> > > > Counted failures: 18
> > > >
> > > > Suite: cqlsh_tests
> > > > Test: cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*
> > > > JIRA:
> > > > https://issues.apache.org/jira/issues/?jql=project%20%
> > > > 3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%
> > > > 20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%
> > > > 2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%
> > > > 20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22
> > > > Last failed: 5/8/2017
> > > > Counted failures: 23
> > > >
> > > > Suite: paxos_tests
> > > > Test: TestPaxos.contention_test_many_threads
> > > > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13517
> > > > Last failed: 5/8/2017
> > > > Counted failures: 15
> > > >
> > > > Suite: repair_test
> > > > Test: TestRepair
> > > > JIRA:
> > > > https://issues.apache.org/jira/issues/?jql=status%20%3D%
> > > > 20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22
> > > > Last failure: 5/4/2017
> > > > Comment: No one test fails a lot but the number of failing tests is
> > > > substantial
> > > >
> > > > Suite: cqlsh_tests
> > > > Test: cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |
> > > > test_use_keyspace | test_create_keyspace]
> > > > JIRA: No JIRA yet
> > > > Last failed: 4/22/2017
> > > > count: 6
> > > >
> > > > If you have spare cycles you can make a huge difference in test
> > > > stability by picking off one of these.
> > > >
> > > > Regards,
> > > > Ariel
> > > >
> > > > On Wed, May 10, 2017, at 12:45 PM, Ariel Weisberg wrote:
> > > > > Hi all,
> > > > >
> > > > > The unit tests are looking pretty reliable right now. There is a
> long
> > > > > tail of infrequently failing tests but it's not bad and almost all
> > > > > builds succeed in the current build environment. In CircleCI it
> seems
> > > > > like unit tests might be a little less reliable, but still usable.
> > > > > The dtests on the other hand aren't producing clean builds yetl.
> There
> > > > > is also a pretty diverse set of failing tests.
> > > > > I did a bit of triaging of the flakey dtests. I started by
> cataloging
> > > > > everything, but what I found is that the long tail of flakey
> dtests is
> > > > > very long indeed so I narrowed focus to just the top frequently
> failing
> > > > > tests for now. See https://goo.gl/b96CdO
> > > > > I created spreadsheet with some of the failing tests. Links to
> JIRA,
> > > > > last time the test was seen failing, and how many failures I found
> in
> > > > > Apache Jenkins across the 3 dtest builds. There are a lot of
> failures
> > > > > not listed. There would be 50+ entries if I cataloged each one.
> > > > > There are two hard failing tests, but both are already moving
> along:
> > > > > CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta
> > > > > reviewing, last updated April 2017) dtest failure in
> > > > > topology_test.TestTopology.size_estimates_multidc_
> testCASSANDRA-13113
> > > > > (Ready to commit, assigned Alex Petrov, Sam T Reviewing,
> > > > > last updated March 2017) test failure in
> > > > > auth_test.TestAuth.system_auth_ks_is_alterable_test
> > > > > I think the tests we should tackle first are on this sheet in
> priority
> > > > > order https://goo.gl/S3khv1
> > > > > Suite Test JIRA Last failure Counted failures Status Assigned
> Reviewer
> > > > > Comments bootstrap_test TestBootstrap.simultaneous_bootstrap_test
> > > > > https://issues.apache.org/jira/browse/CASSANDRA-13506
> > > > > 5/5/2017 45 Open
> > > > >
> > > > >
> > > > >
> > > > > repair_test incremental_repair_test.TestIncRepair.compaction_test
> > > > > https://issues.apache.org/jira/browse/CASSANDRA-13194
> > > > > 5/4/2017 44 Open
> > > > >
> > > > >
> > > > >
> > > > > sstableutil_test SSTableUtilTest.compaction_test
> > > > > https://issues.apache.org/jira/browse/CASSANDRA-[1]13182
> > > > > 5/4/2017 35 Open
> > > > >
> > > > >
> > > > >
> > > > > paging_test TestPagingWithDeletions.test_ttl_deletions
> > > > > https://issues.apache.org/jira/browse/CASSANDRA-[2]13507
> > > > > 4/25/2017 31 Open
> > > > >
> > > > >
> > > > >
> > > > > repair_test incremental_repair_test.TestIncRepair.multiple_repair_
> test
> > > > > https://issues.apache.org/jira/browse/CASSANDRA-[3]13515
> > > > > 5/4/2017 18 Open
> > > > >
> > > > >
> > > > >
> > > > > cqlsh_tests cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*
> > > > > https://issues.apache.org/jira/issues/?jql=project%20%
> > > > 3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%
> > > > 20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%
> > > > 2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%
> > > > 20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22
> > > > > 5/8/2017 23
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > paxos_tests TestPaxos.contention_test_many_threads
> > > > > https://issues.apache.org/jira/browse/CASSANDRA-[4]13517
> > > > > 5/8/2017 15 Open
> > > > >
> > > > >
> > > > >
> > > > > repair_test TestRepair
> > > > > https://issues.apache.org/jira/issues/?jql=status%20%3D%
> > > > 20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22
> > > > > 5/4/2017
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > No one test fails a lot but the number of failing tests is
> substantial
> > > > > cqlsh_tests cqlsh_tests.CqlshSmokeTest.[test_insert |
> test_truncate |
> > > > > test_use_keyspace | test_create_keyspace]
> > > > >
> > > > > 4/22/2017 6
> > > > > If you have spare cycles you can make a huge difference in test
> > > > > stability by picking off one of these.
> > > > > Regards,
> > > > > Ariel
> > > > >
> > > > > Links:
> > > > >
> > > > > 1. https://issues.apache.org/jira/browse/CASSANDRA-13194
> > > > > 2. https://issues.apache.org/jira/browse/CASSANDRA-13194
> > > > > 3. https://issues.apache.org/jira/browse/CASSANDRA-13194
> > > > > 4. https://issues.apache.org/jira/browse/CASSANDRA-13194
> > > >
> > > > ------------------------------------------------------------
> ---------
> > > > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > > > For additional commands, e-mail: dev-help@cassandra.apache.org
> > > >
> > > >
> > >
> > >
> > > --
> > >
> > >
> > > *Lerh Chuan Low*
> > > *Software Engineer*0403953752
> > >
> > > <https://www.instaclustr.com/>
> > >
> > > <https://www.facebook.com/instaclustr>   <https://twitter.com/
> instaclustr>
> > > <https://www.linkedin.com/company/instaclustr>
> > >
> > > Read our latest technical blog posts here
> > > <https://www.instaclustr.com/blog/>.
> > >
> > > This email has been sent on behalf of Instaclustr Pty. Limited
> (Australia)
> > > and Instaclustr Inc (USA).
> > >
> > > This email and any attachments may contain confidential and legally
> > > privileged information.  If you are not the intended recipient, do not
> copy
> > > or disclose its content, but please reply to this email immediately and
> > > highlight the error to the sender and then immediately delete the
> message.
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>


-- 


*Lerh Chuan Low*
*Software Engineer*0403953752

<https://www.instaclustr.com/>

<https://www.facebook.com/instaclustr>   <https://twitter.com/instaclustr>
<https://www.linkedin.com/company/instaclustr>

Read our latest technical blog posts here
<https://www.instaclustr.com/blog/>.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.

Re: Soliciting volunteers for flaky dtests on trunk

Posted by Ariel Weisberg <ar...@weisberg.ws>.
Hi,

Thank you Blake, Lerh Chuan Low, Jason, and Kurt, and anyone else who
volunteered.

I'm going to look at repair_test.TestRepair which is not quite the same
as repair_test.incremental_repair test which Blake is looking at. 

The one remaining somewhat high pole in the tent is
cqlsh_tests.CqlshSmokeTest.

Thanks,
Ariel

On Thu, May 11, 2017, at 01:12 PM, Jason Brown wrote:
> I've taken
> CASSANDRA-13507
> CASSANDRA-13517
> 
> -Jason
> 
> 
> On Wed, May 10, 2017 at 9:45 PM, Lerh Chuan Low <le...@instaclustr.com>
> wrote:
> 
> > I'll try my hand on https://issues.apache.org/jira/browse/CASSANDRA-13182.
> >
> > On 11 May 2017 at 05:59, Blake Eggleston <be...@apple.com> wrote:
> >
> > > I've taken CASSANDRA-13194, CASSANDRA-13506, CASSANDRA-13515,
> > > and CASSANDRA-13372 to start
> > >
> > > On May 10, 2017 at 12:44:47 PM, Ariel Weisberg (ariel@weisberg.ws)
> > wrote:
> > >
> > > Hi,
> > >
> > > The dev list murdered my rich text formatted email. Here it is
> > > reformatted as plain text.
> > >
> > > The unit tests are looking pretty reliable right now. There is a long
> > > tail of infrequently failing tests but it's not bad and almost all
> > > builds succeed in the current build environment. In CircleCI it seems
> > > like unit tests might be a little less reliable, but still usable.
> > >
> > > The dtests on the other hand aren't producing clean builds yetl. There
> > > is also a pretty diverse set of failing tests.
> > >
> > > I did a bit of triaging of the flakey dtests. I started by cataloging
> > > everything, but what I found is that the long tail of flakey dtests is
> > > very long indeed so I narrowed focus to just the top frequently failing
> > > tests for now. See https://goo.gl/b96CdO
> > >
> > > I created spreadsheet with some of the failing tests. Links to JIRA,
> > > last time the test was seen failing, and how many failures I found in
> > > Apache Jenkins across the 3 dtest builds. There are a lot of failures
> > > not listed. There would be 50+ entries if I cataloged each one.
> > >
> > > There are two hard failing tests, but both are already moving along:
> > > CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta
> > > reviewing, last updated April 2017) dtest failure in
> > > topology_test.TestTopology.size_estimates_multidc_test
> > > CASSANDRA-13113 (Ready to commit, assigned Alex Petrov, Sam T Reviewing,
> > > last updated March 2017) test failure in
> > > auth_test.TestAuth.system_auth_ks_is_alterable_test
> > >
> > > I think the tests we should tackle first are on this sheet in priority
> > > order https://goo.gl/S3khv1
> > >
> > > Suite: bootstrap_test
> > > Test: TestBootstrap.simultaneous_bootstrap_test
> > > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13506
> > > Last failure: 5/5/2017
> > > Counted failures: 45
> > >
> > > Suite: repair_test
> > > Test: incremental_repair_test.TestIncRepair.compaction_test
> > > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13194
> > > Last failure: 5/4/2017
> > > Counted failures: 44
> > >
> > > Suite: sstableutil_test
> > > Test: SSTableUtilTest.compaction_test
> > > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13182
> > > Last failure: 5/4/2017
> > > Counted failures: 35
> > >
> > > Suite: paging_test
> > > Test: TestPagingWithDeletions.test_ttl_deletions
> > > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13507
> > > Last failure: 4/25/2017
> > > Counted failures: 31
> > >
> > > Suite: repair_test
> > > Test: incremental_repair_test.TestIncRepair.multiple_repair_test
> > > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13515
> > > Last failed: 5/4/2017
> > > Counted failures: 18
> > >
> > > Suite: cqlsh_tests
> > > Test: cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*
> > > JIRA:
> > > https://issues.apache.org/jira/issues/?jql=project%20%
> > > 3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%
> > > 20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%
> > > 2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%
> > > 20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22
> > > Last failed: 5/8/2017
> > > Counted failures: 23
> > >
> > > Suite: paxos_tests
> > > Test: TestPaxos.contention_test_many_threads
> > > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13517
> > > Last failed: 5/8/2017
> > > Counted failures: 15
> > >
> > > Suite: repair_test
> > > Test: TestRepair
> > > JIRA:
> > > https://issues.apache.org/jira/issues/?jql=status%20%3D%
> > > 20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22
> > > Last failure: 5/4/2017
> > > Comment: No one test fails a lot but the number of failing tests is
> > > substantial
> > >
> > > Suite: cqlsh_tests
> > > Test: cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |
> > > test_use_keyspace | test_create_keyspace]
> > > JIRA: No JIRA yet
> > > Last failed: 4/22/2017
> > > count: 6
> > >
> > > If you have spare cycles you can make a huge difference in test
> > > stability by picking off one of these.
> > >
> > > Regards,
> > > Ariel
> > >
> > > On Wed, May 10, 2017, at 12:45 PM, Ariel Weisberg wrote:
> > > > Hi all,
> > > >
> > > > The unit tests are looking pretty reliable right now. There is a long
> > > > tail of infrequently failing tests but it's not bad and almost all
> > > > builds succeed in the current build environment. In CircleCI it seems
> > > > like unit tests might be a little less reliable, but still usable.
> > > > The dtests on the other hand aren't producing clean builds yetl. There
> > > > is also a pretty diverse set of failing tests.
> > > > I did a bit of triaging of the flakey dtests. I started by cataloging
> > > > everything, but what I found is that the long tail of flakey dtests is
> > > > very long indeed so I narrowed focus to just the top frequently failing
> > > > tests for now. See https://goo.gl/b96CdO
> > > > I created spreadsheet with some of the failing tests. Links to JIRA,
> > > > last time the test was seen failing, and how many failures I found in
> > > > Apache Jenkins across the 3 dtest builds. There are a lot of failures
> > > > not listed. There would be 50+ entries if I cataloged each one.
> > > > There are two hard failing tests, but both are already moving along:
> > > > CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta
> > > > reviewing, last updated April 2017) dtest failure in
> > > > topology_test.TestTopology.size_estimates_multidc_testCASSANDRA-13113
> > > > (Ready to commit, assigned Alex Petrov, Sam T Reviewing,
> > > > last updated March 2017) test failure in
> > > > auth_test.TestAuth.system_auth_ks_is_alterable_test
> > > > I think the tests we should tackle first are on this sheet in priority
> > > > order https://goo.gl/S3khv1
> > > > Suite Test JIRA Last failure Counted failures Status Assigned Reviewer
> > > > Comments bootstrap_test TestBootstrap.simultaneous_bootstrap_test
> > > > https://issues.apache.org/jira/browse/CASSANDRA-13506
> > > > 5/5/2017 45 Open
> > > >
> > > >
> > > >
> > > > repair_test incremental_repair_test.TestIncRepair.compaction_test
> > > > https://issues.apache.org/jira/browse/CASSANDRA-13194
> > > > 5/4/2017 44 Open
> > > >
> > > >
> > > >
> > > > sstableutil_test SSTableUtilTest.compaction_test
> > > > https://issues.apache.org/jira/browse/CASSANDRA-[1]13182
> > > > 5/4/2017 35 Open
> > > >
> > > >
> > > >
> > > > paging_test TestPagingWithDeletions.test_ttl_deletions
> > > > https://issues.apache.org/jira/browse/CASSANDRA-[2]13507
> > > > 4/25/2017 31 Open
> > > >
> > > >
> > > >
> > > > repair_test incremental_repair_test.TestIncRepair.multiple_repair_test
> > > > https://issues.apache.org/jira/browse/CASSANDRA-[3]13515
> > > > 5/4/2017 18 Open
> > > >
> > > >
> > > >
> > > > cqlsh_tests cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*
> > > > https://issues.apache.org/jira/issues/?jql=project%20%
> > > 3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%
> > > 20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%
> > > 2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%
> > > 20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22
> > > > 5/8/2017 23
> > > >
> > > >
> > > >
> > > >
> > > > paxos_tests TestPaxos.contention_test_many_threads
> > > > https://issues.apache.org/jira/browse/CASSANDRA-[4]13517
> > > > 5/8/2017 15 Open
> > > >
> > > >
> > > >
> > > > repair_test TestRepair
> > > > https://issues.apache.org/jira/issues/?jql=status%20%3D%
> > > 20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22
> > > > 5/4/2017
> > > >
> > > >
> > > >
> > > >
> > > > No one test fails a lot but the number of failing tests is substantial
> > > > cqlsh_tests cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |
> > > > test_use_keyspace | test_create_keyspace]
> > > >
> > > > 4/22/2017 6
> > > > If you have spare cycles you can make a huge difference in test
> > > > stability by picking off one of these.
> > > > Regards,
> > > > Ariel
> > > >
> > > > Links:
> > > >
> > > > 1. https://issues.apache.org/jira/browse/CASSANDRA-13194
> > > > 2. https://issues.apache.org/jira/browse/CASSANDRA-13194
> > > > 3. https://issues.apache.org/jira/browse/CASSANDRA-13194
> > > > 4. https://issues.apache.org/jira/browse/CASSANDRA-13194
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > > For additional commands, e-mail: dev-help@cassandra.apache.org
> > >
> > >
> >
> >
> > --
> >
> >
> > *Lerh Chuan Low*
> > *Software Engineer*0403953752
> >
> > <https://www.instaclustr.com/>
> >
> > <https://www.facebook.com/instaclustr>   <https://twitter.com/instaclustr>
> > <https://www.linkedin.com/company/instaclustr>
> >
> > Read our latest technical blog posts here
> > <https://www.instaclustr.com/blog/>.
> >
> > This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> > and Instaclustr Inc (USA).
> >
> > This email and any attachments may contain confidential and legally
> > privileged information.  If you are not the intended recipient, do not copy
> > or disclose its content, but please reply to this email immediately and
> > highlight the error to the sender and then immediately delete the message.
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: Soliciting volunteers for flaky dtests on trunk

Posted by Jason Brown <ja...@gmail.com>.
I've taken
CASSANDRA-13507
CASSANDRA-13517

-Jason


On Wed, May 10, 2017 at 9:45 PM, Lerh Chuan Low <le...@instaclustr.com>
wrote:

> I'll try my hand on https://issues.apache.org/jira/browse/CASSANDRA-13182.
>
> On 11 May 2017 at 05:59, Blake Eggleston <be...@apple.com> wrote:
>
> > I've taken CASSANDRA-13194, CASSANDRA-13506, CASSANDRA-13515,
> > and CASSANDRA-13372 to start
> >
> > On May 10, 2017 at 12:44:47 PM, Ariel Weisberg (ariel@weisberg.ws)
> wrote:
> >
> > Hi,
> >
> > The dev list murdered my rich text formatted email. Here it is
> > reformatted as plain text.
> >
> > The unit tests are looking pretty reliable right now. There is a long
> > tail of infrequently failing tests but it's not bad and almost all
> > builds succeed in the current build environment. In CircleCI it seems
> > like unit tests might be a little less reliable, but still usable.
> >
> > The dtests on the other hand aren't producing clean builds yetl. There
> > is also a pretty diverse set of failing tests.
> >
> > I did a bit of triaging of the flakey dtests. I started by cataloging
> > everything, but what I found is that the long tail of flakey dtests is
> > very long indeed so I narrowed focus to just the top frequently failing
> > tests for now. See https://goo.gl/b96CdO
> >
> > I created spreadsheet with some of the failing tests. Links to JIRA,
> > last time the test was seen failing, and how many failures I found in
> > Apache Jenkins across the 3 dtest builds. There are a lot of failures
> > not listed. There would be 50+ entries if I cataloged each one.
> >
> > There are two hard failing tests, but both are already moving along:
> > CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta
> > reviewing, last updated April 2017) dtest failure in
> > topology_test.TestTopology.size_estimates_multidc_test
> > CASSANDRA-13113 (Ready to commit, assigned Alex Petrov, Sam T Reviewing,
> > last updated March 2017) test failure in
> > auth_test.TestAuth.system_auth_ks_is_alterable_test
> >
> > I think the tests we should tackle first are on this sheet in priority
> > order https://goo.gl/S3khv1
> >
> > Suite: bootstrap_test
> > Test: TestBootstrap.simultaneous_bootstrap_test
> > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13506
> > Last failure: 5/5/2017
> > Counted failures: 45
> >
> > Suite: repair_test
> > Test: incremental_repair_test.TestIncRepair.compaction_test
> > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13194
> > Last failure: 5/4/2017
> > Counted failures: 44
> >
> > Suite: sstableutil_test
> > Test: SSTableUtilTest.compaction_test
> > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13182
> > Last failure: 5/4/2017
> > Counted failures: 35
> >
> > Suite: paging_test
> > Test: TestPagingWithDeletions.test_ttl_deletions
> > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13507
> > Last failure: 4/25/2017
> > Counted failures: 31
> >
> > Suite: repair_test
> > Test: incremental_repair_test.TestIncRepair.multiple_repair_test
> > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13515
> > Last failed: 5/4/2017
> > Counted failures: 18
> >
> > Suite: cqlsh_tests
> > Test: cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*
> > JIRA:
> > https://issues.apache.org/jira/issues/?jql=project%20%
> > 3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%
> > 20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%
> > 2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%
> > 20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22
> > Last failed: 5/8/2017
> > Counted failures: 23
> >
> > Suite: paxos_tests
> > Test: TestPaxos.contention_test_many_threads
> > JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13517
> > Last failed: 5/8/2017
> > Counted failures: 15
> >
> > Suite: repair_test
> > Test: TestRepair
> > JIRA:
> > https://issues.apache.org/jira/issues/?jql=status%20%3D%
> > 20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22
> > Last failure: 5/4/2017
> > Comment: No one test fails a lot but the number of failing tests is
> > substantial
> >
> > Suite: cqlsh_tests
> > Test: cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |
> > test_use_keyspace | test_create_keyspace]
> > JIRA: No JIRA yet
> > Last failed: 4/22/2017
> > count: 6
> >
> > If you have spare cycles you can make a huge difference in test
> > stability by picking off one of these.
> >
> > Regards,
> > Ariel
> >
> > On Wed, May 10, 2017, at 12:45 PM, Ariel Weisberg wrote:
> > > Hi all,
> > >
> > > The unit tests are looking pretty reliable right now. There is a long
> > > tail of infrequently failing tests but it's not bad and almost all
> > > builds succeed in the current build environment. In CircleCI it seems
> > > like unit tests might be a little less reliable, but still usable.
> > > The dtests on the other hand aren't producing clean builds yetl. There
> > > is also a pretty diverse set of failing tests.
> > > I did a bit of triaging of the flakey dtests. I started by cataloging
> > > everything, but what I found is that the long tail of flakey dtests is
> > > very long indeed so I narrowed focus to just the top frequently failing
> > > tests for now. See https://goo.gl/b96CdO
> > > I created spreadsheet with some of the failing tests. Links to JIRA,
> > > last time the test was seen failing, and how many failures I found in
> > > Apache Jenkins across the 3 dtest builds. There are a lot of failures
> > > not listed. There would be 50+ entries if I cataloged each one.
> > > There are two hard failing tests, but both are already moving along:
> > > CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta
> > > reviewing, last updated April 2017) dtest failure in
> > > topology_test.TestTopology.size_estimates_multidc_testCASSANDRA-13113
> > > (Ready to commit, assigned Alex Petrov, Sam T Reviewing,
> > > last updated March 2017) test failure in
> > > auth_test.TestAuth.system_auth_ks_is_alterable_test
> > > I think the tests we should tackle first are on this sheet in priority
> > > order https://goo.gl/S3khv1
> > > Suite Test JIRA Last failure Counted failures Status Assigned Reviewer
> > > Comments bootstrap_test TestBootstrap.simultaneous_bootstrap_test
> > > https://issues.apache.org/jira/browse/CASSANDRA-13506
> > > 5/5/2017 45 Open
> > >
> > >
> > >
> > > repair_test incremental_repair_test.TestIncRepair.compaction_test
> > > https://issues.apache.org/jira/browse/CASSANDRA-13194
> > > 5/4/2017 44 Open
> > >
> > >
> > >
> > > sstableutil_test SSTableUtilTest.compaction_test
> > > https://issues.apache.org/jira/browse/CASSANDRA-[1]13182
> > > 5/4/2017 35 Open
> > >
> > >
> > >
> > > paging_test TestPagingWithDeletions.test_ttl_deletions
> > > https://issues.apache.org/jira/browse/CASSANDRA-[2]13507
> > > 4/25/2017 31 Open
> > >
> > >
> > >
> > > repair_test incremental_repair_test.TestIncRepair.multiple_repair_test
> > > https://issues.apache.org/jira/browse/CASSANDRA-[3]13515
> > > 5/4/2017 18 Open
> > >
> > >
> > >
> > > cqlsh_tests cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*
> > > https://issues.apache.org/jira/issues/?jql=project%20%
> > 3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%
> > 20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%
> > 2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%
> > 20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22
> > > 5/8/2017 23
> > >
> > >
> > >
> > >
> > > paxos_tests TestPaxos.contention_test_many_threads
> > > https://issues.apache.org/jira/browse/CASSANDRA-[4]13517
> > > 5/8/2017 15 Open
> > >
> > >
> > >
> > > repair_test TestRepair
> > > https://issues.apache.org/jira/issues/?jql=status%20%3D%
> > 20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22
> > > 5/4/2017
> > >
> > >
> > >
> > >
> > > No one test fails a lot but the number of failing tests is substantial
> > > cqlsh_tests cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |
> > > test_use_keyspace | test_create_keyspace]
> > >
> > > 4/22/2017 6
> > > If you have spare cycles you can make a huge difference in test
> > > stability by picking off one of these.
> > > Regards,
> > > Ariel
> > >
> > > Links:
> > >
> > > 1. https://issues.apache.org/jira/browse/CASSANDRA-13194
> > > 2. https://issues.apache.org/jira/browse/CASSANDRA-13194
> > > 3. https://issues.apache.org/jira/browse/CASSANDRA-13194
> > > 4. https://issues.apache.org/jira/browse/CASSANDRA-13194
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: dev-help@cassandra.apache.org
> >
> >
>
>
> --
>
>
> *Lerh Chuan Low*
> *Software Engineer*0403953752
>
> <https://www.instaclustr.com/>
>
> <https://www.facebook.com/instaclustr>   <https://twitter.com/instaclustr>
> <https://www.linkedin.com/company/instaclustr>
>
> Read our latest technical blog posts here
> <https://www.instaclustr.com/blog/>.
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>

Re: Soliciting volunteers for flaky dtests on trunk

Posted by Lerh Chuan Low <le...@instaclustr.com>.
I'll try my hand on https://issues.apache.org/jira/browse/CASSANDRA-13182.

On 11 May 2017 at 05:59, Blake Eggleston <be...@apple.com> wrote:

> I've taken CASSANDRA-13194, CASSANDRA-13506, CASSANDRA-13515,
> and CASSANDRA-13372 to start
>
> On May 10, 2017 at 12:44:47 PM, Ariel Weisberg (ariel@weisberg.ws) wrote:
>
> Hi,
>
> The dev list murdered my rich text formatted email. Here it is
> reformatted as plain text.
>
> The unit tests are looking pretty reliable right now. There is a long
> tail of infrequently failing tests but it's not bad and almost all
> builds succeed in the current build environment. In CircleCI it seems
> like unit tests might be a little less reliable, but still usable.
>
> The dtests on the other hand aren't producing clean builds yetl. There
> is also a pretty diverse set of failing tests.
>
> I did a bit of triaging of the flakey dtests. I started by cataloging
> everything, but what I found is that the long tail of flakey dtests is
> very long indeed so I narrowed focus to just the top frequently failing
> tests for now. See https://goo.gl/b96CdO
>
> I created spreadsheet with some of the failing tests. Links to JIRA,
> last time the test was seen failing, and how many failures I found in
> Apache Jenkins across the 3 dtest builds. There are a lot of failures
> not listed. There would be 50+ entries if I cataloged each one.
>
> There are two hard failing tests, but both are already moving along:
> CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta
> reviewing, last updated April 2017) dtest failure in
> topology_test.TestTopology.size_estimates_multidc_test
> CASSANDRA-13113 (Ready to commit, assigned Alex Petrov, Sam T Reviewing,
> last updated March 2017) test failure in
> auth_test.TestAuth.system_auth_ks_is_alterable_test
>
> I think the tests we should tackle first are on this sheet in priority
> order https://goo.gl/S3khv1
>
> Suite: bootstrap_test
> Test: TestBootstrap.simultaneous_bootstrap_test
> JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13506
> Last failure: 5/5/2017
> Counted failures: 45
>
> Suite: repair_test
> Test: incremental_repair_test.TestIncRepair.compaction_test
> JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13194
> Last failure: 5/4/2017
> Counted failures: 44
>
> Suite: sstableutil_test
> Test: SSTableUtilTest.compaction_test
> JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13182
> Last failure: 5/4/2017
> Counted failures: 35
>
> Suite: paging_test
> Test: TestPagingWithDeletions.test_ttl_deletions
> JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13507
> Last failure: 4/25/2017
> Counted failures: 31
>
> Suite: repair_test
> Test: incremental_repair_test.TestIncRepair.multiple_repair_test
> JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13515
> Last failed: 5/4/2017
> Counted failures: 18
>
> Suite: cqlsh_tests
> Test: cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*
> JIRA:
> https://issues.apache.org/jira/issues/?jql=project%20%
> 3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%
> 20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%
> 2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%
> 20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22
> Last failed: 5/8/2017
> Counted failures: 23
>
> Suite: paxos_tests
> Test: TestPaxos.contention_test_many_threads
> JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13517
> Last failed: 5/8/2017
> Counted failures: 15
>
> Suite: repair_test
> Test: TestRepair
> JIRA:
> https://issues.apache.org/jira/issues/?jql=status%20%3D%
> 20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22
> Last failure: 5/4/2017
> Comment: No one test fails a lot but the number of failing tests is
> substantial
>
> Suite: cqlsh_tests
> Test: cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |
> test_use_keyspace | test_create_keyspace]
> JIRA: No JIRA yet
> Last failed: 4/22/2017
> count: 6
>
> If you have spare cycles you can make a huge difference in test
> stability by picking off one of these.
>
> Regards,
> Ariel
>
> On Wed, May 10, 2017, at 12:45 PM, Ariel Weisberg wrote:
> > Hi all,
> >
> > The unit tests are looking pretty reliable right now. There is a long
> > tail of infrequently failing tests but it's not bad and almost all
> > builds succeed in the current build environment. In CircleCI it seems
> > like unit tests might be a little less reliable, but still usable.
> > The dtests on the other hand aren't producing clean builds yetl. There
> > is also a pretty diverse set of failing tests.
> > I did a bit of triaging of the flakey dtests. I started by cataloging
> > everything, but what I found is that the long tail of flakey dtests is
> > very long indeed so I narrowed focus to just the top frequently failing
> > tests for now. See https://goo.gl/b96CdO
> > I created spreadsheet with some of the failing tests. Links to JIRA,
> > last time the test was seen failing, and how many failures I found in
> > Apache Jenkins across the 3 dtest builds. There are a lot of failures
> > not listed. There would be 50+ entries if I cataloged each one.
> > There are two hard failing tests, but both are already moving along:
> > CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta
> > reviewing, last updated April 2017) dtest failure in
> > topology_test.TestTopology.size_estimates_multidc_testCASSANDRA-13113
> > (Ready to commit, assigned Alex Petrov, Sam T Reviewing,
> > last updated March 2017) test failure in
> > auth_test.TestAuth.system_auth_ks_is_alterable_test
> > I think the tests we should tackle first are on this sheet in priority
> > order https://goo.gl/S3khv1
> > Suite Test JIRA Last failure Counted failures Status Assigned Reviewer
> > Comments bootstrap_test TestBootstrap.simultaneous_bootstrap_test
> > https://issues.apache.org/jira/browse/CASSANDRA-13506
> > 5/5/2017 45 Open
> >
> >
> >
> > repair_test incremental_repair_test.TestIncRepair.compaction_test
> > https://issues.apache.org/jira/browse/CASSANDRA-13194
> > 5/4/2017 44 Open
> >
> >
> >
> > sstableutil_test SSTableUtilTest.compaction_test
> > https://issues.apache.org/jira/browse/CASSANDRA-[1]13182
> > 5/4/2017 35 Open
> >
> >
> >
> > paging_test TestPagingWithDeletions.test_ttl_deletions
> > https://issues.apache.org/jira/browse/CASSANDRA-[2]13507
> > 4/25/2017 31 Open
> >
> >
> >
> > repair_test incremental_repair_test.TestIncRepair.multiple_repair_test
> > https://issues.apache.org/jira/browse/CASSANDRA-[3]13515
> > 5/4/2017 18 Open
> >
> >
> >
> > cqlsh_tests cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*
> > https://issues.apache.org/jira/issues/?jql=project%20%
> 3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%
> 20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%
> 2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%
> 20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22
> > 5/8/2017 23
> >
> >
> >
> >
> > paxos_tests TestPaxos.contention_test_many_threads
> > https://issues.apache.org/jira/browse/CASSANDRA-[4]13517
> > 5/8/2017 15 Open
> >
> >
> >
> > repair_test TestRepair
> > https://issues.apache.org/jira/issues/?jql=status%20%3D%
> 20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22
> > 5/4/2017
> >
> >
> >
> >
> > No one test fails a lot but the number of failing tests is substantial
> > cqlsh_tests cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |
> > test_use_keyspace | test_create_keyspace]
> >
> > 4/22/2017 6
> > If you have spare cycles you can make a huge difference in test
> > stability by picking off one of these.
> > Regards,
> > Ariel
> >
> > Links:
> >
> > 1. https://issues.apache.org/jira/browse/CASSANDRA-13194
> > 2. https://issues.apache.org/jira/browse/CASSANDRA-13194
> > 3. https://issues.apache.org/jira/browse/CASSANDRA-13194
> > 4. https://issues.apache.org/jira/browse/CASSANDRA-13194
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>


-- 


*Lerh Chuan Low*
*Software Engineer*0403953752

<https://www.instaclustr.com/>

<https://www.facebook.com/instaclustr>   <https://twitter.com/instaclustr>
<https://www.linkedin.com/company/instaclustr>

Read our latest technical blog posts here
<https://www.instaclustr.com/blog/>.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.

Re: Soliciting volunteers for flaky dtests on trunk

Posted by Blake Eggleston <be...@apple.com>.
I've taken CASSANDRA-13194, CASSANDRA-13506, CASSANDRA-13515, and CASSANDRA-13372 to start

On May 10, 2017 at 12:44:47 PM, Ariel Weisberg (ariel@weisberg.ws) wrote:

Hi,  

The dev list murdered my rich text formatted email. Here it is  
reformatted as plain text.  

The unit tests are looking pretty reliable right now. There is a long  
tail of infrequently failing tests but it's not bad and almost all  
builds succeed in the current build environment. In CircleCI it seems  
like unit tests might be a little less reliable, but still usable.  

The dtests on the other hand aren't producing clean builds yetl. There  
is also a pretty diverse set of failing tests.  

I did a bit of triaging of the flakey dtests. I started by cataloging  
everything, but what I found is that the long tail of flakey dtests is  
very long indeed so I narrowed focus to just the top frequently failing  
tests for now. See https://goo.gl/b96CdO  

I created spreadsheet with some of the failing tests. Links to JIRA,  
last time the test was seen failing, and how many failures I found in  
Apache Jenkins across the 3 dtest builds. There are a lot of failures  
not listed. There would be 50+ entries if I cataloged each one.  

There are two hard failing tests, but both are already moving along:  
CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta  
reviewing, last updated April 2017) dtest failure in  
topology_test.TestTopology.size_estimates_multidc_test  
CASSANDRA-13113 (Ready to commit, assigned Alex Petrov, Sam T Reviewing,  
last updated March 2017) test failure in  
auth_test.TestAuth.system_auth_ks_is_alterable_test  

I think the tests we should tackle first are on this sheet in priority  
order https://goo.gl/S3khv1  

Suite: bootstrap_test  
Test: TestBootstrap.simultaneous_bootstrap_test  
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13506  
Last failure: 5/5/2017  
Counted failures: 45  

Suite: repair_test  
Test: incremental_repair_test.TestIncRepair.compaction_test  
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13194  
Last failure: 5/4/2017  
Counted failures: 44  

Suite: sstableutil_test  
Test: SSTableUtilTest.compaction_test  
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13182  
Last failure: 5/4/2017  
Counted failures: 35  

Suite: paging_test  
Test: TestPagingWithDeletions.test_ttl_deletions  
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13507  
Last failure: 4/25/2017  
Counted failures: 31  

Suite: repair_test  
Test: incremental_repair_test.TestIncRepair.multiple_repair_test  
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13515  
Last failed: 5/4/2017  
Counted failures: 18  

Suite: cqlsh_tests  
Test: cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*  
JIRA:  
https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22  
Last failed: 5/8/2017  
Counted failures: 23  

Suite: paxos_tests  
Test: TestPaxos.contention_test_many_threads  
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13517  
Last failed: 5/8/2017  
Counted failures: 15  

Suite: repair_test  
Test: TestRepair  
JIRA:  
https://issues.apache.org/jira/issues/?jql=status%20%3D%20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22  
Last failure: 5/4/2017  
Comment: No one test fails a lot but the number of failing tests is  
substantial  

Suite: cqlsh_tests  
Test: cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |  
test_use_keyspace | test_create_keyspace]  
JIRA: No JIRA yet  
Last failed: 4/22/2017  
count: 6  

If you have spare cycles you can make a huge difference in test  
stability by picking off one of these.  

Regards,  
Ariel  

On Wed, May 10, 2017, at 12:45 PM, Ariel Weisberg wrote:  
> Hi all,  
>  
> The unit tests are looking pretty reliable right now. There is a long  
> tail of infrequently failing tests but it's not bad and almost all  
> builds succeed in the current build environment. In CircleCI it seems  
> like unit tests might be a little less reliable, but still usable.  
> The dtests on the other hand aren't producing clean builds yetl. There  
> is also a pretty diverse set of failing tests.  
> I did a bit of triaging of the flakey dtests. I started by cataloging  
> everything, but what I found is that the long tail of flakey dtests is  
> very long indeed so I narrowed focus to just the top frequently failing  
> tests for now. See https://goo.gl/b96CdO  
> I created spreadsheet with some of the failing tests. Links to JIRA,  
> last time the test was seen failing, and how many failures I found in  
> Apache Jenkins across the 3 dtest builds. There are a lot of failures  
> not listed. There would be 50+ entries if I cataloged each one.  
> There are two hard failing tests, but both are already moving along:  
> CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta  
> reviewing, last updated April 2017) dtest failure in  
> topology_test.TestTopology.size_estimates_multidc_testCASSANDRA-13113  
> (Ready to commit, assigned Alex Petrov, Sam T Reviewing,  
> last updated March 2017) test failure in  
> auth_test.TestAuth.system_auth_ks_is_alterable_test  
> I think the tests we should tackle first are on this sheet in priority  
> order https://goo.gl/S3khv1  
> Suite Test JIRA Last failure Counted failures Status Assigned Reviewer  
> Comments bootstrap_test TestBootstrap.simultaneous_bootstrap_test  
> https://issues.apache.org/jira/browse/CASSANDRA-13506  
> 5/5/2017 45 Open  
>  
>  
>  
> repair_test incremental_repair_test.TestIncRepair.compaction_test  
> https://issues.apache.org/jira/browse/CASSANDRA-13194  
> 5/4/2017 44 Open  
>  
>  
>  
> sstableutil_test SSTableUtilTest.compaction_test  
> https://issues.apache.org/jira/browse/CASSANDRA-[1]13182  
> 5/4/2017 35 Open  
>  
>  
>  
> paging_test TestPagingWithDeletions.test_ttl_deletions  
> https://issues.apache.org/jira/browse/CASSANDRA-[2]13507  
> 4/25/2017 31 Open  
>  
>  
>  
> repair_test incremental_repair_test.TestIncRepair.multiple_repair_test  
> https://issues.apache.org/jira/browse/CASSANDRA-[3]13515  
> 5/4/2017 18 Open  
>  
>  
>  
> cqlsh_tests cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*  
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22  
> 5/8/2017 23  
>  
>  
>  
>  
> paxos_tests TestPaxos.contention_test_many_threads  
> https://issues.apache.org/jira/browse/CASSANDRA-[4]13517  
> 5/8/2017 15 Open  
>  
>  
>  
> repair_test TestRepair  
> https://issues.apache.org/jira/issues/?jql=status%20%3D%20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22  
> 5/4/2017  
>  
>  
>  
>  
> No one test fails a lot but the number of failing tests is substantial  
> cqlsh_tests cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |  
> test_use_keyspace | test_create_keyspace]  
>  
> 4/22/2017 6  
> If you have spare cycles you can make a huge difference in test  
> stability by picking off one of these.  
> Regards,  
> Ariel  
>  
> Links:  
>  
> 1. https://issues.apache.org/jira/browse/CASSANDRA-13194  
> 2. https://issues.apache.org/jira/browse/CASSANDRA-13194  
> 3. https://issues.apache.org/jira/browse/CASSANDRA-13194  
> 4. https://issues.apache.org/jira/browse/CASSANDRA-13194  

---------------------------------------------------------------------  
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
For additional commands, e-mail: dev-help@cassandra.apache.org  


Re: Soliciting volunteers for flaky dtests on trunk

Posted by Ariel Weisberg <ar...@weisberg.ws>.
Hi,

The dev list murdered my rich text formatted email. Here it is
reformatted as plain text.

The unit tests are looking pretty reliable right now. There is a long
tail of infrequently failing tests but it's not bad and almost all
builds succeed in the current build environment. In CircleCI it seems
like unit tests might be a little less reliable, but still usable.

The dtests on the other hand aren't producing clean builds yetl. There
is also a pretty diverse set of failing tests.

I did a bit of triaging of the flakey dtests. I started by cataloging
everything, but what I found is that the long tail of flakey dtests is
very long indeed so I narrowed focus to just the top frequently failing
tests for now. See https://goo.gl/b96CdO

I created spreadsheet with some of the failing tests. Links to JIRA,
last time the test was seen failing, and how many failures I found in
Apache Jenkins across the 3 dtest builds. There are a lot of failures
not listed. There would be 50+ entries if I cataloged each one.

There are two hard failing tests, but both are already moving along:
CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta
reviewing, last updated April 2017) dtest failure in
topology_test.TestTopology.size_estimates_multidc_test
CASSANDRA-13113 (Ready to commit, assigned Alex Petrov, Sam T Reviewing,
last updated March 2017)       test failure in
auth_test.TestAuth.system_auth_ks_is_alterable_test

I think the tests we should tackle first are on this sheet in priority
order https://goo.gl/S3khv1

Suite: bootstrap_test
Test: TestBootstrap.simultaneous_bootstrap_test
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13506
Last failure: 5/5/2017
Counted failures: 45

Suite: repair_test
Test: incremental_repair_test.TestIncRepair.compaction_test
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13194
Last failure: 5/4/2017
Counted failures: 44

Suite: sstableutil_test
Test: SSTableUtilTest.compaction_test
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13182
Last failure: 5/4/2017
Counted failures: 35

Suite: paging_test
Test: TestPagingWithDeletions.test_ttl_deletions
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13507
Last failure: 4/25/2017
Counted failures: 31

Suite: repair_test
Test: incremental_repair_test.TestIncRepair.multiple_repair_test
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13515
Last failed: 5/4/2017
Counted failures: 18

Suite: cqlsh_tests
Test: cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*
JIRA:
https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22
Last failed: 5/8/2017
Counted failures: 23

Suite: paxos_tests
Test: TestPaxos.contention_test_many_threads
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13517
Last failed: 5/8/2017
Counted failures: 15

Suite: repair_test
Test: TestRepair
JIRA:
https://issues.apache.org/jira/issues/?jql=status%20%3D%20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22
Last failure: 5/4/2017
Comment: No one test fails a lot but the number of failing tests is
substantial

Suite: cqlsh_tests
Test: cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |
test_use_keyspace | test_create_keyspace]
JIRA: No JIRA yet
Last failed: 4/22/2017
count: 6

If you have spare cycles you can make a huge difference in test
stability by picking off one of these.

Regards,
Ariel

On Wed, May 10, 2017, at 12:45 PM, Ariel Weisberg wrote:
> Hi all,
> 
> The unit tests are looking pretty reliable right now. There is a long
> tail of infrequently failing tests but it's not bad and almost all
> builds succeed in the current build environment. In CircleCI it seems
> like unit tests might be a little less reliable, but still usable.
> The dtests on the other hand aren't producing clean builds yetl. There
> is also a pretty diverse set of failing tests.
> I did a bit of triaging of the flakey dtests. I started by cataloging
> everything, but what I found is that the long tail of flakey dtests is
> very long indeed so I narrowed focus to just the top frequently failing
> tests for now. See https://goo.gl/b96CdO
> I created spreadsheet with some of the failing tests. Links to JIRA,
> last time the test was seen failing, and how many failures I found in
> Apache Jenkins across the 3 dtest builds. There are a lot of failures
> not listed. There would be  50+ entries if I cataloged each one.
> There are two hard failing tests, but both are already moving along:
> CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta
> reviewing, last updated April 2017)  dtest failure in
> topology_test.TestTopology.size_estimates_multidc_testCASSANDRA-13113
> (Ready to commit, assigned Alex Petrov, Sam T Reviewing,
> last updated March 2017) test failure in
> auth_test.TestAuth.system_auth_ks_is_alterable_test
> I think the tests we should tackle first are on this sheet in priority
> order https://goo.gl/S3khv1
> Suite Test JIRA Last failure Counted failures Status Assigned Reviewer
> Comments bootstrap_test TestBootstrap.simultaneous_bootstrap_test
> https://issues.apache.org/jira/browse/CASSANDRA-13506
>  5/5/2017 45 Open
> 
> 
> 
> repair_test incremental_repair_test.TestIncRepair.compaction_test
> https://issues.apache.org/jira/browse/CASSANDRA-13194
>  5/4/2017 44 Open
> 
> 
> 
> sstableutil_test SSTableUtilTest.compaction_test
> https://issues.apache.org/jira/browse/CASSANDRA-[1]13182
>  5/4/2017 35 Open
> 
> 
> 
> paging_test TestPagingWithDeletions.test_ttl_deletions
> https://issues.apache.org/jira/browse/CASSANDRA-[2]13507
> 4/25/2017 31 Open
> 
> 
> 
> repair_test incremental_repair_test.TestIncRepair.multiple_repair_test
> https://issues.apache.org/jira/browse/CASSANDRA-[3]13515
>  5/4/2017 18 Open
> 
> 
> 
> cqlsh_tests cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22
>  5/8/2017 23
> 
> 
> 
> 
> paxos_tests TestPaxos.contention_test_many_threads
> https://issues.apache.org/jira/browse/CASSANDRA-[4]13517
>  5/8/2017 15 Open
> 
> 
> 
> repair_test TestRepair
> https://issues.apache.org/jira/issues/?jql=status%20%3D%20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22
>  5/4/2017
> 
> 
> 
> 
> No one test fails a lot but the number of failing tests is substantial
> cqlsh_tests cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |
> test_use_keyspace | test_create_keyspace]
> 
> 4/22/2017 6
> If you have spare cycles you can make a huge difference in test
> stability by picking off one of these.
> Regards,
> Ariel
> 
> Links:
> 
>   1. https://issues.apache.org/jira/browse/CASSANDRA-13194
>   2. https://issues.apache.org/jira/browse/CASSANDRA-13194
>   3. https://issues.apache.org/jira/browse/CASSANDRA-13194
>   4. https://issues.apache.org/jira/browse/CASSANDRA-13194

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org