You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jon Meredith (JIRA)" <ji...@apache.org> on 2019/08/04 22:42:00 UTC

[jira] [Commented] (CASSANDRA-15170) Reduce the time needed to release in-JVM dtest cluster resources after close

    [ https://issues.apache.org/jira/browse/CASSANDRA-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899690#comment-16899690 ] 

Jon Meredith commented on CASSANDRA-15170:
------------------------------------------

I've updated the branches and this should be ready to review.  Once you're happy with it we can update the commit message and squash the fixup in,
I just didn't have the heart to redo all the merging up again.

2.2 | [Branch|https://github.com/jonmeredith/cassandra/commits/in-jvm-dtest-fixes-v4-2.2] | [CircleCI|https://circleci.com/gh/jonmeredith/cassandra/tree/in-jvm-dtest-fixes-v4-2%2E2]
 3.0 | [Branch|https://github.com/jonmeredith/cassandra/commits/in-jvm-dtest-fixes-v4-3.0] | [CircleCI|https://circleci.com/gh/jonmeredith/cassandra/tree/in-jvm-dtest-fixes-v4-3%2E0]
 3.11 | [Branch|https://github.com/jonmeredith/cassandra/commits/in-jvm-dtest-fixes-v4-3.11] | [CircleCI|https://circleci.com/gh/jonmeredith/cassandra/tree/in-jvm-dtest-fixes-v4-3%2E11]
 trunk | [Branch|https://github.com/jonmeredith/cassandra/commits/in-jvm-dtest-fixes-v4-trunk] | [CircleCI|https://circleci.com/gh/jonmeredith/cassandra/tree/in-jvm-dtest-fixes-v4-trunk]

Unit tests / in-jvm-dtest are passing on 2.2-3.11 successfully.  There's a failure on trunk for {{org.apache.cassandra.net.ConnectionTest.testCloseIfEndpointDown}} which I suspect is due to the growth in the powerset of connection options and unrelated to the in-jvm changes.

To document the discussion we had off-ticket.

{quote}
Making {{ResourceLeakTest.doTest}} to be configurable, could also later automatically loop through all in-jvm dtests and run them a dozen times or so to see if leaks are occurring. Perhaps on each loop, we could dump the threads, heap utilisation and files, and check they are not growing? That way the test can become one that actually fails if leaks are detected, and not produce heap dumps etc. unless it is so detected (and perhaps preferably only produce heap dumps if no thread leaks are detected)
{quote}

I agree, that would be nice.  I'd rather tackle that as a separate piece of work under a new ticket (it may make sense to do at the same time as CASSANDRA-15171. It's painful trying to keep all the variations of this in sync at the moment.

{quote}
IsolatedExecutor not using NamedThreadFactory
{quote}

I added a comment to explain, but using NamedThreadFactory was obscuring some exceptions while debugging as it sometimes called lways called FastThreadLocal.removeAll() before it was initialized and crashed (although perhaps with moving unloading the classloader it would not be an issue now, I can't remember how to reproduce).

{quote}
I'm anyway unclear why we are using `CompletableFuture` here, when we return a normal `Future`
{quote}

Good point, fixed up with your suggestion.


> Reduce the time needed to release in-JVM dtest cluster resources after close
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15170
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15170
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Test/dtest
>            Reporter: Jon Meredith
>            Assignee: Jon Meredith
>            Priority: Normal
>
> There are a few issues that slow the in-JVM dtests from reclaiming metaspace once the cluster is closed.
> IsolatedExecutor issues the shutdown on a SingleExecutorThreadPool, sometimes this thread was still running 10s after the dtest cluster was closed.  Instead, switch to a ThreadPoolExecutor with a core pool size of 0 so that the thread executing the class loader close executes sooner.
> If an OutboundTcpConnection is waiting to connect() and the endpoint is not answering, it has to wait for a timeout before it exits. Instead it should check the isShutdown flag and terminate early if shutdown has been requested.
> In 3.0 and above, HintsCatalog.load uses java.nio.Files.list outside of a try-with-resources construct and leaks a file handle for the directory.  This doesn't matter for normal usage, it leaks a file handle for each dtest Instance created.
> On trunk, Netty global event executor threads are still running and delay GC for the instance class loader.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org