You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Roman Puchkovskiy (Jira)" <ji...@apache.org> on 2022/10/20 12:07:00 UTC

[jira] [Created] (IGNITE-17944) Logging storm on server node failure/disconnect during loading through a DataStreamers

Roman Puchkovskiy created IGNITE-17944:
------------------------------------------

             Summary: Logging storm on server node failure/disconnect during loading through a DataStreamers
                 Key: IGNITE-17944
                 URL: https://issues.apache.org/jira/browse/IGNITE-17944
             Project: Ignite
          Issue Type: Bug
            Reporter: Roman Puchkovskiy


This can be reproduced in RebalanceIteratorLargeEntriesOOMTest when changing #additionalRemoteJvmArgs() to return this:

return Arrays.asList("-Xmx128m", "-Xms128m", "-XX:+HeapDumpOnOutOfMemoryError", "-XX:+CrashOnOutOfMemoryError");

On my machine, when the remote node crashes, the machine becomes barely responsive because all the cores peak to 100% of load.

There happens a LOT of logging at that phase, probably a lot of Buffers of DataStreamers get cancelled, so a lot of futures get cancelled, which is made via completing with an exception, which causes a lot of logging.

It seems that same thing might happen in production if a node suddently crashes amidst data loading by a client.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)