You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@geode.apache.org by Alberto Gomez <al...@est.tech> on 2021/10/26 07:58:57 UTC

Test failures on Windows with insufficient memory for the JRE while running distributed tests

Hi,

I am having issues with insufficient memory for the Java Runtime Environment when running some tests on the CI under Windows from the following PR :
https://github.com/apache/geode/pull/7006

The tests never fail under Linux.

This is the error I get for some VMs:

[vm4] # There is insufficient memory for the Java Runtime Environment to continue.
[vm4] # Native memory allocation (malloc) failed to allocate 32744 bytes for ChunkPool::allocate

I have reduced the amount of resources used originally by the tests but still I am not able to get a clean execution.

I do not know if it is a matter of changing the parameters for the windows execution in ci/pipelines/shared/jinja.variables.yml or if there is anything else to consider.

I would appreciate if someone from the community could help me troubleshoot this issue.

Thanks in advance,

Alberto

Re: Test failures on Windows with insufficient memory for the JRE while running distributed tests

Posted by Dale Emery <de...@vmware.com>.

> *Do the Gfsh distributed tests on Windows leave behind more artifacts on
> the harddrive than other test targets?*

On Linux, the artifact file for a full distributed test run is ~750mb.

On Windows, the artifact file for just the gfsh distributed tests is ~1gb.

> *Are we running the Gfsh distributed tests in parallel (which might
> exacerbate harddrive swapping or memory consumption)?*

On Linux, the full distributed test suite executes as many as 24 test classes in parallel (each in its own test JVM).

On Windows, the gfsh distributed tests do not currently execute in parallel.

I don’t know the answers to the other questions.

Dale

From: Alberto Gomez <al...@est.tech>
Date: Wednesday, October 27, 2021 at 10:21 AM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: Test failures on Windows with insufficient memory for the JRE while running distributed tests
Thanks, Kirk.

Any expert on the OS images and pipeline could jump in to answer Kirk's questions and help?

Thanks,

Alberto
________________________________
From: Kirk Lund <kl...@apache.org>
Sent: Tuesday, October 26, 2021 7:26 PM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: Test failures on Windows with insufficient memory for the JRE while running distributed tests

PS: I should also mention that the *windows-gfsh-distributed* test target
is only run on Windows (never on Linux). It might be useful to try getting
windows-gfsh-distributed running on LInux to see if it hits the same issue
on that OS. This would also require some help from a pipeline expert.

On Tue, Oct 26, 2021 at 10:22 AM Kirk Lund <kl...@apache.org> wrote:

> Hi Alberto,
>
> 32 kb is a very small amount of memory, so I don't think it's related to
> Java Heap. Based on what little I've read today, I think a failure in
> ChunkPool::allocate is probably related to either *running out of swap
> space or running out of address space in a 32 bit JVM*. Since the
> failures are OS specific, I would suspect the machine image we use for
> Windows to be involved.
>
> I also notice that this ChunkPool::allocate failure is only occurring for
> the Gfsh distributed tests which is the only job run on Windows that uses
> Gradle support for *JUnit Categories*. The Gradle target is
> distributedTest which we have configured with "*forkEvery 1*" which
> causes every test class to launch in a new JVM. Gradle implements JUnit
> 4 Category filtering by launching every test class to check the Categories
> and then either executes the tests or terminates without running any
> depending on the Categories.
>
> Some things I would check (or ask others about):
>
> *Is the harddrive space much smaller than what's available to the JVM(s)
> on Linux?*
>
> *Do the Gfsh distributed tests on Windows leave behind more artifacts on
> the harddrive than other test targets?*
>
> *Is it possible that the tests are using a 32-bit JVM on Windows? Or maybe
> the tests are spawning Gfsh process(es) using a 32-bit JVM instead of
> 64-bit?*
>
> *Are we running the Gfsh distributed tests in parallel (which might
> exacerbate harddrive swapping or memory consumption)?*
>
> Unfortunately, I don't know what most of the options in
> jinja.variables.yml are about. I think it would be best to get help from an
> expert in the OS images and pipeline details.
>
> Cheers,
> Kirk
>
> On Tue, Oct 26, 2021 at 12:59 AM Alberto Gomez <al...@est.tech>
> wrote:
>
>> Hi,
>>
>> I am having issues with insufficient memory for the Java Runtime
>> Environment when running some tests on the CI under Windows from the
>> following PR :
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fpull%2F7006&amp;data=04%7C01%7Cdemery%40vmware.com%7C7b81184e5afb47b705f808d9996e46eb%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637709521186740352%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=ut262y%2FKb9hEjnEBC9UmRyx6CUPCvrsbDF7q%2B13NQMg%3D&amp;reserved=0
>>
>> The tests never fail under Linux.
>>
>> This is the error I get for some VMs:
>>
>> [vm4] # There is insufficient memory for the Java Runtime Environment to
>> continue.
>> [vm4] # Native memory allocation (malloc) failed to allocate 32744 bytes
>> for ChunkPool::allocate
>>
>> I have reduced the amount of resources used originally by the tests but
>> still I am not able to get a clean execution.
>>
>> I do not know if it is a matter of changing the parameters for the
>> windows execution in ci/pipelines/shared/jinja.variables.yml or if there is
>> anything else to consider.
>>
>> I would appreciate if someone from the community could help me
>> troubleshoot this issue.
>>
>> Thanks in advance,
>>
>> Alberto
>>
>>
>>

Re: Test failures on Windows with insufficient memory for the JRE while running distributed tests

Posted by Alberto Gomez <al...@est.tech>.

Thanks, Kirk.

Any expert on the OS images and pipeline could jump in to answer Kirk's questions and help?

Thanks,

Alberto
________________________________
From: Kirk Lund <kl...@apache.org>
Sent: Tuesday, October 26, 2021 7:26 PM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: Test failures on Windows with insufficient memory for the JRE while running distributed tests

PS: I should also mention that the *windows-gfsh-distributed* test target
is only run on Windows (never on Linux). It might be useful to try getting
windows-gfsh-distributed running on LInux to see if it hits the same issue
on that OS. This would also require some help from a pipeline expert.

On Tue, Oct 26, 2021 at 10:22 AM Kirk Lund <kl...@apache.org> wrote:

> Hi Alberto,
>
> 32 kb is a very small amount of memory, so I don't think it's related to
> Java Heap. Based on what little I've read today, I think a failure in
> ChunkPool::allocate is probably related to either *running out of swap
> space or running out of address space in a 32 bit JVM*. Since the
> failures are OS specific, I would suspect the machine image we use for
> Windows to be involved.
>
> I also notice that this ChunkPool::allocate failure is only occurring for
> the Gfsh distributed tests which is the only job run on Windows that uses
> Gradle support for *JUnit Categories*. The Gradle target is
> distributedTest which we have configured with "*forkEvery 1*" which
> causes every test class to launch in a new JVM. Gradle implements JUnit
> 4 Category filtering by launching every test class to check the Categories
> and then either executes the tests or terminates without running any
> depending on the Categories.
>
> Some things I would check (or ask others about):
>
> *Is the harddrive space much smaller than what's available to the JVM(s)
> on Linux?*
>
> *Do the Gfsh distributed tests on Windows leave behind more artifacts on
> the harddrive than other test targets?*
>
> *Is it possible that the tests are using a 32-bit JVM on Windows? Or maybe
> the tests are spawning Gfsh process(es) using a 32-bit JVM instead of
> 64-bit?*
>
> *Are we running the Gfsh distributed tests in parallel (which might
> exacerbate harddrive swapping or memory consumption)?*
>
> Unfortunately, I don't know what most of the options in
> jinja.variables.yml are about. I think it would be best to get help from an
> expert in the OS images and pipeline details.
>
> Cheers,
> Kirk
>
> On Tue, Oct 26, 2021 at 12:59 AM Alberto Gomez <al...@est.tech>
> wrote:
>
>> Hi,
>>
>> I am having issues with insufficient memory for the Java Runtime
>> Environment when running some tests on the CI under Windows from the
>> following PR :
>> https://github.com/apache/geode/pull/7006
>>
>> The tests never fail under Linux.
>>
>> This is the error I get for some VMs:
>>
>> [vm4] # There is insufficient memory for the Java Runtime Environment to
>> continue.
>> [vm4] # Native memory allocation (malloc) failed to allocate 32744 bytes
>> for ChunkPool::allocate
>>
>> I have reduced the amount of resources used originally by the tests but
>> still I am not able to get a clean execution.
>>
>> I do not know if it is a matter of changing the parameters for the
>> windows execution in ci/pipelines/shared/jinja.variables.yml or if there is
>> anything else to consider.
>>
>> I would appreciate if someone from the community could help me
>> troubleshoot this issue.
>>
>> Thanks in advance,
>>
>> Alberto
>>
>>
>>

Re: Test failures on Windows with insufficient memory for the JRE while running distributed tests

Posted by Kirk Lund <kl...@apache.org>.

PS: I should also mention that the *windows-gfsh-distributed* test target
is only run on Windows (never on Linux). It might be useful to try getting
windows-gfsh-distributed running on LInux to see if it hits the same issue
on that OS. This would also require some help from a pipeline expert.

On Tue, Oct 26, 2021 at 10:22 AM Kirk Lund <kl...@apache.org> wrote:

> Hi Alberto,
>
> 32 kb is a very small amount of memory, so I don't think it's related to
> Java Heap. Based on what little I've read today, I think a failure in
> ChunkPool::allocate is probably related to either *running out of swap
> space or running out of address space in a 32 bit JVM*. Since the
> failures are OS specific, I would suspect the machine image we use for
> Windows to be involved.
>
> I also notice that this ChunkPool::allocate failure is only occurring for
> the Gfsh distributed tests which is the only job run on Windows that uses
> Gradle support for *JUnit Categories*. The Gradle target is
> distributedTest which we have configured with "*forkEvery 1*" which
> causes every test class to launch in a new JVM. Gradle implements JUnit
> 4 Category filtering by launching every test class to check the Categories
> and then either executes the tests or terminates without running any
> depending on the Categories.
>
> Some things I would check (or ask others about):
>
> *Is the harddrive space much smaller than what's available to the JVM(s)
> on Linux?*
>
> *Do the Gfsh distributed tests on Windows leave behind more artifacts on
> the harddrive than other test targets?*
>
> *Is it possible that the tests are using a 32-bit JVM on Windows? Or maybe
> the tests are spawning Gfsh process(es) using a 32-bit JVM instead of
> 64-bit?*
>
> *Are we running the Gfsh distributed tests in parallel (which might
> exacerbate harddrive swapping or memory consumption)?*
>
> Unfortunately, I don't know what most of the options in
> jinja.variables.yml are about. I think it would be best to get help from an
> expert in the OS images and pipeline details.
>
> Cheers,
> Kirk
>
> On Tue, Oct 26, 2021 at 12:59 AM Alberto Gomez <al...@est.tech>
> wrote:
>
>> Hi,
>>
>> I am having issues with insufficient memory for the Java Runtime
>> Environment when running some tests on the CI under Windows from the
>> following PR :
>> https://github.com/apache/geode/pull/7006
>>
>> The tests never fail under Linux.
>>
>> This is the error I get for some VMs:
>>
>> [vm4] # There is insufficient memory for the Java Runtime Environment to
>> continue.
>> [vm4] # Native memory allocation (malloc) failed to allocate 32744 bytes
>> for ChunkPool::allocate
>>
>> I have reduced the amount of resources used originally by the tests but
>> still I am not able to get a clean execution.
>>
>> I do not know if it is a matter of changing the parameters for the
>> windows execution in ci/pipelines/shared/jinja.variables.yml or if there is
>> anything else to consider.
>>
>> I would appreciate if someone from the community could help me
>> troubleshoot this issue.
>>
>> Thanks in advance,
>>
>> Alberto
>>
>>
>>

Re: Test failures on Windows with insufficient memory for the JRE while running distributed tests

Posted by Kirk Lund <kl...@apache.org>.

Hi Alberto,

32 kb is a very small amount of memory, so I don't think it's related to
Java Heap. Based on what little I've read today, I think a failure in
ChunkPool::allocate is probably related to either *running out of swap
space or running out of address space in a 32 bit JVM*. Since the failures
are OS specific, I would suspect the machine image we use for Windows to be
involved.

I also notice that this ChunkPool::allocate failure is only occurring for
the Gfsh distributed tests which is the only job run on Windows that uses
Gradle support for *JUnit Categories*. The Gradle target is distributedTest
which we have configured with "*forkEvery 1*" which causes every test class
to launch in a new JVM. Gradle implements JUnit 4 Category filtering by
launching every test class to check the Categories and then either
executes the tests or terminates without running any depending on the
Categories.

Some things I would check (or ask others about):

*Is the harddrive space much smaller than what's available to the JVM(s) on
Linux?*

*Do the Gfsh distributed tests on Windows leave behind more artifacts on
the harddrive than other test targets?*

*Is it possible that the tests are using a 32-bit JVM on Windows? Or maybe
the tests are spawning Gfsh process(es) using a 32-bit JVM instead of
64-bit?*

*Are we running the Gfsh distributed tests in parallel (which might
exacerbate harddrive swapping or memory consumption)?*

Unfortunately, I don't know what most of the options in jinja.variables.yml
are about. I think it would be best to get help from an expert in the OS
images and pipeline details.

Cheers,
Kirk

On Tue, Oct 26, 2021 at 12:59 AM Alberto Gomez <al...@est.tech>
wrote:

> Hi,
>
> I am having issues with insufficient memory for the Java Runtime
> Environment when running some tests on the CI under Windows from the
> following PR :
> https://github.com/apache/geode/pull/7006
>
> The tests never fail under Linux.
>
> This is the error I get for some VMs:
>
> [vm4] # There is insufficient memory for the Java Runtime Environment to
> continue.
> [vm4] # Native memory allocation (malloc) failed to allocate 32744 bytes
> for ChunkPool::allocate
>
> I have reduced the amount of resources used originally by the tests but
> still I am not able to get a clean execution.
>
> I do not know if it is a matter of changing the parameters for the windows
> execution in ci/pipelines/shared/jinja.variables.yml or if there is
> anything else to consider.
>
> I would appreciate if someone from the community could help me
> troubleshoot this issue.
>
> Thanks in advance,
>
> Alberto
>
>
>