You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Matthias Pohl (Jira)" <ji...@apache.org> on 2023/03/01 09:03:00 UTC

[jira] [Commented] (FLINK-31278) exit code 137 (i.e. OutOfMemoryError) in core module

    [ https://issues.apache.org/jira/browse/FLINK-31278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694943#comment-17694943 ] 

Matthias Pohl commented on FLINK-31278:
---------------------------------------

There is no heapdump provide due to a failure in the upload step. I extracted the tests that where running while the error happened based on the Maven output:
{code}
$ grep -e " Tests run: " -e "\[INFO\] Running" 20230301.3.txt | grep -o "org.apache.flink.[a-zA-Z\.]*" | sort | uniq -c | sort -n | head -5
      1 org.apache.flink.runtime.dispatcher.MemoryExecutionGraphInfoStoreTest
      1 org.apache.flink.runtime.io.disk.ChannelViewsTest
      1 org.apache.flink.runtime.io.disk.FileChannelManagerImplTest
      1 org.apache.flink.runtime.io.disk.iomanager.AsynchronousFileIOChannelTest
      2 org.apache.flink.api.common.accumulators.AverageAccumulatorTest
{code}
Although, that's not necessarily an indication for the cause.

> exit code 137 (i.e. OutOfMemoryError) in core module
> ----------------------------------------------------
>
>                 Key: FLINK-31278
>                 URL: https://issues.apache.org/jira/browse/FLINK-31278
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>    Affects Versions: 1.17.0
>            Reporter: Matthias Pohl
>            Priority: Blocker
>              Labels: test-stability
>
> The following build failed due to a 137 exit code indicating an OutOfMemoryError:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46643&view=logs&j=77a9d8e1-d610-59b3-fc2a-4766541e0e33&t=125e07e7-8de0-5c6c-a541-a567415af3ef&l=7847
> {code}
> [...]
> Mar 01 05:29:06 [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.65 s - in org.apache.flink.runtime.io.compression.BlockCompressionTest
> Mar 01 05:29:06 [INFO] Running org.apache.flink.runtime.dispatcher.DispatcherCachedOperationsHandlerTest
> Mar 01 05:29:07 [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.142 s - in org.apache.flink.runtime.dispatcher.DispatcherCachedOperationsHandlerTest
> Mar 01 05:29:08 [INFO] Running org.apache.flink.runtime.dispatcher.MemoryExecutionGraphInfoStoreTest
> ##[error]Exit code 137 returned from process: file name '/usr/bin/docker', arguments 'exec -i -u 1001  -w /home/vsts_azpcontainer 5953b171e8ed4caba7af2b326533e249211ed4dcc48640edb3c1b0cbbcdf1a21 /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - core
> {code}
> This build ran on an Azure pipeline machine (Azure Pipelines 9) and, therefore, cannot be caused by FLINK-18356. That said, there was a concurrent 137 exit code build failure happening on agent "Azure Pipelines 21" (see [20230301.3|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46643&view=logs&j=77a9d8e1-d610-59b3-fc2a-4766541e0e33&t=125e07e7-8de0-5c6c-a541-a567415af3ef&l=7847]) ~10mins later



--
This message was sent by Atlassian Jira
(v8.20.10#820010)