You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Chesnay Schepler (JIRA)" <ji...@apache.org> on 2018/07/17 10:47:00 UTC

[jira] [Commented] (FLINK-8163) NonHAQueryableStateFsBackendITCase test getting stuck on Travis

    [ https://issues.apache.org/jira/browse/FLINK-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16546392#comment-16546392 ] 

Chesnay Schepler commented on FLINK-8163:
-----------------------------------------

This test has some funky retrying logic that ignores most exceptions:
{code:java}
CompletableFuture<S> expected = client.getKvState(jobId, queryName, key, keyTypeInfo, stateDescriptor);
expected.whenCompleteAsync((result, throwable) -> {
   if (throwable != null) {
      if (
            throwable.getCause() instanceof CancellationException ||
            throwable.getCause() instanceof AssertionError ||
            (failForUnknownKeyOrNamespace && throwable.getCause() instanceof UnknownKeyOrNamespaceException)
      ) {
         resultFuture.completeExceptionally(throwable.getCause());
      } else if (deadline.hasTimeLeft()) {
         getKvStateIgnoringCertainExceptions(
               deadline, resultFuture, client, jobId, queryName, key, keyTypeInfo,
               stateDescriptor, failForUnknownKeyOrNamespace, executor);
      }
   } else {
      resultFuture.complete(result);
   }
}, executor);{code}

When running the test locally in a loop this exception was logged on the server.
{code}
org.apache.flink.shaded.netty4.io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 2147483648, max: 2147483648)
{code}
The client ignores this error, infinitely retries the operation, causing the timeout. Incidentally, on every subsequent attempt the same exception is printed.

> NonHAQueryableStateFsBackendITCase test getting stuck on Travis
> ---------------------------------------------------------------
>
>                 Key: FLINK-8163
>                 URL: https://issues.apache.org/jira/browse/FLINK-8163
>             Project: Flink
>          Issue Type: Bug
>          Components: Queryable State, Tests
>    Affects Versions: 1.5.0
>            Reporter: Till Rohrmann
>            Assignee: Chesnay Schepler
>            Priority: Critical
>              Labels: test-stability
>
> The {{NonHAQueryableStateFsBackendITCase}} tests seems to get stuck on Travis producing no output for 300s.
> https://travis-ci.org/tillrohrmann/flink/jobs/307988209



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)