You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2018/11/08 14:20:23 UTC

[GitHub] zentol opened a new pull request #7062: [FLINK-10825][tests] Increase request-backoff for high-parallelism e2e test

zentol opened a new pull request #7062: [FLINK-10825][tests] Increase request-backoff for high-parallelism e2e test
URL: https://github.com/apache/flink/pull/7062
 
 
   ## What is the purpose of the change
   
   This PR stabilizes the high-parallelism iterations e2e test.
   
   When a task starts running it requests data (partitions) from other tasks. In case of a timeout the request is retried with a backoff, until the maximum backoff (`taskmanager.network.request-backoff.max`) is reached.
   When reached a `PartitionNotFoundException` is thrown as reported in the JIRA that fails the job.
   
   If a job is not fully deployed within the time that it takes 1 task to reach the maximum backoff it is quite likely for this exception to occur.
   
   This PR bumps the maximum backoff to 60 seconds, which should give the job more time to fully deploy.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services