You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/11/01 15:54:00 UTC

[jira] [Commented] (FLINK-10720) Add stress deployment end-to-end test

    [ https://issues.apache.org/jira/browse/FLINK-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671791#comment-16671791 ] 

ASF GitHub Bot commented on FLINK-10720:
----------------------------------------

StefanRRichter opened a new pull request #6994: [FLINK-10720][tests] Add deployment end-to-end stress test with many …
URL: https://github.com/apache/flink/pull/6994
 
 
   …inflated task deployment desciptors
   
   ## What is the purpose of the change
   
   This PR provides a nighly end-to-end test to simulate a heavy deployment, many task with large task deployment desciptors. This is a stress test for the deployment process that create, serializes, and sends the deployment descriptors.
   
   The idea to create inflated deployment descriptors in the job is to use many union operator states with many partitions. They will be replicated in the deployment to all tasks. For example, if we have 100 union states, each with 50 partitions, and a parallelism of 256, the meta data for each partition will be at least the long that represents the partition offset. So the estimated deployment in a recovery amounts to more than 100 (states) x 50 (partitions) x 256 x 256 (parallelism squared) x 8 (size of long) ~ 2,4GB.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Add stress deployment end-to-end test
> -------------------------------------
>
>                 Key: FLINK-10720
>                 URL: https://issues.apache.org/jira/browse/FLINK-10720
>             Project: Flink
>          Issue Type: Sub-task
>          Components: E2E Tests
>    Affects Versions: 1.7.0
>            Reporter: Till Rohrmann
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.7.0
>
>
> In order to test Flink's scalability, I suggest to add an end-to-end test which tests the deployment of a job which is very demanding. The job should have large {{TaskDeploymentDescriptors}} (e.g. a job using union state or having a high degree of parallelism). That way we can test that the serialization overhead of the TDDs does not affect the health of the cluster (e.g. heartbeats are not affected because the serialization does not happen in the main thread).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)