You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2019/09/24 07:45:11 UTC
[GitHub] [flink] wsry opened a new pull request #9753: [FLINK-14180][runtime]Enable config of maximum capacity of FileArchivedExecutionGraphStore.

wsry opened a new pull request #9753: [FLINK-14180][runtime]Enable config of maximum capacity of FileArchivedExecutionGraphStore.
URL: https://github.com/apache/flink/pull/9753
 
 
   ## What is the purpose of the change
   The purpose of this pr to enable config of max capacity of FileArchivedExecutionGraphStore. Currently, Flink session cluster uses FileArchivedExecutionGraphStore to keep finished jobs for historic requests. The FileArchivedExecutionGraphStore purges archived ExecutionGraphs only by an expiration time. In a session cluster on which runs many batch jobs, it is hard to config the jobstore.expiration-time, if configured too short, the historical information may have been deleted when the user want to check it, and if configured too long, the web front end may response very slowly when the number of finished job is too large. This pr enables config of maximum capacity of FileArchivedExecutionGraphStore, after which, the expiration time can be set to a relative long value and the maximum capacity can be set to an appropriate value which does not make the web ui become too slow or consume too much memory.
   
   ## Brief change log
   
     - A new config option *jobstore.max-capacity* was added. The default value is *Integer.MAX_VALUE*, which means the capacity of job store is not limited and keeps the default behavior unchanged.
     - Set the maximum capacity of job detail cache through the *maximumSize* interface of Guava Cache.
     - Add a test case to verify the change.
   
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
     - A ut case was added which works by putting more than the configured capacity number of the finished ExecutionGraphs to FileArchivedExecutionGraphStore and verifying that the size of the FileArchivedExecutionGraphStore is no more than the configured max capacity and the old execution graphs will be purged if the total added execution graph number exceeds the max capacity.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (yes / **no**)
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**)
     - The serializers: (yes / **no** / don't know)
     - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know)
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
     - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (yes / **no**)
     - If yes, how is the feature documented? (not applicable / **docs** / JavaDocs / not documented)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services