You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@yunikorn.apache.org by "Craig Condit (Jira)" <ji...@apache.org> on 2022/12/22 15:32:00 UTC

[jira] [Closed] (YUNIKORN-1483) Overhaul periodic state dump support

     [ https://issues.apache.org/jira/browse/YUNIKORN-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Craig Condit closed YUNIKORN-1483.
----------------------------------
      Assignee: Craig Condit
    Resolution: Won't Fix

Closed in favor of YUNIKORN-1500.

> Overhaul periodic state dump support
> ------------------------------------
>
>                 Key: YUNIKORN-1483
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1483
>             Project: Apache YuniKorn
>          Issue Type: Improvement
>          Components: core - scheduler, release, shim - kubernetes
>            Reporter: Craig Condit
>            Assignee: Craig Condit
>            Priority: Major
>
> The current support for generating periodic state dumps implemented in YUNIKORN-940 has several warts:
>  # The configuration in YUNIKORN-949 is done via the core scheduler configuration, leading to a random option on partitions which doesn't belong there and has nothing to do with scheduling.
>  # Changing the frequency of the state dumps is done via the unsecured REST API. This is a potential denial-of-service vector.
>  # Configuration V2 is now complete, which standardizes on using a ConfigMap to configure all YuniKorn options that make sense to be reconfigured. However, allowing the location to be changed at runtime makes no sense in a containerized environment.
>  # Retrieving the state dumps requires mounting of external storage. This is necessarily a site-specific configuration and currently requires a custom Helm deployment.
>  # The state dumps, though JSON, are emitted as text files with JSON appended to them, making parsing difficult.
> To address these issues:
>  # Deprecate existing REST API configuration for frequency, and make it a no-op now for security reasons. We can remove it completely in 2.0.
>  # Deprecate the statedumpfilepath option on partitions. Ignore it for security reasons now (and warn if found), and remove completely in 2.0.
>  # Disable the feature by default. To enable it, we should require setting a specific environment variable:
>  ** YUNIKORN_STATE_DUMP_LOCATION=/path/to/dir : This would be required to enable the feature at all. Making it an env var makes sense as it is not an option that should be reconfigured (or even visible) in configuration.
>  # Via configmap, we should allow the feature to be enabled / disabled and its frequency set. These options would have no effect if YUNIKORN_STATE_DUMP_LOCATION is not defined:
>  ** periodicStateDump.enabled: "true" | "false" (default "false")
>  ** periodicStateDump.frequency: "15m" (default value, do not allow more frequently than 1m intervals)
>  ** periodicStateDump.count: 10 (default value)
>  # Create an empty directory /yunkorn-state in the Docker image to store state dumps.
>  # Add support to Helm for enabling state dump support as well as setting custom mount options (including quota). Enabling support should set the env var YUNIKORN_STATE_DUMP_LOCATION=/yunikorn-state and mount this directory via the options specified.
>  # Output a single json file per dump and remove oldest files until count <= periodicStateDump.count entries: yunikorn-state-dump-YYYYMMDD-HHMM.json
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: issues-help@yunikorn.apache.org