You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2015/03/12 01:58:38 UTC
[jira] [Created] (YARN-3337) Provide YARN chaos monkey
Steve Loughran created YARN-3337:
------------------------------------
Summary: Provide YARN chaos monkey
Key: YARN-3337
URL: https://issues.apache.org/jira/browse/YARN-3337
Project: Hadoop YARN
Issue Type: New Feature
Components: test
Affects Versions: 2.7.0
Reporter: Steve Loughran
To test failure resilience today you either need custom scripts or implement Chaos Monkey-like logic in your application (SLIDER-202).
Killing AMs and containers on a schedule & probability is the core activity here, one that could be handled by a CLI App/client lib that does this.
# entry point to have a startup delay before acting
# frequency of chaos wakeup/polling
# probability to AM failure generation (0-100)
# probability of non-AM container kill
# future: other operations
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)