You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2018/05/02 16:52:03 UTC

[jira] [Created] (HBASE-20519) [Chaos] Add more chaos options

stack created HBASE-20519:
-----------------------------

Summary: [Chaos] Add more chaos options
Key: HBASE-20519
URL: https://issues.apache.org/jira/browse/HBASE-20519
Project: HBase
Issue Type: Umbrella
Components: integration tests
Reporter: stack

Our Chaos menu is "drawing room polite" given the variety of failures available out in the wild world of deploys.

Other possible items to add (could do as subtasks of this umbrella) taken from a recent [interesting read on how TiDB does its chaos|https://thenewstack.io/chaos-tools-and-techniques-for-testing-the-tidb-distributed-newsql-database/]:

* Send SIGSTOP to hang or SIGCONT to resume the process.
Use `renice` to adjust the process priority or use `setpriority` for the threads of the process.
* Max out the CPU.
* Use `iptables` or `tc` to drop or reject the network packages or delay the network packages.
* Use `tc` to reorder the network packages and use a proxy to reorder the gRPC requests.
* Use `iperf` to take all network throughput.
* Use `libfuse` to mount a file system and do the I/O fault injection.
* Link `libfiu` to do the I/O fault injection.
* Use `rm -rf` forcbily to remove all data.
* Use `echo 0 > file` to damage a file.
* Copy a huge file to create the `NoSpace` problem.

The article includes other interesting possibilities: exploiting the kernels fault injection mechanism or scripting systemtap to mess with nodes. It also describes how they automate their chaos-making.

--
This message was sent by Atlassian JIRA
(v7.6.3#76005)