You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Konstantin Boudnik (JIRA)" <ji...@apache.org> on 2009/06/05 21:50:07 UTC

[jira] Issue Comment Edited: (HADOOP-5974) Add orthogonal fault injection mechanism/framework

    [ https://issues.apache.org/jira/browse/HADOOP-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716737#action_12716737 ] 

Konstantin Boudnik edited comment on HADOOP-5974 at 6/5/09 12:49 PM:
---------------------------------------------------------------------

Here's an overall proposition of the framework layout:
- AspectJ 1.6 should be used as the base framework
- additional set of classes needs to be developed to control and configure injection of the faults at the runtime. In the first version of the framework, I'd recommend to go with with randomly (in terms of their happening, not their location in the application code)  injected faults
- randomization level might be configured through system properties from the command line or set in a separate configuration file
- to completely turn off faults injection for a class the probability level has to be set to 0% ('zero'); setting to 100% will achieve the opposite effect
- build.xml has to be extended with a new target ('injectfaults') to weave needed aspects in place after the normal compilation of Java classes is done; JUnit targets will have to be modified to pass new probability configuration parameters into spawn JVM
- aspects' source code will be place under test/src/aop; package structure will mimic the original one of Hadoop. Say an aspect for FSDataset has to belong to org.apache.hadoop.hdfs.server.datanode

Some examples of new build/test execution interface:

To weave (build-in) aspects in place: 
- % ant injectfaults 
To execute HDFS tests (turn everything off, but BlockReceiver faults, which set at 10% level): 
- % ant run-test-hdfs -DallFaultProbability=0 -DBlockReceiverFaultProbability=10 


      was (Author: cos):
    Here's an overall proposition of the framework layout:
- AspectJ 1.6 should be used as the base framework
- additional set of classes needs to be developed to control and configure injection of the faults at the runtime. In the first version of the framework, I'd recommend to go with with randomly (in terms of their happening, not their location in the application code)  injected faults
- randomization level might be configured through system properties from the command line or set in a separate configuration file
- to completely turn off faults injection for a class the probability level has to be set to 0% ('zero'); setting to 100% will achieve the opposite effect
- build.xml has to be extended with a new target ('injectfaults') to weave needed aspects in place after the normal compilation of Java classes is done; JUnit targets will have to be modified to pass new probability configuration parameters into spawn JVM
- aspects' source code will be place under test/src/aop; package structure will mimic the original one of Hadoop. Say an aspect for FSDataset has to belong to org.apache.hadoop.hdfs.server.datanode

Some examples of new build/test execution interface:

To weave (build-in) aspects in place: 
- % ant injectfaults 
- To execute HDFS tests (turn everything off, but BlockReceiver faults, which set at 10% level): 
% ant run-test-hdfs -DallFaultProbability=0 -DBlockReceiverFaultProbability=10 

  
> Add orthogonal fault injection mechanism/framework
> --------------------------------------------------
>
>                 Key: HADOOP-5974
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5974
>             Project: Hadoop Core
>          Issue Type: Test
>          Components: test
>            Reporter: Konstantin Boudnik
>            Assignee: Konstantin Boudnik
>
> It'd be great to have a fault injection mechanism for Hadoop.
> Having such solution in place will allow to increase test coverage of error handling and recovery mechanisms, reduce reproduction time and increase the reproduction rate of the problems.
> Ideally, the system has to be orthogonal to the current code and test base. E.g. faults have to be injected at build time and would have to be configurable, e.g. all faults could be turned off, or only some of them would be allowed to happen. Also, fault injection has to be separated from production build. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.