You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2013/09/16 11:16:54 UTC

[Hadoop Wiki] Update of "HowToDevelopUnitTests" by SteveLoughran

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "HowToDevelopUnitTests" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/HowToDevelopUnitTests?action=diff&rev1=11&rev2=12

Comment:
explain the mini clusters

  
  This page contains Hadoop testing and test development guidelines.
  
+ == How Hadoop Unit Tests Work ==
+ 
+ Hadoop Unit tests are all designed to work on a local machine, rather than a full-scale Hadoop cluster. The ongoing for for that is in Apache Bigtop.
+ 
+ The unit tests work by creating a miniDFS, MiniYARN and MiniMR clusters -as appropriate. These all run the code of the specific services.
+ 
+ === MiniDFSCluster ===
+ 
+ {{{org.apache.hadoop.hdfs.MiniDFSCluster}}}
+ 
+ Emulates an HDFS cluster with the given number of (emulated) datanodes. After creating one via its builder API; you can build up the HDFS URI {{{"hdfs://localhost:" + miniDFSCluster.getNameNodePort()}}}. This can be used as the base URI for filesystem operations.
+ 
+ 
+ {{{#!java
+ File baseDir = new File("./target/hdfs/"+testName).getAbsoluteFile();
+ FileUtil.fullyDelete(baseDir)
+ conf.set(MiniDFSCluster.HDFS_MINIDFS_BASEDIR, baseDir.getAbsolutePath())
+ MiniDFSCluster.Builder builder = new MiniDFSCluster.Builder(conf)
+ MiniDFSCluster hdfsCluster = builder.build()
+ String hdfsURI = "hdfs://localhost:"+ hdfsCluster.getNameNodePort()}+"/"
+ 
+ 
+ === MiniYARNCluster ===
+ 
+ {{{org.apache.hadoop.yarn.server.MiniYARNCluster}}}
+ 
+ Starts the YARN Services in the JVM, with the given number of simulated Node Managers. You can then submit work to the ResourceManager. The actual AMs (and any containers they themselves execute code in) are actually executed in separate processes -as on a real YARN cluster. The key difference is that the classpath of the test JVM is passed down to the spawned processes (how? Which environment variable?) so that they pick up the same version of the Hadoop JARs.
+ 
+ {{{#!java
+ YarnConfiguration clusterConf = new YarnConfiguration();
+ conf.setInt(YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB, 64);
+ conf.setClass(YarnConfiguration.RM_SCHEDULER,
+               FifoScheduler.class, ResourceScheduler.class);
+ HoyaUtils.patchConfiguration(conf)
+ miniCluster = new MiniYARNCluster(name, noOfNodeManagers, numLocalDirs, numLogDirs)
+ miniCluster.init(conf)
+ miniCluster.start();
+ 
+ //once the cluster is created, you can get its configuration
+ //with the binding details to the cluster added from the minicluster
+ YarnConfiguration appConf = new YarnConfiguration(miniCluster.getConfig()),
+ 
+ }}}
+ 
+ The results of a test run end up saved into the filesystem, where then can be retrieved by hand.
+ 
+ {{{
+ cat target/TestKillAM/TestKillAM-logDir-nm-0_0/application_1378993847080_0001/container_1378993847080_0001_01_000001/out.txt
+ }}}
+ 
+  1. The output is not automatically merged into the JUnit results (if anyone can fix this, code would be welcome)
+  1. The output is formatted by whatever logging tools and configuration the AM and its containers use -such as the specific version of {{{Apache Log4J}}} and {{{log4j.properties}}} are on the classpath.
+  1. The name of the base directory and logdir is determined by the name given to the test cluster -unique cluster names per test classes are invaluable.
+  1. The more node managers you create, the more log directories you will have to look into. A single NM is easier to work with.
+  1. the application- and container- directory names vary every run.
+  1. You can {{{tail -f}}} the {{{out.txt}}} and {{{err.txt}}} files while the tests are running.
+  1. {{{jps -v}}} will list the running applications; {{{kill}}} can then be used to kill the processes, and so test the application's resilience to failures.
+ 
+ It's a bit inelegant to work with, but functional. The ability list and terminate the processes makes writing failure simulation tests possible -which is important as production applications need to be designed to handle failures of child containers.
+ 
+ === MiniMRYarnCluster ===
+ 
+ {{{org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster}}}
+ 
+ This adds an MR History Server to the MiniYarnCluster, and extends the cluster configuration to refer to it. MR applications can then easily talk to the RM to submit jobs, with the history being preserved.
+ 
+ === Using the Mini clusters in tests ===
+ 
+ The clusters take time to set up and tear down, so should only be created once per test class, in a {{{@BeforeClass}}}-tagged static class method. in an {{{@AfterClass}}} they should be stopped. {{{MiniDFSCluster.shutdown()}}} and via the {{{stop()}}} method in the YARN clusters.
+ 
+ 
+ == Writing JUnit Tests ==
+ 
  === Cheat sheet of tests development for JUnit v4 ===
  
  Hadoop has been using JUnit4 for a while now, however it seems that many new tests are still being developed for JUnit v3. It is partially JUnit's fault because for the false sense of backward compatibility all v3 {{{junit.framework}}} classes are packaged along with v4 classes and it all is called {{{junit-4.10.jar}}}. This is necessary to permit mixing of the old and new tests, and to allow the new v4 tests to run under the existing JUnit test runners in IDEs and build tools.
  
  Here's the short list of traps one need to be aware and not to develop yet another JUnit v3 test case
  
-    * YES, new unit tests HAVE to be developed for JUnit v4. No patches which add v3 test case classes will be approved. 
+    * YES, new unit tests HAVE to be developed for JUnit v4. No patches which add v3 test case classes will be approved.
     * DO NOT use {{{junit.framework}}} imports
     * DO use only {{{org.junit}}} imports
     * DO NOT {{{extends TestCase}}} (now, you can create your own test class structures if needed!)
@@ -53, +126 @@

  
   1. Use the JUnit assertions, not the Java {{{assert}}} statement.
   1. In equality tests, place the expected value first
-  1. Give assertions meaningful error messages. 
+  1. Give assertions meaningful error messages.
  
  === Bad ===
  
@@ -61, +134 @@

  /** a test */
  @Test
  public void testBuildVersion() {
-   Namenode nn = getNameNode(); 
+   Namenode nn = getNameNode();
    assertNotNull(nn);
    NamespaceInfo info = nn.versionRequest() ;
    assertEquals(info.getBuildVersion(),"32");
@@ -78, +151 @@

   */
  @Test
  public void testBuildVersion() {
-   Namenode nn = getNameNode(); 
+   Namenode nn = getNameNode();
    assertNotNull("No namenode", nn);
    NamespaceInfo info = nn.versionRequest() ;
    assertEquals("Build version wrong", "32", info.getBuildVersion());
@@ -138, +211 @@

  
  == References ==
  
-  * [[http://code.google.com/p/t2framework/wiki/JUnitQuickTutorial|Quick tutorial]] on the JUnit website. 
+  * [[http://code.google.com/p/t2framework/wiki/JUnitQuickTutorial|Quick tutorial]] on the JUnit website.