You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2010/06/24 01:55:20 UTC

[Hadoop Wiki] Update of "Hive/TipsForAddingNewTests" by JohnSichi

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/TipsForAddingNewTests" page has been changed by JohnSichi.
http://wiki.apache.org/hadoop/Hive/TipsForAddingNewTests?action=diff&rev1=2&rev2=3

--------------------------------------------------

  #pragma section-numbers 2
  = Tips for Adding New Tests in Hive =
  
- Following are a few rules of thumb that should be followed when adding new test cases in Hive that require the introduction of new query file(s). Of course, these rules should not be applied if they invalidate the purpose of your test to begin with. These are genrally helpful in - keeping the test queries concise, minimizing the redundancies where possible, and ensuring that cascading failures due to a single test failure do not occur. 
+ Following are a few rules of thumb that should be followed when adding new test cases in Hive that require the introduction of new query file(s). Of course, these rules should not be applied if they invalidate the purpose of your test to begin with. These are generally helpful in keeping the test queries concise, minimizing the redundancies where possible, and ensuring that cascading failures due to a single test failure do not occur. 
  
   * Instead of creating your own data file for loading into a new table, use existing data from staged tables like {{{src}}}.
-  * If your test requires a {{{SELECT}}} query, limit it to a single {{{SELECT}}} statement per table as these are generally heavily exercised by a majority of tests.
+  * If your test requires a {{{SELECT}}} query, keep it as simple as possible, and minimize the number of queries to keep overall test time down; avoid repeating scenarios which are already covered by existing tests.
-  * If you must use a {{{SELECT}}} statement, make sure you use the {{{ORDER BY}}} clause to minimize the chances of spurious diffs due to output order differences leading to test failures.
+  * When you do need to use a {{{SELECT}}} statement, make sure you use the {{{ORDER BY}}} clause to minimize the chances of spurious diffs due to output order differences leading to test failures.
   * Limit your test to one table unless you require multiple tables specifically.
   * Start the query specification with an explicit {{{DROP TABLE}}} directive to make sure that any upstream test failures that could not clean up do not cause your test to fail.
   * End the query specification with explicit {{{DROP TABLE}}} directive to drop the table(s) you may have created during the course of the test.