You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2010/06/24 01:13:20 UTC

[Hadoop Wiki] Update of "Hive/TipsForAddingNewTests" by ArvindPrabhakar

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/TipsForAddingNewTests" page has been changed by ArvindPrabhakar.
http://wiki.apache.org/hadoop/Hive/TipsForAddingNewTests

--------------------------------------------------

New page:
#pragma section-numbers 2
= Tips for Adding New Tests in Hive =

Following are a few rules of thumb that should be followed when adding new test cases in Hive that require the introduction of new query file(s). Of course, these rules should not be applied if they invalidate the purpose of your test to begin with. These are genrally helpful in - keeping the test queries concise, minimizing the redundancies where possible, and ensuring that cascading failures due to a single test failure do not occur. 

 * Instead of creating your own data file for loading into a new table, use existing data from staged tables like {{{src}}}.
 * If your test requires a {{{SELECT}}} query, limit it to a single {{{SELECT}}} statement per table these are generally heavily exercised by a majority of tests.
 * If you must use a {{{SELECT}}} statement, make sure you use the {{{ORDER BY}}} clause to minimize the chances of spurious diffs due to output order differences leading to test failures.
 * Limit your test to one table as implementing test across multiple tables may be redundant.
 * Start the query specification with an explicitly {{{DROP TABLE}}} directive to make sure that any upstream test failure that did not clean up does not cause your test to fail.
 * End the query specification to explicitly drop the table(s) you may have created during the course of the test.
 * Make sure that you name your query file appropriately with a descriptive name.