You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Stephen Watt (JIRA)" <ji...@apache.org> on 2009/11/02 19:54:59 UTC
[jira] Commented: (HADOOP-6332) Large-scale Automated Test Framework

    [ https://issues.apache.org/jira/browse/HADOOP-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772600#action_12772600 ] 

Stephen Watt commented on HADOOP-6332:
--------------------------------------

I support this proposal. At IBM, we're active users of Hadoop, however we run into issues where we need to be able to test Hadoop on other versions of Java required for non-standard architectures. For instance, we'd like to investigate putting Hadoop through its paces on AS/400, z/OS or OS/390. To do that we have to use non-Sun Java distributions (such as IBM Java) as Sun does not provide a JVM for those architectures. This proposal would provide a means that would standardize and streamline how we provide real world testing for these architectures.

At present, I'm using the Terabyte Gen/Sort/Validate jobs as they produce their own data, which greatly simplifies the test scripts, and they are easy to scale up and down.

Lastly, from what I can gather, the framework is likely to be able incorporate existing cluster environments. Thus, if one is executing a M/R test it would run over whatever dfs the cluster is using, be it HDFS, Kosmos or S3. However, I only see an S3 sub-JIRA for this. Is the intent to purely support HDFS ? 

> Large-scale Automated Test Framework
> ------------------------------------
>
>                 Key: HADOOP-6332
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6332
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: test
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>             Fix For: 0.21.0
>
>
> Hadoop would benefit from having a large-scale, automated, test-framework. This jira is meant to be a master-jira to track relevant work.
> ----
> The proposal is a junit-based, large-scale test framework which would run against _real_ clusters.
> There are several pieces we need to achieve this goal:
> # A set of utilities we can use in junit-based tests to work with real, large-scale hadoop clusters. E.g. utilities to bring up to deploy, start & stop clusters, bring down tasktrackers, datanodes, entire racks of both etc.
> # Enhanced control-ability and inspect-ability of the various components in the system e.g. daemons such as namenode, jobtracker should expose their data-structures for query/manipulation etc. Tests would be much more relevant if we could for e.g. query for specific states of the jobtracker, scheduler etc. Clearly these apis should _not_ be part of the production clusters - hence the proposal is to use aspectj to weave these new apis to debug-deployments.
> ----
> Related note: we should break up our tests into at least 3 categories:
> # src/test/unit -> Real unit tests using mock objects (e.g. HDFS-669 & MAPREDUCE-1050).
> # src/test/integration -> Current junit tests with Mini* clusters etc.
> # src/test/system -> HADOOP-6332 and it's children

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.