You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by "bc Wong (JIRA)" <ji...@apache.org> on 2012/07/10 08:20:36 UTC

[jira] [Commented] (BIGTOP-635) Implement a cluster-abstraction, discovery and manipulation framework for iTest

    [ https://issues.apache.org/jira/browse/BIGTOP-635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13410082#comment-13410082 ] 

bc Wong commented on BIGTOP-635:
--------------------------------

Thanks for the progress update, Sujay!

Have you thought about how to model the rest? It's a bit hard to comment on the current design without knowing your overall plan.
* A cluster has hosts. Perhaps the API should expose that?
** Hosts have properties, like rack assignment. You may want to consider exposing those as well.
* A cluster has services, like HDFS, MR, HBase, ZK, etc. How does the API let callers discover them?
* A service has daemons (NN, DN, etc.) Should each shim expose what daemons have been setup and where they're running?
* A service also has other properties and operations:
*# Configuration, like `fs.defaultFS'. Probably useful for tests to know, and to change.
*# Run state, like started/stopped.
*# Commands, like start/stop/restart.
* A daemon has its own:
*# Configuration, like `hadoop.security.authentication'. For example, tests would probably need to set this for any Kerberos testing.
*# Run state. Useful for testing failover.
*# Commands, like start/stop/restart, decommission.
* Currently, you're modelling a daemon instance by a (daemon_type, hostname) tuple. I'd promote it to be an interface class, because daemons seem more complex than that.

It's useful for me to think in concrete terms. For example, to test things that breaks after you turn on HA (like HIVE-3056), you probably need the capability to:
# Make configuration change, to turn on HA in the middle of the test.
# Trigger commands, which is restart in this case. You already have that.
# Query the run state, to assert that other components are still running. Specific service-level tests are even better.

I'm new to Bigtop. Let me know if that makes sense.
                
> Implement a cluster-abstraction, discovery and manipulation framework for iTest
> -------------------------------------------------------------------------------
>
>                 Key: BIGTOP-635
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-635
>             Project: Bigtop
>          Issue Type: New Feature
>          Components: Tests
>    Affects Versions: 0.4.0
>            Reporter: Roman Shaposhnik
>            Assignee: Sujay Rau
>             Fix For: 0.5.0
>
>         Attachments: BigtopClusterManager.zip, ClusterManagerAPI.pdf
>
>
> We've come to a point where our tests need to have a uniform way of interfacing with the cluster under test. It is no longer ok to assume that the test can be executed on a particular node (and thus have access to services running on it). It is also less than ideal for tests to assume a particular type of interaction with the services since it tends to break in different deployment scenarios. 
> A framework that needs to be put in place has to be capable of (regardless of where a test using it is executed on):
>   # representing the abstract configuration of the cluster
>   # representing the abstract topology of the entire cluster (services running on a cluster, nodes hosting the daemons, racks, etc).
>   # giving tests an ability to query this topology
>   # giving tests an ability to affect the nodes in that topology in a particular way (refreshing configuration, restarting services, etc.)
> Of course, the ideal solution here would be to give Bigtop tests a programmatic access to a Hadoop cluster management framework such as Cloudera's CM or Apache Ambari. 
> As with any ideal solutions I don't think it is realistic though. Hence we have to cook something up. At this point I'm really focused on getting the API right and I'm totally fine with an implementation of that API to be something as silly as a bunch of ssh-based scripts or something.
> This JIRA is primarily focused on coming up with such an API. Anybody who's willing to help is welcome to.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira