You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "John Speidel (JIRA)" <ji...@apache.org> on 2014/10/21 18:48:34 UTC
[jira] [Commented] (AMBARI-6275) Add support for "add hosts" with Blueprints API

    [ https://issues.apache.org/jira/browse/AMBARI-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178617#comment-14178617 ] 

John Speidel commented on AMBARI-6275:
--------------------------------------

For 2.0, here are what I consider to be the minimum requirements for blueprint add hosts:
- add 1-n hosts with a single api call
- api call is asynchronous and will return request information which can be used to tracks the request status
- provide api syntax that allows for mapping of components to new hosts (See description for more on this)
- new hosts can contain slave and/or client components for services already represented in the cluster (restrict adding of new master components)
- automatically update configurations as necessary

Restrictions:
- can't add master components
- can't modify configuration for existing hostgroups 
- can't modify components on existing hosts

Additional Concerns:
In some cases, adding hosts to a cluster could require that additional context specific operations occur.  For example, rebalancing of HDFS after addition of datanodes or updating configuration properties such as “hive.zookeeper.quorum” when a ZK server is added.  Restarting of services/components may also be necessary in some cases.  These additional context specific operations may need to be accounted for in the api.  Unlike the UI, blueprint scaling operations are likely to be headless (no administrator sitting in a chair adding the hosts) so the approach used by the UI of notifying the user to restart a service, etc. may not work for blueprints.  Adding multiple hosts in a single request will likely be more performant for cases where additional context specific operations like HDFS rebalancing are needed since these could be executed once after all of the hosts are added. Determining how to deal with this issue will likely be the most difficult aspect of adding a scaling api.  The above restrictions are in place primarily to minimize these concerns.

Possible approaches (certainly not all inclusive) for dealing with additional context specific operations during cluster scaling:
- Don’t account for this in the api.  This is obviously the easiest to implement but the least usable.   Somehow a user would need to determine which additional operations are needed and then figure out how to execute them via the api after the scaling operation completes.  Because of the extreme difficulties the user would have in successfully using the api with this approach, I don’t consider this a viable option.

- Include the suggested/required operations in the response to the scaling operation.  This could include hrefs and descriptions for each action. These actions could be placed in “suggested” and “required” categories to indicate necessity.  This is better than doing nothing but is still complicated by the fact that the script executing the scaling operation would need to process the response and have the necessary logic for determining which operations to invoke and then invoking the commands after the asynchronous scaling operation completes.

- Handle operations automatically.  After executing all of the add host operations for a scaling operation, we could also determine and execute any additional context specific operations for the scaling operation with no further user input.  This is very user friendly (assuming we make the correct decisions) but likely wouldn’t provide the necessary level of control.  Since some operations such as HDFS rebalancing after adding datanodes would not be required, executing these autonomously would likely not be the best solution as a user may want to explicitly control when the operation occurred.

- Likely the best solution would be somewhere between fully manual and autonomous execution of related tasks.  A user could specify operations to occur after adding the hosts in the scaling operation.  We could potentially allow operations scoped at different granularities.  For example, we might want to allow a user to specify that all required operations be executed.  Or at a more granular level, “all required service restarts”.  Or at an even more granular level, “HDFS service restart”.  To make this work, a user would need to know which operations would be relevant (suggested/required) for a scaling operation prior to invoking the scaling api.  One solution for this discovery would be to allow the user to ask the api for the set of relevant operations for a scaling operation without actually executing the scaling operation.  The api syntax for specifying the “commands” to execute would take a lot of thought to ensure that it is easy to use but also flexible and extensible.


At this time, given the 2.0 timeline, I feel that the minimum requirements shouldn't account for any additional context specific operations.  By restricting the adding of master services we are largely eliminating the need to update configurations and restart services for all services in the current stacks.  For one of the primary use cases, adding additional DATANODE hosts, rebalancing of HDFS would have to be done in a separate api after the request completes. Providing a robust "add host" api without the above restrictions that properly handles/identifies all additional context specific operations will be complicated and will likely be iterative, accomplished via many finer grained steps.



> Add support for "add hosts" with Blueprints API
> -----------------------------------------------
>
>                 Key: AMBARI-6275
>                 URL: https://issues.apache.org/jira/browse/AMBARI-6275
>             Project: Ambari
>          Issue Type: Improvement
>          Components: ambari-server
>    Affects Versions: 1.7.0
>            Reporter: Yusaku Sako
>            Assignee: John Speidel
>             Fix For: 2.0.0
>
>
> Support for "adding hosts" based on *blueprint* style *host_group* via Ambari REST API. There are two scenarios to consider for this JIRA:
> 1) Add hosts based on an existing host in the cluster (and it's *blueprint* style *host_group* component layout). This enables the user to add hosts with components similar to existing hosts in the cluster. For example: expand this cluster with these X hosts and make each of these hosts like Y host (components + configs) existing in the cluster.
> 2) Add hosts based on components + configs. This would be a verbose method that uses *blueprint* style *host_groups* and *configs* to allow you to add hosts to a cluster that do not necessarily have a component layout or config of a similar host existing in the cluster. For example: expand this cluster with these X hosts and make each of these hosts include Y components with Z configs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)