You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Robert Nettleton (JIRA)" <ji...@apache.org> on 2016/05/23 17:09:12 UTC

[jira] [Comment Edited] (AMBARI-15395) Enhance blueprint support for using references

    [ https://issues.apache.org/jira/browse/AMBARI-15395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296656#comment-15296656 ] 

Robert Nettleton edited comment on AMBARI-15395 at 5/23/16 5:08 PM:
--------------------------------------------------------------------

I've reviewed the proposal document attached to this JIRA, and have included my review comments below.

Generally, I'm not sure that using references here is a good fit for Blueprints.  

1. The Reference format is used by the Ambari REST API, but doesn't necessarily make much sense with respect to a Blueprints deployment.  The syntax and structure of a Blueprints deployment is such that much of the information (property name, configuration type, etc) would be duplicated, and likely confusing for users. I’m also concerned about the idea of configuration versions (mentioned in the reference format) being introduced into Blueprint documents.  This could introduce a variety of compatibility problems, as users try to re-use Blueprints across versions of Ambari.  

2. In the "Requirements", section, point #5 mentions having a separate JSON document to abstract out Path references.  I'm not sure I agree that this is necessary.  The stacks do define what appear to be valid defaults for these path properties.  In addition, the Blueprints deployer is always free to customize these properties, either in the Blueprint or Cluster Creation Template.  While I agree that we should do as much error-checking upfront as possible, this seems like an issue that will vary from user-to-user, and so I'm not sure we should add a new Blueprints feature to accomodate this use case.  I also think that adding a new JSON document needlessly complicates things, for very little benefit.  It's important to note that Blueprints is meant as a "power-user" feature, and so will never have as much error-checking/convenience as the UI.  

3. Regarding the potential application of this proposal towards Password fields:

   a. In general, I don't think this is necessary.  The Blueprint configuration processor already removes all passwords from an exported Blueprint.  At deployment time, the error-checking already exists to fail a deployment if a given password is missing.  The Blueprints configuration processor already uses the Ambari Stack definitions to determine the list of required passwords.  
   b. In addition, the “default password” feature already exists in order to allow a customer to override this validation, and simply use a given password for all unset passwords.  
   c. I’m concerned about the suggestion to use the “default password” configuration property in the Cluster Creation Template in the way described in this document.  The “default password” feature is not meant for production environments, and in general should only be used for developer-level clusters during the creation/debugging of Blueprint documents.  
  d. As mentioned above, the Reference structure duplicates much of the Blueprints structure already, and so would be confusing for existing Blueprint users.  

4. Regarding the potential application of this proposal towards Hostname fields:

    a. We need to make a distinction between the types of hostname fields we’re talking about.  In general, any hostname information that points to services managed by Ambari can be handled by the Blueprints processor already.  The specific case where some support would be useful here is in services not managed by Ambari.  An example would be:  A Hive Database that is running on a host that is not managed by Ambari.  During Blueprint export currently, this hostname property is excluded, since the processor cannot portably include this property value, since there is no host group to reference.  
    b. In the case I just described above, I think some support could be added, but I would recommend something much simpler than using references in this case.  
    c. I’d recommend modifying the BlueprintConfigurationProcessor to possibly detect this case (service’s host not managed by Ambari), and allowing the export of this property, but marking the value with a simple token, such as property value being set equal to something like:  “EXTERNAL_HOSTNAME”, or “UNMANAGED_HOST”.  This would allow the user of the exported Blueprint to detect when some additional configuration is required.  The BlueprintConfigurationProcessor could also be updated to error-check against this token, to report an error back to the caller when a Blueprint or Cluster Creation Template included this token.  This type of approach would probably handle the External DB and Kerberos Host cases, and others as well.  

5. Regarding the potential application of this proposal towards path fields:

  I’m not sure I agree that this would be something that should be added to Blueprints.  The Ambari Stack definitions already provide valid defaults for most use cases for the HDFS paths mentioned in this document.  If these settings are not acceptable to a subset of users, the Blueprint feature provides the facility to override these defaults in the Blueprint or Cluster Creation Template. 

Thanks. 

CC [~mahadev], [~antndk], [~arborkar], [~sumitmohanty], [~stoader]



was (Author: rnettleton):
I've reviewed the proposal document attached to this JIRA, and have included my review comments below.

Generally, I'm not sure that using references here is a good fit for Blueprints.  

1. The Reference format is used by the Ambari REST API, but doesn't necessarily make much sense with respect to a Blueprints deployment.  The syntax and structure of a Blueprints deployment is such that much of the information (property name, configuration type, etc) would be duplicated, and likely confusing for users. I’m also concerned about the idea of configuration versions (mentioned in the reference format) being introduced into Blueprint documents.  This could introduce a variety of compatibility problems, as users try to re-use Blueprints across versions of Ambari.  

2. In the "Requirements", section, point #5 mentions having a separate JSON document to abstract out Path references.  I'm not sure I agree that this is necessary.  The stacks do define what appear to be valid defaults for these path properties.  In addition, the Blueprints deployer is always free to customize these properties, either in the Blueprint or Cluster Creation Template.  While I agree that we should do as much error-checking upfront as possible, this seems like an issue that will vary from user-to-user, and so I'm not sure we should add a new Blueprints feature to accomodate this use case.  I also think that adding a new JSON document needlessly complicates things, for very little benefit.  It's important to note that Blueprints is meant as a "power-user" feature, and so will never have as much error-checking/convenience as the UI.  

3. Regarding the potential application of this proposal towards Password fields:

   a. In general, I don't think this is necessary.  The Blueprint configuration processor already removes all passwords from an exported Blueprint.  At deployment time, the error-checking already exists to fail a deployment if a given password is missing.  The Blueprints configuration processor already uses the Ambari Stack definitions to determine the list of required passwords.  
   b. In addition, the “default password” feature already exists in order to allow a customer to override this validation, and simply use a given password for all unset passwords.  
   c. I’m concerned about the suggestion to use the “default password” configuration property in the Cluster Creation Template in the way described in this document.  The “default password” feature is not meant for production environments, and in general should only be used for developer-level clusters during the creation/debugging of Blueprint documents.  
  d. As mentioned above, the Reference structure duplicates much of the Blueprints structure already, and so would be confusing for existing Blueprint users.  

4. Regarding the potential application of this proposal towards Hostname fields:
    a. We need to make a distinction between the types of hostname fields we’re talking about.  In general, any hostname information that points to services managed by Ambari can be handled by the Blueprints processor already.  The specific case where some support would be useful here is in services not managed by Ambari.  An example would be:  A Hive Database that is running on a host that is not managed by Ambari.  During Blueprint export currently, this hostname property is excluded, since the processor cannot portably include this property value, since there is no host group to reference.  
    b. In the case I just described above, I think some support could be added, but I would recommend something much simpler than using references in this case.  
    c. I’d recommend modifying the BlueprintConfigurationProcessor to possibly detect this case (service’s host not managed by Ambari), and allowing the export of this property, but marking the value with a simple token, such as property value being set equal to something like:  “EXTERNAL_HOSTNAME”, or “UNMANAGED_HOST”.  This would allow the user of the exported Blueprint to detect when some additional configuration is required.  The BlueprintConfigurationProcessor could also be updated to error-check against this token, to report an error back to the caller when a Blueprint or Cluster Creation Template included this token.  This type of approach would probably handle the External DB and Kerberos Host cases, and others as well.  

5. Regarding the potential application of this proposal towards path fields:
  I’m not sure I agree that this would be something that should be added to Blueprints.  The Ambari Stack definitions already provide valid defaults for most use cases for the HDFS paths mentioned in this document.  If these settings are not acceptable to a subset of users, the Blueprint feature provides the facility to override these defaults in the Blueprint or Cluster Creation Template. 

Thanks. 

CC [~mahadev], [~antndk], [~arborkar], [~sumitmohanty], [~stoader]


> Enhance blueprint support for using references
> ----------------------------------------------
>
>                 Key: AMBARI-15395
>                 URL: https://issues.apache.org/jira/browse/AMBARI-15395
>             Project: Ambari
>          Issue Type: Story
>          Components: ambari-server
>    Affects Versions: 2.4.0
>            Reporter: Shantanu Mundkur
>            Assignee: Amruta Borkar
>         Attachments: Blueprints_enhancement-AMBARI-15395-v3.docx, Blueprints_enhancement-AMBARI-15395-v3.pdf
>
>
> An exported blueprint should provide ready portability i.e. an exported blueprint be usable without changes to deploy another cluster. Some elements that are masked or omitted use tokens or placeholders. These have been called references in previous Jiras. A reference follow some convention that indicates that it is a reference by using a keyword and a pattern e.g.
> ReferenceName:configType:configVersion:propertyName
> References would be a good indicators of properties that user could choose to customize before deploying the cluster. It could also indicate the need for a "global" default for that property in the cluster template. Examples:
>     - Passwords
>     - Hostnames 
>     - External databases
> Currently Ambari has a concept of SECRET references. E.g.
>         SECRET:hive-site:2:hive.server2.keystore.password
> These are used for masking the password when a blueprint is exported and the reference itself is not exported. It would be useful to have an reference exported as long as it is processed appropriately during deployment. 
> Similar to the secret reference, for hostnames in a one could have,
>         HOST:kerberos-env:-1:kdc_host
> and so forth.
> For any reference, in the cluster template there would be a corresponding property that would be used for substituting a value for the reference during deployment if the registered blueprint had such references. Currently such behavior is used if a property of type password is not specified (default_password). Such references could be used to tag properties to flag them to be the ones that users must customize or include in the cluster template. They can also serve a way to annotate/comment parts of the blueprint JSON.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)