You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "Nigel Jones (JIRA)" <ji...@apache.org> on 2017/07/19 15:11:00 UTC

[jira] [Commented] (ATLAS-1955) Validation for Attributes

    [ https://issues.apache.org/jira/browse/ATLAS-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093242#comment-16093242 ] 

Nigel Jones commented on ATLAS-1955:
------------------------------------

I believe here you are modeling the fact that an email has to follow a certain format, so that metadata should be captured in atlas
However the actual validation for instances of this data, ie a customer record being stored in a DB, would typically be outside atlas. In addition the validation may differ as it would be specific to the data processing system being used - an ETL engine, hbase, a filesystem, different languages. 
So I think the model is more similar to that of Policies, Rules & how ranger works
In atlas we have a business-centric definition of a policy, but the actual implementation sits at the enforcement point (in this case a ranger rule)
I'm interesting in being able to add capability to capture metadata from ranger so we can then tie back the rule implementation to the policy, to aid in compliance checks, reporting -as well as allow ranger to query atlas for policies when a security admin is creating a rule
So I wonder if the same pattern applies here with validation?

> Validation for Attributes
> -------------------------
>
>                 Key: ATLAS-1955
>                 URL: https://issues.apache.org/jira/browse/ATLAS-1955
>             Project: Atlas
>          Issue Type: New Feature
>          Components:  atlas-core
>    Affects Versions: 0.9-incubating
>            Reporter: Israel Varea
>             Fix For: 0.9-incubating
>
>
> It would be very nice that Atlas model could contain a way to represent attribute validation. 
> A simple example is that we would like to model a Person, with attributes Name, Email and Country. Now we would like to specify that Email has to follow a specific regular expression, so it would be nice if we could set Email -> hasValidation -> EmailRegex, with EmailRegex having:
> Name: Email Regular Expresion
> Expression: /[0-9a-z]+@[0-9a-z]+.[0-9a-z]+/
> For more complex types of validation, e.g. checking card number validity, it could be added some external validator function/service.
> Name: Credit Card Number Validator
> Validator: org.apache.atlas.validators.creditcard or https://host:port/creditCardValidator
> For validations from a reference table, for example a country name, it could be:
> Name: Country Name Ref Validator
> Reference Column: <country_name_column>
> where <country_name_column> would be an instance of type Hive_Column or HBase_Column.
> Since this is a kind of Standarization, it could be placed in [Area 5|https://cwiki.apache.org/confluence/display/ATLAS/Area+5+-+Standards].
> A similar approach is followed in software [Kylo|https://github.com/Teradata/kylo/tree/master/integrations/spark/spark-validate-cleanse]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)