You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by Stephanie Hazlewood <st...@ca.ibm.com> on 2017/01/26 20:43:19 UTC

DISCUSS: prescriptive types/services

I've been following the work and discussions that have been happening on 
the V2 API and have been wondering what the community thinks about the 
idea of also having some prescriptive, consumer-friendly services 
(prescriptive, yet extensible) and models around how to store metadata in 
the Atlas repository for particular types. These APIs would not replace, 
but supplement the more generic V2 API.

One example of a metadata 'domain' that might commonly be stored is that 
metadata that describing a data resource and how to connect to it.   I 
think we can come up with a prescriptive model of how to store information 
about how to connect to common resources (e.g., relational dbs, files, 
etc.) so that users who want to store this kind of information in Atlas 
don't have to reinvent the wheel.  I could also envision this facilitating 
cases where there are multiple Atlas repository deployments - having a 
similar model and services for accessing information of a similar type 
would be helpful.

There are a number of other domains that could also be considered (e.g., 
discovery metadata such as profile data, classifications, quality etc.) - 
just wanted to open the discussion and get your thoughts on this.  Has 
there been any discussion/thought on this topic to date? 

Best regards,
Stephanie


Re: DISCUSS: prescriptive types/services

Posted by David Radley <da...@uk.ibm.com>.
Hi Stephanie,
Some thoughts on this : 

1) It seems that way we connect to resources currently in Atlas are the 
bridges; would this approach we sufficient - or do you think we need more 
for connectors? One way to enhance the bridges would be to have a top 
level connector type and then have Hive etc inherit from this top type. In 
setting up a hierarchy like this you could use the type system to group 
like with like, so having an RDB class, that had subclassed for DB2 and 
Oracle and others. Would this meet at least part of the use cases you 
envisage? 

2) At the moment we have typeDefs. I wonder if the typeDefs could be held 
in a system model entity,.  In this way you could name models and have a 
description and version and other attributes. 
3) I suggest that you prototype using the existing type system the domains 
you need, . You may get some clarity on: 
        - how the existing type system supports what you need and what 
enhancements might be needed. 
        -Is the information you want to store in a  "domain"  all storable 
in the type system? Or do you envisage the need for additional custom code 
in Atlas? 
        - The sort of models you envisage will need to support 
bidirectional association. From what I have seen, an application need to 
manage the pointers from a source to the target and the reverse. I think 
we need Atlas to maintain the integrity of bidirectional associations 
(ATLAS-1459 documents this).
4) I like the idea of having some pre-canned content in the domains in 
Atlas, so companies can standardize on common ways of connecting to 
resources and setting up classification content that is widely useful and 
the like. 

        all the best, David. 




From:   "Stephanie Hazlewood" <st...@ca.ibm.com>
To:     atlas <de...@atlas.incubator.apache.org>
Date:   26/01/2017 20:43
Subject:        DISCUSS: prescriptive types/services



I've been following the work and discussions that have been happening on 
the V2 API and have been wondering what the community thinks about the 
idea of also having some prescriptive, consumer-friendly services 
(prescriptive, yet extensible) and models around how to store metadata in 
the Atlas repository for particular types. These APIs would not replace, 
but supplement the more generic V2 API.

One example of a metadata 'domain' that might commonly be stored is that 
metadata that describing a data resource and how to connect to it.   I 
think we can come up with a prescriptive model of how to store information 

about how to connect to common resources (e.g., relational dbs, files, 
etc.) so that users who want to store this kind of information in Atlas 
don't have to reinvent the wheel.  I could also envision this facilitating 

cases where there are multiple Atlas repository deployments - having a 
similar model and services for accessing information of a similar type 
would be helpful.

There are a number of other domains that could also be considered (e.g., 
discovery metadata such as profile data, classifications, quality etc.) - 
just wanted to open the discussion and get your thoughts on this.  Has 
there been any discussion/thought on this topic to date? 

Best regards,
Stephanie




Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU