You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by David Radley <da...@uk.ibm.com> on 2016/12/12 17:57:25 UTC

Improvement suggestion: change terms to be implemented as entities

Hi,
I have raised Atlas Jiras 1254 an 1245. I would like your feedback on 
changing the implementation of business/glossary terms to be entities, 
rather than trait types and trait instances. This would mean: 

1) A Term would have a guid for ATLAS-1245
2) TermResourceDefinition could be changed to add relationship 
projections, to support ATLAS-1254. I suggest we have "has a" , homonymns 
and antonyms as the relationships.
        - has-a relationships would allow us to associate a Hive table 
with one term and its columns with other column related terms. So we could 
then work with the  the business glossary terms and it would be aware of 
the conceptual has-a relationship; rather than needing to interrogate the 
asset. Of course glossary terms could be associated using has-a 
relationships without being mapped to entities. 
        - homonyms and antonyms are commonly used with business glossaries 
 
3) We would not have a new trait type that would be created for every term 
- that cannot be deleted. Instead we would have 1 system type for term 
that all terms entities would be associated with.
4) We would need to ensure we could still support for available_as_tag for 
terms - this means we expose the term by name as a tag 
5) I suggest we tolerate gets on the term using the the guid in the URI as 
well as the fully qualified name. Creation of new terms should create 
hrefs with the guid.
6) Term to term relationships would be simple in the code as we would use 
an entity to entity relationship. 
7) I notice in the the Atlas technical user guide (page 60), talks of 
traits and tags terminology as being interchangable. In the code (apart 
from in the supplied trait types),  it seems that traits are only used to 
implement terms, I guess because terms are often known by their name. Tags 
are somewhat different as they are used to interact with Ranger for tag 
based policies.
8) The Atlas technical user guide talks of 2 ways of categorizing entities 
, the business taxonomy and tags / traits. This change would be in line 
with the separation.
9) Having a guid for terms would allow us to rename the term without 
changing its identifier. I assume we should allow multiple terms of the 
same name in different taxonomies.
10) I think the reason that terms were implemented as trait instances as 
traits are identified by name so do not need guids and if a trait was an 
entity, a user could define a relationship to a term entity, which would 
be confusing. My suggestion is that if the user chooses to create a type 
with a relationship to a term, then we reject the creation of the type . 
At the moment they presumably could create a relationship to a taxonomy 
which we should also reject. 
11) As part of these changes, I suggest that entities also contain a 
response field of terms. So it is more obvious to a REST client what the 
associated terms are with an entity. 

Please let me know if I have missed/misunderstood/misrepresented anything. 
I appreciate your feedback, as I hope to address these Jiras soon, 

many thanks , David. 
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Re: Improvement suggestion: change terms to be implemented as entities

Posted by David Radley <da...@uk.ibm.com>.
Hi Hemanth,
On point 10 - I suggest that all assets would have a "system" optional 
predefined multiplicity relationship to terms (which in this proposal will 
only be one type, so could be a standard entity to entity relationship). 
Does this work for you? 

I suggest on the tag / trait front - we move to clarify the language in 
the docs and APIs around these entities so there is no ambiguity, 

many thanks , David. 



From:   Hemanth Yamijala <hy...@hortonworks.com>
To:     "dev@atlas.incubator.apache.org" <de...@atlas.incubator.apache.org>
Date:   13/12/2016 02:21
Subject:        Re: Improvement suggestion: change terms to be implemented 
as entities



David,

I hope folks who are more plugged into Atlas on a day-to-day basis will 
provide relevant feedback. I have a very few comments below.

Regarding point 10: AFAIK, the most significant constraint of implementing 
terms as entities was that entity to entity relationships needed to be 
predefined, while tags / traits could be associated to any entity without 
this prior definition.

Regarding point 7: Tags and traits are indeed interchangeable. In the 
Atlas UI specifically, we always refer to trait types as tags (which is 
confusing IMO, but well, that's where we are)

Thanks
hemanth
________________________________________
From: David Radley <da...@uk.ibm.com>
Sent: Monday, December 12, 2016 11:27 PM
To: dev@atlas.incubator.apache.org
Subject: Improvement suggestion: change terms to be implemented as 
entities

Hi,
I have raised Atlas Jiras 1254 an 1245. I would like your feedback on
changing the implementation of business/glossary terms to be entities,
rather than trait types and trait instances. This would mean:

1) A Term would have a guid for ATLAS-1245
2) TermResourceDefinition could be changed to add relationship
projections, to support ATLAS-1254. I suggest we have "has a" , homonymns
and antonyms as the relationships.
        - has-a relationships would allow us to associate a Hive table
with one term and its columns with other column related terms. So we could
then work with the  the business glossary terms and it would be aware of
the conceptual has-a relationship; rather than needing to interrogate the
asset. Of course glossary terms could be associated using has-a
relationships without being mapped to entities.
        - homonyms and antonyms are commonly used with business glossaries

3) We would not have a new trait type that would be created for every term
- that cannot be deleted. Instead we would have 1 system type for term
that all terms entities would be associated with.
4) We would need to ensure we could still support for available_as_tag for
terms - this means we expose the term by name as a tag
5) I suggest we tolerate gets on the term using the the guid in the URI as
well as the fully qualified name. Creation of new terms should create
hrefs with the guid.
6) Term to term relationships would be simple in the code as we would use
an entity to entity relationship.
7) I notice in the the Atlas technical user guide (page 60), talks of
traits and tags terminology as being interchangable. In the code (apart
from in the supplied trait types),  it seems that traits are only used to
implement terms, I guess because terms are often known by their name. Tags
are somewhat different as they are used to interact with Ranger for tag
based policies.
8) The Atlas technical user guide talks of 2 ways of categorizing entities
, the business taxonomy and tags / traits. This change would be in line
with the separation.
9) Having a guid for terms would allow us to rename the term without
changing its identifier. I assume we should allow multiple terms of the
same name in different taxonomies.
10) I think the reason that terms were implemented as trait instances as
traits are identified by name so do not need guids and if a trait was an
entity, a user could define a relationship to a term entity, which would
be confusing. My suggestion is that if the user chooses to create a type
with a relationship to a term, then we reject the creation of the type .
At the moment they presumably could create a relationship to a taxonomy
which we should also reject.
11) As part of these changes, I suggest that entities also contain a
response field of terms. So it is more obvious to a REST client what the
associated terms are with an entity.

Please let me know if I have missed/misunderstood/misrepresented anything.
I appreciate your feedback, as I hope to address these Jiras soon,

many thanks , David.
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU




Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Re: Improvement suggestion: change terms to be implemented as entities

Posted by David Radley <da...@uk.ibm.com>.
Hi Madhan,
Thanks for joining the discussion. 

As you are probably aware ATLAS-1186 is currently in review and introduces 
a parent child containment concept; this Jira is a pragmatic minimal 
change to get the functionality out. Longer term, I think we need support 
for optional multiple composition relationships in types. This would allow 
us to model parent child relationships in the type system. I have raised 
this in Jira  ATLAS- 1344. Once this is in place, the glossary objects 
could use that implementation.

For  ATLAS-1186 we manage the graph edges separately and use projections - 
similar to taxonomies. If you look in Jira 1186,  you will see a series of 
follow on Jiras listed  that are on in this area- all looking to make the 
Atlas glossary more mature,

My thinking on term to term relationships are in Jira 1254, where we are 
looking to add: 
1) has a 
2) antonym
3) homonym
4) replaces / replaced by. So the glossary can contain replacement 
knowledge explicitly rather than in rename history. 
5) relationship to 0 or more other glossary categories
6) a verb relationship . for example to be able to hold the information 
behind a statement like ' The customer C owns account A.'  

We could implement all of this in the glossary category style in 1186. Or 
we could enhance the type system as per 1144 to add in support for 
optional multiple composite relationships between entities. 

            many thanks, David. 



From:   Madhan Neethiraj <ma...@apache.org>
To:     "dev@atlas.incubator.apache.org" <de...@atlas.incubator.apache.org>, 
Shwetha Shivalingamurthy <ss...@hortonworks.com>
Date:   20/12/2016 02:49
Subject:        Re: Improvement suggestion: change terms to be implemented 
as entities
Sent by:        Madhan Neethiraj <mn...@hortonworks.com>



David,

Thanks for starting this thread with a lot of good details. I agree on 
modelling terms as entities instead of traits. This approach provides many 
flexibilities – like rename, move, delete.

In addition to support for has-a relationship between entities, I think we 
might need support for “parent” or “container” relationship – to capture 
the hierarchical organization of the terms in taxonomies.

Please give me couple of days to go through all the details and respond 
back.

Thanks,
Madhan

On 12/13/16, 6:23 AM, "David Radley" <da...@uk.ibm.com> wrote:

    Hi Shweta,
    Thank you, I had not considered the search side of things. I suggest 
that 
    for a minium viable change for the first check in, we do not change 
this 
    behavior; this may mean we need to special case the code to ensure 
that 
    isa still performs as it did.
 
    I do think that search needs to deal with taxonomies , glossary 
categories 
    and terms as first level objects. I have raised Jira ATLAS-1372 to 
track 
    this requirement; I think we should do this after ATLAS-1186 has gone 
into 
    the code so it can consider glossary categories.
 
     In line with the way  I proposed in ATLAS-1186 and ATLAS-1327, I 
think we 
    should also change terms to have a name (which is a simple name) and a 

    fully qualified name (which we derive at runtime for the term 
inheritance 
    hierarchy); we can do this when there is a unique id that identifies 
the 
    term not the name. I think this term name change would be best done in 
a 
    follow-on Jira (I have raise ATLAS-1373) as this would change the term 
API 
    and may require an API version change.  I think we would need the term 

    name change to occur before we could think about allowing terms to 
change 
    taxonomies. 
 
       all the best,  David. 
 
 
 
 
 
 
    From:   Shwetha Shivalingamurthy <ss...@hortonworks.com>
    To:     "dev@atlas.incubator.apache.org" 
<de...@atlas.incubator.apache.org>
    Date:   13/12/2016 04:43
    Subject:        Re: Improvement suggestion: change terms to be 
implemented 
    as entities
 
 
 
    Modeling terms as traits also enabled search work out of the box. For
    example, queries like search for assets with term will map to ŒAsset 
isa
    <term>¹ (though this worked only for leaf terms)
 
    Modeling terms as entities will simplify some of the functionalities 
like
    term renames, move term from one hierarchy to the other etc. Are you
    planning to expose different way of searching or use existing search 
like
    ŒAsset where terms = <term>¹?
 
    Regards,
    Shwetha
 
 
 
 
 
 
    On 13/12/16, 7:50 AM, "Hemanth Yamijala" <hy...@hortonworks.com> 
    wrote:
 
    >David,
    >
    >I hope folks who are more plugged into Atlas on a day-to-day basis 
will
    >provide relevant feedback. I have a very few comments below.
    >
    >Regarding point 10: AFAIK, the most significant constraint of
    >implementing terms as entities was that entity to entity 
relationships
    >needed to be predefined, while tags / traits could be associated to 
any
    >entity without this prior definition.
    >
    >Regarding point 7: Tags and traits are indeed interchangeable. In the
    >Atlas UI specifically, we always refer to trait types as tags (which 
is
    >confusing IMO, but well, that's where we are)
    >
    >Thanks
    >hemanth
    >________________________________________
    >From: David Radley <da...@uk.ibm.com>
    >Sent: Monday, December 12, 2016 11:27 PM
    >To: dev@atlas.incubator.apache.org
    >Subject: Improvement suggestion: change terms to be implemented as
    >entities
    >
    >Hi,
    >I have raised Atlas Jiras 1254 an 1245. I would like your feedback on
    >changing the implementation of business/glossary terms to be 
entities,
    >rather than trait types and trait instances. This would mean:
    >
    >1) A Term would have a guid for ATLAS-1245
    >2) TermResourceDefinition could be changed to add relationship
    >projections, to support ATLAS-1254. I suggest we have "has a" , 
homonymns
    >and antonyms as the relationships.
    >        - has-a relationships would allow us to associate a Hive 
table
    >with one term and its columns with other column related terms. So we 
    could
    >then work with the  the business glossary terms and it would be aware 
of
    >the conceptual has-a relationship; rather than needing to interrogate 
the
    >asset. Of course glossary terms could be associated using has-a
    >relationships without being mapped to entities.
    >        - homonyms and antonyms are commonly used with business 
    glossaries
    >
    >3) We would not have a new trait type that would be created for every 

    term
    >- that cannot be deleted. Instead we would have 1 system type for 
term
    >that all terms entities would be associated with.
    >4) We would need to ensure we could still support for 
available_as_tag 
    for
    >terms - this means we expose the term by name as a tag
    >5) I suggest we tolerate gets on the term using the the guid in the 
URI 
    as
    >well as the fully qualified name. Creation of new terms should create
    >hrefs with the guid.
    >6) Term to term relationships would be simple in the code as we would 
use
    >an entity to entity relationship.
    >7) I notice in the the Atlas technical user guide (page 60), talks of
    >traits and tags terminology as being interchangable. In the code 
(apart
    >from in the supplied trait types),  it seems that traits are only 
used to
    >implement terms, I guess because terms are often known by their name. 

    Tags
    >are somewhat different as they are used to interact with Ranger for 
tag
    >based policies.
    >8) The Atlas technical user guide talks of 2 ways of categorizing 
    entities
    >, the business taxonomy and tags / traits. This change would be in 
line
    >with the separation.
    >9) Having a guid for terms would allow us to rename the term without
    >changing its identifier. I assume we should allow multiple terms of 
the
    >same name in different taxonomies.
    >10) I think the reason that terms were implemented as trait instances 
as
    >traits are identified by name so do not need guids and if a trait was 
an
    >entity, a user could define a relationship to a term entity, which 
would
    >be confusing. My suggestion is that if the user chooses to create a 
type
    >with a relationship to a term, then we reject the creation of the 
type .
    >At the moment they presumably could create a relationship to a 
taxonomy
    >which we should also reject.
    >11) As part of these changes, I suggest that entities also contain a
    >response field of terms. So it is more obvious to a REST client what 
the
    >associated terms are with an entity.
    >
    >Please let me know if I have missed/misunderstood/misrepresented 
    anything.
    >I appreciate your feedback, as I hope to address these Jiras soon,
    >
    >many thanks , David.
    >Unless stated otherwise above:
    >IBM United Kingdom Limited - Registered in England and Wales with 
number
    >741598.
    >Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire 
PO6 
    3AU
    >
 
 
 
 
    Unless stated otherwise above:
    IBM United Kingdom Limited - Registered in England and Wales with 
number 
    741598. 
    Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 
3AU
 
 





Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Re: Improvement suggestion: change terms to be implemented as entities

Posted by Madhan Neethiraj <ma...@apache.org>.
David,

Thanks for starting this thread with a lot of good details. I agree on modelling terms as entities instead of traits. This approach provides many flexibilities – like rename, move, delete.

In addition to support for has-a relationship between entities, I think we might need support for “parent” or “container” relationship – to capture the hierarchical organization of the terms in taxonomies.

Please give me couple of days to go through all the details and respond back.

Thanks,
Madhan

On 12/13/16, 6:23 AM, "David Radley" <da...@uk.ibm.com> wrote:

    Hi Shweta,
    Thank you, I had not considered the search side of things. I suggest that 
    for a minium viable change for the first check in, we do not change this 
    behavior; this may mean we need to special case the code to ensure that 
    isa still performs as it did.
    
    I do think that search needs to deal with taxonomies , glossary categories 
    and terms as first level objects. I have raised Jira ATLAS-1372 to track 
    this requirement; I think we should do this after ATLAS-1186 has gone into 
    the code so it can consider glossary categories.
    
     In line with the way  I proposed in ATLAS-1186 and ATLAS-1327, I think we 
    should also change terms to have a name (which is a simple name) and a 
    fully qualified name (which we derive at runtime for the term inheritance 
    hierarchy); we can do this when there is a unique id that identifies the 
    term not the name. I think this term name change would be best done in a 
    follow-on Jira (I have raise ATLAS-1373) as this would change the term API 
    and may require an API version change.  I think we would need the term 
    name change to occur before we could think about allowing terms to change 
    taxonomies. 
    
       all the best,  David. 
     
    
    
    
    
    
    From:   Shwetha Shivalingamurthy <ss...@hortonworks.com>
    To:     "dev@atlas.incubator.apache.org" <de...@atlas.incubator.apache.org>
    Date:   13/12/2016 04:43
    Subject:        Re: Improvement suggestion: change terms to be implemented 
    as entities
    
    
    
    Modeling terms as traits also enabled search work out of the box. For
    example, queries like search for assets with term will map to ŒAsset isa
    <term>¹ (though this worked only for leaf terms)
    
    Modeling terms as entities will simplify some of the functionalities like
    term renames, move term from one hierarchy to the other etc. Are you
    planning to expose different way of searching or use existing search like
    ŒAsset where terms = <term>¹?
    
    Regards,
    Shwetha
    
    
    
    
    
    
    On 13/12/16, 7:50 AM, "Hemanth Yamijala" <hy...@hortonworks.com> 
    wrote:
    
    >David,
    >
    >I hope folks who are more plugged into Atlas on a day-to-day basis will
    >provide relevant feedback. I have a very few comments below.
    >
    >Regarding point 10: AFAIK, the most significant constraint of
    >implementing terms as entities was that entity to entity relationships
    >needed to be predefined, while tags / traits could be associated to any
    >entity without this prior definition.
    >
    >Regarding point 7: Tags and traits are indeed interchangeable. In the
    >Atlas UI specifically, we always refer to trait types as tags (which is
    >confusing IMO, but well, that's where we are)
    >
    >Thanks
    >hemanth
    >________________________________________
    >From: David Radley <da...@uk.ibm.com>
    >Sent: Monday, December 12, 2016 11:27 PM
    >To: dev@atlas.incubator.apache.org
    >Subject: Improvement suggestion: change terms to be implemented as
    >entities
    >
    >Hi,
    >I have raised Atlas Jiras 1254 an 1245. I would like your feedback on
    >changing the implementation of business/glossary terms to be entities,
    >rather than trait types and trait instances. This would mean:
    >
    >1) A Term would have a guid for ATLAS-1245
    >2) TermResourceDefinition could be changed to add relationship
    >projections, to support ATLAS-1254. I suggest we have "has a" , homonymns
    >and antonyms as the relationships.
    >        - has-a relationships would allow us to associate a Hive table
    >with one term and its columns with other column related terms. So we 
    could
    >then work with the  the business glossary terms and it would be aware of
    >the conceptual has-a relationship; rather than needing to interrogate the
    >asset. Of course glossary terms could be associated using has-a
    >relationships without being mapped to entities.
    >        - homonyms and antonyms are commonly used with business 
    glossaries
    >
    >3) We would not have a new trait type that would be created for every 
    term
    >- that cannot be deleted. Instead we would have 1 system type for term
    >that all terms entities would be associated with.
    >4) We would need to ensure we could still support for available_as_tag 
    for
    >terms - this means we expose the term by name as a tag
    >5) I suggest we tolerate gets on the term using the the guid in the URI 
    as
    >well as the fully qualified name. Creation of new terms should create
    >hrefs with the guid.
    >6) Term to term relationships would be simple in the code as we would use
    >an entity to entity relationship.
    >7) I notice in the the Atlas technical user guide (page 60), talks of
    >traits and tags terminology as being interchangable. In the code (apart
    >from in the supplied trait types),  it seems that traits are only used to
    >implement terms, I guess because terms are often known by their name. 
    Tags
    >are somewhat different as they are used to interact with Ranger for tag
    >based policies.
    >8) The Atlas technical user guide talks of 2 ways of categorizing 
    entities
    >, the business taxonomy and tags / traits. This change would be in line
    >with the separation.
    >9) Having a guid for terms would allow us to rename the term without
    >changing its identifier. I assume we should allow multiple terms of the
    >same name in different taxonomies.
    >10) I think the reason that terms were implemented as trait instances as
    >traits are identified by name so do not need guids and if a trait was an
    >entity, a user could define a relationship to a term entity, which would
    >be confusing. My suggestion is that if the user chooses to create a type
    >with a relationship to a term, then we reject the creation of the type .
    >At the moment they presumably could create a relationship to a taxonomy
    >which we should also reject.
    >11) As part of these changes, I suggest that entities also contain a
    >response field of terms. So it is more obvious to a REST client what the
    >associated terms are with an entity.
    >
    >Please let me know if I have missed/misunderstood/misrepresented 
    anything.
    >I appreciate your feedback, as I hope to address these Jiras soon,
    >
    >many thanks , David.
    >Unless stated otherwise above:
    >IBM United Kingdom Limited - Registered in England and Wales with number
    >741598.
    >Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 
    3AU
    >
    
    
    
    
    Unless stated otherwise above:
    IBM United Kingdom Limited - Registered in England and Wales with number 
    741598. 
    Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
    
    



Re: Improvement suggestion: change terms to be implemented as entities

Posted by David Radley <da...@uk.ibm.com>.
Hi Shweta,
Thank you, I had not considered the search side of things. I suggest that 
for a minium viable change for the first check in, we do not change this 
behavior; this may mean we need to special case the code to ensure that 
isa still performs as it did.

I do think that search needs to deal with taxonomies , glossary categories 
and terms as first level objects. I have raised Jira ATLAS-1372 to track 
this requirement; I think we should do this after ATLAS-1186 has gone into 
the code so it can consider glossary categories.

 In line with the way  I proposed in ATLAS-1186 and ATLAS-1327, I think we 
should also change terms to have a name (which is a simple name) and a 
fully qualified name (which we derive at runtime for the term inheritance 
hierarchy); we can do this when there is a unique id that identifies the 
term not the name. I think this term name change would be best done in a 
follow-on Jira (I have raise ATLAS-1373) as this would change the term API 
and may require an API version change.  I think we would need the term 
name change to occur before we could think about allowing terms to change 
taxonomies. 

   all the best,  David. 
 





From:   Shwetha Shivalingamurthy <ss...@hortonworks.com>
To:     "dev@atlas.incubator.apache.org" <de...@atlas.incubator.apache.org>
Date:   13/12/2016 04:43
Subject:        Re: Improvement suggestion: change terms to be implemented 
as entities



Modeling terms as traits also enabled search work out of the box. For
example, queries like search for assets with term will map to ŒAsset isa
<term>¹ (though this worked only for leaf terms)

Modeling terms as entities will simplify some of the functionalities like
term renames, move term from one hierarchy to the other etc. Are you
planning to expose different way of searching or use existing search like
ŒAsset where terms = <term>¹?

Regards,
Shwetha






On 13/12/16, 7:50 AM, "Hemanth Yamijala" <hy...@hortonworks.com> 
wrote:

>David,
>
>I hope folks who are more plugged into Atlas on a day-to-day basis will
>provide relevant feedback. I have a very few comments below.
>
>Regarding point 10: AFAIK, the most significant constraint of
>implementing terms as entities was that entity to entity relationships
>needed to be predefined, while tags / traits could be associated to any
>entity without this prior definition.
>
>Regarding point 7: Tags and traits are indeed interchangeable. In the
>Atlas UI specifically, we always refer to trait types as tags (which is
>confusing IMO, but well, that's where we are)
>
>Thanks
>hemanth
>________________________________________
>From: David Radley <da...@uk.ibm.com>
>Sent: Monday, December 12, 2016 11:27 PM
>To: dev@atlas.incubator.apache.org
>Subject: Improvement suggestion: change terms to be implemented as
>entities
>
>Hi,
>I have raised Atlas Jiras 1254 an 1245. I would like your feedback on
>changing the implementation of business/glossary terms to be entities,
>rather than trait types and trait instances. This would mean:
>
>1) A Term would have a guid for ATLAS-1245
>2) TermResourceDefinition could be changed to add relationship
>projections, to support ATLAS-1254. I suggest we have "has a" , homonymns
>and antonyms as the relationships.
>        - has-a relationships would allow us to associate a Hive table
>with one term and its columns with other column related terms. So we 
could
>then work with the  the business glossary terms and it would be aware of
>the conceptual has-a relationship; rather than needing to interrogate the
>asset. Of course glossary terms could be associated using has-a
>relationships without being mapped to entities.
>        - homonyms and antonyms are commonly used with business 
glossaries
>
>3) We would not have a new trait type that would be created for every 
term
>- that cannot be deleted. Instead we would have 1 system type for term
>that all terms entities would be associated with.
>4) We would need to ensure we could still support for available_as_tag 
for
>terms - this means we expose the term by name as a tag
>5) I suggest we tolerate gets on the term using the the guid in the URI 
as
>well as the fully qualified name. Creation of new terms should create
>hrefs with the guid.
>6) Term to term relationships would be simple in the code as we would use
>an entity to entity relationship.
>7) I notice in the the Atlas technical user guide (page 60), talks of
>traits and tags terminology as being interchangable. In the code (apart
>from in the supplied trait types),  it seems that traits are only used to
>implement terms, I guess because terms are often known by their name. 
Tags
>are somewhat different as they are used to interact with Ranger for tag
>based policies.
>8) The Atlas technical user guide talks of 2 ways of categorizing 
entities
>, the business taxonomy and tags / traits. This change would be in line
>with the separation.
>9) Having a guid for terms would allow us to rename the term without
>changing its identifier. I assume we should allow multiple terms of the
>same name in different taxonomies.
>10) I think the reason that terms were implemented as trait instances as
>traits are identified by name so do not need guids and if a trait was an
>entity, a user could define a relationship to a term entity, which would
>be confusing. My suggestion is that if the user chooses to create a type
>with a relationship to a term, then we reject the creation of the type .
>At the moment they presumably could create a relationship to a taxonomy
>which we should also reject.
>11) As part of these changes, I suggest that entities also contain a
>response field of terms. So it is more obvious to a REST client what the
>associated terms are with an entity.
>
>Please let me know if I have missed/misunderstood/misrepresented 
anything.
>I appreciate your feedback, as I hope to address these Jiras soon,
>
>many thanks , David.
>Unless stated otherwise above:
>IBM United Kingdom Limited - Registered in England and Wales with number
>741598.
>Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 
3AU
>




Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Re: Improvement suggestion: change terms to be implemented as entities

Posted by Shwetha Shivalingamurthy <ss...@hortonworks.com>.
Modeling terms as traits also enabled search work out of the box. For
example, queries like search for assets with term will map to ŒAsset isa
<term>¹ (though this worked only for leaf terms)

Modeling terms as entities will simplify some of the functionalities like
term renames, move term from one hierarchy to the other etc. Are you
planning to expose different way of searching or use existing search like
ŒAsset where terms = <term>¹?

Regards,
Shwetha






On 13/12/16, 7:50 AM, "Hemanth Yamijala" <hy...@hortonworks.com> wrote:

>David,
>
>I hope folks who are more plugged into Atlas on a day-to-day basis will
>provide relevant feedback. I have a very few comments below.
>
>Regarding point 10: AFAIK, the most significant constraint of
>implementing terms as entities was that entity to entity relationships
>needed to be predefined, while tags / traits could be associated to any
>entity without this prior definition.
>
>Regarding point 7: Tags and traits are indeed interchangeable. In the
>Atlas UI specifically, we always refer to trait types as tags (which is
>confusing IMO, but well, that's where we are)
>
>Thanks
>hemanth
>________________________________________
>From: David Radley <da...@uk.ibm.com>
>Sent: Monday, December 12, 2016 11:27 PM
>To: dev@atlas.incubator.apache.org
>Subject: Improvement suggestion: change terms to be implemented as
>entities
>
>Hi,
>I have raised Atlas Jiras 1254 an 1245. I would like your feedback on
>changing the implementation of business/glossary terms to be entities,
>rather than trait types and trait instances. This would mean:
>
>1) A Term would have a guid for ATLAS-1245
>2) TermResourceDefinition could be changed to add relationship
>projections, to support ATLAS-1254. I suggest we have "has a" , homonymns
>and antonyms as the relationships.
>        - has-a relationships would allow us to associate a Hive table
>with one term and its columns with other column related terms. So we could
>then work with the  the business glossary terms and it would be aware of
>the conceptual has-a relationship; rather than needing to interrogate the
>asset. Of course glossary terms could be associated using has-a
>relationships without being mapped to entities.
>        - homonyms and antonyms are commonly used with business glossaries
>
>3) We would not have a new trait type that would be created for every term
>- that cannot be deleted. Instead we would have 1 system type for term
>that all terms entities would be associated with.
>4) We would need to ensure we could still support for available_as_tag for
>terms - this means we expose the term by name as a tag
>5) I suggest we tolerate gets on the term using the the guid in the URI as
>well as the fully qualified name. Creation of new terms should create
>hrefs with the guid.
>6) Term to term relationships would be simple in the code as we would use
>an entity to entity relationship.
>7) I notice in the the Atlas technical user guide (page 60), talks of
>traits and tags terminology as being interchangable. In the code (apart
>from in the supplied trait types),  it seems that traits are only used to
>implement terms, I guess because terms are often known by their name. Tags
>are somewhat different as they are used to interact with Ranger for tag
>based policies.
>8) The Atlas technical user guide talks of 2 ways of categorizing entities
>, the business taxonomy and tags / traits. This change would be in line
>with the separation.
>9) Having a guid for terms would allow us to rename the term without
>changing its identifier. I assume we should allow multiple terms of the
>same name in different taxonomies.
>10) I think the reason that terms were implemented as trait instances as
>traits are identified by name so do not need guids and if a trait was an
>entity, a user could define a relationship to a term entity, which would
>be confusing. My suggestion is that if the user chooses to create a type
>with a relationship to a term, then we reject the creation of the type .
>At the moment they presumably could create a relationship to a taxonomy
>which we should also reject.
>11) As part of these changes, I suggest that entities also contain a
>response field of terms. So it is more obvious to a REST client what the
>associated terms are with an entity.
>
>Please let me know if I have missed/misunderstood/misrepresented anything.
>I appreciate your feedback, as I hope to address these Jiras soon,
>
>many thanks , David.
>Unless stated otherwise above:
>IBM United Kingdom Limited - Registered in England and Wales with number
>741598.
>Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>


Re: Improvement suggestion: change terms to be implemented as entities

Posted by Hemanth Yamijala <hy...@hortonworks.com>.
David,

I hope folks who are more plugged into Atlas on a day-to-day basis will provide relevant feedback. I have a very few comments below.

Regarding point 10: AFAIK, the most significant constraint of implementing terms as entities was that entity to entity relationships needed to be predefined, while tags / traits could be associated to any entity without this prior definition.

Regarding point 7: Tags and traits are indeed interchangeable. In the Atlas UI specifically, we always refer to trait types as tags (which is confusing IMO, but well, that's where we are)

Thanks
hemanth
________________________________________
From: David Radley <da...@uk.ibm.com>
Sent: Monday, December 12, 2016 11:27 PM
To: dev@atlas.incubator.apache.org
Subject: Improvement suggestion: change terms to be implemented as entities

Hi,
I have raised Atlas Jiras 1254 an 1245. I would like your feedback on
changing the implementation of business/glossary terms to be entities,
rather than trait types and trait instances. This would mean:

1) A Term would have a guid for ATLAS-1245
2) TermResourceDefinition could be changed to add relationship
projections, to support ATLAS-1254. I suggest we have "has a" , homonymns
and antonyms as the relationships.
        - has-a relationships would allow us to associate a Hive table
with one term and its columns with other column related terms. So we could
then work with the  the business glossary terms and it would be aware of
the conceptual has-a relationship; rather than needing to interrogate the
asset. Of course glossary terms could be associated using has-a
relationships without being mapped to entities.
        - homonyms and antonyms are commonly used with business glossaries

3) We would not have a new trait type that would be created for every term
- that cannot be deleted. Instead we would have 1 system type for term
that all terms entities would be associated with.
4) We would need to ensure we could still support for available_as_tag for
terms - this means we expose the term by name as a tag
5) I suggest we tolerate gets on the term using the the guid in the URI as
well as the fully qualified name. Creation of new terms should create
hrefs with the guid.
6) Term to term relationships would be simple in the code as we would use
an entity to entity relationship.
7) I notice in the the Atlas technical user guide (page 60), talks of
traits and tags terminology as being interchangable. In the code (apart
from in the supplied trait types),  it seems that traits are only used to
implement terms, I guess because terms are often known by their name. Tags
are somewhat different as they are used to interact with Ranger for tag
based policies.
8) The Atlas technical user guide talks of 2 ways of categorizing entities
, the business taxonomy and tags / traits. This change would be in line
with the separation.
9) Having a guid for terms would allow us to rename the term without
changing its identifier. I assume we should allow multiple terms of the
same name in different taxonomies.
10) I think the reason that terms were implemented as trait instances as
traits are identified by name so do not need guids and if a trait was an
entity, a user could define a relationship to a term entity, which would
be confusing. My suggestion is that if the user chooses to create a type
with a relationship to a term, then we reject the creation of the type .
At the moment they presumably could create a relationship to a taxonomy
which we should also reject.
11) As part of these changes, I suggest that entities also contain a
response field of terms. So it is more obvious to a REST client what the
associated terms are with an entity.

Please let me know if I have missed/misunderstood/misrepresented anything.
I appreciate your feedback, as I hope to address these Jiras soon,

many thanks , David.
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU