You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by Sarath Subramanian <sa...@apache.org> on 2017/06/28 01:36:25 UTC

[DISCUSS] Restrict AtlasStruct and AtlasClassification attributes to primitive and enum types

Hello all,



As part of the new relationship design (*ATLAS-1690
<https://issues.apache.org/jira/browse/ATLAS-1690>)*, we are planning to
restrict AtlasStruct and AtlasClassification types to have attributes of
primitive and enum types only. Attributes that refer to entity/entities
will no longer be supported for AtlasStruct and AtlasClassification types.
Please note that none of the existing out of the box types are impacted by
this change.



If you have any concerns with this change please let us know.





Thanks,

Sarath Subramanian

Re: [DISCUSS] Restrict AtlasStruct and AtlasClassification attributes to primitive and enum types

Posted by Mandy Chessell <ma...@uk.ibm.com>.
Hello Madhan,
This makes sense.  The areas that are impacted by this change are the 
governance and collaboration models which is a new space for Atlas so 
rework is cheap.

I am happy to go ahead and replace each of the richer classifications and 
structs in the common model with an entity/relationship combo.  This means 
concepts such as user tags and certifications will be represented by 
separate entities. 

If the relationship is defined as a composition, I am assuming that 
(soft-)deleting the main entity will also delete the minor entity.  For 
example if a DataSet entity that is related to a user tag of "useful" 
(also an entity) is deleted then the tag entity would also be deleted if 
the relationship between them is composition.  As such we would not get 
zombie tag entities due to the fact that they are modelled as entities 
rather than classifications.

This is a less explicit model but if it keeps the processing simple in the 
structs, classifications and relationships as you describe then it is 
worth it.  It also creates a clear definition of when to use a 
classification or an entity/relationship combo.  This is currently a 
little bit subtle as the classifications become rich :) We would also not 
need cardinality on the classifications which is less work.

All the best
Mandy
___________________________________________
Mandy Chessell CBE FREng CEng FBCS
IBM Distinguished Engineer

Master Inventor
Member of the IBM Academy of Technology
Visiting Professor, Department of Computer Science, University of 
Sheffield

Email: mandy_chessell@uk.ibm.com
LinkedIn: http://www.linkedin.com/pub/mandy-chessell/22/897/a49

Assistant: Janet Brooks - jsbrooks12@uk.ibm.com



From:   Madhan Neethiraj <ma...@apache.org>
To:     "dev@atlas.apache.org" <de...@atlas.apache.org>
Cc:     "dev@atlas.incubator.apache.org" <de...@atlas.incubator.apache.org>, 
"user@atlas.incubator.apache.org" <us...@atlas.incubator.apache.org>
Date:   28/06/2017 10:09
Subject:        Re: [DISCUSS] Restrict AtlasStruct and AtlasClassification 
attributes to primitive and enum types
Sent by:        Madhan Neethiraj <mn...@hortonworks.com>



Mandy,

The relationship introduced in ATLAS-1690 requires entity-type be 
specified for each end; effectively injecting an attribute to the entity 
types specified. This doesn’t allow struct or a classification types at 
relationship ends. In addition, allowing structs to hold entity references 
add complexity in dealing with edges, especially considering cases like 
array/map of structs. None of the existing models use object-reference 
attributes in structs; and we haven’t come across such use by any one so 
far – hence this discuss thread to find existing use for structs to hold 
object references. It will help us understand concrete usecases better and 
perhaps offer an alternate approach to model the same.

Thanks,
Madhan


On 6/28/17, 12:17 AM, "Mandy Chessell" <ma...@uk.ibm.com> wrote:

    Hello Sarath,
    These restrictions will have an impact on the open metadata and 
governance 
    type model - and I expect other models that people have built.  The 
impact 
    of this change is that many of the structs and classifications from 
the 
    open metadata and governance type model will need to be changed to an 
    entity plus relationship combo, creating overhead and complexity in 
the 
    model and reducing clarity.  Perhaps it would help to explain why 
these 
    restrictions are needed?
 
    Here are some examples
 
    The restriction that we can only have one instance of a type of 
    classification associated with an entity is creating a problem for 
    modelling social tagging and assignment of certifications to assets. 
As 
    such we need to have an array of structs inside these classifications. 

    Each cell in the array represents a user's tag or a certification that 
the 
    asset has received.   In the longer term a better solution to this use 

    case is of course to allow cardinality on the classification
 
    Structs and classifications often need to include a relationship.  For 

    example, if we use a classification to represent the terms and 
conditions 
    that apply to an asset - then ideally we would want a relationship to 
the 
    contract or license type.
 
    I am assuming that maps and arrays considered primitive since that 
will 
    impact the Hive model.
 
    All the best
    Mandy
    ___________________________________________
    Mandy Chessell CBE FREng CEng FBCS
    IBM Distinguished Engineer
 
    Master Inventor
    Member of the IBM Academy of Technology
    Visiting Professor, Department of Computer Science, University of 
    Sheffield
 
    Email: mandy_chessell@uk.ibm.com
    LinkedIn: http://www.linkedin.com/pub/mandy-chessell/22/897/a49
 
    Assistant: Janet Brooks - jsbrooks12@uk.ibm.com
 
 
 
    From:   Sarath Subramanian <sa...@apache.org>
    To:     dev@atlas.incubator.apache.org, 
user@atlas.incubator.apache.org
    Date:   28/06/2017 02:43
    Subject:        [DISCUSS] Restrict AtlasStruct and AtlasClassification 

    attributes to primitive and enum types
 
 
 
    Hello all,
 
 
 
    As part of the new relationship design (*ATLAS-1690
    <https://issues.apache.org/jira/browse/ATLAS-1690>)*, we are planning 
to
    restrict AtlasStruct and AtlasClassification types to have attributes 
of
    primitive and enum types only. Attributes that refer to 
entity/entities
    will no longer be supported for AtlasStruct and AtlasClassification 
types.
    Please note that none of the existing out of the box types are 
impacted by
    this change.
 
 
 
    If you have any concerns with this change please let us know.
 
 
 
 
 
    Thanks,
 
    Sarath Subramanian
 
 
 






Re: [DISCUSS] Restrict AtlasStruct and AtlasClassification attributes to primitive and enum types

Posted by Madhan Neethiraj <ma...@apache.org>.
Mandy,

The relationship introduced in ATLAS-1690 requires entity-type be specified for each end; effectively injecting an attribute to the entity types specified. This doesn’t allow struct or a classification types at relationship ends. In addition, allowing structs to hold entity references add complexity in dealing with edges, especially considering cases like array/map of structs. None of the existing models use object-reference attributes in structs; and we haven’t come across such use by any one so far – hence this discuss thread to find existing use for structs to hold object references. It will help us understand concrete usecases better and perhaps offer an alternate approach to model the same.

Thanks,
Madhan


On 6/28/17, 12:17 AM, "Mandy Chessell" <ma...@uk.ibm.com> wrote:

    Hello Sarath,
    These restrictions will have an impact on the open metadata and governance 
    type model - and I expect other models that people have built.  The impact 
    of this change is that many of the structs and classifications from the 
    open metadata and governance type model will need to be changed to an 
    entity plus relationship combo, creating overhead and complexity in the 
    model and reducing clarity.  Perhaps it would help to explain why these 
    restrictions are needed?
    
    Here are some examples
    
    The restriction that we can only have one instance of a type of 
    classification associated with an entity is creating a problem for 
    modelling social tagging and assignment of certifications to assets.  As 
    such we need to have an array of structs inside these classifications. 
    Each cell in the array represents a user's tag or a certification that the 
    asset has received.   In the longer term a better solution to this use 
    case is of course to allow cardinality on the classification
    
    Structs and classifications often need to include a relationship.  For 
    example, if we use a classification to represent the terms and conditions 
    that apply to an asset - then ideally we would want a relationship to the 
    contract or license type.
    
    I am assuming that maps and arrays considered primitive since that will 
    impact the Hive model.
    
    All the best
    Mandy
    ___________________________________________
    Mandy Chessell CBE FREng CEng FBCS
    IBM Distinguished Engineer
    
    Master Inventor
    Member of the IBM Academy of Technology
    Visiting Professor, Department of Computer Science, University of 
    Sheffield
    
    Email: mandy_chessell@uk.ibm.com
    LinkedIn: http://www.linkedin.com/pub/mandy-chessell/22/897/a49
    
    Assistant: Janet Brooks - jsbrooks12@uk.ibm.com
    
    
    
    From:   Sarath Subramanian <sa...@apache.org>
    To:     dev@atlas.incubator.apache.org, user@atlas.incubator.apache.org
    Date:   28/06/2017 02:43
    Subject:        [DISCUSS] Restrict AtlasStruct and AtlasClassification 
    attributes to primitive and enum types
    
    
    
    Hello all,
    
    
    
    As part of the new relationship design (*ATLAS-1690
    <https://issues.apache.org/jira/browse/ATLAS-1690>)*, we are planning to
    restrict AtlasStruct and AtlasClassification types to have attributes of
    primitive and enum types only. Attributes that refer to entity/entities
    will no longer be supported for AtlasStruct and AtlasClassification types.
    Please note that none of the existing out of the box types are impacted by
    this change.
    
    
    
    If you have any concerns with this change please let us know.
    
    
    
    
    
    Thanks,
    
    Sarath Subramanian
    
    
    



Re: [DISCUSS] Restrict AtlasStruct and AtlasClassification attributes to primitive and enum types

Posted by Madhan Neethiraj <ma...@apache.org>.
Mandy,

The relationship introduced in ATLAS-1690 requires entity-type be specified for each end; effectively injecting an attribute to the entity types specified. This doesn’t allow struct or a classification types at relationship ends. In addition, allowing structs to hold entity references add complexity in dealing with edges, especially considering cases like array/map of structs. None of the existing models use object-reference attributes in structs; and we haven’t come across such use by any one so far – hence this discuss thread to find existing use for structs to hold object references. It will help us understand concrete usecases better and perhaps offer an alternate approach to model the same.

Thanks,
Madhan


On 6/28/17, 12:17 AM, "Mandy Chessell" <ma...@uk.ibm.com> wrote:

    Hello Sarath,
    These restrictions will have an impact on the open metadata and governance 
    type model - and I expect other models that people have built.  The impact 
    of this change is that many of the structs and classifications from the 
    open metadata and governance type model will need to be changed to an 
    entity plus relationship combo, creating overhead and complexity in the 
    model and reducing clarity.  Perhaps it would help to explain why these 
    restrictions are needed?
    
    Here are some examples
    
    The restriction that we can only have one instance of a type of 
    classification associated with an entity is creating a problem for 
    modelling social tagging and assignment of certifications to assets.  As 
    such we need to have an array of structs inside these classifications. 
    Each cell in the array represents a user's tag or a certification that the 
    asset has received.   In the longer term a better solution to this use 
    case is of course to allow cardinality on the classification
    
    Structs and classifications often need to include a relationship.  For 
    example, if we use a classification to represent the terms and conditions 
    that apply to an asset - then ideally we would want a relationship to the 
    contract or license type.
    
    I am assuming that maps and arrays considered primitive since that will 
    impact the Hive model.
    
    All the best
    Mandy
    ___________________________________________
    Mandy Chessell CBE FREng CEng FBCS
    IBM Distinguished Engineer
    
    Master Inventor
    Member of the IBM Academy of Technology
    Visiting Professor, Department of Computer Science, University of 
    Sheffield
    
    Email: mandy_chessell@uk.ibm.com
    LinkedIn: http://www.linkedin.com/pub/mandy-chessell/22/897/a49
    
    Assistant: Janet Brooks - jsbrooks12@uk.ibm.com
    
    
    
    From:   Sarath Subramanian <sa...@apache.org>
    To:     dev@atlas.incubator.apache.org, user@atlas.incubator.apache.org
    Date:   28/06/2017 02:43
    Subject:        [DISCUSS] Restrict AtlasStruct and AtlasClassification 
    attributes to primitive and enum types
    
    
    
    Hello all,
    
    
    
    As part of the new relationship design (*ATLAS-1690
    <https://issues.apache.org/jira/browse/ATLAS-1690>)*, we are planning to
    restrict AtlasStruct and AtlasClassification types to have attributes of
    primitive and enum types only. Attributes that refer to entity/entities
    will no longer be supported for AtlasStruct and AtlasClassification types.
    Please note that none of the existing out of the box types are impacted by
    this change.
    
    
    
    If you have any concerns with this change please let us know.
    
    
    
    
    
    Thanks,
    
    Sarath Subramanian
    
    
    



Re: [DISCUSS] Restrict AtlasStruct and AtlasClassification attributes to primitive and enum types

Posted by Mandy Chessell <ma...@uk.ibm.com>.
Hello Sarath,
These restrictions will have an impact on the open metadata and governance 
type model - and I expect other models that people have built.  The impact 
of this change is that many of the structs and classifications from the 
open metadata and governance type model will need to be changed to an 
entity plus relationship combo, creating overhead and complexity in the 
model and reducing clarity.  Perhaps it would help to explain why these 
restrictions are needed?

Here are some examples

The restriction that we can only have one instance of a type of 
classification associated with an entity is creating a problem for 
modelling social tagging and assignment of certifications to assets.  As 
such we need to have an array of structs inside these classifications. 
Each cell in the array represents a user's tag or a certification that the 
asset has received.   In the longer term a better solution to this use 
case is of course to allow cardinality on the classification

Structs and classifications often need to include a relationship.  For 
example, if we use a classification to represent the terms and conditions 
that apply to an asset - then ideally we would want a relationship to the 
contract or license type.

I am assuming that maps and arrays considered primitive since that will 
impact the Hive model.

All the best
Mandy
___________________________________________
Mandy Chessell CBE FREng CEng FBCS
IBM Distinguished Engineer

Master Inventor
Member of the IBM Academy of Technology
Visiting Professor, Department of Computer Science, University of 
Sheffield

Email: mandy_chessell@uk.ibm.com
LinkedIn: http://www.linkedin.com/pub/mandy-chessell/22/897/a49

Assistant: Janet Brooks - jsbrooks12@uk.ibm.com



From:   Sarath Subramanian <sa...@apache.org>
To:     dev@atlas.incubator.apache.org, user@atlas.incubator.apache.org
Date:   28/06/2017 02:43
Subject:        [DISCUSS] Restrict AtlasStruct and AtlasClassification 
attributes to primitive and enum types



Hello all,



As part of the new relationship design (*ATLAS-1690
<https://issues.apache.org/jira/browse/ATLAS-1690>)*, we are planning to
restrict AtlasStruct and AtlasClassification types to have attributes of
primitive and enum types only. Attributes that refer to entity/entities
will no longer be supported for AtlasStruct and AtlasClassification types.
Please note that none of the existing out of the box types are impacted by
this change.



If you have any concerns with this change please let us know.





Thanks,

Sarath Subramanian



Re: [DISCUSS] Restrict AtlasStruct and AtlasClassification attributes to primitive and enum types

Posted by David Radley <da...@uk.ibm.com>.
Hi,
Some thoughts:

On relationships having relationships 
I assume we do not want this. 

On structs having relationships 
I was wondering whether we should implement structs as properties in the 
containing entity - rather than as a separate vertex in the graph. It then 
becomes an extension to the existing relationship instance logic to be 
able to support relationships from structs. I think this could simplify 
things; this would break existing DSL queries and applications.  If we 
were to do this we would need to enhance endDefs to be able to have a 
struct type or an entity type ( do we support struct type to struct type 
relationshiopDefs?). If we had a set of structures in an entity and the 
structures had relationships, I am not sure how we would identify which 
structure we are referring to - I assume structure instances would need to 
have guids - in which case aren't structures very close to composed 
entities? I wonder what we lose, if we were to model structs when they 
only have local content and the other cases model them as entities 
connected by composition? 

On classifications having relationships
We need to be able to have more than one classification on an entity for 
some classification types - to meet the use case Mandy talks of. 
Classification instances do no have guids - so if we have an attribute 
with a set of them, we have the same issue as relationships. 
I wonder if we were to have a standard structure of guid and type name 
(both primitives). Can we then allow classifications to have structures - 
but structures not to have relationships. We would need to model structure 
properties in the same vertex as its owning entity. 

In the above, we are assuming that maps and arrays considered primitives 
presumably also SETs and LISTs and nested combinations of all of these. 


In summary I think the following could be flexible enough to deal with the 
use cases we need:
- change struct instances so they are modelled as properties in the owning 
entity instance vertex.
- do not allow sets of structs - only lists or arrays or maps so we can 
identify them 
- do not allow structs or classifications to have relationships 
- allow classification to have structs. Introduce a struct with a type and 
guid (reuse AtlasObjectId?). 
- allow multiple classifications to be applied to an entity based on a new 
classificationDef property.


Any thoughts ? 
    all the best, David. 

 

 



From:   Mandy Chessell <ma...@uk.ibm.com>
To:     dev@atlas.apache.org
Cc:     dev@atlas.incubator.apache.org, user@atlas.incubator.apache.org
Date:   28/06/2017 08:18
Subject:        Re: [DISCUSS] Restrict AtlasStruct and AtlasClassification 
attributes to primitive and enum types



Hello Sarath,
These restrictions will have an impact on the open metadata and governance 

type model - and I expect other models that people have built.  The impact 

of this change is that many of the structs and classifications from the 
open metadata and governance type model will need to be changed to an 
entity plus relationship combo, creating overhead and complexity in the 
model and reducing clarity.  Perhaps it would help to explain why these 
restrictions are needed?

Here are some examples

The restriction that we can only have one instance of a type of 
classification associated with an entity is creating a problem for 
modelling social tagging and assignment of certifications to assets.  As 
such we need to have an array of structs inside these classifications. 
Each cell in the array represents a user's tag or a certification that the 

asset has received.   In the longer term a better solution to this use 
case is of course to allow cardinality on the classification

Structs and classifications often need to include a relationship.  For 
example, if we use a classification to represent the terms and conditions 
that apply to an asset - then ideally we would want a relationship to the 
contract or license type.

I am assuming that maps and arrays considered primitive since that will 
impact the Hive model.

All the best
Mandy
___________________________________________
Mandy Chessell CBE FREng CEng FBCS
IBM Distinguished Engineer

Master Inventor
Member of the IBM Academy of Technology
Visiting Professor, Department of Computer Science, University of 
Sheffield

Email: mandy_chessell@uk.ibm.com
LinkedIn: http://www.linkedin.com/pub/mandy-chessell/22/897/a49

Assistant: Janet Brooks - jsbrooks12@uk.ibm.com



From:   Sarath Subramanian <sa...@apache.org>
To:     dev@atlas.incubator.apache.org, user@atlas.incubator.apache.org
Date:   28/06/2017 02:43
Subject:        [DISCUSS] Restrict AtlasStruct and AtlasClassification 
attributes to primitive and enum types



Hello all,



As part of the new relationship design (*ATLAS-1690
<https://issues.apache.org/jira/browse/ATLAS-1690>)*, we are planning to
restrict AtlasStruct and AtlasClassification types to have attributes of
primitive and enum types only. Attributes that refer to entity/entities
will no longer be supported for AtlasStruct and AtlasClassification types.
Please note that none of the existing out of the box types are impacted by
this change.



If you have any concerns with this change please let us know.





Thanks,

Sarath Subramanian





Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU