You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by Sarath Subramanian <sa...@apache.org> on 2017/06/28 01:36:25 UTC
[DISCUSS] Restrict AtlasStruct and AtlasClassification attributes to
primitive and enum types
Hello all,
As part of the new relationship design (*ATLAS-1690
<https://issues.apache.org/jira/browse/ATLAS-1690>)*, we are planning to
restrict AtlasStruct and AtlasClassification types to have attributes of
primitive and enum types only. Attributes that refer to entity/entities
will no longer be supported for AtlasStruct and AtlasClassification types.
Please note that none of the existing out of the box types are impacted by
this change.
If you have any concerns with this change please let us know.
Thanks,
Sarath Subramanian
Re: [DISCUSS] Restrict AtlasStruct and AtlasClassification attributes to
primitive and enum types
Posted by Mandy Chessell <ma...@uk.ibm.com>.
Hello Madhan,
This makes sense. The areas that are impacted by this change are the
governance and collaboration models which is a new space for Atlas so
rework is cheap.
I am happy to go ahead and replace each of the richer classifications and
structs in the common model with an entity/relationship combo. This means
concepts such as user tags and certifications will be represented by
separate entities.
If the relationship is defined as a composition, I am assuming that
(soft-)deleting the main entity will also delete the minor entity. For
example if a DataSet entity that is related to a user tag of "useful"
(also an entity) is deleted then the tag entity would also be deleted if
the relationship between them is composition. As such we would not get
zombie tag entities due to the fact that they are modelled as entities
rather than classifications.
This is a less explicit model but if it keeps the processing simple in the
structs, classifications and relationships as you describe then it is
worth it. It also creates a clear definition of when to use a
classification or an entity/relationship combo. This is currently a
little bit subtle as the classifications become rich :) We would also not
need cardinality on the classifications which is less work.
All the best
Mandy
___________________________________________
Mandy Chessell CBE FREng CEng FBCS
IBM Distinguished Engineer
Master Inventor
Member of the IBM Academy of Technology
Visiting Professor, Department of Computer Science, University of
Sheffield
Email: mandy_chessell@uk.ibm.com
LinkedIn: http://www.linkedin.com/pub/mandy-chessell/22/897/a49
Assistant: Janet Brooks - jsbrooks12@uk.ibm.com
From: Madhan Neethiraj <ma...@apache.org>
To: "dev@atlas.apache.org" <de...@atlas.apache.org>
Cc: "dev@atlas.incubator.apache.org" <de...@atlas.incubator.apache.org>,
"user@atlas.incubator.apache.org" <us...@atlas.incubator.apache.org>
Date: 28/06/2017 10:09
Subject: Re: [DISCUSS] Restrict AtlasStruct and AtlasClassification
attributes to primitive and enum types
Sent by: Madhan Neethiraj <mn...@hortonworks.com>
Mandy,
The relationship introduced in ATLAS-1690 requires entity-type be
specified for each end; effectively injecting an attribute to the entity
types specified. This doesn’t allow struct or a classification types at
relationship ends. In addition, allowing structs to hold entity references
add complexity in dealing with edges, especially considering cases like
array/map of structs. None of the existing models use object-reference
attributes in structs; and we haven’t come across such use by any one so
far – hence this discuss thread to find existing use for structs to hold
object references. It will help us understand concrete usecases better and
perhaps offer an alternate approach to model the same.
Thanks,
Madhan
On 6/28/17, 12:17 AM, "Mandy Chessell" <ma...@uk.ibm.com> wrote:
Hello Sarath,
These restrictions will have an impact on the open metadata and
governance
type model - and I expect other models that people have built. The
impact
of this change is that many of the structs and classifications from
the
open metadata and governance type model will need to be changed to an
entity plus relationship combo, creating overhead and complexity in
the
model and reducing clarity. Perhaps it would help to explain why
these
restrictions are needed?
Here are some examples
The restriction that we can only have one instance of a type of
classification associated with an entity is creating a problem for
modelling social tagging and assignment of certifications to assets.
As
such we need to have an array of structs inside these classifications.
Each cell in the array represents a user's tag or a certification that
the
asset has received. In the longer term a better solution to this use
case is of course to allow cardinality on the classification
Structs and classifications often need to include a relationship. For
example, if we use a classification to represent the terms and
conditions
that apply to an asset - then ideally we would want a relationship to
the
contract or license type.
I am assuming that maps and arrays considered primitive since that
will
impact the Hive model.
All the best
Mandy
___________________________________________
Mandy Chessell CBE FREng CEng FBCS
IBM Distinguished Engineer
Master Inventor
Member of the IBM Academy of Technology
Visiting Professor, Department of Computer Science, University of
Sheffield
Email: mandy_chessell@uk.ibm.com
LinkedIn: http://www.linkedin.com/pub/mandy-chessell/22/897/a49
Assistant: Janet Brooks - jsbrooks12@uk.ibm.com
From: Sarath Subramanian <sa...@apache.org>
To: dev@atlas.incubator.apache.org,
user@atlas.incubator.apache.org
Date: 28/06/2017 02:43
Subject: [DISCUSS] Restrict AtlasStruct and AtlasClassification
attributes to primitive and enum types
Hello all,
As part of the new relationship design (*ATLAS-1690
<https://issues.apache.org/jira/browse/ATLAS-1690>)*, we are planning
to
restrict AtlasStruct and AtlasClassification types to have attributes
of
primitive and enum types only. Attributes that refer to
entity/entities
will no longer be supported for AtlasStruct and AtlasClassification
types.
Please note that none of the existing out of the box types are
impacted by
this change.
If you have any concerns with this change please let us know.
Thanks,
Sarath Subramanian
Re: [DISCUSS] Restrict AtlasStruct and AtlasClassification attributes
to primitive and enum types
Posted by Madhan Neethiraj <ma...@apache.org>.
Mandy,
The relationship introduced in ATLAS-1690 requires entity-type be specified for each end; effectively injecting an attribute to the entity types specified. This doesn’t allow struct or a classification types at relationship ends. In addition, allowing structs to hold entity references add complexity in dealing with edges, especially considering cases like array/map of structs. None of the existing models use object-reference attributes in structs; and we haven’t come across such use by any one so far – hence this discuss thread to find existing use for structs to hold object references. It will help us understand concrete usecases better and perhaps offer an alternate approach to model the same.
Thanks,
Madhan
On 6/28/17, 12:17 AM, "Mandy Chessell" <ma...@uk.ibm.com> wrote:
Hello Sarath,
These restrictions will have an impact on the open metadata and governance
type model - and I expect other models that people have built. The impact
of this change is that many of the structs and classifications from the
open metadata and governance type model will need to be changed to an
entity plus relationship combo, creating overhead and complexity in the
model and reducing clarity. Perhaps it would help to explain why these
restrictions are needed?
Here are some examples
The restriction that we can only have one instance of a type of
classification associated with an entity is creating a problem for
modelling social tagging and assignment of certifications to assets. As
such we need to have an array of structs inside these classifications.
Each cell in the array represents a user's tag or a certification that the
asset has received. In the longer term a better solution to this use
case is of course to allow cardinality on the classification
Structs and classifications often need to include a relationship. For
example, if we use a classification to represent the terms and conditions
that apply to an asset - then ideally we would want a relationship to the
contract or license type.
I am assuming that maps and arrays considered primitive since that will
impact the Hive model.
All the best
Mandy
___________________________________________
Mandy Chessell CBE FREng CEng FBCS
IBM Distinguished Engineer
Master Inventor
Member of the IBM Academy of Technology
Visiting Professor, Department of Computer Science, University of
Sheffield
Email: mandy_chessell@uk.ibm.com
LinkedIn: http://www.linkedin.com/pub/mandy-chessell/22/897/a49
Assistant: Janet Brooks - jsbrooks12@uk.ibm.com
From: Sarath Subramanian <sa...@apache.org>
To: dev@atlas.incubator.apache.org, user@atlas.incubator.apache.org
Date: 28/06/2017 02:43
Subject: [DISCUSS] Restrict AtlasStruct and AtlasClassification
attributes to primitive and enum types
Hello all,
As part of the new relationship design (*ATLAS-1690
<https://issues.apache.org/jira/browse/ATLAS-1690>)*, we are planning to
restrict AtlasStruct and AtlasClassification types to have attributes of
primitive and enum types only. Attributes that refer to entity/entities
will no longer be supported for AtlasStruct and AtlasClassification types.
Please note that none of the existing out of the box types are impacted by
this change.
If you have any concerns with this change please let us know.
Thanks,
Sarath Subramanian
Re: [DISCUSS] Restrict AtlasStruct and AtlasClassification attributes
to primitive and enum types
Posted by Madhan Neethiraj <ma...@apache.org>.
Mandy,
The relationship introduced in ATLAS-1690 requires entity-type be specified for each end; effectively injecting an attribute to the entity types specified. This doesn’t allow struct or a classification types at relationship ends. In addition, allowing structs to hold entity references add complexity in dealing with edges, especially considering cases like array/map of structs. None of the existing models use object-reference attributes in structs; and we haven’t come across such use by any one so far – hence this discuss thread to find existing use for structs to hold object references. It will help us understand concrete usecases better and perhaps offer an alternate approach to model the same.
Thanks,
Madhan
On 6/28/17, 12:17 AM, "Mandy Chessell" <ma...@uk.ibm.com> wrote:
Hello Sarath,
These restrictions will have an impact on the open metadata and governance
type model - and I expect other models that people have built. The impact
of this change is that many of the structs and classifications from the
open metadata and governance type model will need to be changed to an
entity plus relationship combo, creating overhead and complexity in the
model and reducing clarity. Perhaps it would help to explain why these
restrictions are needed?
Here are some examples
The restriction that we can only have one instance of a type of
classification associated with an entity is creating a problem for
modelling social tagging and assignment of certifications to assets. As
such we need to have an array of structs inside these classifications.
Each cell in the array represents a user's tag or a certification that the
asset has received. In the longer term a better solution to this use
case is of course to allow cardinality on the classification
Structs and classifications often need to include a relationship. For
example, if we use a classification to represent the terms and conditions
that apply to an asset - then ideally we would want a relationship to the
contract or license type.
I am assuming that maps and arrays considered primitive since that will
impact the Hive model.
All the best
Mandy
___________________________________________
Mandy Chessell CBE FREng CEng FBCS
IBM Distinguished Engineer
Master Inventor
Member of the IBM Academy of Technology
Visiting Professor, Department of Computer Science, University of
Sheffield
Email: mandy_chessell@uk.ibm.com
LinkedIn: http://www.linkedin.com/pub/mandy-chessell/22/897/a49
Assistant: Janet Brooks - jsbrooks12@uk.ibm.com
From: Sarath Subramanian <sa...@apache.org>
To: dev@atlas.incubator.apache.org, user@atlas.incubator.apache.org
Date: 28/06/2017 02:43
Subject: [DISCUSS] Restrict AtlasStruct and AtlasClassification
attributes to primitive and enum types
Hello all,
As part of the new relationship design (*ATLAS-1690
<https://issues.apache.org/jira/browse/ATLAS-1690>)*, we are planning to
restrict AtlasStruct and AtlasClassification types to have attributes of
primitive and enum types only. Attributes that refer to entity/entities
will no longer be supported for AtlasStruct and AtlasClassification types.
Please note that none of the existing out of the box types are impacted by
this change.
If you have any concerns with this change please let us know.
Thanks,
Sarath Subramanian
Re: [DISCUSS] Restrict AtlasStruct and AtlasClassification attributes to
primitive and enum types
Posted by Mandy Chessell <ma...@uk.ibm.com>.
Hello Sarath,
These restrictions will have an impact on the open metadata and governance
type model - and I expect other models that people have built. The impact
of this change is that many of the structs and classifications from the
open metadata and governance type model will need to be changed to an
entity plus relationship combo, creating overhead and complexity in the
model and reducing clarity. Perhaps it would help to explain why these
restrictions are needed?
Here are some examples
The restriction that we can only have one instance of a type of
classification associated with an entity is creating a problem for
modelling social tagging and assignment of certifications to assets. As
such we need to have an array of structs inside these classifications.
Each cell in the array represents a user's tag or a certification that the
asset has received. In the longer term a better solution to this use
case is of course to allow cardinality on the classification
Structs and classifications often need to include a relationship. For
example, if we use a classification to represent the terms and conditions
that apply to an asset - then ideally we would want a relationship to the
contract or license type.
I am assuming that maps and arrays considered primitive since that will
impact the Hive model.
All the best
Mandy
___________________________________________
Mandy Chessell CBE FREng CEng FBCS
IBM Distinguished Engineer
Master Inventor
Member of the IBM Academy of Technology
Visiting Professor, Department of Computer Science, University of
Sheffield
Email: mandy_chessell@uk.ibm.com
LinkedIn: http://www.linkedin.com/pub/mandy-chessell/22/897/a49
Assistant: Janet Brooks - jsbrooks12@uk.ibm.com
From: Sarath Subramanian <sa...@apache.org>
To: dev@atlas.incubator.apache.org, user@atlas.incubator.apache.org
Date: 28/06/2017 02:43
Subject: [DISCUSS] Restrict AtlasStruct and AtlasClassification
attributes to primitive and enum types
Hello all,
As part of the new relationship design (*ATLAS-1690
<https://issues.apache.org/jira/browse/ATLAS-1690>)*, we are planning to
restrict AtlasStruct and AtlasClassification types to have attributes of
primitive and enum types only. Attributes that refer to entity/entities
will no longer be supported for AtlasStruct and AtlasClassification types.
Please note that none of the existing out of the box types are impacted by
this change.
If you have any concerns with this change please let us know.
Thanks,
Sarath Subramanian
Re: [DISCUSS] Restrict AtlasStruct and AtlasClassification attributes to
primitive and enum types
Posted by David Radley <da...@uk.ibm.com>.
Hi,
Some thoughts:
On relationships having relationships
I assume we do not want this.
On structs having relationships
I was wondering whether we should implement structs as properties in the
containing entity - rather than as a separate vertex in the graph. It then
becomes an extension to the existing relationship instance logic to be
able to support relationships from structs. I think this could simplify
things; this would break existing DSL queries and applications. If we
were to do this we would need to enhance endDefs to be able to have a
struct type or an entity type ( do we support struct type to struct type
relationshiopDefs?). If we had a set of structures in an entity and the
structures had relationships, I am not sure how we would identify which
structure we are referring to - I assume structure instances would need to
have guids - in which case aren't structures very close to composed
entities? I wonder what we lose, if we were to model structs when they
only have local content and the other cases model them as entities
connected by composition?
On classifications having relationships
We need to be able to have more than one classification on an entity for
some classification types - to meet the use case Mandy talks of.
Classification instances do no have guids - so if we have an attribute
with a set of them, we have the same issue as relationships.
I wonder if we were to have a standard structure of guid and type name
(both primitives). Can we then allow classifications to have structures -
but structures not to have relationships. We would need to model structure
properties in the same vertex as its owning entity.
In the above, we are assuming that maps and arrays considered primitives
presumably also SETs and LISTs and nested combinations of all of these.
In summary I think the following could be flexible enough to deal with the
use cases we need:
- change struct instances so they are modelled as properties in the owning
entity instance vertex.
- do not allow sets of structs - only lists or arrays or maps so we can
identify them
- do not allow structs or classifications to have relationships
- allow classification to have structs. Introduce a struct with a type and
guid (reuse AtlasObjectId?).
- allow multiple classifications to be applied to an entity based on a new
classificationDef property.
Any thoughts ?
all the best, David.
From: Mandy Chessell <ma...@uk.ibm.com>
To: dev@atlas.apache.org
Cc: dev@atlas.incubator.apache.org, user@atlas.incubator.apache.org
Date: 28/06/2017 08:18
Subject: Re: [DISCUSS] Restrict AtlasStruct and AtlasClassification
attributes to primitive and enum types
Hello Sarath,
These restrictions will have an impact on the open metadata and governance
type model - and I expect other models that people have built. The impact
of this change is that many of the structs and classifications from the
open metadata and governance type model will need to be changed to an
entity plus relationship combo, creating overhead and complexity in the
model and reducing clarity. Perhaps it would help to explain why these
restrictions are needed?
Here are some examples
The restriction that we can only have one instance of a type of
classification associated with an entity is creating a problem for
modelling social tagging and assignment of certifications to assets. As
such we need to have an array of structs inside these classifications.
Each cell in the array represents a user's tag or a certification that the
asset has received. In the longer term a better solution to this use
case is of course to allow cardinality on the classification
Structs and classifications often need to include a relationship. For
example, if we use a classification to represent the terms and conditions
that apply to an asset - then ideally we would want a relationship to the
contract or license type.
I am assuming that maps and arrays considered primitive since that will
impact the Hive model.
All the best
Mandy
___________________________________________
Mandy Chessell CBE FREng CEng FBCS
IBM Distinguished Engineer
Master Inventor
Member of the IBM Academy of Technology
Visiting Professor, Department of Computer Science, University of
Sheffield
Email: mandy_chessell@uk.ibm.com
LinkedIn: http://www.linkedin.com/pub/mandy-chessell/22/897/a49
Assistant: Janet Brooks - jsbrooks12@uk.ibm.com
From: Sarath Subramanian <sa...@apache.org>
To: dev@atlas.incubator.apache.org, user@atlas.incubator.apache.org
Date: 28/06/2017 02:43
Subject: [DISCUSS] Restrict AtlasStruct and AtlasClassification
attributes to primitive and enum types
Hello all,
As part of the new relationship design (*ATLAS-1690
<https://issues.apache.org/jira/browse/ATLAS-1690>)*, we are planning to
restrict AtlasStruct and AtlasClassification types to have attributes of
primitive and enum types only. Attributes that refer to entity/entities
will no longer be supported for AtlasStruct and AtlasClassification types.
Please note that none of the existing out of the box types are impacted by
this change.
If you have any concerns with this change please let us know.
Thanks,
Sarath Subramanian
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU