You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by David Radley <da...@uk.ibm.com> on 2017/01/18 12:41:58 UTC

Loops in V2 API.

Hi Madhan,
On further reflection, a more generic mechanism to get the name might be 
to enhance the constraint to add a refLabel to the params which could 
specify the label field as being "name", like this:
 
"constraintDefs": [
                        {
                            "type": "mappedFromRef",
                            "params": {
                                "refAttribute": "TestType2",
                       "refLabel ": "name" 
                            }
                        } 

What do you think?

Thanks David.

 
----- Forwarded by David Radley/UK/IBM on 18/01/2017 12:36 -----

From:   David Radley/UK/IBM
To:     Madhan Neethiraj <ma...@apache.org>
Cc:     <de...@atlas.incubator.apache.org>
Date:   18/01/2017 10:22
Subject:        Fw: [jira] David Radley mentioned you (JIRA)


Hi Madhan,
I like the idea of using AtlasObjectId. Can I suggest that if we have a 
single hashcode of it in the serialized form , then we cannot easily see 
that a reference points to a particular object's guid. Maybe the Java API 
reference should serialise to a nested structure containing the type and 
guid. For readability / debugging I suggest we also add name into 
AtlasObjectId (this approach proved very useful in ATLAS-1186). In this 
way a reference contains the type and name (understandable by the human 
reader, but not unique) and the guid (not understandable by the human 
reader but useful for unique identification of objects and construction of 
unambiguous object references). 

So the children and parent references in json would look something like :


"children": [
                                 {
                                     "type": "TestType2",
                                            "name": "child1",
                                            "guid":  “1234-5678-90123”
                                 },
                                        {
                                     "type": "TestType2",
                                            "name": "child2",
                                            "guid":  “1234-5678-90124”
                                 },
           ]
"parent": 
                                 {
                                     "type": "TestType1",
                                            "name": "parent1",
                                            "guid":  “1234-5678-90123”
                                 }


We would need something equivalent in toString(). 


What do you think?

   Thanks, David. 
----- Forwarded by David Radley/UK/IBM on 18/01/2017 09:45 -----

From:   Madhan Neethiraj <ma...@apache.org>
To:     David Radley/UK/IBM@IBMGB
Date:   18/01/2017 01:47
Subject:        Re: [jira] David Radley mentioned you (JIRA)
Sent by:        Madhan Neethiraj <mn...@hortonworks.com>



David,

Thanks for the type-def JSONs and the steps to reproduce the issue. The 
implementation should be updated to not get into such infinite loops. I 
guess one approach would be to treat references to other entities as a 
AtlasObjectId, even when the reference points to a full entity. For 
example:
- toString() should print AtlasObjectId equivalent of the referenced 
object. i.e. { “typeName”: “TestType1”, “guid”: “1234-5678-90123” }
- hashCode() should use (new AtlasObjectId(“TestType1”, 
“1234-5678-90123”)).hashCode(), instead of calling child.hashCode() or 
parent.hashCode()

What do you think?

Thanks,
Madhan




On 1/17/17, 7:11 AM, "David Radley (JIRA)" <ji...@apache.org> wrote:

 
         [ 
https://issues.apache.org/jira/browse/ATLAS-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel 
]
 
    David Radley mentioned you on ATLAS-1458
    --------------------------------
 
    [~madhan.neethiraj]
    As requested on the dev list, here are json files defining the types 
that I used to recreate the loop.
 
    The types create ok. To ge the loop using the  01-test.json, I do the 
following
    create an entity of TestType1 EntityA
    create an entity of TestType2 EntityB
    I update EntityA to have EntityB as a child.
    I update EntityB to have EntityB as a parent. It loops during this 
update.
 
    I get the same loop for 02-test.json and 03-test.json. 
 
    >                 Key: ATLAS-1458
 
    >         View Online: 
https://issues.apache.org/jira/browse/ATLAS-1458
    >         Add Comment: 
https://issues.apache.org/jira/browse/ATLAS-1458#add-comment
 
    Hint: You can mention someone in an issue description or comment by 
typing  "@" in front of their username.
 
 
 
 
    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
 
 




Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Re: Loops in V2 API.

Posted by Mandy Chessell <ma...@uk.ibm.com>.
Dear All,
When I was talking to David last week about this, we could not think of an 
example where the relationship between metadata entities was one way.  The 
reason is that metadata relationships are connecting different 
perspectives.  Value comes from understanding how each perspective 
connects to the rest of the world.  So for example, if we take the idea of 
a business term entity linked to a data field entity contained in a data 
set entity then the data set owner wants to understand the meaning of the 
data field (navigation from data field to business term) and the subject 
area owner wants to know where data of a particular meaning is stored 
(navigation from business term to data field).

So perhaps the default in the type language - and through to the atlas 
implementation - should be that all relationships between entities are 
two-way - with 3 labels:
The relationship name
The names of the relationship as viewed from each end of the relationship

We see all three lables in UML.  UML relationships are also two-way by 
default and you add constraints to make them one-way.

This would suggest the type language should define relationships as well 
ans entities as top level objects in the type language.  The relationship 
would have a name and declare which entities it connects to.  The entities 
would reference the relationship and provide its local name for the 
relationship.

This way, the type language will then allow for an extension where 
relationships have properties.  This capability is supported natively in 
the grpah and would enable richer information gathering on the 
relationships between entities (one of the key values of an integrate 
metadata repository).

All the best
Mandy
___________________________________________
Mandy Chessell CBE FREng CEng FBCS
IBM Distinguished Engineer
IBM Analytics Group CTO Office

Master Inventor
Member of the IBM Academy of Technology
Visiting Professor, Department of Computer Science, University of 
Sheffield

Email: mandy_chessell@uk.ibm.com
LinkedIn: http://www.linkedin.com/pub/mandy-chessell/22/897/a49

Assistant: Janet Brooks - jsbrooks12@uk.ibm.com



From:   Madhan Neethiraj <ma...@apache.org>
To:     "dev@atlas.incubator.apache.org" <de...@atlas.incubator.apache.org>
Date:   24/01/2017 06:57
Subject:        Re: Loops in V2 API.
Sent by:        Madhan Neethiraj <mn...@hortonworks.com>



David,

Idea of ‘refLabel’ sounds good; it will be helpful to include such an 
attribute value in addition to typeName and guid. I think this idea will 
be useful to the type in general, not just in constraints; how about the 
ability to designate one of the entity-attributes as ‘displayText’; this 
attribute value can be used many places, like search results. What do you 
think?

Thanks,
Madhan

 

On 1/18/17, 4:41 AM, "David Radley" <da...@uk.ibm.com> wrote:

    Hi Madhan,
    On further reflection, a more generic mechanism to get the name might 
be 
    to enhance the constraint to add a refLabel to the params which could 
    specify the label field as being "name", like this:
 
    "constraintDefs": [
                            {
                                "type": "mappedFromRef",
                                "params": {
                                    "refAttribute": "TestType2",
                           "refLabel ": "name" 
                                }
                            } 
 
    What do you think?
 
    Thanks David.
 
 
    ----- Forwarded by David Radley/UK/IBM on 18/01/2017 12:36 -----
 
    From:   David Radley/UK/IBM
    To:     Madhan Neethiraj <ma...@apache.org>
    Cc:     <de...@atlas.incubator.apache.org>
    Date:   18/01/2017 10:22
    Subject:        Fw: [jira] David Radley mentioned you (JIRA)
 
 
    Hi Madhan,
    I like the idea of using AtlasObjectId. Can I suggest that if we have 
a 
    single hashcode of it in the serialized form , then we cannot easily 
see 
    that a reference points to a particular object's guid. Maybe the Java 
API 
    reference should serialise to a nested structure containing the type 
and 
    guid. For readability / debugging I suggest we also add name into 
    AtlasObjectId (this approach proved very useful in ATLAS-1186). In 
this 
    way a reference contains the type and name (understandable by the 
human 
    reader, but not unique) and the guid (not understandable by the human 
    reader but useful for unique identification of objects and 
construction of 
    unambiguous object references). 
 
    So the children and parent references in json would look something 
like :
 
 
    "children": [
                                     {
                                         "type": "TestType2",
                                                "name": "child1",
                                                "guid":  “1234-5678-90123”
                                     },
                                            {
                                         "type": "TestType2",
                                                "name": "child2",
                                                "guid":  “1234-5678-90124”
                                     },
               ]
    "parent": 
                                     {
                                         "type": "TestType1",
                                                "name": "parent1",
                                                "guid":  “1234-5678-90123”
                                     }
 
 
    We would need something equivalent in toString(). 
 
 
    What do you think?
 
       Thanks, David. 
    ----- Forwarded by David Radley/UK/IBM on 18/01/2017 09:45 -----
 
    From:   Madhan Neethiraj <ma...@apache.org>
    To:     David Radley/UK/IBM@IBMGB
    Date:   18/01/2017 01:47
    Subject:        Re: [jira] David Radley mentioned you (JIRA)
    Sent by:        Madhan Neethiraj <mn...@hortonworks.com>
 
 
 
    David,
 
    Thanks for the type-def JSONs and the steps to reproduce the issue. 
The 
    implementation should be updated to not get into such infinite loops. 
I 
    guess one approach would be to treat references to other entities as a 

    AtlasObjectId, even when the reference points to a full entity. For 
    example:
    - toString() should print AtlasObjectId equivalent of the referenced 
    object. i.e. { “typeName”: “TestType1”, “guid”: “1234-5678-90123” }
    - hashCode() should use (new AtlasObjectId(“TestType1”, 
    “1234-5678-90123”)).hashCode(), instead of calling child.hashCode() or 

    parent.hashCode()
 
    What do you think?
 
    Thanks,
    Madhan
 
 
 
 
    On 1/17/17, 7:11 AM, "David Radley (JIRA)" <ji...@apache.org> wrote:
 
 
             [ 
    
https://issues.apache.org/jira/browse/ATLAS-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel 

    ]
 
        David Radley mentioned you on ATLAS-1458
        --------------------------------
 
        [~madhan.neethiraj]
        As requested on the dev list, here are json files defining the 
types 
    that I used to recreate the loop.
 
        The types create ok. To ge the loop using the  01-test.json, I do 
the 
    following
        create an entity of TestType1 EntityA
        create an entity of TestType2 EntityB
        I update EntityA to have EntityB as a child.
        I update EntityB to have EntityB as a parent. It loops during this 

    update.
 
        I get the same loop for 02-test.json and 03-test.json. 
 
        >                 Key: ATLAS-1458
 
        >         View Online: 
    https://issues.apache.org/jira/browse/ATLAS-1458
        >         Add Comment: 
    https://issues.apache.org/jira/browse/ATLAS-1458#add-comment
 
        Hint: You can mention someone in an issue description or comment 
by 
    typing  "@" in front of their username.
 
 
 
 
        --
        This message was sent by Atlassian JIRA
        (v6.3.4#6332)
 
 
 
 
 
 
    Unless stated otherwise above:
    IBM United Kingdom Limited - Registered in England and Wales with 
number 
    741598. 
    Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 
3AU
 
 
    Unless stated otherwise above:
    IBM United Kingdom Limited - Registered in England and Wales with 
number 
    741598. 
    Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 
3AU
 
 






Re: Loops in V2 API.

Posted by Madhan Neethiraj <ma...@apache.org>.
David,

Idea of ‘refLabel’ sounds good; it will be helpful to include such an attribute value in addition to typeName and guid. I think this idea will be useful to the type in general, not just in constraints; how about the ability to designate one of the entity-attributes as ‘displayText’; this attribute value can be used many places, like search results. What do you think?

Thanks,
Madhan

 

On 1/18/17, 4:41 AM, "David Radley" <da...@uk.ibm.com> wrote:

    Hi Madhan,
    On further reflection, a more generic mechanism to get the name might be 
    to enhance the constraint to add a refLabel to the params which could 
    specify the label field as being "name", like this:
     
    "constraintDefs": [
                            {
                                "type": "mappedFromRef",
                                "params": {
                                    "refAttribute": "TestType2",
                           "refLabel ": "name" 
                                }
                            } 
    
    What do you think?
    
    Thanks David.
    
     
    ----- Forwarded by David Radley/UK/IBM on 18/01/2017 12:36 -----
    
    From:   David Radley/UK/IBM
    To:     Madhan Neethiraj <ma...@apache.org>
    Cc:     <de...@atlas.incubator.apache.org>
    Date:   18/01/2017 10:22
    Subject:        Fw: [jira] David Radley mentioned you (JIRA)
    
    
    Hi Madhan,
    I like the idea of using AtlasObjectId. Can I suggest that if we have a 
    single hashcode of it in the serialized form , then we cannot easily see 
    that a reference points to a particular object's guid. Maybe the Java API 
    reference should serialise to a nested structure containing the type and 
    guid. For readability / debugging I suggest we also add name into 
    AtlasObjectId (this approach proved very useful in ATLAS-1186). In this 
    way a reference contains the type and name (understandable by the human 
    reader, but not unique) and the guid (not understandable by the human 
    reader but useful for unique identification of objects and construction of 
    unambiguous object references). 
    
    So the children and parent references in json would look something like :
    
    
    "children": [
                                     {
                                         "type": "TestType2",
                                                "name": "child1",
                                                "guid":  “1234-5678-90123”
                                     },
                                            {
                                         "type": "TestType2",
                                                "name": "child2",
                                                "guid":  “1234-5678-90124”
                                     },
               ]
    "parent": 
                                     {
                                         "type": "TestType1",
                                                "name": "parent1",
                                                "guid":  “1234-5678-90123”
                                     }
    
    
    We would need something equivalent in toString(). 
    
    
    What do you think?
    
       Thanks, David. 
    ----- Forwarded by David Radley/UK/IBM on 18/01/2017 09:45 -----
    
    From:   Madhan Neethiraj <ma...@apache.org>
    To:     David Radley/UK/IBM@IBMGB
    Date:   18/01/2017 01:47
    Subject:        Re: [jira] David Radley mentioned you (JIRA)
    Sent by:        Madhan Neethiraj <mn...@hortonworks.com>
    
    
    
    David,
    
    Thanks for the type-def JSONs and the steps to reproduce the issue. The 
    implementation should be updated to not get into such infinite loops. I 
    guess one approach would be to treat references to other entities as a 
    AtlasObjectId, even when the reference points to a full entity. For 
    example:
    - toString() should print AtlasObjectId equivalent of the referenced 
    object. i.e. { “typeName”: “TestType1”, “guid”: “1234-5678-90123” }
    - hashCode() should use (new AtlasObjectId(“TestType1”, 
    “1234-5678-90123”)).hashCode(), instead of calling child.hashCode() or 
    parent.hashCode()
    
    What do you think?
    
    Thanks,
    Madhan
    
    
    
    
    On 1/17/17, 7:11 AM, "David Radley (JIRA)" <ji...@apache.org> wrote:
    
     
             [ 
    https://issues.apache.org/jira/browse/ATLAS-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel 
    ]
     
        David Radley mentioned you on ATLAS-1458
        --------------------------------
     
        [~madhan.neethiraj]
        As requested on the dev list, here are json files defining the types 
    that I used to recreate the loop.
     
        The types create ok. To ge the loop using the  01-test.json, I do the 
    following
        create an entity of TestType1 EntityA
        create an entity of TestType2 EntityB
        I update EntityA to have EntityB as a child.
        I update EntityB to have EntityB as a parent. It loops during this 
    update.
     
        I get the same loop for 02-test.json and 03-test.json. 
     
        >                 Key: ATLAS-1458
     
        >         View Online: 
    https://issues.apache.org/jira/browse/ATLAS-1458
        >         Add Comment: 
    https://issues.apache.org/jira/browse/ATLAS-1458#add-comment
     
        Hint: You can mention someone in an issue description or comment by 
    typing  "@" in front of their username.
     
     
     
     
        --
        This message was sent by Atlassian JIRA
        (v6.3.4#6332)
     
     
    
    
    
    
    Unless stated otherwise above:
    IBM United Kingdom Limited - Registered in England and Wales with number 
    741598. 
    Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
    
    
    Unless stated otherwise above:
    IBM United Kingdom Limited - Registered in England and Wales with number 
    741598. 
    Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU