You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by Sarath Subramanian <sa...@apache.org> on 2019/12/17 08:29:58 UTC

Review Request 71919: ATLAS-3563: Improve tag propagation performance using in-memory traversal

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71919/
-----------------------------------------------------------

Review request for atlas, Ashutosh Mestry, Aadarsh Jajodia, keval bhatt, Sridhar K, Le Ma, Mandar Ambawane, mayank jain, Nixon Rodrigues, Sameer Shaikh, and Sarath Subramanian.


Bugs: ATLAS-3563
    https://issues.apache.org/jira/browse/ATLAS-3563


Repository: atlas


Description
-------

Tag propagation uses gremlin query to find entities to which the tag has to be propagated to.

Gremlin query is not scaling well for entities with large lineage (with many depth). In-memory traversal seems to have improved performance significantly since it avoids the overhead added by gremlin script engine initialization, query execution time.

 

Performance improvement in tag propagation from 3004 ms to 180 ms is seen


Diffs
-----

  graphdb/api/src/main/java/org/apache/atlas/repository/graphdb/AtlasVertex.java 6de4dcf10 
  graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusVertex.java 71b285731 
  intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 928ac0d8b 
  repository/src/main/java/org/apache/atlas/repository/graph/GraphHelper.java 1e7acf1e7 
  repository/src/main/java/org/apache/atlas/repository/store/graph/v1/DeleteHandlerV1.java c9ed79750 
  repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java 1c8b057ba 
  repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java a415d3084 
  repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java 8a24fa127 
  repository/src/main/java/org/apache/atlas/util/AtlasGremlin3QueryProvider.java 20c570f7f 
  repository/src/main/java/org/apache/atlas/util/AtlasGremlinQueryProvider.java d201db338 


Diff: https://reviews.apache.org/r/71919/diff/1/


Testing
-------

Manually validated tag propagation works.

* Add classification
* Block propagation
* Change Propagation direction
* Remove Classification


Thanks,

Sarath Subramanian


Re: Review Request 71919: ATLAS-3563: Improve tag propagation performance using in-memory traversal

Posted by Madhan Neethiraj <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71919/#review219043
-----------------------------------------------------------




graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusVertex.java
Lines 76 (patched)
<https://reviews.apache.org/r/71919/#comment307080>

    Given the underlying vertex classes expect a string array, consider using "String[]"  as the type for parameter "edgeLabels", instead of "Collection<String>".



intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java
Lines 284 (patched)
<https://reviews.apache.org/r/71919/#comment307075>

    LOG.info ==> LOG.debug



intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java
Lines 407 (patched)
<https://reviews.apache.org/r/71919/#comment307076>

    ";;" => ";"



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
Lines 412 (patched)
<https://reviews.apache.org/r/71919/#comment307077>

    impactedEntityVertices => propagatedEntities
      // entity vertices to which the classification is currently propagated to
    
    impactedEntityVerticesWithRestrictions => impactedEntities
      // entity vertices to which the classifications must be propagated to



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
Lines 418 (patched)
<https://reviews.apache.org/r/71919/#comment307078>

    - is 'ret' in #416 the list of propagations to be added?
    - is 'ret' in #418 the list of propagations to be removed?
    
    Consider adding a comment for this method. Looking at the caller of this method in AtlasRelationshipStoreV2.handleBlockedClassifications(), the list returned from this method seems to be used to both remove and add propagations. Please review and refactor/rename as neceessary.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
Lines 466 (patched)
<https://reviews.apache.org/r/71919/#comment307081>

    classificationIdToExclude => classificationId
      in #466 and #474



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
Lines 517 (patched)
<https://reviews.apache.org/r/71919/#comment307079>

    getAdjacentVertex() => getOtherVertex() // to be inline with JanusGraphEdge.otherVertex()


- Madhan Neethiraj


On Dec. 17, 2019, 8:29 a.m., Sarath Subramanian wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71919/
> -----------------------------------------------------------
> 
> (Updated Dec. 17, 2019, 8:29 a.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Aadarsh Jajodia, keval bhatt, Sridhar K, Le Ma, Mandar Ambawane, mayank jain, Nixon Rodrigues, Sameer Shaikh, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3563
>     https://issues.apache.org/jira/browse/ATLAS-3563
> 
> 
> Repository: atlas
> 
> 
> Description
> -------
> 
> Tag propagation uses gremlin query to find entities to which the tag has to be propagated to.
> 
> Gremlin query doesn't scale well for entities with large lineage (with many depth). In-memory traversal seems to have improved performance significantly since it avoids the overhead added by gremlin script engine initialization, query execution time.
> 
>  
> 
> Performance improvement in tag propagation from 3004 ms to 180 ms is seen
> 
> 
> Diffs
> -----
> 
>   graphdb/api/src/main/java/org/apache/atlas/repository/graphdb/AtlasVertex.java 6de4dcf10 
>   graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusVertex.java 71b285731 
>   intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 928ac0d8b 
>   repository/src/main/java/org/apache/atlas/repository/graph/GraphHelper.java 1e7acf1e7 
>   repository/src/main/java/org/apache/atlas/repository/store/graph/v1/DeleteHandlerV1.java c9ed79750 
>   repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java 1c8b057ba 
>   repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java a415d3084 
>   repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java 8a24fa127 
>   repository/src/main/java/org/apache/atlas/util/AtlasGremlin3QueryProvider.java 20c570f7f 
>   repository/src/main/java/org/apache/atlas/util/AtlasGremlinQueryProvider.java d201db338 
> 
> 
> Diff: https://reviews.apache.org/r/71919/diff/2/
> 
> 
> Testing
> -------
> 
> Manually validated tag propagation works.
> 
> * Add classification
> * Block propagation
> * Change Propagation direction
> * Remove Classification
> 
> 
> Thanks,
> 
> Sarath Subramanian
> 
>


Re: Review Request 71919: ATLAS-3563: Improve tag propagation performance using in-memory traversal

Posted by Madhan Neethiraj <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71919/#review219049
-----------------------------------------------------------


Ship it!




Ship It!

- Madhan Neethiraj


On Dec. 18, 2019, 1:31 a.m., Sarath Subramanian wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71919/
> -----------------------------------------------------------
> 
> (Updated Dec. 18, 2019, 1:31 a.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Aadarsh Jajodia, keval bhatt, Sridhar K, Le Ma, Mandar Ambawane, mayank jain, Nixon Rodrigues, Sameer Shaikh, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3563
>     https://issues.apache.org/jira/browse/ATLAS-3563
> 
> 
> Repository: atlas
> 
> 
> Description
> -------
> 
> Tag propagation uses gremlin query to find entities to which the tag has to be propagated to.
> 
> Gremlin query doesn't scale well for entities with large lineage (with many depth). In-memory traversal seems to have improved performance significantly since it avoids the overhead added by gremlin script engine initialization, query execution time.
> 
>  
> 
> Performance improvement in tag propagation from 3004 ms to 180 ms is seen
> 
> 
> Diffs
> -----
> 
>   graphdb/api/src/main/java/org/apache/atlas/repository/graphdb/AtlasVertex.java 6de4dcf10 
>   graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusVertex.java 71b285731 
>   intg/src/main/java/org/apache/atlas/AtlasErrorCode.java 7a2aae2e9 
>   intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 928ac0d8b 
>   repository/src/main/java/org/apache/atlas/repository/graph/GraphHelper.java 1e7acf1e7 
>   repository/src/main/java/org/apache/atlas/repository/store/graph/v1/DeleteHandlerV1.java c9ed79750 
>   repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java 1c8b057ba 
>   repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java a415d3084 
>   repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java 8a24fa127 
>   repository/src/main/java/org/apache/atlas/util/AtlasGremlin3QueryProvider.java 20c570f7f 
>   repository/src/main/java/org/apache/atlas/util/AtlasGremlinQueryProvider.java d201db338 
>   repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java 6f9c05e7a 
> 
> 
> Diff: https://reviews.apache.org/r/71919/diff/4/
> 
> 
> Testing
> -------
> 
> Manually validated tag propagation works.
> 
> * Add classification
> * Block propagation
> * Change Propagation direction
> * Remove Classification
> 
> 
> Thanks,
> 
> Sarath Subramanian
> 
>


Re: Review Request 71919: ATLAS-3563: Improve tag propagation performance using in-memory traversal

Posted by Sarath Subramanian <sa...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71919/
-----------------------------------------------------------

(Updated Dec. 17, 2019, 5:31 p.m.)


Review request for atlas, Ashutosh Mestry, Aadarsh Jajodia, keval bhatt, Sridhar K, Le Ma, Mandar Ambawane, mayank jain, Nixon Rodrigues, Sameer Shaikh, and Sarath Subramanian.


Bugs: ATLAS-3563
    https://issues.apache.org/jira/browse/ATLAS-3563


Repository: atlas


Description
-------

Tag propagation uses gremlin query to find entities to which the tag has to be propagated to.

Gremlin query doesn't scale well for entities with large lineage (with many depth). In-memory traversal seems to have improved performance significantly since it avoids the overhead added by gremlin script engine initialization, query execution time.

 

Performance improvement in tag propagation from 3004 ms to 180 ms is seen


Diffs (updated)
-----

  graphdb/api/src/main/java/org/apache/atlas/repository/graphdb/AtlasVertex.java 6de4dcf10 
  graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusVertex.java 71b285731 
  intg/src/main/java/org/apache/atlas/AtlasErrorCode.java 7a2aae2e9 
  intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 928ac0d8b 
  repository/src/main/java/org/apache/atlas/repository/graph/GraphHelper.java 1e7acf1e7 
  repository/src/main/java/org/apache/atlas/repository/store/graph/v1/DeleteHandlerV1.java c9ed79750 
  repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java 1c8b057ba 
  repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java a415d3084 
  repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java 8a24fa127 
  repository/src/main/java/org/apache/atlas/util/AtlasGremlin3QueryProvider.java 20c570f7f 
  repository/src/main/java/org/apache/atlas/util/AtlasGremlinQueryProvider.java d201db338 
  repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java 6f9c05e7a 


Diff: https://reviews.apache.org/r/71919/diff/4/

Changes: https://reviews.apache.org/r/71919/diff/3-4/


Testing
-------

Manually validated tag propagation works.

* Add classification
* Block propagation
* Change Propagation direction
* Remove Classification


Thanks,

Sarath Subramanian


Re: Review Request 71919: ATLAS-3563: Improve tag propagation performance using in-memory traversal

Posted by Sarath Subramanian <sa...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71919/
-----------------------------------------------------------

(Updated Dec. 17, 2019, 10:05 a.m.)


Review request for atlas, Ashutosh Mestry, Aadarsh Jajodia, keval bhatt, Sridhar K, Le Ma, Mandar Ambawane, mayank jain, Nixon Rodrigues, Sameer Shaikh, and Sarath Subramanian.


Bugs: ATLAS-3563
    https://issues.apache.org/jira/browse/ATLAS-3563


Repository: atlas


Description
-------

Tag propagation uses gremlin query to find entities to which the tag has to be propagated to.

Gremlin query doesn't scale well for entities with large lineage (with many depth). In-memory traversal seems to have improved performance significantly since it avoids the overhead added by gremlin script engine initialization, query execution time.

 

Performance improvement in tag propagation from 3004 ms to 180 ms is seen


Diffs (updated)
-----

  graphdb/api/src/main/java/org/apache/atlas/repository/graphdb/AtlasVertex.java 6de4dcf10 
  graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusVertex.java 71b285731 
  intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 928ac0d8b 
  repository/src/main/java/org/apache/atlas/repository/graph/GraphHelper.java 1e7acf1e7 
  repository/src/main/java/org/apache/atlas/repository/store/graph/v1/DeleteHandlerV1.java c9ed79750 
  repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java 1c8b057ba 
  repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java a415d3084 
  repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java 8a24fa127 
  repository/src/main/java/org/apache/atlas/util/AtlasGremlin3QueryProvider.java 20c570f7f 
  repository/src/main/java/org/apache/atlas/util/AtlasGremlinQueryProvider.java d201db338 


Diff: https://reviews.apache.org/r/71919/diff/3/

Changes: https://reviews.apache.org/r/71919/diff/2-3/


Testing
-------

Manually validated tag propagation works.

* Add classification
* Block propagation
* Change Propagation direction
* Remove Classification


Thanks,

Sarath Subramanian