You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@maven.apache.org by "Hudson (Jira)" <ji...@apache.org> on 2019/10/24 19:33:00 UTC

[jira] [Commented] (MRESOLVER-93) PathRecordingDependencyVisitor to handle 3 cycles

    [ https://issues.apache.org/jira/browse/MRESOLVER-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16959170#comment-16959170 ] 

Hudson commented on MRESOLVER-93:
---------------------------------

Build succeeded in Jenkins: Maven TLP » maven-resolver » master #36

See https://builds.apache.org/job/maven-box/job/maven-resolver/job/master/36/

> PathRecordingDependencyVisitor to handle 3 cycles
> -------------------------------------------------
>
>                 Key: MRESOLVER-93
>                 URL: https://issues.apache.org/jira/browse/MRESOLVER-93
>             Project: Maven Resolver
>          Issue Type: Improvement
>          Components: resolver
>    Affects Versions: 1.3.3, 1.4.0, 1.4.1
>            Reporter: Tomo Suzuki
>            Assignee: Tibor Digana
>            Priority: Major
>             Fix For: 1.4.2
>
>         Attachments: IMG_0234.jpg, IMG_0235.jpg, IMG_0236.jpg, IMG_0237.jpg, IMG_0238.jpg, IMG_0240.jpg, IMG_0241.jpg, IMG_0242.jpg, IMG_0243.jpg, IMG_0244.jpg, IMG_0245.jpg, IMG_0255.jpg, IMG_0256.jpg, IMG_0257.jpg, IMG_0258.jpg, IMG_0259.jpg, IMG_0260.jpg, IMG_0261.jpg, IMG_0262.jpg, IMG_0263.jpg, IMG_0264.jpg, IMG_0265.jpg, IMG_0266.jpg
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> PathRecordingDependencyVisitor cannot handle dependency graphs that have 3 or more cycles such as below:
>   
> {code:java}
> gid:a:1 (1)
> +- gid:b:0
> |  \- ^1
> +- gid:b:1
> |  \- ^1
> \- gid:b:2
>    \- ^1
> {code}
> It fails with StackOverflowError or OutOfMemoryError. [Test case|https://github.com/suztomo/maven-resolver/commit/31b24dfe240997861e27661a7540546fbe6e0dab].
>  
> h1. Solutions
> I came up with three solutions. I pick solution #1 for simplicity. 
> h2. 1. Use "parents" to check the cycle, rather than visited set
> This is the simplest. Checking array element member is usually discouraged especially for large data set. The implementation should confirm the overhead of this solution.
> h2. 2. Use AbstractMapBag/Multiset for visited set
> Creating a new class that extends AbstractMapBag and leverages IdentityHashMap. Although this solution would be theoretically more efficient than solution #1, I felt it's overkill to create a class just for this solution.
> {code:java}
> AbstractMapBag(new IdentityHashMap<DependencyNode, AbstractMapBag.MutableInteger>()){code}
>  
> [https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/bag/AbstractMapBag.html]
>  
> IdentityHashMap<DependencyNode, Integer>() would work as a multiset.
> h2. 3. Call visitLeave only when visitEnter is true
> The cause of this bug is [DefaultDependencyNode|https://github.com/apache/maven-resolver/blob/47edcfe69c4e52ced4cb93d65b7348b5645cdd68/maven-resolver-api/src/main/java/org/eclipse/aether/graph/DefaultDependencyNode.java#L354] calling visitLeave regardless of visitEnter result.
> I'm not sure how many other visitors rely on visitLeave being called regardless of visitEnter result.
> h1. Illustration on why existing algorithm does not catch cycle 
> The following illustration is the node traversal for the test case above by current algorithm. This illustration tracks the dependency node graph and the "visited" set maintained by the visitor.
>  * visited set. An internal data structure in PathRecordingDependencyVisitor to avoid cycle ([link|https://github.com/apache/maven-resolver/blob/0c2373f6c66f20953b1a7e443ea1de8672d1b072/maven-resolver-util/src/main/java/org/eclipse/aether/util/graph/visitor/PathRecordingDependencyVisitor.java#L45]).
>  * visitEnter(node): PathRecordingDependencyVisitor's function ([link|https://github.com/apache/maven-resolver/blob/0c2373f6c66f20953b1a7e443ea1de8672d1b072/maven-resolver-util/src/main/java/org/eclipse/aether/util/graph/visitor/PathRecordingDependencyVisitor.java#L100]). When returning true, the node's children is traversed by the algorithm. This function adds the node to visited set.
>  * visitLeave(node): PathRecordingDependencyVisitor's function ([link|https://github.com/apache/maven-resolver/blob/0c2373f6c66f20953b1a7e443ea1de8672d1b072/maven-resolver-util/src/main/java/org/eclipse/aether/util/graph/visitor/PathRecordingDependencyVisitor.java#L129]). This function removes the node from visited set.
>   
> The initial state starts with node "a" and visited set \{a}.
> !IMG_0234.jpg|width=334,height=252!
> First child of a is b0. Because visited does not contain, visitEnter(b0) returns true, meaning that the algorithm traverses this b0's children next. B0 is added to visited.
> !IMG_0235.jpg|width=359,height=191!
> B0's children is "a". Because visited set contains "a", visitEnter(a) returns false. This means that the algorithm does not traverse this "a"'s children. A is added to visited set (already it has).
>   !IMG_0236.jpg|width=438,height=197!
> Now not traversing this "a"'s children, the algorithm calls visitLeave(a). This removes "a" from visited set.
> !IMG_0237.jpg|width=434,height=165!
> B0's children are all traversed. the algorithm calls visitLeave(b0). This removes "b0" from visited set.
> !IMG_0238.jpg|width=459,height=197!
> Now visited set is empty.
> Next child of the root "a" is b1. B1 is not in visited set, thus visitEnter(b1) returns true. This means the algorithm traverses the children of this b1.
> !IMG_0240.jpg|width=445,height=270!
> B1's only child is a. "a" is not in visited set. visitEnter(a) returns true. This means to traverse "a"'s children.
> !IMG_0241.jpg|width=418,height=262!
> A's first children is b0. b0 is not in visited set. visitEnter(b0) returns true, meaning to traverse children of this b0.
> !IMG_0242.jpg|width=422,height=208!  
>  (img 0242)
> The only child of b0 is "a". Visited set contains "a", and thus not traversing its children.
> !IMG_0243.jpg|width=491,height=191!
> visitLeave(a) removes "a" from visited set.
> !IMG_0244.jpg|width=481,height=189!
> b0's children is all traversed. VisitLeave(b0) removes b0 from visited set.
> !IMG_0245.jpg|width=498,height=182!
> Next child of this "a" is b1. B1 is in visited set, and thus visitEnter(b1) returns false. This node's children is not to be traversed.
> !IMG_0255.jpg|width=545,height=245!
> (img 0255)
> visitLeave(b1) removes b1 from visited set. Now visited is emtpy.
> !IMG_0256.jpg|width=528,height=294!
> The last child of "a" is b2. VisitEnter(b2) returns true. It's children is to be traversed. B2 is in visited set.
> !IMG_0257.jpg|width=502,height=309!
>  B2's only child is "a". "a" is not in visited set, thus visitEnter(a) returns true. The algorithm traverses this "a"'s children.
> !IMG_0258.jpg|width=485,height=299!
> (img 0258)
>  
> (...omit...)
>  
> IMG_0266 shows the step where I decided to give up. The algorithm does not seem to stop. Indeed the test shows that. The path from the root to the furthest a includes 5 "a" nodes. I concluded the visited set is not working as expected to avoid cycle.
> !IMG_0266.jpg|width=656,height=252!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)