You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@maven.apache.org by "Xiong Luyao (Jira)" <ji...@apache.org> on 2022/07/09 02:28:00 UTC

[jira] [Commented] (MNG-7509) Huge memory cost when parent pom widely used in a big project for dependencyManagement

    [ https://issues.apache.org/jira/browse/MNG-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17564491#comment-17564491 ] 

Xiong Luyao commented on MNG-7509:
----------------------------------

The reason is, when maven resolving the relationship, for each library it will create instances for the dependencies in dependencyManagement. So for those dependencies in parent pom, they will be created many times even they are totally same. In the worst cases, If a project introduce 3000 libraries and each library introduce the same parent pom which managed 3000 libraries version. There will be 9 million Dependency instance created. But in fact, only 3000 Dependency instances are necessary for this case.
 
The way to resolve it is to add a cache map for the dependencies. If the dependency instance is already in the cache map, it will get instance from the map rather than creating a new one.  Fortunately, the Dependency(including DefaultArtifact in it) is designed as immutable, which means, if someone change the value in somewhere, it won’t change the existing instance, but return a totally new one. So we needn’t be worried about that it will impact other reference of a Dependency if somewhere change its value.

> Huge memory cost when parent pom widely used in a big project for dependencyManagement
> --------------------------------------------------------------------------------------
>
>                 Key: MNG-7509
>                 URL: https://issues.apache.org/jira/browse/MNG-7509
>             Project: Maven
>          Issue Type: Improvement
>          Components: Performance
>            Reporter: Xiong Luyao
>            Priority: Major
>         Attachments: image-2022-07-09-09-37-53-823.png, image-2022-07-09-09-38-26-354.png, image-2022-07-09-10-27-12-668.png
>
>
> When maven try to resolve dependency relationship, it will create many objects of dependency / artifact. Even the dependency/artifact content is totally same, but just in different models. It cost huge memory if there is a big dependencyManagement parent pom in each dependencies of the project.
>  
> Here is a real case. As below picture, there is over 3000 business domain libraries, each library may rely on several other libraries among them. So the library which didn’t rely on each other will be built first and the library only relies on the built libraries will be built, until all the libraries have been built. In this way, we use a parent pom to maintain the version of these 3000+ libraries version and make each library introduce it. So that it can keep all the libraries are built from same base.
> But it costs huge memory in above ways to built libraries. And even the libraries have been released, if the project which contains a lot of above libraries, it also cost huge memory when building project.
>  
> !image-2022-07-09-09-37-53-823.png|width=493,height=226!
>  
> Here is a thread dump when building a real project with above libraries. The top 5 objects are all related to org.eclipse.aether.graph.Dependency.
> !image-2022-07-09-09-38-26-354.png|width=510,height=199!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)