You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@maven.apache.org by "Christoph Läubrich (Jira)" <ji...@apache.org> on 2022/11/06 06:28:00 UTC

[jira] [Created] (MNG-7592) String deduplication in model building

Christoph Läubrich created MNG-7592:
---------------------------------------

             Summary: String deduplication in model building
                 Key: MNG-7592
                 URL: https://issues.apache.org/jira/browse/MNG-7592
             Project: Maven
          Issue Type: Improvement
            Reporter: Christoph Läubrich


I currently investigate improving memory consumption in m2eclipse (maven ide extension) and noticed that one problem is that maven model seem to not deduplicate strings, so for large projects (I used apache camel as an example), there are a lot of duplicate strings hanging around, e.g. I see 12.000 instances of "org.apache.maven.plugins" or around 10.000 of "org.apache.camel" (please note that probably not all related to maven!).

If I look at the Graph of incoming references I see for example that these are from Model/Artifact groupId.

I know that string deduplication in general is hard and even controversial, but maybe one could think about such thing at least for the "hotsposts", e,g, groupId, artifactId and version or even managementKeys seem good candidates to be considered for such thing as these are used all over the place.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)