You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@maven.apache.org by "mbien (via GitHub)" <gi...@apache.org> on 2023/05/24 16:08:01 UTC

[GitHub] [maven-indexer] mbien opened a new pull request, #317: Avoid using String#split in hot code if possible

mbien opened a new pull request, #317:
URL: https://github.com/apache/maven-indexer/pull/317

    - async-profiler showed two optimization opportunities in the index creation code
    - IndexDataReader can avoid some work by moving an if-check up
    - `split` can be replaced by `indexOf` + `substring` pairs
    - this reduces MT extraction time from ~328s to ~303s on my machine (i6700k)
   
   ![indexer-split](https://github.com/apache/maven-indexer/assets/114367/5ffde316-e4f2-4fb7-8e73-b146697bf946)
   ![indexer-no-split](https://github.com/apache/maven-indexer/assets/114367/2c17f53f-0d17-4265-af00-f00e641762dc)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@maven.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [maven-indexer] mbien commented on a diff in pull request #317: Avoid using String#split in hot code if possible

Posted by "mbien (via GitHub)" <gi...@apache.org>.
mbien commented on code in PR #317:
URL: https://github.com/apache/maven-indexer/pull/317#discussion_r1204462214


##########
indexer-core/src/main/java/org/apache/maven/index/updater/IndexDataReader.java:
##########
@@ -351,12 +351,15 @@ public Document readDocument() throws IOException {
         final Field uinfoField = (Field) doc.getField(ArtifactInfo.UINFO);
         final String info = doc.get(ArtifactInfo.INFO);
         if (uinfoField != null && info != null && !info.isEmpty()) {
-            final String[] splitInfo = ArtifactInfo.FS_PATTERN.split(info);
-            if (splitInfo.length > 6) {
-                final String extension = splitInfo[6];
-                final String uinfoString = uinfoField.stringValue();
-                if (uinfoString.endsWith(ArtifactInfo.FS + ArtifactInfo.NA)) {
-                    uinfoField.setStringValue(uinfoString + ArtifactInfo.FS + ArtifactInfo.nvl(extension));
+            String uinfoString = uinfoField.stringValue();
+            if (uinfoString.endsWith(ArtifactInfo.FS + ArtifactInfo.NA)) {
+                int elem = 0;
+                for (int i = -1; (i = info.indexOf(ArtifactInfo.FS, i + 1)) != -1; ) {
+                    if (++elem == 6) { // extension is field 6
+                        String extension = info.substring(i + 1);
+                        uinfoField.setStringValue(uinfoString + ArtifactInfo.FS + ArtifactInfo.nvl(extension));
+                        break;
+                    }

Review Comment:
   if-check got moved before split is called to skip some work
   
   I am sorry that the for-loop looks so ugly but the while didn't look any better unfortunately. This is essentially moving the window till it is at field 6, then the substring is extracted.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@maven.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [maven-indexer] cstamas merged pull request #317: [MINDEXER-190] Avoid using String#split in hot code if possible

Posted by "cstamas (via GitHub)" <gi...@apache.org>.
cstamas merged PR #317:
URL: https://github.com/apache/maven-indexer/pull/317


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@maven.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org