You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/10/21 20:27:22 UTC

[GitHub] [lucene] dweiss commented on a change in pull request #404: LUCENE-10196: Improve IntroSorter with 3-ways partitioning.

dweiss commented on a change in pull request #404:
URL: https://github.com/apache/lucene/pull/404#discussion_r734017645



##########
File path: lucene/core/src/java/org/apache/lucene/util/IntroSorter.java
##########
@@ -35,51 +38,102 @@ public final void sort(int from, int to) {
     quicksort(from, to, 2 * MathUtil.log(to - from, 2));
   }
 
+  /**
+   * Sorts between from (inclusive) and to (exclusive) with intro sort.

Review comment:
       "with intro sort"? Is this accurate here?

##########
File path: lucene/core/src/java/org/apache/lucene/util/Sorter.java
##########
@@ -216,6 +219,25 @@ void binarySort(int from, int to, int i) {
     }
   }
 
+  /**
+   * Sorts between from (inclusive) and to (exclusive) with insertion sort. Runs in {@code O(n^2)}.
+   * It is typically used by more sophisticated implementations as a fall-back when the numbers of

Review comment:
       when the number (not numbers)

##########
File path: lucene/core/src/test/org/apache/lucene/util/SorterBenchmark.java
##########
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import java.util.Random;
+import org.apache.lucene.util.BaseSortTestCase.Entry;
+import org.apache.lucene.util.BaseSortTestCase.Strategy;
+
+/**
+ * Benchmark for {@link Sorter} implementations.
+ *
+ * <p>Run the static {@link #main(String[])} method to start the benchmark.
+ */
+public class SorterBenchmark {
+
+  private static final int ARRAY_LENGTH = 20000;
+  private static final int RUNS = 10;
+  private static final int LOOPS = 100;
+
+  private enum SorterFactory {
+    INTRO_SORTER(
+        "IntroSorter",
+        (arr, s) -> {
+          return new ArrayIntroSorter<>(arr, Entry::compareTo);
+        }),
+    TIM_SORTER(
+        "TimSorter",
+        (arr, s) -> {
+          return new ArrayTimSorter<>(arr, Entry::compareTo, arr.length / 64);
+        }),
+    MERGE_SORTER(
+        "MergeSorter",
+        (arr, s) -> {
+          return new ArrayInPlaceMergeSorter<>(arr, Entry::compareTo);
+        }),
+    ;
+    final String name;
+    final Builder builder;
+
+    SorterFactory(String name, Builder builder) {
+      this.name = name;
+      this.builder = builder;
+    }
+
+    interface Builder {
+      Sorter build(Entry[] arr, Strategy strategy);
+    }
+  }
+
+  public static void main(String[] args) throws Exception {

Review comment:
       You could convert it to a test and give the test an assumption on some property (or just an Ignore). Then you'd have a seed-reproducible-benchmark. :)
   
   This stuff fits JMH nicely but I understand why you didn't want to roll out the big guns here.

##########
File path: lucene/core/src/test/org/apache/lucene/util/SorterBenchmark.java
##########
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import java.util.Random;
+import org.apache.lucene.util.BaseSortTestCase.Entry;
+import org.apache.lucene.util.BaseSortTestCase.Strategy;
+
+/**
+ * Benchmark for {@link Sorter} implementations.
+ *
+ * <p>Run the static {@link #main(String[])} method to start the benchmark.
+ */
+public class SorterBenchmark {
+
+  private static final int ARRAY_LENGTH = 20000;
+  private static final int RUNS = 10;
+  private static final int LOOPS = 100;
+
+  private enum SorterFactory {
+    INTRO_SORTER(
+        "IntroSorter",
+        (arr, s) -> {
+          return new ArrayIntroSorter<>(arr, Entry::compareTo);
+        }),
+    TIM_SORTER(
+        "TimSorter",
+        (arr, s) -> {
+          return new ArrayTimSorter<>(arr, Entry::compareTo, arr.length / 64);
+        }),
+    MERGE_SORTER(
+        "MergeSorter",
+        (arr, s) -> {
+          return new ArrayInPlaceMergeSorter<>(arr, Entry::compareTo);
+        }),
+    ;
+    final String name;
+    final Builder builder;
+
+    SorterFactory(String name, Builder builder) {
+      this.name = name;
+      this.builder = builder;
+    }
+
+    interface Builder {
+      Sorter build(Entry[] arr, Strategy strategy);
+    }
+  }
+
+  public static void main(String[] args) throws Exception {
+    assert false : "Disable assertions to run the benchmark";
+    Random random = new Random(System.currentTimeMillis());
+    long seed = random.nextLong();
+
+    System.out.println("WARMUP");
+    benchmarkSorters(Strategy.RANDOM, random, seed);
+    System.out.println();
+
+    for (Strategy strategy : Strategy.values()) {
+      System.out.println(strategy);
+      benchmarkSorters(strategy, random, seed);
+    }
+  }
+
+  private static void benchmarkSorters(Strategy strategy, Random random, long seed) {
+    for (SorterFactory sorterFactory : SorterFactory.values()) {
+      System.out.print("  " + padRight(sorterFactory.name, 13) + "...");
+      random.setSeed(seed);
+      benchmarkSorter(strategy, sorterFactory, random);
+      System.out.println();
+    }
+  }
+
+  private static void benchmarkSorter(
+      Strategy strategy, SorterFactory sorterFactory, Random random) {
+    for (int i = 0; i < RUNS; i++) {
+      Entry[] original = createArray(strategy, random);
+      Entry[] clone = original.clone();
+      Sorter sorter = sorterFactory.builder.build(clone, strategy);
+      long startTimeNs = System.nanoTime();
+      for (int j = 0; j < LOOPS; j++) {
+        System.arraycopy(original, 0, clone, 0, original.length);
+        sorter.sort(0, clone.length);
+      }
+      long timeMs = (System.nanoTime() - startTimeNs) / 1000000;
+      System.out.print(" " + padLeft(timeMs, 4));

Review comment:
       String.printf("%5f\n", timeMs), String.printf("%-5f\n", timeMs)?
   

##########
File path: lucene/core/src/test/org/apache/lucene/util/BaseSortTestCase.java
##########
@@ -68,70 +69,77 @@ public void test(Entry[] arr) {
   enum Strategy {
     RANDOM {
       @Override
-      public void set(Entry[] arr, int i) {
-        arr[i] = new Entry(random().nextInt(), i);
+      public void set(Entry[] arr, int i, Random random) {

Review comment:
       Thanks for using an explicit random here. random() in loops is costly.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org