You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/01/02 14:42:45 UTC
[GitHub] [lucene-solr] uschindler opened a new pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
uschindler opened a new pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176
This is just a draft PR for a first insight on memory maaping improvements in JDK 16+.
Some background information: Starting with JDK-14, there is a new incubating module "jdk.incubator.foreign" that has a new, not yet stable API for accessing off-heap memory (and later it will also support calling functions using classical MethodHandles that are located in libraries like .so or .dll files). This incubator module has several versions:
- first version: https://openjdk.java.net/jeps/370 (slow, very buggy and thread confinement, so making it unuseable with Lucene)
- second version: https://openjdk.java.net/jeps/383 (still thread confinement, but now allows transfer of "ownership" to other threads; this is still impossible to use with Lucene.
- third version in JDK 16: https://openjdk.java.net/jeps/393 (this version has included "Support for shared segments"). This now allows us to safely use the same external mmaped memory from different threads and also unmap it!
This module more or less overcomes several problems:
- ByteBuffer API is limited to 32bit (in fact MMapDirectory has to chunk in 1 GiB portions)
- There is no official way to unmap ByteBuffers as the file is not used. There is a way to use `sun.misc.Unsafe` and forcefully unmap segments, but any IndexInput accessing the file from another thread will crush the JVM with SIGSEGV or SIGBUS. We learned to live with that, but that's the main issue.
@uschindler had many discussions with the team at OpenJDK and finally with the third incubator, we have an API that works with Lucene. It was very fruitful discussions (thanks to @mcimadamore !)
With the third incubator we are now finally able to do some tests (especially performance). As this is an incubating module, this PR first changes a bit the build system:
- disable `-Werror` for `:lucene:core`
- add the incubating module to compiler of `:lucene:core` and enable it for all test builds. This is important, as you have to mass `--add-modules jdk.incubator.foreign` also at runtime!
The code basically just modifies `MMapDirectory` to use LONG instead of INT for the chunk size parameter. In addition it adds `MemorySegmentIndexInput` that is a copy of our `ByteBufferIndexInput` (still there, but unused), but using MemorySegment instead of ByteBuffer behind the scenes. It works in exactly the same way, just the try/catch blocks for supporting EOFException or moving to another segment were rewritten.
The openInput code uses `MemorySegment.mapFile()` to get a memory mapping. This method is unfortunately a bit buggy in JDK-16-ea-b30, so I added some workarounds. See JDK issues: https://bugs.openjdk.java.net/browse/JDK-8259027, https://bugs.openjdk.java.net/browse/JDK-8259028, https://bugs.openjdk.java.net/browse/JDK-8259032, https://bugs.openjdk.java.net/browse/JDK-8259034
It passes all tests and it looks like you can use it to read indexes. The default chunk size is now 16 GiB (but you can raise or lower it as you like; tests are doing this). Of course you can set it to Long.MAX_VALUE, in that case every index file is always mapped to one big memory mapping. My testing with Windows 10 have shown, that this is *not a good idea!!!*. Huge mappings fragment address space over time and as we can only use like 43 or 46 bits (depending on OS), the fragmentation will at some point kill you. So 16 GiB looks like a good compromise: Most files will be smaller than 6 GiB anyways (unless you optimize your index to one huge segment). So for most Lucene installations, the number of segments will equal the number of open files, so Elasticsearch huge user consumers will be very happy. The sysctl max_map_count may not need to be touched anymore.
In addition, this implements `readLELongs` in a better way than @jpountz did (no caching or arbitrary objects). Nevertheless, as the new MemorySegment API relies on final, unmodifiable classes and coping memory from a MemorySegment to a on-heap Java array, it requires us to wrap all those arrays using a MemorySegment each time (e.g. in `readBytes()` or `readLELongs`), there may be some overhead du to short living object allocations (those are NOT reuseable!!!). _In short: In future we should throw away on coping/loading our stuff to heap and maybe throw away IndexInput completely and base our code fully on random access. The new foreign-vector APIs will in future also be written with MemorySegment in its focus. So you can allocate a vector view on a MemorySegment and let the vectorizer fully work outside java heap inside our mmapped files! :-)_
It would be good if you could checkout this branch and try it in production.
But be aware:
- You need JDK 11 to run Gradle (set `JAVA_HOME` to it)
- You need JDK 16-ea-b30 (set `RUNTIME_JAVA_HOME` to it)
- The lucene-core.jar will be JDK16 class files and requires JDK-16 to execute.
- Also you need to add `--add-modules jdk.incubator.foreign` to the command line of your Java program/Solr server/Elasticsearch server
It would be good to get some benchmarks, especially by @rmuir or @mikemccand. Take your time and enjoy the complexity.
My plan is the following:
- report any bugs or slowness, especially with Hotspot optimizaions. The last time I talked to Maurizio, he taked about Hotspot not being able to fully optimize for-loops with long instead of int, so it may take some time until the full performance is there.
- wait until the final version of project PANAMA-foreign goes into Java's Core Library (no module needed anymore)
- add a MR-JAR for lucene-core.jar and compile the MemorySegmentIndexInput and maybe some helper classes with JDK 17/18/19 (hopefully?).
In addition there are some comments in the code tlaking about safety (e.g., we need `IOUtils.close()` taking `AutoCloseable` instead of just `Closeable`, so we can also enfoce that all memory segments are closed after usage. In addition, by default all VarHandles are aligned. By default it refuses to read a LONG from an address which is not a multiple of 8. I had to disable this feature, as all our index files are heavily unaliged. We should in meantime not only convert our files to little endian, but also make all non-compressed types (like `long[]` arrays or non-encoded integers be aligned to the correct boundaries in files). The most horrible thing I have seen is that our CFS file format starts the "inner" files totally unaligned. We should fix the CFSWriter to start new files always at multiples of 8 bytes. I will open an issue about this.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753673096
Hi read the instructions above, at "be aware". Gradle ist Not compatible to JDK 16 at all. So Gradle needs to run with JDK 11. You can pass the JDK 16 directory using an environment variable or through sysprop. Gradlew shows all options when you run it without any options and read instructions about alternative jvms.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551191931
##########
File path: gradle/defaults-java.gradle
##########
@@ -60,3 +60,14 @@ allprojects {
}
}
}
+
+configure(project(":lucene:core")) {
Review comment:
We can't release that anyways. Earliest ist jdk 17, if not 18. The API is not stable at all.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551179881
##########
File path: gradle/defaults-java.gradle
##########
@@ -60,3 +60,14 @@ allprojects {
}
}
}
+
+configure(project(":lucene:core")) {
Review comment:
So if we want this in before JDK16 becomes the minimum platform we'd need to use a multi-release jar again, right?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) Panama APIs (>= JDK-16-ea-b32)
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-855874294
I will close this and would like to move discussion over to https://github.com/apache/lucene/pull/173
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753691565
Hi @msokolov: I removed some incorrect assert, which may make test fail on linux/mac.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler edited a comment on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler edited a comment on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-754304561
This would also explain slowdowns on *short* posting-list iterations, as readLELongs() with smaller number of longs is not working well.
My plan would be to do an if statement at beginning of readBytes() or readLELongs() like:
```java
if (length < SOME_LIMIT) {
for (int i = offset; i< offset + length; i++) {
bytes[i] = readByte();
}
} else {
... current code with targetSlice ...
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753547873
I fixed the rmeinaing TODOs regarding a safe close of *all* segments, when exceptions on `map()` occur. When closing the master IndexInput, we also make sure to unmap all segments, although exceptions might occur (e.g. on concurrent access, `close()` may fail with IllegalStateException). Those exceptions are bubbled up.
As MemorySegment does not implement `Closeable` but the more generic `AutoCloseable`, I used `IOUtils.applyToAll()` with `MemorySegment::close` as method reference to the close method (heavy functional interface adaption, ey?)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551061081
##########
File path: lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java
##########
@@ -0,0 +1,525 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.lang.invoke.VarHandle;
+import java.nio.ByteOrder;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.util.IOUtils;
+
+import jdk.incubator.foreign.MemoryHandles;
+import jdk.incubator.foreign.MemorySegment;
+
+/**
+ * Base IndexInput implementation that uses an array of MemorySegments to represent a file.
+ *
+ * <p>For efficiency, this class requires that the segment size are a power-of-two
+ * (<code>chunkSizePower</code>).
+ */
+public abstract class MemorySegmentIndexInput extends IndexInput implements RandomAccessInput {
+ // We pass 1L as alignment, because currently Lucene file formats are heavy unaligned: :(
+ static final VarHandle VH_getByte = MemoryHandles.varHandle(byte.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getShort = MemoryHandles.varHandle(short.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getInt = MemoryHandles.varHandle(int.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getLong = MemoryHandles.varHandle(long.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+
+ final boolean isClone;
+ final long length;
+ final long chunkSizeMask;
+ final int chunkSizePower;
+ final MemorySegment[] segments;
+
+ int curSegmentIndex = -1;
+ MemorySegment curSegment; // redundant for speed: segments[curSegmentIndex], also marker if closed!
+ long curPosition; // relative to curSegment, not globally
+
+ public static MemorySegmentIndexInput newInstance(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower) {
+ if (segments.length == 1) {
+ return new SingleSegmentImpl(resourceDescription, segments[0], length, chunkSizePower, false);
+ } else {
+ return new MultiSegmentImpl(resourceDescription, segments, 0, length, chunkSizePower, false);
+ }
+ }
+
+ private MemorySegmentIndexInput(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower,
+ boolean isClone) {
+ super(resourceDescription);
+ this.segments = segments;
+ this.length = length;
+ this.chunkSizePower = chunkSizePower;
+ this.chunkSizeMask = (1L << chunkSizePower) - 1L;
+ this.isClone = isClone;
+ this.curSegment = segments[0];
+ }
+
+ void ensureOpen() {
+ if (curSegment == null) {
+ throw alreadyClosed();
+ }
+ }
+
+ RuntimeException wrapAlreadyClosedException(RuntimeException e) {
+ if (e instanceof NullPointerException) {
+ return alreadyClosed();
+ }
+ // TODO: maybe open a JDK issue to have a separate, more
+ // meaningful exception for unmapped segments:
+ if (e.getMessage() != null && e.getMessage().contains("closed")) {
+ return alreadyClosed();
+ }
+ return e;
+ }
+
+ // return value only for compiler!
Review comment:
no, we need some return value so we can call it and return a value. But as it is throwing an exception anyways, the return value does not matter.
These tricks are often used for exception handlers to delegate "exception handling" to a method where you know it will never return and just bubble up the exception:
- if the exception is known before (by type), you use `throw callSomeMethod(param)`. The pro thing: it's clean as you delegate the instantiation of the exception to the factory, but throw it on your own. The problem here is: only works with a known exception type - and a single one, because the method needs it as return type. A generic `Exception` is a desaster, as it requires to declare that on throws!
- an alternative is (and that's used here): put the throw clause also into the method, so the only sense of our method is throwing some exception, it never returns cleanly. Backside: the method needs a return type, so you can do: `return callSomeMethod()` from the caller. Of course you can add a separate return statement after the method call, but that looks more bullshitty.
Here we use the second approch. The most common return type to all callers of this method is `byte`. Compiler is happy, voila!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753512086
FYI, after I was able to run tests in a more convenient way, I figured out that some tests fail from time to time because of using `HandleTrackingFS`. I disabled all custom filesystems in the PR, but it looks like some tests use `HandleTrackingFS`.
I forgot to mention this in my description above: MMapDirectory no longer works with custom java.nio.filesystem.FileSystem, as it tries to cast the FileChannel internally (see JDK issue: https://bugs.openjdk.java.net/browse/JDK-8259028). For now we have to disable all custom file systems in our test suite until this bug is fixed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] msokolov edited a comment on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
msokolov edited a comment on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-754153605
I ran some vector search performance tests even though this isn't probably the most representative test, but since I have been running these for other reasons lately and I'm all geared up for it, I thought I would fire them up with these changes. I see a pronounced slowdown there, about 40% increase in latency for these KNN searches with this patch (I ran both conditions w/JDK16). Most of the workload from these searches are random-access reads of bytes from the index, conversion to float[], and computing dot products.
I did recently discover that we were (inadvertently) reading and writing the `float[]` vectors that these searches are based on as big-endian, so perhaps if (when) we switch that (I have a pending patch), it would work better in conjunction with this??, although currently on master, at the `IndexInput` level it is just reading bytes on the heap and only afterwards converting those to `float[]`, so it doesn't seem likely there would be any interaction there; I just mention it since it's top of mind, and since I plan to add `DataInput.readFloats()`, which would be needed here too.
Then I also ran luceneutil tests and see interesting mixed results:
```
TaskQPS baseline StdDevQPS candidate StdDev Pct diff p-value
BrowseMonthTaxoFacets 2.15 (7.7%) 1.19 (2.1%) -44.5% ( -50% - -37%) 0.000
BrowseDayOfYearTaxoFacets 2.03 (8.1%) 1.16 (2.4%) -43.0% ( -49% - -35%) 0.000
BrowseDateTaxoFacets 2.03 (8.2%) 1.16 (2.3%) -42.9% ( -49% - -35%) 0.000
Respell 52.33 (2.4%) 46.89 (1.7%) -10.4% ( -14% - -6%) 0.000
BrowseMonthSSDVFacets 12.00 (13.2%) 10.81 (6.9%) -10.0% ( -26% - 11%) 0.003
PKLookup 132.89 (2.3%) 120.09 (1.3%) -9.6% ( -13% - -6%) 0.000
AndHighHigh 24.98 (3.0%) 22.61 (2.1%) -9.5% ( -14% - -4%) 0.000
MedSloppyPhrase 15.66 (3.9%) 14.37 (2.0%) -8.2% ( -13% - -2%) 0.000
LowSpanNear 96.34 (2.3%) 88.72 (1.6%) -7.9% ( -11% - -4%) 0.000
HighSloppyPhrase 10.57 (4.6%) 9.75 (2.7%) -7.7% ( -14% - 0%) 0.000
Prefix3 37.23 (7.2%) 34.53 (6.1%) -7.2% ( -19% - 6%) 0.001
MedSpanNear 15.63 (1.5%) 14.52 (2.0%) -7.1% ( -10% - -3%) 0.000
LowSloppyPhrase 15.11 (4.4%) 14.06 (2.5%) -6.9% ( -13% - 0%) 0.000
AndHighMed 72.87 (3.4%) 67.88 (3.0%) -6.8% ( -12% - 0%) 0.000
AndHighLow 366.41 (3.2%) 342.99 (3.6%) -6.4% ( -12% - 0%) 0.000
OrHighMed 69.00 (2.6%) 65.09 (2.7%) -5.7% ( -10% - 0%) 0.000
LowTerm 930.90 (5.7%) 878.74 (4.3%) -5.6% ( -14% - 4%) 0.000
Fuzzy2 46.43 (7.2%) 43.94 (5.4%) -5.4% ( -16% - 7%) 0.008
Fuzzy1 64.31 (6.9%) 60.98 (6.5%) -5.2% ( -17% - 8%) 0.014
HighSpanNear 19.07 (1.7%) 18.12 (2.6%) -5.0% ( -9% - 0%) 0.000
MedPhrase 21.45 (2.4%) 20.53 (1.9%) -4.3% ( -8% - 0%) 0.000
HighTermTitleBDVSort 73.32 (16.1%) 70.33 (17.0%) -4.1% ( -32% - 34%) 0.437
OrHighHigh 16.82 (1.7%) 16.19 (1.9%) -3.7% ( -7% - 0%) 0.000
BrowseDayOfYearSSDVFacets 9.56 (8.7%) 9.24 (6.2%) -3.4% ( -16% - 12%) 0.159
HighIntervalsOrdered 9.02 (0.6%) 8.78 (0.7%) -2.7% ( -4% - -1%) 0.000
OrNotHighMed 324.93 (2.7%) 316.20 (3.6%) -2.7% ( -8% - 3%) 0.007
Wildcard 129.01 (2.5%) 126.91 (2.6%) -1.6% ( -6% - 3%) 0.046
HighTermDayOfYearSort 113.01 (11.9%) 111.68 (10.6%) -1.2% ( -21% - 24%) 0.741
OrNotHighLow 539.19 (2.8%) 534.10 (2.8%) -0.9% ( -6% - 4%) 0.284
OrHighLow 314.01 (5.4%) 311.23 (5.9%) -0.9% ( -11% - 11%) 0.620
HighTermMonthSort 96.00 (9.6%) 95.48 (10.1%) -0.5% ( -18% - 21%) 0.863
TermDTSort 100.42 (10.9%) 100.52 (13.4%) 0.1% ( -21% - 27%) 0.980
LowPhrase 259.52 (1.9%) 262.37 (2.2%) 1.1% ( -2% - 5%) 0.091
HighTerm 872.57 (3.9%) 892.07 (3.4%) 2.2% ( -4% - 9%) 0.052
MedTerm 1003.36 (4.8%) 1029.34 (4.8%) 2.6% ( -6% - 12%) 0.086
HighPhrase 220.30 (1.9%) 228.63 (3.3%) 3.8% ( -1% - 9%) 0.000
OrHighNotHigh 372.34 (3.4%) 391.32 (4.2%) 5.1% ( -2% - 13%) 0.000
OrNotHighHigh 389.61 (2.7%) 415.68 (5.5%) 6.7% ( -1% - 15%) 0.000
OrHighNotMed 409.35 (1.9%) 445.76 (6.9%) 8.9% ( 0% - 18%) 0.000
OrHighNotLow 513.64 (4.2%) 561.96 (7.9%) 9.4% ( -2% - 22%) 0.000
IntNRQ 203.68 (2.3%) 224.37 (4.7%) 10.2% ( 3% - 17%) 0.000
```
note on the setup: I had to tinker with luceneutil a bit to get it to use JDK16 runtime and add the foreign memory access module. gradle seemed to work OK given the environment variable setup discussed above. I did check the command line to make sure it was in fact using JDK16, and I saw it fail once (before I added the module import command line argument), so I'm sure it is.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753503210
@dweiss I need your help here. There is one thing that drives me crazy: with the changes (actually adding `--add-modules", "jdk.incubator.foreign` to the test command line, JUnit 4 suddely fails for many "non-tests" with exceptions like this.
This seems to have to do with some automatism by Gradle or by JUnit once it detects the "Java module system", and I have no idea how to turn it off. What I am completely wondering about: why does it load those classes at all? I was trying to find the good old "Ant fileset" that has this `**/*Test.class,**/Test*.class` pattern, but this is no longer there with gradle. How to readd it, if JUnit/Gradle seems to think that it needs some (broken) module system API.
Unless this is fixed, I can't run all tests easily. I only ran them per module and ignored the tons of stack traces. PLEASE HELP!
```
org.apache.lucene.backward_codecs.lucene60.Lucene60PointsWriter > initializationError FAILED
java.lang.IllegalArgumentException: Test class can only have one constructor
at org.junit.runners.model.TestClass.<init>(TestClass.java:48)
at org.junit.runners.JUnit4.<init>(JUnit4.java:23)
at org.junit.internal.builders.JUnit4Builder.runnerForClass(JUnit4Builder.java:10)
at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:70)
at org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:37)
at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:70)
at org.junit.internal.requests.ClassRequest.createRunner(ClassRequest.java:28)
at org.junit.internal.requests.MemoizingRequest.getRunner(MemoizingRequest.java:19)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:78)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
at jdk.proxy1/jdk.proxy1.$Proxy2.processTestClass(Unknown Source)
at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:119)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:182)
at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:164)
at org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:414)
at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64)
at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:48)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
at org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:56)
at java.base/java.lang.Thread.run(Thread.java:831)
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551193215
##########
File path: gradle/defaults-java.gradle
##########
@@ -60,3 +60,14 @@ allprojects {
}
}
}
+
+configure(project(":lucene:core")) {
Review comment:
Ok, just wondering!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] msokolov commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
msokolov commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551104528
##########
File path: lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java
##########
@@ -0,0 +1,525 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.lang.invoke.VarHandle;
+import java.nio.ByteOrder;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.util.IOUtils;
+
+import jdk.incubator.foreign.MemoryHandles;
+import jdk.incubator.foreign.MemorySegment;
+
+/**
+ * Base IndexInput implementation that uses an array of MemorySegments to represent a file.
+ *
+ * <p>For efficiency, this class requires that the segment size are a power-of-two
+ * (<code>chunkSizePower</code>).
+ */
+public abstract class MemorySegmentIndexInput extends IndexInput implements RandomAccessInput {
+ // We pass 1L as alignment, because currently Lucene file formats are heavy unaligned: :(
+ static final VarHandle VH_getByte = MemoryHandles.varHandle(byte.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getShort = MemoryHandles.varHandle(short.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getInt = MemoryHandles.varHandle(int.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getLong = MemoryHandles.varHandle(long.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+
+ final boolean isClone;
+ final long length;
+ final long chunkSizeMask;
+ final int chunkSizePower;
+ final MemorySegment[] segments;
+
+ int curSegmentIndex = -1;
+ MemorySegment curSegment; // redundant for speed: segments[curSegmentIndex], also marker if closed!
+ long curPosition; // relative to curSegment, not globally
+
+ public static MemorySegmentIndexInput newInstance(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower) {
+ if (segments.length == 1) {
+ return new SingleSegmentImpl(resourceDescription, segments[0], length, chunkSizePower, false);
+ } else {
+ return new MultiSegmentImpl(resourceDescription, segments, 0, length, chunkSizePower, false);
+ }
+ }
+
+ private MemorySegmentIndexInput(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower,
+ boolean isClone) {
+ super(resourceDescription);
+ this.segments = segments;
+ this.length = length;
+ this.chunkSizePower = chunkSizePower;
+ this.chunkSizeMask = (1L << chunkSizePower) - 1L;
+ this.isClone = isClone;
+ this.curSegment = segments[0];
+ }
+
+ void ensureOpen() {
+ if (curSegment == null) {
+ throw alreadyClosed();
+ }
+ }
+
+ RuntimeException wrapAlreadyClosedException(RuntimeException e) {
+ if (e instanceof NullPointerException) {
+ return alreadyClosed();
+ }
+ // TODO: maybe open a JDK issue to have a separate, more
+ // meaningful exception for unmapped segments:
+ if (e.getMessage() != null && e.getMessage().contains("closed")) {
+ return alreadyClosed();
+ }
+ return e;
+ }
+
+ // return value only for compiler!
Review comment:
I see, it saves the (bullshitty) pretense of returning some dummy value when that case will never actually occur - neat
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler edited a comment on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler edited a comment on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-754299385
Hi @msokolov,
thanks for the speed test - I wanted to wait for the specialists to do it. As said before, we should report slowness (or improvements) to the Hotpsot/OpenJDK people. I was expecting a result like this due to the early stage of the development and misisng Hotspot optimizations. Just to clarify: In your code, you are reading the floats with a series of readInt() from the indexinput, converting them to float?
If that's the case, I would say: this is expected! From what I figured out on the mailing lists: loops using `long` as an index vs. `int` as an index are not supported well by Hotspot yet. The position() argument of ByteBuffers is a 32bits `int`and whenever you read from memory it uses this int (`index += 4`). Those loops are easy optimized. With `MemorySegment`s, the pointer is a 64bit `long` named `curPosition` in our `MemorySegmentIndexInput`. Hotspot has no idea how to optimize that loop of readInt() statements! I am not sure if this is fully true at this stage but that's my first guess, I will discuss this with Maurizio.
Interestingly, I would have expected that `readBytes()` and `readLELongs()` are slower due to the overheads by wrapping all those `slice()` calls and `MemorySegment.ofArray()`all producing short living object instances! But I seem to be proved wrong :-)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753492065
> Very exciting! Thank you for leading the way here, Uwe.
>
>> The most horrible thing I have seen is that our CFS file format starts the "inner" files totally unaligned. We should fix the CFSWriter to start new files always at multiples of 8 bytes. I will open an issue about this.
>
> I ran into this just yesterday as I was playing around with aligning vectors in their index files to see if any perf bump could be gained (the header seems to be 50 bytes usually, so some padding is needed). And then when I asserted alignment, realized that CFS was messing with it.
In fact for CFS file we don't even need to change the file format / version number. We just ave to make sure that CFS writer starts new inner files aligned to 8 bytes. That should be easy to implement.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-755778832
Hi @msokolov: I added the readLEFloats() from #2175 here.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] msokolov commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
msokolov commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753672740
Ah, OK once I set `RUNTIME_JAVA_HOME` I was able to compile OK
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-754302399
> No, the floats are read using readBytes() into a ByteBuffer and then converted using a derived FloatBuffer. So I think your intuition about readBytes() is correct?
Huh? Really. I thought you are using the code in readLELongs default implementation on IndexInput. So you also copy stuff 2 times! That's even worse, it would have been better to the ByteBuffer directly from IndexInput: `curSegment.slice(offset, length).asByteBuffer()` would return a view on the bytes. Or alternatively do it like with readLELongs().
But nevertheless, for readBytes the wrapping is extensive and may be a slowdown. Can you try this again using a loop of `Float.intBitsToFloat(input.readInt())`? If this is faster I would say that my first expectation is correct: If the number of integers or single bytes to read is small (maybe <50 or <100) then it might really be better to just have a loop calling readByte() or readLong() instead of the array views.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-761142012
I removed the hacks for the bugs in JDK. The minimum requirement to test this draft implementation is *JDK-16-ea-b32*.
Thanks @msokolov for the performance tests. I was hoping that it gets a bit better, but we may need to figure out where the speed differences are coming from. I don't think results will change in the new preview build, I just removed the hacks.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753511728
@dweiss I found a workaround by adding the include/excludes as we use in Ant build. Why were those removed? Should I maybe reopen a new issue.
I have the feeling thois comes from the fact that class files compiled with JDK16 and using the new class file format can't be "analyzed" by gradle, so it assumes "no idea, let's run it because its a class file". The automatisms don't seem to work then. With the include/exclude patterns everything went back to normal.
Should we also add those lines to `defaults-tests.gradle`?:
```
include '**/*Test.class', '**/Test*.class'
exclude '**/*$*'
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551061719
##########
File path: lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java
##########
@@ -0,0 +1,525 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.lang.invoke.VarHandle;
+import java.nio.ByteOrder;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.util.IOUtils;
+
+import jdk.incubator.foreign.MemoryHandles;
+import jdk.incubator.foreign.MemorySegment;
+
+/**
+ * Base IndexInput implementation that uses an array of MemorySegments to represent a file.
+ *
+ * <p>For efficiency, this class requires that the segment size are a power-of-two
+ * (<code>chunkSizePower</code>).
+ */
+public abstract class MemorySegmentIndexInput extends IndexInput implements RandomAccessInput {
+ // We pass 1L as alignment, because currently Lucene file formats are heavy unaligned: :(
+ static final VarHandle VH_getByte = MemoryHandles.varHandle(byte.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getShort = MemoryHandles.varHandle(short.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getInt = MemoryHandles.varHandle(int.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getLong = MemoryHandles.varHandle(long.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+
+ final boolean isClone;
+ final long length;
+ final long chunkSizeMask;
+ final int chunkSizePower;
+ final MemorySegment[] segments;
+
+ int curSegmentIndex = -1;
+ MemorySegment curSegment; // redundant for speed: segments[curSegmentIndex], also marker if closed!
+ long curPosition; // relative to curSegment, not globally
+
+ public static MemorySegmentIndexInput newInstance(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower) {
+ if (segments.length == 1) {
+ return new SingleSegmentImpl(resourceDescription, segments[0], length, chunkSizePower, false);
+ } else {
+ return new MultiSegmentImpl(resourceDescription, segments, 0, length, chunkSizePower, false);
+ }
+ }
+
+ private MemorySegmentIndexInput(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower,
+ boolean isClone) {
+ super(resourceDescription);
+ this.segments = segments;
+ this.length = length;
+ this.chunkSizePower = chunkSizePower;
+ this.chunkSizeMask = (1L << chunkSizePower) - 1L;
+ this.isClone = isClone;
+ this.curSegment = segments[0];
+ }
+
+ void ensureOpen() {
+ if (curSegment == null) {
+ throw alreadyClosed();
+ }
+ }
+
+ RuntimeException wrapAlreadyClosedException(RuntimeException e) {
+ if (e instanceof NullPointerException) {
+ return alreadyClosed();
+ }
+ // TODO: maybe open a JDK issue to have a separate, more
+ // meaningful exception for unmapped segments:
+ if (e.getMessage() != null && e.getMessage().contains("closed")) {
+ return alreadyClosed();
+ }
+ return e;
+ }
+
+ // return value only for compiler!
Review comment:
See here: https://github.com/apache/lucene-solr/pull/2176/files#diff-1c8534e50633cd8648f2cb3c886a269a6ebf8a57d817885bd962225d83f4e93fR450
`byte` is fine for a method that can return `long`, `int`, `short`, `byte`,.... or `void`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551194238
##########
File path: lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java
##########
@@ -0,0 +1,525 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.lang.invoke.VarHandle;
+import java.nio.ByteOrder;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.util.IOUtils;
+
+import jdk.incubator.foreign.MemoryHandles;
+import jdk.incubator.foreign.MemorySegment;
+
+/**
+ * Base IndexInput implementation that uses an array of MemorySegments to represent a file.
+ *
+ * <p>For efficiency, this class requires that the segment size are a power-of-two
+ * (<code>chunkSizePower</code>).
+ */
+public abstract class MemorySegmentIndexInput extends IndexInput implements RandomAccessInput {
+ // We pass 1L as alignment, because currently Lucene file formats are heavy unaligned: :(
+ static final VarHandle VH_getByte = MemoryHandles.varHandle(byte.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getShort = MemoryHandles.varHandle(short.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getInt = MemoryHandles.varHandle(int.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getLong = MemoryHandles.varHandle(long.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+
+ final boolean isClone;
+ final long length;
+ final long chunkSizeMask;
+ final int chunkSizePower;
+ final MemorySegment[] segments;
+
+ int curSegmentIndex = -1;
+ MemorySegment curSegment; // redundant for speed: segments[curSegmentIndex], also marker if closed!
+ long curPosition; // relative to curSegment, not globally
+
+ public static MemorySegmentIndexInput newInstance(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower) {
+ if (segments.length == 1) {
+ return new SingleSegmentImpl(resourceDescription, segments[0], length, chunkSizePower, false);
+ } else {
+ return new MultiSegmentImpl(resourceDescription, segments, 0, length, chunkSizePower, false);
+ }
+ }
+
+ private MemorySegmentIndexInput(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower,
+ boolean isClone) {
+ super(resourceDescription);
+ this.segments = segments;
+ this.length = length;
+ this.chunkSizePower = chunkSizePower;
+ this.chunkSizeMask = (1L << chunkSizePower) - 1L;
+ this.isClone = isClone;
+ this.curSegment = segments[0];
+ }
+
+ void ensureOpen() {
+ if (curSegment == null) {
+ throw alreadyClosed();
+ }
+ }
+
+ RuntimeException wrapAlreadyClosedException(RuntimeException e) {
+ if (e instanceof NullPointerException) {
+ return alreadyClosed();
+ }
+ // TODO: maybe open a JDK issue to have a separate, more
+ // meaningful exception for unmapped segments:
+ if (e.getMessage() != null && e.getMessage().contains("closed")) {
+ return alreadyClosed();
+ }
+ return e;
+ }
+
+ // return value only for compiler!
+ byte handlePositionalIOOBE(String action, long pos) throws IOException {
Review comment:
The return also has same effect. Compiler understands that code is unreachable afterwards.
Please read my explanation as reply to Sokolov's question.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] msokolov commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
msokolov commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753491649
Very exciting! Thank you for leading the way here, Uwe.
> The most horrible thing I have seen is that our CFS file format starts the "inner" files totally unaligned. We should fix the CFSWriter to start new files always at multiples of 8 bytes. I will open an issue about this.
I ran into this just yesterday as I was playing around with aligning vectors in their index files to see if any perf bump could be gained (the header seems to be 50 bytes usually, so some padding is needed). And then when I asserted alignment, realized that CFS was messing with it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551193422
##########
File path: lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java
##########
@@ -0,0 +1,525 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.lang.invoke.VarHandle;
+import java.nio.ByteOrder;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.util.IOUtils;
+
+import jdk.incubator.foreign.MemoryHandles;
+import jdk.incubator.foreign.MemorySegment;
+
+/**
+ * Base IndexInput implementation that uses an array of MemorySegments to represent a file.
+ *
+ * <p>For efficiency, this class requires that the segment size are a power-of-two
+ * (<code>chunkSizePower</code>).
+ */
+public abstract class MemorySegmentIndexInput extends IndexInput implements RandomAccessInput {
+ // We pass 1L as alignment, because currently Lucene file formats are heavy unaligned: :(
+ static final VarHandle VH_getByte = MemoryHandles.varHandle(byte.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getShort = MemoryHandles.varHandle(short.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getInt = MemoryHandles.varHandle(int.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getLong = MemoryHandles.varHandle(long.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+
+ final boolean isClone;
+ final long length;
+ final long chunkSizeMask;
+ final int chunkSizePower;
+ final MemorySegment[] segments;
+
+ int curSegmentIndex = -1;
+ MemorySegment curSegment; // redundant for speed: segments[curSegmentIndex], also marker if closed!
+ long curPosition; // relative to curSegment, not globally
+
+ public static MemorySegmentIndexInput newInstance(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower) {
+ if (segments.length == 1) {
+ return new SingleSegmentImpl(resourceDescription, segments[0], length, chunkSizePower, false);
+ } else {
+ return new MultiSegmentImpl(resourceDescription, segments, 0, length, chunkSizePower, false);
+ }
+ }
+
+ private MemorySegmentIndexInput(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower,
+ boolean isClone) {
+ super(resourceDescription);
+ this.segments = segments;
+ this.length = length;
+ this.chunkSizePower = chunkSizePower;
+ this.chunkSizeMask = (1L << chunkSizePower) - 1L;
+ this.isClone = isClone;
+ this.curSegment = segments[0];
+ }
+
+ void ensureOpen() {
+ if (curSegment == null) {
+ throw alreadyClosed();
+ }
+ }
+
+ RuntimeException wrapAlreadyClosedException(RuntimeException e) {
+ if (e instanceof NullPointerException) {
+ return alreadyClosed();
+ }
+ // TODO: maybe open a JDK issue to have a separate, more
+ // meaningful exception for unmapped segments:
+ if (e.getMessage() != null && e.getMessage().contains("closed")) {
+ return alreadyClosed();
+ }
+ return e;
+ }
+
+ // return value only for compiler!
+ byte handlePositionalIOOBE(String action, long pos) throws IOException {
Review comment:
Won't work here (the other code paths do this).
Here the problem is that it throws IOException or RuntimeRxception (IllegalArgumentEx extends RuntimeEx, EOFException extends checked IOException). There's no common Supertype. This is why option (2) was chosen here.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
dweiss commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753673327
See this, Mike:
https://github.com/apache/lucene-solr/blob/master/help/jvms.txt
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] msokolov commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
msokolov commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-754153605
I ran some vector search performance tests even though this isn't probably the most representative test, but since I have been running these for other reasons lately and I'm all geared up for it, I thought I would fire them up with these changes. I see a pronounced slowdown there, about 40% increase in latency for these KNN searches with this patch (I ran both conditions w/JDK16). Most of the workload from these searches are random-access reads of bytes from the index, conversion to float[], and computing dot products.
I did recently discover that we were (inadvertently) reading and writing the `float[]` vectors that these searches are based on as big-endian, so perhaps if (when) we switch that (I have a pending patch), it would work better in conjunction with this??, although currently on master, at the `IndexInput` level it is just reading bytes on the heap and only afterwards converting those to `float[]`, so it doesn't seem likely there would be any interaction there; I just mention it since it's top of mind, and since I plan to add `DataInput.readFloats()`, which would be needed here too.
Then I also ran luceneutil tests and see interesting mixed results:
```
TaskQPS baseline StdDevQPS candidate StdDev Pct diff p-value
BrowseMonthTaxoFacets 2.15 (7.7%) 1.19 (2.1%) -44.5% ( -50% - -37%) 0.000
BrowseDayOfYearTaxoFacets 2.03 (8.1%) 1.16 (2.4%) -43.0% ( -49% - -35%) 0.000
BrowseDateTaxoFacets 2.03 (8.2%) 1.16 (2.3%) -42.9% ( -49% - -35%) 0.000
Respell 52.33 (2.4%) 46.89 (1.7%) -10.4% ( -14% - -6%) 0.000
BrowseMonthSSDVFacets 12.00 (13.2%) 10.81 (6.9%) -10.0% ( -26% - 11%) 0.003
PKLookup 132.89 (2.3%) 120.09 (1.3%) -9.6% ( -13% - -6%) 0.000
AndHighHigh 24.98 (3.0%) 22.61 (2.1%) -9.5% ( -14% - -4%) 0.000
MedSloppyPhrase 15.66 (3.9%) 14.37 (2.0%) -8.2% ( -13% - -2%) 0.000
LowSpanNear 96.34 (2.3%) 88.72 (1.6%) -7.9% ( -11% - -4%) 0.000
HighSloppyPhrase 10.57 (4.6%) 9.75 (2.7%) -7.7% ( -14% - 0%) 0.000
Prefix3 37.23 (7.2%) 34.53 (6.1%) -7.2% ( -19% - 6%) 0.001
MedSpanNear 15.63 (1.5%) 14.52 (2.0%) -7.1% ( -10% - -3%) 0.000
LowSloppyPhrase 15.11 (4.4%) 14.06 (2.5%) -6.9% ( -13% - 0%) 0.000
AndHighMed 72.87 (3.4%) 67.88 (3.0%) -6.8% ( -12% - 0%) 0.000
AndHighLow 366.41 (3.2%) 342.99 (3.6%) -6.4% ( -12% - 0%) 0.000
OrHighMed 69.00 (2.6%) 65.09 (2.7%) -5.7% ( -10% - 0%) 0.000
LowTerm 930.90 (5.7%) 878.74 (4.3%) -5.6% ( -14% - 4%) 0.000
Fuzzy2 46.43 (7.2%) 43.94 (5.4%) -5.4% ( -16% - 7%) 0.008
Fuzzy1 64.31 (6.9%) 60.98 (6.5%) -5.2% ( -17% - 8%) 0.014
HighSpanNear 19.07 (1.7%) 18.12 (2.6%) -5.0% ( -9% - 0%) 0.000
MedPhrase 21.45 (2.4%) 20.53 (1.9%) -4.3% ( -8% - 0%) 0.000
HighTermTitleBDVSort 73.32 (16.1%) 70.33 (17.0%) -4.1% ( -32% - 34%) 0.437
OrHighHigh 16.82 (1.7%) 16.19 (1.9%) -3.7% ( -7% - 0%) 0.000
BrowseDayOfYearSSDVFacets 9.56 (8.7%) 9.24 (6.2%) -3.4% ( -16% - 12%) 0.159
HighIntervalsOrdered 9.02 (0.6%) 8.78 (0.7%) -2.7% ( -4% - -1%) 0.000
OrNotHighMed 324.93 (2.7%) 316.20 (3.6%) -2.7% ( -8% - 3%) 0.007
Wildcard 129.01 (2.5%) 126.91 (2.6%) -1.6% ( -6% - 3%) 0.046
HighTermDayOfYearSort 113.01 (11.9%) 111.68 (10.6%) -1.2% ( -21% - 24%) 0.741
OrNotHighLow 539.19 (2.8%) 534.10 (2.8%) -0.9% ( -6% - 4%) 0.284
OrHighLow 314.01 (5.4%) 311.23 (5.9%) -0.9% ( -11% - 11%) 0.620
HighTermMonthSort 96.00 (9.6%) 95.48 (10.1%) -0.5% ( -18% - 21%) 0.863
TermDTSort 100.42 (10.9%) 100.52 (13.4%) 0.1% ( -21% - 27%) 0.980
LowPhrase 259.52 (1.9%) 262.37 (2.2%) 1.1% ( -2% - 5%) 0.091
HighTerm 872.57 (3.9%) 892.07 (3.4%) 2.2% ( -4% - 9%) 0.052
MedTerm 1003.36 (4.8%) 1029.34 (4.8%) 2.6% ( -6% - 12%) 0.086
HighPhrase 220.30 (1.9%) 228.63 (3.3%) 3.8% ( -1% - 9%) 0.000
OrHighNotHigh 372.34 (3.4%) 391.32 (4.2%) 5.1% ( -2% - 13%) 0.000
OrNotHighHigh 389.61 (2.7%) 415.68 (5.5%) 6.7% ( -1% - 15%) 0.000
OrHighNotMed 409.35 (1.9%) 445.76 (6.9%) 8.9% ( 0% - 18%) 0.000
OrHighNotLow 513.64 (4.2%) 561.96 (7.9%) 9.4% ( -2% - 22%) 0.000
IntNRQ 203.68 (2.3%) 224.37 (4.7%) 10.2% ( 3% - 17%) 0.000
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-754304561
This would also explain slowdowns on *short* posting-list iterations, as readLELongs() with smaller number of longs is not working well.
My plan would be to do an if statement at beginning of readBytes() or readLELongs() like:
```
if (length < SOME_LIMIT) {
for (int i = offset; i< offset + length; i++) {
bytes[i] = readByte();
}
} else {
... current code...
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551195154
##########
File path: gradle/defaults-java.gradle
##########
@@ -60,3 +60,14 @@ allprojects {
}
}
}
+
+configure(project(":lucene:core")) {
Review comment:
The main reason for this PR is testing and feedback to JDK team.
Jenkins tests it and we can feedback. 4 issues already reported.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler edited a comment on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler edited a comment on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753503210
@dweiss I need your help here. There is one thing that drives me crazy: with the changes (actually adding `--add-modules jdk.incubator.foreign` to the test command line, JUnit 4 suddely fails for many "non-tests" with exceptions like this.
This seems to have to do with some automatism by Gradle or by JUnit once it detects the "Java module system", and I have no idea how to turn it off. What I am completely wondering about: why does it load those classes at all? I was trying to find the good old "Ant fileset" that has this `**/*Test.class,**/Test*.class` pattern, but this is no longer there with gradle. How to readd it, if JUnit/Gradle seems to think that it needs some (broken) module system API.
Unless this is fixed, I can't run all tests easily. I only ran them per module and ignored the tons of stack traces. PLEASE HELP!
```
org.apache.lucene.backward_codecs.lucene60.Lucene60PointsWriter > initializationError FAILED
java.lang.IllegalArgumentException: Test class can only have one constructor
at org.junit.runners.model.TestClass.<init>(TestClass.java:48)
at org.junit.runners.JUnit4.<init>(JUnit4.java:23)
at org.junit.internal.builders.JUnit4Builder.runnerForClass(JUnit4Builder.java:10)
at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:70)
at org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:37)
at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:70)
at org.junit.internal.requests.ClassRequest.createRunner(ClassRequest.java:28)
at org.junit.internal.requests.MemoizingRequest.getRunner(MemoizingRequest.java:19)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:78)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
at jdk.proxy1/jdk.proxy1.$Proxy2.processTestClass(Unknown Source)
at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:119)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:182)
at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:164)
at org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:414)
at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64)
at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:48)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
at org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:56)
at java.base/java.lang.Thread.run(Thread.java:831)
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551197902
##########
File path: lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java
##########
@@ -0,0 +1,525 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.lang.invoke.VarHandle;
+import java.nio.ByteOrder;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.util.IOUtils;
+
+import jdk.incubator.foreign.MemoryHandles;
+import jdk.incubator.foreign.MemorySegment;
+
+/**
+ * Base IndexInput implementation that uses an array of MemorySegments to represent a file.
+ *
+ * <p>For efficiency, this class requires that the segment size are a power-of-two
+ * (<code>chunkSizePower</code>).
+ */
+public abstract class MemorySegmentIndexInput extends IndexInput implements RandomAccessInput {
+ // We pass 1L as alignment, because currently Lucene file formats are heavy unaligned: :(
+ static final VarHandle VH_getByte = MemoryHandles.varHandle(byte.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getShort = MemoryHandles.varHandle(short.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getInt = MemoryHandles.varHandle(int.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getLong = MemoryHandles.varHandle(long.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+
+ final boolean isClone;
+ final long length;
+ final long chunkSizeMask;
+ final int chunkSizePower;
+ final MemorySegment[] segments;
+
+ int curSegmentIndex = -1;
+ MemorySegment curSegment; // redundant for speed: segments[curSegmentIndex], also marker if closed!
+ long curPosition; // relative to curSegment, not globally
+
+ public static MemorySegmentIndexInput newInstance(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower) {
+ if (segments.length == 1) {
+ return new SingleSegmentImpl(resourceDescription, segments[0], length, chunkSizePower, false);
+ } else {
+ return new MultiSegmentImpl(resourceDescription, segments, 0, length, chunkSizePower, false);
+ }
+ }
+
+ private MemorySegmentIndexInput(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower,
+ boolean isClone) {
+ super(resourceDescription);
+ this.segments = segments;
+ this.length = length;
+ this.chunkSizePower = chunkSizePower;
+ this.chunkSizeMask = (1L << chunkSizePower) - 1L;
+ this.isClone = isClone;
+ this.curSegment = segments[0];
+ }
+
+ void ensureOpen() {
+ if (curSegment == null) {
+ throw alreadyClosed();
+ }
+ }
+
+ RuntimeException wrapAlreadyClosedException(RuntimeException e) {
+ if (e instanceof NullPointerException) {
+ return alreadyClosed();
+ }
+ // TODO: maybe open a JDK issue to have a separate, more
+ // meaningful exception for unmapped segments:
+ if (e.getMessage() != null && e.getMessage().contains("closed")) {
+ return alreadyClosed();
+ }
+ return e;
+ }
+
+ // return value only for compiler!
+ byte handlePositionalIOOBE(String action, long pos) throws IOException {
Review comment:
Ah now I understand... You declare it to return RuntimeException, but never return it. Instead throw exception directly, return type is just a fake.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] msokolov commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
msokolov commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-754627642
Uwe, I didn't think that IndexInput would expose its internal ByteBuffers easily? But you are right about the double-copying - that is why I opened https://issues.apache.org/jira/browse/LUCENE-9652. BTW I did test that change with and without this one and saw a similar slowdown there. Possibly a loop with a single float at a time would be better? It's highly counterintutitive to me, but when I get a moment, I will try it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler edited a comment on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler edited a comment on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753492065
> Very exciting! Thank you for leading the way here, Uwe.
>
>> The most horrible thing I have seen is that our CFS file format starts the "inner" files totally unaligned. We should fix the CFSWriter to start new files always at multiples of 8 bytes. I will open an issue about this.
>
> I ran into this just yesterday as I was playing around with aligning vectors in their index files to see if any perf bump could be gained (the header seems to be 50 bytes usually, so some padding is needed). And then when I asserted alignment, realized that CFS was messing with it.
In fact for CFS file we don't even need to change the file format / version number. We just have to make sure that CFS writer starts new inner files aligned to 8 bytes. That should be easy to implement.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753650342
After some cleanup, I also added a workaround for https://bugs.openjdk.java.net/browse/JDK-8259028: In MMapDirectory we try to unwrap the path using reflection. This is for sure a hackydihickhack workaround, but works until this is fixed -- and now all tests pass for me! :-)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551061081
##########
File path: lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java
##########
@@ -0,0 +1,525 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.lang.invoke.VarHandle;
+import java.nio.ByteOrder;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.util.IOUtils;
+
+import jdk.incubator.foreign.MemoryHandles;
+import jdk.incubator.foreign.MemorySegment;
+
+/**
+ * Base IndexInput implementation that uses an array of MemorySegments to represent a file.
+ *
+ * <p>For efficiency, this class requires that the segment size are a power-of-two
+ * (<code>chunkSizePower</code>).
+ */
+public abstract class MemorySegmentIndexInput extends IndexInput implements RandomAccessInput {
+ // We pass 1L as alignment, because currently Lucene file formats are heavy unaligned: :(
+ static final VarHandle VH_getByte = MemoryHandles.varHandle(byte.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getShort = MemoryHandles.varHandle(short.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getInt = MemoryHandles.varHandle(int.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getLong = MemoryHandles.varHandle(long.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+
+ final boolean isClone;
+ final long length;
+ final long chunkSizeMask;
+ final int chunkSizePower;
+ final MemorySegment[] segments;
+
+ int curSegmentIndex = -1;
+ MemorySegment curSegment; // redundant for speed: segments[curSegmentIndex], also marker if closed!
+ long curPosition; // relative to curSegment, not globally
+
+ public static MemorySegmentIndexInput newInstance(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower) {
+ if (segments.length == 1) {
+ return new SingleSegmentImpl(resourceDescription, segments[0], length, chunkSizePower, false);
+ } else {
+ return new MultiSegmentImpl(resourceDescription, segments, 0, length, chunkSizePower, false);
+ }
+ }
+
+ private MemorySegmentIndexInput(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower,
+ boolean isClone) {
+ super(resourceDescription);
+ this.segments = segments;
+ this.length = length;
+ this.chunkSizePower = chunkSizePower;
+ this.chunkSizeMask = (1L << chunkSizePower) - 1L;
+ this.isClone = isClone;
+ this.curSegment = segments[0];
+ }
+
+ void ensureOpen() {
+ if (curSegment == null) {
+ throw alreadyClosed();
+ }
+ }
+
+ RuntimeException wrapAlreadyClosedException(RuntimeException e) {
+ if (e instanceof NullPointerException) {
+ return alreadyClosed();
+ }
+ // TODO: maybe open a JDK issue to have a separate, more
+ // meaningful exception for unmapped segments:
+ if (e.getMessage() != null && e.getMessage().contains("closed")) {
+ return alreadyClosed();
+ }
+ return e;
+ }
+
+ // return value only for compiler!
Review comment:
no, we need some return value so we can call it and return a value. But as it is throwing an exception anyways, the return value does not matter.
These tricks are often used for exception handlers to delegate "exception handling" to a method where you know it will never return and just bubble up the exception:
- if the exception is known before (by type), you use `throw callSomeMethod(param)`. The pro thing: it's clean as you delegate the instantiation of the exception to the factory, but throw it on your own. The problem here is: only works with a known exception type - and a single one, because the method needs it as return type. A generic `Exception` is a desaster, as it requires to declare that on throws!
- an alternative is (and that's used here): put the throw clause also into the method, so the only sense of our method is throwing some exception, it never returns cleanly. Backside: the method needs a return type, so you can do: `return callSomeMethod()` from the caller (to make the compiler understand that we exit the method here - and always). Of course you can add a separate return statement after the method call, but that looks more bullshitty.
Here we use the second approch. The most common return type to all callers of this method is `byte`. Compiler is happy, voila!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler edited a comment on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler edited a comment on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753673096
Hi read the instructions above, at "be aware". Gradle is not compatible to JDK 16 at all. So Gradle needs to run with JDK 11. You can pass the JDK 16 directory using an environment variable or through sysprop. Gradlew shows all options when you run it without any options and read instructions about alternative jvms.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] msokolov commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
msokolov commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753672225
Hi Uwe - just trying to get this patch working here; when I try to compile (`gradlew lucene:core:jar`) w/JDK 11 I get this: `Could not target platform: 'Java SE 16' using tool chain: 'JDK 11 (11)'`. So I tried w/JDK16 (hmm might not be the right version?) and get `java.lang.IllegalAccessError: class org.gradle.internal.compiler.java.ClassNameCollector (in unnamed module @0x772989a6) cannot access class com.sun.tools.javac.code.Symbol$TypeSymbol (in module jdk.compiler) because module jdk.compiler does not export com.sun.tools.javac.code to unnamed module @0x772989a6`. Yeah I have 16-ea+30-2130 - i think that was the version you specified?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] msokolov commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
msokolov commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551059214
##########
File path: lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java
##########
@@ -0,0 +1,525 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.lang.invoke.VarHandle;
+import java.nio.ByteOrder;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.util.IOUtils;
+
+import jdk.incubator.foreign.MemoryHandles;
+import jdk.incubator.foreign.MemorySegment;
+
+/**
+ * Base IndexInput implementation that uses an array of MemorySegments to represent a file.
+ *
+ * <p>For efficiency, this class requires that the segment size are a power-of-two
+ * (<code>chunkSizePower</code>).
+ */
+public abstract class MemorySegmentIndexInput extends IndexInput implements RandomAccessInput {
+ // We pass 1L as alignment, because currently Lucene file formats are heavy unaligned: :(
+ static final VarHandle VH_getByte = MemoryHandles.varHandle(byte.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getShort = MemoryHandles.varHandle(short.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getInt = MemoryHandles.varHandle(int.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getLong = MemoryHandles.varHandle(long.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+
+ final boolean isClone;
+ final long length;
+ final long chunkSizeMask;
+ final int chunkSizePower;
+ final MemorySegment[] segments;
+
+ int curSegmentIndex = -1;
+ MemorySegment curSegment; // redundant for speed: segments[curSegmentIndex], also marker if closed!
+ long curPosition; // relative to curSegment, not globally
+
+ public static MemorySegmentIndexInput newInstance(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower) {
+ if (segments.length == 1) {
+ return new SingleSegmentImpl(resourceDescription, segments[0], length, chunkSizePower, false);
+ } else {
+ return new MultiSegmentImpl(resourceDescription, segments, 0, length, chunkSizePower, false);
+ }
+ }
+
+ private MemorySegmentIndexInput(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower,
+ boolean isClone) {
+ super(resourceDescription);
+ this.segments = segments;
+ this.length = length;
+ this.chunkSizePower = chunkSizePower;
+ this.chunkSizeMask = (1L << chunkSizePower) - 1L;
+ this.isClone = isClone;
+ this.curSegment = segments[0];
+ }
+
+ void ensureOpen() {
+ if (curSegment == null) {
+ throw alreadyClosed();
+ }
+ }
+
+ RuntimeException wrapAlreadyClosedException(RuntimeException e) {
+ if (e instanceof NullPointerException) {
+ return alreadyClosed();
+ }
+ // TODO: maybe open a JDK issue to have a separate, more
+ // meaningful exception for unmapped segments:
+ if (e.getMessage() != null && e.getMessage().contains("closed")) {
+ return alreadyClosed();
+ }
+ return e;
+ }
+
+ // return value only for compiler!
Review comment:
Is that a JDK16 thing? It thinks this is a no-op?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler edited a comment on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler edited a comment on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753512086
FYI, after I was able to run tests in a more convenient way, I figured out that some tests fail from time to time because of using `HandleTrackingFS`. I disabled all custom filesystems in the PR, but it looks like some tests use `HandleTrackingFS`.
I forgot to mention this in my description above: MMapDirectory no longer works with custom `java.nio.filesystem.FileSystem`, as it tries to cast the `FileChannel` returned by our custom filesystem to the JDK-internal class (see JDK issue: https://bugs.openjdk.java.net/browse/JDK-8259028). For now we have to disable all custom file systems in our test suite until this bug is fixed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551198238
##########
File path: lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java
##########
@@ -0,0 +1,525 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.lang.invoke.VarHandle;
+import java.nio.ByteOrder;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.util.IOUtils;
+
+import jdk.incubator.foreign.MemoryHandles;
+import jdk.incubator.foreign.MemorySegment;
+
+/**
+ * Base IndexInput implementation that uses an array of MemorySegments to represent a file.
+ *
+ * <p>For efficiency, this class requires that the segment size are a power-of-two
+ * (<code>chunkSizePower</code>).
+ */
+public abstract class MemorySegmentIndexInput extends IndexInput implements RandomAccessInput {
+ // We pass 1L as alignment, because currently Lucene file formats are heavy unaligned: :(
+ static final VarHandle VH_getByte = MemoryHandles.varHandle(byte.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getShort = MemoryHandles.varHandle(short.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getInt = MemoryHandles.varHandle(int.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getLong = MemoryHandles.varHandle(long.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+
+ final boolean isClone;
+ final long length;
+ final long chunkSizeMask;
+ final int chunkSizePower;
+ final MemorySegment[] segments;
+
+ int curSegmentIndex = -1;
+ MemorySegment curSegment; // redundant for speed: segments[curSegmentIndex], also marker if closed!
+ long curPosition; // relative to curSegment, not globally
+
+ public static MemorySegmentIndexInput newInstance(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower) {
+ if (segments.length == 1) {
+ return new SingleSegmentImpl(resourceDescription, segments[0], length, chunkSizePower, false);
+ } else {
+ return new MultiSegmentImpl(resourceDescription, segments, 0, length, chunkSizePower, false);
+ }
+ }
+
+ private MemorySegmentIndexInput(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower,
+ boolean isClone) {
+ super(resourceDescription);
+ this.segments = segments;
+ this.length = length;
+ this.chunkSizePower = chunkSizePower;
+ this.chunkSizeMask = (1L << chunkSizePower) - 1L;
+ this.isClone = isClone;
+ this.curSegment = segments[0];
+ }
+
+ void ensureOpen() {
+ if (curSegment == null) {
+ throw alreadyClosed();
+ }
+ }
+
+ RuntimeException wrapAlreadyClosedException(RuntimeException e) {
+ if (e instanceof NullPointerException) {
+ return alreadyClosed();
+ }
+ // TODO: maybe open a JDK issue to have a separate, more
+ // meaningful exception for unmapped segments:
+ if (e.getMessage() != null && e.getMessage().contains("closed")) {
+ return alreadyClosed();
+ }
+ return e;
+ }
+
+ // return value only for compiler!
+ byte handlePositionalIOOBE(String action, long pos) throws IOException {
Review comment:
Precisely.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551196614
##########
File path: lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java
##########
@@ -0,0 +1,525 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.lang.invoke.VarHandle;
+import java.nio.ByteOrder;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.util.IOUtils;
+
+import jdk.incubator.foreign.MemoryHandles;
+import jdk.incubator.foreign.MemorySegment;
+
+/**
+ * Base IndexInput implementation that uses an array of MemorySegments to represent a file.
+ *
+ * <p>For efficiency, this class requires that the segment size are a power-of-two
+ * (<code>chunkSizePower</code>).
+ */
+public abstract class MemorySegmentIndexInput extends IndexInput implements RandomAccessInput {
+ // We pass 1L as alignment, because currently Lucene file formats are heavy unaligned: :(
+ static final VarHandle VH_getByte = MemoryHandles.varHandle(byte.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getShort = MemoryHandles.varHandle(short.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getInt = MemoryHandles.varHandle(int.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getLong = MemoryHandles.varHandle(long.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+
+ final boolean isClone;
+ final long length;
+ final long chunkSizeMask;
+ final int chunkSizePower;
+ final MemorySegment[] segments;
+
+ int curSegmentIndex = -1;
+ MemorySegment curSegment; // redundant for speed: segments[curSegmentIndex], also marker if closed!
+ long curPosition; // relative to curSegment, not globally
+
+ public static MemorySegmentIndexInput newInstance(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower) {
+ if (segments.length == 1) {
+ return new SingleSegmentImpl(resourceDescription, segments[0], length, chunkSizePower, false);
+ } else {
+ return new MultiSegmentImpl(resourceDescription, segments, 0, length, chunkSizePower, false);
+ }
+ }
+
+ private MemorySegmentIndexInput(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower,
+ boolean isClone) {
+ super(resourceDescription);
+ this.segments = segments;
+ this.length = length;
+ this.chunkSizePower = chunkSizePower;
+ this.chunkSizeMask = (1L << chunkSizePower) - 1L;
+ this.isClone = isClone;
+ this.curSegment = segments[0];
+ }
+
+ void ensureOpen() {
+ if (curSegment == null) {
+ throw alreadyClosed();
+ }
+ }
+
+ RuntimeException wrapAlreadyClosedException(RuntimeException e) {
+ if (e instanceof NullPointerException) {
+ return alreadyClosed();
+ }
+ // TODO: maybe open a JDK issue to have a separate, more
+ // meaningful exception for unmapped segments:
+ if (e.getMessage() != null && e.getMessage().contains("closed")) {
+ return alreadyClosed();
+ }
+ return e;
+ }
+
+ // return value only for compiler!
+ byte handlePositionalIOOBE(String action, long pos) throws IOException {
Review comment:
I have. But I think the throws conveys it better :); the difference exists at call sites like this one:
```
} catch (IndexOutOfBoundsException e) {
handlePositionalIOOBE("seek", pos);
}
```
The compiler would allow you to add code after the catch (or after handlePositionalOOBE). Whereas this would have the desired effect (also showing up-front that the method never returns normally):
```
} catch (IndexOutOfBoundsException e) {
throw handlePositionalIOOBE("seek", pos);
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-755789343
FYI, I added some interface in oal.util.Unwrapable that can be implemented by our mocking layers in test-framework. The new interface allows us to unwrap the Path without knowing about test-framework internals. IMHO, we should use this also for other mocking layers, so we can easy unwrap them with some consistent API.
The remaining 2 issues in MappedByteBuffer.mapFile() were merged today and as soon as those are part of another preview build of JDK 16, we can remove the mapFileBugfix() method.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551208675
##########
File path: gradle/defaults-java.gradle
##########
@@ -60,3 +60,14 @@ allprojects {
}
}
}
+
+configure(project(":lucene:core")) {
Review comment:
To come back to the MR-JAR: Yes we may need it, but it can be solved without, too.
The IndexInputs are completely separate (so they can life side-by side). The changes to MMapDircetory are mainly a change from int to long for the chunk size (this is why I also not yet removed the methods to query unmapping status, thy would always return true for JDK>xxxx.
When we would have this productive, MMapDirectory would just return one of the IndexInputs that fits your JDK. We may only need to move the code that maps a file to ByteBuffer or MemorySegment to a separate class or maybe the ByteBuffer-/MemorySegmentIndexInput class. Then we have 2 classes staying next to each other, one for JDK 11 and one for JDK 16/17/18+, MMapDirectory just chooses the correct one. Then we won't need a MR JAR, only some tricks to compile a few classes with later `--release`.
With compilation we can use `--release` and of course a newer Gradle version. Like Elasticsearch, people who want to compile Lucene/Solr would need JDK 17, but the code is working for JDK 11+. Elasticsearch does exactly the same: It is tested and compiled for Java 8, but to build it you need JDK 15.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] msokolov commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
msokolov commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-754300727
> Just to clarify: In your code, you are reading the floats with a series of readInt() from the indexinput, converting them to float?
No, the floats are read using `readBytes()` into a `ByteBuffer` and then converted using a derived `FloatBuffer`. So I think your intuition about `readBytes()` is correct?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551196614
##########
File path: lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java
##########
@@ -0,0 +1,525 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.lang.invoke.VarHandle;
+import java.nio.ByteOrder;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.util.IOUtils;
+
+import jdk.incubator.foreign.MemoryHandles;
+import jdk.incubator.foreign.MemorySegment;
+
+/**
+ * Base IndexInput implementation that uses an array of MemorySegments to represent a file.
+ *
+ * <p>For efficiency, this class requires that the segment size are a power-of-two
+ * (<code>chunkSizePower</code>).
+ */
+public abstract class MemorySegmentIndexInput extends IndexInput implements RandomAccessInput {
+ // We pass 1L as alignment, because currently Lucene file formats are heavy unaligned: :(
+ static final VarHandle VH_getByte = MemoryHandles.varHandle(byte.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getShort = MemoryHandles.varHandle(short.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getInt = MemoryHandles.varHandle(int.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getLong = MemoryHandles.varHandle(long.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+
+ final boolean isClone;
+ final long length;
+ final long chunkSizeMask;
+ final int chunkSizePower;
+ final MemorySegment[] segments;
+
+ int curSegmentIndex = -1;
+ MemorySegment curSegment; // redundant for speed: segments[curSegmentIndex], also marker if closed!
+ long curPosition; // relative to curSegment, not globally
+
+ public static MemorySegmentIndexInput newInstance(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower) {
+ if (segments.length == 1) {
+ return new SingleSegmentImpl(resourceDescription, segments[0], length, chunkSizePower, false);
+ } else {
+ return new MultiSegmentImpl(resourceDescription, segments, 0, length, chunkSizePower, false);
+ }
+ }
+
+ private MemorySegmentIndexInput(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower,
+ boolean isClone) {
+ super(resourceDescription);
+ this.segments = segments;
+ this.length = length;
+ this.chunkSizePower = chunkSizePower;
+ this.chunkSizeMask = (1L << chunkSizePower) - 1L;
+ this.isClone = isClone;
+ this.curSegment = segments[0];
+ }
+
+ void ensureOpen() {
+ if (curSegment == null) {
+ throw alreadyClosed();
+ }
+ }
+
+ RuntimeException wrapAlreadyClosedException(RuntimeException e) {
+ if (e instanceof NullPointerException) {
+ return alreadyClosed();
+ }
+ // TODO: maybe open a JDK issue to have a separate, more
+ // meaningful exception for unmapped segments:
+ if (e.getMessage() != null && e.getMessage().contains("closed")) {
+ return alreadyClosed();
+ }
+ return e;
+ }
+
+ // return value only for compiler!
+ byte handlePositionalIOOBE(String action, long pos) throws IOException {
Review comment:
I have. But I think the throws conveys it better :); the difference exists at call sites like this one:
{code}
} catch (IndexOutOfBoundsException e) {
handlePositionalIOOBE("seek", pos);
}
{code}
The compiler would allow you to add code after the catch (or after handlePositionalOOBE). Whereas this would have the desired effect (also showing up-front that the method never returns normally):
{code}
} catch (IndexOutOfBoundsException e) {
throw handlePositionalIOOBE("seek", pos);
}
{code}
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551061081
##########
File path: lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java
##########
@@ -0,0 +1,525 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.lang.invoke.VarHandle;
+import java.nio.ByteOrder;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.util.IOUtils;
+
+import jdk.incubator.foreign.MemoryHandles;
+import jdk.incubator.foreign.MemorySegment;
+
+/**
+ * Base IndexInput implementation that uses an array of MemorySegments to represent a file.
+ *
+ * <p>For efficiency, this class requires that the segment size are a power-of-two
+ * (<code>chunkSizePower</code>).
+ */
+public abstract class MemorySegmentIndexInput extends IndexInput implements RandomAccessInput {
+ // We pass 1L as alignment, because currently Lucene file formats are heavy unaligned: :(
+ static final VarHandle VH_getByte = MemoryHandles.varHandle(byte.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getShort = MemoryHandles.varHandle(short.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getInt = MemoryHandles.varHandle(int.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getLong = MemoryHandles.varHandle(long.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+
+ final boolean isClone;
+ final long length;
+ final long chunkSizeMask;
+ final int chunkSizePower;
+ final MemorySegment[] segments;
+
+ int curSegmentIndex = -1;
+ MemorySegment curSegment; // redundant for speed: segments[curSegmentIndex], also marker if closed!
+ long curPosition; // relative to curSegment, not globally
+
+ public static MemorySegmentIndexInput newInstance(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower) {
+ if (segments.length == 1) {
+ return new SingleSegmentImpl(resourceDescription, segments[0], length, chunkSizePower, false);
+ } else {
+ return new MultiSegmentImpl(resourceDescription, segments, 0, length, chunkSizePower, false);
+ }
+ }
+
+ private MemorySegmentIndexInput(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower,
+ boolean isClone) {
+ super(resourceDescription);
+ this.segments = segments;
+ this.length = length;
+ this.chunkSizePower = chunkSizePower;
+ this.chunkSizeMask = (1L << chunkSizePower) - 1L;
+ this.isClone = isClone;
+ this.curSegment = segments[0];
+ }
+
+ void ensureOpen() {
+ if (curSegment == null) {
+ throw alreadyClosed();
+ }
+ }
+
+ RuntimeException wrapAlreadyClosedException(RuntimeException e) {
+ if (e instanceof NullPointerException) {
+ return alreadyClosed();
+ }
+ // TODO: maybe open a JDK issue to have a separate, more
+ // meaningful exception for unmapped segments:
+ if (e.getMessage() != null && e.getMessage().contains("closed")) {
+ return alreadyClosed();
+ }
+ return e;
+ }
+
+ // return value only for compiler!
Review comment:
no, we need some return value so we can call it and return a value. But as it is throwing an exception anyways, the return value does not matter.
These tricks are often used for exception handlers to delegate "exception handling" to a method where you know it will never return and just bubble up the exception:
- if the exception is known before (by type), you use `throw callSomeMethod(param)`. The pro thing: it's clean as you delegate the instantiation of the exception to the factory, but throw it on your own. The problem here is: only works with a known exception type - and a single one, because the method needs it as return type. A generic `Exception` is a desaster, as it reuires to declare that on throws!
- an alternative is (and that's used here): put the throw clause also into the method, so the only sense of our method is throwing some exception, it never returns cleanly. Backside: the method needs a return type, so you can do: `return callSomeMethod()` from the caller. Of course you can add a separate return statement after the method call, but that looks more bullshitty.
Here we use the second approch. The most common return type to all callers of this method is `byte`. Compiler is happy, voila!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler edited a comment on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler edited a comment on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-754299385
Hi @msokolov,
thanks for the speed test - I wanted to wait for the specialists to do it. As said before, we should report slowness (or improvements) to the Hotpsot/OpenJDK people. I was expecting a result like this due to the early stage of the development and misisng Hotspot optimizations. Just to clarify: In your code, you are reading the floats with a series of readInt() from the indexinput, converting them to float?
If that's the case, I would say: this is expected! From what I figured out on the mailing lists: loops using `long` as an index vs. `int` as an index are not supported well by Hotspot yet. The position() argument of good old ByteBuffers is a 32bits `int` and whenever you read from memory it uses this int (`index += 4`). Those loops are easy optimized. With `MemorySegment`s, the pointer is a 64bit `long` named `curPosition` in our `MemorySegmentIndexInput`. Hotspot has no idea how to optimize that loop of readInt() statements! I am not sure if this is fully true at this stage but that's my first guess, I will discuss this with Maurizio.
Interestingly, I would have expected that `readBytes()` and `readLELongs()` are slower due to the overheads by wrapping all those `slice()` calls and `MemorySegment.ofArray()`all producing short living object instances! But I seem to be proved wrong :-)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551208675
##########
File path: gradle/defaults-java.gradle
##########
@@ -60,3 +60,14 @@ allprojects {
}
}
}
+
+configure(project(":lucene:core")) {
Review comment:
To come back to the MR-JAR: Yes we may need it, but it can be solved without, too.
The IndexInputs are completely separate (so they can life side-by side). The changes to MMapDircetory are mainly a change from int to long for the chunk size (this is why I also not yet removed the methods to query unmapping status, thy would always return true for JDK>xxxx.
When we would have this productive, MMapDirectory would just return one of the IndexInputs that fits your JDK. We may only need to move the code that maps a file to ByteBuffer or MemorySegment to a separate class or maybe the IndexInput class. Then we have 2 classes staying next to each other, one for JDK 11 and one for JDK 16/17/18+, MMapDircetory just chooses the correct one. Then we won't need a MR JAR.
With compilation we can use `--release` and of course a newer Gradle version. Like Elasticsearch, people who want to compile Lucene/Solr would need JDK 17, but the code is working for JDK 11+. Elasticsearch does exactly the same: It is tested and compiled for Java 8, but to build it you need JDK 15.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] msokolov commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
msokolov commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-754299548
I ran benchmarks one more time, including vector tasks:
```
TaskQPS baseline StdDevQPS candidate StdDev Pct diff p-value
BrowseMonthTaxoFacets 2.17 (6.3%) 1.17 (1.6%) -46.1% ( -50% - -40%) 0.000
BrowseDateTaxoFacets 2.06 (6.5%) 1.14 (1.7%) -44.7% ( -49% - -39%) 0.000
BrowseDayOfYearTaxoFacets 2.06 (6.5%) 1.14 (1.7%) -44.7% ( -49% - -38%) 0.000
Fuzzy2 50.14 (10.0%) 42.75 (9.0%) -14.7% ( -30% - 4%) 0.000
OrNotHighLow 471.70 (4.3%) 404.83 (3.7%) -14.2% ( -21% - -6%) 0.000
Respell 48.12 (1.8%) 41.51 (1.2%) -13.7% ( -16% - -11%) 0.000
BrowseMonthSSDVFacets 12.43 (7.9%) 10.88 (6.6%) -12.4% ( -24% - 2%) 0.000
PKLookup 134.94 (1.0%) 118.56 (1.1%) -12.1% ( -14% - -10%) 0.000
LowTerm 972.51 (3.9%) 870.89 (2.9%) -10.4% ( -16% - -3%) 0.000
AndHighLow 1256.92 (4.3%) 1126.63 (4.2%) -10.4% ( -18% - -1%) 0.000
HighTerm 938.27 (4.5%) 843.52 (3.7%) -10.1% ( -17% - -1%) 0.000
LowTermVector 821.50 (2.4%) 739.43 (1.4%) -10.0% ( -13% - -6%) 0.000
Wildcard 94.86 (2.1%) 85.51 (1.9%) -9.9% ( -13% - -6%) 0.000
OrHighMed 98.33 (3.0%) 88.70 (3.0%) -9.8% ( -15% - -3%) 0.000
AndHighLowVector 859.32 (2.2%) 775.74 (1.9%) -9.7% ( -13% - -5%) 0.000
AndHighMedVector 723.85 (4.0%) 654.44 (3.3%) -9.6% ( -16% - -2%) 0.000
AndHighHighVector 825.28 (1.7%) 746.88 (1.5%) -9.5% ( -12% - -6%) 0.000
MedTerm 1133.18 (4.2%) 1025.84 (2.6%) -9.5% ( -15% - -2%) 0.000
AndHighMed 163.82 (4.0%) 148.32 (3.3%) -9.5% ( -16% - -2%) 0.000
OrNotHighHigh 432.50 (4.6%) 391.92 (3.9%) -9.4% ( -17% - 0%) 0.000
MedTermVector 761.28 (3.9%) 690.36 (3.4%) -9.3% ( -15% - -2%) 0.000
MedSloppyPhrase 44.77 (2.3%) 40.61 (1.6%) -9.3% ( -12% - -5%) 0.000
OrHighNotHigh 457.14 (3.9%) 414.97 (5.6%) -9.2% ( -17% - 0%) 0.000
HighTermVector 814.15 (5.0%) 741.66 (5.0%) -8.9% ( -18% - 1%) 0.000
AndHighHigh 34.20 (3.6%) 31.39 (3.5%) -8.2% ( -14% - -1%) 0.000
OrHighLow 318.11 (7.2%) 292.68 (4.9%) -8.0% ( -18% - 4%) 0.000
OrNotHighMed 489.33 (4.3%) 451.73 (3.3%) -7.7% ( -14% - 0%) 0.000
MedPhrase 112.51 (2.1%) 103.95 (2.3%) -7.6% ( -11% - -3%) 0.000
HighSpanNear 9.46 (3.1%) 8.76 (2.1%) -7.4% ( -12% - -2%) 0.000
LowSloppyPhrase 98.59 (2.4%) 91.33 (1.3%) -7.4% ( -10% - -3%) 0.000
MedSpanNear 98.82 (2.6%) 91.64 (2.0%) -7.3% ( -11% - -2%) 0.000
Fuzzy1 52.63 (9.9%) 49.02 (5.8%) -6.9% ( -20% - 9%) 0.007
LowSpanNear 93.95 (2.1%) 87.52 (1.6%) -6.8% ( -10% - -3%) 0.000
HighPhrase 256.37 (3.7%) 239.54 (2.3%) -6.6% ( -12% - 0%) 0.000
Prefix3 227.65 (6.2%) 213.30 (5.9%) -6.3% ( -17% - 6%) 0.001
HighSloppyPhrase 9.96 (3.1%) 9.35 (2.2%) -6.1% ( -11% - 0%) 0.000
OrHighHigh 21.71 (2.6%) 20.54 (1.8%) -5.4% ( -9% - -1%) 0.000
OrHighNotMed 471.56 (6.3%) 446.92 (4.6%) -5.2% ( -15% - 5%) 0.003
OrHighNotLow 555.62 (4.9%) 526.59 (3.5%) -5.2% ( -13% - 3%) 0.000
HighTermDayOfYearSort 33.02 (9.4%) 31.30 (8.0%) -5.2% ( -20% - 13%) 0.058
LowPhrase 310.16 (4.3%) 295.71 (2.5%) -4.7% ( -11% - 2%) 0.000
HighIntervalsOrdered 12.32 (1.4%) 11.77 (1.3%) -4.5% ( -7% - -1%) 0.000
HighTermMonthSort 121.01 (14.0%) 116.57 (11.1%) -3.7% ( -25% - 24%) 0.358
BrowseDayOfYearSSDVFacets 9.72 (6.0%) 9.39 (1.7%) -3.3% ( -10% - 4%) 0.017
TermDTSort 96.25 (8.7%) 94.99 (8.3%) -1.3% ( -16% - 17%) 0.626
HighTermTitleBDVSort 102.37 (16.1%) 101.51 (16.7%) -0.8% ( -28% - 38%) 0.871
IntNRQ 203.00 (0.8%) 202.97 (0.7%) -0.0% ( -1% - 1%) 0.952
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-755818100
Hi @msokolov: If you have so time, you may check performance again. I rewrote the getBytes() and getLEXxxx() methods a bit.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551179303
##########
File path: lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java
##########
@@ -0,0 +1,525 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.lang.invoke.VarHandle;
+import java.nio.ByteOrder;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.util.IOUtils;
+
+import jdk.incubator.foreign.MemoryHandles;
+import jdk.incubator.foreign.MemorySegment;
+
+/**
+ * Base IndexInput implementation that uses an array of MemorySegments to represent a file.
+ *
+ * <p>For efficiency, this class requires that the segment size are a power-of-two
+ * (<code>chunkSizePower</code>).
+ */
+public abstract class MemorySegmentIndexInput extends IndexInput implements RandomAccessInput {
+ // We pass 1L as alignment, because currently Lucene file formats are heavy unaligned: :(
+ static final VarHandle VH_getByte = MemoryHandles.varHandle(byte.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getShort = MemoryHandles.varHandle(short.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getInt = MemoryHandles.varHandle(int.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getLong = MemoryHandles.varHandle(long.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+
+ final boolean isClone;
+ final long length;
+ final long chunkSizeMask;
+ final int chunkSizePower;
+ final MemorySegment[] segments;
+
+ int curSegmentIndex = -1;
+ MemorySegment curSegment; // redundant for speed: segments[curSegmentIndex], also marker if closed!
+ long curPosition; // relative to curSegment, not globally
+
+ public static MemorySegmentIndexInput newInstance(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower) {
+ if (segments.length == 1) {
+ return new SingleSegmentImpl(resourceDescription, segments[0], length, chunkSizePower, false);
+ } else {
+ return new MultiSegmentImpl(resourceDescription, segments, 0, length, chunkSizePower, false);
+ }
+ }
+
+ private MemorySegmentIndexInput(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower,
+ boolean isClone) {
+ super(resourceDescription);
+ this.segments = segments;
+ this.length = length;
+ this.chunkSizePower = chunkSizePower;
+ this.chunkSizeMask = (1L << chunkSizePower) - 1L;
+ this.isClone = isClone;
+ this.curSegment = segments[0];
+ }
+
+ void ensureOpen() {
+ if (curSegment == null) {
+ throw alreadyClosed();
+ }
+ }
+
+ RuntimeException wrapAlreadyClosedException(RuntimeException e) {
+ if (e instanceof NullPointerException) {
+ return alreadyClosed();
+ }
+ // TODO: maybe open a JDK issue to have a separate, more
+ // meaningful exception for unmapped segments:
+ if (e.getMessage() != null && e.getMessage().contains("closed")) {
+ return alreadyClosed();
+ }
+ return e;
+ }
+
+ // return value only for compiler!
+ byte handlePositionalIOOBE(String action, long pos) throws IOException {
Review comment:
The trick with byte return could be solved in a different way as well - by declaring a (dummy) return type of RuntimeException and replacing {{return handlePositionalIOOBE()}} with {{throw handlePositionalIOOBE(...)}} everywhere else. This has an additional benefit that code paths at invocation points see the code block as throwing an exception (so anything that follows is unreachable).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753693042
I added test runs on Jenkins: https://jenkins.thetaphi.de/job/Lucene-Solr-jdk16panama-Linux/ and https://jenkins.thetaphi.de/job/Lucene-Solr-jdk16panama-Windows/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler edited a comment on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler edited a comment on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753511728
@dweiss I found a workaround by adding the include/excludes as we use in Ant build. Why were those removed? Should I maybe reopen a new issue to add them back in master?
I have the feeling thois comes from the fact that class files compiled with JDK16 and using the new class file format can't be "analyzed" by gradle, so it assumes "no idea, let's run it because its a class file". The automatisms don't seem to work then. With the include/exclude patterns everything went back to normal.
Should we also add those lines to `defaults-tests.gradle` in master branch? I don't trust those autodetection, especially it may load read class files that it does not need to look at all?:
```
include '**/*Test.class', '**/Test*.class'
exclude '**/*$*'
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
jpountz commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753490290
> In future we should throw away on coping/loading our stuff to heap and maybe throw away IndexInput completely and base our code fully on random access. The new foreign-vector APIs will in future also be written with MemorySegment in its focus. So you can allocate a vector view on a MemorySegment and let the vectorizer fully work outside java heap inside our mmapped files! :-)
+1 Things work like they do today because auto-vectorization only works on on-heap arrays, and this readLELongs method was the fastest way to copy data from a directory to a long[]. @msokolov and I were just discussing a few days ago how this vector API might help drop the copy entirely in the future, which would be great.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler closed pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) Panama APIs (>= JDK-16-ea-b32)
Posted by GitBox <gi...@apache.org>.
uschindler closed pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-754299385
Hi @msokolov,
thanks for the speed test - I wanted to wait for the specialists to do it. As said before, we should report slowness (or improvements) to the Hotpsot/OpenJDK people. I was expecting a result like this due to the early stage of the development and misisng Hotspot optimizations. Just to clarify: In your code, you are reading the floats with a series of readInt() from the indexinput, converting them to float?
If that's the case, I would say: this is expected! From what I figured out on the mailing lists: loops using `long` as an index vs. ìnt` as an index are not supported well by Hotspot yet. The position() argument of ByteBuffers is a 32bits `int`and whenever you read from memory it uses this int (`index += 4`). Those loops are easy optimized. With `MemorySegment`s, the pointer is a 64bit `long` named `curPosition` in our `MemorySegmentIndexInput`. Hotspot has no idea how to optimize that loop of readInt() statements! I am not sure if this is fully true at this stage but that's my first guess, I will discuss this with Maurizio.
Interestingly, I would have expected that `readBytes()` and `readLELongs()` are slower due to the overheads by wrapping all those `slice()` calls and `MemorySegment.ofArray()`all producing short living object instances! But I seem to be proved wrong :-)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r552031972
##########
File path: lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java
##########
@@ -0,0 +1,524 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.lang.invoke.VarHandle;
+import java.nio.ByteOrder;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.util.IOUtils;
+
+import jdk.incubator.foreign.MemoryHandles;
+import jdk.incubator.foreign.MemorySegment;
+
+/**
+ * Base IndexInput implementation that uses an array of MemorySegments to represent a file.
+ *
+ * <p>For efficiency, this class requires that the segment size are a power-of-two
+ * (<code>chunkSizePower</code>).
+ */
+public abstract class MemorySegmentIndexInput extends IndexInput implements RandomAccessInput {
+ // We pass 1L as alignment, because currently Lucene file formats are heavy unaligned: :(
+ static final VarHandle VH_getByte = MemoryHandles.varHandle(byte.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getShort = MemoryHandles.varHandle(short.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
Review comment:
Thanks for review! In fact that's wanted, because Lucene Apis read any type always from a file offset. The alignment of 1L is wanted because currently we have ints/longs/floats anywhere in files (see PR description).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] mcimadamore commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
mcimadamore commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r552026775
##########
File path: lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java
##########
@@ -0,0 +1,524 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.lang.invoke.VarHandle;
+import java.nio.ByteOrder;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.util.IOUtils;
+
+import jdk.incubator.foreign.MemoryHandles;
+import jdk.incubator.foreign.MemorySegment;
+
+/**
+ * Base IndexInput implementation that uses an array of MemorySegments to represent a file.
+ *
+ * <p>For efficiency, this class requires that the segment size are a power-of-two
+ * (<code>chunkSizePower</code>).
+ */
+public abstract class MemorySegmentIndexInput extends IndexInput implements RandomAccessInput {
+ // We pass 1L as alignment, because currently Lucene file formats are heavy unaligned: :(
+ static final VarHandle VH_getByte = MemoryHandles.varHandle(byte.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getShort = MemoryHandles.varHandle(short.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
Review comment:
Just to be sure here - the offset accepted by this var handle is a _byte_ offset not a short offset. E.g. if you want to read consecutive short elements you need to do:
```
VH_getShort(segment, 0);
VH_getShort(segment, 2);
VH_getShort(segment, 4);
```
Alternatively, if you need array-like indexing, but don't want to mess with layouts, you can use the pre-baked statics inside the MemoryAccess class (e.g. `getShortAtIndex` vs. `getShortAtOffset` - the former take a logical short index, the latter takes a raw byte offset).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler edited a comment on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler edited a comment on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753511728
@dweiss I found a workaround by adding the include/excludes as we use in Ant build. Why were those removed? Should I maybe reopen a new issue.
I have the feeling thois comes from the fact that class files compiled with JDK16 and using the new class file format can't be "analyzed" by gradle, so it assumes "no idea, let's run it because its a class file". The automatisms don't seem to work then. With the include/exclude patterns everything went back to normal.
Should we also add those lines to `defaults-tests.gradle` in master branch? I don't trust those autodetection, especially it may load read class files that it does not need to look at all?:
```
include '**/*Test.class', '**/Test*.class'
exclude '**/*$*'
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] msokolov commented on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
msokolov commented on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-758175143
@uschindler I pulled latest from this branch (ba61072221905c202450034c0c90409f7073ebb3) and re-ran (comparing to updated master (6711eb7571727552aad3ace53c52c9a8fe07dc40) and see similar results:
```
TaskQPS baseline StdDevQPS candidate StdDev Pct diff p-value
BrowseMonthTaxoFacets 2.15 (7.0%) 1.20 (5.6%) -44.2% ( -53% - -33%) 0.000
BrowseDayOfYearTaxoFacets 2.04 (7.1%) 1.16 (5.8%) -42.9% ( -52% - -32%) 0.000
BrowseDateTaxoFacets 2.04 (7.1%) 1.17 (5.8%) -42.8% ( -51% - -32%) 0.000
LowTerm 1099.09 (4.8%) 936.16 (3.2%) -14.8% ( -21% - -7%) 0.000
OrNotHighLow 523.44 (7.0%) 450.50 (3.8%) -13.9% ( -23% - -3%) 0.000
Respell 42.23 (2.2%) 36.71 (2.5%) -13.1% ( -17% - -8%) 0.000
AndHighMedVector 893.77 (2.9%) 786.53 (3.0%) -12.0% ( -17% - -6%) 0.000
AndHighLow 662.65 (2.9%) 584.68 (2.0%) -11.8% ( -16% - -7%) 0.000
PKLookup 136.02 (1.5%) 120.03 (1.5%) -11.8% ( -14% - -8%) 0.000
Fuzzy1 53.41 (6.4%) 47.18 (5.4%) -11.7% ( -22% - 0%) 0.000
MedTermVector 1067.80 (3.2%) 946.10 (2.3%) -11.4% ( -16% - -6%) 0.000
AndHighLowVector 941.02 (3.8%) 834.30 (3.1%) -11.3% ( -17% - -4%) 0.000
AndHighHighVector 908.27 (2.9%) 808.05 (2.1%) -11.0% ( -15% - -6%) 0.000
LowTermVector 1010.16 (2.2%) 899.95 (1.9%) -10.9% ( -14% - -6%) 0.000
MedTerm 1167.78 (3.9%) 1043.79 (3.1%) -10.6% ( -17% - -3%) 0.000
BrowseMonthSSDVFacets 11.99 (13.4%) 10.73 (7.4%) -10.5% ( -27% - 11%) 0.002
AndHighMed 155.89 (3.5%) 139.90 (2.8%) -10.3% ( -16% - -4%) 0.000
OrNotHighMed 464.05 (4.1%) 416.91 (4.1%) -10.2% ( -17% - -2%) 0.000
HighTermVector 1222.14 (3.5%) 1105.22 (2.2%) -9.6% ( -14% - -4%) 0.000
Wildcard 91.66 (2.2%) 83.55 (2.4%) -8.9% ( -13% - -4%) 0.000
LowSloppyPhrase 46.92 (2.1%) 42.77 (1.5%) -8.8% ( -12% - -5%) 0.000
LowSpanNear 87.56 (2.7%) 80.02 (2.0%) -8.6% ( -13% - -3%) 0.000
OrHighNotMed 470.32 (5.4%) 429.87 (3.3%) -8.6% ( -16% - 0%) 0.000
HighTerm 1269.54 (7.1%) 1163.40 (5.8%) -8.4% ( -19% - 4%) 0.000
OrHighNotHigh 430.03 (4.1%) 395.13 (4.3%) -8.1% ( -15% - 0%) 0.000
OrHighNotLow 585.18 (5.0%) 537.96 (5.5%) -8.1% ( -17% - 2%) 0.000
AndHighHigh 37.46 (3.2%) 34.45 (2.6%) -8.0% ( -13% - -2%) 0.000
OrHighLow 576.99 (7.4%) 533.08 (6.4%) -7.6% ( -19% - 6%) 0.001
HighPhrase 261.45 (2.1%) 241.99 (1.8%) -7.4% ( -11% - -3%) 0.000
LowPhrase 375.57 (3.3%) 347.95 (3.7%) -7.4% ( -13% - 0%) 0.000
HighSpanNear 20.85 (2.8%) 19.36 (2.0%) -7.2% ( -11% - -2%) 0.000
MedPhrase 22.28 (1.8%) 20.77 (1.2%) -6.8% ( -9% - -3%) 0.000
MedSloppyPhrase 15.96 (3.0%) 14.92 (2.5%) -6.5% ( -11% - 0%) 0.000
Prefix3 204.87 (2.0%) 193.03 (2.2%) -5.8% ( -9% - -1%) 0.000
OrNotHighHigh 496.09 (6.6%) 467.47 (4.1%) -5.8% ( -15% - 5%) 0.001
HighSloppyPhrase 22.22 (4.0%) 21.02 (3.0%) -5.4% ( -11% - 1%) 0.000
HighTermMonthSort 132.66 (11.4%) 126.18 (13.2%) -4.9% ( -26% - 22%) 0.211
TermDTSort 68.55 (13.3%) 65.22 (15.1%) -4.9% ( -29% - 27%) 0.279
OrHighMed 78.75 (3.8%) 74.95 (2.6%) -4.8% ( -10% - 1%) 0.000
MedSpanNear 29.19 (2.3%) 27.83 (1.9%) -4.7% ( -8% - 0%) 0.000
HighTermDayOfYearSort 105.52 (12.0%) 101.55 (10.1%) -3.8% ( -23% - 20%) 0.284
IntNRQ 76.00 (12.3%) 73.29 (11.9%) -3.6% ( -24% - 23%) 0.351
HighIntervalsOrdered 21.82 (1.5%) 21.05 (1.5%) -3.5% ( -6% - 0%) 0.000
HighTermTitleBDVSort 114.39 (16.4%) 110.77 (14.9%) -3.2% ( -29% - 33%) 0.522
OrHighHigh 16.17 (3.5%) 15.68 (2.8%) -3.1% ( -9% - 3%) 0.002
BrowseDayOfYearSSDVFacets 9.57 (9.0%) 9.29 (5.2%) -2.9% ( -15% - 12%) 0.221
Fuzzy2 30.01 (8.5%) 30.26 (10.2%) 0.8% ( -16% - 21%) 0.781
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler edited a comment on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler edited a comment on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-753503210
@dweiss I need your help here. There is one thing that drives me crazy: with the changes (actually adding `--add-modules jdk.incubator.foreign` to the test command line), JUnit 4 suddely fails for many "non-tests" with exceptions like this, although those should never have been executed. It's mostly some test helper classes (like the writer for Lucene60Points).
This seems to have to do with some automatism by Gradle or by JUnit once it detects the "Java module system", and I have no idea how to turn it off! What I am completely wondering about: why does it load those classes at all? I was trying to find the good old "Ant fileset" that has this `**/*Test.class,**/Test*.class` pattern, but this is no longer there with gradle. How to re-add it, if JUnit/Gradle seems to think that it needs some (broken) module system API.
Unless this is fixed, I can't run all tests easily. I only ran them per module and ignored the tons of stack traces. PLEASE HELP!
```
org.apache.lucene.backward_codecs.lucene60.Lucene60PointsWriter > initializationError FAILED
java.lang.IllegalArgumentException: Test class can only have one constructor
at org.junit.runners.model.TestClass.<init>(TestClass.java:48)
at org.junit.runners.JUnit4.<init>(JUnit4.java:23)
at org.junit.internal.builders.JUnit4Builder.runnerForClass(JUnit4Builder.java:10)
at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:70)
at org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:37)
at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:70)
at org.junit.internal.requests.ClassRequest.createRunner(ClassRequest.java:28)
at org.junit.internal.requests.MemoizingRequest.getRunner(MemoizingRequest.java:19)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:78)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
at jdk.proxy1/jdk.proxy1.$Proxy2.processTestClass(Unknown Source)
at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:119)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:182)
at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:164)
at org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:414)
at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64)
at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:48)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
at org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:56)
at java.base/java.lang.Thread.run(Thread.java:831)
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs
Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#discussion_r551202560
##########
File path: lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java
##########
@@ -0,0 +1,525 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.lang.invoke.VarHandle;
+import java.nio.ByteOrder;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.util.IOUtils;
+
+import jdk.incubator.foreign.MemoryHandles;
+import jdk.incubator.foreign.MemorySegment;
+
+/**
+ * Base IndexInput implementation that uses an array of MemorySegments to represent a file.
+ *
+ * <p>For efficiency, this class requires that the segment size are a power-of-two
+ * (<code>chunkSizePower</code>).
+ */
+public abstract class MemorySegmentIndexInput extends IndexInput implements RandomAccessInput {
+ // We pass 1L as alignment, because currently Lucene file formats are heavy unaligned: :(
+ static final VarHandle VH_getByte = MemoryHandles.varHandle(byte.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getShort = MemoryHandles.varHandle(short.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getInt = MemoryHandles.varHandle(int.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+ static final VarHandle VH_getLong = MemoryHandles.varHandle(long.class, 1L, ByteOrder.BIG_ENDIAN).withInvokeExactBehavior();
+
+ final boolean isClone;
+ final long length;
+ final long chunkSizeMask;
+ final int chunkSizePower;
+ final MemorySegment[] segments;
+
+ int curSegmentIndex = -1;
+ MemorySegment curSegment; // redundant for speed: segments[curSegmentIndex], also marker if closed!
+ long curPosition; // relative to curSegment, not globally
+
+ public static MemorySegmentIndexInput newInstance(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower) {
+ if (segments.length == 1) {
+ return new SingleSegmentImpl(resourceDescription, segments[0], length, chunkSizePower, false);
+ } else {
+ return new MultiSegmentImpl(resourceDescription, segments, 0, length, chunkSizePower, false);
+ }
+ }
+
+ private MemorySegmentIndexInput(
+ String resourceDescription,
+ MemorySegment[] segments,
+ long length,
+ int chunkSizePower,
+ boolean isClone) {
+ super(resourceDescription);
+ this.segments = segments;
+ this.length = length;
+ this.chunkSizePower = chunkSizePower;
+ this.chunkSizeMask = (1L << chunkSizePower) - 1L;
+ this.isClone = isClone;
+ this.curSegment = segments[0];
+ }
+
+ void ensureOpen() {
+ if (curSegment == null) {
+ throw alreadyClosed();
+ }
+ }
+
+ RuntimeException wrapAlreadyClosedException(RuntimeException e) {
+ if (e instanceof NullPointerException) {
+ return alreadyClosed();
+ }
+ // TODO: maybe open a JDK issue to have a separate, more
+ // meaningful exception for unmapped segments:
+ if (e.getMessage() != null && e.getMessage().contains("closed")) {
+ return alreadyClosed();
+ }
+ return e;
+ }
+
+ // return value only for compiler!
+ byte handlePositionalIOOBE(String action, long pos) throws IOException {
Review comment:
Committed and pushed this change.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org