You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/04/08 09:49:18 UTC

[GitHub] [lucene] iverase opened a new pull request #72: LUCENE-9907: Remove packedInts#getReaderNoHeader dependency on TermsVectorFieldsFormat

iverase opened a new pull request #72:
URL: https://github.com/apache/lucene/pull/72


   Replaces the usages of PackedInts#getReaderNoHeader with DirecReader#getInstance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] iverase merged pull request #72: LUCENE-9907: Remove packedInts#getReaderNoHeader dependency on TermsVectorFieldsFormat

Posted by GitBox <gi...@apache.org>.
iverase merged pull request #72:
URL: https://github.com/apache/lucene/pull/72


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a change in pull request #72: LUCENE-9907: Remove packedInts#getReaderNoHeader dependency on TermsVectorFieldsFormat

Posted by GitBox <gi...@apache.org>.
jpountz commented on a change in pull request #72:
URL: https://github.com/apache/lucene/pull/72#discussion_r614052826



##########
File path: lucene/core/src/java/org/apache/lucene/codecs/lucene90/compressing/Lucene90CompressingTermVectorsReader.java
##########
@@ -295,6 +300,38 @@ public TermVectorsReader clone() {
     return new Lucene90CompressingTermVectorsReader(this);
   }
 
+  private static RandomAccessInput slice(IndexInput in) throws IOException {
+    final int length = in.readVInt();
+    final byte[] bytes = new byte[length];
+    in.readBytes(bytes, 0, length);
+    final ByteArrayDataInput input = new ByteArrayDataInput(bytes);

Review comment:
       it'd probably be easier by getting a ByteBuffer wrapper than a ByteArrayDataInput, since ByteBuffer already has logic to have random access to bytes, shorts, ints and longs

##########
File path: lucene/core/src/java/org/apache/lucene/codecs/lucene90/compressing/Lucene90CompressingTermVectorsWriter.java
##########
@@ -223,6 +224,7 @@ void addPosition(int position, int startOffset, int length, int payloadLength) {
   private final ByteBuffersDataOutput payloadBytes; // buffered term payloads
   private final BlockPackedWriter writer;
   private final int maxDocsPerChunk; // hard limit on number of docs per chunk
+  private final ByteBuffersDataOutput scratchBuffer = ByteBuffersDataOutput.newResettableInstance();

Review comment:
       can you add it to `ramBytesUsed()`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a change in pull request #72: LUCENE-9907: Remove packedInts#getReaderNoHeader dependency on TermsVectorFieldsFormat

Posted by GitBox <gi...@apache.org>.
jpountz commented on a change in pull request #72:
URL: https://github.com/apache/lucene/pull/72#discussion_r614092381



##########
File path: lucene/core/src/java/org/apache/lucene/codecs/lucene90/compressing/Lucene90CompressingTermVectorsReader.java
##########
@@ -295,6 +300,38 @@ public TermVectorsReader clone() {
     return new Lucene90CompressingTermVectorsReader(this);
   }
 
+  private static RandomAccessInput slice(IndexInput in) throws IOException {
+    final int length = in.readVInt();
+    final byte[] bytes = new byte[length];
+    in.readBytes(bytes, 0, length);
+    final ByteArrayDataInput input = new ByteArrayDataInput(bytes);

Review comment:
       Even better!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] iverase commented on a change in pull request #72: LUCENE-9907: Remove packedInts#getReaderNoHeader dependency on TermsVectorFieldsFormat

Posted by GitBox <gi...@apache.org>.
iverase commented on a change in pull request #72:
URL: https://github.com/apache/lucene/pull/72#discussion_r614079868



##########
File path: lucene/core/src/java/org/apache/lucene/codecs/lucene90/compressing/Lucene90CompressingTermVectorsReader.java
##########
@@ -295,6 +300,38 @@ public TermVectorsReader clone() {
     return new Lucene90CompressingTermVectorsReader(this);
   }
 
+  private static RandomAccessInput slice(IndexInput in) throws IOException {
+    final int length = in.readVInt();
+    final byte[] bytes = new byte[length];
+    in.readBytes(bytes, 0, length);
+    final ByteArrayDataInput input = new ByteArrayDataInput(bytes);

Review comment:
       Lets wrap it and use a ByteBuffersDataInput instead.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] iverase commented on pull request #72: LUCENE-9907: Remove packedInts#getReaderNoHeader dependency on TermsVectorFieldsFormat

Posted by GitBox <gi...@apache.org>.
iverase commented on pull request #72:
URL: https://github.com/apache/lucene/pull/72#issuecomment-820374588


   I have added the length needed to store the int array so we can retrieve it before reading it. 
   
   In the read side, I found that wrapping the IndexInput is tricky as you might be changing the current position of the index at any point. I took a different approach and I am reading the data into a byte[] which is equivalent to what we were doing before by reading it into a long[]. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] iverase commented on a change in pull request #72: LUCENE-9907: Remove packedInts#getReaderNoHeader dependency on TermsVectorFieldsFormat

Posted by GitBox <gi...@apache.org>.
iverase commented on a change in pull request #72:
URL: https://github.com/apache/lucene/pull/72#discussion_r614076033



##########
File path: lucene/core/src/java/org/apache/lucene/codecs/lucene90/compressing/Lucene90CompressingTermVectorsWriter.java
##########
@@ -223,6 +224,7 @@ void addPosition(int position, int startOffset, int length, int payloadLength) {
   private final ByteBuffersDataOutput payloadBytes; // buffered term payloads
   private final BlockPackedWriter writer;
   private final int maxDocsPerChunk; // hard limit on number of docs per chunk
+  private final ByteBuffersDataOutput scratchBuffer = ByteBuffersDataOutput.newResettableInstance();

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org