You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/02/01 21:08:30 UTC

[GitHub] [lucene-solr] dweiss commented on a change in pull request #2258: LUCENE-9686: Fix read past EOF handling in DirectIODirectory

dweiss commented on a change in pull request #2258:
URL: https://github.com/apache/lucene-solr/pull/2258#discussion_r568136570



##########
File path: lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##########
@@ -381,17 +377,18 @@ public long length() {
     @Override
     public byte readByte() throws IOException {
       if (!buffer.hasRemaining()) {
-        refill();
+        refill(1);
       }
+
       return buffer.get();
     }
 
-    private void refill() throws IOException {
+    private void refill(int byteToRead) throws IOException {

Review comment:
       Should it be plural (bytesToRead)?

##########
File path: lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##########
@@ -381,17 +377,18 @@ public long length() {
     @Override
     public byte readByte() throws IOException {
       if (!buffer.hasRemaining()) {
-        refill();
+        refill(1);
       }
+
       return buffer.get();
     }
 
-    private void refill() throws IOException {
+    private void refill(int byteToRead) throws IOException {
       filePos += buffer.capacity();
 
       // BaseDirectoryTestCase#testSeekPastEOF test for consecutive read past EOF,
       // hence throwing EOFException early to maintain buffer state (position in particular)
-      if (filePos > channel.size()) {
+      if (filePos > channel.size() || (channel.size() - filePos < byteToRead)) {

Review comment:
       I wonder if we should move the channel's position to actually point after the last byte, then throw EOFException... so that not only we indicate an EOF but also leave the channel pointing at the end. I have a scenario in my mind when somebody tries to read a bulk of bytes, hits an eof but then a single-byte read() succeeds. That would be awkward, wouldn't it? 
   
   A refill should try to read as many bytes as it can (min(channel.size() -filePos, bytesToRead)), then potentially fail if bytesToRead is still >0 and channel is at EOF. Or is my thinking flawed somewhere?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org