You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/09 15:18:02 UTC

[GitHub] [arrow] stczwd opened a new pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

stczwd opened a new pull request #9147:
URL: https://github.com/apache/arrow/pull/9147


   InvalidProtocolBufferException will be thrown in ArrowMessage.frame if we use gzip compress in Grpc.
   
   The reason is, stream.available is still a 1 after we read compressed data. It need another read to change InflaterInputStream.reachEOF to be true, and make stream.available to be 0.
   Thus we need check stream.available agian after read first byte from stream.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #9147:
URL: https://github.com/apache/arrow/pull/9147#issuecomment-801014998


   I still believe we should not be polling available().


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on a change in pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #9147:
URL: https://github.com/apache/arrow/pull/9147#discussion_r596528147



##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -332,6 +333,23 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
 
   }
 
+  /**
+   * Get first byte with EOF check, it is especially needed when using grpc compression.
+   * InflaterInputStream need another read to change reachEOF after all bytes has been read.
+   *
+   * @param is InputStream
+   * @return -1 if stream is not available, otherwise it will return the actual value.
+   * @throws IOException Read first byte failed.
+   */
+  private static int readRawVarint32WithEOFCheck(InputStream is) throws IOException {
+    int firstByte = is.read();

Review comment:
       I agree with @lidavidm another scenario where zero is perfectly allowable is when no data is buffered but EOF hasn't been reached. @stczwd do you think you will be able to revise the PR to fix the underlying issue?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] stczwd commented on a change in pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #9147:
URL: https://github.com/apache/arrow/pull/9147#discussion_r557832603



##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -259,7 +259,8 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
       ArrowBuf body = null;
       ArrowBuf appMetadata = null;
       while (stream.available() > 0) {
-        int tag = readRawVarint32(stream);
+        int tag = readRawVarint32WithEOFCheck(stream);
+
         switch (tag) {

Review comment:
       Em,  it is similar to current code. Besides, current we can return unified exception information for abnormal EOF.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #9147:
URL: https://github.com/apache/arrow/pull/9147#discussion_r560200455



##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -259,7 +259,8 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
       ArrowBuf body = null;
       ArrowBuf appMetadata = null;
       while (stream.available() > 0) {
-        int tag = readRawVarint32(stream);
+        int tag = readRawVarint32WithEOFCheck(stream);
+
         switch (tag) {

Review comment:
       The current code is wrong, though. (It would be entirely valid for available() == 0 even when the stream is not at EOF. The JavaDocs state that EOF implies available() == 0, but not the other way around.) And the PR raises different exceptions for EOF depending on the case.
   
   Instead of adding flags to massage the current behavior in different cases, let's write code that adheres to the contract of InputStream.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #9147:
URL: https://github.com/apache/arrow/pull/9147#issuecomment-800866250


   @lidavidm @stczwd what is the status of this PR?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] stczwd commented on a change in pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #9147:
URL: https://github.com/apache/arrow/pull/9147#discussion_r556236616



##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -332,6 +333,23 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
 
   }
 
+  /**
+   * Get first byte with EOF check, it is especially needed when using grpc compression.
+   * InflaterInputStream need another read to change reachEOF after all bytes has been read.
+   *
+   * @param is InputStream
+   * @return -1 if stream is not available, otherwise it will return the actual value.
+   * @throws IOException Read first byte failed.
+   */
+  private static int readRawVarint32WithEOFCheck(InputStream is) throws IOException {
+    int firstByte = is.read();

Review comment:
       The description of `java.io.InputStream#available()` indicates it will return {@code 0} when it reaches the end of the input stream.
   ```
   /**
    * Returns an estimate of the number of bytes that can be read (or
    * skipped over) from this input stream without blocking by the next
    * invocation of a method for this input stream. The next invocation
    * might be the same thread or another thread.  A single read or skip of this
    * many bytes will not block, but may read or skip fewer bytes.
    *
    * <p> Note that while some implementations of {@code InputStream} will return
    * the total number of bytes in the stream, many will not.  It is
    * never correct to use the return value of this method to allocate
    * a buffer intended to hold all data in this stream.
    *
    * <p> A subclass' implementation of this method may choose to throw an
    * {@link IOException} if this input stream has been closed by
    * invoking the {@link #close()} method.
    *
    * <p> The {@code available} method for class {@code InputStream} always
    * returns {@code 0}.
    *
    * <p> This method should be overridden by subclasses.
    *
    * @return     an estimate of the number of bytes that can be read (or skipped
    *             over) from this input stream without blocking or {@code 0} when
    *             it reaches the end of the input stream.
    * @exception  IOException if an I/O error occurs.
    */
   public int available() throws IOException {
       return 0;
   }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #9147:
URL: https://github.com/apache/arrow/pull/9147#discussion_r555078736



##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -332,6 +333,23 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
 
   }
 
+  /**
+   * Get first byte with EOF check, it is especially needed when using grpc compression.
+   * InflaterInputStream need another read to change reachEOF after all bytes has been read.
+   *
+   * @param is InputStream
+   * @return -1 if stream is not available, otherwise it will return the actual value.
+   * @throws IOException Read first byte failed.
+   */
+  private static int readRawVarint32WithEOFCheck(InputStream is) throws IOException {
+    int firstByte = is.read();

Review comment:
       Why can't we check `firstByte < 0` for EOF?

##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -259,7 +259,8 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
       ArrowBuf body = null;
       ArrowBuf appMetadata = null;
       while (stream.available() > 0) {
-        int tag = readRawVarint32(stream);
+        int tag = readRawVarint32WithEOFCheck(stream);
+
         switch (tag) {

Review comment:
       Can we explicitly handle -1 (EOF) here?

##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -332,6 +333,23 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
 
   }
 
+  /**
+   * Get first byte with EOF check, it is especially needed when using grpc compression.
+   * InflaterInputStream need another read to change reachEOF after all bytes has been read.
+   *
+   * @param is InputStream
+   * @return -1 if stream is not available, otherwise it will return the actual value.
+   * @throws IOException Read first byte failed.
+   */
+  private static int readRawVarint32WithEOFCheck(InputStream is) throws IOException {
+    int firstByte = is.read();
+    if (is.available() <= 0) {
+      return -1;
+    } else {
+      return CodedInputStream.readRawVarint32(firstByte, is);
+    }
+  }
+
   private static int readRawVarint32(InputStream is) throws IOException {
     int firstByte = is.read();

Review comment:
       It seems here we should be checking `firstByte < -1` since `readRawVarint32` expects the caller to do EOF-checking - so maybe we don't need a new method, and we should fix this for all callers of this method?

##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -332,6 +333,23 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
 
   }
 
+  /**
+   * Get first byte with EOF check, it is especially needed when using grpc compression.
+   * InflaterInputStream need another read to change reachEOF after all bytes has been read.
+   *
+   * @param is InputStream
+   * @return -1 if stream is not available, otherwise it will return the actual value.
+   * @throws IOException Read first byte failed.
+   */

Review comment:
       ```suggestion
     /**
      * Read a varint32 from the stream, checking for EOF.
      *
      * <p>When using gRPC compression, EOF may not be reported by the InflaterInputStream until another read has
      * been performed. This method checks {@link InputStream#available()} after reading the first byte in order
      * to handle this case.
      *
      * @param is InputStream
      * @return -1 if EOF reached, else the varint32 value.
      * @throws IOException if an error occurred while reading the stream.
      */
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] stczwd commented on pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

Posted by GitBox <gi...@apache.org>.
stczwd commented on pull request #9147:
URL: https://github.com/apache/arrow/pull/9147#issuecomment-849241823


   @emkornfield @lidavidm any more comments?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9147:
URL: https://github.com/apache/arrow/pull/9147#issuecomment-757305199


   https://issues.apache.org/jira/browse/ARROW-11177


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #9147:
URL: https://github.com/apache/arrow/pull/9147#discussion_r556542740



##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -259,7 +259,8 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
       ArrowBuf body = null;
       ArrowBuf appMetadata = null;
       while (stream.available() > 0) {
-        int tag = readRawVarint32(stream);
+        int tag = readRawVarint32WithEOFCheck(stream);
+
         switch (tag) {

Review comment:
       Ah, actually, now that I look at this, ArrowMessage#frame overall doesn't seem right...it shouldn't be checking stream.available() at all, instead it should be looping and reading the first byte, and handing it to readRawVarint32(int, InputStream) if it's >= 0.
   
   I think that would explain the actual issues you are seeing - we should not rely on available() at all and instead use read(). So it should look like
   
   ```java
   while (true) {
       int firstByte = stream.read();
       if (firstByte < 0) {
           // EOF
           break;    
       }
       int tag = readRawVarint32(firstByte, stream);
       // ...
   }
   ```

##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -259,7 +259,8 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
       ArrowBuf body = null;
       ArrowBuf appMetadata = null;
       while (stream.available() > 0) {
-        int tag = readRawVarint32(stream);
+        int tag = readRawVarint32WithEOFCheck(stream);
+
         switch (tag) {

Review comment:
       As-is, -1 will get silently ignored and that happens to be correct here, fortunately. But we shouldn't be relying on available() at all.

##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -332,6 +333,23 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
 
   }
 
+  /**
+   * Get first byte with EOF check, it is especially needed when using grpc compression.
+   * InflaterInputStream need another read to change reachEOF after all bytes has been read.
+   *
+   * @param is InputStream
+   * @return -1 if stream is not available, otherwise it will return the actual value.
+   * @throws IOException Read first byte failed.
+   */
+  private static int readRawVarint32WithEOFCheck(InputStream is) throws IOException {
+    int firstByte = is.read();

Review comment:
       Sorry, my read of this is that EOF implies available == 0, but not the other way around. Otherwise, since the base InputStream always returns 0 for available, that would imply it's always at EOF, which is rather useless, right? And the method definition is about the number of bytes that can be read _without blocking_, not the number of bytes that can be read overall.
   
   If empirically, that doesn't match the behavior of the InflaterInputStream, that's fine - but let's make it clear in that case so that it doesn't confuse people without that context.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] stczwd commented on a change in pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #9147:
URL: https://github.com/apache/arrow/pull/9147#discussion_r555572500



##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -332,6 +333,23 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
 
   }
 
+  /**
+   * Get first byte with EOF check, it is especially needed when using grpc compression.
+   * InflaterInputStream need another read to change reachEOF after all bytes has been read.
+   *
+   * @param is InputStream
+   * @return -1 if stream is not available, otherwise it will return the actual value.
+   * @throws IOException Read first byte failed.
+   */

Review comment:
       Great, thanks.

##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -332,6 +333,23 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
 
   }
 
+  /**
+   * Get first byte with EOF check, it is especially needed when using grpc compression.
+   * InflaterInputStream need another read to change reachEOF after all bytes has been read.
+   *
+   * @param is InputStream
+   * @return -1 if stream is not available, otherwise it will return the actual value.
+   * @throws IOException Read first byte failed.
+   */
+  private static int readRawVarint32WithEOFCheck(InputStream is) throws IOException {
+    int firstByte = is.read();

Review comment:
       what if `firstByte < 0` but `is.available >0`?

##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -259,7 +259,8 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
       ArrowBuf body = null;
       ArrowBuf appMetadata = null;
       while (stream.available() > 0) {
-        int tag = readRawVarint32(stream);
+        int tag = readRawVarint32WithEOFCheck(stream);
+
         switch (tag) {

Review comment:
       readRawVarint32 will throw InvalidProtocolBufferException before the tag return.

##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -332,6 +333,23 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
 
   }
 
+  /**
+   * Get first byte with EOF check, it is especially needed when using grpc compression.
+   * InflaterInputStream need another read to change reachEOF after all bytes has been read.
+   *
+   * @param is InputStream
+   * @return -1 if stream is not available, otherwise it will return the actual value.
+   * @throws IOException Read first byte failed.
+   */
+  private static int readRawVarint32WithEOFCheck(InputStream is) throws IOException {
+    int firstByte = is.read();
+    if (is.available() <= 0) {
+      return -1;
+    } else {
+      return CodedInputStream.readRawVarint32(firstByte, is);
+    }
+  }
+
   private static int readRawVarint32(InputStream is) throws IOException {
     int firstByte = is.read();

Review comment:
       Hm, I have thought about this. This function is also called for getting size, we should keep checking of `firstByte < -1`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] stczwd commented on a change in pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #9147:
URL: https://github.com/apache/arrow/pull/9147#discussion_r556234741



##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -332,6 +333,23 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
 
   }
 
+  /**
+   * Get first byte with EOF check, it is especially needed when using grpc compression.
+   * InflaterInputStream need another read to change reachEOF after all bytes has been read.
+   *
+   * @param is InputStream
+   * @return -1 if stream is not available, otherwise it will return the actual value.
+   * @throws IOException Read first byte failed.
+   */
+  private static int readRawVarint32WithEOFCheck(InputStream is) throws IOException {
+    int firstByte = is.read();
+    if (is.available() <= 0) {
+      return -1;
+    } else {
+      return CodedInputStream.readRawVarint32(firstByte, is);
+    }
+  }
+
   private static int readRawVarint32(InputStream is) throws IOException {
     int firstByte = is.read();

Review comment:
       `readRawVarint32` is called to get tag and byte size, exception is still needed on the byte size reading.
   If the tag readed is -1, it means there's never any more data to read.
   But if the size readed is -1, it means this is bad data and exception is needed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9147:
URL: https://github.com/apache/arrow/pull/9147#issuecomment-757305199


   https://issues.apache.org/jira/browse/ARROW-11177


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #9147:
URL: https://github.com/apache/arrow/pull/9147#issuecomment-838718260


   @stczwd did you want to address feedback here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #9147:
URL: https://github.com/apache/arrow/pull/9147#discussion_r555809311



##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -259,7 +259,8 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
       ArrowBuf body = null;
       ArrowBuf appMetadata = null;
       while (stream.available() > 0) {
-        int tag = readRawVarint32(stream);
+        int tag = readRawVarint32WithEOFCheck(stream);
+
         switch (tag) {

Review comment:
       That's not true, though, since readRawVarint32WithEOFCheck has a branch that returns -1.

##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -332,6 +333,23 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
 
   }
 
+  /**
+   * Get first byte with EOF check, it is especially needed when using grpc compression.
+   * InflaterInputStream need another read to change reachEOF after all bytes has been read.
+   *
+   * @param is InputStream
+   * @return -1 if stream is not available, otherwise it will return the actual value.
+   * @throws IOException Read first byte failed.
+   */
+  private static int readRawVarint32WithEOFCheck(InputStream is) throws IOException {
+    int firstByte = is.read();
+    if (is.available() <= 0) {
+      return -1;
+    } else {
+      return CodedInputStream.readRawVarint32(firstByte, is);
+    }
+  }
+
   private static int readRawVarint32(InputStream is) throws IOException {
     int firstByte = is.read();

Review comment:
       I'm not sure what you mean - if is.read() returns -1, the stream is at EOF, so there's never any more data to read.
   
   I also mean, we should consolidate readRawVarint32 and readRawVarint32WithEOFCheck since there's never any reason to proceed if we're at EOF, and furthermore, CodedInputStream#readRawVarint32(int, InputStream) itself doesn't do an EOF check (it assumes the caller has).

##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -332,6 +333,23 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
 
   }
 
+  /**
+   * Get first byte with EOF check, it is especially needed when using grpc compression.
+   * InflaterInputStream need another read to change reachEOF after all bytes has been read.
+   *
+   * @param is InputStream
+   * @return -1 if stream is not available, otherwise it will return the actual value.
+   * @throws IOException Read first byte failed.
+   */
+  private static int readRawVarint32WithEOFCheck(InputStream is) throws IOException {
+    int firstByte = is.read();

Review comment:
       That would be an incorrect InputStream implementation, since InputStream.read() == -1 is defined to be EOF - there should be no data left.

##########
File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java
##########
@@ -332,6 +333,23 @@ private static ArrowMessage frame(BufferAllocator allocator, final InputStream s
 
   }
 
+  /**
+   * Get first byte with EOF check, it is especially needed when using grpc compression.
+   * InflaterInputStream need another read to change reachEOF after all bytes has been read.
+   *
+   * @param is InputStream
+   * @return -1 if stream is not available, otherwise it will return the actual value.
+   * @throws IOException Read first byte failed.
+   */
+  private static int readRawVarint32WithEOFCheck(InputStream is) throws IOException {
+    int firstByte = is.read();

Review comment:
       Furthermore, available() is defined to be an estimate; available() == 0 doesn't mean EOF.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] stczwd commented on pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

Posted by GitBox <gi...@apache.org>.
stczwd commented on pull request #9147:
URL: https://github.com/apache/arrow/pull/9147#issuecomment-849241240


   sure. I have changed the code with checking -1.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org