You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2021/07/27 14:29:22 UTC

[GitHub] [incubator-doris] huozhanfeng commented on a change in pull request #6308: [Bug] Fix bug that broker can't load OSS/S3A files

huozhanfeng commented on a change in pull request #6308:
URL: https://github.com/apache/incubator-doris/pull/6308#discussion_r677506934



##########
File path: fs_brokers/apache_hdfs_broker/src/main/java/org/apache/doris/broker/hdfs/FileSystemManager.java
##########
@@ -561,22 +561,25 @@ public ByteBuffer pread(TBrokerFD fd, long offset, long length) {
                             currentStreamOffset, offset);
                 }
             }
-            ByteBuffer buf;
+            // Avoid using the ByteBuffer based read for Hadoop because some FSDataInputStream
+            // implementations are not ByteBufferReadable,
+            // See https://issues.apache.org/jira/browse/HADOOP-14603
+            byte[] buf;
             if (length > readBufferSize) {
-                buf = ByteBuffer.allocate(readBufferSize);
+                buf = new byte[readBufferSize];

Review comment:
       Ehh...I think `ByteBuffer` can't solve such a problem, it's only related to what size of the buffer we inited and whether the buffer can read enough bytes. In this way, the `ByteBuffer` should same as `byte array`.
   
   I tested it with both `ByteBuffer` and `byte array` and the behavior are same when `readBufferSize` is larger than 128kb. All two of them can't read more the 128k data. Here is the debug code and part of the log.
   <pre>
   logger.debug("read buffer from input stream, request.length " + length + ", readBufferSize:" + readBufferSize +", buffer size:" + buf.length + ", read length:" + readLength);
                   
   2021-07-27 09:57:04  [ pool-2-thread-4:31261 ] - [ INFO ]  read buffer from input stream, request.length 131072, readBufferSize:1048576, buffer size:131072, read length:131072
   2021-07-27 09:57:04  [ pool-2-thread-4:31268 ] - [ INFO ]  read buffer from input stream, request.length 131072, readBufferSize:1048576, buffer size:131072, read length:131072
   2021-07-27 09:57:04  [ pool-2-thread-4:31273 ] - [ INFO ]  read buffer from input stream, request.length 17612, readBufferSize:1048576, buffer size:17612, read length:17612
   2021-07-27 09:57:04  [ pool-2-thread-4:31275 ] - [ INFO ]  read buffer from input stream, request.length 186, readBufferSize:1048576, buffer size:186, read length:186
   2021-07-27 09:57:04  [ pool-2-thread-4:31277 ] - [ INFO ]  read buffer from input stream, request.length 680, readBufferSize:1048576, buffer size:680, read length:680
   </pre>
   
   I guess the root cause is `TBrokerPReadRequest.length` in RPC request is no more than 128k which is controlled by the client. I have no BE dev env now, maybe you can help to take a look😁  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org