You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/06/30 21:16:31 UTC

[GitHub] [pinot] snleee opened a new issue, #9003: Segment download on the server side can be incomplete

snleee opened a new issue, #9003:
URL: https://github.com/apache/pinot/issues/9003

   ```
   // First attempt via segment upload (FAILED due to 
   2022/06/30 01:17:37.045 INFO [BaseTableDataManager] Download segment: xxx of table: sales_seat_metrics_additive_OFFLINE as crc changes from: 822932084 to: 976745101
   2022/06/30 01:17:37.082 INFO [BaseTableDataManager Downloaded tarred segment: xxx for table: yyy from: https://... file length: 21332
   
   // Segment load failed due to EOF Exception.
   2022/06/30 15:01:47.857 ERROR [yyy-SegmentRefreshMessageHandler]  onError: INTERNAL,
   ERROR
   java.io.EOFException: null
           at org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.read(GzipCompressorInputStream.java:306) ~[commons-compress-1.21.jar:1.21]
           at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.read(TarArchiveInputStream.java:738) ~[commons-compress-1.21.jar:1.21]
           at java.io.InputStream.read(InputStream.java:205) ~[?:?]
           at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1309) ~[commons-io-2.11.0.jar:2.11.0]
           at org.apache.commons.io.IOUtils.copy(IOUtils.java:978) ~[commons-io-2.11.0.jar:2.11.0]
           at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1282) ~[commons-io-2.11.0.jar:2.11.0]
           at org.apache.commons.io.IOUtils.copy(IOUtils.java:953) ~[commons-io-2.11.0.jar:2.11.0]
           at org.apache.pinot.common.utils.TarGzCompressionUtils.untar(TarGzCompressionUtils.java:167) ~[pinot-common-0.11.0-dev-575.jar:0.11.0-dev-575-ae9a1dc26eaff3be719f9804e23a5
   
   // Second attempt via segment refresh (SUCCEEDED)
   2022/06/30 15:13:51.499 INFO [BaseTableDataManager] Download segment:xxx of table: sales_seat_metrics_additive_OFFLINE as crc changes from: 822932084 to: 976745101
   2022/06/30 15:13:51.515 INFO [BaseTableDataManager] Downloaded tarred segment: xxx for table: yyy from: https://... file length: 21440
   ```
   In both cases, the CRC was changed from `822932084 to 976745101`; however, the downloaded file lengths for 2 cases are different. (`21332` in the first attempt and `21440` in the second attempt).
   
   
   1. We need to investigate why segment downloads can end up having wrong bytes.
   2. Add re-try for HTTP segment fetcher


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org