You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/01/16 02:27:00 UTC

[jira] [Updated] (HUDI-1532) Super slow magic sequence search within the log files on GCS

     [ https://issues.apache.org/jira/browse/HUDI-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated HUDI-1532:
---------------------------------
    Labels: pull-request-available  (was: )

> Super slow magic sequence search within the log files on GCS 
> -------------------------------------------------------------
>
>                 Key: HUDI-1532
>                 URL: https://issues.apache.org/jira/browse/HUDI-1532
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: DeltaStreamer
>            Reporter: Volodymyr Burenin
>            Priority: Major
>              Labels: pull-request-available
>
> HudiDeltaStreamer freezes for a very long time(days) when scanning through the log file looking for the magic sequence. Java stacktrace points to this location:
> ```
> "Executor task launch worker for task 183" #233 daemon prio=5 os_prio=0 tid=0x00005629fb650000 nid=0xff runnable [0x00007f378a433000]
>    java.lang.Thread.State: RUNNABLE
> 	at java.net.SocketInputStream.socketRead0(Native Method)
> 	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> 	at java.net.SocketInputStream.read(SocketInputStream.java:171)
> 	at java.net.SocketInputStream.read(SocketInputStream.java:141)
> 	at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
> 	at sun.security.ssl.InputRecord.read(InputRecord.java:503)
> 	at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
> 	- locked <0x00000007ba998608> (a java.lang.Object)
> 	at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
> 	at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
> 	- locked <0x00000007ba9995c0> (a sun.security.ssl.AppInputStream)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> 	at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
> 	- locked <0x0000000779e6f360> (a java.io.BufferedInputStream)
> 	at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
> 	at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
> 	at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587)
> 	- locked <0x0000000779e69ba0> (a sun.net.www.protocol.https.DelegateHttpsURLConnection)
> 	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
> 	- locked <0x0000000779e69ba0> (a sun.net.www.protocol.https.DelegateHttpsURLConnection)
> 	at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
> 	at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:347)
> 	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.http.javanet.NetHttpResponse.<init>(NetHttpResponse.java:36)
> 	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:144)
> 	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:79)
> 	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.http.HttpRequest.execute(HttpRequest.java:995)
> 	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:549)
> 	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:482)
> 	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeMedia(AbstractGoogleClientRequest.java:510)
> 	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.services.storage.Storage$Objects$Get.executeMedia(Storage.java:6981)
> 	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageReadChannel.openStream(GoogleCloudStorageReadChannel.java:967)
> 	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageReadChannel.openContentChannel(GoogleCloudStorageReadChannel.java:772)
> 	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageReadChannel.performLazySeek(GoogleCloudStorageReadChannel.java:763)
> 	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageReadChannel.read(GoogleCloudStorageReadChannel.java:365)
> 	at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFSInputStream.read(GoogleHadoopFSInputStream.java:131)
> 	- locked <0x0000000616319fb8> (a com.google.cloud.hadoop.fs.gcs.GoogleHadoopFSInputStream)
> 	at java.io.DataInputStream.read(DataInputStream.java:149)
> 	at java.io.DataInputStream.readFully(DataInputStream.java:195)
> 	at org.apache.hudi.common.table.log.HoodieLogFileReader.hasNextMagic(HoodieLogFileReader.java:339)
> 	at org.apache.hudi.common.table.log.HoodieLogFileReader.scanForNextAvailableBlockOffset(HoodieLogFileReader.java:280)
> 	at org.apache.hudi.common.table.log.HoodieLogFileReader.createCorruptBlock(HoodieLogFileReader.java:221)
> 	at org.apache.hudi.common.table.log.HoodieLogFileReader.readBlock(HoodieLogFileReader.java:147)
> 	at org.apache.hudi.common.table.log.HoodieLogFileReader.next(HoodieLogFileReader.java:347)
> ```
> After deeper research I discovered it happens due to the unbuffered access to the GCS as well as due to the inefficient algorithm that searches for the magic sequence technically making request to GCS to read next 6 bytes, then offsets +1 and tries again. With 50-60ms latency and 5-6MB file sizes it may take forever.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)