You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Alexander Filipchik (Jira)" <ji...@apache.org> on 2020/04/11 21:35:00 UTC

[jira] [Created] (HUDI-784) CorruptedLogFileException sometimes happens on GCS

Alexander Filipchik created HUDI-784:
----------------------------------------

             Summary: CorruptedLogFileException sometimes happens on GCS
                 Key: HUDI-784
                 URL: https://issues.apache.org/jira/browse/HUDI-784
             Project: Apache Hudi (incubating)
          Issue Type: Bug
            Reporter: Alexander Filipchik


768726 [Executor task launch worker-2] ERROR org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner  - Got exception when reading log file
org.apache.hudi.exception.CorruptedLogFileException: HoodieLogFile{pathStr='
[gs://.log.|gs://1_20200219014757.log.2]
', fileLen=0}could not be read. Did not find the magic bytes at the start of the block
at org.apache.hudi.common.table.log.HoodieLogFileReader.readMagic(HoodieLogFileReader.java:313)
	at org.apache.hudi.common.table.log.HoodieLogFileReader.hasNext(HoodieLogFileReader.java:295)
	at org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:103)
 

I did extensive debugging and still unclear on why it is happening. It might be issue with GCS libraries themselves. The fix that is working:

 

In: HoodieLogFileReader made 
{code:java}
// private final byte[] magicBuffer = new byte[6];
{code}
non static. I'm not sure why it is actually static in the first place as it is inviting a race.

Also in HoodieLogFileReader:

added
{code:java}
// fsDataInputStream.seek(0);
{code}
added right after stream creation in the constructor.

 

 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)