You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/09/05 15:31:21 UTC

[GitHub] [hudi] loukey-lj opened a new pull request, #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

loukey-lj opened a new pull request, #6602:
URL: https://github.com/apache/hudi/pull/6602

   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was copied._
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance impact._
   
   **Risk level: none | low | medium | high**
   
   _Choose one. If medium or high, explain what verification was done to mitigate the risks._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
yihua commented on PR #6602:
URL: https://github.com/apache/hudi/pull/6602#issuecomment-1246019433

   > fix looks ok to me. Were you able to test out the patch? also, is it possible to write a test to validate the fix?
   
   +1.  @loukey-lj could you write a test for the change?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] loukey-lj commented on a diff in pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
loukey-lj commented on code in PR #6602:
URL: https://github.com/apache/hudi/pull/6602#discussion_r963168938


##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFormatWriter.java:
##########
@@ -270,7 +270,7 @@ public long getCurrentSize() throws IOException {
     if (output == null) {
       return 0;
     }
-    return output.getPos();
+    return output.getPos() + logFile.getFileSize();

Review Comment:
   @yihua When appending data to an old log file,org.apache.hudi.common.table.log.HoodieLogFormatWriter#getOutputStream  postition always start at 0,  after flush, org.apache.hudi.common.table.log.HoodieLogFormatWriter#getCurrentSize  returned result is the size of the append dataset, not the total size of the entire file. u can debug follow code.
   
   StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
           env.enableCheckpointing(1000 * 20);
           env.setParallelism(1);
   
           StreamTableEnvironment tableEnvironment = StreamTableEnvironment.create(env);
           final DataStreamSource<Tuple3<String, Long, Long>> tuple3DataStreamSource = env.addSource(new SourceFunction<Tuple3<String, Long, Long>>() {
               @Override
               public void run(SourceContext<Tuple3<String, Long, Long>> ctx) throws Exception {
                   while (!Thread.interrupted()){
                       ctx.collect(new Tuple3<>("1",System.currentTimeMillis(), System.currentTimeMillis()));
                       Thread.sleep(1000 * 10 );
                   }
               }
               @Override
               public void cancel() {
               }
           });
   
           tableEnvironment.createTemporaryView("s", tuple3DataStreamSource);
   
           tableEnvironment.executeSql("\n" +
                           "\n" +
                           "create table if not exists h(\n" +
                           "  `id` string PRIMARY KEY NOT ENFORCED , \n" +
                           "  `ts` bigint , \n" +
                           "  `time` bigint \n" +
                           ") \n" +
                           "with (\n" +
                           "\t'connector' = 'hudi',\n 'write.bucket_assign.tasks'='1', " +
                           "\t'hoodie.datasource.write.keygenerator.class'='org.apache.hudi.keygen.SimpleAvroKeyGenerator',\n" +
                           "\t'table.type' = 'MERGE_ON_READ',\n" +
                           "\t'hive_sync.enable' = 'false',\n" +
                           "\t'write.tasks'='1',\n" +
                           "\t'path' = 'hdfs://xx',\n" +
                           "\t'hoodie.cleaner.commits.retained' = '1'\n" +
            ")\n");
   
           tableEnvironment.executeSql("insert into h  SELECT * from s \n");
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6602:
URL: https://github.com/apache/hudi/pull/6602#issuecomment-1237285375

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8dad0d3818280e2b7d80e536ebb65abde9a581c7",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11168",
       "triggerID" : "8dad0d3818280e2b7d80e536ebb65abde9a581c7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8dad0d3818280e2b7d80e536ebb65abde9a581c7 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11168) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] guanziyue commented on pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
guanziyue commented on PR #6602:
URL: https://github.com/apache/hudi/pull/6602#issuecomment-1242914624

   > > > SizeAwareFSDataOutputStream
   > > 
   > > 
   > > Thanks for your clarification. Do agree with you that the correct way is change the result returned by SizeAwareFSDataOutputStream.
   > 
   > @guanziyue SizeAwareFSDataOutputStream updated.
   
   LGTM. But you may request a reviewing from a commiter or PMC member. I'm interested in this PR because I have changed some code about stream in log writer before.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6602:
URL: https://github.com/apache/hudi/pull/6602#issuecomment-1242161828

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8dad0d3818280e2b7d80e536ebb65abde9a581c7",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11168",
       "triggerID" : "8dad0d3818280e2b7d80e536ebb65abde9a581c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cc8badbbe36b955d4099f59a4bbde9fa369baf4f",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11271",
       "triggerID" : "cc8badbbe36b955d4099f59a4bbde9fa369baf4f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8dad0d3818280e2b7d80e536ebb65abde9a581c7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11168) 
   * cc8badbbe36b955d4099f59a4bbde9fa369baf4f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11271) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6602:
URL: https://github.com/apache/hudi/pull/6602#issuecomment-1242155015

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8dad0d3818280e2b7d80e536ebb65abde9a581c7",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11168",
       "triggerID" : "8dad0d3818280e2b7d80e536ebb65abde9a581c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cc8badbbe36b955d4099f59a4bbde9fa369baf4f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "cc8badbbe36b955d4099f59a4bbde9fa369baf4f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8dad0d3818280e2b7d80e536ebb65abde9a581c7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11168) 
   * cc8badbbe36b955d4099f59a4bbde9fa369baf4f UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] loukey-lj commented on pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
loukey-lj commented on PR #6602:
URL: https://github.com/apache/hudi/pull/6602#issuecomment-1242882301

   > > SizeAwareFSDataOutputStream
   > 
   > Thanks for your clarification. Do agree with you that the correct way is change the result returned by SizeAwareFSDataOutputStream.
   
   @guanziyue SizeAwareFSDataOutputStream updated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on a diff in pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
yihua commented on code in PR #6602:
URL: https://github.com/apache/hudi/pull/6602#discussion_r970137636


##########
hudi-common/src/main/java/org/apache/hudi/common/fs/SizeAwareFSDataOutputStream.java:
##########
@@ -44,7 +44,7 @@ public class SizeAwareFSDataOutputStream extends FSDataOutputStream {
 
   public SizeAwareFSDataOutputStream(Path path, FSDataOutputStream out, ConsistencyGuard consistencyGuard,
                                      Runnable closeCallback) throws IOException {
-    super(out, null);
+    super(out, null, out.getPos());

Review Comment:
   After digging into the code, the start position passed in here is only used by `FSDataOutputStream.PositionCache` to track the current position.  So this is okay.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6602:
URL: https://github.com/apache/hudi/pull/6602#issuecomment-1237502904

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8dad0d3818280e2b7d80e536ebb65abde9a581c7",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11168",
       "triggerID" : "8dad0d3818280e2b7d80e536ebb65abde9a581c7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8dad0d3818280e2b7d80e536ebb65abde9a581c7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11168) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6602:
URL: https://github.com/apache/hudi/pull/6602#issuecomment-1237279522

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8dad0d3818280e2b7d80e536ebb65abde9a581c7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8dad0d3818280e2b7d80e536ebb65abde9a581c7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8dad0d3818280e2b7d80e536ebb65abde9a581c7 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] loukey-lj commented on pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
loukey-lj commented on PR #6602:
URL: https://github.com/apache/hudi/pull/6602#issuecomment-1240324396

   > Hi loukey-lj, could you share which kind of filesystem you use? HDFS or S3 or any other type. The problem you mentioned should be covered in UT TestHoodieLogFormat#testRollover. I just have a try on master code and go over HDFS implementation code. It has correct behaviour.
   
   @guanziyue Thanks for review,  the OutputStream used by the test class and the actual task run is different, testRollover is HdfsDataOutputStream and the other is SizeAwareFSDataOutputStream, SizeAwareFSDataOutputStream startPosition always 0.
   Maybe  the right way
    to fix this issue is change the constructor of SizeAwareFSDataOutputStream  `super(out, null) `  to  `super(out, null, out.getPos())`  , but it's used in so many places that I can't assess the impact of this change
    
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6602:
URL: https://github.com/apache/hudi/pull/6602#issuecomment-1242316395

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8dad0d3818280e2b7d80e536ebb65abde9a581c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11168",
       "triggerID" : "8dad0d3818280e2b7d80e536ebb65abde9a581c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cc8badbbe36b955d4099f59a4bbde9fa369baf4f",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11271",
       "triggerID" : "cc8badbbe36b955d4099f59a4bbde9fa369baf4f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * cc8badbbe36b955d4099f59a4bbde9fa369baf4f Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11271) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
danny0405 commented on PR #6602:
URL: https://github.com/apache/hudi/pull/6602#issuecomment-1247722560

   @yihua Can you help to add a test case here, wanna merge it first to make it into 0.12.1.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 merged pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
danny0405 merged PR #6602:
URL: https://github.com/apache/hudi/pull/6602


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on a diff in pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
yihua commented on code in PR #6602:
URL: https://github.com/apache/hudi/pull/6602#discussion_r963142141


##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFormatWriter.java:
##########
@@ -270,7 +270,7 @@ public long getCurrentSize() throws IOException {
     if (output == null) {
       return 0;
     }
-    return output.getPos();
+    return output.getPos() + logFile.getFileSize();

Review Comment:
   @loukey-lj Could you explain how this affects the size calculation?  Should `output.getPos()` return the size written already?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on a diff in pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
yihua commented on code in PR #6602:
URL: https://github.com/apache/hudi/pull/6602#discussion_r970132343


##########
hudi-common/src/main/java/org/apache/hudi/common/fs/SizeAwareFSDataOutputStream.java:
##########
@@ -44,7 +44,7 @@ public class SizeAwareFSDataOutputStream extends FSDataOutputStream {
 
   public SizeAwareFSDataOutputStream(Path path, FSDataOutputStream out, ConsistencyGuard consistencyGuard,
                                      Runnable closeCallback) throws IOException {
-    super(out, null);
+    super(out, null, out.getPos());

Review Comment:
   Even for correctness, we should pass the right start position (`out.getPos()`).  Without this, does it overwrite the existing log blocks from the position of 0 when trying to append new log blocks on HDFS?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on a diff in pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
yihua commented on code in PR #6602:
URL: https://github.com/apache/hudi/pull/6602#discussion_r970124521


##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFormatWriter.java:
##########
@@ -270,7 +270,7 @@ public long getCurrentSize() throws IOException {
     if (output == null) {
       return 0;
     }
-    return output.getPos();
+    return output.getPos() + logFile.getFileSize();

Review Comment:
   @loukey-lj Got it.  This is a HDFS-specific problem.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] guanziyue commented on pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
guanziyue commented on PR #6602:
URL: https://github.com/apache/hudi/pull/6602#issuecomment-1239101110

   Hi loukey-lj, could you share which kind of filesystem you use? HDFS or S3 or any other type. The problem you mentioned should be covered in UT TestHoodieLogFormat#testRollover. I just have a try on master code and go over HDFS implementation code. It has correct behaviour. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] guanziyue commented on pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

Posted by GitBox <gi...@apache.org>.
guanziyue commented on PR #6602:
URL: https://github.com/apache/hudi/pull/6602#issuecomment-1240533451

   > SizeAwareFSDataOutputStream
   
   Thanks for your clarification. Do agree with you that the correct way is change the result returned by SizeAwareFSDataOutputStream. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org