You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "michael1991 (via GitHub)" <gi...@apache.org> on 2023/03/06 09:00:19 UTC

[GitHub] [hudi] michael1991 opened a new issue, #8100: [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs

michael1991 opened a new issue, #8100:
URL: https://github.com/apache/hudi/issues/8100

   **Describe the problem you faced**
   
   We have an hourly custom Spark job which incrementally append a MOR table with async cleaning and compaction. But we faced two problems here:
   1. Job execution time is unstable (9min ~ 34min).
   2. Many RequestHandler WARN logs appear like below:
   ```
   23/03/06 08:29:27 WARN RequestHandler: Bad request response due to client view behind server view. Last known instant from client was 20230306081811007 but server has the following timeline [[20230306053715487__deltacommit__COMPLETED], [20230306053909170__deltacommit__COMPLETED], [20230306054016158__deltacommit__COMPLETED], [20230306054048527__commit__COMPLETED], [20230306054633503__deltacommit__COMPLETED], [20230306054829613__deltacommit__COMPLETED], [20230306055410708__deltacommit__COMPLETED], [20230306055412958__clean__COMPLETED], [20230306055651288__deltacommit__COMPLETED], [20230306060351975__deltacommit__COMPLETED], [20230306060355539__clean__COMPLETED], [20230306061211058__deltacommit__COMPLETED], [20230306061726347__commit__COMPLETED], [20230306062444656__deltacommit__COMPLETED], [20230306062859872__deltacommit__COMPLETED], [20230306063610912__deltacommit__COMPLETED], [20230306063613508__clean__COMPLETED], [20230306065356198__deltacommit__COMPLETED], [20230306070057884__de
 ltacommit__COMPLETED], [20230306070100336__clean__COMPLETED], [20230306072323088__deltacommit__COMPLETED], [20230306072824814__commit__COMPLETED], [20230306072853022__clean__COMPLETED], [20230306073557240__deltacommit__COMPLETED], [20230306073841564__deltacommit__COMPLETED], [20230306074501873__deltacommit__COMPLETED], [20230306074504461__clean__COMPLETED], [20230306075516906__deltacommit__COMPLETED], [20230306080130924__deltacommit__COMPLETED], [20230306080133247__clean__COMPLETED], [20230306081331748__deltacommit__COMPLETED], [20230306081811007__commit__COMPLETED], [20230306081838580__clean__COMPLETED], [20230306082752807__clean__COMPLETED]]
   ```
   
   **Expected behavior**
   
   Under same Spark cluster scale, job execution time should be correlated with data size.
   
   **Environment Description**
   
   * Hudi version : 0.12.0
   
   * Spark version : 3.3.0
   
   * Hive version : not used
   
   * Hadoop version : 3.3.3
   
   * Storage (HDFS/S3/GCS..) : GCS
   
   * Running on Docker? (yes/no) : no
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] michael1991 commented on issue #8100: [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs

Posted by "michael1991 (via GitHub)" <gi...@apache.org>.
michael1991 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1459702310

   > Weird, are all the error messages empty string here?
   
   So weird, i couldn't find any error message here. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8100: [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1457638079

   1. We have a separate utilities for compaction using either Spark or Flink
   2. The WARNNIGN log expects to have an error message at the end, isn't there?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs [hudi]

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1759271106

   Got it, you can have a try.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs [hudi]

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1759099553

   You can always refresh the fs view but the refreshing itself is costly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] michael1991 commented on issue #8100: [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs

Posted by "michael1991 (via GitHub)" <gi...@apache.org>.
michael1991 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1457648920

   > 1. We have a separate utilities for compaction using either Spark or Flink
   I will check it later, thanks in advance !
   
   > 2. The WARNNIGN log expects to have an error message at the end, isn't there?
   Not once.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] michael1991 commented on issue #8100: [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs

Posted by "michael1991 (via GitHub)" <gi...@apache.org>.
michael1991 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1459632502

   Additional observation: 2nd point only exist in async cleaning / compaction step, not sure specific step.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] michael1991 commented on issue #8100: [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs

Posted by "michael1991 (via GitHub)" <gi...@apache.org>.
michael1991 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1457608016

   Thanks @danny0405 for response !
   
   First point seems due to async inline compaction, if I disable inline compaction, job execution time would be uniform. Maybe I should try separate compaction job, could we use custom Spark job to achieve it ? I saw deltastreamer, spark streaming, flink and hudi cli / util on docs, but didn't see separate Spark job example.
   
   About second question, I just saw a lot of warning messages here, no error messages found in logs.
   Pls let me know if I could support more details here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs [hudi]

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1758753847

   Did you ever try the latest release, the fs view should perform better.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #8100: [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs

Posted by "nsivabalan (via GitHub)" <gi...@apache.org>.
nsivabalan commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1465084370

   yes. if you enable inline compaction, once every N delta commits, compaction might kick in. and so, all other writes will see lesser write latency while the Nth delta commit will have higher latency since compaction also happens. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] michael1991 commented on issue #8100: [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs

Posted by "michael1991 (via GitHub)" <gi...@apache.org>.
michael1991 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1465435154

   > yes. if you enable inline compaction, once every N delta commits, compaction might kick in. and so, all other writes will see lesser write latency while the Nth delta commit will have higher latency since compaction also happens.
   
   Gotcha! Thanks @nsivabalan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8100: [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1457535893

   There should be some error message print out following the phrase:
   
   ```java
   LOG.warn("Bad request response due to client view behind server view. " + re.getMessage());
   ```
   
   Could you also show us the error messages?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] michael1991 closed issue #8100: [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs

Posted by "michael1991 (via GitHub)" <gi...@apache.org>.
michael1991 closed issue #8100: [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs
URL: https://github.com/apache/hudi/issues/8100


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs [hudi]

Posted by "lovemylover042 (via GitHub)" <gi...@apache.org>.
lovemylover042 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1759119020

   > You can always refresh the fs view but the refreshing itself is costly.
   
   But it better than every task init local file system view. That would be very slowly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs [hudi]

Posted by "lovemylover042 (via GitHub)" <gi...@apache.org>.
lovemylover042 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1757266624

   @danny0405 I found delta commit became so slowly because it use secondary filesystem view when got a bad response from remote timeline server.  I think the bad response was caused by compaction running at the same time and timeline server was behind the client. Can i  force sync local view if timeline server was behind the client ? 
   -----------------------------------------------------------------
   org.apache.hudi.timeline.service.RequestHandler line 501:
   // TODO: set refreshCheck to be true when timeline server became behind several times or some seconds
   if (refreshCheck) {
             long beginFinalCheck = System.currentTimeMillis();
             if (isLocalViewBehind(context)) {
               String errMsg =
                   "Last known instant from client was "
                       + context.queryParam(RemoteHoodieTableFileSystemView.LAST_INSTANT_TS,
                           HoodieTimeline.INVALID_INSTANT_TS)
                       + " but server has the following timeline "
                       + viewManager.getFileSystemView(context.queryParam(RemoteHoodieTableFileSystemView.BASEPATH_PARAM))
                           .getTimeline().getInstants().collect(Collectors.toList());
               throw new BadRequestResponse(errMsg);
             }
             long endFinalCheck = System.currentTimeMillis();
             finalCheckTimeTaken = endFinalCheck - beginFinalCheck;
           }
   -----------------------------------------------------------------
   Environment Description: 
   Hudi version : 0.10.1
   Spark version : 3.0.1
   Hadoop version : 3.1.1
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs [hudi]

Posted by "lovemylover042 (via GitHub)" <gi...@apache.org>.
lovemylover042 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1759086656

   > Did you ever try the latest release, the fs view should perform better.
   
   Sorry but i had just upgraded to 0.10.1 from 0.8.0 recently. I will concern upgrading to 0.14 at the future time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] michael1991 commented on issue #8100: [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs

Posted by "michael1991 (via GitHub)" <gi...@apache.org>.
michael1991 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1459660995

   > We need the error stack trace, when async table services are enabled, the MDT is guarded by a explicit lock, which would affect the access efficiency of MDT which is used as a backend of the Driver fs view.
   
   Gotcha, let me try to collect more logs to see if any error stack trace could be found. Will let you know later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8100: [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1459658082

   We need the error stack trace, when async table services are enabled, the MDT is guarded by a explicit lock, which would affect the access efficiency of MDT which is used as a backend of the Driver fs view.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] michael1991 commented on issue #8100: [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs

Posted by "michael1991 (via GitHub)" <gi...@apache.org>.
michael1991 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1459680882

   @danny0405 Unfortunately, even if I set spark job log level to DEBUG, I couldn't find any error message from log file, only lots of RequestHandler WARN logs there. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8100: [SUPPORT] Unstable Execution Time and Many RequestHandler WARN Logs

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8100:
URL: https://github.com/apache/hudi/issues/8100#issuecomment-1459696472

   Weird, are all the error messages empty string here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org