You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2023/01/03 12:38:01 UTC

[GitHub] [hudi] BalaMahesh opened a new issue, #7595: [SUPPORT] Hudi Clean and Delta commits taking ~50 mins to finish frequently

BalaMahesh opened a new issue, #7595:
URL: https://github.com/apache/hudi/issues/7595

   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
   
   - Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
   
   - If you have triaged this as a bug, then file an [issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
   
   We have a Hudi table with metadata enabled and using delta streamer, async clean, async compact services . Delta commit and clean operations are taking ~50 minutes frequently. 
   
   A clear and concise description of the problem.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. Run 0.12.1 version Hudi with metadata table enabled.
   2.Enable async compaction and cleaner services. 
   3. use the below configuration.
       hoodie.cleaner.policy=KEEP_LATEST_COMMITS
       hoodie.clean.automatic=true
       hoodie.clean.async=true
       hoodie.cleaner.commits.retained=5
       hoodie.keep.min.commits=10
       #compaction config
       hoodie.datasource.compaction.async.enable=true
       hoodie.parquet.small.file.limit=1048576
       hoodie.compaction.target.io=50
       hoodie.metadata.metrics.enable=true
   
       hoodie.metadata.index.bloom.filter.enable=false
       hoodie.metadata.index.column.stats.enable=false 
       hoodie.write.concurrency.mode=optimistic_concurrency_control
       hoodie.cleaner.policy.failed.writes=LAZY
       hoodie.write.lock.provider=org.apache.hudi.client.transaction.lock.InProcessLockProvider
       hoodie.write.lock.wait_time_ms=300000
   4.
   
   **Expected behavior**
   
   Delta commit and clean actions should not take longer times.
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version : 0.12.1
   
   * Spark version : 3.2.1
   
   * Hive version : 2.3.5
   
   * Hadoop version : 2.7.7
   
   * Storage (HDFS/S3/GCS..) : GCS
   
   * Running on Docker? (yes/no) : yes
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   Delta streamer runs in continuous mode. 
   
   Adding the screenshot of timeline of operations 
   
   <img width="1125" alt="Screenshot 2023-01-03 at 11 50 00 AM" src="https://user-images.githubusercontent.com/25053668/210357883-9b72667c-1afe-4d0f-ab77-14c9a8ee0e32.png">
   
   
   <img width="1352" alt="Screenshot 2023-01-03 at 11 50 57 AM" src="https://user-images.githubusercontent.com/25053668/210358097-9806715e-0e5f-44cf-9976-f478841a1433.png">
   
   Below is the only error I see in logs. 
   
   **Stacktrace**
   
   ``` RequestHandler: Bad request response due to client view behind server view. Last known instant from client was 20230103113021745 but server has the following timeline [[20221128033016359__rollback__COMPLETED], [20221128042615784__rollback__COMPLETED], [20221128052249948__rollback__COMPLETED], [20221128100542977__rollback__COMPLETED], [20221128114411534__rollback__COMPLETED], [20221128121237952__rollback__COMPLETED], [20221128121547373__rollback__COMPLETED], [20221128124007294__rollback__COMPLETED], [20221128130510784__rollback__COMPLETED], [20221128150135765__rollback__COMPLETED], [20221202082857955__rollback__COMPLETED], [20221202083358380__rollback__COMPLETED], [20221205180609234__rollback__COMPLETED], [20221213024840399__rollback__COMPLETED], [20221215121336002__rollback__COMPLETED], [20230103075416732__clean__COMPLETED], [20230103080003681__clean__COMPLETED], [20230103080537813__clean__COMPLETED], [20230103081110194__clean__COMPLETED], [20230103081642791__clean__COMPLETED]
 , [20230103082158513__clean__COMPLETED], [20230103082749103__clean__COMPLETED], [20230103083327661__clean__COMPLETED], [20230103083915577__clean__COMPLETED], [20230103084450294__clean__COMPLETED], [20230103085022170__clean__COMPLETED], [20230103085539296__deltacommit__COMPLETED], [20230103085550414__clean__COMPLETED], [20230103090129353__deltacommit__COMPLETED], [20230103090140117__clean__COMPLETED], [20230103090705599__deltacommit__COMPLETED], [20230103090716308__clean__COMPLETED], [20230103091245975__deltacommit__COMPLETED], [20230103091256846__clean__COMPLETED], [20230103091825253__deltacommit__COMPLETED], [20230103091836101__clean__COMPLETED], [20230103092403683__deltacommit__COMPLETED], [20230103092414824__clean__COMPLETED], [20230103092828723__commit__COMPLETED], [20230103092851264__clean__COMPLETED], [20230103092923310__deltacommit__COMPLETED], [20230103093158260__clean__COMPLETED], [20230103102048896__deltacommit__COMPLETED], [20230103102100480__clean__COMPLETED], [202301031
 02637434__deltacommit__COMPLETED], [20230103102648856__clean__COMPLETED], [20230103103218354__deltacommit__COMPLETED], [20230103103229738__clean__COMPLETED], [20230103103812033__deltacommit__COMPLETED], [20230103103823381__clean__COMPLETED], [20230103104351306__deltacommit__COMPLETED], [20230103104402684__clean__COMPLETED], [20230103104950491__deltacommit__COMPLETED], [20230103105002062__clean__COMPLETED], [20230103105541444__deltacommit__COMPLETED], [20230103105552964__clean__COMPLETED], [20230103110154035__deltacommit__COMPLETED], [20230103110205541__clean__COMPLETED], [20230103110749857__deltacommit__COMPLETED], [20230103110801657__clean__COMPLETED], [20230103111344582__deltacommit__COMPLETED], [20230103111356019__clean__COMPLETED], [20230103111912226__deltacommit__COMPLETED], [20230103111923380__clean__COMPLETED], [20230103112519397__deltacommit__COMPLETED], [20230103112531041__clean__COMPLETED], [20230103113021745__commit__COMPLETED], [20230103113045783__clean__COMPLETED]]```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] michael1991 commented on issue #7595: [SUPPORT] Hudi Clean and Delta commits taking ~50 mins to finish frequently

Posted by "michael1991 (via GitHub)" <gi...@apache.org>.
michael1991 commented on issue #7595:
URL: https://github.com/apache/hudi/issues/7595#issuecomment-1455644287

   > We were running with
   > 
   > hoodie.metadata.index.bloom.filter.enable=false hoodie.metadata.index.column.stats.enable=false
   > 
   > I hope that answers your question. We have set this false because, I have run into this issue when set to true. #7657
   > 
   > I have changed the index type to simple and then restarted the application.
   > 
   > Index look up duration has come down and uniform now. <img alt="Screenshot 2023-01-13 at 10 29 51 AM" width="671" src="https://user-images.githubusercontent.com/25053668/212241200-1284af49-d728-432f-a1ba-a72e1ed50dbe.png">
   > 
   > delta commit durations are uniform too except for the issue I mentioned here #7364. After the restart delta commit gets stuck and then later progresses.
   > 
   > <img alt="Screenshot 2023-01-13 at 10 30 00 AM" width="672" src="https://user-images.githubusercontent.com/25053668/212241373-1f67ca15-7e5d-4bdc-b9a1-ced25a94ba68.png">
   
   @BalaMahesh  Hello, BalaMahesh, we met same issue here, I have two questions here, hope you could have chance to response. Thanks in advance !
   - how to find charts of "Index lookup duration" ?
   - simple is the default index type on Spark engine, and we are using simple index type on our Custom Spark job, seems simple index type is not the key solution. do you have any more updates ?
   
   Thanks in advance again !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #7595: [SUPPORT] Hudi Clean and Delta commits taking ~50 mins to finish frequently

Posted by GitBox <gi...@apache.org>.
danny0405 commented on issue #7595:
URL: https://github.com/apache/hudi/issues/7595#issuecomment-1379972942

   @BalaMahesh DId you try with the bloomfilter index metadata disabled ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] BalaMahesh commented on issue #7595: [SUPPORT] Hudi Clean and Delta commits taking ~50 mins to finish frequently

Posted by GitBox <gi...@apache.org>.
BalaMahesh commented on issue #7595:
URL: https://github.com/apache/hudi/issues/7595#issuecomment-1378454205

   @yihua - did you get chance to check this ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] BalaMahesh commented on issue #7595: [SUPPORT] Hudi Clean and Delta commits taking ~50 mins to finish frequently

Posted by "BalaMahesh (via GitHub)" <gi...@apache.org>.
BalaMahesh commented on issue #7595:
URL: https://github.com/apache/hudi/issues/7595#issuecomment-1464816071

   > > We were running with
   > > hoodie.metadata.index.bloom.filter.enable=false hoodie.metadata.index.column.stats.enable=false
   > > I hope that answers your question. We have set this false because, I have run into this issue when set to true. #7657
   > > I have changed the index type to simple and then restarted the application.
   > > Index look up duration has come down and uniform now. <img alt="Screenshot 2023-01-13 at 10 29 51 AM" width="671" src="https://user-images.githubusercontent.com/25053668/212241200-1284af49-d728-432f-a1ba-a72e1ed50dbe.png">
   > > delta commit durations are uniform too except for the issue I mentioned here #7364. After the restart delta commit gets stuck and then later progresses.
   > > <img alt="Screenshot 2023-01-13 at 10 30 00 AM" width="672" src="https://user-images.githubusercontent.com/25053668/212241373-1f67ca15-7e5d-4bdc-b9a1-ced25a94ba68.png">
   > 
   > @BalaMahesh Hello, BalaMahesh, we met same issue here, I have two questions here, hope you could have chance to response. Thanks in advance !
   > 
   > * how to find charts of "Index lookup duration" ?
   > * simple is the default index type on Spark engine, and we are using simple index type on our Custom Spark job, seems simple index type is not the key solution. do you have any more updates ?
   > 
   > Thanks in advance again !
   
   1. You have to push metrics to either Prometheus push gateway or any other monitoring solution for plotting these charts. 
   2. You can see the spark stages and jobs in the spark web ui to identify which task is taking long. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #7595: [SUPPORT] Hudi Clean and Delta commits taking ~50 mins to finish frequently

Posted by GitBox <gi...@apache.org>.
danny0405 commented on issue #7595:
URL: https://github.com/apache/hudi/issues/7595#issuecomment-1381553999

   I guess we run into some performance issue when using BloomFilter index for mor table with metadata table disabled, thanks for the feedback, let me record this issue first for this on-call time and have a discussion on the meeting


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] BalaMahesh commented on issue #7595: [SUPPORT] Hudi Clean and Delta commits taking ~50 mins to finish frequently

Posted by GitBox <gi...@apache.org>.
BalaMahesh commented on issue #7595:
URL: https://github.com/apache/hudi/issues/7595#issuecomment-1370487901

   This is a non-partitioned table with minimum file size set to 1 MB and ~150 parquet files are created.  Below are the screenshots from spark web ui. 
   <img width="1440" alt="Screenshot 2023-01-04 at 9 53 03 AM" src="https://user-images.githubusercontent.com/25053668/210485678-01569009-8b3c-4b62-8d04-7d5149cbdb7b.png">
   <img width="1440" alt="Screenshot 2023-01-04 at 9 53 24 AM" src="https://user-images.githubusercontent.com/25053668/210485684-b5314c87-a794-4242-889f-3a7a6623e395.png">
   <img width="1435" alt="Screenshot 2023-01-04 at 9 53 50 AM" src="https://user-images.githubusercontent.com/25053668/210485691-d3112ff4-739e-429a-bca1-3abd44394ba6.png">
   <img width="1375" alt="Screenshot 2023-01-04 at 10 04 45 AM" src="https://user-images.githubusercontent.com/25053668/210485699-26aeede3-24fa-46ec-9e4f-b07d87b842c9.png">
   <img width="1440" alt="Screenshot 2023-01-04 at 10 02 19 AM" src="https://user-images.githubusercontent.com/25053668/210485707-bdd31536-0399-4a89-98b8-2816ed1fdff3.png">
   
   I am adding the logs between 23/01/04 03:48:51  and 23/01/04 04:41:25 - which took longer duration for delta commit.
    
   [hudi_logs.txt](https://github.com/apache/hudi/files/10341543/hudi_logs.txt)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #7595: [SUPPORT] Hudi Clean and Delta commits taking ~50 mins to finish frequently

Posted by GitBox <gi...@apache.org>.
yihua commented on issue #7595:
URL: https://github.com/apache/hudi/issues/7595#issuecomment-1370256718

   Hi @BalaMahesh Thanks for raising the issue.  To better triage this, could you provide more details about the Hudi table, partitioned or non-partitioned table, how many partitions if partitioned, Spark driver logs, and screenshots of the stages which take a long time to finish?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] michael1991 commented on issue #7595: [SUPPORT] Hudi Clean and Delta commits taking ~50 mins to finish frequently

Posted by "michael1991 (via GitHub)" <gi...@apache.org>.
michael1991 commented on issue #7595:
URL: https://github.com/apache/hudi/issues/7595#issuecomment-1465916747

   > > > We were running with
   > > > hoodie.metadata.index.bloom.filter.enable=false hoodie.metadata.index.column.stats.enable=false
   > > > I hope that answers your question. We have set this false because, I have run into this issue when set to true. #7657
   > > > I have changed the index type to simple and then restarted the application.
   > > > Index look up duration has come down and uniform now. <img alt="Screenshot 2023-01-13 at 10 29 51 AM" width="671" src="https://user-images.githubusercontent.com/25053668/212241200-1284af49-d728-432f-a1ba-a72e1ed50dbe.png">
   > > > delta commit durations are uniform too except for the issue I mentioned here #7364. After the restart delta commit gets stuck and then later progresses.
   > > > <img alt="Screenshot 2023-01-13 at 10 30 00 AM" width="672" src="https://user-images.githubusercontent.com/25053668/212241373-1f67ca15-7e5d-4bdc-b9a1-ced25a94ba68.png">
   > > 
   > > 
   > > @BalaMahesh Hello, BalaMahesh, we met same issue here, I have two questions here, hope you could have chance to response. Thanks in advance !
   > > 
   > > * how to find charts of "Index lookup duration" ?
   > > * simple is the default index type on Spark engine, and we are using simple index type on our Custom Spark job, seems simple index type is not the key solution. do you have any more updates ?
   > > 
   > > Thanks in advance again !
   > 
   > 1. You have to push metrics to either Prometheus push gateway or any other monitoring solution for plotting these charts.
   > 2. You can see the spark stages and jobs in the spark web ui to identify which task is taking long.
   
   Got it! Thanks a lot !!!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] BalaMahesh commented on issue #7595: [SUPPORT] Hudi Clean and Delta commits taking ~50 mins to finish frequently

Posted by GitBox <gi...@apache.org>.
BalaMahesh commented on issue #7595:
URL: https://github.com/apache/hudi/issues/7595#issuecomment-1381317993

   I have changed the index type to simple and then restarted the application.
   
   Index look up duration has come down and uniform now. 
   <img width="671" alt="Screenshot 2023-01-13 at 10 29 51 AM" src="https://user-images.githubusercontent.com/25053668/212241200-1284af49-d728-432f-a1ba-a72e1ed50dbe.png">
   
   delta commit durations are uniform too except for the issue I mentioned here https://github.com/apache/hudi/issues/7364.After the restart delta commit gets stuck and then later progresses. 
     
   <img width="672" alt="Screenshot 2023-01-13 at 10 30 00 AM" src="https://user-images.githubusercontent.com/25053668/212241373-1f67ca15-7e5d-4bdc-b9a1-ced25a94ba68.png">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] BalaMahesh commented on issue #7595: [SUPPORT] Hudi Clean and Delta commits taking ~50 mins to finish frequently

Posted by GitBox <gi...@apache.org>.
BalaMahesh commented on issue #7595:
URL: https://github.com/apache/hudi/issues/7595#issuecomment-1385272668

   > I guess we run into some performance issue when using BloomFilter index for mor table with metadata table disabled, thanks for the feedback, let me record this issue first for this on-call time and have a discussion on the meeting
   
   Metadata table is enabled for this MoR job but 
   
   hoodie.metadata.index.bloom.filter.enable=false
   hoodie.metadata.index.column.stats.enable=false
   
   these are set to false. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] michael1991 commented on issue #7595: [SUPPORT] Hudi Clean and Delta commits taking ~50 mins to finish frequently

Posted by "michael1991 (via GitHub)" <gi...@apache.org>.
michael1991 commented on issue #7595:
URL: https://github.com/apache/hudi/issues/7595#issuecomment-1452883556

   Same issue faced here, but we use Spark to append MOR tables with async cleaning.
   A lot of warning messages came into log file, any good ideas more ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org