You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "DavidZ1 (via GitHub)" <gi...@apache.org> on 2023/03/22 08:17:40 UTC

[GitHub] [hudi] DavidZ1 opened a new issue, #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

DavidZ1 opened a new issue, #8267:
URL: https://github.com/apache/hudi/issues/8267

   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
   
   - Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
   
   - If you have triaged this as a bug, then file an [issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
   
   We have a flink task that consumes kafka messages and then writes them into the hudi table, using MOR table, index using buecket index, and the write mode of the table is upsert. Our MOR table has 2 levels of partitions, day and hour.
   
   After the flink task has been running for a period of time, we found that the log files in each hour partition were not converted to parquet files. We also checked the compaction request file and found that it did not contain all the log files. I don’t know how to solve it? At the same time, I also want to know how to judge that the data logs files of a certain partition have been compacted?
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1.
   2.
   3.
   4.
   
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version : 0.13.0
   
   * Spark version : 3.2.1
   
   * Hive version : 3.2.1
   
   * Hadoop version : 3.2.1
   
   * Storage (HDFS/S3/GCS..) : COSN
   
   * Running on Docker? (yes/no) : yes
   
   
   **Additional context**
   
   1.Hudi config
   
   ```java
   checkpoint.interval=300
   checkpoint.timeout=900
   compaction.max_memory=1024
   payload.class.name=org.apache.hudi.common.model.OverwriteNonDefaultsWithLatestAvroPayload
   compaction.delta_commits=5
   compaction.trigger.strategy=num_or_time
   compaction.delta_seconds=3600
   clean.policy=KEEP_LATEST_COMMITS
   clean.retain_commits=1
   hoodie.bucket.index.num.buckets=50
   archive.max_commits=50
   archive.min_commits=40
   compaction.async.enabled=true
   write.operation=upsert
   table.type=MERGE_ON_READ
   index.type=BUCKET
   checkpoint.incremental.enable=true
   ``` 
   
   
   2.hoodie.properties
   
   ```java
   
   hoodie.table.precombine.field=acquire_timestamp
   hoodie.datasource.write.drop.partition.columns=false
   hoodie.table.partition.fields=pt,ht
   hoodie.table.type=MERGE_ON_READ
   hoodie.archivelog.folder=archived
   hoodie.table.cdc.enabled=false
   hoodie.compaction.payload.class=org.apache.hudi.common.model.OverwriteNonDefaultsWithLatestAvroPayload
   hoodie.table.version=5
   hoodie.timeline.layout.version=1
   hoodie.table.recordkey.fields=vin,acquire_timestamp
   hoodie.datasource.write.partitionpath.urlencode=false
   hoodie.table.name=ods_icv_can_hudi_temp
   hoodie.table.keygenerator.class=org.apache.hudi.keygen.ComplexAvroKeyGenerator
   hoodie.compaction.record.merger.strategy=eeb8d96f-b1e4-49fd-bbf8-28ac514178e5
   hoodie.datasource.write.hive_style_partitioning=true
   ``` 
   
   3.DAG
   ![2012fe42112700a0bae99e5b95054eb](https://user-images.githubusercontent.com/30795397/226840153-71dc771b-7322-4605-8f9f-1006c3259205.png)
   
   4.Data file
   
   ![9a882f5a6ea5f1ec3c156e297ea0636](https://user-images.githubusercontent.com/30795397/226839970-694da791-a9aa-4589-838a-561d2de2eaee.png)
   
   00000023-96fb-4ca0-b0ae-0547b5898b3b fileId parquet size is 40MB,but arvo logs files size  1500MB+,so some arvo logs not compact to parquet.
   
   ![2434969ad9f5389e2f5b051fac3ccd7](https://user-images.githubusercontent.com/30795397/226840070-bc4be834-8c5e-4bcd-90f7-df8fa7f38938.png)
   
   We found that the compact.request file under the hoodie directory does not contain all arvo log files.
   
   **Stacktrace**
   
   1. Clean file exception 
   
   ```
   2023-03-22 14:32:37.627 [pool-18-thread-1] WARN  org.apache.hudi.table.action.clean.CleanActionExecutor [] - Failed to perform previous clean operation, instant: [==>20230322143231759__clean__REQUESTED]
   java.lang.NullPointerException: Expected a non-null value. Got null
   	at org.apache.hudi.common.util.Option.<init>(Option.java:65) ~[blob_p-7584645ba23f46692000bbfac6ef844cbd0e30ce-451b376bd445dd495f01c72e3dff67e5:?]
   	at org.apache.hudi.common.util.Option.of(Option.java:76) ~[blob_p-7584645ba23f46692000bbfac6ef844cbd0e30ce-451b376bd445dd495f01c72e3dff67e5:?]
   	at org.apache.hudi.table.action.clean.CleanActionExecutor.runClean(CleanActionExecutor.java:230) ~[blob_p-7584645ba23f46692000bbfac6ef844cbd0e30ce-451b376bd445dd495f01c72e3dff67e5:?]
   	at org.apache.hudi.table.action.clean.CleanActionExecutor.runPendingClean(CleanActionExecutor.java:187) ~[blob_p-7584645ba23f46692000bbfac6ef844cbd0e30ce-451b376bd445dd495f01c72e3dff67e5:?]
   	at org.apache.hudi.table.action.clean.CleanActionExecutor.lambda$execute$8(CleanActionExecutor.java:256) ~[blob_p-7584645ba23f46692000bbfac6ef844cbd0e30ce-451b376bd445dd495f01c72e3dff67e5:?]
   	at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_332]
   	at org.apache.hudi.table.action.clean.CleanActionExecutor.execute(CleanActionExecutor.java:250) ~[blob_p-7584645ba23f46692000bbfac6ef844cbd0e30ce-451b376bd445dd495f01c72e3dff67e5:?]
   	at org.apache.hudi.table.HoodieFlinkCopyOnWriteTable.clean(HoodieFlinkCopyOnWriteTable.java:322) ~[blob_p-7584645ba23f46692000bbfac6ef844cbd0e30ce-451b376bd445dd495f01c72e3dff67e5:?]
   	at org.apache.hudi.client.BaseHoodieTableServiceClient.clean(BaseHoodieTableServiceClient.java:554) ~[blob_p-7584645ba23f46692000bbfac6ef844cbd0e30ce-451b376bd445dd495f01c72e3dff67e5:?]
   	at org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:758) ~[blob_p-7584645ba23f46692000bbfac6ef844cbd0e30ce-451b376bd445dd495f01c72e3dff67e5:?]
   	at org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:730) ~[blob_p-7584645ba23f46692000bbfac6ef844cbd0e30ce-451b376bd445dd495f01c72e3dff67e5:?]
   	at org.apache.hudi.async.AsyncCleanerService.lambda$startService$0(AsyncCleanerService.java:55) ~[blob_p-7584645ba23f46692000bbfac6ef844cbd0e30ce-451b376bd445dd495f01c72e3dff67e5:?]
   	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) [?:1.8.0_332]
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_332]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_332]
   	at java.lang.Thread.run(Thread.java:750) [?:1.8.0_332]
   ```
   
   
   2.When we use MOR table + Insert model,There are warn logs such as compact,but the MOR table + Upsert do not has this. 
   The following exception occurs:
   
   `2023-03-21 14:32:36.033 [JettyServerThreadPool-334] WARN  org.apache.hudi.timeline.service.RequestHandler [] - Bad request response due to client view behind server view. Last known instant from client was 20230321142047972 but server has the following timeline [[20230321125010487__deltacommit__COMPLETED], [20230321125525961__deltacommit__COMPLETED], [20230321130051999__deltacommit__COMPLETED], [20230321130617771__deltacommit__COMPLETED], [20230321131133084__deltacommit__COMPLETED], [20230321131650502__deltacommit__COMPLETED], [==>20230321132210140__compaction__INFLIGHT], [20230321132212886__deltacommit__COMPLETED], [==>20230321132729719__compaction__INFLIGHT], [20230321132731672__deltacommit__COMPLETED], [==>20230321133253906__compaction__INFLIGHT], [20230321133256109__deltacommit__COMPLETED], [==>20230321133820416__compaction__INFLIGHT], [20230321133822486__deltacommit__COMPLETED], [==>20230321134348164__compaction__INFLIGHT], [20230321134350553__deltacommit__COMPLETED], [20230
 321134912462__deltacommit__COMPLETED], [20230321135434761__deltacommit__COMPLETED], [20230321140440297__rollback__COMPLETED], [20230321140440947__rollback__COMPLETED], [20230321140443670__deltacommit__COMPLETED], [20230321140445923__rollback__COMPLETED], [20230321140450567__rollback__COMPLETED], [20230321140454064__rollback__COMPLETED], [20230321140456989__rollback__COMPLETED], [==>20230321140910025__compaction__REQUESTED], [20230321140913981__deltacommit__COMPLETED], [==>20230321141505445__compaction__REQUESTED], [20230321141508195__deltacommit__COMPLETED], [20230321142047972__deltacommit__COMPLETED], [20230321142644822__deltacommit__COMPLETED]]`
   
   
   ```Add the stacktrace of the error.```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "DavidZ1 (via GitHub)" <gi...@apache.org>.
DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1484881742

   Thx for reply.
   
   > * the `--service` param has no value, it is a non-valued param
   we tried it, no problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1482211054

   Yeah, the compaction lags a little bit, we can increase some resource for the compaction or use the offline compaction job.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "DavidZ1 (via GitHub)" <gi...@apache.org>.
DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1482523743

   We found that part of the Instant Compaction was completed through `Hudi Cli`. We have a few questions:
   
   1.Why the` Data File Path` is null, through hudi cli?
   
   ![1679650806693](https://user-images.githubusercontent.com/30795397/227484026-005e5a52-3e12-41d6-9fc3-18949793c9ac.png)
   
   ![1679651061728](https://user-images.githubusercontent.com/30795397/227484590-fb028ec5-eee5-4194-b292-de60d90fdf19.png)
   
   2.Why  `xxx.compaction.inflight`  meta file size is zero?
   ![1679651190521](https://user-images.githubusercontent.com/30795397/227485559-1fca9d52-4104-45ea-b984-6d78d05497be.png)
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1487874245

   Had the partition been compacted already?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1489947903

   For no record of FieldId 00000000-0b69-4b13-a1b2-677b800e0729, what is the status of the file group now? Are there any parquets or log files there?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1639187100

   @victorxiang30 Recently we found a bug for incremental cleaning with compaction: https://github.com/apache/hudi/pull/9038, in rare cases, when the partition path changed, there could be a case where log files are never cleaned.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "DavidZ1 (via GitHub)" <gi...@apache.org>.
DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1482471170

   When I used the` --service `parameter, I found that there was an error in parsing:
   
   ![1679647857839](https://user-images.githubusercontent.com/30795397/227473299-f4a9b91f-c5b9-4404-90b0-ce433e7135ad.png)
   
   We fund set ` JCommander  arity` ,can running.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "DavidZ1 (via GitHub)" <gi...@apache.org>.
DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1493216537

   Yes, I understand, we test the effect of different bucket numbers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] victorxiang30 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "victorxiang30 (via GitHub)" <gi...@apache.org>.
victorxiang30 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1637431169

   hi danny, my case is a bit different. The avro log of that particular record group has been compacted and I can see the corresponding parquet files. My cleaning job had been triggering daily. It has been a few days of cleaning. But those avro files created a few days ago are not deleted either. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "DavidZ1 (via GitHub)" <gi...@apache.org>.
DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1491339024

   No, we have stopped the job. The logs file cleanup logic is enabled by default, as seen from the DAG diagram.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "DavidZ1 (via GitHub)" <gi...@apache.org>.
DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1482134241

   Thank you for your answer.
   
   I use the `HUDI CLI`  tool to view the compact execution process of the table, and found that many compacts are `inflight`. Does it mean that the asynchronous compaction of the flink job is too slow and requires more resources, or is it an offline compaction plan?
   
   
   ![f69ab074618168c8e40026c67a7a728](https://user-images.githubusercontent.com/30795397/227402647-ffa831c3-176b-4d3b-98c3-e08559aa5621.png)
   
   ![df6b4140d34c0fd4ae763196aac7f4a](https://user-images.githubusercontent.com/30795397/227402674-0e0bd4fa-670e-4a13-bece-63a0125627ad.png)
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "DavidZ1 (via GitHub)" <gi...@apache.org>.
DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1490228324

   We are currently stopped the offline Compaction operation. The bucket index we use and the bucket num are set to 100. Currently, under this file group, there are 13 parquet files, and the logs are still there, but they have not been cleaned up.
   
   ![513108a072bac3e368d7791deff778e](https://user-images.githubusercontent.com/30795397/228837762-b4f34036-6b42-4854-952a-9bce101bbe23.png)
   
   ![e30ee1a4cc28e919ae2f12b4d96a002](https://user-images.githubusercontent.com/30795397/228837778-57158599-7fce-4526-aad0-532dc574ed98.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "DavidZ1 (via GitHub)" <gi...@apache.org>.
DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1493220098

   We use the offline method for compaction. It is normal to start the compaction, but after running for a period of time, we found that the compaction is delayed a lot, and the instant time is still yesterday.
   
   Let's take a look at the `FlinkCompactionConfig `parameter configuration. Currently, we have not found any parameters that can be tuned.
   
   ![1680408973530](https://user-images.githubusercontent.com/30795397/229331242-30ee4d8a-2be9-46ab-baa7-36126cb33c7d.png)
   
   ![1680409026244](https://user-images.githubusercontent.com/30795397/229331245-8fb078e5-d425-4073-a7cd-35cbbb56c023.png)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1491241122

   Thanks, is the job still running ? Did you enable the incremental cleaning yet?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1480611571

   You need to monitor the timeline, to see whether there are pending compactions that is in pending state for a long time, you may need to rollback it first, reastart the job or rollback it manually through HUDI CLI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "DavidZ1 (via GitHub)" <gi...@apache.org>.
DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1486562349

   At present, our data is written in real time, and the Compaction is offline through the Flink job, but we found that there are still some logs files without Compaction.
   
   Through the HUDI CLI, it is found that the Compaction is completed normally, and there is no abnormal situation.
   
   1.DAG
   ![1679997323126](https://user-images.githubusercontent.com/30795397/228200049-4a1e934b-5b33-43cb-bf22-d34cd75a314f.png)
   
   2.no parquet file
   ![1679997363383](https://user-images.githubusercontent.com/30795397/228200213-99b4ab44-0249-40ce-b1fc-9979dbfabdec.png)
   
   no parquet files for 13 hour partitions
   ![1679997491851](https://user-images.githubusercontent.com/30795397/228200789-5c5b8a19-f373-4f9a-9439-590fe6f08969.png)
   
   
   3.Compactions status
   ![1679997449250](https://user-images.githubusercontent.com/30795397/228200591-ec0986ac-a02c-4e82-8ccf-62bf6fd4d846.png)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1492818398

   The bucket num  can not be changed once set up, the parallelism can be tweaked BTW.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1483713136

   1. the `--service` param has no value, it is a non-valued param
   2. the path is null: I guess the path means base files to compact, in your use case, there are no parquets but all logs to compact
   3. the `.inflight` file is just a marker file to indicate that the transaction of current instant is on-going.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "DavidZ1 (via GitHub)" <gi...@apache.org>.
DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1487993254

   Yes, we checked the Compaction archive file and found that the corresponding commit has completed the Compaction, but there is actually no filedId parquet file.
   
   
   ![556c395225d6cb7ec60bfc97c4b32fe](https://user-images.githubusercontent.com/30795397/228440036-33be49b9-8581-42e9-ac9b-99adfb8a9541.png)
   
   ![289f141a5bfa99362f941111c6d7194](https://user-images.githubusercontent.com/30795397/228440081-baeef86b-97b1-4370-90be-3a050d842b33.png)
   
   
   ![1680069696475](https://user-images.githubusercontent.com/30795397/228440518-059af26a-d3a8-4a3c-ba4f-e453c3d152a5.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1488033858

   > there is actually no filedId parquet file
   
   Confused by your words, can you re-organize it a little?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

Posted by "DavidZ1 (via GitHub)" <gi...@apache.org>.
DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1488144130

   > > there is actually no filedId parquet file
   > 
   > Confused by your words, can you re-organize it a little?
   
   Sorry,We fund instant `20230328130031810 `has been compacted, but there is no compaction record for FieldId `00000000-0b69-4b13-a1b2-677b800e0729`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org