You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by vi...@apache.org on 2022/01/11 00:07:43 UTC
[hudi] branch asf-site updated: [MINOR] Fix performance table in marker blog (#4547)
This is an automated email from the ASF dual-hosted git repository.
vinoth pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 95cdb25 [MINOR] Fix performance table in marker blog (#4547)
95cdb25 is described below
commit 95cdb25e23067440938de4772d382802dee0edcd
Author: Y Ethan Guo <et...@gmail.com>
AuthorDate: Mon Jan 10 16:05:33 2022 -0800
[MINOR] Fix performance table in marker blog (#4547)
---
website/blog/2021-08-18-improving-marker-mechanism.md | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/website/blog/2021-08-18-improving-marker-mechanism.md b/website/blog/2021-08-18-improving-marker-mechanism.md
index 840deb5..d5fad9f 100644
--- a/website/blog/2021-08-18-improving-marker-mechanism.md
+++ b/website/blog/2021-08-18-improving-marker-mechanism.md
@@ -61,11 +61,11 @@ We evaluate the write performance over both direct and timeline-server-based mar
As shown below, direct marker mechanism works really well, when a part of the table is written, e.g., 1K out of 165K data files. However, the time of direct marker operations is non-trivial when we need to write significant number of data files. Compared to the direct marker mechanism, the timeline-server-based marker mechanism generates much fewer files storing markers because of the batch processing, leading to much less time on marker-related I/O operations, thus achieving 31% lower [...]
-| Marker Type | Input data size | Num data files written | Files created for markers | Marker deletion time | Bulk Insert Time (including marker deletion) |
-| ----------- | --------- | :---------: | :---------: | :---------: | :---------: |
-| Direct | 600MB | 1k | 1k | 5.4secs | - |
-| Direct | 100GB | 165k | 165k | 15min | 55min |
-| Timeline-server-based | 100GB | 165k | 20 | ~3s | 38min |
+| Marker Type | Total Files | Num data files written | Files created for markers | Marker deletion time | Bulk Insert Time (including marker deletion) |
+| ----------- |-----------| :---------: | :---------: | :---------: | :---------: |
+| Direct | 165k | 1k | 1k | 5.4secs | - |
+| Direct | 165k | 165k | 165k | 15min | 55min |
+| Timeline-server-based | 165k | 165k | 20 | ~3s | 38min |
## Conclusion