You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by vi...@apache.org on 2022/01/11 00:07:43 UTC

[hudi] branch asf-site updated: [MINOR] Fix performance table in marker blog (#4547)

This is an automated email from the ASF dual-hosted git repository.

vinoth pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 95cdb25  [MINOR] Fix performance table in marker blog (#4547)
95cdb25 is described below

commit 95cdb25e23067440938de4772d382802dee0edcd
Author: Y Ethan Guo <et...@gmail.com>
AuthorDate: Mon Jan 10 16:05:33 2022 -0800

    [MINOR] Fix performance table in marker blog (#4547)
---
 website/blog/2021-08-18-improving-marker-mechanism.md | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/website/blog/2021-08-18-improving-marker-mechanism.md b/website/blog/2021-08-18-improving-marker-mechanism.md
index 840deb5..d5fad9f 100644
--- a/website/blog/2021-08-18-improving-marker-mechanism.md
+++ b/website/blog/2021-08-18-improving-marker-mechanism.md
@@ -61,11 +61,11 @@ We evaluate the write performance over both direct and timeline-server-based mar
 
 As shown below, direct marker mechanism works really well, when a part of the table is written, e.g., 1K out of 165K data files.  However, the time of direct marker operations is non-trivial when we need to write significant number of data files. Compared to the direct marker mechanism, the timeline-server-based marker mechanism generates much fewer files storing markers because of the batch processing, leading to much less time on marker-related I/O operations, thus achieving 31% lower  [...]
 
-| Marker Type |   Input data size   |  Num data files written | Files created for markers | Marker deletion time | Bulk Insert Time (including marker deletion) |
-| ----------- | --------- | :---------: | :---------: | :---------: | :---------: | 
-| Direct | 600MB | 1k | 1k | 5.4secs | - |
-| Direct | 100GB | 165k | 165k | 15min | 55min |
-| Timeline-server-based | 100GB | 165k | 20 | ~3s | 38min |
+| Marker Type | Total Files |  Num data files written | Files created for markers | Marker deletion time | Bulk Insert Time (including marker deletion) |
+| ----------- |-----------| :---------: | :---------: | :---------: | :---------: | 
+| Direct | 165k | 1k | 1k | 5.4secs | - |
+| Direct | 165k | 165k | 165k | 15min | 55min |
+| Timeline-server-based | 165k | 165k | 20 | ~3s | 38min |
 
 ## Conclusion