You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/02/23 09:15:29 UTC

[GitHub] [hudi] prashantwason commented on a change in pull request #4640: [HUDI-3225] [RFC-45] for async metadata indexing

prashantwason commented on a change in pull request #4640:
URL: https://github.com/apache/hudi/pull/4640#discussion_r812687856



##########
File path: rfc/rfc-45/rfc-45.md
##########
@@ -0,0 +1,229 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+# RFC-45: Asynchronous Metadata Indexing
+
+## Proposers
+
+- @codope
+- @manojpec
+
+## Approvers
+
+- @nsivabalan
+- @vinothchandar
+
+## Status
+
+JIRA: [HUDI-2488](https://issues.apache.org/jira/browse/HUDI-2488)
+
+## Abstract
+
+Metadata indexing (aka metadata bootstrapping) is the process of creation of one
+or more metadata-based indexes, e.g. data partitions to files index, that is
+stored in Hudi metadata table. Currently, the metadata table (referred as MDT
+hereafter) supports single partition which is created synchronously with the
+corresponding data table, i.e. commits are first applied to metadata table
+followed by data table. Our goal for MDT is to support multiple partitions to
+boost the performance of existing index and records lookup. However, the
+synchronous manner of metadata indexing is not very scalable as we add more
+partitions to the MDT because the regular writers (writing to the data table)
+have to wait until the MDT commit completes. In this RFC, we propose a design to
+support asynchronous metadata indexing.
+
+## Background
+
+We can read more about the MDT design
+in [RFC-15](https://cwiki.apache.org/confluence/display/HUDI/RFC+-+15%3A+HUDI+File+Listing+Improvements)
+. Here is a quick summary of the current state (Hudi v0.10.1). MDT is an
+internal Merge-on-Read (MOR) table that has a single partition called `files`
+which stores the data partitions to files index that is used in file listing.
+MDT is co-located with the data table (inside `.hoodie/metadata` directory under
+the basepath). In order to handle multi-writer scenario, users configure lock
+provider and only one writer can access MDT in read-write mode. Hence, any write
+to MDT is guarded by the data table lock. This ensures only one write is
+committed to MDT at any point in time and thus guarantees serializability.
+However, locking overhead adversely affects the write throughput and will reach
+its scalability limits as we add more partitions to the MDT.
+
+## Goals
+
+- Support indexing one or more partitions in MDT while regular writers and table
+  services (such as cleaning or compaction) are in progress.
+- Locking to be as lightweight as possible.
+- Keep required config changes to a minimum to simplify deployment / upgrade in
+  production.
+- Do not require specific ordering of how writers and table service pipelines
+  need to be upgraded / restarted.
+- If an external long-running process is being used to initialize the index, the
+  process should be made idempotent so it can handle errors from previous runs.
+- To re-initialize the index, make it as simple as running the external
+  initialization process again without having to change configs.
+
+## Implementation
+
+### A new Hudi action: INDEX
+
+We introduce a new action `index` which will denote the index building process,
+the mechanics of which is as follows:
+
+1. From an external process, users can issue a CREATE INDEX or similar statement

Review comment:
       +1




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org