You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Prashant Wason (Jira)" <ji...@apache.org> on 2023/04/18 06:02:00 UTC
[jira] [Created] (HUDI-6092) Reuse schema objects while reading large number of log blocks
Prashant Wason created HUDI-6092:
------------------------------------
Summary: Reuse schema objects while reading large number of log blocks
Key: HUDI-6092
URL: https://issues.apache.org/jira/browse/HUDI-6092
Project: Apache Hudi
Issue Type: Improvement
Reporter: Prashant Wason
Assignee: Prashant Wason
Some log files may contain a large number of log blocks. When such a log file is read, for each block the schema string is read from the log block header and parsed into the Schema object. The schema string will most probably be the same and hence parsing it again and again created overhead of parsing as well as memory overhead.
An optimization is to cache the parsed schema objects.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)