You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sivabalan narayanan (Jira)" <ji...@apache.org> on 2022/09/04 22:00:00 UTC

[jira] [Updated] (HUDI-3892) Add HoodieReadClient with java

     [ https://issues.apache.org/jira/browse/HUDI-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

sivabalan narayanan updated HUDI-3892:
--------------------------------------
    Sprint: 2022/09/19  (was: 2022/09/05)

> Add HoodieReadClient with java
> ------------------------------
>
>                 Key: HUDI-3892
>                 URL: https://issues.apache.org/jira/browse/HUDI-3892
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: reader-core
>            Reporter: sivabalan narayanan
>            Priority: Critical
>             Fix For: 0.12.1
>
>
> We might need a hoodie read client in java similar to the one we have for spark. 
> [Apache Pulsar|https://github.com/apache/pulsar] is doing integration with Hudi, and take Hudi as tiered storage to offload topic cold data into Hudi. When consumers fetch cold data from topic, Pulsar broker will locate the target data is stored in Pulsar or not. If the target data stored in tiered storage (Hudi), Pulsar broker will fetch data from Hudi by Java API, and package them into Pulsar format and dispatch to consumer side.
> However, we found current Hudi implementation doesn't support read Hudi table records by Java API, and we couldn't read the target data out from Hudi into Pulsar Broker, which will block the Pulsar & Hudi integration.
> h3. What we need
>  # We need Hudi to support reading records by Java API
>  # We need Hudi to support read records out which keep the writer order, or support order by specific fields.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)