You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hudi.apache.org by Vinoth Chandar <vi...@apache.org> on 2021/10/06 13:50:30 UTC

Re: [DISCUSS] Refactor hudi-client module for better support of multiple engines

Hi Gary,

We can pass the constructed timeline and filesystem view into the IOHandle.
I think it makes sense for how Flink does things.

Thanks
Vinoth

On Fri, Sep 24, 2021 at 2:04 AM Gary Li <ga...@apache.org> wrote:

> Hi Vinoth,
>
> Currently, each executor of Flink has a timeline server I believe. Do you
> think we can avoid passing the timeline and filesystem view into the
> IOHandle? I mean one IOHandle is handling the IO of one filegroup, and it
> doesn't need to know the timeline and filesystem view of the table, if we
> can define what the IOHandle is supposed to do during the initialization.
> Based on the log I see, constructing the filesystem view of a partition
> with 500 filegroups is taking 200ms. If the AppendHandle is only flushing a
> few records to disk, the actual flush could be faster than filesystem view
> construction.
>
> On Fri, Sep 24, 2021 at 12:03 PM Vinoth Chandar <vi...@apache.org> wrote:
>
> > Thanks for the explanation. I get the streaming aspect better now. Esp in
> > Flink land. Timeline server and remote file system view are what the
> > defaults are. Assuming its a RPC call that takes 10-100 ms to the
> timeline
> > server, not sure how much room there is for optimization for loading of
> the
> > file system view itself. For timeline, it can be redundant (again in
> spark
> > model, it gets passed from driver to executor). Wondering if we can serve
> > timeline also view the server (duh).
> >
> > In flink, currently we run a timeline server per executor? Wondering if
> > that helps or hurts. In spark we run one in the driver alone.
> >
> > If we want to pass in a constructed table filesystem view and timeline
> into
> > IOHandle we can. I am fine with it, but trying to understand what exactly
> > we are solving by that.
> >
> >
> > On Thu, Sep 23, 2021 at 7:05 PM Gary Li <ga...@apache.org> wrote:
> >
> > > Hi Vinoth,
> > >
> > > IMO the IOHandle should be as lightweight as possible, especially when
> we
> > > want to do streaming and near-real-time update(possibly real-time in
> the
> > > future?). Constructing the timeline and filesystem view inside the
> handle
> > > is time-consuming. In some cases, some handles only write a few records
> > in
> > > each commit, when we try to commit very aggressively. The timeline
> server
> > > and remote filesystem view are helpful, but I feel like there is still
> > some
> > > room for improvement.
> > >
> > > Best,
> > > Gary
> > >
> > > On Fri, Sep 24, 2021 at 3:04 AM Vinoth Chandar <vi...@apache.org>
> > wrote:
> > >
> > > > Hi Gary,
> > > >
> > > > So in effect you want to pull all the timeline filtering out of the
> > > handles
> > > > and pass a plan i.e what file slice to work on - to the handle?
> > > > That does sound cleaner. but we need to introduce this additional
> > layer.
> > > > The timeline and filesystem view do live within the table, I believe
> > > today.
> > > >
> > > > Thanks
> > > > Vinoth
> > > >
> > > > On Wed, Sep 22, 2021 at 6:35 PM Gary Li <ga...@apache.org> wrote:
> > > >
> > > > > Hi Vinoth,
> > > > >
> > > > > Thanks for your response. For HoodieIOHandle, IMO we could define
> the
> > > > scope
> > > > > of the Handle during the initialization, so we don't need to care
> > about
> > > > the
> > > > > timeline and table view when actually writing the data. Is that
> > > > possible? A
> > > > > HoodieTable could have many Handles writing data at the same time
> and
> > > it
> > > > > will look cleaner if we can keep the timeline and file system view
> > > inside
> > > > > the table itself.
> > > > >
> > > > > Best,
> > > > > Gary
> > > > >
> > > > > On Sat, Sep 18, 2021 at 12:06 AM Vinoth Chandar <vinoth@apache.org
> >
> > > > wrote:
> > > > >
> > > > > > Hi Gary,
> > > > > >
> > > > > > Thanks for the detailed response. Let me add my take on it.
> > > > > >
> > > > > > >>HoodieFlinkMergeOnReadTable.upsert(List<HoodieRecord>) to use
> the
> > > > > > AppendHandle.write(HoodieRecord) directly,
> > > > > >
> > > > > > I have the same issue on JavaClient, for the Kafka Connect
> > > > > implementation.
> > > > > > I have an idea of how we can implement this. Will raise a PR and
> > get
> > > > your
> > > > > > thoughts.
> > > > > > We can then see if this can be leveraged across Flink and Java
> > > clients.
> > > > > >
> > > > > > On the IOHandle not having the Table inside, I think the File
> > > > > > reader/writer  abstraction exists already and having the Table in
> > the
> > > > io
> > > > > > layers helps us perform I/O
> > > > > > while maintaining consistency with the timeline.
> > > > > >
> > > > > > +1 on the next two points.
> > > > > >
> > > > > > I think these layers have well defined roles, and probably why we
> > are
> > > > > able
> > > > > > to get this far :) . May be we need to pull I/O up into
> > hudi-common ?
> > > > > >
> > > > > > For this project, we can trim the scope to code reuse and moving
> > all
> > > > the
> > > > > > different engine specific implementations up into
> > hudi-client-common.
> > > > > >
> > > > > > What do you think?
> > > > > >
> > > > > > Thanks
> > > > > > Vinoth
> > > > > >
> > > > > >
> > > > > > On Thu, Sep 16, 2021 at 6:55 AM Gary Li <ga...@apache.org>
> wrote:
> > > > > >
> > > > > > > Huge +1. Recently I am working on making the Flink writer in a
> > > > > streaming
> > > > > > > fashion and found the List<HoodieRecord> interface is limiting
> > the
> > > > > > > streaming power of Flink. By switching from
> > > > > > > HoodieFlinkMergeOnReadTable.upsert(List<HoodieRecord>) to use
> the
> > > > > > > AppendHandle.write(HoodieRecord) directly, the throughput was
> > > almost
> > > > > > > doubled and the checkpoint time of the writer was reduced from
> > > > minutes
> > > > > to
> > > > > > > seconds. But I found it really difficult to fit this change
> into
> > > the
> > > > > > > current client interface.
> > > > > > >
> > > > > > > My 2 cents:
> > > > > > >
> > > > > > >    - The HoodieIOHandle should only handle the IO, and not
> having
> > > > > > >    HoodieTable inside.
> > > > > > >    - We need a more streaming-friendly Handle. For Flink, we
> can
> > > > > > definitely
> > > > > > >    change all the batch mode List<HoodieRecord> to processing
> > > > > > HoodieRecord
> > > > > > > one
> > > > > > >    by one, just like the AppendHandle.write(HoodieRecord) and
> > > > > > >    AppendHandle.close(). This will spread the computing cost
> and
> > > > > > >    flattening the curve.
> > > > > > >    - We can use the Handle to precisely control the JVM to
> avoid
> > > OOM
> > > > > and
> > > > > > >    optimize the memory footprint. Then we don't need to
> implement
> > > > > another
> > > > > > >    memory control mechanism in the compute engine itself.
> > > > > > >    - HoodieClient, HoodieTable, HoodieIOHandle, HoodieTimeline,
> > > > > > >    HoodieFileSystemView e.t.c should have a well-defined role
> and
> > > > > > > well-defined
> > > > > > >    layer. We should know when to use what, it should be used by
> > the
> > > > > > driver
> > > > > > > in
> > > > > > >    a single thread or used by the worker in a distributed way.
> > > > > > >
> > > > > > > This is a big project and could benefit Hudi in long term.
> Happy
> > to
> > > > > > discuss
> > > > > > > more in the design doc or PRs.
> > > > > > >
> > > > > > > Best,
> > > > > > > Gary
> > > > > > >
> > > > > > > On Thu, Sep 16, 2021 at 3:21 AM Raymond Xu <
> > > > > xu.shiyan.raymond@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > +1 that's a great improvement.
> > > > > > > >
> > > > > > > > On Wed, Sep 15, 2021 at 10:40 AM Sivabalan <
> n.siva.b@gmail.com
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > ++1. definitely help's Hudi scale and makes it more
> > > maintainable.
> > > > > > > Thanks
> > > > > > > > > for driving this effort. Mostly devs show interest in major
> > > > > features
> > > > > > > and
> > > > > > > > > don't like to spend time in such foundational work. But as
> > the
> > > > > > project
> > > > > > > > > scales, these foundational work will have a higher returns
> in
> > > the
> > > > > > long
> > > > > > > > run.
> > > > > > > > >
> > > > > > > > > On Wed, Sep 15, 2021 at 8:29 AM Vinoth Chandar <
> > > > vinoth@apache.org>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Another +1 ,  HoodieData abstraction will go a long way
> in
> > > > > reducing
> > > > > > > > LoC.
> > > > > > > > > >
> > > > > > > > > > Happy to work with you to see this through! I really
> > > encourage
> > > > > top
> > > > > > > > > > contributors to the Flink and Java clients as well,
> > > > > > > > > > actively review all PRs, given there are subtle
> differences
> > > > > > > everywhere.
> > > > > > > > > >
> > > > > > > > > > This will help us smoothly provide all the core features
> > > across
> > > > > > > > engines.
> > > > > > > > > > Also help us easily write a DataSet/Row based
> > > > > > > > > > client for Spark as well.
> > > > > > > > > >
> > > > > > > > > > Onwards and upwards
> > > > > > > > > > Vinoth
> > > > > > > > > >
> > > > > > > > > > On Wed, Sep 15, 2021 at 4:57 AM vino yang <
> > > > yanghua1127@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Ethan,
> > > > > > > > > > >
> > > > > > > > > > > Big +1 for the proposal.
> > > > > > > > > > >
> > > > > > > > > > > Actually, we have discussed this topic before.[1]
> > > > > > > > > > >
> > > > > > > > > > > Will review your refactor PR later.
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Vino
> > > > > > > > > > >
> > > > > > > > > > > [1]:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r71d96d285c735b1611920fb3e7224c9ce6fd53d09bf0e8f144f4fcbd%40%3Cdev.hudi.apache.org%3E
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Y Ethan Guo <et...@gmail.com> 于2021年9月15日周三
> > > > 下午3:34写道:
> > > > > > > > > > >
> > > > > > > > > > > > Hi all,
> > > > > > > > > > > >
> > > > > > > > > > > > hudi-client module has core Hudi abstractions and
> > client
> > > > > logic
> > > > > > > for
> > > > > > > > > > > > different engines like Spark, Flink, and Java.  While
> > > > > previous
> > > > > > > > effort
> > > > > > > > > > > > (HUDI-538 [1]) has decoupled the integration with
> > Spark,
> > > > > there
> > > > > > is
> > > > > > > > > quite
> > > > > > > > > > > > some code duplication across different engines for
> > almost
> > > > the
> > > > > > > same
> > > > > > > > > > logic
> > > > > > > > > > > > due to the current interface design.  Some part also
> > has
> > > > > > > divergence
> > > > > > > > > > among
> > > > > > > > > > > > engines, making debugging and support difficult.
> > > > > > > > > > > >
> > > > > > > > > > > > I propose to further refactor the hudi-client module
> > with
> > > > the
> > > > > > > goal
> > > > > > > > of
> > > > > > > > > > > > improving the code reuse across multiple engines and
> > > > reducing
> > > > > > the
> > > > > > > > > > > > divergence of the logic among them, so that the core
> > Hudi
> > > > > > action
> > > > > > > > > > > execution
> > > > > > > > > > > > logic should be shared across engines, except for
> > engine
> > > > > > specific
> > > > > > > > > > > > transformations.  Such a pattern also allows easy
> > support
> > > > of
> > > > > > core
> > > > > > > > > Hudi
> > > > > > > > > > > > functionality for all engines in the future.
> > > Specifically,
> > > > > > > > > > > >
> > > > > > > > > > > > (1) Abstracts the transformation boilerplates inside
> > the
> > > > > > > > > > > > HoodieEngineContext and implements the
> engine-specific
> > > data
> > > > > > > > > > > transformation
> > > > > > > > > > > > logic in the subclasses.  Type cast can be done
> inside
> > > the
> > > > > > engine
> > > > > > > > > > > context.
> > > > > > > > > > > > (2) Creates new HoodieData abstraction for passing
> > input
> > > > and
> > > > > > > output
> > > > > > > > > > along
> > > > > > > > > > > > the flow of execution, and uses it in different Hudi
> > > > > > > abstractions,
> > > > > > > > > > e.g.,
> > > > > > > > > > > > HoodieTable, HoodieIOHandle, BaseActionExecutor,
> > instead
> > > of
> > > > > > > > enforcing
> > > > > > > > > > > type
> > > > > > > > > > > > parameters encountering RDD<HoodieRecord> and
> > > > > > List<HoodieRecord>
> > > > > > > > > which
> > > > > > > > > > > are
> > > > > > > > > > > > one source of duplication.
> > > > > > > > > > > > (3) Extracts common execution logic to
> > hudi-client-common
> > > > > > module
> > > > > > > > from
> > > > > > > > > > > > multiple engines.
> > > > > > > > > > > >
> > > > > > > > > > > > As a first step and exploration for item (1) and (3)
> > > above,
> > > > > > I've
> > > > > > > > > tried
> > > > > > > > > > to
> > > > > > > > > > > > refactor the rollback actions and the PR is here
> > > > > > [HUDI-2433][2].
> > > > > > > > In
> > > > > > > > > > this
> > > > > > > > > > > > PR, I completely remove all engine-specific rollback
> > > > packages
> > > > > > and
> > > > > > > > > only
> > > > > > > > > > > keep
> > > > > > > > > > > > one rollback package in hudi-client-common, adding
> ~350
> > > LoC
> > > > > > while
> > > > > > > > > > > deleting
> > > > > > > > > > > > 1.3K LoC.  My next step is to refactor the commit
> > action
> > > > > which
> > > > > > > > > > > encompasses
> > > > > > > > > > > > item (2) above.
> > > > > > > > > > > >
> > > > > > > > > > > > What do you folks think and any other suggestions?
> > > > > > > > > > > >
> > > > > > > > > > > > [1] [HUDI-538] [UMBRELLA] Restructuring hudi client
> > > module
> > > > > for
> > > > > > > > multi
> > > > > > > > > > > engine
> > > > > > > > > > > > support
> > > > > > > > > > > > https://issues.apache.org/jira/browse/HUDI-538
> > > > > > > > > > > > [2] PR: [HUDI-2433] Refactor rollback actions in
> > > > hudi-client
> > > > > > > module
> > > > > > > > > > > > https://github.com/apache/hudi/pull/3664/files
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > - Ethan
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Regards,
> > > > > > > > > -Sivabalan
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>