You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Wellington Chevreuil <we...@gmail.com> on 2021/12/07 15:31:22 UTC

[DISCUSS] Merge HBASE-26067 branch into master, and backport it to base 2 branches

Hello everyone,

We have been making progress on the alternative way of tracking store files
originally proposed by Duo in HBASE-26067.

To briefly summarize it for those not following it, this feature introduces
an abstraction layer to track store files still used/needed by store
engines, allowing for plugging different approaches of identifying store
files required by the given store. The design doc describing it in more
detail is available here
<https://docs.google.com/document/d/16Nr1Fn3VaXuz1g1FTiME-bnGR3qVK5B-raXshOkDLcY/edit#heading=h.calrs3kn4d8s>
.

Our main goal within this feature is to avoid the need for using temp files
and renames when creating new hfiles (whenever flushing, compacting,
splitting/merging or snapshotting). This is made possible by the pluggable
tracker implementation labeled "FILE". The current behavior using temp dirs
and renames would still be the default approach (labeled "DEFAULT").

This "renameless" approach is appealing for deployments using Amazon S3
Object store file system, where the lack of atomic rename operations
imposed the necessity of an additional layer of locking (HBOSS), which
combined with the s3a rename operation can have a performance overhead.

Some test runs on my employer infrastructure have shown promising results.
A pure insertion ycsb run has shown ~6% performance gain on the client
writes. Snapshot clone of hundreds of regions table completes in half of
the time. There are also improvements in compaction, splits and merges
times.

Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we feel
optimistic that the current implementation is in a good state to get merged
into master branch, but it would be nice to hear other opinions about it,
before we effectively commit it. Looking forward to hearing some
thoughts/concerns you might have.

Kind regards,
Wellington.

Re: [DISCUSS] Merge HBASE-26067 branch into master, and backport it to base 2 branches

Posted by "张铎(Duo Zhang)" <pa...@gmail.com>.
I suggest we start a formal vote thread after we finish all the works :)

And #3861 is not a blocker, I think we still have some concerns on how to
collect the metrics at region server side to master side. We could do it
after merging back the feature branch.

Thanks.

Josh Elser <el...@apache.org> 于2021年12月15日周三 06:15写道:

> Thanks for your input, Andrew and Nick!
>
> Big thank you to Duo for your hands-on-keyboard commitment as well for
> this whole feature.
>
> I am also happy to target 2.x (and not 2.5.x) for the backport.
>
> In the interest of getting rid of this feature branch (and the
> inevitable rebase pains the longer it runs parallel to master), I'd like
> to move ahead with a concrete plan to merge.
>
> 1. Given there was no objection, do folks feel the need for a VOTE? Even
> if one person would like a VOTE, I'm happy to start that. Please just
> say so.
>
> 2. We have three outstanding PRs for the sake of SFT which are all (IMO)
> very close to merging (#3851, #3861, and #3942). I think 3851 and 3942
> are easy to include and just need one more review cycle. If we feel like
> we are still far away on 3861, I think we set that aside and revisit it
> after the feature merge is done.
>
> If there are any other concerns, please shout!
>
> - Josh
>
> On 12/8/21 9:07 PM, Andrew Purtell wrote:
> > +1 for merging to branch-2 (2.6)
> >
> >> On Dec 8, 2021, at 6:04 PM, 张铎 <pa...@gmail.com> wrote:
> >>
> >> I think here we just want this to be backported to 2.x, not 2.5.x.
> >>
> >> So thanks Andrew for the quick action.
> >>
> >> +1 on merging HBASE-26067 to master and backporting to branch-2(2.6.0).
> >>
> >> Thanks.
> >>
> >> Andrew Purtell <ap...@apache.org> 于2021年12月9日周四 08:45写道:
> >>
> >>> I concur with Nick, but let me help here by branching 2.5 today. It was
> >>> always going to be somewhat arbitrary a point.
> >>>
> >>>> On Wed, Dec 8, 2021 at 3:09 PM Nick Dimiduk <nd...@apache.org>
> wrote:
> >>>>
> >>>> Based solely on the comments made to this thread, I would recommend
> >>> against
> >>>> a merge to branch-2, given that we are very close to 2.5. The points
> >>> about
> >>>> existing gaps seem like things we're not ready to publish in the
> >>> impending
> >>>> minor release. Once we have a branch-2.5, this particular concern of
> mine
> >>>> will be alleviated.
> >>>>
> >>>> Thanks,
> >>>> Nick
> >>>>
> >>>>> On Wed, Dec 8, 2021 at 1:37 PM Josh Elser <el...@apache.org> wrote:
> >>>>
> >>>>> I was going to wait for some other folks to chime in, but I guess I
> can
> >>>>> be the next one :)
> >>>>>
> >>>>> Duo, Wellington, and Szabolcs have been doing some excellent work on
> >>> the
> >>>>> storefile tracking (SFT) to a degree that I never expected to see. I
> >>>>> remember some of the original "Filesystem re-do" issues on Jira. The
> >>>>> idea was exceptional, but the result seemed unreachable.
> >>>>>
> >>>>> These devs, building on the success of what Zach/Stephen first talked
> >>>>> about in HBASE-24749, came up with what I think is an excellent step
> >>>>> forward. I've yet to break it via my own testing, but do acknowledge
> >>>>> that there's always more work to be done.
> >>>>>
> >>>>> I think this is at a reasonable place to merge this back into the
> >>>>> "mainline" branches from the feature branch (HBASE-26067). I believe
> >>>>> this is ready because:
> >>>>>
> >>>>> 1. The feature is completely opt-in (HBase works the same way by
> >>> default)
> >>>>> 2. There is API to migrate tables into the new SFT implementation
> >>>>> 3. There is also API to migrate tables back to the default
> >>> implementation
> >>>>>
> >>>>> Some gaps still exist around bulk loading, documentation, snapshots,
> >>> and
> >>>>> recovery tooling, but these are being worked on. In the context of
> S3,
> >>>>> this makes a significantly more compelling offering of HBase by
> >>> removing
> >>>>> the complexity of HBOSS. For HBase in all installations, I think SFT
> >>>>> makes more a significantly more "deterministic" way of managing
> >>>>> regions/files.
> >>>>>
> >>>>> +1 from me to merge HBASE-26067 into master and branch-2
> >>>>>
> >>>>> - Josh
> >>>>>
> >>>>> On 12/7/21 10:31 AM, Wellington Chevreuil wrote:
> >>>>>> Hello everyone,
> >>>>>>
> >>>>>> We have been making progress on the alternative way of tracking
> store
> >>>>> files
> >>>>>> originally proposed by Duo in HBASE-26067.
> >>>>>>
> >>>>>> To briefly summarize it for those not following it, this feature
> >>>>> introduces
> >>>>>> an abstraction layer to track store files still used/needed by store
> >>>>>> engines, allowing for plugging different approaches of identifying
> >>>> store
> >>>>>> files required by the given store. The design doc describing it in
> >>> more
> >>>>>> detail is available here
> >>>>>> <
> >>>>>
> >>>>
> >>>
> https://docs.google.com/document/d/16Nr1Fn3VaXuz1g1FTiME-bnGR3qVK5B-raXshOkDLcY/edit#heading=h.calrs3kn4d8s
> >>>>>>
> >>>>>> .
> >>>>>>
> >>>>>> Our main goal within this feature is to avoid the need for using
> temp
> >>>>> files
> >>>>>> and renames when creating new hfiles (whenever flushing, compacting,
> >>>>>> splitting/merging or snapshotting). This is made possible by the
> >>>>> pluggable
> >>>>>> tracker implementation labeled "FILE". The current behavior using
> >>> temp
> >>>>> dirs
> >>>>>> and renames would still be the default approach (labeled "DEFAULT").
> >>>>>>
> >>>>>> This "renameless" approach is appealing for deployments using Amazon
> >>> S3
> >>>>>> Object store file system, where the lack of atomic rename operations
> >>>>>> imposed the necessity of an additional layer of locking (HBOSS),
> >>> which
> >>>>>> combined with the s3a rename operation can have a performance
> >>> overhead.
> >>>>>>
> >>>>>> Some test runs on my employer infrastructure have shown promising
> >>>>> results.
> >>>>>> A pure insertion ycsb run has shown ~6% performance gain on the
> >>> client
> >>>>>> writes. Snapshot clone of hundreds of regions table completes in
> half
> >>>> of
> >>>>>> the time. There are also improvements in compaction, splits and
> >>> merges
> >>>>>> times.
> >>>>>>
> >>>>>> Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we
> >>> feel
> >>>>>> optimistic that the current implementation is in a good state to get
> >>>>> merged
> >>>>>> into master branch, but it would be nice to hear other opinions
> about
> >>>> it,
> >>>>>> before we effectively commit it. Looking forward to hearing some
> >>>>>> thoughts/concerns you might have.
> >>>>>>
> >>>>>> Kind regards,
> >>>>>> Wellington.
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> Best regards,
> >>> Andrew
> >>>
> >>> Words like orphans lost among the crosstalk, meaning torn from truth's
> >>> decrepit hands
> >>>    - A23, Crosstalk
> >>>
>

Re: [DISCUSS] Merge HBASE-26067 branch into master, and backport it to base 2 branches

Posted by Josh Elser <el...@apache.org>.
Thanks for your input, Andrew and Nick!

Big thank you to Duo for your hands-on-keyboard commitment as well for 
this whole feature.

I am also happy to target 2.x (and not 2.5.x) for the backport.

In the interest of getting rid of this feature branch (and the 
inevitable rebase pains the longer it runs parallel to master), I'd like 
to move ahead with a concrete plan to merge.

1. Given there was no objection, do folks feel the need for a VOTE? Even 
if one person would like a VOTE, I'm happy to start that. Please just 
say so.

2. We have three outstanding PRs for the sake of SFT which are all (IMO) 
very close to merging (#3851, #3861, and #3942). I think 3851 and 3942 
are easy to include and just need one more review cycle. If we feel like 
we are still far away on 3861, I think we set that aside and revisit it 
after the feature merge is done.

If there are any other concerns, please shout!

- Josh

On 12/8/21 9:07 PM, Andrew Purtell wrote:
> +1 for merging to branch-2 (2.6)
> 
>> On Dec 8, 2021, at 6:04 PM, 张铎 <pa...@gmail.com> wrote:
>>
>> I think here we just want this to be backported to 2.x, not 2.5.x.
>>
>> So thanks Andrew for the quick action.
>>
>> +1 on merging HBASE-26067 to master and backporting to branch-2(2.6.0).
>>
>> Thanks.
>>
>> Andrew Purtell <ap...@apache.org> 于2021年12月9日周四 08:45写道:
>>
>>> I concur with Nick, but let me help here by branching 2.5 today. It was
>>> always going to be somewhat arbitrary a point.
>>>
>>>> On Wed, Dec 8, 2021 at 3:09 PM Nick Dimiduk <nd...@apache.org> wrote:
>>>>
>>>> Based solely on the comments made to this thread, I would recommend
>>> against
>>>> a merge to branch-2, given that we are very close to 2.5. The points
>>> about
>>>> existing gaps seem like things we're not ready to publish in the
>>> impending
>>>> minor release. Once we have a branch-2.5, this particular concern of mine
>>>> will be alleviated.
>>>>
>>>> Thanks,
>>>> Nick
>>>>
>>>>> On Wed, Dec 8, 2021 at 1:37 PM Josh Elser <el...@apache.org> wrote:
>>>>
>>>>> I was going to wait for some other folks to chime in, but I guess I can
>>>>> be the next one :)
>>>>>
>>>>> Duo, Wellington, and Szabolcs have been doing some excellent work on
>>> the
>>>>> storefile tracking (SFT) to a degree that I never expected to see. I
>>>>> remember some of the original "Filesystem re-do" issues on Jira. The
>>>>> idea was exceptional, but the result seemed unreachable.
>>>>>
>>>>> These devs, building on the success of what Zach/Stephen first talked
>>>>> about in HBASE-24749, came up with what I think is an excellent step
>>>>> forward. I've yet to break it via my own testing, but do acknowledge
>>>>> that there's always more work to be done.
>>>>>
>>>>> I think this is at a reasonable place to merge this back into the
>>>>> "mainline" branches from the feature branch (HBASE-26067). I believe
>>>>> this is ready because:
>>>>>
>>>>> 1. The feature is completely opt-in (HBase works the same way by
>>> default)
>>>>> 2. There is API to migrate tables into the new SFT implementation
>>>>> 3. There is also API to migrate tables back to the default
>>> implementation
>>>>>
>>>>> Some gaps still exist around bulk loading, documentation, snapshots,
>>> and
>>>>> recovery tooling, but these are being worked on. In the context of S3,
>>>>> this makes a significantly more compelling offering of HBase by
>>> removing
>>>>> the complexity of HBOSS. For HBase in all installations, I think SFT
>>>>> makes more a significantly more "deterministic" way of managing
>>>>> regions/files.
>>>>>
>>>>> +1 from me to merge HBASE-26067 into master and branch-2
>>>>>
>>>>> - Josh
>>>>>
>>>>> On 12/7/21 10:31 AM, Wellington Chevreuil wrote:
>>>>>> Hello everyone,
>>>>>>
>>>>>> We have been making progress on the alternative way of tracking store
>>>>> files
>>>>>> originally proposed by Duo in HBASE-26067.
>>>>>>
>>>>>> To briefly summarize it for those not following it, this feature
>>>>> introduces
>>>>>> an abstraction layer to track store files still used/needed by store
>>>>>> engines, allowing for plugging different approaches of identifying
>>>> store
>>>>>> files required by the given store. The design doc describing it in
>>> more
>>>>>> detail is available here
>>>>>> <
>>>>>
>>>>
>>> https://docs.google.com/document/d/16Nr1Fn3VaXuz1g1FTiME-bnGR3qVK5B-raXshOkDLcY/edit#heading=h.calrs3kn4d8s
>>>>>>
>>>>>> .
>>>>>>
>>>>>> Our main goal within this feature is to avoid the need for using temp
>>>>> files
>>>>>> and renames when creating new hfiles (whenever flushing, compacting,
>>>>>> splitting/merging or snapshotting). This is made possible by the
>>>>> pluggable
>>>>>> tracker implementation labeled "FILE". The current behavior using
>>> temp
>>>>> dirs
>>>>>> and renames would still be the default approach (labeled "DEFAULT").
>>>>>>
>>>>>> This "renameless" approach is appealing for deployments using Amazon
>>> S3
>>>>>> Object store file system, where the lack of atomic rename operations
>>>>>> imposed the necessity of an additional layer of locking (HBOSS),
>>> which
>>>>>> combined with the s3a rename operation can have a performance
>>> overhead.
>>>>>>
>>>>>> Some test runs on my employer infrastructure have shown promising
>>>>> results.
>>>>>> A pure insertion ycsb run has shown ~6% performance gain on the
>>> client
>>>>>> writes. Snapshot clone of hundreds of regions table completes in half
>>>> of
>>>>>> the time. There are also improvements in compaction, splits and
>>> merges
>>>>>> times.
>>>>>>
>>>>>> Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we
>>> feel
>>>>>> optimistic that the current implementation is in a good state to get
>>>>> merged
>>>>>> into master branch, but it would be nice to hear other opinions about
>>>> it,
>>>>>> before we effectively commit it. Looking forward to hearing some
>>>>>> thoughts/concerns you might have.
>>>>>>
>>>>>> Kind regards,
>>>>>> Wellington.
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Andrew
>>>
>>> Words like orphans lost among the crosstalk, meaning torn from truth's
>>> decrepit hands
>>>    - A23, Crosstalk
>>>

Re: [DISCUSS] Merge HBASE-26067 branch into master, and backport it to base 2 branches

Posted by Andrew Purtell <an...@gmail.com>.
+1 for merging to branch-2 (2.6)

> On Dec 8, 2021, at 6:04 PM, 张铎 <pa...@gmail.com> wrote:
> 
> I think here we just want this to be backported to 2.x, not 2.5.x.
> 
> So thanks Andrew for the quick action.
> 
> +1 on merging HBASE-26067 to master and backporting to branch-2(2.6.0).
> 
> Thanks.
> 
> Andrew Purtell <ap...@apache.org> 于2021年12月9日周四 08:45写道:
> 
>> I concur with Nick, but let me help here by branching 2.5 today. It was
>> always going to be somewhat arbitrary a point.
>> 
>>> On Wed, Dec 8, 2021 at 3:09 PM Nick Dimiduk <nd...@apache.org> wrote:
>>> 
>>> Based solely on the comments made to this thread, I would recommend
>> against
>>> a merge to branch-2, given that we are very close to 2.5. The points
>> about
>>> existing gaps seem like things we're not ready to publish in the
>> impending
>>> minor release. Once we have a branch-2.5, this particular concern of mine
>>> will be alleviated.
>>> 
>>> Thanks,
>>> Nick
>>> 
>>>> On Wed, Dec 8, 2021 at 1:37 PM Josh Elser <el...@apache.org> wrote:
>>> 
>>>> I was going to wait for some other folks to chime in, but I guess I can
>>>> be the next one :)
>>>> 
>>>> Duo, Wellington, and Szabolcs have been doing some excellent work on
>> the
>>>> storefile tracking (SFT) to a degree that I never expected to see. I
>>>> remember some of the original "Filesystem re-do" issues on Jira. The
>>>> idea was exceptional, but the result seemed unreachable.
>>>> 
>>>> These devs, building on the success of what Zach/Stephen first talked
>>>> about in HBASE-24749, came up with what I think is an excellent step
>>>> forward. I've yet to break it via my own testing, but do acknowledge
>>>> that there's always more work to be done.
>>>> 
>>>> I think this is at a reasonable place to merge this back into the
>>>> "mainline" branches from the feature branch (HBASE-26067). I believe
>>>> this is ready because:
>>>> 
>>>> 1. The feature is completely opt-in (HBase works the same way by
>> default)
>>>> 2. There is API to migrate tables into the new SFT implementation
>>>> 3. There is also API to migrate tables back to the default
>> implementation
>>>> 
>>>> Some gaps still exist around bulk loading, documentation, snapshots,
>> and
>>>> recovery tooling, but these are being worked on. In the context of S3,
>>>> this makes a significantly more compelling offering of HBase by
>> removing
>>>> the complexity of HBOSS. For HBase in all installations, I think SFT
>>>> makes more a significantly more "deterministic" way of managing
>>>> regions/files.
>>>> 
>>>> +1 from me to merge HBASE-26067 into master and branch-2
>>>> 
>>>> - Josh
>>>> 
>>>> On 12/7/21 10:31 AM, Wellington Chevreuil wrote:
>>>>> Hello everyone,
>>>>> 
>>>>> We have been making progress on the alternative way of tracking store
>>>> files
>>>>> originally proposed by Duo in HBASE-26067.
>>>>> 
>>>>> To briefly summarize it for those not following it, this feature
>>>> introduces
>>>>> an abstraction layer to track store files still used/needed by store
>>>>> engines, allowing for plugging different approaches of identifying
>>> store
>>>>> files required by the given store. The design doc describing it in
>> more
>>>>> detail is available here
>>>>> <
>>>> 
>>> 
>> https://docs.google.com/document/d/16Nr1Fn3VaXuz1g1FTiME-bnGR3qVK5B-raXshOkDLcY/edit#heading=h.calrs3kn4d8s
>>>>> 
>>>>> .
>>>>> 
>>>>> Our main goal within this feature is to avoid the need for using temp
>>>> files
>>>>> and renames when creating new hfiles (whenever flushing, compacting,
>>>>> splitting/merging or snapshotting). This is made possible by the
>>>> pluggable
>>>>> tracker implementation labeled "FILE". The current behavior using
>> temp
>>>> dirs
>>>>> and renames would still be the default approach (labeled "DEFAULT").
>>>>> 
>>>>> This "renameless" approach is appealing for deployments using Amazon
>> S3
>>>>> Object store file system, where the lack of atomic rename operations
>>>>> imposed the necessity of an additional layer of locking (HBOSS),
>> which
>>>>> combined with the s3a rename operation can have a performance
>> overhead.
>>>>> 
>>>>> Some test runs on my employer infrastructure have shown promising
>>>> results.
>>>>> A pure insertion ycsb run has shown ~6% performance gain on the
>> client
>>>>> writes. Snapshot clone of hundreds of regions table completes in half
>>> of
>>>>> the time. There are also improvements in compaction, splits and
>> merges
>>>>> times.
>>>>> 
>>>>> Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we
>> feel
>>>>> optimistic that the current implementation is in a good state to get
>>>> merged
>>>>> into master branch, but it would be nice to hear other opinions about
>>> it,
>>>>> before we effectively commit it. Looking forward to hearing some
>>>>> thoughts/concerns you might have.
>>>>> 
>>>>> Kind regards,
>>>>> Wellington.
>>>>> 
>>>> 
>>> 
>> 
>> 
>> --
>> Best regards,
>> Andrew
>> 
>> Words like orphans lost among the crosstalk, meaning torn from truth's
>> decrepit hands
>>   - A23, Crosstalk
>> 

Re: [DISCUSS] Merge HBASE-26067 branch into master, and backport it to base 2 branches

Posted by "张铎(Duo Zhang)" <pa...@gmail.com>.
I think here we just want this to be backported to 2.x, not 2.5.x.

So thanks Andrew for the quick action.

+1 on merging HBASE-26067 to master and backporting to branch-2(2.6.0).

Thanks.

Andrew Purtell <ap...@apache.org> 于2021年12月9日周四 08:45写道:

> I concur with Nick, but let me help here by branching 2.5 today. It was
> always going to be somewhat arbitrary a point.
>
> On Wed, Dec 8, 2021 at 3:09 PM Nick Dimiduk <nd...@apache.org> wrote:
>
> > Based solely on the comments made to this thread, I would recommend
> against
> > a merge to branch-2, given that we are very close to 2.5. The points
> about
> > existing gaps seem like things we're not ready to publish in the
> impending
> > minor release. Once we have a branch-2.5, this particular concern of mine
> > will be alleviated.
> >
> > Thanks,
> > Nick
> >
> > On Wed, Dec 8, 2021 at 1:37 PM Josh Elser <el...@apache.org> wrote:
> >
> > > I was going to wait for some other folks to chime in, but I guess I can
> > > be the next one :)
> > >
> > > Duo, Wellington, and Szabolcs have been doing some excellent work on
> the
> > > storefile tracking (SFT) to a degree that I never expected to see. I
> > > remember some of the original "Filesystem re-do" issues on Jira. The
> > > idea was exceptional, but the result seemed unreachable.
> > >
> > > These devs, building on the success of what Zach/Stephen first talked
> > > about in HBASE-24749, came up with what I think is an excellent step
> > > forward. I've yet to break it via my own testing, but do acknowledge
> > > that there's always more work to be done.
> > >
> > > I think this is at a reasonable place to merge this back into the
> > > "mainline" branches from the feature branch (HBASE-26067). I believe
> > > this is ready because:
> > >
> > > 1. The feature is completely opt-in (HBase works the same way by
> default)
> > > 2. There is API to migrate tables into the new SFT implementation
> > > 3. There is also API to migrate tables back to the default
> implementation
> > >
> > > Some gaps still exist around bulk loading, documentation, snapshots,
> and
> > > recovery tooling, but these are being worked on. In the context of S3,
> > > this makes a significantly more compelling offering of HBase by
> removing
> > > the complexity of HBOSS. For HBase in all installations, I think SFT
> > > makes more a significantly more "deterministic" way of managing
> > > regions/files.
> > >
> > > +1 from me to merge HBASE-26067 into master and branch-2
> > >
> > > - Josh
> > >
> > > On 12/7/21 10:31 AM, Wellington Chevreuil wrote:
> > > > Hello everyone,
> > > >
> > > > We have been making progress on the alternative way of tracking store
> > > files
> > > > originally proposed by Duo in HBASE-26067.
> > > >
> > > > To briefly summarize it for those not following it, this feature
> > > introduces
> > > > an abstraction layer to track store files still used/needed by store
> > > > engines, allowing for plugging different approaches of identifying
> > store
> > > > files required by the given store. The design doc describing it in
> more
> > > > detail is available here
> > > > <
> > >
> >
> https://docs.google.com/document/d/16Nr1Fn3VaXuz1g1FTiME-bnGR3qVK5B-raXshOkDLcY/edit#heading=h.calrs3kn4d8s
> > > >
> > > > .
> > > >
> > > > Our main goal within this feature is to avoid the need for using temp
> > > files
> > > > and renames when creating new hfiles (whenever flushing, compacting,
> > > > splitting/merging or snapshotting). This is made possible by the
> > > pluggable
> > > > tracker implementation labeled "FILE". The current behavior using
> temp
> > > dirs
> > > > and renames would still be the default approach (labeled "DEFAULT").
> > > >
> > > > This "renameless" approach is appealing for deployments using Amazon
> S3
> > > > Object store file system, where the lack of atomic rename operations
> > > > imposed the necessity of an additional layer of locking (HBOSS),
> which
> > > > combined with the s3a rename operation can have a performance
> overhead.
> > > >
> > > > Some test runs on my employer infrastructure have shown promising
> > > results.
> > > > A pure insertion ycsb run has shown ~6% performance gain on the
> client
> > > > writes. Snapshot clone of hundreds of regions table completes in half
> > of
> > > > the time. There are also improvements in compaction, splits and
> merges
> > > > times.
> > > >
> > > > Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we
> feel
> > > > optimistic that the current implementation is in a good state to get
> > > merged
> > > > into master branch, but it would be nice to hear other opinions about
> > it,
> > > > before we effectively commit it. Looking forward to hearing some
> > > > thoughts/concerns you might have.
> > > >
> > > > Kind regards,
> > > > Wellington.
> > > >
> > >
> >
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>    - A23, Crosstalk
>

Re: [DISCUSS] Merge HBASE-26067 branch into master, and backport it to base 2 branches

Posted by Andrew Purtell <ap...@apache.org>.
I concur with Nick, but let me help here by branching 2.5 today. It was
always going to be somewhat arbitrary a point.

On Wed, Dec 8, 2021 at 3:09 PM Nick Dimiduk <nd...@apache.org> wrote:

> Based solely on the comments made to this thread, I would recommend against
> a merge to branch-2, given that we are very close to 2.5. The points about
> existing gaps seem like things we're not ready to publish in the impending
> minor release. Once we have a branch-2.5, this particular concern of mine
> will be alleviated.
>
> Thanks,
> Nick
>
> On Wed, Dec 8, 2021 at 1:37 PM Josh Elser <el...@apache.org> wrote:
>
> > I was going to wait for some other folks to chime in, but I guess I can
> > be the next one :)
> >
> > Duo, Wellington, and Szabolcs have been doing some excellent work on the
> > storefile tracking (SFT) to a degree that I never expected to see. I
> > remember some of the original "Filesystem re-do" issues on Jira. The
> > idea was exceptional, but the result seemed unreachable.
> >
> > These devs, building on the success of what Zach/Stephen first talked
> > about in HBASE-24749, came up with what I think is an excellent step
> > forward. I've yet to break it via my own testing, but do acknowledge
> > that there's always more work to be done.
> >
> > I think this is at a reasonable place to merge this back into the
> > "mainline" branches from the feature branch (HBASE-26067). I believe
> > this is ready because:
> >
> > 1. The feature is completely opt-in (HBase works the same way by default)
> > 2. There is API to migrate tables into the new SFT implementation
> > 3. There is also API to migrate tables back to the default implementation
> >
> > Some gaps still exist around bulk loading, documentation, snapshots, and
> > recovery tooling, but these are being worked on. In the context of S3,
> > this makes a significantly more compelling offering of HBase by removing
> > the complexity of HBOSS. For HBase in all installations, I think SFT
> > makes more a significantly more "deterministic" way of managing
> > regions/files.
> >
> > +1 from me to merge HBASE-26067 into master and branch-2
> >
> > - Josh
> >
> > On 12/7/21 10:31 AM, Wellington Chevreuil wrote:
> > > Hello everyone,
> > >
> > > We have been making progress on the alternative way of tracking store
> > files
> > > originally proposed by Duo in HBASE-26067.
> > >
> > > To briefly summarize it for those not following it, this feature
> > introduces
> > > an abstraction layer to track store files still used/needed by store
> > > engines, allowing for plugging different approaches of identifying
> store
> > > files required by the given store. The design doc describing it in more
> > > detail is available here
> > > <
> >
> https://docs.google.com/document/d/16Nr1Fn3VaXuz1g1FTiME-bnGR3qVK5B-raXshOkDLcY/edit#heading=h.calrs3kn4d8s
> > >
> > > .
> > >
> > > Our main goal within this feature is to avoid the need for using temp
> > files
> > > and renames when creating new hfiles (whenever flushing, compacting,
> > > splitting/merging or snapshotting). This is made possible by the
> > pluggable
> > > tracker implementation labeled "FILE". The current behavior using temp
> > dirs
> > > and renames would still be the default approach (labeled "DEFAULT").
> > >
> > > This "renameless" approach is appealing for deployments using Amazon S3
> > > Object store file system, where the lack of atomic rename operations
> > > imposed the necessity of an additional layer of locking (HBOSS), which
> > > combined with the s3a rename operation can have a performance overhead.
> > >
> > > Some test runs on my employer infrastructure have shown promising
> > results.
> > > A pure insertion ycsb run has shown ~6% performance gain on the client
> > > writes. Snapshot clone of hundreds of regions table completes in half
> of
> > > the time. There are also improvements in compaction, splits and merges
> > > times.
> > >
> > > Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we feel
> > > optimistic that the current implementation is in a good state to get
> > merged
> > > into master branch, but it would be nice to hear other opinions about
> it,
> > > before we effectively commit it. Looking forward to hearing some
> > > thoughts/concerns you might have.
> > >
> > > Kind regards,
> > > Wellington.
> > >
> >
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: [DISCUSS] Merge HBASE-26067 branch into master, and backport it to base 2 branches

Posted by Nick Dimiduk <nd...@apache.org>.
Based solely on the comments made to this thread, I would recommend against
a merge to branch-2, given that we are very close to 2.5. The points about
existing gaps seem like things we're not ready to publish in the impending
minor release. Once we have a branch-2.5, this particular concern of mine
will be alleviated.

Thanks,
Nick

On Wed, Dec 8, 2021 at 1:37 PM Josh Elser <el...@apache.org> wrote:

> I was going to wait for some other folks to chime in, but I guess I can
> be the next one :)
>
> Duo, Wellington, and Szabolcs have been doing some excellent work on the
> storefile tracking (SFT) to a degree that I never expected to see. I
> remember some of the original "Filesystem re-do" issues on Jira. The
> idea was exceptional, but the result seemed unreachable.
>
> These devs, building on the success of what Zach/Stephen first talked
> about in HBASE-24749, came up with what I think is an excellent step
> forward. I've yet to break it via my own testing, but do acknowledge
> that there's always more work to be done.
>
> I think this is at a reasonable place to merge this back into the
> "mainline" branches from the feature branch (HBASE-26067). I believe
> this is ready because:
>
> 1. The feature is completely opt-in (HBase works the same way by default)
> 2. There is API to migrate tables into the new SFT implementation
> 3. There is also API to migrate tables back to the default implementation
>
> Some gaps still exist around bulk loading, documentation, snapshots, and
> recovery tooling, but these are being worked on. In the context of S3,
> this makes a significantly more compelling offering of HBase by removing
> the complexity of HBOSS. For HBase in all installations, I think SFT
> makes more a significantly more "deterministic" way of managing
> regions/files.
>
> +1 from me to merge HBASE-26067 into master and branch-2
>
> - Josh
>
> On 12/7/21 10:31 AM, Wellington Chevreuil wrote:
> > Hello everyone,
> >
> > We have been making progress on the alternative way of tracking store
> files
> > originally proposed by Duo in HBASE-26067.
> >
> > To briefly summarize it for those not following it, this feature
> introduces
> > an abstraction layer to track store files still used/needed by store
> > engines, allowing for plugging different approaches of identifying store
> > files required by the given store. The design doc describing it in more
> > detail is available here
> > <
> https://docs.google.com/document/d/16Nr1Fn3VaXuz1g1FTiME-bnGR3qVK5B-raXshOkDLcY/edit#heading=h.calrs3kn4d8s
> >
> > .
> >
> > Our main goal within this feature is to avoid the need for using temp
> files
> > and renames when creating new hfiles (whenever flushing, compacting,
> > splitting/merging or snapshotting). This is made possible by the
> pluggable
> > tracker implementation labeled "FILE". The current behavior using temp
> dirs
> > and renames would still be the default approach (labeled "DEFAULT").
> >
> > This "renameless" approach is appealing for deployments using Amazon S3
> > Object store file system, where the lack of atomic rename operations
> > imposed the necessity of an additional layer of locking (HBOSS), which
> > combined with the s3a rename operation can have a performance overhead.
> >
> > Some test runs on my employer infrastructure have shown promising
> results.
> > A pure insertion ycsb run has shown ~6% performance gain on the client
> > writes. Snapshot clone of hundreds of regions table completes in half of
> > the time. There are also improvements in compaction, splits and merges
> > times.
> >
> > Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we feel
> > optimistic that the current implementation is in a good state to get
> merged
> > into master branch, but it would be nice to hear other opinions about it,
> > before we effectively commit it. Looking forward to hearing some
> > thoughts/concerns you might have.
> >
> > Kind regards,
> > Wellington.
> >
>

Re: [DISCUSS] Merge HBASE-26067 branch into master, and backport it to base 2 branches

Posted by Josh Elser <el...@apache.org>.
I was going to wait for some other folks to chime in, but I guess I can 
be the next one :)

Duo, Wellington, and Szabolcs have been doing some excellent work on the 
storefile tracking (SFT) to a degree that I never expected to see. I 
remember some of the original "Filesystem re-do" issues on Jira. The 
idea was exceptional, but the result seemed unreachable.

These devs, building on the success of what Zach/Stephen first talked 
about in HBASE-24749, came up with what I think is an excellent step 
forward. I've yet to break it via my own testing, but do acknowledge 
that there's always more work to be done.

I think this is at a reasonable place to merge this back into the 
"mainline" branches from the feature branch (HBASE-26067). I believe 
this is ready because:

1. The feature is completely opt-in (HBase works the same way by default)
2. There is API to migrate tables into the new SFT implementation
3. There is also API to migrate tables back to the default implementation

Some gaps still exist around bulk loading, documentation, snapshots, and 
recovery tooling, but these are being worked on. In the context of S3, 
this makes a significantly more compelling offering of HBase by removing 
the complexity of HBOSS. For HBase in all installations, I think SFT 
makes more a significantly more "deterministic" way of managing 
regions/files.

+1 from me to merge HBASE-26067 into master and branch-2

- Josh

On 12/7/21 10:31 AM, Wellington Chevreuil wrote:
> Hello everyone,
> 
> We have been making progress on the alternative way of tracking store files
> originally proposed by Duo in HBASE-26067.
> 
> To briefly summarize it for those not following it, this feature introduces
> an abstraction layer to track store files still used/needed by store
> engines, allowing for plugging different approaches of identifying store
> files required by the given store. The design doc describing it in more
> detail is available here
> <https://docs.google.com/document/d/16Nr1Fn3VaXuz1g1FTiME-bnGR3qVK5B-raXshOkDLcY/edit#heading=h.calrs3kn4d8s>
> .
> 
> Our main goal within this feature is to avoid the need for using temp files
> and renames when creating new hfiles (whenever flushing, compacting,
> splitting/merging or snapshotting). This is made possible by the pluggable
> tracker implementation labeled "FILE". The current behavior using temp dirs
> and renames would still be the default approach (labeled "DEFAULT").
> 
> This "renameless" approach is appealing for deployments using Amazon S3
> Object store file system, where the lack of atomic rename operations
> imposed the necessity of an additional layer of locking (HBOSS), which
> combined with the s3a rename operation can have a performance overhead.
> 
> Some test runs on my employer infrastructure have shown promising results.
> A pure insertion ycsb run has shown ~6% performance gain on the client
> writes. Snapshot clone of hundreds of regions table completes in half of
> the time. There are also improvements in compaction, splits and merges
> times.
> 
> Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we feel
> optimistic that the current implementation is in a good state to get merged
> into master branch, but it would be nice to hear other opinions about it,
> before we effectively commit it. Looking forward to hearing some
> thoughts/concerns you might have.
> 
> Kind regards,
> Wellington.
>