You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Szabolcs Bukros <sz...@cloudera.com.INVALID> on 2022/02/14 16:31:29 UTC

[DISCUSS] operator tools, HBase 3 and StoreFileTracking

Hi Folks!

While working on adding tools to handle potential FileBased
StoreFileTracker issues to HBCK2 (HBASE-26624
<https://issues.apache.org/jira/browse/HBASE-26624>) I ran into multiple
problems I'm unsure how to solve.

First of all the tools would rely on files not yet available in any of the
released hbase artifacts. I tried to solve this without changing the hbase
dependency version to keep HBCK2 as hbase version independent as possible,
but none of the solutions I have found looked acceptable:
 - Pushing the logic to the hbase side (as far as I can tell) is not
feasible because it has to be able to repair meta which is easier when
hbase is down and the tool should be able to run without a working hbase.
 - The files tracking the store content are serialized proto objects so
while replicating those files in the operator tools is possible, it would
not be pretty.

Bumping operator tools to use hbase 2.6.0-SNAPSHOT (branch-2 has the SFT
changes) would mean that now we need that or a newer version to build the
project and a version check to avoid runtime problems with the new tools,
but otherwise this looks rather painless and backwards compatible. I know
operator tools tries to avoid having a hbase-specific release, but having
2.6 as a min version to build against might be acceptable.

While looking into this I also checked what needs to be done to make
operator tools work with hbase 3.0.0-alpha-3-SNAPSHOT. Most of the changes
are backwards compatible but not all of them and the ones that aren't would
make a big chunk of Fsck unusable with older hbases. For me that looks
acceptable since this is a major version change, but that would mean I can
not rely on a potential HBCK3 to fix SFT issues, I would also need a
solution for HBCK2.

I tried to look for plans/direction regarding the new 1.3 operator tools
but could not find any.

Do you think it would be possible to bump the hbase version it uses to
2.6.0-SNAPSHOT?
Do you think it would make sense to start working on a hbase3 compatible
branch or is it too early?

NOTE:
I'm aware hbase does not publish SNAPSHOT builds for years, but I do not
know how the internal build system works and if these artifacts would be
available for internal builds or not. I also do not know if necessary could
they be made available.

Re: [DISCUSS] operator tools, HBase 3 and StoreFileTracking

Posted by Andrew Purtell <ap...@apache.org>.
I opened HBASE-26826 to consider the backport of SFT into branch-2.5.

On Wed, Mar 9, 2022 at 2:47 PM Wellington Chevreuil <
wellington.chevreuil@gmail.com> wrote:

> I would prefer the option 2 suggested by Andrew, SFT backported to 2.5 as
> experimental. Regarding the original "thin client/server master do all the
> fix " approach for hbck2 mentioned by Duo, that has already been relaxed to
> include some logic containing functions for some break scenarios where
> master didn't have a solution implemented yet
> (see RegionInfoMismatchTool, FsRegionsMetaRecoverer,
> MissingTableDescriptorGenerator,
> for example). Clusters facing such issues would then require a whole hbase
> upgrade to a version including the fix logic, which is not feasible when
> these are production deployments.
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
    It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse

Re: [DISCUSS] operator tools, HBase 3 and StoreFileTracking

Posted by Wellington Chevreuil <we...@gmail.com>.
I would prefer the option 2 suggested by Andrew, SFT backported to 2.5 as
experimental. Regarding the original "thin client/server master do all the
fix " approach for hbck2 mentioned by Duo, that has already been relaxed to
include some logic containing functions for some break scenarios where
master didn't have a solution implemented yet
(see RegionInfoMismatchTool, FsRegionsMetaRecoverer,
MissingTableDescriptorGenerator,
for example). Clusters facing such issues would then require a whole hbase
upgrade to a version including the fix logic, which is not feasible when
these are production deployments.

Re: [DISCUSS] operator tools, HBase 3 and StoreFileTracking

Posted by "张铎(Duo Zhang)" <pa...@gmail.com>.
I just talked about our old design choice, it was not made by me...

In fact, for me, I agree that if we want to operate on an active cluster,
we'd better go against a procedure.
But for having a separated HBCK2 repo outside the main code repo, well, I
do not see big advantages and it will introduce more problems.

And I also agree that for some types of operations we do not need to have
an active cluster or master. But we have decided to do so in the past and
introduced a maintenance mode, which cost me a lot of time when I wanted to
move balancer code to a sub module and decouple HMaster and HRegionServer.
And still, the decision is not by me...

So in general, I agree with most of your points. What I want to say is, we
have decided to go one way in the past, if we want to break it, we need to
review the old decision to see whether it is safe for us to break it, maybe
we will fall into another hole right after we jump out the current one...

And on how to implement HBCK2, I made a mistake on having HBCK2 depend on
SNAPSHOT HBase, technically there is no problem, but when we want to make a
release, this is not allowed by the ASF release rules...

Thanks.

Josh Elser <el...@apache.org> 于2022年3月2日周三 05:49写道:

> I tend to lean towards what Andrew is saying here, but I will also admit
> that this is in part from not having a good user-experience about
> getting up an HMaster in maintenance mode to do surgical stuff (feels
> like two steps instead of just one).
>
> Naively, rebuilding the SFT meta files from the filesystem doesn't
> require the HMaster to be up because there isn't any other "state" to
> consider (which was a big reason behind pushing the work that hbck2 was
> doing into the active master to avoid split-brain).
>
> Is doing logic in HBCK2 that doesn't talk to the HMaster a -1 from you,
> Duo? Similarly, is a utility in hbase-operator-tools (not a part of the
> hbck2 wrapper command) also a -1?
>
> Either are feasible, but I do think trying to build this SFT
> rebuilding/recovery into a maintenance-mode HMaster will be more work.
>
> On 2/21/22 12:27 PM, Andrew Purtell wrote:
> > There are some recovery cases where the cluster cannot be expected to be
> up
> > and running. What happens if we have no tooling for those? The user has a
> > dead cluster. So I don't think a requirement that the cluster be up and
> > running always is sufficient. For this type of recovery operator-tools
> must
> > be able to parse and write on disk formats. On the other hand hopefully
> the
> > cases for which that is not true are rare. In HBase 1, we had
> > OffineMetaRebuild. For my operations occasionally it has been necessary,
> in
> > test environments especially where users are not always clueful, and it
> has
> > shortened incident time from many hours to less than one hour. The
> > alternative would have been rebuild from scratch with total data loss,
> > which is a totally unsatisfying user experience.
> >
> >
> > On Sun, Feb 20, 2022 at 4:29 AM 张铎(Duo Zhang) <pa...@gmail.com>
> wrote:
> >
> >> Sorry a bit late...
> >>
> >> IIRC, the design of HBCK2 is that, most of the actual fix logic should
> be
> >> done inside hbase(usually as a procedure), and the hbase-operator-tools
> is
> >> just a facade for calling these methods. It will query the cluster to
> find
> >> out which features are supportted. So in general, the design here is to
> >> always have the cluster up when fixing. We have a maintenance mode
> where we
> >> will just bring up HMaster and make meta table online, without loading
> any
> >> other regions.
> >>
> >> So I prefer we just use snapshot dependencies of hbase in HBCK2. It is
> not
> >> a big deal for end users as if we have not make the release yet, the new
> >> fixing options can never be actually used against a production cluster.
> >>
> >> Anyway, this means we need to publish nightly builds then.
> >>
> >> Thanks.
> >>
> >> Peter Somogyi <ps...@apache.org> 于2022年2月18日周五 06:40写道:
> >>
> >>> Makes sense. Thanks Andrew for clarifying!
> >>>
> >>> On Thu, Feb 17, 2022, 21:28 Andrew Purtell <ap...@apache.org>
> wrote:
> >>>
> >>>> On Thu, Feb 17, 2022 at 12:19 PM Peter Somogyi <ps...@apache.org>
> >>>> wrote:
> >>>>
> >>>>> I like the idea of including the store file tracking in 2.5.0 to
> >>> unblock
> >>>>> the HBCK development efforts.
> >>>>>
> >>>>> Unfortunately, I was not following its development that much. Can it
> >>>> cause
> >>>>> any issues if 2.5.0 has the feature but later an incompatible change
> >> is
> >>>>> needed for SFT? Can it be marked as a beta feature where we are free
> >> to
> >>>>> modify interfaces?
> >>>>>
> >>>>
> >>>> Yes, this is what I meant when I suggested we could mark it as
> >>>> 'experimental'. We have done this in the past. The word 'experimental'
> >> is
> >>>> prominently included adjacent to any discussion of the feature in
> >>>> documentation and release notes. When we feel for sure it is stable
> >> that
> >>>> word is removed. We can do something different this time of course but
> >>> that
> >>>> has been our past practice when introducing new functionality into
> >>>> releasing code lines. And I presume we would use the Evolving
> interface
> >>>> annotation everywhere.
> >>>>
> >>>> Peter
> >>>>>
> >>>>> On Tue, Feb 15, 2022 at 11:07 PM Andrew Purtell <
> >>>> andrew.purtell@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Another option which I do not see mentioned yet is to extract the
> >>>>> relevant
> >>>>>> common proto and source files from the ‘hbase’ repository into a
> >> new
> >>>>>> repository (‘hbase-storage’?), from which we would release
> >> artifacts
> >>> to
> >>>>> be
> >>>>>> consumed by both hbase and hbase-operator-tools. This maintains
> >>> D.R.Y.
> >>>>>> through refactoring although it may down the road cause some
> >>> complexity
> >>>>> in
> >>>>>> coordinating evolution among the three (if not more) repositories
> >> and
> >>>>>> releases produced from them. This is like Josh’s Option 1 but
> >> without
> >>>>>> duplication.
> >>>>>>
> >>>>>> Regarding the option 2 issue… If it would help we can drop SFT into
> >>>>>> branch-2.5 along with the log4j2 changes and release 2.5.0
> >> afterward.
> >>>> We
> >>>>>> are taking the opportunity of this minor increment to accelerate
> >>> log4j1
> >>>>>> retirement, which is why it’s still waiting (but not for long). We
> >>> can
> >>>>> use
> >>>>>> the same opportunity to release SFT even if we designate it as an
> >>>>>> experimental feature if that would simplify some other logistics.
> >> For
> >>>>> what
> >>>>>> it’s worth.
> >>>>>>
> >>>>>>> On Feb 15, 2022, at 7:44 AM, Josh Elser <el...@apache.org>
> >> wrote:
> >>>>>>>
> >>>>>>> I was talking with Szabolcs prior to him sending this one, and
> >>> it's
> >>>> a
> >>>>>> tricky issue for sure.
> >>>>>>>
> >>>>>>> To date, we've solved any HBase API issues by copying code into
> >>> HBCK2
> >>>>>> e.g. HBCKMetaTableAccessor which copies parts of MetaTableAccessor,
> >>> or
> >>>> we
> >>>>>> push the logic down server-side to the HBase Master and invoke it
> >>> over
> >>>>> the
> >>>>>> Hbck RPC interface.
> >>>>>>>
> >>>>>>> I definitely want to avoid HBase version specific builds of the
> >>>>>> operator-tools, so that is not an option in my mind for 2.x. The
> >>>>>> discussions we had (that I remember) around HBCK2 were limited in
> >>> scope
> >>>>> to
> >>>>>> HBase 2.x.
> >>>>>>>
> >>>>>>> Option 1: we copy the necessary proto files from HBase into the
> >>>>>> operator-tools and try to remember that, if we make any change to
> >> the
> >>>>>> serialization of the storefile list files, we have to copy that
> >>> change
> >>>> to
> >>>>>> HBCK2. Brittle on the surface but effective.
> >>>>>>>
> >>>>>>> Option 2: We bump HBCK2 to hbase-2.6.0-SNAPSHOT. Problematic
> >> until
> >>> we
> >>>>>> make an HBase 2.6.0[-alpha] release. We should already have wire
> >>> compat
> >>>>>> between all of HBase 2.x which makes that a non-issue.
> >>>>>>>
> >>>>>>> Option 3: We create an HBCK3 targeted for HBase 3.x. I'm not
> >>>> convinced
> >>>>>> we need to do that (hbck for hbase 3.x would be just like hbck for
> >>>> hbase
> >>>>>> 2.x). This would also not solve the problem for the SFT feature in
> >>>> hbase
> >>>>>> 2.6.
> >>>>>>>
> >>>>>>> I think option 3 is a no-go. I am leaning towards option 1 at
> >> this
> >>>>>> point. Hopefully my thought process is helpful for others to weigh
> >>> in.
> >>>>>>>
> >>>>>>>
> >>>>>>>> On 2/14/22 11:31 AM, Szabolcs Bukros wrote:
> >>>>>>>> Hi Folks!
> >>>>>>>> While working on adding tools to handle potential FileBased
> >>>>>>>> StoreFileTracker issues to HBCK2 (HBASE-26624
> >>>>>>>> <https://issues.apache.org/jira/browse/HBASE-26624>) I ran into
> >>>>>> multiple
> >>>>>>>> problems I'm unsure how to solve.
> >>>>>>>> First of all the tools would rely on files not yet available in
> >>> any
> >>>> of
> >>>>>> the
> >>>>>>>> released hbase artifacts. I tried to solve this without changing
> >>> the
> >>>>>> hbase
> >>>>>>>> dependency version to keep HBCK2 as hbase version independent as
> >>>>>> possible,
> >>>>>>>> but none of the solutions I have found looked acceptable:
> >>>>>>>>   - Pushing the logic to the hbase side (as far as I can tell) is
> >>> not
> >>>>>>>> feasible because it has to be able to repair meta which is
> >> easier
> >>>> when
> >>>>>>>> hbase is down and the tool should be able to run without a
> >> working
> >>>>>> hbase.
> >>>>>>>>   - The files tracking the store content are serialized proto
> >>> objects
> >>>>> so
> >>>>>>>> while replicating those files in the operator tools is possible,
> >>> it
> >>>>>> would
> >>>>>>>> not be pretty.
> >>>>>>>> Bumping operator tools to use hbase 2.6.0-SNAPSHOT (branch-2 has
> >>> the
> >>>>> SFT
> >>>>>>>> changes) would mean that now we need that or a newer version to
> >>>> build
> >>>>>> the
> >>>>>>>> project and a version check to avoid runtime problems with the
> >> new
> >>>>>> tools,
> >>>>>>>> but otherwise this looks rather painless and backwards
> >>> compatible. I
> >>>>>> know
> >>>>>>>> operator tools tries to avoid having a hbase-specific release,
> >> but
> >>>>>> having
> >>>>>>>> 2.6 as a min version to build against might be acceptable.
> >>>>>>>> While looking into this I also checked what needs to be done to
> >>> make
> >>>>>>>> operator tools work with hbase 3.0.0-alpha-3-SNAPSHOT. Most of
> >> the
> >>>>>> changes
> >>>>>>>> are backwards compatible but not all of them and the ones that
> >>>> aren't
> >>>>>> would
> >>>>>>>> make a big chunk of Fsck unusable with older hbases. For me that
> >>>> looks
> >>>>>>>> acceptable since this is a major version change, but that would
> >>>> mean I
> >>>>>> can
> >>>>>>>> not rely on a potential HBCK3 to fix SFT issues, I would also
> >>> need a
> >>>>>>>> solution for HBCK2.
> >>>>>>>> I tried to look for plans/direction regarding the new 1.3
> >> operator
> >>>>> tools
> >>>>>>>> but could not find any.
> >>>>>>>> Do you think it would be possible to bump the hbase version it
> >>> uses
> >>>> to
> >>>>>>>> 2.6.0-SNAPSHOT?
> >>>>>>>> Do you think it would make sense to start working on a hbase3
> >>>>> compatible
> >>>>>>>> branch or is it too early?
> >>>>>>>> NOTE:
> >>>>>>>> I'm aware hbase does not publish SNAPSHOT builds for years, but
> >> I
> >>> do
> >>>>> not
> >>>>>>>> know how the internal build system works and if these artifacts
> >>>> would
> >>>>> be
> >>>>>>>> available for internal builds or not. I also do not know if
> >>>> necessary
> >>>>>> could
> >>>>>>>> they be made available.
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Best regards,
> >>>> Andrew
> >>>>
> >>>> Unrest, ignorance distilled, nihilistic imbeciles -
> >>>>      It's what we’ve earned
> >>>> Welcome, apocalypse, what’s taken you so long?
> >>>> Bring us the fitting end that we’ve been counting on
> >>>>     - A23, Welcome, Apocalypse
> >>>>
> >>>
> >>
> >
> >
>

Re: [DISCUSS] operator tools, HBase 3 and StoreFileTracking

Posted by Josh Elser <el...@apache.org>.
I tend to lean towards what Andrew is saying here, but I will also admit 
that this is in part from not having a good user-experience about 
getting up an HMaster in maintenance mode to do surgical stuff (feels 
like two steps instead of just one).

Naively, rebuilding the SFT meta files from the filesystem doesn't 
require the HMaster to be up because there isn't any other "state" to 
consider (which was a big reason behind pushing the work that hbck2 was 
doing into the active master to avoid split-brain).

Is doing logic in HBCK2 that doesn't talk to the HMaster a -1 from you, 
Duo? Similarly, is a utility in hbase-operator-tools (not a part of the 
hbck2 wrapper command) also a -1?

Either are feasible, but I do think trying to build this SFT 
rebuilding/recovery into a maintenance-mode HMaster will be more work.

On 2/21/22 12:27 PM, Andrew Purtell wrote:
> There are some recovery cases where the cluster cannot be expected to be up
> and running. What happens if we have no tooling for those? The user has a
> dead cluster. So I don't think a requirement that the cluster be up and
> running always is sufficient. For this type of recovery operator-tools must
> be able to parse and write on disk formats. On the other hand hopefully the
> cases for which that is not true are rare. In HBase 1, we had
> OffineMetaRebuild. For my operations occasionally it has been necessary, in
> test environments especially where users are not always clueful, and it has
> shortened incident time from many hours to less than one hour. The
> alternative would have been rebuild from scratch with total data loss,
> which is a totally unsatisfying user experience.
> 
> 
> On Sun, Feb 20, 2022 at 4:29 AM 张铎(Duo Zhang) <pa...@gmail.com> wrote:
> 
>> Sorry a bit late...
>>
>> IIRC, the design of HBCK2 is that, most of the actual fix logic should be
>> done inside hbase(usually as a procedure), and the hbase-operator-tools is
>> just a facade for calling these methods. It will query the cluster to find
>> out which features are supportted. So in general, the design here is to
>> always have the cluster up when fixing. We have a maintenance mode where we
>> will just bring up HMaster and make meta table online, without loading any
>> other regions.
>>
>> So I prefer we just use snapshot dependencies of hbase in HBCK2. It is not
>> a big deal for end users as if we have not make the release yet, the new
>> fixing options can never be actually used against a production cluster.
>>
>> Anyway, this means we need to publish nightly builds then.
>>
>> Thanks.
>>
>> Peter Somogyi <ps...@apache.org> 于2022年2月18日周五 06:40写道:
>>
>>> Makes sense. Thanks Andrew for clarifying!
>>>
>>> On Thu, Feb 17, 2022, 21:28 Andrew Purtell <ap...@apache.org> wrote:
>>>
>>>> On Thu, Feb 17, 2022 at 12:19 PM Peter Somogyi <ps...@apache.org>
>>>> wrote:
>>>>
>>>>> I like the idea of including the store file tracking in 2.5.0 to
>>> unblock
>>>>> the HBCK development efforts.
>>>>>
>>>>> Unfortunately, I was not following its development that much. Can it
>>>> cause
>>>>> any issues if 2.5.0 has the feature but later an incompatible change
>> is
>>>>> needed for SFT? Can it be marked as a beta feature where we are free
>> to
>>>>> modify interfaces?
>>>>>
>>>>
>>>> Yes, this is what I meant when I suggested we could mark it as
>>>> 'experimental'. We have done this in the past. The word 'experimental'
>> is
>>>> prominently included adjacent to any discussion of the feature in
>>>> documentation and release notes. When we feel for sure it is stable
>> that
>>>> word is removed. We can do something different this time of course but
>>> that
>>>> has been our past practice when introducing new functionality into
>>>> releasing code lines. And I presume we would use the Evolving interface
>>>> annotation everywhere.
>>>>
>>>> Peter
>>>>>
>>>>> On Tue, Feb 15, 2022 at 11:07 PM Andrew Purtell <
>>>> andrew.purtell@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Another option which I do not see mentioned yet is to extract the
>>>>> relevant
>>>>>> common proto and source files from the ‘hbase’ repository into a
>> new
>>>>>> repository (‘hbase-storage’?), from which we would release
>> artifacts
>>> to
>>>>> be
>>>>>> consumed by both hbase and hbase-operator-tools. This maintains
>>> D.R.Y.
>>>>>> through refactoring although it may down the road cause some
>>> complexity
>>>>> in
>>>>>> coordinating evolution among the three (if not more) repositories
>> and
>>>>>> releases produced from them. This is like Josh’s Option 1 but
>> without
>>>>>> duplication.
>>>>>>
>>>>>> Regarding the option 2 issue… If it would help we can drop SFT into
>>>>>> branch-2.5 along with the log4j2 changes and release 2.5.0
>> afterward.
>>>> We
>>>>>> are taking the opportunity of this minor increment to accelerate
>>> log4j1
>>>>>> retirement, which is why it’s still waiting (but not for long). We
>>> can
>>>>> use
>>>>>> the same opportunity to release SFT even if we designate it as an
>>>>>> experimental feature if that would simplify some other logistics.
>> For
>>>>> what
>>>>>> it’s worth.
>>>>>>
>>>>>>> On Feb 15, 2022, at 7:44 AM, Josh Elser <el...@apache.org>
>> wrote:
>>>>>>>
>>>>>>> I was talking with Szabolcs prior to him sending this one, and
>>> it's
>>>> a
>>>>>> tricky issue for sure.
>>>>>>>
>>>>>>> To date, we've solved any HBase API issues by copying code into
>>> HBCK2
>>>>>> e.g. HBCKMetaTableAccessor which copies parts of MetaTableAccessor,
>>> or
>>>> we
>>>>>> push the logic down server-side to the HBase Master and invoke it
>>> over
>>>>> the
>>>>>> Hbck RPC interface.
>>>>>>>
>>>>>>> I definitely want to avoid HBase version specific builds of the
>>>>>> operator-tools, so that is not an option in my mind for 2.x. The
>>>>>> discussions we had (that I remember) around HBCK2 were limited in
>>> scope
>>>>> to
>>>>>> HBase 2.x.
>>>>>>>
>>>>>>> Option 1: we copy the necessary proto files from HBase into the
>>>>>> operator-tools and try to remember that, if we make any change to
>> the
>>>>>> serialization of the storefile list files, we have to copy that
>>> change
>>>> to
>>>>>> HBCK2. Brittle on the surface but effective.
>>>>>>>
>>>>>>> Option 2: We bump HBCK2 to hbase-2.6.0-SNAPSHOT. Problematic
>> until
>>> we
>>>>>> make an HBase 2.6.0[-alpha] release. We should already have wire
>>> compat
>>>>>> between all of HBase 2.x which makes that a non-issue.
>>>>>>>
>>>>>>> Option 3: We create an HBCK3 targeted for HBase 3.x. I'm not
>>>> convinced
>>>>>> we need to do that (hbck for hbase 3.x would be just like hbck for
>>>> hbase
>>>>>> 2.x). This would also not solve the problem for the SFT feature in
>>>> hbase
>>>>>> 2.6.
>>>>>>>
>>>>>>> I think option 3 is a no-go. I am leaning towards option 1 at
>> this
>>>>>> point. Hopefully my thought process is helpful for others to weigh
>>> in.
>>>>>>>
>>>>>>>
>>>>>>>> On 2/14/22 11:31 AM, Szabolcs Bukros wrote:
>>>>>>>> Hi Folks!
>>>>>>>> While working on adding tools to handle potential FileBased
>>>>>>>> StoreFileTracker issues to HBCK2 (HBASE-26624
>>>>>>>> <https://issues.apache.org/jira/browse/HBASE-26624>) I ran into
>>>>>> multiple
>>>>>>>> problems I'm unsure how to solve.
>>>>>>>> First of all the tools would rely on files not yet available in
>>> any
>>>> of
>>>>>> the
>>>>>>>> released hbase artifacts. I tried to solve this without changing
>>> the
>>>>>> hbase
>>>>>>>> dependency version to keep HBCK2 as hbase version independent as
>>>>>> possible,
>>>>>>>> but none of the solutions I have found looked acceptable:
>>>>>>>>   - Pushing the logic to the hbase side (as far as I can tell) is
>>> not
>>>>>>>> feasible because it has to be able to repair meta which is
>> easier
>>>> when
>>>>>>>> hbase is down and the tool should be able to run without a
>> working
>>>>>> hbase.
>>>>>>>>   - The files tracking the store content are serialized proto
>>> objects
>>>>> so
>>>>>>>> while replicating those files in the operator tools is possible,
>>> it
>>>>>> would
>>>>>>>> not be pretty.
>>>>>>>> Bumping operator tools to use hbase 2.6.0-SNAPSHOT (branch-2 has
>>> the
>>>>> SFT
>>>>>>>> changes) would mean that now we need that or a newer version to
>>>> build
>>>>>> the
>>>>>>>> project and a version check to avoid runtime problems with the
>> new
>>>>>> tools,
>>>>>>>> but otherwise this looks rather painless and backwards
>>> compatible. I
>>>>>> know
>>>>>>>> operator tools tries to avoid having a hbase-specific release,
>> but
>>>>>> having
>>>>>>>> 2.6 as a min version to build against might be acceptable.
>>>>>>>> While looking into this I also checked what needs to be done to
>>> make
>>>>>>>> operator tools work with hbase 3.0.0-alpha-3-SNAPSHOT. Most of
>> the
>>>>>> changes
>>>>>>>> are backwards compatible but not all of them and the ones that
>>>> aren't
>>>>>> would
>>>>>>>> make a big chunk of Fsck unusable with older hbases. For me that
>>>> looks
>>>>>>>> acceptable since this is a major version change, but that would
>>>> mean I
>>>>>> can
>>>>>>>> not rely on a potential HBCK3 to fix SFT issues, I would also
>>> need a
>>>>>>>> solution for HBCK2.
>>>>>>>> I tried to look for plans/direction regarding the new 1.3
>> operator
>>>>> tools
>>>>>>>> but could not find any.
>>>>>>>> Do you think it would be possible to bump the hbase version it
>>> uses
>>>> to
>>>>>>>> 2.6.0-SNAPSHOT?
>>>>>>>> Do you think it would make sense to start working on a hbase3
>>>>> compatible
>>>>>>>> branch or is it too early?
>>>>>>>> NOTE:
>>>>>>>> I'm aware hbase does not publish SNAPSHOT builds for years, but
>> I
>>> do
>>>>> not
>>>>>>>> know how the internal build system works and if these artifacts
>>>> would
>>>>> be
>>>>>>>> available for internal builds or not. I also do not know if
>>>> necessary
>>>>>> could
>>>>>>>> they be made available.
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Andrew
>>>>
>>>> Unrest, ignorance distilled, nihilistic imbeciles -
>>>>      It's what we’ve earned
>>>> Welcome, apocalypse, what’s taken you so long?
>>>> Bring us the fitting end that we’ve been counting on
>>>>     - A23, Welcome, Apocalypse
>>>>
>>>
>>
> 
> 

Re: [DISCUSS] operator tools, HBase 3 and StoreFileTracking

Posted by Andrew Purtell <ap...@apache.org>.
There are some recovery cases where the cluster cannot be expected to be up
and running. What happens if we have no tooling for those? The user has a
dead cluster. So I don't think a requirement that the cluster be up and
running always is sufficient. For this type of recovery operator-tools must
be able to parse and write on disk formats. On the other hand hopefully the
cases for which that is not true are rare. In HBase 1, we had
OffineMetaRebuild. For my operations occasionally it has been necessary, in
test environments especially where users are not always clueful, and it has
shortened incident time from many hours to less than one hour. The
alternative would have been rebuild from scratch with total data loss,
which is a totally unsatisfying user experience.


On Sun, Feb 20, 2022 at 4:29 AM 张铎(Duo Zhang) <pa...@gmail.com> wrote:

> Sorry a bit late...
>
> IIRC, the design of HBCK2 is that, most of the actual fix logic should be
> done inside hbase(usually as a procedure), and the hbase-operator-tools is
> just a facade for calling these methods. It will query the cluster to find
> out which features are supportted. So in general, the design here is to
> always have the cluster up when fixing. We have a maintenance mode where we
> will just bring up HMaster and make meta table online, without loading any
> other regions.
>
> So I prefer we just use snapshot dependencies of hbase in HBCK2. It is not
> a big deal for end users as if we have not make the release yet, the new
> fixing options can never be actually used against a production cluster.
>
> Anyway, this means we need to publish nightly builds then.
>
> Thanks.
>
> Peter Somogyi <ps...@apache.org> 于2022年2月18日周五 06:40写道:
>
> > Makes sense. Thanks Andrew for clarifying!
> >
> > On Thu, Feb 17, 2022, 21:28 Andrew Purtell <ap...@apache.org> wrote:
> >
> > > On Thu, Feb 17, 2022 at 12:19 PM Peter Somogyi <ps...@apache.org>
> > > wrote:
> > >
> > > > I like the idea of including the store file tracking in 2.5.0 to
> > unblock
> > > > the HBCK development efforts.
> > > >
> > > > Unfortunately, I was not following its development that much. Can it
> > > cause
> > > > any issues if 2.5.0 has the feature but later an incompatible change
> is
> > > > needed for SFT? Can it be marked as a beta feature where we are free
> to
> > > > modify interfaces?
> > > >
> > >
> > > Yes, this is what I meant when I suggested we could mark it as
> > > 'experimental'. We have done this in the past. The word 'experimental'
> is
> > > prominently included adjacent to any discussion of the feature in
> > > documentation and release notes. When we feel for sure it is stable
> that
> > > word is removed. We can do something different this time of course but
> > that
> > > has been our past practice when introducing new functionality into
> > > releasing code lines. And I presume we would use the Evolving interface
> > > annotation everywhere.
> > >
> > > Peter
> > > >
> > > > On Tue, Feb 15, 2022 at 11:07 PM Andrew Purtell <
> > > andrew.purtell@gmail.com>
> > > > wrote:
> > > >
> > > > > Another option which I do not see mentioned yet is to extract the
> > > > relevant
> > > > > common proto and source files from the ‘hbase’ repository into a
> new
> > > > > repository (‘hbase-storage’?), from which we would release
> artifacts
> > to
> > > > be
> > > > > consumed by both hbase and hbase-operator-tools. This maintains
> > D.R.Y.
> > > > > through refactoring although it may down the road cause some
> > complexity
> > > > in
> > > > > coordinating evolution among the three (if not more) repositories
> and
> > > > > releases produced from them. This is like Josh’s Option 1 but
> without
> > > > > duplication.
> > > > >
> > > > > Regarding the option 2 issue… If it would help we can drop SFT into
> > > > > branch-2.5 along with the log4j2 changes and release 2.5.0
> afterward.
> > > We
> > > > > are taking the opportunity of this minor increment to accelerate
> > log4j1
> > > > > retirement, which is why it’s still waiting (but not for long). We
> > can
> > > > use
> > > > > the same opportunity to release SFT even if we designate it as an
> > > > > experimental feature if that would simplify some other logistics.
> For
> > > > what
> > > > > it’s worth.
> > > > >
> > > > > > On Feb 15, 2022, at 7:44 AM, Josh Elser <el...@apache.org>
> wrote:
> > > > > >
> > > > > > I was talking with Szabolcs prior to him sending this one, and
> > it's
> > > a
> > > > > tricky issue for sure.
> > > > > >
> > > > > > To date, we've solved any HBase API issues by copying code into
> > HBCK2
> > > > > e.g. HBCKMetaTableAccessor which copies parts of MetaTableAccessor,
> > or
> > > we
> > > > > push the logic down server-side to the HBase Master and invoke it
> > over
> > > > the
> > > > > Hbck RPC interface.
> > > > > >
> > > > > > I definitely want to avoid HBase version specific builds of the
> > > > > operator-tools, so that is not an option in my mind for 2.x. The
> > > > > discussions we had (that I remember) around HBCK2 were limited in
> > scope
> > > > to
> > > > > HBase 2.x.
> > > > > >
> > > > > > Option 1: we copy the necessary proto files from HBase into the
> > > > > operator-tools and try to remember that, if we make any change to
> the
> > > > > serialization of the storefile list files, we have to copy that
> > change
> > > to
> > > > > HBCK2. Brittle on the surface but effective.
> > > > > >
> > > > > > Option 2: We bump HBCK2 to hbase-2.6.0-SNAPSHOT. Problematic
> until
> > we
> > > > > make an HBase 2.6.0[-alpha] release. We should already have wire
> > compat
> > > > > between all of HBase 2.x which makes that a non-issue.
> > > > > >
> > > > > > Option 3: We create an HBCK3 targeted for HBase 3.x. I'm not
> > > convinced
> > > > > we need to do that (hbck for hbase 3.x would be just like hbck for
> > > hbase
> > > > > 2.x). This would also not solve the problem for the SFT feature in
> > > hbase
> > > > > 2.6.
> > > > > >
> > > > > > I think option 3 is a no-go. I am leaning towards option 1 at
> this
> > > > > point. Hopefully my thought process is helpful for others to weigh
> > in.
> > > > > >
> > > > > >
> > > > > >> On 2/14/22 11:31 AM, Szabolcs Bukros wrote:
> > > > > >> Hi Folks!
> > > > > >> While working on adding tools to handle potential FileBased
> > > > > >> StoreFileTracker issues to HBCK2 (HBASE-26624
> > > > > >> <https://issues.apache.org/jira/browse/HBASE-26624>) I ran into
> > > > > multiple
> > > > > >> problems I'm unsure how to solve.
> > > > > >> First of all the tools would rely on files not yet available in
> > any
> > > of
> > > > > the
> > > > > >> released hbase artifacts. I tried to solve this without changing
> > the
> > > > > hbase
> > > > > >> dependency version to keep HBCK2 as hbase version independent as
> > > > > possible,
> > > > > >> but none of the solutions I have found looked acceptable:
> > > > > >>  - Pushing the logic to the hbase side (as far as I can tell) is
> > not
> > > > > >> feasible because it has to be able to repair meta which is
> easier
> > > when
> > > > > >> hbase is down and the tool should be able to run without a
> working
> > > > > hbase.
> > > > > >>  - The files tracking the store content are serialized proto
> > objects
> > > > so
> > > > > >> while replicating those files in the operator tools is possible,
> > it
> > > > > would
> > > > > >> not be pretty.
> > > > > >> Bumping operator tools to use hbase 2.6.0-SNAPSHOT (branch-2 has
> > the
> > > > SFT
> > > > > >> changes) would mean that now we need that or a newer version to
> > > build
> > > > > the
> > > > > >> project and a version check to avoid runtime problems with the
> new
> > > > > tools,
> > > > > >> but otherwise this looks rather painless and backwards
> > compatible. I
> > > > > know
> > > > > >> operator tools tries to avoid having a hbase-specific release,
> but
> > > > > having
> > > > > >> 2.6 as a min version to build against might be acceptable.
> > > > > >> While looking into this I also checked what needs to be done to
> > make
> > > > > >> operator tools work with hbase 3.0.0-alpha-3-SNAPSHOT. Most of
> the
> > > > > changes
> > > > > >> are backwards compatible but not all of them and the ones that
> > > aren't
> > > > > would
> > > > > >> make a big chunk of Fsck unusable with older hbases. For me that
> > > looks
> > > > > >> acceptable since this is a major version change, but that would
> > > mean I
> > > > > can
> > > > > >> not rely on a potential HBCK3 to fix SFT issues, I would also
> > need a
> > > > > >> solution for HBCK2.
> > > > > >> I tried to look for plans/direction regarding the new 1.3
> operator
> > > > tools
> > > > > >> but could not find any.
> > > > > >> Do you think it would be possible to bump the hbase version it
> > uses
> > > to
> > > > > >> 2.6.0-SNAPSHOT?
> > > > > >> Do you think it would make sense to start working on a hbase3
> > > > compatible
> > > > > >> branch or is it too early?
> > > > > >> NOTE:
> > > > > >> I'm aware hbase does not publish SNAPSHOT builds for years, but
> I
> > do
> > > > not
> > > > > >> know how the internal build system works and if these artifacts
> > > would
> > > > be
> > > > > >> available for internal builds or not. I also do not know if
> > > necessary
> > > > > could
> > > > > >> they be made available.
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best regards,
> > > Andrew
> > >
> > > Unrest, ignorance distilled, nihilistic imbeciles -
> > >     It's what we’ve earned
> > > Welcome, apocalypse, what’s taken you so long?
> > > Bring us the fitting end that we’ve been counting on
> > >    - A23, Welcome, Apocalypse
> > >
> >
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
    It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse

Re: [DISCUSS] operator tools, HBase 3 and StoreFileTracking

Posted by "张铎(Duo Zhang)" <pa...@gmail.com>.
Sorry a bit late...

IIRC, the design of HBCK2 is that, most of the actual fix logic should be
done inside hbase(usually as a procedure), and the hbase-operator-tools is
just a facade for calling these methods. It will query the cluster to find
out which features are supportted. So in general, the design here is to
always have the cluster up when fixing. We have a maintenance mode where we
will just bring up HMaster and make meta table online, without loading any
other regions.

So I prefer we just use snapshot dependencies of hbase in HBCK2. It is not
a big deal for end users as if we have not make the release yet, the new
fixing options can never be actually used against a production cluster.

Anyway, this means we need to publish nightly builds then.

Thanks.

Peter Somogyi <ps...@apache.org> 于2022年2月18日周五 06:40写道:

> Makes sense. Thanks Andrew for clarifying!
>
> On Thu, Feb 17, 2022, 21:28 Andrew Purtell <ap...@apache.org> wrote:
>
> > On Thu, Feb 17, 2022 at 12:19 PM Peter Somogyi <ps...@apache.org>
> > wrote:
> >
> > > I like the idea of including the store file tracking in 2.5.0 to
> unblock
> > > the HBCK development efforts.
> > >
> > > Unfortunately, I was not following its development that much. Can it
> > cause
> > > any issues if 2.5.0 has the feature but later an incompatible change is
> > > needed for SFT? Can it be marked as a beta feature where we are free to
> > > modify interfaces?
> > >
> >
> > Yes, this is what I meant when I suggested we could mark it as
> > 'experimental'. We have done this in the past. The word 'experimental' is
> > prominently included adjacent to any discussion of the feature in
> > documentation and release notes. When we feel for sure it is stable that
> > word is removed. We can do something different this time of course but
> that
> > has been our past practice when introducing new functionality into
> > releasing code lines. And I presume we would use the Evolving interface
> > annotation everywhere.
> >
> > Peter
> > >
> > > On Tue, Feb 15, 2022 at 11:07 PM Andrew Purtell <
> > andrew.purtell@gmail.com>
> > > wrote:
> > >
> > > > Another option which I do not see mentioned yet is to extract the
> > > relevant
> > > > common proto and source files from the ‘hbase’ repository into a new
> > > > repository (‘hbase-storage’?), from which we would release artifacts
> to
> > > be
> > > > consumed by both hbase and hbase-operator-tools. This maintains
> D.R.Y.
> > > > through refactoring although it may down the road cause some
> complexity
> > > in
> > > > coordinating evolution among the three (if not more) repositories and
> > > > releases produced from them. This is like Josh’s Option 1 but without
> > > > duplication.
> > > >
> > > > Regarding the option 2 issue… If it would help we can drop SFT into
> > > > branch-2.5 along with the log4j2 changes and release 2.5.0 afterward.
> > We
> > > > are taking the opportunity of this minor increment to accelerate
> log4j1
> > > > retirement, which is why it’s still waiting (but not for long). We
> can
> > > use
> > > > the same opportunity to release SFT even if we designate it as an
> > > > experimental feature if that would simplify some other logistics. For
> > > what
> > > > it’s worth.
> > > >
> > > > > On Feb 15, 2022, at 7:44 AM, Josh Elser <el...@apache.org> wrote:
> > > > >
> > > > > I was talking with Szabolcs prior to him sending this one, and
> it's
> > a
> > > > tricky issue for sure.
> > > > >
> > > > > To date, we've solved any HBase API issues by copying code into
> HBCK2
> > > > e.g. HBCKMetaTableAccessor which copies parts of MetaTableAccessor,
> or
> > we
> > > > push the logic down server-side to the HBase Master and invoke it
> over
> > > the
> > > > Hbck RPC interface.
> > > > >
> > > > > I definitely want to avoid HBase version specific builds of the
> > > > operator-tools, so that is not an option in my mind for 2.x. The
> > > > discussions we had (that I remember) around HBCK2 were limited in
> scope
> > > to
> > > > HBase 2.x.
> > > > >
> > > > > Option 1: we copy the necessary proto files from HBase into the
> > > > operator-tools and try to remember that, if we make any change to the
> > > > serialization of the storefile list files, we have to copy that
> change
> > to
> > > > HBCK2. Brittle on the surface but effective.
> > > > >
> > > > > Option 2: We bump HBCK2 to hbase-2.6.0-SNAPSHOT. Problematic until
> we
> > > > make an HBase 2.6.0[-alpha] release. We should already have wire
> compat
> > > > between all of HBase 2.x which makes that a non-issue.
> > > > >
> > > > > Option 3: We create an HBCK3 targeted for HBase 3.x. I'm not
> > convinced
> > > > we need to do that (hbck for hbase 3.x would be just like hbck for
> > hbase
> > > > 2.x). This would also not solve the problem for the SFT feature in
> > hbase
> > > > 2.6.
> > > > >
> > > > > I think option 3 is a no-go. I am leaning towards option 1 at this
> > > > point. Hopefully my thought process is helpful for others to weigh
> in.
> > > > >
> > > > >
> > > > >> On 2/14/22 11:31 AM, Szabolcs Bukros wrote:
> > > > >> Hi Folks!
> > > > >> While working on adding tools to handle potential FileBased
> > > > >> StoreFileTracker issues to HBCK2 (HBASE-26624
> > > > >> <https://issues.apache.org/jira/browse/HBASE-26624>) I ran into
> > > > multiple
> > > > >> problems I'm unsure how to solve.
> > > > >> First of all the tools would rely on files not yet available in
> any
> > of
> > > > the
> > > > >> released hbase artifacts. I tried to solve this without changing
> the
> > > > hbase
> > > > >> dependency version to keep HBCK2 as hbase version independent as
> > > > possible,
> > > > >> but none of the solutions I have found looked acceptable:
> > > > >>  - Pushing the logic to the hbase side (as far as I can tell) is
> not
> > > > >> feasible because it has to be able to repair meta which is easier
> > when
> > > > >> hbase is down and the tool should be able to run without a working
> > > > hbase.
> > > > >>  - The files tracking the store content are serialized proto
> objects
> > > so
> > > > >> while replicating those files in the operator tools is possible,
> it
> > > > would
> > > > >> not be pretty.
> > > > >> Bumping operator tools to use hbase 2.6.0-SNAPSHOT (branch-2 has
> the
> > > SFT
> > > > >> changes) would mean that now we need that or a newer version to
> > build
> > > > the
> > > > >> project and a version check to avoid runtime problems with the new
> > > > tools,
> > > > >> but otherwise this looks rather painless and backwards
> compatible. I
> > > > know
> > > > >> operator tools tries to avoid having a hbase-specific release, but
> > > > having
> > > > >> 2.6 as a min version to build against might be acceptable.
> > > > >> While looking into this I also checked what needs to be done to
> make
> > > > >> operator tools work with hbase 3.0.0-alpha-3-SNAPSHOT. Most of the
> > > > changes
> > > > >> are backwards compatible but not all of them and the ones that
> > aren't
> > > > would
> > > > >> make a big chunk of Fsck unusable with older hbases. For me that
> > looks
> > > > >> acceptable since this is a major version change, but that would
> > mean I
> > > > can
> > > > >> not rely on a potential HBCK3 to fix SFT issues, I would also
> need a
> > > > >> solution for HBCK2.
> > > > >> I tried to look for plans/direction regarding the new 1.3 operator
> > > tools
> > > > >> but could not find any.
> > > > >> Do you think it would be possible to bump the hbase version it
> uses
> > to
> > > > >> 2.6.0-SNAPSHOT?
> > > > >> Do you think it would make sense to start working on a hbase3
> > > compatible
> > > > >> branch or is it too early?
> > > > >> NOTE:
> > > > >> I'm aware hbase does not publish SNAPSHOT builds for years, but I
> do
> > > not
> > > > >> know how the internal build system works and if these artifacts
> > would
> > > be
> > > > >> available for internal builds or not. I also do not know if
> > necessary
> > > > could
> > > > >> they be made available.
> > > >
> > >
> >
> >
> > --
> > Best regards,
> > Andrew
> >
> > Unrest, ignorance distilled, nihilistic imbeciles -
> >     It's what we’ve earned
> > Welcome, apocalypse, what’s taken you so long?
> > Bring us the fitting end that we’ve been counting on
> >    - A23, Welcome, Apocalypse
> >
>

Re: [DISCUSS] operator tools, HBase 3 and StoreFileTracking

Posted by Peter Somogyi <ps...@apache.org>.
Makes sense. Thanks Andrew for clarifying!

On Thu, Feb 17, 2022, 21:28 Andrew Purtell <ap...@apache.org> wrote:

> On Thu, Feb 17, 2022 at 12:19 PM Peter Somogyi <ps...@apache.org>
> wrote:
>
> > I like the idea of including the store file tracking in 2.5.0 to unblock
> > the HBCK development efforts.
> >
> > Unfortunately, I was not following its development that much. Can it
> cause
> > any issues if 2.5.0 has the feature but later an incompatible change is
> > needed for SFT? Can it be marked as a beta feature where we are free to
> > modify interfaces?
> >
>
> Yes, this is what I meant when I suggested we could mark it as
> 'experimental'. We have done this in the past. The word 'experimental' is
> prominently included adjacent to any discussion of the feature in
> documentation and release notes. When we feel for sure it is stable that
> word is removed. We can do something different this time of course but that
> has been our past practice when introducing new functionality into
> releasing code lines. And I presume we would use the Evolving interface
> annotation everywhere.
>
> Peter
> >
> > On Tue, Feb 15, 2022 at 11:07 PM Andrew Purtell <
> andrew.purtell@gmail.com>
> > wrote:
> >
> > > Another option which I do not see mentioned yet is to extract the
> > relevant
> > > common proto and source files from the ‘hbase’ repository into a new
> > > repository (‘hbase-storage’?), from which we would release artifacts to
> > be
> > > consumed by both hbase and hbase-operator-tools. This maintains D.R.Y.
> > > through refactoring although it may down the road cause some complexity
> > in
> > > coordinating evolution among the three (if not more) repositories and
> > > releases produced from them. This is like Josh’s Option 1 but without
> > > duplication.
> > >
> > > Regarding the option 2 issue… If it would help we can drop SFT into
> > > branch-2.5 along with the log4j2 changes and release 2.5.0 afterward.
> We
> > > are taking the opportunity of this minor increment to accelerate log4j1
> > > retirement, which is why it’s still waiting (but not for long). We can
> > use
> > > the same opportunity to release SFT even if we designate it as an
> > > experimental feature if that would simplify some other logistics. For
> > what
> > > it’s worth.
> > >
> > > > On Feb 15, 2022, at 7:44 AM, Josh Elser <el...@apache.org> wrote:
> > > >
> > > > I was talking with Szabolcs prior to him sending this one, and it's
> a
> > > tricky issue for sure.
> > > >
> > > > To date, we've solved any HBase API issues by copying code into HBCK2
> > > e.g. HBCKMetaTableAccessor which copies parts of MetaTableAccessor, or
> we
> > > push the logic down server-side to the HBase Master and invoke it over
> > the
> > > Hbck RPC interface.
> > > >
> > > > I definitely want to avoid HBase version specific builds of the
> > > operator-tools, so that is not an option in my mind for 2.x. The
> > > discussions we had (that I remember) around HBCK2 were limited in scope
> > to
> > > HBase 2.x.
> > > >
> > > > Option 1: we copy the necessary proto files from HBase into the
> > > operator-tools and try to remember that, if we make any change to the
> > > serialization of the storefile list files, we have to copy that change
> to
> > > HBCK2. Brittle on the surface but effective.
> > > >
> > > > Option 2: We bump HBCK2 to hbase-2.6.0-SNAPSHOT. Problematic until we
> > > make an HBase 2.6.0[-alpha] release. We should already have wire compat
> > > between all of HBase 2.x which makes that a non-issue.
> > > >
> > > > Option 3: We create an HBCK3 targeted for HBase 3.x. I'm not
> convinced
> > > we need to do that (hbck for hbase 3.x would be just like hbck for
> hbase
> > > 2.x). This would also not solve the problem for the SFT feature in
> hbase
> > > 2.6.
> > > >
> > > > I think option 3 is a no-go. I am leaning towards option 1 at this
> > > point. Hopefully my thought process is helpful for others to weigh in.
> > > >
> > > >
> > > >> On 2/14/22 11:31 AM, Szabolcs Bukros wrote:
> > > >> Hi Folks!
> > > >> While working on adding tools to handle potential FileBased
> > > >> StoreFileTracker issues to HBCK2 (HBASE-26624
> > > >> <https://issues.apache.org/jira/browse/HBASE-26624>) I ran into
> > > multiple
> > > >> problems I'm unsure how to solve.
> > > >> First of all the tools would rely on files not yet available in any
> of
> > > the
> > > >> released hbase artifacts. I tried to solve this without changing the
> > > hbase
> > > >> dependency version to keep HBCK2 as hbase version independent as
> > > possible,
> > > >> but none of the solutions I have found looked acceptable:
> > > >>  - Pushing the logic to the hbase side (as far as I can tell) is not
> > > >> feasible because it has to be able to repair meta which is easier
> when
> > > >> hbase is down and the tool should be able to run without a working
> > > hbase.
> > > >>  - The files tracking the store content are serialized proto objects
> > so
> > > >> while replicating those files in the operator tools is possible, it
> > > would
> > > >> not be pretty.
> > > >> Bumping operator tools to use hbase 2.6.0-SNAPSHOT (branch-2 has the
> > SFT
> > > >> changes) would mean that now we need that or a newer version to
> build
> > > the
> > > >> project and a version check to avoid runtime problems with the new
> > > tools,
> > > >> but otherwise this looks rather painless and backwards compatible. I
> > > know
> > > >> operator tools tries to avoid having a hbase-specific release, but
> > > having
> > > >> 2.6 as a min version to build against might be acceptable.
> > > >> While looking into this I also checked what needs to be done to make
> > > >> operator tools work with hbase 3.0.0-alpha-3-SNAPSHOT. Most of the
> > > changes
> > > >> are backwards compatible but not all of them and the ones that
> aren't
> > > would
> > > >> make a big chunk of Fsck unusable with older hbases. For me that
> looks
> > > >> acceptable since this is a major version change, but that would
> mean I
> > > can
> > > >> not rely on a potential HBCK3 to fix SFT issues, I would also need a
> > > >> solution for HBCK2.
> > > >> I tried to look for plans/direction regarding the new 1.3 operator
> > tools
> > > >> but could not find any.
> > > >> Do you think it would be possible to bump the hbase version it uses
> to
> > > >> 2.6.0-SNAPSHOT?
> > > >> Do you think it would make sense to start working on a hbase3
> > compatible
> > > >> branch or is it too early?
> > > >> NOTE:
> > > >> I'm aware hbase does not publish SNAPSHOT builds for years, but I do
> > not
> > > >> know how the internal build system works and if these artifacts
> would
> > be
> > > >> available for internal builds or not. I also do not know if
> necessary
> > > could
> > > >> they be made available.
> > >
> >
>
>
> --
> Best regards,
> Andrew
>
> Unrest, ignorance distilled, nihilistic imbeciles -
>     It's what we’ve earned
> Welcome, apocalypse, what’s taken you so long?
> Bring us the fitting end that we’ve been counting on
>    - A23, Welcome, Apocalypse
>

Re: [DISCUSS] operator tools, HBase 3 and StoreFileTracking

Posted by Andrew Purtell <ap...@apache.org>.
On Thu, Feb 17, 2022 at 12:19 PM Peter Somogyi <ps...@apache.org> wrote:

> I like the idea of including the store file tracking in 2.5.0 to unblock
> the HBCK development efforts.
>
> Unfortunately, I was not following its development that much. Can it cause
> any issues if 2.5.0 has the feature but later an incompatible change is
> needed for SFT? Can it be marked as a beta feature where we are free to
> modify interfaces?
>

Yes, this is what I meant when I suggested we could mark it as
'experimental'. We have done this in the past. The word 'experimental' is
prominently included adjacent to any discussion of the feature in
documentation and release notes. When we feel for sure it is stable that
word is removed. We can do something different this time of course but that
has been our past practice when introducing new functionality into
releasing code lines. And I presume we would use the Evolving interface
annotation everywhere.

Peter
>
> On Tue, Feb 15, 2022 at 11:07 PM Andrew Purtell <an...@gmail.com>
> wrote:
>
> > Another option which I do not see mentioned yet is to extract the
> relevant
> > common proto and source files from the ‘hbase’ repository into a new
> > repository (‘hbase-storage’?), from which we would release artifacts to
> be
> > consumed by both hbase and hbase-operator-tools. This maintains D.R.Y.
> > through refactoring although it may down the road cause some complexity
> in
> > coordinating evolution among the three (if not more) repositories and
> > releases produced from them. This is like Josh’s Option 1 but without
> > duplication.
> >
> > Regarding the option 2 issue… If it would help we can drop SFT into
> > branch-2.5 along with the log4j2 changes and release 2.5.0 afterward. We
> > are taking the opportunity of this minor increment to accelerate log4j1
> > retirement, which is why it’s still waiting (but not for long). We can
> use
> > the same opportunity to release SFT even if we designate it as an
> > experimental feature if that would simplify some other logistics. For
> what
> > it’s worth.
> >
> > > On Feb 15, 2022, at 7:44 AM, Josh Elser <el...@apache.org> wrote:
> > >
> > > I was talking with Szabolcs prior to him sending this one, and it's a
> > tricky issue for sure.
> > >
> > > To date, we've solved any HBase API issues by copying code into HBCK2
> > e.g. HBCKMetaTableAccessor which copies parts of MetaTableAccessor, or we
> > push the logic down server-side to the HBase Master and invoke it over
> the
> > Hbck RPC interface.
> > >
> > > I definitely want to avoid HBase version specific builds of the
> > operator-tools, so that is not an option in my mind for 2.x. The
> > discussions we had (that I remember) around HBCK2 were limited in scope
> to
> > HBase 2.x.
> > >
> > > Option 1: we copy the necessary proto files from HBase into the
> > operator-tools and try to remember that, if we make any change to the
> > serialization of the storefile list files, we have to copy that change to
> > HBCK2. Brittle on the surface but effective.
> > >
> > > Option 2: We bump HBCK2 to hbase-2.6.0-SNAPSHOT. Problematic until we
> > make an HBase 2.6.0[-alpha] release. We should already have wire compat
> > between all of HBase 2.x which makes that a non-issue.
> > >
> > > Option 3: We create an HBCK3 targeted for HBase 3.x. I'm not convinced
> > we need to do that (hbck for hbase 3.x would be just like hbck for hbase
> > 2.x). This would also not solve the problem for the SFT feature in hbase
> > 2.6.
> > >
> > > I think option 3 is a no-go. I am leaning towards option 1 at this
> > point. Hopefully my thought process is helpful for others to weigh in.
> > >
> > >
> > >> On 2/14/22 11:31 AM, Szabolcs Bukros wrote:
> > >> Hi Folks!
> > >> While working on adding tools to handle potential FileBased
> > >> StoreFileTracker issues to HBCK2 (HBASE-26624
> > >> <https://issues.apache.org/jira/browse/HBASE-26624>) I ran into
> > multiple
> > >> problems I'm unsure how to solve.
> > >> First of all the tools would rely on files not yet available in any of
> > the
> > >> released hbase artifacts. I tried to solve this without changing the
> > hbase
> > >> dependency version to keep HBCK2 as hbase version independent as
> > possible,
> > >> but none of the solutions I have found looked acceptable:
> > >>  - Pushing the logic to the hbase side (as far as I can tell) is not
> > >> feasible because it has to be able to repair meta which is easier when
> > >> hbase is down and the tool should be able to run without a working
> > hbase.
> > >>  - The files tracking the store content are serialized proto objects
> so
> > >> while replicating those files in the operator tools is possible, it
> > would
> > >> not be pretty.
> > >> Bumping operator tools to use hbase 2.6.0-SNAPSHOT (branch-2 has the
> SFT
> > >> changes) would mean that now we need that or a newer version to build
> > the
> > >> project and a version check to avoid runtime problems with the new
> > tools,
> > >> but otherwise this looks rather painless and backwards compatible. I
> > know
> > >> operator tools tries to avoid having a hbase-specific release, but
> > having
> > >> 2.6 as a min version to build against might be acceptable.
> > >> While looking into this I also checked what needs to be done to make
> > >> operator tools work with hbase 3.0.0-alpha-3-SNAPSHOT. Most of the
> > changes
> > >> are backwards compatible but not all of them and the ones that aren't
> > would
> > >> make a big chunk of Fsck unusable with older hbases. For me that looks
> > >> acceptable since this is a major version change, but that would mean I
> > can
> > >> not rely on a potential HBCK3 to fix SFT issues, I would also need a
> > >> solution for HBCK2.
> > >> I tried to look for plans/direction regarding the new 1.3 operator
> tools
> > >> but could not find any.
> > >> Do you think it would be possible to bump the hbase version it uses to
> > >> 2.6.0-SNAPSHOT?
> > >> Do you think it would make sense to start working on a hbase3
> compatible
> > >> branch or is it too early?
> > >> NOTE:
> > >> I'm aware hbase does not publish SNAPSHOT builds for years, but I do
> not
> > >> know how the internal build system works and if these artifacts would
> be
> > >> available for internal builds or not. I also do not know if necessary
> > could
> > >> they be made available.
> >
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
    It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse

Re: [DISCUSS] operator tools, HBase 3 and StoreFileTracking

Posted by Peter Somogyi <ps...@apache.org>.
I like the idea of including the store file tracking in 2.5.0 to unblock
the HBCK development efforts.

Unfortunately, I was not following its development that much. Can it cause
any issues if 2.5.0 has the feature but later an incompatible change is
needed for SFT? Can it be marked as a beta feature where we are free to
modify interfaces?

Peter

On Tue, Feb 15, 2022 at 11:07 PM Andrew Purtell <an...@gmail.com>
wrote:

> Another option which I do not see mentioned yet is to extract the relevant
> common proto and source files from the ‘hbase’ repository into a new
> repository (‘hbase-storage’?), from which we would release artifacts to be
> consumed by both hbase and hbase-operator-tools. This maintains D.R.Y.
> through refactoring although it may down the road cause some complexity in
> coordinating evolution among the three (if not more) repositories and
> releases produced from them. This is like Josh’s Option 1 but without
> duplication.
>
> Regarding the option 2 issue… If it would help we can drop SFT into
> branch-2.5 along with the log4j2 changes and release 2.5.0 afterward. We
> are taking the opportunity of this minor increment to accelerate log4j1
> retirement, which is why it’s still waiting (but not for long). We can use
> the same opportunity to release SFT even if we designate it as an
> experimental feature if that would simplify some other logistics. For what
> it’s worth.
>
> > On Feb 15, 2022, at 7:44 AM, Josh Elser <el...@apache.org> wrote:
> >
> > I was talking with Szabolcs prior to him sending this one, and it's a
> tricky issue for sure.
> >
> > To date, we've solved any HBase API issues by copying code into HBCK2
> e.g. HBCKMetaTableAccessor which copies parts of MetaTableAccessor, or we
> push the logic down server-side to the HBase Master and invoke it over the
> Hbck RPC interface.
> >
> > I definitely want to avoid HBase version specific builds of the
> operator-tools, so that is not an option in my mind for 2.x. The
> discussions we had (that I remember) around HBCK2 were limited in scope to
> HBase 2.x.
> >
> > Option 1: we copy the necessary proto files from HBase into the
> operator-tools and try to remember that, if we make any change to the
> serialization of the storefile list files, we have to copy that change to
> HBCK2. Brittle on the surface but effective.
> >
> > Option 2: We bump HBCK2 to hbase-2.6.0-SNAPSHOT. Problematic until we
> make an HBase 2.6.0[-alpha] release. We should already have wire compat
> between all of HBase 2.x which makes that a non-issue.
> >
> > Option 3: We create an HBCK3 targeted for HBase 3.x. I'm not convinced
> we need to do that (hbck for hbase 3.x would be just like hbck for hbase
> 2.x). This would also not solve the problem for the SFT feature in hbase
> 2.6.
> >
> > I think option 3 is a no-go. I am leaning towards option 1 at this
> point. Hopefully my thought process is helpful for others to weigh in.
> >
> >
> >> On 2/14/22 11:31 AM, Szabolcs Bukros wrote:
> >> Hi Folks!
> >> While working on adding tools to handle potential FileBased
> >> StoreFileTracker issues to HBCK2 (HBASE-26624
> >> <https://issues.apache.org/jira/browse/HBASE-26624>) I ran into
> multiple
> >> problems I'm unsure how to solve.
> >> First of all the tools would rely on files not yet available in any of
> the
> >> released hbase artifacts. I tried to solve this without changing the
> hbase
> >> dependency version to keep HBCK2 as hbase version independent as
> possible,
> >> but none of the solutions I have found looked acceptable:
> >>  - Pushing the logic to the hbase side (as far as I can tell) is not
> >> feasible because it has to be able to repair meta which is easier when
> >> hbase is down and the tool should be able to run without a working
> hbase.
> >>  - The files tracking the store content are serialized proto objects so
> >> while replicating those files in the operator tools is possible, it
> would
> >> not be pretty.
> >> Bumping operator tools to use hbase 2.6.0-SNAPSHOT (branch-2 has the SFT
> >> changes) would mean that now we need that or a newer version to build
> the
> >> project and a version check to avoid runtime problems with the new
> tools,
> >> but otherwise this looks rather painless and backwards compatible. I
> know
> >> operator tools tries to avoid having a hbase-specific release, but
> having
> >> 2.6 as a min version to build against might be acceptable.
> >> While looking into this I also checked what needs to be done to make
> >> operator tools work with hbase 3.0.0-alpha-3-SNAPSHOT. Most of the
> changes
> >> are backwards compatible but not all of them and the ones that aren't
> would
> >> make a big chunk of Fsck unusable with older hbases. For me that looks
> >> acceptable since this is a major version change, but that would mean I
> can
> >> not rely on a potential HBCK3 to fix SFT issues, I would also need a
> >> solution for HBCK2.
> >> I tried to look for plans/direction regarding the new 1.3 operator tools
> >> but could not find any.
> >> Do you think it would be possible to bump the hbase version it uses to
> >> 2.6.0-SNAPSHOT?
> >> Do you think it would make sense to start working on a hbase3 compatible
> >> branch or is it too early?
> >> NOTE:
> >> I'm aware hbase does not publish SNAPSHOT builds for years, but I do not
> >> know how the internal build system works and if these artifacts would be
> >> available for internal builds or not. I also do not know if necessary
> could
> >> they be made available.
>

Re: [DISCUSS] operator tools, HBase 3 and StoreFileTracking

Posted by Andrew Purtell <an...@gmail.com>.
Another option which I do not see mentioned yet is to extract the relevant common proto and source files from the ‘hbase’ repository into a new repository (‘hbase-storage’?), from which we would release artifacts to be consumed by both hbase and hbase-operator-tools. This maintains D.R.Y. through refactoring although it may down the road cause some complexity in coordinating evolution among the three (if not more) repositories and releases produced from them. This is like Josh’s Option 1 but without duplication. 

Regarding the option 2 issue… If it would help we can drop SFT into branch-2.5 along with the log4j2 changes and release 2.5.0 afterward. We are taking the opportunity of this minor increment to accelerate log4j1 retirement, which is why it’s still waiting (but not for long). We can use the same opportunity to release SFT even if we designate it as an experimental feature if that would simplify some other logistics. For what it’s worth. 

> On Feb 15, 2022, at 7:44 AM, Josh Elser <el...@apache.org> wrote:
> 
> I was talking with Szabolcs prior to him sending this one, and it's a tricky issue for sure.
> 
> To date, we've solved any HBase API issues by copying code into HBCK2 e.g. HBCKMetaTableAccessor which copies parts of MetaTableAccessor, or we push the logic down server-side to the HBase Master and invoke it over the Hbck RPC interface.
> 
> I definitely want to avoid HBase version specific builds of the operator-tools, so that is not an option in my mind for 2.x. The discussions we had (that I remember) around HBCK2 were limited in scope to HBase 2.x.
> 
> Option 1: we copy the necessary proto files from HBase into the operator-tools and try to remember that, if we make any change to the serialization of the storefile list files, we have to copy that change to HBCK2. Brittle on the surface but effective.
> 
> Option 2: We bump HBCK2 to hbase-2.6.0-SNAPSHOT. Problematic until we make an HBase 2.6.0[-alpha] release. We should already have wire compat between all of HBase 2.x which makes that a non-issue.
> 
> Option 3: We create an HBCK3 targeted for HBase 3.x. I'm not convinced we need to do that (hbck for hbase 3.x would be just like hbck for hbase 2.x). This would also not solve the problem for the SFT feature in hbase 2.6.
> 
> I think option 3 is a no-go. I am leaning towards option 1 at this point. Hopefully my thought process is helpful for others to weigh in.
> 
> 
>> On 2/14/22 11:31 AM, Szabolcs Bukros wrote:
>> Hi Folks!
>> While working on adding tools to handle potential FileBased
>> StoreFileTracker issues to HBCK2 (HBASE-26624
>> <https://issues.apache.org/jira/browse/HBASE-26624>) I ran into multiple
>> problems I'm unsure how to solve.
>> First of all the tools would rely on files not yet available in any of the
>> released hbase artifacts. I tried to solve this without changing the hbase
>> dependency version to keep HBCK2 as hbase version independent as possible,
>> but none of the solutions I have found looked acceptable:
>>  - Pushing the logic to the hbase side (as far as I can tell) is not
>> feasible because it has to be able to repair meta which is easier when
>> hbase is down and the tool should be able to run without a working hbase.
>>  - The files tracking the store content are serialized proto objects so
>> while replicating those files in the operator tools is possible, it would
>> not be pretty.
>> Bumping operator tools to use hbase 2.6.0-SNAPSHOT (branch-2 has the SFT
>> changes) would mean that now we need that or a newer version to build the
>> project and a version check to avoid runtime problems with the new tools,
>> but otherwise this looks rather painless and backwards compatible. I know
>> operator tools tries to avoid having a hbase-specific release, but having
>> 2.6 as a min version to build against might be acceptable.
>> While looking into this I also checked what needs to be done to make
>> operator tools work with hbase 3.0.0-alpha-3-SNAPSHOT. Most of the changes
>> are backwards compatible but not all of them and the ones that aren't would
>> make a big chunk of Fsck unusable with older hbases. For me that looks
>> acceptable since this is a major version change, but that would mean I can
>> not rely on a potential HBCK3 to fix SFT issues, I would also need a
>> solution for HBCK2.
>> I tried to look for plans/direction regarding the new 1.3 operator tools
>> but could not find any.
>> Do you think it would be possible to bump the hbase version it uses to
>> 2.6.0-SNAPSHOT?
>> Do you think it would make sense to start working on a hbase3 compatible
>> branch or is it too early?
>> NOTE:
>> I'm aware hbase does not publish SNAPSHOT builds for years, but I do not
>> know how the internal build system works and if these artifacts would be
>> available for internal builds or not. I also do not know if necessary could
>> they be made available.

Re: [DISCUSS] operator tools, HBase 3 and StoreFileTracking

Posted by Josh Elser <el...@apache.org>.
I was talking with Szabolcs prior to him sending this one, and it's a 
tricky issue for sure.

To date, we've solved any HBase API issues by copying code into HBCK2 
e.g. HBCKMetaTableAccessor which copies parts of MetaTableAccessor, or 
we push the logic down server-side to the HBase Master and invoke it 
over the Hbck RPC interface.

I definitely want to avoid HBase version specific builds of the 
operator-tools, so that is not an option in my mind for 2.x. The 
discussions we had (that I remember) around HBCK2 were limited in scope 
to HBase 2.x.

Option 1: we copy the necessary proto files from HBase into the 
operator-tools and try to remember that, if we make any change to the 
serialization of the storefile list files, we have to copy that change 
to HBCK2. Brittle on the surface but effective.

Option 2: We bump HBCK2 to hbase-2.6.0-SNAPSHOT. Problematic until we 
make an HBase 2.6.0[-alpha] release. We should already have wire compat 
between all of HBase 2.x which makes that a non-issue.

Option 3: We create an HBCK3 targeted for HBase 3.x. I'm not convinced 
we need to do that (hbck for hbase 3.x would be just like hbck for hbase 
2.x). This would also not solve the problem for the SFT feature in hbase 
2.6.

I think option 3 is a no-go. I am leaning towards option 1 at this 
point. Hopefully my thought process is helpful for others to weigh in.


On 2/14/22 11:31 AM, Szabolcs Bukros wrote:
> Hi Folks!
> 
> While working on adding tools to handle potential FileBased
> StoreFileTracker issues to HBCK2 (HBASE-26624
> <https://issues.apache.org/jira/browse/HBASE-26624>) I ran into multiple
> problems I'm unsure how to solve.
> 
> First of all the tools would rely on files not yet available in any of the
> released hbase artifacts. I tried to solve this without changing the hbase
> dependency version to keep HBCK2 as hbase version independent as possible,
> but none of the solutions I have found looked acceptable:
>   - Pushing the logic to the hbase side (as far as I can tell) is not
> feasible because it has to be able to repair meta which is easier when
> hbase is down and the tool should be able to run without a working hbase.
>   - The files tracking the store content are serialized proto objects so
> while replicating those files in the operator tools is possible, it would
> not be pretty.
> 
> Bumping operator tools to use hbase 2.6.0-SNAPSHOT (branch-2 has the SFT
> changes) would mean that now we need that or a newer version to build the
> project and a version check to avoid runtime problems with the new tools,
> but otherwise this looks rather painless and backwards compatible. I know
> operator tools tries to avoid having a hbase-specific release, but having
> 2.6 as a min version to build against might be acceptable.
> 
> While looking into this I also checked what needs to be done to make
> operator tools work with hbase 3.0.0-alpha-3-SNAPSHOT. Most of the changes
> are backwards compatible but not all of them and the ones that aren't would
> make a big chunk of Fsck unusable with older hbases. For me that looks
> acceptable since this is a major version change, but that would mean I can
> not rely on a potential HBCK3 to fix SFT issues, I would also need a
> solution for HBCK2.
> 
> I tried to look for plans/direction regarding the new 1.3 operator tools
> but could not find any.
> 
> Do you think it would be possible to bump the hbase version it uses to
> 2.6.0-SNAPSHOT?
> Do you think it would make sense to start working on a hbase3 compatible
> branch or is it too early?
> 
> NOTE:
> I'm aware hbase does not publish SNAPSHOT builds for years, but I do not
> know how the internal build system works and if these artifacts would be
> available for internal builds or not. I also do not know if necessary could
> they be made available.
>