You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Bryan Beaudreault (Jira)" <ji...@apache.org> on 2022/04/12 18:04:00 UTC
[jira] [Updated] (HBASE-26522) Improve documentation of hbase 1.x to 2.x potential incompatibilities

     [ https://issues.apache.org/jira/browse/HBASE-26522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Beaudreault updated HBASE-26522:
--------------------------------------
    Description: 
We're working on a major upgrade of almost 900 tables across 100 production clusters (and corresponding QA environment clusters). We've upgraded about 25% of our QA environment and run into a series of incompatibilities along the way. Most of them have been easy to get around, but I wanted to create this Jira to collect them so that we can make an update to the docs for future upgraders.

My plan is to periodically edit this description to add to the list. If anyone else has anything to contribute, feel free to edit as well or add a comment. 

Incompatibilities to document:
 -  HBASE-15676 changed the serialized byte string used for the fuzzy mask. FuzzyRowFilters created by older clients will not match any rows in an hbase2 cluster. This was fixed in HBASE-26537 but should be documented in our upgrade guide.
 - CDH5 try/catches bad HTableDescriptor.getDurability calls and returns USE_DEFAULT. In hbase2, if someone creates a table with a bad durability (i.e. DEFAULT instead of USE_DEFAULT), it results in a failure which causes the CreateTableProcedure to infinitely retries with no backoff. This rapid retry caused a bunch of pain on the cluster that encountered it, backing up datanode's ability to keep up with the millions of calls to create and delete .regioninfo files.
 - This isn't quite an incompatibility, but HBASE-19389 introduced a concurrency mitigation which may have surprising results coming from older versions. The defaults are pretty conservative – when writing more than 100 columns, no more than 10 concurrent writes or 20 pending writes at once.
 - Increments sent from branch-1 clients may get erroneously stored with a timestamp of 0 on hbase2+ clusters: HBASE-26713
 - CheckAndMutate with a "null" compare value used to ignore CompareOp. Fixed in HBASE-26742, checkAndMutate affects may change between versions.
 - client will not know how to handle dangling rep_barrier rows in meta: HBASE-26797
 - the default hbase split policy is SteppingSplitPolicy. This is overall a good policy which is more likely to split small tables to ensure they are spread across more servers. If you upgrade, you may notice your tables suddenly getting split more than you're used to. This may be an issue if you use a row key prefix, because hbase isn't aware of your prefix and may mess up your splits. You can get around this by defining a RegionSplitRestriction. See HBASE-25766
 - Regression in meta requests may impact replication on clusters with many regions. Fixed in 2.4.10+, per HBASE-26590

  was:
We're working on a major upgrade of almost 900 tables across 100 production clusters (and corresponding QA environment clusters). We've upgraded about 25% of our QA environment and run into a series of incompatibilities along the way. Most of them have been easy to get around, but I wanted to create this Jira to collect them so that we can make an update to the docs for future upgraders.

My plan is to periodically edit this description to add to the list. If anyone else has anything to contribute, feel free to edit as well or add a comment. 

Incompatibilities to document:
 -  HBASE-15676 changed the serialized byte string used for the fuzzy mask. FuzzyRowFilters created by older clients will not match any rows in an hbase2 cluster. This was fixed in HBASE-26537 but should be documented in our upgrade guide.
 - CDH5 try/catches bad HTableDescriptor.getDurability calls and returns USE_DEFAULT. In hbase2, if someone creates a table with a bad durability (i.e. DEFAULT instead of USE_DEFAULT), it results in a failure which causes the CreateTableProcedure to infinitely retries with no backoff. This rapid retry caused a bunch of pain on the cluster that encountered it, backing up datanode's ability to keep up with the millions of calls to create and delete .regioninfo files.
 - This isn't quite an incompatibility, but HBASE-19389 introduced a concurrency mitigation which may have surprising results coming from older versions. The defaults are pretty conservative – when writing more than 100 columns, no more than 10 concurrent writes or 20 pending writes at once.
 - Increments sent from branch-1 clients may get erroneously stored with a timestamp of 0 on hbase2+ clusters: HBASE-26713
 - CheckAndMutate with a "null" compare value used to ignore CompareOp. Fixed in HBASE-26742, checkAndMutate affects may change between versions.
 - client will not know how to handle dangling rep_barrier rows in meta: HBASE-26797
 - the default hbase split policy is SteppingSplitPolicy. This is overall a good policy which is more likely to split small tables to ensure they are spread across more servers. If you upgrade, you may notice your tables suddenly getting split more than you're used to. This may be an issue if you use a row key prefix, because hbase isn't aware of your prefix and may mess up your splits. You can get around this by defining a RegionSplitRestriction. See HBASE-25766


> Improve documentation of hbase 1.x to 2.x potential incompatibilities
> ---------------------------------------------------------------------
>
>                 Key: HBASE-26522
>                 URL: https://issues.apache.org/jira/browse/HBASE-26522
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Bryan Beaudreault
>            Assignee: Bryan Beaudreault
>            Priority: Minor
>
> We're working on a major upgrade of almost 900 tables across 100 production clusters (and corresponding QA environment clusters). We've upgraded about 25% of our QA environment and run into a series of incompatibilities along the way. Most of them have been easy to get around, but I wanted to create this Jira to collect them so that we can make an update to the docs for future upgraders.
> My plan is to periodically edit this description to add to the list. If anyone else has anything to contribute, feel free to edit as well or add a comment. 
> Incompatibilities to document:
>  -  HBASE-15676 changed the serialized byte string used for the fuzzy mask. FuzzyRowFilters created by older clients will not match any rows in an hbase2 cluster. This was fixed in HBASE-26537 but should be documented in our upgrade guide.
>  - CDH5 try/catches bad HTableDescriptor.getDurability calls and returns USE_DEFAULT. In hbase2, if someone creates a table with a bad durability (i.e. DEFAULT instead of USE_DEFAULT), it results in a failure which causes the CreateTableProcedure to infinitely retries with no backoff. This rapid retry caused a bunch of pain on the cluster that encountered it, backing up datanode's ability to keep up with the millions of calls to create and delete .regioninfo files.
>  - This isn't quite an incompatibility, but HBASE-19389 introduced a concurrency mitigation which may have surprising results coming from older versions. The defaults are pretty conservative – when writing more than 100 columns, no more than 10 concurrent writes or 20 pending writes at once.
>  - Increments sent from branch-1 clients may get erroneously stored with a timestamp of 0 on hbase2+ clusters: HBASE-26713
>  - CheckAndMutate with a "null" compare value used to ignore CompareOp. Fixed in HBASE-26742, checkAndMutate affects may change between versions.
>  - client will not know how to handle dangling rep_barrier rows in meta: HBASE-26797
>  - the default hbase split policy is SteppingSplitPolicy. This is overall a good policy which is more likely to split small tables to ensure they are spread across more servers. If you upgrade, you may notice your tables suddenly getting split more than you're used to. This may be an issue if you use a row key prefix, because hbase isn't aware of your prefix and may mess up your splits. You can get around this by defining a RegionSplitRestriction. See HBASE-25766
>  - Regression in meta requests may impact replication on clusters with many regions. Fixed in 2.4.10+, per HBASE-26590



--
This message was sent by Atlassian Jira
(v8.20.1#820001)