You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by ab...@apache.org on 2020/09/04 13:25:20 UTC

[kudu] 01/03: Add release notes for 1.13.0

This is an automated email from the ASF dual-hosted git repository.

abukor pushed a commit to branch branch-1.13.x
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit 931a4a5f053528a985483677e84fb37d54deeeb4
Author: Attila Bukor <ab...@apache.org>
AuthorDate: Thu Sep 3 14:15:10 2020 +0200

    Add release notes for 1.13.0
    
    Change-Id: Iac2a0ae740c3dabc5e7d8b7ef53312924c6e3532
    Reviewed-on: http://gerrit.cloudera.org:8080/16410
    Tested-by: Kudu Jenkins
    Reviewed-by: Greg Solovyev <gs...@cloudera.com>
    Reviewed-by: Alexey Serbin <as...@cloudera.com>
    Reviewed-by: Grant Henke <gr...@apache.org>
---
 docs/release_notes.adoc | 131 ++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 126 insertions(+), 5 deletions(-)

diff --git a/docs/release_notes.adoc b/docs/release_notes.adoc
index f23e070..044d389 100644
--- a/docs/release_notes.adoc
+++ b/docs/release_notes.adoc
@@ -31,27 +31,134 @@
 [[rn_1.13.0_upgrade_notes]]
 == Upgrade Notes
 
-
-[[rn_1.13.0_obsoletions]]
-== Obsoletions
-
+* The Sentry integration has been removed and the Ranger integration should now
+  be used in its place for fine-grained authorization.
 
 [[rn_1.13.0_deprecations]]
 == Deprecations
 
-Support for Python 2.x and Python 3.4 and earlier is deprecated and may be removed in the next minor release.
+* Support for Python 2.x and Python 3.4 and earlier is deprecated and may be
+  removed in the next minor release.
+* The `kudu-mapreduce` integration has been deprecated and may be removed in the
+  next minor release. Similar functionality and capabilities now exist via the
+  Apache Spark, Apache Hive, Apache Impala, and Apache NiFi integrations.
 
 [[rn_1.13.0_new_features]]
 == New features
 
+* Added table ownership support. All newly created tables are automatically
+  owned by the user creating them. It is also possible to change the owner by
+  altering the table. You can also assign privileges to table owners via Apache
+  Ranger (see link:https://issues.apache.org/jira/browse/KUDU-3090[KUDU-3090]).
+* An experimental feature is added to Kudu that allows it to automatically
+  rebalance tablet replicas among tablet servers. The background task can be
+  enabled by setting the `--auto_rebalancing_enabled` flag on the Kudu masters.
+  Before starting auto-rebalancing on an existing cluster, the CLI rebalancer
+  tool should be run first (see
+  link:https://issues.apache.org/jira/browse/KUDU-2780[KUDU-2780]).
+* Bloom filter column predicate pushdown has been added to allow optimized
+  execution of filters which match on a set of column values with a
+  false-positive rate. Support for Impala queries utilizing Bloom filter
+  predicate is available yielding performance improvements of 19% to 30% in TPC-H
+  benchmarks and around 41% improvement for distributed joins across large
+  tables. Support for Spark is not yet available. (see
+  link:https://issues.apache.org/jira/browse/KUDU-2483[KUDU-2483]).
+* AArch64-based (ARM) architectures are now supported including published Docker
+  images.
+* The Java client now supports the columnar row format returned from the server
+  transparently. Using this format can reduce the server CPU and size of the
+  request over the network for scans. The columnar format can be enabled via the
+  setRowDataFormat() method on the KuduScanner.
+* An experimental feature that can be enabled by setting the
+  `--enable_workload_score_for_perf_improvement_ops` prioritizes flushing and
+  compacting hot tablets.
 
 [[rn_1.13.0_improvements]]
 == Optimizations and improvements
 
+* Hive metastore synchronization now supports Hive 3 and later.
+* The Spark KuduContext accumulator metrics now track operation counts per table
+  instead of cumulatively for all tables.
+* The `kudu local_replica delete` CLI tool now accepts multiple tablet
+  identifiers. Along with the newly added `--ignore_nonexistent` flag, this
+  helps with scripting scenarios when removing multiple tablet replicas from a
+  particular Tablet Server.
+* Both Master’s and Tablet Server’s web UI now displays the name for a service
+  thread pool group at the `/threadz` page
+* Introduced `queue_overflow_rejections_` metrics for both Masters and Tablet
+  Servers: number of RPC requests of a particular type dropped due to RPC
+  service queue overflow.
+* Introduced a CoDel-like queue control mechanism for the apply queue. This
+  helps to avoid accumulating too many write requests and timing them out in
+  case of seek-bound workloads (e.g., uniform random inserts). The newly
+  introduced queue control mechanism is disabled by default. To enable it, set
+  the `--tablet_apply_pool_overload_threshold_ms` Tablet Server’s flag to
+  appropriate value, e.g. 250 (see
+  link:https://issues.apache.org/jira/browse/KUDU-1587[KUDU-1587]).
+* Operation accumulators in Spark KuduContext are now tracked on a per-table
+  basis.
+* Java client’s error collector can be resized (see
+  link:https://issues.apache.org/jira/browse/KUDU-1422[KUDU-1422]).
+* Calls to the Kudu master server are now drastically reduced when using scan
+  tokens. Previously deserializing a scan token would result in a GetTableSchema
+  request and potentially a GetTableLocations request. Now the table schema and
+  location information is serialized into the scan token itself avoiding the
+  need for any requests to the master when processing them.
+* The default size of Master’s RPC queue is now 100 (it was 50 in earlier
+  releases). This is to optimize for use cases where a Kudu cluster has many
+  clients working concurrently.
+* Masters now have an option to cache table location responses. This is
+  targeted for Kudu clusters which have many clients working concurrently. By
+  default, the caching of table location responses is disabled. To enable table
+  location caching, set the proper capacity of the table location cache using
+  Master’s `--table_locations_cache_capacity_mb` flag (setting to 0 disables the
+  caching). Up to 17% of improvement is observed in GetTableLocations request
+  rate when enabling the caching.
+* Removed lock contention on Raft consensus lock in Tablet Servers while
+  processing a write request. This helps to avoid RPC queue overflows when
+  handling concurrent write requests to the same tablet from multiple clients
+  (see link:https://issues.apache.org/jira/browse/KUDU-2727[KUDU-2727]).
+* Master’s performance for handling concurrent GetTableSchema requests has been
+  improved. End-to-end tests indicated up to 15% improvement in sustained
+  request rate for high concurrency scenarios.
+* Kudu servers now use protobuf Arena objects to perform all RPC
+  request/response-related memory allocations. This gives a boost for overall
+  RPC performance, and with further optimization the result request rate
+  was increased significantly for certain methods. For example, the result request
+  rate increased up to 25% for Master’s GetTabletLocations() RPC in case of
+  highly concurrent scenarios (see
+  link:https://issues.apache.org/jira/browse/KUDU-636[KUDU-636]).
+* Tablet Servers now use protobuf Arena for allocating Raft-related runtime
+  structures. This results in substantial reduction of CPU cycles used and
+  increases write throughput (see
+  link:https://issues.apache.org/jira/browse/KUDU-636[KUDU-636]).
+* Tablet Servers now use protobuf Arena for allocating EncodedKeys to reduce
+  allocator contention and improve memory locality (see
+  link:https://issues.apache.org/jira/browse/KUDU-636[KUDU-636]).
+* Bloom filter predicate evaluation for scans can be computationally expensive.
+  A heuristic has been added that verifies rejection rate of the supplied Bloom
+  filter predicate below which the Bloom filter predicate is automatically
+  disabled. This helped reduce regression observed with Bloom filter predicate
+  in TPC-H benchmark query #9 (see
+  link:https://issues.apache.org/jira/browse/KUDU-3140[KUDU-3140]).
+* Improved scan performance of dictionary and plain-encoded string columns by
+  avoiding copying them (see
+  link:https://issues.apache.org/jira/browse/KUDU-2844[KUDU-2844]).
+* Improved maintenance manager's heuristics to prioritize larger memstores
+  (see link:https://issues.apache.org/jira/browse/KUDU-3180[KUDU-3180]).
+* Spark client's KuduReadOptions now supports setting a snapshot timestamp for
+  repeatable reads with READ_AT_SNAPSHOT consistency mode (see
+  link:https://issues.apache.org/jira/browse/KUDU-3177[KUDU-3177]).
 
 [[rn_1.13.0_fixed_issues]]
 == Fixed Issues
 
+* Kudu scans now honor location assignments when multiple tablet servers are
+  co-located with the client.
+* Fixed a bug that caused IllegalArgumentException to be thrown when trying to
+  create a predicate for a DATE column in Kudu Java client (see
+  link:https://issues.apache.org/jira/browse/KUDU-3152[KUDU-3152]).
+* Fixed a potential race when multiple RPCs work on the same scanner object.
 
 [[rn_1.13.0_wire_compatibility]]
 == Wire Protocol compatibility
@@ -104,6 +211,20 @@ documentation.
 [[rn_1.13.0_contributors]]
 == Contributors
 
+Kudu 1.13.0 includes contributions from 22 people, including 9 first-time
+contributors:
+
+* Jim Apple
+* Kevin J McCarthy
+* Li Zhiming
+* Mahesh Reddy
+* Romain Rigaux
+* RuiChen
+* Shuping Zhou
+* ningw
+* wenjie
+
+
 [[resources_and_next_steps]]
 == Resources