You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@kudu.apache.org by to...@apache.org on 2017/01/12 20:39:25 UTC

[1/2] kudu git commit: Restructure release notes in preparation for 1.2 release

Repository: kudu
Updated Branches:
  refs/heads/branch-1.2.x 58aa4ea08 -> a23e9d645


Restructure release notes in preparation for 1.2 release

* Moved the 1.1 release notes to the Prior Release Notes page
* On the Prior Release Notes page, removed the list of known
  limitations, upgrade instructions, compatibility notes, etc for each
  of the past releases. Those things aren't very useful in the case of
  the prior releases, and it would generally be better for people to
  refer to the documentation corresponding to that particular release if
  they are interested in those details.
* Moved the current version Known Issues and Limitations documentation
  to a new separate docs page.

Change-Id: Ia6684706ec9c0b774ec11805cab1d4a3f02412f0
Reviewed-on: http://gerrit.cloudera.org:8080/5602
Tested-by: Kudu Jenkins
Reviewed-by: Jean-Daniel Cryans <jd...@apache.org>
(cherry picked from commit 3d634777e5750faba2a5d91c6967f7fecc5ff151)
Reviewed-on: http://gerrit.cloudera.org:8080/5697
Reviewed-by: Todd Lipcon <to...@apache.org>
Tested-by: Todd Lipcon <to...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/e89cac83
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/e89cac83
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/e89cac83

Branch: refs/heads/branch-1.2.x
Commit: e89cac8309c942c8b36f662d6cc374f48409105d
Parents: 58aa4ea
Author: Todd Lipcon <to...@apache.org>
Authored: Wed Jan 4 15:55:16 2017 -0800
Committer: Todd Lipcon <to...@apache.org>
Committed: Thu Jan 12 20:38:34 2017 +0000

----------------------------------------------------------------------
 docs/known_issues.adoc                          |  95 +++++
 docs/prior_release_notes.adoc                   | 356 +++++++------------
 docs/release_notes.adoc                         | 193 +---------
 docs/support/jekyll-templates/document.html.erb |   1 +
 4 files changed, 235 insertions(+), 410 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/e89cac83/docs/known_issues.adoc
----------------------------------------------------------------------
diff --git a/docs/known_issues.adoc b/docs/known_issues.adoc
new file mode 100644
index 0000000..edb8afb
--- /dev/null
+++ b/docs/known_issues.adoc
@@ -0,0 +1,95 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+[[known_issues_and_limitations]]
+= Known Issues and Limitations
+
+:author: Kudu Team
+:imagesdir: ./images
+:icons: font
+:toc: left
+:toclevels: 3
+:doctype: book
+:backend: html5
+:sectlinks:
+:experimental:
+
+== Schema and Usage Limitations
+* Kudu is primarily designed for analytic use cases. You are likely to encounter issues if
+  a single row contains multiple kilobytes of data.
+
+* The columns which make up the primary key must be listed first in the schema.
+
+* Key columns cannot be altered. You must drop and recreate a table to change its keys.
+
+* Key columns must not be null.
+
+* Columns with `DOUBLE`, `FLOAT`, or `BOOL` types are not allowed as part of a
+  primary key definition.
+
+* Type and nullability of existing columns cannot be changed by altering the table.
+
+* A table\u2019s primary key cannot be changed.
+
+* Dropping a column does not immediately reclaim space. Compaction must run first.
+There is no way to run compaction manually, but dropping the table will reclaim the
+space immediately.
+
+== Partitioning Limitations
+* Tables must be manually pre-split into tablets using simple or compound primary
+  keys. Automatic splitting is not yet possible. Range partitions may be added
+  or dropped after a table has been created. See
+  link:schema_design.html[Schema Design] for more information.
+
+* Data in existing tables cannot currently be automatically repartitioned. As a workaround,
+  create a new table with the new partitioning and insert the contents of the old
+  table.
+
+== Replication and Backup Limitations
+* Kudu does not currently include any built-in features for backup and restore.
+  Users are encouraged to use tools such as Spark or Impala to export or import
+  tables as necessary.
+
+== Impala Limitations
+
+* Updates, inserts, and deletes via Impala are non-transactional. If a query
+  fails part of the way through, its partial effects will not be rolled back.
+
+* No timestamp and decimal type support.
+
+* The maximum parallelism of a single query is limited to the number of tablets
+  in a table. For good analytic performance, aim for 10 or more tablets per host
+  or use large tables.
+
+== Security Limitations
+
+* Authentication and authorization features are not implemented.
+* Data encryption is not built in. Kudu has been reported to run correctly
+  on systems using local block device encryption (e.g. `dmcrypt`).
+
+== Other Known Issues
+
+The following are known bugs and issues with the current release of Kudu. They will
+be addressed in later releases. Note that this list is not exhaustive, and is meant
+to communicate only the most important known issues.
+
+* If the Kudu master is configured with the `-log_force_fsync_all` option, tablet servers
+  and clients will experience frequent timeouts, and the cluster may become unusable.
+
+* If a tablet server has a very large number of tablets, it may take several minutes
+  to start up. It is recommended to limit the number of tablets per server to 100 or fewer.
+  Consider this limitation when pre-splitting your tables. If you notice slow start-up times,
+  you can monitor the number of tablets per server in the web UI.

http://git-wip-us.apache.org/repos/asf/kudu/blob/e89cac83/docs/prior_release_notes.adoc
----------------------------------------------------------------------
diff --git a/docs/prior_release_notes.adoc b/docs/prior_release_notes.adoc
index 228592a..45b0b5e 100644
--- a/docs/prior_release_notes.adoc
+++ b/docs/prior_release_notes.adoc
@@ -15,7 +15,7 @@
 // specific language governing permissions and limitations
 // under the License.
 
-[[release_notes]]
+[[prior_release_notes]]
 = Apache Kudu Prior Version Release Notes
 
 :author: Kudu Team
@@ -28,6 +28,128 @@
 :sectlinks:
 :experimental:
 
+This section reproduces the release notes for new features and incompatible
+changes in prior releases of Apache Kudu.
+
+
+NOTE: The list of known issues and limitations for prior releases are not
+reproduced on this page. Please consult the
+link:http://kudu.apache.org/releases/[documentation of the appropriate release]
+for a list of known issues and limitations.
+
+[[rn_1.1.0]]
+== Release notes specific to 1.1.0
+
+[[rn_1.1.0_new_features]]
+== New features
+
+* The Python client has been brought up to feature parity with the Java and {cpp} clients
+  and as such the package version will be brought to 1.1 with this release (from 0.3). A
+  list of the highlights can be found below.
+    ** Improved Partial Row semantics
+    ** Range partition support
+    ** Scan Token API
+    ** Enhanced predicate support
+    ** Support for all Kudu data types (including a mapping of Python's `datetime.datetime` to
+    `UNIXTIME_MICROS`)
+    ** Alter table support
+    ** Enabled Read at Snapshot for Scanners
+    ** Enabled Scanner Replica Selection
+    ** A few bug fixes for Python 3 in addition to various other improvements.
+
+* IN LIST predicate pushdown support was added to allow optimized execution of filters which
+  match on a set of column values. Support for Spark, Map Reduce and Impala queries utilizing
+  IN LIST pushdown is not yet complete.
+
+* The Java client now features client-side request tracing in order to help troubleshoot timeouts.
+  Error messages are now augmented with traces that show which servers were contacted before the
+  timeout occured instead of just the last error. The traces also contain RPCs that were
+  required to fulfill the client's request, such as contacting the master to discover a tablet's
+  location. Note that the traces are not available for successful requests and are not
+  programatically queryable.
+
+== Optimizations and improvements
+
+* Kudu now publishes JAR files for Spark 2.0 compiled with Scala 2.11 along with the
+  existing Spark 1.6 JAR compiled with Scala 2.10.
+
+* The Java client now allows configuring scanners to read from the closest replica instead of
+  the known leader replica. The default remains the latter. Use the relevant `ReplicaSelection`
+  enum with the scanner's builder to change this behavior.
+
+* Tablet servers use a new policy for retaining write-ahead log (WAL) segments.
+  Previously, servers used the 'log_min_segments_to_retain' flag to prioritize
+  any flushes which were retaining log segments past the configured value (default 2).
+  This policy caused servers to flush in-memory data more frequently than necessary,
+  limiting write performance.
++
+The new policy introduces a new flag 'log_target_replay_size_mb' which
+  determines the threshold at which write-ahead log retention will prioritize flushes.
+  The new flag is considered experimental and users should not need to modify
+  its value.
++
+The improved policy has been seen to improve write performance in some use cases
+  by a factor of 2x relative to the old policy.
+
+* Kudu's implementation of the Raft consensus algorithm has been improved to include
+  a "pre-election" phase. This can improve the stability of tablet leader election
+  in high-load scenarios, especially if each server hosts a high number of tablets.
+
+* Tablet server start-up time has been substantially improved in the case that
+  the server contains a high number of tombstoned tablet replicas.
+
+=== Command line tools
+
+* The tool `kudu tablet leader_step_down` has been added to manually force a leader to step down.
+* The tool `kudu remote_replica copy` has been added to manually copy a replica from
+  one running tablet server to another.
+* The tool `kudu local_replica delete` has been added to delete a replica of a tablet.
+* The `kudu test loadgen` tool has been added to replace the obsoleted
+  `insert-generated-rows` standalone binary. The new tool is enriched with
+  additional functionality and can be used to run load generation tests against
+  a Kudu cluster.
+
+== Wire protocol compatibility
+
+Kudu 1.1.0 is wire-compatible with previous versions of Kudu:
+
+* Kudu 1.1 clients may connect to servers running Kudu 1.0. If the client uses the new
+  'IN LIST' predicate type, an error will be returned.
+* Kudu 1.0 clients may connect to servers running Kudu 1.1 without limitations.
+* Rolling upgrade between Kudu 1.0 and Kudu 1.1 servers is believed to be possible
+  though has not been sufficiently tested. Users are encouraged to shut down all nodes
+  in the cluster, upgrade the software, and then restart the daemons on the new version.
+
+[[rn_1.1.0_incompatible_changes]]
+== Incompatible changes in Kudu 1.1.0
+
+=== Client APIs ({cpp}/Java/Python)
+
+* The {cpp} client no longer requires the
+  link:https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html[old gcc5 ABI].
+  Which ABI is actually used depends on the compiler configuration. Some new distros
+  (e.g. Ubuntu 16.04) will use the new ABI. Your application must use the same ABI as is
+  used by the client library; an easy way to guarantee this is to use the same compiler
+  to build both.
+
+* The {cpp} client's `KuduSession::CountBufferedOperations()` method is
+  deprecated. Its behavior is inconsistent unless the session runs in the
+  `MANUAL_FLUSH` mode. Instead, to get number of buffered operations, count
+  invocations of the `KuduSession::Apply()` method since last
+  `KuduSession::Flush()` call or, if using asynchronous flushing, since last
+  invocation of the callback passed into `KuduSession::FlushAsync()`.
+
+* The Java client's `OperationResponse.getWriteTimestamp` method was renamed to `getWriteTimestampRaw`
+  to emphasize that it doesn't return milliseconds, unlike what its Javadoc indicated. The renamed
+  method was also hidden from the public APIs and should not be used.
+
+* The Java client's sync API (`KuduClient`, `KuduSession`, `KuduScanner`) used to throw either
+  a `NonRecoverableException` or a `TimeoutException` for a timeout, and now it's only possible for the
+  client to throw the former.
+
+* The Java client's handling of errors in `KuduSession` was modified so that subclasses of
+  `KuduException` are converted into RowErrors instead of being thrown.
+
 [[rn_1.0.1]]
 == Release notes specific to 1.0.1
 
@@ -397,38 +519,6 @@ This feature may be configured using the `fs_data_dirs_reserved_bytes` and
   replication on a previously-unreplicated table. This change is internal and
   should not be visible to users.
 
-[[rn_0.10.0_upgrade]]
-=== Upgrading from 0.9.x to 0.10.0
-
-Before upgrading, see <<rn_0.10.0_incompatible_changes>> and
-<<rn_0.10.0_downgrade>>.
-
-To upgrade from Kudu 0.9.x to Kudu 0.10.0, perform the following high-level
-steps, which are detailed in the installation guide under
-link:installation.html#upgrade_procedure[Upgrade Procedure]:
-
-. Shut down all Kudu services.
-. Install the new Kudu packages or parcels, or install Kudu 0.10.0 from source.
-. Restart all Kudu services.
-
-WARNING: Rolling upgrades are not supported when upgrading from Kudu 0.9.x to
-0.10.0 and they are known to cause errors in this release. If you run into a
-problem after an accidental rolling upgrade, shut down all services and then
-restart all services and the system should come up properly.
-
-NOTE: For the duration of the Kudu Beta, upgrade instructions are generally
-only given for going from the previous latest version to the newly released
-version.
-
-[[rn_0.10.0_downgrade]]
-=== Downgrading from 0.10.0 to 0.9.x
-
-After upgrading to Kudu 0.10.0, it is possible to downgrade to 0.9.x with the
-following exceptions:
-
-. Tables created in 0.10.0 will not be accessible after a downgrade to 0.9.x
-. A multi-master setup formatted in 0.10.0 may not be downgraded to 0.9.x
-
 [[rn_0.9.1]]
 == Release notes specific to 0.9.1
 
@@ -440,18 +530,6 @@ See also +++<a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20
 for Kudu 0.9.1</a>+++ and +++<a href="https://github.com/apache/kudu/compare/0.9.0...0.9.1">Git
 changes between 0.9.0 and 0.9.1</a>+++.
 
-[[rn_0.9.1_upgrade]]
-=== Upgrading from 0.9.0 to 0.9.1
-
-Before upgrading to Kudu 0.9.1 from Kudu 0.8.0, please read the <<rn_0.9.0>>.
-
-Upgrading from 0.8.0 or 0.9.0 to 0.9.1 is supported. To upgrade from Kudu 0.8.0
-or Kudu 0.9.0 to Kudu 0.9.1, use the procedure documented in <<rn_0.9.0_upgrade>>.
-
-NOTE: For the duration of the Kudu Beta, upgrade instructions are generally
-only given for going from the previous latest version to the newly released
-version.
-
 [[rn_0.9.1_fixed_issues]]
 === Fixed Issues
 
@@ -559,30 +637,6 @@ All Kudu clients have longer default timeout values, as listed below.
   link:http://getkudu.io/2016/04/26/ycsb.html[Experiments using YCSB] indicate that these
   values will provide better throughput for write-heavy applications on typical server hardware.
 
-[[rn_0.9.0_upgrade]]
-=== Upgrading from 0.8.0 to 0.9.x
-
-Before upgrading, see <<rn_0.9.0_incompatible_changes>> and
-<<rn_0.9.0_client_compatibility>>. To upgrade from Kudu 0.8.0 to 0.9.0, perform
-the following high-level steps, which are detailed in the installation guide
-under link:installation.html#upgrade_procedure[Upgrade Procedure]:
-
-. Shut down all Kudu services.
-. Install the new Kudu packages or parcels, or install Kudu 0.9.1 from source.
-. Restart all Kudu services.
-
-It is technically possible to upgrade Kudu using rolling restarts, but it has not
-been tested and is not recommended.
-
-NOTE: For the duration of the Kudu Beta, upgrade instructions are only given for going
-from the previous latest version to the newest.
-
-[[rn_0.9.0_client_compatibility]]
-=== Client compatibility
-
-Masters and tablet servers should be upgraded before clients are upgraded. For specific
-information about client compatibility, see the <<rn_0.9.0_incompatible_changes>> section.
-
 
 [[rn_0.8.0]]
 == Release notes specific to 0.8.0
@@ -779,15 +833,6 @@ previous link and link:http://developerblog.redhat.com/2015/02/05/gcc5-and-the-c
 
 - The Python client is no longer considered experimental.
 
-=== Limitations
-
-See also <<beta_limitations>>. Where applicable, this list adds to or overrides that
-list.
-
-==== Operating System Limitations
-* Kudu 0.7 is known to work on RHEL 7 or 6.4 or newer, CentOS 7 or 6.4 or newer, Ubuntu
-Trusty, and SLES 12. Other operating systems may work but have not been tested.
-
 
 [[rn_0.6.0]]
 == Release notes specific to 0.6.0
@@ -806,160 +851,9 @@ consistent with the C++ client.
 - OSX is now supported for single-host development. Please consult its specific installation
 instructions in link:installation.html#osx_from_source[OS X].
 
-=== Limitations
-
-See also <<beta_limitations>>. Where applicable, this list adds to or overrides that
-list.
-
-==== Operating System Limitations
-* Kudu 0.6 is known to work on RHEL 6.4 or newer, CentOS 6.4 or newer, and Ubuntu
-Trusty. Other operating systems may work but have not been tested.
-
-==== API Limitations
-* The Python client is still considered experimental.
-
 
 [[rn_0.5.0]]
 == Release Notes Specific to 0.5.0
 
-=== Limitations
-
-See also <<beta_limitations>>. Where applicable, this list adds to or overrides that
-list.
-
-==== Operating System Limitations
-* Kudu 0.5 is known to work on RHEL 7 or 6.4 or newer, CentOS 7 or 6.4 or newer, Ubuntu
-Trusty, and SLES 12. Other operating systems may work but have not been tested.
-
-==== API Limitations
-* The Python client is considered experimental.
-
-== About the Kudu Public Beta
-
-Releases of Apache Kudu prior to 1.0 are considered beta. Do not run beta releases on production clusters.
-During the public beta period, Kudu will be supported via a
-link:https://issues.cloudera.org/projects/KUDU[public JIRA] and a public
-link:http://mail-archives.apache.org/mod_mbox/kudu-user/[mailing list], which will be
-monitored by the Kudu development team and community members. Commercial support
-is not available at this time.
-
-* You can submit any issues or feedback related to your Kudu experience via either
-the JIRA system or the mailing list. The Kudu development team and community members
-will respond and assist as quickly as possible.
-* The Kudu team will work with early adopters to fix bugs and release new binary drops
-when fixes or features are ready. However, we cannot commit to issue resolution or
-bug fix delivery times during the public beta period, and it is possible that some
-fixes or enhancements will not be selected for a release.
-* We can't guarantee time frames or contents for future beta code drops. However,
-they will be announced to the user group when they occur.
-* No guarantees are made regarding upgrades from this release to follow-on releases.
-While multiple drops of beta code are planned, we can't guarantee their schedules
-or contents.
-
-
-[[beta_limitations]]
-=== Limitations of the Kudu Public Beta
-
-Items in this list may be amended or superseded by limitations listed in the release
-notes for specific Kudu releases above.
-
-
-==== Schema Limitations
-* Kudu is primarily designed for analytic use cases and, in the beta release,
-you are likely to encounter issues if a single row contains multiple kilobytes of data.
-* The columns which make up the primary key must be listed first in the schema.
-* Key columns cannot be altered. You must drop and recreate a table to change its keys.
-* Key columns must not be null.
-* Columns with `DOUBLE`, `FLOAT`, or `BOOL` types are not allowed as part of a
-primary key definition.
-* Type and nullability of existing columns cannot be changed by altering the table.
-* A table\u2019s primary key cannot be changed.
-* Dropping a column does not immediately reclaim space. Compaction must run first.
-There is no way to run compaction manually, but dropping the table will reclaim the
-space immediately.
-
-==== Ingest Limitations
-* Ingest via Sqoop or Flume is not supported in the public beta. The recommended
-approach for bulk ingest is to use Impala\u2019s `CREATE TABLE AS SELECT` functionality
-or use the Kudu Java or C++ API.
-* Tables must be manually pre-split into tablets using simple or compound primary
-keys. Automatic splitting is not yet possible. See
-link:schema_design.html[Schema Design].
-* Tablets cannot currently be merged. Instead, create a new table with the contents
-of the old tables to be merged.
-
-==== Replication and Backup Limitations
-* Replication and failover of Kudu masters is considered experimental. It is
-recommended to run a single master and periodically perform a manual backup of
-its data directories.
-
-==== Impala Limitations
-* To use Kudu with Impala, you must install a special release of Impala called
-Impala_Kudu. Obtaining and installing a compatible Impala release is detailed in Kudu's
-link:kudu_impala_integration.html[Impala Integration] documentation.
-* To use Impala_Kudu alongside an existing Impala instance, you must install using parcels.
-* Updates, inserts, and deletes via Impala are non-transactional. If a query
-fails part of the way through, its partial effects will not be rolled back.
-* All queries will be distributed across all Impala hosts which host a replica
-of the target table(s), even if a predicate on a primary key could correctly
-restrict the query to a single tablet. This limits the maximum concurrency of
-short queries made via Impala.
-* No timestamp and decimal type support.
-* The maximum parallelism of a single query is limited to the number of tablets
-in a table. For good analytic performance, aim for 10 or more tablets per host
-or use large tables.
-* Impala is only able to push down predicates involving `=`, `<=`, `>=`,
-or `BETWEEN` comparisons between any column and a literal value, and `<` and `>`
-for integer columns only. For example, for a table with an integer key `ts`, and
-a string key `name`, the predicate `WHERE ts >= 12345` will convert into an
-efficient range scan, whereas `where name > 'lipcon'` will currently fetch all
-data from the table and evaluate the predicate within Impala.
-
-==== Security Limitations
-
-* Authentication and authorization are not included in the public beta.
-* Data encryption is not included in the public beta.
-
-==== Client and API Limitations
-
-* Potentially-incompatible C++, Java and Python API changes may be required during the
-public beta.
-* `ALTER TABLE` is not yet fully supported via the client APIs. More `ALTER TABLE`
-operations will become available in future betas.
-
-==== Application Integration Limitations
-
-* The Spark DataFrame implementation is not yet complete.
-
-==== Other Known Issues
-
-The following are known bugs and issues with the current release of Kudu. They will
-be addressed in later beta releases.
-
-* If the Kudu master is configured with the `-log_force_fsync_all` option, tablet servers
-and clients will experience frequent timeouts, and the cluster may become unusable.
-
-* If a tablet server has a very large number of tablets, it may take several minutes
-to start up. It is recommended to limit the number of tablets per server to 100 or fewer.
-Consider this limitation when pre-splitting your tables. If you notice slow start-up times,
-you can monitor the number of tablets per server in the web UI.
-
-== Resources
-
-- link:http://getkudu.io[Kudu Website]
-- link:http://github.com/apache/kudu[Kudu GitHub Repository]
-- link:index.html[Kudu Documentation]
-
-== Installation Options
-* A Quickstart VM is provided to get you up and running quickly.
-* You can install Kudu using provided deb/yum packages.
-* You can install Kudu, in clusters managed by Cloudera Manager, using parcels or deb/yum packages.
-* You can build Kudu from source.
-
-For full installation details, see link:installation.html[Kudu Installation].
-
-== Next Steps
-- link:quickstart.html[Kudu Quickstart]
-- link:installation.html[Installing Kudu]
-- link:configuration.html[Configuring Kudu]
-
+Kudu 0.5.0 was the first public release. As such, no improvements or changes were
+noted in its release notes.

http://git-wip-us.apache.org/repos/asf/kudu/blob/e89cac83/docs/release_notes.adoc
----------------------------------------------------------------------
diff --git a/docs/release_notes.adoc b/docs/release_notes.adoc
index a9aa587..7ef408f 100644
--- a/docs/release_notes.adoc
+++ b/docs/release_notes.adoc
@@ -16,7 +16,7 @@
 // under the License.
 
 [[release_notes]]
-= Apache Kudu 1.1 Release Notes
+= Apache Kudu 1.2.0 Release Notes
 
 :author: Kudu Team
 :imagesdir: ./images
@@ -28,207 +28,42 @@
 :sectlinks:
 :experimental:
 
-[[rn_1.1.0]]
+[[rn_1.2.0]]
 
-[[rn_1.1.0_new_features]]
+[[rn_1.2.0_new_features]]
 == New features
 
-* The Python client has been brought up to feature parity with the Java and {cpp} clients
-  and as such the package version will be brought to 1.1 with this release (from 0.3). A
-  list of the highlights can be found below.
-    ** Improved Partial Row semantics
-    ** Range partition support
-    ** Scan Token API
-    ** Enhanced predicate support
-    ** Support for all Kudu data types (including a mapping of Python's `datetime.datetime` to
-    `UNIXTIME_MICROS`)
-    ** Alter table support
-    ** Enabled Read at Snapshot for Scanners
-    ** Enabled Scanner Replica Selection
-    ** A few bug fixes for Python 3 in addition to various other improvements.
-
-* IN LIST predicate pushdown support was added to allow optimized execution of filters which
-  match on a set of column values. Support for Spark, Map Reduce and Impala queries utilizing
-  IN LIST pushdown is not yet complete.
-
-* The Java client now features client-side request tracing in order to help troubleshoot timeouts.
-  Error messages are now augmented with traces that show which servers were contacted before the
-  timeout occured instead of just the last error. The traces also contain RPCs that were
-  required to fulfill the client's request, such as contacting the master to discover a tablet's
-  location. Note that the traces are not available for successful requests and are not
-  programatically queryable.
 
 == Optimizations and improvements
 
-* Kudu now publishes JAR files for Spark 2.0 compiled with Scala 2.11 along with the
-  existing Spark 1.6 JAR compiled with Scala 2.10.
-
-* The Java client now allows configuring scanners to read from the closest replica instead of
-  the known leader replica. The default remains the latter. Use the relevant `ReplicaSelection`
-  enum with the scanner's builder to change this behavior.
-
-* Tablet servers use a new policy for retaining write-ahead log (WAL) segments.
-  Previously, servers used the 'log_min_segments_to_retain' flag to prioritize
-  any flushes which were retaining log segments past the configured value (default 2).
-  This policy caused servers to flush in-memory data more frequently than necessary,
-  limiting write performance.
-+
-The new policy introduces a new flag 'log_target_replay_size_mb' which
-  determines the threshold at which write-ahead log retention will prioritize flushes.
-  The new flag is considered experimental and users should not need to modify
-  its value.
-+
-The improved policy has been seen to improve write performance in some use cases
-  by a factor of 2x relative to the old policy.
-
-* Kudu's implementation of the Raft consensus algorithm has been improved to include
-  a "pre-election" phase. This can improve the stability of tablet leader election
-  in high-load scenarios, especially if each server hosts a high number of tablets.
-
-* Tablet server start-up time has been substantially improved in the case that
-  the server contains a high number of tombstoned tablet replicas.
 
 === Command line tools
 
-* The tool `kudu tablet leader_step_down` has been added to manually force a leader to step down.
-* The tool `kudu remote_replica copy` has been added to manually copy a replica from
-  one running tablet server to another.
-* The tool `kudu local_replica delete` has been added to delete a replica of a tablet.
-* The `kudu test loadgen` tool has been added to replace the obsoleted
-  `insert-generated-rows` standalone binary. The new tool is enriched with
-  additional functionality and can be used to run load generation tests against
-  a Kudu cluster.
 
 == Wire protocol compatibility
 
-Kudu 1.1.0 is wire-compatible with previous versions of Kudu:
+Kudu 1.2.0 is wire-compatible with previous versions of Kudu:
 
-* Kudu 1.1 clients may connect to servers running Kudu 1.0. If the client uses the new
-  'IN LIST' predicate type, an error will be returned.
-* Kudu 1.0 clients may connect to servers running Kudu 1.1 without limitations.
-* Rolling upgrade between Kudu 1.0 and Kudu 1.1 servers is believed to be possible
+* Kudu 1.2 clients may connect to servers running Kudu 1.0. If the client uses features
+  that are not available on the target server, an error will be returned.
+* Kudu 1.0 clients may connect to servers running Kudu 1.2 without limitations.
+* Rolling upgrade between Kudu 1.1 and Kudu 1.2 servers is believed to be possible
   though has not been sufficiently tested. Users are encouraged to shut down all nodes
   in the cluster, upgrade the software, and then restart the daemons on the new version.
 
-[[rn_1.1.0_incompatible_changes]]
-== Incompatible changes in Kudu 1.1.0
+[[rn_1.2.0_incompatible_changes]]
+== Incompatible changes in Kudu 1.2.0
 
 === Client APIs ({cpp}/Java/Python)
 
-* The {cpp} client no longer requires the
-  link:https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html[old gcc5 ABI].
-  Which ABI is actually used depends on the compiler configuration. Some new distros
-  (e.g. Ubuntu 16.04) will use the new ABI. Your application must use the same ABI as is
-  used by the client library; an easy way to guarantee this is to use the same compiler
-  to build both.
+[[rn_1.2.0_known_issues]]
 
-* The {cpp} client's `KuduSession::CountBufferedOperations()` method is
-  deprecated. Its behavior is inconsistent unless the session runs in the
-  `MANUAL_FLUSH` mode. Instead, to get number of buffered operations, count
-  invocations of the `KuduSession::Apply()` method since last
-  `KuduSession::Flush()` call or, if using asynchronous flushing, since last
-  invocation of the callback passed into `KuduSession::FlushAsync()`.
-
-* The Java client's `OperationResponse.getWriteTimestamp` method was renamed to `getWriteTimestampRaw`
-  to emphasize that it doesn't return milliseconds, unlike what its Javadoc indicated. The renamed
-  method was also hidden from the public APIs and should not be used.
-
-* The Java client's sync API (`KuduClient`, `KuduSession`, `KuduScanner`) used to throw either
-  a `NonRecoverableException` or a `TimeoutException` for a timeout, and now it's only possible for the
-  client to throw the former.
-
-* The Java client's handling of errors in `KuduSession` was modified so that subclasses of
-  `KuduException` are converted into RowErrors instead of being thrown.
-
-[[known_issues_and_limitations]]
 == Known Issues and Limitations
 
-=== Schema and Usage Limitations
-* Kudu is primarily designed for analytic use cases. You are likely to encounter issues if
-  a single row contains multiple kilobytes of data.
-
-* The columns which make up the primary key must be listed first in the schema.
-
-* Key columns cannot be altered. You must drop and recreate a table to change its keys.
-
-* Key columns must not be null.
-
-* Columns with `DOUBLE`, `FLOAT`, or `BOOL` types are not allowed as part of a
-  primary key definition.
-
-* Type and nullability of existing columns cannot be changed by altering the table.
-
-* A table\u2019s primary key cannot be changed.
-
-* Dropping a column does not immediately reclaim space. Compaction must run first.
-There is no way to run compaction manually, but dropping the table will reclaim the
-space immediately.
-
-=== Partitioning Limitations
-* Tables must be manually pre-split into tablets using simple or compound primary
-  keys. Automatic splitting is not yet possible. Range partitions may be added
-  or dropped after a table has been created. See
-  link:schema_design.html[Schema Design] for more information.
-
-* Data in existing tables cannot currently be automatically repartitioned. As a workaround,
-  create a new table with the new partitioning and insert the contents of the old
-  table.
-
-=== Replication and Backup Limitations
-* Kudu does not currently include any built-in features for backup and restore.
-  Users are encouraged to use tools such as Spark or Impala to export or import
-  tables as necessary.
-
-=== Impala Limitations
-
-* To use Kudu with Impala, you must install a special release of Impala called
-  Impala_Kudu. Obtaining and installing a compatible Impala release is detailed in Kudu's
-  link:kudu_impala_integration.html[Impala Integration] documentation.
-
-* To use Impala_Kudu alongside an existing Impala instance, you must install using parcels.
-
-* Updates, inserts, and deletes via Impala are non-transactional. If a query
-  fails part of the way through, its partial effects will not be rolled back.
-
-* No timestamp and decimal type support.
-
-* The maximum parallelism of a single query is limited to the number of tablets
-  in a table. For good analytic performance, aim for 10 or more tablets per host
-  or use large tables.
-
-=== Security Limitations
-
-* Authentication and authorization features are not implemented.
-* Data encryption is not built in. Kudu has been reported to run correctly
-  on systems using local block device encryption (e.g. `dmcrypt`).
-
-=== Client and API Limitations
-
-* `ALTER TABLE` is not yet fully supported via the client APIs. More `ALTER TABLE`
-  operations will become available in future releases.
-
-=== Other Known Issues
-
-The following are known bugs and issues with the current release of Kudu. They will
-be addressed in later releases. Note that this list is not exhaustive, and is meant
-to communicate only the most important known issues.
-
-* If the Kudu master is configured with the `-log_force_fsync_all` option, tablet servers
-  and clients will experience frequent timeouts, and the cluster may become unusable.
-
-* If a tablet server has a very large number of tablets, it may take several minutes
-  to start up. It is recommended to limit the number of tablets per server to 100 or fewer.
-  Consider this limitation when pre-splitting your tables. If you notice slow start-up times,
-  you can monitor the number of tablets per server in the web UI.
+Please refer to the link:known_issues.html[Known Issues and Limitations] section of the
+documentation.
 
-* Due to a known bug in Linux kernels prior to 3.8, running Kudu on `ext4` mount points
-  may cause a subsequent `fsck` to fail with errors such as `Logical start <N> does
-  not match logical start <M> at next level`. These errors are repairable using `fsck -y`,
-  but may impact server restart time.
-+
-This affects RHEL/CentOS 6.8 and below. A fix is planned for RHEL/CentOS 6.9.
-  RHEL 7.0 and higher are not affected. Ubuntu 14.04 and later are not affected.
-  SLES 12 and later are not affected.
+[[resources_and_next_steps]]
 
 == Resources
 

http://git-wip-us.apache.org/repos/asf/kudu/blob/e89cac83/docs/support/jekyll-templates/document.html.erb
----------------------------------------------------------------------
diff --git a/docs/support/jekyll-templates/document.html.erb b/docs/support/jekyll-templates/document.html.erb
index b8047ac..d7cca20 100644
--- a/docs/support/jekyll-templates/document.html.erb
+++ b/docs/support/jekyll-templates/document.html.erb
@@ -98,6 +98,7 @@ end %>
         :contributing, "Contributing to Kudu",
         :style_guide, "Kudu Documentation Style Guide",
         :configuration_reference, "Kudu Configuration Reference",
+        :known_issues, "Known Issues and Limitations",
         :export_control, "Export Control Notice"
       ]
       toplevels.each_slice(2) do |page, pagename|

[2/2] kudu git commit: Initial draft of release notes and doc updates for 1.2

Posted by to...@apache.org.

Initial draft of release notes and doc updates for 1.2

Change-Id: I08326171dd2bf6097a7594b95adca946bb5922eb
Reviewed-on: http://gerrit.cloudera.org:8080/5604
Tested-by: Kudu Jenkins
Reviewed-by: Jean-Daniel Cryans <jd...@apache.org>
(cherry picked from commit ccb34a7eaed7d9a01e8a3908ad9a089e4101eaac)
Reviewed-on: http://gerrit.cloudera.org:8080/5698
Reviewed-by: Todd Lipcon <to...@apache.org>
Tested-by: Todd Lipcon <to...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/a23e9d64
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/a23e9d64
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/a23e9d64

Branch: refs/heads/branch-1.2.x
Commit: a23e9d6458fa1384c978cfad4db2eea5a90f0c40
Parents: e89cac8
Author: Todd Lipcon <to...@apache.org>
Authored: Wed Jan 4 17:24:32 2017 -0800
Committer: Todd Lipcon <to...@apache.org>
Committed: Thu Jan 12 20:38:58 2017 +0000

----------------------------------------------------------------------
 docs/known_issues.adoc  |  14 ++--
 docs/release_notes.adoc | 182 ++++++++++++++++++++++++++++++++++++++++++-
 docs/schema_design.adoc |   8 +-
 3 files changed, 191 insertions(+), 13 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/a23e9d64/docs/known_issues.adoc
----------------------------------------------------------------------
diff --git a/docs/known_issues.adoc b/docs/known_issues.adoc
index edb8afb..abb9929 100644
--- a/docs/known_issues.adoc
+++ b/docs/known_issues.adoc
@@ -33,17 +33,21 @@
 
 * The columns which make up the primary key must be listed first in the schema.
 
-* Key columns cannot be altered. You must drop and recreate a table to change its keys.
+* Columns that are part of the primary key cannot be renamed.
+  The primary key may not be changed after the table is created.
+  You must drop and recreate a table to select a new primary key
+  or rename key columns.
 
-* Key columns must not be null.
+* The primary key of a row may not be modified using the `UPDATE` functionality.
+  To modify a row's primary key, the row must be deleted and re-inserted with
+  the modified key. Such a modification is non-atomic.
 
 * Columns with `DOUBLE`, `FLOAT`, or `BOOL` types are not allowed as part of a
-  primary key definition.
+  primary key definition. Additionally, all columns that are part of a primary
+  key definition must be `NOT NULL`.
 
 * Type and nullability of existing columns cannot be changed by altering the table.
 
-* A table\u2019s primary key cannot be changed.
-
 * Dropping a column does not immediately reclaim space. Compaction must run first.
 There is no way to run compaction manually, but dropping the table will reclaim the
 space immediately.

http://git-wip-us.apache.org/repos/asf/kudu/blob/a23e9d64/docs/release_notes.adoc
----------------------------------------------------------------------
diff --git a/docs/release_notes.adoc b/docs/release_notes.adoc
index 7ef408f..fbac805 100644
--- a/docs/release_notes.adoc
+++ b/docs/release_notes.adoc
@@ -33,14 +33,148 @@
 [[rn_1.2.0_new_features]]
 == New features
 
+* Kudu clients and servers now redact user data such as cell values
+  from log messages, Java exception messages, and `Status` strings.
+  User metadata such as table names, column names, and partition
+  bounds are not redacted.
++
+Redaction is enabled by default, but may be disabled by setting the new
+`log_redact_user_data` flag to `false`.
+// TODO(danburkert): this flag is marked experimental, should we not doc it?
+
+* Kudu's ability to provide consistency guarantees has been substantially
+improved:
+
+** Replicas now correctly track their "safe timestamp". This timestamp
+   is the maximum timestamp at which reads are guaranteed to be
+   repeatable.
+
+** A scan created using the `SCAN_AT_SNAPSHOT` mode will now
+   either wait for the requested snapshot to be "safe" at the replica
+   being scanned, or be re-routed to a replica where the requested
+   snapshot is "safe". This ensures that all such scans are repeatable.
+
+** Kudu Tablet Servers now properly retain historical data when a row
+   with a given primary key is inserted and deleted, followed by the
+   insertion of a new row with the same key. Previous versions of Kudu
+   would not retain history in such situations. This allows the server
+   to return correct results for snapshot scans with a timestamp in the
+   past, even in the presence of such "reinsertion" scenarios.
+
+** The Kudu clients now automatically retain the timestamp of their latest
+   successful read or write operation. Scans using the `READ_AT_SNAPSHOT` mode
+   without a client-provided timestamp automatically assign a timestamp
+   higher than the timestamp of their most recent write. Writes also propagate
+   the timestamp, ensuring that sequences of operations with causal dependencies
+   between them are assigned increasing timestamps. Together, these changes
+   allow clients to achieve read-your-writes consistency, and also ensure
+   that snapshot scans performed by other clients return causally-consistent
+   results.
+
+* Kudu servers now automatically limit the number of log files.
+  The number of log files retained can be configured using the
+  `max_log_files` flag. By default, 10 log files will be retained
+  at each severity level.
+// TODO(danburkert): this new flag is marked experimental, should we make it
+// stable or evolving? Or should we not document that it's configurable?
 
 == Optimizations and improvements
 
+* The logging in the Java and {cpp} clients has been substantially quieted.
+  Clients no longer log messages in normal operation unless there
+  is some kind of error.
+
+* The {cpp} client now includes a `KuduSession::SetErrorBufferSpace`
+  API which can limit the amount of memory used to buffer
+  errors from asynchronous operations.
+
+* The Java client now fetches tablet locations from the Kudu Master
+  in batches of 1000, increased from batches of 10 in prior versions.
+  This can substantially improve the performance of Spark and Impala
+  queries running against Kudu tables with large numbers of tablets.
+
+* Table metadata lock contention in the Kudu Master was substantially
+  reduced. This improves the performance of tablet location lookups on
+  large clusters with a high degree of concurrency.
+
+* Lock contention in the Kudu Tablet Server during high-concurrency
+  write workloads was also reduced. This can reduce CPU consumption and
+  improve performance when a large number of concurrent clients are writing
+  to a smaller number of a servers.
+
+* Lock contention when writing log messages has been substantially reduced.
+  This source of contention could cause high tail latencies on requests,
+  and when under high load could contribute to cluster instability
+  such as election storms and request timeouts.
+
+* The `BITSHUFFLE` column encoding has been optimized to use the `AVX2`
+  instruction set present on processors including Intel(R) Sandy Bridge
+  and later. Scans on `BITSHUFFLE`-encoded columns are now up to 30% faster.
+
+* The `kudu` tool now accepts hyphens as an alternative to underscores
+  when specifying actions. For example, `kudu local-replica copy-from-remote`
+  may be used as an alternative to `kudu local_replica copy_from_remote`.
+
+[[rn_1.2.0_fixed_issues]]
+== Fixed Issues
+
+* link:https://issues.apache.org/jira/browse/KUDU-1508[KUDU-1508]
+  Fixed a long-standing issue in which running Kudu on `ext4` file systems
+  could cause file system corruption.
+
+* link:https://issues.apache.org/jira/browse/KUDU-1399[KUDU-1399]
+  Implemented an LRU cache for open files, which prevents running out of
+  file descriptors on long-lived Kudu clusters. By default, Kudu will
+  limit its file descriptor usage to half of its configured `ulimit`.
+
+* link:http://gerrit.cloudera.org:8080/5192[Gerrit #5192]
+  Fixed an issue which caused data corruption and crashes in the case that
+  a table had a non-composite (single-column) primary key, and that column
+  was specified to use `DICT_ENCODING` or `BITSHUFFLE` encodings. If a
+  table with an affected schema was written in previous versions of Kudu,
+  the corruption will not be automatically repaired; users are encouraged
+  to re-insert such tables after upgrading to Kudu 1.2 or later.
 
-=== Command line tools
+* link:http://gerrit.cloudera.org:8080/5541[Gerrit #5541]
+  Fixed a bug in the Spark `KuduRDD` implementation which could cause
+  rows in the result set to be silently skipped in some cases.
 
+* link:https://issues.apache.org/jira/browse/KUDU-1551[KUDU-1551]
+  Fixed an issue in which the tablet server would crash on restart in the
+  case that it had previously crashed during the process of allocating
+  a new WAL segment.
 
-== Wire protocol compatibility
+* link:https://issues.apache.org/jira/browse/KUDU-1764[KUDU-1764]
+  Fixed an issue where Kudu servers would leak approximately 16-32MB of disk
+  space for every 10GB of data written to disk. After upgrading to Kudu
+  1.2 or later, any disk space leaked in previous versions will be
+  automatically recovered on startup.
+
+* link:https://issues.apache.org/jira/browse/KUDU-1750[KUDU-1750]
+  Fixed an issue where the API to drop a range partition would drop any
+  partition with a matching lower _or_ upper bound, rather than any partition
+  with matching lower _and_ upper bound.
+
+* link:https://issues.apache.org/jira/browse/KUDU-1766[KUDU-1766]
+  Fixed an issue in the Java client where equality predicates which compared
+  an integer column to its maximum possible value (e.g. `Integer.MAX_VALUE`)
+  would return incorrect results.
+
+* link:https://issues.apache.org/jira/browse/KUDU-1780[KUDU-1780]
+  Fixed the `kudu-client` Java artifact to properly shade classes in the
+  `com.google.thirdparty` namespace. The lack of proper shading in prior
+  releases could cause conflicts with certain versions of Google Guava.
+
+* link:http://gerrit.cloudera.org:8080/5327[Gerrit #5327]
+  Fixed shading issues in the `kudu-flume-sink` Java artifact. The sink
+  now expects that Hadoop dependencies are provided by Flume, and properly
+  shades the Kudu client's dependencies.
+
+* Fixed a few issues using the Python client library from Python 3.
+
+
+[[rn_1.2.0_wire_compatibility]]
+== Wire Protocol compatibility
 
 Kudu 1.2.0 is wire-compatible with previous versions of Kudu:
 
@@ -52,9 +186,49 @@ Kudu 1.2.0 is wire-compatible with previous versions of Kudu:
   in the cluster, upgrade the software, and then restart the daemons on the new version.
 
 [[rn_1.2.0_incompatible_changes]]
-== Incompatible changes in Kudu 1.2.0
+== Incompatible Changes in Kudu 1.2.0
+
+* The replication factor of tables is now limited to a maximum of 7. In addition,
+  it is no longer allowed to create a table with an even replication factor.
+
+* The `GROUP_VARINT` encoding is now deprecated. Kudu servers have never supported
+  this encoding, and now the client-side constant has been deprecated to match the
+  server's capabilities.
+
+=== New Restrictions on Data, Schemas, and Identifiers
+
+Kudu 1.2.0 introduces several new restrictions on schemas, cell size, and identifiers:
+
+Number of Columns:: By default, Kudu will not permit the creation of tables with
+more than 300 columns. We recommend schema designs that use fewer columns for best
+performance.
+
+Size of Cells:: No individual cell may be larger than 64KB. The cells making up a
+a composite key are limited to a total of 16KB after the internal composite-key encoding
+done by Kudu. Inserting rows not conforming to these limitations will result in errors
+being returned to the client.
+
+Valid Identifiers:: Identifiers such as column and table names are now restricted to
+be valid UTF-8 strings. Additionally, a maximum length of 256 characters is enforced.
+
+[[rn_1.2.0_client_compatibility]]
+=== Client Library Compatibility
+
+* The Kudu 1.2 Java client is API- and ABI-compatible with Kudu 1.1. Applications
+  written against Kudu 1.1 will compile and run against the Kudu 1.2 client and
+  vice-versa.
+
+* The Kudu 1.2 {cpp} client is API- and ABI-forward-compatible with Kudu 1.1.
+  Applications written and compiled against the Kudu 1.1 client will run without
+  modification against the Kudu 1.2 client. Applications written and compiled
+  against the Kudu 1.2 client will run without modification against the Kudu 1.1
+  client unless they use one of the following new APIs:
+** `kudu::DisableSaslInitialization()`
+** `KuduSession::SetErrorBufferSpace(...)`
 
-=== Client APIs ({cpp}/Java/Python)
+* The Kudu 1.2 Python client is API-compatible with Kudu 1.1. Applications
+  written against Kudu 1.1 will continue to run against the Kudu 1.2 client
+  and vice-versa.
 
 [[rn_1.2.0_known_issues]]
 

http://git-wip-us.apache.org/repos/asf/kudu/blob/a23e9d64/docs/schema_design.adoc
----------------------------------------------------------------------
diff --git a/docs/schema_design.adoc b/docs/schema_design.adoc
index 7c991f3..0c737f7 100644
--- a/docs/schema_design.adoc
+++ b/docs/schema_design.adoc
@@ -442,10 +442,7 @@ support renaming primary key columns.
 [[known-limitations]]
 == Known Limitations
 
-Kudu currently has some known limitations that may factor into schema design. When
-designing your schema, consider these limitations together, not in isolation. If you
-test these limitations and your findings are different from these, please share your
-test cases and results.
+Kudu currently has some known limitations that may factor into schema design.
 
 Number of Columns:: By default, Kudu will not permit the creation of tables with
 more than 300 columns. We recommend schema designs that use fewer columns for best
@@ -459,6 +456,9 @@ being returned to the client.
 Size of Rows:: Although individual cells may be up to 64KB, and Kudu supports up to
 300 columns, it is recommended that no single row be larger than a few hundred KB.
 
+Valid Identifiers:: Identifiers such as table and column names must be valid UTF-8
+sequences and no longer than 256 bytes.
+
 Immutable Primary Keys:: Kudu does not allow you to update the primary key
 columns of a row.