You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by we...@apache.org on 2019/07/08 17:15:49 UTC

[arrow] branch master updated: ARROW-5826: [Website] Blog post for 0.14.0 release announcement

This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
     new 07e1042  ARROW-5826: [Website] Blog post for 0.14.0 release announcement
07e1042 is described below

commit 07e104260b3366cc97900abd6aa7999f75969791
Author: Sutou Kouhei <ko...@clear-code.com>
AuthorDate: Mon Jul 8 12:15:41 2019 -0500

    ARROW-5826: [Website] Blog post for 0.14.0 release announcement
    
    I plan to publish by noon US/Eastern time on Monday July 8 -- please post commits with any edits or push directly to my branch and I will include the changes in the post.
    
    Author: Sutou Kouhei <ko...@clear-code.com>
    Author: Wes McKinney <we...@apache.org>
    Author: Wes McKinney <we...@users.noreply.github.com>
    
    Closes #4819 from wesm/ARROW-5826 and squashes the following commits:
    
    079c88ccc <Wes McKinney> Add discussion links
    f4f208a42 <Wes McKinney> Update site/_posts/2019-07-08-0.14.0-release.md
    4dc02a214 <Sutou Kouhei> Add Packaging section
    c5d9cdcb0 <Sutou Kouhei> Add Sparse representation and compression
    e47d69a94 <Sutou Kouhei> Fix a typo
    205f222f6 <Sutou Kouhei> Fix markup
    1544da180 <Wes McKinney> Attribute community
    f9653fa17 <Wes McKinney> Finish tweaking formatting, links
    6b82e825f <Wes McKinney> Start markdownifying 0.14.0 blog post
---
 site/README.md                           |   7 +
 site/_data/contributors.yml              |   5 +-
 site/_posts/2019-07-08-0.14.0-release.md | 300 +++++++++++++++++++++++++++++++
 site/_release/0.14.0.md                  |  40 ++---
 4 files changed, 331 insertions(+), 21 deletions(-)

diff --git a/site/README.md b/site/README.md
index 73fd185..33758e9 100644
--- a/site/README.md
+++ b/site/README.md
@@ -36,6 +36,13 @@ gem install jekyll bundler
 bundle install
 ```
 
+On some platforms, the Ruby `nokogiri` library may fail to build, in
+such cases the following configuration option may help:
+
+```
+bundle config build.nokogiri --use-system-libraries
+```
+
 If you are planning to publish the website, you must clone the arrow-site git
 repository. Run this command from the `site` directory so that `asf-site` is a
 subdirectory of `site`.
diff --git a/site/_data/contributors.yml b/site/_data/contributors.yml
index 95b18b2..185a565 100644
--- a/site/_data/contributors.yml
+++ b/site/_data/contributors.yml
@@ -16,10 +16,13 @@
 # Database of contributors to Apache Arrow (WIP)
 # Blogs and other pages use this data
 #
+- name: Apache Arrow Community
+  githubId: apache
+  homepage: https://arrow.apache.org
 - name: Wes McKinney
   apacheId: wesm
   githubId: wesm
-  homepage: http://wesmckinney.com
+  homepage: https://wesmckinney.com
   role: PMC
 - name: Uwe Korn
   apacheId: uwe
diff --git a/site/_posts/2019-07-08-0.14.0-release.md b/site/_posts/2019-07-08-0.14.0-release.md
new file mode 100644
index 0000000..1c1f1c8
--- /dev/null
+++ b/site/_posts/2019-07-08-0.14.0-release.md
@@ -0,0 +1,300 @@
+---
+layout: post
+title: "Apache Arrow 0.14.0 Release"
+date: "2019-07-02 00:00:00 -0600"
+author: apache
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+The Apache Arrow team is pleased to announce the 0.14.0 release. This
+covers 3 months of development work and includes [**602 resolved
+issues**][1] from [**75 distinct contributors**][2].  See the Install
+Page to learn how to get the libraries for your platform. The
+[complete changelog][3] is also available.
+
+This post will give some brief highlights in the project since the
+0.13.0 release from April.
+
+## New committers
+
+Since the 0.13.0 release, the following have been added:
+
+* [Neville Dipale][5] was added as a committer
+* [François Saint-Jacques][6] was added as a committer
+* [Praveen Kumar][7] was added as a committer
+
+Thank you for all your contributions!
+
+## Upcoming 1.0.0 Format Stability Release
+
+We are planning for our next major release to move from 0.14.0 to
+1.0.0. The major version number will indicate stability of the Arrow
+columnar format and binary protocol. While the format has already been
+stable since December 2017, we believe it is a good idea to make this
+stability official and to indicate that it is safe to persist
+serialized Arrow data in applications. This means that applications
+will be able to safely upgrade to new Arrow versions without having to
+worry about backwards incompatibilities. We will write in a future
+blog post about the stability guarantees we intend to provide to help
+application developers plan accordingly.
+
+## Packaging
+
+We added support for the following platforms:
+
+* Debian GNU/Linux buster
+* Ubuntu 19.04
+
+We dropped support for Ubuntu 14.04.
+
+## Development Infrastructure and Tooling
+
+As the project has grown larger and more diverse, we are increasingly
+outgrowing what we can test in public continuous integration services
+like Travis CI and Appveyor. In addition, we share these resources
+with the entire Apache Software Foundation, and given the high volume
+of pull requests into Apache Arrow, maintainers are frequently waiting
+many hours for the green light to merge patches.
+
+The complexity of our testing is driven by the number of different
+components and programming languages as well as increasingly long
+compilation and test execution times as individual libraries grow
+larger. The 50 minute time limit of public CI services is simply too
+limited to comprehensively test the project. Additionally, the CI host
+machines are constrained in their features and memory limits,
+preventing us from testing features that are only relevant on large
+amounts of data (10GB or more) or functionality that requires a
+CUDA-enabled GPU.
+
+Organizations that contribute to Apache Arrow are working on physical
+build infrastructure and tools to improve build times and build
+scalability. One such new tool is `ursabot`, a GitHub-enabled bot
+that can be used to trigger builds either on physical build or in the
+cloud. It can also be used to trigger benchmark timing comparisons. If
+you are contributing to the project, you may see Ursabot being
+employed to trigger tests in pull requests.
+
+To help assist with migrating away from Travis CI, we are also working
+to make as many of our builds reproducible with Docker and not reliant
+on Travis CI-specific configuration details. This will also help
+contributors reproduce build failures locally without having to wait
+for Travis CI.
+
+## Columnar Format Notes
+
+* User-defined "extension" types have been formalized in the Arrow
+  format, enabling library users to embed custom data types in the
+  Arrow columnar format. Initial support is available in C++, Java,
+  and Python.
+* A new Duration logical type was added to represent absolute lengths
+  of time.
+
+## Arrow Flight notes
+
+Flight now supports many of the features of a complete RPC
+framework.
+
+* Authentication APIs are now supported across all languages (ARROW-5137)
+* Encrypted communication using OpenSSL is supported (ARROW-5643,
+  ARROW-5529)
+* Clients can specify timeouts on remote calls (ARROW-5136)
+* On the protocol level, endpoints are now identified with URIs, to
+  support an open-ended number of potential transports (including TLS
+  and Unix sockets, and perhaps even non-gRPC-based transports in the
+  future) (ARROW-4651)
+* Application-defined metadata can be sent alongside data (ARROW-4626,
+  ARROW-4627).
+
+Windows is now a supported platform for Flight in C++ and Python
+(ARROW-3294), and Python wheels are shipped for all languages
+(ARROW-3150, ARROW-5656). C++, Python, and Java have been brought to
+parity, now that actions can return streaming results in Java
+(ARROW-5254).
+
+## C++ notes
+
+188 resolved issues related to the C++ implementation, so we summarize
+some of the work here.
+
+### General platform improvements
+
+* A FileSystem abstraction (ARROW-767) has been added, which paves the
+  way for a future Arrow Datasets library allowing to access sharded
+  data on arbitrary storage systems, including remote or cloud
+  storage. A first draft of the Datasets API was committed in
+  ARROW-5512. Right now, this comes with no implementation, but we
+  expect to slowly build it up in the coming weeks or months. Early
+  feedback is welcome on this API.
+* The dictionary API has been reworked in ARROW-3144. The dictionary
+  values used to be tied to the DictionaryType instance, which ended
+  up too inflexible. Since dictionary-encoding is more often an
+  optimization than a semantic property of the data, we decided to
+  move the dictionary values to the ArrayData structure, making it
+  natural for dictionary-encoded arrays to share the same DataType
+  instance, regardless of the encoding details.
+* The FixedSizeList and Map types have been implemented, including in
+  integration tests. The Map type is akin to a List of Struct(key,
+  value) entries, but making it explicit that the underlying data has
+  key-value mapping semantics. Also, map entries are always non-null.
+* A `Result<T>` class has been introduced in ARROW-4800. The aim is to
+  allow to return an error as w ell as a function's logical result
+  without resorting to pointer-out arguments.
+* The Parquet C++ library has been refactored to use common Arrow IO
+  classes for improved C++ platform interoperability.
+
+### Line-delimited JSON reader
+
+A multithreaded line-delimited JSON reader (powered internally by
+RapidJSON) is now available for use (also in Python and R via
+bindings) . This will likely be expanded to support more kinds of JSON
+storage in the future.
+
+### New computational kernels
+
+A number of new computational kernels have been developed
+
+* Compare filter for logical comparisons yielding boolean arrays
+* Filter kernel for selecting elements of an input array according to
+  a boolean selection array.
+* Take kernel, which selects elements by integer index, has been
+  expanded to support nested types
+
+## C# Notes
+
+The native C# implementation has continued to mature since 0.13. This
+release includes a number of performance, memory use, and usability
+improvements.
+
+## Go notes
+
+Go's support for the Arrow columnar format continues to expand. Go now
+supports reading and writing the Arrow columnar binary protocol, and
+it has also been **added to the cross language integration
+tests**. There are now four languages (C++, Go, Java, and JavaScript)
+included in our integration tests to verify cross-language
+interoperability.
+
+## Java notes
+
+* Support for referencing arbitrary memory using `ArrowBuf` has been
+  implemented, paving the way for memory map support in Java
+* A number of performance improvements around vector value access were
+  added (see ARROW-5264, ARROW-5290).
+* The Map type has been implemented in Java and integration tested
+  with C++
+* Several microbenchmarks have been added and improved.  Including a
+  significant speed-up of zeroing out buffers.
+* A new algorithms package has been started to contain reference
+  implementations of common algorithms.  The initial contribution is
+  for Array/Vector sorting.
+
+## JavaScript Notes
+
+A new incremental [array builder API][4] is available.
+
+## MATLAB Notes
+
+Version 0.14.0 features improved Feather file support in the MEX bindings.
+
+## Python notes
+
+* We fixed a problem with the Python wheels causing the Python wheels
+  to be much larger in 0.13.0 than they were in 0.12.0. Since the
+  introduction of LLVM into our build toolchain, the wheels are going
+  to still be significantly bigger. We are interested in approaches to
+  enable pyarrow to be installed in pieces with pip or conda rather
+  than monolithically.
+* It is now possible to define ExtensionTypes with a Python
+  implementation (ARROW-840). Those ExtensionTypes can survive a
+  roundtrip through C++ and serialization.
+* The Flight improvements highlighted above (see C++ notes) are all
+  available from Python. Furthermore, Flight is now bundled in our
+  binary wheels and conda packages for Linux, Windows and macOS
+  (ARROW-3150, ARROW-5656).
+* We will build "manylinux2010" binary wheels for Linux systems, in
+  addition to "manylinux1" wheels (ARROW-2461). Manylinux2010 is a
+  newer standard for more recent systems, with less limiting toolchain
+  constraints. Installing manylinux2010 wheels requires an up-to-date
+  version of pip.
+* Various bug fixes for CSV reading in Python and C++ including the
+  ability to parse Decimal(x, y) columns.
+
+### Parquet improvements
+
+* Column statistics for logical types like unicode strings, unsigned
+  integers, and timestamps are casted to compatible Python types (see
+  ARROW-4139)
+* It's now possible to configure "data page" sizes when writing a file
+  from Python
+
+## Ruby and C GLib notes
+
+The GLib and Ruby bindings have been tracking features in the C++
+project. This release includes bindings for Gandiva, JSON reader, and
+other C++ features.
+
+## Rust notes
+
+There is ongoing work in Rust happening on Parquet file support,
+computational kernels, and the DataFusion query engine. See the full
+changelog for details.
+
+## R notes
+
+We have been working on build and packaging for R so that community
+members can hopefully release the project to CRAN in the near
+future. Feature development for R has continued to follow the upstream
+C++ project.
+
+## Community Discussions Ongoing
+
+There are a number of active discussions ongoing on the developer
+dev@arrow.apache.org mailing list. We look forward to hearing from the
+community there:
+
+* [Timing and scope of 1.0.0 release][15]
+* [Solutions to increase continuous integration capacity][13]
+* [A proposal for versioning and forward/backward compatibility
+  guarantees for the 1.0.0 release][8] was shared, not much discussion has
+  occurred yet.
+* [Addressing possible unaligned access and undefined behavior concerns][9]
+  in the Arrow binary protocol
+* [Supporting smaller than 128-bit encoding of fixed width decimals][10]
+* [Forking the Avro C++ implementation][11] so as to adapt it to Arrow's
+  needs
+* [Sparse representation and compression in Arrow][12]
+* [Flight extensions: middleware API and generalized Put operations][14]
+
+[1]: https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20%3D%20Resolved%20AND%20fixVersion%20%3D%200.13.0
+[2]: https://arrow.apache.org/release/0.14.0.html#contributors
+[3]: https://arrow.apache.org/release/0.14.0.html
+[4]: https://github.com/apache/arrow/tree/master/js/src/builder
+[5]: https://github.com/nevi-me
+[6]: https://github.com/fsaintjacques
+[7]: https://github.com/praveenbingo
+[8]: https://lists.apache.org/thread.html/5715a4d402c835d22d929a8069c5c0cf232077a660ee98639d544af8@%3Cdev.arrow.apache.org%3E
+[9]: https://lists.apache.org/thread.html/8440be572c49b7b2ffb76b63e6d935ada9efd9c1c2021369b6d27786@%3Cdev.arrow.apache.org%3E
+[10]: https://lists.apache.org/thread.html/31b00086c2991104bd71fb1a2173f32b4a2f569d8e7b5b41e836f3a3@%3Cdev.arrow.apache.org%3E
+[11]: https://lists.apache.org/thread.html/97d78112ab583eecb155a7d78342c1063df65d64ec3ccfa0b18737c3@%3Cdev.arrow.apache.org%3E
+[12]: https://lists.apache.org/thread.html/a99124e57c14c3c9ef9d98f3c80cfe1dd25496bf3ff7046778add937@%3Cdev.arrow.apache.org%3E
+[13]: https://lists.apache.org/thread.html/96b2e22606e8a7b0ad7dc4aae16f232724d1059b34636676ed971d40@%3Cdev.arrow.apache.org%3E
+[14]: https://lists.apache.org/thread.html/82a7c026ad18dbe9fdbcffa3560979aff6fd86dd56a49f40d9cfb46e@%3Cdev.arrow.apache.org%3E
+[15]: https://lists.apache.org/thread.html/44a7a3d256ab5dbd62da6fe45b56951b435697426bf4adedb6520907@%3Cdev.arrow.apache.org%3E
\ No newline at end of file
diff --git a/site/_release/0.14.0.md b/site/_release/0.14.0.md
index 8bf84c8..ed191d9 100644
--- a/site/_release/0.14.0.md
+++ b/site/_release/0.14.0.md
@@ -192,13 +192,13 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-3166](https://issues.apache.org/jira/browse/ARROW-3166) - [C++] Consolidate IO interfaces used in arrow/io and parquet-cpp
 * [ARROW-3191](https://issues.apache.org/jira/browse/ARROW-3191) - [Java] Add support for ArrowBuf to point to arbitrary memory.
 * [ARROW-3200](https://issues.apache.org/jira/browse/ARROW-3200) - [C++] Add support for reading Flight streams with dictionaries
-* [ARROW-3290](https://issues.apache.org/jira/browse/ARROW-3290) - [C++] Toolchain support for secure gRPC 
+* [ARROW-3290](https://issues.apache.org/jira/browse/ARROW-3290) - [C++] Toolchain support for secure gRPC
 * [ARROW-3294](https://issues.apache.org/jira/browse/ARROW-3294) - [C++] Test Flight RPC on Windows / Appveyor
 * [ARROW-3314](https://issues.apache.org/jira/browse/ARROW-3314) - [R] Set -rpath using pkg-config when building
 * [ARROW-3419](https://issues.apache.org/jira/browse/ARROW-3419) - [C++] Run include-what-you-use checks as nightly build
 * [ARROW-3459](https://issues.apache.org/jira/browse/ARROW-3459) - [C++][Gandiva] Add support for variable length output vectors
 * [ARROW-3475](https://issues.apache.org/jira/browse/ARROW-3475) - [C++] Int64Builder.Finish(NumericArray<Int64Type>)
-* [ARROW-3572](https://issues.apache.org/jira/browse/ARROW-3572) - [Packaging] Correctly handle ssh origin urls for crossbow 
+* [ARROW-3572](https://issues.apache.org/jira/browse/ARROW-3572) - [Packaging] Correctly handle ssh origin urls for crossbow
 * [ARROW-3671](https://issues.apache.org/jira/browse/ARROW-3671) - [Go] implement Interval array
 * [ARROW-3676](https://issues.apache.org/jira/browse/ARROW-3676) - [Go] implement Decimal128 array
 * [ARROW-3679](https://issues.apache.org/jira/browse/ARROW-3679) - [Go] implement IPC protocol
@@ -213,7 +213,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-3791](https://issues.apache.org/jira/browse/ARROW-3791) - [C++] Add type inference for boolean values in CSV files
 * [ARROW-3794](https://issues.apache.org/jira/browse/ARROW-3794) - [R] Consider mapping INT8 to integer() not raw()
 * [ARROW-3804](https://issues.apache.org/jira/browse/ARROW-3804) - [R] Consider lowering required R runtime
-* [ARROW-3810](https://issues.apache.org/jira/browse/ARROW-3810) - [R] type= argument for Array and ChunkedArray 
+* [ARROW-3810](https://issues.apache.org/jira/browse/ARROW-3810) - [R] type= argument for Array and ChunkedArray
 * [ARROW-3811](https://issues.apache.org/jira/browse/ARROW-3811) - [R] struct arrays inference
 * [ARROW-3814](https://issues.apache.org/jira/browse/ARROW-3814) - [R] RecordBatch$from\_arrays()
 * [ARROW-3815](https://issues.apache.org/jira/browse/ARROW-3815) - [R] refine record batch factory
@@ -226,7 +226,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-4047](https://issues.apache.org/jira/browse/ARROW-4047) - [Python] Document use of int96 timestamps and options in Parquet docs
 * [ARROW-4086](https://issues.apache.org/jira/browse/ARROW-4086) - [Java] Add apis to debug alloc failures
 * [ARROW-4121](https://issues.apache.org/jira/browse/ARROW-4121) - [C++] Refactor memory allocation from InvertKernel
-* [ARROW-4159](https://issues.apache.org/jira/browse/ARROW-4159) - [C++] Check for -Wdocumentation issues 
+* [ARROW-4159](https://issues.apache.org/jira/browse/ARROW-4159) - [C++] Check for -Wdocumentation issues
 * [ARROW-4194](https://issues.apache.org/jira/browse/ARROW-4194) - [Format] Metadata.rst does not specify timezone for Timestamp type
 * [ARROW-4302](https://issues.apache.org/jira/browse/ARROW-4302) - [C++] Add OpenSSL to C++ build toolchain
 * [ARROW-4337](https://issues.apache.org/jira/browse/ARROW-4337) - [C#] Array / RecordBatch Builder Fluent API
@@ -246,7 +246,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-4627](https://issues.apache.org/jira/browse/ARROW-4627) - [Flight] Add application metadata field to DoPut
 * [ARROW-4701](https://issues.apache.org/jira/browse/ARROW-4701) - [C++] Add JSON chunker benchmarks
 * [ARROW-4702](https://issues.apache.org/jira/browse/ARROW-4702) - [C++] Upgrade dependency versions
-* [ARROW-4708](https://issues.apache.org/jira/browse/ARROW-4708) - [C++] Add multithreaded JSON reader 
+* [ARROW-4708](https://issues.apache.org/jira/browse/ARROW-4708) - [C++] Add multithreaded JSON reader
 * [ARROW-4714](https://issues.apache.org/jira/browse/ARROW-4714) - [C++][Java] Providing JNI interface to Read ORC file via Arrow C++
 * [ARROW-4717](https://issues.apache.org/jira/browse/ARROW-4717) - [C#] Consider exposing ValueTask instead of Task
 * [ARROW-4719](https://issues.apache.org/jira/browse/ARROW-4719) - [C#] Implement ChunkedArray, Column and Table in C#
@@ -262,7 +262,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-4904](https://issues.apache.org/jira/browse/ARROW-4904) - [C++] Move implementations in arrow/ipc/test-common.h into libarrow\_testing
 * [ARROW-4911](https://issues.apache.org/jira/browse/ARROW-4911) - [R] Support for building package for Windows
 * [ARROW-4912](https://issues.apache.org/jira/browse/ARROW-4912) - [C++, Python] Allow specifying column names to CSV reader
-* [ARROW-4913](https://issues.apache.org/jira/browse/ARROW-4913) - [Java][Memory] Limit number of ledgers and arrowbufs 
+* [ARROW-4913](https://issues.apache.org/jira/browse/ARROW-4913) - [Java][Memory] Limit number of ledgers and arrowbufs
 * [ARROW-4945](https://issues.apache.org/jira/browse/ARROW-4945) - [Flight] Enable Flight integration tests in Travis
 * [ARROW-4956](https://issues.apache.org/jira/browse/ARROW-4956) - [C#] Allow ArrowBuffers to wrap external Memory in C#
 * [ARROW-4959](https://issues.apache.org/jira/browse/ARROW-4959) - [Gandiva][Crossbow] Builds broken
@@ -274,7 +274,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-4990](https://issues.apache.org/jira/browse/ARROW-4990) - [C++] Kernel to compare array with array
 * [ARROW-4993](https://issues.apache.org/jira/browse/ARROW-4993) - [C++] Display summary at the end of CMake configuration
 * [ARROW-5000](https://issues.apache.org/jira/browse/ARROW-5000) - [Python] Fix deprecation warning from setup.py
-* [ARROW-5007](https://issues.apache.org/jira/browse/ARROW-5007) - [C++] Move DCHECK out of sse-utils 
+* [ARROW-5007](https://issues.apache.org/jira/browse/ARROW-5007) - [C++] Move DCHECK out of sse-utils
 * [ARROW-5020](https://issues.apache.org/jira/browse/ARROW-5020) - [C++][Gandiva] Split Gandiva-related conda packages for builds into separate .yml conda env file
 * [ARROW-5027](https://issues.apache.org/jira/browse/ARROW-5027) - [Python] Add JSON Reader
 * [ARROW-5038](https://issues.apache.org/jira/browse/ARROW-5038) - [Rust] [DataFusion] Implement AVG aggregate function
@@ -282,7 +282,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-5045](https://issues.apache.org/jira/browse/ARROW-5045) - [Rust] Code coverage silently failing in CI
 * [ARROW-5053](https://issues.apache.org/jira/browse/ARROW-5053) - [Rust] [DataFusion] Use env var for location of arrow test data
 * [ARROW-5054](https://issues.apache.org/jira/browse/ARROW-5054) - [C++][Release] Test Flight in verify-release-candidate.sh
-* [ARROW-5056](https://issues.apache.org/jira/browse/ARROW-5056) - [Packaging] Adjust conda recipes to use ORC conda-forge package on unix systems 
+* [ARROW-5056](https://issues.apache.org/jira/browse/ARROW-5056) - [Packaging] Adjust conda recipes to use ORC conda-forge package on unix systems
 * [ARROW-5061](https://issues.apache.org/jira/browse/ARROW-5061) - [Release] Improve 03-binary performance
 * [ARROW-5062](https://issues.apache.org/jira/browse/ARROW-5062) - [Java] Shade Java Guava dependency for Flight
 * [ARROW-5063](https://issues.apache.org/jira/browse/ARROW-5063) - [Java] FlightClient should not create a child allocator
@@ -337,7 +337,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-5203](https://issues.apache.org/jira/browse/ARROW-5203) - [GLib] Add support for Compare filter
 * [ARROW-5204](https://issues.apache.org/jira/browse/ARROW-5204) - [C++] Improve BufferBuilder performance
 * [ARROW-5212](https://issues.apache.org/jira/browse/ARROW-5212) - [Go] Array BinaryBuilder in Go library has no access to resize the values buffer
-* [ARROW-5218](https://issues.apache.org/jira/browse/ARROW-5218) - [C++] Improve build when third-party library locations are specified 
+* [ARROW-5218](https://issues.apache.org/jira/browse/ARROW-5218) - [C++] Improve build when third-party library locations are specified
 * [ARROW-5219](https://issues.apache.org/jira/browse/ARROW-5219) - [C++] Build protobuf\_ep in parallel when using Ninja
 * [ARROW-5222](https://issues.apache.org/jira/browse/ARROW-5222) - [Python] Issues with installing pyarrow for development on MacOS
 * [ARROW-5225](https://issues.apache.org/jira/browse/ARROW-5225) - [Java] Improve performance of BaseValueVector#getValidityBufferSizeFromCount
@@ -362,7 +362,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-5288](https://issues.apache.org/jira/browse/ARROW-5288) - [Documentation] Enrich the contribution guidelines
 * [ARROW-5289](https://issues.apache.org/jira/browse/ARROW-5289) - [C++] Move arrow/util/concatenate.h to arrow/array/
 * [ARROW-5290](https://issues.apache.org/jira/browse/ARROW-5290) - [Java] Provide a flag to enable/disable null-checking in vectors' get methods
-* [ARROW-5291](https://issues.apache.org/jira/browse/ARROW-5291) - [Python] Add wrapper for "take" kernel on Array 
+* [ARROW-5291](https://issues.apache.org/jira/browse/ARROW-5291) - [Python] Add wrapper for "take" kernel on Array
 * [ARROW-5298](https://issues.apache.org/jira/browse/ARROW-5298) - [Rust] Add debug implementation for Buffer
 * [ARROW-5299](https://issues.apache.org/jira/browse/ARROW-5299) - [C++] ListArray comparison is incorrect
 * [ARROW-5309](https://issues.apache.org/jira/browse/ARROW-5309) - [Python] Add clarifications to Python "append" methods that return new objects
@@ -441,7 +441,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-5488](https://issues.apache.org/jira/browse/ARROW-5488) - [R] Workaround when C++ lib not available
 * [ARROW-5490](https://issues.apache.org/jira/browse/ARROW-5490) - [C++] Remove ARROW\_BOOST\_HEADER\_ONLY
 * [ARROW-5491](https://issues.apache.org/jira/browse/ARROW-5491) - [C++] Remove unecessary semicolons following MACRO definitions
-* [ARROW-5492](https://issues.apache.org/jira/browse/ARROW-5492) - [R] Add "col\_select" argument to read\_\* functions to read subset of columns 
+* [ARROW-5492](https://issues.apache.org/jira/browse/ARROW-5492) - [R] Add "col\_select" argument to read\_\* functions to read subset of columns
 * [ARROW-5495](https://issues.apache.org/jira/browse/ARROW-5495) - [C++] Use HTTPS consistently for downloading dependencies
 * [ARROW-5496](https://issues.apache.org/jira/browse/ARROW-5496) - [R][CI] Fix relative paths in R codecov.io reporting
 * [ARROW-5498](https://issues.apache.org/jira/browse/ARROW-5498) - [C++] Build failure with Flatbuffers 1.11.0 and MinGW
@@ -453,7 +453,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-5512](https://issues.apache.org/jira/browse/ARROW-5512) - [C++] Draft initial public APIs for Datasets project
 * [ARROW-5513](https://issues.apache.org/jira/browse/ARROW-5513) - [Java] Refactor method name for getstartOffset to use camel case
 * [ARROW-5516](https://issues.apache.org/jira/browse/ARROW-5516) - [Python] Development page for pyarrow has a missing dependency in using pip
-* [ARROW-5518](https://issues.apache.org/jira/browse/ARROW-5518) - [Java] Set VectorSchemaRoot rowCount to 0 on allocateNew and clear 
+* [ARROW-5518](https://issues.apache.org/jira/browse/ARROW-5518) - [Java] Set VectorSchemaRoot rowCount to 0 on allocateNew and clear
 * [ARROW-5524](https://issues.apache.org/jira/browse/ARROW-5524) - [C++] Turn off PARQUET\_BUILD\_ENCRYPTION in CMake if OpenSSL not found
 * [ARROW-5526](https://issues.apache.org/jira/browse/ARROW-5526) - [Developer] Add more prominent notice to GitHub issue template to direct bug reports to JIRA
 * [ARROW-5529](https://issues.apache.org/jira/browse/ARROW-5529) - [Flight] Allow serving with multiple TLS certificates
@@ -532,7 +532,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-5718](https://issues.apache.org/jira/browse/ARROW-5718) - [R] auto splice data frames in record\_batch() and table()
 * [ARROW-5721](https://issues.apache.org/jira/browse/ARROW-5721) - [Rust] Move array related code into a separate module
 * [ARROW-5724](https://issues.apache.org/jira/browse/ARROW-5724) - [R] [CI] AppVeyor build should use ccache
-* [ARROW-5725](https://issues.apache.org/jira/browse/ARROW-5725) - [Crossbow] Port conda recipes to azure pipelines 
+* [ARROW-5725](https://issues.apache.org/jira/browse/ARROW-5725) - [Crossbow] Port conda recipes to azure pipelines
 * [ARROW-5726](https://issues.apache.org/jira/browse/ARROW-5726) - [Java] Implement a common interface for int vectors
 * [ARROW-5727](https://issues.apache.org/jira/browse/ARROW-5727) - [Python] [CI] Install pytest-faulthandler before running tests
 * [ARROW-5748](https://issues.apache.org/jira/browse/ARROW-5748) - [Packaging][deb] Add support for Debian GNU/Linux buster
@@ -570,7 +570,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-2461](https://issues.apache.org/jira/browse/ARROW-2461) - [Python] Build wheels for manylinux2010 tag
 * [ARROW-3344](https://issues.apache.org/jira/browse/ARROW-3344) - [Python] test\_plasma.py fails (in test\_plasma\_list)
 * [ARROW-3399](https://issues.apache.org/jira/browse/ARROW-3399) - [Python] Cannot serialize numpy matrix object
-* [ARROW-3650](https://issues.apache.org/jira/browse/ARROW-3650) - [Python] Mixed column indexes are read back as strings 
+* [ARROW-3650](https://issues.apache.org/jira/browse/ARROW-3650) - [Python] Mixed column indexes are read back as strings
 * [ARROW-3762](https://issues.apache.org/jira/browse/ARROW-3762) - [C++] Parquet arrow::Table reads error when overflowing capacity of BinaryArray
 * [ARROW-4021](https://issues.apache.org/jira/browse/ARROW-4021) - [Ruby] Error building red-arrow on msys2
 * [ARROW-4076](https://issues.apache.org/jira/browse/ARROW-4076) - [Python] schema validation and filters
@@ -592,7 +592,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-4885](https://issues.apache.org/jira/browse/ARROW-4885) - [Python] read\_csv() can't handle decimal128 columns
 * [ARROW-4886](https://issues.apache.org/jira/browse/ARROW-4886) - [Rust] Inconsistent behaviour with casting sliced primitive array to list array
 * [ARROW-4923](https://issues.apache.org/jira/browse/ARROW-4923) - Expose setters for Decimal vector that take long and double inputs
-* [ARROW-4934](https://issues.apache.org/jira/browse/ARROW-4934) - [Python] Address deprecation notice that will be a bug in Python 3.8 
+* [ARROW-4934](https://issues.apache.org/jira/browse/ARROW-4934) - [Python] Address deprecation notice that will be a bug in Python 3.8
 * [ARROW-5019](https://issues.apache.org/jira/browse/ARROW-5019) - [C#] ArrowStreamWriter doesn't work on a non-seekable stream
 * [ARROW-5049](https://issues.apache.org/jira/browse/ARROW-5049) - [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow FileSystem used in spark
 * [ARROW-5051](https://issues.apache.org/jira/browse/ARROW-5051) - [GLib][Gandiva] Test failure in release verification script
@@ -614,7 +614,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-5142](https://issues.apache.org/jira/browse/ARROW-5142) - [CI] Fix conda calls in AppVeyor scripts
 * [ARROW-5144](https://issues.apache.org/jira/browse/ARROW-5144) - [Python] ParquetDataset and ParquetPiece not serializable
 * [ARROW-5146](https://issues.apache.org/jira/browse/ARROW-5146) - [Dev] Merge script imposes directory name
-* [ARROW-5147](https://issues.apache.org/jira/browse/ARROW-5147) - [C++] get an error in building: Could NOT find DoubleConversion 
+* [ARROW-5147](https://issues.apache.org/jira/browse/ARROW-5147) - [C++] get an error in building: Could NOT find DoubleConversion
 * [ARROW-5148](https://issues.apache.org/jira/browse/ARROW-5148) - [CI] [C++] LLVM-related compile errors
 * [ARROW-5149](https://issues.apache.org/jira/browse/ARROW-5149) - [Packaging][Wheel] Pin LLVM to version 7 in windows builds
 * [ARROW-5152](https://issues.apache.org/jira/browse/ARROW-5152) - [Python] CMake warnings when building
@@ -622,7 +622,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-5160](https://issues.apache.org/jira/browse/ARROW-5160) - [C++] ABORT\_NOT\_OK evalutes expression twice
 * [ARROW-5166](https://issues.apache.org/jira/browse/ARROW-5166) - [Python][Parquet] Statistics for uint64 columns may overflow
 * [ARROW-5167](https://issues.apache.org/jira/browse/ARROW-5167) - [C++] Upgrade string-view-light to latest
-* [ARROW-5169](https://issues.apache.org/jira/browse/ARROW-5169) - [Python] non-nullable fields are converted to nullable in {{Table.from\_pandas}}
+* [ARROW-5169](https://issues.apache.org/jira/browse/ARROW-5169) - [Python] non-nullable fields are converted to nullable in Table.from\_pandas
 * [ARROW-5173](https://issues.apache.org/jira/browse/ARROW-5173) - [Go] handle multiple concatenated streams back-to-back
 * [ARROW-5174](https://issues.apache.org/jira/browse/ARROW-5174) - [Go] implement Stringer for DataTypes
 * [ARROW-5177](https://issues.apache.org/jira/browse/ARROW-5177) - [Python] ParquetReader.read\_column() doesn't check bounds
@@ -655,13 +655,13 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-5301](https://issues.apache.org/jira/browse/ARROW-5301) - [Python] parquet documentation outdated on nthreads argument
 * [ARROW-5306](https://issues.apache.org/jira/browse/ARROW-5306) - [CI] [GLib] Disable GTK-Doc
 * [ARROW-5308](https://issues.apache.org/jira/browse/ARROW-5308) - [Go] remove deprecated Feather format
-* [ARROW-5314](https://issues.apache.org/jira/browse/ARROW-5314) - [Go] Incorrect Printing for String Arrays with Offsets 
+* [ARROW-5314](https://issues.apache.org/jira/browse/ARROW-5314) - [Go] Incorrect Printing for String Arrays with Offsets
 * [ARROW-5325](https://issues.apache.org/jira/browse/ARROW-5325) - [Archery][Benchmark] Output properly formatted jsonlines from benchmark diff cli command
 * [ARROW-5330](https://issues.apache.org/jira/browse/ARROW-5330) - [Python] [CI] Run Python Flight tests on Travis-CI
 * [ARROW-5332](https://issues.apache.org/jira/browse/ARROW-5332) - [R] R package fails to build/install: error in dyn.load()
 * [ARROW-5348](https://issues.apache.org/jira/browse/ARROW-5348) - [CI] [Java] Gandiva checkstyle failure
 * [ARROW-5360](https://issues.apache.org/jira/browse/ARROW-5360) - [Rust] Builds are broken by rustyline on nightly 2019-05-16+
-* [ARROW-5362](https://issues.apache.org/jira/browse/ARROW-5362) - [C++] Compression round trip test can cause some sanitizers to to fail 
+* [ARROW-5362](https://issues.apache.org/jira/browse/ARROW-5362) - [C++] Compression round trip test can cause some sanitizers to to fail
 * [ARROW-5371](https://issues.apache.org/jira/browse/ARROW-5371) - [Release] Add tests for dev/release/00-prepare.sh
 * [ARROW-5373](https://issues.apache.org/jira/browse/ARROW-5373) - [Java] Add missing details for Gandiva Java Build
 * [ARROW-5376](https://issues.apache.org/jira/browse/ARROW-5376) - [C++] Compile failure on gcc 5.4.0
@@ -669,7 +669,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0
 * [ARROW-5387](https://issues.apache.org/jira/browse/ARROW-5387) - [Go] properly handle sub-slice of List
 * [ARROW-5388](https://issues.apache.org/jira/browse/ARROW-5388) - [Go] use arrow.TypeEqual in array.NewChunked
 * [ARROW-5390](https://issues.apache.org/jira/browse/ARROW-5390) - [CI] Job time limit exceeded on Travis
-* [ARROW-5397](https://issues.apache.org/jira/browse/ARROW-5397) - Test Flight TLS support 
+* [ARROW-5397](https://issues.apache.org/jira/browse/ARROW-5397) - Test Flight TLS support
 * [ARROW-5398](https://issues.apache.org/jira/browse/ARROW-5398) - [Python] Flight tests broken by URI changes
 * [ARROW-5403](https://issues.apache.org/jira/browse/ARROW-5403) - [C++] Test failures not propagated in Windows shared builds
 * [ARROW-5411](https://issues.apache.org/jira/browse/ARROW-5411) - [C++][Python] Build error building on Mac OS Mojave