You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by GitBox <gi...@apache.org> on 2021/02/01 21:28:44 UTC

[GitHub] [arrow-site] bkietz commented on a change in pull request #92: 3.0 release blog post

bkietz commented on a change in pull request #92:
URL: https://github.com/apache/arrow-site/pull/92#discussion_r568151444



##########
File path: _posts/2021-01-25-3.0.0-release.md
##########
@@ -0,0 +1,231 @@
+---
+layout: post
+title: "Apache Arrow 3.0.0 Release"
+date: "2021-01-25 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+The Apache Arrow team is pleased to announce the 3.0.0 release. This covers
+over 3 months of development work and includes [**666 resolved issues**][1]
+from [**106 distinct contributors**][2]. See the Install Page to learn how to
+get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the [complete changelog][3].
+
+
+## Columnar Format Notes
+
+The Decimal256 data type, which was already supported by the Arrow columnar
+format specification, is now implemented in C++ and Java (ARROW-9747).
+
+## Arrow Flight RPC notes
+
+Authentication in C++/Java/Python has been overhauled, allowing more flexible authentication methods and use of standard headers.
+Support for cookies has also been added.
+The C++/Java implementations are now more permissive when parsing messages in order to interoperate better with other Flight implementations.
+
+A basic Flight implementation for C#/.NET has been added.
+See the [implementation status matrix](https://arrow.apache.org/docs/status.html#flight-rpc) for details.
+## C++ notes
+
+The default memory pool can now be changed at runtime using the environment
+variable `ARROW_DEFAULT_MEMORY_POOL` (ARROW-11009).  The environment variable
+is inspected at process startup.  This is useful when trying to diagnose memory
+consumption issues with Arrow.
+
+STL-like iterators are now provided over concrete arrays. Those are useful for
+non-performance critical tasks, for example testing (ARROW-10776).
+
+It is now possible to concatenate dictionary arrays with unequal dictionaries.
+The dictionaries are unified when concatenating, for supported data types
+(ARROW-5336).
+
+Threads in a thread pool are now spawned lazily as needed for enqueued
+tasks, up to the configured capacity. They used to be spawned upfront on
+creation of the thread pool (ARROW-10038).
+
+### Compute layer
+
+Comprehensive documentation for compute functions is now available:
+https://arrow.apache.org/docs/cpp/compute.html
+
+Compute functions for string processing have been added for:
+* spliting on whitespace (ASCII and Unicode flavours) and splitting on a
+  pattern (ARROW-9991);
+* trimming characters (ARROW-9128).
+
+Behavior of the `index_in` and `is_in` compute functions with nulls has been
+changed for consistency (ARROW-10663).
+
+Multiple-column sort kernels are now available for tables and record batches
+(ARROW-8199, ARROW-10796, ARROW-10790).
+
+Performance of table filtering has been vastly improved (ARROW-10569).
+
+Scalar arguments are now accepted for more compute functions.
+
+Compute functions `quantile` (ARROW-10831) and `is_nan` (ARROW-11043) have been
+added for numeric data.
+
+Aggregation functions `any` (ARROW-1846) and `all` (ARROW-10301) have been
+added for boolean data.
+

Review comment:
       ```suggestion
   ### Dataset
   
   The `Expression` hierarchy has simplified to a wrapper around literals, field references,
   or calls to named functions. This enables usage of any compute function while filtering
   with no bolierplate.
   
   Parquet statistics are lazily parsed in `ParquetDatasetFactory` and
   `ParquetFileFragment` for shorter construction time.
   
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org