You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by "raulcd (via GitHub)" <gi...@apache.org> on 2023/05/03 13:57:57 UTC

[GitHub] [arrow-site] raulcd commented on a diff in pull request #346: [Website] Version 12.0.0 blog post

raulcd commented on code in PR #346:
URL: https://github.com/apache/arrow-site/pull/346#discussion_r1183727075


##########
_posts/2023-05-02-12.0.0-release.md:
##########
@@ -0,0 +1,265 @@
+---
+layout: post
+title: "Apache Arrow 12.0.0 Release"
+date: "2023-05-02 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+
+The Apache Arrow team is pleased to announce the 12.0.0 release. This covers
+over 3 months of development work and includes [**476 resolved issues**][1]
+with [**531 commits from 97 distinct contributors**][2].
+See the [Install Page](https://arrow.apache.org/install/)
+to learn how to get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the [complete changelog][3].
+
+## Community
+
+Since the 11.0.0 release, Wang Mingming, Mustafa Akur and Ruihang Xia
+have been invited to be committers.
+Will Jones have joined the Project Management Committee (PMC).
+
+Thanks for your contributions and participation in the project!
+
+## Columnar Format Notes
+
+A first "canonical" extension type has been formalized: `arrow.fixed_shape_tensor` to
+represent an Array where each slot contains a tensor, with all tensors having the same
+dimension and shape. This is based on a Fixed-Size List layout as storage array
+(https://arrow.apache.org/docs/dev/format/CanonicalExtensions.html#fixed-shape-tensor-extension).
+
+## Arrow Flight RPC notes
+
+The JDBC driver for Arrow Flight SQL has had some bugfixes, and has been refactored into a core library (which is not distributed as an uberjar with shaded names) and a driver (which is distributed as an uberjar).
+
+The Java server builder API now offers easier access to the underlying gRPC builder.
+
+Go now implements the Flight SQL extensions for Substrait and transaction support.
+
+## Plasma notes
+
+Plasma was deprecated since 10.0.0. Plasma is removed in this
+release. [GH-33243][GH-33243]
+
+## C++ notes
+
+* Run-End Encoded Arrays have been implemented and are accessible ([GH-32104](https://github.com/apache/arrow/issues/32104))
+* The FixedShapeTensor Logical value type has been implemented using ExtensionType ([GH-15483](https://github.com/apache/arrow/issues/15483), [GH-34796](https://github.com/apache/arrow/issues/34796))
+
+### Compute
+
+* New kernel to convert timestamp with timezone to wall time ([GH-33143](https://github.com/apache/arrow/issues/33143))
+* Cast kernels are now built into libarrow by default ([GH-34388](https://github.com/apache/arrow/issues/34388))
+
+### Acero
+
+* Acero has been moved out of libarrow into it's own shared library, allowing for smaller builds of the core libarrow ([GH-15280](https://github.com/apache/arrow/issues/15280))
+* Exec nodes now can have a concept of "ordering" and will reject non-sensible plans ([GH-34136](https://github.com/apache/arrow/issues/34136))
+* New exec nodes: "pivot_longer" ([GH-34266](https://github.com/apache/arrow/issues/34266)), "order_by" ([GH-34248](https://github.com/apache/arrow/issues/34248)) and "fetch" ([GH-34059](https://github.com/apache/arrow/issues/34059))
+* Reorder output fields of "group_by" node so that keys/segment keys come before aggregates ([GH-33616](https://github.com/apache/arrow/issues/33616))
+
+### Parquet
+
+* Added support for DeltaLengthByteArray encoding to the Parquet writer ([GH-33024](https://github.com/apache/arrow/issues/33024))
+* NaNs are correctly handled now for Parquet predicate push-downs ([GH-18481](https://github.com/apache/arrow/issues/18481))
+* Added support for reading Parquet page indexes ([GH-33596](https://github.com/apache/arrow/issues/33596)) and writing page indexes ([GH-34053](https://github.com/apache/arrow/issues/34053))
+* Parquet writer can write columns in parallel now ([GH-33655](https://github.com/apache/arrow/issues/33655))
+* Fixed incorrect number of rows in Parquet V2 page headers ([GH-34086](https://github.com/apache/arrow/issues/34086))
+* Fixed incorrect Parquet page null_count when stats are disabled ([GH-34326](https://github.com/apache/arrow/issues/34326))
+* Added support for reading BloomFilters to the Parquet Reader ([GH-34665](https://github.com/apache/arrow/issues/34665))
+* Parquet File-writer can now add additional key-value metadata after it has been opened ([GH-34888](https://github.com/apache/arrow/issues/34888))
+
+### ORC
+
+* Added support for the union type in ORC writer ([GH-34262](https://github.com/apache/arrow/issues/34262))
+* Fixed ORC CHAR type mapping with Arrow ([GH-34823](https://github.com/apache/arrow/issues/34823))
+* Fixed timestamp type mapping between ORC and arrow ([GH-34590](https://github.com/apache/arrow/issues/34590))
+
+### Datasets
+
+* Added support for reading JSON datasets ([GH-33209](https://github.com/apache/arrow/issues/33209))
+* Dataset writer now supports specifying a function callback to construct the file name in addition to the existing file name template ([GH-34565](https://github.com/apache/arrow/issues/34565))
+
+### Filesystems
+
+* GcsFileSystem::OpenInputFile avoids unnecessary downloads ([GH-34051](https://github.com/apache/arrow/issues/34051))
+
+### Other changes
+
+* Convenience Append(std::optional<T>...) methods have been added to array builders ([GH-14863](https://github.com/apache/arrow/issues/14863))
+* A deprecated OpenTelemetry header was removed from the Flight library ([GH-34417](https://github.com/apache/arrow/issues/34417))
+* Fixed crash in "take" kernels on ExtensionArrays with an underlying dictionary type ([GH-34619](https://github.com/apache/arrow/issues/34619))
+* Fixed bug where the C-Data bridge did not preserve nullability of map values on import ([GH-34983](https://github.com/apache/arrow/issues/34983))
+* Added support for EqualOptions to RecordBatch::Equals ([GH-34968](https://github.com/apache/arrow/issues/34968))
+* zstd dependency upgraded to v1.5.5 ([GH-34899](https://github.com/apache/arrow/issues/34899))
+* Improved handling of "logical" nulls such as with union and RunEndEncoded arrays ([GH-34361](https://github.com/apache/arrow/issues/34361))
+* Fixed incorrect handling of uncompressed body buffers in IPC reader, added IpcWriteOptions::min_space_savings for optional compression optimizations ([GH-15102](https://github.com/apache/arrow/issues/15102))
+
+## C# notes

Review Comment:
   ```suggestion
   ## C# notes
   
   * Added support for Float 16 [GH-25163](https://github.com/apache/arrow/issues/25163)
   * Added decompression support for Record Batches [GH-32240](https://github.com/apache/arrow/issues/32240) and added new Apache.Arrow.Compression package.
   * Define C Data Interface for schemas and types [GH-34737](https://github.com/apache/arrow/issues/34737)
   ```
   @westonpace @wjones127 @eerhardt 
   I tried to add some of the relevant notes I found, let me know if those are ok with you?
   This is the link with the C# issues closed on the 12.0.0 release



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org