You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by al...@apache.org on 2021/07/29 11:51:32 UTC

[arrow-site] branch master updated: Rust 5.0.0 release blog (#128)

This is an automated email from the ASF dual-hosted git repository.

alamb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow-site.git


The following commit(s) were added to refs/heads/master by this push:
     new a5a81c3  Rust 5.0.0 release blog (#128)
a5a81c3 is described below

commit a5a81c3ca7c531b6392d8e70fb539a32cdd859ed
Author: Andrew Lamb <an...@nerdnetworks.org>
AuthorDate: Thu Jul 29 07:51:24 2021 -0400

    Rust 5.0.0 release blog (#128)
    
    * Rust 5.0.0 release blog
    
    * Apply suggestions from code review
    
    Co-authored-by: Andy Grove <an...@gmail.com>
    Co-authored-by: David Roher <dr...@users.noreply.github.com>
    
    * Reword arrow features / releases
    
    * Add note about FFI/C data interface integration
    
    * Update _posts/2021-07-20-5.0.0-rs-release.md
    
    Co-authored-by: Neal Richardson <ne...@gmail.com>
    
    * Update dates, add link to 5.0.0 release announcement, qualify MIRI progress, and note MapArray support coming soon
    
    Co-authored-by: Andy Grove <an...@gmail.com>
    Co-authored-by: David Roher <dr...@users.noreply.github.com>
    Co-authored-by: Neal Richardson <ne...@gmail.com>
---
 _posts/2021-07-20-5.0.0-rs-release.md | 115 ++++++++++++++++++++++++++++++++++
 1 file changed, 115 insertions(+)

diff --git a/_posts/2021-07-20-5.0.0-rs-release.md b/_posts/2021-07-20-5.0.0-rs-release.md
new file mode 100644
index 0000000..16fe42b
--- /dev/null
+++ b/_posts/2021-07-20-5.0.0-rs-release.md
@@ -0,0 +1,115 @@
+---
+layout: post
+title: Apache Arrow Rust 5.0.0 Release
+date: "2021-07-29 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+We recently released the 5.0.0 Rust version of [Apache Arrow](https://arrow.apache.org/) which coincides with the [Arrow 5.0.0 release](https://arrow.apache.org/release/5.0.0.html). This post highlights some of the improvements in the Rust implementation. The full changelog can be found [here](https://github.com/apache/arrow-rs/blob/5.0.0/CHANGELOG.md).
+
+<!--
+(arrow_dev) alamb@MacBook-Pro:~/Software/arrow-rs$ git log --pretty=oneline 4.0.0..5.0.0 | wc -l
+     161
+(arrow_dev) alamb@MacBook-Pro:~/Software/arrow-rs$ git shortlog -sn 4.0.0..5.0.0 | wc -l
+      35 // but Jorge is double counted
+-->
+
+The Rust Arrow implementation would not be possible without the wonderful work and support of our community, and the 5.0.0 release is no exception. It includes 161 commits from 34 individual contributors, many of them with their first contribution. Thank you all very much.
+
+# Arrow
+Feature-wise, this release adds:
+1. A [new kernel](https://github.com/apache/arrow-rs/pull/424) which lexicographically partitions points.
+2. [Expanded support](https://github.com/apache/arrow-rs/pull/439) for the FFI/[C data interface](https://arrow.apache.org/docs/format/CDataInterface.html), easing integration with the broader Arrow ecosystem
+3. Usability enhancements for creating and manipulating `RecordBatch`es.
+4. [Improved usability](https://github.com/apache/arrow-rs/pull/377) for Arrow Flight's API.
+5. Slimmer dependency stack when default features are disabled
+
+We continue to leverage the Rust ecosystem to deliver reliable and performant code. We made significant progress towards running the Rust test suite under the [MIRI checker](https://github.com/rust-lang/miri) (a sort of valgrind for Rust) for memory access violations, and we expect it to be fully enabled in CI for the 5.1.0 release.
+
+Of course, this release also contains bug fixes, performance improvements, and improved documentation examples. For the full list of changes, please consult the [changelog](https://github.com/apache/arrow-rs/blob/5.0.0/CHANGELOG.md).
+
+# Parquet
+The `parquet-derive` crate now automatically derives the required parquet schema, and the `parquet` crate had several bug fixes and enhancements.
+
+# More Frequent Releases
+Arrow releases major versions every three months. The Rust implementation has been experimenting with releasing minor version updates to speed the flow of new features and fixes. By implementing a new development process, as described in [A New Development Workflow for Arrow's Rust Implementation](https://arrow.apache.org/blog/2021/05/04/rust-dev-workflow/) we have successfully created 4 minor releases on the 4.x.x line every other week without any reports of breakage.
+
+You can always find the latest releases on crates.io: [`arrow`](https://crates.io/crates/arrow), [`parquet`](https://crates.io/crates/parquet), [`arrow-flight`](https://crates.io/crates/arrow-flight), and [`parquet-derive`](https://crates.io/crates/parquet-derive).
+
+# DataFusion & Ballista
+[DataFusion](https://docs.rs/datafusion/4.0.0/datafusion/) is an in-memory query engine with DataFrame and SQL APIs, built on top of Arrow. Ballista is a distributed compute platform. These projects are now in their [own repository](https://github.com/apache/arrow-datafusion), and are no longer released in lock-step with Arrow. Expect further news in this area soon.
+
+# Roadmap for 6.0.0 and Beyond
+Here are some of the initiatives that contributors are currently working on for future releases:
+
+* Improved performance of compute kernels
+* Date/time/timestamp/interval compute kernels
+* MapArray support
+* Preparations for removing the use of `unsafe` to make arrow faster and more secure -- see the [mailing list](https://lists.apache.org/thread.html/recac1f6dc982bab2923f8fb6992e2d4c927f46daff5f03ed6c4de19c%40%3Cdev.arrow.apache.org%3E) discussion for more details.
+
+# Contributors to 5.0.0:
+Again, thank you to all the contributors for this release. Here is the raw git listing:
+<!--
+(arrow_dev) alamb@MacBook-Pro:~/Software/arrow-rs$ git shortlog -sn 4.0.0..5.0.0
+.. list below ..
+
+Note I combined two distinct names for Jorge
+-->
+```
+    28  Jorge Leitao
+    27  Andrew Lamb
+    15  Jiayu Liu
+    12  Ritchie Vink
+    10  Wakahisa
+     8  Raphael Taylor-Davies
+     6  Daniël Heres
+     5  Andy Grove
+     5  Navin
+     5  Jörn Horstmann
+     4  Ádám Lippai
+     4  Dominik Moritz
+     4  Marco Neumann
+     3  Roee Shlomo
+     3  Michael Edwards
+     2  Steven
+     2  Krisztián Szűcs
+     2  Gary Pennington
+     1  Ben Chambers
+     1  Max Meldrum
+     1  Edd Robinson
+     1  Gang Liao
+     1  Chojan Shang
+     1  Boaz
+     1  Wes McKinney
+     1  Yordan Pavlov
+     1  baishen
+     1  hulunbier
+     1  kazuhiko kikuchi
+     1  Dmitry Patsura
+     1  Kornelijus Survila
+     1  Laurent Mazare
+     1  Manish Gill
+     1  Marc van Heerden
+```
+
+# How to Get Involved
+If you are interested in contributing to the Rust implementation of Apache Arrow, we would love to have you! You can help by trying out Arrow on some of your own data and projects and filing bug reports and helping to improve the documentation, or contribute to the documentation, tests or code. A list of open issues suitable for beginners is [here](https://github.com/apache/arrow-rs/labels/good%20first%20issue) and the full list is [here](https://github.com/apache/arrow-rs/issues).