You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/29 16:46:05 UTC

[GitHub] [arrow-site] andygrove commented on a diff in pull request #257: [WEBSITE] Ballista 0.9.0 Blog Post

andygrove commented on code in PR #257:
URL: https://github.com/apache/arrow-site/pull/257#discussion_r1008728648


##########
_posts/2022-10-28-ballista-0.9.0.md:
##########
@@ -0,0 +1,130 @@
+---
+layout: post
+title: Apache Arrow Ballista 0.9.0 Release
+date: "2022-10-28 00:00:00"
+author: pmc
+categories: [release]
+---
+
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[Ballista][ballista] is an Arrow-native distributed SQL query engine implemented in Rust.
+
+Ballista 0.9.0 is now available and is the most significant release since the project was [donated][donated] to Apache
+Arrow in 2021.
+
+This release represents 4 weeks of work, with 66 commits from 14 contributors:
+
+```
+    22  Andy Grove
+    12  yahoNanJing
+     6  Daniël Heres
+     4  Brent Gardner
+     4  dependabot[bot]
+     4  r.4ntix
+     3  Stefan Stanciulescu
+     3  mingmwang
+     2  Ken Suenobu
+     2  Yang Jiang
+     1  Metehan Yıldırım
+     1  Trent Feda
+     1  askoa
+     1  yangzhong
+```
+
+## Release Highlights
+
+The release notes below are not exhaustive and only expose selected highlights of the release. Many other bug fixes
+and improvements have been made: we refer you to the [complete changelog][changelog].
+
+### Support for Cloud Object Stores and Distributed File Systems
+
+This is the first release of Ballista to have documented support for querying data from distributed file systems and
+object stores. Currently, S3 and HDFS are supported. Support for Google Cloud Storage and Azure Blob Storage is planned
+for the next release.
+
+### Flight SQL & JDBC support
+
+The Ballista scheduler now implements the [Flight SQL protocol][flight-sql], enabling any compliant Flight SQL client
+to connect to and run queries against a Ballista cluster.
+
+The Apache Arrow Flight SQL JDBC driver can be used to connect Business Intelligence tools to a Ballista cluster.
+
+### Python Bindings
+
+It is now possible to connect to a Ballista cluster from Python and execute queries using both the DataFrame and SQL
+interfaces.
+
+### Scheduler Web User Interface and REST API
+
+The scheduler now has a web user interface for monitoring queries. It is also possible to view graphical query plans
+that show how the query was executed, along with metrics.
+
+<img src="{{ site.baseurl }}/img/2022-10-28-ballista-web-ui.png" width="800"/>
+
+The REST API that powers the user interface can also be accessed directly.
+
+### Simplified Kubernetes Deployment
+
+Ballista now provides a [Helm chart][helm-chart] for simplified Kubernetes deployment.
+
+### Performance and Scalability
+
+We continue to use benchmarks derived from TPC-H to compare performance with other query engines and identify
+query optimizations that should be implemented. With the 0.9.0 release, we see that some queries are now more than 10x
+faster than Apache Spark at the scale factor we are currently testing at.
+
+<img src="{{ site.baseurl }}/img/2022-10-28-ballista-perf.png" />
+
+More optimizations are planned for the 0.10.0 release. See the [tracking issue][optimizations] for details.

Review Comment:
   I plan on removing this section since I have no confidence that all of the queries are working correctly



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org