You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by GitBox <gi...@apache.org> on 2020/06/24 20:46:03 UTC
[GitHub] [arrow-site] wesm commented on a change in pull request #63: Revamp website for 1.0 release

wesm commented on a change in pull request #63:
URL: https://github.com/apache/arrow-site/pull/63#discussion_r445134010



##########
File path: _includes/header.html
##########
@@ -50,22 +33,44 @@
           </a>
           <div class="dropdown-menu" aria-labelledby="navbarDropdownDocumentation">
             <a class="dropdown-item" href="{{ site.baseurl }}/docs">Project Docs</a>
-            <a class="dropdown-item" href="{{ site.baseurl }}/docs/python">Python</a>
+            <a class="dropdown-item" href="{{ site.baseurl }}/docs/format/Columnar.html">Specification</a>
+            <hr/>
+            <a class="dropdown-item" href="{{ site.baseurl }}/docs/c_glib">C GLib</a>
             <a class="dropdown-item" href="{{ site.baseurl }}/docs/cpp">C++</a>
+            <a class="dropdown-item" href="https://github.com/apache/arrow/blob/master/csharp/README.md">C#</a>
+            <a class="dropdown-item" href="https://godoc.org/github.com/apache/arrow/go/arrow">Go</a>
             <a class="dropdown-item" href="{{ site.baseurl }}/docs/java">Java</a>
-            <a class="dropdown-item" href="{{ site.baseurl }}/docs/c_glib">C GLib</a>
             <a class="dropdown-item" href="{{ site.baseurl }}/docs/js">JavaScript</a>
+            <a class="dropdown-item" href="https://github.com/apache/arrow/blob/master/matlab/README.md">MATLAB</a>
+            <a class="dropdown-item" href="{{ site.baseurl }}/docs/python">Python</a>
             <a class="dropdown-item" href="{{ site.baseurl }}/docs/r">R</a>
+            <a class="dropdown-item" href="https://github.com/apache/arrow/blob/master/ruby/README.md">Ruby</a>
+            <a class="dropdown-item" href="https://docs.rs/crate/arrow/">Rust</a>
+          </div>
+        </li>
+        <li class="nav-item dropdown">
+          <a class="nav-link dropdown-toggle" href="#"
+             id="navbarDropdownCommunity" role="button" data-toggle="dropdown"
+             aria-haspopup="true" aria-expanded="false">
+             Community
+          </a>
+          <div class="dropdown-menu" aria-labelledby="navbarDropdownCommunity">
+            <a class="dropdown-item" href="{{ site.baseurl }}/community/">Mailing Lists</a>

Review comment:
       "Communications"?

##########
File path: _includes/header.html
##########
@@ -50,22 +33,44 @@
           </a>
           <div class="dropdown-menu" aria-labelledby="navbarDropdownDocumentation">
             <a class="dropdown-item" href="{{ site.baseurl }}/docs">Project Docs</a>
-            <a class="dropdown-item" href="{{ site.baseurl }}/docs/python">Python</a>
+            <a class="dropdown-item" href="{{ site.baseurl }}/docs/format/Columnar.html">Specification</a>

Review comment:
       "Columnar Format"?

##########
File path: community.md
##########
@@ -0,0 +1,73 @@
+---
+layout: default
+title: Apache Arrow Community
+description: Links and resources for participating in Apache Arrow
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Apache Arrow Community
+
+We welcome participation from everyone and encourage you to join us, ask questions, and get involved.
+
+All participation in the Apache Arrow project is governed by the Apache Software Foundation's [code of conduct](https://www.apache.org/foundation/policies/conduct.html).
+
+## Questions?
+
+### Mailing lists
+
+These arrow.apache.org mailing lists are for project discussion:
+
+<ul>
+  <li> <code>user@</code> is for questions on using Apache Arrow libraries {% include mailing_list_links.html list="user" %} </li>
+  <li> <code>dev@</code> is for discussions about contributing to the project development {% include mailing_list_links.html list="dev" %} </li>
+</ul>
+
+When emailing one of the lists, you may want to prefix the subject line with one or more tags, like `[C++] why did this segfault?`, `[Python] trouble with wheels`, etc., so that the appropriate people in the community notice the message.
+
+You may also wish to subscript to these lists, which capture some activity streams:

Review comment:
       subscribe

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?
+
+Arrow is an open standard for how to represent columnar data in memory, along
+with libraries in many languages that implement that standard.  The Arrow format
+allows different programs and runtimes, perhaps written in different languages,
+to share data efficiently using a set of rich data types (included nested
+and user-defined data types).  The Arrow libraries make it easy to write such
+programs, by sparing the programmer from implementing low-level details of the
+Arrow format.
+
+Arrow additionally defines a streaming format and a file format for
+inter-process communication (IPC), based on the in-memory format.  It also
+defines a generic client-server RPC mechanism (Arrow Flight), based on the
+IPC format, and implemented on top of the gRPC framework.  <!-- TODO links -->
+
+### Why create a new standard?
+
+<!-- Fill this in -->
+
+## Project status
+
 ### How stable is the Arrow format? Is it safe to use in my application?
+<!-- Revise this -->
+
+The Arrow *in-memory format* is considered stable, and we intend to make only
+backwards-compatible changes, such as additional data types.  It is used by
+many applications already, and you can trust that compatibility will not be
+broken.

Review comment:
       Let's link to the versioning backward/forward compatibility guarantees in the docs

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?
+
+Arrow is an open standard for how to represent columnar data in memory, along
+with libraries in many languages that implement that standard.  The Arrow format
+allows different programs and runtimes, perhaps written in different languages,
+to share data efficiently using a set of rich data types (included nested
+and user-defined data types).  The Arrow libraries make it easy to write such
+programs, by sparing the programmer from implementing low-level details of the
+Arrow format.

Review comment:
       Here's a reframing -- I have been encouraging us to move away from creating a false equivalence between "Apache Arrow The Project" and the "Arrow Columnar Format". So anyplace where someone might say "Arrow _is_ the columnar format" we should correct them to say that "Arrow _contains_ a columnar format". Please edit / wordsmith as desired
   
   Apache Arrow is a software development platform for building high performance applications that process and transport large data sets. It is designed to both improve the performance of analytical algorithms and the efficiency of moving data from one system (or programming language to another). 
   
   A critical component of Apache Arrow is its **in-memory columnar format**, a standardized language-agnostic data structure specification for representing structured, table-like datasets in-memory. This data format has a rich data type system (included nested and user-defined data types) designed to support the needs of analytic database systems, data frame libraries, and more. The project contains many implementation of the Arrow columnar format along with utilities for reading and writing it to many common storage formats. 
   
   We do not anticipate that many third-party projects will choose to implement the Arrow columnar format themselves, instead choosing to depend on one of the official libraries. For projects that want to implement a small subset of the format, we have created some tools (like a C data interface) to assist with interoperability with the official Arrow libraries.
   
   The Arrow libraries contain many software components that assist with systems problems related to getting data in and out of remote storage systems and moving Arrow-formatted data over network interfaces. Some of these components can be used even in scenarios where the columnar format is not used at all. 
   
   Lastly, alongside software that helps with data access and IO-related issues, there are libraries of algorithms for performing analytical operations or queries against Arrow datasets.

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?
+
+Arrow is an open standard for how to represent columnar data in memory, along
+with libraries in many languages that implement that standard.  The Arrow format
+allows different programs and runtimes, perhaps written in different languages,
+to share data efficiently using a set of rich data types (included nested
+and user-defined data types).  The Arrow libraries make it easy to write such
+programs, by sparing the programmer from implementing low-level details of the
+Arrow format.
+
+Arrow additionally defines a streaming format and a file format for
+inter-process communication (IPC), based on the in-memory format.  It also
+defines a generic client-server RPC mechanism (Arrow Flight), based on the
+IPC format, and implemented on top of the gRPC framework.  <!-- TODO links -->
+
+### Why create a new standard?
+
+<!-- Fill this in -->
+
+## Project status
+
 ### How stable is the Arrow format? Is it safe to use in my application?
+<!-- Revise this -->
+
+The Arrow *in-memory format* is considered stable, and we intend to make only
+backwards-compatible changes, such as additional data types.  It is used by
+many applications already, and you can trust that compatibility will not be
+broken.
+
+The Arrow *file format* (based on the Arrow IPC mechanism) is not recommended
+for long-term disk persistence of data; that said, it is perfectly acceptable
+to write Arrow memory to disk for purposes of memory mapping and caching.

Review comment:
       I think this para can be removed as of 1.0.0

##########
File path: _layouts/home.html
##########
@@ -0,0 +1,21 @@
+{% include top.html %}
+
+<body class="wrap">
+  <header>
+    {% include header.html %}
+  </header>
+  <div class="big-arrow-bg">
+    <div class="container p-lg-4 centered">
+      <img src="{{ site.baseurl }}/img/arrow-inverse.png" style="max-width: 80%;"/>

Review comment:
       Smaller also better imho

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?
+
+Arrow is an open standard for how to represent columnar data in memory, along
+with libraries in many languages that implement that standard.  The Arrow format
+allows different programs and runtimes, perhaps written in different languages,
+to share data efficiently using a set of rich data types (included nested
+and user-defined data types).  The Arrow libraries make it easy to write such
+programs, by sparing the programmer from implementing low-level details of the
+Arrow format.
+
+Arrow additionally defines a streaming format and a file format for
+inter-process communication (IPC), based on the in-memory format.  It also
+defines a generic client-server RPC mechanism (Arrow Flight), based on the
+IPC format, and implemented on top of the gRPC framework.  <!-- TODO links -->
+
+### Why create a new standard?
+
+<!-- Fill this in -->
+
+## Project status
+
 ### How stable is the Arrow format? Is it safe to use in my application?
+<!-- Revise this -->
+
+The Arrow *in-memory format* is considered stable, and we intend to make only
+backwards-compatible changes, such as additional data types.  It is used by
+many applications already, and you can trust that compatibility will not be
+broken.
+
+The Arrow *file format* (based on the Arrow IPC mechanism) is not recommended
+for long-term disk persistence of data; that said, it is perfectly acceptable
+to write Arrow memory to disk for purposes of memory mapping and caching.
+
+We encourage people to start building Arrow-based in-memory computing
+applications now, and choose a suitable file format for disk storage
+if necessary. The Arrow libraries include adapters for several file formats,
+including Parquet, ORC, CSV, and JSON.

Review comment:
       I don't think we need to hedge regarding people storage Arrow data on disk starting with 1.0.0. We should state explicitly here however that we don't intend for Arrow to be replacement for Parquet (an exceedingly common question) and where relevant the columnar format makes trade-offs to support the performance requirements of in-memory analytics over purely file storage considerations. Parquet is not a "runtime in-memory format" and file formats almost always have to be deserialized into some in-memory data structure for processing, and we intend for Arrow to be that in-memory data structure

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?
+
+Arrow is an open standard for how to represent columnar data in memory, along
+with libraries in many languages that implement that standard.  The Arrow format
+allows different programs and runtimes, perhaps written in different languages,
+to share data efficiently using a set of rich data types (included nested
+and user-defined data types).  The Arrow libraries make it easy to write such
+programs, by sparing the programmer from implementing low-level details of the
+Arrow format.
+
+Arrow additionally defines a streaming format and a file format for
+inter-process communication (IPC), based on the in-memory format.  It also
+defines a generic client-server RPC mechanism (Arrow Flight), based on the
+IPC format, and implemented on top of the gRPC framework.  <!-- TODO links -->
+
+### Why create a new standard?
+
+<!-- Fill this in -->
+
+## Project status
+
 ### How stable is the Arrow format? Is it safe to use in my application?
+<!-- Revise this -->
+
+The Arrow *in-memory format* is considered stable, and we intend to make only
+backwards-compatible changes, such as additional data types.  It is used by
+many applications already, and you can trust that compatibility will not be
+broken.
+
+The Arrow *file format* (based on the Arrow IPC mechanism) is not recommended
+for long-term disk persistence of data; that said, it is perfectly acceptable
+to write Arrow memory to disk for purposes of memory mapping and caching.
+
+We encourage people to start building Arrow-based in-memory computing
+applications now, and choose a suitable file format for disk storage
+if necessary. The Arrow libraries include adapters for several file formats,
+including Parquet, ORC, CSV, and JSON.
+
+### How stable are the Arrow libraries?
+
+We refer you to the [implementation matrix](https://github.com/apache/arrow/blob/master/docs/source/status.rst).
+
+## Getting started
+
+### Where can I get Arrow libraries?
+
+Arrow libraries for many languages are available through the usual package
+managers. See the [install]({{ site.baseurl }}/install/) page for specifics.
 
-The Arrow in-memory format is considered stable, and we intend to make only backwards-compatible changes, such as additional data types. We do not yet recommend the Arrow file format for long-term disk persistence of data; that said, it is perfectly acceptable to write Arrow memory to disk for purposes of memory mapping and caching.
+## Getting involved
 
-We encourage people to start building Arrow-based in-memory computing applications now, and choose a suitable file format for disk storage if necessary. The Arrow libraries include adapters for several file formats, including Parquet, ORC, CSV, and JSON.
+### I have some questions. How can I get help?
+
+The [Arrow mailing lists]({{ site.baseurl }}/community/) are the best place
+to ask questions. Don't be shy--we're here to help.
+
+### I tried to use Arrow and it didn't work. Can you fix it?
+
+Hopefully! Please make a detailed bug report--that's a valuable contribution
+to the project itself.
+See the [contribution guidelines]({{ site.baseurl }}/docs/developers/contributing.html)
+for how to make a report.
+
+### Arrow looks great and I'd totally use it if it only did X. When will it be done?
+
+We use [JIRA](https://issues.apache.org/jira/browse/ARROW) for our issue tracker.
+Search for an issue that matches your need. If you find one, feel free to
+comment on it and describe your use case--that will help whoever picks up
+the task. If you don't find one, make it.
+
+Ultimately, Arrow is software written by and for the community. If you don't
+see someone else in the community working on your issue, the best way to get
+it done is to pitch in yourself. We're more than willing to help you contribute
+successfully to the project.
+
+### How can I report a security vulnerability?
+
+Please send an email to [private@arrow.apache.org](mailto:private@arrow.apache.org).
+See the [security]({{ site.baseurl }}/security/) page for more.
+
+## Relation to other projects
 
 ### What is the difference between Apache Arrow and Apache Parquet?
+<!-- Revise this -->
+
+Parquet is a storage format designed for maximum space efficiency, using
+advanced compression and encoding techniques.  It is ideal when wanting to
+minimize disk usage while storing gigabytes of data, or perhaps more.
+This efficiency comes at the cost of relatively expensive reading into memory,
+as Parquet data cannot be directly operated on but must be decoded in
+large chunks.
+
+Conversely, Arrow is an in-memory format meant for direct and efficient use
+for computational purposes.  Arrow data is not compressed (or only lightly so,
+when using dictionary encoding) but laid out in natural format for the CPU,
+so that data can be accessed at arbitrary places at full speed.
+
+Therefore, Arrow and Parquet are not competitors: they complement each other
+and are commonly used together in applications.  Storing your data on disk
+using Parquet, and reading it into memory in the Arrow format, will allow
+you to make the most of your computing hardware.
 
-In short, Parquet files are designed for disk storage, while Arrow is designed for in-memory use, but you can put it on disk and then memory-map later. Arrow and Parquet are intended to be compatible with each other and used together in applications.
+### What about "Arrow files" then?
 
-Parquet is a columnar file format for data serialization. Reading a Parquet file requires decompressing and decoding its contents into some kind of in-memory data structure. It is designed to be space/IO-efficient at the expensive CPU utilization for decoding. It does not provide any data structures for in-memory computing. Parquet is a streaming format which must be decoded from start-to-end; while some "index page" facilities have been added to the storage format recently, random access operations are generally costly.
+Apache Arrow defines an inter-process communication (IPC) mechanism to
+transfer a collection of Arrow columnar arrays (called a "record batch").
+It can be used synchronously between processes using the Arrow "stream format",
+or asynchronously by first persisting data on storage using the Arrow "file format".
 
-Arrow on the other hand is first and foremost a library providing columnar data structures for *in-memory computing*. When you read a Parquet file, you can decompress and decode the data *into* Arrow columnar data structures so that you can then perform analytics in-memory on the decoded data. The Arrow columnar format has some nice properties: random access is O(1) and each value cell is next to the previous and following one in memory, so it's efficient to iterate over.
+The Arrow IPC mechanism is based on the Arrow in-memory format, such that
+there is no translation necessary between the on-disk representation and
+the in-memory representation.  Therefore, performing analytics on an Arrow
+IPC file can use memory-mapping and pay effectively zero cost.
 
-What about "Arrow files" then? Apache Arrow defines a binary "serialization" protocol for arranging a collection of Arrow columnar arrays (called a "record batch") that can be used for messaging and interprocess communication. You can put the protocol anywhere, including on disk, which can later be memory-mapped or read into memory and sent elsewhere.
+Some things to keep in mind when comparing the Arrow IPC file format and the
+Parquet format:
 
-This Arrow protocol is designed so that you can "map" a blob of Arrow data without doing any deserialization, so performing analytics on Arrow protocol data on disk can use memory-mapping and pay effectively zero cost. The protocol is used for many other things as well, such as streaming data between Spark SQL and Python for running pandas functions against chunks of Spark SQL data (these are called "pandas udfs").
+* Parquet is safe for long-term storage and archival purposes, meaning if
+  you write a file today, you can expect that any system that says they can
+  "read Parquet" will be able to read the file in 5 years or 10 years.
+  We are not yet making this assertion about long-term stability of the Arrow
+  format.
 
-In some applications, Parquet and Arrow can be used interchangeably for on-disk data serialization. Some things to keep in mind:
+* Reading Parquet files generally requires expensive decoding, while reading

Review comment:
       "expensive" is in the eye of the beholder. How about "requires efficient, but relatively complex decoding"

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?
+
+Arrow is an open standard for how to represent columnar data in memory, along
+with libraries in many languages that implement that standard.  The Arrow format
+allows different programs and runtimes, perhaps written in different languages,
+to share data efficiently using a set of rich data types (included nested
+and user-defined data types).  The Arrow libraries make it easy to write such
+programs, by sparing the programmer from implementing low-level details of the
+Arrow format.
+
+Arrow additionally defines a streaming format and a file format for
+inter-process communication (IPC), based on the in-memory format.  It also
+defines a generic client-server RPC mechanism (Arrow Flight), based on the
+IPC format, and implemented on top of the gRPC framework.  <!-- TODO links -->
+
+### Why create a new standard?
+
+<!-- Fill this in -->
+
+## Project status
+
 ### How stable is the Arrow format? Is it safe to use in my application?
+<!-- Revise this -->
+
+The Arrow *in-memory format* is considered stable, and we intend to make only
+backwards-compatible changes, such as additional data types.  It is used by
+many applications already, and you can trust that compatibility will not be
+broken.
+
+The Arrow *file format* (based on the Arrow IPC mechanism) is not recommended
+for long-term disk persistence of data; that said, it is perfectly acceptable
+to write Arrow memory to disk for purposes of memory mapping and caching.
+
+We encourage people to start building Arrow-based in-memory computing
+applications now, and choose a suitable file format for disk storage
+if necessary. The Arrow libraries include adapters for several file formats,
+including Parquet, ORC, CSV, and JSON.
+
+### How stable are the Arrow libraries?
+
+We refer you to the [implementation matrix](https://github.com/apache/arrow/blob/master/docs/source/status.rst).
+
+## Getting started
+
+### Where can I get Arrow libraries?
+
+Arrow libraries for many languages are available through the usual package
+managers. See the [install]({{ site.baseurl }}/install/) page for specifics.
 
-The Arrow in-memory format is considered stable, and we intend to make only backwards-compatible changes, such as additional data types. We do not yet recommend the Arrow file format for long-term disk persistence of data; that said, it is perfectly acceptable to write Arrow memory to disk for purposes of memory mapping and caching.
+## Getting involved
 
-We encourage people to start building Arrow-based in-memory computing applications now, and choose a suitable file format for disk storage if necessary. The Arrow libraries include adapters for several file formats, including Parquet, ORC, CSV, and JSON.
+### I have some questions. How can I get help?
+
+The [Arrow mailing lists]({{ site.baseurl }}/community/) are the best place
+to ask questions. Don't be shy--we're here to help.
+
+### I tried to use Arrow and it didn't work. Can you fix it?
+
+Hopefully! Please make a detailed bug report--that's a valuable contribution
+to the project itself.
+See the [contribution guidelines]({{ site.baseurl }}/docs/developers/contributing.html)
+for how to make a report.
+
+### Arrow looks great and I'd totally use it if it only did X. When will it be done?
+
+We use [JIRA](https://issues.apache.org/jira/browse/ARROW) for our issue tracker.
+Search for an issue that matches your need. If you find one, feel free to
+comment on it and describe your use case--that will help whoever picks up
+the task. If you don't find one, make it.
+
+Ultimately, Arrow is software written by and for the community. If you don't
+see someone else in the community working on your issue, the best way to get
+it done is to pitch in yourself. We're more than willing to help you contribute
+successfully to the project.
+
+### How can I report a security vulnerability?
+
+Please send an email to [private@arrow.apache.org](mailto:private@arrow.apache.org).
+See the [security]({{ site.baseurl }}/security/) page for more.
+
+## Relation to other projects
 
 ### What is the difference between Apache Arrow and Apache Parquet?
+<!-- Revise this -->
+
+Parquet is a storage format designed for maximum space efficiency, using
+advanced compression and encoding techniques.  It is ideal when wanting to
+minimize disk usage while storing gigabytes of data, or perhaps more.
+This efficiency comes at the cost of relatively expensive reading into memory,
+as Parquet data cannot be directly operated on but must be decoded in
+large chunks.
+
+Conversely, Arrow is an in-memory format meant for direct and efficient use
+for computational purposes.  Arrow data is not compressed (or only lightly so,
+when using dictionary encoding) but laid out in natural format for the CPU,
+so that data can be accessed at arbitrary places at full speed.
+
+Therefore, Arrow and Parquet are not competitors: they complement each other
+and are commonly used together in applications.  Storing your data on disk
+using Parquet, and reading it into memory in the Arrow format, will allow
+you to make the most of your computing hardware.
 
-In short, Parquet files are designed for disk storage, while Arrow is designed for in-memory use, but you can put it on disk and then memory-map later. Arrow and Parquet are intended to be compatible with each other and used together in applications.
+### What about "Arrow files" then?
 
-Parquet is a columnar file format for data serialization. Reading a Parquet file requires decompressing and decoding its contents into some kind of in-memory data structure. It is designed to be space/IO-efficient at the expensive CPU utilization for decoding. It does not provide any data structures for in-memory computing. Parquet is a streaming format which must be decoded from start-to-end; while some "index page" facilities have been added to the storage format recently, random access operations are generally costly.
+Apache Arrow defines an inter-process communication (IPC) mechanism to
+transfer a collection of Arrow columnar arrays (called a "record batch").
+It can be used synchronously between processes using the Arrow "stream format",
+or asynchronously by first persisting data on storage using the Arrow "file format".
 
-Arrow on the other hand is first and foremost a library providing columnar data structures for *in-memory computing*. When you read a Parquet file, you can decompress and decode the data *into* Arrow columnar data structures so that you can then perform analytics in-memory on the decoded data. The Arrow columnar format has some nice properties: random access is O(1) and each value cell is next to the previous and following one in memory, so it's efficient to iterate over.
+The Arrow IPC mechanism is based on the Arrow in-memory format, such that
+there is no translation necessary between the on-disk representation and
+the in-memory representation.  Therefore, performing analytics on an Arrow
+IPC file can use memory-mapping and pay effectively zero cost.
 
-What about "Arrow files" then? Apache Arrow defines a binary "serialization" protocol for arranging a collection of Arrow columnar arrays (called a "record batch") that can be used for messaging and interprocess communication. You can put the protocol anywhere, including on disk, which can later be memory-mapped or read into memory and sent elsewhere.
+Some things to keep in mind when comparing the Arrow IPC file format and the
+Parquet format:
 
-This Arrow protocol is designed so that you can "map" a blob of Arrow data without doing any deserialization, so performing analytics on Arrow protocol data on disk can use memory-mapping and pay effectively zero cost. The protocol is used for many other things as well, such as streaming data between Spark SQL and Python for running pandas functions against chunks of Spark SQL data (these are called "pandas udfs").
+* Parquet is safe for long-term storage and archival purposes, meaning if
+  you write a file today, you can expect that any system that says they can
+  "read Parquet" will be able to read the file in 5 years or 10 years.
+  We are not yet making this assertion about long-term stability of the Arrow
+  format.

Review comment:
       "We are not yet making this assertion about long-term stability of the Arrow format."
   
   --> "While the Arrow on-disk format is stable and will be readable by future versions of the libraries, it is not intended for long-term archival storage."

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?
+
+Arrow is an open standard for how to represent columnar data in memory, along
+with libraries in many languages that implement that standard.  The Arrow format
+allows different programs and runtimes, perhaps written in different languages,
+to share data efficiently using a set of rich data types (included nested
+and user-defined data types).  The Arrow libraries make it easy to write such
+programs, by sparing the programmer from implementing low-level details of the
+Arrow format.
+
+Arrow additionally defines a streaming format and a file format for
+inter-process communication (IPC), based on the in-memory format.  It also
+defines a generic client-server RPC mechanism (Arrow Flight), based on the
+IPC format, and implemented on top of the gRPC framework.  <!-- TODO links -->
+
+### Why create a new standard?
+
+<!-- Fill this in -->
+
+## Project status
+
 ### How stable is the Arrow format? Is it safe to use in my application?
+<!-- Revise this -->
+
+The Arrow *in-memory format* is considered stable, and we intend to make only
+backwards-compatible changes, such as additional data types.  It is used by
+many applications already, and you can trust that compatibility will not be
+broken.
+
+The Arrow *file format* (based on the Arrow IPC mechanism) is not recommended
+for long-term disk persistence of data; that said, it is perfectly acceptable
+to write Arrow memory to disk for purposes of memory mapping and caching.
+
+We encourage people to start building Arrow-based in-memory computing
+applications now, and choose a suitable file format for disk storage
+if necessary. The Arrow libraries include adapters for several file formats,
+including Parquet, ORC, CSV, and JSON.
+
+### How stable are the Arrow libraries?
+
+We refer you to the [implementation matrix](https://github.com/apache/arrow/blob/master/docs/source/status.rst).
+
+## Getting started
+
+### Where can I get Arrow libraries?
+
+Arrow libraries for many languages are available through the usual package
+managers. See the [install]({{ site.baseurl }}/install/) page for specifics.
 
-The Arrow in-memory format is considered stable, and we intend to make only backwards-compatible changes, such as additional data types. We do not yet recommend the Arrow file format for long-term disk persistence of data; that said, it is perfectly acceptable to write Arrow memory to disk for purposes of memory mapping and caching.
+## Getting involved
 
-We encourage people to start building Arrow-based in-memory computing applications now, and choose a suitable file format for disk storage if necessary. The Arrow libraries include adapters for several file formats, including Parquet, ORC, CSV, and JSON.
+### I have some questions. How can I get help?
+
+The [Arrow mailing lists]({{ site.baseurl }}/community/) are the best place
+to ask questions. Don't be shy--we're here to help.
+
+### I tried to use Arrow and it didn't work. Can you fix it?
+
+Hopefully! Please make a detailed bug report--that's a valuable contribution
+to the project itself.
+See the [contribution guidelines]({{ site.baseurl }}/docs/developers/contributing.html)
+for how to make a report.
+
+### Arrow looks great and I'd totally use it if it only did X. When will it be done?
+
+We use [JIRA](https://issues.apache.org/jira/browse/ARROW) for our issue tracker.
+Search for an issue that matches your need. If you find one, feel free to
+comment on it and describe your use case--that will help whoever picks up
+the task. If you don't find one, make it.
+
+Ultimately, Arrow is software written by and for the community. If you don't
+see someone else in the community working on your issue, the best way to get
+it done is to pitch in yourself. We're more than willing to help you contribute
+successfully to the project.
+
+### How can I report a security vulnerability?
+
+Please send an email to [private@arrow.apache.org](mailto:private@arrow.apache.org).
+See the [security]({{ site.baseurl }}/security/) page for more.
+
+## Relation to other projects
 
 ### What is the difference between Apache Arrow and Apache Parquet?
+<!-- Revise this -->
+
+Parquet is a storage format designed for maximum space efficiency, using
+advanced compression and encoding techniques.  It is ideal when wanting to
+minimize disk usage while storing gigabytes of data, or perhaps more.
+This efficiency comes at the cost of relatively expensive reading into memory,
+as Parquet data cannot be directly operated on but must be decoded in
+large chunks.
+
+Conversely, Arrow is an in-memory format meant for direct and efficient use
+for computational purposes.  Arrow data is not compressed (or only lightly so,
+when using dictionary encoding) but laid out in natural format for the CPU,
+so that data can be accessed at arbitrary places at full speed.
+
+Therefore, Arrow and Parquet are not competitors: they complement each other
+and are commonly used together in applications.  Storing your data on disk
+using Parquet, and reading it into memory in the Arrow format, will allow
+you to make the most of your computing hardware.
 
-In short, Parquet files are designed for disk storage, while Arrow is designed for in-memory use, but you can put it on disk and then memory-map later. Arrow and Parquet are intended to be compatible with each other and used together in applications.
+### What about "Arrow files" then?
 
-Parquet is a columnar file format for data serialization. Reading a Parquet file requires decompressing and decoding its contents into some kind of in-memory data structure. It is designed to be space/IO-efficient at the expensive CPU utilization for decoding. It does not provide any data structures for in-memory computing. Parquet is a streaming format which must be decoded from start-to-end; while some "index page" facilities have been added to the storage format recently, random access operations are generally costly.
+Apache Arrow defines an inter-process communication (IPC) mechanism to
+transfer a collection of Arrow columnar arrays (called a "record batch").
+It can be used synchronously between processes using the Arrow "stream format",
+or asynchronously by first persisting data on storage using the Arrow "file format".
 
-Arrow on the other hand is first and foremost a library providing columnar data structures for *in-memory computing*. When you read a Parquet file, you can decompress and decode the data *into* Arrow columnar data structures so that you can then perform analytics in-memory on the decoded data. The Arrow columnar format has some nice properties: random access is O(1) and each value cell is next to the previous and following one in memory, so it's efficient to iterate over.
+The Arrow IPC mechanism is based on the Arrow in-memory format, such that
+there is no translation necessary between the on-disk representation and
+the in-memory representation.  Therefore, performing analytics on an Arrow
+IPC file can use memory-mapping and pay effectively zero cost.
 
-What about "Arrow files" then? Apache Arrow defines a binary "serialization" protocol for arranging a collection of Arrow columnar arrays (called a "record batch") that can be used for messaging and interprocess communication. You can put the protocol anywhere, including on disk, which can later be memory-mapped or read into memory and sent elsewhere.
+Some things to keep in mind when comparing the Arrow IPC file format and the
+Parquet format:
 
-This Arrow protocol is designed so that you can "map" a blob of Arrow data without doing any deserialization, so performing analytics on Arrow protocol data on disk can use memory-mapping and pay effectively zero cost. The protocol is used for many other things as well, such as streaming data between Spark SQL and Python for running pandas functions against chunks of Spark SQL data (these are called "pandas udfs").
+* Parquet is safe for long-term storage and archival purposes, meaning if
+  you write a file today, you can expect that any system that says they can
+  "read Parquet" will be able to read the file in 5 years or 10 years.
+  We are not yet making this assertion about long-term stability of the Arrow
+  format.
 
-In some applications, Parquet and Arrow can be used interchangeably for on-disk data serialization. Some things to keep in mind:
+* Reading Parquet files generally requires expensive decoding, while reading
+  Arrow IPC files is just a matter of transferring raw bytes from the storage
+  hardware.

Review comment:
       Instead of "just a matter of transferring raw bytes from the storage hardware." how about the more precise statement "reading Arrow IPC files does not involve any decoding because the on-disk representation is the same as the in-memory representation."

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?
+
+Arrow is an open standard for how to represent columnar data in memory, along
+with libraries in many languages that implement that standard.  The Arrow format
+allows different programs and runtimes, perhaps written in different languages,
+to share data efficiently using a set of rich data types (included nested
+and user-defined data types).  The Arrow libraries make it easy to write such
+programs, by sparing the programmer from implementing low-level details of the
+Arrow format.
+
+Arrow additionally defines a streaming format and a file format for
+inter-process communication (IPC), based on the in-memory format.  It also
+defines a generic client-server RPC mechanism (Arrow Flight), based on the
+IPC format, and implemented on top of the gRPC framework.  <!-- TODO links -->

Review comment:
       perhaps merge this with some of the thoughts above

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?
+
+Arrow is an open standard for how to represent columnar data in memory, along
+with libraries in many languages that implement that standard.  The Arrow format
+allows different programs and runtimes, perhaps written in different languages,
+to share data efficiently using a set of rich data types (included nested
+and user-defined data types).  The Arrow libraries make it easy to write such
+programs, by sparing the programmer from implementing low-level details of the
+Arrow format.
+
+Arrow additionally defines a streaming format and a file format for
+inter-process communication (IPC), based on the in-memory format.  It also
+defines a generic client-server RPC mechanism (Arrow Flight), based on the
+IPC format, and implemented on top of the gRPC framework.  <!-- TODO links -->
+
+### Why create a new standard?
+
+<!-- Fill this in -->
+
+## Project status
+
 ### How stable is the Arrow format? Is it safe to use in my application?
+<!-- Revise this -->
+
+The Arrow *in-memory format* is considered stable, and we intend to make only
+backwards-compatible changes, such as additional data types.  It is used by
+many applications already, and you can trust that compatibility will not be
+broken.
+
+The Arrow *file format* (based on the Arrow IPC mechanism) is not recommended
+for long-term disk persistence of data; that said, it is perfectly acceptable
+to write Arrow memory to disk for purposes of memory mapping and caching.
+
+We encourage people to start building Arrow-based in-memory computing
+applications now, and choose a suitable file format for disk storage
+if necessary. The Arrow libraries include adapters for several file formats,
+including Parquet, ORC, CSV, and JSON.
+
+### How stable are the Arrow libraries?
+
+We refer you to the [implementation matrix](https://github.com/apache/arrow/blob/master/docs/source/status.rst).
+
+## Getting started
+
+### Where can I get Arrow libraries?
+
+Arrow libraries for many languages are available through the usual package
+managers. See the [install]({{ site.baseurl }}/install/) page for specifics.
 
-The Arrow in-memory format is considered stable, and we intend to make only backwards-compatible changes, such as additional data types. We do not yet recommend the Arrow file format for long-term disk persistence of data; that said, it is perfectly acceptable to write Arrow memory to disk for purposes of memory mapping and caching.
+## Getting involved
 
-We encourage people to start building Arrow-based in-memory computing applications now, and choose a suitable file format for disk storage if necessary. The Arrow libraries include adapters for several file formats, including Parquet, ORC, CSV, and JSON.
+### I have some questions. How can I get help?
+
+The [Arrow mailing lists]({{ site.baseurl }}/community/) are the best place
+to ask questions. Don't be shy--we're here to help.
+
+### I tried to use Arrow and it didn't work. Can you fix it?
+
+Hopefully! Please make a detailed bug report--that's a valuable contribution
+to the project itself.
+See the [contribution guidelines]({{ site.baseurl }}/docs/developers/contributing.html)
+for how to make a report.
+
+### Arrow looks great and I'd totally use it if it only did X. When will it be done?
+
+We use [JIRA](https://issues.apache.org/jira/browse/ARROW) for our issue tracker.
+Search for an issue that matches your need. If you find one, feel free to
+comment on it and describe your use case--that will help whoever picks up
+the task. If you don't find one, make it.
+
+Ultimately, Arrow is software written by and for the community. If you don't
+see someone else in the community working on your issue, the best way to get
+it done is to pitch in yourself. We're more than willing to help you contribute
+successfully to the project.
+
+### How can I report a security vulnerability?
+
+Please send an email to [private@arrow.apache.org](mailto:private@arrow.apache.org).
+See the [security]({{ site.baseurl }}/security/) page for more.
+
+## Relation to other projects
 
 ### What is the difference between Apache Arrow and Apache Parquet?
+<!-- Revise this -->
+
+Parquet is a storage format designed for maximum space efficiency, using
+advanced compression and encoding techniques.  It is ideal when wanting to
+minimize disk usage while storing gigabytes of data, or perhaps more.
+This efficiency comes at the cost of relatively expensive reading into memory,
+as Parquet data cannot be directly operated on but must be decoded in
+large chunks.
+
+Conversely, Arrow is an in-memory format meant for direct and efficient use
+for computational purposes.  Arrow data is not compressed (or only lightly so,
+when using dictionary encoding) but laid out in natural format for the CPU,
+so that data can be accessed at arbitrary places at full speed.
+
+Therefore, Arrow and Parquet are not competitors: they complement each other
+and are commonly used together in applications.  Storing your data on disk
+using Parquet, and reading it into memory in the Arrow format, will allow
+you to make the most of your computing hardware.
 
-In short, Parquet files are designed for disk storage, while Arrow is designed for in-memory use, but you can put it on disk and then memory-map later. Arrow and Parquet are intended to be compatible with each other and used together in applications.
+### What about "Arrow files" then?
 
-Parquet is a columnar file format for data serialization. Reading a Parquet file requires decompressing and decoding its contents into some kind of in-memory data structure. It is designed to be space/IO-efficient at the expensive CPU utilization for decoding. It does not provide any data structures for in-memory computing. Parquet is a streaming format which must be decoded from start-to-end; while some "index page" facilities have been added to the storage format recently, random access operations are generally costly.
+Apache Arrow defines an inter-process communication (IPC) mechanism to
+transfer a collection of Arrow columnar arrays (called a "record batch").
+It can be used synchronously between processes using the Arrow "stream format",
+or asynchronously by first persisting data on storage using the Arrow "file format".
 
-Arrow on the other hand is first and foremost a library providing columnar data structures for *in-memory computing*. When you read a Parquet file, you can decompress and decode the data *into* Arrow columnar data structures so that you can then perform analytics in-memory on the decoded data. The Arrow columnar format has some nice properties: random access is O(1) and each value cell is next to the previous and following one in memory, so it's efficient to iterate over.
+The Arrow IPC mechanism is based on the Arrow in-memory format, such that
+there is no translation necessary between the on-disk representation and
+the in-memory representation.  Therefore, performing analytics on an Arrow
+IPC file can use memory-mapping and pay effectively zero cost.
 
-What about "Arrow files" then? Apache Arrow defines a binary "serialization" protocol for arranging a collection of Arrow columnar arrays (called a "record batch") that can be used for messaging and interprocess communication. You can put the protocol anywhere, including on disk, which can later be memory-mapped or read into memory and sent elsewhere.
+Some things to keep in mind when comparing the Arrow IPC file format and the
+Parquet format:
 
-This Arrow protocol is designed so that you can "map" a blob of Arrow data without doing any deserialization, so performing analytics on Arrow protocol data on disk can use memory-mapping and pay effectively zero cost. The protocol is used for many other things as well, such as streaming data between Spark SQL and Python for running pandas functions against chunks of Spark SQL data (these are called "pandas udfs").
+* Parquet is safe for long-term storage and archival purposes, meaning if
+  you write a file today, you can expect that any system that says they can
+  "read Parquet" will be able to read the file in 5 years or 10 years.
+  We are not yet making this assertion about long-term stability of the Arrow
+  format.
 
-In some applications, Parquet and Arrow can be used interchangeably for on-disk data serialization. Some things to keep in mind:
+* Reading Parquet files generally requires expensive decoding, while reading
+  Arrow IPC files is just a matter of transferring raw bytes from the storage
+  hardware.
 
-* Parquet is intended for "archival" purposes, meaning if you write a file today, we expect that any system that says they can "read Parquet" will be able to read the file in 5 years or 7 years. We are not yet making this assertion about long-term stability of the Arrow format.
-* Parquet is generally a lot more expensive to read because it must be decoded into some other data structure. Arrow protocol data can simply be memory-mapped.
-* Parquet files are often much smaller than Arrow-protocol-on-disk because of the data encoding schemes that Parquet uses. If your disk storage or network is slow, Parquet may be a better choice.
+* Parquet files are often much smaller than Arrow IPC files because of the
+  elaborate encoding schemes that Parquet uses. If your disk storage or network
+  is slow, Parquet may be a better choice even for short-term storage or caching.
+
+### What about the "Feather" file format?
+
+The Feather v1 format started as a separate specification, but the Feather v2
+format is just another, easier to remember name for the Arrow IPC file format.
 
 ### How does Arrow relate to Flatbuffers?
 
-Flatbuffers is a domain-agnostic low-level building block for binary data formats. It cannot be used directly for data analysis tasks without a lot of manual scaffolding. Arrow is a data layer aimed directly at the needs of data analysis, providing elaborate data types (including extensible logical types), built-in support for "null" values (a.k.a "N/A"), and an expanding toolbox of I/O and computing facilities.
+Flatbuffers is a low-level building block for binary data serialization.
+It is not adapted to the representation of large, structured, homogenous
+data, and does not sit at the right abstraction layer for data analysis tasks.
+
+Arrow is a data layer aimed directly at the needs of data analysis, providing
+elaborate data types (including extensible logical types), built-in support

Review comment:
       Use a more neutral word than "elaborate". How about, "providing a comprehensive collection of data types required to analytics" or something similar

##########
File path: index.html
##########
@@ -1,72 +1,62 @@
 ---
-layout: default
+layout: home
 ---
-<div class="jumbotron">
-    <h1>Apache Arrow</h1>
-    <p class="lead">A cross-language development platform for in-memory data</p>
-    <p>
-        <a class="btn btn-lg btn-success" style="white-space: normal;" href="mailto:dev-subscribe@arrow.apache.org" role="button">Join Mailing List</a>
-        <a class="btn btn-lg btn-primary" style="white-space: normal;" href="{{ site.baseurl }}/install/" role="button">Install ({{site.data.versions['current'].number}} Release - {{site.data.versions['current'].date}})</a>
-    </p>
-</div>
-<h5>
-  Interested in contributing?
-  <small class="text-muted">Join the <a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/"><strong>mailing list</strong></a> or check out the <a href="https://cwiki.apache.org/confluence/display/ARROW"><strong>developer wiki</strong></a>.</small>
-</h5>
-<h5>
-  <a href="{{ site.baseurl }}/blog/"><strong>See Latest News</strong></a>
-</h5>
-<p>
-  {{ site.description }}
-</p>
-<hr />
+<h1>What is Arrow?</h1>
 <div class="row">
   <div class="col-lg-4">
-      <h2 class="mt-3">Fast</h2>
-      <p>Apache Arrow&#8482; enables execution engines to take advantage of the latest SIMD (Single instruction, multiple data) operations included in modern processors, for native vectorized optimization of analytical data processing. Columnar layout is optimized for data locality for better performance on modern hardware like CPUs and GPUs.</p>
-      <p>The Arrow memory format supports <strong>zero-copy reads</strong> for lightning-fast data access without serialization overhead.</p>
+      <h2 class="mt-3">Format</h2>
+      <p>Apache Arrow defines a language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs. The Arrow memory format also supports <strong>zero-copy reads</strong> for lightning-fast data access without serialization overhead.</p>
+      <p><a href="{{ site.baseurl }}/overview/">Learn more</a> about the design or
+        <a href="{{ site.baseurl }}/docs/format/Columnar.html">read the specification</a>.</p>
   </div>
   <div class="col-lg-4">
-      <h2 class="mt-3">Flexible</h2>
-      <p>Arrow acts as a new high-performance interface between various systems. It is also focused on supporting a wide variety of industry-standard programming languages. C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust implementations are in progress and more languages are welcome.
+      <h2 class="mt-3">Libraries</h2>
+      <p>The Arrow project includes libraries that implement the memory specification in many languages. They enable you to use the Arrow format as an efficient means of sharing data across languages and processes. Libraries are available for <a href="{{ site.baseurl }}/docs/c_glib/">C</a>, <a href="{{ site.baseurl }}/docs/cpp/">C++</a>, <a href="https://github.com/apache/arrow/blob/master/csharp/README.md">C#</a>, <a href="https://godoc.org/github.com/apache/arrow/go/arrow">Go</a>, <a href="{{ site.baseurl }}/docs/java/">Java</a>, <a href="{{ site.baseurl }}/docs/js/">JavaScript</a>, <a href="https://github.com/apache/arrow/blob/master/matlab/README.md">MATLAB</a>, <a href="{{ site.baseurl }}/docs/python/">Python</a>, <a href="{{ site.baseurl }}/docs/r/">R</a>, <a href="https://github.com/apache/arrow/blob/master/ruby/README.md">Ruby</a>, and <a href="https://docs.rs/crate/arrow/">Rust</a>.

Review comment:
       Arrow's libraries provide building blocks for creating high performance analytics applications. The libraries implement the Arrow columnar format and address a wide spectrum of problems related to data access, in-memory data management, and analytical query processing. 

##########
File path: index.html
##########
@@ -1,72 +1,62 @@
 ---
-layout: default
+layout: home
 ---
-<div class="jumbotron">
-    <h1>Apache Arrow</h1>
-    <p class="lead">A cross-language development platform for in-memory data</p>
-    <p>
-        <a class="btn btn-lg btn-success" style="white-space: normal;" href="mailto:dev-subscribe@arrow.apache.org" role="button">Join Mailing List</a>
-        <a class="btn btn-lg btn-primary" style="white-space: normal;" href="{{ site.baseurl }}/install/" role="button">Install ({{site.data.versions['current'].number}} Release - {{site.data.versions['current'].date}})</a>
-    </p>
-</div>
-<h5>
-  Interested in contributing?
-  <small class="text-muted">Join the <a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/"><strong>mailing list</strong></a> or check out the <a href="https://cwiki.apache.org/confluence/display/ARROW"><strong>developer wiki</strong></a>.</small>
-</h5>
-<h5>
-  <a href="{{ site.baseurl }}/blog/"><strong>See Latest News</strong></a>
-</h5>
-<p>
-  {{ site.description }}
-</p>
-<hr />
+<h1>What is Arrow?</h1>
 <div class="row">
   <div class="col-lg-4">
-      <h2 class="mt-3">Fast</h2>
-      <p>Apache Arrow&#8482; enables execution engines to take advantage of the latest SIMD (Single instruction, multiple data) operations included in modern processors, for native vectorized optimization of analytical data processing. Columnar layout is optimized for data locality for better performance on modern hardware like CPUs and GPUs.</p>
-      <p>The Arrow memory format supports <strong>zero-copy reads</strong> for lightning-fast data access without serialization overhead.</p>
+      <h2 class="mt-3">Format</h2>
+      <p>Apache Arrow defines a language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs. The Arrow memory format also supports <strong>zero-copy reads</strong> for lightning-fast data access without serialization overhead.</p>
+      <p><a href="{{ site.baseurl }}/overview/">Learn more</a> about the design or
+        <a href="{{ site.baseurl }}/docs/format/Columnar.html">read the specification</a>.</p>
   </div>
   <div class="col-lg-4">
-      <h2 class="mt-3">Flexible</h2>
-      <p>Arrow acts as a new high-performance interface between various systems. It is also focused on supporting a wide variety of industry-standard programming languages. C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust implementations are in progress and more languages are welcome.
+      <h2 class="mt-3">Libraries</h2>
+      <p>The Arrow project includes libraries that implement the memory specification in many languages. They enable you to use the Arrow format as an efficient means of sharing data across languages and processes. Libraries are available for <a href="{{ site.baseurl }}/docs/c_glib/">C</a>, <a href="{{ site.baseurl }}/docs/cpp/">C++</a>, <a href="https://github.com/apache/arrow/blob/master/csharp/README.md">C#</a>, <a href="https://godoc.org/github.com/apache/arrow/go/arrow">Go</a>, <a href="{{ site.baseurl }}/docs/java/">Java</a>, <a href="{{ site.baseurl }}/docs/js/">JavaScript</a>, <a href="https://github.com/apache/arrow/blob/master/matlab/README.md">MATLAB</a>, <a href="{{ site.baseurl }}/docs/python/">Python</a>, <a href="{{ site.baseurl }}/docs/r/">R</a>, <a href="https://github.com/apache/arrow/blob/master/ruby/README.md">Ruby</a>, and <a href="https://docs.rs/crate/arrow/">Rust</a>.
       </p>
+      See <a href="{{ site.baseurl }}/install/">how to install</a> and <a href="{{ site.baseurl }}/getting_started/">get started</a>.
   </div>
   <div class="col-lg-4">
-      <h2 class="mt-3">Standard</h2>
-      <p>Apache Arrow is backed by key developers of 13 major open source projects, including Calcite, Cassandra, Drill, Hadoop, HBase, Ibis, Impala, Kudu, Pandas, Parquet, Phoenix, Spark, and Storm making it the de-facto standard for columnar in-memory analytics.</p>
-      <p>Learn more about projects that are <a href="{{ site.baseurl }}/powered_by/">Powered By Apache Arrow</a></p>
+      <h2 class="mt-3">Applications</h2>
+      <p>Arrow libraries provide a foundation for developers to build fast analytics applications. <a href="{{ site.baseurl }}/powered_by/">Many popular projects</a> use Arrow to ship columnar data efficiently or as the basis for analytic engines.
+      <p>The libraries also include built-in features for working with data directly, including Parquet file reading and querying large datasets. See more Arrow <a href="{{ site.baseurl }}/use_cases/">use cases</a>.</p>
   </div>
 </div>
-<hr />
+
+<h1>Why Arrow?</h1>

Review comment:
       "Why use the Arrow Columnar Format?" 

##########
File path: index.html
##########
@@ -1,72 +1,62 @@
 ---
-layout: default
+layout: home
 ---
-<div class="jumbotron">
-    <h1>Apache Arrow</h1>
-    <p class="lead">A cross-language development platform for in-memory data</p>
-    <p>
-        <a class="btn btn-lg btn-success" style="white-space: normal;" href="mailto:dev-subscribe@arrow.apache.org" role="button">Join Mailing List</a>
-        <a class="btn btn-lg btn-primary" style="white-space: normal;" href="{{ site.baseurl }}/install/" role="button">Install ({{site.data.versions['current'].number}} Release - {{site.data.versions['current'].date}})</a>
-    </p>
-</div>
-<h5>
-  Interested in contributing?
-  <small class="text-muted">Join the <a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/"><strong>mailing list</strong></a> or check out the <a href="https://cwiki.apache.org/confluence/display/ARROW"><strong>developer wiki</strong></a>.</small>
-</h5>
-<h5>
-  <a href="{{ site.baseurl }}/blog/"><strong>See Latest News</strong></a>
-</h5>
-<p>
-  {{ site.description }}
-</p>
-<hr />
+<h1>What is Arrow?</h1>
 <div class="row">
   <div class="col-lg-4">
-      <h2 class="mt-3">Fast</h2>
-      <p>Apache Arrow&#8482; enables execution engines to take advantage of the latest SIMD (Single instruction, multiple data) operations included in modern processors, for native vectorized optimization of analytical data processing. Columnar layout is optimized for data locality for better performance on modern hardware like CPUs and GPUs.</p>
-      <p>The Arrow memory format supports <strong>zero-copy reads</strong> for lightning-fast data access without serialization overhead.</p>
+      <h2 class="mt-3">Format</h2>
+      <p>Apache Arrow defines a language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs. The Arrow memory format also supports <strong>zero-copy reads</strong> for lightning-fast data access without serialization overhead.</p>
+      <p><a href="{{ site.baseurl }}/overview/">Learn more</a> about the design or
+        <a href="{{ site.baseurl }}/docs/format/Columnar.html">read the specification</a>.</p>
   </div>
   <div class="col-lg-4">
-      <h2 class="mt-3">Flexible</h2>
-      <p>Arrow acts as a new high-performance interface between various systems. It is also focused on supporting a wide variety of industry-standard programming languages. C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust implementations are in progress and more languages are welcome.
+      <h2 class="mt-3">Libraries</h2>
+      <p>The Arrow project includes libraries that implement the memory specification in many languages. They enable you to use the Arrow format as an efficient means of sharing data across languages and processes. Libraries are available for <a href="{{ site.baseurl }}/docs/c_glib/">C</a>, <a href="{{ site.baseurl }}/docs/cpp/">C++</a>, <a href="https://github.com/apache/arrow/blob/master/csharp/README.md">C#</a>, <a href="https://godoc.org/github.com/apache/arrow/go/arrow">Go</a>, <a href="{{ site.baseurl }}/docs/java/">Java</a>, <a href="{{ site.baseurl }}/docs/js/">JavaScript</a>, <a href="https://github.com/apache/arrow/blob/master/matlab/README.md">MATLAB</a>, <a href="{{ site.baseurl }}/docs/python/">Python</a>, <a href="{{ site.baseurl }}/docs/r/">R</a>, <a href="https://github.com/apache/arrow/blob/master/ruby/README.md">Ruby</a>, and <a href="https://docs.rs/crate/arrow/">Rust</a>.
       </p>
+      See <a href="{{ site.baseurl }}/install/">how to install</a> and <a href="{{ site.baseurl }}/getting_started/">get started</a>.
   </div>
   <div class="col-lg-4">
-      <h2 class="mt-3">Standard</h2>
-      <p>Apache Arrow is backed by key developers of 13 major open source projects, including Calcite, Cassandra, Drill, Hadoop, HBase, Ibis, Impala, Kudu, Pandas, Parquet, Phoenix, Spark, and Storm making it the de-facto standard for columnar in-memory analytics.</p>
-      <p>Learn more about projects that are <a href="{{ site.baseurl }}/powered_by/">Powered By Apache Arrow</a></p>
+      <h2 class="mt-3">Applications</h2>

Review comment:
       Ecosystem?

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?
+
+Arrow is an open standard for how to represent columnar data in memory, along
+with libraries in many languages that implement that standard.  The Arrow format
+allows different programs and runtimes, perhaps written in different languages,
+to share data efficiently using a set of rich data types (included nested
+and user-defined data types).  The Arrow libraries make it easy to write such
+programs, by sparing the programmer from implementing low-level details of the
+Arrow format.
+
+Arrow additionally defines a streaming format and a file format for
+inter-process communication (IPC), based on the in-memory format.  It also
+defines a generic client-server RPC mechanism (Arrow Flight), based on the
+IPC format, and implemented on top of the gRPC framework.  <!-- TODO links -->
+
+### Why create a new standard?
+
+<!-- Fill this in -->
+
+## Project status
+
 ### How stable is the Arrow format? Is it safe to use in my application?
+<!-- Revise this -->
+
+The Arrow *in-memory format* is considered stable, and we intend to make only
+backwards-compatible changes, such as additional data types.  It is used by
+many applications already, and you can trust that compatibility will not be
+broken.
+
+The Arrow *file format* (based on the Arrow IPC mechanism) is not recommended
+for long-term disk persistence of data; that said, it is perfectly acceptable
+to write Arrow memory to disk for purposes of memory mapping and caching.
+
+We encourage people to start building Arrow-based in-memory computing
+applications now, and choose a suitable file format for disk storage
+if necessary. The Arrow libraries include adapters for several file formats,
+including Parquet, ORC, CSV, and JSON.
+
+### How stable are the Arrow libraries?
+
+We refer you to the [implementation matrix](https://github.com/apache/arrow/blob/master/docs/source/status.rst).
+
+## Getting started
+
+### Where can I get Arrow libraries?
+
+Arrow libraries for many languages are available through the usual package
+managers. See the [install]({{ site.baseurl }}/install/) page for specifics.
 
-The Arrow in-memory format is considered stable, and we intend to make only backwards-compatible changes, such as additional data types. We do not yet recommend the Arrow file format for long-term disk persistence of data; that said, it is perfectly acceptable to write Arrow memory to disk for purposes of memory mapping and caching.
+## Getting involved
 
-We encourage people to start building Arrow-based in-memory computing applications now, and choose a suitable file format for disk storage if necessary. The Arrow libraries include adapters for several file formats, including Parquet, ORC, CSV, and JSON.
+### I have some questions. How can I get help?
+
+The [Arrow mailing lists]({{ site.baseurl }}/community/) are the best place
+to ask questions. Don't be shy--we're here to help.
+
+### I tried to use Arrow and it didn't work. Can you fix it?
+
+Hopefully! Please make a detailed bug report--that's a valuable contribution
+to the project itself.
+See the [contribution guidelines]({{ site.baseurl }}/docs/developers/contributing.html)
+for how to make a report.
+
+### Arrow looks great and I'd totally use it if it only did X. When will it be done?
+
+We use [JIRA](https://issues.apache.org/jira/browse/ARROW) for our issue tracker.
+Search for an issue that matches your need. If you find one, feel free to
+comment on it and describe your use case--that will help whoever picks up
+the task. If you don't find one, make it.
+
+Ultimately, Arrow is software written by and for the community. If you don't
+see someone else in the community working on your issue, the best way to get
+it done is to pitch in yourself. We're more than willing to help you contribute
+successfully to the project.
+
+### How can I report a security vulnerability?
+
+Please send an email to [private@arrow.apache.org](mailto:private@arrow.apache.org).
+See the [security]({{ site.baseurl }}/security/) page for more.
+
+## Relation to other projects
 
 ### What is the difference between Apache Arrow and Apache Parquet?
+<!-- Revise this -->
+
+Parquet is a storage format designed for maximum space efficiency, using
+advanced compression and encoding techniques.  It is ideal when wanting to
+minimize disk usage while storing gigabytes of data, or perhaps more.
+This efficiency comes at the cost of relatively expensive reading into memory,
+as Parquet data cannot be directly operated on but must be decoded in
+large chunks.
+
+Conversely, Arrow is an in-memory format meant for direct and efficient use
+for computational purposes.  Arrow data is not compressed (or only lightly so,
+when using dictionary encoding) but laid out in natural format for the CPU,
+so that data can be accessed at arbitrary places at full speed.
+
+Therefore, Arrow and Parquet are not competitors: they complement each other
+and are commonly used together in applications.  Storing your data on disk
+using Parquet, and reading it into memory in the Arrow format, will allow
+you to make the most of your computing hardware.
 
-In short, Parquet files are designed for disk storage, while Arrow is designed for in-memory use, but you can put it on disk and then memory-map later. Arrow and Parquet are intended to be compatible with each other and used together in applications.
+### What about "Arrow files" then?
 
-Parquet is a columnar file format for data serialization. Reading a Parquet file requires decompressing and decoding its contents into some kind of in-memory data structure. It is designed to be space/IO-efficient at the expensive CPU utilization for decoding. It does not provide any data structures for in-memory computing. Parquet is a streaming format which must be decoded from start-to-end; while some "index page" facilities have been added to the storage format recently, random access operations are generally costly.
+Apache Arrow defines an inter-process communication (IPC) mechanism to
+transfer a collection of Arrow columnar arrays (called a "record batch").
+It can be used synchronously between processes using the Arrow "stream format",
+or asynchronously by first persisting data on storage using the Arrow "file format".
 
-Arrow on the other hand is first and foremost a library providing columnar data structures for *in-memory computing*. When you read a Parquet file, you can decompress and decode the data *into* Arrow columnar data structures so that you can then perform analytics in-memory on the decoded data. The Arrow columnar format has some nice properties: random access is O(1) and each value cell is next to the previous and following one in memory, so it's efficient to iterate over.
+The Arrow IPC mechanism is based on the Arrow in-memory format, such that
+there is no translation necessary between the on-disk representation and
+the in-memory representation.  Therefore, performing analytics on an Arrow
+IPC file can use memory-mapping and pay effectively zero cost.

Review comment:
       +1

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?
+
+Arrow is an open standard for how to represent columnar data in memory, along
+with libraries in many languages that implement that standard.  The Arrow format
+allows different programs and runtimes, perhaps written in different languages,
+to share data efficiently using a set of rich data types (included nested
+and user-defined data types).  The Arrow libraries make it easy to write such
+programs, by sparing the programmer from implementing low-level details of the
+Arrow format.
+
+Arrow additionally defines a streaming format and a file format for
+inter-process communication (IPC), based on the in-memory format.  It also
+defines a generic client-server RPC mechanism (Arrow Flight), based on the
+IPC format, and implemented on top of the gRPC framework.  <!-- TODO links -->
+
+### Why create a new standard?
+
+<!-- Fill this in -->

Review comment:
       Traditionally, data processing engine developers have created custom data structures to represent datasets in-memory while they are being processed. Given the "custom" nature of these data structures, they must also develop serialization interfaces to convert between these data structures and different file formats, network wire protocols, database clients, and other data transport interface. The net result of this is an incredible amount of waste both in developer time and in CPU cycles spend serializing data from one format to another.
   
   Therefore, the rationale for Arrow's in-memory columnar data format is to provide an out-of-the-box solution to several interrelated problems:
   
   * A general purpose tabular data representation that is highly efficient to process on modern hardware while also being suitable for a wide spectrum of use cases. We believe that fewer and fewer systems will create their own data structures and simply use Arrow.
   * Supports both random access and streaming / scan-based workloads.
   * A standardized memory format facilitates reuse of libraries of algorithms. When custom in-memory data formats are used, common algorithms must often be rewritten to target those custom data formats.
   * Systems that both use or support Arrow can transfer data between them at little-to-no cost. This results in a radical reduction in the amount of serialization overhead in analytical workloads that can often represent 80-90% of computing costs. 
   * The language-agnostic design of the Arrow format enables systems written in different programming languages (even running on the JVM) to communicate datasets without serialization overhead. For example, a Java application can call a C or C++ algorithm on data that originated in the JVM.  
   
   ... probably some other stuff can be added here

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?

Review comment:
       "Apache Arrow"

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?
+
+Arrow is an open standard for how to represent columnar data in memory, along
+with libraries in many languages that implement that standard.  The Arrow format
+allows different programs and runtimes, perhaps written in different languages,
+to share data efficiently using a set of rich data types (included nested
+and user-defined data types).  The Arrow libraries make it easy to write such
+programs, by sparing the programmer from implementing low-level details of the
+Arrow format.
+
+Arrow additionally defines a streaming format and a file format for
+inter-process communication (IPC), based on the in-memory format.  It also
+defines a generic client-server RPC mechanism (Arrow Flight), based on the
+IPC format, and implemented on top of the gRPC framework.  <!-- TODO links -->
+
+### Why create a new standard?
+
+<!-- Fill this in -->
+
+## Project status
+
 ### How stable is the Arrow format? Is it safe to use in my application?
+<!-- Revise this -->
+
+The Arrow *in-memory format* is considered stable, and we intend to make only
+backwards-compatible changes, such as additional data types.  It is used by
+many applications already, and you can trust that compatibility will not be
+broken.
+
+The Arrow *file format* (based on the Arrow IPC mechanism) is not recommended
+for long-term disk persistence of data; that said, it is perfectly acceptable
+to write Arrow memory to disk for purposes of memory mapping and caching.
+
+We encourage people to start building Arrow-based in-memory computing
+applications now, and choose a suitable file format for disk storage
+if necessary. The Arrow libraries include adapters for several file formats,
+including Parquet, ORC, CSV, and JSON.
+
+### How stable are the Arrow libraries?
+
+We refer you to the [implementation matrix](https://github.com/apache/arrow/blob/master/docs/source/status.rst).
+
+## Getting started
+
+### Where can I get Arrow libraries?
+
+Arrow libraries for many languages are available through the usual package
+managers. See the [install]({{ site.baseurl }}/install/) page for specifics.
 
-The Arrow in-memory format is considered stable, and we intend to make only backwards-compatible changes, such as additional data types. We do not yet recommend the Arrow file format for long-term disk persistence of data; that said, it is perfectly acceptable to write Arrow memory to disk for purposes of memory mapping and caching.
+## Getting involved
 
-We encourage people to start building Arrow-based in-memory computing applications now, and choose a suitable file format for disk storage if necessary. The Arrow libraries include adapters for several file formats, including Parquet, ORC, CSV, and JSON.
+### I have some questions. How can I get help?
+
+The [Arrow mailing lists]({{ site.baseurl }}/community/) are the best place
+to ask questions. Don't be shy--we're here to help.
+
+### I tried to use Arrow and it didn't work. Can you fix it?
+
+Hopefully! Please make a detailed bug report--that's a valuable contribution
+to the project itself.
+See the [contribution guidelines]({{ site.baseurl }}/docs/developers/contributing.html)
+for how to make a report.
+
+### Arrow looks great and I'd totally use it if it only did X. When will it be done?
+
+We use [JIRA](https://issues.apache.org/jira/browse/ARROW) for our issue tracker.
+Search for an issue that matches your need. If you find one, feel free to
+comment on it and describe your use case--that will help whoever picks up
+the task. If you don't find one, make it.
+
+Ultimately, Arrow is software written by and for the community. If you don't
+see someone else in the community working on your issue, the best way to get
+it done is to pitch in yourself. We're more than willing to help you contribute
+successfully to the project.
+
+### How can I report a security vulnerability?
+
+Please send an email to [private@arrow.apache.org](mailto:private@arrow.apache.org).
+See the [security]({{ site.baseurl }}/security/) page for more.
+
+## Relation to other projects
 
 ### What is the difference between Apache Arrow and Apache Parquet?
+<!-- Revise this -->
+
+Parquet is a storage format designed for maximum space efficiency, using
+advanced compression and encoding techniques.  It is ideal when wanting to
+minimize disk usage while storing gigabytes of data, or perhaps more.
+This efficiency comes at the cost of relatively expensive reading into memory,
+as Parquet data cannot be directly operated on but must be decoded in
+large chunks.
+
+Conversely, Arrow is an in-memory format meant for direct and efficient use
+for computational purposes.  Arrow data is not compressed (or only lightly so,
+when using dictionary encoding) but laid out in natural format for the CPU,
+so that data can be accessed at arbitrary places at full speed.
+
+Therefore, Arrow and Parquet are not competitors: they complement each other
+and are commonly used together in applications.  Storing your data on disk
+using Parquet, and reading it into memory in the Arrow format, will allow
+you to make the most of your computing hardware.
 
-In short, Parquet files are designed for disk storage, while Arrow is designed for in-memory use, but you can put it on disk and then memory-map later. Arrow and Parquet are intended to be compatible with each other and used together in applications.
+### What about "Arrow files" then?
 
-Parquet is a columnar file format for data serialization. Reading a Parquet file requires decompressing and decoding its contents into some kind of in-memory data structure. It is designed to be space/IO-efficient at the expensive CPU utilization for decoding. It does not provide any data structures for in-memory computing. Parquet is a streaming format which must be decoded from start-to-end; while some "index page" facilities have been added to the storage format recently, random access operations are generally costly.
+Apache Arrow defines an inter-process communication (IPC) mechanism to
+transfer a collection of Arrow columnar arrays (called a "record batch").
+It can be used synchronously between processes using the Arrow "stream format",
+or asynchronously by first persisting data on storage using the Arrow "file format".
 
-Arrow on the other hand is first and foremost a library providing columnar data structures for *in-memory computing*. When you read a Parquet file, you can decompress and decode the data *into* Arrow columnar data structures so that you can then perform analytics in-memory on the decoded data. The Arrow columnar format has some nice properties: random access is O(1) and each value cell is next to the previous and following one in memory, so it's efficient to iterate over.
+The Arrow IPC mechanism is based on the Arrow in-memory format, such that
+there is no translation necessary between the on-disk representation and
+the in-memory representation.  Therefore, performing analytics on an Arrow
+IPC file can use memory-mapping and pay effectively zero cost.
 
-What about "Arrow files" then? Apache Arrow defines a binary "serialization" protocol for arranging a collection of Arrow columnar arrays (called a "record batch") that can be used for messaging and interprocess communication. You can put the protocol anywhere, including on disk, which can later be memory-mapped or read into memory and sent elsewhere.
+Some things to keep in mind when comparing the Arrow IPC file format and the
+Parquet format:
 
-This Arrow protocol is designed so that you can "map" a blob of Arrow data without doing any deserialization, so performing analytics on Arrow protocol data on disk can use memory-mapping and pay effectively zero cost. The protocol is used for many other things as well, such as streaming data between Spark SQL and Python for running pandas functions against chunks of Spark SQL data (these are called "pandas udfs").
+* Parquet is safe for long-term storage and archival purposes, meaning if
+  you write a file today, you can expect that any system that says they can
+  "read Parquet" will be able to read the file in 5 years or 10 years.
+  We are not yet making this assertion about long-term stability of the Arrow
+  format.
 
-In some applications, Parquet and Arrow can be used interchangeably for on-disk data serialization. Some things to keep in mind:
+* Reading Parquet files generally requires expensive decoding, while reading
+  Arrow IPC files is just a matter of transferring raw bytes from the storage
+  hardware.
 
-* Parquet is intended for "archival" purposes, meaning if you write a file today, we expect that any system that says they can "read Parquet" will be able to read the file in 5 years or 7 years. We are not yet making this assertion about long-term stability of the Arrow format.
-* Parquet is generally a lot more expensive to read because it must be decoded into some other data structure. Arrow protocol data can simply be memory-mapped.
-* Parquet files are often much smaller than Arrow-protocol-on-disk because of the data encoding schemes that Parquet uses. If your disk storage or network is slow, Parquet may be a better choice.
+* Parquet files are often much smaller than Arrow IPC files because of the
+  elaborate encoding schemes that Parquet uses. If your disk storage or network
+  is slow, Parquet may be a better choice even for short-term storage or caching.
+
+### What about the "Feather" file format?
+
+The Feather v1 format started as a separate specification, but the Feather v2
+format is just another, easier to remember name for the Arrow IPC file format.
 
 ### How does Arrow relate to Flatbuffers?
 
-Flatbuffers is a domain-agnostic low-level building block for binary data formats. It cannot be used directly for data analysis tasks without a lot of manual scaffolding. Arrow is a data layer aimed directly at the needs of data analysis, providing elaborate data types (including extensible logical types), built-in support for "null" values (a.k.a "N/A"), and an expanding toolbox of I/O and computing facilities.
+Flatbuffers is a low-level building block for binary data serialization.
+It is not adapted to the representation of large, structured, homogenous
+data, and does not sit at the right abstraction layer for data analysis tasks.
+
+Arrow is a data layer aimed directly at the needs of data analysis, providing
+elaborate data types (including extensible logical types), built-in support
+for "null" values (representing missing data), and an expanding toolbox of I/O
+and computing facilities.
 
-The Arrow file format does use Flatbuffers under the hood to facilitate low-level metadata serialization. However, Arrow data has much richer semantics than Flatbuffers data.
+The Arrow file format does use Flatbuffers under the hood to facilitate low-level
+metadata serialization, but the Arrow data format uses its own representation

Review comment:
       maybe "to serialize schemas and other metadata needed to implement the Arrow binary IPC protocol"

##########
File path: getting_started.md
##########
@@ -0,0 +1,74 @@
+---
+layout: default
+title: Getting started
+description: Links to user guides to help you start using Arrow
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Getting started
+
+This page collects resources and guides for using Arrow in all of the project's languages.
+For reference on official release packages, see the
+[install page]({{ site.baseurl }}/install/).
+
+## C
+
+Glib

Review comment:
       TODO

##########
File path: index.html
##########
@@ -1,72 +1,62 @@
 ---
-layout: default
+layout: home
 ---
-<div class="jumbotron">
-    <h1>Apache Arrow</h1>
-    <p class="lead">A cross-language development platform for in-memory data</p>
-    <p>
-        <a class="btn btn-lg btn-success" style="white-space: normal;" href="mailto:dev-subscribe@arrow.apache.org" role="button">Join Mailing List</a>
-        <a class="btn btn-lg btn-primary" style="white-space: normal;" href="{{ site.baseurl }}/install/" role="button">Install ({{site.data.versions['current'].number}} Release - {{site.data.versions['current'].date}})</a>
-    </p>
-</div>
-<h5>
-  Interested in contributing?
-  <small class="text-muted">Join the <a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/"><strong>mailing list</strong></a> or check out the <a href="https://cwiki.apache.org/confluence/display/ARROW"><strong>developer wiki</strong></a>.</small>
-</h5>
-<h5>
-  <a href="{{ site.baseurl }}/blog/"><strong>See Latest News</strong></a>
-</h5>
-<p>
-  {{ site.description }}
-</p>
-<hr />
+<h1>What is Arrow?</h1>
 <div class="row">
   <div class="col-lg-4">
-      <h2 class="mt-3">Fast</h2>
-      <p>Apache Arrow&#8482; enables execution engines to take advantage of the latest SIMD (Single instruction, multiple data) operations included in modern processors, for native vectorized optimization of analytical data processing. Columnar layout is optimized for data locality for better performance on modern hardware like CPUs and GPUs.</p>
-      <p>The Arrow memory format supports <strong>zero-copy reads</strong> for lightning-fast data access without serialization overhead.</p>
+      <h2 class="mt-3">Format</h2>
+      <p>Apache Arrow defines a language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs. The Arrow memory format also supports <strong>zero-copy reads</strong> for lightning-fast data access without serialization overhead.</p>
+      <p><a href="{{ site.baseurl }}/overview/">Learn more</a> about the design or
+        <a href="{{ site.baseurl }}/docs/format/Columnar.html">read the specification</a>.</p>
   </div>
   <div class="col-lg-4">
-      <h2 class="mt-3">Flexible</h2>
-      <p>Arrow acts as a new high-performance interface between various systems. It is also focused on supporting a wide variety of industry-standard programming languages. C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust implementations are in progress and more languages are welcome.
+      <h2 class="mt-3">Libraries</h2>
+      <p>The Arrow project includes libraries that implement the memory specification in many languages. They enable you to use the Arrow format as an efficient means of sharing data across languages and processes. Libraries are available for <a href="{{ site.baseurl }}/docs/c_glib/">C</a>, <a href="{{ site.baseurl }}/docs/cpp/">C++</a>, <a href="https://github.com/apache/arrow/blob/master/csharp/README.md">C#</a>, <a href="https://godoc.org/github.com/apache/arrow/go/arrow">Go</a>, <a href="{{ site.baseurl }}/docs/java/">Java</a>, <a href="{{ site.baseurl }}/docs/js/">JavaScript</a>, <a href="https://github.com/apache/arrow/blob/master/matlab/README.md">MATLAB</a>, <a href="{{ site.baseurl }}/docs/python/">Python</a>, <a href="{{ site.baseurl }}/docs/r/">R</a>, <a href="https://github.com/apache/arrow/blob/master/ruby/README.md">Ruby</a>, and <a href="https://docs.rs/crate/arrow/">Rust</a>.
       </p>
+      See <a href="{{ site.baseurl }}/install/">how to install</a> and <a href="{{ site.baseurl }}/getting_started/">get started</a>.
   </div>
   <div class="col-lg-4">
-      <h2 class="mt-3">Standard</h2>
-      <p>Apache Arrow is backed by key developers of 13 major open source projects, including Calcite, Cassandra, Drill, Hadoop, HBase, Ibis, Impala, Kudu, Pandas, Parquet, Phoenix, Spark, and Storm making it the de-facto standard for columnar in-memory analytics.</p>
-      <p>Learn more about projects that are <a href="{{ site.baseurl }}/powered_by/">Powered By Apache Arrow</a></p>
+      <h2 class="mt-3">Applications</h2>
+      <p>Arrow libraries provide a foundation for developers to build fast analytics applications. <a href="{{ site.baseurl }}/powered_by/">Many popular projects</a> use Arrow to ship columnar data efficiently or as the basis for analytic engines.
+      <p>The libraries also include built-in features for working with data directly, including Parquet file reading and querying large datasets. See more Arrow <a href="{{ site.baseurl }}/use_cases/">use cases</a>.</p>

Review comment:
       I would say to condense the 2nd and 3rd points here and change this 3rd one to be about the ecosystem/community

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?
+
+Arrow is an open standard for how to represent columnar data in memory, along
+with libraries in many languages that implement that standard.  The Arrow format
+allows different programs and runtimes, perhaps written in different languages,
+to share data efficiently using a set of rich data types (included nested
+and user-defined data types).  The Arrow libraries make it easy to write such
+programs, by sparing the programmer from implementing low-level details of the
+Arrow format.
+
+Arrow additionally defines a streaming format and a file format for
+inter-process communication (IPC), based on the in-memory format.  It also
+defines a generic client-server RPC mechanism (Arrow Flight), based on the
+IPC format, and implemented on top of the gRPC framework.  <!-- TODO links -->
+
+### Why create a new standard?
+
+<!-- Fill this in -->
+
+## Project status
+
 ### How stable is the Arrow format? Is it safe to use in my application?
+<!-- Revise this -->
+
+The Arrow *in-memory format* is considered stable, and we intend to make only
+backwards-compatible changes, such as additional data types.  It is used by
+many applications already, and you can trust that compatibility will not be
+broken.
+
+The Arrow *file format* (based on the Arrow IPC mechanism) is not recommended
+for long-term disk persistence of data; that said, it is perfectly acceptable
+to write Arrow memory to disk for purposes of memory mapping and caching.
+
+We encourage people to start building Arrow-based in-memory computing
+applications now, and choose a suitable file format for disk storage
+if necessary. The Arrow libraries include adapters for several file formats,
+including Parquet, ORC, CSV, and JSON.
+
+### How stable are the Arrow libraries?
+
+We refer you to the [implementation matrix](https://github.com/apache/arrow/blob/master/docs/source/status.rst).
+
+## Getting started
+
+### Where can I get Arrow libraries?
+
+Arrow libraries for many languages are available through the usual package
+managers. See the [install]({{ site.baseurl }}/install/) page for specifics.
 
-The Arrow in-memory format is considered stable, and we intend to make only backwards-compatible changes, such as additional data types. We do not yet recommend the Arrow file format for long-term disk persistence of data; that said, it is perfectly acceptable to write Arrow memory to disk for purposes of memory mapping and caching.
+## Getting involved
 
-We encourage people to start building Arrow-based in-memory computing applications now, and choose a suitable file format for disk storage if necessary. The Arrow libraries include adapters for several file formats, including Parquet, ORC, CSV, and JSON.
+### I have some questions. How can I get help?
+
+The [Arrow mailing lists]({{ site.baseurl }}/community/) are the best place
+to ask questions. Don't be shy--we're here to help.
+
+### I tried to use Arrow and it didn't work. Can you fix it?
+
+Hopefully! Please make a detailed bug report--that's a valuable contribution
+to the project itself.
+See the [contribution guidelines]({{ site.baseurl }}/docs/developers/contributing.html)
+for how to make a report.
+
+### Arrow looks great and I'd totally use it if it only did X. When will it be done?
+
+We use [JIRA](https://issues.apache.org/jira/browse/ARROW) for our issue tracker.
+Search for an issue that matches your need. If you find one, feel free to
+comment on it and describe your use case--that will help whoever picks up
+the task. If you don't find one, make it.
+
+Ultimately, Arrow is software written by and for the community. If you don't
+see someone else in the community working on your issue, the best way to get
+it done is to pitch in yourself. We're more than willing to help you contribute
+successfully to the project.
+
+### How can I report a security vulnerability?
+
+Please send an email to [private@arrow.apache.org](mailto:private@arrow.apache.org).
+See the [security]({{ site.baseurl }}/security/) page for more.
+
+## Relation to other projects
 
 ### What is the difference between Apache Arrow and Apache Parquet?
+<!-- Revise this -->
+
+Parquet is a storage format designed for maximum space efficiency, using
+advanced compression and encoding techniques.  It is ideal when wanting to
+minimize disk usage while storing gigabytes of data, or perhaps more.
+This efficiency comes at the cost of relatively expensive reading into memory,
+as Parquet data cannot be directly operated on but must be decoded in
+large chunks.
+
+Conversely, Arrow is an in-memory format meant for direct and efficient use
+for computational purposes.  Arrow data is not compressed (or only lightly so,
+when using dictionary encoding) but laid out in natural format for the CPU,
+so that data can be accessed at arbitrary places at full speed.
+
+Therefore, Arrow and Parquet are not competitors: they complement each other
+and are commonly used together in applications.  Storing your data on disk
+using Parquet, and reading it into memory in the Arrow format, will allow
+you to make the most of your computing hardware.
 
-In short, Parquet files are designed for disk storage, while Arrow is designed for in-memory use, but you can put it on disk and then memory-map later. Arrow and Parquet are intended to be compatible with each other and used together in applications.
+### What about "Arrow files" then?
 
-Parquet is a columnar file format for data serialization. Reading a Parquet file requires decompressing and decoding its contents into some kind of in-memory data structure. It is designed to be space/IO-efficient at the expensive CPU utilization for decoding. It does not provide any data structures for in-memory computing. Parquet is a streaming format which must be decoded from start-to-end; while some "index page" facilities have been added to the storage format recently, random access operations are generally costly.
+Apache Arrow defines an inter-process communication (IPC) mechanism to
+transfer a collection of Arrow columnar arrays (called a "record batch").
+It can be used synchronously between processes using the Arrow "stream format",
+or asynchronously by first persisting data on storage using the Arrow "file format".
 
-Arrow on the other hand is first and foremost a library providing columnar data structures for *in-memory computing*. When you read a Parquet file, you can decompress and decode the data *into* Arrow columnar data structures so that you can then perform analytics in-memory on the decoded data. The Arrow columnar format has some nice properties: random access is O(1) and each value cell is next to the previous and following one in memory, so it's efficient to iterate over.
+The Arrow IPC mechanism is based on the Arrow in-memory format, such that
+there is no translation necessary between the on-disk representation and
+the in-memory representation.  Therefore, performing analytics on an Arrow
+IPC file can use memory-mapping and pay effectively zero cost.
 
-What about "Arrow files" then? Apache Arrow defines a binary "serialization" protocol for arranging a collection of Arrow columnar arrays (called a "record batch") that can be used for messaging and interprocess communication. You can put the protocol anywhere, including on disk, which can later be memory-mapped or read into memory and sent elsewhere.
+Some things to keep in mind when comparing the Arrow IPC file format and the
+Parquet format:
 
-This Arrow protocol is designed so that you can "map" a blob of Arrow data without doing any deserialization, so performing analytics on Arrow protocol data on disk can use memory-mapping and pay effectively zero cost. The protocol is used for many other things as well, such as streaming data between Spark SQL and Python for running pandas functions against chunks of Spark SQL data (these are called "pandas udfs").
+* Parquet is safe for long-term storage and archival purposes, meaning if
+  you write a file today, you can expect that any system that says they can
+  "read Parquet" will be able to read the file in 5 years or 10 years.
+  We are not yet making this assertion about long-term stability of the Arrow
+  format.
 
-In some applications, Parquet and Arrow can be used interchangeably for on-disk data serialization. Some things to keep in mind:
+* Reading Parquet files generally requires expensive decoding, while reading
+  Arrow IPC files is just a matter of transferring raw bytes from the storage
+  hardware.
 
-* Parquet is intended for "archival" purposes, meaning if you write a file today, we expect that any system that says they can "read Parquet" will be able to read the file in 5 years or 7 years. We are not yet making this assertion about long-term stability of the Arrow format.
-* Parquet is generally a lot more expensive to read because it must be decoded into some other data structure. Arrow protocol data can simply be memory-mapped.
-* Parquet files are often much smaller than Arrow-protocol-on-disk because of the data encoding schemes that Parquet uses. If your disk storage or network is slow, Parquet may be a better choice.
+* Parquet files are often much smaller than Arrow IPC files because of the
+  elaborate encoding schemes that Parquet uses. If your disk storage or network
+  is slow, Parquet may be a better choice even for short-term storage or caching.
+
+### What about the "Feather" file format?
+
+The Feather v1 format started as a separate specification, but the Feather v2
+format is just another, easier to remember name for the Arrow IPC file format.

Review comment:
       "started as a separate specification" -> "was a simplified custom container for writing a subset of the Arrow format to disk prior to the development of the Arrow IPC file format. "Feather version 2" is now exactly the Arrow IPC file format and we have retained the "Feather" name and APIs for backwards compatibility."

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?
+
+Arrow is an open standard for how to represent columnar data in memory, along
+with libraries in many languages that implement that standard.  The Arrow format
+allows different programs and runtimes, perhaps written in different languages,
+to share data efficiently using a set of rich data types (included nested
+and user-defined data types).  The Arrow libraries make it easy to write such
+programs, by sparing the programmer from implementing low-level details of the
+Arrow format.
+
+Arrow additionally defines a streaming format and a file format for
+inter-process communication (IPC), based on the in-memory format.  It also
+defines a generic client-server RPC mechanism (Arrow Flight), based on the
+IPC format, and implemented on top of the gRPC framework.  <!-- TODO links -->
+
+### Why create a new standard?
+
+<!-- Fill this in -->
+
+## Project status
+
 ### How stable is the Arrow format? Is it safe to use in my application?
+<!-- Revise this -->
+
+The Arrow *in-memory format* is considered stable, and we intend to make only
+backwards-compatible changes, such as additional data types.  It is used by
+many applications already, and you can trust that compatibility will not be
+broken.
+
+The Arrow *file format* (based on the Arrow IPC mechanism) is not recommended
+for long-term disk persistence of data; that said, it is perfectly acceptable
+to write Arrow memory to disk for purposes of memory mapping and caching.
+
+We encourage people to start building Arrow-based in-memory computing
+applications now, and choose a suitable file format for disk storage
+if necessary. The Arrow libraries include adapters for several file formats,
+including Parquet, ORC, CSV, and JSON.
+
+### How stable are the Arrow libraries?
+
+We refer you to the [implementation matrix](https://github.com/apache/arrow/blob/master/docs/source/status.rst).
+
+## Getting started
+
+### Where can I get Arrow libraries?
+
+Arrow libraries for many languages are available through the usual package
+managers. See the [install]({{ site.baseurl }}/install/) page for specifics.
 
-The Arrow in-memory format is considered stable, and we intend to make only backwards-compatible changes, such as additional data types. We do not yet recommend the Arrow file format for long-term disk persistence of data; that said, it is perfectly acceptable to write Arrow memory to disk for purposes of memory mapping and caching.
+## Getting involved
 
-We encourage people to start building Arrow-based in-memory computing applications now, and choose a suitable file format for disk storage if necessary. The Arrow libraries include adapters for several file formats, including Parquet, ORC, CSV, and JSON.
+### I have some questions. How can I get help?
+
+The [Arrow mailing lists]({{ site.baseurl }}/community/) are the best place
+to ask questions. Don't be shy--we're here to help.
+
+### I tried to use Arrow and it didn't work. Can you fix it?
+
+Hopefully! Please make a detailed bug report--that's a valuable contribution
+to the project itself.
+See the [contribution guidelines]({{ site.baseurl }}/docs/developers/contributing.html)
+for how to make a report.
+
+### Arrow looks great and I'd totally use it if it only did X. When will it be done?
+
+We use [JIRA](https://issues.apache.org/jira/browse/ARROW) for our issue tracker.
+Search for an issue that matches your need. If you find one, feel free to
+comment on it and describe your use case--that will help whoever picks up
+the task. If you don't find one, make it.
+
+Ultimately, Arrow is software written by and for the community. If you don't
+see someone else in the community working on your issue, the best way to get
+it done is to pitch in yourself. We're more than willing to help you contribute
+successfully to the project.
+
+### How can I report a security vulnerability?
+
+Please send an email to [private@arrow.apache.org](mailto:private@arrow.apache.org).
+See the [security]({{ site.baseurl }}/security/) page for more.
+
+## Relation to other projects
 
 ### What is the difference between Apache Arrow and Apache Parquet?
+<!-- Revise this -->
+
+Parquet is a storage format designed for maximum space efficiency, using
+advanced compression and encoding techniques.  It is ideal when wanting to
+minimize disk usage while storing gigabytes of data, or perhaps more.
+This efficiency comes at the cost of relatively expensive reading into memory,
+as Parquet data cannot be directly operated on but must be decoded in
+large chunks.
+
+Conversely, Arrow is an in-memory format meant for direct and efficient use
+for computational purposes.  Arrow data is not compressed (or only lightly so,
+when using dictionary encoding) but laid out in natural format for the CPU,
+so that data can be accessed at arbitrary places at full speed.
+
+Therefore, Arrow and Parquet are not competitors: they complement each other
+and are commonly used together in applications.  Storing your data on disk
+using Parquet, and reading it into memory in the Arrow format, will allow
+you to make the most of your computing hardware.
 
-In short, Parquet files are designed for disk storage, while Arrow is designed for in-memory use, but you can put it on disk and then memory-map later. Arrow and Parquet are intended to be compatible with each other and used together in applications.
+### What about "Arrow files" then?
 
-Parquet is a columnar file format for data serialization. Reading a Parquet file requires decompressing and decoding its contents into some kind of in-memory data structure. It is designed to be space/IO-efficient at the expensive CPU utilization for decoding. It does not provide any data structures for in-memory computing. Parquet is a streaming format which must be decoded from start-to-end; while some "index page" facilities have been added to the storage format recently, random access operations are generally costly.
+Apache Arrow defines an inter-process communication (IPC) mechanism to
+transfer a collection of Arrow columnar arrays (called a "record batch").
+It can be used synchronously between processes using the Arrow "stream format",
+or asynchronously by first persisting data on storage using the Arrow "file format".
 
-Arrow on the other hand is first and foremost a library providing columnar data structures for *in-memory computing*. When you read a Parquet file, you can decompress and decode the data *into* Arrow columnar data structures so that you can then perform analytics in-memory on the decoded data. The Arrow columnar format has some nice properties: random access is O(1) and each value cell is next to the previous and following one in memory, so it's efficient to iterate over.
+The Arrow IPC mechanism is based on the Arrow in-memory format, such that
+there is no translation necessary between the on-disk representation and
+the in-memory representation.  Therefore, performing analytics on an Arrow
+IPC file can use memory-mapping and pay effectively zero cost.
 
-What about "Arrow files" then? Apache Arrow defines a binary "serialization" protocol for arranging a collection of Arrow columnar arrays (called a "record batch") that can be used for messaging and interprocess communication. You can put the protocol anywhere, including on disk, which can later be memory-mapped or read into memory and sent elsewhere.
+Some things to keep in mind when comparing the Arrow IPC file format and the
+Parquet format:
 
-This Arrow protocol is designed so that you can "map" a blob of Arrow data without doing any deserialization, so performing analytics on Arrow protocol data on disk can use memory-mapping and pay effectively zero cost. The protocol is used for many other things as well, such as streaming data between Spark SQL and Python for running pandas functions against chunks of Spark SQL data (these are called "pandas udfs").
+* Parquet is safe for long-term storage and archival purposes, meaning if
+  you write a file today, you can expect that any system that says they can
+  "read Parquet" will be able to read the file in 5 years or 10 years.
+  We are not yet making this assertion about long-term stability of the Arrow
+  format.
 
-In some applications, Parquet and Arrow can be used interchangeably for on-disk data serialization. Some things to keep in mind:
+* Reading Parquet files generally requires expensive decoding, while reading
+  Arrow IPC files is just a matter of transferring raw bytes from the storage
+  hardware.
 
-* Parquet is intended for "archival" purposes, meaning if you write a file today, we expect that any system that says they can "read Parquet" will be able to read the file in 5 years or 7 years. We are not yet making this assertion about long-term stability of the Arrow format.
-* Parquet is generally a lot more expensive to read because it must be decoded into some other data structure. Arrow protocol data can simply be memory-mapped.
-* Parquet files are often much smaller than Arrow-protocol-on-disk because of the data encoding schemes that Parquet uses. If your disk storage or network is slow, Parquet may be a better choice.
+* Parquet files are often much smaller than Arrow IPC files because of the
+  elaborate encoding schemes that Parquet uses. If your disk storage or network

Review comment:
       "elaborate" seems a bit emotionally charged to me, let's use something more neutral and precise
   
   "elaborate encoding schemes" -> "columnar data compression strategies"

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?
+
+Arrow is an open standard for how to represent columnar data in memory, along
+with libraries in many languages that implement that standard.  The Arrow format
+allows different programs and runtimes, perhaps written in different languages,
+to share data efficiently using a set of rich data types (included nested
+and user-defined data types).  The Arrow libraries make it easy to write such
+programs, by sparing the programmer from implementing low-level details of the
+Arrow format.
+
+Arrow additionally defines a streaming format and a file format for
+inter-process communication (IPC), based on the in-memory format.  It also
+defines a generic client-server RPC mechanism (Arrow Flight), based on the
+IPC format, and implemented on top of the gRPC framework.  <!-- TODO links -->
+
+### Why create a new standard?
+
+<!-- Fill this in -->
+
+## Project status
+
 ### How stable is the Arrow format? Is it safe to use in my application?
+<!-- Revise this -->
+
+The Arrow *in-memory format* is considered stable, and we intend to make only

Review comment:
       Maybe "columnar format and protocol"

##########
File path: faq.md
##########
@@ -24,32 +24,155 @@ limitations under the License.
 
 # Frequently Asked Questions
 
+## General
+
+### What *is* Arrow?
+
+Arrow is an open standard for how to represent columnar data in memory, along
+with libraries in many languages that implement that standard.  The Arrow format
+allows different programs and runtimes, perhaps written in different languages,
+to share data efficiently using a set of rich data types (included nested
+and user-defined data types).  The Arrow libraries make it easy to write such
+programs, by sparing the programmer from implementing low-level details of the
+Arrow format.
+
+Arrow additionally defines a streaming format and a file format for
+inter-process communication (IPC), based on the in-memory format.  It also
+defines a generic client-server RPC mechanism (Arrow Flight), based on the
+IPC format, and implemented on top of the gRPC framework.  <!-- TODO links -->
+
+### Why create a new standard?

Review comment:
       "Why define a standard for columnar in-memory?"
   
   There can't be a new standard if there isn't an old one. There never was




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org