You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by GitBox <gi...@apache.org> on 2020/04/02 18:28:02 UTC

[GitHub] [arrow-site] bkietz commented on a change in pull request #50: ARROW-8023: [Website] Add blog post about the C data interface

bkietz commented on a change in pull request #50: ARROW-8023: [Website] Add blog post about the C data interface
URL: https://github.com/apache/arrow-site/pull/50#discussion_r402507444
 
 

 ##########
 File path: _posts/2020-04-xx-introducing-arrow-c-data-interface.md
 ##########
 @@ -0,0 +1,90 @@
+---
+layout: post
+title: "Introducing the Apache Arrow C data interface"
+description: "This post introduces the Arrow C data interface, a simple C-based
+interoperability standard to simplify interactions between independent users
+and implementors of the Arrow in-memory format."
+date: "2020-04-02 00:00:00 +0100"
+author: apitrou
+categories: [application]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+Apache Arrow is a cross-language, platform-independent in-memory format
+allowing zero-copy data sharing and transfer between heterogenous runtimes
+and applications.
+
+The easiest way to benefit from the Arrow format has always been to depend
+on one of the concrete implementations developed by the Apache Arrow community.
+There are 11 such implementations, each for a different language and ecosystem
+(Java, C++, Python, Rust, R, Javascript, C#, Go...).
+
+However, packaging and ABI issues in C++ can deter from depending on the
+Arrow C++ library.  We have therefore worked on an alternative, which
+involves exchanging data at the C level by conforming to a simple data
+definition.  The C ABI is a platform-wide standard that is unlikely to
+change (and practically never changes), because it ensures portability of
+libraries and executable binaries.
+
+The best way to learn about the C Data Interface is to read the
+[spec](https://arrow.apache.org/docs/format/CDataInterface.html).
+However, we will quickly go over its strong points.
+
+## Two simple struct definitions
+
+To interact with the C Data Interface from you C or C++ level, the only
+thing you have to include in your code is two struct type declarations
+(and a couple of ``#define`` for constant values).  Those declarations
+only depend on standard C types.  You can simply paste them in a header
+file.
+
+## Zero-copy data sharing
+
+The C Data Interface passes Arrow buffers through memory pointers.  So,
+by construction, it allows you to share or pass data from one runtime to
 
 Review comment:
   ```suggestion
   by construction, it allows you to share data from one runtime to
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services