You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by th...@apache.org on 2021/12/08 12:57:23 UTC

[arrow] branch master updated: ARROW-14753: [Doc] Steps in making your first PR - building C++

This is an automated email from the ASF dual-hosted git repository.

thisisnic pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
     new 065b1fc  ARROW-14753: [Doc] Steps in making your first PR - building C++
065b1fc is described below

commit 065b1fcba985b067e960947e79ae62c8925dc42a
Author: Alenka Frim <fr...@gmail.com>
AuthorDate: Wed Dec 8 12:55:33 2021 +0000

    ARROW-14753: [Doc] Steps in making your first PR - building C++
    
    Add Building C++ and PyArrow sections of the New Contributor's guide.
    
    Closes #11820 from AlenkaF/ARROW-14753
    
    Lead-authored-by: Alenka Frim <fr...@gmail.com>
    Co-authored-by: Alenka Frim <Al...@users.noreply.github.com>
    Signed-off-by: Nic Crane <th...@gmail.com>
---
 docs/source/developers/cpp/building.rst            |   4 +
 .../developers/guide/architectural_overview.rst    |   2 +-
 docs/source/developers/guide/index.rst             |   2 +-
 .../developers/guide/step_by_step/building.rst     | 128 ++++++++++++++++++---
 docs/source/developers/python.rst                  |   4 +
 5 files changed, 125 insertions(+), 15 deletions(-)

diff --git a/docs/source/developers/cpp/building.rst b/docs/source/developers/cpp/building.rst
index f205972..a3d9ccc 100644
--- a/docs/source/developers/cpp/building.rst
+++ b/docs/source/developers/cpp/building.rst
@@ -131,6 +131,8 @@ repository and navigated to the ``cpp`` subdirectory:
    $ git clone https://github.com/apache/arrow.git
    $ cd arrow/cpp
 
+.. _cmake_presets:
+
 CMake presets
 -------------
 
@@ -293,6 +295,8 @@ option can make full builds significantly faster, but it also increases the
 memory requirements.  Consider turning it on (using ``-DCMAKE_UNITY_BUILD=ON``)
 if memory consumption is not an issue.
 
+.. _cpp_build_optional_components:
+
 Optional Components
 ~~~~~~~~~~~~~~~~~~~
 
diff --git a/docs/source/developers/guide/architectural_overview.rst b/docs/source/developers/guide/architectural_overview.rst
index cd85ca0..394e884 100644
--- a/docs/source/developers/guide/architectural_overview.rst
+++ b/docs/source/developers/guide/architectural_overview.rst
@@ -37,4 +37,4 @@ For an Architectural Overview of Arrow's libraries please
 refer to:
 
 - PyArrow Architectural Overview
-- R-Arrow Architectural Overview
\ No newline at end of file
+- R package Architectural Overview
\ No newline at end of file
diff --git a/docs/source/developers/guide/index.rst b/docs/source/developers/guide/index.rst
index bf5e822..324971c 100644
--- a/docs/source/developers/guide/index.rst
+++ b/docs/source/developers/guide/index.rst
@@ -81,7 +81,7 @@ of adding a basic feature.
    appropriate :ref:`communication` channel.
 
    See a short description about the building process of 
-   :ref:`PyArrow or R-Arrow<build-arrow>` or go straight to detailed
+   :ref:`PyArrow or the R package<build-arrow>` or go straight to detailed
    instructions on how to build one of Arrow libraries in the
    `documentation <https://arrow.apache.org/docs/>`_ .
  
diff --git a/docs/source/developers/guide/step_by_step/building.rst b/docs/source/developers/guide/step_by_step/building.rst
index 9a5fcff..645c5d8 100644
--- a/docs/source/developers/guide/step_by_step/building.rst
+++ b/docs/source/developers/guide/step_by_step/building.rst
@@ -18,9 +18,9 @@
 
 .. SCOPE OF THIS SECTION
 .. The aim of this section is to provide extra description to
-.. the process of building Arow library. It could include:
+.. the process of building Arrow library. It could include:
 .. what does building mean, what is CMake, what are flags and why
-.. do we use them, is building Arrow supposed to be straigtforward?
+.. do we use them, is building Arrow supposed to be straightforward?
 .. etc.
 
 .. Be sure not to duplicate with existing documentation!
@@ -31,21 +31,123 @@
 
 .. _build-arrow:
 
-*********************************
-Building Arrow's libraries 🏋🏿‍♀️
-*********************************
+**********************************
+Building the Arrow libraries 🏋🏿‍♀️
+**********************************
 
+The Arrow project contains a number of libraries that enable
+work in many languages. Most libraries (C++, C#, Go, Java,
+JavaScript, Julia, and Rust) already contain distinct implementations
+of Arrow. 
 
+This is different for C (Glib), MATLAB, Python, R, and Ruby as they
+are built on top of the C++ library. In this section of the guide
+we will try to make a friendly introduction to the build,
+dealing with some of these libraries as well has how they work with
+the C++ library.
 
-Building C++
-============
+If you decide to contribute to Arrow you might need to compile the
+C++ source code. This is done using a tool called CMake, which you
+may or may not have experience with. If not, this section of the
+guide will help you better understand CMake and the process
+of building Arrow's C++ code.
 
-.. _build-pyarrow:
+This content is intended to help explain the concepts related to 
+and tools required for building Arrow's C++ library from source.
+If you are looking for the specific required steps, or already feel comfortable 
+with compiling Arrow's C++ library, then feel free to proceed
+to the :ref:`C++ <building-arrow-cpp>`, :ref:`PyArrow <build_pyarrow>` or
+`R package build section <https://arrow.apache.org/docs/r/articles/developing.html>`_.
 
-Building PyArrow
-================
+Building Arrow C++
+==================
 
-.. _build-rarrow:
+Why build Arrow C++ from source?
+--------------------------------
+
+For Arrow implementations which are built on top of the C++ implementation
+(e.g. Python and R), wrappers and interfaces have been written to the
+underlying C++ functions. If you want to work on PyArrow or the R package,
+you may need to edit the source code of the C++ library too.
+
+Detailed instructions on building C++ library from source can
+be found :ref:`here <building-arrow-cpp>`.
+
+About CMake
+-----------
+
+CMake is a cross-platform build system generator and it defers
+to another program such as ``make`` or ``ninja`` for the actual build.
+If you are running into errors with the build process, the first thing to
+do is to look at the error message thoroughly and check the building
+documentation for any similar error advice. Also changing the CMake flags
+for compiling Arrow could be useful.
+
+CMake presets
+^^^^^^^^^^^^^^^^^^^
+
+You could also try to build with CMake presets which are a collection of
+build and test recipes for Arrow's CMake. They are a very useful
+starting points.
+
+More detailed information about CMake presets can be found in
+the :ref:`cmake_presets` section.
+
+Optional flags and environment variables
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Flags used in the CMake build are used to include additional components
+and to handle third-party dependencies.
+The build for C++ library can be minimal with no use of flags or can
+be changed with adding optional components from the
+:ref:`list <cpp_build_optional_components>`.
+
+.. seealso::
+	Full list of optional flags: :ref:`cpp_build_optional_components`
+
+R and Python have specific lists of flags in their respective builds
+that need to be included. You can find the links at the end
+of this section.
+
+In general on Python side, the options are set with CMake flags and
+paths with environment variables. In R the environment variables are used
+for all things connected to the build, also for setting CMake flags.
+
+Building from source vs. using binaries
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Using binaries is a fast and simple way of working with the last release
+of Arrow. However, if you use these it means that you will be unable to
+make changes to the Arrow C++ library.
+
+**Note:** every language has its own way of dealing with binaries.
+To get more information navigate to the section of the language you are
+interested to find more information.
+
+.. tabs::
+
+   .. tab:: Building Pyarrow
+
+      After building the Arrow C++ library, you need to build PyArrow on top
+      of it also. The reason is the same; so you can edit the code and run
+      tests on the edited code you have locally.
+
+      **Why do we have to do builds separately?**
+
+      As mentioned at the beginning of this page, the Python part of the Arrow
+      project is built on top of the C++ library. In order to make changes in
+      the Python part of Arrow as well as the C++ part of Arrow, you need to
+      build them separately..
+
+      We hope this introduction was enough to help you start with the building
+      process.
+
+      .. seealso::
+         Follow the instructions to build PyArrow together with the C++ library
+
+         - :ref:`build_pyarrow`
+         Or
+
+         - :ref:`build_pyarrow_win`
+
+   .. tab:: Building the R package
 
-Building R-Arrow
-================
diff --git a/docs/source/developers/python.rst b/docs/source/developers/python.rst
index 0f7abaf..71cbbeb 100644
--- a/docs/source/developers/python.rst
+++ b/docs/source/developers/python.rst
@@ -106,6 +106,8 @@ Benchmarking
 
 For running the benchmarks, see :ref:`python-benchmarks`.
 
+.. _build_pyarrow:
+
 Building on Linux and MacOS
 =============================
 
@@ -421,6 +423,8 @@ debugging a C++ unittest, for example:
    Make breakpoint pending on future shared library load? (y or [n]) y
    Breakpoint 1 (src/arrow/python/arrow_to_pandas.cc:1874) pending.
 
+.. _build_pyarrow_win:
+
 Building on Windows
 ===================