You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by we...@apache.org on 2017/07/01 22:14:39 UTC
arrow git commit: ARROW-960: Add section on how to develop with pip
Repository: arrow
Updated Branches:
refs/heads/master c294ec3db -> 96e7e9979
ARROW-960: Add section on how to develop with pip
Closes #788
Change-Id: Ia904d5e065c15ba83cf39f57ef97a4f6710aa60f
Project: http://git-wip-us.apache.org/repos/asf/arrow/repo
Commit: http://git-wip-us.apache.org/repos/asf/arrow/commit/96e7e997
Tree: http://git-wip-us.apache.org/repos/asf/arrow/tree/96e7e997
Diff: http://git-wip-us.apache.org/repos/asf/arrow/diff/96e7e997
Branch: refs/heads/master
Commit: 96e7e9979bd522b2231cc33c4196c2418a24e0fc
Parents: c294ec3
Author: Uwe L. Korn <uw...@xhochy.com>
Authored: Tue Jun 27 16:41:06 2017 +0200
Committer: Wes McKinney <we...@twosigma.com>
Committed: Sat Jul 1 18:14:25 2017 -0400
----------------------------------------------------------------------
python/doc/source/development.rst | 93 ++++++++++++++++++++++++++--------
1 file changed, 71 insertions(+), 22 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/arrow/blob/96e7e997/python/doc/source/development.rst
----------------------------------------------------------------------
diff --git a/python/doc/source/development.rst b/python/doc/source/development.rst
index 2063ba8..8a70180 100644
--- a/python/doc/source/development.rst
+++ b/python/doc/source/development.rst
@@ -22,14 +22,11 @@
Development
***********
-Developing with conda
-=====================
-
-Linux and macOS
----------------
+Developing on Linux and MacOS
+=============================
System Requirements
-~~~~~~~~~~~~~~~~~~~
+-------------------
On macOS, any modern XCode (6.4 or higher; the current version is 8.3.1) is
sufficient.
@@ -55,20 +52,9 @@ Finally, set gcc 4.9 as the active compiler using:
export CXX=g++-4.9
Environment Setup and Build
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-First, let's create a conda environment with all the C++ build and Python
-dependencies from conda-forge:
-
-.. code-block:: shell
-
- conda create -y -q -n pyarrow-dev \
- python=3.6 numpy six setuptools cython pandas pytest \
- cmake flatbuffers rapidjson boost-cpp thrift-cpp snappy zlib \
- brotli jemalloc -c conda-forge
- source activate pyarrow-dev
+---------------------------
-Now, let's clone the Arrow and Parquet git repositories:
+First, let's clone the Arrow and Parquet git repositories:
.. code-block:: shell
@@ -87,6 +73,21 @@ You should now see
drwxrwxr-x 12 wesm wesm 4096 Apr 15 19:19 arrow/
drwxrwxr-x 12 wesm wesm 4096 Apr 15 19:19 parquet-cpp/
+Using Conda
+~~~~~~~~~~~
+
+Let's create a conda environment with all the C++ build and Python dependencies
+from conda-forge:
+
+.. code-block:: shell
+
+ conda create -y -q -n pyarrow-dev \
+ python=3.6 numpy six setuptools cython pandas pytest \
+ cmake flatbuffers rapidjson boost-cpp thrift-cpp snappy zlib \
+ brotli jemalloc -c conda-forge
+ source activate pyarrow-dev
+
+
We need to set some environment variables to let Arrow's build system know
about our build toolchain:
@@ -99,6 +100,55 @@ about our build toolchain:
export ARROW_HOME=$CONDA_PREFIX
export PARQUET_HOME=$CONDA_PREFIX
+Using pip
+~~~~~~~~~
+
+On macOS, install all dependencies through Homebrew that are required for
+building Arrow C++:
+
+.. code-block:: shell
+
+ brew install ccache jemalloc boost thrift
+
+On Debian/Ubuntu, you need the following minimal set of dependencies. All other
+dependencies will be automatically built by Arrow' thrid-party toolchain.
+
+.. code-block:: shell
+
+ $ sudo apt-get install libjemalloc-dev libboost-dev \
+ libboost-filesystem-dev \
+ libboost-system-dev
+
+Now, let's create a Python virtualenv with all Python dependencies in the same
+folder as the repositories and a target installation folder:
+
+.. code-block:: shell
+
+ virtualenv pyarrow
+ source ./pyarrow/bin/activate
+ pip install six numpy pandas cython pytest
+
+ # This is the folder where we will install Arrow and Parquet to during
+ # development
+ mkdir dist
+
+If your cmake version is too old on Linux, you could get a newer one via ``pip
+install cmake``.
+
+We need to set some environment variables to let Arrow's build system know
+about our build toolchain:
+
+.. code-block:: shell
+
+ export ARROW_BUILD_TYPE=release
+
+ export ARROW_HOME=$(pwd)/dist
+ export PARQUET_HOME=$(pwd)/dist
+ export LD_LIBRARY_PATH=$(pwd)/dist/lib:$LD_LIBRARY_PATH
+
+Build and test
+--------------
+
Now build and install the Arrow C++ libraries:
.. code-block:: shell
@@ -127,7 +177,6 @@ toolchain:
-DCMAKE_INSTALL_PREFIX=$PARQUET_HOME \
-DPARQUET_BUILD_BENCHMARKS=off \
-DPARQUET_BUILD_EXECUTABLES=off \
- -DPARQUET_ZLIB_VENDORED=off \
-DPARQUET_BUILD_TESTS=off \
..
@@ -179,8 +228,8 @@ You can build a wheel by running:
Again, if you did not build parquet-cpp, you should omit ``--with-parquet``.
-Windows
-=======
+Developing on Windows
+=====================
First, we bootstrap a conda environment similar to the `C++ build instructions
<https://github.com/apache/arrow/blob/master/cpp/apidoc/Windows.md>`_. This