You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by we...@apache.org on 2017/07/01 22:14:39 UTC

arrow git commit: ARROW-960: Add section on how to develop with pip

Repository: arrow
Updated Branches:
  refs/heads/master c294ec3db -> 96e7e9979


ARROW-960: Add section on how to develop with pip

Closes #788

Change-Id: Ia904d5e065c15ba83cf39f57ef97a4f6710aa60f


Project: http://git-wip-us.apache.org/repos/asf/arrow/repo
Commit: http://git-wip-us.apache.org/repos/asf/arrow/commit/96e7e997
Tree: http://git-wip-us.apache.org/repos/asf/arrow/tree/96e7e997
Diff: http://git-wip-us.apache.org/repos/asf/arrow/diff/96e7e997

Branch: refs/heads/master
Commit: 96e7e9979bd522b2231cc33c4196c2418a24e0fc
Parents: c294ec3
Author: Uwe L. Korn <uw...@xhochy.com>
Authored: Tue Jun 27 16:41:06 2017 +0200
Committer: Wes McKinney <we...@twosigma.com>
Committed: Sat Jul 1 18:14:25 2017 -0400

----------------------------------------------------------------------
 python/doc/source/development.rst | 93 ++++++++++++++++++++++++++--------
 1 file changed, 71 insertions(+), 22 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/arrow/blob/96e7e997/python/doc/source/development.rst
----------------------------------------------------------------------
diff --git a/python/doc/source/development.rst b/python/doc/source/development.rst
index 2063ba8..8a70180 100644
--- a/python/doc/source/development.rst
+++ b/python/doc/source/development.rst
@@ -22,14 +22,11 @@
 Development
 ***********
 
-Developing with conda
-=====================
-
-Linux and macOS
----------------
+Developing on Linux and MacOS
+=============================
 
 System Requirements
-~~~~~~~~~~~~~~~~~~~
+-------------------
 
 On macOS, any modern XCode (6.4 or higher; the current version is 8.3.1) is
 sufficient.
@@ -55,20 +52,9 @@ Finally, set gcc 4.9 as the active compiler using:
    export CXX=g++-4.9
 
 Environment Setup and Build
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-First, let's create a conda environment with all the C++ build and Python
-dependencies from conda-forge:
-
-.. code-block:: shell
-
-   conda create -y -q -n pyarrow-dev \
-         python=3.6 numpy six setuptools cython pandas pytest \
-         cmake flatbuffers rapidjson boost-cpp thrift-cpp snappy zlib \
-         brotli jemalloc -c conda-forge
-   source activate pyarrow-dev
+---------------------------
 
-Now, let's clone the Arrow and Parquet git repositories:
+First, let's clone the Arrow and Parquet git repositories:
 
 .. code-block:: shell
 
@@ -87,6 +73,21 @@ You should now see
    drwxrwxr-x 12 wesm wesm 4096 Apr 15 19:19 arrow/
    drwxrwxr-x 12 wesm wesm 4096 Apr 15 19:19 parquet-cpp/
 
+Using Conda
+~~~~~~~~~~~
+
+Let's create a conda environment with all the C++ build and Python dependencies
+from conda-forge:
+
+.. code-block:: shell
+
+   conda create -y -q -n pyarrow-dev \
+         python=3.6 numpy six setuptools cython pandas pytest \
+         cmake flatbuffers rapidjson boost-cpp thrift-cpp snappy zlib \
+         brotli jemalloc -c conda-forge
+   source activate pyarrow-dev
+
+
 We need to set some environment variables to let Arrow's build system know
 about our build toolchain:
 
@@ -99,6 +100,55 @@ about our build toolchain:
    export ARROW_HOME=$CONDA_PREFIX
    export PARQUET_HOME=$CONDA_PREFIX
 
+Using pip
+~~~~~~~~~
+
+On macOS, install all dependencies through Homebrew that are required for
+building Arrow C++:
+
+.. code-block:: shell
+
+   brew install ccache jemalloc boost thrift
+
+On Debian/Ubuntu, you need the following minimal set of dependencies. All other
+dependencies will be automatically built by Arrow' thrid-party toolchain.
+
+.. code-block:: shell
+
+   $ sudo apt-get install libjemalloc-dev libboost-dev \
+                          libboost-filesystem-dev \
+                          libboost-system-dev
+
+Now, let's create a Python virtualenv with all Python dependencies in the same
+folder as the repositories and a target installation folder:
+
+.. code-block:: shell
+
+   virtualenv pyarrow
+   source ./pyarrow/bin/activate
+   pip install six numpy pandas cython pytest
+
+   # This is the folder where we will install Arrow and Parquet to during
+   # development
+   mkdir dist
+
+If your cmake version is too old on Linux, you could get a newer one via ``pip
+install cmake``.
+
+We need to set some environment variables to let Arrow's build system know
+about our build toolchain:
+
+.. code-block:: shell
+
+   export ARROW_BUILD_TYPE=release
+
+   export ARROW_HOME=$(pwd)/dist
+   export PARQUET_HOME=$(pwd)/dist
+   export LD_LIBRARY_PATH=$(pwd)/dist/lib:$LD_LIBRARY_PATH
+
+Build and test
+--------------
+
 Now build and install the Arrow C++ libraries:
 
 .. code-block:: shell
@@ -127,7 +177,6 @@ toolchain:
          -DCMAKE_INSTALL_PREFIX=$PARQUET_HOME \
          -DPARQUET_BUILD_BENCHMARKS=off \
          -DPARQUET_BUILD_EXECUTABLES=off \
-         -DPARQUET_ZLIB_VENDORED=off \
          -DPARQUET_BUILD_TESTS=off \
          ..
 
@@ -179,8 +228,8 @@ You can build a wheel by running:
 
 Again, if you did not build parquet-cpp, you should omit ``--with-parquet``.
 
-Windows
-=======
+Developing on Windows
+=====================
 
 First, we bootstrap a conda environment similar to the `C++ build instructions
 <https://github.com/apache/arrow/blob/master/cpp/apidoc/Windows.md>`_. This