You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by uw...@apache.org on 2018/05/17 13:50:49 UTC

[arrow] branch master updated: ARROW-2486: [C++/Python] Provide a Docker image that contains all dependencies for development

This is an automated email from the ASF dual-hosted git repository.

uwe pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
     new 941a1b7  ARROW-2486: [C++/Python] Provide a Docker image that contains all dependencies for development
941a1b7 is described below

commit 941a1b762c6d1ad1d86f67b5352875a00cf707d5
Author: Aneesh Karve <an...@gmail.com>
AuthorDate: Thu May 17 15:50:17 2018 +0200

    ARROW-2486: [C++/Python] Provide a Docker image that contains all dependencies for development
    
    Open items
    - [x] Why is `py.test pyarrow` failing on plasma deps when script follows [docs](https://arrow.apache.org/docs/python/development.html#developing-on-linux-and-macos)?
    - [x] Should `/script/*.sh` use the same code as developer docs to avoid denormalization?
    - [x] Move docker image to Apache registry?
    - [x] Multiple container strategy possible, but overly complex. Requires exposing volume on one container as a mount point for a second container. Only speeds up user's first build.
    - [x] Are gcc/g++ 4.8 the ideal versions?
    - [x] Unit tests needed?
    - [x] Update README per resolution of above
    
    Author: Aneesh Karve <an...@gmail.com>
    
    Closes #2016 from akarve/master and squashes the following commits:
    
    5aec17a8 <Aneesh Karve> final PR feedback; README indendtation
---
 dev/container/Dockerfile              | 49 ++++++++++++++++++++++
 dev/container/README.md               | 76 +++++++++++++++++++++++++++++++++++
 dev/container/script/arrow-build.sh   | 33 +++++++++++++++
 dev/container/script/env.sh           | 28 +++++++++++++
 dev/container/script/parquet-build.sh | 32 +++++++++++++++
 dev/container/script/pyarrow-build.sh | 22 ++++++++++
 6 files changed, 240 insertions(+)

diff --git a/dev/container/Dockerfile b/dev/container/Dockerfile
new file mode 100644
index 0000000..de0729b
--- /dev/null
+++ b/dev/container/Dockerfile
@@ -0,0 +1,49 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+FROM ubuntu:18.04
+
+RUN apt-get update && \
+	apt-get install -y \
+		gcc-8 \
+		g++-8 \
+		vim \
+		git \
+		wget \
+		make \
+		ninja-build
+
+ENV CC=gcc-8
+ENV CXX=g++-8
+
+# Miniconda - Python 3.6, 64-bit, x86, latest
+RUN wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O mconda.sh && \
+	/bin/bash mconda.sh -b -p miniconda && \
+	rm mconda.sh
+
+ENV PATH="/miniconda/bin:$PATH"
+
+# create conda env with deps
+RUN conda create -y -q -n pyarrow-dev \
+	python=3.6 numpy six setuptools cython pandas pytest \
+	cmake flatbuffers rapidjson boost-cpp thrift-cpp snappy zlib \
+  	gflags brotli jemalloc lz4-c zstd -c conda-forge \
+	&& conda clean --all
+
+ADD script ./script
+RUN chmod u=rwx ./script/*.sh
+
diff --git a/dev/container/README.md b/dev/container/README.md
new file mode 100644
index 0000000..d8636e9
--- /dev/null
+++ b/dev/container/README.md
@@ -0,0 +1,76 @@
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+
+# Apache Arrow development container
+* Includes all dependencies for Arrow development
+* Builds are incremental,  mirrored to local file system
+* Resolves [ARROW-2486](https://issues.apache.org/jira/browse/ARROW-2486)
+
+## Get started
+
+### [Install Docker](https://docs.docker.com/install/)
+
+### Acquire image
+
+```
+$ docker pull quiltdata/arrow
+```
+
+### Populate host directory
+Keep git repos and subsequent build products in a persistent local
+directory, `/io`.
+
+```
+$ mkdir -p io/arrow
+$ git clone https://github.com/apache/arrow.git io/arrow
+$ mkdir -p io/parquet-cpp
+$ git clone https://github.com/apache/parquet-cpp.git io/parquet-cpp
+```
+Alternatively, if you wish to use existing git repos, you can nest them
+under `/io`.
+
+### Run container, mount `/io` as volume
+
+```
+$ docker run \
+	--shm-size=2g \
+	-v /LOCAL/PATH/TO/io:/io \
+	-it quiltdata/arrow
+```
+
+### Use container
+Run scripts to build executables.
+
+See also [Arrow dev docs](https://arrow.apache.org/docs/python/development.html).
+
+```
+$ source script/env.sh
+$ script/arrow-build.sh
+$ script/parquet-build.sh
+$ script/pyarrow-build.sh
+# run tests
+$ cd /io/arrow/python
+$ py.test pyarrow
+```
+
+## Build container
+
+```
+$ docker build -t USERNAME/arrow .
+```
diff --git a/dev/container/script/arrow-build.sh b/dev/container/script/arrow-build.sh
new file mode 100644
index 0000000..ab448b8
--- /dev/null
+++ b/dev/container/script/arrow-build.sh
@@ -0,0 +1,33 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# See also https://arrow.apache.org/docs/python/development.html
+mkdir -p /io/arrow/cpp/build
+pushd /io/arrow/cpp/build
+cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
+      -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
+      -DARROW_PYTHON=on \
+      -DARROW_PLASMA=on \
+      -DARROW_BUILD_TESTS=OFF \
+      -DCMAKE_CXX_FLAGS=$CXXFLAGS \
+      -GNinja \
+      ..
+ninja
+ninja install
+popd
+
diff --git a/dev/container/script/env.sh b/dev/container/script/env.sh
new file mode 100644
index 0000000..cb24424
--- /dev/null
+++ b/dev/container/script/env.sh
@@ -0,0 +1,28 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# See also https://arrow.apache.org/docs/python/development.html#build-and-test
+source activate pyarrow-dev
+export ARROW_BUILD_TYPE=release
+export ARROW_BUILD_TOOLCHAIN=$CONDA_PREFIX
+export PARQUET_BUILD_TOOLCHAIN=$CONDA_PREFIX
+export ARROW_HOME=$CONDA_PREFIX
+export PARQUET_HOME=$CONDA_PREFIX
+# For newer GCC per https://arrow.apache.org/docs/python/development.html#known-issues
+export CXXFLAGS="-D_GLIBCXX_USE_CXX11_ABI=0"
+export PYARROW_CXXFLAGS=$CXXFLAGS
diff --git a/dev/container/script/parquet-build.sh b/dev/container/script/parquet-build.sh
new file mode 100644
index 0000000..f06aa07
--- /dev/null
+++ b/dev/container/script/parquet-build.sh
@@ -0,0 +1,32 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# See also https://arrow.apache.org/docs/python/development.html#build-and-test
+mkdir -p /io/parquet-cpp/build
+pushd /io/parquet-cpp/build
+cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
+      -DCMAKE_INSTALL_PREFIX=$PARQUET_HOME \
+      -DPARQUET_BUILD_BENCHMARKS=off \
+      -DPARQUET_BUILD_EXECUTABLES=off \
+      -DPARQUET_BUILD_TESTS=off \
+      -DCMAKE_CXX_FLAGS=$CXXFLAGS \
+      -GNinja \
+      ..
+ninja
+ninja install
+popd
diff --git a/dev/container/script/pyarrow-build.sh b/dev/container/script/pyarrow-build.sh
new file mode 100644
index 0000000..927ff79
--- /dev/null
+++ b/dev/container/script/pyarrow-build.sh
@@ -0,0 +1,22 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# See also https://arrow.apache.org/docs/python/development.html#build-and-test
+cd /io/arrow/python
+python setup.py build_ext --build-type=$ARROW_BUILD_TYPE \
+  --with-parquet --with-plasma --inplace

-- 
To stop receiving notification emails like this one, please contact
uwe@apache.org.