You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by we...@apache.org on 2018/09/30 17:49:39 UTC

[arrow] branch master updated: ARROW-3180: [C++] Add docker-compose setup to simulate Travis CI run locally

This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
     new 1649864  ARROW-3180: [C++] Add docker-compose setup to simulate Travis CI run locally
1649864 is described below

commit 16498643eb52c414970a6b9b49fe7396728cb306
Author: Krisztián Szűcs <sz...@gmail.com>
AuthorDate: Sun Sep 30 13:49:30 2018 -0400

    ARROW-3180: [C++] Add docker-compose setup to simulate Travis CI run locally
    
    Also resolves:
    
    > ARROW-3078: [Python] Docker integration tests should not contaminate the local Python development environment
    
    Working:
    - [x] rust
    - [x] go
    - [x] c_glib
    - [x] ruby
    - [x] java
    - [x] cpp
    - [x] python
    - [x] hdfs integration
    
    The rest:
    - js (gulp error...)
    - matlab
    - R
    - dask integration
    - spark integration
    - gen_apidocs
    - hiveserver2
    - iwyu
    
    Perhaps resolve them in follow-up PRs.
    
    Author: Krisztián Szűcs <sz...@gmail.com>
    Author: Wes McKinney <we...@apache.org>
    
    Closes #2572 from kszucs/ARROW-3180 and squashes the following commits:
    
    41cd81082 <Krisztián Szűcs> shm_size
    c3b73e7c8 <Krisztián Szűcs> set shm_size
    83929c98c <Krisztián Szűcs>  remove js lock files from dockerignore
    52c0d8917 <Wes McKinney> Add glog to C++ env
    5074cf577 <Krisztián Szűcs>  add integration data to java image
    8ea75a67d <Krisztián Szűcs>  java docker image
    0d11fc67e <Krisztián Szűcs> organize js dockerfile
    450311638 <Krisztián Szűcs> improve layer caching
    a5385ad23 <Krisztián Szűcs> update dockerignore to reduce the build context's size
    e28791942 <Krisztián Szűcs> meson install c_glib; run ruby tests
    210d4eba1 <Krisztián Szűcs> c_glib dockerfile
    09a3e6f49 <Krisztián Szűcs> js dockerfile
    9e51cb90c <Krisztián Szűcs> mixed whitespaces in python Dockerfile
    c8d3f2651 <Krisztián Szűcs> mixed whitespaces in cpp Dockerfile
    99ce42e44 <Krisztián Szűcs> mixed whitespaces in hdfs Dockerfile
    f505105b7 <Krisztián Szűcs> missing license headers
    26fe1f61a <Krisztián Szűcs> license headers
    1737ef4a5 <Krisztián Szűcs> comment fixes
    2900810b5 <Krisztián Szűcs> docker-compose setup; go, rust, cpp, python, hdfs-integration
---
 .dockerignore                                      |  85 +++++++++++++++
 c_glib/Dockerfile                                  |  72 ++++++++++++
 c_glib/README.md                                   |   2 +-
 ci/conda_env_cpp.yml                               |  33 ++++++
 ci/conda_env_python.yml                            |  26 +++++
 ci/docker_build_c_glib.sh                          |  37 +++++++
 ci/docker_build_cpp.sh                             |  49 +++++++++
 ci/docker_build_python.sh                          |  38 +++++++
 ci/docker_install_conda.sh                         |  29 +++++
 cpp/CMakeLists.txt                                 |   6 -
 cpp/Dockerfile                                     |  53 +++++++++
 dev/docker-compose.yml                             |  32 ------
 dev/hdfs_integration/Dockerfile                    |  78 -------------
 dev/hdfs_integration/hdfs_integration.sh           | 115 --------------------
 docker-compose.yml                                 | 121 +++++++++++++++++++++
 go/Dockerfile                                      |  25 +++++
 integration/hdfs/Dockerfile                        |  74 +++++++++++++
 .../hdfs/libhdfs3.xml                              |   0
 integration/hdfs/runtest.sh                        |  30 +++++
 java/Dockerfile                                    |  30 +++++
 js/Dockerfile                                      |  32 ++++++
 python/Dockerfile                                  |  62 +++++++++++
 python/pyarrow/tests/test_parquet.py               |   5 +-
 rust/Dockerfile                                    |  24 ++++
 24 files changed, 824 insertions(+), 234 deletions(-)

diff --git a/.dockerignore b/.dockerignore
new file mode 100644
index 0000000..61f42f0
--- /dev/null
+++ b/.dockerignore
@@ -0,0 +1,85 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+.git
+
+# IDE
+.vscode
+
+# c_glib
+c_glib/build
+c_glib/autom4te.cache
+c_glib/m4
+c_glib/*/*.o
+c_glib/*/*.lo
+c_glib/*/*.la
+c_glib/*/.deps
+c_glib/*/.libs
+
+# cpp
+cpp/.idea
+cpp/build
+cpp/*-build
+cpp/Testing
+
+# python
+python/build
+python/dist
+python/*.egg-info
+python/*.egg
+python/*.pyc
+python/doc/_build
+__pycache__/
+*/__pycache__/
+*/*/__pycache__/
+*/*/*/__pycache__/
+*.py[cod]
+*/*.py[cod]
+*/*/*.py[cod]
+*/*/*/*.py[cod]
+*.so
+*/*.so
+*/*/*.so
+*/*/*/*.so
+*.dylib
+*/*.dylib
+*/*/*.dylib
+*/*/*/*.dylib
+
+# JS
+js/.npm
+js/node_modules
+js/jspm_packages
+
+js/logs
+js/*.log
+js/.esm-cache
+js/npm-debug.log*
+js/yarn-debug.log*
+js/yarn-error.log*
+
+js/.grunt
+js/bower_components
+js/.lock-wscript
+js/build/Release
+js/dist
+js/targets
+js/test/data
+js/test/__snapshots__
+
+# Rust
+rust/target
diff --git a/c_glib/Dockerfile b/c_glib/Dockerfile
new file mode 100644
index 0000000..76fc8cf
--- /dev/null
+++ b/c_glib/Dockerfile
@@ -0,0 +1,72 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+FROM ubuntu:18.04
+
+ENV DEBIAN_FRONTEND=noninteractive
+RUN apt-get -q update && \
+    apt-get -q install --no-install-recommends -y \
+        gcc \
+        g++ \
+        git \
+        wget \
+        tzdata \
+        ruby-dev \
+        pkg-config \
+        ninja-build \
+        autoconf-archive \
+        gtk-doc-tools \
+        libgirepository1.0-dev
+
+ENV CC=gcc \
+    CXX=g++ \
+    PATH=/opt/conda/bin:$PATH \
+    CONDA_PREFIX=/opt/conda
+
+# install dependencies
+ADD ci/docker_install_conda.sh \
+    ci/conda_env_cpp.yml \
+    /arrow/ci/
+ADD c_glib/Gemfile /arrow/c_glib/
+RUN arrow/ci/docker_install_conda.sh && \
+    conda install -c conda-forge \
+        --file arrow/ci/conda_env_cpp.yml \
+        meson=0.47.1 && \
+    conda clean --all && \
+    gem install bundler && \
+    bundle install --gemfile arrow/c_glib/Gemfile
+
+# build cpp
+ENV ARROW_BUILD_TESTS=OFF \
+    ARROW_BUILD_UTILITIES=OFF \
+    ARROW_INSTALL_NAME_RPATH=OFF
+ADD ci/docker_build_cpp.sh /arrow/ci/
+ADD cpp /arrow/cpp
+ADD format /arrow/format
+ADD java/pom.xml /arrow/java/pom.xml
+RUN arrow/ci/docker_build_cpp.sh
+
+# build c_glib
+ENV LD_LIBRARY_PATH="${CONDA_PREFIX}/lib" \
+    PKG_CONFIG_PATH="${CONDA_PREFIX}/lib/pkgconfig" \
+    GI_TYPELIB_PATH="${CONDA_PREFIX}/lib/girepository-1.0"
+ADD ci/docker_build_c_glib.sh /arrow/ci/
+ADD c_glib /arrow/c_glib
+RUN arrow/ci/docker_build_c_glib.sh
+
+WORKDIR arrow/c_glib
+CMD test/run-test.rb
diff --git a/c_glib/README.md b/c_glib/README.md
index c4f8045..498f94f 100644
--- a/c_glib/README.md
+++ b/c_glib/README.md
@@ -198,7 +198,7 @@ based bindings. Here are languages that support GObject Introspection:
   * Ruby: [red-arrow gem](https://rubygems.org/gems/red-arrow) should be used.
     * Examples: https://github.com/red-data-tools/red-arrow/tree/master/example
 
-  * Python: [PyGObject](https://wiki.gnome.org/Projects/PyGObject) should be used. (Note that you should use PyArrow than Arrow GLib.)
+  * Python: [PyGObject](https://wiki.gnome.org/Projects/PyGObject) should be used. (Note that you should prefer PyArrow over Arrow GLib.)
 
   * Lua: [LGI](https://github.com/pavouk/lgi) should be used.
     * Examples: `example/lua/` directory.
diff --git a/ci/conda_env_cpp.yml b/ci/conda_env_cpp.yml
new file mode 100644
index 0000000..5aa3c32
--- /dev/null
+++ b/ci/conda_env_cpp.yml
@@ -0,0 +1,33 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+boost-cpp
+brotli
+cmake
+flatbuffers
+gflags
+glog
+gtest
+jemalloc
+libprotobuf
+lz4-c
+python
+rapidjson
+snappy
+thrift-cpp
+zlib
+zstd
diff --git a/ci/conda_env_python.yml b/ci/conda_env_python.yml
new file mode 100644
index 0000000..43022ae
--- /dev/null
+++ b/ci/conda_env_python.yml
@@ -0,0 +1,26 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+cython
+ipython
+nomkl
+numpy
+pandas
+pytest
+python
+setuptools
+setuptools_scm
diff --git a/ci/docker_build_c_glib.sh b/ci/docker_build_c_glib.sh
new file mode 100755
index 0000000..7390e79
--- /dev/null
+++ b/ci/docker_build_c_glib.sh
@@ -0,0 +1,37 @@
+#!/usr/bin/env bash
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+set -e
+
+export ARROW_C_GLIB_HOME=$CONDA_PREFIX
+
+export CFLAGS="-DARROW_NO_DEPRECATED_API"
+export CXXFLAGS="-DARROW_NO_DEPRECATED_API -D_GLIBCXX_USE_CXX11_ABI=0"
+
+pushd arrow/c_glib
+  mkdir build
+
+  # Build with Meson
+  meson build --prefix=$ARROW_C_GLIB_HOME --libdir=lib
+
+  pushd build
+    ninja
+    ninja install
+  popd
+popd
diff --git a/ci/docker_build_cpp.sh b/ci/docker_build_cpp.sh
new file mode 100755
index 0000000..16a09f3
--- /dev/null
+++ b/ci/docker_build_cpp.sh
@@ -0,0 +1,49 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+set -e
+
+# Arrow specific environment variables
+export ARROW_BUILD_TOOLCHAIN=$CONDA_PREFIX
+export ARROW_HOME=$CONDA_PREFIX
+export PARQUET_HOME=$CONDA_PREFIX
+
+# https://arrow.apache.org/docs/python/development.html#known-issues
+export CXXFLAGS="-D_GLIBCXX_USE_CXX11_ABI=0"
+
+mkdir -p arrow/cpp/build
+pushd arrow/cpp/build
+
+cmake -GNinja \
+      -DCMAKE_BUILD_TYPE=${ARROW_BUILD_TYPE:-debug} \
+      -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
+      -DARROW_ORC=ON \
+      -DARROW_PLASMA=ON \
+      -DARROW_PARQUET=ON \
+      -DARROW_HDFS=${ARROW_HDFS:-OFF} \
+      -DARROW_PYTHON=${ARROW_PYTHON:-OFF} \
+      -DARROW_BUILD_TESTS=${ARROW_BUILD_TESTS:-OFF} \
+      -DARROW_BUILD_UTILITIES=${ARROW_BUILD_UTILITIES:-ON} \
+      -DARROW_INSTALL_NAME_RPATH=${ARROW_INSTALL_NAME_RPATH:-ON} \
+      -DARROW_EXTRA_ERROR_CONTEXT=ON \
+      -DCMAKE_CXX_FLAGS=$CXXFLAGS \
+      ..
+ninja
+ninja install
+
+popd
diff --git a/ci/docker_build_python.sh b/ci/docker_build_python.sh
new file mode 100755
index 0000000..f145694
--- /dev/null
+++ b/ci/docker_build_python.sh
@@ -0,0 +1,38 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+set -e
+
+export ARROW_BUILD_TOOLCHAIN=$CONDA_PREFIX
+export ARROW_HOME=$CONDA_PREFIX
+
+# For newer GCC per https://arrow.apache.org/docs/python/development.html#known-issues
+export CXXFLAGS="-D_GLIBCXX_USE_CXX11_ABI=0"
+export PYARROW_CXXFLAGS=$CXXFLAGS
+export PYARROW_CMAKE_GENERATOR=Ninja
+
+# Build pyarrow
+pushd arrow/python
+
+python setup.py build_ext \
+    --build-type=${ARROW_BUILD_TYPE:-debug} \
+    --with-parquet \
+    --with-plasma \
+    --inplace
+
+popd
diff --git a/ci/docker_install_conda.sh b/ci/docker_install_conda.sh
new file mode 100755
index 0000000..81d5256
--- /dev/null
+++ b/ci/docker_install_conda.sh
@@ -0,0 +1,29 @@
+#!/usr/bin/env bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Exit on any error
+set -e
+
+wget --quiet https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
+bash miniconda.sh -b -q -p ${CONDA_PREFIX:=/opt/conda}
+rm miniconda.sh
+
+ln -s ${CONDA_PREFIX}/etc/profile.d/conda.sh /etc/profile.d/conda.sh
+echo ". ${CONDA_PREFIX}/etc/profile.d/conda.sh" >> ~/.bashrc
+echo "conda activate base" >> ~/.bashrc
diff --git a/cpp/CMakeLists.txt b/cpp/CMakeLists.txt
index 1d74604..ad231fa 100644
--- a/cpp/CMakeLists.txt
+++ b/cpp/CMakeLists.txt
@@ -343,12 +343,6 @@ if (NOT ARROW_FUZZING)
   set(NO_FUZZING 1)
 endif()
 
-if(ARROW_HDFS)
-  set(ARROW_BOOST_HEADER_ONLY 0)
-else()
-  set(ARROW_BOOST_HEADER_ONLY 1)
-endif()
-
 if (ARROW_TENSORFLOW)
   # TensorFlow uses the old GLIBCXX ABI, so we have to use it too
   set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -D_GLIBCXX_USE_CXX11_ABI=0")
diff --git a/cpp/Dockerfile b/cpp/Dockerfile
new file mode 100644
index 0000000..de5b40e
--- /dev/null
+++ b/cpp/Dockerfile
@@ -0,0 +1,53 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+FROM ubuntu:18.04
+
+RUN apt-get update && \
+    apt-get install -y \
+        gcc \
+        g++ \
+        git \
+        wget \
+        pkg-config \
+        ninja-build
+
+ENV CC=gcc \
+    CXX=g++ \
+    PATH=/opt/conda/bin:$PATH \
+    CONDA_PREFIX=/opt/conda
+
+ADD ci/docker_install_conda.sh \
+    ci/conda_env_cpp.yml \
+    /arrow/ci/
+RUN arrow/ci/docker_install_conda.sh && \
+    conda install -c conda-forge \
+        --file arrow/ci/conda_env_cpp.yml && \
+    conda clean --all
+
+# build cpp with tests
+ENV ARROW_BUILD_TESTS=ON
+ADD ci/docker_build_cpp.sh /arrow/ci/
+ADD cpp /arrow/cpp
+ADD format /arrow/format
+ADD java/pom.xml /arrow/java/pom.xml
+RUN arrow/ci/docker_build_cpp.sh
+
+# execute the tests
+WORKDIR arrow/cpp/build
+ENV PARQUET_TEST_DATA=/arrow/cpp/submodules/parquet-testing/data
+CMD ninja test
diff --git a/dev/docker-compose.yml b/dev/docker-compose.yml
index c832cc3..5e5fdc2 100644
--- a/dev/docker-compose.yml
+++ b/dev/docker-compose.yml
@@ -17,44 +17,12 @@
 version: '3'
 services:
 
-  hdfs-namenode:
-    image: gelog/hadoop
-    shm_size: 2G
-    ports:
-      - "9000:9000"
-      - "50070:50070"
-    command: hdfs namenode
-    hostname: hdfs-namenode
-
-  hdfs-datanode:
-    image: gelog/hadoop
-    command: hdfs datanode
-    ports:
-      # The host port is randomly assigned by Docker, to allow scaling
-      # to multiple DataNodes on the same host
-      - "50075"
-    links:
-      - hdfs-namenode:hdfs-namenode
-
   impala:
     image: cpcloud86/impala:java8-1
     ports:
       - "21050"
     hostname: impala
 
-  hdfs_integration:
-    links:
-      - hdfs-namenode:hdfs-namenode
-      - hdfs-datanode:hdfs-datanode
-    environment:
-      - ARROW_HDFS_TEST_HOST=hdfs-namenode
-      - ARROW_HDFS_TEST_PORT=9000
-      - ARROW_HDFS_TEST_USER=root
-    build:
-      context: hdfs_integration
-    volumes:
-     - ../..:/apache-arrow
-
   hiveserver2:
     links:
       - impala
diff --git a/dev/hdfs_integration/Dockerfile b/dev/hdfs_integration/Dockerfile
deleted file mode 100644
index 56b8cb1..0000000
--- a/dev/hdfs_integration/Dockerfile
+++ /dev/null
@@ -1,78 +0,0 @@
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements.  See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License.  You may obtain a copy of the License at
-#
-#    http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-FROM gelog/hadoop
-
-ENV CC=gcc \
-    CXX=g++
-
-RUN apt-get update -y \
- && apt-get install -y \
-	  gcc \
-	  g++ \
-	  git \
-	  wget \
-	  pkg-config \
-	  ninja-build
-
-RUN wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O conda.sh \
- && /bin/bash conda.sh -b -p /opt/conda \
- && rm conda.sh
-
-ENV PATH="/opt/conda/bin:${PATH}"
-
-RUN conda create -y -q -c conda-forge -n pyarrow-dev \
-      python=3.6 \
-      ipython \
-      matplotlib \
-      nomkl \
-      numpy \
-      six \
-      setuptools \
-      cython \
-      pandas \
-      pytest \
-      cmake \
-      flatbuffers \
-      rapidjson \
-      boost-cpp \
-      thrift-cpp \
-      snappy \
-      zlib \
-      gflags \
-      brotli \
-      jemalloc \
-      lz4-c \
-      zstd \
-      setuptools \
-      setuptools_scm \
- && conda clean --all
-
-# installing in the previous step boost=1.60 and boost-cpp=1.67 gets installed,
-# cmake finds 1.60 and parquet fails to compile
-# installing it in a separate step, boost=1.60 and boost-cpp=1.64 gets
-# installed, cmake finds 1.64
-# libhdfs3 needs to be pinned,see ARROW-1465 and ARROW-1445
-RUN conda install -y -q -n pyarrow-dev -c conda-forge \
-      hdfs3 \
-      libhdfs3=2.2.31 \
- && conda clean --all
-
-ADD . /apache-arrow
-WORKDIR /apache-arrow
-
-CMD arrow/dev/hdfs_integration/hdfs_integration.sh
diff --git a/dev/hdfs_integration/hdfs_integration.sh b/dev/hdfs_integration/hdfs_integration.sh
deleted file mode 100755
index b48d434..0000000
--- a/dev/hdfs_integration/hdfs_integration.sh
+++ /dev/null
@@ -1,115 +0,0 @@
-#!/usr/bin/env bash
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements.  See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License.  You may obtain a copy of the License at
-#
-#    http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-# Exit on any error
-set -e
-
-# cwd is mounted from host machine to
-# and contains both arrow and parquet-cpp
-
-# Activate conda environment
-conda activate pyarrow-dev
-
-# Arrow build variables
-export ARROW_BUILD_TYPE=debug
-export ARROW_BUILD_TOOLCHAIN=$CONDA_PREFIX
-export PARQUET_BUILD_TOOLCHAIN=$CONDA_PREFIX
-export ARROW_HOME=$CONDA_PREFIX
-export PARQUET_HOME=$CONDA_PREFIX
-
-# Hadoop variables
-export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native/
-export CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath --glob`
-
-# For newer GCC per https://arrow.apache.org/docs/python/development.html#known-issues
-export CXXFLAGS="-D_GLIBCXX_USE_CXX11_ABI=0"
-export PYARROW_CXXFLAGS=$CXXFLAGS
-export PYARROW_CMAKE_GENERATOR=Ninja
-
-_PWD=`pwd`
-ARROW_CPP_BUILD_DIR=$_PWD/arrow/cpp/hdfs-integration-build
-PARQUET_CPP_BUILD_DIR=$_PWD/parquet-cpp/hdfs-integration-build
-
-# Run tests
-export LIBHDFS3_CONF=$_PWD/arrow/dev/hdfs_integration/libhdfs3-client-config.xml
-
-function cleanup {
-    rm -rf $ARROW_CPP_BUILD_DIR
-    rm -rf $PARQUET_CPP_BUILD_DIR
-    pushd $_PWD/arrow/python
-    git clean -fdx .
-    popd
-}
-
-trap cleanup EXIT
-
-# Install arrow-cpp
-mkdir -p $ARROW_CPP_BUILD_DIR
-pushd $ARROW_CPP_BUILD_DIR
-
-cmake -GNinja \
-      -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
-      -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
-      -DARROW_PYTHON=ON \
-      -DARROW_PLASMA=ON \
-      -DARROW_HDFS=ON \
-      -DARROW_BUILD_TESTS=ON \
-      -DCMAKE_CXX_FLAGS=$CXXFLAGS \
-      ..
-ninja
-ninja install
-
-# Run C++ unit tests
-debug/io-hdfs-test
-
-popd
-
-# Install parquet-cpp
-mkdir -p $PARQUET_CPP_BUILD_DIR
-pushd $PARQUET_CPP_BUILD_DIR
-
-cmake -GNinja \
-      -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
-      -DCMAKE_INSTALL_PREFIX=$PARQUET_HOME \
-      -DPARQUET_BUILD_BENCHMARKS=OFF \
-      -DPARQUET_BUILD_EXECUTABLES=OFF \
-      -DPARQUET_BUILD_TESTS=OFF \
-      -DCMAKE_CXX_FLAGS=$CXXFLAGS \
-      ..
-ninja
-ninja install
-
-popd
-
-# Install pyarrow
-pushd arrow/python
-
-# Clear the build directory so we are guaranteed a fresh set of extensions
-rm -rf build/
-
-python setup.py build_ext \
-    --build-type=$ARROW_BUILD_TYPE \
-    --with-parquet \
-    --with-plasma \
-    --inplace
-
-# Python
-python -m pytest -vv -r sxX -s pyarrow \
-       --only-parquet --only-hdfs
-
-popd
diff --git a/docker-compose.yml b/docker-compose.yml
new file mode 100644
index 0000000..a96d8ba
--- /dev/null
+++ b/docker-compose.yml
@@ -0,0 +1,121 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+version: '3.5'
+services:
+
+  # we can further improve the caching mechanism for go rust and js via
+  # early adding the dependencies explicitly (cargo.toml etc) to prevent
+  # reinstalling the dependencies on each modification
+
+  ######################### Language Containers ###############################
+
+  c_glib:
+    image: arrow:c_glib
+    build:
+      context: .
+      dockerfile: c_glib/Dockerfile
+
+  cpp:
+    image: arrow:cpp
+    shm_size: 2G
+    build:
+      context: .
+      dockerfile: cpp/Dockerfile
+
+  go:
+    image: arrow:go
+    build:
+      context: .
+      dockerfile: go/Dockerfile
+
+  java:
+    image: arrow:java
+    build:
+      context: .
+      dockerfile: java/Dockerfile
+
+  js:
+    image: arrow:js
+    build:
+      context: .
+      dockerfile: js/Dockerfile
+
+  python:
+    image: arrow:python-${PYTHON_VERSION:-3.6}
+    shm_size: 2G
+    build:
+      context: .
+      dockerfile: python/Dockerfile
+      args:
+        PYTHON_VERSION: ${PYTHON_VERSION:-3.6}
+
+  #TODO(kszucs): R
+
+  rust:
+    image: arrow:rust
+    build:
+      context: .
+      dockerfile: rust/Dockerfile
+
+  ######################### Integration Tests #################################
+
+  # impala:
+  #   image: cpcloud86/impala:java8-1
+  #   ports:
+  #     - "21050"
+  #   hostname: impala
+
+  hdfs-namenode:
+    image: gelog/hadoop
+    shm_size: 2G
+    ports:
+      - "9000:9000"
+      - "50070:50070"
+    command: hdfs namenode
+    hostname: hdfs-namenode
+
+  hdfs-datanode:
+    image: gelog/hadoop
+    command: hdfs datanode
+    ports:
+      # The host port is randomly assigned by Docker, to allow scaling
+      # to multiple DataNodes on the same host
+      - "50075"
+    links:
+      - hdfs-namenode:hdfs-namenode
+
+  hdfs-integration:
+    links:
+      - hdfs-namenode:hdfs-namenode
+      - hdfs-datanode:hdfs-datanode
+    environment:
+      - ARROW_HDFS_TEST_HOST=hdfs-namenode
+      - ARROW_HDFS_TEST_PORT=9000
+      - ARROW_HDFS_TEST_USER=root
+    build:
+      context: .
+      dockerfile: integration/hdfs/Dockerfile
+
+  # TODO(kszucs): dask-integration
+  # TODO(kszucs): hive-integration
+  # TODO(kszucs): spark-integration
+
+  ######################### Documentation #####################################
+
+  # TODO(kszucs): site
+  # TODO(kszucs): apidoc
diff --git a/go/Dockerfile b/go/Dockerfile
new file mode 100644
index 0000000..860f7d6
--- /dev/null
+++ b/go/Dockerfile
@@ -0,0 +1,25 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+FROM golang
+
+ADD go /arrow/go
+WORKDIR /arrow/go/arrow
+
+RUN go get -d -t -v ./... && \
+    go install -v ./...
+CMD go test
diff --git a/integration/hdfs/Dockerfile b/integration/hdfs/Dockerfile
new file mode 100644
index 0000000..87d4e31
--- /dev/null
+++ b/integration/hdfs/Dockerfile
@@ -0,0 +1,74 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+FROM gelog/hadoop
+
+RUN apt-get update && \
+    apt-get install -y \
+        gcc \
+        g++ \
+        git \
+        wget \
+        pkg-config \
+        ninja-build
+
+ENV CC=gcc \
+    CXX=g++ \
+    PATH=/opt/conda/bin:$PATH \
+    CONDA_PREFIX=/opt/conda
+
+# install dependencies
+ARG PYTHON_VERSION=3.6
+ADD ci/docker_install_conda.sh \
+    ci/conda_env_cpp.yml \
+    ci/conda_env_python.yml \
+    /arrow/ci/
+RUN arrow/ci/docker_install_conda.sh && \
+    conda install -c conda-forge \
+        --file arrow/ci/conda_env_cpp.yml \
+        --file arrow/ci/conda_env_python.yml \
+        python=$PYTHON_VERSION && \
+    conda clean --all
+
+# installing in the previous step boost=1.60 and boost-cpp=1.67 gets installed,
+# cmake finds 1.60 and parquet fails to compile
+# installing it in a separate step, boost=1.60 and boost-cpp=1.64 gets
+# installed, cmake finds 1.64
+# libhdfs3 needs to be pinned, see ARROW-1465 and ARROW-1445
+RUN conda install -y -c conda-forge hdfs3 libhdfs3=2.2.31 && \
+    conda clean --all
+
+# build cpp with tests
+ENV ARROW_HDFS=ON \
+    ARROW_PYTHON=ON \
+    ARROW_BUILD_TESTS=ON \
+    LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${HADOOP_HOME}/lib/native"
+ADD ci/docker_build_cpp.sh /arrow/ci/
+ADD cpp /arrow/cpp
+ADD format /arrow/format
+ADD java/pom.xml /arrow/java/pom.xml
+RUN arrow/ci/docker_build_cpp.sh
+
+# build python
+ADD ci/docker_build_python.sh /arrow/ci/
+ADD python /arrow/python
+RUN arrow/ci/docker_build_python.sh
+
+# execute integration tests
+ENV LIBHDFS3_CONF=/arrow/integration/hdfs/libhdfs3.xml
+ADD integration /arrow/integration
+CMD arrow/integration/hdfs/runtest.sh
diff --git a/dev/hdfs_integration/libhdfs3-client-config.xml b/integration/hdfs/libhdfs3.xml
similarity index 100%
rename from dev/hdfs_integration/libhdfs3-client-config.xml
rename to integration/hdfs/libhdfs3.xml
diff --git a/integration/hdfs/runtest.sh b/integration/hdfs/runtest.sh
new file mode 100755
index 0000000..12fd85f
--- /dev/null
+++ b/integration/hdfs/runtest.sh
@@ -0,0 +1,30 @@
+#!/usr/bin/env bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+set -e
+
+export CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath --glob`
+
+pushd arrow/cpp/build
+  debug/io-hdfs-test
+popd
+
+pushd arrow/python
+  python -m pytest -vv -r sxX -s --only-parquet --only-hdfs pyarrow
+popd
diff --git a/java/Dockerfile b/java/Dockerfile
new file mode 100644
index 0000000..96c4d4c
--- /dev/null
+++ b/java/Dockerfile
@@ -0,0 +1,30 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+FROM maven
+
+ADD header /arrow/
+ADD format /arrow/format
+ADD integration/data /arrow/integration/data
+ADD java /arrow/java
+WORKDIR /arrow/java
+
+# build
+RUN mvn -DskipTests=true -Dcheckstyle.skip=true -B install
+
+# test
+CMD mvn test
diff --git a/js/Dockerfile b/js/Dockerfile
new file mode 100644
index 0000000..6f44917
--- /dev/null
+++ b/js/Dockerfile
@@ -0,0 +1,32 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+FROM node
+
+# install dependencies
+ADD js/.npmrc js/package.json /arrow/js/
+WORKDIR /arrow/js
+RUN npm install -g npm@latest && \
+    npm install
+
+# build
+ADD LICENSE.txt /arrow/
+ADD NOTICE.txt /arrow/
+ADD js /arrow/js
+RUN npm run lint && npm run build
+
+CMD npm run test
diff --git a/python/Dockerfile b/python/Dockerfile
new file mode 100644
index 0000000..f143cca
--- /dev/null
+++ b/python/Dockerfile
@@ -0,0 +1,62 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+FROM ubuntu:18.04
+
+RUN apt-get update && \
+    apt-get install -y \
+        gcc \
+        g++ \
+        git \
+        wget \
+        pkg-config \
+        ninja-build
+
+ENV CC=gcc \
+    CXX=g++ \
+    PATH=/opt/conda/bin:$PATH \
+    CONDA_PREFIX=/opt/conda
+
+# install dependencies
+ARG PYTHON_VERSION=3.6
+ADD ci/docker_install_conda.sh \
+    ci/conda_env_cpp.yml \
+    ci/conda_env_python.yml \
+    /arrow/ci/
+RUN arrow/ci/docker_install_conda.sh && \
+    conda install -c conda-forge \
+        --file arrow/ci/conda_env_cpp.yml \
+        --file arrow/ci/conda_env_python.yml \
+        python=$PYTHON_VERSION && \
+    conda clean --all
+
+# build cpp without tests
+ENV ARROW_PYTHON=ON \
+    ARROW_BUILD_TESTS=OFF
+ADD ci/docker_build_cpp.sh /arrow/ci/
+ADD cpp /arrow/cpp
+ADD format /arrow/format
+ADD java/pom.xml /arrow/java/pom.xml
+RUN arrow/ci/docker_build_cpp.sh
+
+# build python
+ADD ci/docker_build_python.sh /arrow/ci/
+ADD python /arrow/python
+RUN arrow/ci/docker_build_python.sh
+
+WORKDIR arrow/python
+CMD pytest -v pyarrow
diff --git a/python/pyarrow/tests/test_parquet.py b/python/pyarrow/tests/test_parquet.py
index e7970bb..4e50e64 100644
--- a/python/pyarrow/tests/test_parquet.py
+++ b/python/pyarrow/tests/test_parquet.py
@@ -1442,14 +1442,15 @@ def _test_read_common_metadata_files(fs, base_path):
         'values': np.random.randn(N)
     }, columns=['index', 'values'])
 
-    data_path = base_path / 'data.parquet'
+    base_path = str(base_path)
+    data_path = os.path.join(base_path, 'data.parquet')
 
     table = pa.Table.from_pandas(df)
 
     with fs.open(data_path, 'wb') as f:
         _write_table(table, f)
 
-    metadata_path = base_path / '_common_metadata'
+    metadata_path = os.path.join(base_path, '_common_metadata')
     with fs.open(metadata_path, 'wb') as f:
         pq.write_metadata(table.schema, f)
 
diff --git a/rust/Dockerfile b/rust/Dockerfile
new file mode 100644
index 0000000..6ad57c3
--- /dev/null
+++ b/rust/Dockerfile
@@ -0,0 +1,24 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+FROM rust
+
+ADD rust /arrow/rust
+WORKDIR /arrow/rust
+
+RUN cargo build
+CMD cargo test