You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by we...@apache.org on 2018/08/06 19:40:15 UTC

[arrow] branch master updated (1e2a069 -> 551e9ce)

This is an automated email from the ASF dual-hosted git repository.

wesm pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git.


    omit 1e2a069  ARROW-2813: [CI] [Followup] Disable gcov output in Travis-CI logs
    omit 4d682a6  ARROW-2988: Improve Windows release verification script to be more automated
    omit 0b654ce  ARROW-2061: [C++] Make tests a bit faster with Valgrind
    omit d9d1f6b  ARROW-2815: [CI] Skip Java tests and style checks on C++ job [skip appveyor]
    omit 992b27f  ARROW-2982: Ensure release verification script works with wget < 1.16, build ORC in C++ libraries
    omit ad7bbbd  ARROW-2951: [CI] Don't skip AppVeyor build on format-only changes
    omit da7a48e  ARROW-2990: [GLib] Support building with rpath-ed Arrow C++ on macOS
    omit ae95780  ARROW-2985: [Ruby] Add support for verifying RC
    omit f8ba33d  ARROW-2869: [Python] Add documentation for Array.to_numpy
    omit 8cbaf44  ARROW-2977: [Packaging] Release verification script should check rust too
    omit 889e1e6  ARROW-2978: [Rust] Change argument to rust fmt to fix build
    omit 9101292  ARROW-2480: [C++] Enable casting the value of a decimal to int32_t or int64_t
    omit 41bb85b  ARROW-2962: [Packaging] Bintray descriptor files are no longer needed
    omit 7a6144e  ARROW-2666: [Python] Add __array__ method to Array, ChunkedArray, Column
    omit 5b45c66  ARROW-2813: [CI] Mute uninformative lcov warnings
     add 446dd45  [Release] Update CHANGELOG.md for 0.10.0
     add d38bc66  [Release] Update .deb/.rpm changelogs for 0.10.0
     add 07f142d  [maven-release-plugin] prepare release apache-arrow-0.10.0
     new 0f5fb20  ARROW-2813: [CI] Mute uninformative lcov warnings
     new ef933a6  ARROW-2666: [Python] Add __array__ method to Array, ChunkedArray, Column
     new 0c29673  ARROW-2962: [Packaging] Bintray descriptor files are no longer needed
     new 495bf36  ARROW-2480: [C++] Enable casting the value of a decimal to int32_t or int64_t
     new 1b2a42e  ARROW-2978: [Rust] Change argument to rust fmt to fix build
     new 7c953a0  ARROW-2977: [Packaging] Release verification script should check rust too
     new de50744  ARROW-2869: [Python] Add documentation for Array.to_numpy
     new 072fa77  ARROW-2985: [Ruby] Add support for verifying RC
     new 00aed05  ARROW-2990: [GLib] Support building with rpath-ed Arrow C++ on macOS
     new 91eab98  ARROW-2951: [CI] Don't skip AppVeyor build on format-only changes
     new ea9157a  ARROW-2982: Ensure release verification script works with wget < 1.16, build ORC in C++ libraries
     new e10f2b3  ARROW-2815: [CI] Skip Java tests and style checks on C++ job [skip appveyor]
     new d3c9c1d  ARROW-2061: [C++] Make tests a bit faster with Valgrind
     new 71145cd  ARROW-2988: Improve Windows release verification script to be more automated
     new 551e9ce  ARROW-2813: [CI] [Followup] Disable gcov output in Travis-CI logs

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (1e2a069)
            \
             N -- N -- N   refs/heads/master (551e9ce)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 15 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG.md                                       | 463 +++++++++++++++++++++
 .../linux-packages/debian.ubuntu-trusty/changelog  |   6 +
 dev/tasks/linux-packages/debian/changelog          |   6 +
 dev/tasks/linux-packages/yum/arrow.spec.in         |   3 +
 java/adapter/jdbc/pom.xml                          |   2 +-
 java/format/pom.xml                                |   2 +-
 java/memory/pom.xml                                |   2 +-
 java/plasma/pom.xml                                |   2 +-
 java/pom.xml                                       |   4 +-
 java/tools/pom.xml                                 |   2 +-
 java/vector/pom.xml                                |   2 +-
 11 files changed, 486 insertions(+), 8 deletions(-)


[arrow] 06/15: ARROW-2977: [Packaging] Release verification script should check rust too

Posted by we...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git

commit 7c953a01e84e14f23bbfd3f5bc649afc28c4b649
Author: Krisztián Szűcs <sz...@gmail.com>
AuthorDate: Sun Aug 5 16:06:04 2018 -0400

    ARROW-2977: [Packaging] Release verification script should check rust too
    
    I've found a couple of issues with the verification scripts:
    1. The standalone js verification script seems obsolete
    2. The windows script only checks arrow-cpp, parquet-cpp and pyarrow
    3. The windows script doesn't create conda env
    
    For the next release it'd nice to have consistent scripts on each platform (c_glib requires additional configuration on OSX).
    
    Author: Krisztián Szűcs <sz...@gmail.com>
    
    Closes #2369 from kszucs/ARROW-2977 and squashes the following commits:
    
    5e323c9b <Krisztián Szűcs> remove comments
    a59ca508 <Krisztián Szűcs> setup rustup and test rust library
---
 dev/release/verify-release-candidate.sh | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/dev/release/verify-release-candidate.sh b/dev/release/verify-release-candidate.sh
index 74ec61c..eedec46 100755
--- a/dev/release/verify-release-candidate.sh
+++ b/dev/release/verify-release-candidate.sh
@@ -225,6 +225,29 @@ test_js() {
   popd
 }
 
+test_rust() {
+  # install rust toolchain in a similar fashion like test-miniconda
+  export RUSTUP_HOME=`pwd`/test-rustup
+  export CARGO_HOME=`pwd`/test-rustup
+
+  curl https://sh.rustup.rs -sSf | sh -s -- -y
+  source $RUSTUP_HOME/env
+
+  # build and test rust
+  pushd rust
+
+  # raises on any formatting errors (disabled, because RC1 has a couple)
+  # rustup component add rustfmt-preview
+  # cargo fmt --all -- --check
+  # raises on any warnings
+  cargo rustc -- -D warnings
+
+  cargo build
+  cargo test
+
+  popd
+}
+
 # Build and test Java (Requires newer Maven -- I used 3.3.9)
 
 test_package_java() {
@@ -286,6 +309,7 @@ test_integration
 test_glib
 install_parquet_cpp
 test_python
+test_rust
 
 echo 'Release candidate looks good!'
 exit 0


[arrow] 02/15: ARROW-2666: [Python] Add __array__ method to Array, ChunkedArray, Column

Posted by we...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git

commit ef933a642ecbd00591735acb353db4ea9f74060c
Author: Pedro M. Duarte <pm...@gmail.com>
AuthorDate: Sat Aug 4 16:00:40 2018 -0400

    ARROW-2666: [Python] Add __array__ method to Array, ChunkedArray, Column
    
    Implement `__array__` method on `pyarrow.Array`, `pyarrow.ChunkedArray` and `pyarrow.Column` so that the `to_pandas()` method is used when calling `numpy.asarray` on an instance of these classes.
    
    Currently `numpy.asarray` falls back to using the iterator interface so we get numpy object arrays of the underlying pyarrow scalar value type.
    
    Author: Pedro M. Duarte <pm...@gmail.com>
    
    Closes #2365 from PedroMDuarte/asarray and squashes the following commits:
    
    71f9e291 <Pedro M. Duarte> Improve inline comment
    6eac2685 <Pedro M. Duarte> Add __array__ method to Array, ChunkedArray, Column
---
 python/pyarrow/array.pxi           |  5 ++++
 python/pyarrow/table.pxi           | 17 +++++++++---
 python/pyarrow/tests/test_array.py | 29 ++++++++++++++++++++
 python/pyarrow/tests/test_table.py | 56 +++++++++++++++++++++++++++++++++++---
 4 files changed, 99 insertions(+), 8 deletions(-)

diff --git a/python/pyarrow/array.pxi b/python/pyarrow/array.pxi
index d59bb05..513fa86 100644
--- a/python/pyarrow/array.pxi
+++ b/python/pyarrow/array.pxi
@@ -652,6 +652,11 @@ cdef class Array:
                                               self, &out))
         return wrap_array_output(out)
 
+    def __array__(self, dtype=None):
+        if dtype is None:
+            return self.to_pandas()
+        return self.to_pandas().astype(dtype)
+
     def to_numpy(self):
         """
         EXPERIMENTAL: Construct a NumPy view of this array. Only supports
diff --git a/python/pyarrow/table.pxi b/python/pyarrow/table.pxi
index 9a8a875..e056843 100644
--- a/python/pyarrow/table.pxi
+++ b/python/pyarrow/table.pxi
@@ -147,11 +147,12 @@ cdef class ChunkedArray:
                   c_bool zero_copy_only=False,
                   c_bool integer_object_nulls=False):
         """
-        Convert the arrow::Column to a pandas.Series
+        Convert the arrow::ChunkedArray to an array object suitable for use
+        in pandas
 
-        Returns
-        -------
-        pandas.Series
+        See also
+        --------
+        Column.to_pandas
         """
         cdef:
             PyObject* out
@@ -171,6 +172,11 @@ cdef class ChunkedArray:
 
         return wrap_array_output(out)
 
+    def __array__(self, dtype=None):
+        if dtype is None:
+            return self.to_pandas()
+        return self.to_pandas().astype(dtype)
+
     def dictionary_encode(self):
         """
         Compute dictionary-encoded representation of array
@@ -517,6 +523,9 @@ cdef class Column:
 
         return result
 
+    def __array__(self, dtype=None):
+        return self.data.__array__(dtype=dtype)
+
     def equals(self, Column other):
         """
         Check if contents of two columns are equal
diff --git a/python/pyarrow/tests/test_array.py b/python/pyarrow/tests/test_array.py
index af2708f..425fe09 100644
--- a/python/pyarrow/tests/test_array.py
+++ b/python/pyarrow/tests/test_array.py
@@ -156,6 +156,35 @@ def test_to_pandas_zero_copy():
         np_arr.sum()
 
 
+def test_asarray():
+    arr = pa.array(range(4))
+
+    # The iterator interface gives back an array of Int64Value's
+    np_arr = np.asarray([_ for _ in arr])
+    assert np_arr.tolist() == [0, 1, 2, 3]
+    assert np_arr.dtype == np.dtype('O')
+    assert type(np_arr[0]) == pa.lib.Int64Value
+
+    # Calling with the arrow array gives back an array with 'int64' dtype
+    np_arr = np.asarray(arr)
+    assert np_arr.tolist() == [0, 1, 2, 3]
+    assert np_arr.dtype == np.dtype('int64')
+
+    # An optional type can be specified when calling np.asarray
+    np_arr = np.asarray(arr, dtype='str')
+    assert np_arr.tolist() == ['0', '1', '2', '3']
+
+    # If PyArrow array has null values, numpy type will be changed as needed
+    # to support nulls.
+    arr = pa.array([0, 1, 2, None])
+    assert arr.type == pa.int64()
+    np_arr = np.asarray(arr)
+    elements = np_arr.tolist()
+    assert elements[:3] == [0., 1., 2.]
+    assert np.isnan(elements[3])
+    assert np_arr.dtype == np.dtype('float64')
+
+
 def test_array_getitem():
     arr = pa.array(range(10, 15))
     lst = arr.to_pylist()
diff --git a/python/pyarrow/tests/test_table.py b/python/pyarrow/tests/test_table.py
index 69086e0..cc672fc 100644
--- a/python/pyarrow/tests/test_table.py
+++ b/python/pyarrow/tests/test_table.py
@@ -160,6 +160,48 @@ def test_chunked_array_pickle(data, typ):
     assert result.equals(array)
 
 
+def test_chunked_array_to_pandas():
+    data = [
+        pa.array([-10, -5, 0, 5, 10])
+    ]
+    table = pa.Table.from_arrays(data, names=['a'])
+    chunked_arr = table.column(0).data
+    assert isinstance(chunked_arr, pa.ChunkedArray)
+    array = chunked_arr.to_pandas()
+    assert array.shape == (5,)
+    assert array[0] == -10
+
+
+def test_chunked_array_asarray():
+    data = [
+        pa.array([0]),
+        pa.array([1, 2, 3])
+    ]
+    chunked_arr = pa.chunked_array(data)
+
+    np_arr = np.asarray(chunked_arr)
+    assert np_arr.tolist() == [0, 1, 2, 3]
+    assert np_arr.dtype == np.dtype('int64')
+
+    # An optional type can be specified when calling np.asarray
+    np_arr = np.asarray(chunked_arr, dtype='str')
+    assert np_arr.tolist() == ['0', '1', '2', '3']
+
+    # Types are modified when there are nulls
+    data = [
+        pa.array([1, None]),
+        pa.array([1, 2, 3])
+    ]
+    chunked_arr = pa.chunked_array(data)
+
+    np_arr = np.asarray(chunked_arr)
+    elements = np_arr.tolist()
+    assert elements[0] == 1.
+    assert np.isnan(elements[1])
+    assert elements[2:] == [1., 2., 3.]
+    assert np_arr.dtype == np.dtype('float64')
+
+
 def test_column_basics():
     data = [
         pa.array([-10, -5, 0, 5, 10])
@@ -219,14 +261,20 @@ def test_column_to_pandas():
     assert series.iloc[0] == -10
 
 
-def test_chunked_array_to_pandas():
+def test_column_asarray():
     data = [
         pa.array([-10, -5, 0, 5, 10])
     ]
     table = pa.Table.from_arrays(data, names=['a'])
-    array = table.column(0).data.to_pandas()
-    assert array.shape == (5,)
-    assert array[0] == -10
+    column = table.column(0)
+
+    np_arr = np.asarray(column)
+    assert np_arr.tolist() == [-10, -5, 0, 5, 10]
+    assert np_arr.dtype == np.dtype('int64')
+
+    # An optional type can be specified when calling np.asarray
+    np_arr = np.asarray(column, dtype='str')
+    assert np_arr.tolist() == ['-10', '-5', '0', '5', '10']
 
 
 def test_column_flatten():


[arrow] 14/15: ARROW-2988: Improve Windows release verification script to be more automated

Posted by we...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git

commit 71145cdbdc0c2d717ca3a6a4f8189c6cbcad38e5
Author: Wes McKinney <we...@apache.org>
AuthorDate: Mon Aug 6 14:42:29 2018 -0400

    ARROW-2988: Improve Windows release verification script to be more automated
    
    * Downloads tarball from SVN dist
    * Creates ephemeral conda environment automatically
    
    I am adding instructions to https://cwiki.apache.org/confluence/display/ARROW to help others verify releases on Windows.
    
    Author: Wes McKinney <we...@apache.org>
    
    Closes #2373 from wesm/ARROW-2988 and squashes the following commits:
    
    1a52a48c <Wes McKinney> Revamp Windows release verification script
---
 dev/release/verify-release-candidate.bat | 73 ++++++++++++++++++--------------
 1 file changed, 42 insertions(+), 31 deletions(-)

diff --git a/dev/release/verify-release-candidate.bat b/dev/release/verify-release-candidate.bat
index bc05b23..86abbc6 100644
--- a/dev/release/verify-release-candidate.bat
+++ b/dev/release/verify-release-candidate.bat
@@ -15,24 +15,8 @@
 @rem specific language governing permissions and limitations
 @rem under the License.
 
-@rem To use this script, first create the following conda environment. Change
-@rem the Python version if so desired. You can also omit one or more of the
-@rem libray build rem dependencies if you want to build them from source as well
-@rem
-
-@rem set PYTHON=3.6
-@rem conda create -n arrow-verify-release -f -q -y python=%PYTHON%
-@rem conda install -y ^
-@rem       six pytest setuptools numpy pandas cython ^
-@rem       thrift-cpp flatbuffers rapidjson ^
-@rem       cmake ^
-@rem       git ^
-@rem       boost-cpp ^
-@rem       snappy zlib brotli gflags lz4-c zstd -c conda-forge || exit /B
-
-@rem Then run from the directory containing the RC tarball
-@rem
-@rem verify-release-candidate.bat apache-arrow-%VERSION%
+@rem To run the script:
+@rem verify-release-candidate.bat VERSION RC_NUM
 
 @echo on
 
@@ -40,17 +24,40 @@ if not exist "C:\tmp\" mkdir C:\tmp
 if exist "C:\tmp\arrow-verify-release" rd C:\tmp\arrow-verify-release /s /q
 if not exist "C:\tmp\arrow-verify-release" mkdir C:\tmp\arrow-verify-release
 
-tar xvf %1.tar.gz -C "C:/tmp/"
+set _VERIFICATION_DIR=C:\tmp\arrow-verify-release
+set _VERIFICATION_DIR_UNIX=C:/tmp/arrow-verify-release
+set _VERIFICATION_CONDA_ENV=%_VERIFICATION_DIR%\conda-env
+set _DIST_URL=https://dist.apache.org/repos/dist/dev/arrow
+set _TARBALL=apache-arrow-%1.tar.gz
+set ARROW_SOURCE=%_VERIFICATION_DIR%\apache-arrow-%1
+set INSTALL_DIR=%_VERIFICATION_DIR%\install
+
+@rem Requires GNU Wget for Windows
+wget -O %_TARBALL% %_DIST_URL%/apache-arrow-%1-rc%2/%_TARBALL%
+
+tar xvf %_TARBALL% -C %_VERIFICATION_DIR_UNIX%
+
+set PYTHON=3.6
+
+@rem Using call with conda.bat seems necessary to avoid terminating the batch
+@rem script execution
+call conda create -p %_VERIFICATION_CONDA_ENV% -f -q -y python=%PYTHON% || exit /B
+
+call activate %_VERIFICATION_CONDA_ENV%
+
+call conda install -y ^
+      six pytest setuptools numpy pandas cython ^
+      thrift-cpp flatbuffers rapidjson ^
+      cmake ^
+      git ^
+      boost-cpp ^
+      snappy zlib brotli gflags lz4-c zstd -c conda-forge
 
 set GENERATOR=Visual Studio 14 2015 Win64
 set CONFIGURATION=release
-set ARROW_SOURCE=C:\tmp\%1
-set INSTALL_DIR=C:\tmp\%1\install
 
 pushd %ARROW_SOURCE%
 
-call activate arrow-verify-release
-
 set ARROW_BUILD_TOOLCHAIN=%CONDA_PREFIX%\Library
 set PARQUET_BUILD_TOOLCHAIN=%CONDA_PREFIX%\Library
 
@@ -59,14 +66,17 @@ set PARQUET_HOME=%INSTALL_DIR%
 set PATH=%INSTALL_DIR%\bin;%PATH%
 
 @rem Build and test Arrow C++ libraries
-mkdir cpp\build
-pushd cpp\build
+mkdir %ARROW_SOURCE%\cpp\build
+pushd %ARROW_SOURCE%\cpp\build
+
+@rem This is the path for Visual Studio Community 2017
+call "C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\Tools\VsDevCmd.bat" -arch=amd64
 
 cmake -G "%GENERATOR%" ^
       -DCMAKE_INSTALL_PREFIX=%ARROW_HOME% ^
       -DARROW_BOOST_USE_SHARED=OFF ^
       -DCMAKE_BUILD_TYPE=%CONFIGURATION% ^
-      -DARROW_CXXFLAGS="/WX /MP" ^
+      -DARROW_CXXFLAGS="/MP" ^
       -DARROW_PYTHON=ON ^
       ..  || exit /B
 cmake --build . --target INSTALL --config %CONFIGURATION%  || exit /B
@@ -79,13 +89,13 @@ popd
 
 @rem Build parquet-cpp
 git clone https://github.com/apache/parquet-cpp.git || exit /B
-mkdir parquet-cpp\build
-pushd parquet-cpp\build
+mkdir %ARROW_SOURCE%\parquet-cpp\build
+pushd %ARROW_SOURCE%\parquet-cpp\build
 
 cmake -G "%GENERATOR%" ^
      -DCMAKE_INSTALL_PREFIX=%PARQUET_HOME% ^
      -DCMAKE_BUILD_TYPE=%CONFIGURATION% ^
-     -DPARQUET_BOOST_USE_SHARED=OFF ^
+x     -DPARQUET_BOOST_USE_SHARED=OFF ^
      -DPARQUET_BUILD_TESTS=off .. || exit /B
 cmake --build . --target INSTALL --config %CONFIGURATION% || exit /B
 popd
@@ -93,10 +103,11 @@ popd
 @rem Build and import pyarrow
 @rem parquet-cpp has some additional runtime dependencies that we need to figure out
 @rem see PARQUET-1018
-pushd python
+pushd %ARROW_SOURCE%\python
 
-set PYARROW_CXXFLAGS=/WX
 python setup.py build_ext --inplace --with-parquet --bundle-arrow-cpp bdist_wheel  || exit /B
 py.test pyarrow -v -s --parquet || exit /B
 
 popd
+
+call deactivate


[arrow] 01/15: ARROW-2813: [CI] Mute uninformative lcov warnings

Posted by we...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git

commit 0f5fb20ca896b5b3aacfe7c67f8df0385acea6d6
Author: Antoine Pitrou <an...@python.org>
AuthorDate: Fri Aug 3 22:23:10 2018 -0400

    ARROW-2813: [CI] Mute uninformative lcov warnings
    
    Author: Antoine Pitrou <an...@python.org>
    
    Closes #2367 from pitrou/ARROW-2813-mute-lcov-output and squashes the following commits:
    
    19a4f661 <Antoine Pitrou> ARROW-2813:  Mute uninformative lcov warnings
---
 ci/travis_script_cpp.sh    | 3 ++-
 ci/travis_script_python.sh | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/ci/travis_script_cpp.sh b/ci/travis_script_cpp.sh
index eedca98..3a6b2f7 100755
--- a/ci/travis_script_cpp.sh
+++ b/ci/travis_script_cpp.sh
@@ -30,6 +30,7 @@ popd
 # Capture C++ coverage info (we wipe the build dir in travis_script_python.sh)
 if [ "$ARROW_TRAVIS_COVERAGE" == "1" ]; then
     pushd $TRAVIS_BUILD_DIR
-    lcov --quiet --directory . --capture --no-external --output-file $ARROW_CPP_COVERAGE_FILE
+    lcov --quiet --directory . --capture --no-external --output-file $ARROW_CPP_COVERAGE_FILE \
+        2>&1 | grep -v "WARNING: no data found for /usr/include"
     popd
 fi
diff --git a/ci/travis_script_python.sh b/ci/travis_script_python.sh
index 0743f86..53dd36c 100755
--- a/ci/travis_script_python.sh
+++ b/ci/travis_script_python.sh
@@ -155,7 +155,8 @@ if [ "$ARROW_TRAVIS_COVERAGE" == "1" ]; then
     coverage xml -i -o $TRAVIS_BUILD_DIR/coverage.xml
     # Capture C++ coverage info and combine with previous coverage file
     pushd $TRAVIS_BUILD_DIR
-    lcov --quiet --directory . --capture --no-external --output-file coverage-python-tests.info
+    lcov --quiet --directory . --capture --no-external --output-file coverage-python-tests.info \
+        2>&1 | grep -v "WARNING: no data found for /usr/include"
     lcov --add-tracefile coverage-python-tests.info \
         --add-tracefile $ARROW_CPP_COVERAGE_FILE \
         --output-file $ARROW_CPP_COVERAGE_FILE


[arrow] 15/15: ARROW-2813: [CI] [Followup] Disable gcov output in Travis-CI logs

Posted by we...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git

commit 551e9cec0f04c91963411c735f744346b1772ae1
Author: Antoine Pitrou <an...@python.org>
AuthorDate: Mon Aug 6 15:38:38 2018 -0400

    ARROW-2813: [CI] [Followup] Disable gcov output in Travis-CI logs
    
    We don't actually need codecov's gcov discovery, since we gather coverage ourselves using `lcov` in the CI scripts. This suppresses hundreds of lines of logs in Travis-CI's output.
    
    Author: Antoine Pitrou <an...@python.org>
    
    Closes #2379 from pitrou/codecov-disable-gcov-discovery and squashes the following commits:
    
    cc06becb <Antoine Pitrou>  Disable gcov output in Travis-CI logs
---
 ci/travis_upload_cpp_coverage.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ci/travis_upload_cpp_coverage.sh b/ci/travis_upload_cpp_coverage.sh
index 2b11c5e..38ea5d3 100755
--- a/ci/travis_upload_cpp_coverage.sh
+++ b/ci/travis_upload_cpp_coverage.sh
@@ -25,7 +25,7 @@ pushd $TRAVIS_BUILD_DIR
 
 # Display C++ coverage summary
 lcov --list $ARROW_CPP_COVERAGE_FILE
-# Upload report to CodeCov
-bash <(curl -s https://codecov.io/bash) || echo "Codecov did not collect coverage reports"
+# Upload report to CodeCov, disabling gcov discovery to save time and avoid warnings
+bash <(curl -s https://codecov.io/bash) -X gcov || echo "Codecov did not collect coverage reports"
 
 popd


[arrow] 12/15: ARROW-2815: [CI] Skip Java tests and style checks on C++ job [skip appveyor]

Posted by we...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git

commit e10f2b3c15c426c879924529ec944222b9e576f5
Author: Antoine Pitrou <an...@python.org>
AuthorDate: Mon Aug 6 19:10:08 2018 +0200

    ARROW-2815: [CI] Skip Java tests and style checks on C++ job [skip appveyor]
    
    This omits all warning and debug logs from previous Maven output.
    
    Author: Antoine Pitrou <an...@python.org>
    
    Closes #2378 from pitrou/ARROW-2815-strip-java-logging and squashes the following commits:
    
    603db64 <Antoine Pitrou> ARROW-2815:  Skip Java tests and style checks on C++ job
---
 .travis.yml              | 4 ++++
 ci/travis_script_java.sh | 7 ++++++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/.travis.yml b/.travis.yml
index f14b86f..a1f5699 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -23,6 +23,9 @@ services:
 
 cache:
   ccache: true
+  directories:
+    - $HOME/.m2  # Maven
+
 
 before_install:
   # Common pre-install steps for all builds
@@ -57,6 +60,7 @@ matrix:
     - ARROW_TRAVIS_PYTHON_DOCS=1
     - ARROW_BUILD_WARNING_LEVEL=CHECKIN
     - ARROW_TRAVIS_PYTHON_JVM=1
+    - ARROW_TRAVIS_JAVA_BUILD_ONLY=1
     - CC="clang-6.0"
     - CXX="clang++-6.0"
     before_script:
diff --git a/ci/travis_script_java.sh b/ci/travis_script_java.sh
index a8ad94c..9553dd5 100755
--- a/ci/travis_script_java.sh
+++ b/ci/travis_script_java.sh
@@ -24,6 +24,11 @@ JAVA_DIR=${TRAVIS_BUILD_DIR}/java
 pushd $JAVA_DIR
 
 export MAVEN_OPTS="$MAVEN_OPTS -Dorg.slf4j.simpleLogger.defaultLogLevel=warn"
-mvn -B install
+if [ $ARROW_TRAVIS_JAVA_BUILD_ONLY == "1" ]; then
+    # Save time and make build less verbose by skipping tests and style checks
+    mvn -DskipTests=true -Dcheckstyle.skip=true -B install
+else
+    mvn -B install
+fi
 
 popd


[arrow] 07/15: ARROW-2869: [Python] Add documentation for Array.to_numpy

Posted by we...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git

commit de50744e207bd98ab8d775b5fca42d9a29a0dd1f
Author: Antoine Pitrou <an...@python.org>
AuthorDate: Sun Aug 5 16:09:58 2018 -0400

    ARROW-2869: [Python] Add documentation for Array.to_numpy
    
    Author: Antoine Pitrou <an...@python.org>
    
    Closes #2351 from pitrou/ARROW-2869-document-numpy and squashes the following commits:
    
    2792dc84 <Antoine Pitrou> Fix renamed reference
    8cb89989 <Antoine Pitrou> Revert "Capitalize Pandas"
    34d8c36e <Antoine Pitrou> Capitalize Pandas
    395231e0 <Antoine Pitrou> Address review comments
    347ca4e7 <Antoine Pitrou> ARROW-2869:  Add documentation for Array.to_numpy
---
 python/doc/Makefile             |  2 +-
 python/doc/source/api.rst       |  4 +--
 python/doc/source/data.rst      |  4 +--
 python/doc/source/extending.rst |  2 +-
 python/doc/source/index.rst     |  5 +--
 python/doc/source/numpy.rst     | 75 +++++++++++++++++++++++++++++++++++++++++
 python/doc/source/pandas.rst    | 16 ++++++---
 python/doc/source/plasma.rst    |  2 +-
 python/pyarrow/array.pxi        | 17 ++++++----
 9 files changed, 106 insertions(+), 21 deletions(-)

diff --git a/python/doc/Makefile b/python/doc/Makefile
index eacb124..5798f27 100644
--- a/python/doc/Makefile
+++ b/python/doc/Makefile
@@ -20,7 +20,7 @@
 #
 
 # You can set these variables from the command line.
-SPHINXOPTS    = -j4
+SPHINXOPTS    = -j8 -W
 SPHINXBUILD   = sphinx-build
 PAPER         =
 BUILDDIR      = _build
diff --git a/python/doc/source/api.rst b/python/doc/source/api.rst
index cb99933..23eae92 100644
--- a/python/doc/source/api.rst
+++ b/python/doc/source/api.rst
@@ -139,7 +139,7 @@ Scalar Value Types
 
 .. _api.array:
 
-.. currentmodule:: pyarrow.lib
+.. currentmodule:: pyarrow
 
 Array Types
 -----------
@@ -299,7 +299,7 @@ Memory Pools
 
 .. _api.type_classes:
 
-.. currentmodule:: pyarrow.lib
+.. currentmodule:: pyarrow
 
 Type Classes
 ------------
diff --git a/python/doc/source/data.rst b/python/doc/source/data.rst
index 3f4169c..f54cba1 100644
--- a/python/doc/source/data.rst
+++ b/python/doc/source/data.rst
@@ -401,8 +401,8 @@ for one or more arrays of the same type.
    c.data.num_chunks
    c.data.chunk(0)
 
-As you'll see in the :ref:`pandas section <pandas>`, we can convert these
-objects to contiguous NumPy arrays for use in pandas:
+As you'll see in the :ref:`pandas section <pandas_interop>`, we can convert
+these objects to contiguous NumPy arrays for use in pandas:
 
 .. ipython:: python
 
diff --git a/python/doc/source/extending.rst b/python/doc/source/extending.rst
index a471fb3..e3d8707 100644
--- a/python/doc/source/extending.rst
+++ b/python/doc/source/extending.rst
@@ -15,7 +15,7 @@
 .. specific language governing permissions and limitations
 .. under the License.
 
-.. currentmodule:: pyarrow.lib
+.. currentmodule:: pyarrow
 .. _extending:
 
 Using pyarrow from C++ and Cython Code
diff --git a/python/doc/source/index.rst b/python/doc/source/index.rst
index c35f20b..8af795d 100644
--- a/python/doc/source/index.rst
+++ b/python/doc/source/index.rst
@@ -15,8 +15,8 @@
 .. specific language governing permissions and limitations
 .. under the License.
 
-Apache Arrow (Python)
-=====================
+Python bindings for Apache Arrow
+================================
 
 Apache Arrow is a cross-language development platform for in-memory data. It
 specifies a standardized language-independent columnar memory format for flat
@@ -45,6 +45,7 @@ structures.
    ipc
    filesystems
    plasma
+   numpy
    pandas
    parquet
    extending
diff --git a/python/doc/source/numpy.rst b/python/doc/source/numpy.rst
new file mode 100644
index 0000000..303e182
--- /dev/null
+++ b/python/doc/source/numpy.rst
@@ -0,0 +1,75 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+.. _numpy_interop:
+
+Using PyArrow with NumPy
+========================
+
+PyArrow allows converting back and forth from
+`NumPy <https://www.numpy.org/>`_ arrays to Arrow :ref:`Arrays <data.array>`.
+
+NumPy to Arrow
+--------------
+
+To convert a NumPy array to Arrow, one can simply call the :func:`pyarrow.array`
+factory function.
+
+.. code-block:: pycon
+
+   >>> import numpy as np
+   >>> import pyarrow as pa
+   >>> data = np.arange(10, dtype='int16')
+   >>> arr = pa.array(data)
+   >>> arr
+   <pyarrow.lib.Int16Array object at 0x7fb1d1e6ae58>
+   [
+     0,
+     1,
+     2,
+     3,
+     4,
+     5,
+     6,
+     7,
+     8,
+     9
+   ]
+
+Converting from NumPy supports a wide range of input dtypes, including
+structured dtypes or strings.
+
+Arrow to NumPy
+--------------
+
+In the reverse direction, it is possible to produce a view of an Arrow Array
+for use with NumPy using the :meth:`~pyarrow.Array.to_numpy` method.
+This is limited to primitive types for which NumPy has the same physical
+representation as Arrow, and assuming the Arrow data has no nulls.
+
+.. code-block:: pycon
+
+   >>> import numpy as np
+   >>> import pyarrow as pa
+   >>> arr = pa.array([4, 5, 6], type=pa.int32())
+   >>> view = arr.to_numpy()
+   >>> view
+   array([4, 5, 6], dtype=int32)
+
+For more complex data types, you have to use the :meth:`~pyarrow.Array.to_pandas`
+method (which will construct a Numpy array with Pandas semantics for, e.g.,
+representation of null values).
diff --git a/python/doc/source/pandas.rst b/python/doc/source/pandas.rst
index 7699b13..be11b5b 100644
--- a/python/doc/source/pandas.rst
+++ b/python/doc/source/pandas.rst
@@ -15,24 +15,30 @@
 .. specific language governing permissions and limitations
 .. under the License.
 
-.. _pandas:
+.. _pandas_interop:
 
 Using PyArrow with pandas
 =========================
 
-To interface with pandas, PyArrow provides various conversion routines to
-consume pandas structures and convert back to them.
+To interface with `pandas <https://pandas.pydata.org/>`_, PyArrow provides
+various conversion routines to consume pandas structures and convert back
+to them.
+
+.. note::
+   While pandas uses NumPy as a backend, it has enough peculiarities
+   (such as a different type system, and support for null values) that this
+   is a separate topic from :ref:`numpy_interop`.
 
 DataFrames
 ----------
 
-The equivalent to a pandas DataFrame in Arrow is a :class:`pyarrow.table.Table`.
+The equivalent to a pandas DataFrame in Arrow is a :ref:`Table <data.table>`.
 Both consist of a set of named columns of equal length. While pandas only
 supports flat columns, the Table also provides nested columns, thus it can
 represent more data than a DataFrame, so a full conversion is not always possible.
 
 Conversion from a Table to a DataFrame is done by calling
-:meth:`pyarrow.table.Table.to_pandas`. The inverse is then achieved by using
+:meth:`pyarrow.Table.to_pandas`. The inverse is then achieved by using
 :meth:`pyarrow.Table.from_pandas`.
 
 .. code-block:: python
diff --git a/python/doc/source/plasma.rst b/python/doc/source/plasma.rst
index b64b4c2..6adc470 100644
--- a/python/doc/source/plasma.rst
+++ b/python/doc/source/plasma.rst
@@ -291,7 +291,7 @@ process of storing an object in the Plasma store, however one cannot directly
 write the ``DataFrame`` to Plasma with Pandas alone. Plasma also needs to know
 the size of the ``DataFrame`` to allocate a buffer for.
 
-See :ref:`pandas` for more information on using Arrow with Pandas.
+See :ref:`pandas_interop` for more information on using Arrow with Pandas.
 
 You can create the pyarrow equivalent of a Pandas ``DataFrame`` by using
 ``pyarrow.from_pandas`` to convert it to a ``RecordBatch``.
diff --git a/python/pyarrow/array.pxi b/python/pyarrow/array.pxi
index 513fa86..5906965 100644
--- a/python/pyarrow/array.pxi
+++ b/python/pyarrow/array.pxi
@@ -620,7 +620,7 @@ cdef class Array:
                   c_bool zero_copy_only=False,
                   c_bool integer_object_nulls=False):
         """
-        Convert to an array object suitable for use in pandas
+        Convert to a NumPy array object suitable for use in pandas.
 
         Parameters
         ----------
@@ -659,14 +659,13 @@ cdef class Array:
 
     def to_numpy(self):
         """
-        EXPERIMENTAL: Construct a NumPy view of this array. Only supports
-        primitive arrays with the same memory layout as NumPy (i.e. integers,
-        floating point) without any nulls.
+        Experimental: return a NumPy view of this array. Only primitive
+        arrays with the same memory layout as NumPy (i.e. integers,
+        floating point), without any nulls, are supported.
 
         Returns
         -------
-        arr : numpy.ndarray
-
+        array : numpy.ndarray
         """
         if self.null_count:
             raise NotImplementedError('NumPy array view is only supported '
@@ -681,7 +680,11 @@ cdef class Array:
 
     def to_pylist(self):
         """
-        Convert to an list of native Python objects.
+        Convert to a list of native Python objects.
+
+        Returns
+        -------
+        lst : list
         """
         return [x.as_py() for x in self]
 


[arrow] 11/15: ARROW-2982: Ensure release verification script works with wget < 1.16, build ORC in C++ libraries

Posted by we...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git

commit ea9157a6c6fb6da3516f1a53b80e3436a82cc2c1
Author: Wes McKinney <we...@apache.org>
AuthorDate: Mon Aug 6 08:18:59 2018 -0400

    ARROW-2982: Ensure release verification script works with wget < 1.16, build ORC in C++ libraries
    
    I also wrote a guide to setting up Ubuntu Linux (14.04 and higher) to be able to run from a cold start. I may have missed some stuff from a brand new install; others can keep updating. Eventually we should Dockerize for Ubuntu 14.04, 16.04, and 18.04: https://cwiki.apache.org/confluence/display/ARROW/How+to+Verify+Release+Candidates
    
    Author: Wes McKinney <we...@apache.org>
    
    Closes #2372 from wesm/ARROW-2982 and squashes the following commits:
    
    0ffdd15b <Wes McKinney> Ensure script works with older wget
---
 dev/release/verify-release-candidate.sh | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/dev/release/verify-release-candidate.sh b/dev/release/verify-release-candidate.sh
index 05b0a43..220a79b 100755
--- a/dev/release/verify-release-candidate.sh
+++ b/dev/release/verify-release-candidate.sh
@@ -76,13 +76,17 @@ fetch_archive() {
 }
 
 verify_binary_artifacts() {
+  # --show-progress not supported on wget < 1.16
+  wget --help | grep -q '\--show-progress' && \
+      _WGET_PROGRESS_OPT="-q --show-progress" || _WGET_PROGRESS_OPT=""
+
   # download the binaries folder for the current RC
   rcname=apache-arrow-${VERSION}-rc${RC_NUMBER}
   wget -P "$rcname" \
     --quiet \
     --no-host-directories \
     --cut-dirs=5 \
-    --show-progress \
+    $_WGET_PROGRESS_OPT \
     --no-parent \
     --reject 'index.html*' \
     --recursive "$ARROW_DIST_URL/$rcname/binaries/"
@@ -144,8 +148,9 @@ test_and_install_cpp() {
 
   cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
         -DCMAKE_INSTALL_LIBDIR=$ARROW_HOME/lib \
-        -DARROW_PLASMA=on \
-        -DARROW_PYTHON=on \
+        -DARROW_PLASMA=ON \
+        -DARROW_ORC=ON \
+        -DARROW_PYTHON=ON \
         -DARROW_BOOST_USE_SHARED=on \
         -DCMAKE_BUILD_TYPE=release \
         -DARROW_BUILD_BENCHMARKS=on \
@@ -323,12 +328,12 @@ cd ${DIST_NAME}
 test_package_java
 setup_miniconda
 test_and_install_cpp
-test_js
-test_integration
-test_glib
 install_parquet_cpp
 test_python
+test_glib
 test_ruby
+test_js
+test_integration
 test_rust
 
 echo 'Release candidate looks good!'


[arrow] 09/15: ARROW-2990: [GLib] Support building with rpath-ed Arrow C++ on macOS

Posted by we...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git

commit 00aed053fd77e5c5e17d83e36d85a82a1b738fa0
Author: Kouhei Sutou <ko...@clear-code.com>
AuthorDate: Mon Aug 6 08:16:37 2018 -0400

    ARROW-2990: [GLib] Support building with rpath-ed Arrow C++ on macOS
    
    Author: Kouhei Sutou <ko...@clear-code.com>
    
    Closes #2374 from kou/glib-macos and squashes the following commits:
    
    c8b5c453 <Kouhei Sutou>  Support building with rpath-ed Arrow C++ on macOS
---
 c_glib/arrow-glib/Makefile.am     | 19 ++++++++++---------
 c_glib/arrow-gpu-glib/Makefile.am | 28 +++++++++++++++++-----------
 c_glib/configure.ac               |  2 ++
 3 files changed, 29 insertions(+), 20 deletions(-)

diff --git a/c_glib/arrow-glib/Makefile.am b/c_glib/arrow-glib/Makefile.am
index 0eef0d4..e557964 100644
--- a/c_glib/arrow-glib/Makefile.am
+++ b/c_glib/arrow-glib/Makefile.am
@@ -242,14 +242,6 @@ if HAVE_INTROSPECTION
 INTROSPECTION_GIRS =
 INTROSPECTION_SCANNER_ARGS =
 INTROSPECTION_SCANNER_ENV =
-if USE_ARROW_BUILD_DIR
-INTROSPECTION_SCANNER_ENV +=			\
-	LD_LIBRARY_PATH=$(ARROW_LIB_DIR):$${LD_LIBRARY_PATH}
-endif
-if OS_MACOS
-INTROSPECTION_SCANNER_ENV +=			\
-	ARCHFLAGS=
-endif
 INTROSPECTION_COMPILER_ARGS =
 
 Arrow-1.0.gir: libarrow-glib.la
@@ -261,12 +253,21 @@ Arrow_1_0_gir_INCLUDES =			\
 	Gio-2.0
 Arrow_1_0_gir_CFLAGS =				\
 	$(AM_CPPFLAGS)
-Arrow_1_0_gir_LIBS = libarrow-glib.la
+Arrow_1_0_gir_LIBS =
 Arrow_1_0_gir_FILES = $(libarrow_glib_la_sources)
 Arrow_1_0_gir_SCANNERFLAGS =			\
+	--library-path=$(ARROW_LIB_DIR)		\
 	--warn-all				\
 	--identifier-prefix=GArrow		\
 	--symbol-prefix=garrow
+if OS_MACOS
+Arrow_1_0_gir_LIBS += arrow-glib
+Arrow_1_0_gir_SCANNERFLAGS +=			\
+	--no-libtool				\
+	--library-path=$(abs_builddir)/.libs
+else
+Arrow_1_0_gir_LIBS += libarrow-glib.la
+endif
 INTROSPECTION_GIRS += Arrow-1.0.gir
 
 girdir = $(datadir)/gir-1.0
diff --git a/c_glib/arrow-gpu-glib/Makefile.am b/c_glib/arrow-gpu-glib/Makefile.am
index 1e1c02a..2ed9665 100644
--- a/c_glib/arrow-gpu-glib/Makefile.am
+++ b/c_glib/arrow-gpu-glib/Makefile.am
@@ -78,10 +78,6 @@ else
 INTROSPECTION_SCANNER_ENV +=			\
 	PKG_CONFIG_PATH=${abs_builddir}/../arrow-glib:$${PKG_CONFIG_PATH}
 endif
-if OS_MACOS
-INTROSPECTION_SCANNER_ENV +=			\
-	ARCHFLAGS=
-endif
 INTROSPECTION_COMPILER_ARGS =			\
 	--includedir=$(abs_builddir)/../arrow-glib
 
@@ -95,20 +91,30 @@ ArrowGPU_1_0_gir_INCLUDES =			\
 ArrowGPU_1_0_gir_CFLAGS =			\
 	$(AM_CPPFLAGS)
 ArrowGPU_1_0_gir_LDFLAGS =
-if USE_ARROW_BUILD_DIR
-ArrowGPU_1_0_gir_LDFLAGS +=			\
-	-L$(ARROW_LIB_DIR)
-endif
-ArrowGPU_1_0_gir_LIBS =					\
-	$(abs_builddir)/../arrow-glib/libarrow-glib.la	\
-	libarrow-gpu-glib.la
+ArrowGPU_1_0_gir_LIBS =
 ArrowGPU_1_0_gir_FILES =			\
 	$(libarrow_gpu_glib_la_sources)
 ArrowGPU_1_0_gir_SCANNERFLAGS =					\
+	--library-path=$(ARROW_LIB_DIR)				\
 	--warn-all						\
 	--add-include-path=$(abs_builddir)/../arrow-glib	\
 	--identifier-prefix=GArrowGPU				\
 	--symbol-prefix=garrow_gpu
+if OS_MACOS
+ArrowGPU_1_0_gir_LIBS +=			\
+	 arrow-glib				\
+	 arrow-gpu-glib
+ArrowGPU_1_0_gir_SCANNERFLAGS +=				\
+	--no-libtool						\
+	--library-path=$(abs_builddir)/../arrow-glib/.libs	\
+	--library-path=$(abs_builddir)/.libs
+else
+ArrowGPU_1_0_gir_LIBS +=				\
+	$(abs_builddir)/../arrow-glib/libarrow-glib.la	\
+	libarrow-gpu-glib.la
+endif
+
+					\
 INTROSPECTION_GIRS += ArrowGPU-1.0.gir
 
 girdir = $(datadir)/gir-1.0
diff --git a/c_glib/configure.ac b/c_glib/configure.ac
index 6692927..6368170 100644
--- a/c_glib/configure.ac
+++ b/c_glib/configure.ac
@@ -115,6 +115,8 @@ if test "x$GARROW_ARROW_CPP_BUILD_DIR" = "x"; then
   USE_ARROW_BUILD_DIR=no
 
   PKG_CHECK_MODULES([ARROW], [arrow arrow-compute])
+  _PKG_CONFIG(ARROW_LIB_DIR, [variable=libdir], [arrow])
+  ARROW_LIB_DIR="$pkg_cv_ARROW_LIB_DIR"
   PKG_CHECK_MODULES([ARROW_ORC],
                     [arrow-orc],
                     [HAVE_ARROW_ORC=yes],


[arrow] 08/15: ARROW-2985: [Ruby] Add support for verifying RC

Posted by we...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git

commit 072fa775d8fd54d4f3b6aa185fb85f91b79a1876
Author: Kouhei Sutou <ko...@clear-code.com>
AuthorDate: Mon Aug 6 08:15:34 2018 -0400

    ARROW-2985: [Ruby] Add support for verifying RC
    
    Author: Kouhei Sutou <ko...@clear-code.com>
    
    Closes #2376 from kou/verify-ruby and squashes the following commits:
    
    f3e1fb7a <Kouhei Sutou>  Add support for verifying RC
---
 dev/release/verify-release-candidate.sh | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/dev/release/verify-release-candidate.sh b/dev/release/verify-release-candidate.sh
index eedec46..05b0a43 100755
--- a/dev/release/verify-release-candidate.sh
+++ b/dev/release/verify-release-candidate.sh
@@ -225,6 +225,25 @@ test_js() {
   popd
 }
 
+test_ruby() {
+  export GI_TYPELIB_PATH=$ARROW_HOME/lib/girepository-1.0:$GI_TYPELIB_PATH
+
+  pushd ruby
+
+  pushd red-arrow
+  bundle install --path vendor/bundle
+  bundle exec ruby test/run-test.rb
+  popd
+
+  # TODO: Arrow GPU related tests
+  # pushd red-arrow-gpu
+  # bundle install --path vendor/bundle
+  # bundle exec ruby test/run-test.rb
+  # popd
+
+  popd
+}
+
 test_rust() {
   # install rust toolchain in a similar fashion like test-miniconda
   export RUSTUP_HOME=`pwd`/test-rustup
@@ -309,6 +328,7 @@ test_integration
 test_glib
 install_parquet_cpp
 test_python
+test_ruby
 test_rust
 
 echo 'Release candidate looks good!'


[arrow] 03/15: ARROW-2962: [Packaging] Bintray descriptor files are no longer needed

Posted by we...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git

commit 0c29673824bc388f51266174ca83a457f8820f79
Author: Krisztián Szűcs <sz...@gmail.com>
AuthorDate: Sat Aug 4 18:07:55 2018 -0400

    ARROW-2962: [Packaging] Bintray descriptor files are no longer needed
    
    Wait for [build-302](https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-302) to pass
    
    Author: Krisztián Szűcs <sz...@gmail.com>
    
    Closes #2357 from kszucs/ARROW-2962 and squashes the following commits:
    
    7445b8cb <Krisztián Szűcs> remove bintray descriptors
    8baabdac <Krisztián Szűcs> don't update descriptor in rake task
---
 dev/tasks/linux-packages/apt/descriptor.json | 45 ----------------------------
 dev/tasks/linux-packages/package-task.rb     | 10 -------
 dev/tasks/linux-packages/yum/descriptor.json | 22 --------------
 3 files changed, 77 deletions(-)

diff --git a/dev/tasks/linux-packages/apt/descriptor.json b/dev/tasks/linux-packages/apt/descriptor.json
deleted file mode 100644
index d45ed85..0000000
--- a/dev/tasks/linux-packages/apt/descriptor.json
+++ /dev/null
@@ -1,45 +0,0 @@
-{
-    "package": {
-        "name": "APT",
-        "repo": "apache-arrow-apt",
-        "subject": "kou",
-        "licenses": ["Apache-2.0"],
-        "vcs_url": "htttps://github.com/apache/arrow.git"
-    },
-    "version": {
-        "name": "dev"
-    },
-    "files": [
-        {
-            "includePattern": "dev/tasks/linux-packages/apt/repositories/([^/]+)/pool/stretch/main/a/apache-arrow/([^/]+\\.deb)\\z",
-            "uploadPattern": "pool/stretch/main/$2",
-            "matrixParams": {
-                "deb_distribution": "stretch",
-                "deb_component": "main",
-                "deb_architecture": "amd64",
-                "override": 1
-            }
-        },
-        {
-            "includePattern": "dev/tasks/linux-packages/apt/repositories/([^/]+)/pool/trusty/universe/a/apache-arrow/([^/]+\\.deb)\\z",
-            "uploadPattern": "pool/trusty/universe/$2",
-            "matrixParams": {
-                "deb_distribution": "trusty",
-                "deb_component": "universe",
-                "deb_architecture": "amd64",
-                "override": 1
-            }
-        },
-        {
-            "includePattern": "dev/tasks/linux-packages/apt/repositories/([^/]+)/pool/xenial/universe/a/apache-arrow/([^/]+\\.deb)\\z",
-            "uploadPattern": "pool/xenial/universe/$2",
-            "matrixParams": {
-                "deb_distribution": "xenial",
-                "deb_component": "universe",
-                "deb_architecture": "amd64",
-                "override": 1
-            }
-        }
-    ],
-    "publish": true
-}
diff --git a/dev/tasks/linux-packages/package-task.rb b/dev/tasks/linux-packages/package-task.rb
index 29468a1..b8f25ae 100644
--- a/dev/tasks/linux-packages/package-task.rb
+++ b/dev/tasks/linux-packages/package-task.rb
@@ -266,7 +266,6 @@ VERSION=#{@deb_upstream_version}
       task :update do
         update_debian_changelog
         update_spec
-        update_descriptor
       end
     end
   end
@@ -325,13 +324,4 @@ VERSION=#{@deb_upstream_version}
     end
   end
 
-  def update_descriptor
-    Dir.glob("**/descriptor.json") do |descriptor_json|
-      update_content(descriptor_json) do |content|
-        content = content.sub(/"name": "\d+\.\d+\.\d+.*?"/) do
-          "\"name\": \"#{@version}\""
-        end
-      end
-    end
-  end
 end
diff --git a/dev/tasks/linux-packages/yum/descriptor.json b/dev/tasks/linux-packages/yum/descriptor.json
deleted file mode 100644
index b025b17..0000000
--- a/dev/tasks/linux-packages/yum/descriptor.json
+++ /dev/null
@@ -1,22 +0,0 @@
-{
-    "package": {
-        "name": "Yum",
-        "repo": "apache-arrow-yum",
-        "subject": "kou",
-        "licenses": ["Apache-2.0"],
-        "vcs_url": "htttps://github.com/apache/arrow.git"
-    },
-    "version": {
-        "name": "dev"
-    },
-    "files": [
-        {
-            "includePattern": "cpp-linux/yum/repositories/(centos)/([^/]+)/([^/]+)/[^/]+/([^/]+\\.rpm)",
-            "uploadPattern": "$1/$2/$3/$4",
-            "matrixParams": {
-                "override": 1
-            }
-        }
-    ],
-    "publish": true
-}


[arrow] 05/15: ARROW-2978: [Rust] Change argument to rust fmt to fix build

Posted by we...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git

commit 1b2a42e563c6b6f5e8e72144303e0dfcb168300f
Author: Andy Grove <an...@gmail.com>
AuthorDate: Sun Aug 5 16:01:31 2018 -0400

    ARROW-2978: [Rust] Change argument to rust fmt to fix build
    
    Work in progress... trying to fix the build.
    
    Author: Andy Grove <an...@gmail.com>
    
    Closes #2371 from andygrove/fix_rust_ci_failure and squashes the following commits:
    
    94c12773 <Andy Grove> Update code formatting to keep latest version of rust fmt happy
    1b3e72d9 <Andy Grove> Change argument to rust fmt to fix build
---
 ci/travis_script_rust.sh |  2 +-
 rust/src/array.rs        |  7 ++-----
 rust/src/buffer.rs       | 15 ++++++++++-----
 rust/src/datatypes.rs    | 13 ++++++++-----
 4 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/ci/travis_script_rust.sh b/ci/travis_script_rust.sh
index ff12483..f85820f 100755
--- a/ci/travis_script_rust.sh
+++ b/ci/travis_script_rust.sh
@@ -25,7 +25,7 @@ pushd $RUST_DIR
 
 # raises on any formatting errors
 rustup component add rustfmt-preview
-cargo fmt --all -- --write-mode=diff
+cargo fmt --all -- --check
 # raises on any warnings
 cargo rustc -- -D warnings
 
diff --git a/rust/src/array.rs b/rust/src/array.rs
index e418518..1c4322c 100644
--- a/rust/src/array.rs
+++ b/rust/src/array.rs
@@ -19,9 +19,9 @@
 use std::any::Any;
 use std::convert::From;
 use std::ops::Add;
-use std::sync::Arc;
 use std::str;
 use std::string::String;
+use std::sync::Arc;
 
 use super::bitmap::Bitmap;
 use super::buffer::*;
@@ -453,12 +453,9 @@ mod tests {
     fn test_access_array_concurrently() {
         let a = PrimitiveArray::from(Buffer::from(vec![5, 6, 7, 8, 9]));
 
-        let ret = thread::spawn(move || {
-            a.iter().collect::<Vec<i32>>()
-        }).join();
+        let ret = thread::spawn(move || a.iter().collect::<Vec<i32>>()).join();
 
         assert!(ret.is_ok());
         assert_eq!(vec![5, 6, 7, 8, 9], ret.ok().unwrap());
     }
 }
-
diff --git a/rust/src/buffer.rs b/rust/src/buffer.rs
index 0fdc2c5..bdc3601 100644
--- a/rust/src/buffer.rs
+++ b/rust/src/buffer.rs
@@ -190,7 +190,8 @@ mod tests {
     fn test_buffer_eq() {
         let a = Buffer::from(vec![1, 2, 3, 4, 5]);
         let b = Buffer::from(vec![5, 4, 3, 2, 1]);
-        let c = a.iter()
+        let c = a
+            .iter()
             .zip(b.iter())
             .map(|(a, b)| a == b)
             .collect::<Vec<bool>>();
@@ -201,7 +202,8 @@ mod tests {
     fn test_buffer_lt() {
         let a = Buffer::from(vec![1, 2, 3, 4, 5]);
         let b = Buffer::from(vec![5, 4, 3, 2, 1]);
-        let c = a.iter()
+        let c = a
+            .iter()
             .zip(b.iter())
             .map(|(a, b)| a < b)
             .collect::<Vec<bool>>();
@@ -212,7 +214,8 @@ mod tests {
     fn test_buffer_gt() {
         let a = Buffer::from(vec![1, 2, 3, 4, 5]);
         let b = Buffer::from(vec![5, 4, 3, 2, 1]);
-        let c = a.iter()
+        let c = a
+            .iter()
             .zip(b.iter())
             .map(|(a, b)| a > b)
             .collect::<Vec<bool>>();
@@ -223,7 +226,8 @@ mod tests {
     fn test_buffer_add() {
         let a = Buffer::from(vec![1, 2, 3, 4, 5]);
         let b = Buffer::from(vec![5, 4, 3, 2, 1]);
-        let c = a.iter()
+        let c = a
+            .iter()
             .zip(b.iter())
             .map(|(a, b)| a + b)
             .collect::<Vec<i32>>();
@@ -234,7 +238,8 @@ mod tests {
     fn test_buffer_multiply() {
         let a = Buffer::from(vec![1, 2, 3, 4, 5]);
         let b = Buffer::from(vec![5, 4, 3, 2, 1]);
-        let c = a.iter()
+        let c = a
+            .iter()
             .zip(b.iter())
             .map(|(a, b)| a * b)
             .collect::<Vec<i32>>();
diff --git a/rust/src/datatypes.rs b/rust/src/datatypes.rs
index d4849da..2adec0b 100644
--- a/rust/src/datatypes.rs
+++ b/rust/src/datatypes.rs
@@ -278,11 +278,14 @@ impl Schema {
 
 impl fmt::Display for Schema {
     fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
-        f.write_str(&self.columns
-            .iter()
-            .map(|c| c.to_string())
-            .collect::<Vec<String>>()
-            .join(", "))
+        f.write_str(
+            &self
+                .columns
+                .iter()
+                .map(|c| c.to_string())
+                .collect::<Vec<String>>()
+                .join(", "),
+        )
     }
 }
 


[arrow] 04/15: ARROW-2480: [C++] Enable casting the value of a decimal to int32_t or int64_t

Posted by we...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git

commit 495bf36bedc8614dd49e309760b66f912987c800
Author: Antoine Pitrou <an...@python.org>
AuthorDate: Sat Aug 4 18:37:08 2018 -0400

    ARROW-2480: [C++] Enable casting the value of a decimal to int32_t or int64_t
    
    Author: Antoine Pitrou <an...@python.org>
    Author: Phillip Cloud <cp...@gmail.com>
    
    Closes #1917 from cpcloud/ARROW-2480 and squashes the following commits:
    
    456624e4 <Antoine Pitrou> Try to fix other compile error
    d9c2955a <Antoine Pitrou> Try to fix gcc 4.8 failure
    609efaec <Phillip Cloud> ARROW-2480:  Enable casting the value of a decimal to int32_t or int64_t
---
 cpp/src/arrow/util/decimal-test.cc | 26 ++++++++++++++++++++++++++
 cpp/src/arrow/util/decimal.h       | 18 ++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/cpp/src/arrow/util/decimal-test.cc b/cpp/src/arrow/util/decimal-test.cc
index 0877617..61884a1 100644
--- a/cpp/src/arrow/util/decimal-test.cc
+++ b/cpp/src/arrow/util/decimal-test.cc
@@ -436,4 +436,30 @@ TEST(Decimal128Test, TestFromBigEndianBadLength) {
   ASSERT_RAISES(Invalid, Decimal128::FromBigEndian(0, 17, &out));
 }
 
+TEST(Decimal128Test, TestToInteger) {
+  Decimal128 value1("1234");
+  int32_t out1;
+
+  Decimal128 value2("-1234");
+  int64_t out2;
+
+  ASSERT_OK(value1.ToInteger(&out1));
+  ASSERT_EQ(1234, out1);
+
+  ASSERT_OK(value1.ToInteger(&out2));
+  ASSERT_EQ(1234, out2);
+
+  ASSERT_OK(value2.ToInteger(&out1));
+  ASSERT_EQ(-1234, out1);
+
+  ASSERT_OK(value2.ToInteger(&out2));
+  ASSERT_EQ(-1234, out2);
+
+  Decimal128 invalid_int32(static_cast<int64_t>(std::pow(2, 31)));
+  ASSERT_RAISES(Invalid, invalid_int32.ToInteger(&out1));
+
+  Decimal128 invalid_int64("12345678912345678901");
+  ASSERT_RAISES(Invalid, invalid_int64.ToInteger(&out2));
+}
+
 }  // namespace arrow
diff --git a/cpp/src/arrow/util/decimal.h b/cpp/src/arrow/util/decimal.h
index b3180cb..7280362 100644
--- a/cpp/src/arrow/util/decimal.h
+++ b/cpp/src/arrow/util/decimal.h
@@ -20,11 +20,14 @@
 
 #include <array>
 #include <cstdint>
+#include <limits>
+#include <sstream>
 #include <string>
 #include <type_traits>
 
 #include "arrow/status.h"
 #include "arrow/util/macros.h"
+#include "arrow/util/type_traits.h"
 #include "arrow/util/visibility.h"
 
 namespace arrow {
@@ -134,6 +137,21 @@ class ARROW_EXPORT Decimal128 {
   /// \brief Convert Decimal128 from one scale to another
   Status Rescale(int32_t original_scale, int32_t new_scale, Decimal128* out) const;
 
+  /// \brief Convert to a signed integer
+  template <typename T, typename = EnableIfIsOneOf<T, int32_t, int64_t>>
+  Status ToInteger(T* out) const {
+    constexpr auto min_value = std::numeric_limits<T>::min();
+    constexpr auto max_value = std::numeric_limits<T>::max();
+    const auto& self = *this;
+    if (self < min_value || self > max_value) {
+      std::stringstream buf;
+      buf << "Invalid cast from Decimal128 to " << sizeof(T) << " byte integer";
+      return Status::Invalid(buf.str());
+    }
+    *out = static_cast<T>(low_bits_);
+    return Status::OK();
+  }
+
  private:
   int64_t high_bits_;
   uint64_t low_bits_;


[arrow] 10/15: ARROW-2951: [CI] Don't skip AppVeyor build on format-only changes

Posted by we...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git

commit 91eab98976124b27cae457c3852915d053ad6178
Author: Antoine Pitrou <an...@python.org>
AuthorDate: Mon Aug 6 08:17:08 2018 -0400

    ARROW-2951: [CI] Don't skip AppVeyor build on format-only changes
    
    Author: Antoine Pitrou <an...@python.org>
    
    Closes #2375 from pitrou/ARROW-2951-appveyor-builds-format and squashes the following commits:
    
    8d813774 <Antoine Pitrou> ARROW-2951:  Don't skip AppVeyor build on format-only changes
---
 appveyor.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/appveyor.yml b/appveyor.yml
index d62baf7..0e37033 100644
--- a/appveyor.yml
+++ b/appveyor.yml
@@ -24,6 +24,7 @@ only_commits:
     - appveyor.yml
     - ci/
     - cpp/
+    - format/
     - python/
     - rust/
 


[arrow] 13/15: ARROW-2061: [C++] Make tests a bit faster with Valgrind

Posted by we...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git

commit d3c9c1df257c991e04fdd3c10d328ec857d68f96
Author: Antoine Pitrou <an...@python.org>
AuthorDate: Mon Aug 6 14:41:50 2018 -0400

    ARROW-2061: [C++] Make tests a bit faster with Valgrind
    
    Saves around 80 seconds on Travis-CI.
    
    Author: Antoine Pitrou <an...@python.org>
    
    Closes #2377 from pitrou/ARROW-2061-valgrind-test-speed and squashes the following commits:
    
    43a1e0e1 <Antoine Pitrou> ARROW-2061:  Make tests a bit faster with Valgrind
---
 cpp/src/arrow/array-test.cc              |  5 +++-
 cpp/src/arrow/compute/compute-test.cc    | 51 +++++++++++++++++++-------------
 cpp/src/arrow/io/io-memory-test.cc       | 11 +++++--
 cpp/src/arrow/ipc/ipc-read-write-test.cc |  3 ++
 4 files changed, 47 insertions(+), 23 deletions(-)

diff --git a/cpp/src/arrow/array-test.cc b/cpp/src/arrow/array-test.cc
index b7bad67..8b78762 100644
--- a/cpp/src/arrow/array-test.cc
+++ b/cpp/src/arrow/array-test.cc
@@ -247,10 +247,13 @@ TEST_F(TestArray, TestIsNullIsValidNoNulls) {
 TEST_F(TestArray, BuildLargeInMemoryArray) {
 #ifdef NDEBUG
   const int64_t length = static_cast<int64_t>(std::numeric_limits<int32_t>::max()) + 1;
-#else
+#elif !defined(ARROW_VALGRIND)
   // use a smaller size since the insert function isn't optimized properly on debug and
   // the test takes a long time to complete
   const int64_t length = 2 << 24;
+#else
+  // use an even smaller size with valgrind
+  const int64_t length = 2 << 20;
 #endif
 
   BooleanBuilder builder;
diff --git a/cpp/src/arrow/compute/compute-test.cc b/cpp/src/arrow/compute/compute-test.cc
index 6a92844..ba5c935 100644
--- a/cpp/src/arrow/compute/compute-test.cc
+++ b/cpp/src/arrow/compute/compute-test.cc
@@ -1034,24 +1034,29 @@ TEST_F(TestHashKernel, DictEncodeBinary) {
 }
 
 TEST_F(TestHashKernel, BinaryResizeTable) {
-  const int64_t kTotalValues = 10000;
-  const int64_t kRepeats = 10;
+  const int32_t kTotalValues = 10000;
+#if !defined(ARROW_VALGRIND)
+  const int32_t kRepeats = 10;
+#else
+  // Mitigate Valgrind's slowness
+  const int32_t kRepeats = 3;
+#endif
 
   vector<std::string> values;
   vector<std::string> uniques;
   vector<int32_t> indices;
-  for (int64_t i = 0; i < kTotalValues * kRepeats; i++) {
-    int64_t index = i % kTotalValues;
-    std::stringstream ss;
-    ss << "test" << index;
-    std::string val = ss.str();
+  char buf[20] = "test";
 
-    values.push_back(val);
+  for (int32_t i = 0; i < kTotalValues * kRepeats; i++) {
+    int32_t index = i % kTotalValues;
+
+    ASSERT_GE(snprintf(buf + 4, sizeof(buf) - 4, "%d", index), 0);
+    values.emplace_back(buf);
 
     if (i < kTotalValues) {
-      uniques.push_back(val);
+      uniques.push_back(values.back());
     }
-    indices.push_back(static_cast<int32_t>(i % kTotalValues));
+    indices.push_back(index);
   }
 
   CheckUnique<BinaryType, std::string>(&this->ctx_, binary(), values, {}, uniques, {});
@@ -1076,24 +1081,30 @@ TEST_F(TestHashKernel, DictEncodeFixedSizeBinary) {
 }
 
 TEST_F(TestHashKernel, FixedSizeBinaryResizeTable) {
-  const int64_t kTotalValues = 10000;
-  const int64_t kRepeats = 10;
+  const int32_t kTotalValues = 10000;
+#if !defined(ARROW_VALGRIND)
+  const int32_t kRepeats = 10;
+#else
+  // Mitigate Valgrind's slowness
+  const int32_t kRepeats = 3;
+#endif
 
   vector<std::string> values;
   vector<std::string> uniques;
   vector<int32_t> indices;
-  for (int64_t i = 0; i < kTotalValues * kRepeats; i++) {
-    int64_t index = i % kTotalValues;
-    std::stringstream ss;
-    ss << "test" << static_cast<char>(index / 128) << static_cast<char>(index % 128);
-    std::string val = ss.str();
+  char buf[7] = "test..";
 
-    values.push_back(val);
+  for (int32_t i = 0; i < kTotalValues * kRepeats; i++) {
+    int32_t index = i % kTotalValues;
+
+    buf[4] = static_cast<char>(index / 128);
+    buf[5] = static_cast<char>(index % 128);
+    values.emplace_back(buf, 6);
 
     if (i < kTotalValues) {
-      uniques.push_back(val);
+      uniques.push_back(values.back());
     }
-    indices.push_back(static_cast<int32_t>(i % kTotalValues));
+    indices.push_back(index);
   }
 
   auto type = fixed_size_binary(6);
diff --git a/cpp/src/arrow/io/io-memory-test.cc b/cpp/src/arrow/io/io-memory-test.cc
index d80aaec..62305a6 100644
--- a/cpp/src/arrow/io/io-memory-test.cc
+++ b/cpp/src/arrow/io/io-memory-test.cc
@@ -131,9 +131,16 @@ TEST(TestBufferReader, RetainParentReference) {
 }
 
 TEST(TestMemcopy, ParallelMemcopy) {
+#if defined(ARROW_VALGRIND)
+  // Compensate for Valgrind's slowness
+  constexpr int64_t THRESHOLD = 32 * 1024;
+#else
+  constexpr int64_t THRESHOLD = 1024 * 1024;
+#endif
+
   for (int i = 0; i < 5; ++i) {
     // randomize size so the memcopy alignment is tested
-    int64_t total_size = 3 * 1024 * 1024 + std::rand() % 100;
+    int64_t total_size = 3 * THRESHOLD + std::rand() % 100;
 
     std::shared_ptr<Buffer> buffer1, buffer2;
 
@@ -144,7 +151,7 @@ TEST(TestMemcopy, ParallelMemcopy) {
 
     io::FixedSizeBufferWriter writer(buffer1);
     writer.set_memcopy_threads(4);
-    writer.set_memcopy_threshold(1024 * 1024);
+    writer.set_memcopy_threshold(THRESHOLD);
     ASSERT_OK(writer.Write(buffer2->data(), buffer2->size()));
 
     ASSERT_EQ(0, memcmp(buffer1->data(), buffer2->data(), buffer1->size()));
diff --git a/cpp/src/arrow/ipc/ipc-read-write-test.cc b/cpp/src/arrow/ipc/ipc-read-write-test.cc
index baf067e..f6e49ea 100644
--- a/cpp/src/arrow/ipc/ipc-read-write-test.cc
+++ b/cpp/src/arrow/ipc/ipc-read-write-test.cc
@@ -498,8 +498,11 @@ TEST_F(RecursionLimits, StressLimit) {
   CheckDepth(100, &it_works);
   ASSERT_TRUE(it_works);
 
+// Mitigate Valgrind's slowness
+#if !defined(ARROW_VALGRIND)
   CheckDepth(500, &it_works);
   ASSERT_TRUE(it_works);
+#endif
 }
 #endif  // !defined(_WIN32) || defined(NDEBUG)