You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Andrew Palumbo <ap...@outlook.com> on 2019/01/09 00:48:19 UTC

Missing hypothesis module in PyArrow

Hello,
I'm just building arrow from source from a fresh checkout; commit:

326015cfc66e1f657cdd6811620137e9e277b43d

Everything seems to build against python 2.7:


$python setup.py build_ext --build-type=$ARROW_BUILD_TYPE    --with-parquet --with-plasma --inplace

{...}
Bundling includes: release/include
release/gandiva.so
Cython module gandiva failure permitted
('Moving generated C++ source', 'lib.cpp', 'to build path', '/home/apalumbo/repos/arrow/python/pyarrow/lib.cpp')
('Moving built C-extension', 'release/lib.so', 'to build path', '/home/apalumbo/repos/arrow/python/pyarrow/lib.so')
('Moving generated C++ source', '_csv.cpp', 'to build path', '/home/apalumbo/repos/arrow/python/pyarrow/_csv.cpp')
('Moving built C-extension', 'release/_csv.so', 'to build path', '/home/apalumbo/repos/arrow/python/pyarrow/_csv.so')
release/_cuda.so
Cython module _cuda failure permitted
('Moving generated C++ source', '_parquet.cpp', 'to build path', '/home/apalumbo/repos/arrow/python/pyarrow/_parquet.cpp')
('Moving built C-extension', 'release/_parquet.so', 'to build path', '/home/apalumbo/repos/arrow/python/pyarrow/_parquet.so')
release/_orc.so
Cython module _orc failure permitted
('Moving generated C++ source', '_plasma.cpp', 'to build path', '/home/apalumbo/repos/arrow/python/pyarrow/_plasma.cpp')
('Moving built C-extension', 'release/_plasma.so', 'to build path', '/home/apalumbo/repos/arrow/python/pyarrow/_plasma.so')
{...}

running tests though I get:

$ py.test pyarrow

ImportError while loading conftest '/home/apalumbo/repos/arrow/python/pyarrow/tests/conftest.py'.
../../pyarrow/lib/python2.7/site-packages/six.py:709: in exec_
    exec("""exec _code_ in _globs_, _locs_""")
pyarrow/tests/conftest.py:20: in <module>
    import hypothesis as h
E   ImportError: No module named hypothesis



after a pip install of `hypothesis` in my venv, (Python 2.7) I am able to run the tests.

Several fail right off the bat (seems like many of the errors are Pandas-related (see bottom for stack trace):



Switching to a virtualenv Running Python 3.5, the build fails:


 $make -j4

{...}
make[2]: *** [src/arrow/python/CMakeFiles/arrow_python_objlib.dir/benchmark.cc.o] Error 1
CMakeFiles/Makefile2:1862: recipe for target 'src/arrow/python/CMakeFiles/arrow_python_objlib.dir/all' failed
make[1]: *** [src/arrow/python/CMakeFiles/arrow_python_objlib.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
-- glog_ep install command succeeded.  See also /home/apalumbo/repos/arrow/cpp/build/glog_ep-prefix/src/glog_ep-stamp/glog_ep-install-*.log
[ 40%] Building CXX object src/plasma/CMakeFiles/plasma_objlib.dir/common.cc.o
[ 40%] Completed 'glog_ep'
[ 40%] Built target glog_ep
[ 41%] Building CXX object src/plasma/CMakeFiles/plasma_objlib.dir/eviction_policy.cc.o
[ 41%] Building CXX object src/plasma/CMakeFiles/plasma_objlib.dir/events.cc.o
[ 42%] Building CXX object src/plasma/CMakeFiles/plasma_objlib.dir/fling.cc.o
[ 42%] Building CXX object src/plasma/CMakeFiles/plasma_objlib.dir/io.cc.o
[ 43%] Building CXX object src/plasma/CMakeFiles/plasma_objlib.dir/malloc.cc.o
[ 43%] Building CXX object src/plasma/CMakeFiles/plasma_objlib.dir/plasma.cc.o
[ 44%] Building CXX object src/plasma/CMakeFiles/plasma_objlib.dir/protocol.cc.o
[ 44%] Building C object src/plasma/CMakeFiles/plasma_objlib.dir/thirdparty/ae/ae.c.o
[ 44%] Built target plasma_objlib
-- jemalloc_ep build command succeeded.  See also /home/apalumbo/repos/arrow/cpp/build/jemalloc_ep-prefix/src/jemalloc_ep-stamp/jemalloc_ep-build-*.log
[ 45%] Performing install step for 'jemalloc_ep'
-- jemalloc_ep install command succeeded.  See also /home/apalumbo/repos/arrow/cpp/build/jemalloc_ep-prefix/src/jemalloc_ep-stamp/jemalloc_ep-install-*.log
[ 45%] Completed 'jemalloc_ep'
[ 45%] Built target jemalloc_ep
Makefile:138: recipe for target 'all' failed
make: *** [all] Error 2




Any thoughts?  I', building with the instructions from https://arrow.apache.org/docs/python/development.html#development

Thanks in advance,

Andy








Partial stack trace (python 2.7) :

$py.test pyarrow

{...}


[5000 rows x 1 columns]
schema = None, preserve_index = False, nthreads = 16, columns = None, safe = True

    def dataframe_to_arrays(df, schema, preserve_index, nthreads=1, columns=None,
                            safe=True):
        names, column_names, index_columns, index_column_names, \
            columns_to_convert, convert_types = _get_columns_to_convert(
                df, schema, preserve_index, columns
            )

        # NOTE(wesm): If nthreads=None, then we use a heuristic to decide whether
        # using a thread pool is worth it. Currently the heuristic is whether the
        # nrows > 100 * ncols.
        if nthreads is None:
            nrows, ncols = len(df), len(df.columns)
            if nrows > ncols * 100:
                nthreads = pa.cpu_count()
            else:
                nthreads = 1

        def convert_column(col, ty):
            try:
                return pa.array(col, type=ty, from_pandas=True, safe=safe)
            except (pa.ArrowInvalid,
                    pa.ArrowNotImplementedError,
                    pa.ArrowTypeError) as e:
                e.args += ("Conversion failed for column {0!s} with type {1!s}"
                           .format(col.name, col.dtype),)
                raise e

        if nthreads == 1:
            arrays = [convert_column(c, t)
                      for c, t in zip(columns_to_convert,
                                      convert_types)]
        else:
>           from concurrent import futures
E           ImportError: No module named concurrent

pyarrow/pandas_compat.py:430: ImportError
___________________________________________________ test_compress_decompress ___________________________________________________

    def test_compress_decompress():
        INPUT_SIZE = 10000
        test_data = (np.random.randint(0, 255, size=INPUT_SIZE)
                     .astype(np.uint8)
                     .tostring())
        test_buf = pa.py_buffer(test_data)

        codecs = ['lz4', 'snappy', 'gzip', 'zstd', 'brotli']
        for codec in codecs:
>           compressed_buf = pa.compress(test_buf, codec=codec)

pyarrow/tests/test_io.py:508:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pyarrow/io.pxi:1340: in pyarrow.lib.compress
    check_status(CCodec.Create(c_codec, &compressor))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   raise ArrowNotImplementedError(message)
E   ArrowNotImplementedError: ZSTD codec support not built

pyarrow/error.pxi:89: ArrowNotImplementedError
_______________________________________________ test_compressed_roundtrip[zstd] ________________________________________________

compression = 'zstd'

    @pytest.mark.parametrize("compression",
                             ["bz2", "brotli", "gzip", "lz4", "zstd"])
    def test_compressed_roundtrip(compression):
        data = b"some test data\n" * 10 + b"eof\n"
        raw = pa.BufferOutputStream()
        try:
>           with pa.CompressedOutputStream(raw, compression) as compressed:

pyarrow/tests/test_io.py:1045:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pyarrow/io.pxi:1149: in pyarrow.lib.CompressedOutputStream.__init__
    self._init(stream, compression_type)
pyarrow/io.pxi:1162: in pyarrow.lib.CompressedOutputStream._init
    _make_compressed_output_stream(stream.get_output_stream(),
pyarrow/io.pxi:1087: in pyarrow.lib._make_compressed_output_stream
    check_status(CCodec.Create(compression_type, &codec))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   raise ArrowNotImplementedError(message)
E   ArrowNotImplementedError: ZSTD codec support not built

pyarrow/error.pxi:89: ArrowNotImplementedError
__________________________________________ test_pandas_serialize_round_trip_nthreads ___________________________________________

    def test_pandas_serialize_round_trip_nthreads():
        index = pd.Index([1, 2, 3], name='my_index')
        columns = ['foo', 'bar']
        df = pd.DataFrame(
            {'foo': [1.5, 1.6, 1.7], 'bar': list('abc')},
            index=index, columns=columns
        )
>       _check_serialize_pandas_round_trip(df, use_threads=True)

pyarrow/tests/test_ipc.py:536:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pyarrow/tests/test_ipc.py:514: in _check_serialize_pandas_round_trip
    buf = pa.serialize_pandas(df, nthreads=2 if use_threads else 1)
pyarrow/ipc.py:163: in serialize_pandas
    preserve_index=preserve_index)
pyarrow/table.pxi:864: in pyarrow.lib.RecordBatch.from_pandas
    names, arrays, metadata = pdcompat.dataframe_to_arrays(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

df =           foo bar
my_index
1         1.5   a
2         1.6   b
3         1.7   c, schema = None
preserve_index = True, nthreads = 2, columns = None, safe = True

    def dataframe_to_arrays(df, schema, preserve_index, nthreads=1, columns=None,
                            safe=True):
        names, column_names, index_columns, index_column_names, \
            columns_to_convert, convert_types = _get_columns_to_convert(
                df, schema, preserve_index, columns
            )

        # NOTE(wesm): If nthreads=None, then we use a heuristic to decide whether
        # using a thread pool is worth it. Currently the heuristic is whether the
        # nrows > 100 * ncols.
        if nthreads is None:
            nrows, ncols = len(df), len(df.columns)
            if nrows > ncols * 100:
                nthreads = pa.cpu_count()
            else:
                nthreads = 1

        def convert_column(col, ty):
            try:
                return pa.array(col, type=ty, from_pandas=True, safe=safe)
            except (pa.ArrowInvalid,
                    pa.ArrowNotImplementedError,
                    pa.ArrowTypeError) as e:
                e.args += ("Conversion failed for column {0!s} with type {1!s}"
                           .format(col.name, col.dtype),)
                raise e

        if nthreads == 1:
            arrays = [convert_column(c, t)
                      for c, t in zip(columns_to_convert,
                                      convert_types)]
        else:
>           from concurrent import futures
E           ImportError: No module named concurrent

pyarrow/pandas_compat.py:430: ImportError
======================================================= warnings summary =======================================================
pyarrow/tests/test_convert_pandas.py::TestConvertMetadata::test_empty_list_metadata
  /home/apalumbo/repos/pyarrow/lib/python2.7/site-packages/pandas/core/dtypes/missing.py:431: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
    if left_value != right_value:
  /home/apalumbo/repos/pyarrow/lib/python2.7/site-packages/pandas/core/dtypes/missing.py:431: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
    if left_value != right_value:
  /home/apalumbo/repos/pyarrow/lib/python2.7/site-packages/pandas/core/dtypes/missing.py:431: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
    if left_value != right_value:

pyarrow/tests/test_convert_pandas.py::TestListTypes::test_column_of_lists_first_empty
  /home/apalumbo/repos/pyarrow/lib/python2.7/site-packages/pandas/core/dtypes/missing.py:431: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
    if left_value != right_value:

pyarrow/tests/test_convert_pandas.py::TestListTypes::test_empty_list_roundtrip
  /home/apalumbo/repos/pyarrow/lib/python2.7/site-packages/pandas/core/dtypes/missing.py:431: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
    if left_value != right_value:
  /home/apalumbo/repos/pyarrow/lib/python2.7/site-packages/pandas/core/dtypes/missing.py:431: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
    if left_value != right_value:
  /home/apalumbo/repos/pyarrow/lib/python2.7/site-packages/pandas/core/dtypes/missing.py:431: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
    if left_value != right_value:

-- Docs: https://docs.pytest.org/en/latest/warnings.html
========================== 45 failed, 997 passed, 194 skipped, 3 xfailed, 7 warnings in 33.14 seconds ==========================
(pyarrow)




Re: Missing hypothesis module in PyArrow

Posted by Wes McKinney <we...@gmail.com>.
hi Andrew,

On Python 2.7 you need to run both

pip install -r requirements.txt
pip install -r requirements-test.txt

It looks like your CMake version is old so ZSTD was disabled. zstd
cannot be built automatically from source for CMake versions less than
3.7

You will have a better time if you use conda to manage your build toolchain, see

https://github.com/apache/arrow/blob/master/docs/source/python/development.rst

- Wes

On Tue, Jan 8, 2019 at 6:48 PM Andrew Palumbo <ap...@outlook.com> wrote:
>
> Hello,
> I'm just building arrow from source from a fresh checkout; commit:
>
> 326015cfc66e1f657cdd6811620137e9e277b43d
>
> Everything seems to build against python 2.7:
>
>
> $python setup.py build_ext --build-type=$ARROW_BUILD_TYPE    --with-parquet --with-plasma --inplace
>
> {...}
> Bundling includes: release/include
> release/gandiva.so
> Cython module gandiva failure permitted
> ('Moving generated C++ source', 'lib.cpp', 'to build path', '/home/apalumbo/repos/arrow/python/pyarrow/lib.cpp')
> ('Moving built C-extension', 'release/lib.so', 'to build path', '/home/apalumbo/repos/arrow/python/pyarrow/lib.so')
> ('Moving generated C++ source', '_csv.cpp', 'to build path', '/home/apalumbo/repos/arrow/python/pyarrow/_csv.cpp')
> ('Moving built C-extension', 'release/_csv.so', 'to build path', '/home/apalumbo/repos/arrow/python/pyarrow/_csv.so')
> release/_cuda.so
> Cython module _cuda failure permitted
> ('Moving generated C++ source', '_parquet.cpp', 'to build path', '/home/apalumbo/repos/arrow/python/pyarrow/_parquet.cpp')
> ('Moving built C-extension', 'release/_parquet.so', 'to build path', '/home/apalumbo/repos/arrow/python/pyarrow/_parquet.so')
> release/_orc.so
> Cython module _orc failure permitted
> ('Moving generated C++ source', '_plasma.cpp', 'to build path', '/home/apalumbo/repos/arrow/python/pyarrow/_plasma.cpp')
> ('Moving built C-extension', 'release/_plasma.so', 'to build path', '/home/apalumbo/repos/arrow/python/pyarrow/_plasma.so')
> {...}
>
> running tests though I get:
>
> $ py.test pyarrow
>
> ImportError while loading conftest '/home/apalumbo/repos/arrow/python/pyarrow/tests/conftest.py'.
> ../../pyarrow/lib/python2.7/site-packages/six.py:709: in exec_
>     exec("""exec _code_ in _globs_, _locs_""")
> pyarrow/tests/conftest.py:20: in <module>
>     import hypothesis as h
> E   ImportError: No module named hypothesis
>
>
>
> after a pip install of `hypothesis` in my venv, (Python 2.7) I am able to run the tests.
>
> Several fail right off the bat (seems like many of the errors are Pandas-related (see bottom for stack trace):
>
>
>
> Switching to a virtualenv Running Python 3.5, the build fails:
>
>
>  $make -j4
>
> {...}
> make[2]: *** [src/arrow/python/CMakeFiles/arrow_python_objlib.dir/benchmark.cc.o] Error 1
> CMakeFiles/Makefile2:1862: recipe for target 'src/arrow/python/CMakeFiles/arrow_python_objlib.dir/all' failed
> make[1]: *** [src/arrow/python/CMakeFiles/arrow_python_objlib.dir/all] Error 2
> make[1]: *** Waiting for unfinished jobs....
> -- glog_ep install command succeeded.  See also /home/apalumbo/repos/arrow/cpp/build/glog_ep-prefix/src/glog_ep-stamp/glog_ep-install-*.log
> [ 40%] Building CXX object src/plasma/CMakeFiles/plasma_objlib.dir/common.cc.o
> [ 40%] Completed 'glog_ep'
> [ 40%] Built target glog_ep
> [ 41%] Building CXX object src/plasma/CMakeFiles/plasma_objlib.dir/eviction_policy.cc.o
> [ 41%] Building CXX object src/plasma/CMakeFiles/plasma_objlib.dir/events.cc.o
> [ 42%] Building CXX object src/plasma/CMakeFiles/plasma_objlib.dir/fling.cc.o
> [ 42%] Building CXX object src/plasma/CMakeFiles/plasma_objlib.dir/io.cc.o
> [ 43%] Building CXX object src/plasma/CMakeFiles/plasma_objlib.dir/malloc.cc.o
> [ 43%] Building CXX object src/plasma/CMakeFiles/plasma_objlib.dir/plasma.cc.o
> [ 44%] Building CXX object src/plasma/CMakeFiles/plasma_objlib.dir/protocol.cc.o
> [ 44%] Building C object src/plasma/CMakeFiles/plasma_objlib.dir/thirdparty/ae/ae.c.o
> [ 44%] Built target plasma_objlib
> -- jemalloc_ep build command succeeded.  See also /home/apalumbo/repos/arrow/cpp/build/jemalloc_ep-prefix/src/jemalloc_ep-stamp/jemalloc_ep-build-*.log
> [ 45%] Performing install step for 'jemalloc_ep'
> -- jemalloc_ep install command succeeded.  See also /home/apalumbo/repos/arrow/cpp/build/jemalloc_ep-prefix/src/jemalloc_ep-stamp/jemalloc_ep-install-*.log
> [ 45%] Completed 'jemalloc_ep'
> [ 45%] Built target jemalloc_ep
> Makefile:138: recipe for target 'all' failed
> make: *** [all] Error 2
>
>
>
>
> Any thoughts?  I', building with the instructions from https://arrow.apache.org/docs/python/development.html#development
>
> Thanks in advance,
>
> Andy
>
>
>
>
>
>
>
>
> Partial stack trace (python 2.7) :
>
> $py.test pyarrow
>
> {...}
>
>
> [5000 rows x 1 columns]
> schema = None, preserve_index = False, nthreads = 16, columns = None, safe = True
>
>     def dataframe_to_arrays(df, schema, preserve_index, nthreads=1, columns=None,
>                             safe=True):
>         names, column_names, index_columns, index_column_names, \
>             columns_to_convert, convert_types = _get_columns_to_convert(
>                 df, schema, preserve_index, columns
>             )
>
>         # NOTE(wesm): If nthreads=None, then we use a heuristic to decide whether
>         # using a thread pool is worth it. Currently the heuristic is whether the
>         # nrows > 100 * ncols.
>         if nthreads is None:
>             nrows, ncols = len(df), len(df.columns)
>             if nrows > ncols * 100:
>                 nthreads = pa.cpu_count()
>             else:
>                 nthreads = 1
>
>         def convert_column(col, ty):
>             try:
>                 return pa.array(col, type=ty, from_pandas=True, safe=safe)
>             except (pa.ArrowInvalid,
>                     pa.ArrowNotImplementedError,
>                     pa.ArrowTypeError) as e:
>                 e.args += ("Conversion failed for column {0!s} with type {1!s}"
>                            .format(col.name, col.dtype),)
>                 raise e
>
>         if nthreads == 1:
>             arrays = [convert_column(c, t)
>                       for c, t in zip(columns_to_convert,
>                                       convert_types)]
>         else:
> >           from concurrent import futures
> E           ImportError: No module named concurrent
>
> pyarrow/pandas_compat.py:430: ImportError
> ___________________________________________________ test_compress_decompress ___________________________________________________
>
>     def test_compress_decompress():
>         INPUT_SIZE = 10000
>         test_data = (np.random.randint(0, 255, size=INPUT_SIZE)
>                      .astype(np.uint8)
>                      .tostring())
>         test_buf = pa.py_buffer(test_data)
>
>         codecs = ['lz4', 'snappy', 'gzip', 'zstd', 'brotli']
>         for codec in codecs:
> >           compressed_buf = pa.compress(test_buf, codec=codec)
>
> pyarrow/tests/test_io.py:508:
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> pyarrow/io.pxi:1340: in pyarrow.lib.compress
>     check_status(CCodec.Create(c_codec, &compressor))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>
> >   raise ArrowNotImplementedError(message)
> E   ArrowNotImplementedError: ZSTD codec support not built
>
> pyarrow/error.pxi:89: ArrowNotImplementedError
> _______________________________________________ test_compressed_roundtrip[zstd] ________________________________________________
>
> compression = 'zstd'
>
>     @pytest.mark.parametrize("compression",
>                              ["bz2", "brotli", "gzip", "lz4", "zstd"])
>     def test_compressed_roundtrip(compression):
>         data = b"some test data\n" * 10 + b"eof\n"
>         raw = pa.BufferOutputStream()
>         try:
> >           with pa.CompressedOutputStream(raw, compression) as compressed:
>
> pyarrow/tests/test_io.py:1045:
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> pyarrow/io.pxi:1149: in pyarrow.lib.CompressedOutputStream.__init__
>     self._init(stream, compression_type)
> pyarrow/io.pxi:1162: in pyarrow.lib.CompressedOutputStream._init
>     _make_compressed_output_stream(stream.get_output_stream(),
> pyarrow/io.pxi:1087: in pyarrow.lib._make_compressed_output_stream
>     check_status(CCodec.Create(compression_type, &codec))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>
> >   raise ArrowNotImplementedError(message)
> E   ArrowNotImplementedError: ZSTD codec support not built
>
> pyarrow/error.pxi:89: ArrowNotImplementedError
> __________________________________________ test_pandas_serialize_round_trip_nthreads ___________________________________________
>
>     def test_pandas_serialize_round_trip_nthreads():
>         index = pd.Index([1, 2, 3], name='my_index')
>         columns = ['foo', 'bar']
>         df = pd.DataFrame(
>             {'foo': [1.5, 1.6, 1.7], 'bar': list('abc')},
>             index=index, columns=columns
>         )
> >       _check_serialize_pandas_round_trip(df, use_threads=True)
>
> pyarrow/tests/test_ipc.py:536:
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> pyarrow/tests/test_ipc.py:514: in _check_serialize_pandas_round_trip
>     buf = pa.serialize_pandas(df, nthreads=2 if use_threads else 1)
> pyarrow/ipc.py:163: in serialize_pandas
>     preserve_index=preserve_index)
> pyarrow/table.pxi:864: in pyarrow.lib.RecordBatch.from_pandas
>     names, arrays, metadata = pdcompat.dataframe_to_arrays(
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>
> df =           foo bar
> my_index
> 1         1.5   a
> 2         1.6   b
> 3         1.7   c, schema = None
> preserve_index = True, nthreads = 2, columns = None, safe = True
>
>     def dataframe_to_arrays(df, schema, preserve_index, nthreads=1, columns=None,
>                             safe=True):
>         names, column_names, index_columns, index_column_names, \
>             columns_to_convert, convert_types = _get_columns_to_convert(
>                 df, schema, preserve_index, columns
>             )
>
>         # NOTE(wesm): If nthreads=None, then we use a heuristic to decide whether
>         # using a thread pool is worth it. Currently the heuristic is whether the
>         # nrows > 100 * ncols.
>         if nthreads is None:
>             nrows, ncols = len(df), len(df.columns)
>             if nrows > ncols * 100:
>                 nthreads = pa.cpu_count()
>             else:
>                 nthreads = 1
>
>         def convert_column(col, ty):
>             try:
>                 return pa.array(col, type=ty, from_pandas=True, safe=safe)
>             except (pa.ArrowInvalid,
>                     pa.ArrowNotImplementedError,
>                     pa.ArrowTypeError) as e:
>                 e.args += ("Conversion failed for column {0!s} with type {1!s}"
>                            .format(col.name, col.dtype),)
>                 raise e
>
>         if nthreads == 1:
>             arrays = [convert_column(c, t)
>                       for c, t in zip(columns_to_convert,
>                                       convert_types)]
>         else:
> >           from concurrent import futures
> E           ImportError: No module named concurrent
>
> pyarrow/pandas_compat.py:430: ImportError
> ======================================================= warnings summary =======================================================
> pyarrow/tests/test_convert_pandas.py::TestConvertMetadata::test_empty_list_metadata
>   /home/apalumbo/repos/pyarrow/lib/python2.7/site-packages/pandas/core/dtypes/missing.py:431: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
>     if left_value != right_value:
>   /home/apalumbo/repos/pyarrow/lib/python2.7/site-packages/pandas/core/dtypes/missing.py:431: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
>     if left_value != right_value:
>   /home/apalumbo/repos/pyarrow/lib/python2.7/site-packages/pandas/core/dtypes/missing.py:431: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
>     if left_value != right_value:
>
> pyarrow/tests/test_convert_pandas.py::TestListTypes::test_column_of_lists_first_empty
>   /home/apalumbo/repos/pyarrow/lib/python2.7/site-packages/pandas/core/dtypes/missing.py:431: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
>     if left_value != right_value:
>
> pyarrow/tests/test_convert_pandas.py::TestListTypes::test_empty_list_roundtrip
>   /home/apalumbo/repos/pyarrow/lib/python2.7/site-packages/pandas/core/dtypes/missing.py:431: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
>     if left_value != right_value:
>   /home/apalumbo/repos/pyarrow/lib/python2.7/site-packages/pandas/core/dtypes/missing.py:431: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
>     if left_value != right_value:
>   /home/apalumbo/repos/pyarrow/lib/python2.7/site-packages/pandas/core/dtypes/missing.py:431: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
>     if left_value != right_value:
>
> -- Docs: https://docs.pytest.org/en/latest/warnings.html
> ========================== 45 failed, 997 passed, 194 skipped, 3 xfailed, 7 warnings in 33.14 seconds ==========================
> (pyarrow)
>
>
>