You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by Anna Waldron <an...@elementaryrobotics.com> on 2020/01/24 01:38:33 UTC
Pyarrow build/install from source in ubuntu not working
Hi,
I am trying to build and install pyarrow from source in an ubuntu 18.04
docker image and getting the following error when attempting to import the
module:
Traceback (most recent call last):
> File "<string>", line 1, in <module>
> File
> "/usr/local/lib/python3.6/dist-packages/pyarrow-0.14.0-py3.6-linux-x86_64.egg/pyarrow/__init__.py",
> line 49, in <module>
> from pyarrow.lib import cpu_count, set_cpu_count
> ImportError: libarrow.so.14: cannot open shared object file: No such file
> or directory
>
Here is the Dockerfile I am using:
FROM ubuntu:18.04
> RUN apt-get update
> RUN apt-get install -y git
> RUN mkdir /arrow
> RUN git clone https://github.com/apache/arrow.git /arrow
> WORKDIR /arrow/arrow
> RUN git checkout apache-arrow-0.14.0
> WORKDIR /
COPY install_arrow.sh /install_arrow.sh
RUN bash install_arrow.sh
> RUN python3 -c 'import pyarrow'
and the install_arrow.sh script copied into the image:
export ARROW_BUILD_TYPE=release
> export ARROW_HOME=/usr/local \
> PARQUET_HOME=/usr/local
> export PYTHON_EXECUTABLE=/usr/bin/python3
>
# install requirements
> export DEBIAN_FRONTEND="noninteractive"
> apt-get update
> apt-get install -y --no-install-recommends apt-utils
> apt-get install -y git python3-minimal python3-pip autoconf libtool
> apt-get install -y cmake \
> python3-dev \
> libjemalloc-dev libboost-dev \
> build-essential \
> libboost-filesystem-dev \
> libboost-regex-dev \
> libboost-system-dev \
> flex \
> bison
> pip3 install --no-cache-dir six pytest numpy cython
> mkdir -p /arrow/cpp/build \
> && cd /arrow/cpp/build \
> && cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
> -DOPENSSL_ROOT_DIR=/usr/local/ssl \
> -DCMAKE_INSTALL_LIBDIR=lib \
> -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
> -DARROW_PARQUET=ON \
> -DARROW_PYTHON=ON \
> -DARROW_PLASMA=ON \
> -DARROW_BUILD_TESTS=OFF \
> -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
> .. \
> && make -j$(nproc) \
> && make install \
> && cd /arrow/python \
> && python3 setup.py build_ext --build-type=$ARROW_BUILD_TYPE
> --with-parquet \
> && python3 setup.py install
>
LD_LIBRARY_PATH=/usr/local/lib
>
I'm using Docker 19.03.5 on Ubuntu 18.04.3 LTS to build the image.
Thanks in advance for any help.
Anna
Re: Pyarrow build/install from source in ubuntu not working
Posted by Sutou Kouhei <ko...@clear-code.com>.
Hi,
Changing
> RUN python3 -c 'import pyarrow'
to
RUN LD_LIBRARY_PATH=/usr/local/lib python3 -c 'import pyarrow'
works on my environment.
Other solution:
Adding
ENV LD_LIBRARY_PATH=/usr/local/lib
before
RUN python3 -c 'import pyarrow'
Dockerfile:
...
RUN bash install_arrow.sh
ENV LD_LIBRARY_PATH=/usr/local/lib
RUN python3 -c 'import pyarrow'
Thanks,
--
kou
In <CA...@mail.gmail.com>
"Pyarrow build/install from source in ubuntu not working" on Thu, 23 Jan 2020 17:38:33 -0800,
Anna Waldron <an...@elementaryrobotics.com> wrote:
> Hi,
>
> I am trying to build and install pyarrow from source in an ubuntu 18.04
> docker image and getting the following error when attempting to import the
> module:
>
> Traceback (most recent call last):
>> File "<string>", line 1, in <module>
>> File
>> "/usr/local/lib/python3.6/dist-packages/pyarrow-0.14.0-py3.6-linux-x86_64.egg/pyarrow/__init__.py",
>> line 49, in <module>
>> from pyarrow.lib import cpu_count, set_cpu_count
>> ImportError: libarrow.so.14: cannot open shared object file: No such file
>> or directory
>>
>
> Here is the Dockerfile I am using:
>
> FROM ubuntu:18.04
>> RUN apt-get update
>> RUN apt-get install -y git
>> RUN mkdir /arrow
>> RUN git clone https://github.com/apache/arrow.git /arrow
>> WORKDIR /arrow/arrow
>> RUN git checkout apache-arrow-0.14.0
>> WORKDIR /
>
> COPY install_arrow.sh /install_arrow.sh
>
> RUN bash install_arrow.sh
>
>
>> RUN python3 -c 'import pyarrow'
>
>
> and the install_arrow.sh script copied into the image:
>
> export ARROW_BUILD_TYPE=release
>> export ARROW_HOME=/usr/local \
>> PARQUET_HOME=/usr/local
>> export PYTHON_EXECUTABLE=/usr/bin/python3
>>
>
>
> # install requirements
>> export DEBIAN_FRONTEND="noninteractive"
>> apt-get update
>> apt-get install -y --no-install-recommends apt-utils
>> apt-get install -y git python3-minimal python3-pip autoconf libtool
>> apt-get install -y cmake \
>> python3-dev \
>> libjemalloc-dev libboost-dev \
>> build-essential \
>> libboost-filesystem-dev \
>> libboost-regex-dev \
>> libboost-system-dev \
>> flex \
>> bison
>> pip3 install --no-cache-dir six pytest numpy cython
>
>
>> mkdir -p /arrow/cpp/build \
>> && cd /arrow/cpp/build \
>> && cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
>> -DOPENSSL_ROOT_DIR=/usr/local/ssl \
>> -DCMAKE_INSTALL_LIBDIR=lib \
>> -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>> -DARROW_PARQUET=ON \
>> -DARROW_PYTHON=ON \
>> -DARROW_PLASMA=ON \
>> -DARROW_BUILD_TESTS=OFF \
>> -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
>> .. \
>> && make -j$(nproc) \
>> && make install \
>> && cd /arrow/python \
>> && python3 setup.py build_ext --build-type=$ARROW_BUILD_TYPE
>> --with-parquet \
>> && python3 setup.py install
>>
>
>
> LD_LIBRARY_PATH=/usr/local/lib
>>
>
> I'm using Docker 19.03.5 on Ubuntu 18.04.3 LTS to build the image.
>
> Thanks in advance for any help.
>
> Anna
Re: Pyarrow build/install from source in ubuntu not working
Posted by Anna Waldron <an...@elementaryrobotics.com>.
Thanks Matt and Kou, I was able to resolve using your examples.
Anna
On Fri, Jan 24, 2020 at 5:13 AM Calder, Matthew <mc...@xbktrading.com>
wrote:
> Anna,
>
>
>
> Not sure it will help, but below is the install_arrow.sh script I am using
> to build arrow + pyarrow in our containers which are also based off of
> Ubuntu 18.04.
>
>
>
> Matt
>
>
>
>
>
> #!/bin/bash
>
>
>
> # Taken from:
> https://arrow.apache.org/docs/developers/python.html#python-development
>
> # minor edits
>
>
>
> mkdir /repos
>
> cd /repos
>
>
>
> git clone https://github.com/apache/arrow.git
>
> cd arrow
>
>
>
> apt-get install -y libjemalloc-dev libboost-dev \
>
> libboost-filesystem-dev \
>
> libboost-system-dev \
>
> libboost-regex-dev \
>
> python3-dev \
>
> autoconf \
>
> flex \
>
> bison
>
>
>
> pip3 install six numpy pandas cython pytest hypothesis
>
>
>
> mkdir dist
>
>
>
> export ARROW_HOME=/usr/local
>
> export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
>
>
>
> mkdir /repos/arrow/cpp/build
>
> cd /repos/arrow/cpp/build
>
>
>
> rm /usr/bin/python
>
> ln -s /usr/bin/python3 /usr/bin/python
>
>
>
> cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>
> -DCMAKE_INSTALL_LIBDIR=lib \
>
> -DARROW_FLIGHT=ON \
>
> -DARROW_GANDIVA=OFF \
>
> -DARROW_ORC=ON \
>
> -DARROW_PARQUET=ON \
>
> -DARROW_PYTHON=ON \
>
> -DARROW_PLASMA=ON \
>
> -DARROW_BUILD_TESTS=ON \
>
> -DPYTHON_DEFAULT_EXECUTABLE=$(which python3) \
>
> -DPYTHON_INCLUDE_PATH=/usr/include/python3.6m \
>
> -DPYTHON_LIBRARY=/usr/lib/x86_64-linux-gnu/libpython3.6m.so \
>
> -DPYTHON_INCLUDE_DIR=/usr/include/python3.6m \
>
> ..
>
> make -j4
>
> make install # This installs to /repos/arrow/dist
>
>
>
> cd /repos/arrow/python
>
> export PYARROW_WITH_FLIGHT=1
>
> export PYARROW_WITH_GANDIVA=0
>
> export PYARROW_WITH_ORC=1
>
> export PYARROW_WITH_PARQUET=1
>
> python setup.py build_ext
>
> python setup.py install
>
>
>
>
>
>
>
> *From:* Anna Waldron <an...@elementaryrobotics.com>
> *Sent:* Thursday, January 23, 2020 8:39 PM
> *To:* user@arrow.apache.org
> *Subject:* Pyarrow build/install from source in ubuntu not working
>
>
>
> Hi,
>
>
>
> I am trying to build and install pyarrow from source in an ubuntu 18.04
> docker image and getting the following error when attempting to import the
> module:
>
>
>
> Traceback (most recent call last):
> File "<string>", line 1, in <module>
> File
> "/usr/local/lib/python3.6/dist-packages/pyarrow-0.14.0-py3.6-linux-x86_64.egg/pyarrow/__init__.py",
> line 49, in <module>
> from pyarrow.lib import cpu_count, set_cpu_count
> ImportError: libarrow.so.14: cannot open shared object file: No such file
> or directory
>
>
>
> Here is the Dockerfile I am using:
>
>
>
> FROM ubuntu:18.04
> RUN apt-get update
> RUN apt-get install -y git
> RUN mkdir /arrow
> RUN git clone https://github.com/apache/arrow.git
> <https://clicktime.symantec.com/38kKr4FEmbbfReBM25EpjXa7Vc?u=https%3A%2F%2Fgithub.com%2Fapache%2Farrow.git>
> /arrow
> WORKDIR /arrow/arrow
> RUN git checkout apache-arrow-0.14.0
> WORKDIR /
>
> COPY install_arrow.sh /install_arrow.sh
>
> RUN bash install_arrow.sh
>
>
> RUN python3 -c 'import pyarrow'
>
>
>
> and the install_arrow.sh script copied into the image:
>
>
>
> export ARROW_BUILD_TYPE=release
> export ARROW_HOME=/usr/local \
> PARQUET_HOME=/usr/local
> export PYTHON_EXECUTABLE=/usr/bin/python3
>
>
>
> # install requirements
> export DEBIAN_FRONTEND="noninteractive"
> apt-get update
> apt-get install -y --no-install-recommends apt-utils
> apt-get install -y git python3-minimal python3-pip autoconf libtool
> apt-get install -y cmake \
> python3-dev \
> libjemalloc-dev libboost-dev \
> build-essential \
> libboost-filesystem-dev \
> libboost-regex-dev \
> libboost-system-dev \
> flex \
> bison
> pip3 install --no-cache-dir six pytest numpy cython
>
>
>
> mkdir -p /arrow/cpp/build \
> && cd /arrow/cpp/build \
> && cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
> -DOPENSSL_ROOT_DIR=/usr/local/ssl \
> -DCMAKE_INSTALL_LIBDIR=lib \
> -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
> -DARROW_PARQUET=ON \
> -DARROW_PYTHON=ON \
> -DARROW_PLASMA=ON \
> -DARROW_BUILD_TESTS=OFF \
> -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
> .. \
> && make -j$(nproc) \
> && make install \
> && cd /arrow/python \
> && python3 setup.py build_ext --build-type=$ARROW_BUILD_TYPE
> --with-parquet \
> && python3 setup.py install
>
>
>
> LD_LIBRARY_PATH=/usr/local/lib
>
>
>
> I'm using Docker 19.03.5 on Ubuntu 18.04.3 LTS to build the image.
>
>
>
> Thanks in advance for any help.
>
>
>
> Anna
>
> The information contained in this e-mail may be confidential and is
> intended solely for the use of the named addressee.
>
> Access, copying or re-use of the e-mail or any information contained
> therein by any other person is not authorized.
>
> If you are not the intended recipient please notify us immediately by
> returning the e-mail to the originator.
>
> Disclaimer Version MB.US.1
>
--
Anna Waldron
Software Engineer
Elementary Robotics
anna@elementaryrobotics.com
RE: Pyarrow build/install from source in ubuntu not working
Posted by "Calder, Matthew" <mc...@xbktrading.com>.
Anna,
Not sure it will help, but below is the install_arrow.sh script I am using to build arrow + pyarrow in our containers which are also based off of Ubuntu 18.04.
Matt
#!/bin/bash
# Taken from: https://arrow.apache.org/docs/developers/python.html#python-development
# minor edits
mkdir /repos
cd /repos
git clone https://github.com/apache/arrow.git
cd arrow
apt-get install -y libjemalloc-dev libboost-dev \
libboost-filesystem-dev \
libboost-system-dev \
libboost-regex-dev \
python3-dev \
autoconf \
flex \
bison
pip3 install six numpy pandas cython pytest hypothesis
mkdir dist
export ARROW_HOME=/usr/local
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
mkdir /repos/arrow/cpp/build
cd /repos/arrow/cpp/build
rm /usr/bin/python
ln -s /usr/bin/python3 /usr/bin/python
cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
-DCMAKE_INSTALL_LIBDIR=lib \
-DARROW_FLIGHT=ON \
-DARROW_GANDIVA=OFF \
-DARROW_ORC=ON \
-DARROW_PARQUET=ON \
-DARROW_PYTHON=ON \
-DARROW_PLASMA=ON \
-DARROW_BUILD_TESTS=ON \
-DPYTHON_DEFAULT_EXECUTABLE=$(which python3) \
-DPYTHON_INCLUDE_PATH=/usr/include/python3.6m \
-DPYTHON_LIBRARY=/usr/lib/x86_64-linux-gnu/libpython3.6m.so \
-DPYTHON_INCLUDE_DIR=/usr/include/python3.6m \
..
make -j4
make install # This installs to /repos/arrow/dist
cd /repos/arrow/python
export PYARROW_WITH_FLIGHT=1
export PYARROW_WITH_GANDIVA=0
export PYARROW_WITH_ORC=1
export PYARROW_WITH_PARQUET=1
python setup.py build_ext
python setup.py install
From: Anna Waldron <an...@elementaryrobotics.com>
Sent: Thursday, January 23, 2020 8:39 PM
To: user@arrow.apache.org
Subject: Pyarrow build/install from source in ubuntu not working
Hi,
I am trying to build and install pyarrow from source in an ubuntu 18.04 docker image and getting the following error when attempting to import the module:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/lib/python3.6/dist-packages/pyarrow-0.14.0-py3.6-linux-x86_64.egg/pyarrow/__init__.py", line 49, in <module>
from pyarrow.lib import cpu_count, set_cpu_count
ImportError: libarrow.so.14: cannot open shared object file: No such file or directory
Here is the Dockerfile I am using:
FROM ubuntu:18.04
RUN apt-get update
RUN apt-get install -y git
RUN mkdir /arrow
RUN git clone https://github.com/apache/arrow.git<https://clicktime.symantec.com/38kKr4FEmbbfReBM25EpjXa7Vc?u=https%3A%2F%2Fgithub.com%2Fapache%2Farrow.git> /arrow
WORKDIR /arrow/arrow
RUN git checkout apache-arrow-0.14.0
WORKDIR /
COPY install_arrow.sh /install_arrow.sh
RUN bash install_arrow.sh
RUN python3 -c 'import pyarrow'
and the install_arrow.sh script copied into the image:
export ARROW_BUILD_TYPE=release
export ARROW_HOME=/usr/local \
PARQUET_HOME=/usr/local
export PYTHON_EXECUTABLE=/usr/bin/python3
# install requirements
export DEBIAN_FRONTEND="noninteractive"
apt-get update
apt-get install -y --no-install-recommends apt-utils
apt-get install -y git python3-minimal python3-pip autoconf libtool
apt-get install -y cmake \
python3-dev \
libjemalloc-dev libboost-dev \
build-essential \
libboost-filesystem-dev \
libboost-regex-dev \
libboost-system-dev \
flex \
bison
pip3 install --no-cache-dir six pytest numpy cython
mkdir -p /arrow/cpp/build \
&& cd /arrow/cpp/build \
&& cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
-DOPENSSL_ROOT_DIR=/usr/local/ssl \
-DCMAKE_INSTALL_LIBDIR=lib \
-DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
-DARROW_PARQUET=ON \
-DARROW_PYTHON=ON \
-DARROW_PLASMA=ON \
-DARROW_BUILD_TESTS=OFF \
-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
.. \
&& make -j$(nproc) \
&& make install \
&& cd /arrow/python \
&& python3 setup.py build_ext --build-type=$ARROW_BUILD_TYPE --with-parquet \
&& python3 setup.py install
LD_LIBRARY_PATH=/usr/local/lib
I'm using Docker 19.03.5 on Ubuntu 18.04.3 LTS to build the image.
Thanks in advance for any help.
Anna
The information contained in this e-mail may be confidential and is intended solely for the use of the named addressee.
Access, copying or re-use of the e-mail or any information contained therein by any other person is not authorized.
If you are not the intended recipient please notify us immediately by returning the e-mail to the originator.
Disclaimer Version MB.US.1