You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by Micah Kornfield <em...@gmail.com> on 2021/11/06 00:33:05 UTC

Re: Help building pyarrow 4.0.1

Hi Mathieu,
I don't have much experience here, but I think there were a few JIRA work
items that had to be done to get Arrow compiling on an M1, you might try
searching JIRA to see if these provide any clues.

-Micah

On Fri, Oct 15, 2021 at 10:29 PM Mathieu Leduc-Hamel <
mathieu.leduc-hamel@metrio.net> wrote:

> Hi, I'm working on beam which is currently not supporting the latest
> release of Arrow (6.x.) and I'm trying to build the required packages on
> Apple M1.
>
> Currently when building the python package `pyarrow` like this:
> ```
> python setup.py build_ext --build-type=release --bundle-arrow-cpp
> --bundle-arrow-cpp-headers --bundle-cython-cpp --cython-cplus
> --bundle-boost  --with-static-boost --extra-cmake-args=boost-python3
> --boost-namespace=boost-python3 bdist_wheel
> ```
>
> I've got a package but when I'm installing it and trying on real use case
> which simply import pyarrow I've got the following error:
>
> ```
>     import pyarrow.lib as _lib
> E   ImportError:
> dlopen(/Users/mlhamel/src/github/metrio/jupyter/metrics-sdk/.venv/lib/python3.8/site-packages/pyarrow/
> lib.cpython-38-darwin.so, 2): Symbol not found: __Py_FatalErrorFunc
> E     Referenced from:
> /Users/mlhamel/src/github/metrio/jupyter/metrics-sdk/.venv/lib/python3.8/site-packages/pyarrow/libarrow_python.400.dylib
> E     Expected in: flat namespace
> E    in
> /Users/mlhamel/src/github/metrio/jupyter/metrics-sdk/.venv/lib/python3.8/site-packages/pyarrow/libarrow_python.400.dylib
> ```
>
> I tried embedding statically both parquet and boost python but it's still
> faling.
>
> Any idea where i can explore?
>

Re: Help building pyarrow 4.0.1

Posted by Mathieu Leduc-Hamel <ma...@metrio.net>.
It works well with 5.x.

At first I really needed to build versions 4.0.1 cause this was a hard
requirement from Apache Beam 2.33 but now with the latest release of Beam
2.34 it is now supporting version 5.0 then I don't need anymore to build
pyarrow by myself for M1.

Thanks everyone !

On Fri, Nov 12, 2021 at 12:05 PM Aldrin <ak...@ucsc.edu> wrote:

> Hi Mathieu,
>
> I don't know if it's helpful, but I figured I could share some things that
> I have done/seen for building on an M1. Uwe Korn wrote a blog post about
> building arrow on M1 early this year [1]. I didn't look at it too closely,
> because I had a separate library blocking my build on my M1.
>
> I haven't built pyarrow from source (it looked complicated to me), so I've
> just been using pip to recompile the pyarrow binaries (and you can still
> specify the pyarrow version to install) [2]. That being said, I have no
> issues install pyarrow 6.0.0, but when I try to install pyarrow 4.0.1 this
> way, I find that it has a dependency on numpy 1.19.4, which isn't supported
> on M1 (via pip) [3]. I'm curious if you'd have more luck with pyarrow
> 5.0.0, which installs fine the normal way on M1.
>
> Sorry if this isn't too helpful. I'll be trying to get my builds working
> on M1 in a couple weeks, so if you still have this as an issue then perhaps
> I can check in if I figure out anything more.
>
> Good luck!
>
> -- references --
> [1]: https://uwekorn.com/2021/01/11/apache-arrow-on-the-apple-m1.html
> [2]:
> https://gist.github.com/drin/5dbda4aa546c3bf4a0058cd1402d5b4d#file-install-pyarrow-bash
> [3]:
> https://github.com/scipy/oldest-supported-numpy/blob/d26b44463b1be0fdb9c929a2d9781293fabffeda/setup.cfg#L38
>
> Aldrin Montana
> Computer Science PhD Student
> UC Santa Cruz
>
>
> On Thu, Nov 11, 2021 at 9:30 PM Alenka Frim <al...@voltrondata.com>
> wrote:
>
>> Hi Mathieu,
>>
>> The error in your case is new to me but still similar to what I was
>> getting.
>> I tried
>> - downgrading Python from 3.10. to 3.9 (building latest Arrow release),
>> - updating Xcode and the Command Line Tools,
>> - added -DARROW_INSTALL_NAME_RPATH=OFF to cmake.
>> There is a ticket for cmake in Jira:
>>
>> https://issues.apache.org/jira/browse/ARROW-14570
>> <https://github.com/apache/arrow/pull/11677>
>>
>> I am not sure it’s connected but you can try.
>>
>> Alenka
>>
>> On 11 Nov 2021, at 16:05, Mathieu Leduc-Hamel <
>> mathieu.leduc-hamel@metrio.net> wrote:
>>
>> Thanks Micah,
>>
>> I was browsing JIRA the other day but i didn't found anything related to
>> M1 yet, I'm gonna continue searching but if you can point me out to
>> something you found that would be appreciated.
>>
>> Thanks Again
>>
>> - Mathieu
>>
>> On Fri, Nov 5, 2021 at 8:33 PM Micah Kornfield <em...@gmail.com>
>> wrote:
>>
>>> Hi Mathieu,
>>> I don't have much experience here, but I think there were a few JIRA
>>> work items that had to be done to get Arrow compiling on an M1, you might
>>> try searching JIRA to see if these provide any clues.
>>>
>>> -Micah
>>>
>>> On Fri, Oct 15, 2021 at 10:29 PM Mathieu Leduc-Hamel <
>>> mathieu.leduc-hamel@metrio.net> wrote:
>>>
>>>> Hi, I'm working on beam which is currently not supporting the latest
>>>> release of Arrow (6.x.) and I'm trying to build the required packages on
>>>> Apple M1.
>>>>
>>>> Currently when building the python package `pyarrow` like this:
>>>> ```
>>>> python setup.py build_ext --build-type=release --bundle-arrow-cpp
>>>> --bundle-arrow-cpp-headers --bundle-cython-cpp --cython-cplus
>>>> --bundle-boost  --with-static-boost --extra-cmake-args=boost-python3
>>>> --boost-namespace=boost-python3 bdist_wheel
>>>> ```
>>>>
>>>> I've got a package but when I'm installing it and trying on real use
>>>> case which simply import pyarrow I've got the following error:
>>>>
>>>> ```
>>>>     import pyarrow.lib as _lib
>>>> E   ImportError:
>>>> dlopen(/Users/mlhamel/src/github/metrio/jupyter/metrics-sdk/.venv/lib/python3.8/site-packages/pyarrow/
>>>> lib.cpython-38-darwin.so, 2): Symbol not found: __Py_FatalErrorFunc
>>>> E     Referenced from:
>>>> /Users/mlhamel/src/github/metrio/jupyter/metrics-sdk/.venv/lib/python3.8/site-packages/pyarrow/libarrow_python.400.dylib
>>>> E     Expected in: flat namespace
>>>> E    in
>>>> /Users/mlhamel/src/github/metrio/jupyter/metrics-sdk/.venv/lib/python3.8/site-packages/pyarrow/libarrow_python.400.dylib
>>>> ```
>>>>
>>>> I tried embedding statically both parquet and boost python but it's
>>>> still faling.
>>>>
>>>> Any idea where i can explore?
>>>>
>>>
>>

Re: Help building pyarrow 4.0.1

Posted by Aldrin <ak...@ucsc.edu>.
Hi Mathieu,

I don't know if it's helpful, but I figured I could share some things that
I have done/seen for building on an M1. Uwe Korn wrote a blog post about
building arrow on M1 early this year [1]. I didn't look at it too closely,
because I had a separate library blocking my build on my M1.

I haven't built pyarrow from source (it looked complicated to me), so I've
just been using pip to recompile the pyarrow binaries (and you can still
specify the pyarrow version to install) [2]. That being said, I have no
issues install pyarrow 6.0.0, but when I try to install pyarrow 4.0.1 this
way, I find that it has a dependency on numpy 1.19.4, which isn't supported
on M1 (via pip) [3]. I'm curious if you'd have more luck with pyarrow
5.0.0, which installs fine the normal way on M1.

Sorry if this isn't too helpful. I'll be trying to get my builds working on
M1 in a couple weeks, so if you still have this as an issue then perhaps I
can check in if I figure out anything more.

Good luck!

-- references --
[1]: https://uwekorn.com/2021/01/11/apache-arrow-on-the-apple-m1.html
[2]:
https://gist.github.com/drin/5dbda4aa546c3bf4a0058cd1402d5b4d#file-install-pyarrow-bash
[3]:
https://github.com/scipy/oldest-supported-numpy/blob/d26b44463b1be0fdb9c929a2d9781293fabffeda/setup.cfg#L38

Aldrin Montana
Computer Science PhD Student
UC Santa Cruz


On Thu, Nov 11, 2021 at 9:30 PM Alenka Frim <al...@voltrondata.com> wrote:

> Hi Mathieu,
>
> The error in your case is new to me but still similar to what I was
> getting.
> I tried
> - downgrading Python from 3.10. to 3.9 (building latest Arrow release),
> - updating Xcode and the Command Line Tools,
> - added -DARROW_INSTALL_NAME_RPATH=OFF to cmake.
> There is a ticket for cmake in Jira:
>
> https://issues.apache.org/jira/browse/ARROW-14570
> <https://github.com/apache/arrow/pull/11677>
>
> I am not sure it’s connected but you can try.
>
> Alenka
>
> On 11 Nov 2021, at 16:05, Mathieu Leduc-Hamel <
> mathieu.leduc-hamel@metrio.net> wrote:
>
> Thanks Micah,
>
> I was browsing JIRA the other day but i didn't found anything related to
> M1 yet, I'm gonna continue searching but if you can point me out to
> something you found that would be appreciated.
>
> Thanks Again
>
> - Mathieu
>
> On Fri, Nov 5, 2021 at 8:33 PM Micah Kornfield <em...@gmail.com>
> wrote:
>
>> Hi Mathieu,
>> I don't have much experience here, but I think there were a few JIRA work
>> items that had to be done to get Arrow compiling on an M1, you might try
>> searching JIRA to see if these provide any clues.
>>
>> -Micah
>>
>> On Fri, Oct 15, 2021 at 10:29 PM Mathieu Leduc-Hamel <
>> mathieu.leduc-hamel@metrio.net> wrote:
>>
>>> Hi, I'm working on beam which is currently not supporting the latest
>>> release of Arrow (6.x.) and I'm trying to build the required packages on
>>> Apple M1.
>>>
>>> Currently when building the python package `pyarrow` like this:
>>> ```
>>> python setup.py build_ext --build-type=release --bundle-arrow-cpp
>>> --bundle-arrow-cpp-headers --bundle-cython-cpp --cython-cplus
>>> --bundle-boost  --with-static-boost --extra-cmake-args=boost-python3
>>> --boost-namespace=boost-python3 bdist_wheel
>>> ```
>>>
>>> I've got a package but when I'm installing it and trying on real use
>>> case which simply import pyarrow I've got the following error:
>>>
>>> ```
>>>     import pyarrow.lib as _lib
>>> E   ImportError:
>>> dlopen(/Users/mlhamel/src/github/metrio/jupyter/metrics-sdk/.venv/lib/python3.8/site-packages/pyarrow/
>>> lib.cpython-38-darwin.so, 2): Symbol not found: __Py_FatalErrorFunc
>>> E     Referenced from:
>>> /Users/mlhamel/src/github/metrio/jupyter/metrics-sdk/.venv/lib/python3.8/site-packages/pyarrow/libarrow_python.400.dylib
>>> E     Expected in: flat namespace
>>> E    in
>>> /Users/mlhamel/src/github/metrio/jupyter/metrics-sdk/.venv/lib/python3.8/site-packages/pyarrow/libarrow_python.400.dylib
>>> ```
>>>
>>> I tried embedding statically both parquet and boost python but it's
>>> still faling.
>>>
>>> Any idea where i can explore?
>>>
>>
>

Re: Help building pyarrow 4.0.1

Posted by Alenka Frim <al...@voltrondata.com>.
Hi Mathieu,

The error in your case is new to me but still similar to what I was getting.
I tried
- downgrading Python from 3.10. to 3.9 (building latest Arrow release),
- updating Xcode and the Command Line Tools,
- added -DARROW_INSTALL_NAME_RPATH=OFF to cmake.
There is a ticket for cmake in Jira:

https://issues.apache.org/jira/browse/ARROW-14570 <https://github.com/apache/arrow/pull/11677>

I am not sure it’s connected but you can try.

Alenka

> On 11 Nov 2021, at 16:05, Mathieu Leduc-Hamel <ma...@metrio.net> wrote:
> 
> Thanks Micah,
> 
> I was browsing JIRA the other day but i didn't found anything related to M1 yet, I'm gonna continue searching but if you can point me out to something you found that would be appreciated.
> 
> Thanks Again
> 
> - Mathieu
> 
> On Fri, Nov 5, 2021 at 8:33 PM Micah Kornfield <emkornfield@gmail.com <ma...@gmail.com>> wrote:
> Hi Mathieu,
> I don't have much experience here, but I think there were a few JIRA work items that had to be done to get Arrow compiling on an M1, you might try searching JIRA to see if these provide any clues.
> 
> -Micah
> 
> On Fri, Oct 15, 2021 at 10:29 PM Mathieu Leduc-Hamel <mathieu.leduc-hamel@metrio.net <ma...@metrio.net>> wrote:
> Hi, I'm working on beam which is currently not supporting the latest release of Arrow (6.x.) and I'm trying to build the required packages on Apple M1.
> 
> Currently when building the python package `pyarrow` like this:
> ```
> python setup.py build_ext --build-type=release --bundle-arrow-cpp --bundle-arrow-cpp-headers --bundle-cython-cpp --cython-cplus --bundle-boost  --with-static-boost --extra-cmake-args=boost-python3 --boost-namespace=boost-python3 bdist_wheel
> ```
> 
> I've got a package but when I'm installing it and trying on real use case which simply import pyarrow I've got the following error:
> 
> ```
>     import pyarrow.lib as _lib
> E   ImportError: dlopen(/Users/mlhamel/src/github/metrio/jupyter/metrics-sdk/.venv/lib/python3.8/site-packages/pyarrow/lib.cpython-38-darwin.so <http://lib.cpython-38-darwin.so/>, 2): Symbol not found: __Py_FatalErrorFunc
> E     Referenced from: /Users/mlhamel/src/github/metrio/jupyter/metrics-sdk/.venv/lib/python3.8/site-packages/pyarrow/libarrow_python.400.dylib
> E     Expected in: flat namespace
> E    in /Users/mlhamel/src/github/metrio/jupyter/metrics-sdk/.venv/lib/python3.8/site-packages/pyarrow/libarrow_python.400.dylib
> ```
> 
> I tried embedding statically both parquet and boost python but it's still faling.
> 
> Any idea where i can explore?


Re: Help building pyarrow 4.0.1

Posted by Mathieu Leduc-Hamel <ma...@metrio.net>.
Thanks Micah,

I was browsing JIRA the other day but i didn't found anything related to M1
yet, I'm gonna continue searching but if you can point me out to something
you found that would be appreciated.

Thanks Again

- Mathieu

On Fri, Nov 5, 2021 at 8:33 PM Micah Kornfield <em...@gmail.com>
wrote:

> Hi Mathieu,
> I don't have much experience here, but I think there were a few JIRA work
> items that had to be done to get Arrow compiling on an M1, you might try
> searching JIRA to see if these provide any clues.
>
> -Micah
>
> On Fri, Oct 15, 2021 at 10:29 PM Mathieu Leduc-Hamel <
> mathieu.leduc-hamel@metrio.net> wrote:
>
>> Hi, I'm working on beam which is currently not supporting the latest
>> release of Arrow (6.x.) and I'm trying to build the required packages on
>> Apple M1.
>>
>> Currently when building the python package `pyarrow` like this:
>> ```
>> python setup.py build_ext --build-type=release --bundle-arrow-cpp
>> --bundle-arrow-cpp-headers --bundle-cython-cpp --cython-cplus
>> --bundle-boost  --with-static-boost --extra-cmake-args=boost-python3
>> --boost-namespace=boost-python3 bdist_wheel
>> ```
>>
>> I've got a package but when I'm installing it and trying on real use case
>> which simply import pyarrow I've got the following error:
>>
>> ```
>>     import pyarrow.lib as _lib
>> E   ImportError:
>> dlopen(/Users/mlhamel/src/github/metrio/jupyter/metrics-sdk/.venv/lib/python3.8/site-packages/pyarrow/
>> lib.cpython-38-darwin.so, 2): Symbol not found: __Py_FatalErrorFunc
>> E     Referenced from:
>> /Users/mlhamel/src/github/metrio/jupyter/metrics-sdk/.venv/lib/python3.8/site-packages/pyarrow/libarrow_python.400.dylib
>> E     Expected in: flat namespace
>> E    in
>> /Users/mlhamel/src/github/metrio/jupyter/metrics-sdk/.venv/lib/python3.8/site-packages/pyarrow/libarrow_python.400.dylib
>> ```
>>
>> I tried embedding statically both parquet and boost python but it's still
>> faling.
>>
>> Any idea where i can explore?
>>
>