You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/08/13 03:47:22 UTC

[GitHub] [arrow] nesyamun opened a new issue #10929: Help with running pyrrow on arm64 in Docker

nesyamun opened a new issue #10929:
URL: https://github.com/apache/arrow/issues/10929


   I have issue running Arrow with pyarrow on arm64, in Docker, and hope to get some advice.
    
   I get the following errors: 
   ```
   <jemalloc>: Unsupported system page size
   ```
     
   The Docker image I’m building is Debian and based on `python:3.8.9-slim-buster`. It's used for Apache Airflow.
   
   I’ve tried a nightly pyarrow wheel but it doesn't fix the issue. Installing Arrow and pyarrow from source **does** fix the issue.
   
   Any advice for fixing this issue **without** installing Arrow and pyarrow from source?
   
   I contacted the mailing list on the 27th of July for this. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on issue #10929: Help with running pyrrow on arm64 in Docker

Posted by GitBox <gi...@apache.org>.
kszucs commented on issue #10929:
URL: https://github.com/apache/arrow/issues/10929#issuecomment-898329696


   We build the arm64 wheels on graviton2 travis instances [in docker](https://github.com/apache/arrow/blob/master/dev/tasks/python-wheels/travis.linux.arm64.yml#L59), so we do not cross-compile. @xhochy do you have a reference to the issue?
   
   Sadly I don't have an arm64 machine at hand, so I'm unable to reproduce the issue. 
   A wheel can be produced using the following command:
   
   ```console
   pip install -e arrow/dev/archery[docker]
   ARCH=arm64v8 PYTHON=3.8 archery docker run python-wheel-manylinux-2014
   # wheel is going to be available under arrow/python/repaired_wheels/
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] xhochy commented on issue #10929: Help with running pyrrow on arm64 in Docker

Posted by GitBox <gi...@apache.org>.
xhochy commented on issue #10929:
URL: https://github.com/apache/arrow/issues/10929#issuecomment-898336108


   Yes, I'm doing that with `--with-lg-page=14` in the OSX case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nesyamun commented on issue #10929: Help with running pyrrow on arm64 in Docker

Posted by GitBox <gi...@apache.org>.
nesyamun commented on issue #10929:
URL: https://github.com/apache/arrow/issues/10929#issuecomment-899213718


   Thanks everyone for your help!
   
   > Can you share your Dockerfile so I can try and reproduce?
   
   Unfortunately I can't as it's used at work.
   
   > If you don't want to install Arrow from source you can try using the environment variable `ARROW_DEFAULT_MEMORY_POOL` to switch to a different allocator:
   
   This doesn't seem to help either. Here's a log snippet where I've echoed and grepped the container's environment to check that it's set properly:
   ```
   [2021-08-16T03:16:17Z] ARROW_DEFAULT_MEMORY_POOL=system
   SNIP
   [2021-08-16T03:16:18Z] <jemalloc>: Unsupported system page size
   ```
   
   Now I'm wondering if something else except for Arrow is using jemalloc
   
   > Did you make sure to subscribe to the mailing list before you sent your message? The mailing list will ignore emails from unsubscribed users. If you're pretty confident you subscribed then email me at weston dot pace at gmail dot com (remove spaces and replace `dot` with `.` and `at` with `@`) so I can get your email.
   
   Yes, I overlooked subscribing to the mailing list. I have now. Thanks!
   
   > Yes, I'm doing that with --with-lg-page=14 in the OSX case.
   
   I'll trying build a wheel and/or installing jemalloc from source. Thanks!
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] xhochy commented on issue #10929: Help with running pyrrow on arm64 in Docker

Posted by GitBox <gi...@apache.org>.
xhochy commented on issue #10929:
URL: https://github.com/apache/arrow/issues/10929#issuecomment-898313641


   This error is known to me and can occur if you cross-compile `pyarrow`. @kszucs do we do that for ARM wheels? If so, we need to specify the page size in the jemalloc configure explicitly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nesyamun edited a comment on issue #10929: Help with running pyrrow on arm64 in Docker

Posted by GitBox <gi...@apache.org>.
nesyamun edited a comment on issue #10929:
URL: https://github.com/apache/arrow/issues/10929#issuecomment-901572751


   @kszucs Thanks! I've tested this and it works: https://github.com/apache/arrow/pull/10940#issuecomment-900317827
   
   Closing as https://github.com/apache/arrow/pull/10940 is merged.
   
   Thanks again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on issue #10929: Help with running pyrrow on arm64 in Docker

Posted by GitBox <gi...@apache.org>.
kszucs commented on issue #10929:
URL: https://github.com/apache/arrow/issues/10929#issuecomment-899679332


   @nesyamun could you please try out [pyarrow-6.0.0.dev45-cp38-cp38-manylinux2014_aarch64.whl](https://github.com/ursacomputing/crossbow/releases/download/actions-777-travis-wheel-manylinux2014-cp38-arm64/pyarrow-6.0.0.dev45-cp38-cp38-manylinux2014_aarch64.whl)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nesyamun commented on issue #10929: Help with running pyrrow on arm64 in Docker

Posted by GitBox <gi...@apache.org>.
nesyamun commented on issue #10929:
URL: https://github.com/apache/arrow/issues/10929#issuecomment-899950190


   Thanks @kszucs, unfortunately it doesn't work. I've checked the page size of the system I'm running pyarrow on - it's 64 KiB.
   
   Would it be possible to get a wheel with jemalloc configured for this page size, or would that be something I would need to maintain myself? Building from your fork with `ARROW_JEMALLOC_LG_PAGE=16` works. If it is merged, would it be possible for a wheel to be published configured as such?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace commented on issue #10929: Help with running pyrrow on arm64 in Docker

Posted by GitBox <gi...@apache.org>.
westonpace commented on issue #10929:
URL: https://github.com/apache/arrow/issues/10929#issuecomment-898208010


   If you don't want to install Arrow from source you can try using the environment variable `ARROW_DEFAULT_MEMORY_POOL` to switch to a different allocator:
   
   ```
   (arrow-release-5) pace@pace-desktop:~$ ARROW_DEFAULT_MEMORY_POOL=jemalloc python -c "import pyarrow; print(pyarrow.default_memory_pool().backend_name)"
   jemalloc
   (arrow-release-5) pace@pace-desktop:~$ ARROW_DEFAULT_MEMORY_POOL=system python -c "import pyarrow; print(pyarrow.default_memory_pool().backend_name)"
   system
   (arrow-release-5) pace@pace-desktop:~$ ARROW_DEFAULT_MEMORY_POOL=mimalloc python -c "import pyarrow; print(pyarrow.default_memory_pool().backend_name)"
   mimalloc
   ```
   
   I'm not sure off the top of my head what options are available in the pyarrow wheel.
   
   Also, some research suggests that there might be certain settings that can be set to get jemalloc working.  Can you share your Dockerfile so I can try and reproduce?
   
   > I contacted the mailing list on the 27th of July for this. Thanks!
   
   Yikes, unfortunately, I don't see any message on that date on either [user@](https://lists.apache.org/list.html?user@arrow.apache.org) or [dev@](https://lists.apache.org/list.html?dev@arrow.apache.org)
   
   Did you make sure to subscribe to the mailing list before you sent your message?  The mailing list will ignore emails from unsubscribed users.  If you're pretty confident you subscribed then email me at weston dot pace at gmail dot com (remove spaces and replace `dot` with `.` and `at` with `@`) so I can get your email.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nesyamun closed issue #10929: Help with running pyrrow on arm64 in Docker

Posted by GitBox <gi...@apache.org>.
nesyamun closed issue #10929:
URL: https://github.com/apache/arrow/issues/10929


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nesyamun commented on issue #10929: Help with running pyrrow on arm64 in Docker

Posted by GitBox <gi...@apache.org>.
nesyamun commented on issue #10929:
URL: https://github.com/apache/arrow/issues/10929#issuecomment-901572751


   @kszucs Thanks! I've tested this and it works: https://github.com/apache/arrow/pull/10940#issuecomment-900317827


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on issue #10929: Help with running pyrrow on arm64 in Docker

Posted by GitBox <gi...@apache.org>.
kszucs commented on issue #10929:
URL: https://github.com/apache/arrow/issues/10929#issuecomment-898334022


   > I would expect the error to also happen when run using Qemu. It could also be that the kernel of the graviton instances has a larger than normal page size and thus only breaks on smaller machines.
   
   Perhaps we can overcome that issue using https://github.com/jemalloc/jemalloc/issues/467#issuecomment-294383178 ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] xhochy commented on issue #10929: Help with running pyrrow on arm64 in Docker

Posted by GitBox <gi...@apache.org>.
xhochy commented on issue #10929:
URL: https://github.com/apache/arrow/issues/10929#issuecomment-898331375


   > We build the arm64 wheels on graviton2 travis instances [in docker](https://github.com/apache/arrow/blob/master/dev/tasks/python-wheels/travis.linux.arm64.yml#L59), so we do not cross-compile. @xhochy do you have a reference to the issue?
   > 
   
   I came across thus while building the conda packages for osx-arm64. No real documentation except what is in the recipe.
   
   > 
   > Sadly I don't have an arm64 machine at hand, so I'm unable to reproduce the issue. 
   > 
   > A wheel can be produced using the following command:
   > 
   > 
   > 
   > ```console
   > 
   > pip install -e arrow/dev/archery[docker]
   > 
   > ARCH=arm64v8 PYTHON=3.8 archery docker run python-wheel-manylinux-2014
   > 
   > # wheel is going to be available under arrow/python/repaired_wheels/
   > 
   > ```
   
   I would expect the error to also happen when run using Qemu. It could also be that the kernel of the graviton instances has a larger than normal page size and thus only breaks on smaller machines.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org