You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/11/04 09:33:37 UTC

[GitHub] [arrow] malthe opened a new pull request #8588: ARROW-4637: [Python] Must lock on import, to avoid deadlock due to circular imports

malthe opened a new pull request #8588:
URL: https://github.com/apache/arrow/pull/8588


   This fixes an issue where multithreaded use of the library would cause import deadlock issues due to the lazy Pandas importing mechanism.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #8588: ARROW-4637: [Python] Must lock on import, to avoid deadlock due to circular imports

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #8588:
URL: https://github.com/apache/arrow/pull/8588#issuecomment-722058751


   Do you know why the deadlock occurs? It's not especially intuitive to me


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on pull request #8588: ARROW-4637: [Python] Must lock on import, to avoid deadlock due to circular imports

Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #8588:
URL: https://github.com/apache/arrow/pull/8588#issuecomment-723917858


   Can we test this?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8588: ARROW-4637: [Python] Must lock on import, to avoid deadlock due to circular imports

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8588:
URL: https://github.com/apache/arrow/pull/8588#issuecomment-724014295


   @malthe Could you please open a separate JIRA for this?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #8588: ARROW-4637: [Python] Must lock on import, to avoid deadlock due to circular imports

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #8588:
URL: https://github.com/apache/arrow/pull/8588


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8588: ARROW-4637: [Python] Must lock on import, to avoid deadlock due to circular imports

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8588:
URL: https://github.com/apache/arrow/pull/8588#issuecomment-724037836


   Opened ARROW-10519.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8588: ARROW-4637: [Python] Must lock on import, to avoid deadlock due to circular imports

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8588:
URL: https://github.com/apache/arrow/pull/8588#issuecomment-724044398


   Closed in favour of #8615.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8588: ARROW-4637: [Python] Must lock on import, to avoid deadlock due to circular imports

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8588:
URL: https://github.com/apache/arrow/pull/8588#issuecomment-724031162


   I've got a reproducer. The problem is actually more complex. We've got a deadlock between two threads:
   
   Thread A:
   * takes the GIL
   * calls `arrow::py::internal::InitPandasStaticData`
     * which calls `std::call_once`, which acquires the lock inside the `std::once_flag`
       * which imports the `pandas` module
         * which releases the GIL before reading the code object from disk
         * and then tries to re-acquire the GIL
   
   Thread B:
   * takes the GIL
   * calls `arrow::py::internal::InitPandasStaticData`
     * which calls `std::call_once`, which tries to acquire the lock inside the `std::once_flag`
   
   So there is a lock ordering-induced deadlock between the Python GIL and the `std::once_flag` lock.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] malthe commented on pull request #8588: ARROW-4637: [Python] Must lock on import, to avoid deadlock due to circular imports

Posted by GitBox <gi...@apache.org>.
malthe commented on pull request #8588:
URL: https://github.com/apache/arrow/pull/8588#issuecomment-722174670


   @wesm due to circular imports. Normally, this is not a problem because at least they're resolved in a predictable way, but it can cause deadlocks when multiple threads are trying to do it concurrently.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #8588: ARROW-4637: [Python] Must lock on import, to avoid deadlock due to circular imports

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #8588:
URL: https://github.com/apache/arrow/pull/8588#issuecomment-722500005


   OK, I updated the PR description so we have a better explanation of the problem in the changelog


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] malthe edited a comment on pull request #8588: ARROW-4637: [Python] Must lock on import, to avoid deadlock due to circular imports

Posted by GitBox <gi...@apache.org>.
malthe edited a comment on pull request #8588:
URL: https://github.com/apache/arrow/pull/8588#issuecomment-722174670


   @wesm due to circular imports – in Pandas. Normally, this is not a problem because at least they're resolved in a predictable way, but it can cause deadlocks when multiple threads are trying to do it concurrently.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8588: ARROW-4637: [Python] Must lock on import, to avoid deadlock due to circular imports

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8588:
URL: https://github.com/apache/arrow/pull/8588#issuecomment-721627353


   https://issues.apache.org/jira/browse/ARROW-4637


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org