You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@arrow.apache.org by "Uwe L. Korn (JIRA)" <ji...@apache.org> on 2019/02/15 10:44:00 UTC

[jira] [Created] (ARROW-4582) [C++/Python] Memory corruption on Pandas->Arrow conversion

Uwe L. Korn created ARROW-4582:
----------------------------------

             Summary: [C++/Python] Memory corruption on Pandas->Arrow conversion
                 Key: ARROW-4582
                 URL: https://issues.apache.org/jira/browse/ARROW-4582
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++, Python
    Affects Versions: 0.12.0, 0.11.1, 0.11.0
            Reporter: Uwe L. Korn
            Assignee: Uwe L. Korn
             Fix For: 0.13.0


When converting DataFrames with numerical columns to Arrow tables we were seeing random segfaults in core Python code. This only happened in environments where we had a high level of parallelisation or slow code execution (e.g. in AddressSanitizer builds).

The reason for these segfaults was that we were incrementing the reference count of the underlying NumPy buffer but were not holding the GIL while changing the reference count.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)