You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2020/06/15 14:56:00 UTC
[jira] [Commented] (ARROW-9117) [Python] Is there Pandas circular
dependency problem?
[ https://issues.apache.org/jira/browse/ARROW-9117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135937#comment-17135937 ]
Joris Van den Bossche commented on ARROW-9117:
----------------------------------------------
[~Seungmin] this was solved for you?
> [Python] Is there Pandas circular dependency problem?
> -----------------------------------------------------
>
> Key: ARROW-9117
> URL: https://issues.apache.org/jira/browse/ARROW-9117
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 0.17.1
> Reporter: SEUNGMIN HEO
> Priority: Major
>
> I'm using Pyarrow for generating parquet dataset.
> Whenever I test my code, I encountered same error
> can't import name 'BlockManager' error.
> In many cases, I know this error occurs when there is circular dependency
> this is my reproduced sample code
>
>
> {code:java}
> field_col1 = pyarrow.field('col1', type=pyarrow.int64(), nullable=True, metadata=None)
> field_col2 = pyarrow.field('col2', type=pyarrow.col232(), nullable=True, metadata=None)
> col1_arr = pyarrow.array([col1] * len_rows, pyarrow.int64())
> col2_arr = pyarrow.array([file_col2] * len_rows, pyarrow.date32())
> csv_table = csv_table.add_column(0, field_col2, col2_arr)
> csv_table = csv_table.add_column(0, field_col1, col1_arr)
> csv_table = csv_table.cast(generate_schema(csv_table))
> parquet.write_to_dataset(csv_table,
> f"{s3_path}/{table_name}",
> partition_cols=['col1', 'col2'],
> partition_filename_cb=lambda partition_cols: partition_cols[1].strftime("%Y-%m-%d"),
> filesystem=s3_fs,
> compression='snappy')
> {code}
> And this is error message
>
> {code:java}
> Traceback (most recent call last):
> File "pyarrow/pandas-shim.pxi", line 107, in pyarrow.lib._PandasAPIShim._check_import
> File "pyarrow/pandas-shim.pxi", line 44, in pyarrow.lib._PandasAPIShim._import_pandas
> File "/Users/aa/Desktop/Python/youtube-api-caller/venv/lib/python3.7/site-packages/pandas/_init_.py", line 42, in <module>
> from pandas.core.api import *
> File "/Users/aa/Desktop/Python/youtube-api-caller/venv/lib/python3.7/site-packages/pandas/core/api.py", line 26, in <module>
> from pandas.core.groupby import Grouper
> File "/Users/aa/Desktop/Python/youtube-api-caller/venv/lib/python3.7/site-packages/pandas/core/groupby/_init_.py", line 1, in <module>
> from pandas.core.groupby.groupby import GroupBy # noqa: F401
> File "/Users/aa/Desktop/Python/youtube-api-caller/venv/lib/python3.7/site-packages/pandas/core/groupby/groupby.py", line 37, in <module>
> from pandas.core.frame import DataFrame
> File "/Users/aa/Desktop/Python/youtube-api-caller/venv/lib/python3.7/site-packages/pandas/core/frame.py", line 87, in <module>
> from pandas.core.generic import NDFrame, _shared_docs
> File "/Users/aa/Desktop/Python/youtube-api-caller/venv/lib/python3.7/site-packages/pandas/core/generic.py", line 46, in <module>
> from pandas.core.internals import BlockManager
> File "<frozen importlib._bootstrap>", line 980, in _find_and_load
> File "<frozen importlib.bootstrap>", line 149, in __enter_
> File "<frozen importlib._bootstrap>", line 94, in acquire
> _frozen_importlib._DeadlockError: deadlock detected by _ModuleLock('pandas.core.internals') at 4776509328
> cannot import name 'BlockManager' from 'pandas.core.internals'
>
> {code}
>
>
> How can I solve this error?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)