You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/26 08:22:34 UTC
[GitHub] [arrow] EnvironmentalEngineer opened a new issue #9326: ERROR: file not found: pyarrow and to_parquet() not working
EnvironmentalEngineer opened a new issue #9326:
URL: https://github.com/apache/arrow/issues/9326
Hi,
I am using pyarrow 1.0.0.
It failed to process integers and floats but worked with objects. When I run the following python script:
sample = pd.DataFrame({'a':[1, 2], 'b': [3, 4]})
sample.to_parquet('tmp.parquet')
I got:
ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion failed for column a with type int64')
When I run pytest pyarrow, I got file not found error:
============================= test session starts ==============================
platform linux -- Python 3.7.3, pytest-4.3.1, py-1.8.0, pluggy-0.9.0
rootdir: /home/ubuntu, inifile:
plugins: remotedata-0.3.1, openfiles-0.3.2, doctestplus-0.3.0, arraydiff-0.3
collecting ...
========================= no tests ran in 0.00 seconds =========================
ERROR: file not found: pyarrow
And the package is installed because when I run pip3.7 install --no-cache pyarrow, I got:
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: pyarrow in ./anaconda3/lib/python3.7/site-packages (1.0.0)
Requirement already satisfied: numpy>=1.14 in ./anaconda3/lib/python3.7/site-packages (from pyarrow) (1.20.0rc2)
Here is what I got if I run pd.show_versions():
INSTALLED VERSIONS
------------------
commit : 9d598a5e1eee26df95b3910e3f2934890d062caa
python : 3.7.3.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.0-1035-aws
Version : #37~18.04.1-Ubuntu SMP Wed Jan 6 22:31:04 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.2.1
numpy : 1.20.0rc2
pytz : 2018.9
dateutil : 2.8.0
pip : 21.0
setuptools : 52.0.0
Cython : 0.29.6
pytest : 4.3.1
hypothesis : None
sphinx : 1.8.5
blosc : None
feather : None
xlsxwriter : 1.1.5
lxml.etree : 4.3.2
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10
IPython : 7.4.0
pandas_datareader: None
bs4 : 4.7.1
bottleneck : 1.2.1
fsspec : 0.8.3
fastparquet : None
gcsfs : None
matplotlib : 3.0.3
numexpr : 2.6.9
odfpy : None
openpyxl : 2.6.1
pandas_gbq : None
pyarrow : 1.0.0
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.1
tables : 3.5.1
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
numba : 0.43.1
Could anyone please help me with this? Thanks!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] jorisvandenbossche commented on issue #9326: ERROR: file not found: pyarrow and to_parquet() not working
Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on issue #9326:
URL: https://github.com/apache/arrow/issues/9326#issuecomment-768915707
Already answered at https://github.com/pandas-dev/pandas/issues/39411
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] jorisvandenbossche closed issue #9326: ERROR: file not found: pyarrow and to_parquet() not working
Posted by GitBox <gi...@apache.org>.
jorisvandenbossche closed issue #9326:
URL: https://github.com/apache/arrow/issues/9326
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org