You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/26 08:22:34 UTC

[GitHub] [arrow] EnvironmentalEngineer opened a new issue #9326: ERROR: file not found: pyarrow and to_parquet() not working

EnvironmentalEngineer opened a new issue #9326:
URL: https://github.com/apache/arrow/issues/9326


   Hi,
   
   I am using pyarrow 1.0.0.
   
   It failed to process integers and floats but worked with objects. When I run the following python script:
   sample = pd.DataFrame({'a':[1, 2], 'b': [3, 4]})
   sample.to_parquet('tmp.parquet')
   
   I got:
   ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion failed for column a with type int64')
   
   When I run pytest pyarrow, I got file not found error:
   ============================= test session starts ==============================
   platform linux -- Python 3.7.3, pytest-4.3.1, py-1.8.0, pluggy-0.9.0
   rootdir: /home/ubuntu, inifile:
   plugins: remotedata-0.3.1, openfiles-0.3.2, doctestplus-0.3.0, arraydiff-0.3
   collecting ... 
   ========================= no tests ran in 0.00 seconds =========================
   ERROR: file not found: pyarrow
   
   And the package is installed because when I run pip3.7 install --no-cache pyarrow, I got:
   Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
   Requirement already satisfied: pyarrow in ./anaconda3/lib/python3.7/site-packages (1.0.0)
   Requirement already satisfied: numpy>=1.14 in ./anaconda3/lib/python3.7/site-packages (from pyarrow) (1.20.0rc2)
   
   Here is what I got if I run pd.show_versions():
   
   INSTALLED VERSIONS
   ------------------
   commit           : 9d598a5e1eee26df95b3910e3f2934890d062caa
   python           : 3.7.3.final.0
   python-bits      : 64
   OS               : Linux
   OS-release       : 5.4.0-1035-aws
   Version          : #37~18.04.1-Ubuntu SMP Wed Jan 6 22:31:04 UTC 2021
   machine          : x86_64
   processor        : x86_64
   byteorder        : little
   LC_ALL           : None
   LANG             : C.UTF-8
   LOCALE           : en_US.UTF-8
   
   pandas           : 1.2.1
   numpy            : 1.20.0rc2
   pytz             : 2018.9
   dateutil         : 2.8.0
   pip              : 21.0
   setuptools       : 52.0.0
   Cython           : 0.29.6
   pytest           : 4.3.1
   hypothesis       : None
   sphinx           : 1.8.5
   blosc            : None
   feather          : None
   xlsxwriter       : 1.1.5
   lxml.etree       : 4.3.2
   html5lib         : 1.0.1
   pymysql          : None
   psycopg2         : None
   jinja2           : 2.10
   IPython          : 7.4.0
   pandas_datareader: None
   bs4              : 4.7.1
   bottleneck       : 1.2.1
   fsspec           : 0.8.3
   fastparquet      : None
   gcsfs            : None
   matplotlib       : 3.0.3
   numexpr          : 2.6.9
   odfpy            : None
   openpyxl         : 2.6.1
   pandas_gbq       : None
   pyarrow          : 1.0.0
   pyxlsb           : None
   s3fs             : None
   scipy            : 1.4.1
   sqlalchemy       : 1.3.1
   tables           : 3.5.1
   tabulate         : None
   xarray           : None
   xlrd             : 1.2.0
   xlwt             : 1.3.0
   numba            : 0.43.1
   
   Could anyone please help me with this? Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on issue #9326: ERROR: file not found: pyarrow and to_parquet() not working

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on issue #9326:
URL: https://github.com/apache/arrow/issues/9326#issuecomment-768915707


   Already answered at https://github.com/pandas-dev/pandas/issues/39411


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche closed issue #9326: ERROR: file not found: pyarrow and to_parquet() not working

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche closed issue #9326:
URL: https://github.com/apache/arrow/issues/9326


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org