You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2017/05/06 14:40:04 UTC

[jira] [Commented] (ARROW-955) ImportError: No module named _config

    [ https://issues.apache.org/jira/browse/ARROW-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999449#comment-15999449 ] 

Wes McKinney commented on ARROW-955:
------------------------------------

hi [~derringdo] it looks like you may be importing pyarrow from the source folder. Without a more complete output of your console session it's hard for me to tell what else might be wrong with your environment. 

I strongly recommend that you install these binary libraries from conda-forge, which have just been updated for the 0.3.0 release:

{code}
conda install pyarrow -c conda-forge
{code}

There is also a source build guide using conda-packages https://github.com/apache/arrow/blob/master/python/doc/source/development.rst. I hope that someone will contribute detailed development details for other platforms and package managers.

> ImportError: No module named _config
> ------------------------------------
>
>                 Key: ARROW-955
>                 URL: https://issues.apache.org/jira/browse/ARROW-955
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>         Environment: Ubuntu - 3.19.0-80-generic #88~14.04.1-Ubuntu
> Python 2.7.6
>            Reporter: Devang Shah
>            Priority: Blocker
>
> I built pyarrow, arrow, and parquet-cpp from source - so that I could use the new read_row_group() interface and in general, have access to the latest versions. I ran into many issues during the build but was ultimately successful (notes below). However, I am not able to import pyarrow.parquet due to the following issue:
> >>import pyarrow.parquet
> Traceback (most recent call last):
> File "", line 1, in 
> File "pyarrow/init.py", line 28, in 
> import pyarrow._config
> ImportError: No module named _config
> This is similar to an issue reported in github/conda-forge/pyarrow-feedstock, where also I posted this...but I think this forum is more direct and appropriate - so re-posting here.
> I used instructions at https://arrow.apache.org/docs/python/install.html to build arrow/cpp, parquet-cpp, and then pyarrow, with the following deviations (I view them as possibly bugs in the instructions):
> arrow/cpp build:
> export ARROW_HOME=$HOME/local
> I had to specify -DARROW_PYTHON=on and -DPARQUET_ARROW=ON to the cmake command (besides the -DCMAKE_INSTALL_PREFIX=$ARROW_HOME)
> parquet-cpp build:
> export ARROW_HOME=$HOME/local
> cmake -DARROW_HOME=$HOME/local -DPARQUET_ARROW_LINKAGE=static -DPARQUET_ARROW=ON .
> make
> sudo make install ----> this installs parquet libs in the std systems location (/usr/local/lib) so that the pyarrow build (see below) can find the parquet libs
> pyarrow build:
> export ARROW_HOME=$HOME/local (not a deviation; just repeating here)
> export LD_LIBRARY_PATH=$HOME/local/lib:$HOME/parquet4/parquet-cpp/build/latest
> sudo python setup.py build_ext --with-parquet --with-jemalloc --build-type=release install
> sudo python setup.py install
> (sudo is needed to install in /usr/local/lib/python2.7/dist-packages )
> These are the steps and modifications to the instructions needed for me to build the pyarrow.parquet package. However, when I now try to import the package I get the error specified above.
> Maybe I did something wrong in my steps which I kind of put together by searching for these issues...but really can't tell what. It took me almost a whole day to get to the point where I can build pyarrow and parquet, and now I can't use what I built.
> Any comments, help appreciated! Thanks in advance.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)