You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Tanja Miličić (JIRA)" <ji...@apache.org> on 2017/07/29 06:45:03 UTC

[jira] [Updated] (ARROW-1293) Module initialization error when using pyarrow with AWS Lambda

     [ https://issues.apache.org/jira/browse/ARROW-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tanja Miličić updated ARROW-1293:
---------------------------------
    Description: 
When using pyarrow in AWS Lambda function like this:

{code:python}
import pyarrow as pa
import pyarrow.parquet as pq
import pandas as pd

df = pd.DataFrame([data]) #data is dictionary
table = pa.Table.from_pandas(df)
pq.write_table(table, 'tmp/test.parquet', compression='snappy')
table = pq.read_table('tmp/test.parquet')
table.to_pandas()
print(table)
{code}

Module initialization error occurs: 

{code:python}
module initialization error: [Errno 2] No such file or directory: '/var/task/__pycache__/_cffi__x762f05ffx6bf5342b.c'
{code}

Deployment package was prepared by running following commands:

{code}
virtualenv nameofenv
source nameofenv/bin/active
pip install pyarrow
sudo apt-get install libsnappy-dev
pip install python-snappy
pip install pandas
{code}

files from site-packages directory are than zipped together with lambda function.

  was:
When using pyarrow in AWS Lambda function like this:

{code:python}
import pyarrow as pa
import pyarrow.parquet as pq
import pandas as pd

df = pd.DataFrame([data]) #data is dictionary
table = pa.Table.from_pandas(df)
pq.write_table(table, 'tmp/test.parquet', compression='snappy')
table = pq.read_table('tmp/test.parquet')
table.to_pandas()
print(table)
{code}

Module initialization error occurs: 

{code:python}
module initialization error: [Errno 2] No such file or directory: '/var/task/__pycache__/_cffi__x762f05ffx6bf5342b.c'
{code}

Deployment package was prepared by running following commands:

{code:sh}
virtualenv nameofenv
source nameofenv/bin/active
pip install pyarrow
sudo apt-get install libsnappy-dev
pip install python-snappy
pip install pandas
{code}

files from site-packages directory are than zipped together with lambda function.


> Module initialization error when using pyarrow with AWS Lambda
> --------------------------------------------------------------
>
>                 Key: ARROW-1293
>                 URL: https://issues.apache.org/jira/browse/ARROW-1293
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.5.0
>         Environment: AWS Lambda
>            Reporter: Tanja Miličić
>
> When using pyarrow in AWS Lambda function like this:
> {code:python}
> import pyarrow as pa
> import pyarrow.parquet as pq
> import pandas as pd
> df = pd.DataFrame([data]) #data is dictionary
> table = pa.Table.from_pandas(df)
> pq.write_table(table, 'tmp/test.parquet', compression='snappy')
> table = pq.read_table('tmp/test.parquet')
> table.to_pandas()
> print(table)
> {code}
> Module initialization error occurs: 
> {code:python}
> module initialization error: [Errno 2] No such file or directory: '/var/task/__pycache__/_cffi__x762f05ffx6bf5342b.c'
> {code}
> Deployment package was prepared by running following commands:
> {code}
> virtualenv nameofenv
> source nameofenv/bin/active
> pip install pyarrow
> sudo apt-get install libsnappy-dev
> pip install python-snappy
> pip install pandas
> {code}
> files from site-packages directory are than zipped together with lambda function.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)