You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Todd Farmer (Jira)" <ji...@apache.org> on 2022/07/12 14:05:02 UTC

[jira] [Assigned] (ARROW-15642) [Python] [JavaScript] Arrow IPC file output by apache-arrow tableToIPC method cannot be read by pyarrow

     [ https://issues.apache.org/jira/browse/ARROW-15642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Farmer reassigned ARROW-15642:
-----------------------------------

    Assignee:     (was: Weston Pace)

This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.

> [Python] [JavaScript] Arrow IPC file output by apache-arrow tableToIPC method cannot be read by pyarrow
> -------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-15642
>                 URL: https://issues.apache.org/jira/browse/ARROW-15642
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: JavaScript, Python
>    Affects Versions: 7.0.0
>            Reporter: Dan Coates
>            Priority: Major
>
> IPC files created by the node library `apache-arrow` don't seem to be able to be read by pyarrow. There is an example of this issue here: [https://github.com/dancoates/pyarrow-jsarrow-test |https://github.com/dancoates/pyarrow-jsarrow-test]
>  
> writing the arrow file from js
> {code:javascript}
> import {tableToIPC, tableFromArrays} from 'apache-arrow';
> import fs from 'fs';
> const LENGTH = 2000;
> const rainAmounts = Float32Array.from(
>     { length: LENGTH },
>     () => Number((Math.random() * 20).toFixed(1)));
> const rainDates = Array.from(
>     { length: LENGTH },
>     (_, i) => new Date(Date.now() - 1000 * 60 * 60 * 24 * i));
> const rainfall = tableFromArrays({
>     precipitation: rainAmounts,
>     date: rainDates
> });
> const outputTable = tableToIPC(rainfall);
> fs.writeFileSync('jsarrow.arrow', outputTable); {code}
>  
> reading in python
> {code:python}
> import pyarrow as pa
> with open('jsarrow.arrow', 'rb') as f:
>     with pa.ipc.open_file(f) as reader:
>         df = reader.read_pandas()
>         print(df.head())
>  {code}
>  
> produces the error:
> {code:java}
> pyarrow.lib.ArrowInvalid: Not an Arrow file {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)