You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Trey Hakanson (JIRA)" <ji...@apache.org> on 2019/06/07 22:43:00 UTC

[jira] [Updated] (ARROW-5532) Field Metadata Not Read

     [ https://issues.apache.org/jira/browse/ARROW-5532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Trey Hakanson updated ARROW-5532:
---------------------------------
    Description: 
Field metadata is not read when using @apache-arrow/ts@0.13.0`. Example below also uses `pyarrow==0.13.0`

Steps to reproduce:

Adding metadata:

{code:title=toarrow.py|borderStyle=solid}
import pyarrow as pa
import pandas as pd

source = "sample.csv"
output = "sample.arrow"
df = pd.read_csv(source)
table = pa.Table.from_pandas(df)
schema = pa.schema([
 column.field.add_metadata(\{"foo": "bar"}))
 for column
 in table.columns
])
writer = pa.RecordBatchFileWriter(output, schema)
writer.write(table)
writer.close()
{code}

Reading field metadata using `pyarrow`:

{code:title=readarrow.py|borderStyle=solid}
source = "sample.arrow"
field = "foo"
reader = pa.RecordBatchFileReader(source)
reader.schema.field_by_name(field).metadata # Correctly shows `\{"foo": "bar"}`
{code}

Reading field metadata using `@apache-arrow/ts`:

{code:title=toarrow.ts|borderStyle=solid}
import \{ Table, Field, Type } from "@apache-arrow/ts";

const url = "https://example.com/sample.arrow";
const buf = await fetch(url).then(res => res.arrayBuffer());
const table = Table.from([new Uint8Array(buf)]);
for (let field of table.schema.fields) {
 field.metadata; // Incorrectly shows an empty map
}
{code}

  was:
Field metadata is not read when using `@apache-arrow/ts@0.13.0`. Example below also uses `pyarrow==0.13.0`

Steps to reproduce:

Adding metadata:

```py
import pyarrow as pa
import pandas as pd

source = "sample.csv"
output = "sample.arrow"
df = pd.read_csv(source)
table = pa.Table.from_pandas(df)
schema = pa.schema([
 column.field.add_metadata(\{"foo": "bar"}))
 for column
 in table.columns
])
writer = pa.RecordBatchFileWriter(output, schema)
writer.write(table)
writer.close()
```

Reading field metadata using `pyarrow`:

```py
source = "sample.arrow"
field = "foo"
reader = pa.RecordBatchFileReader(source)
reader.schema.field_by_name(field).metadata # Correctly shows `\{"foo": "bar"}`
```

Reading field metadata using `@apache-arrow/ts`:

```ts
import \{ Table, Field, Type } from "@apache-arrow/ts";

const url = "https://example.com/sample.arrow";
const buf = await fetch(url).then(res => res.arrayBuffer());
const table = Table.from([new Uint8Array(buf)]);
for (let field of table.schema.fields) {
 field.metadata; // Incorrectly shows an empty map
}
```


> Field Metadata Not Read
> -----------------------
>
>                 Key: ARROW-5532
>                 URL: https://issues.apache.org/jira/browse/ARROW-5532
>             Project: Apache Arrow
>          Issue Type: Bug
>    Affects Versions: 0.13.0
>         Environment: Mac OSX 10.14, Chrome 74
>            Reporter: Trey Hakanson
>            Priority: Major
>
> Field metadata is not read when using @apache-arrow/ts@0.13.0`. Example below also uses `pyarrow==0.13.0`
> Steps to reproduce:
> Adding metadata:
> {code:title=toarrow.py|borderStyle=solid}
> import pyarrow as pa
> import pandas as pd
> source = "sample.csv"
> output = "sample.arrow"
> df = pd.read_csv(source)
> table = pa.Table.from_pandas(df)
> schema = pa.schema([
>  column.field.add_metadata(\{"foo": "bar"}))
>  for column
>  in table.columns
> ])
> writer = pa.RecordBatchFileWriter(output, schema)
> writer.write(table)
> writer.close()
> {code}
> Reading field metadata using `pyarrow`:
> {code:title=readarrow.py|borderStyle=solid}
> source = "sample.arrow"
> field = "foo"
> reader = pa.RecordBatchFileReader(source)
> reader.schema.field_by_name(field).metadata # Correctly shows `\{"foo": "bar"}`
> {code}
> Reading field metadata using `@apache-arrow/ts`:
> {code:title=toarrow.ts|borderStyle=solid}
> import \{ Table, Field, Type } from "@apache-arrow/ts";
> const url = "https://example.com/sample.arrow";
> const buf = await fetch(url).then(res => res.arrayBuffer());
> const table = Table.from([new Uint8Array(buf)]);
> for (let field of table.schema.fields) {
>  field.metadata; // Incorrectly shows an empty map
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)