You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "René Rex (Jira)" <ji...@apache.org> on 2020/05/05 09:04:00 UTC

[jira] [Created] (ARROW-8703) [R][Parquet] table$schema$metadata is a string

René Rex created ARROW-8703:
-------------------------------

             Summary: [R][Parquet] table$schema$metadata is a string
                 Key: ARROW-8703
                 URL: https://issues.apache.org/jira/browse/ARROW-8703
             Project: Apache Arrow
          Issue Type: Bug
          Components: R
    Affects Versions: 0.17.0
            Reporter: René Rex


Currently, I try to export numeric data plus some metadata in Python into to a parquet file and read it in R. However, the metadata seems to be a dict in Python but a string in R. I would have expected a list (which is roughly a dict in Python). Am I missing something? Here is the code to demonstrate the issue:

{{import sys}}{{import numpy as np}}
{{import pyarrow as pa}}
{{import pyarrow.parquet as pq}}{{print(sys.version)}}
{{print(pa.__version__)}}{{x = np.random.randint(0, 10, (10, 3))}}
{{arrays = [pa.array(x[:, i]) for i in range(x.shape[1])]}}
{{table = pa.Table.from_arrays(arrays=arrays, names=['A', 'B', 'C'],}}
{{ metadata=\{'foo': '42'})}}
{{pq.write_table(table, 'array.parquet', compression='snappy')}}{{table = pq.read_table('array.parquet')}}
{{metadata = table.schema.metadata}}
{{print(metadata)}}
{{print(type(metadata))}}

 

And in R:

 

{{library(arrow)}}{{print(R.version)}}
{{print(packageVersion("arrow"))}}{{table <- read_parquet("array.parquet", as_data_frame = FALSE)}}
{{metadata <- table$schema$metadata}}
{{print(metadata)}}
{{print(is(metadata))}}
{{print(metadata["foo"])}}{{ }}

 

Output Python:

{{3.6.8 (default, Aug 7 2019, 17:28:10) }}
{{[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]}}
{{0.13.0}}
{{OrderedDict([(b'foo', b'42')])}}
{{<class 'collections.OrderedDict'>}}

 

Output R:

{{[1] ‘0.17.0’}}
{{[1] "\n-- metadata --\nfoo: 42"}}
{{[1] "character" "vector" "data.frameRowLabels"}}
{{[4] "SuperClassMethod" }}
{{[1] NA}}

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)