You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2021/05/03 11:57:00 UTC
[jira] [Commented] (ARROW-12588) Expose JSON schema inference to
Python API
[ https://issues.apache.org/jira/browse/ARROW-12588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17338332#comment-17338332 ]
Joris Van den Bossche commented on ARROW-12588:
-----------------------------------------------
Can you give a concrete example?
Some level of schema inference also happens in the general {{pa.array()}} constructor. For example, passing a list of dicts works in simple cases:
{code}
In [2]: arr = pa.array([{'a': 1, 'b': 2}, {'a': 3, 'b': 4}])
In [3]: arr.type
Out[3]: StructType(struct<a: int64, b: int64>)
In [4]: arr
Out[4]:
<pyarrow.lib.StructArray object at 0x7f160695d4c0>
-- is_valid: all not null
-- child 0 type: int64
[
1,
3
]
-- child 1 type: int64
[
2,
4
]
{code}
> Expose JSON schema inference to Python API
> ------------------------------------------
>
> Key: ARROW-12588
> URL: https://issues.apache.org/jira/browse/ARROW-12588
> Project: Apache Arrow
> Issue Type: Improvement
> Reporter: Piotr Żelasko
> Priority: Minor
>
> When using `pyarrow.json.read_json()`, the schema is automatically inferred. It would be useful to infer the schema from a json that is already loaded in memory (i.e. possibly a list of dicts in Python).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)