You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Krisztian Szucs (Jira)" <ji...@apache.org> on 2020/09/15 22:04:00 UTC

[jira] [Comment Edited] (ARROW-9997) [Python] StructScalar.as_py() fails if the type has duplicate field names

    [ https://issues.apache.org/jira/browse/ARROW-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17196573#comment-17196573 ] 

Krisztian Szucs edited comment on ARROW-9997 at 9/15/20, 10:03 PM:
-------------------------------------------------------------------

My issue is StructScalar is an *arrow object* which implements a python mapping interface. Once we have duplicate keys the object stops to operate, we cannot do anything with it since all operation will raise a KeyError (not just when we call {{.as_py()}}

My another problem is that the struct array/scalar is the only type where we fail to roundtrip between arrow and python (at least according to a hypothesis test):
{code:python}
pa.array(arr.to_pylist(), type=arr.type)
pa.scalar(scalar.as_py(), type=scalar.type)
{code}
If we want convenient pythonic access to StructScalar I'd rather add a best effort {{.as_dict()}} method.


was (Author: kszucs):
My issue is StructScalar is an arrow object which implements a python mapping interface. Once we have duplicate keys the object stops to operate, we cannot do anything with it since all operation will raise a KeyError (not just when we call {{.as_py()}}

My another problem is that the struct array/scalar is the only type where we fail to roundtrip between arrow and python (at least according to a hypothesis test): 
{code:python}
pa.array(arr.to_pylist(), type=arr.type)
pa.scalar(scalar.as_py(), type=scalar.type)
{code}

If we want convenient pythonic access to StructScalar I'd rather add a best effort {{.as_dict()}} method.

> [Python] StructScalar.as_py() fails if the type has duplicate field names
> -------------------------------------------------------------------------
>
>                 Key: ARROW-9997
>                 URL: https://issues.apache.org/jira/browse/ARROW-9997
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>            Reporter: Krisztian Szucs
>            Assignee: Krisztian Szucs
>            Priority: Major
>             Fix For: 2.0.0
>
>
> {{StructScalar}} currently extends an abstract Mapping interface. Since the type allows duplicate field names we cannot provide that API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)