You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Antoine Pitrou <an...@python.org> on 2020/04/16 13:36:03 UTC

[Format] Ambiguity with extension and dictionary

Hello,

Let's an IPC Schema message contains the following Field (in pseudo-JSON
representation):

    {
      "name" : "...",
      "nullable" : true,
      "type" : Utf8,
      "dictionary": {
        "id": 0,
        "indexType": Int32,
        "isOrdered": true
      },
      "children" : [],
      "metadata" : [
         {"key": "ARROW:extension:name", "value": "MyExtType"},
         {"key": "ARROW:extension:metadata", "value": "..."}
      ]
    }

Which of the following two logical types does it represent?

- MyExtType<storage = int32-dictionary<string>>
- int32-dictionary<MyExtType<storage = string>

Regards

Antoine.

Re: [Format] Ambiguity with extension and dictionary

Posted by Wes McKinney <we...@gmail.com>.
On Thu, Apr 16, 2020 at 8:52 AM Antoine Pitrou <an...@python.org> wrote:
>
>
> Hello,
>
> Let's an IPC Schema message contains the following Field (in pseudo-JSON
> representation):
>
>     {
>       "name" : "...",
>       "nullable" : true,
>       "type" : Utf8,
>       "dictionary": {
>         "id": 0,
>         "indexType": Int32,
>         "isOrdered": true
>       },
>       "children" : [],
>       "metadata" : [
>          {"key": "ARROW:extension:name", "value": "MyExtType"},
>          {"key": "ARROW:extension:metadata", "value": "..."}
>       ]
>     }
>
> Which of the following two logical types does it represent?
>
> - MyExtType<storage = int32-dictionary<string>>

This one.

> - int32-dictionary<MyExtType<storage = string>

I do not believe this is representable in the IPC metadata protocol as
it currently stands. So in C++ it would not be possible to roundtrip
dictionary<ext_type>, this would have to be written as
ext_type<dictionary>.

> Regards
>
> Antoine.