You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/09/13 08:12:00 UTC

[jira] [Updated] (ARROW-15545) [C++] Cast dictionary of extension type to extension type

     [ https://issues.apache.org/jira/browse/ARROW-15545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated ARROW-15545:
-----------------------------------
    Labels: good-first-issue kernel pull-request-available  (was: good-first-issue kernel)

> [C++] Cast dictionary of extension type to extension type
> ---------------------------------------------------------
>
>                 Key: ARROW-15545
>                 URL: https://issues.apache.org/jira/browse/ARROW-15545
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Joris Van den Bossche
>            Assignee: Miles Granger
>            Priority: Major
>              Labels: good-first-issue, kernel, pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> We support casting a DictionaryArray to its dictionary values' type. For example:
> {code}
> >>> arr = pa.array([1, 2, 1]).dictionary_encode()
> >>> arr
> <pyarrow.lib.DictionaryArray object at 0x7f0c1aca46d0>
> -- dictionary:
>   [
>     1,
>     2
>   ]
> -- indices:
>   [
>     0,
>     1,
>     0
>   ]
> >>> arr.type
> DictionaryType(dictionary<values=int64, indices=int32, ordered=0>)
> >>> arr.cast(arr.type.value_type)
> <pyarrow.lib.Int64Array object at 0x7f0c19891dc0>
> [
>   1,
>   2,
>   1
> ]
> {code}
> However, if the type of the dictionary values is an ExtensionType, this cast is not supported:
> {code}
> >>> from pyarrow.tests.test_extension_type import UuidType
> >>> storage = pa.array([b"0123456789abcdef"], type=pa.binary(16))
> >>> arr = pa.ExtensionArray.from_storage(UuidType(), storage)
> >>> arr
> <pyarrow.lib.ExtensionArray object at 0x7f0c1875bc40>
> [
>   30313233343536373839616263646566
> ]
> >>> dict_arr = pa.DictionaryArray.from_arrays(pa.array([0, 0], pa.int32()), arr)
> >>> dict_arr.type
> DictionaryType(dictionary<values=extension<arrow.py_extension_type<UuidType>>, indices=int32, ordered=0>)
> >>> dict_arr.cast(UuidType())
> ...
> ArrowNotImplementedError: Unsupported cast from dictionary<values=extension<arrow.py_extension_type<UuidType>>, indices=int32, ordered=0> to extension<arrow.py_extension_type<UuidType>> (no available cast function for target type)
> ../src/arrow/compute/cast.cc:119  GetCastFunctionInternal(cast_options->to_type, args[0].type().get())
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)