You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by Michael <mi...@gmail.com> on 2022/07/08 20:23:31 UTC
ExtensionArray Examples
I'm trying to create some ExtensionArrays in pandas and pyarrow but having
trouble figuring out the relationships between them.
I've taken a look at what they've been working on for the next release of
Pandas
<https://github.com/pandas-dev/pandas/tree/main/pandas/core/arrays/arrow>,
and while some of it is helpful, it's focused on supporting native pandas
types and providing them with arrow-backed arrays. I'd like to do something
similar but for scalar classes that are not part of pandas.
I think I need to create 4 different classes and some of the relevant
methods:
- pandas ExtensionArray subclass
- __arrow_array__
- pandas ExtensionDtype subclass
- pyarrow ExtensionArray subclass
- pyarrow ExtensionType subclass
- __arrow_ext_serialize__
- __arrow_ext_deserialize__
- __arrow_ext_class__
- to_pandas_dtype
Is anybody aware of some good concrete examples of how to organize these
classes?
Thanks!
Best,
Michael
Re: ExtensionArray Examples
Posted by Michael <mi...@gmail.com>.
Ah thanks! It looks like the upcoming ExtensionScalar hooks
<https://github.com/apache/arrow/pull/13454> are exactly what I was looking
for. Very exciting!
Michael
On Fri, Jul 8, 2022 at 5:11 PM Rok Mihevc <ro...@gmail.com> wrote:
> Hey Michael,
>
>
> https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_extension_type.py
> might have the material you need.
>
> Rok
>
> On Fri, Jul 8, 2022 at 10:23 PM Michael <mi...@gmail.com>
> wrote:
>
>> I'm trying to create some ExtensionArrays in pandas and pyarrow but
>> having trouble figuring out the relationships between them.
>>
>> I've taken a look at what they've been working on for the next release
>> of Pandas
>> <https://github.com/pandas-dev/pandas/tree/main/pandas/core/arrays/arrow>,
>> and while some of it is helpful, it's focused on supporting native pandas
>> types and providing them with arrow-backed arrays. I'd like to do something
>> similar but for scalar classes that are not part of pandas.
>>
>> I think I need to create 4 different classes and some of the relevant
>> methods:
>>
>> - pandas ExtensionArray subclass
>> - __arrow_array__
>> - pandas ExtensionDtype subclass
>> - pyarrow ExtensionArray subclass
>> - pyarrow ExtensionType subclass
>> - __arrow_ext_serialize__
>> - __arrow_ext_deserialize__
>> - __arrow_ext_class__
>> - to_pandas_dtype
>>
>> Is anybody aware of some good concrete examples of how to organize these
>> classes?
>>
>> Thanks!
>>
>> Best,
>> Michael
>>
>
Re: ExtensionArray Examples
Posted by Rok Mihevc <ro...@gmail.com>.
Hey Michael,
https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_extension_type.py
might have the material you need.
Rok
On Fri, Jul 8, 2022 at 10:23 PM Michael <mi...@gmail.com>
wrote:
> I'm trying to create some ExtensionArrays in pandas and pyarrow but having
> trouble figuring out the relationships between them.
>
> I've taken a look at what they've been working on for the next release of
> Pandas
> <https://github.com/pandas-dev/pandas/tree/main/pandas/core/arrays/arrow>,
> and while some of it is helpful, it's focused on supporting native pandas
> types and providing them with arrow-backed arrays. I'd like to do something
> similar but for scalar classes that are not part of pandas.
>
> I think I need to create 4 different classes and some of the relevant
> methods:
>
> - pandas ExtensionArray subclass
> - __arrow_array__
> - pandas ExtensionDtype subclass
> - pyarrow ExtensionArray subclass
> - pyarrow ExtensionType subclass
> - __arrow_ext_serialize__
> - __arrow_ext_deserialize__
> - __arrow_ext_class__
> - to_pandas_dtype
>
> Is anybody aware of some good concrete examples of how to organize these
> classes?
>
> Thanks!
>
> Best,
> Michael
>