You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/21 18:56:20 UTC
[GitHub] [arrow] jmdeschenes commented on pull request #10565: ARROW-638: [C++] Complex Number Support via ExtensionTypes
jmdeschenes commented on pull request #10565:
URL: https://github.com/apache/arrow/pull/10565#issuecomment-999017847
Hello,
There is an issue with the approach:
on array.pxi
```cython
cdef class ExtensionArray(Array):
"""
Concrete class for Arrow extension arrays.
"""
@property
def storage(self):
cdef:
CExtensionArray* ext_array = <CExtensionArray*>(self.ap)
return pyarrow_wrap_array(ext_array.storage())
#
## LINES SKIPPED
#
def to_numpy(self, **kwargs):
"""
Convert extension array to a numpy ndarray.
See Also
--------
Array.to_numpy
"""
return self.storage.to_numpy(**kwargs)
```
on table.pxi
```cython
def to_numpy(self):
"""
Return a NumPy copy of this array (experimental).
Returns
-------
array : numpy.ndarray
"""
cdef:
PyObject* out
PandasOptions c_options
object values
if self.type.id == _Type_EXTENSION:
storage_array = chunked_array(
[chunk.storage for chunk in self.iterchunks()],
type=self.type.storage_type
```
Both of these "strip" the Extension type sent to the CPP code. As such, the CPP code never knows that it is dealing with an extension.
If this is to be kept, fixed_size_list would need to convert into a proper 2D numpy array(That could have several benefits, it could be done only for primitive types at the start)
@jorisvandenbossche Do you think that is something that could be acceptable?
Otherwise, letting the CPP code handle the extension type could be another option.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org