You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Tim Swast (JIRA)" <ji...@apache.org> on 2019/06/07 18:56:00 UTC

[jira] [Commented] (ARROW-2607) [Java/Python] Support VarCharVector / StringArray in pyarrow.Array.from_jvm

    [ https://issues.apache.org/jira/browse/ARROW-2607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858909#comment-16858909 ] 

Tim Swast commented on ARROW-2607:
----------------------------------

I'm very interested in this issue, as it would be extremely useful for parsing files into Arrow tables from numba. I would expect to be able to do the following:
{code:java}
my_string_array = pyarrow.Array.from_buffers(
  pyarrow.string(),
  row_count,
  [
    pyarrow.py_buffer(my_string_nullmask),
    pyarrow.py_buffer(my_string_offsets),
    pyarrow.py_buffer(my_string_bytes),
  ],
){code}
But I get :
{quote}File "pyarrow/array.pxi", line 578, in pyarrow.lib.Array.from_buffers
NotImplementedError: from_buffers is only supported for primitive arrays yet.
{quote}
I suppose if I wanted to contribute this fix, I should start looking at pyarrow/array.pxi first?

> [Java/Python] Support VarCharVector / StringArray in pyarrow.Array.from_jvm
> ---------------------------------------------------------------------------
>
>                 Key: ARROW-2607
>                 URL: https://issues.apache.org/jira/browse/ARROW-2607
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Java, Python
>            Reporter: Uwe L. Korn
>            Priority: Major
>
> Follow-up after https://issues.apache.org/jira/browse/ARROW-2249: Currently only primitive arrays are supported in {{pyarrow.Array.from_jvm}} as it uses {{pyarrow.Array.from_buffers}} underneath. We should extend one of the two functions to be able to deal with string arrays. There is a currently failing unit test {{test_jvm_string_array}} in {{pyarrow/tests/test_jvm.py}} to verify the implementation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)