You are viewing a plain text version of this content. The canonical link for it is here.

Posted to jira@arrow.apache.org by "Will Jones (Jira)" <ji...@apache.org> on 2021/09/09 02:46:00 UTC

[jira] [Commented] (ARROW-13939) how to do resampling of arrow table using cython

    [ https://issues.apache.org/jira/browse/ARROW-13939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17412298#comment-17412298 ] 

Will Jones commented on ARROW-13939:
------------------------------------

Hi Krishna,

Could you provide an example of what you mean by "resampling"? Are you trying to create a new table that is a subset of rows of the existing table? Or do you mean time series resampling like in Pandas DataFrame.resample?

bq. Is there a way i create an empty table of same schema and keep appending to it. Or should I use vectors/list and then pass them to create a table.

Arrow tables aren't meant to be appended row-wise. You can build the arrays and then create a table out of them.


> how to do resampling of arrow table using cython
> ------------------------------------------------
>
>                 Key: ARROW-13939
>                 URL: https://issues.apache.org/jira/browse/ARROW-13939
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++, Python
>            Reporter: krishna deepak
>            Priority: Minor
>
> Please can someone point me to resources, how to write a resampling code in cython for Arrow table.
>  # Will iterating the whole table be slow in cython?
>  # which is the best to use to append new elements to. Is there a way i create an empty table of same schema and keep appending to it. Or should I use vectors/list and then pass them to create a table.
> Performance is very important for me. Any help is highly appreciated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)