You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Francois Saint-Jacques (Jira)" <ji...@apache.org> on 2020/06/15 18:40:00 UTC

[jira] [Updated] (ARROW-7798) [R] Refactor R <-> Array conversion

     [ https://issues.apache.org/jira/browse/ARROW-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Francois Saint-Jacques updated ARROW-7798:
------------------------------------------
    Description: 
There's a bit of technical debt accumulated in array_to_vector and vector_to_array:

* Mix of conversion *and* casting, ideally we'd move casting out of there (at the cost of more memory copy). The rationale is that the conversion logic will differ from the CastKernels, e.g. when to raise errors, benefits from complex conversions like timezone... The current implementation is fast, e.g. it fuses the conversion and casting in a single loop at the cost of code clarity and divergence.
* There should be 2 paths, zero-copy, non zero-copy. The non-zero copy should use the newly introduced VectorToArrayConverter which will work with complex nested types.
*  The in array_to vector, Converter should work primarily with Array and not ArrayVector
* The vector_to_array should not use builders, sizes are known, the null bitmap should be constructed separately. There's probably a chance that we can re-use R's memory with zero-copy for the raw data.

  was:
There's a bit of technical debt accumulated in array_to_vector and vector_to_array:

* Mix of conversion *and* casting, ideally we'd move casting out of there (at the cost of more memory copy). The rationale is that the conversion logic will differ from the CastKernels, e.g. when to raise errors, benefits from complex conversions like timezone... The current implementation is fast, e.g. it fuses the conversion and casting in a single loop at the cost of code clarity and divergence.
* There should be 2 paths, zero-copy, non zero-copy. The non-zero copy should use the newly introduced VectorToArrayConverter which will work with complex nested types.
*  The in array_to vector, Converter should work primarily with Array and not ArrayVector


> [R] Refactor R <-> Array conversion
> -----------------------------------
>
>                 Key: ARROW-7798
>                 URL: https://issues.apache.org/jira/browse/ARROW-7798
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>            Reporter: Francois Saint-Jacques
>            Priority: Major
>
> There's a bit of technical debt accumulated in array_to_vector and vector_to_array:
> * Mix of conversion *and* casting, ideally we'd move casting out of there (at the cost of more memory copy). The rationale is that the conversion logic will differ from the CastKernels, e.g. when to raise errors, benefits from complex conversions like timezone... The current implementation is fast, e.g. it fuses the conversion and casting in a single loop at the cost of code clarity and divergence.
> * There should be 2 paths, zero-copy, non zero-copy. The non-zero copy should use the newly introduced VectorToArrayConverter which will work with complex nested types.
> *  The in array_to vector, Converter should work primarily with Array and not ArrayVector
> * The vector_to_array should not use builders, sizes are known, the null bitmap should be constructed separately. There's probably a chance that we can re-use R's memory with zero-copy for the raw data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)