You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ben Kietzman (Jira)" <ji...@apache.org> on 2021/06/04 01:19:00 UTC

[jira] [Commented] (ARROW-12789) [C++] Support for scalar value recycling in RecordBatch/Table creation

    [ https://issues.apache.org/jira/browse/ARROW-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17356998#comment-17356998 ] 

Ben Kietzman commented on ARROW-12789:
--------------------------------------

In C++ terms, a RecordBatch's column is strictly an array and not a scalar. I'm guessing you'd like to do something similar to

{code}
tibble(a = 1:10, s = 0)
{code}

that seems useful, but it's not something which makes sense to add in C++. I'd recommend adding a clause to {{RecordBatch$create}} which detects Scalars and invokes MakeArrayFromScalar to coerce to an Array

> [C++] Support for scalar value recycling in RecordBatch/Table creation
> ----------------------------------------------------------------------
>
>                 Key: ARROW-12789
>                 URL: https://issues.apache.org/jira/browse/ARROW-12789
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Nic Crane
>            Priority: Major
>
> Please can we have the capability to be able to recycle scalar values during table creation?  It would work as follows:
> Upon creation of a new Table/RecordBatch, the length of each column is checked.  If:
>  * number of columns is > 1 and
>  * any columns have length 1 and
>  * not all columns have length 1
> then, the value in the length 1 column(s) should be repeated to make it as long as the other columns. 
> This should only occur if all columns either have length 1 or N (where N is some value greater than 1), and if any columns lengths are values other than 1 or N, we should still get an error as we do now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)