You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ben Kietzman (Jira)" <ji...@apache.org> on 2021/06/04 01:19:00 UTC
[jira] [Commented] (ARROW-12789) [C++] Support for scalar value
recycling in RecordBatch/Table creation
[ https://issues.apache.org/jira/browse/ARROW-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17356998#comment-17356998 ]
Ben Kietzman commented on ARROW-12789:
--------------------------------------
In C++ terms, a RecordBatch's column is strictly an array and not a scalar. I'm guessing you'd like to do something similar to
{code}
tibble(a = 1:10, s = 0)
{code}
that seems useful, but it's not something which makes sense to add in C++. I'd recommend adding a clause to {{RecordBatch$create}} which detects Scalars and invokes MakeArrayFromScalar to coerce to an Array
> [C++] Support for scalar value recycling in RecordBatch/Table creation
> ----------------------------------------------------------------------
>
> Key: ARROW-12789
> URL: https://issues.apache.org/jira/browse/ARROW-12789
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Nic Crane
> Priority: Major
>
> Please can we have the capability to be able to recycle scalar values during table creation? It would work as follows:
> Upon creation of a new Table/RecordBatch, the length of each column is checked. If:
> * number of columns is > 1 and
> * any columns have length 1 and
> * not all columns have length 1
> then, the value in the length 1 column(s) should be repeated to make it as long as the other columns.
> This should only occur if all columns either have length 1 or N (where N is some value greater than 1), and if any columns lengths are values other than 1 or N, we should still get an error as we do now.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)