You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Neal Richardson (Jira)" <ji...@apache.org> on 2022/10/10 13:48:00 UTC
[jira] [Created] (ARROW-17974) [C++] random function can't actually be used
Neal Richardson created ARROW-17974:
---------------------------------------
Summary: [C++] random function can't actually be used
Key: ARROW-17974
URL: https://issues.apache.org/jira/browse/ARROW-17974
Project: Apache Arrow
Issue Type: Bug
Components: C++
Reporter: Neal Richardson
random() is currently implemented as a nullary function. It doesn't let you specify the number of values you want to generate because it's designed to generate however many the given ExecBatch has. The only option RandomOptions takes seems to be an optional seed value. Unfortunately, the result is that the function is not usable, AFAICT.
Calling the compute function directly, you get 0 values (all examples from R):
{code}
library(arrow)
call_function("random")
# Array
# <double>
# []
{code}
Calling it from within an ExecPlan, it errors because it is not a proper scalar function, despite what the filenames say (scalar_random.cc, etc.):
{code}
library(arrow)
library(dplyr)
mtcars %>%
arrow_table() %>%
mutate(x = arrow_random()) %>%
collect()
# Error in `collect()`:
# ! Invalid: ExecuteScalarExpression cannot Execute non-scalar expression Array[double]
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)