You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Nic Crane (Jira)" <ji...@apache.org> on 2021/09/02 16:40:00 UTC

[jira] [Updated] (ARROW-13866) [R] Implement Options for all compute kernels available via list_compute_functions

     [ https://issues.apache.org/jira/browse/ARROW-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nic Crane updated ARROW-13866:
------------------------------
    Description: 
I'm writing a section in the cookbook about calling kernels which don't have R bindings.  I'm using {{utf8_ltrim}} as an example - it appears when we call {{list_compute_functions()}}.  I tried to call it by searching for it in the C++ code, seeing that it requires a TrimOptions class of options, and then saw that it has a single parameter, characters.

I tried calling {{call_function("utf8_ltrim", Array$create(c("abc", "abacus", "abracadabra")), options = list(characters = "ab"))}} to see what would happen, which resulted in:
{{Error: Invalid: Attempted to initialize KernelState from null FunctionOptions}}

This is because TrimOptions isn't implemented in arrow/r/src/compute.cpp.  We should go through all the compute functions listed via {{list_compute_functions()}} and ensure all of them have options implemented.

Functions to implement (will create subtasks shortly) are any of these which have options to implement (some do not):

* abs
* abs_checked
* acos
* acos_checked
* add
* all
* and
* and_not
* and_not_kleene
* any
* array_filter
* array_take
* ascii_capitalize
* ascii_center
* ascii_is_alnum
* ascii_is_alpha
* ascii_is_decimal
* ascii_is_lower
* ascii_is_printable
* ascii_is_space
* ascii_is_title
* ascii_is_upper
* ascii_lower
* ascii_lpad
* ascii_ltrim
* ascii_ltrim_whitespace
* ascii_reverse
* ascii_rtrim
* ascii_rtrim_whitespace
* ascii_split_whitespace
* ascii_swapcase
* ascii_trim
* ascii_trim_whitespace
* ascii_upper
* asin
* asin_checked
* atan
* atan2
* binary_join
* binary_length
* binary_replace_slice
* bit_wise_and
* bit_wise_not
* bit_wise_or
* bit_wise_xor
* case_when
* ceil
* choose
* coalesce
* cos
* cos_checked
* count_substring
* count_substring_regex
* day
* day_of_year
* divide
* drop_null
* ends_with
* extract_regex
* find_substring
* find_substring_regex
* floor
* hash_any
* hash_count
* hash_count_distinct
* hash_distinct
* hash_mean
* hash_min_max
* hash_product
* hash_sum
* hash_tdigest
* hash_variance
* hour
* if_else
* index
* index_in
* index_in_meta_binary
* is_finite
* is_inf
* iso_calendar
* iso_week
* iso_year
* list_flatten
* list_parent_indices
* list_value_length
* ln
* ln_checked
* log10
* log10_checked
* log1p
* log1p_checked
* log2
* log2_checked
* logb
* logb_checked
* match_substring
* match_substring_regex
* max_element_wise
* mean
* microsecond
* millisecond
* min_max
* minute
* mode
* month
* multiply
* nanosecond
* negate
* negate_checked
* or
* partition_nth_indices
* power
* product
* quarter
* replace_substring_regex
* replace_with_mask
* second
* shift_left
* shift_left_checked
* shift_right
* shift_right_checked
* sign
* sin
* sin_checked
* split_pattern_regex
* starts_with
* stddev
* strftime
* string_is_ascii
* subsecond
* subtract
* sum
* tan
* tan_checked
* tdigest
* trunc
* unique
* utf8_capitalize
* utf8_center
* utf8_is_alnum
* utf8_is_alpha
* utf8_is_decimal
* utf8_is_digit
* utf8_is_lower
* utf8_is_numeric
* utf8_is_printable
* utf8_is_space
* utf8_is_title
* utf8_is_upper
* utf8_lpad
* utf8_ltrim
* utf8_ltrim_whitespace
* utf8_replace_slice
* utf8_reverse
* utf8_rpad
* utf8_rtrim
* utf8_rtrim_whitespace
* utf8_swapcase
* utf8_trim
* utf8_trim_whitespace
* value_counts
* variance
* xor
* year

  was:
I'm writing a section in the cookbook about calling kernels which don't have R bindings.  I'm using {{utf8_ltrim}} as an example - it appears when we call {{list_compute_functions()}}.  I tried to call it by searching for it in the C++ code, seeing that it requires a TrimOptions class of options, and then saw that it has a single parameter, characters.

I tried calling {{call_function("utf8_ltrim", Array$create(c("abc", "abacus", "abracadabra")), options = list(characters = "ab"))}} to see what would happen, which resulted in:
{{Error: Invalid: Attempted to initialize KernelState from null FunctionOptions}}

This is because TrimOptions isn't implemented in arrow/r/src/compute.cpp.  We should go through all the compute functions listed via {{list_compute_functions()}} and ensure all of them have options implemented.


> [R] Implement Options for all compute kernels available via list_compute_functions
> ----------------------------------------------------------------------------------
>
>                 Key: ARROW-13866
>                 URL: https://issues.apache.org/jira/browse/ARROW-13866
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>            Reporter: Nic Crane
>            Assignee: Nic Crane
>            Priority: Major
>
> I'm writing a section in the cookbook about calling kernels which don't have R bindings.  I'm using {{utf8_ltrim}} as an example - it appears when we call {{list_compute_functions()}}.  I tried to call it by searching for it in the C++ code, seeing that it requires a TrimOptions class of options, and then saw that it has a single parameter, characters.
> I tried calling {{call_function("utf8_ltrim", Array$create(c("abc", "abacus", "abracadabra")), options = list(characters = "ab"))}} to see what would happen, which resulted in:
> {{Error: Invalid: Attempted to initialize KernelState from null FunctionOptions}}
> This is because TrimOptions isn't implemented in arrow/r/src/compute.cpp.  We should go through all the compute functions listed via {{list_compute_functions()}} and ensure all of them have options implemented.
> Functions to implement (will create subtasks shortly) are any of these which have options to implement (some do not):
> * abs
> * abs_checked
> * acos
> * acos_checked
> * add
> * all
> * and
> * and_not
> * and_not_kleene
> * any
> * array_filter
> * array_take
> * ascii_capitalize
> * ascii_center
> * ascii_is_alnum
> * ascii_is_alpha
> * ascii_is_decimal
> * ascii_is_lower
> * ascii_is_printable
> * ascii_is_space
> * ascii_is_title
> * ascii_is_upper
> * ascii_lower
> * ascii_lpad
> * ascii_ltrim
> * ascii_ltrim_whitespace
> * ascii_reverse
> * ascii_rtrim
> * ascii_rtrim_whitespace
> * ascii_split_whitespace
> * ascii_swapcase
> * ascii_trim
> * ascii_trim_whitespace
> * ascii_upper
> * asin
> * asin_checked
> * atan
> * atan2
> * binary_join
> * binary_length
> * binary_replace_slice
> * bit_wise_and
> * bit_wise_not
> * bit_wise_or
> * bit_wise_xor
> * case_when
> * ceil
> * choose
> * coalesce
> * cos
> * cos_checked
> * count_substring
> * count_substring_regex
> * day
> * day_of_year
> * divide
> * drop_null
> * ends_with
> * extract_regex
> * find_substring
> * find_substring_regex
> * floor
> * hash_any
> * hash_count
> * hash_count_distinct
> * hash_distinct
> * hash_mean
> * hash_min_max
> * hash_product
> * hash_sum
> * hash_tdigest
> * hash_variance
> * hour
> * if_else
> * index
> * index_in
> * index_in_meta_binary
> * is_finite
> * is_inf
> * iso_calendar
> * iso_week
> * iso_year
> * list_flatten
> * list_parent_indices
> * list_value_length
> * ln
> * ln_checked
> * log10
> * log10_checked
> * log1p
> * log1p_checked
> * log2
> * log2_checked
> * logb
> * logb_checked
> * match_substring
> * match_substring_regex
> * max_element_wise
> * mean
> * microsecond
> * millisecond
> * min_max
> * minute
> * mode
> * month
> * multiply
> * nanosecond
> * negate
> * negate_checked
> * or
> * partition_nth_indices
> * power
> * product
> * quarter
> * replace_substring_regex
> * replace_with_mask
> * second
> * shift_left
> * shift_left_checked
> * shift_right
> * shift_right_checked
> * sign
> * sin
> * sin_checked
> * split_pattern_regex
> * starts_with
> * stddev
> * strftime
> * string_is_ascii
> * subsecond
> * subtract
> * sum
> * tan
> * tan_checked
> * tdigest
> * trunc
> * unique
> * utf8_capitalize
> * utf8_center
> * utf8_is_alnum
> * utf8_is_alpha
> * utf8_is_decimal
> * utf8_is_digit
> * utf8_is_lower
> * utf8_is_numeric
> * utf8_is_printable
> * utf8_is_space
> * utf8_is_title
> * utf8_is_upper
> * utf8_lpad
> * utf8_ltrim
> * utf8_ltrim_whitespace
> * utf8_replace_slice
> * utf8_reverse
> * utf8_rpad
> * utf8_rtrim
> * utf8_rtrim_whitespace
> * utf8_swapcase
> * utf8_trim
> * utf8_trim_whitespace
> * value_counts
> * variance
> * xor
> * year



--
This message was sent by Atlassian Jira
(v8.3.4#803005)