You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Dewey Dunnington (Jira)" <ji...@apache.org> on 2022/05/31 16:30:00 UTC
[jira] [Created] (ARROW-16695) [R][C++] Extension types are not supported in joins
Dewey Dunnington created ARROW-16695:
----------------------------------------
Summary: [R][C++] Extension types are not supported in joins
Key: ARROW-16695
URL: https://issues.apache.org/jira/browse/ARROW-16695
Project: Apache Arrow
Issue Type: Improvement
Components: C++, R
Reporter: Dewey Dunnington
It looks like extension types are not supported in joins (even if the underlying type is supproted)! Reported by [~jonkeane] while making a demo for Arrow + Query engine + geoarrow (R package), which uses extension types liberally:
{code:R}
library(arrow, warn.conflicts = FALSE)
library(dplyr, warn.conflicts = FALSE)
rb_non_ext <- record_batch(
a = 1:5,
b = letters[1:5]
)
rb_ext_storage <- record_batch(
b = letters[1:5],
c = Array$create(list(as.raw(1:5)), type = binary())
)
rb_ext <- record_batch(
b = letters[1:5],
c = vctrs_extension_array(rb_ext_storage$c$as_vector())
)
rb_non_ext %>%
left_join(rb_ext_storage) %>%
collect()
#> # A tibble: 5 × 3
#> a b c
#> <int> <chr> <arrw_bnr>
#> 1 1 a 01, 02, 03, 04, 05
#> 2 2 b 01, 02, 03, 04, 05
#> 3 3 c 01, 02, 03, 04, 05
#> 4 4 d 01, 02, 03, 04, 05
#> 5 5 e 01, 02, 03, 04, 05
rb_non_ext %>%
left_join(rb_ext) %>%
collect()
#> Error in `collect()`:
#> ! Invalid: Data type <arrow_binary[0]> is not supported in join non-key field
#> /Users/deweydunnington/Desktop/rscratch/arrow/cpp/src/arrow/compute/exec/hash_join_node.cc:121 ValidateSchemas(join_type, left_schema, left_keys, left_output, right_schema, right_keys, right_output, left_field_name_suffix, right_field_name_suffix)
#> /Users/deweydunnington/Desktop/rscratch/arrow/cpp/src/arrow/compute/exec/hash_join_node.cc:499 schema_mgr->Init( join_options.join_type, left_schema, join_options.left_keys, join_options.left_output, right_schema, join_options.right_keys, join_options.right_output, join_options.filter, join_options.output_suffix_for_left, join_options.output_suffix_for_right)
{code}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)