You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ian Cook (Jira)" <ji...@apache.org> on 2021/03/25 16:39:00 UTC
[jira] [Updated] (ARROW-12087) [C++] Fix sort_indices,
array_sort_indices timestamp support discrepancy
[ https://issues.apache.org/jira/browse/ARROW-12087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ian Cook updated ARROW-12087:
-----------------------------
Description:
{{sort_indices}} supports sorting by timestamp arrays, but {{array_sort_indices}} does not. Here's some example R code to demonstrate this (but this example code depends on ARROW-11703 to run):
{code:java}
tbl <- tibble::tibble(
dttm = lubridate::ymd_hms(c("2021-01-01 00:00:00", "1900-01-01 00:00:00")),
)
rb <- arrow::record_batch(tbl)
# this fails:
arrow:::call_function(
"array_sort_indices",
rb$dttm,
options = list(order = F)
)
## Error: NotImplemented: Function array_sort_indices has no kernel matching input types (array[timestamp[us, tz=UTC]])
# this fails because it internally calls array_sort_indices
arrow:::call_function(
"sort_indices",
rb,
options = list(names = "dttm", orders = 0L)
)
## Error: NotImplemented: Function array_sort_indices has no kernel matching input types (array[timestamp[us, tz=UTC]])
# this succeeds
arrow:::call_function(
"sort_indices",
rb,
options = list(names = c("dttm", "dttm"), orders = 0L)
)
## Array
## <uint64>
## [
## 1,
## 0
## ]{code}
was:
{{sort_indices}} supports sorting by timestamp arrays, but {{array_sort_indices}} does not. Here's some example R code to demonstrate this (but this example code depends on ARROW-11703 to run):
{code:java}
tbl <- tibble::tibble(
dttm = lubridate::ymd_hms(c("2021-01-01 00:00:00", "1900-01-01 00:00:00")),
)
rb <- arrow::record_batch(tbl)
# this fails:
arrow:::call_function(
"array_sort_indices",
rb$dttm,
options = list(order = F)
)
## Error: NotImplemented: Function array_sort_indices has no kernel matching input types (array[timestamp[us, tz=UTC]])
# this fails
arrow:::call_function(
"sort_indices",
rb,
options = list(names = "dttm", orders = 0L)
)
## Error: NotImplemented: Function array_sort_indices has no kernel matching input types (array[timestamp[us, tz=UTC]])
# this succeeds
arrow:::call_function(
"sort_indices",
rb,
options = list(names = c("dttm", "dttm"), orders = 0L)
)
## Array
## <uint64>
## [
## 1,
## 0
## ]{code}
> [C++] Fix sort_indices, array_sort_indices timestamp support discrepancy
> ------------------------------------------------------------------------
>
> Key: ARROW-12087
> URL: https://issues.apache.org/jira/browse/ARROW-12087
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Reporter: Ian Cook
> Priority: Major
>
> {{sort_indices}} supports sorting by timestamp arrays, but {{array_sort_indices}} does not. Here's some example R code to demonstrate this (but this example code depends on ARROW-11703 to run):
> {code:java}
> tbl <- tibble::tibble(
> dttm = lubridate::ymd_hms(c("2021-01-01 00:00:00", "1900-01-01 00:00:00")),
> )
> rb <- arrow::record_batch(tbl)
> # this fails:
> arrow:::call_function(
> "array_sort_indices",
> rb$dttm,
> options = list(order = F)
> )
> ## Error: NotImplemented: Function array_sort_indices has no kernel matching input types (array[timestamp[us, tz=UTC]])
> # this fails because it internally calls array_sort_indices
> arrow:::call_function(
> "sort_indices",
> rb,
> options = list(names = "dttm", orders = 0L)
> )
> ## Error: NotImplemented: Function array_sort_indices has no kernel matching input types (array[timestamp[us, tz=UTC]])
> # this succeeds
> arrow:::call_function(
> "sort_indices",
> rb,
> options = list(names = c("dttm", "dttm"), orders = 0L)
> )
> ## Array
> ## <uint64>
> ## [
> ## 1,
> ## 0
> ## ]{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)