You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/20 22:09:38 UTC
[GitHub] [arrow] dragosmg commented on pull request #13196: ARROW-16407: [R] Extend `parse_date_time` to cover hour, dates, and minutes components
dragosmg commented on PR #13196:
URL: https://github.com/apache/arrow/pull/13196#issuecomment-1160827641
Results of benchmarking `parse_date_time()` implemented with combined formats (with and without separator) vs separate formats (either with or without separator)
```r
library(dplyr)
library(lubridate)
library(ggplot2)
library(hrbrthemes)
load_all()
test_df <- tibble::tibble(
a = rep(c("20220614", "2022-06-14"), 1e6)
)
results <- bench::mark(
separate = test_df %>%
arrow_table() %>%
mutate(b = parse_date_time(a, orders = "ymd")) %>%
collect(),
combined = test_df %>%
arrow_table() %>%
mutate(b = parse_date_time_combined(a, orders = "ymd")) %>%
collect(),
min_iterations = 20
)
results
# A tibble: 2 × 13
expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time result memory time gc
<bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm> <list> <list> <list> <list>
1 separate 5.93s 5.94s 0.168 15.8MB 0.0720 14 6 1.39m <tibble> <Rprofmem> <bench_tm> <tibble>
2 combined 12.22s 12.25s 0.0815 16.2MB 0.0439 13 7 2.66m <tibble> <Rprofmem> <bench_tm> <tibble>
ggplot2::autoplot(results) +
theme_ipsum_rc(grid = "XxY") +
labs(title = "Comparison of format parsing",
subtitle =
"separate = formats with or without separator are tried separately\n
combined = formats are combined in a single vector and all are passed to `coalesce()`")
```
![image](https://user-images.githubusercontent.com/13176361/174673234-99592af2-43ed-4646-8890-2c794adf70f2.png)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org