You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Stephanie Hazlitt (Jira)" <ji...@apache.org> on 2020/02/04 22:02:00 UTC

[jira] [Created] (ARROW-7772) Unable to dplyr::filter on date32 object when using open_dataset()

Stephanie Hazlitt created ARROW-7772:
----------------------------------------

             Summary: Unable to dplyr::filter on date32 object when using open_dataset()
                 Key: ARROW-7772
                 URL: https://issues.apache.org/jira/browse/ARROW-7772
             Project: Apache Arrow
          Issue Type: Bug
          Components: R
    Affects Versions: 0.15.1
         Environment: version  R version 3.6.2 (2019-12-12)
 os       macOS Mojave 10.14.6        
 system   x86_64, darwin15.6.0        
            Reporter: Stephanie Hazlitt
             Fix For: 0.16.0


I am trying to filter on a date column using `open_dataset()` and `dplyr::filter()`:

library(arrow)
library(dplyr)

tmp <- tempfile()
dir.create(tmp)
df <- data.frame(date = Sys.Date())
write_parquet(df, file.path(tmp, "file.parquet"))

ds <- open_dataset(tmp)


ds %>%
 filter(date > as.Date("2020-02-02")) %>%
 collect()

 

This code crashes R with this error message:
{quote}/private/var/folders/nz/vv4_9tw56nv9k3tkvyszvwg80000gn/T/hbtmp/apache-arrow-20200203-29929-1uoyri7/cpp/src/arrow/result.cc:28: ValueOrDie called on an error: NotImplemented: casting scalarsof type date64[ms] to type date32[day]
0 arrow.so 0x0000000104461f1d _ZN5arrow4util7CerrLogD2Ev + 209
1 arrow.so 0x0000000104461e3e _ZN5arrow4util7CerrLogD0Ev + 14
2 arrow.so 0x0000000104461de6 _ZN5arrow4util8ArrowLogD1Ev + 34
3 arrow.so 0x000000010436c57f _ZN5arrow8internal14DieWithMessageERKNSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEE + 63
4 arrow.so 0x000000010436d384 _ZNR5arrow6ResultINSt3__110shared_ptrINS_6ScalarEEEE10ValueOrDieEv + 192
5 arrow.so 0x000000010426ee1a _ZN5arrow7dataset23InsertImplicitCastsImpl4CastENSt3__110shared_ptrINS_8DataTypeEEEPNS3_INS0_10ExpressionEEE + 186
6 arrow.so 0x000000010426e033 _ZN5arrow7dataset23InsertImplicitCastsImplclERKNS0_20ComparisonExpressionE + 757
7 arrow.so 0x0000000104269de7 _ZN5arrow7dataset15VisitExpressionINS0_23InsertImplicitCastsImplEEEDTclfp0_fp_EERKNS0_10ExpressionEOT_ + 317
8 arrow.so 0x0000000104269c76 _ZN5arrow7dataset19InsertImplicitCastsERKNS0_10ExpressionERKNS_6SchemaE + 36
9 arrow.so 0x0000000104134e04 _Z32dataset___ScannerBuilder__FilterRKNSt3__110shared_ptrIN5arrow7dataset14ScannerBuilderEEERKNS0_INS2_10ExpressionEEE + 52
10 arrow.so 0x00000001040f35b7 _arrow_dataset___ScannerBuilder__Filter + 135
11 libR.dylib 0x00000001001f375b R_doDotCall + 955
12 libR.dylib 0x000000010023d46a bcEval + 99306
13 libR.dylib 0x000000010022494d Rf_eval + 445
14 libR.dylib 0x0000000100243209 R_execClosure + 2153
15 libR.dylib 0x000000010024210a Rf_applyClosure + 346
16 libR.dylib 0x0000000100224e7d Rf_eval + 1773
17 libR.dylib 0x0000000100245880 do_begin + 432
18 libR.dylib 0x0000000100224b40 Rf_eval + 944
19 libR.dylib 0x0000000100243209 R_execClosure + 2153
20 libR.dylib 0x000000010024210a Rf_applyClosure + 346
21 libR.dylib 0x000000010022bd71 bcEval + 27889
22 libR.dylib 0x000000010022494d Rf_eval + 445
23 libR.dylib 0x0000000100243209 R_execClosure + 2153
24 libR.dylib 0x000000010024210a Rf_applyClosure + 346
25 libR.dylib 0x0000000100287675 dispatchMethod + 757
26 libR.dylib 0x0000000100287332 Rf_usemethod + 738
27 libR.dylib 0x0000000100287926 do_usemethod + 646
28 libR.dylib 0x000000010022c369 bcEval + 29417
29 libR.dylib 0x000000010022494d Rf_eval + 445
30 libR.dylib 0x0000000100243209 R_execClosure + 2153
31 libR.dylib 0x000000010024210a Rf_applyClosure + 346
32 libR.dylib 0x0000000100224e7d Rf_eval + 1773
33 libR.dylib 0x0000000100243209 R_execClosure + 2153
34 libR.dylib 0x000000010024210a Rf_applyClosure + 346
35 libR.dylib 0x000000010022bd71 bcEval + 27889
36 libR.dylib 0x000000010022494d Rf_eval + 445
37 libR.dylib 0x00000001002418c3 forcePromise + 179
38 libR.dylib 0x0000000100224c30 Rf_eval + 1184
39 libR.dylib 0x0000000100247241 do_withVisible + 49
40 libR.dylib 0x0000000100286603 do_internal + 339
41 libR.dylib 0x000000010022c369 bcEval + 29417
42 libR.dylib 0x000000010022494d Rf_eval + 445
43 libR.dylib 0x0000000100243209 R_execClosure + 2153
44 libR.dylib 0x000000010024210a Rf_applyClosure + 346
45 libR.dylib 0x000000010022bd71 bcEval + 27889
46 libR.dylib 0x000000010022494d Rf_eval + 445
47 libR.dylib 0x0000000100243209 R_execClosure + 2153
48 libR.dylib 0x000000010024210a Rf_applyClosure + 346
49 libR.dylib 0x0000000100224e7d Rf_eval + 1773
50 libR.dylib 0x0000000100243209 R_execClosure + 2153
51 libR.dylib 0x000000010024210a Rf_applyClosure + 346
52 libR.dylib 0x0000000100224e7d Rf_eval + 1773
53 libR.dylib 0x0000000100246bc6 do_eval + 646
54 libR.dylib 0x000000010022c186 bcEval + 28934
55 libR.dylib 0x000000010022494d Rf_eval + 445
56 libR.dylib 0x0000000100243209 R_execClosure + 2153
57 libR.dylib 0x000000010024210a Rf_applyClosure + 346
58 libR.dylib 0x000000010022bd71 bcEval + 27889
59 libR.dylib 0x000000010022494d Rf_eval + 445
60 libR.dylib 0x00000001002418c3 forcePromise + 179
61 libR.dylib 0x0000000100224c30 Rf_eval + 1184
62 libR.dylib 0x0000000100247241 do_withVisible + 49
63 libR.dylib 0x0000000100286603 do_internal + 339
64 libR.dylib 0x000000010022c369 bcEval + 29417
65 libR.dylib 0x000000010022494d Rf_eval + 445
66 libR.dylib 0x0000000100243209 R_execClosure + 2153
67 libR.dylib 0x000000010024210a Rf_applyClosure + 346
68 libR.dylib 0x000000010022bd71 bcEval + 27889
69 libR.dylib 0x000000010022494d Rf_eval + 445
70 libR.dylib 0x0000000100243209 R_execClosure + 2153
71 libR.dylib 0x000000010024210a Rf_applyClosure + 346
72 libR.dylib 0x0000000100224e7d Rf_eval + 1773
73 libR.dylib 0x000000010027506a Rf_ReplIteration + 794
74 libR.dylib 0x000000010027658f run_Rmainloop + 207
75 R 0x000000010016cf5b main + 27
76 libdyld.dylib 0x00007fff7de493d5 start + 1
Abort trap: 6
{quote}
 

Thanks to Neal Richardson for help with the above reprex and putting words to the issue - Neil says:
{quote}"the 3 bugs are: (1) that should not crash; (2) R should translate Date in the filter expression to date32, just as it does for vectors; (3) the C++ layer should probably support that date64 to date32 cast (up for debate and technically not a bug, just not yet implemented I guess)".
{quote}
 

session_info():
{quote}─ Session info ──────────────────────────────────────────────────────────────────────────── setting value version R version 3.6.2 (2019-12-12) os macOS Mojave 10.14.6 system x86_64, darwin15.6.0 ui RStudio language (EN) collate en_CA.UTF-8 ctype en_CA.UTF-8 tz America/Vancouver date 2020-02-04 ─ Packages ──────────────────────────────────────────────────────────────────────────────── package * version date lib source arrow * 0.15.1.20200203 2020-02-03 [1] local assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.0) backports 1.1.5 2019-10-02 [1] CRAN (R 3.6.0) bit 1.1-15.1 2020-01-14 [1] CRAN (R 3.6.0) bit64 0.9-7 2017-05-08 [1] CRAN (R 3.6.0) bortles 0.0.0.9000 2019-11-12 [1] Github (adam-gruer/bortles@068a1b1) callr 3.4.1 2020-01-24 [1] CRAN (R 3.6.0) cli 2.0.1 2020-01-08 [1] CRAN (R 3.6.0) clisymbols 1.2.0 2017-05-21 [1] CRAN (R 3.6.0) crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.0) desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.0) devtools * 2.2.1 2019-09-24 [1] CRAN (R 3.6.1) digest 0.6.23 2019-11-23 [1] CRAN (R 3.6.0) dplyr * 0.8.4 2020-01-31 [1] CRAN (R 3.6.0) ellipsis 0.3.0 2019-09-20 [1] CRAN (R 3.6.0) evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.0) fansi 0.4.1 2020-01-08 [1] CRAN (R 3.6.0) fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.0) glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.0) here * 0.1 2017-05-28 [1] CRAN (R 3.5.0) htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.6.0) knitr 1.27 2020-01-16 [1] CRAN (R 3.6.2) magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.0) memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.0) packrat 0.5.0 2018-11-14 [1] CRAN (R 3.6.0) pillar 1.4.3 2019-12-20 [1] CRAN (R 3.6.0) pkgbuild 1.0.6 2019-10-09 [1] CRAN (R 3.6.0) pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.0) pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.0) praise 1.0.0 2015-08-11 [1] CRAN (R 3.5.0) prettyunits 1.1.1 2020-01-24 [1] CRAN (R 3.6.0) processx 3.4.1 2019-07-18 [1] CRAN (R 3.6.0) prompt 1.0.0 2020-01-13 [1] Github (gaborcsardi/prompt@b332c42) ps 1.3.0 2018-12-21 [1] CRAN (R 3.6.0) purrr 0.3.3 2019-10-18 [1] CRAN (R 3.6.0) R6 2.4.1 2019-11-12 [1] CRAN (R 3.6.0) Rcpp 1.0.3 2019-11-08 [1] CRAN (R 3.6.0) remotes 2.1.0 2019-06-24 [1] CRAN (R 3.6.0) reprex 0.3.0 2019-05-16 [1] CRAN (R 3.6.0) rlang 0.4.4 2020-01-28 [1] CRAN (R 3.6.0) rmarkdown 2.1 2020-01-20 [1] CRAN (R 3.6.0) rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.0) rstudioapi 0.10 2019-03-19 [1] CRAN (R 3.6.0) sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.0) testthat * 2.3.1 2019-12-01 [1] CRAN (R 3.6.0) tibble 2.1.3 2019-06-06 [1] CRAN (R 3.6.0) tictoc * 1.0 2014-06-17 [1] CRAN (R 3.6.0) tidyselect 1.0.0 2020-01-27 [1] CRAN (R 3.6.0) usethis * 1.5.1 2019-07-04 [1] CRAN (R 3.6.0) vctrs 0.2.2 2020-01-24 [1] CRAN (R 3.6.0) whisker 0.4 2019-08-28 [1] CRAN (R 3.6.0) withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.0) xfun 0.12 2020-01-13 [1] CRAN (R 3.6.0)
{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)