You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Neal Richardson (Jira)" <ji...@apache.org> on 2020/05/28 21:45:00 UTC
[jira] [Updated] (ARROW-8977) [R] Table$create crashes with some
dictionary index types
[ https://issues.apache.org/jira/browse/ARROW-8977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neal Richardson updated ARROW-8977:
-----------------------------------
Summary: [R] Table$create crashes with some dictionary index types (was: R Table$create crashes with some dictionary index types)
> [R] Table$create crashes with some dictionary index types
> ---------------------------------------------------------
>
> Key: ARROW-8977
> URL: https://issues.apache.org/jira/browse/ARROW-8977
> Project: Apache Arrow
> Issue Type: Bug
> Components: R
> Affects Versions: 0.17.1
> Environment: Using the latest nightly build of arrow and R 4.0.0 on OS X.
> R sessionInfo:
> R version 4.0.0 (2020-04-24)
> Platform: x86_64-apple-darwin17.0 (64-bit)
> Running under: macOS High Sierra 10.13.6
> Matrix products: default
> BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
> LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
> other attached packages:
> [1] arrow_0.17.1.20200527
> loaded via a namespace (and not attached):
> [1] tidyselect_1.1.0 bit_1.1-15.2 compiler_4.0.0 magrittr_1.5 assertthat_0.2.1 R6_2.4.1
> [7] tools_4.0.0 glue_1.4.1 Rcpp_1.0.4.6 bit64_0.9-7 vctrs_0.3.0 knitr_1.28
> [13] xfun_0.14 rlang_0.4.6 purrr_0.3.4
> Reporter: Ben Schmidt
> Priority: Minor
>
> On the latest nightly build in R, using Table$create with a custom schema can crash R entirely (fatal error/bomb in RStudio) when the schema includes a different index_type for the dictionary than expected.
> Example:
> {code:r}
> library(arrow)
> native = data.frame(a = c(1, 2, 3), b = as.factor(c("a", "b", "c")))
> #Works. 'a' is <float>, dictionary is string - int8
> Table$create(native)
> # Works, although 'a' is cast to int32.
> Table$create(native, schema = s1)
> s1 = schema(a = uint32(), b = dictionary(value_type = arrow::string(), index_type = arrow::int8()))
> # Crashes R on my system because index_type is int16(), not int8()
> s2 = schema(a = uint32(), b = dictionary(value_type = arrow::string(), index_type = arrow::int16()))
> Table$create(native, schema = s2)
> {code}
>
> On restart, following log is in my rstudio session:
>
> {noformat}
> /private/var/folders/84/dvp0h0kn22qcx_0z_hn_b36w0000gn/T/hbtmp/apache-arrow-20200528-16757-1uok2ln/cpp/src/arrow/array.cc:1194: Check failed: (indices->type_id()) == (dict.index_type()->id()) 0 arrow.so 0x000000010dd2b3cd _ZN5arrow4util7CerrLogD2Ev + 209 1 arrow.so 0x000000010dd2b2ee _ZN5arrow4util7CerrLogD0Ev + 14 2 arrow.so 0x000000010dd2b296 _ZN5arrow4util8ArrowLogD1Ev + 34 3 arrow.so 0x000000010daf2429 _ZN5arrow15DictionaryArray10FromArraysERKNSt3__110shared_ptrINS_8DataTypeEEERKNS2_INS_5ArrayEEESA_ + 619 4 arrow.so 0x000000010d8f1eff _ZN5arrow1r19MakeFactorArrayImplINS_8Int8TypeEEENSt3__110shared_ptrINS_5ArrayEEEN4Rcpp6VectorILi13ENS7_16NoProtectStorageEEERKNS4_INS_8DataTypeEEE + 1743 5 arrow.so 0x000000010d8f1684 _ZN5arrow1r15MakeFactorArrayEN4Rcpp6VectorILi13ENS1_16NoProtectStorageEEERKNSt3__110shared_ptrINS_8DataTypeEEE + 260 6 arrow.so 0x000000010d8f40f4 _ZN5arrow1r18Array__from_vectorEP7SEXPRECRKNSt3__110shared_ptrINS_8DataTypeEEEb + 292 7 arrow.so 0x000000010d98e418 _ZZ16Table__from_dotsP7SEXPRECS0_ENK3$_1clEiS0_ + 440 8 arrow.so 0x000000010d98d15b _Z16Table__from_dotsP7SEXPRECS0_ + 1611 9 arrow.so 0x000000010d95af3b _arrow_Table__from_dots + 91 10 libR.dylib 0x0000000105466b5d R_doDotCall + 1437 11 libR.dylib 0x00000001054b2a7a bcEval + 105226 12 libR.dylib 0x0000000105498831 Rf_eval + 385 13 libR.dylib 0x00000001054b8cf1 R_execClosure + 2193 14 libR.dylib 0x00000001054b7ac9 Rf_applyClosure + 473 15 libR.dylib 0x000000010549f9a8 bcEval + 27192 16 libR.dylib 0x0000000105498831 Rf_eval + 385 17 libR.dylib 0x00000001054b71cc forcePromise + 172 18 libR.dylib 0x00000001054c2b4a getvar + 778 19 libR.dylib 0x000000010549cb8c bcEval + 15388 20 libR.dylib 0x0000000105498831 Rf_eval + 385 21 libR.dylib 0x00000001054b71cc forcePromise + 172 22 libR.dylib 0x00000001054c2b4a getvar + 778 23 libR.dylib 0x000000010549cb8c bcEval + 15388 24 libR.dylib 0x0000000105498831 Rf_eval + 385 25 libR.dylib 0x00000001054b8cf1 R_execClosure + 2193 26 libR.dylib 0x00000001054b7ac9 Rf_applyClosure + 473 27 libR.dylib 0x000000010549f9a8 bcEval + 27192 28 libR.dylib 0x0000000105498831 Rf_eval + 385 29 libR.dylib 0x00000001054b8cf1 R_execClosure + 2193 30 libR.dylib 0x00000001054b7ac9 Rf_applyClosure + 473 31 libR.dylib 0x000000010549f9a8 bcEval + 27192 32 libR.dylib 0x0000000105498831 Rf_eval + 385 33 libR.dylib 0x00000001054b8cf1 R_execClosure + 2193 34 libR.dylib 0x00000001054b7ac9 Rf_applyClosure + 473 35 libR.dylib 0x0000000105498d06 Rf_eval + 1622 36 libR.dylib 0x00000001054edcda Rf_ReplIteration + 810 37 libR.dylib 0x00000001054ef1ff run_Rmainloop + 207 38 rsession 0x0000000104bef4d0 _ZN7rstudio1r7session12runEmbeddedRERKNS_4core8FilePathES5_bb7SA_TYPERKNS1_9CallbacksEPNS1_17InternalCallbacksE + 416
>
> {noformat}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)