You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Ben Schmidt (Jira)" <ji...@apache.org> on 2020/05/28 21:31:00 UTC

[jira] [Created] (ARROW-8977) Table$create crashes with some dictinoary

Ben Schmidt created ARROW-8977:
----------------------------------

             Summary: Table$create crashes with some dictinoary
                 Key: ARROW-8977
                 URL: https://issues.apache.org/jira/browse/ARROW-8977
             Project: Apache Arrow
          Issue Type: Bug
          Components: R
    Affects Versions: 0.17.1
         Environment: Using the latest nightly build of arrow and R 4.0.0 on OS X. 

R sessionInfo: 

R version 4.0.0 (2020-04-24)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] arrow_0.17.1.20200527

loaded via a namespace (and not attached):
 [1] tidyselect_1.1.0 bit_1.1-15.2     compiler_4.0.0   magrittr_1.5     assertthat_0.2.1 R6_2.4.1        
 [7] tools_4.0.0      glue_1.4.1       Rcpp_1.0.4.6     bit64_0.9-7      vctrs_0.3.0      knitr_1.28      
[13] xfun_0.14        rlang_0.4.6      purrr_0.3.4     
            Reporter: Ben Schmidt


On the latest nightly build in R, using Table$create with a custom schema can crash R entirely (fatal error/bomb in RStudio) when the schema includes a different index_type for the dictionary than expected.

 Example:
{code:r}
library(arrow)

native = data.frame(a = c(1, 2, 3), b = as.factor(c("a", "b", "c")))

#Works. 'a' is <float>, dictionary is string - int8
Table$create(native)

# Works, although 'a' is cast to int32.
Table$create(native, schema = s1)
s1 = schema(a = uint32(), b = dictionary(value_type = arrow::string(), index_type = arrow::int8()))

# Crashes R on my system because index_type is int16(), not int8()
s2 = schema(a = uint32(), b = dictionary(value_type = arrow::string(), index_type = arrow::int16()))
Table$create(native, schema = s2)
{code}
 

On restart, following log is in my rstudio session:

 
{noformat}
/private/var/folders/84/dvp0h0kn22qcx_0z_hn_b36w0000gn/T/hbtmp/apache-arrow-20200528-16757-1uok2ln/cpp/src/arrow/array.cc:1194: Check failed: (indices->type_id()) == (dict.index_type()->id()) 0 arrow.so 0x000000010dd2b3cd _ZN5arrow4util7CerrLogD2Ev + 209 1 arrow.so 0x000000010dd2b2ee _ZN5arrow4util7CerrLogD0Ev + 14 2 arrow.so 0x000000010dd2b296 _ZN5arrow4util8ArrowLogD1Ev + 34 3 arrow.so 0x000000010daf2429 _ZN5arrow15DictionaryArray10FromArraysERKNSt3__110shared_ptrINS_8DataTypeEEERKNS2_INS_5ArrayEEESA_ + 619 4 arrow.so 0x000000010d8f1eff _ZN5arrow1r19MakeFactorArrayImplINS_8Int8TypeEEENSt3__110shared_ptrINS_5ArrayEEEN4Rcpp6VectorILi13ENS7_16NoProtectStorageEEERKNS4_INS_8DataTypeEEE + 1743 5 arrow.so 0x000000010d8f1684 _ZN5arrow1r15MakeFactorArrayEN4Rcpp6VectorILi13ENS1_16NoProtectStorageEEERKNSt3__110shared_ptrINS_8DataTypeEEE + 260 6 arrow.so 0x000000010d8f40f4 _ZN5arrow1r18Array__from_vectorEP7SEXPRECRKNSt3__110shared_ptrINS_8DataTypeEEEb + 292 7 arrow.so 0x000000010d98e418 _ZZ16Table__from_dotsP7SEXPRECS0_ENK3$_1clEiS0_ + 440 8 arrow.so 0x000000010d98d15b _Z16Table__from_dotsP7SEXPRECS0_ + 1611 9 arrow.so 0x000000010d95af3b _arrow_Table__from_dots + 91 10 libR.dylib 0x0000000105466b5d R_doDotCall + 1437 11 libR.dylib 0x00000001054b2a7a bcEval + 105226 12 libR.dylib 0x0000000105498831 Rf_eval + 385 13 libR.dylib 0x00000001054b8cf1 R_execClosure + 2193 14 libR.dylib 0x00000001054b7ac9 Rf_applyClosure + 473 15 libR.dylib 0x000000010549f9a8 bcEval + 27192 16 libR.dylib 0x0000000105498831 Rf_eval + 385 17 libR.dylib 0x00000001054b71cc forcePromise + 172 18 libR.dylib 0x00000001054c2b4a getvar + 778 19 libR.dylib 0x000000010549cb8c bcEval + 15388 20 libR.dylib 0x0000000105498831 Rf_eval + 385 21 libR.dylib 0x00000001054b71cc forcePromise + 172 22 libR.dylib 0x00000001054c2b4a getvar + 778 23 libR.dylib 0x000000010549cb8c bcEval + 15388 24 libR.dylib 0x0000000105498831 Rf_eval + 385 25 libR.dylib 0x00000001054b8cf1 R_execClosure + 2193 26 libR.dylib 0x00000001054b7ac9 Rf_applyClosure + 473 27 libR.dylib 0x000000010549f9a8 bcEval + 27192 28 libR.dylib 0x0000000105498831 Rf_eval + 385 29 libR.dylib 0x00000001054b8cf1 R_execClosure + 2193 30 libR.dylib 0x00000001054b7ac9 Rf_applyClosure + 473 31 libR.dylib 0x000000010549f9a8 bcEval + 27192 32 libR.dylib 0x0000000105498831 Rf_eval + 385 33 libR.dylib 0x00000001054b8cf1 R_execClosure + 2193 34 libR.dylib 0x00000001054b7ac9 Rf_applyClosure + 473 35 libR.dylib 0x0000000105498d06 Rf_eval + 1622 36 libR.dylib 0x00000001054edcda Rf_ReplIteration + 810 37 libR.dylib 0x00000001054ef1ff run_Rmainloop + 207 38 rsession 0x0000000104bef4d0 _ZN7rstudio1r7session12runEmbeddedRERKNS_4core8FilePathES5_bb7SA_TYPERKNS1_9CallbacksEPNS1_17InternalCallbacksE + 416
 
{noformat}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)