You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Neal Richardson (Jira)" <ji...@apache.org> on 2020/12/21 20:44:01 UTC

[jira] [Resolved] (ARROW-10642) [R] Can't get Table from RecordBatchReader with 0 batches

     [ https://issues.apache.org/jira/browse/ARROW-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neal Richardson resolved ARROW-10642.
-------------------------------------
    Resolution: Fixed

Issue resolved by pull request 8956
[https://github.com/apache/arrow/pull/8956]

> [R] Can't get Table from RecordBatchReader with 0 batches
> ---------------------------------------------------------
>
>                 Key: ARROW-10642
>                 URL: https://issues.apache.org/jira/browse/ARROW-10642
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>    Affects Versions: 2.0.0
>         Environment: > sessionInfo()
> R version 4.0.3 (2020-10-10)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Debian GNU/Linux 10 (buster)
> Matrix products: default
> BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
> LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so
> locale:
>  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
>  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
>  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
> [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base     
> other attached packages:
> [1] bigrquery_1.3.2        bigrquerystorage_0.1.0
> loaded via a namespace (and not attached):
>  [1] Rcpp_1.0.5           cellranger_1.1.0     pillar_1.4.6        
>  [4] compiler_4.0.3       dbplyr_2.0.0         tools_4.0.3         
>  [7] odbc_1.3.0           getPass_0.2-2        digest_0.6.27       
> [10] bit_4.0.4            gargle_0.5.0         jsonlite_1.7.1      
> [13] memoise_1.1.0        lifecycle_0.2.0      tibble_3.0.4        
> [16] pkgconfig_2.0.3      rlang_0.4.8          extraw_1.8.25       
> [19] DBI_1.1.0            rstudioapi_0.13      curl_4.3            
> [22] xml2_1.3.2           dplyr_1.0.2          httr_1.4.2          
> [25] askpass_1.1          fs_1.5.0             generics_0.1.0      
> [28] vctrs_0.3.5          hms_0.5.3            bit64_4.0.5         
> [31] tidyselect_1.1.0     glue_1.4.2           data.table_1.13.2   
> [34] R6_2.5.0             readxl_1.3.1         connect.cap_0.3.19  
> [37] purrr_0.3.4          blob_1.2.1           magrittr_2.0.1      
> [40] ellipsis_0.3.1       assertthat_0.2.1     keyring_1.1.0       
> [43] arrow_2.0.0.20201117 openssl_1.4.3        crayon_1.3.4 
>            Reporter: Bruno Tremblay
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 3.0.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Objective is to build a 0 rows data.frame using an arrow schema field definition
>  
>  
>  
> {code:java}
> #IPC stream containing only a schema
> stream<-as.raw(c(255,255,255,255,16,1,0,0,16,0,0,0,0,0,10,0,12,0,6,0,5,0,8,0,10,0,0,0,0,1,3,0,12,0,0,0,8,0,8,0,0,0,4,0,8,0,0,0,4,0,0,0,4,0,0,0,160,0,0,0,92,0,0,0,48,0,0,0,4,0,0,0,128,255,255,255,0,0,1,5,20,0,0,0,12,0,0,0,4,0,0,0,0,0,0,0,176,255,255,255,7,0,0,0,82,69,80,79,78,83,69,0,168,255,255,255,0,0,1,5,20,0,0,0,12,0,0,0,4,0,0,0,0,0,0,0,216,255,255,255,6,0,0,0,68,69,84,65,73,76,0,0,208,255,255,255,0,0,1,5,24,0,0,0,16,0,0,0,4,0,0,0,0,0,0,0,4,0,4,0,4,0,0,0,8,0,0,0,68,65,84,65,84,89,80,69,0,0,0,0,16,0,20,0,8,0,6,0,7,0,12,0,0,0,16,0,16,0,0,0,0,0,1,7,36,0,0,0,20,0,0,0,4,0,0,0,0,0,0,0,8,0,12,0,4,0,8,0,8,0,0,0,38,0,0,0,9,0,0,0,8,0,0,0,77,65,67,84,65,95,73,68,0,0,0,0,0,0,0,0))
> readr <- RecordBatchStreamReader$create(stream)
> readr$read_table()
> # Error in Table__from_RecordBatchStreamReader(self) : 
> # Invalid: Must pass at least one record batch or an explicit Schema
> # Now trying to be too clever
> tb <- Table$create(data.frame(), schema = readr$schema)
> dtf <- as.data.frame(tb)
> # This will crash you R session
> {code}
>  
>  
> Tested on nightly, same behavior. It's borderline a bug / feature request, but to be a drop in replacement for some DBI methods, it needs to be able to build 0 rows data.frame with the correct class for each column.
>  
> Thank you and have a nice day.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)