You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Bruno Tremblay (Jira)" <ji...@apache.org> on 2020/12/01 18:33:00 UTC

[jira] [Updated] (ARROW-10773) [R] parallel as.data.frame.Table hangs indefinitely on Windows

     [ https://issues.apache.org/jira/browse/ARROW-10773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruno Tremblay updated ARROW-10773:
-----------------------------------
    Description: 
On Windows only

Tested on 2 machines, mingw. 

Reprex
{code:java}
install.packages("arrow", repos = "https://arrow-r-nightly.s3.amazonaws.com")
remotes::install_github("meztez/bigrquery", ref = "bigquerystorage", INSTALL_opts = "--no-multiarch")
library(bigrquery)

Sys.info()
sessionInfo()

Sys.setenv("BIGQUERY_TEST_PROJECT"="{project}")con <- bigrquery::dbConnect(
  bigrquery::bigquery(),
  project = "bigquery-public-data",
  dataset = "usa_names",
  billing = bigrquery:::bq_test_project())

# Does not hang
options(arrow.use_threads = FALSE)
dt <- DBI::dbReadTable(con, "bigquery-public-data.usa_names.usa_1910_current", bqs = TRUE)

# Hangs
options(arrow.use_threads = TRUE)
dt <- DBI::dbReadTable(con, "bigquery-public-data.usa_names.usa_1910_current", bqs = TRUE){code}
 

Session details

 
{code:java}
> Sys.info()
       sysname        release        version       nodename        machine          login           user effective_user 
     "Windows"       "10 x64"  "build 19042"   "C000055787"       "x86-64"     "gen01914"     "gen01914"     "gen01914" 
> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     other attached packages:
[1] bigrquery_1.3.2.9001

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5           rstudioapi_0.13      magrittr_1.5         tidyselect_1.1.0     bit_4.0.4            R6_2.5.0            
 [7] rlang_0.4.8          dplyr_1.0.2          httr_1.4.2           tools_4.0.3          arrow_2.0.0.20201130 DBI_1.1.0           
[13] dbplyr_2.0.0         ellipsis_0.3.1       remotes_2.2.0        bit64_4.0.5          assertthat_0.2.1     gargle_0.5.0        
[19] tibble_3.0.4         lifecycle_0.2.0      crayon_1.3.4         purrr_0.3.4          fs_1.5.0             vctrs_0.3.4         
[25] glue_1.4.2           compiler_4.0.3       pillar_1.4.6         generics_0.1.0       jsonlite_1.7.1       pkgconfig_2.0.3  
{code}
 

```

  was:
On Windows only

Tested on 2 machines, mingw. 

Reprex
{code:java}
remotes::install_github("meztez/bigrquerystorage")
library(bigrquerystorage)
## Auth is done automagically using Application Default Credentials.
## Use the following command once to set it up :
## gcloud auth application-default login --billing-project={project}
options(bigquerystorage.project = "{project}")
arrow_table <- bqs_table_download(
 x = "bigquery-public-data:usa_names.usa_1910_current"
)
# On Windows Only
# Does not hang
options(arrow.use_threads = FALSE)
df <- as.data.frame(arrow_table)
# Hangs
options(arrow.use_threads = TRUE)
df <- as.data.frame(arrow_table){code}
 

Session details

 
{code:java}
> Sys.info()
 sysname release version nodename 
 "Windows" "10 x64" "build 19041" "DESKTOP-G89AJQO" 
 machine login user effective_user 
 "x86-64" "tremb" "tremb" "tremb" 
 > sessionInfo()
 R version 4.0.2 (2020-06-22)
 Platform: x86_64-w64-mingw32/x64 (64-bit)
 Running under: Windows 10 x64 (build 19041)
Matrix products: default
locale:
 [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 
 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C 
 [5] LC_TIME=English_United States.1252 
 system code page: 65001
attached base packages:
 [1] stats graphics grDevices utils datasets methods base
other attached packages:
 [1] bigrquerystorage_0.2.0
loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5 rstudioapi_0.11 magrittr_2.0.1 tidyselect_1.1.0 
 [5] bit_4.0.4 R6_2.5.0 rlang_0.4.9 httr_1.4.2 
 [9] tools_4.0.2 xfun_0.15 arrow_2.0.0 tinytex_0.24 
 [13] DBI_1.1.0 ellipsis_0.3.1 remotes_2.2.0.9000 bit64_4.0.5 
 [17] assertthat_0.2.1 tibble_3.0.4 lifecycle_0.2.0 crayon_1.3.4 
 [21] purrr_0.3.4 vctrs_0.3.5 glue_1.4.1 bigrquery_1.3.2 
 [25] compiler_4.0.2 pillar_1.4.7 jsonlite_1.7.1 pkgconfig_2.0.3
{code}
 

```


> [R] parallel as.data.frame.Table hangs indefinitely on Windows
> --------------------------------------------------------------
>
>                 Key: ARROW-10773
>                 URL: https://issues.apache.org/jira/browse/ARROW-10773
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, R
>    Affects Versions: 2.0.0
>            Reporter: Bruno Tremblay
>            Priority: Minor
>
> On Windows only
> Tested on 2 machines, mingw. 
> Reprex
> {code:java}
> install.packages("arrow", repos = "https://arrow-r-nightly.s3.amazonaws.com")
> remotes::install_github("meztez/bigrquery", ref = "bigquerystorage", INSTALL_opts = "--no-multiarch")
> library(bigrquery)
> Sys.info()
> sessionInfo()
> Sys.setenv("BIGQUERY_TEST_PROJECT"="{project}")con <- bigrquery::dbConnect(
>   bigrquery::bigquery(),
>   project = "bigquery-public-data",
>   dataset = "usa_names",
>   billing = bigrquery:::bq_test_project())
> # Does not hang
> options(arrow.use_threads = FALSE)
> dt <- DBI::dbReadTable(con, "bigquery-public-data.usa_names.usa_1910_current", bqs = TRUE)
> # Hangs
> options(arrow.use_threads = TRUE)
> dt <- DBI::dbReadTable(con, "bigquery-public-data.usa_names.usa_1910_current", bqs = TRUE){code}
>  
> Session details
>  
> {code:java}
> > Sys.info()
>        sysname        release        version       nodename        machine          login           user effective_user 
>      "Windows"       "10 x64"  "build 19042"   "C000055787"       "x86-64"     "gen01914"     "gen01914"     "gen01914" 
> > sessionInfo()
> R version 4.0.3 (2020-10-10)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 10 x64 (build 19042)
> Matrix products: default
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base     other attached packages:
> [1] bigrquery_1.3.2.9001
> loaded via a namespace (and not attached):
>  [1] Rcpp_1.0.5           rstudioapi_0.13      magrittr_1.5         tidyselect_1.1.0     bit_4.0.4            R6_2.5.0            
>  [7] rlang_0.4.8          dplyr_1.0.2          httr_1.4.2           tools_4.0.3          arrow_2.0.0.20201130 DBI_1.1.0           
> [13] dbplyr_2.0.0         ellipsis_0.3.1       remotes_2.2.0        bit64_4.0.5          assertthat_0.2.1     gargle_0.5.0        
> [19] tibble_3.0.4         lifecycle_0.2.0      crayon_1.3.4         purrr_0.3.4          fs_1.5.0             vctrs_0.3.4         
> [25] glue_1.4.2           compiler_4.0.3       pillar_1.4.6         generics_0.1.0       jsonlite_1.7.1       pkgconfig_2.0.3  
> {code}
>  
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)