You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/09 18:07:20 UTC

[GitHub] [arrow] wjones127 commented on pull request #13090: ARROW-15622: [R] Implement union_all and union for arrow_dplyr_query

wjones127 commented on PR #13090:
URL: https://github.com/apache/arrow/pull/13090#issuecomment-1121415279

   Some example usage:
   
   ``` r
   library(arrow)
   #> 
   #> Attaching package: 'arrow'
   #> The following object is masked from 'package:utils':
   #> 
   #>     timestamp
   library(dplyr)
   #> 
   #> Attaching package: 'dplyr'
   #> The following objects are masked from 'package:stats':
   #> 
   #>     filter, lag
   #> The following objects are masked from 'package:base':
   #> 
   #>     intersect, setdiff, setequal, union
   
   tab1 <- arrow_table(x = 1:3)
   tab2 <- arrow_table(x = 2:4, y = c("a", "b", "c"))
   
   tab1 |>
       mutate(y = "a") |>
       union_all(tab2) |>
       collect()
   #> # A tibble: 6 × 2
   #>       x y    
   #>   <int> <chr>
   #> 1     2 a    
   #> 2     3 b    
   #> 3     4 c    
   #> 4     1 a    
   #> 5     2 a    
   #> 6     3 a
   
   tab1 |>
       mutate(y = "a") |>
       union_all(tab2) |>
       arrange(x, y) |>
       collect()
   #> # A tibble: 6 × 2
   #>       x y    
   #>   <int> <chr>
   #> 1     1 a    
   #> 2     2 a    
   #> 3     2 a    
   #> 4     3 a    
   #> 5     3 b    
   #> 6     4 c
   
   tab1 |>
       mutate(y = "a") |>
       dplyr::union(tab2) |>
       arrange(x, y) |>
       collect()
   #> # A tibble: 5 × 2
   #>       x y    
   #>   <int> <chr>
   #> 1     1 a    
   #> 2     2 a    
   #> 3     3 a    
   #> 4     3 b    
   #> 5     4 c
   ```
   
   <sup>Created on 2022-05-09 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)</sup>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org