You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Jonathan Keane (Jira)" <ji...@apache.org> on 2021/09/27 19:42:00 UTC

[jira] [Comment Edited] (ARROW-13767) [R] Add Arrow methods slice(), slice_head(), slice_tail()

    [ https://issues.apache.org/jira/browse/ARROW-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17417826#comment-17417826 ] 

Jonathan Keane edited comment on ARROW-13767 at 9/27/21, 7:41 PM:
------------------------------------------------------------------

The [dplyr docs say they "do not work with relational databases"|https://dplyr.tidyverse.org/reference/slice.html#details]

Though, IME, many real-world uses of {{slice()}}, {{slice_head()}}, {{slice_tail()}} happen after an arrange:

{code:r}
library(dplyr)

mtcars %>% 
  group_by(cyl) %>% 
  arrange(mpg) %>% 
  slice_head(n = 2)
{code}

These are better written using `slice_min()`, though there has been some evolution around that + `top_n()` and the like. I've seen code like the above (with {{arrange}} + {{slice_head}}) a lot.



was (Author: jonkeane):
The[ dplyr docs say they "do not work with relational databases"|https://dplyr.tidyverse.org/reference/slice.html#details]

Though, IME, many real-world uses of {{slice()}}, {{slice_head()}}, {{slice_tail()}} happen after an arrange:

{code:r}
library(dplyr)

mtcars %>% 
  group_by(cyl) %>% 
  arrange(mpg) %>% 
  slice_head(n = 2)
{code}

These are better written using `slice_min()`, though there has been some evolution around that + `top_n()` and the like. I've seen code like the above (with {{arrange}} + {{slice_head}}) a lot.


> [R] Add Arrow methods slice(), slice_head(), slice_tail()
> ---------------------------------------------------------
>
>                 Key: ARROW-13767
>                 URL: https://issues.apache.org/jira/browse/ARROW-13767
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>            Reporter: Ian Cook
>            Priority: Major
>              Labels: query-engine
>             Fix For: 7.0.0
>
>
> Implement [{{slice()}}, {{slice_head()}}, and {{slice_tail()}}|https://dplyr.tidyverse.org/reference/slice.html] methods for {{ArrowTabular}}, {{Dataset}}, and {{arrow_dplyr_query}} objects . I believe this should be relatively straightforward, using {{Take()}} to return only the specified rows. We already have a {{head()}} method which I believe we can reuse for {{slice_head()}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)