You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/17 19:19:36 UTC

[GitHub] [arrow] paleolimbot commented on a diff in pull request #14337: ARROW-17954: [R] Update news for 10.0

paleolimbot commented on code in PR #14337:
URL: https://github.com/apache/arrow/pull/14337#discussion_r997421476


##########
r/NEWS.md:
##########
@@ -19,6 +19,61 @@
 
 # arrow 9.0.0.9000
 
+## Arrow dplyr queries
+
+Several new functions can be used in queries: 
+
+* `dplyr::across()` can be used to apply the same computation across multiple 
+  columns;
+* `add_filename()` can be used to get the filename a row came from (only 
+  available when querying `?Dataset`);

Review Comment:
   ```suggestion
     available when querying a `Dataset`);
   ```
   
   (I see question marks in a few other places so maybe that's intentional...I think that pkgdown takes care of the linking for you but maybe that's only for functions?)



##########
r/NEWS.md:
##########
@@ -19,6 +19,61 @@
 
 # arrow 9.0.0.9000
 
+## Arrow dplyr queries
+
+Several new functions can be used in queries: 
+
+* `dplyr::across()` can be used to apply the same computation across multiple 
+  columns;
+* `add_filename()` can be used to get the filename a row came from (only 
+  available when querying `?Dataset`);
+* Five functions in the `slice_*` family: `dplyr::slice_min()`, 
+  `dplyr::slice_max()`, `dplyr::slice_head()`, `dplyr::slice_tail()`, and
+  `dplyr::slice_sample()`.
+
+A full list of functions available in queries is available at `?acero`.
+
+A few new features and bugfixes were implemented for joins.
+Extension arrays are now supported in joins, allowing, for example, joining 
+datasets that contain [geoarrow](https://paleolimbot.github.io/geoarrow/) data.
+The `keep` argument is now supported, allowing separate columns for the left
+and right hand side join keys in join output. Full joins now coalesce the 
+join keys (when `keep = FALSE`), avoiding the issue where the join keys would 
+be all `NA` for rows in the right hand side without any matches on the left.
+
+A few breaking changes: Calling `dplyr::pull()` will return a `?ChunkedArray` 
+instead of an R vector. Calling `dplyr::compute()` on a query that is grouped 
+returns a `?Table`, instead of an query object.
+
+Finally, long-running queries can now be cancelled and will abort their 
+computation immediately.
+
+## Arrays and tables
+
+`as_arrow_array()` can now take `blob::blob` and `?vctrs::list_of`, which
+convert to binary and list arrays, respectively. Also fixed issue where 

Review Comment:
   ```suggestion
   convert to binary and list arrays, respectively. Also fixed an issue where 
   ```



##########
r/NEWS.md:
##########
@@ -19,6 +19,61 @@
 
 # arrow 9.0.0.9000
 
+## Arrow dplyr queries
+
+Several new functions can be used in queries: 
+
+* `dplyr::across()` can be used to apply the same computation across multiple 
+  columns;
+* `add_filename()` can be used to get the filename a row came from (only 
+  available when querying `?Dataset`);
+* Five functions in the `slice_*` family: `dplyr::slice_min()`, 

Review Comment:
   ```suggestion
   * Added five functions in the `slice_*` family: `dplyr::slice_min()`, 
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org