You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Nicola Crane (Jira)" <ji...@apache.org> on 2022/09/26 07:16:00 UTC

[jira] [Closed] (ARROW-17802) [R] Merging multi file datasets on particular columns that are present in all the datasets.

     [ https://issues.apache.org/jira/browse/ARROW-17802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicola Crane closed ARROW-17802.
--------------------------------
    Resolution: Not A Bug

> [R] Merging multi file datasets on particular columns that are present in all the datasets.
> -------------------------------------------------------------------------------------------
>
>                 Key: ARROW-17802
>                 URL: https://issues.apache.org/jira/browse/ARROW-17802
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>            Reporter: N Gautam Animesh
>            Priority: Major
>
> While working with multi file datasets, I came across an issue where I wanted to merge specific columns from all the datasets and work on them.
> Though I was not able to do so, I want to know whether there is any work around for merging multi file datasets around some specific columns?
> Please look into it and do let me know if there's anything regarding this.
> {code:java}
> system.time({
>   df <- open_dataset('C:/Test/Files/test', format = "arrow")
>   df <- df %>% collect() %>%
>   #merging logic so as to select only specified column(s)
>   #write_dataset(df, 'C:/Test/Files/test', format = "arrow")
> }) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)