You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Hirokazu SUZUKI (Jira)" <ji...@apache.org> on 2022/10/20 02:27:00 UTC

[jira] [Updated] (ARROW-18091) [Ruby] Arrow::Table#join returns duplicated key columns

     [ https://issues.apache.org/jira/browse/ARROW-18091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hirokazu SUZUKI updated ARROW-18091:
------------------------------------
    Summary: [Ruby] Arrow::Table#join returns duplicated key columns  (was: [Ruby] Arrow::Table#join returns separated columns by key)

> [Ruby] Arrow::Table#join returns duplicated key columns
> -------------------------------------------------------
>
>                 Key: ARROW-18091
>                 URL: https://issues.apache.org/jira/browse/ARROW-18091
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Ruby
>            Reporter: Hirokazu SUZUKI
>            Priority: Major
>
> `Arrow::Table#join` returns columns with duplicate keys. Duplicate column names are acceptable in Arrow, but it is preferable to use one.
> Also with `type: :full_outer`, column data should be merged.
> table1
> => 
> #<Arrow::Table:0x7f9706109380 ptr=0x55a91a4cac10>
>         KEY     X         
> 0       A       1         
> 1       B       2         
> 2       C       3
> table2
> => 
> #<Arrow::Table:0x7f970415d2c0 ptr=0x55a91a348ce0>
>         KEY     X
> 0       A       4
> 1       B       5
> 2       D       6
>  
> Should omit `:KEY` in right
> table1.join(table2, :KEY)
> => 
> #<Arrow::Table:0x7f96fd152548 ptr=0x55a91af21110>                   
>         KEY     X       KEY     X                                   
> 0       A       1       A       4                                   
> 1       B       2       B       5
>  
> Should merge `:KEY`s
> table1.join(table2, :KEY, type: :full_outer)
> => 
> #<Arrow::Table:0x7f96fd0e1550 ptr=0x55a91a1a6410>                   
>         KEY          X  KEY          X                              
> 0       A            1  A            4                              
> 1       B            2  B            5                              
> 2       C            3  (null)  (null)                              
> 3       (null)  (null)  D            6
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)