You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Narine Kokhlikyan (JIRA)" <ji...@apache.org> on 2015/12/06 03:05:10 UTC

[jira] [Comment Edited] (SPARK-11250) Generate different alias for columns with same name during join

    [ https://issues.apache.org/jira/browse/SPARK-11250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15043647#comment-15043647 ] 

Narine Kokhlikyan edited comment on SPARK-11250 at 12/6/15 2:04 AM:
--------------------------------------------------------------------

Hi there,

I've created a pull request for the join on scala side.
if the not-join-condition column names repeat in both dataframes.
e.g.

Employee
-------------
empid
name

Company
----------
cid
empid
name


and we call join with
employee.join(company, "empid", "inner") this will generate a resulting dataframe with columns:

empid, cid, name_x name_y

what do you think ? [~davies]  [~shivaram] [~sunrui] I can change other joins too if we agree on the logic.

Thanks,
Narine


was (Author: narine):
Hi there,

I've created a pull request for the join on scala side.
if the not-join-condition column names repeat in both dataframes.
e.g.

Employee
-------------
empid
name

Company
----------
cid
empid
name


and we call join with
employee.join(company, "empid", "inner") this will generate a resulting dataframe with columns:

empid, cid, name_x name_y

what do you think ?  I can change other joins too if we agree on the logic.

Thanks,
Narine

> Generate different alias for columns with same name during join
> ---------------------------------------------------------------
>
>                 Key: SPARK-11250
>                 URL: https://issues.apache.org/jira/browse/SPARK-11250
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Davies Liu
>            Assignee: Apache Spark
>
> It's confusing to see columns with same name after joining, and hard to access them, we could generate different alias for them in joined DataFrame.
> see https://github.com/apache/spark/pull/9012/files#r42696855 as example



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org