You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Andy Grove (Jira)" <ji...@apache.org> on 2020/11/28 16:28:00 UTC

[jira] [Commented] (ARROW-10732) [Rust] [DataFusion] Add SQL support for table/relation aliases and compound identifiers

    [ https://issues.apache.org/jira/browse/ARROW-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240001#comment-17240001 ] 

Andy Grove commented on ARROW-10732:
------------------------------------

[~alamb] [~jorgecarleitao] [~nevime] In order to implement this I need to make a design decision that impacts the core Arrow crate and potentially IPC so thought it would be good to discuss here first . I'm not sure if it warrants its own Google design doc or not but I am happy to create one if you think that would be helpful.

The issue is that when translating a SQL AST into a query plan, we need to be able to reference columns using compound keys such as "customer.id" and when we support structs in SQL we will need it for representing projections into structs as well e.g. "customer.address.street". I can easily update the Expr::Column to support compound names or even add a new Expr::CompoundColumnReferece but the issue I face is that we represent the schema of each LogicalPlan using the Arrow Schema and Field structs and Field does not currently support compound names:
{code:java}
pub struct Field {
    name: String,
    data_type: DataType,
    nullable: bool,
    dict_id: i64,
    dict_is_ordered: bool,
} {code}
I can work (hack) around this by just using the fully qualified name in the string in the form "table.column" and then have logic in the SQL planner to look up a field either by its simple name (while also checking that this is not an ambiguous reference) as well as looking up fully-qualified names. The other option would be to make changes to Field to support compound names and/or adding meta-data that we can use.

What do you think?

> [Rust] [DataFusion] Add SQL support for table/relation aliases and compound identifiers
> ---------------------------------------------------------------------------------------
>
>                 Key: ARROW-10732
>                 URL: https://issues.apache.org/jira/browse/ARROW-10732
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Rust - DataFusion
>            Reporter: Andy Grove
>            Assignee: Andy Grove
>            Priority: Major
>
> We need to support referencing columns in queries using table name and/or alias prefixes so that we can support use cases such as joins between tables that have duplicate column names.
> For example:
> {code:java}
> SELECT t1.id, t1.name, t2.name FROM t1 JOIN t2 ON t1.id = t2.id {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)