You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by "Hyunsik Choi (JIRA)" <ji...@apache.org> on 2014/06/25 01:10:25 UTC

[jira] [Updated] (TAJO-889) Separate a data flow into a logical flow and physical flows

     [ https://issues.apache.org/jira/browse/TAJO-889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyunsik Choi updated TAJO-889:
------------------------------

    Attachment: physical.png
                logical.png

> Separate a data flow into a logical flow and physical flows
> -----------------------------------------------------------
>
>                 Key: TAJO-889
>                 URL: https://issues.apache.org/jira/browse/TAJO-889
>             Project: Tajo
>          Issue Type: Sub-task
>            Reporter: Hyunsik Choi
>         Attachments: logical.png, physical.png
>
>
> Currently, DataChannel represents a data flow between execution blocks (query stages). In the current DAG framework, a data flow indicates only a physical data flow. It should be improved in order to enable users to easily deal with complex data flows.
> For example, see the following examples:
> {code}
> select * from A join (select * from B union select * from C) D;
> {code}
> The above cases will make the data flows as the figure (physical.png) I attached.
> The main problem is that each ScanNode can have only one data source. But, in the above cases, one ScanNode has to involve two data sources B and C. So, currently, we use some hack to change B and C into some fake data source id. It works well, but it results in messy code.
> A potential solution is to separate the current data flow model into a logical data flow and a physical data flow. For example, in the figure (logical.png), the dotted line represents one logical data flow. 
> If a user uses a logical data flow instead of directly handling physical data flow when needed, it would be very helpful for distributed plan generation. Also, it will simplify the global plan code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)