You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/09/12 11:47:45 UTC

[jira] [Commented] (TAJO-1832) Well support for self-describing data formats

    [ https://issues.apache.org/jira/browse/TAJO-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741997#comment-14741997 ] 

ASF GitHub Bot commented on TAJO-1832:
--------------------------------------

GitHub user jihoonson opened a pull request:

    https://github.com/apache/tajo/pull/756

    TAJO-1832: Well support for self-describing data formats

    This is still on-going work, but mostly finished.
    Honestly, I think it will be finished if I improve unit tests and cleanup some code lines.
    Before finishing it, I'd like to share my approach for this issue.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jihoonson/tajo-2 TAJO-1832

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tajo/pull/756.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #756
    
----
commit 2445c7d6c4d5b6cead7db09936d08945da043acf
Author: Jihoon Son <ji...@apache.org>
Date:   2015-09-10T07:54:07Z

    TAJO-1833

commit 9b978900eade3d6d71f9a47c405b0eab4cdb275a
Author: Jihoon Son <ji...@apache.org>
Date:   2015-09-10T08:30:19Z

    Change to abstract class

commit 11c089f0a0c71f1cb60ab2dbb71e7769d42ff6bc
Author: Jihoon Son <ji...@apache.org>
Date:   2015-09-10T09:26:44Z

    TAJO-1832

commit 67497269c4e248dd4430bdbf745c91e6078412dd
Author: Jihoon Son <ji...@apache.org>
Date:   2015-09-10T12:58:51Z

    TAJO-1832

commit 01b4d92df72171437cf8cf5ee96394112d88169d
Author: Jihoon Son <ji...@apache.org>
Date:   2015-09-10T14:36:29Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into TAJO-1832

commit d0be6b31cd8c3747c0b186ea35d69c6dbebe0c34
Author: Jihoon Son <ji...@apache.org>
Date:   2015-09-11T17:30:38Z

    TAJO-1832

commit 9b58e686b207f47c645be5f519671511af84dd24
Author: Jihoon Son <ji...@apache.org>
Date:   2015-09-12T06:13:12Z

    fix groupby

commit af8a8e4abcce248359dd95e2e06ee1b926becd92
Author: Jihoon Son <ji...@apache.org>
Date:   2015-09-12T06:50:10Z

    add license

commit 471a70a5e628023509c5eb867660714b5e61a388
Author: Jihoon Son <ji...@apache.org>
Date:   2015-09-12T09:37:59Z

    Finished, but need to improve test

commit 4f5ac455d21c7a9bfc13280a60cc3354010c9145
Author: Jihoon Son <ji...@apache.org>
Date:   2015-09-12T09:43:32Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into TAJO-1832

----


> Well support for self-describing data formats
> ---------------------------------------------
>
>                 Key: TAJO-1832
>                 URL: https://issues.apache.org/jira/browse/TAJO-1832
>             Project: Tajo
>          Issue Type: New Feature
>          Components: Planner/Optimizer
>            Reporter: Jihoon Son
>            Assignee: Jihoon Son
>
> *Problem*
> Tajo already has a support for self-describing data formats like JSON, Parquet, or ORC. While they are capable of providing schema information by themselves, users must define schema to query on them with the current implementation. To solve this inconvenience, we have to improve our query planner to support self-describing data formats well. 
> *Solution*
> First, we need to allow omitting schema definition for the create table statement. When a query is submitted for a self-describing table, the columns which don't exist in that table will be filled with Nulls. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)