You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2014/11/30 15:15:13 UTC

[jira] [Commented] (TAJO-1211) Staging directory for CTAS and INSERT should be in the output dir.

    [ https://issues.apache.org/jira/browse/TAJO-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14229109#comment-14229109 ] 

ASF GitHub Bot commented on TAJO-1211:
--------------------------------------

GitHub user hyunsik opened a pull request:

    https://github.com/apache/tajo/pull/274

    TAJO-1211: Staging directory for CTAS and INSERT should be in the output...

    ... dir.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hyunsik/tajo TAJO-1211

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tajo/pull/274.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #274
    
----
commit 779be0cc948d79863f1b812c1f3c47f9363bd2e5
Author: Hyunsik Choi <hy...@apache.org>
Date:   2014-11-30T14:14:03Z

    TAJO-1211: Staging directory for CTAS and INSERT should be in the output dir.

----


> Staging directory for CTAS and INSERT should be in the output dir.
> ------------------------------------------------------------------
>
>                 Key: TAJO-1211
>                 URL: https://issues.apache.org/jira/browse/TAJO-1211
>             Project: Tajo
>          Issue Type: Bug
>          Components: query master
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>             Fix For: 0.9.1
>
>
> *Background*
> Staging directory plays a role to keep the final output data temporarily. The final output data are moved toe the the final output dir if query is successfully finished. It is important to keep the output directory consistent even if query is failed.
> *Problem*
> Currently, staging directory is included /tmp/tajo-$\{user.name\}/ in HDFS that $\{tajo.root\} uses. The final output directory and the staging directory can be on different file systems. In this case, the move will cause unnecessary copy overheads. In addition, in S3, such a move operation may be more problematic.
> *Solution*
> CTAS and INSERT (OVERWRITE) INTO should use the staging dir as a hidden subdirectory in the final output dir. For example, if the output dir is {{/table1}}, the corresponding staging dir should be {{/table1/.staging}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)