You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/08/10 11:38:46 UTC

[jira] [Commented] (TAJO-1340) Change the default output file format.

    [ https://issues.apache.org/jira/browse/TAJO-1340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14679865#comment-14679865 ] 

ASF GitHub Bot commented on TAJO-1340:
--------------------------------------

Github user jihoonson commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/671#discussion_r36613979
  
    --- Diff: tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java ---
    @@ -991,7 +992,7 @@ public LogicalNode visitProjection(GlobalPlanContext context, LogicalPlan plan,
             for (DataChannel dataChannel : masterPlan.getIncomingChannels(execBlock.getId())) {
               // This data channel will be stored in staging directory, but RawFile, default file type, does not support
               // distributed file system. It needs to change the file format for distributed file system.
    -          dataChannel.setStoreType("TEXT");
    +          dataChannel.setStoreType(BuiltinStorages.DRAW);
    --- End diff --
    
    How about adding a final variable for the default store type and using it in the class?


> Change the default output file format.
> --------------------------------------
>
>                 Key: TAJO-1340
>                 URL: https://issues.apache.org/jira/browse/TAJO-1340
>             Project: Tajo
>          Issue Type: Improvement
>            Reporter: Hyunsik Choi
>            Assignee: Jinho Kim
>             Fix For: 0.11.0
>
>
> Currently, the default output file is CSV. Due to its nature, CSV has mainly three problems:
>  * Its line or field delimiter can be duplicated to some character included in the result data.
>  * Plan text file is likely to be larger than other file formats.
>  * Its read and write performance is slow.
> We need to change the default output file format into other file formats. We also need to investigate which file format is the best for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)