You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tajo.apache.org by "Hyunsik Choi (JIRA)" <ji...@apache.org> on 2013/10/02 09:07:38 UTC

[jira] [Assigned] (TAJO-9) Change the default intermediate data file format for hash repartitioning

     [ https://issues.apache.org/jira/browse/TAJO-9?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyunsik Choi reassigned TAJO-9:
-------------------------------

    Assignee: Hyunsik Choi

> Change the default intermediate data file format for hash repartitioning
> ------------------------------------------------------------------------
>
>                 Key: TAJO-9
>                 URL: https://issues.apache.org/jira/browse/TAJO-9
>             Project: Tajo
>          Issue Type: Improvement
>          Components: repartitioning
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>
> For easy debugging, the hash repartitioning have used CSV as the default intermediate data format. CSV file format may cause parsing overhead, and it may cause relatively large intermediate data to be transmitted via networks. We need to change it to RawFile or another efficient file format.
> Digging PartitionedStoredExec class is a good starting point for this issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)