You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by "Hyoungjun Kim (JIRA)" <ji...@apache.org> on 2014/08/27 11:26:57 UTC
[jira] [Resolved] (TAJO-992) Reduce number of hash shuffle output
file.
[ https://issues.apache.org/jira/browse/TAJO-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyoungjun Kim resolved TAJO-992.
--------------------------------
Resolution: Fixed
Fix Version/s: 0.9.0
Committed.
> Reduce number of hash shuffle output file.
> ------------------------------------------
>
> Key: TAJO-992
> URL: https://issues.apache.org/jira/browse/TAJO-992
> Project: Tajo
> Issue Type: Sub-task
> Components: data shuffle
> Reporter: Hyoungjun Kim
> Assignee: Hyoungjun Kim
> Fix For: 0.9.0
>
>
> Currently Tajo creates too many intermediate files in the case of hash shuffle. A execution block(SubQuery) on a TajoWorker creates intermediate files as following rule:
> # intermediate files in a worker = # tasks / # workers * # partitions
> This may cause 'too many file opens' error and makes it difficult to scale out. To solve this problem, We should reduce number of hash shuffle output file.
--
This message was sent by Atlassian JIRA
(v6.2#6252)