You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@tajo.apache.org by "Henry Saputra (JIRA)" <ji...@apache.org> on 2015/07/16 20:10:04 UTC

[jira] [Commented] (TAJO-5) Cache mechanism to keep instances of opened BSTIndexs in PullServerService

    [ https://issues.apache.org/jira/browse/TAJO-5?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630100#comment-14630100 ] 

Henry Saputra commented on TAJO-5:
----------------------------------

Sincerely apologize but unfortuntaely won't be able to work at this in near fitter. Reassign for someone to pick it up.

> Cache mechanism to keep instances of opened BSTIndexs in PullServerService
> --------------------------------------------------------------------------
>
>                 Key: TAJO-5
>                 URL: https://issues.apache.org/jira/browse/TAJO-5
>             Project: Tajo
>          Issue Type: Improvement
>          Components: Data Shuffle
>            Reporter: Hyunsik Choi
>              Labels: newbie
>
> PullServerAuxService is an auxiliary service of Yarn to repartition intermediate data. It is similar to ShuffleHandler of MRv2. PullServerAuxService supports hash repartition as well as range repartition. It works through netty-based HTTP web server.
> For retrieval of range partition data, PullServerAuxService uses a binary search tree (BSTIndex.java). For each request of range partitioned data, it opens BSTIndex every time. It may cause overheads. See messageReceived in PullServer and getFileChunks in PullServerAuxService.
> If PullServerAuxService uses some cache mechanism that keeps instances of opened BSTIndex and data files, it could get rid of this overhead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)