You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tajo.apache.org by "Jihoon Son (JIRA)" <ji...@apache.org> on 2013/09/11 13:20:54 UTC

[jira] [Commented] (TAJO-178) Implements StorageManager for Vectorized Engine

    [ https://issues.apache.org/jira/browse/TAJO-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13764214#comment-13764214 ] 

Jihoon Son commented on TAJO-178:
---------------------------------

This is a great idea, 
but I wonder what's the relationship of the new storage manager and vectorized engine.
It doesn't look like involving any columnar operations.
                
> Implements StorageManager for Vectorized Engine
> -----------------------------------------------
>
>                 Key: TAJO-178
>                 URL: https://issues.apache.org/jira/browse/TAJO-178
>             Project: Tajo
>          Issue Type: Improvement
>          Components: storage
>    Affects Versions: 0.2-incubating
>            Reporter: hyoungjunkim
>         Attachments: tajo_storage_manager.png
>
>
> The current StorageManager does not provide scan scheduling function. All scan operations run concurrently. This is the cause of random disk access and disk read performance is not good.
> The proposed StorageManager is based on double buffering. Each disk has a scheduler to schedule by order of scanned adjust. Each Scanner has a InputStream and a Tuple pool. The next() operation of ScanNode is blocked until Tuple pool is filled. Assigned Scanner by the scheduler read data(xMB) and fills Tuple Pool and notifies to next() operation. After scanning Scanner re-enter DiskScanQueue.
> In this way Scanner can pass column vector to Vectorized Query Engine.
> See the attached file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira