You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/06/04 13:19:00 UTC

[jira] [Commented] (IMPALA-7556) Clean up ScanRange

    [ https://issues.apache.org/jira/browse/IMPALA-7556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17357352#comment-17357352 ] 

ASF subversion and git services commented on IMPALA-7556:
---------------------------------------------------------

Commit 700246fad9d207786a87aca6e833d49560596da6 in impala's branch refs/heads/master from Amogh Margoor
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=700246f ]

IMPALA-7556: Decouple BufferManagement from the ScanRange and IoMgr

Currently BufferManagement is tightly coupled with ScanRange.
Every ScanRange maintains list of unused buffers and ready buffers.
Unused buffers are buffers used to read scanned data and ready
buffers are buffers with the data already read. For managing these
buffers, ScanRange defines various functions like AddUnusedBuffer,
GetUsedBuffer, EnqueueReadyBuffer and functions to allocate and
cleanup buffers. This patch has created ScanBufferManager which
would be responsible for the managing these buffers for ScanRange.
ScanBufferManager's logic is still coupled with the ScanRange,
but refactorig it into a seperate class is a good first step.

Testing:
 1. Ran these existing tests: EE, BackEnd, JDBC and Cluster test.
 2. Ran the above tests with TSAN build using exhaustive strategy.

Change-Id: Ibd74691b50b46114f95a8641034c05d07ddeec97
Reviewed-on: http://gerrit.cloudera.org:8080/17413
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Clean up ScanRange
> ------------------
>
>                 Key: IMPALA-7556
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7556
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Amogh Margoor
>            Priority: Major
>              Labels: ramp-up
>
> For IMPALA-7543 I want to add some additional functionality to scan ranges.
> However, the code of the ScanRange class is already quite messy. It handles different types of files, does some buffer management, updates all kinds of counters.
> So, instead of complicating the code further, let's refactor the ScanRange class a bit.
>  * Do the file operations in separate classes
>  ** A new, abstract class could be invented to provide an API for file operations, i.e. Open(), ReadFromPos(), Close(), etc.
>  *** Keep in mind that the interface must be a good fit for IMPALA-7543, i.e. we need positional reads from files
>  ** Operations for local files and HDFS files could be implemented in child classes
>  * Buffer management
>  ** A new BufferStore class could be created
>  ** This new class would be responsible for managing the unused buffers
>  *** if possible, it would also handle the client and cached buffers as well
>  * Counters and metrics would be updated by the corresponding new classes
>  ** E.g. ImpaladMetrics::IO_MGR_NUM_OPEN_FILES would be updated by the file handling classes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org