You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2017/05/11 19:59:04 UTC

[jira] [Resolved] (IMPALA-5169) Parallelise read I/O of BufferPool::Pin()

     [ https://issues.apache.org/jira/browse/IMPALA-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Armstrong resolved IMPALA-5169.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 2.9.0



IMPALA-5169: Add support for async pins in buffer pool

Makes Pin() do async reads behind-the-scenes, instead of
blocking until the read completes. The blocking is done
instead when the client tries to access the buffer via
PageHandle::GetBuffer() or ExtractBuffer().

This is implemented with a new sub-state of "pinned"
where the page has a buffer and consumes reservation
but the buffer does not contain valid data.

Motivation:
This unlocks various opportunities to overlap read I/Os
with other work:
* Reads to different disks can execute in parallel
* I/O and computation can be overlapped.

This initially benefits BufferedTupleStream::PinStream(),
where many pages are pinned at once. With this change the
reads run asynchronously. This can potentially lead
to large speedups when spilling. E.g. if the pages for a Hash
Join's partition are spread across 10 disks, we could get 10x
the read throughput, plus overlap the I/O with hash table build.

In future we can use this to do read-ahead over unpinned
BufferedTupleStreams or for unpinned Runs in Sorter, but
this requires changes to the client code to Pin() pages
in advance.

Testing:
* BufferedTupleStreamV2 already exercises this.
* Various BufferPool tests already exercise this.
* Added a basic test to cover edge cases made possible by the
  new state transitions.
* Extended the randomised test to cover this.

Change-Id: Ibdf074c1ac4405d6f08d623ba438a85f7d39fd79
Reviewed-on: http://gerrit.cloudera.org:8080/6612
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins

> Parallelise read I/O of BufferPool::Pin()
> -----------------------------------------
>
>                 Key: IMPALA-5169
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5169
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 2.9.0
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>             Fix For: Impala 2.9.0
>
>
> Currently read I/O in BufferPool is synchronous. In some cases this can lead to poor resource utilisation and I/O throughput, because:
> * We don't dispatch parallel reads to multiple scratch disks or high-throughput SSDs
> * Issuing reads of contiguous scratch ranges at the same time improves the odds that the second read can be served without a disk seek or by the disks internal cache.
> * Expose a batched Pin() interface that can pin multiple buffers at the same time
> * Expose an asynchronous Pin() interface that can start the read, and allow the client to wait for it.
> The first alternative is probably simplest.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)