You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by Zhanpeng Wu <wu...@gmail.com> on 2022/03/17 02:00:22 UTC

[DISCUSS] BP-49: Support reading ahead in async mode

Master issue: https://github.com/apache/bookkeeper/issues/3085

----

### Motivation

#### Current Design of Read-ahead

Under the current design of read-ahead, every `read-entry` request that the
entry data is required to be read from main storage eventually, will force
a read-ahead operation through the method
`org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage.fillReadAheadCache`.
This method will read several entries after the current position and load
them into the read-cache, among which the amount of entries is controlled
by the `dbStorage_readAheadCacheBatchSize`.

In this mode, once a miss of read-cache occurs, the elapsed time of reading
an entry is equivalent to the sum of the time of reading that entry plus
reading several entries after the entry, because the process of read-ahead
is synchronous.

Synchronous read-ahead is a simple and effective solution in scenarios
where read latency is less of a concern. However, we found that when the
cluster has a large number of catch-up reads, and the p99 latency cannot be
ignored, synchronous read-ahead may introduce a lot of latency glitches.
Therefore, we decided to introduce an asynchronous read-ahead mode to
reduce the latency for the catch-up reads.

#### Proposed Approach

Instead of modifying the original synchronous read-ahead logic, we
introduced an independent asynchronous read-ahead module named
`ReadAheadManager`. The user can select a specific read-ahead mode through
configuration parameters. The async read-ahead module will provide an
interface for reading entry for upper-layer logic.

#### Evaluation Results

I also uploaded the monitoring screenshots of reading entry in the master
issue. We can see the real optimization effects after enabling asynchronous
read-ahead.

Re: [DISCUSS] BP-49: Support reading ahead in async mode

Posted by Zhanpeng Wu <wu...@gmail.com>.
Hi community, any suggestion for this proposal?

Zhanpeng Wu <wu...@gmail.com> 于2022年3月17日周四 10:00写道:

> Master issue: https://github.com/apache/bookkeeper/issues/3085
>
> ----
>
> ### Motivation
>
> #### Current Design of Read-ahead
>
> Under the current design of read-ahead, every `read-entry` request that
> the entry data is required to be read from main storage eventually, will
> force a read-ahead operation through the method
> `org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage.fillReadAheadCache`.
> This method will read several entries after the current position and load
> them into the read-cache, among which the amount of entries is controlled
> by the `dbStorage_readAheadCacheBatchSize`.
>
> In this mode, once a miss of read-cache occurs, the elapsed time of
> reading an entry is equivalent to the sum of the time of reading that entry
> plus reading several entries after the entry, because the process of
> read-ahead is synchronous.
>
> Synchronous read-ahead is a simple and effective solution in scenarios
> where read latency is less of a concern. However, we found that when the
> cluster has a large number of catch-up reads, and the p99 latency cannot be
> ignored, synchronous read-ahead may introduce a lot of latency glitches.
> Therefore, we decided to introduce an asynchronous read-ahead mode to
> reduce the latency for the catch-up reads.
>
> #### Proposed Approach
>
> Instead of modifying the original synchronous read-ahead logic, we
> introduced an independent asynchronous read-ahead module named
> `ReadAheadManager`. The user can select a specific read-ahead mode through
> configuration parameters. The async read-ahead module will provide an
> interface for reading entry for upper-layer logic.
>
> #### Evaluation Results
>
> I also uploaded the monitoring screenshots of reading entry in the master
> issue. We can see the real optimization effects after enabling asynchronous
> read-ahead.
>