You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Chen Luo (Jira)" <ji...@apache.org> on 2019/10/31 17:11:00 UTC
[jira] [Created] (ASTERIXDB-2666) One-Phase Log Replay Approach
Chen Luo created ASTERIXDB-2666:
-----------------------------------
Summary: One-Phase Log Replay Approach
Key: ASTERIXDB-2666
URL: https://issues.apache.org/jira/browse/ASTERIXDB-2666
Project: Apache AsterixDB
Issue Type: Wish
Components: STO - Storage, TX - Transactions
Reporter: Chen Luo
AsterixDB currently uses a classical two-phase log replay approach during recovery by first identifying committed writes and then applying these commit writes to LSM-trees. This is a stardard approach for general-purpose transaction processing systems, but for AsterixDB, we can design something better.
AsterixDB uses a record-level transaction model where each write is committed as soon as possible by "entity commit". To exploit this property, we can design a one-phase log replay approach as follows:
* Start from the log head based on the low watermark LSN
* Whenever we see an update log record, store that log record in memory (for each job)
* Whenever we see an entity commit or abort record, redo the corresponding update log record immediately and remove it from memory
The key property here is that the window between an update log record and a commit log record is very short - we commit on a frame basis. Thus, this will speed up the recovery process by only using one log read pass and avoiding store all entity commits in memory. We only need a small amount of memory, based on the window between updates and commits, during the recovery proces.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)