You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2021/04/02 17:47:16 UTC

[GitHub] [helix] kaisun2000 opened a new issue #1694: Helix Selective Updates -- Phantom Read

kaisun2000 opened a new issue #1694:
URL: https://github.com/apache/helix/issues/1694

Helix **Selective Updates** is a feature by default enabled to speed up data provider update speed wrt Zookeeper. This feature by design is wrong as they would incur "phantom read", or dirty read. This is a serious issue. When Helix is under stress with Zookeeper, the data read can be wrong and incurs various data race issue in the controller pipeline.

### Selective Update sees missed updates (Phantom read)
First this is how selective update works ideally.

1/ Upon change notification from zookeeper of a directory say '/Cluster_TestTaskRebalancerStopResume/Config/Resource/', selective update would first list all the workflow/job znodes under the directory.

2/ Selective update would find all the existing znodes comparing to its cache. Say '/Cluster_TestTaskRebalancerStopResume/Config/Resource/stopAndDeleteQueue' exist.

3/ For existing znode, selective update would read the Stat meta info of the znode, and compares with the cached Stat meta. If the two Stat is not the same, namely the znode content changed, selective update would issue read to get the latest content.

The core idea is that selective update would only read() if the data changed.

Note, Zookeeper provides sequential consistency, which means that all the updates are in total order. Helix general leverages on this fact that the read of concerned data would not miss updates. (Or you can think that when Helix see some snapshot of data, it would see all the data at certain total order points). Helix should not see some later updates without all the previous updates.

### How it can go wrong – Race condition illustrated
Let us use Task related https://github.com/apache/helix/issues/1394 as an example to illustrate. Note, the pattern applies in general, the race condition applies in Helix controller and router too.

Here is the what stopAndDeleteQueue() test do:

1/ Creates queue. ==> /Cluster_TestTaskRebalancerStopResume/Config/Resource/stopAndDeleteQueue as workflow config is created

2/ Creates Job master. ==>/Cluster_TestTaskRebalancerStopResume/Config/Resource/stopAndDeleteQueue/stopAndDeleteQueue_masterJob is created.

3/ Updates queue. ==> /Cluster_TestTaskRebalancerStopResume/Config/Resource/stopAndDeleteQueue workflow config is updated with master job in the Dag.

4/ Creates Job slave ==>/Cluster_TestTaskRebalancerStopResume/Config/Resource/stopAndDeleteQueue/stopAndDeleteQueue_slaveJob is created.

5/ Updates queue. ==>==> /Cluster_TestTaskRebalancerStopResume/Config/Resource/stopAndDeleteQueue workflow config is updated with slave job in the Dag.

Now, let us mix the selective update sequence and specifically, let us look at the this order.

1/ Creates job queue. => stopAndDeleteQueue workflow config is created.

2/ Selective update step 1, step2, step3 => selective update cache having stopAndDeleteQueue workflow config with nothing in Dag.

3/ Creates Job master, updates Queue

4/ Selective update step 1, step2 => selective update finds the set of existing znode is stopAndDeleteQueue and set of new node is Job master.

5/ Creates Job slave, updates Queue

6/ Selective update step 3. => selective update reads Stat of node stopAndDeleteQueue and finds that Stat changed. Thus, it would read stopAndDeleteQueue workflow config with new content and only job master config.

So at the end, selective update sees

stopAndDeleteQueue config with Dag including job master and job slave
job master config, but not job slave config.

### Summary of all the failure scenarios
Base cases
case 1/ This case, the read stat would read the newly added WorkflowCfg (containing job 1 cfg). However, list file does not contain job 1. It would miss job 1. Thus end result is that reading new wfg (with job 1 inside) but miss job 1 config. This is "phantom read"

Selective update:

------------------------ s1(list files) ------------------------------------ s2 ( read stat ) --------------------------------

Adding config:

--------------------------------------add job 1 ---------- add wfg -----------------------------

case 2/ This case, the read stat would read the old workflow cfg (not containing job 1). List file does not contain job 1 either. Thus end result is reading old wfg and miss job1 config. This is consistent.

Selective update:

------------------------s1(list files)------------------------------s2 (read stat) ----------------------

adding config:

--------------------------------------add job 1 -------------------------------------add wfg--------------

case 3/ This case, the list file would see job 1. Read stat would see new wfg changed too. Thus, end result is that reading new job config and new wfg config. This is consistent.

Selective update:

-----------------------s1(list files) --------------------------------s2(read stat) ------------------------

adding config

---------add job 1-----------------------------add wfg ----------------------------------------------

case 4/ End result is reading new job config but reading old wfg (not containing job config 1). This is acceptable in the case of task work flow. But in other cases, it may be not acceptable.

Selective update:

-----------------------s1(list files)-------------------------------s2(read stat)-------------------------

adding config

-----------add job1 --------------------------------------------------------------add wfg-------------

In summary, assuming selective update two operation as "{" , "}". And job update two operation as "(" and ")", one can think it this way:

{ () } : phantom read, new wfg config (with job config inside) but missing new job config.
{ ( } ): reading old data by selective update and new job config and wfg are all ignored.
( { ) } : reading new job config with new workflow config.
( { } ): reading new job config with old workflow config. Consistent or not, depending on application logic. In general it is not consistent.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org