You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Bharat Viswanadham (Jira)" <ji...@apache.org> on 2019/11/14 00:22:00 UTC

[jira] [Created] (HDDS-2477) TableCache cleanup issue for OM non-HA

Bharat Viswanadham created HDDS-2477:
----------------------------------------

             Summary: TableCache cleanup issue for OM non-HA
                 Key: HDDS-2477
                 URL: https://issues.apache.org/jira/browse/HDDS-2477
             Project: Hadoop Distributed Data Store
          Issue Type: Bug
            Reporter: Bharat Viswanadham
            Assignee: Bharat Viswanadham


In OM in non-HA case, the ratisTransactionLogIndex is generated by OmProtocolServersideTranslatorPB.java. And in OM non-HA validateAndUpdateCache is called from multipleHandler threads. So think of a case where one thread which has an index - 10 has added to doubleBuffer. (0-9 still have not added). DoubleBuffer flush thread flushes and call cleanup. (So, now cleanup will go and cleanup all cache entries with less than 10 epoch) This should not have cleanup those which might have put in to cache later and which are in process of flush to DB. This will cause inconsitency for few OM requests.

 

 

Example:

4 threads Committing 4 parts.

1st thread - part 1 - ratis Index - 3

2nd thread - part 2 - ratis index - 2

3rd thread - part3 - ratis index - 1

 

First thread got lock, and put in to doubleBuffer and cache with OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread might cleanup those entries, as it is called with index 3 for cleanup.

 

Now when the 4th part upload came -> when it is commit Multipart upload when it gets multipartinfo it get Only part1 in OmMultipartInfo, as the OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now after 4th part upload is complete in DB and Cache we will have 1,4 parts only. We will miss part2,3 information.

 

So for non-HA case cleanup will be called with list of epochs that need to be cleanedup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org