You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "runzhiwang (Jira)" <ji...@apache.org> on 2020/03/26 02:56:00 UTC

[jira] [Commented] (HDDS-3240) Improve write efficiency by creating container in parallel.

    [ https://issues.apache.org/jira/browse/HDDS-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17067325#comment-17067325 ] 

runzhiwang commented on HDDS-3240:
----------------------------------

I'm working on it

> Improve write efficiency by creating container in parallel.
> -----------------------------------------------------------
>
>                 Key: HDDS-3240
>                 URL: https://issues.apache.org/jira/browse/HDDS-3240
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>            Reporter: runzhiwang
>            Assignee: runzhiwang
>            Priority: Major
>         Attachments: screenshot-1.png
>
>
> Now follower cannot create container until leader finish creating container. But follower and leader can create container in parallel rather than in sequential.
> 1. From the code,  the [future thread|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L672] do getCachedStateMachineData  in readStateMachineData and the [future thread|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L459] do createContainer in writeStateMachineData  are the same [thread|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L505]. Because `writeStateMachineData  `called before `readStateMachineData`. So leader must wait `createContainer `finish then `getCachedStateMachineData `and append logs to the follower, so leader and follower are not independent in createContainer, follower must wait leader finish `createContainer`.  
> 2. From the jaeger UI, you can also see follower create container after leader finishing it currently.
> How to improve it:
> I think this order can be improved by distinguishing the thread used by `getCachedStateMachineData `  and `createContainer `, and  [data = readStateMachineData(requestProto, term, logIndex)|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L619]  use same thread with `createContainer `. If [stateMachineDataCache.get(logIndex)|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L617] does not return null,  leader can get stateMachineData from cache and need not wait `createContainer` finish, thus leader and follower can be independent. But if it return null, leader must finish `createContainer `and then apennd logs to the follower, so I think [data = readStateMachineData(requestProto, term, logIndex)|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L619] should use the same thread with `createContainer` rather than the whole [getCachedStateMachineData|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L614]. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org