You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Anu Engineer (JIRA)" <ji...@apache.org> on 2016/11/01 18:02:58 UTC
[jira] [Resolved] (HDFS-10904) Need a new Result state for
DiskBalancerWorkStatus to indicate the final Plan step errors and stuck
rebalancing
[ https://issues.apache.org/jira/browse/HDFS-10904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anu Engineer resolved HDFS-10904.
---------------------------------
Resolution: Not A Problem
> Need a new Result state for DiskBalancerWorkStatus to indicate the final Plan step errors and stuck rebalancing
> ---------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-10904
> URL: https://issues.apache.org/jira/browse/HDFS-10904
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: balancer & mover
> Affects Versions: 3.0.0-alpha2
> Reporter: Manoj Govindassamy
> Assignee: Manoj Govindassamy
> Fix For: 2.9.0
>
>
> * A DiskBalancer {{NodePlan}} might include a Single {{MoveStep}} or a list of MoveSteps to perform the requested disk balancing operation.
> * {{DiskBalancerWorkStatus}} tracks the current disk balancing operation status for the {{Plan}} just submitted.
> * {{DiskBalancerWorkStatus#Result}} has following states and the state machine movement for the {{currentResult}} state doesn't seem to be a driven totally from disk balancing operation. Especially, the state movement to DONE is happening only upon QueryResult, which can be improved. {code}
> /** Various result values. **/
> public enum Result {
> NO_PLAN(0),
> PLAN_UNDER_PROGRESS(1),
> PLAN_DONE(2),
> PLAN_CANCELLED(3);
> DiskBalancer
> cancelPlan(String)
> this.currentResult = Result.PLAN_CANCELLED;
> DiskBalancer(String, Configuration, BlockMover)
> this.currentResult = Result.NO_PLAN;
> queryWorkStatus()
> this.currentResult = Result.PLAN_DONE;
> shutdown()
> this.currentResult = Result.NO_PLAN;
> this.currentResult = Result.PLAN_CANCELLED;
> submitPlan(String, long, String, String, boolean)
> this.currentResult = Result.PLAN_UNDER_PROGRESS;
> {code}
> * More importantly, when the final {{MoveStep}} of the {{NodePlan}} fails, the currentResult state is stuck in {{PLAN_UNDER_PROGRESS}} forever. User querying the status will assume the operation is in progress when in reality its not making any progress. User can also run {{Query}} command with _verbose_ option which then will display more details about the operation which includes details about errors encountered.
> ** Query Output: {code}
> Plan File: <_file_path_>
> Plan ID: <_plan_hash_>
> Result: PLAN_UNDER_PROGRESS
> {code}
> ** {code}
> "sourcePath" : "/data/disk2/hdfs/dn",
> "destPath" : "/data/disk3/hdfs/dn",
> "workItem" :
> .. .. ..
> "errorCount" : 0,
> "errMsg" : null,
> .. ..
> "maxDiskErrors" : 5,
> .. .. ..
> {code}
> ** But, user has to decipher these details to make out that the disk balancing operation is stuck as the top level Result still says {{PLAN_UNDER_PROGRESS}}. So, we want the DiskBalancer differentiate between the in-progress operation and the stuck or final error operations.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org