You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Rakesh Radhakrishnan <ra...@gmail.com> on 2016/06/09 09:05:39 UTC

Re: HDFS Erasuring Coding Block placement policy related reconstruction work not scheduled appropriately

Thanks Rui for reporting this.

With "RS-DEFAULT-6-3-64k EC policy" EC file will have 6 data blocks and 3
parity blocks. Like you described initially the cluster has 5 racks, so the
first 5 data blocks will use those racks. Now while adding rack-6,
reconstruction task will be scheduled for placing 6th data block in rack-6.
Presently I could see for an EC file
"BlockManager#isPlacementPolicySatisfied()" is using "#numDataUnits" count
to verify that the block's placement meets requirement of placement policy,
i.e. replicas are placed on no less than minRacks racks in the system.
Thats the reason while adding rack-7 or more racks its not considering the
parity blocks count and not scheduling further reconstruction tasks to
place the parity blocks. I couldn't see any specific reason not to consider
the parity blocks to most racks. IMHO, its good to distribute all 9 blocks
to 9 diff racks, probably you can file a jira to discuss and reach to an
agreement.

Thanks,
Rakesh.

On Thu, Jun 9, 2016 at 11:16 AM, Rui Gao <rg...@yahoo-corp.jp> wrote:

> Hi all,
>
> We found out that under RS-DEFAULT-6-3-64k EC policy, if an EC file was
> witten to 5 racks, reconstruction work would be scheduled if
> the 6th rack is added. While adding the 7th rack or more racks will not
> trigger reconstruction work. Based on
> “BlockPlacementPolicyRackFaultTolerant.java”,
> EC file should be scheduled to distribute to 9 racks if possible.
>
> May I file a JIRA to address this issue?
>
> Looking forward to your opinions.
> Thank you
>
> Gao Rui
>
>
>

Re: HDFS Erasuring Coding Block placement policy related reconstruction work not scheduled appropriately

Posted by Rui Gao <rg...@yahoo-corp.jp>.
Thanks Rakesh for your comments. 

I’ve filed a jira for discussion https://issues.apache.org/jira/browse/HDFS-10530
Looking forward to discussing, and reach to an agreement.

Thanks,
Rui

On 6/9/16, 18:05, "Rakesh Radhakrishnan" <ra...@gmail.com> wrote:

Thanks Rui for reporting this.

With "RS-DEFAULT-6-3-64k EC policy" EC file will have 6 data blocks and 3
parity blocks. Like you described initially the cluster has 5 racks, so the
first 5 data blocks will use those racks. Now while adding rack-6,
reconstruction task will be scheduled for placing 6th data block in rack-6.
Presently I could see for an EC file
"BlockManager#isPlacementPolicySatisfied()" is using "#numDataUnits" count
to verify that the block's placement meets requirement of placement policy,
i.e. replicas are placed on no less than minRacks racks in the system.
Thats the reason while adding rack-7 or more racks its not considering the
parity blocks count and not scheduling further reconstruction tasks to
place the parity blocks. I couldn't see any specific reason not to consider
the parity blocks to most racks. IMHO, its good to distribute all 9 blocks
to 9 diff racks, probably you can file a jira to discuss and reach to an
agreement.

Thanks,
Rakesh.

On Thu, Jun 9, 2016 at 11:16 AM, Rui Gao <rg...@yahoo-corp.jp> wrote:

> Hi all,
>
> We found out that under RS-DEFAULT-6-3-64k EC policy, if an EC file was
> witten to 5 racks, reconstruction work would be scheduled if
> the 6th rack is added. While adding the 7th rack or more racks will not
> trigger reconstruction work. Based on
> “BlockPlacementPolicyRackFaultTolerant.java”,
> EC file should be scheduled to distribute to 9 racks if possible.
>
> May I file a JIRA to address this issue?
>
> Looking forward to your opinions.
> Thank you
>
> Gao Rui
>
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org