You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Brahma Reddy Battula <br...@huawei.com> on 2016/04/13 08:34:33 UTC

FW: [HDFS-9038] Non-Dfs used Calculation

Gentle Remainder!!


--Brahma Reddy Battula

From: Brahma Reddy Battula
Sent: 28 March 2016 12:26
To: hdfs-dev@hadoop.apache.org
Cc: 'aagarwal@hortonworks.com'; 'cnauroth@hortonworks.com'; 'vinayakumarb@apache.org'
Subject: [HDFS-9038] Non-Dfs used Calculation

Hi All,

Chris Nauroth / Arpit / Vinay and me discussing this calculation.

There is a disagreement on the definition of non-DFS used space, because of which Issue is not making progress.
Essentially, it's a question of whether this metric means "Raw Non-DFS Used" or "Unplanned Non-DFS Used".


Here is the summary of the conversation, by Arpit.

The pre HDFS-5215 calculation had two bugs.

 1. It incorrectly subtracted reserved space from the non-DFS used. (net negative). Chris suggests this is not really an issue as non-DFS used should be shown as zero unless it exceeds the DFS reserved value.

   2. It used File#getUsableSpace to calculate the volume free space instead of File#getFreeSpace. (net positive)

The net effect was that non-DFS used was displayed as zero unless the actual non-DFS used exceeded DFS reserved - system reserved.

HDFS-5215 fixed the first issue and the value that is now erroneously counted towards non-DFS used is in fact the system reserved 5%.

>From the testing it was found that, "Ext derivatives hold back 5% free space while XFS does not."


Proposed calculation to report the exact Non-DFS Usage:

  non-DFS used = getCapacity() + reserved - getDfsUsed() - totalFreeSpace
               = usage.getCapacity() - reserved + reserved - getDfsUsed() - totalFreeSpace
               = usage.getCapacity() - getDfsUsed() - totalFreeSpace
               = File#getTotalSpace - getDfsUsed() - File#getFreeSpace

Chris Nauroth thinks we should subtract "dfs.datanode.du.reserved" for non-dfs used because it allowed  to monitor for unexpected non-zero non-DFS usage and react.

Even Akira given "+0" on above calculation.

We would like take inputs from you to see some progress on the issue.

Please let me know your thoughts on this issue.

Thanks
--Brahma Reddy Battula


R: [HDFS-9038] Non-Dfs used Calculationxa

Posted by Francesco Giuliani <33...@gmail.com>.

----- Messaggio originale -----
Da: "Brahma Reddy Battula" <br...@huawei.com>
Inviato: ‎20/‎04/‎2016 12:58
A: "hdfs-dev@hadoop.apache.org" <hd...@hadoop.apache.org>; "Tsz Wo Sze" <sz...@yahoo.com>
Cc: "cnauroth@hortonworks.com" <cn...@hortonworks.com>; "vinayakumarb@apache.org" <vi...@apache.org>; "aagarwal@hortonworks.com" <aa...@hortonworks.com>
Oggetto: RE: [HDFS-9038] Non-Dfs used Calculation

>>> It is incorrect to minus reserved from usage.getAvailable() above since the reserved space, which is the space reserved for non-hdfs used, may already be occupied by some non-hdfs files but not necessarily empty space.
You are right. Its incorrect to subtract all reserved space. But we may need to subtract actual usage by non-dfs files, if its less than reserved. If the non-dfs usage is more than reserved, then need not subtract.

May be actual confusion started with this description for 'dfs.datanode.du.reserved' in HDFS-5215
"Reserved space in bytes per volume. Always leave this much space free for non dfs use."
Reserved space is for the non-dfs files. HDFS should not use reserved space to store dfs files.

But, if the reserved space is already used by non-dfs files, then HDFS need not care about reserved anymore in getAvailable().

Considering non-dfs shown in metrics is the unplanned non-dfs usage i.e. extra usage beyond reserved, I hope below changes might be fine. If okay, then I will update the patch in HDFS-9038 based on this.

---------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
index 0d060f9..451b258 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
@@ -383,7 +383,7 @@ public void setCapacityForTesting(long capacity) {
   @Override
   public long getAvailable() throws IOException {
     long remaining = getCapacity() - getDfsUsed() - reservedForReplicas.get();
-    long available = usage.getAvailable() - reserved
+    long available = usage.getAvailable() - getRemainingReserved()
         - reservedForReplicas.get();
     if (remaining > available) {
       remaining = available;
@@ -391,6 +391,31 @@ public long getAvailable() throws IOException {
     return (remaining > 0) ? remaining : 0;
   }

+  private long getActualNonDfsUsed() throws IOException {
+    return usage.getUsed() - getDfsUsed();
+  }
+
+  private long getRemainingReserved() throws IOException {
+    long actualNonDfsUsed = getActualNonDfsUsed();
+    if (actualNonDfsUsed < reserved) {
+      return reserved - actualNonDfsUsed;
+    }
+    return 0L;
+  }
+
+  /**
+   * Unplanned Non-DFS usage, i.e. Extra usage beyond reserved.
+   * @return
+   * @throws IOException
+   */
+  public long getNonDfsUsed() throws IOException {
+    long actualNonDfsUsed = getActualNonDfsUsed();
+    if (actualNonDfsUsed < reserved) {
+      return 0L;
+    }
+    return actualNonDfsUsed - reserved;
+  }
+
   @VisibleForTesting
   public long getReservedForReplicas() {
     return reservedForReplicas.get();


--Brahma Reddy Battula

-----Original Message-----
From: Ravi Prakash [mailto:ravihadoop@gmail.com] 
Sent: 16 April 2016 09:40
To: hdfs-dev; Tsz Wo Sze
Subject: Re: [HDFS-9038] Non-Dfs used Calculation

I meant HDFS-9530 instead of HDFS-9038. We are seeing an issue where none of the datanodes are available for writes, and my suspicion is that incorrect calculation of all these numbers is causing it.

On Fri, Apr 15, 2016 at 6:38 PM, Ravi Prakash <ra...@gmail.com> wrote:

> Hi Nicholas!
>
> Could you please point out exactly which place you are seeing this 
> {{available = usage.getAvailable() - reserved}} calculation? I'm sorry 
> I'm a bit confused because there are several places you could be 
> talking about ( in the patch / in the unpatched NN code / in the unpatched DN code ) .
>
> It seems to me the non-DFS used is only ever used to display a number 
> on a UI , so I would prefer to resolve this sooner so that we can nail 
> down more important issues e.g. HDFS-9038.
>
> Thanks
> Ravi
>
> On Wed, Apr 13, 2016 at 12:06 AM, Tsz Wo Sze 
> <sz...@yahoo.com.invalid>
> wrote:
>
>> available = usage.getAvailable() - reserved
>>
>> It is incorrect to minus reserved from usage.getAvailable() above 
>> since the reserved space, which is the space reserved for non-hdfs 
>> used, may already be occupied by some non-hdfs files but not necessarily empty space.
>> In pre HDFS-5215 calculation, the non-DFS used is like "unplanned 
>> non-DFS used" while the "planned DFS used" is the reserved space.
>> Tsz-Wo
>>
>>
>>
>>     On Wednesday, April 13, 2016 2:35 PM, Brahma Reddy Battula < 
>> brahmareddy.battula@huawei.com> wrote:
>>
>>
>>
>>  Gentle Remainder!!
>>
>>
>> --Brahma Reddy Battula
>>
>> From: Brahma Reddy Battula
>> Sent: 28 March 2016 12:26
>> To: hdfs-dev@hadoop.apache.org
>> Cc: 'aagarwal@hortonworks.com'; 'cnauroth@hortonworks.com'; '
>> vinayakumarb@apache.org'
>> Subject: [HDFS-9038] Non-Dfs used Calculation
>>
>> Hi All,
>>
>> Chris Nauroth / Arpit / Vinay and me discussing this calculation.
>>
>> There is a disagreement on the definition of non-DFS used space, 
>> because of which Issue is not making progress.
>> Essentially, it's a question of whether this metric means "Raw 
>> Non-DFS Used" or "Unplanned Non-DFS Used".
>>
>>
>> Here is the summary of the conversation, by Arpit.
>>
>> The pre HDFS-5215 calculation had two bugs.
>>
>>  1. It incorrectly subtracted reserved space from the non-DFS used. 
>> (net negative). Chris suggests this is not really an issue as non-DFS 
>> used should be shown as zero unless it exceeds the DFS reserved value.
>>
>>   2. It used File#getUsableSpace to calculate the volume free space 
>> instead of File#getFreeSpace. (net positive)
>>
>> The net effect was that non-DFS used was displayed as zero unless the 
>> actual non-DFS used exceeded DFS reserved - system reserved.
>>
>> HDFS-5215 fixed the first issue and the value that is now erroneously 
>> counted towards non-DFS used is in fact the system reserved 5%.
>>
>> From the testing it was found that, "Ext derivatives hold back 5% 
>> free space while XFS does not."
>>
>>
>> Proposed calculation to report the exact Non-DFS Usage:
>>
>>   non-DFS used = getCapacity() + reserved - getDfsUsed() - totalFreeSpace
>>               = usage.getCapacity() - reserved + reserved - 
>> getDfsUsed()
>> - totalFreeSpace
>>               = usage.getCapacity() - getDfsUsed() - totalFreeSpace
>>               = File#getTotalSpace - getDfsUsed() - File#getFreeSpace
>>
>> Chris Nauroth thinks we should subtract "dfs.datanode.du.reserved" 
>> for non-dfs used because it allowed  to monitor for unexpected 
>> non-zero non-DFS usage and react.
>>
>> Even Akira given "+0" on above calculation.
>>
>> We would like take inputs from you to see some progress on the issue.
>>
>> Please let me know your thoughts on this issue.
>>
>> Thanks
>> --Brahma Reddy Battula
>>
>>
>>
>>
>>
>
>

RE: [HDFS-9038] Non-Dfs used Calculation

Posted by Brahma Reddy Battula <br...@huawei.com>.
>>> It is incorrect to minus reserved from usage.getAvailable() above since the reserved space, which is the space reserved for non-hdfs used, may already be occupied by some non-hdfs files but not necessarily empty space.
You are right. Its incorrect to subtract all reserved space. But we may need to subtract actual usage by non-dfs files, if its less than reserved. If the non-dfs usage is more than reserved, then need not subtract.

May be actual confusion started with this description for 'dfs.datanode.du.reserved' in HDFS-5215
"Reserved space in bytes per volume. Always leave this much space free for non dfs use."
Reserved space is for the non-dfs files. HDFS should not use reserved space to store dfs files.

But, if the reserved space is already used by non-dfs files, then HDFS need not care about reserved anymore in getAvailable().

Considering non-dfs shown in metrics is the unplanned non-dfs usage i.e. extra usage beyond reserved, I hope below changes might be fine. If okay, then I will update the patch in HDFS-9038 based on this.

---------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
index 0d060f9..451b258 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
@@ -383,7 +383,7 @@ public void setCapacityForTesting(long capacity) {
   @Override
   public long getAvailable() throws IOException {
     long remaining = getCapacity() - getDfsUsed() - reservedForReplicas.get();
-    long available = usage.getAvailable() - reserved
+    long available = usage.getAvailable() - getRemainingReserved()
         - reservedForReplicas.get();
     if (remaining > available) {
       remaining = available;
@@ -391,6 +391,31 @@ public long getAvailable() throws IOException {
     return (remaining > 0) ? remaining : 0;
   }

+  private long getActualNonDfsUsed() throws IOException {
+    return usage.getUsed() - getDfsUsed();
+  }
+
+  private long getRemainingReserved() throws IOException {
+    long actualNonDfsUsed = getActualNonDfsUsed();
+    if (actualNonDfsUsed < reserved) {
+      return reserved - actualNonDfsUsed;
+    }
+    return 0L;
+  }
+
+  /**
+   * Unplanned Non-DFS usage, i.e. Extra usage beyond reserved.
+   * @return
+   * @throws IOException
+   */
+  public long getNonDfsUsed() throws IOException {
+    long actualNonDfsUsed = getActualNonDfsUsed();
+    if (actualNonDfsUsed < reserved) {
+      return 0L;
+    }
+    return actualNonDfsUsed - reserved;
+  }
+
   @VisibleForTesting
   public long getReservedForReplicas() {
     return reservedForReplicas.get();


--Brahma Reddy Battula

-----Original Message-----
From: Ravi Prakash [mailto:ravihadoop@gmail.com] 
Sent: 16 April 2016 09:40
To: hdfs-dev; Tsz Wo Sze
Subject: Re: [HDFS-9038] Non-Dfs used Calculation

I meant HDFS-9530 instead of HDFS-9038. We are seeing an issue where none of the datanodes are available for writes, and my suspicion is that incorrect calculation of all these numbers is causing it.

On Fri, Apr 15, 2016 at 6:38 PM, Ravi Prakash <ra...@gmail.com> wrote:

> Hi Nicholas!
>
> Could you please point out exactly which place you are seeing this 
> {{available = usage.getAvailable() - reserved}} calculation? I'm sorry 
> I'm a bit confused because there are several places you could be 
> talking about ( in the patch / in the unpatched NN code / in the unpatched DN code ) .
>
> It seems to me the non-DFS used is only ever used to display a number 
> on a UI , so I would prefer to resolve this sooner so that we can nail 
> down more important issues e.g. HDFS-9038.
>
> Thanks
> Ravi
>
> On Wed, Apr 13, 2016 at 12:06 AM, Tsz Wo Sze 
> <sz...@yahoo.com.invalid>
> wrote:
>
>> available = usage.getAvailable() - reserved
>>
>> It is incorrect to minus reserved from usage.getAvailable() above 
>> since the reserved space, which is the space reserved for non-hdfs 
>> used, may already be occupied by some non-hdfs files but not necessarily empty space.
>> In pre HDFS-5215 calculation, the non-DFS used is like "unplanned 
>> non-DFS used" while the "planned DFS used" is the reserved space.
>> Tsz-Wo
>>
>>
>>
>>     On Wednesday, April 13, 2016 2:35 PM, Brahma Reddy Battula < 
>> brahmareddy.battula@huawei.com> wrote:
>>
>>
>>
>>  Gentle Remainder!!
>>
>>
>> --Brahma Reddy Battula
>>
>> From: Brahma Reddy Battula
>> Sent: 28 March 2016 12:26
>> To: hdfs-dev@hadoop.apache.org
>> Cc: 'aagarwal@hortonworks.com'; 'cnauroth@hortonworks.com'; '
>> vinayakumarb@apache.org'
>> Subject: [HDFS-9038] Non-Dfs used Calculation
>>
>> Hi All,
>>
>> Chris Nauroth / Arpit / Vinay and me discussing this calculation.
>>
>> There is a disagreement on the definition of non-DFS used space, 
>> because of which Issue is not making progress.
>> Essentially, it's a question of whether this metric means "Raw 
>> Non-DFS Used" or "Unplanned Non-DFS Used".
>>
>>
>> Here is the summary of the conversation, by Arpit.
>>
>> The pre HDFS-5215 calculation had two bugs.
>>
>>  1. It incorrectly subtracted reserved space from the non-DFS used. 
>> (net negative). Chris suggests this is not really an issue as non-DFS 
>> used should be shown as zero unless it exceeds the DFS reserved value.
>>
>>   2. It used File#getUsableSpace to calculate the volume free space 
>> instead of File#getFreeSpace. (net positive)
>>
>> The net effect was that non-DFS used was displayed as zero unless the 
>> actual non-DFS used exceeded DFS reserved - system reserved.
>>
>> HDFS-5215 fixed the first issue and the value that is now erroneously 
>> counted towards non-DFS used is in fact the system reserved 5%.
>>
>> From the testing it was found that, "Ext derivatives hold back 5% 
>> free space while XFS does not."
>>
>>
>> Proposed calculation to report the exact Non-DFS Usage:
>>
>>   non-DFS used = getCapacity() + reserved - getDfsUsed() - totalFreeSpace
>>               = usage.getCapacity() - reserved + reserved - 
>> getDfsUsed()
>> - totalFreeSpace
>>               = usage.getCapacity() - getDfsUsed() - totalFreeSpace
>>               = File#getTotalSpace - getDfsUsed() - File#getFreeSpace
>>
>> Chris Nauroth thinks we should subtract "dfs.datanode.du.reserved" 
>> for non-dfs used because it allowed  to monitor for unexpected 
>> non-zero non-DFS usage and react.
>>
>> Even Akira given "+0" on above calculation.
>>
>> We would like take inputs from you to see some progress on the issue.
>>
>> Please let me know your thoughts on this issue.
>>
>> Thanks
>> --Brahma Reddy Battula
>>
>>
>>
>>
>>
>
>

RE: [HDFS-9038] Non-Dfs used Calculation

Posted by Brahma Reddy Battula <br...@huawei.com>.
Remainder mail...


--Brahma Reddy Battula

-----Original Message-----
From: Brahma Reddy Battula 
Sent: 20 April 2016 18:58
To: hdfs-dev; Tsz Wo Sze
Cc: 'cnauroth@hortonworks.com'; 'vinayakumarb@apache.org'; 'aagarwal@hortonworks.com'
Subject: RE: [HDFS-9038] Non-Dfs used Calculation

>>> It is incorrect to minus reserved from usage.getAvailable() above since the reserved space, which is the space reserved for non-hdfs used, may already be occupied by some non-hdfs files but not necessarily empty space.
You are right. Its incorrect to subtract all reserved space. But we may need to subtract actual usage by non-dfs files, if its less than reserved. If the non-dfs usage is more than reserved, then need not subtract.

May be actual confusion started with this description for 'dfs.datanode.du.reserved' in HDFS-5215 "Reserved space in bytes per volume. Always leave this much space free for non dfs use."
Reserved space is for the non-dfs files. HDFS should not use reserved space to store dfs files.

But, if the reserved space is already used by non-dfs files, then HDFS need not care about reserved anymore in getAvailable().

Considering non-dfs shown in metrics is the unplanned non-dfs usage i.e. extra usage beyond reserved, I hope below changes might be fine. If okay, then I will update the patch in HDFS-9038 based on this.

---------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
index 0d060f9..451b258 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hd
+++ fs/server/datanode/fsdataset/impl/FsVolumeImpl.java
@@ -383,7 +383,7 @@ public void setCapacityForTesting(long capacity) {
   @Override
   public long getAvailable() throws IOException {
     long remaining = getCapacity() - getDfsUsed() - reservedForReplicas.get();
-    long available = usage.getAvailable() - reserved
+    long available = usage.getAvailable() - getRemainingReserved()
         - reservedForReplicas.get();
     if (remaining > available) {
       remaining = available;
@@ -391,6 +391,31 @@ public long getAvailable() throws IOException {
     return (remaining > 0) ? remaining : 0;
   }

+  private long getActualNonDfsUsed() throws IOException {
+    return usage.getUsed() - getDfsUsed();  }
+
+  private long getRemainingReserved() throws IOException {
+    long actualNonDfsUsed = getActualNonDfsUsed();
+    if (actualNonDfsUsed < reserved) {
+      return reserved - actualNonDfsUsed;
+    }
+    return 0L;
+  }
+
+  /**
+   * Unplanned Non-DFS usage, i.e. Extra usage beyond reserved.
+   * @return
+   * @throws IOException
+   */
+  public long getNonDfsUsed() throws IOException {
+    long actualNonDfsUsed = getActualNonDfsUsed();
+    if (actualNonDfsUsed < reserved) {
+      return 0L;
+    }
+    return actualNonDfsUsed - reserved;  }
+
   @VisibleForTesting
   public long getReservedForReplicas() {
     return reservedForReplicas.get();


--Brahma Reddy Battula

-----Original Message-----
From: Ravi Prakash [mailto:ravihadoop@gmail.com]
Sent: 16 April 2016 09:40
To: hdfs-dev; Tsz Wo Sze
Subject: Re: [HDFS-9038] Non-Dfs used Calculation

I meant HDFS-9530 instead of HDFS-9038. We are seeing an issue where none of the datanodes are available for writes, and my suspicion is that incorrect calculation of all these numbers is causing it.

On Fri, Apr 15, 2016 at 6:38 PM, Ravi Prakash <ra...@gmail.com> wrote:

> Hi Nicholas!
>
> Could you please point out exactly which place you are seeing this 
> {{available = usage.getAvailable() - reserved}} calculation? I'm sorry 
> I'm a bit confused because there are several places you could be 
> talking about ( in the patch / in the unpatched NN code / in the unpatched DN code ) .
>
> It seems to me the non-DFS used is only ever used to display a number 
> on a UI , so I would prefer to resolve this sooner so that we can nail 
> down more important issues e.g. HDFS-9038.
>
> Thanks
> Ravi
>
> On Wed, Apr 13, 2016 at 12:06 AM, Tsz Wo Sze 
> <sz...@yahoo.com.invalid>
> wrote:
>
>> available = usage.getAvailable() - reserved
>>
>> It is incorrect to minus reserved from usage.getAvailable() above 
>> since the reserved space, which is the space reserved for non-hdfs 
>> used, may already be occupied by some non-hdfs files but not necessarily empty space.
>> In pre HDFS-5215 calculation, the non-DFS used is like "unplanned 
>> non-DFS used" while the "planned DFS used" is the reserved space.
>> Tsz-Wo
>>
>>
>>
>>     On Wednesday, April 13, 2016 2:35 PM, Brahma Reddy Battula < 
>> brahmareddy.battula@huawei.com> wrote:
>>
>>
>>
>>  Gentle Remainder!!
>>
>>
>> --Brahma Reddy Battula
>>
>> From: Brahma Reddy Battula
>> Sent: 28 March 2016 12:26
>> To: hdfs-dev@hadoop.apache.org
>> Cc: 'aagarwal@hortonworks.com'; 'cnauroth@hortonworks.com'; '
>> vinayakumarb@apache.org'
>> Subject: [HDFS-9038] Non-Dfs used Calculation
>>
>> Hi All,
>>
>> Chris Nauroth / Arpit / Vinay and me discussing this calculation.
>>
>> There is a disagreement on the definition of non-DFS used space, 
>> because of which Issue is not making progress.
>> Essentially, it's a question of whether this metric means "Raw 
>> Non-DFS Used" or "Unplanned Non-DFS Used".
>>
>>
>> Here is the summary of the conversation, by Arpit.
>>
>> The pre HDFS-5215 calculation had two bugs.
>>
>>  1. It incorrectly subtracted reserved space from the non-DFS used. 
>> (net negative). Chris suggests this is not really an issue as non-DFS 
>> used should be shown as zero unless it exceeds the DFS reserved value.
>>
>>   2. It used File#getUsableSpace to calculate the volume free space 
>> instead of File#getFreeSpace. (net positive)
>>
>> The net effect was that non-DFS used was displayed as zero unless the 
>> actual non-DFS used exceeded DFS reserved - system reserved.
>>
>> HDFS-5215 fixed the first issue and the value that is now erroneously 
>> counted towards non-DFS used is in fact the system reserved 5%.
>>
>> From the testing it was found that, "Ext derivatives hold back 5% 
>> free space while XFS does not."
>>
>>
>> Proposed calculation to report the exact Non-DFS Usage:
>>
>>   non-DFS used = getCapacity() + reserved - getDfsUsed() - totalFreeSpace
>>               = usage.getCapacity() - reserved + reserved -
>> getDfsUsed()
>> - totalFreeSpace
>>               = usage.getCapacity() - getDfsUsed() - totalFreeSpace
>>               = File#getTotalSpace - getDfsUsed() - File#getFreeSpace
>>
>> Chris Nauroth thinks we should subtract "dfs.datanode.du.reserved" 
>> for non-dfs used because it allowed  to monitor for unexpected 
>> non-zero non-DFS usage and react.
>>
>> Even Akira given "+0" on above calculation.
>>
>> We would like take inputs from you to see some progress on the issue.
>>
>> Please let me know your thoughts on this issue.
>>
>> Thanks
>> --Brahma Reddy Battula
>>
>>
>>
>>
>>
>
>

RE: [HDFS-9038] Non-Dfs used Calculation

Posted by Brahma Reddy Battula <br...@huawei.com>.
Gentle Reminder!!!!


--Brahma Reddy Battula

-----Original Message-----
From: Brahma Reddy Battula 
Sent: 26 April 2016 14:34
To: 'hdfs-dev@hadoop.apache.org'; 'Tsz Wo Sze'
Cc: 'cnauroth@hortonworks.com'; 'vinayakumarb@apache.org'; 'aagarwal@hortonworks.com'
Subject: RE: [HDFS-9038] Non-Dfs used Calculation

Remainder mail...


--Brahma Reddy Battula

-----Original Message-----
From: Brahma Reddy Battula
Sent: 20 April 2016 18:58
To: hdfs-dev; Tsz Wo Sze
Cc: 'cnauroth@hortonworks.com'; 'vinayakumarb@apache.org'; 'aagarwal@hortonworks.com'
Subject: RE: [HDFS-9038] Non-Dfs used Calculation

>>> It is incorrect to minus reserved from usage.getAvailable() above since the reserved space, which is the space reserved for non-hdfs used, may already be occupied by some non-hdfs files but not necessarily empty space.
You are right. Its incorrect to subtract all reserved space. But we may need to subtract actual usage by non-dfs files, if its less than reserved. If the non-dfs usage is more than reserved, then need not subtract.

May be actual confusion started with this description for 'dfs.datanode.du.reserved' in HDFS-5215 "Reserved space in bytes per volume. Always leave this much space free for non dfs use."
Reserved space is for the non-dfs files. HDFS should not use reserved space to store dfs files.

But, if the reserved space is already used by non-dfs files, then HDFS need not care about reserved anymore in getAvailable().

Considering non-dfs shown in metrics is the unplanned non-dfs usage i.e. extra usage beyond reserved, I hope below changes might be fine. If okay, then I will update the patch in HDFS-9038 based on this.

---------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
index 0d060f9..451b258 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hd
+++ fs/server/datanode/fsdataset/impl/FsVolumeImpl.java
@@ -383,7 +383,7 @@ public void setCapacityForTesting(long capacity) {
   @Override
   public long getAvailable() throws IOException {
     long remaining = getCapacity() - getDfsUsed() - reservedForReplicas.get();
-    long available = usage.getAvailable() - reserved
+    long available = usage.getAvailable() - getRemainingReserved()
         - reservedForReplicas.get();
     if (remaining > available) {
       remaining = available;
@@ -391,6 +391,31 @@ public long getAvailable() throws IOException {
     return (remaining > 0) ? remaining : 0;
   }

+  private long getActualNonDfsUsed() throws IOException {
+    return usage.getUsed() - getDfsUsed();  }
+
+  private long getRemainingReserved() throws IOException {
+    long actualNonDfsUsed = getActualNonDfsUsed();
+    if (actualNonDfsUsed < reserved) {
+      return reserved - actualNonDfsUsed;
+    }
+    return 0L;
+  }
+
+  /**
+   * Unplanned Non-DFS usage, i.e. Extra usage beyond reserved.
+   * @return
+   * @throws IOException
+   */
+  public long getNonDfsUsed() throws IOException {
+    long actualNonDfsUsed = getActualNonDfsUsed();
+    if (actualNonDfsUsed < reserved) {
+      return 0L;
+    }
+    return actualNonDfsUsed - reserved;  }
+
   @VisibleForTesting
   public long getReservedForReplicas() {
     return reservedForReplicas.get();


--Brahma Reddy Battula

-----Original Message-----
From: Ravi Prakash [mailto:ravihadoop@gmail.com]
Sent: 16 April 2016 09:40
To: hdfs-dev; Tsz Wo Sze
Subject: Re: [HDFS-9038] Non-Dfs used Calculation

I meant HDFS-9530 instead of HDFS-9038. We are seeing an issue where none of the datanodes are available for writes, and my suspicion is that incorrect calculation of all these numbers is causing it.

On Fri, Apr 15, 2016 at 6:38 PM, Ravi Prakash <ra...@gmail.com> wrote:

> Hi Nicholas!
>
> Could you please point out exactly which place you are seeing this 
> {{available = usage.getAvailable() - reserved}} calculation? I'm sorry 
> I'm a bit confused because there are several places you could be 
> talking about ( in the patch / in the unpatched NN code / in the unpatched DN code ) .
>
> It seems to me the non-DFS used is only ever used to display a number 
> on a UI , so I would prefer to resolve this sooner so that we can nail 
> down more important issues e.g. HDFS-9038.
>
> Thanks
> Ravi
>
> On Wed, Apr 13, 2016 at 12:06 AM, Tsz Wo Sze 
> <sz...@yahoo.com.invalid>
> wrote:
>
>> available = usage.getAvailable() - reserved
>>
>> It is incorrect to minus reserved from usage.getAvailable() above 
>> since the reserved space, which is the space reserved for non-hdfs 
>> used, may already be occupied by some non-hdfs files but not necessarily empty space.
>> In pre HDFS-5215 calculation, the non-DFS used is like "unplanned 
>> non-DFS used" while the "planned DFS used" is the reserved space.
>> Tsz-Wo
>>
>>
>>
>>     On Wednesday, April 13, 2016 2:35 PM, Brahma Reddy Battula < 
>> brahmareddy.battula@huawei.com> wrote:
>>
>>
>>
>>  Gentle Remainder!!
>>
>>
>> --Brahma Reddy Battula
>>
>> From: Brahma Reddy Battula
>> Sent: 28 March 2016 12:26
>> To: hdfs-dev@hadoop.apache.org
>> Cc: 'aagarwal@hortonworks.com'; 'cnauroth@hortonworks.com'; '
>> vinayakumarb@apache.org'
>> Subject: [HDFS-9038] Non-Dfs used Calculation
>>
>> Hi All,
>>
>> Chris Nauroth / Arpit / Vinay and me discussing this calculation.
>>
>> There is a disagreement on the definition of non-DFS used space, 
>> because of which Issue is not making progress.
>> Essentially, it's a question of whether this metric means "Raw 
>> Non-DFS Used" or "Unplanned Non-DFS Used".
>>
>>
>> Here is the summary of the conversation, by Arpit.
>>
>> The pre HDFS-5215 calculation had two bugs.
>>
>>  1. It incorrectly subtracted reserved space from the non-DFS used. 
>> (net negative). Chris suggests this is not really an issue as non-DFS 
>> used should be shown as zero unless it exceeds the DFS reserved value.
>>
>>   2. It used File#getUsableSpace to calculate the volume free space 
>> instead of File#getFreeSpace. (net positive)
>>
>> The net effect was that non-DFS used was displayed as zero unless the 
>> actual non-DFS used exceeded DFS reserved - system reserved.
>>
>> HDFS-5215 fixed the first issue and the value that is now erroneously 
>> counted towards non-DFS used is in fact the system reserved 5%.
>>
>> From the testing it was found that, "Ext derivatives hold back 5% 
>> free space while XFS does not."
>>
>>
>> Proposed calculation to report the exact Non-DFS Usage:
>>
>>   non-DFS used = getCapacity() + reserved - getDfsUsed() - totalFreeSpace
>>               = usage.getCapacity() - reserved + reserved -
>> getDfsUsed()
>> - totalFreeSpace
>>               = usage.getCapacity() - getDfsUsed() - totalFreeSpace
>>               = File#getTotalSpace - getDfsUsed() - File#getFreeSpace
>>
>> Chris Nauroth thinks we should subtract "dfs.datanode.du.reserved" 
>> for non-dfs used because it allowed  to monitor for unexpected 
>> non-zero non-DFS usage and react.
>>
>> Even Akira given "+0" on above calculation.
>>
>> We would like take inputs from you to see some progress on the issue.
>>
>> Please let me know your thoughts on this issue.
>>
>> Thanks
>> --Brahma Reddy Battula
>>
>>
>>
>>
>>
>
>

Re: [HDFS-9038] Non-Dfs used Calculation

Posted by Ravi Prakash <ra...@gmail.com>.
I meant HDFS-9530 instead of HDFS-9038. We are seeing an issue where none
of the datanodes are available for writes, and my suspicion is that
incorrect calculation of all these numbers is causing it.

On Fri, Apr 15, 2016 at 6:38 PM, Ravi Prakash <ra...@gmail.com> wrote:

> Hi Nicholas!
>
> Could you please point out exactly which place you are seeing this
> {{available = usage.getAvailable() - reserved}} calculation? I'm sorry I'm
> a bit confused because there are several places you could be talking about
> ( in the patch / in the unpatched NN code / in the unpatched DN code ) .
>
> It seems to me the non-DFS used is only ever used to display a number on a
> UI , so I would prefer to resolve this sooner so that we can nail down more
> important issues e.g. HDFS-9038.
>
> Thanks
> Ravi
>
> On Wed, Apr 13, 2016 at 12:06 AM, Tsz Wo Sze <sz...@yahoo.com.invalid>
> wrote:
>
>> available = usage.getAvailable() - reserved
>>
>> It is incorrect to minus reserved from usage.getAvailable() above since
>> the reserved space, which is the space reserved for non-hdfs used, may
>> already be occupied by some non-hdfs files but not necessarily empty space.
>> In pre HDFS-5215 calculation, the non-DFS used is like "unplanned non-DFS
>> used" while the "planned DFS used" is the reserved space.
>> Tsz-Wo
>>
>>
>>
>>     On Wednesday, April 13, 2016 2:35 PM, Brahma Reddy Battula <
>> brahmareddy.battula@huawei.com> wrote:
>>
>>
>>
>>  Gentle Remainder!!
>>
>>
>> --Brahma Reddy Battula
>>
>> From: Brahma Reddy Battula
>> Sent: 28 March 2016 12:26
>> To: hdfs-dev@hadoop.apache.org
>> Cc: 'aagarwal@hortonworks.com'; 'cnauroth@hortonworks.com'; '
>> vinayakumarb@apache.org'
>> Subject: [HDFS-9038] Non-Dfs used Calculation
>>
>> Hi All,
>>
>> Chris Nauroth / Arpit / Vinay and me discussing this calculation.
>>
>> There is a disagreement on the definition of non-DFS used space, because
>> of which Issue is not making progress.
>> Essentially, it's a question of whether this metric means "Raw Non-DFS
>> Used" or "Unplanned Non-DFS Used".
>>
>>
>> Here is the summary of the conversation, by Arpit.
>>
>> The pre HDFS-5215 calculation had two bugs.
>>
>>  1. It incorrectly subtracted reserved space from the non-DFS used. (net
>> negative). Chris suggests this is not really an issue as non-DFS used
>> should be shown as zero unless it exceeds the DFS reserved value.
>>
>>   2. It used File#getUsableSpace to calculate the volume free space
>> instead of File#getFreeSpace. (net positive)
>>
>> The net effect was that non-DFS used was displayed as zero unless the
>> actual non-DFS used exceeded DFS reserved - system reserved.
>>
>> HDFS-5215 fixed the first issue and the value that is now erroneously
>> counted towards non-DFS used is in fact the system reserved 5%.
>>
>> From the testing it was found that, "Ext derivatives hold back 5% free
>> space while XFS does not."
>>
>>
>> Proposed calculation to report the exact Non-DFS Usage:
>>
>>   non-DFS used = getCapacity() + reserved - getDfsUsed() - totalFreeSpace
>>               = usage.getCapacity() - reserved + reserved - getDfsUsed()
>> - totalFreeSpace
>>               = usage.getCapacity() - getDfsUsed() - totalFreeSpace
>>               = File#getTotalSpace - getDfsUsed() - File#getFreeSpace
>>
>> Chris Nauroth thinks we should subtract "dfs.datanode.du.reserved" for
>> non-dfs used because it allowed  to monitor for unexpected non-zero non-DFS
>> usage and react.
>>
>> Even Akira given "+0" on above calculation.
>>
>> We would like take inputs from you to see some progress on the issue.
>>
>> Please let me know your thoughts on this issue.
>>
>> Thanks
>> --Brahma Reddy Battula
>>
>>
>>
>>
>>
>
>

Re: [HDFS-9038] Non-Dfs used Calculation

Posted by Ravi Prakash <ra...@gmail.com>.
Hi Nicholas!

Could you please point out exactly which place you are seeing this
{{available = usage.getAvailable() - reserved}} calculation? I'm sorry I'm
a bit confused because there are several places you could be talking about
( in the patch / in the unpatched NN code / in the unpatched DN code ) .

It seems to me the non-DFS used is only ever used to display a number on a
UI , so I would prefer to resolve this sooner so that we can nail down more
important issues e.g. HDFS-9038.

Thanks
Ravi

On Wed, Apr 13, 2016 at 12:06 AM, Tsz Wo Sze <sz...@yahoo.com.invalid>
wrote:

> available = usage.getAvailable() - reserved
>
> It is incorrect to minus reserved from usage.getAvailable() above since
> the reserved space, which is the space reserved for non-hdfs used, may
> already be occupied by some non-hdfs files but not necessarily empty space.
> In pre HDFS-5215 calculation, the non-DFS used is like "unplanned non-DFS
> used" while the "planned DFS used" is the reserved space.
> Tsz-Wo
>
>
>
>     On Wednesday, April 13, 2016 2:35 PM, Brahma Reddy Battula <
> brahmareddy.battula@huawei.com> wrote:
>
>
>
>  Gentle Remainder!!
>
>
> --Brahma Reddy Battula
>
> From: Brahma Reddy Battula
> Sent: 28 March 2016 12:26
> To: hdfs-dev@hadoop.apache.org
> Cc: 'aagarwal@hortonworks.com'; 'cnauroth@hortonworks.com'; '
> vinayakumarb@apache.org'
> Subject: [HDFS-9038] Non-Dfs used Calculation
>
> Hi All,
>
> Chris Nauroth / Arpit / Vinay and me discussing this calculation.
>
> There is a disagreement on the definition of non-DFS used space, because
> of which Issue is not making progress.
> Essentially, it's a question of whether this metric means "Raw Non-DFS
> Used" or "Unplanned Non-DFS Used".
>
>
> Here is the summary of the conversation, by Arpit.
>
> The pre HDFS-5215 calculation had two bugs.
>
>  1. It incorrectly subtracted reserved space from the non-DFS used. (net
> negative). Chris suggests this is not really an issue as non-DFS used
> should be shown as zero unless it exceeds the DFS reserved value.
>
>   2. It used File#getUsableSpace to calculate the volume free space
> instead of File#getFreeSpace. (net positive)
>
> The net effect was that non-DFS used was displayed as zero unless the
> actual non-DFS used exceeded DFS reserved - system reserved.
>
> HDFS-5215 fixed the first issue and the value that is now erroneously
> counted towards non-DFS used is in fact the system reserved 5%.
>
> From the testing it was found that, "Ext derivatives hold back 5% free
> space while XFS does not."
>
>
> Proposed calculation to report the exact Non-DFS Usage:
>
>   non-DFS used = getCapacity() + reserved - getDfsUsed() - totalFreeSpace
>               = usage.getCapacity() - reserved + reserved - getDfsUsed() -
> totalFreeSpace
>               = usage.getCapacity() - getDfsUsed() - totalFreeSpace
>               = File#getTotalSpace - getDfsUsed() - File#getFreeSpace
>
> Chris Nauroth thinks we should subtract "dfs.datanode.du.reserved" for
> non-dfs used because it allowed  to monitor for unexpected non-zero non-DFS
> usage and react.
>
> Even Akira given "+0" on above calculation.
>
> We would like take inputs from you to see some progress on the issue.
>
> Please let me know your thoughts on this issue.
>
> Thanks
> --Brahma Reddy Battula
>
>
>
>
>

Re: [HDFS-9038] Non-Dfs used Calculation

Posted by Tsz Wo Sze <sz...@yahoo.com.INVALID>.
available = usage.getAvailable() - reserved

It is incorrect to minus reserved from usage.getAvailable() above since the reserved space, which is the space reserved for non-hdfs used, may already be occupied by some non-hdfs files but not necessarily empty space.
In pre HDFS-5215 calculation, the non-DFS used is like "unplanned non-DFS used" while the "planned DFS used" is the reserved space.
Tsz-Wo

 

    On Wednesday, April 13, 2016 2:35 PM, Brahma Reddy Battula <br...@huawei.com> wrote:
 
 

 Gentle Remainder!!


--Brahma Reddy Battula

From: Brahma Reddy Battula
Sent: 28 March 2016 12:26
To: hdfs-dev@hadoop.apache.org
Cc: 'aagarwal@hortonworks.com'; 'cnauroth@hortonworks.com'; 'vinayakumarb@apache.org'
Subject: [HDFS-9038] Non-Dfs used Calculation

Hi All,

Chris Nauroth / Arpit / Vinay and me discussing this calculation.

There is a disagreement on the definition of non-DFS used space, because of which Issue is not making progress.
Essentially, it's a question of whether this metric means "Raw Non-DFS Used" or "Unplanned Non-DFS Used".


Here is the summary of the conversation, by Arpit.

The pre HDFS-5215 calculation had two bugs.

 1. It incorrectly subtracted reserved space from the non-DFS used. (net negative). Chris suggests this is not really an issue as non-DFS used should be shown as zero unless it exceeds the DFS reserved value.

  2. It used File#getUsableSpace to calculate the volume free space instead of File#getFreeSpace. (net positive)

The net effect was that non-DFS used was displayed as zero unless the actual non-DFS used exceeded DFS reserved - system reserved.

HDFS-5215 fixed the first issue and the value that is now erroneously counted towards non-DFS used is in fact the system reserved 5%.

>From the testing it was found that, "Ext derivatives hold back 5% free space while XFS does not."


Proposed calculation to report the exact Non-DFS Usage:

  non-DFS used = getCapacity() + reserved - getDfsUsed() - totalFreeSpace
              = usage.getCapacity() - reserved + reserved - getDfsUsed() - totalFreeSpace
              = usage.getCapacity() - getDfsUsed() - totalFreeSpace
              = File#getTotalSpace - getDfsUsed() - File#getFreeSpace

Chris Nauroth thinks we should subtract "dfs.datanode.du.reserved" for non-dfs used because it allowed  to monitor for unexpected non-zero non-DFS usage and react.

Even Akira given "+0" on above calculation.

We would like take inputs from you to see some progress on the issue.

Please let me know your thoughts on this issue.

Thanks
--Brahma Reddy Battula