You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Raju Bairishetti (JIRA)" <ji...@apache.org> on 2014/10/16 07:23:35 UTC

[jira] [Updated] (HADOOP-11205) ThrottledInputStream should return the actual bandwidth (read rate)

     [ https://issues.apache.org/jira/browse/HADOOP-11205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raju Bairishetti updated HADOOP-11205:
--------------------------------------
    Description: 
Currently, it is not returning the actual read rate. Due to this, most of the time is in idle state.

Behavior: First, it checks whether current bandwidth (number of bytes per second) is more than maxBandwidth before reading a chunk of bytes(or byte) from buffer. If read rate exceeds max bandwidth then it sleeps for 50ms and resume the process after the sleeping period(50ms).

Ex: Assume, both maxBandwidth = 1MBPS and read rate = 1MBPS(i.e. reading 1M messages per second())

In the above case,  even if it reads 1.5MB in 1.5 sec which is ideally not crossing the max bandwidth but still it goes for sleeping mode as it assumes read rate is 1.5M (bytes read/ time i.e. 1.5/1.. time is 1500ms/1000 =1) instead of 1(i.e. 1.5/1.5).

Example: 
It does not got to sleep mode till 1 sec as number of bytes read in that elapsed time is lesser than maxBandwidth.
when it reads 1M +1 byte/chunk it checks read rate against maxBandwidth. 
when it reads 1M + 2byte /chunk it sleeps for 50ms as read rate is > 1
when it reads 1M + 3byte/chunk again it sleeps for 50ms as read rate is > 1
...
even if it reads 1.5MB in 1.5 sec but still it goes for sleeping mode as it assumes read rate is 1.5M (bytes read/ time i.e. 1.5/1.. time is 1500ms/1000 =1) instead of 1(i.e. 1.5/1.5).

Cons: it reads for a sec and almost sleeps for a 1sec in an alternate fashion.

getBytesPerSec() method is not returning the actual bandwidth.
Current code: {code}
public long getBytesPerSec() {
    long elapsed = (System.currentTimeMillis() - startTime) / 1000;
    if (elapsed == 0) {
      return bytesRead;
    } else {
      return bytesRead / elapsed;
    }
  }
{code}
We should fix the getBytesPerSec() method:


  was:
Currently, it is not returning the actual read rate. Due to this, most of the time is in idle state.

Behavior: First, it checks whether current bandwidth (number of bytes per second) is more than maxBandwidth before reading a chunk of bytes(or byte) from buffer. If read rate exceeds max bandwidth then it sleeps for 50ms and resume the process after the sleeping period(50ms).

Ex: Assume, both maxBandwidth = 1MBPS and read rate = 1MBPS(i.e. reading 1M messages per second())

In the above case,  even if it reads 1.5MB in 1.5 sec which is ideally not crossing the max bandwidth but still it goes for sleeping mode as it assumes read rate is 1.5M (bytes read/ time i.e. 1.5/1.. time is 1500ms/1000 =1) instead of 1(i.e. 1.5/1.5).

Example: 
It does not got to sleep mode till 1 sec as number of bytes read in that elapsed time is lesser than maxBandwidth.
when it reads 1M +1 byte/chunk it checks read rate against maxBandwidth. 
when it reads 1M + 2byte /chunk it sleeps for 50ms as read rate is > 1
when it reads 1M + 3byte/chunk again it sleeps for 50ms as read rate is > 1
...
even if it reads 1.5MB in 1.5 sec but still it goes for sleeping mode as it assumes read rate is 1.5M (bytes read/ time i.e. 1.5/1.. time is 1500ms/1000 =1) instead of 1(i.e. 1.5/1.5).

Cons: it reads for a sec and almost sleeps for a 1sec in an alternate fashion.

getBytesPerSec() method is not returning the actual bandwidth.
Current code: {code}
public long getBytesPerSec() {
    long elapsed = (System.currentTimeMillis() - startTime) / 1000;
    if (elapsed == 0) {
      return bytesRead;
    } else {
      return bytesRead / elapsed;
    }
  }
{code}
We should fix the getBytesPerSec() method:
{code}
public long getBytesPerSec() {
    long elapsedTimeInMilliSecs = System.currentTimeMillis() - startTime;
    if (elapsedTimeInMilliSecs <= MILLISECONDS_IN_SEC) {
      return bytesRead;
    } else {
      return (bytesRead * MILLISECONDS_IN_SEC)/ elapsedTimeInMilliSecs;
    }
  }
{code}



> ThrottledInputStream should return the actual bandwidth (read rate)
> -------------------------------------------------------------------
>
>                 Key: HADOOP-11205
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11205
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: tools/distcp
>            Reporter: Raju Bairishetti
>            Assignee: Raju Bairishetti
>
> Currently, it is not returning the actual read rate. Due to this, most of the time is in idle state.
> Behavior: First, it checks whether current bandwidth (number of bytes per second) is more than maxBandwidth before reading a chunk of bytes(or byte) from buffer. If read rate exceeds max bandwidth then it sleeps for 50ms and resume the process after the sleeping period(50ms).
> Ex: Assume, both maxBandwidth = 1MBPS and read rate = 1MBPS(i.e. reading 1M messages per second())
> In the above case,  even if it reads 1.5MB in 1.5 sec which is ideally not crossing the max bandwidth but still it goes for sleeping mode as it assumes read rate is 1.5M (bytes read/ time i.e. 1.5/1.. time is 1500ms/1000 =1) instead of 1(i.e. 1.5/1.5).
> Example: 
> It does not got to sleep mode till 1 sec as number of bytes read in that elapsed time is lesser than maxBandwidth.
> when it reads 1M +1 byte/chunk it checks read rate against maxBandwidth. 
> when it reads 1M + 2byte /chunk it sleeps for 50ms as read rate is > 1
> when it reads 1M + 3byte/chunk again it sleeps for 50ms as read rate is > 1
> ...
> even if it reads 1.5MB in 1.5 sec but still it goes for sleeping mode as it assumes read rate is 1.5M (bytes read/ time i.e. 1.5/1.. time is 1500ms/1000 =1) instead of 1(i.e. 1.5/1.5).
> Cons: it reads for a sec and almost sleeps for a 1sec in an alternate fashion.
> getBytesPerSec() method is not returning the actual bandwidth.
> Current code: {code}
> public long getBytesPerSec() {
>     long elapsed = (System.currentTimeMillis() - startTime) / 1000;
>     if (elapsed == 0) {
>       return bytesRead;
>     } else {
>       return bytesRead / elapsed;
>     }
>   }
> {code}
> We should fix the getBytesPerSec() method:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)