You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2011/08/17 19:20:27 UTC

[jira] [Created] (HBASE-4215) RS requestsPerSecond counter seems to be off

RS requestsPerSecond counter seems to be off
--------------------------------------------

                 Key: HBASE-4215
                 URL: https://issues.apache.org/jira/browse/HBASE-4215
             Project: HBase
          Issue Type: Bug
          Components: metrics
    Affects Versions: 0.92.0
            Reporter: Todd Lipcon
            Priority: Critical
             Fix For: 0.92.0


In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4215) RS requestsPerSecond counter seems to be off

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-4215:
-------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

Thank you for the patch Subramanian.  Nice work.  Thanks too for the trail through your thinking with color.  Took a while for me to snake through it but it for sure helped me when reviewing your patch.

Committed to TRUNK.

> RS requestsPerSecond counter seems to be off
> --------------------------------------------
>
>                 Key: HBASE-4215
>                 URL: https://issues.apache.org/jira/browse/HBASE-4215
>             Project: HBase
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: subramanian raghunathan
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4215_trunk.patch
>
>
> In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4215) RS requestsPerSecond counter seems to be off

Posted by "subramanian raghunathan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086939#comment-13086939 ] 

subramanian raghunathan commented on HBASE-4215:
------------------------------------------------

As part of the defect fix HBASE-3807

we proposed and modified the request attribute to
requestPersecond both in RegionServer{RegionServerMetrics} and Master{HServerLoad}

RegionServerMetrics calcualtes from MetricsRate

Following is the code doing the calcualtion:
{code}
    long now = System.currentTimeMillis();
    long diff = (now-ts)/1000;
    if (diff == 0) diff = 1; // sigh this is crap.
    this.prevRate = (float)value / diff;
{code}    
    {color:red}this.prevRate = (float)value / diff;{color}
prevRate is finally displayed as "requestPersecond" as per the change in HBASE-3807

But in master the same is calculated from  HServerLoad

HRegionServer.buildServerLoad()
{code}
new HServerLoad(requestCount.get(),
      (int)(memory.getUsed() / 1024 / 1024),
      (int) (memory.getMax() / 1024 / 1024), regionLoads)
{code}

Request counter is present in HregionServer 
{code}      
  // Request counter.
  // Do we need this?  Can't we just sum region counters?  St.Ack 20110412
  private AtomicInteger requestCount = new AtomicInteger();
{code}  
Obtained form the request counter which is incremented in all the API's of HRegionServer   

{color:red}This is not calculated per second its representing the total request per second.{color}

but still in the master page we claim {color:green}"Load is requests per second and count of regions loaded."{color}
This promted me in changing the convention from request to resquestPerSecond

{color:green}Ideally The fix should be calculating the requestpersecond at region server and 
initializing the HServerLoad with that value and the same to be displayed in the master.{color}



Region Servers
 Address Start Code Load 
linux-kxjl:60030 1313659887824linux-kxjl,60020,1313659887824 requestsPerSecond=0, numberOfOnlineRegions=2, usedHeapMB=26, maxHeapMB=995 
Total:  servers: 1   requests=0, regions=2 

Load is requests per second and count of regions loaded

Also its better to change the agregation details also into the new convention 
{color:red} requests=0, regions=2{color}
to 
{color:green} requestsPerSecond=0, numberOfOnlineRegions=2{color}    

If this looks fine i can provide a patch for the same.

> RS requestsPerSecond counter seems to be off
> --------------------------------------------
>
>                 Key: HBASE-4215
>                 URL: https://issues.apache.org/jira/browse/HBASE-4215
>             Project: HBase
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Critical
>             Fix For: 0.92.0
>
>
> In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4215) RS requestsPerSecond counter seems to be off

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092042#comment-13092042 ] 

Hudson commented on HBASE-4215:
-------------------------------

Integrated in HBase-TRUNK #2147 (See [https://builds.apache.org/job/HBase-TRUNK/2147/])
    HBASE-4215 RS requestsPerSecond counter seems to be off
HBASE-4215 RS requestsPerSecond counter seems to be off

stack : 
Files : 
* /hbase/trunk/CHANGES.txt

stack : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/jamon/org/apache/hbase/tmpl/master/MasterStatusTmpl.jamon
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HServerLoad.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java


> RS requestsPerSecond counter seems to be off
> --------------------------------------------
>
>                 Key: HBASE-4215
>                 URL: https://issues.apache.org/jira/browse/HBASE-4215
>             Project: HBase
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: subramanian raghunathan
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4215_trunk.patch
>
>
> In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4215) RS requestsPerSecond counter seems to be off

Posted by "subramanian raghunathan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

subramanian raghunathan updated HBASE-4215:
-------------------------------------------

    Attachment: HBASE-4215_trunk.patch

> RS requestsPerSecond counter seems to be off
> --------------------------------------------
>
>                 Key: HBASE-4215
>                 URL: https://issues.apache.org/jira/browse/HBASE-4215
>             Project: HBase
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: subramanian raghunathan
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4215_trunk.patch
>
>
> In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-4215) RS requestsPerSecond counter seems to be off

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan reassigned HBASE-4215:
---------------------------------------------

    Assignee: subramanian raghunathan  (was: ramkrishna.s.vasudevan)

> RS requestsPerSecond counter seems to be off
> --------------------------------------------
>
>                 Key: HBASE-4215
>                 URL: https://issues.apache.org/jira/browse/HBASE-4215
>             Project: HBase
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: subramanian raghunathan
>            Priority: Critical
>             Fix For: 0.92.0
>
>
> In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4215) RS requestsPerSecond counter seems to be off

Posted by "subramanian raghunathan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088577#comment-13088577 ] 

subramanian raghunathan commented on HBASE-4215:
------------------------------------------------

1)  Modifying the requestPersecond value to integer rather than float since the value is used in several VO's
like: 
1) MasterMetrics metrics 
2) AClusterStatus
3) StorageClusterStatusModel
4) AServerLoad etc. 

Also to maintain consistensy the regionservermetrics and HServerLoad using integer values for requestPersecond.

Advantages: Consitency throughout the system . Changes are minimal.
DisAdvantages: Percision loss, also integer can accomadate less value than the float.

2)HServerLoad.getNumberOfRequests() is used in multiple places.

{color:green}StorageClusterStatusModel - Expects request per second 
ClusterStatus - Expects request per second 
MasterStatusTmplImpl - UI part also Expects request per second {color}

{color:red}AvroUtil - Not sure {color} 

{color:red}
HMaster 
public void regionServerReport(final byte [] sn, final HServerLoad hsl)
  throws IOException {
    this.serverManager.regionServerReport(new ServerName(sn), hsl);
    if (hsl != null && this.metrics != null) {
      // Up our metrics.
      this.metrics.incrementRequests(hsl.getTotalNumberOfRequests());
    }
  }
The metric expects the total request in the all the region servers and the metric computes the per second value of the aggregation of all the region servers.{color}

So following change is sugested to keep the existing attribute as the requestPersecond.Also introduce additional attribute that reports the total request made in RS as totalNumberOfRequests This will be used by the Master Metrics.

Will upload the patch ASAP 

> RS requestsPerSecond counter seems to be off
> --------------------------------------------
>
>                 Key: HBASE-4215
>                 URL: https://issues.apache.org/jira/browse/HBASE-4215
>             Project: HBase
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Critical
>             Fix For: 0.92.0
>
>
> In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4215) RS requestsPerSecond counter seems to be off

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087313#comment-13087313 ] 

stack commented on HBASE-4215:
------------------------------

+1 on changing HSL to do float.

> RS requestsPerSecond counter seems to be off
> --------------------------------------------
>
>                 Key: HBASE-4215
>                 URL: https://issues.apache.org/jira/browse/HBASE-4215
>             Project: HBase
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Critical
>             Fix For: 0.92.0
>
>
> In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4215) RS requestsPerSecond counter seems to be off

Posted by "subramanian raghunathan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087014#comment-13087014 ] 

subramanian raghunathan commented on HBASE-4215:
------------------------------------------------

In further to these understandindg , i was planning to unify the methodology of deriving the "requestPerSecond" , so that HRsgionServer uses the APi of RegionServerMetrics (i.e) RegionServerMetrics.getRequests() and initialise HServerLoad , but one desparity found was metricsState uses a float value whereas HServerLoad uses an integer value to represent the same,It would be better to unify the data type across them so that the master and region server UI are in sink.

Please sugest me which of them can be used and unified,i personally feel float is better but the changes seem to be more.

Based on that i will provide the patch tommorow.

> RS requestsPerSecond counter seems to be off
> --------------------------------------------
>
>                 Key: HBASE-4215
>                 URL: https://issues.apache.org/jira/browse/HBASE-4215
>             Project: HBase
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Critical
>             Fix For: 0.92.0
>
>
> In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4215) RS requestsPerSecond counter seems to be off

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087314#comment-13087314 ] 

stack commented on HBASE-4215:
------------------------------

There should be no migration issue since these are transient objects.

> RS requestsPerSecond counter seems to be off
> --------------------------------------------
>
>                 Key: HBASE-4215
>                 URL: https://issues.apache.org/jira/browse/HBASE-4215
>             Project: HBase
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Critical
>             Fix For: 0.92.0
>
>
> In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-4215) RS requestsPerSecond counter seems to be off

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan reassigned HBASE-4215:
---------------------------------------------

    Assignee: ramkrishna.s.vasudevan

> RS requestsPerSecond counter seems to be off
> --------------------------------------------
>
>                 Key: HBASE-4215
>                 URL: https://issues.apache.org/jira/browse/HBASE-4215
>             Project: HBase
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Critical
>             Fix For: 0.92.0
>
>
> In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4215) RS requestsPerSecond counter seems to be off

Posted by "subramanian raghunathan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

subramanian raghunathan updated HBASE-4215:
-------------------------------------------

    Status: Patch Available  (was: Open)

> RS requestsPerSecond counter seems to be off
> --------------------------------------------
>
>                 Key: HBASE-4215
>                 URL: https://issues.apache.org/jira/browse/HBASE-4215
>             Project: HBase
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: subramanian raghunathan
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4215_trunk.patch
>
>
> In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira