You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Himanish Kushary <hi...@gmail.com> on 2011/05/16 19:33:29 UTC

Performance degrades on moving from desktop to blade environment

Hi,

We are in the process of moving a small Hbase/Hadoop cluster from our
development to production environment.Our development environment were few
intel desktops (8 cores CPU/8 Gigs RAM/7200 rpm disks) running centOS while
the production environment are blades with (24 cores AMD CPU/32 gigs
RAM/15000 rpm disks) AMD architecture running centOS.

Strangely the hbase performance seems to degrade after moving stuffs to the
production enviroment (suppoesed to have more horse power).We are using the
latest and default installation for cloudera version of hadoop and hbase.No
changes to memory or other parameter were done on both the environment.

Any idea what could cause this.Could the AMD architecture be the
cause.Pointers to things to look for to improve performance in the
production cluster would be really appreciated.

Note: We ran "count" from hbase shell on a huge table and found the desktops
to be performing much better. We are in the process of comparing Map-Reduces
presently.

---------------------------
Thanks & Regards
Himanish

Re: Performance degrades on moving from desktop to blade environment

Posted by Stack <st...@duboce.net>.

On Mon, May 16, 2011 at 10:33 AM, Himanish Kushary <hi...@gmail.com> wrote:
> Hi,
>
> We are in the process of moving a small Hbase/Hadoop cluster from our
> development to production environment.Our development environment were few
> intel desktops (8 cores CPU/8 Gigs RAM/7200 rpm disks) running centOS while
> the production environment are blades with (24 cores AMD CPU/32 gigs
> RAM/15000 rpm disks) AMD architecture running centOS.
>

Nice.


> Strangely the hbase performance seems to degrade after moving stuffs to the
> production enviroment (suppoesed to have more horse power).We are using the
> latest and default installation for cloudera version of hadoop and hbase.No
> changes to memory or other parameter were done on both the environment.
>
> Any idea what could cause this.Could the AMD architecture be the
> cause.Pointers to things to look for to improve performance in the
> production cluster would be really appreciated.
>
> Note: We ran "count" from hbase shell on a huge table and found the desktops
> to be performing much better. We are in the process of comparing Map-Reduces
> presently.
>

Recheck the required configurations --
http://hbase.apache.org/book/notsoquick.html#requirements -- and I'm
sure you've see the perf section:
http://hbase.apache.org/book/performance.html

Otherwise, can you check the systems?  Perhaps there is a badly
configured network driver or disk controller on the new hardware?  Do
some basic sanity checks that the blades are working as advertised.

Good luck,
St.Ack

Re: Performance degrades on moving from desktop to blade environment

Posted by Jack Levin <ma...@gmail.com>.

We had issues of moving into 32 core AMD box also.  The issue was
revolving around datanode getting slow after about 12 hours.  What you
need to do is check fsreadlatency_ave_time graph, if it appears spiky
then you have a problem with IO, next get a graph of "Runnable
Threads"  they should be flatlining, if they are spiking you might
have IO/Memory contention.  Run RAM tests head to head, we found our
32 Core ECC RAM would always underperform while using sysbench, by a
large margin.

-Jack

On Mon, May 16, 2011 at 10:33 AM, Himanish Kushary <hi...@gmail.com> wrote:
> Hi,
>
> We are in the process of moving a small Hbase/Hadoop cluster from our
> development to production environment.Our development environment were few
> intel desktops (8 cores CPU/8 Gigs RAM/7200 rpm disks) running centOS while
> the production environment are blades with (24 cores AMD CPU/32 gigs
> RAM/15000 rpm disks) AMD architecture running centOS.
>
> Strangely the hbase performance seems to degrade after moving stuffs to the
> production enviroment (suppoesed to have more horse power).We are using the
> latest and default installation for cloudera version of hadoop and hbase.No
> changes to memory or other parameter were done on both the environment.
>
> Any idea what could cause this.Could the AMD architecture be the
> cause.Pointers to things to look for to improve performance in the
> production cluster would be really appreciated.
>
> Note: We ran "count" from hbase shell on a huge table and found the desktops
> to be performing much better. We are in the process of comparing Map-Reduces
> presently.
>
> ---------------------------
> Thanks & Regards
> Himanish
>

Re: Performance degrades on moving from desktop to blade environment

Posted by tsuna <ts...@gmail.com>.

On Thu, May 19, 2011 at 11:50 AM, Jack Levin <ma...@gmail.com> wrote:
> Himanish, it hard to say without trend graphs.  Setup ganglia and get
> fsreadlatancy, as well as thread count graphs to see what the issue
> might be.

You might wanna setup OpenTSDB instead of Ganglia, it would give more
fine grained details as well as more system metrics out of the box.
</shamelessplug>

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com

Re: Performance degrades on moving from desktop to blade environment

Posted by Jack Levin <ma...@gmail.com>.

Himanish, it hard to say without trend graphs.  Setup ganglia and get
fsreadlatancy, as well as thread count graphs to see what the issue
might be.

-Jack

On Thu, May 19, 2011 at 11:46 AM, Himanish Kushary <hi...@gmail.com> wrote:
> Hi,
>
> Could anybody suggest what may be the issue. I ran YCSB on both the
> development and production servers.
>
> The loading of data performs better on the production cluster but the 50%
> read-50% write workloada performs better on the development.The average
> latency for read shoots up to 30-40 ms on production, for development it is
> between 10-20 ms.This was while running with 10 threads maintaining 1000 tps
> using this command - [*java -cp build/ycsb.jar:db/hbase/conf:db/hbase/lib/*
> com.yahoo.ycsb.Client -t -db com.yahoo.ycsb.db.HBaseClient -P
> workloads/workloada -p columnfamily=data -p operationcount=1000000 -s
> -threads 10 -target 1000*]
>
> The clusters seems to perform similiarly using YCSB when the tps and
> operationcount is lowered to 500 and 100000 respectively.
>
> We ran our Map-Reduces on the two clusters (assuming that we will not reach
> 1000 tps or that much of operationcount from the map-reduce), but strangely
> the development cluster performed better.
>
> Any suggestions will be really helpful?
>
> Thanks
> Himanish
>
>
>
> On Mon, May 16, 2011 at 4:43 PM, Himanish Kushary <hi...@gmail.com>wrote:
>
>> *PRODUCTION SERVER CPU INFO*
>> processor : 0
>> vendor_id : AuthenticAMD
>> cpu family : 16
>> model : 9
>> model name : AMD Opteron(tm) Processor 6174
>> stepping : 1
>> cpu MHz : 2200.022
>> cache size : 512 KB
>> physical id : 1
>> siblings : 12
>> core id : 0
>> cpu cores : 12
>> apicid : 16
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 5
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
>> pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp
>> lm 3dnowext 3dnow constant_tsc nonstop_tsc pni cx16 popcnt lahf_lm
>> cmp_legacy svm extapic cr8_legacy altmovcr8 abm sse4a misalignsse
>> 3dnowprefetch osvw
>> bogomips : 4400.03
>> TLB size : 1024 4K pages
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 48 bits physical, 48 bits virtual
>> power management: ts ttp tm stc 100mhzsteps hwpstate [8]
>>
>>
>> *DEVELOPMENT SERVER CPU INFO*
>>
>> processor : 0
>> vendor_id : GenuineIntel
>> cpu family : 6
>> model : 30
>> model name : Intel(R) Core(TM) i7 CPU       Q 740  @ 1.73GHz
>> stepping : 5
>> cpu MHz : 933.000
>> cache size : 6144 KB
>> physical id : 0
>> siblings : 8
>> core id : 0
>> cpu cores : 4
>> apicid : 0
>> initial apicid : 0
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 11
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
>> pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm
>> constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf
>> pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2
>> popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
>> bogomips : 3457.61
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 36 bits physical, 48 bits virtual
>> power management:
>>
>>
>>
>> On Mon, May 16, 2011 at 4:26 PM, Jack Levin <ma...@gmail.com> wrote:
>>
>>> What is the clock rate of your CPUs (desktop vs blade)?
>>>
>>> -Jack
>>>
>>> On Mon, May 16, 2011 at 1:24 PM, Himanish Kushary <hi...@gmail.com>
>>> wrote:
>>> > Yes, it is only the HW that was changed . All the configurations are
>>> kept at
>>> > default from the cloudera installer.
>>> >
>>> > The regionserver logs semms ok.
>>> >
>>> > On Mon, May 16, 2011 at 3:20 PM, Jean-Daniel Cryans <
>>> jdcryans@apache.org>wrote:
>>> >
>>> >> Ok I see... so the only thing that changed is the HW right? No
>>> >> upgrades to a new version? Also could it be possible that you changed
>>> >> some configs (or missed them)? BTW counting has a parameter for
>>> >> scanner caching, like you would write: count "myTable", CACHE = 1000
>>> >>
>>> >> and it should stream through your data.
>>> >>
>>> >> Anything weird in the region server logs?
>>> >>
>>> >> J-D
>>> >>
>>> >> On Mon, May 16, 2011 at 12:13 PM, Himanish Kushary <himanish@gmail.com
>>> >
>>> >> wrote:
>>> >> > Thanks for the reply. We ran the TestDFSIO benchmark on both the
>>> >> development
>>> >> > and production and found the production to be better.The statistics
>>> are
>>> >> > shown below.
>>> >> >
>>> >> > But once we bring HBase into the picture things gets reversed :-(
>>> >> >
>>> >> > The count operation,map-reduces etc becomes less performing on the
>>> >> > production box.We are using Pseudo Distribution mode in both the
>>> >> development
>>> >> > and production servers for both hadoop and hbase.
>>> >> >
>>> >> > *DEVELOPMENT SERVER*
>>> >> >
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:            Date & time: Sun May
>>> 15
>>> >> > 21:26:26 EDT 2011
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:        Number of files: 10
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Total MBytes processed: 10000
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:      Throughput mb/sec:
>>> >> > 58.09495038691237
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Average IO rate mb/sec:
>>> >> > 59.699485778808594
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:  IO rate std deviation:
>>> >> > 10.54547265175703
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:     Test exec time sec: 163.354
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:
>>> >> >
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:            Date & time: Sun May
>>> 15
>>> >> > 21:28:44 EDT 2011
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:        Number of files: 10
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Total MBytes processed: 10000
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:      Throughput mb/sec:
>>> >> > 682.4075337791729
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Average IO rate mb/sec:
>>> >> > 755.5845947265625
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:  IO rate std deviation:
>>> >> > 229.60029445080488
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:     Test exec time sec: 63.896
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > *PRODUCTION SERVER*
>>> >> >
>>> >> > 5/16 01:00:43 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *WRITE
>>> >> PERFORMANCE*
>>> >> >
>>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Date & time: Mon May 16 01:00:43
>>> >> > GMT+00:00 2011
>>> >> >
>>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Number of files: 10
>>> >> >
>>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Total MBytes processed: 10000
>>> >> >
>>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Throughput mb/sec:
>>> 69.25447557048375
>>> >> >
>>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Average IO rate mb/sec:
>>> >> > 70.06581115722656
>>> >> >
>>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: IO rate std deviation:
>>> >> > 7.243961483443693
>>> >> >
>>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Test exec time sec: 126.896
>>> >> >
>>> >> >
>>> >> > 5/16 01:25:01 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *READ
>>> >> PERFORMANCE*
>>> >> >
>>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Date & time: Mon May 16 01:25:01
>>> >> > GMT+00:00 2011
>>> >> >
>>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Number of files: 10
>>> >> >
>>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Total MBytes processed: 10000
>>> >> >
>>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Throughput mb/sec:
>>> 1487.20999405116
>>> >> >
>>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Average IO rate mb/sec:
>>> >> > 1525.230712890625
>>> >> >
>>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: IO rate std deviation:
>>> >> > 239.54492784268226
>>> >> >
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Thanks & Regards
>>> > Himanish
>>> >
>>>
>>
>>
>>
>> --
>> Thanks & Regards
>> Himanish
>>
>
>
>
> --
> Thanks & Regards
> Himanish
>

RE: Performance degrades on moving from desktop to blade environment

Posted by "Rottinghuis, Joep" <jr...@ebay.com>.

Hi Himanish,

This is a phenomon I've seen before, but not in the context of HBase.
We had web-service calls with sub-second response times on desktops. When moving to a blade environment we had spikes of up to 20 seconds.
In that case it turned out that we had verbose logging turned on. The dev desktop had reasonable performance, but the disk in the blade server had terrible write performance. Ops guys mentioned that those disks were means for the OS which was expected to do little writes.

Could be completely unrelated, but this is what we saw.
You may want to compare IOPS in both environments.

Also, are the memory settings the same on both? (larger memory in prod could result in more GC overhead).

Cheers,

Joep
________________________________________
From: Himanish Kushary [himanish@gmail.com]
Sent: Thursday, May 19, 2011 11:46 AM
To: user@hbase.apache.org
Subject: Re: Performance degrades on moving from desktop to blade environment

Hi,

Could anybody suggest what may be the issue. I ran YCSB on both the
development and production servers.

The loading of data performs better on the production cluster but the 50%
read-50% write workloada performs better on the development.The average
latency for read shoots up to 30-40 ms on production, for development it is
between 10-20 ms.This was while running with 10 threads maintaining 1000 tps
using this command - [*java -cp build/ycsb.jar:db/hbase/conf:db/hbase/lib/*
com.yahoo.ycsb.Client -t -db com.yahoo.ycsb.db.HBaseClient -P
workloads/workloada -p columnfamily=data -p operationcount=1000000 -s
-threads 10 -target 1000*]

The clusters seems to perform similiarly using YCSB when the tps and
operationcount is lowered to 500 and 100000 respectively.

We ran our Map-Reduces on the two clusters (assuming that we will not reach
1000 tps or that much of operationcount from the map-reduce), but strangely
the development cluster performed better.

Any suggestions will be really helpful?

Thanks
Himanish



On Mon, May 16, 2011 at 4:43 PM, Himanish Kushary <hi...@gmail.com>wrote:

> *PRODUCTION SERVER CPU INFO*
> processor : 0
> vendor_id : AuthenticAMD
> cpu family : 16
> model : 9
> model name : AMD Opteron(tm) Processor 6174
> stepping : 1
> cpu MHz : 2200.022
> cache size : 512 KB
> physical id : 1
> siblings : 12
> core id : 0
> cpu cores : 12
> apicid : 16
> fpu : yes
> fpu_exception : yes
> cpuid level : 5
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
> pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp
> lm 3dnowext 3dnow constant_tsc nonstop_tsc pni cx16 popcnt lahf_lm
> cmp_legacy svm extapic cr8_legacy altmovcr8 abm sse4a misalignsse
> 3dnowprefetch osvw
> bogomips : 4400.03
> TLB size : 1024 4K pages
> clflush size : 64
> cache_alignment : 64
> address sizes : 48 bits physical, 48 bits virtual
> power management: ts ttp tm stc 100mhzsteps hwpstate [8]
>
>
> *DEVELOPMENT SERVER CPU INFO*
>
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 30
> model name : Intel(R) Core(TM) i7 CPU       Q 740  @ 1.73GHz
> stepping : 5
> cpu MHz : 933.000
> cache size : 6144 KB
> physical id : 0
> siblings : 8
> core id : 0
> cpu cores : 4
> apicid : 0
> initial apicid : 0
> fpu : yes
> fpu_exception : yes
> cpuid level : 11
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
> pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm
> constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf
> pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2
> popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
> bogomips : 3457.61
> clflush size : 64
> cache_alignment : 64
> address sizes : 36 bits physical, 48 bits virtual
> power management:
>
>
>
> On Mon, May 16, 2011 at 4:26 PM, Jack Levin <ma...@gmail.com> wrote:
>
>> What is the clock rate of your CPUs (desktop vs blade)?
>>
>> -Jack
>>
>> On Mon, May 16, 2011 at 1:24 PM, Himanish Kushary <hi...@gmail.com>
>> wrote:
>> > Yes, it is only the HW that was changed . All the configurations are
>> kept at
>> > default from the cloudera installer.
>> >
>> > The regionserver logs semms ok.
>> >
>> > On Mon, May 16, 2011 at 3:20 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org>wrote:
>> >
>> >> Ok I see... so the only thing that changed is the HW right? No
>> >> upgrades to a new version? Also could it be possible that you changed
>> >> some configs (or missed them)? BTW counting has a parameter for
>> >> scanner caching, like you would write: count "myTable", CACHE = 1000
>> >>
>> >> and it should stream through your data.
>> >>
>> >> Anything weird in the region server logs?
>> >>
>> >> J-D
>> >>
>> >> On Mon, May 16, 2011 at 12:13 PM, Himanish Kushary <himanish@gmail.com
>> >
>> >> wrote:
>> >> > Thanks for the reply. We ran the TestDFSIO benchmark on both the
>> >> development
>> >> > and production and found the production to be better.The statistics
>> are
>> >> > shown below.
>> >> >
>> >> > But once we bring HBase into the picture things gets reversed :-(
>> >> >
>> >> > The count operation,map-reduces etc becomes less performing on the
>> >> > production box.We are using Pseudo Distribution mode in both the
>> >> development
>> >> > and production servers for both hadoop and hbase.
>> >> >
>> >> > *DEVELOPMENT SERVER*
>> >> >
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:            Date & time: Sun May
>> 15
>> >> > 21:26:26 EDT 2011
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:        Number of files: 10
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Total MBytes processed: 10000
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:      Throughput mb/sec:
>> >> > 58.09495038691237
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Average IO rate mb/sec:
>> >> > 59.699485778808594
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:  IO rate std deviation:
>> >> > 10.54547265175703
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:     Test exec time sec: 163.354
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:
>> >> >
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:            Date & time: Sun May
>> 15
>> >> > 21:28:44 EDT 2011
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:        Number of files: 10
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Total MBytes processed: 10000
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:      Throughput mb/sec:
>> >> > 682.4075337791729
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Average IO rate mb/sec:
>> >> > 755.5845947265625
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:  IO rate std deviation:
>> >> > 229.60029445080488
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:     Test exec time sec: 63.896
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > *PRODUCTION SERVER*
>> >> >
>> >> > 5/16 01:00:43 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *WRITE
>> >> PERFORMANCE*
>> >> >
>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Date & time: Mon May 16 01:00:43
>> >> > GMT+00:00 2011
>> >> >
>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Number of files: 10
>> >> >
>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Total MBytes processed: 10000
>> >> >
>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Throughput mb/sec:
>> 69.25447557048375
>> >> >
>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Average IO rate mb/sec:
>> >> > 70.06581115722656
>> >> >
>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: IO rate std deviation:
>> >> > 7.243961483443693
>> >> >
>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Test exec time sec: 126.896
>> >> >
>> >> >
>> >> > 5/16 01:25:01 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *READ
>> >> PERFORMANCE*
>> >> >
>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Date & time: Mon May 16 01:25:01
>> >> > GMT+00:00 2011
>> >> >
>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Number of files: 10
>> >> >
>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Total MBytes processed: 10000
>> >> >
>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Throughput mb/sec:
>> 1487.20999405116
>> >> >
>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Average IO rate mb/sec:
>> >> > 1525.230712890625
>> >> >
>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: IO rate std deviation:
>> >> > 239.54492784268226
>> >> >
>> >>
>> >
>> >
>> >
>> > --
>> > Thanks & Regards
>> > Himanish
>> >
>>
>
>
>
> --
> Thanks & Regards
> Himanish
>



--
Thanks & Regards
Himanish

Re: Performance degrades on moving from desktop to blade environment

Posted by Himanish Kushary <hi...@gmail.com>.

Hi,

Could anybody suggest what may be the issue. I ran YCSB on both the
development and production servers.

The loading of data performs better on the production cluster but the 50%
read-50% write workloada performs better on the development.The average
latency for read shoots up to 30-40 ms on production, for development it is
between 10-20 ms.This was while running with 10 threads maintaining 1000 tps
using this command - [*java -cp build/ycsb.jar:db/hbase/conf:db/hbase/lib/*
com.yahoo.ycsb.Client -t -db com.yahoo.ycsb.db.HBaseClient -P
workloads/workloada -p columnfamily=data -p operationcount=1000000 -s
-threads 10 -target 1000*]

The clusters seems to perform similiarly using YCSB when the tps and
operationcount is lowered to 500 and 100000 respectively.

We ran our Map-Reduces on the two clusters (assuming that we will not reach
1000 tps or that much of operationcount from the map-reduce), but strangely
the development cluster performed better.

Any suggestions will be really helpful?

Thanks
Himanish



On Mon, May 16, 2011 at 4:43 PM, Himanish Kushary <hi...@gmail.com>wrote:

> *PRODUCTION SERVER CPU INFO*
> processor : 0
> vendor_id : AuthenticAMD
> cpu family : 16
> model : 9
> model name : AMD Opteron(tm) Processor 6174
> stepping : 1
> cpu MHz : 2200.022
> cache size : 512 KB
> physical id : 1
> siblings : 12
> core id : 0
> cpu cores : 12
> apicid : 16
> fpu : yes
> fpu_exception : yes
> cpuid level : 5
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
> pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp
> lm 3dnowext 3dnow constant_tsc nonstop_tsc pni cx16 popcnt lahf_lm
> cmp_legacy svm extapic cr8_legacy altmovcr8 abm sse4a misalignsse
> 3dnowprefetch osvw
> bogomips : 4400.03
> TLB size : 1024 4K pages
> clflush size : 64
> cache_alignment : 64
> address sizes : 48 bits physical, 48 bits virtual
> power management: ts ttp tm stc 100mhzsteps hwpstate [8]
>
>
> *DEVELOPMENT SERVER CPU INFO*
>
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 30
> model name : Intel(R) Core(TM) i7 CPU       Q 740  @ 1.73GHz
> stepping : 5
> cpu MHz : 933.000
> cache size : 6144 KB
> physical id : 0
> siblings : 8
> core id : 0
> cpu cores : 4
> apicid : 0
> initial apicid : 0
> fpu : yes
> fpu_exception : yes
> cpuid level : 11
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
> pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm
> constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf
> pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2
> popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
> bogomips : 3457.61
> clflush size : 64
> cache_alignment : 64
> address sizes : 36 bits physical, 48 bits virtual
> power management:
>
>
>
> On Mon, May 16, 2011 at 4:26 PM, Jack Levin <ma...@gmail.com> wrote:
>
>> What is the clock rate of your CPUs (desktop vs blade)?
>>
>> -Jack
>>
>> On Mon, May 16, 2011 at 1:24 PM, Himanish Kushary <hi...@gmail.com>
>> wrote:
>> > Yes, it is only the HW that was changed . All the configurations are
>> kept at
>> > default from the cloudera installer.
>> >
>> > The regionserver logs semms ok.
>> >
>> > On Mon, May 16, 2011 at 3:20 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org>wrote:
>> >
>> >> Ok I see... so the only thing that changed is the HW right? No
>> >> upgrades to a new version? Also could it be possible that you changed
>> >> some configs (or missed them)? BTW counting has a parameter for
>> >> scanner caching, like you would write: count "myTable", CACHE = 1000
>> >>
>> >> and it should stream through your data.
>> >>
>> >> Anything weird in the region server logs?
>> >>
>> >> J-D
>> >>
>> >> On Mon, May 16, 2011 at 12:13 PM, Himanish Kushary <himanish@gmail.com
>> >
>> >> wrote:
>> >> > Thanks for the reply. We ran the TestDFSIO benchmark on both the
>> >> development
>> >> > and production and found the production to be better.The statistics
>> are
>> >> > shown below.
>> >> >
>> >> > But once we bring HBase into the picture things gets reversed :-(
>> >> >
>> >> > The count operation,map-reduces etc becomes less performing on the
>> >> > production box.We are using Pseudo Distribution mode in both the
>> >> development
>> >> > and production servers for both hadoop and hbase.
>> >> >
>> >> > *DEVELOPMENT SERVER*
>> >> >
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:            Date & time: Sun May
>> 15
>> >> > 21:26:26 EDT 2011
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:        Number of files: 10
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Total MBytes processed: 10000
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:      Throughput mb/sec:
>> >> > 58.09495038691237
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Average IO rate mb/sec:
>> >> > 59.699485778808594
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:  IO rate std deviation:
>> >> > 10.54547265175703
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:     Test exec time sec: 163.354
>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:
>> >> >
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:            Date & time: Sun May
>> 15
>> >> > 21:28:44 EDT 2011
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:        Number of files: 10
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Total MBytes processed: 10000
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:      Throughput mb/sec:
>> >> > 682.4075337791729
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Average IO rate mb/sec:
>> >> > 755.5845947265625
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:  IO rate std deviation:
>> >> > 229.60029445080488
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:     Test exec time sec: 63.896
>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > *PRODUCTION SERVER*
>> >> >
>> >> > 5/16 01:00:43 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *WRITE
>> >> PERFORMANCE*
>> >> >
>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Date & time: Mon May 16 01:00:43
>> >> > GMT+00:00 2011
>> >> >
>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Number of files: 10
>> >> >
>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Total MBytes processed: 10000
>> >> >
>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Throughput mb/sec:
>> 69.25447557048375
>> >> >
>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Average IO rate mb/sec:
>> >> > 70.06581115722656
>> >> >
>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: IO rate std deviation:
>> >> > 7.243961483443693
>> >> >
>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Test exec time sec: 126.896
>> >> >
>> >> >
>> >> > 5/16 01:25:01 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *READ
>> >> PERFORMANCE*
>> >> >
>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Date & time: Mon May 16 01:25:01
>> >> > GMT+00:00 2011
>> >> >
>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Number of files: 10
>> >> >
>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Total MBytes processed: 10000
>> >> >
>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Throughput mb/sec:
>> 1487.20999405116
>> >> >
>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Average IO rate mb/sec:
>> >> > 1525.230712890625
>> >> >
>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: IO rate std deviation:
>> >> > 239.54492784268226
>> >> >
>> >>
>> >
>> >
>> >
>> > --
>> > Thanks & Regards
>> > Himanish
>> >
>>
>
>
>
> --
> Thanks & Regards
> Himanish
>



-- 
Thanks & Regards
Himanish

Re: Performance degrades on moving from desktop to blade environment

Posted by Himanish Kushary <hi...@gmail.com>.

*PRODUCTION SERVER CPU INFO*
processor : 0
vendor_id : AuthenticAMD
cpu family : 16
model : 9
model name : AMD Opteron(tm) Processor 6174
stepping : 1
cpu MHz : 2200.022
cache size : 512 KB
physical id : 1
siblings : 12
core id : 0
cpu cores : 12
apicid : 16
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp
lm 3dnowext 3dnow constant_tsc nonstop_tsc pni cx16 popcnt lahf_lm
cmp_legacy svm extapic cr8_legacy altmovcr8 abm sse4a misalignsse
3dnowprefetch osvw
bogomips : 4400.03
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate [8]


*DEVELOPMENT SERVER CPU INFO*

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 30
model name : Intel(R) Core(TM) i7 CPU       Q 740  @ 1.73GHz
stepping : 5
cpu MHz : 933.000
cache size : 6144 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm
constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf
pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2
popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips : 3457.61
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:



On Mon, May 16, 2011 at 4:26 PM, Jack Levin <ma...@gmail.com> wrote:

> What is the clock rate of your CPUs (desktop vs blade)?
>
> -Jack
>
> On Mon, May 16, 2011 at 1:24 PM, Himanish Kushary <hi...@gmail.com>
> wrote:
> > Yes, it is only the HW that was changed . All the configurations are kept
> at
> > default from the cloudera installer.
> >
> > The regionserver logs semms ok.
> >
> > On Mon, May 16, 2011 at 3:20 PM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
> >
> >> Ok I see... so the only thing that changed is the HW right? No
> >> upgrades to a new version? Also could it be possible that you changed
> >> some configs (or missed them)? BTW counting has a parameter for
> >> scanner caching, like you would write: count "myTable", CACHE = 1000
> >>
> >> and it should stream through your data.
> >>
> >> Anything weird in the region server logs?
> >>
> >> J-D
> >>
> >> On Mon, May 16, 2011 at 12:13 PM, Himanish Kushary <hi...@gmail.com>
> >> wrote:
> >> > Thanks for the reply. We ran the TestDFSIO benchmark on both the
> >> development
> >> > and production and found the production to be better.The statistics
> are
> >> > shown below.
> >> >
> >> > But once we bring HBase into the picture things gets reversed :-(
> >> >
> >> > The count operation,map-reduces etc becomes less performing on the
> >> > production box.We are using Pseudo Distribution mode in both the
> >> development
> >> > and production servers for both hadoop and hbase.
> >> >
> >> > *DEVELOPMENT SERVER*
> >> >
> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:            Date & time: Sun May
> 15
> >> > 21:26:26 EDT 2011
> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:        Number of files: 10
> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Total MBytes processed: 10000
> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:      Throughput mb/sec:
> >> > 58.09495038691237
> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Average IO rate mb/sec:
> >> > 59.699485778808594
> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:  IO rate std deviation:
> >> > 10.54547265175703
> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:     Test exec time sec: 163.354
> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:
> >> >
> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read
> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:            Date & time: Sun May
> 15
> >> > 21:28:44 EDT 2011
> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:        Number of files: 10
> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Total MBytes processed: 10000
> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:      Throughput mb/sec:
> >> > 682.4075337791729
> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Average IO rate mb/sec:
> >> > 755.5845947265625
> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:  IO rate std deviation:
> >> > 229.60029445080488
> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:     Test exec time sec: 63.896
> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:
> >> >
> >> >
> >> >
> >> >
> >> > *PRODUCTION SERVER*
> >> >
> >> > 5/16 01:00:43 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *WRITE
> >> PERFORMANCE*
> >> >
> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Date & time: Mon May 16 01:00:43
> >> > GMT+00:00 2011
> >> >
> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Number of files: 10
> >> >
> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Total MBytes processed: 10000
> >> >
> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Throughput mb/sec:
> 69.25447557048375
> >> >
> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Average IO rate mb/sec:
> >> > 70.06581115722656
> >> >
> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: IO rate std deviation:
> >> > 7.243961483443693
> >> >
> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Test exec time sec: 126.896
> >> >
> >> >
> >> > 5/16 01:25:01 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *READ
> >> PERFORMANCE*
> >> >
> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Date & time: Mon May 16 01:25:01
> >> > GMT+00:00 2011
> >> >
> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Number of files: 10
> >> >
> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Total MBytes processed: 10000
> >> >
> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Throughput mb/sec:
> 1487.20999405116
> >> >
> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Average IO rate mb/sec:
> >> > 1525.230712890625
> >> >
> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: IO rate std deviation:
> >> > 239.54492784268226
> >> >
> >>
> >
> >
> >
> > --
> > Thanks & Regards
> > Himanish
> >
>



-- 
Thanks & Regards
Himanish

Re: Performance degrades on moving from desktop to blade environment

Posted by Jack Levin <ma...@gmail.com>.

What is the clock rate of your CPUs (desktop vs blade)?

-Jack

On Mon, May 16, 2011 at 1:24 PM, Himanish Kushary <hi...@gmail.com> wrote:
> Yes, it is only the HW that was changed . All the configurations are kept at
> default from the cloudera installer.
>
> The regionserver logs semms ok.
>
> On Mon, May 16, 2011 at 3:20 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> Ok I see... so the only thing that changed is the HW right? No
>> upgrades to a new version? Also could it be possible that you changed
>> some configs (or missed them)? BTW counting has a parameter for
>> scanner caching, like you would write: count "myTable", CACHE = 1000
>>
>> and it should stream through your data.
>>
>> Anything weird in the region server logs?
>>
>> J-D
>>
>> On Mon, May 16, 2011 at 12:13 PM, Himanish Kushary <hi...@gmail.com>
>> wrote:
>> > Thanks for the reply. We ran the TestDFSIO benchmark on both the
>> development
>> > and production and found the production to be better.The statistics are
>> > shown below.
>> >
>> > But once we bring HBase into the picture things gets reversed :-(
>> >
>> > The count operation,map-reduces etc becomes less performing on the
>> > production box.We are using Pseudo Distribution mode in both the
>> development
>> > and production servers for both hadoop and hbase.
>> >
>> > *DEVELOPMENT SERVER*
>> >
>> > 11/05/15 21:26:26 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
>> > 11/05/15 21:26:26 INFO fs.TestDFSIO:            Date & time: Sun May 15
>> > 21:26:26 EDT 2011
>> > 11/05/15 21:26:26 INFO fs.TestDFSIO:        Number of files: 10
>> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Total MBytes processed: 10000
>> > 11/05/15 21:26:26 INFO fs.TestDFSIO:      Throughput mb/sec:
>> > 58.09495038691237
>> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Average IO rate mb/sec:
>> > 59.699485778808594
>> > 11/05/15 21:26:26 INFO fs.TestDFSIO:  IO rate std deviation:
>> > 10.54547265175703
>> > 11/05/15 21:26:26 INFO fs.TestDFSIO:     Test exec time sec: 163.354
>> > 11/05/15 21:26:26 INFO fs.TestDFSIO:
>> >
>> > 11/05/15 21:28:44 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read
>> > 11/05/15 21:28:44 INFO fs.TestDFSIO:            Date & time: Sun May 15
>> > 21:28:44 EDT 2011
>> > 11/05/15 21:28:44 INFO fs.TestDFSIO:        Number of files: 10
>> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Total MBytes processed: 10000
>> > 11/05/15 21:28:44 INFO fs.TestDFSIO:      Throughput mb/sec:
>> > 682.4075337791729
>> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Average IO rate mb/sec:
>> > 755.5845947265625
>> > 11/05/15 21:28:44 INFO fs.TestDFSIO:  IO rate std deviation:
>> > 229.60029445080488
>> > 11/05/15 21:28:44 INFO fs.TestDFSIO:     Test exec time sec: 63.896
>> > 11/05/15 21:28:44 INFO fs.TestDFSIO:
>> >
>> >
>> >
>> >
>> > *PRODUCTION SERVER*
>> >
>> > 5/16 01:00:43 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *WRITE
>> PERFORMANCE*
>> >
>> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Date & time: Mon May 16 01:00:43
>> > GMT+00:00 2011
>> >
>> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Number of files: 10
>> >
>> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Total MBytes processed: 10000
>> >
>> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Throughput mb/sec: 69.25447557048375
>> >
>> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Average IO rate mb/sec:
>> > 70.06581115722656
>> >
>> > 11/05/16 01:00:43 INFO fs.TestDFSIO: IO rate std deviation:
>> > 7.243961483443693
>> >
>> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Test exec time sec: 126.896
>> >
>> >
>> > 5/16 01:25:01 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *READ
>> PERFORMANCE*
>> >
>> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Date & time: Mon May 16 01:25:01
>> > GMT+00:00 2011
>> >
>> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Number of files: 10
>> >
>> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Total MBytes processed: 10000
>> >
>> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Throughput mb/sec: 1487.20999405116
>> >
>> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Average IO rate mb/sec:
>> > 1525.230712890625
>> >
>> > 11/05/16 01:25:01 INFO fs.TestDFSIO: IO rate std deviation:
>> > 239.54492784268226
>> >
>>
>
>
>
> --
> Thanks & Regards
> Himanish
>

Re: Performance degrades on moving from desktop to blade environment

Posted by Himanish Kushary <hi...@gmail.com>.

Yes, it is only the HW that was changed . All the configurations are kept at
default from the cloudera installer.

The regionserver logs semms ok.

On Mon, May 16, 2011 at 3:20 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> Ok I see... so the only thing that changed is the HW right? No
> upgrades to a new version? Also could it be possible that you changed
> some configs (or missed them)? BTW counting has a parameter for
> scanner caching, like you would write: count "myTable", CACHE = 1000
>
> and it should stream through your data.
>
> Anything weird in the region server logs?
>
> J-D
>
> On Mon, May 16, 2011 at 12:13 PM, Himanish Kushary <hi...@gmail.com>
> wrote:
> > Thanks for the reply. We ran the TestDFSIO benchmark on both the
> development
> > and production and found the production to be better.The statistics are
> > shown below.
> >
> > But once we bring HBase into the picture things gets reversed :-(
> >
> > The count operation,map-reduces etc becomes less performing on the
> > production box.We are using Pseudo Distribution mode in both the
> development
> > and production servers for both hadoop and hbase.
> >
> > *DEVELOPMENT SERVER*
> >
> > 11/05/15 21:26:26 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
> > 11/05/15 21:26:26 INFO fs.TestDFSIO:            Date & time: Sun May 15
> > 21:26:26 EDT 2011
> > 11/05/15 21:26:26 INFO fs.TestDFSIO:        Number of files: 10
> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Total MBytes processed: 10000
> > 11/05/15 21:26:26 INFO fs.TestDFSIO:      Throughput mb/sec:
> > 58.09495038691237
> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Average IO rate mb/sec:
> > 59.699485778808594
> > 11/05/15 21:26:26 INFO fs.TestDFSIO:  IO rate std deviation:
> > 10.54547265175703
> > 11/05/15 21:26:26 INFO fs.TestDFSIO:     Test exec time sec: 163.354
> > 11/05/15 21:26:26 INFO fs.TestDFSIO:
> >
> > 11/05/15 21:28:44 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read
> > 11/05/15 21:28:44 INFO fs.TestDFSIO:            Date & time: Sun May 15
> > 21:28:44 EDT 2011
> > 11/05/15 21:28:44 INFO fs.TestDFSIO:        Number of files: 10
> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Total MBytes processed: 10000
> > 11/05/15 21:28:44 INFO fs.TestDFSIO:      Throughput mb/sec:
> > 682.4075337791729
> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Average IO rate mb/sec:
> > 755.5845947265625
> > 11/05/15 21:28:44 INFO fs.TestDFSIO:  IO rate std deviation:
> > 229.60029445080488
> > 11/05/15 21:28:44 INFO fs.TestDFSIO:     Test exec time sec: 63.896
> > 11/05/15 21:28:44 INFO fs.TestDFSIO:
> >
> >
> >
> >
> > *PRODUCTION SERVER*
> >
> > 5/16 01:00:43 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *WRITE
> PERFORMANCE*
> >
> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Date & time: Mon May 16 01:00:43
> > GMT+00:00 2011
> >
> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Number of files: 10
> >
> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Total MBytes processed: 10000
> >
> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Throughput mb/sec: 69.25447557048375
> >
> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Average IO rate mb/sec:
> > 70.06581115722656
> >
> > 11/05/16 01:00:43 INFO fs.TestDFSIO: IO rate std deviation:
> > 7.243961483443693
> >
> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Test exec time sec: 126.896
> >
> >
> > 5/16 01:25:01 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *READ
> PERFORMANCE*
> >
> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Date & time: Mon May 16 01:25:01
> > GMT+00:00 2011
> >
> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Number of files: 10
> >
> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Total MBytes processed: 10000
> >
> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Throughput mb/sec: 1487.20999405116
> >
> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Average IO rate mb/sec:
> > 1525.230712890625
> >
> > 11/05/16 01:25:01 INFO fs.TestDFSIO: IO rate std deviation:
> > 239.54492784268226
> >
>



-- 
Thanks & Regards
Himanish

Re: Performance degrades on moving from desktop to blade environment

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Ok I see... so the only thing that changed is the HW right? No
upgrades to a new version? Also could it be possible that you changed
some configs (or missed them)? BTW counting has a parameter for
scanner caching, like you would write: count "myTable", CACHE = 1000

and it should stream through your data.

Anything weird in the region server logs?

J-D

On Mon, May 16, 2011 at 12:13 PM, Himanish Kushary <hi...@gmail.com> wrote:
> Thanks for the reply. We ran the TestDFSIO benchmark on both the development
> and production and found the production to be better.The statistics are
> shown below.
>
> But once we bring HBase into the picture things gets reversed :-(
>
> The count operation,map-reduces etc becomes less performing on the
> production box.We are using Pseudo Distribution mode in both the development
> and production servers for both hadoop and hbase.
>
> *DEVELOPMENT SERVER*
>
> 11/05/15 21:26:26 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
> 11/05/15 21:26:26 INFO fs.TestDFSIO:            Date & time: Sun May 15
> 21:26:26 EDT 2011
> 11/05/15 21:26:26 INFO fs.TestDFSIO:        Number of files: 10
> 11/05/15 21:26:26 INFO fs.TestDFSIO: Total MBytes processed: 10000
> 11/05/15 21:26:26 INFO fs.TestDFSIO:      Throughput mb/sec:
> 58.09495038691237
> 11/05/15 21:26:26 INFO fs.TestDFSIO: Average IO rate mb/sec:
> 59.699485778808594
> 11/05/15 21:26:26 INFO fs.TestDFSIO:  IO rate std deviation:
> 10.54547265175703
> 11/05/15 21:26:26 INFO fs.TestDFSIO:     Test exec time sec: 163.354
> 11/05/15 21:26:26 INFO fs.TestDFSIO:
>
> 11/05/15 21:28:44 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read
> 11/05/15 21:28:44 INFO fs.TestDFSIO:            Date & time: Sun May 15
> 21:28:44 EDT 2011
> 11/05/15 21:28:44 INFO fs.TestDFSIO:        Number of files: 10
> 11/05/15 21:28:44 INFO fs.TestDFSIO: Total MBytes processed: 10000
> 11/05/15 21:28:44 INFO fs.TestDFSIO:      Throughput mb/sec:
> 682.4075337791729
> 11/05/15 21:28:44 INFO fs.TestDFSIO: Average IO rate mb/sec:
> 755.5845947265625
> 11/05/15 21:28:44 INFO fs.TestDFSIO:  IO rate std deviation:
> 229.60029445080488
> 11/05/15 21:28:44 INFO fs.TestDFSIO:     Test exec time sec: 63.896
> 11/05/15 21:28:44 INFO fs.TestDFSIO:
>
>
>
>
> *PRODUCTION SERVER*
>
> 5/16 01:00:43 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *WRITE PERFORMANCE*
>
> 11/05/16 01:00:43 INFO fs.TestDFSIO: Date & time: Mon May 16 01:00:43
> GMT+00:00 2011
>
> 11/05/16 01:00:43 INFO fs.TestDFSIO: Number of files: 10
>
> 11/05/16 01:00:43 INFO fs.TestDFSIO: Total MBytes processed: 10000
>
> 11/05/16 01:00:43 INFO fs.TestDFSIO: Throughput mb/sec: 69.25447557048375
>
> 11/05/16 01:00:43 INFO fs.TestDFSIO: Average IO rate mb/sec:
> 70.06581115722656
>
> 11/05/16 01:00:43 INFO fs.TestDFSIO: IO rate std deviation:
> 7.243961483443693
>
> 11/05/16 01:00:43 INFO fs.TestDFSIO: Test exec time sec: 126.896
>
>
> 5/16 01:25:01 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *READ PERFORMANCE*
>
> 11/05/16 01:25:01 INFO fs.TestDFSIO: Date & time: Mon May 16 01:25:01
> GMT+00:00 2011
>
> 11/05/16 01:25:01 INFO fs.TestDFSIO: Number of files: 10
>
> 11/05/16 01:25:01 INFO fs.TestDFSIO: Total MBytes processed: 10000
>
> 11/05/16 01:25:01 INFO fs.TestDFSIO: Throughput mb/sec: 1487.20999405116
>
> 11/05/16 01:25:01 INFO fs.TestDFSIO: Average IO rate mb/sec:
> 1525.230712890625
>
> 11/05/16 01:25:01 INFO fs.TestDFSIO: IO rate std deviation:
> 239.54492784268226
>

Re: Performance degrades on moving from desktop to blade environment

Posted by Himanish Kushary <hi...@gmail.com>.

Thanks for the reply. We ran the TestDFSIO benchmark on both the development
and production and found the production to be better.The statistics are
shown below.

But once we bring HBase into the picture things gets reversed :-(

The count operation,map-reduces etc becomes less performing on the
production box.We are using Pseudo Distribution mode in both the development
and production servers for both hadoop and hbase.

*DEVELOPMENT SERVER*

11/05/15 21:26:26 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
11/05/15 21:26:26 INFO fs.TestDFSIO:            Date & time: Sun May 15
21:26:26 EDT 2011
11/05/15 21:26:26 INFO fs.TestDFSIO:        Number of files: 10
11/05/15 21:26:26 INFO fs.TestDFSIO: Total MBytes processed: 10000
11/05/15 21:26:26 INFO fs.TestDFSIO:      Throughput mb/sec:
58.09495038691237
11/05/15 21:26:26 INFO fs.TestDFSIO: Average IO rate mb/sec:
59.699485778808594
11/05/15 21:26:26 INFO fs.TestDFSIO:  IO rate std deviation:
10.54547265175703
11/05/15 21:26:26 INFO fs.TestDFSIO:     Test exec time sec: 163.354
11/05/15 21:26:26 INFO fs.TestDFSIO:

11/05/15 21:28:44 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read
11/05/15 21:28:44 INFO fs.TestDFSIO:            Date & time: Sun May 15
21:28:44 EDT 2011
11/05/15 21:28:44 INFO fs.TestDFSIO:        Number of files: 10
11/05/15 21:28:44 INFO fs.TestDFSIO: Total MBytes processed: 10000
11/05/15 21:28:44 INFO fs.TestDFSIO:      Throughput mb/sec:
682.4075337791729
11/05/15 21:28:44 INFO fs.TestDFSIO: Average IO rate mb/sec:
755.5845947265625
11/05/15 21:28:44 INFO fs.TestDFSIO:  IO rate std deviation:
229.60029445080488
11/05/15 21:28:44 INFO fs.TestDFSIO:     Test exec time sec: 63.896
11/05/15 21:28:44 INFO fs.TestDFSIO:




*PRODUCTION SERVER*

5/16 01:00:43 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *WRITE PERFORMANCE*

11/05/16 01:00:43 INFO fs.TestDFSIO: Date & time: Mon May 16 01:00:43
GMT+00:00 2011

11/05/16 01:00:43 INFO fs.TestDFSIO: Number of files: 10

11/05/16 01:00:43 INFO fs.TestDFSIO: Total MBytes processed: 10000

11/05/16 01:00:43 INFO fs.TestDFSIO: Throughput mb/sec: 69.25447557048375

11/05/16 01:00:43 INFO fs.TestDFSIO: Average IO rate mb/sec:
70.06581115722656

11/05/16 01:00:43 INFO fs.TestDFSIO: IO rate std deviation:
7.243961483443693

11/05/16 01:00:43 INFO fs.TestDFSIO: Test exec time sec: 126.896


5/16 01:25:01 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *READ PERFORMANCE*

11/05/16 01:25:01 INFO fs.TestDFSIO: Date & time: Mon May 16 01:25:01
GMT+00:00 2011

11/05/16 01:25:01 INFO fs.TestDFSIO: Number of files: 10

11/05/16 01:25:01 INFO fs.TestDFSIO: Total MBytes processed: 10000

11/05/16 01:25:01 INFO fs.TestDFSIO: Throughput mb/sec: 1487.20999405116

11/05/16 01:25:01 INFO fs.TestDFSIO: Average IO rate mb/sec:
1525.230712890625

11/05/16 01:25:01 INFO fs.TestDFSIO: IO rate std deviation:
239.54492784268226

11/05/16 01:25:01 INFO fs.TestDFSIO: Test exec time sec: 51.117




On Mon, May 16, 2011 at 2:23 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> You are giving us the mile high overview of the problem, pointing to a
> specific culprit could be very time consuming. Instead, can you run
> some system tests and make sure things work the way they should? Are
> the disks strangely slow? Any switches acting up?
>
> Regarding your CPUs, counting is mostly IO bound so I don't see how
> that would change anything (which is why I ask about disks and
> network).
>
> J-D
>
> On Mon, May 16, 2011 at 10:33 AM, Himanish Kushary <hi...@gmail.com>
> wrote:
> > Hi,
> >
> > We are in the process of moving a small Hbase/Hadoop cluster from our
> > development to production environment.Our development environment were
> few
> > intel desktops (8 cores CPU/8 Gigs RAM/7200 rpm disks) running centOS
> while
> > the production environment are blades with (24 cores AMD CPU/32 gigs
> > RAM/15000 rpm disks) AMD architecture running centOS.
> >
> > Strangely the hbase performance seems to degrade after moving stuffs to
> the
> > production enviroment (suppoesed to have more horse power).We are using
> the
> > latest and default installation for cloudera version of hadoop and
> hbase.No
> > changes to memory or other parameter were done on both the environment.
> >
> > Any idea what could cause this.Could the AMD architecture be the
> > cause.Pointers to things to look for to improve performance in the
> > production cluster would be really appreciated.
> >
> > Note: We ran "count" from hbase shell on a huge table and found the
> desktops
> > to be performing much better. We are in the process of comparing
> Map-Reduces
> > presently.
> >
> > ---------------------------
> > Thanks & Regards
> > Himanish
> >
>



-- 
Thanks & Regards
Himanish

Re: Performance degrades on moving from desktop to blade environment

Posted by Jean-Daniel Cryans <jd...@apache.org>.

You are giving us the mile high overview of the problem, pointing to a
specific culprit could be very time consuming. Instead, can you run
some system tests and make sure things work the way they should? Are
the disks strangely slow? Any switches acting up?

Regarding your CPUs, counting is mostly IO bound so I don't see how
that would change anything (which is why I ask about disks and
network).

J-D

On Mon, May 16, 2011 at 10:33 AM, Himanish Kushary <hi...@gmail.com> wrote:
> Hi,
>
> We are in the process of moving a small Hbase/Hadoop cluster from our
> development to production environment.Our development environment were few
> intel desktops (8 cores CPU/8 Gigs RAM/7200 rpm disks) running centOS while
> the production environment are blades with (24 cores AMD CPU/32 gigs
> RAM/15000 rpm disks) AMD architecture running centOS.
>
> Strangely the hbase performance seems to degrade after moving stuffs to the
> production enviroment (suppoesed to have more horse power).We are using the
> latest and default installation for cloudera version of hadoop and hbase.No
> changes to memory or other parameter were done on both the environment.
>
> Any idea what could cause this.Could the AMD architecture be the
> cause.Pointers to things to look for to improve performance in the
> production cluster would be really appreciated.
>
> Note: We ran "count" from hbase shell on a huge table and found the desktops
> to be performing much better. We are in the process of comparing Map-Reduces
> presently.
>
> ---------------------------
> Thanks & Regards
> Himanish
>