You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Zhoushuaifeng <zh...@huawei.com> on 2011/08/01 03:37:53 UTC

Is there any influence to the read performance if setting the TTL of a table?

I do some test and find that the scan performance getting worse after I setting the TTL  of a table. Is there any explain to this? Or TTL is not the case, there may be other reason of the worse performance?

Zhou Shuaifeng(Frank)



RE: Is there any influence to the read performance if setting the TTL of a table?

Posted by Zhoushuaifeng <zh...@huawei.com>.
Thanks J-D,
I agree with you. I reviewed the flow of put and also find no ttl used. I will do more test.

Zhou Shuaifeng(Frank)

-----Original Message-----
From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of Jean-Daniel Cryans
Sent: Thursday, August 04, 2011 5:55 AM
To: dev@hbase.apache.org
Subject: Re: Is there any influence to the read performance if setting the TTL of a table?

You went from 120k rows per second to 5k when writing? One thing with
TTLs is that they aren't used on the write path, so it seems to me
that there's something else weird going on on your cluster.

You might want to do more tests, and probably want to set the TTL
smaller so that you can test more often.

There are many factors that can affect a benchmark... for example were
you starting the TTL test on a clean cluster or you used the setup
left by the previous one? Did you test 2 days later when no TTL too?

And after all that, you might want to reduce the scope of your test to
pin down exactly what is slow. It may be your code, machines, network,
or the HBase code.

Good luck!

J-D

On Mon, Aug 1, 2011 at 6:50 PM, Zhoushuaifeng <zh...@huawei.com> wrote:
> Thanks J-D, this is my test case:
> I do read+write mixture performance test, when no setting TTL, the read avg latency is about 200ms, but if setting TTL, the latency change to about 1s.
> The read throughput is fixed to 500scan/s, there is no limit on the write throughput. Read pattern is scan 1~1000 rows per scan(the records are sequential).
> The write throughput changed, no setting TTL, it's avg 120,000puts/s, changed to 5,000puts/s. Read throughput decrease a little after setting TTL.
> I set ttl = 172800 , so that's 2days. I test this at the second day, so it's actually in the range of TTL.
>
> Then, I do only read performance test, the latency is about 200ms whatever setting TTL or not.
> I can't explain this strange phenomena.
>
> Zhou Shuaifeng(Frank)
>
> -----Original Message-----
> From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of Jean-Daniel Cryans
> Sent: Tuesday, August 02, 2011 4:45 AM
> To: dev@hbase.apache.org
> Subject: Re: Is there any influence to the read performance if setting the TTL of a table?
>
> How worse?
>
> When you are setting TTL, is data actually getting out of that TTL
> range at all? I'm asking because "corpses" of rows can slow you down
> until the major compactions. Should not be noticeable unless you churn
> through thousands of versions a day within a single region. Basically
> we need to know more about your test.
>
> J-D
>
> On Sun, Jul 31, 2011 at 6:37 PM, Zhoushuaifeng <zh...@huawei.com> wrote:
>> I do some test and find that the scan performance getting worse after I setting the TTL  of a table. Is there any explain to this? Or TTL is not the case, there may be other reason of the worse performance?
>>
>> Zhou Shuaifeng(Frank)
>>
>>
>>
>

Re: Is there any influence to the read performance if setting the TTL of a table?

Posted by Jean-Daniel Cryans <jd...@apache.org>.
You went from 120k rows per second to 5k when writing? One thing with
TTLs is that they aren't used on the write path, so it seems to me
that there's something else weird going on on your cluster.

You might want to do more tests, and probably want to set the TTL
smaller so that you can test more often.

There are many factors that can affect a benchmark... for example were
you starting the TTL test on a clean cluster or you used the setup
left by the previous one? Did you test 2 days later when no TTL too?

And after all that, you might want to reduce the scope of your test to
pin down exactly what is slow. It may be your code, machines, network,
or the HBase code.

Good luck!

J-D

On Mon, Aug 1, 2011 at 6:50 PM, Zhoushuaifeng <zh...@huawei.com> wrote:
> Thanks J-D, this is my test case:
> I do read+write mixture performance test, when no setting TTL, the read avg latency is about 200ms, but if setting TTL, the latency change to about 1s.
> The read throughput is fixed to 500scan/s, there is no limit on the write throughput. Read pattern is scan 1~1000 rows per scan(the records are sequential).
> The write throughput changed, no setting TTL, it's avg 120,000puts/s, changed to 5,000puts/s. Read throughput decrease a little after setting TTL.
> I set ttl = 172800 , so that's 2days. I test this at the second day, so it's actually in the range of TTL.
>
> Then, I do only read performance test, the latency is about 200ms whatever setting TTL or not.
> I can't explain this strange phenomena.
>
> Zhou Shuaifeng(Frank)
>
> -----Original Message-----
> From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of Jean-Daniel Cryans
> Sent: Tuesday, August 02, 2011 4:45 AM
> To: dev@hbase.apache.org
> Subject: Re: Is there any influence to the read performance if setting the TTL of a table?
>
> How worse?
>
> When you are setting TTL, is data actually getting out of that TTL
> range at all? I'm asking because "corpses" of rows can slow you down
> until the major compactions. Should not be noticeable unless you churn
> through thousands of versions a day within a single region. Basically
> we need to know more about your test.
>
> J-D
>
> On Sun, Jul 31, 2011 at 6:37 PM, Zhoushuaifeng <zh...@huawei.com> wrote:
>> I do some test and find that the scan performance getting worse after I setting the TTL  of a table. Is there any explain to this? Or TTL is not the case, there may be other reason of the worse performance?
>>
>> Zhou Shuaifeng(Frank)
>>
>>
>>
>

RE: Is there any influence to the read performance if setting the TTL of a table?

Posted by Zhoushuaifeng <zh...@huawei.com>.
Thanks J-D, this is my test case:
I do read+write mixture performance test, when no setting TTL, the read avg latency is about 200ms, but if setting TTL, the latency change to about 1s.
The read throughput is fixed to 500scan/s, there is no limit on the write throughput. Read pattern is scan 1~1000 rows per scan(the records are sequential).
The write throughput changed, no setting TTL, it's avg 120,000puts/s, changed to 5,000puts/s. Read throughput decrease a little after setting TTL.
I set ttl = 172800 , so that's 2days. I test this at the second day, so it's actually in the range of TTL.

Then, I do only read performance test, the latency is about 200ms whatever setting TTL or not.
I can't explain this strange phenomena.

Zhou Shuaifeng(Frank)

-----Original Message-----
From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of Jean-Daniel Cryans
Sent: Tuesday, August 02, 2011 4:45 AM
To: dev@hbase.apache.org
Subject: Re: Is there any influence to the read performance if setting the TTL of a table?

How worse?

When you are setting TTL, is data actually getting out of that TTL
range at all? I'm asking because "corpses" of rows can slow you down
until the major compactions. Should not be noticeable unless you churn
through thousands of versions a day within a single region. Basically
we need to know more about your test.

J-D

On Sun, Jul 31, 2011 at 6:37 PM, Zhoushuaifeng <zh...@huawei.com> wrote:
> I do some test and find that the scan performance getting worse after I setting the TTL  of a table. Is there any explain to this? Or TTL is not the case, there may be other reason of the worse performance?
>
> Zhou Shuaifeng(Frank)
>
>
>

Re: Is there any influence to the read performance if setting the TTL of a table?

Posted by Jean-Daniel Cryans <jd...@apache.org>.
How worse?

When you are setting TTL, is data actually getting out of that TTL
range at all? I'm asking because "corpses" of rows can slow you down
until the major compactions. Should not be noticeable unless you churn
through thousands of versions a day within a single region. Basically
we need to know more about your test.

J-D

On Sun, Jul 31, 2011 at 6:37 PM, Zhoushuaifeng <zh...@huawei.com> wrote:
> I do some test and find that the scan performance getting worse after I setting the TTL  of a table. Is there any explain to this? Or TTL is not the case, there may be other reason of the worse performance?
>
> Zhou Shuaifeng(Frank)
>
>
>