You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kudu.apache.org by 李津 <yu...@icloud.com> on 2017/08/31 00:53:55 UTC

Question about per server data upper limit.

why per tserver have the upper limit  of 4T and  it include the memrowset data?   we also not testing more than 4T. what will happen if reach the upper limit? 

Re: Question about per server data upper limit.

Posted by Todd Lipcon <to...@cloudera.com>.
Thanks Li Jin for reporting back your experiences!

Kudu 1.5 also has more improvements for data density, so if you want to try
testing the Kudu 1.5.0 RC3 release candidate in your environment, that
would be great.

-Todd

On Mon, Sep 4, 2017 at 7:41 PM, Li Jin <yu...@gmail.com> wrote:

> Thanks replay.that is to say. there is no hard limit about ts' data, the
> more data just inc the time of start up, and we need more resource, such as
> tablets,tserver's thread count,file descriptor
> count. may be the upper limit is not ts, but something others.
> by the way, we test data is more than 6T per ts, and it's work well now.
> qps is more than 50w . kudu 1.4.0 have do many optimize, include the
> recommend the upper limit of ts data per ts 8T . we are continue to test
> util the data reaches 8T or more. it's interesting.
> our machine configure and kudu version:
> 32 cpu  Intel(R) Xeon(R) CPU E5-2682 v4 @ 2.50GHz   128G memory 6*16T hdd
> for data and 3T for wal. kudu 1.4.0  5 master + 5 tserver.
> if more interesting things happened, I will replay here.
> thanks again.
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Question about per server data upper limit.

Posted by Li Jin <yu...@gmail.com>.
Thanks replay.that is to say. there is no hard limit about ts' data, the
more data just inc the time of start up, and we need more resource, such as
tablets,tserver's thread count,file descriptor
count. may be the upper limit is not ts, but something others.
by the way, we test data is more than 6T per ts, and it's work well now.
qps is more than 50w . kudu 1.4.0 have do many optimize, include the
recommend the upper limit of ts data per ts 8T . we are continue to test
util the data reaches 8T or more. it's interesting.
our machine configure and kudu version:
32 cpu  Intel(R) Xeon(R) CPU E5-2682 v4 @ 2.50GHz   128G memory 6*16T hdd
for data and 3T for wal. kudu 1.4.0  5 master + 5 tserver.
if more interesting things happened, I will replay here.
thanks again.

Re: Question about per server data upper limit.

Posted by Adar Lieber-Dembo <ad...@cloudera.com>.
The upper limit of 4 TB is for data on-disk (post-encoding,
post-compression, and post-replication); it does not include in-memory
data from memrowsets or deltamemstores.

The value of the limit is based on the kinds of workloads tested by
the Kudu development community. As a group we feel comfortable
supporting users up to 4 TB because we've run such workloads
ourselves. Beyond 4 TB, however, we're not exactly sure what becomes
slow, what breaks, etc.

Speaking from experience, as the amount of on-disk data grows,
tservers will take longer to start-up. You might become vulnerable to
KUDU-2050; we're not sure. In order to reach that amount of data
you'll probably also raise the number of tablets hosted by the
tserver. This can increase the tserver's thread count, file descriptor
count, and may cause slowdowns in other areas.

In short, nothing will "happen" the moment you cross 4 TB, it's just
that you'll be entering relatively uncharted waters and might
encounter unusual or unexpected behavior. If that doesn't deter you,
by all means give it a shot (and report back with your findings)!

On Wed, Aug 30, 2017 at 5:53 PM, 李津 <yu...@icloud.com> wrote:
> why per tserver have the upper limit  of 4T and  it include the memrowset data?   we also not testing more than 4T. what will happen if reach the upper limit?