You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@kudu.apache.org by 张晓宁 <zh...@jd.com> on 2018/03/15 10:12:54 UTC

A few questions for using Kudu

I have a few questions for using kudu:

1.       As more and more data inserted to kudu, the performance decrease. After continuous data insertion for about 30 minutes, the TPS performance decreased with 20%, and after 1-hour data insertion, the performance decreased with 40%. Is this a known issue?

2.       When setting the replica number to be 1, totally I will have 2 copy of data(1 master data + 1 replica data), is this true?

3.       I want to install kudu 1.6, but our machine cannot connect to public internet. Will kudu team build out the rpm packages for 1.6 version?

答复: 答复: A few questions for using Kudu

Posted by 张晓宁 <zh...@jd.com>.

Some more questions for using kudu:

1. We changed our kudu version from 1.4 to 1.6, and tested the 1-master-5-tserver(3 time range partition + 5 hash partition), and 1-master-9-tserver(3 time range partition + 9 hash partition) again, and the replica number is 3. This time we got almost the same result as that in 1.4. The TPS for 1-master-5-tserver is around 100W, and for 1-master-9-tserver it’s about 115W, which encreased only 15%. It seems the issue still exists in 1.6 version. So I wonder what cluster size you ever used in your kudu benchmark testing? Did you ever see such issues before? It could be a severe problem if the performance cannot encrease with more hosts added to cluster.

2. We tested the replica with 3, and compared it with original value 1. The TPS for 3 replica is only about half of that for 1 replica. Is this as expected?

3. Still the original issue: TPS decreased 15% after continuous data insertion for about 30 minutes. Did you ever see such an issue? What could be possible reasons? If the performance continues to decrease, it should be a big issue for long-term running.

4. I tested the shutdown of a tserver, and after some time the data on the tserver was re-assigned to other live tservers. When I restarted the tserver, the data on it was not removed and the cluster did not re-allocated all data. So what we can do is to remove the replicas on the restarted tserver and then manually move tablets from other tservers to it. Is this the expected operation? Will kudu add automatic data-rebalance in future release?

发件人: Todd Lipcon [mailto:todd@cloudera.com]
发送时间: 2018年3月20日 10:50
收件人: user@kudu.apache.org
主题: Re: 答复: A few questions for using Kudu

On Thu, Mar 15, 2018 at 8:32 PM, 张晓宁 <zh...@jd.com>> wrote:
Thank you Dan! My follow-up comments with XiaoNing.

发件人: Dan Burkert [mailto:danburkert@apache.org<ma...@apache.org>]
发送时间: 2018年3月16日 1:06
收件人: user@kudu.apache.org<ma...@kudu.apache.org>
主题: Re: A few questions for using Kudu

Hi, answers inline:
On Thu, Mar 15, 2018 at 3:12 AM, 张晓宁 <zh...@jd.com>> wrote:
I have a few questions for using kudu:

1. As more and more data inserted to kudu, the performance decrease. After continuous data insertion for about 30 minutes, the TPS performance decreased with 20%, and after 1-hour data insertion, the performance decreased with 40%. Is this a known issue?
This is expected if you are inserting data in random order. If you try another benchmark where you insert data in primary key sorted order, you'll see that the performance will be much higher, and more consistent. If you have a heavy insert workload, this kind of optimization is critical. The table's partitioning and primary key can often be designed to make this happen naturally, but it's a dataset dependent thing, so without more specifics about your data it's difficult to give more precise advice.
XiaoNing: Our table has 2 partitions,the first level partition is by date range(using the column timestamp),one partition for one single day, and the second partition is by a hash on 2 column(key + host).These 3 columns(timestamp,key,host) are the primary key of the table.For you comment “insert data in primary key sorted order”,do you mean we need to sort the data on the 3 primary-key columns before insertion?

If timestamp is the first column then it should probably be somewhat naturally-sorted by the primary key, right? It doesn't need to be perfectly sorted, but if the inserts are in roughly PK order, we will avoid unnecessary compaction.

2. When setting the replica number to be 1, totally I will have 2 copy of data(1 master data + 1 replica data), is this true?
That's incorrect. The master node does not hold any table data. If you set the number of replicas to be 1, you will lose data if you lose the tablet server which holds the replica. We always recommend production workloads set number of replicas to 3 in order to have fault tolerance.
XiaoNing: So if we want to have fault tolerance, we should at least set the replica number to be 3, right?

That's right.

-Todd
--
Todd Lipcon
Software Engineer, Cloudera

Re: 答复: A few questions for using Kudu

Posted by Todd Lipcon <to...@cloudera.com>.

On Thu, Mar 15, 2018 at 8:32 PM, 张晓宁 <zh...@jd.com> wrote:

> Thank you Dan! My follow-up comments with XiaoNing.
>
>
>
> *发件人:* Dan Burkert [mailto:danburkert@apache.org]
> *发送时间:* 2018年3月16日 1:06
> *收件人:* user@kudu.apache.org
> *主题:* Re: A few questions for using Kudu
>
>
>
> Hi, answers inline:
>
> On Thu, Mar 15, 2018 at 3:12 AM, 张晓宁 <zh...@jd.com> wrote:
>
> I have a few questions for using kudu:
>
> 1.       As more and more data inserted to kudu, the performance
> decrease. After continuous data insertion for about 30 minutes, the TPS
> performance decreased with 20%, and after 1-hour data insertion, the
> performance decreased with 40%. Is this a known issue?
>
> This is expected if you are inserting data in random order.  If you try
> another benchmark where you insert data in primary key sorted order, you'll
> see that the performance will be much higher, and more consistent.  If you
> have a heavy insert workload, this kind of optimization is critical.  The
> table's partitioning and primary key can often be designed to make this
> happen naturally, but it's a dataset dependent thing, so without more
> specifics about your data it's difficult to give more precise advice.
>
>  XiaoNing: Our table has 2 partitions,the first level partition is by
> date range(using the column timestamp),one partition for one single day,
> and the second partition is by a hash on 2 column(key + host).These 3
> columns(timestamp,key,host) are the primary key of the table.For you
> comment “insert data in primary key sorted order”,do you mean we need to
> sort the data on the 3 primary-key columns before insertion?
>

If timestamp is the first column then it should probably be somewhat
naturally-sorted by the primary key, right? It doesn't need to be perfectly
sorted, but if the inserts are in roughly PK order, we will avoid
unnecessary compaction.


> 2.       When setting the replica number to be 1, totally I will have 2
> copy of data(1 master data + 1 replica data), is this true?
>
> That's incorrect.  The master node does not hold any table data.  If you
> set the number of replicas to be 1, you will lose data if you lose the
> tablet server which holds the replica.  We always recommend production
> workloads set number of replicas to 3 in order to have fault tolerance.
>
>  XiaoNing: So if we want to have fault tolerance, we should at least set
> the replica number to be 3, right?
>

That's right.

-Todd
--
Todd Lipcon
Software Engineer, Cloudera

答复: A few questions for using Kudu

Posted by 张晓宁 <zh...@jd.com>.

Thank you Dan! My follow-up comments with XiaoNing.

发件人: Dan Burkert [mailto:danburkert@apache.org]
发送时间: 2018年3月16日 1:06
收件人: user@kudu.apache.org
主题: Re: A few questions for using Kudu

Hi, answers inline:
On Thu, Mar 15, 2018 at 3:12 AM, 张晓宁 <zh...@jd.com>> wrote:
I have a few questions for using kudu:

3. I want to install kudu 1.6, but our machine cannot connect to public internet. Will kudu team build out the rpm packages for 1.6 version?

The Apache Kudu project does not provide binary artifacts for releases, however vendors can and do. For instance you can find Cloudera's RPMs corresponding to Kudu 1.6 here<https://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/5.14/RPMS/x86_64/>.
XiaoNing: Got it, thanks.
- Dan

Re: A few questions for using Kudu

Posted by Dan Burkert <da...@apache.org>.

Hi, answers inline:

On Thu, Mar 15, 2018 at 3:12 AM, 张晓宁 <zh...@jd.com> wrote:

> I have a few questions for using kudu:
>
> 1.       As more and more data inserted to kudu, the performance
> decrease. After continuous data insertion for about 30 minutes, the TPS
> performance decreased with 20%, and after 1-hour data insertion, the
> performance decreased with 40%. Is this a known issue?
>
This is expected if you are inserting data in random order.  If you try
another benchmark where you insert data in primary key sorted order, you'll
see that the performance will be much higher, and more consistent.  If you
have a heavy insert workload, this kind of optimization is critical.  The
table's partitioning and primary key can often be designed to make this
happen naturally, but it's a dataset dependent thing, so without more
specifics about your data it's difficult to give more precise advice.


> 2.       When setting the replica number to be 1, totally I will have 2
> copy of data(1 master data + 1 replica data), is this true?
>
That's incorrect.  The master node does not hold any table data.  If you
set the number of replicas to be 1, you will lose data if you lose the
tablet server which holds the replica.  We always recommend production
workloads set number of replicas to 3 in order to have fault tolerance.


> 3.       I want to install kudu 1.6, but our machine cannot connect to
> public internet. Will kudu team build out the rpm packages for 1.6 version?
>

The Apache Kudu project does not provide binary artifacts for releases,
however vendors can and do.  For instance you can find Cloudera's RPMs
corresponding to Kudu 1.6 here
<https://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/5.14/RPMS/x86_64/>.

- Dan