You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by ya...@condorgreen.com on 2019/08/20 10:31:57 UTC

Cassandra-stress testing

Hi Everyone,

Anyone before who have bused Cassandra-stress. I want to test if it’s possible to load 600 milllions records per hour in Cassandra or
Find a better way to optimize Cassandra for this case. 

Any help will be highly appreciated.

Sent from Mail for Window

Re: Cassandra-stress testing

Posted by Ben Slater <be...@instaclustr.com>.
Whether 600m rows per hour is good or bad depends on the hardware you are
using (do you have 1 node or 1000? 2 cores each or 16?) and the data you
are writing (is it 10 bytes per row or 100kb?).

In general, I think you will need to supply a lot more context about your
use case and set up to get any useful response from the community.

Cheers
Ben

---


*Ben Slater**Chief Product Officer*

<https://www.instaclustr.com/platform/>

<https://www.facebook.com/instaclustr>   <https://twitter.com/instaclustr>
<https://www.linkedin.com/company/instaclustr>

Read our latest technical blog posts here
<https://www.instaclustr.com/blog/>.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


On Wed, 21 Aug 2019 at 21:32, <ya...@condorgreen.com> wrote:

>
>
>
>
> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for
> Windows 10
>
>
>
> Thanks for feedback.
>
>
>
> Just to elaborate more, I am currently writing 600m rows per hour and
> need to understand if this is about on target or if there are better ways
> to write or perhaps structure the keyspaces and table structures.
>
>
>
> And I can use the Cassandra Stress tool to get potential maximum
> throughput stats. Or use the schema provided(keyspace/table definitions for
> a stress test)
>
>
>
>
>
> Cassandra, being a scale-out database, can load any arbitrary number of
> records per hour.
>
>
>
> The best way to do this is for your given data model, find what your max
> throughput is on a single node by scaling the number of clients until you
> start seeing errors (or hit your latency SLA) then pull back by 15-20%.
> From there, it's a matter of linearly scaling clients and nodes until you
> hit your desired throughput.
>
>
>
> I recommend taking a look at TLP-Stress as it's a bit easier to use and
> understand:
> https://thelastpickle.com/blog/2018/10/31/tlp-stress-intro.html
>
>
>
> Best.
>
> *Marc Selwan | *DataStax *| *PM, Server Team *|* *(925) 413-7079* *|*
> Twitter <https://twitter.com/MarcSelwan>
>
>
>
> *  Quick links | *DataStax <http://www.datastax.com> *| *Training
> <http://www.academy.datastax.com> *| *Documentation
> <http://www.datastax.com/documentation/getting_started/doc/getting_started/gettingStartedIntro_r.html>
>  *| *Downloads <http://www.datastax.com/download>
>
>
>
>
>
>
>
> On Tue, Aug 20, 2019 at 7:16 AM Surbhi Gupta <su...@gmail.com>
> wrote:
>
> Have you tried ycsa?
>
> It is a tool from yahoo for stress testing nosql databases.
>
>
>
> On Tue, Aug 20, 2019 at 3:34 AM <ya...@condorgreen.com> wrote:
>
> Hi Everyone,
>
>
>
> Anyone before who have bused Cassandra-stress. I want to test if it’s
> possible to load 600 milllions records per hour in Cassandra or
>
> Find a better way to optimize Cassandra for this case.
>
> Any help will be highly appreciated.
>
>
>
> Sent from Mail
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__go.microsoft.com_fwlink_-3FLinkId-3D550986&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=E6NVfMr2TIhW42QMfARTvsfCLtdF-oEA3KfAQRfVZdk&m=qz4MqEErkPhY1u6JLqEJUgJmIIjmnMQjptddjTPJE_M&s=87TbqmPgsIH-JP0fbsUYHhpSQyxeHVdqioQud3BHygc&e=>
> for Window
>
>
>

RE: Cassandra-stress testing

Posted by ya...@condorgreen.com.

Sent from Mail for Windows 10

Thanks for feedback. 

Just to elaborate more, I am currently writing 600m rows per hour and need to understand if this is about on target or if there are better ways to write or perhaps structure the keyspaces and table structures.

And I can use the Cassandra Stress tool to get potential maximum throughput stats. Or use the schema provided(keyspace/table definitions for a stress test)


Cassandra, being a scale-out database, can load any arbitrary number of records per hour. 

The best way to do this is for your given data model, find what your max throughput is on a single node by scaling the number of clients until you start seeing errors (or hit your latency SLA) then pull back by 15-20%. From there, it's a matter of linearly scaling clients and nodes until you hit your desired throughput. 

I recommend taking a look at TLP-Stress as it's a bit easier to use and understand:  https://thelastpickle.com/blog/2018/10/31/tlp-stress-intro.html  

Best.

Marc Selwan | DataStax | PM, Server Team | (925) 413-7079 | Twitter 

  Quick links | DataStax | Training | Documentation | Downloads  



On Tue, Aug 20, 2019 at 7:16 AM Surbhi Gupta <su...@gmail.com> wrote:
Have you tried ycsa?
It is a tool from yahoo for stress testing nosql databases.

On Tue, Aug 20, 2019 at 3:34 AM <ya...@condorgreen.com> wrote:
Hi Everyone,
 
Anyone before who have bused Cassandra-stress. I want to test if it’s possible to load 600 milllions records per hour in Cassandra or
Find a better way to optimize Cassandra for this case. 

Any help will be highly appreciated.
 
Sent from Mail for Window


Re: Cassandra-stress testing

Posted by Ben Slater <be...@instaclustr.com>.
If you’re after some benchmark that someone else has already run to help
estimate sizing, we pretty regularly publish benchmarking on various cloud
provider instances.

For example, see:
https://www.instaclustr.com/announcing-instaclustr-support-for-aws-i3en-instances/
and https://www.instaclustr.com/certified-apache-cassandra/

Cheers
Ben

---


*Ben Slater**Chief Product Officer*

<https://www.instaclustr.com/platform/>

<https://www.facebook.com/instaclustr>   <https://twitter.com/instaclustr>
<https://www.linkedin.com/company/instaclustr>

Read our latest technical blog posts here
<https://www.instaclustr.com/blog/>.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


On Wed, 21 Aug 2019 at 02:42, Marc Selwan <ma...@datastax.com> wrote:

> Cassandra, being a scale-out database, can load any arbitrary number of
> records per hour.
>
> The best way to do this is for your given data model, find what your max
> throughput is on a single node by scaling the number of clients until you
> start seeing errors (or hit your latency SLA) then pull back by 15-20%.
> From there, it's a matter of linearly scaling clients and nodes until you
> hit your desired throughput.
>
> I recommend taking a look at TLP-Stress as it's a bit easier to use and
> understand:
> https://thelastpickle.com/blog/2018/10/31/tlp-stress-intro.html
>
> Best.
> *Marc Selwan | *DataStax *| *PM, Server Team *|* *(925) 413-7079* *|*
> Twitter <https://twitter.com/MarcSelwan>
>
> *  Quick links | *DataStax <http://www.datastax.com> *| *Training
> <http://www.academy.datastax.com> *| *Documentation
> <http://www.datastax.com/documentation/getting_started/doc/getting_started/gettingStartedIntro_r.html>
>  *| *Downloads <http://www.datastax.com/download>
>
>
>
> On Tue, Aug 20, 2019 at 7:16 AM Surbhi Gupta <su...@gmail.com>
> wrote:
>
>> Have you tried ycsa?
>> It is a tool from yahoo for stress testing nosql databases.
>>
>> On Tue, Aug 20, 2019 at 3:34 AM <ya...@condorgreen.com> wrote:
>>
>>> Hi Everyone,
>>>
>>>
>>>
>>> Anyone before who have bused Cassandra-stress. I want to test if it’s
>>> possible to load 600 milllions records per hour in Cassandra or
>>>
>>> Find a better way to optimize Cassandra for this case.
>>>
>>> Any help will be highly appreciated.
>>>
>>>
>>>
>>> Sent from Mail
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__go.microsoft.com_fwlink_-3FLinkId-3D550986&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=E6NVfMr2TIhW42QMfARTvsfCLtdF-oEA3KfAQRfVZdk&m=qz4MqEErkPhY1u6JLqEJUgJmIIjmnMQjptddjTPJE_M&s=87TbqmPgsIH-JP0fbsUYHhpSQyxeHVdqioQud3BHygc&e=>
>>> for Window
>>>
>>

Re: Cassandra-stress testing

Posted by Marc Selwan <ma...@datastax.com>.
Cassandra, being a scale-out database, can load any arbitrary number of
records per hour.

The best way to do this is for your given data model, find what your max
throughput is on a single node by scaling the number of clients until you
start seeing errors (or hit your latency SLA) then pull back by 15-20%.
From there, it's a matter of linearly scaling clients and nodes until you
hit your desired throughput.

I recommend taking a look at TLP-Stress as it's a bit easier to use and
understand:  https://thelastpickle.com/blog/2018/10/31/tlp-stress-intro.html


Best.
*Marc Selwan | *DataStax *| *PM, Server Team *|* *(925) 413-7079* *|*
Twitter <https://twitter.com/MarcSelwan>

*  Quick links | *DataStax <http://www.datastax.com> *| *Training
<http://www.academy.datastax.com> *| *Documentation
<http://www.datastax.com/documentation/getting_started/doc/getting_started/gettingStartedIntro_r.html>
 *| *Downloads <http://www.datastax.com/download>



On Tue, Aug 20, 2019 at 7:16 AM Surbhi Gupta <su...@gmail.com>
wrote:

> Have you tried ycsa?
> It is a tool from yahoo for stress testing nosql databases.
>
> On Tue, Aug 20, 2019 at 3:34 AM <ya...@condorgreen.com> wrote:
>
>> Hi Everyone,
>>
>>
>>
>> Anyone before who have bused Cassandra-stress. I want to test if it’s
>> possible to load 600 milllions records per hour in Cassandra or
>>
>> Find a better way to optimize Cassandra for this case.
>>
>> Any help will be highly appreciated.
>>
>>
>>
>> Sent from Mail
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__go.microsoft.com_fwlink_-3FLinkId-3D550986&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=E6NVfMr2TIhW42QMfARTvsfCLtdF-oEA3KfAQRfVZdk&m=qz4MqEErkPhY1u6JLqEJUgJmIIjmnMQjptddjTPJE_M&s=87TbqmPgsIH-JP0fbsUYHhpSQyxeHVdqioQud3BHygc&e=>
>> for Window
>>
>

Re: Cassandra-stress testing

Posted by Surbhi Gupta <su...@gmail.com>.
Have you tried ycsa?
It is a tool from yahoo for stress testing nosql databases.

On Tue, Aug 20, 2019 at 3:34 AM <ya...@condorgreen.com> wrote:

> Hi Everyone,
>
>
>
> Anyone before who have bused Cassandra-stress. I want to test if it’s
> possible to load 600 milllions records per hour in Cassandra or
>
> Find a better way to optimize Cassandra for this case.
>
> Any help will be highly appreciated.
>
>
>
> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for Window
>