You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Mich Talebzadeh <mi...@gmail.com> on 2016/11/15 17:09:55 UTC

Running stress tests on spark cluster to avoid wild-goose chase later

Hi,

This is rather a broad question.

We would like to run a set of stress tests against our Spark clusters to
ensure that the build performs as expected before deploying the cluster.

Reasoning behind this is that the users were reporting some ML jobs running
on two equal clusters reporting back different times, one cluster was
behaving much worse than other using the same workload.

This was eventually traced to wrong BIOS setting at hardware level and did
not have anything to do with Spark itself.

So rather spending a good while doing wild-goose chase, we would like to
take spark app through some tests cycles.

We have some ideas but appreciate some other feedbacks.

The current version is CHDS 5.2.

Thanks

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

Re: Running stress tests on spark cluster to avoid wild-goose chase later

Posted by Dave Jaffe <dj...@vmware.com>.
Mich-

Sparkperf from Databricks (https://github.com/databricks/spark-perf) is a good stress test, covering a wide range of Spark functionality but especially ML. I’ve tested it with Spark 1.6.0 on CDH 5.7. It may need some work for Spark 2.0.

Dave Jaffe

BigData Performance
VMware
djaffe@vmware.com

From: Mich Talebzadeh <mi...@gmail.com>
Date: Tuesday, November 15, 2016 at 11:09 AM
To: "user @spark" <us...@spark.apache.org>
Subject: Running stress tests on spark cluster to avoid wild-goose chase later

Hi,

This is rather a broad question.

We would like to run a set of stress tests against our Spark clusters to ensure that the build performs as expected before deploying the cluster.

Reasoning behind this is that the users were reporting some ML jobs running on two equal clusters reporting back different times, one cluster was behaving much worse than other using the same workload.

This was eventually traced to wrong BIOS setting at hardware level and did not have anything to do with Spark itself.

So rather spending a good while doing wild-goose chase, we would like to take spark app through some tests cycles.

We have some ideas but appreciate some other feedbacks.

The current version is CHDS 5.2.

Thanks

Dr Mich Talebzadeh



LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_profile_view-3Fid-3DAAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw&d=CwMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=ZVa_NfRWb4LTiT6_IVstUCci54W90AgDk7po0Fiao_o&m=wiCWSz9X6j73L9qSOVRiIF9IkPVl6k6FLRg4xtXoSB4&s=t-NkpQbe3_A_BKcpsWZVhI-BBq7lcZzqOW-8X43il_0&e=>



http://talebzadehmich.wordpress.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__talebzadehmich.wordpress.com&d=CwMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=ZVa_NfRWb4LTiT6_IVstUCci54W90AgDk7po0Fiao_o&m=wiCWSz9X6j73L9qSOVRiIF9IkPVl6k6FLRg4xtXoSB4&s=ezSuGAqAyEhd1YVeV1slP5csMpLGRIp3JAqsFm3d0xw&e=>



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.