You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Anis Nasir <aa...@gmail.com> on 2017/02/14 13:57:13 UTC

Fwd: Handling Skewness and Heterogeneity

Dear all,

Can you please comment on the below mentioned use case.

Thanking you in advance

Regards,
Anis


---------- Forwarded message ---------
From: Anis Nasir <aa...@gmail.com>
Date: Tue, 14 Feb 2017 at 17:01
Subject: Handling Skewness and Heterogeneity
To: <us...@spark.apache.org>


Dear All,

I have few use cases for spark streaming where spark cluster consist of
heterogenous machines.

Additionally, there is skew present in both the input distribution (e.g.,
each tuple is drawn from a zipf distribution) and the service time (e.g.,
service time required for each tuple comes from a zipf distribution).

I want to know who spark will handle such use cases.

Any help will be highly appreciated!


Regards,
Anis