You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Soumya Simanta <so...@gmail.com> on 2014/12/06 04:31:36 UTC

Trying to understand a basic difference between these two configurations

I'm trying to understand the conceptual difference between these two
configurations in term of performance (using Spark standalone cluster)

Case 1:

1 Node
60 cores
240G of memory
50G of data on local file system

Case 2:

6 Nodes
10 cores per node
40G of memory per node
50G of data on HDFS
nodes are connected using a 10G network

I just wanted to validate my understanding.

1.  Reads in case 1 will be slower compared to case 2 because, in case 2
all 6 nodes can read the data in parallel from HDFS. However, if I change
the file system to HDFS in Case 1, my read speeds will be conceptually the
same as case 2. Correct ?

2. Once the data is loaded, case 1 will execute operations faster because
there is no network overhead and all shuffle operations are local.

3. Obviously, case 1 is bad from a fault tolerance point of view because we
have a single point of failure.

Thanks
-Soumya

Re: Trying to understand a basic difference between these two configurations

Posted by Soumya Simanta <so...@gmail.com>.

TD,

Thanks. This helps a lot.

So in that case increasing the number of workers in case 2 to 8 or 12 then
the aggregate read bandwidth is going to increase.
At least conceptually?

 -Soumya

On Fri, Dec 5, 2014 at 10:51 PM, Tathagata Das <ta...@gmail.com>
wrote:

> That depends! See inline.  I am assuming that when you said replacing
> local disk with HDFS in case 1, you are connected to a separate HDFS
> cluster (like case 1) with a single 10G link. Also assumign that all
> nodes (1 in case 1, and 6 in case 2) are the worker nodes, and the
> spark application driver is running somewhere else.
>
> On Fri, Dec 5, 2014 at 7:31 PM, Soumya Simanta <so...@gmail.com>
> wrote:
> > I'm trying to understand the conceptual difference between these two
> > configurations in term of performance (using Spark standalone cluster)
> >
> > Case 1:
> >
> > 1 Node
> > 60 cores
> > 240G of memory
> > 50G of data on local file system
> >
> > Case 2:
> >
> > 6 Nodes
> > 10 cores per node
> > 40G of memory per node
> > 50G of data on HDFS
> > nodes are connected using a 10G network
> >
> > I just wanted to validate my understanding.
> >
> > 1.  Reads in case 1 will be slower compared to case 2 because, in case 2
> all
> > 6 nodes can read the data in parallel from HDFS. However, if I change the
> > file system to HDFS in Case 1, my read speeds will be conceptually the
> same
> > as case 2. Correct ?
> >
>
> If filesystem was HDFS in case 1, it still may not be conceptually
> same as case 2. It depends on what is the bottleneck of the system. In
> case 2, the total network b/w with which you can read from the HDFS
> cluster is 6 x 10G, but in case 1 it will still be 10G. So if the HDFS
> cluster has very high aggregate read b/w and network is the bottleneck
> in case 2, then case 1 will be less throughput than case 2. And vice
> versa, if the HDFS read b/w is the bottleneck, then conceptually case
> 1 will be same as case 2.
>
> And there are other issues like memory read b/w as well. Say the HDFS
> read b/w is the bottleneck is case 1 seems to be same as case 2. So
> the aggregate read b/w is say 50Gbps (less than 6 x 10Gbps network
> b/w). Can the single node of case 1 do memory transfers (reading from
> NIC to memory for processing) at 50Gbps? That would depend on memory
> speed, number of memory cards, etc.
>
> So all of these need to be considered. And thats how hardwares need to
> be designed so that all these parameters are balanced  :)
>
>
> > 2. Once the data is loaded, case 1 will execute operations faster because
> > there is no network overhead and all shuffle operations are local.
> >
>
> Assuming case 1 = case 2, yes. Potentially. If you are using local
> disk to shuffle files, some files systems are not happy with 60
> threads trying to read and write from disk. Those can give rise to
> weird behavior. Ignoring that, conceptually, yes.
>
>
> > 3. Obviously, case 1 is bad from a fault tolerance point of view because
> we
> > have a single point of failure.
> >
>
> Yeah. If that single node dies, there are no more resources left to
> continue the processing.
>
> > Thanks
> > -Soumya
> >
> >
> >
> >
> >
> >
> >
>

Re: Trying to understand a basic difference between these two configurations

Posted by Tathagata Das <ta...@gmail.com>.

That depends! See inline.  I am assuming that when you said replacing
local disk with HDFS in case 1, you are connected to a separate HDFS
cluster (like case 1) with a single 10G link. Also assumign that all
nodes (1 in case 1, and 6 in case 2) are the worker nodes, and the
spark application driver is running somewhere else.

On Fri, Dec 5, 2014 at 7:31 PM, Soumya Simanta <so...@gmail.com> wrote:
> I'm trying to understand the conceptual difference between these two
> configurations in term of performance (using Spark standalone cluster)
>
> Case 1:
>
> 1 Node
> 60 cores
> 240G of memory
> 50G of data on local file system
>
> Case 2:
>
> 6 Nodes
> 10 cores per node
> 40G of memory per node
> 50G of data on HDFS
> nodes are connected using a 10G network
>
> I just wanted to validate my understanding.
>
> 1.  Reads in case 1 will be slower compared to case 2 because, in case 2 all
> 6 nodes can read the data in parallel from HDFS. However, if I change the
> file system to HDFS in Case 1, my read speeds will be conceptually the same
> as case 2. Correct ?
>

If filesystem was HDFS in case 1, it still may not be conceptually
same as case 2. It depends on what is the bottleneck of the system. In
case 2, the total network b/w with which you can read from the HDFS
cluster is 6 x 10G, but in case 1 it will still be 10G. So if the HDFS
cluster has very high aggregate read b/w and network is the bottleneck
in case 2, then case 1 will be less throughput than case 2. And vice
versa, if the HDFS read b/w is the bottleneck, then conceptually case
1 will be same as case 2.

And there are other issues like memory read b/w as well. Say the HDFS
read b/w is the bottleneck is case 1 seems to be same as case 2. So
the aggregate read b/w is say 50Gbps (less than 6 x 10Gbps network
b/w). Can the single node of case 1 do memory transfers (reading from
NIC to memory for processing) at 50Gbps? That would depend on memory
speed, number of memory cards, etc.

So all of these need to be considered. And thats how hardwares need to
be designed so that all these parameters are balanced  :)

> 2. Once the data is loaded, case 1 will execute operations faster because
> there is no network overhead and all shuffle operations are local.
>

Assuming case 1 = case 2, yes. Potentially. If you are using local
disk to shuffle files, some files systems are not happy with 60
threads trying to read and write from disk. Those can give rise to
weird behavior. Ignoring that, conceptually, yes.

> 3. Obviously, case 1 is bad from a fault tolerance point of view because we
> have a single point of failure.
>

Yeah. If that single node dies, there are no more resources left to
continue the processing.

> Thanks
> -Soumya
>
>
>
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org