You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Alexandros Daglis <al...@epfl.ch> on 2012/11/27 18:40:29 UTC

What a "worker" really is and other interesting runtime information

Hello everybody,

I went through most of the documentation I could find for Giraph and also
most of the messages in this email list, but still I have not figured out
precisely what a "worker" really is. I would really appreciate it if you
could help me understand how the framework works.

At first I thought that a worker has a one-to-one correspondence to a map
task. Apparently this is not exactly the case, since I have noticed that if
I ask for x workers, the job finishes after having used x+1 map tasks. What
is this extra task for?

I have been trying out the example SSSP application on a single node with
12 cores. Giving an input graph of ~400MB and using 1 worker, around 10 GBs
of memory are used during execution. What intrigues me is that if I use 2
workers for the same input (and without limiting memory per map task),
double the memory will be used. Furthermore, there will be no improvement
in performance. I rather notice a slowdown. Are these observations normal?

Might it be the case that 1 and 2 workers are very few and I should go to
the 30-100 range that is the proposed number of mappers for a conventional
MapReduce job?

Finally, a last observation. Even though I use only 1 worker, I see that
there are significant periods during execution where up to 90% of the 12
cores computing power is consumed, that is, almost 10 cores are used in
parallel. Does each worker spawn multiple threads and dynamically balances
the load to utilize the available hardware?

Thanks a lot in advance!

Best,
Alexandros

RE: What a "worker" really is and other interesting runtime information

Posted by "Magyar, Bence (US SSA)" <be...@baesystems.com>.
Hi Alexandros,

I  increased my number of workers to "30", but my job just hangs at 3%:

./giraph -Dgiraph.useSuperstepCounters=false -DSimpleShortestPathsVertex.sourceId=100 ../target/giraph.jar org.apache.giraph.examples.SimpleShortestPathsVertex -if org.apache.giraph.io.JsonLongDoubleFloatDoubleVertexInputFormat -ip /user/hduser/insight -of org.apache.giraph.io.JsonLongDoubleFloatDoubleVertexOutputFormat -op /user/hduser/insight-out-30 -w 30


No lib directory, assuming dev environment
No HADOOP_CONF_DIR set, using /opt/hadoop-1.0.3/conf
12/11/29 14:31:14 INFO mapred.JobClient: Running job: job_201211282132_0002
12/11/29 14:31:15 INFO mapred.JobClient:  map 0% reduce 0%
12/11/29 14:31:34 INFO mapred.JobClient:  map 3% reduce 0%
12/11/29 14:36:39 INFO mapred.JobClient: Job complete: job_201211282132_0002
12/11/29 14:36:39 INFO mapred.JobClient: Counters: 5
12/11/29 14:36:39 INFO mapred.JobClient:   Job Counters
12/11/29 14:36:39 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2451427
12/11/29 14:36:39 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
12/11/29 14:36:39 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
12/11/29 14:36:39 INFO mapred.JobClient:     Launched map tasks=8
12/11/29 14:36:39 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=3725

Looking at the logs on my master machine:

/opt/hadoop-1.0.3/logs/userlogs/job_201211282132_0002/attempt_201211282132_0002_m_000000_0 $

I see:

2012-11-29 14:32:00,732 INFO org.apache.giraph.graph.BspServiceMaster: checkWorkers: Only found 7 responses of 30 needed to start superstep -1.  Sleeping for 30000 msecs and used 0 of 10 attempts.
2012-11-29 14:32:30,766 INFO org.apache.giraph.graph.BspServiceMaster: checkWorkers: Only found 7 responses of 30 needed to start superstep -1.  Sleeping for 30000 msecs and used 1 of 10 attempts.
2012-11-29 14:33:00,807 INFO org.apache.giraph.graph.BspServiceMaster: checkWorkers: Only found 7 responses of 30 needed to start superstep -1.  Sleeping for 30000 msecs and used 2 of 10 attempts.
....
This repeats until: Sleeping for 30000 msecs and used 8 of 10 attempts.

Can anyone please provide guidance on what number of workers should be set to given some general characteristics/topology of a cluster?  (Mine are given in the email chain below)

-Bence

From: Alexandros Daglis [mailto:alexandros.daglis@epfl.ch]
Sent: Thursday, November 29, 2012 12:30 PM
To: user@giraph.apache.org
Subject: Re: What a "worker" really is and other interesting runtime information

Hello Bence,

So, you have 96 cores at your disposal. My guess would be that 3 workers are not enough to use all of them, you should either try with a lot more, or try to multithread them as Avery said (thus, try 4 workers with 24 threads each). However, as I already reported, I tried this myself and I didn't notice any improvement in performance. I guess scaling with the number of worker threads is fundamental for a BSP framework so that's really weird, I guess I must be doing something wrong.

Could you please try increasing your workers and tell me if you also don't notice any improvement in performance? Furthermore, have you tried running with both 1 and 3 workers? Did you see any difference there?

I really want to sort this scalability issue out...

On a final note, please don't use the word "node" for describing a lot of different things, it got me quite confused :-) Your cluster's nodes have cores and your input is a graph with 140k vertices.

Cheers,
Alexandros
On 29 November 2012 14:02, Magyar, Bence (US SSA) <be...@baesystems.com>> wrote:
Folks,

I have some of the same questions as Alexandros below.  What is exactly is "a worker"?  I am not sure I understood Avery's answer below.  I have 4-node cluster.  Each node has 24 nodes.  My first node is functioning (in MapReduce parlance) as both a "job tracker" as well as a "task tracker".  So I have 4 compute nodes.  (I have verified that master/slave config is correct).  I am launching the Giraph SimpleShortestPathsVertex example on an input graph with approximately 140,000 nodes/ 410,000 edges and the computation is taking approx. 6 minutes.  Although I don't know what a "good" number is, 6 minutes seems rather "slow" given all the compute horsepower I have at my disposal.  When I monitor "top" on my machines while the compute is running, my cores are ~ 80-90% idle.

I am launching my job with the following parameters:

./giraph -Dgiraph.useSuperstepCounters=false -DSimpleShortestPathsVertex.sourceId=100 ../target/giraph.jar org.apache.giraph.examples.SimpleShortestPathsVertex -if org.apache.giraph.io.JsonLongDoubleFloatDoubleVertexInputFormat -ip /user/hduser/in -of org.apache.giraph.io.JsonLongDoubleFloatDoubleVertexOutputFormat -op /user/hduser/out -w 3

Note that I have my number of workers (-w =3).  Should this be some other value?  Does anyone have any simple configuration suggestions that will help me tune Giraph to my problem?

Thanks!

Bence

From: Alexandros Daglis [mailto:alexandros.daglis@epfl.ch<ma...@epfl.ch>]
Sent: Thursday, November 29, 2012 6:19 AM
To: user@giraph.apache.org<ma...@giraph.apache.org>
Subject: Re: What a "worker" really is and other interesting runtime information

Ok, so I added the partitions flag, going with

 hadoop jar target/giraph-0.1-jar-with-dependencies.jar org.apache.giraph.examples.SimpleShortestPathsVertex -Dgiraph.SplitMasterWorker=false -Dgiraph.numComputeThreads=12 -Dhash.userPartitionCount=12 input output 12 1

but still I got no overall speedup at all (compared to using 1 thread) and only 1 out of 12 cores is utilized at most times. Isn't Giraph supposed to exploit parallelism to get some speedup? Any other suggestion?

Thanks,
Alexandros
On 29 November 2012 00:20, Avery Ching <ac...@apache.org>> wrote:
Oh, forgot one thing.  You need to set the number of partitions to use single each thread works on a single partition at a time.

Try -Dhash.userPartitionCount=<number of threads>


On 11/28/12 5:29 AM, Alexandros Daglis wrote:
Dear Avery,

I followed your advice, but the application seems to be totally thread-count-insensitive: I literally observe zero scaling of performance, while I increase the thread count. Maybe you can point out if I am doing something wrong.

- Using only 4 cores on a single node at the moment
- Input graph: 14 million vertices, file size is 470 MB
- Running SSSP as follows: hadoop jar target/giraph-0.1-jar-with-dependencies.jar org.apache.giraph.examples.SimpleShortestPathsVertex -Dgiraph.SplitMasterWorker=false -Dgiraph.numComputeThreads=X input output 12 1
where X=1,2,3,12,30
- I notice a total insensitivity to the number of thread I specify. Aggregate core utilization is always approximately the same (usually around 25-30% => only one of the cores running) and overall execution time is always the same (~8 mins)

Why is Giraph's performance not scaling? Is the input size / number of workers inappropriate? It's not an IO issue either, because even during really low core utilization, time is wasted on idle, not on IO.

Cheers,
Alexandros

On 28 November 2012 11:13, Alexandros Daglis <al...@epfl.ch>> wrote:
Thank you Avery, that helped a lot!

Regards,
Alexandros

On 27 November 2012 20:57, Avery Ching <ac...@apache.org>> wrote:
Hi Alexandros,

The extra task is for the master process (a coordination task). In your case, since you are using a single machine, you can use a single task.

-Dgiraph.SplitMasterWorker=false

and you can try multithreading instead of multiple workers.

-Dgiraph.numComputeThreads=12

The reason why cpu usage increases is due to netty threads to handle network requests.  By using multithreading instead, you should bypass this.

Avery


On 11/27/12 9:40 AM, Alexandros Daglis wrote:
Hello everybody,

I went through most of the documentation I could find for Giraph and also most of the messages in this email list, but still I have not figured out precisely what a "worker" really is. I would really appreciate it if you could help me understand how the framework works.

At first I thought that a worker has a one-to-one correspondence to a map task. Apparently this is not exactly the case, since I have noticed that if I ask for x workers, the job finishes after having used x+1 map tasks. What is this extra task for?

I have been trying out the example SSSP application on a single node with 12 cores. Giving an input graph of ~400MB and using 1 worker, around 10 GBs of memory are used during execution. What intrigues me is that if I use 2 workers for the same input (and without limiting memory per map task), double the memory will be used. Furthermore, there will be no improvement in performance. I rather notice a slowdown. Are these observations normal?

Might it be the case that 1 and 2 workers are very few and I should go to the 30-100 range that is the proposed number of mappers for a conventional MapReduce job?

Finally, a last observation. Even though I use only 1 worker, I see that there are significant periods during execution where up to 90% of the 12 cores computing power is consumed, that is, almost 10 cores are used in parallel. Does each worker spawn multiple threads and dynamically balances the load to utilize the available hardware?

Thanks a lot in advance!

Best,
Alexandros







Re: What a "worker" really is and other interesting runtime information

Posted by Alexandros Daglis <al...@epfl.ch>.
Hello Bence,

So, you have 96 cores at your disposal. My guess would be that 3 workers
are not enough to use all of them, you should either try with a lot more,
or try to multithread them as Avery said (thus, try 4 workers with 24
threads each). However, as I already reported, I tried this myself and I
didn't notice any improvement in performance. I guess scaling with the
number of worker threads is fundamental for a BSP framework so that's
really weird, I guess I must be doing something wrong.

Could you please try increasing your workers and tell me if you also don't
notice any improvement in performance? Furthermore, have you tried running
with both 1 and 3 workers? Did you see any difference there?

I really want to sort this scalability issue out...

On a final note, please don't use the word "node" for describing a lot of
different things, it got me quite confused :-) Your cluster's *nodes* have *
cores* and your input is a graph with 140k *vertices.

*Cheers,
Alexandros

On 29 November 2012 14:02, Magyar, Bence (US SSA) <
bence.magyar@baesystems.com> wrote:

>  Folks, ****
>
> ** **
>
> I have some of the same questions as Alexandros below.  What is exactly is
> “a worker”?  I am not sure I understood Avery’s answer below.  I have
> 4-node cluster.  Each node has 24 nodes.  My first node is functioning (in
> MapReduce parlance) as both a “job tracker” as well as a “task tracker”.
> So I have 4 compute nodes.  (I have verified that master/slave config is
> correct).  I am launching the Giraph SimpleShortestPathsVertex example on
> an input graph with approximately 140,000 nodes/ 410,000 edges and the
> computation is taking approx. 6 minutes.  Although I don’t know what a
> “good” number is, 6 minutes seems rather “slow” given all the compute
> horsepower I have at my disposal.  When I monitor “top” on my machines
> while the compute is running, my cores are ~ 80-90% idle.****
>
> ** **
>
> I am launching my job with the following parameters:****
>
> ** **
>
> ./giraph -Dgiraph.useSuperstepCounters=false
> -DSimpleShortestPathsVertex.sourceId=100 ../target/giraph.jar
> org.apache.giraph.examples.SimpleShortestPathsVertex -if
> org.apache.giraph.io.JsonLongDoubleFloatDoubleVertexInputFormat -ip
> /user/hduser/in -of
> org.apache.giraph.io.JsonLongDoubleFloatDoubleVertexOutputFormat -op
> /user/hduser/out -w 3****
>
> ** **
>
> Note that I have my number of workers (–w =3).  Should this be some other
> value?  Does anyone have any simple configuration suggestions that will
> help me tune Giraph to my problem?****
>
> ** **
>
> Thanks! ****
>
> ** **
>
> Bence****
>
> ** **
>
> *From:* Alexandros Daglis [mailto:alexandros.daglis@epfl.ch]
> *Sent:* Thursday, November 29, 2012 6:19 AM
> *To:* user@giraph.apache.org
> *Subject:* Re: What a "worker" really is and other interesting runtime
> information****
>
> ** **
>
> Ok, so I added the partitions flag, going with
>
>  hadoop jar target/giraph-0.1-jar-with-dependencies.jar
> org.apache.giraph.examples.SimpleShortestPathsVertex
> -Dgiraph.SplitMasterWorker=false -Dgiraph.numComputeThreads=12
> -Dhash.userPartitionCount=12 input output 12 1
>
> but still I got no overall speedup at all (compared to using 1 thread) and
> only 1 out of 12 cores is utilized at most times. Isn't Giraph supposed to
> exploit parallelism to get some speedup? Any other suggestion?
>
> Thanks,
> Alexandros****
>
> On 29 November 2012 00:20, Avery Ching <ac...@apache.org> wrote:****
>
> Oh, forgot one thing.  You need to set the number of partitions to use
> single each thread works on a single partition at a time.
>
> Try -Dhash.userPartitionCount=<number of threads>****
>
>
>
> On 11/28/12 5:29 AM, Alexandros Daglis wrote:****
>
> Dear Avery,
>
> I followed your advice, but the application seems to be totally
> thread-count-insensitive: I literally observe zero scaling of performance,
> while I increase the thread count. Maybe you can point out if I am doing
> something wrong.
>
> - Using only 4 cores on a single node at the moment
> - Input graph: 14 million vertices, file size is 470 MB
> - Running SSSP as follows: hadoop jar
> target/giraph-0.1-jar-with-dependencies.jar
> org.apache.giraph.examples.SimpleShortestPathsVertex
> -Dgiraph.SplitMasterWorker=false -Dgiraph.numComputeThreads=X input output
> 12 1
> where X=1,2,3,12,30
> - I notice a total insensitivity to the number of thread I specify.
> Aggregate core utilization is always approximately the same (usually around
> 25-30% => only one of the cores running) and overall execution time is
> always the same (~8 mins)
>
> Why is Giraph's performance not scaling? Is the input size / number of
> workers inappropriate? It's not an IO issue either, because even during
> really low core utilization, time is wasted on idle, not on IO.
>
> Cheers,
> Alexandros
>
>
> ****
>
> On 28 November 2012 11:13, Alexandros Daglis <al...@epfl.ch>
> wrote:****
>
> Thank you Avery, that helped a lot!
>
> Regards,
> Alexandros ****
>
> ** **
>
> On 27 November 2012 20:57, Avery Ching <ac...@apache.org> wrote:****
>
> Hi Alexandros,
>
> The extra task is for the master process (a coordination task). In your
> case, since you are using a single machine, you can use a single task.
>
> -Dgiraph.SplitMasterWorker=false
>
> and you can try multithreading instead of multiple workers.
>
> -Dgiraph.numComputeThreads=12
>
> The reason why cpu usage increases is due to netty threads to handle
> network requests.  By using multithreading instead, you should bypass this.
>
> Avery ****
>
>
>
> On 11/27/12 9:40 AM, Alexandros Daglis wrote:****
>
> Hello everybody,
>
> I went through most of the documentation I could find for Giraph and also
> most of the messages in this email list, but still I have not figured out
> precisely what a "worker" really is. I would really appreciate it if you
> could help me understand how the framework works.
>
> At first I thought that a worker has a one-to-one correspondence to a map
> task. Apparently this is not exactly the case, since I have noticed that if
> I ask for x workers, the job finishes after having used x+1 map tasks. What
> is this extra task for?
>
> I have been trying out the example SSSP application on a single node with
> 12 cores. Giving an input graph of ~400MB and using 1 worker, around 10 GBs
> of memory are used during execution. What intrigues me is that if I use 2
> workers for the same input (and without limiting memory per map task),
> double the memory will be used. Furthermore, there will be no improvement
> in performance. I rather notice a slowdown. Are these observations normal?
>
> Might it be the case that 1 and 2 workers are very few and I should go to
> the 30-100 range that is the proposed number of mappers for a conventional
> MapReduce job?
>
> Finally, a last observation. Even though I use only 1 worker, I see that
> there are significant periods during execution where up to 90% of the 12
> cores computing power is consumed, that is, almost 10 cores are used in
> parallel. Does each worker spawn multiple threads and dynamically balances
> the load to utilize the available hardware?
>
> Thanks a lot in advance!
>
> Best,
> Alexandros
>
> ****
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>

RE: What a "worker" really is and other interesting runtime information

Posted by "Magyar, Bence (US SSA)" <be...@baesystems.com>.
Folks,

I have some of the same questions as Alexandros below.  What is exactly is "a worker"?  I am not sure I understood Avery's answer below.  I have 4-node cluster.  Each node has 24 nodes.  My first node is functioning (in MapReduce parlance) as both a "job tracker" as well as a "task tracker".  So I have 4 compute nodes.  (I have verified that master/slave config is correct).  I am launching the Giraph SimpleShortestPathsVertex example on an input graph with approximately 140,000 nodes/ 410,000 edges and the computation is taking approx. 6 minutes.  Although I don't know what a "good" number is, 6 minutes seems rather "slow" given all the compute horsepower I have at my disposal.  When I monitor "top" on my machines while the compute is running, my cores are ~ 80-90% idle.

I am launching my job with the following parameters:

./giraph -Dgiraph.useSuperstepCounters=false -DSimpleShortestPathsVertex.sourceId=100 ../target/giraph.jar org.apache.giraph.examples.SimpleShortestPathsVertex -if org.apache.giraph.io.JsonLongDoubleFloatDoubleVertexInputFormat -ip /user/hduser/in -of org.apache.giraph.io.JsonLongDoubleFloatDoubleVertexOutputFormat -op /user/hduser/out -w 3

Note that I have my number of workers (-w =3).  Should this be some other value?  Does anyone have any simple configuration suggestions that will help me tune Giraph to my problem?

Thanks!

Bence

From: Alexandros Daglis [mailto:alexandros.daglis@epfl.ch]
Sent: Thursday, November 29, 2012 6:19 AM
To: user@giraph.apache.org
Subject: Re: What a "worker" really is and other interesting runtime information

Ok, so I added the partitions flag, going with

 hadoop jar target/giraph-0.1-jar-with-dependencies.jar org.apache.giraph.examples.SimpleShortestPathsVertex -Dgiraph.SplitMasterWorker=false -Dgiraph.numComputeThreads=12 -Dhash.userPartitionCount=12 input output 12 1

but still I got no overall speedup at all (compared to using 1 thread) and only 1 out of 12 cores is utilized at most times. Isn't Giraph supposed to exploit parallelism to get some speedup? Any other suggestion?

Thanks,
Alexandros
On 29 November 2012 00:20, Avery Ching <ac...@apache.org>> wrote:
Oh, forgot one thing.  You need to set the number of partitions to use single each thread works on a single partition at a time.

Try -Dhash.userPartitionCount=<number of threads>


On 11/28/12 5:29 AM, Alexandros Daglis wrote:
Dear Avery,

I followed your advice, but the application seems to be totally thread-count-insensitive: I literally observe zero scaling of performance, while I increase the thread count. Maybe you can point out if I am doing something wrong.

- Using only 4 cores on a single node at the moment
- Input graph: 14 million vertices, file size is 470 MB
- Running SSSP as follows: hadoop jar target/giraph-0.1-jar-with-dependencies.jar org.apache.giraph.examples.SimpleShortestPathsVertex -Dgiraph.SplitMasterWorker=false -Dgiraph.numComputeThreads=X input output 12 1
where X=1,2,3,12,30
- I notice a total insensitivity to the number of thread I specify. Aggregate core utilization is always approximately the same (usually around 25-30% => only one of the cores running) and overall execution time is always the same (~8 mins)

Why is Giraph's performance not scaling? Is the input size / number of workers inappropriate? It's not an IO issue either, because even during really low core utilization, time is wasted on idle, not on IO.

Cheers,
Alexandros


On 28 November 2012 11:13, Alexandros Daglis <al...@epfl.ch>> wrote:
Thank you Avery, that helped a lot!

Regards,
Alexandros

On 27 November 2012 20:57, Avery Ching <ac...@apache.org>> wrote:
Hi Alexandros,

The extra task is for the master process (a coordination task). In your case, since you are using a single machine, you can use a single task.

-Dgiraph.SplitMasterWorker=false

and you can try multithreading instead of multiple workers.

-Dgiraph.numComputeThreads=12

The reason why cpu usage increases is due to netty threads to handle network requests.  By using multithreading instead, you should bypass this.

Avery


On 11/27/12 9:40 AM, Alexandros Daglis wrote:
Hello everybody,

I went through most of the documentation I could find for Giraph and also most of the messages in this email list, but still I have not figured out precisely what a "worker" really is. I would really appreciate it if you could help me understand how the framework works.

At first I thought that a worker has a one-to-one correspondence to a map task. Apparently this is not exactly the case, since I have noticed that if I ask for x workers, the job finishes after having used x+1 map tasks. What is this extra task for?

I have been trying out the example SSSP application on a single node with 12 cores. Giving an input graph of ~400MB and using 1 worker, around 10 GBs of memory are used during execution. What intrigues me is that if I use 2 workers for the same input (and without limiting memory per map task), double the memory will be used. Furthermore, there will be no improvement in performance. I rather notice a slowdown. Are these observations normal?

Might it be the case that 1 and 2 workers are very few and I should go to the 30-100 range that is the proposed number of mappers for a conventional MapReduce job?

Finally, a last observation. Even though I use only 1 worker, I see that there are significant periods during execution where up to 90% of the 12 cores computing power is consumed, that is, almost 10 cores are used in parallel. Does each worker spawn multiple threads and dynamically balances the load to utilize the available hardware?

Thanks a lot in advance!

Best,
Alexandros







Re: What a "worker" really is and other interesting runtime information

Posted by Alexandros Daglis <al...@epfl.ch>.
Ok, so I added the partitions flag, going with

 hadoop jar target/giraph-0.1-jar-with-dependencies.jar
org.apache.giraph.examples.SimpleShortestPathsVertex
-Dgiraph.SplitMasterWorker=false -Dgiraph.numComputeThreads=12
-Dhash.userPartitionCount=12 input output 12 1

but still I got no overall speedup at all (compared to using 1 thread) and
only 1 out of 12 cores is utilized at most times. Isn't Giraph supposed to
exploit parallelism to get some speedup? Any other suggestion?

Thanks,
Alexandros

On 29 November 2012 00:20, Avery Ching <ac...@apache.org> wrote:

>  Oh, forgot one thing.  You need to set the number of partitions to use
> single each thread works on a single partition at a time.
>
> Try -Dhash.userPartitionCount=<number of threads>
>
>
> On 11/28/12 5:29 AM, Alexandros Daglis wrote:
>
> Dear Avery,
>
> I followed your advice, but the application seems to be totally
> thread-count-insensitive: I literally observe zero scaling of performance,
> while I increase the thread count. Maybe you can point out if I am doing
> something wrong.
>
> - Using only 4 cores on a single node at the moment
> - Input graph: 14 million vertices, file size is 470 MB
> - Running SSSP as follows: hadoop jar
> target/giraph-0.1-jar-with-dependencies.jar
> org.apache.giraph.examples.SimpleShortestPathsVertex
> -Dgiraph.SplitMasterWorker=false -Dgiraph.numComputeThreads=X input output
> 12 1
> where X=1,2,3,12,30
> - I notice a total insensitivity to the number of thread I specify.
> Aggregate core utilization is always approximately the same (usually around
> 25-30% => only one of the cores running) and overall execution time is
> always the same (~8 mins)
>
> Why is Giraph's performance not scaling? Is the input size / number of
> workers inappropriate? It's not an IO issue either, because even during
> really low core utilization, time is wasted on idle, not on IO.
>
> Cheers,
> Alexandros
>
>
>
> On 28 November 2012 11:13, Alexandros Daglis <al...@epfl.ch>wrote:
>
>> Thank you Avery, that helped a lot!
>>
>> Regards,
>> Alexandros
>>
>>
>>  On 27 November 2012 20:57, Avery Ching <ac...@apache.org> wrote:
>>
>>> Hi Alexandros,
>>>
>>> The extra task is for the master process (a coordination task). In your
>>> case, since you are using a single machine, you can use a single task.
>>>
>>> -Dgiraph.SplitMasterWorker=false
>>>
>>> and you can try multithreading instead of multiple workers.
>>>
>>> -Dgiraph.numComputeThreads=12
>>>
>>> The reason why cpu usage increases is due to netty threads to handle
>>> network requests.  By using multithreading instead, you should bypass this.
>>>
>>> Avery
>>>
>>>
>>> On 11/27/12 9:40 AM, Alexandros Daglis wrote:
>>>
>>>> Hello everybody,
>>>>
>>>> I went through most of the documentation I could find for Giraph and
>>>> also most of the messages in this email list, but still I have not figured
>>>> out precisely what a "worker" really is. I would really appreciate it if
>>>> you could help me understand how the framework works.
>>>>
>>>> At first I thought that a worker has a one-to-one correspondence to a
>>>> map task. Apparently this is not exactly the case, since I have noticed
>>>> that if I ask for x workers, the job finishes after having used x+1 map
>>>> tasks. What is this extra task for?
>>>>
>>>> I have been trying out the example SSSP application on a single node
>>>> with 12 cores. Giving an input graph of ~400MB and using 1 worker, around
>>>> 10 GBs of memory are used during execution. What intrigues me is that if I
>>>> use 2 workers for the same input (and without limiting memory per map
>>>> task), double the memory will be used. Furthermore, there will be no
>>>> improvement in performance. I rather notice a slowdown. Are these
>>>> observations normal?
>>>>
>>>> Might it be the case that 1 and 2 workers are very few and I should go
>>>> to the 30-100 range that is the proposed number of mappers for a
>>>> conventional MapReduce job?
>>>>
>>>> Finally, a last observation. Even though I use only 1 worker, I see
>>>> that there are significant periods during execution where up to 90% of the
>>>> 12 cores computing power is consumed, that is, almost 10 cores are used in
>>>> parallel. Does each worker spawn multiple threads and dynamically balances
>>>> the load to utilize the available hardware?
>>>>
>>>> Thanks a lot in advance!
>>>>
>>>> Best,
>>>> Alexandros
>>>>
>>>>
>>>>
>>>
>>
>
>

Re: What a "worker" really is and other interesting runtime information

Posted by Avery Ching <ac...@apache.org>.
Oh, forgot one thing.  You need to set the number of partitions to use 
single each thread works on a single partition at a time.

Try -Dhash.userPartitionCount=<number of threads>

On 11/28/12 5:29 AM, Alexandros Daglis wrote:
> Dear Avery,
>
> I followed your advice, but the application seems to be totally 
> thread-count-insensitive: I literally observe zero scaling of 
> performance, while I increase the thread count. Maybe you can point 
> out if I am doing something wrong.
>
> - Using only 4 cores on a single node at the moment
> - Input graph: 14 million vertices, file size is 470 MB
> - Running SSSP as follows: hadoop jar 
> target/giraph-0.1-jar-with-dependencies.jar 
> org.apache.giraph.examples.SimpleShortestPathsVertex 
> -Dgiraph.SplitMasterWorker=false -Dgiraph.numComputeThreads=X input 
> output 12 1
> where X=1,2,3,12,30
> - I notice a total insensitivity to the number of thread I specify. 
> Aggregate core utilization is always approximately the same (usually 
> around 25-30% => only one of the cores running) and overall execution 
> time is always the same (~8 mins)
>
> Why is Giraph's performance not scaling? Is the input size / number of 
> workers inappropriate? It's not an IO issue either, because even 
> during really low core utilization, time is wasted on idle, not on IO.
>
> Cheers,
> Alexandros
>
>
>
> On 28 November 2012 11:13, Alexandros Daglis 
> <alexandros.daglis@epfl.ch <ma...@epfl.ch>> wrote:
>
>     Thank you Avery, that helped a lot!
>
>     Regards,
>     Alexandros
>
>
>     On 27 November 2012 20:57, Avery Ching <aching@apache.org
>     <ma...@apache.org>> wrote:
>
>         Hi Alexandros,
>
>         The extra task is for the master process (a coordination
>         task). In your case, since you are using a single machine, you
>         can use a single task.
>
>         -Dgiraph.SplitMasterWorker=false
>
>         and you can try multithreading instead of multiple workers.
>
>         -Dgiraph.numComputeThreads=12
>
>         The reason why cpu usage increases is due to netty threads to
>         handle network requests.  By using multithreading instead, you
>         should bypass this.
>
>         Avery
>
>
>         On 11/27/12 9:40 AM, Alexandros Daglis wrote:
>
>             Hello everybody,
>
>             I went through most of the documentation I could find for
>             Giraph and also most of the messages in this email list,
>             but still I have not figured out precisely what a "worker"
>             really is. I would really appreciate it if you could help
>             me understand how the framework works.
>
>             At first I thought that a worker has a one-to-one
>             correspondence to a map task. Apparently this is not
>             exactly the case, since I have noticed that if I ask for x
>             workers, the job finishes after having used x+1 map tasks.
>             What is this extra task for?
>
>             I have been trying out the example SSSP application on a
>             single node with 12 cores. Giving an input graph of ~400MB
>             and using 1 worker, around 10 GBs of memory are used
>             during execution. What intrigues me is that if I use 2
>             workers for the same input (and without limiting memory
>             per map task), double the memory will be used.
>             Furthermore, there will be no improvement in performance.
>             I rather notice a slowdown. Are these observations normal?
>
>             Might it be the case that 1 and 2 workers are very few and
>             I should go to the 30-100 range that is the proposed
>             number of mappers for a conventional MapReduce job?
>
>             Finally, a last observation. Even though I use only 1
>             worker, I see that there are significant periods during
>             execution where up to 90% of the 12 cores computing power
>             is consumed, that is, almost 10 cores are used in
>             parallel. Does each worker spawn multiple threads and
>             dynamically balances the load to utilize the available
>             hardware?
>
>             Thanks a lot in advance!
>
>             Best,
>             Alexandros
>
>
>
>
>


Re: What a "worker" really is and other interesting runtime information

Posted by Alexandros Daglis <al...@epfl.ch>.
Dear Avery,

I followed your advice, but the application seems to be totally
thread-count-insensitive: I literally observe zero scaling of performance,
while I increase the thread count. Maybe you can point out if I am doing
something wrong.

- Using only 4 cores on a single node at the moment
- Input graph: 14 million vertices, file size is 470 MB
- Running SSSP as follows: hadoop jar
target/giraph-0.1-jar-with-dependencies.jar
org.apache.giraph.examples.SimpleShortestPathsVertex
-Dgiraph.SplitMasterWorker=false -Dgiraph.numComputeThreads=X input output
12 1
where X=1,2,3,12,30
- I notice a total insensitivity to the number of thread I specify.
Aggregate core utilization is always approximately the same (usually around
25-30% => only one of the cores running) and overall execution time is
always the same (~8 mins)

Why is Giraph's performance not scaling? Is the input size / number of
workers inappropriate? It's not an IO issue either, because even during
really low core utilization, time is wasted on idle, not on IO.

Cheers,
Alexandros



On 28 November 2012 11:13, Alexandros Daglis <al...@epfl.ch>wrote:

> Thank you Avery, that helped a lot!
>
> Regards,
> Alexandros
>
>
> On 27 November 2012 20:57, Avery Ching <ac...@apache.org> wrote:
>
>> Hi Alexandros,
>>
>> The extra task is for the master process (a coordination task). In your
>> case, since you are using a single machine, you can use a single task.
>>
>> -Dgiraph.SplitMasterWorker=**false
>>
>> and you can try multithreading instead of multiple workers.
>>
>> -Dgiraph.numComputeThreads=12
>>
>> The reason why cpu usage increases is due to netty threads to handle
>> network requests.  By using multithreading instead, you should bypass this.
>>
>> Avery
>>
>>
>> On 11/27/12 9:40 AM, Alexandros Daglis wrote:
>>
>>> Hello everybody,
>>>
>>> I went through most of the documentation I could find for Giraph and
>>> also most of the messages in this email list, but still I have not figured
>>> out precisely what a "worker" really is. I would really appreciate it if
>>> you could help me understand how the framework works.
>>>
>>> At first I thought that a worker has a one-to-one correspondence to a
>>> map task. Apparently this is not exactly the case, since I have noticed
>>> that if I ask for x workers, the job finishes after having used x+1 map
>>> tasks. What is this extra task for?
>>>
>>> I have been trying out the example SSSP application on a single node
>>> with 12 cores. Giving an input graph of ~400MB and using 1 worker, around
>>> 10 GBs of memory are used during execution. What intrigues me is that if I
>>> use 2 workers for the same input (and without limiting memory per map
>>> task), double the memory will be used. Furthermore, there will be no
>>> improvement in performance. I rather notice a slowdown. Are these
>>> observations normal?
>>>
>>> Might it be the case that 1 and 2 workers are very few and I should go
>>> to the 30-100 range that is the proposed number of mappers for a
>>> conventional MapReduce job?
>>>
>>> Finally, a last observation. Even though I use only 1 worker, I see that
>>> there are significant periods during execution where up to 90% of the 12
>>> cores computing power is consumed, that is, almost 10 cores are used in
>>> parallel. Does each worker spawn multiple threads and dynamically balances
>>> the load to utilize the available hardware?
>>>
>>> Thanks a lot in advance!
>>>
>>> Best,
>>> Alexandros
>>>
>>>
>>>
>>
>

Re: What a "worker" really is and other interesting runtime information

Posted by Alexandros Daglis <al...@epfl.ch>.
Thank you Avery, that helped a lot!

Regards,
Alexandros

On 27 November 2012 20:57, Avery Ching <ac...@apache.org> wrote:

> Hi Alexandros,
>
> The extra task is for the master process (a coordination task). In your
> case, since you are using a single machine, you can use a single task.
>
> -Dgiraph.SplitMasterWorker=**false
>
> and you can try multithreading instead of multiple workers.
>
> -Dgiraph.numComputeThreads=12
>
> The reason why cpu usage increases is due to netty threads to handle
> network requests.  By using multithreading instead, you should bypass this.
>
> Avery
>
>
> On 11/27/12 9:40 AM, Alexandros Daglis wrote:
>
>> Hello everybody,
>>
>> I went through most of the documentation I could find for Giraph and also
>> most of the messages in this email list, but still I have not figured out
>> precisely what a "worker" really is. I would really appreciate it if you
>> could help me understand how the framework works.
>>
>> At first I thought that a worker has a one-to-one correspondence to a map
>> task. Apparently this is not exactly the case, since I have noticed that if
>> I ask for x workers, the job finishes after having used x+1 map tasks. What
>> is this extra task for?
>>
>> I have been trying out the example SSSP application on a single node with
>> 12 cores. Giving an input graph of ~400MB and using 1 worker, around 10 GBs
>> of memory are used during execution. What intrigues me is that if I use 2
>> workers for the same input (and without limiting memory per map task),
>> double the memory will be used. Furthermore, there will be no improvement
>> in performance. I rather notice a slowdown. Are these observations normal?
>>
>> Might it be the case that 1 and 2 workers are very few and I should go to
>> the 30-100 range that is the proposed number of mappers for a conventional
>> MapReduce job?
>>
>> Finally, a last observation. Even though I use only 1 worker, I see that
>> there are significant periods during execution where up to 90% of the 12
>> cores computing power is consumed, that is, almost 10 cores are used in
>> parallel. Does each worker spawn multiple threads and dynamically balances
>> the load to utilize the available hardware?
>>
>> Thanks a lot in advance!
>>
>> Best,
>> Alexandros
>>
>>
>>
>

Re: What a "worker" really is and other interesting runtime information

Posted by Avery Ching <ac...@apache.org>.
Hi Alexandros,

The extra task is for the master process (a coordination task). In your 
case, since you are using a single machine, you can use a single task.

-Dgiraph.SplitMasterWorker=false

and you can try multithreading instead of multiple workers.

-Dgiraph.numComputeThreads=12

The reason why cpu usage increases is due to netty threads to handle 
network requests.  By using multithreading instead, you should bypass this.

Avery

On 11/27/12 9:40 AM, Alexandros Daglis wrote:
> Hello everybody,
>
> I went through most of the documentation I could find for Giraph and 
> also most of the messages in this email list, but still I have not 
> figured out precisely what a "worker" really is. I would really 
> appreciate it if you could help me understand how the framework works.
>
> At first I thought that a worker has a one-to-one correspondence to a 
> map task. Apparently this is not exactly the case, since I have 
> noticed that if I ask for x workers, the job finishes after having 
> used x+1 map tasks. What is this extra task for?
>
> I have been trying out the example SSSP application on a single node 
> with 12 cores. Giving an input graph of ~400MB and using 1 worker, 
> around 10 GBs of memory are used during execution. What intrigues me 
> is that if I use 2 workers for the same input (and without limiting 
> memory per map task), double the memory will be used. Furthermore, 
> there will be no improvement in performance. I rather notice a 
> slowdown. Are these observations normal?
>
> Might it be the case that 1 and 2 workers are very few and I should go 
> to the 30-100 range that is the proposed number of mappers for a 
> conventional MapReduce job?
>
> Finally, a last observation. Even though I use only 1 worker, I see 
> that there are significant periods during execution where up to 90% of 
> the 12 cores computing power is consumed, that is, almost 10 cores are 
> used in parallel. Does each worker spawn multiple threads and 
> dynamically balances the load to utilize the available hardware?
>
> Thanks a lot in advance!
>
> Best,
> Alexandros
>
>