You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Nick Pentreath <ni...@gmail.com> on 2014/06/25 11:48:17 UTC

Re: Is there anyone who can explain why the function of ALS.train give different shuffle results when execute the same transformation flatMap

How many users and items do you have?

Each iteration will first iterate through users and then items, so each
iteration of ALS actually ends up having 2 flatMap operations. I'd assume
that you have many more users than items (or vice versa), which is why one
of the operations generates more data.


On Wed, Jun 25, 2014 at 11:39 AM, Lizhengbing (bing, BIPA) <
zhengbing.li@huawei.com> wrote:

>
>
> Sometimes, shuffle write of flatMap is 14.8G and sometimes  is  647.9M
>
> Why does this happen?
>
> The size of training data is about 1.5G. and the feature number is 200
>
>
>
> *Stage Id*
>
> *Description*
>
> *Submitted*
>
> *Duration*
>
> *Tasks: Succeeded/Total*
>
> *Shuffle Read*
>
> *Shuffle Write*
>
> 114
>
> flatMap at ALS.scala:434
>
> 2014/06/25 17:13:39
>
> 6.3 min
>
> 48/48
>
> 611.7 MB
>
> 14.8 GB
>
> 115
>
> groupByKey at ALS.scala:442
>
> 2014/06/25 17:13:34
>
> 4 s
>
> 48/48
>
> 337.5 MB
>
> 1275.9 MB
>
> 116
>
> flatMap at ALS.scala:434
>
> 2014/06/25 17:09:02
>
> 4.5 min
>
> 48/48
>
> 12.2 GB
>
> 674.9 MB
>
> 117
>
> groupByKey at ALS.scala:442
>
> 2014/06/25 17:07:05
>
> 2.0 min
>
> 48/48
>
> 7.4 GB
>
> 25.5 GB
>
> 118
>
> flatMap at ALS.scala:434
>
> 2014/06/25 17:00:41
>
> 6.4 min
>
> 48/48
>
> 664.2 MB
>
> 14.8 GB
>
> 119
>
> groupByKey at ALS.scala:442
>
> 2014/06/25 17:00:30
>
> 10 s
>
> 48/48
>
> 337.4 MB
>
> 1275.9 MB
>
> 120
>
> flatMap at ALS.scala:434
>
> 2014/06/25 16:55:19
>
> 5.2 min
>
> 48/48
>
> 12.2 GB
>
> 674.9 MB
>
> 121
>
> groupByKey at ALS.scala:442
>
> 2014/06/25 16:54:02
>
> 1.3 min
>
> 48/48
>
> 7.4 GB
>
> 25.5 GB
>
> 122
>
> flatMap at ALS.scala:434
>
> 2014/06/25 16:53:52
>
> 9 s
>
> 48/48
>
> 14.8 GB
>
> 123
>
> mapPartitionsWithIndex at ALS.scala:200
> <http://10.71.123.101:4040/stages/stage?id=123>
>
> 2014/06/25 16:53:40
>
> 12 s
>
> 48/48
>
> 399.5 MB
>
> 737.4 MB
>
> 6
>
> map at ALS.scala:183 <http://10.71.123.101:4040/stages/stage?id=6>
>
> 2014/06/25 16:53:01
>
> 39 s
>
> 20/20
>
> 799.4 MB
>
> 3
>
> map at ALS.scala:186 <http://10.71.123.101:4040/stages/stage?id=3>
>
> 2014/06/25 16:53:01
>
> 39 s
>
> 20/20
>
> 652.2 MB
>
>
>

答复: Is there anyone who can explain why the function of ALS.train give different shuffle results when execute the same transformation flatMap

Posted by "Lizhengbing (bing, BIPA)" <zh...@huawei.com>.
Thanks nick
I know the reason.
Users’ number is 480189
Item’s number is 17770

It seems that this algorithm needs more memory than I expected
If  each cell in matrix occupies 8 bytes, the total bytes in user matrix are 480189 * 200 * 8 = 768,302,400, less than 1G.
But in order to get this result, it needs shuffle 14.8G data  in each iteration.

发件人: Nick Pentreath [mailto:nick.pentreath@gmail.com]
发送时间: 2014年6月25日 17:48
收件人: user@spark.apache.org
主题: Re: Is there anyone who can explain why the function of ALS.train give different shuffle results when execute the same transformation flatMap

How many users and items do you have?

Each iteration will first iterate through users and then items, so each iteration of ALS actually ends up having 2 flatMap operations. I'd assume that you have many more users than items (or vice versa), which is why one of the operations generates more data.

On Wed, Jun 25, 2014 at 11:39 AM, Lizhengbing (bing, BIPA) <zh...@huawei.com>> wrote:

Sometimes, shuffle write of flatMap is 14.8G and sometimes  is  647.9M
Why does this happen?
The size of training data is about 1.5G. and the feature number is 200

Stage Id

Description

Submitted

Duration

Tasks: Succeeded/Total

Shuffle Read

Shuffle Write

114

flatMap at ALS.scala:434

2014/06/25 17:13:39

6.3 min

48/48

611.7 MB

14.8 GB

115

groupByKey at ALS.scala:442

2014/06/25 17:13:34

4 s

48/48

337.5 MB

1275.9 MB

116

flatMap at ALS.scala:434

2014/06/25 17:09:02

4.5 min

48/48

12.2 GB

674.9 MB

117

groupByKey at ALS.scala:442

2014/06/25 17:07:05

2.0 min

48/48

7.4 GB

25.5 GB

118

flatMap at ALS.scala:434

2014/06/25 17:00:41

6.4 min

48/48

664.2 MB

14.8 GB

119

groupByKey at ALS.scala:442

2014/06/25 17:00:30

10 s

48/48

337.4 MB

1275.9 MB

120

flatMap at ALS.scala:434

2014/06/25 16:55:19

5.2 min

48/48

12.2 GB

674.9 MB

121

groupByKey at ALS.scala:442

2014/06/25 16:54:02

1.3 min

48/48

7.4 GB

25.5 GB

122

flatMap at ALS.scala:434

2014/06/25 16:53:52

9 s

48/48

14.8 GB

123

mapPartitionsWithIndex at ALS.scala:200<http://10.71.123.101:4040/stages/stage?id=123>

2014/06/25 16:53:40

12 s

48/48

399.5 MB

737.4 MB

6

map at ALS.scala:183<http://10.71.123.101:4040/stages/stage?id=6>

2014/06/25 16:53:01

39 s

20/20

799.4 MB

3

map at ALS.scala:186<http://10.71.123.101:4040/stages/stage?id=3>

2014/06/25 16:53:01

39 s

20/20

652.2 MB