You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Guillaume Pitel <gu...@exensa.com> on 2014/10/06 10:27:06 UTC

TorrentBroadcast slow performance

Hi,

I've had no answer to this on user@spark.apache.org, so I post it on dev before 
filing a JIRA (in case the problem or solution is already identified)

We've had some performance issues since switching to 1.1.0, and we finally found 
the origin : TorrentBroadcast seems to be very slow in our setting (and it 
became default with 1.1.0)

The logs of a 4MB variable with TorrentBroadcast : (15s)

14/10/01 15:47:13 INFO storage.MemoryStore: Block broadcast_84_piece1 stored as 
bytes in memory (estimated size 171.6 KB, free 7.2 GB)
14/10/01 15:47:13 INFO storage.BlockManagerMaster: Updated info of block 
broadcast_84_piece1
14/10/01 15:47:23 INFO storage.MemoryStore: ensureFreeSpace(4194304) called with 
curMem=1401611984, maxMem=9168696115
14/10/01 15:47:23 INFO storage.MemoryStore: Block broadcast_84_piece0 stored as 
bytes in memory (estimated size 4.0 MB, free 7.2 GB)
14/10/01 15:47:23 INFO storage.BlockManagerMaster: Updated info of block 
broadcast_84_piece0
14/10/01 15:47:23 INFO broadcast.TorrentBroadcast: Reading broadcast variable 84 
took 15.202260006 s
14/10/01 15:47:23 INFO storage.MemoryStore: ensureFreeSpace(4371392) called with 
curMem=1405806288, maxMem=9168696115
14/10/01 15:47:23 INFO storage.MemoryStore: Block broadcast_84 stored as values 
in memory (estimated size 4.2 MB, free 7.2 GB)

(notice that a 10s lag happens after the "Updated info of block broadcast_..." 
and before the MemoryStore log

And with HttpBroadcast (0.3s):

14/10/01 16:05:58 INFO broadcast.HttpBroadcast: Started reading broadcast 
variable 147
14/10/01 16:05:58 INFO storage.MemoryStore: ensureFreeSpace(4369376) called with 
curMem=1373493232, maxMem=9168696115
14/10/01 16:05:58 INFO storage.MemoryStore: Block broadcast_147 stored as values 
in memory (estimated size 4.2 MB, free 7.3 GB)
14/10/01 16:05:58 INFO broadcast.HttpBroadcast: Reading broadcast variable 147 
took 0.320907112 s 14/10/01 16:05:58 INFO storage.BlockManager: Found block 
broadcast_147 locally

Since Torrent is supposed to perform much better than Http, we suspect a 
configuration error from our side, but are unable to pin it down. Does someone 
have any idea of the origin of the problem ?

For now we're sticking with the HttpBroadcast workaround.

Guillaume
-- 
eXenSa

	
*Guillaume PITEL, Président*
+33(0)626 222 431

eXenSa S.A.S. <http://www.exensa.com/>
41, rue Périer - 92120 Montrouge - FRANCE
Tel +33(0)184 163 677 / Fax +33(0)972 283 705


Re: TorrentBroadcast slow performance

Posted by Matei Zaharia <ma...@gmail.com>.
Oops I forgot to add, for 2, maybe we can add a flag to use DISK_ONLY for TorrentBroadcast, or if the broadcasts are bigger than some size.

Matei

On Oct 9, 2014, at 3:04 PM, Matei Zaharia <ma...@gmail.com> wrote:

> Thanks for the feedback. For 1, there is an open patch: https://github.com/apache/spark/pull/2659. For 2, broadcast blocks actually use MEMORY_AND_DISK storage, so they will spill to disk if you have low memory, but they're faster to access otherwise.
> 
> Matei
> 
> On Oct 9, 2014, at 12:11 PM, Guillaume Pitel <gu...@exensa.com> wrote:
> 
>> Hi,
>> 
>> Thanks to your answer, we've found the problem. It was on reverse IP resolution on the drivers we used (wrong configuration of the local bind9). Apparently, not being able to reverse-resolve the IP address of the nodes was the culprit of the 10s delay.
>> 
>> We've hit two other secondary problems with TorrentBroadcast though, in case you're interested  :
>> 
>> 1 - Broadcasting a variable of about 2GB (1.8GB exactly) triggers a "java.lang.OutOfMemoryError: Requested array size exceeds VM limit", which is not the case with HttpBroadcast (I guess HttpBroadcast splits the serialized variable in small chunks)
>> 2 - Memory use of Torrent seems to be higher than Http (i.e. switching from Http to Torrent triggers several OOM).
>> 
>> Additionally, a question : while HttpBroadcast stores the broadcast pieces on disk (in spark.local.dir/spark-... ), TorrentBroadcast seems not to use disk backend storage. Does it mean that HttpBroadcast can handle bigger broadcast out of memory ? If so, it's too bad that this design choice wasn't used for Torrent.
>> 
>> That being said, hats off to the people in charge of the broadcast unloading wrt the lineage, this stuff works great !
>> 
>> Guillaume
>> 
>> 
>>> Maybe there is a firewall issue that makes it slow for your nodes to connect through the IP addresses they're configured with. I see there's this 10 second pause between "Updated info of block broadcast_84_piece1" and "ensureFreeSpace(4194304) called" (where it actually receives the block). HTTP broadcast used only HTTP fetches from the executors to the driver, but TorrentBroadcast has connections between the executors themselves and between executors and the driver over a different port. Where are you running your driver app and nodes?
>>> 
>>> Matei
>>> 
>>> On Oct 7, 2014, at 11:42 AM, Davies Liu <da...@databricks.com> wrote:
>>> 
>>>> Could you create a JIRA for it? maybe it's a regression after
>>>> https://issues.apache.org/jira/browse/SPARK-3119.
>>>> 
>>>> We will appreciate that if you could tell how to reproduce it.
>>>> 
>>>> On Mon, Oct 6, 2014 at 1:27 AM, Guillaume Pitel
>>>> <gu...@exensa.com> wrote:
>>>>> Hi,
>>>>> 
>>>>> I've had no answer to this on user@spark.apache.org, so I post it on dev
>>>>> before filing a JIRA (in case the problem or solution is already identified)
>>>>> 
>>>>> We've had some performance issues since switching to 1.1.0, and we finally
>>>>> found the origin : TorrentBroadcast seems to be very slow in our setting
>>>>> (and it became default with 1.1.0)
>>>>> 
>>>>> The logs of a 4MB variable with TorrentBroadcast : (15s)
>>>>> 
>>>>> 14/10/01 15:47:13 INFO storage.MemoryStore: Block broadcast_84_piece1 stored
>>>>> as bytes in memory (estimated size 171.6 KB, free 7.2 GB)
>>>>> 14/10/01 15:47:13 INFO storage.BlockManagerMaster: Updated info of block
>>>>> broadcast_84_piece1
>>>>> 14/10/01 15:47:23 INFO storage.MemoryStore: ensureFreeSpace(4194304) called
>>>>> with curMem=1401611984, maxMem=9168696115
>>>>> 14/10/01 15:47:23 INFO storage.MemoryStore: Block broadcast_84_piece0 stored
>>>>> as bytes in memory (estimated size 4.0 MB, free 7.2 GB)
>>>>> 14/10/01 15:47:23 INFO storage.BlockManagerMaster: Updated info of block
>>>>> broadcast_84_piece0
>>>>> 14/10/01 15:47:23 INFO broadcast.TorrentBroadcast: Reading broadcast
>>>>> variable 84 took 15.202260006 s
>>>>> 14/10/01 15:47:23 INFO storage.MemoryStore: ensureFreeSpace(4371392) called
>>>>> with curMem=1405806288, maxMem=9168696115
>>>>> 14/10/01 15:47:23 INFO storage.MemoryStore: Block broadcast_84 stored as
>>>>> values in memory (estimated size 4.2 MB, free 7.2 GB)
>>>>> 
>>>>> (notice that a 10s lag happens after the "Updated info of block
>>>>> broadcast_..." and before the MemoryStore log
>>>>> 
>>>>> And with HttpBroadcast (0.3s):
>>>>> 
>>>>> 14/10/01 16:05:58 INFO broadcast.HttpBroadcast: Started reading broadcast
>>>>> variable 147
>>>>> 14/10/01 16:05:58 INFO storage.MemoryStore: ensureFreeSpace(4369376) called
>>>>> with curMem=1373493232, maxMem=9168696115
>>>>> 14/10/01 16:05:58 INFO storage.MemoryStore: Block broadcast_147 stored as
>>>>> values in memory (estimated size 4.2 MB, free 7.3 GB)
>>>>> 14/10/01 16:05:58 INFO broadcast.HttpBroadcast: Reading broadcast variable
>>>>> 147 took 0.320907112 s 14/10/01 16:05:58 INFO storage.BlockManager: Found
>>>>> block broadcast_147 locally
>>>>> 
>>>>> Since Torrent is supposed to perform much better than Http, we suspect a
>>>>> configuration error from our side, but are unable to pin it down. Does
>>>>> someone have any idea of the origin of the problem ?
>>>>> 
>>>>> For now we're sticking with the HttpBroadcast workaround.
>>>>> 
>>>>> Guillaume
>>>>> --
>>>>> Guillaume PITEL, Président
>>>>> +33(0)626 222 431
>>>>> 
>>>>> eXenSa S.A.S.
>>>>> 41, rue Périer - 92120 Montrouge - FRANCE
>>>>> Tel +33(0)184 163 677 / Fax +33(0)972 283 705
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: dev-help@spark.apache.org
>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>> 
> 


Re: TorrentBroadcast slow performance

Posted by Matei Zaharia <ma...@gmail.com>.
Thanks for the feedback. For 1, there is an open patch: https://github.com/apache/spark/pull/2659. For 2, broadcast blocks actually use MEMORY_AND_DISK storage, so they will spill to disk if you have low memory, but they're faster to access otherwise.

Matei

On Oct 9, 2014, at 12:11 PM, Guillaume Pitel <gu...@exensa.com> wrote:

> Hi,
> 
> Thanks to your answer, we've found the problem. It was on reverse IP resolution on the drivers we used (wrong configuration of the local bind9). Apparently, not being able to reverse-resolve the IP address of the nodes was the culprit of the 10s delay.
> 
> We've hit two other secondary problems with TorrentBroadcast though, in case you're interested  :
> 
> 1 - Broadcasting a variable of about 2GB (1.8GB exactly) triggers a "java.lang.OutOfMemoryError: Requested array size exceeds VM limit", which is not the case with HttpBroadcast (I guess HttpBroadcast splits the serialized variable in small chunks)
> 2 - Memory use of Torrent seems to be higher than Http (i.e. switching from Http to Torrent triggers several OOM).
> 
> Additionally, a question : while HttpBroadcast stores the broadcast pieces on disk (in spark.local.dir/spark-... ), TorrentBroadcast seems not to use disk backend storage. Does it mean that HttpBroadcast can handle bigger broadcast out of memory ? If so, it's too bad that this design choice wasn't used for Torrent.
> 
> That being said, hats off to the people in charge of the broadcast unloading wrt the lineage, this stuff works great !
> 
> Guillaume
> 
> 
>> Maybe there is a firewall issue that makes it slow for your nodes to connect through the IP addresses they're configured with. I see there's this 10 second pause between "Updated info of block broadcast_84_piece1" and "ensureFreeSpace(4194304) called" (where it actually receives the block). HTTP broadcast used only HTTP fetches from the executors to the driver, but TorrentBroadcast has connections between the executors themselves and between executors and the driver over a different port. Where are you running your driver app and nodes?
>> 
>> Matei
>> 
>> On Oct 7, 2014, at 11:42 AM, Davies Liu <da...@databricks.com> wrote:
>> 
>>> Could you create a JIRA for it? maybe it's a regression after
>>> https://issues.apache.org/jira/browse/SPARK-3119.
>>> 
>>> We will appreciate that if you could tell how to reproduce it.
>>> 
>>> On Mon, Oct 6, 2014 at 1:27 AM, Guillaume Pitel
>>> <gu...@exensa.com> wrote:
>>>> Hi,
>>>> 
>>>> I've had no answer to this on user@spark.apache.org, so I post it on dev
>>>> before filing a JIRA (in case the problem or solution is already identified)
>>>> 
>>>> We've had some performance issues since switching to 1.1.0, and we finally
>>>> found the origin : TorrentBroadcast seems to be very slow in our setting
>>>> (and it became default with 1.1.0)
>>>> 
>>>> The logs of a 4MB variable with TorrentBroadcast : (15s)
>>>> 
>>>> 14/10/01 15:47:13 INFO storage.MemoryStore: Block broadcast_84_piece1 stored
>>>> as bytes in memory (estimated size 171.6 KB, free 7.2 GB)
>>>> 14/10/01 15:47:13 INFO storage.BlockManagerMaster: Updated info of block
>>>> broadcast_84_piece1
>>>> 14/10/01 15:47:23 INFO storage.MemoryStore: ensureFreeSpace(4194304) called
>>>> with curMem=1401611984, maxMem=9168696115
>>>> 14/10/01 15:47:23 INFO storage.MemoryStore: Block broadcast_84_piece0 stored
>>>> as bytes in memory (estimated size 4.0 MB, free 7.2 GB)
>>>> 14/10/01 15:47:23 INFO storage.BlockManagerMaster: Updated info of block
>>>> broadcast_84_piece0
>>>> 14/10/01 15:47:23 INFO broadcast.TorrentBroadcast: Reading broadcast
>>>> variable 84 took 15.202260006 s
>>>> 14/10/01 15:47:23 INFO storage.MemoryStore: ensureFreeSpace(4371392) called
>>>> with curMem=1405806288, maxMem=9168696115
>>>> 14/10/01 15:47:23 INFO storage.MemoryStore: Block broadcast_84 stored as
>>>> values in memory (estimated size 4.2 MB, free 7.2 GB)
>>>> 
>>>> (notice that a 10s lag happens after the "Updated info of block
>>>> broadcast_..." and before the MemoryStore log
>>>> 
>>>> And with HttpBroadcast (0.3s):
>>>> 
>>>> 14/10/01 16:05:58 INFO broadcast.HttpBroadcast: Started reading broadcast
>>>> variable 147
>>>> 14/10/01 16:05:58 INFO storage.MemoryStore: ensureFreeSpace(4369376) called
>>>> with curMem=1373493232, maxMem=9168696115
>>>> 14/10/01 16:05:58 INFO storage.MemoryStore: Block broadcast_147 stored as
>>>> values in memory (estimated size 4.2 MB, free 7.3 GB)
>>>> 14/10/01 16:05:58 INFO broadcast.HttpBroadcast: Reading broadcast variable
>>>> 147 took 0.320907112 s 14/10/01 16:05:58 INFO storage.BlockManager: Found
>>>> block broadcast_147 locally
>>>> 
>>>> Since Torrent is supposed to perform much better than Http, we suspect a
>>>> configuration error from our side, but are unable to pin it down. Does
>>>> someone have any idea of the origin of the problem ?
>>>> 
>>>> For now we're sticking with the HttpBroadcast workaround.
>>>> 
>>>> Guillaume
>>>> --
>>>> Guillaume PITEL, Président
>>>> +33(0)626 222 431
>>>> 
>>>> eXenSa S.A.S.
>>>> 41, rue Périer - 92120 Montrouge - FRANCE
>>>> Tel +33(0)184 163 677 / Fax +33(0)972 283 705
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: dev-help@spark.apache.org
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
> 


Re: TorrentBroadcast slow performance

Posted by Guillaume Pitel <gu...@exensa.com>.
Hi,

Thanks to your answer, we've found the problem. It was on reverse IP 
resolution on the drivers we used (wrong configuration of the local 
bind9). Apparently, not being able to reverse-resolve the IP address of 
the nodes was the culprit of the 10s delay.

We've hit two other secondary problems with TorrentBroadcast though, in 
case you're interested  :

1 - Broadcasting a variable of about 2GB (1.8GB exactly) triggers a 
"java.lang.OutOfMemoryError: Requested array size exceeds VM limit", 
which is not the case with HttpBroadcast (I guess HttpBroadcast splits 
the serialized variable in small chunks)
2 - Memory use of Torrent seems to be higher than Http (i.e. switching 
from Http to Torrent triggers several OOM).

Additionally, a question : while HttpBroadcast stores the broadcast 
pieces on disk (in spark.local.dir/spark-... ), TorrentBroadcast seems 
not to use disk backend storage. Does it mean that HttpBroadcast can 
handle bigger broadcast out of memory ? If so, it's too bad that this 
design choice wasn't used for Torrent.

  That being said, hats off to the people in charge of the broadcast 
unloading wrt the lineage, this stuff works great !

Guillaume


> Maybe there is a firewall issue that makes it slow for your nodes to connect through the IP addresses they're configured with. I see there's this 10 second pause between "Updated info of block broadcast_84_piece1" and "ensureFreeSpace(4194304) called" (where it actually receives the block). HTTP broadcast used only HTTP fetches from the executors to the driver, but TorrentBroadcast has connections between the executors themselves and between executors and the driver over a different port. Where are you running your driver app and nodes?
>
> Matei
>
> On Oct 7, 2014, at 11:42 AM, Davies Liu <da...@databricks.com> wrote:
>
>> Could you create a JIRA for it? maybe it's a regression after
>> https://issues.apache.org/jira/browse/SPARK-3119.
>>
>> We will appreciate that if you could tell how to reproduce it.
>>
>> On Mon, Oct 6, 2014 at 1:27 AM, Guillaume Pitel
>> <gu...@exensa.com> wrote:
>>> Hi,
>>>
>>> I've had no answer to this on user@spark.apache.org, so I post it on dev
>>> before filing a JIRA (in case the problem or solution is already identified)
>>>
>>> We've had some performance issues since switching to 1.1.0, and we finally
>>> found the origin : TorrentBroadcast seems to be very slow in our setting
>>> (and it became default with 1.1.0)
>>>
>>> The logs of a 4MB variable with TorrentBroadcast : (15s)
>>>
>>> 14/10/01 15:47:13 INFO storage.MemoryStore: Block broadcast_84_piece1 stored
>>> as bytes in memory (estimated size 171.6 KB, free 7.2 GB)
>>> 14/10/01 15:47:13 INFO storage.BlockManagerMaster: Updated info of block
>>> broadcast_84_piece1
>>> 14/10/01 15:47:23 INFO storage.MemoryStore: ensureFreeSpace(4194304) called
>>> with curMem=1401611984, maxMem=9168696115
>>> 14/10/01 15:47:23 INFO storage.MemoryStore: Block broadcast_84_piece0 stored
>>> as bytes in memory (estimated size 4.0 MB, free 7.2 GB)
>>> 14/10/01 15:47:23 INFO storage.BlockManagerMaster: Updated info of block
>>> broadcast_84_piece0
>>> 14/10/01 15:47:23 INFO broadcast.TorrentBroadcast: Reading broadcast
>>> variable 84 took 15.202260006 s
>>> 14/10/01 15:47:23 INFO storage.MemoryStore: ensureFreeSpace(4371392) called
>>> with curMem=1405806288, maxMem=9168696115
>>> 14/10/01 15:47:23 INFO storage.MemoryStore: Block broadcast_84 stored as
>>> values in memory (estimated size 4.2 MB, free 7.2 GB)
>>>
>>> (notice that a 10s lag happens after the "Updated info of block
>>> broadcast_..." and before the MemoryStore log
>>>
>>> And with HttpBroadcast (0.3s):
>>>
>>> 14/10/01 16:05:58 INFO broadcast.HttpBroadcast: Started reading broadcast
>>> variable 147
>>> 14/10/01 16:05:58 INFO storage.MemoryStore: ensureFreeSpace(4369376) called
>>> with curMem=1373493232, maxMem=9168696115
>>> 14/10/01 16:05:58 INFO storage.MemoryStore: Block broadcast_147 stored as
>>> values in memory (estimated size 4.2 MB, free 7.3 GB)
>>> 14/10/01 16:05:58 INFO broadcast.HttpBroadcast: Reading broadcast variable
>>> 147 took 0.320907112 s 14/10/01 16:05:58 INFO storage.BlockManager: Found
>>> block broadcast_147 locally
>>>
>>> Since Torrent is supposed to perform much better than Http, we suspect a
>>> configuration error from our side, but are unable to pin it down. Does
>>> someone have any idea of the origin of the problem ?
>>>
>>> For now we're sticking with the HttpBroadcast workaround.
>>>
>>> Guillaume
>>> --
>>> Guillaume PITEL, Président
>>> +33(0)626 222 431
>>>
>>> eXenSa S.A.S.
>>> 41, rue Périer - 92120 Montrouge - FRANCE
>>> Tel +33(0)184 163 677 / Fax +33(0)972 283 705
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: TorrentBroadcast slow performance

Posted by Matei Zaharia <ma...@gmail.com>.
Maybe there is a firewall issue that makes it slow for your nodes to connect through the IP addresses they're configured with. I see there's this 10 second pause between "Updated info of block broadcast_84_piece1" and "ensureFreeSpace(4194304) called" (where it actually receives the block). HTTP broadcast used only HTTP fetches from the executors to the driver, but TorrentBroadcast has connections between the executors themselves and between executors and the driver over a different port. Where are you running your driver app and nodes?

Matei

On Oct 7, 2014, at 11:42 AM, Davies Liu <da...@databricks.com> wrote:

> Could you create a JIRA for it? maybe it's a regression after
> https://issues.apache.org/jira/browse/SPARK-3119.
> 
> We will appreciate that if you could tell how to reproduce it.
> 
> On Mon, Oct 6, 2014 at 1:27 AM, Guillaume Pitel
> <gu...@exensa.com> wrote:
>> Hi,
>> 
>> I've had no answer to this on user@spark.apache.org, so I post it on dev
>> before filing a JIRA (in case the problem or solution is already identified)
>> 
>> We've had some performance issues since switching to 1.1.0, and we finally
>> found the origin : TorrentBroadcast seems to be very slow in our setting
>> (and it became default with 1.1.0)
>> 
>> The logs of a 4MB variable with TorrentBroadcast : (15s)
>> 
>> 14/10/01 15:47:13 INFO storage.MemoryStore: Block broadcast_84_piece1 stored
>> as bytes in memory (estimated size 171.6 KB, free 7.2 GB)
>> 14/10/01 15:47:13 INFO storage.BlockManagerMaster: Updated info of block
>> broadcast_84_piece1
>> 14/10/01 15:47:23 INFO storage.MemoryStore: ensureFreeSpace(4194304) called
>> with curMem=1401611984, maxMem=9168696115
>> 14/10/01 15:47:23 INFO storage.MemoryStore: Block broadcast_84_piece0 stored
>> as bytes in memory (estimated size 4.0 MB, free 7.2 GB)
>> 14/10/01 15:47:23 INFO storage.BlockManagerMaster: Updated info of block
>> broadcast_84_piece0
>> 14/10/01 15:47:23 INFO broadcast.TorrentBroadcast: Reading broadcast
>> variable 84 took 15.202260006 s
>> 14/10/01 15:47:23 INFO storage.MemoryStore: ensureFreeSpace(4371392) called
>> with curMem=1405806288, maxMem=9168696115
>> 14/10/01 15:47:23 INFO storage.MemoryStore: Block broadcast_84 stored as
>> values in memory (estimated size 4.2 MB, free 7.2 GB)
>> 
>> (notice that a 10s lag happens after the "Updated info of block
>> broadcast_..." and before the MemoryStore log
>> 
>> And with HttpBroadcast (0.3s):
>> 
>> 14/10/01 16:05:58 INFO broadcast.HttpBroadcast: Started reading broadcast
>> variable 147
>> 14/10/01 16:05:58 INFO storage.MemoryStore: ensureFreeSpace(4369376) called
>> with curMem=1373493232, maxMem=9168696115
>> 14/10/01 16:05:58 INFO storage.MemoryStore: Block broadcast_147 stored as
>> values in memory (estimated size 4.2 MB, free 7.3 GB)
>> 14/10/01 16:05:58 INFO broadcast.HttpBroadcast: Reading broadcast variable
>> 147 took 0.320907112 s 14/10/01 16:05:58 INFO storage.BlockManager: Found
>> block broadcast_147 locally
>> 
>> Since Torrent is supposed to perform much better than Http, we suspect a
>> configuration error from our side, but are unable to pin it down. Does
>> someone have any idea of the origin of the problem ?
>> 
>> For now we're sticking with the HttpBroadcast workaround.
>> 
>> Guillaume
>> --
>> Guillaume PITEL, Président
>> +33(0)626 222 431
>> 
>> eXenSa S.A.S.
>> 41, rue Périer - 92120 Montrouge - FRANCE
>> Tel +33(0)184 163 677 / Fax +33(0)972 283 705
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: TorrentBroadcast slow performance

Posted by Davies Liu <da...@databricks.com>.
Could you create a JIRA for it? maybe it's a regression after
https://issues.apache.org/jira/browse/SPARK-3119.

We will appreciate that if you could tell how to reproduce it.

On Mon, Oct 6, 2014 at 1:27 AM, Guillaume Pitel
<gu...@exensa.com> wrote:
> Hi,
>
> I've had no answer to this on user@spark.apache.org, so I post it on dev
> before filing a JIRA (in case the problem or solution is already identified)
>
> We've had some performance issues since switching to 1.1.0, and we finally
> found the origin : TorrentBroadcast seems to be very slow in our setting
> (and it became default with 1.1.0)
>
> The logs of a 4MB variable with TorrentBroadcast : (15s)
>
> 14/10/01 15:47:13 INFO storage.MemoryStore: Block broadcast_84_piece1 stored
> as bytes in memory (estimated size 171.6 KB, free 7.2 GB)
> 14/10/01 15:47:13 INFO storage.BlockManagerMaster: Updated info of block
> broadcast_84_piece1
> 14/10/01 15:47:23 INFO storage.MemoryStore: ensureFreeSpace(4194304) called
> with curMem=1401611984, maxMem=9168696115
> 14/10/01 15:47:23 INFO storage.MemoryStore: Block broadcast_84_piece0 stored
> as bytes in memory (estimated size 4.0 MB, free 7.2 GB)
> 14/10/01 15:47:23 INFO storage.BlockManagerMaster: Updated info of block
> broadcast_84_piece0
> 14/10/01 15:47:23 INFO broadcast.TorrentBroadcast: Reading broadcast
> variable 84 took 15.202260006 s
> 14/10/01 15:47:23 INFO storage.MemoryStore: ensureFreeSpace(4371392) called
> with curMem=1405806288, maxMem=9168696115
> 14/10/01 15:47:23 INFO storage.MemoryStore: Block broadcast_84 stored as
> values in memory (estimated size 4.2 MB, free 7.2 GB)
>
> (notice that a 10s lag happens after the "Updated info of block
> broadcast_..." and before the MemoryStore log
>
> And with HttpBroadcast (0.3s):
>
> 14/10/01 16:05:58 INFO broadcast.HttpBroadcast: Started reading broadcast
> variable 147
> 14/10/01 16:05:58 INFO storage.MemoryStore: ensureFreeSpace(4369376) called
> with curMem=1373493232, maxMem=9168696115
> 14/10/01 16:05:58 INFO storage.MemoryStore: Block broadcast_147 stored as
> values in memory (estimated size 4.2 MB, free 7.3 GB)
> 14/10/01 16:05:58 INFO broadcast.HttpBroadcast: Reading broadcast variable
> 147 took 0.320907112 s 14/10/01 16:05:58 INFO storage.BlockManager: Found
> block broadcast_147 locally
>
> Since Torrent is supposed to perform much better than Http, we suspect a
> configuration error from our side, but are unable to pin it down. Does
> someone have any idea of the origin of the problem ?
>
> For now we're sticking with the HttpBroadcast workaround.
>
> Guillaume
> --
> Guillaume PITEL, Président
> +33(0)626 222 431
>
> eXenSa S.A.S.
> 41, rue Périer - 92120 Montrouge - FRANCE
> Tel +33(0)184 163 677 / Fax +33(0)972 283 705

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org