You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Cassa L <lc...@gmail.com> on 2016/06/13 21:56:32 UTC

Spark Memory Error - Not enough space to cache broadcast

Hi,

I'm using spark 1.5.1 version. I am reading data from Kafka into Spark
and writing it into Cassandra after processing it. Spark job starts
fine and runs all good for some time until I start getting below
errors. Once these errors come, job start to lag behind and I see that
job has scheduling and processing delays in streaming  UI.

Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak
memoryFraction parameters. Nothing works.


16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with
curMem=565394, maxMem=2778495713
16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0
stored as bytes in memory (estimated size 3.9 KB, free 2.6 GB)
16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable
69652 took 2 ms
16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory
threshold of 1024.0 KB for computing block broadcast_69652 in memory.
16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache
broadcast_69652 in memory! (computed 496.0 B so far)
16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) +
2.6 GB (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage
limit = 2.6 GB.
16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652
to disk instead.
16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0
(TID 452316). 2043 bytes result sent to driver


Thanks,

L

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Cassa L <lc...@gmail.com>.
Hi,

>
> What do you see under Executors and Details for Stage (for the
> affected stages)? Anything weird memory-related?
>
Under executor Tab, logs throw these warning -

16/06/16 20:45:40 INFO TorrentBroadcast: Reading broadcast variable
422145 took 1 ms
16/06/16 20:45:40 WARN MemoryStore: Failed to reserve initial memory
threshold of 1024.0 KB for computing block broadcast_422145 in memory.
16/06/16 20:45:40 WARN MemoryStore: Not enough space to cache
broadcast_422145 in memory! (computed 496.0 B so far)
16/06/16 20:45:40 INFO MemoryStore: Memory use = 147.9 KB (blocks) +
2.2 GB (scratch space shared across 0 tasks(s)) = 2.2 GB. Storage
limit = 2.2 GB.
16/06/16 20:45:40 WARN MemoryStore: Persisting block broadcast_422145
to disk instead.
16/06/16 20:45:40 INFO MapOutputTrackerWorker: Don't have map outputs
for shuffle 70278, fetching them

16/06/16 20:45:40 INFO MapOutputTrackerWorker: Doing the fetch; tracker
endpoint = AkkaRpcEndpointRef(Actor[akka.tcp://
sparkDriver@17.40.240.71:46187/user/MapOutputTracker#-1794035569])

I dont see any memory related errors on 'stages' Tab.

>
> How does your "I am reading data from Kafka into Spark and writing it
> into Cassandra after processing it." pipeline look like?
>
> This part has no issues. Reading from Kafka is always up to date. There
are no offset lags. Writting to Cassandra is also fine with less than 1ms
to write data.


> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Mon, Jun 13, 2016 at 11:56 PM, Cassa L <lc...@gmail.com> wrote:
> > Hi,
> >
> > I'm using spark 1.5.1 version. I am reading data from Kafka into Spark
> and
> > writing it into Cassandra after processing it. Spark job starts fine and
> > runs all good for some time until I start getting below errors. Once
> these
> > errors come, job start to lag behind and I see that job has scheduling
> and
> > processing delays in streaming  UI.
> >
> > Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak
> > memoryFraction parameters. Nothing works.
> >
> >
> > 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with
> > curMem=565394, maxMem=2778495713
> > 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored
> as
> > bytes in memory (estimated size 3.9 KB, free 2.6 GB)
> > 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 69652
> > took 2 ms
> > 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory
> > threshold of 1024.0 KB for computing block broadcast_69652 in memory.
> > 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache
> > broadcast_69652 in memory! (computed 496.0 B so far)
> > 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 2.6
> GB
> > (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6
> GB.
> > 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to
> disk
> > instead.
> > 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
> > 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID
> > 452316). 2043 bytes result sent to driver
> >
> >
> > Thanks,
> >
> > L
>

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Deepak Goel <de...@gmail.com>.
Seems like the exexutor memory is not enough for your job and it is writing
objects to disk
On Jun 17, 2016 2:25 AM, "Cassa L" <lc...@gmail.com> wrote:

>
>
> On Thu, Jun 16, 2016 at 5:27 AM, Deepak Goel <de...@gmail.com> wrote:
>
>> What is your hardware configuration like which you are running Spark on?
>>
>> It  is 24core, 128GB RAM
>
>> Hey
>>
>> Namaskara~Nalama~Guten Tag~Bonjour
>>
>>
>>    --
>> Keigu
>>
>> Deepak
>> 73500 12833
>> www.simtree.net, deepak@simtree.net
>> deicool@gmail.com
>>
>> LinkedIn: www.linkedin.com/in/deicool
>> Skype: thumsupdeicool
>> Google talk: deicool
>> Blog: http://loveandfearless.wordpress.com
>> Facebook: http://www.facebook.com/deicool
>>
>> "Contribute to the world, environment and more :
>> http://www.gridrepublic.org
>> "
>>
>> On Thu, Jun 16, 2016 at 5:33 PM, Jacek Laskowski <ja...@japila.pl> wrote:
>>
>>> Hi,
>>>
>>> What do you see under Executors and Details for Stage (for the
>>> affected stages)? Anything weird memory-related?
>>>
>>> How does your "I am reading data from Kafka into Spark and writing it
>>> into Cassandra after processing it." pipeline look like?
>>>
>>> Pozdrawiam,
>>> Jacek Laskowski
>>> ----
>>> https://medium.com/@jaceklaskowski/
>>> Mastering Apache Spark http://bit.ly/mastering-apache-spark
>>> Follow me at https://twitter.com/jaceklaskowski
>>>
>>>
>>> On Mon, Jun 13, 2016 at 11:56 PM, Cassa L <lc...@gmail.com> wrote:
>>> > Hi,
>>> >
>>> > I'm using spark 1.5.1 version. I am reading data from Kafka into Spark
>>> and
>>> > writing it into Cassandra after processing it. Spark job starts fine
>>> and
>>> > runs all good for some time until I start getting below errors. Once
>>> these
>>> > errors come, job start to lag behind and I see that job has scheduling
>>> and
>>> > processing delays in streaming  UI.
>>> >
>>> > Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak
>>> > memoryFraction parameters. Nothing works.
>>> >
>>> >
>>> > 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with
>>> > curMem=565394, maxMem=2778495713
>>> > 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0
>>> stored as
>>> > bytes in memory (estimated size 3.9 KB, free 2.6 GB)
>>> > 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable
>>> 69652
>>> > took 2 ms
>>> > 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory
>>> > threshold of 1024.0 KB for computing block broadcast_69652 in memory.
>>> > 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache
>>> > broadcast_69652 in memory! (computed 496.0 B so far)
>>> > 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) +
>>> 2.6 GB
>>> > (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6
>>> GB.
>>> > 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652
>>> to disk
>>> > instead.
>>> > 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
>>> > 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0
>>> (TID
>>> > 452316). 2043 bytes result sent to driver
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > L
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: user-help@spark.apache.org
>>>
>>>
>>
>

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Cassa L <lc...@gmail.com>.
On Thu, Jun 16, 2016 at 5:27 AM, Deepak Goel <de...@gmail.com> wrote:

> What is your hardware configuration like which you are running Spark on?
>
> It  is 24core, 128GB RAM

> Hey
>
> Namaskara~Nalama~Guten Tag~Bonjour
>
>
>    --
> Keigu
>
> Deepak
> 73500 12833
> www.simtree.net, deepak@simtree.net
> deicool@gmail.com
>
> LinkedIn: www.linkedin.com/in/deicool
> Skype: thumsupdeicool
> Google talk: deicool
> Blog: http://loveandfearless.wordpress.com
> Facebook: http://www.facebook.com/deicool
>
> "Contribute to the world, environment and more :
> http://www.gridrepublic.org
> "
>
> On Thu, Jun 16, 2016 at 5:33 PM, Jacek Laskowski <ja...@japila.pl> wrote:
>
>> Hi,
>>
>> What do you see under Executors and Details for Stage (for the
>> affected stages)? Anything weird memory-related?
>>
>> How does your "I am reading data from Kafka into Spark and writing it
>> into Cassandra after processing it." pipeline look like?
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> ----
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark http://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>>
>> On Mon, Jun 13, 2016 at 11:56 PM, Cassa L <lc...@gmail.com> wrote:
>> > Hi,
>> >
>> > I'm using spark 1.5.1 version. I am reading data from Kafka into Spark
>> and
>> > writing it into Cassandra after processing it. Spark job starts fine and
>> > runs all good for some time until I start getting below errors. Once
>> these
>> > errors come, job start to lag behind and I see that job has scheduling
>> and
>> > processing delays in streaming  UI.
>> >
>> > Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak
>> > memoryFraction parameters. Nothing works.
>> >
>> >
>> > 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with
>> > curMem=565394, maxMem=2778495713
>> > 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored
>> as
>> > bytes in memory (estimated size 3.9 KB, free 2.6 GB)
>> > 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable
>> 69652
>> > took 2 ms
>> > 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory
>> > threshold of 1024.0 KB for computing block broadcast_69652 in memory.
>> > 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache
>> > broadcast_69652 in memory! (computed 496.0 B so far)
>> > 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) +
>> 2.6 GB
>> > (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6
>> GB.
>> > 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to
>> disk
>> > instead.
>> > 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
>> > 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID
>> > 452316). 2043 bytes result sent to driver
>> >
>> >
>> > Thanks,
>> >
>> > L
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Cassa L <lc...@gmail.com>.
On Thu, Jun 16, 2016 at 5:27 AM, Deepak Goel <de...@gmail.com> wrote:

> What is your hardware configuration like which you are running Spark on?
>
> It  is 24core, 128GB RAM

> Hey
>
> Namaskara~Nalama~Guten Tag~Bonjour
>
>
>    --
> Keigu
>
> Deepak
> 73500 12833
> www.simtree.net, deepak@simtree.net
> deicool@gmail.com
>
> LinkedIn: www.linkedin.com/in/deicool
> Skype: thumsupdeicool
> Google talk: deicool
> Blog: http://loveandfearless.wordpress.com
> Facebook: http://www.facebook.com/deicool
>
> "Contribute to the world, environment and more :
> http://www.gridrepublic.org
> "
>
> On Thu, Jun 16, 2016 at 5:33 PM, Jacek Laskowski <ja...@japila.pl> wrote:
>
>> Hi,
>>
>> What do you see under Executors and Details for Stage (for the
>> affected stages)? Anything weird memory-related?
>>
>> How does your "I am reading data from Kafka into Spark and writing it
>> into Cassandra after processing it." pipeline look like?
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> ----
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark http://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>>
>> On Mon, Jun 13, 2016 at 11:56 PM, Cassa L <lc...@gmail.com> wrote:
>> > Hi,
>> >
>> > I'm using spark 1.5.1 version. I am reading data from Kafka into Spark
>> and
>> > writing it into Cassandra after processing it. Spark job starts fine and
>> > runs all good for some time until I start getting below errors. Once
>> these
>> > errors come, job start to lag behind and I see that job has scheduling
>> and
>> > processing delays in streaming  UI.
>> >
>> > Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak
>> > memoryFraction parameters. Nothing works.
>> >
>> >
>> > 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with
>> > curMem=565394, maxMem=2778495713
>> > 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored
>> as
>> > bytes in memory (estimated size 3.9 KB, free 2.6 GB)
>> > 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable
>> 69652
>> > took 2 ms
>> > 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory
>> > threshold of 1024.0 KB for computing block broadcast_69652 in memory.
>> > 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache
>> > broadcast_69652 in memory! (computed 496.0 B so far)
>> > 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) +
>> 2.6 GB
>> > (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6
>> GB.
>> > 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to
>> disk
>> > instead.
>> > 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
>> > 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID
>> > 452316). 2043 bytes result sent to driver
>> >
>> >
>> > Thanks,
>> >
>> > L
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Deepak Goel <de...@gmail.com>.
What is your hardware configuration like which you are running Spark on?

Hey

Namaskara~Nalama~Guten Tag~Bonjour


   --
Keigu

Deepak
73500 12833
www.simtree.net, deepak@simtree.net
deicool@gmail.com

LinkedIn: www.linkedin.com/in/deicool
Skype: thumsupdeicool
Google talk: deicool
Blog: http://loveandfearless.wordpress.com
Facebook: http://www.facebook.com/deicool

"Contribute to the world, environment and more : http://www.gridrepublic.org
"

On Thu, Jun 16, 2016 at 5:33 PM, Jacek Laskowski <ja...@japila.pl> wrote:

> Hi,
>
> What do you see under Executors and Details for Stage (for the
> affected stages)? Anything weird memory-related?
>
> How does your "I am reading data from Kafka into Spark and writing it
> into Cassandra after processing it." pipeline look like?
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Mon, Jun 13, 2016 at 11:56 PM, Cassa L <lc...@gmail.com> wrote:
> > Hi,
> >
> > I'm using spark 1.5.1 version. I am reading data from Kafka into Spark
> and
> > writing it into Cassandra after processing it. Spark job starts fine and
> > runs all good for some time until I start getting below errors. Once
> these
> > errors come, job start to lag behind and I see that job has scheduling
> and
> > processing delays in streaming  UI.
> >
> > Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak
> > memoryFraction parameters. Nothing works.
> >
> >
> > 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with
> > curMem=565394, maxMem=2778495713
> > 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored
> as
> > bytes in memory (estimated size 3.9 KB, free 2.6 GB)
> > 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 69652
> > took 2 ms
> > 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory
> > threshold of 1024.0 KB for computing block broadcast_69652 in memory.
> > 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache
> > broadcast_69652 in memory! (computed 496.0 B so far)
> > 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 2.6
> GB
> > (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6
> GB.
> > 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to
> disk
> > instead.
> > 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
> > 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID
> > 452316). 2043 bytes result sent to driver
> >
> >
> > Thanks,
> >
> > L
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Cassa L <lc...@gmail.com>.
Hi,

>
> What do you see under Executors and Details for Stage (for the
> affected stages)? Anything weird memory-related?
>
Under executor Tab, logs throw these warning -

16/06/16 20:45:40 INFO TorrentBroadcast: Reading broadcast variable
422145 took 1 ms
16/06/16 20:45:40 WARN MemoryStore: Failed to reserve initial memory
threshold of 1024.0 KB for computing block broadcast_422145 in memory.
16/06/16 20:45:40 WARN MemoryStore: Not enough space to cache
broadcast_422145 in memory! (computed 496.0 B so far)
16/06/16 20:45:40 INFO MemoryStore: Memory use = 147.9 KB (blocks) +
2.2 GB (scratch space shared across 0 tasks(s)) = 2.2 GB. Storage
limit = 2.2 GB.
16/06/16 20:45:40 WARN MemoryStore: Persisting block broadcast_422145
to disk instead.
16/06/16 20:45:40 INFO MapOutputTrackerWorker: Don't have map outputs
for shuffle 70278, fetching them

16/06/16 20:45:40 INFO MapOutputTrackerWorker: Doing the fetch; tracker
endpoint = AkkaRpcEndpointRef(Actor[akka.tcp://
sparkDriver@17.40.240.71:46187/user/MapOutputTracker#-1794035569])

I dont see any memory related errors on 'stages' Tab.

>
> How does your "I am reading data from Kafka into Spark and writing it
> into Cassandra after processing it." pipeline look like?
>
> This part has no issues. Reading from Kafka is always up to date. There
are no offset lags. Writting to Cassandra is also fine with less than 1ms
to write data.


> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Mon, Jun 13, 2016 at 11:56 PM, Cassa L <lc...@gmail.com> wrote:
> > Hi,
> >
> > I'm using spark 1.5.1 version. I am reading data from Kafka into Spark
> and
> > writing it into Cassandra after processing it. Spark job starts fine and
> > runs all good for some time until I start getting below errors. Once
> these
> > errors come, job start to lag behind and I see that job has scheduling
> and
> > processing delays in streaming  UI.
> >
> > Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak
> > memoryFraction parameters. Nothing works.
> >
> >
> > 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with
> > curMem=565394, maxMem=2778495713
> > 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored
> as
> > bytes in memory (estimated size 3.9 KB, free 2.6 GB)
> > 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 69652
> > took 2 ms
> > 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory
> > threshold of 1024.0 KB for computing block broadcast_69652 in memory.
> > 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache
> > broadcast_69652 in memory! (computed 496.0 B so far)
> > 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 2.6
> GB
> > (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6
> GB.
> > 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to
> disk
> > instead.
> > 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
> > 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID
> > 452316). 2043 bytes result sent to driver
> >
> >
> > Thanks,
> >
> > L
>

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Jacek Laskowski <ja...@japila.pl>.
Hi,

What do you see under Executors and Details for Stage (for the
affected stages)? Anything weird memory-related?

How does your "I am reading data from Kafka into Spark and writing it
into Cassandra after processing it." pipeline look like?

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Mon, Jun 13, 2016 at 11:56 PM, Cassa L <lc...@gmail.com> wrote:
> Hi,
>
> I'm using spark 1.5.1 version. I am reading data from Kafka into Spark and
> writing it into Cassandra after processing it. Spark job starts fine and
> runs all good for some time until I start getting below errors. Once these
> errors come, job start to lag behind and I see that job has scheduling and
> processing delays in streaming  UI.
>
> Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak
> memoryFraction parameters. Nothing works.
>
>
> 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with
> curMem=565394, maxMem=2778495713
> 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored as
> bytes in memory (estimated size 3.9 KB, free 2.6 GB)
> 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 69652
> took 2 ms
> 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory
> threshold of 1024.0 KB for computing block broadcast_69652 in memory.
> 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache
> broadcast_69652 in memory! (computed 496.0 B so far)
> 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 2.6 GB
> (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6 GB.
> 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to disk
> instead.
> 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
> 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID
> 452316). 2043 bytes result sent to driver
>
>
> Thanks,
>
> L

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Cassa L <lc...@gmail.com>.
Hi,
Upgrading sprak is not option right now. I did set  --driver-memory 4G. I
still run into this issue after 1 hour of data load.

LCassa


On Tue, Jun 14, 2016 at 3:57 PM, Gaurav Bhatnagar <ga...@gmail.com>
wrote:

> try setting the option --driver-memory 4G
>
> On Tue, Jun 14, 2016 at 3:52 PM, Ben Slater <be...@instaclustr.com>
> wrote:
>
>> A high level shot in the dark but in our testing we found Spark 1.6 a lot
>> more reliable in low memory situations (presumably due to
>> https://issues.apache.org/jira/browse/SPARK-10000). If it’s an option,
>> probably worth a try.
>>
>> Cheers
>> Ben
>>
>> On Wed, 15 Jun 2016 at 08:48 Cassa L <lc...@gmail.com> wrote:
>>
>>> Hi,
>>> I would appreciate any clue on this. It has become a bottleneck for our
>>> spark job.
>>>
>>> On Mon, Jun 13, 2016 at 2:56 PM, Cassa L <lc...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm using spark 1.5.1 version. I am reading data from Kafka into Spark and writing it into Cassandra after processing it. Spark job starts fine and runs all good for some time until I start getting below errors. Once these errors come, job start to lag behind and I see that job has scheduling and processing delays in streaming  UI.
>>>>
>>>> Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak memoryFraction parameters. Nothing works.
>>>>
>>>>
>>>> 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with curMem=565394, maxMem=2778495713
>>>> 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored as bytes in memory (estimated size 3.9 KB, free 2.6 GB)
>>>> 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 69652 took 2 ms
>>>> 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory threshold of 1024.0 KB for computing block broadcast_69652 in memory.
>>>> 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache broadcast_69652 in memory! (computed 496.0 B so far)
>>>> 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 2.6 GB (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6 GB.
>>>> 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to disk instead.
>>>> 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
>>>> 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID 452316). 2043 bytes result sent to driver
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> L
>>>>
>>>>
>>> --
>> ————————
>> Ben Slater
>> Chief Product Officer
>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
>> +61 437 929 798
>>
>
>

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Takeshi Yamamuro <li...@gmail.com>.
Hi,

Have you checked the statistics of storage memory, or something?

// maropu

On Thu, Jun 16, 2016 at 1:37 PM, Cassa L <lc...@gmail.com> wrote:

> Hi,
>  I did set  --driver-memory 4G. I still run into this issue after 1 hour
> of data load.
>
> I also tried version 1.6 in test environment. I hit this issue much faster
> than in 1.5.1 setup.
> LCassa
>
> On Tue, Jun 14, 2016 at 3:57 PM, Gaurav Bhatnagar <ga...@gmail.com>
> wrote:
>
>> try setting the option --driver-memory 4G
>>
>> On Tue, Jun 14, 2016 at 3:52 PM, Ben Slater <be...@instaclustr.com>
>> wrote:
>>
>>> A high level shot in the dark but in our testing we found Spark 1.6 a
>>> lot more reliable in low memory situations (presumably due to
>>> https://issues.apache.org/jira/browse/SPARK-10000). If it’s an option,
>>> probably worth a try.
>>>
>>> Cheers
>>> Ben
>>>
>>> On Wed, 15 Jun 2016 at 08:48 Cassa L <lc...@gmail.com> wrote:
>>>
>>>> Hi,
>>>> I would appreciate any clue on this. It has become a bottleneck for our
>>>> spark job.
>>>>
>>>> On Mon, Jun 13, 2016 at 2:56 PM, Cassa L <lc...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm using spark 1.5.1 version. I am reading data from Kafka into Spark and writing it into Cassandra after processing it. Spark job starts fine and runs all good for some time until I start getting below errors. Once these errors come, job start to lag behind and I see that job has scheduling and processing delays in streaming  UI.
>>>>>
>>>>> Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak memoryFraction parameters. Nothing works.
>>>>>
>>>>>
>>>>> 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with curMem=565394, maxMem=2778495713
>>>>> 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored as bytes in memory (estimated size 3.9 KB, free 2.6 GB)
>>>>> 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 69652 took 2 ms
>>>>> 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory threshold of 1024.0 KB for computing block broadcast_69652 in memory.
>>>>> 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache broadcast_69652 in memory! (computed 496.0 B so far)
>>>>> 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 2.6 GB (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6 GB.
>>>>> 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to disk instead.
>>>>> 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
>>>>> 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID 452316). 2043 bytes result sent to driver
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> L
>>>>>
>>>>>
>>>> --
>>> ————————
>>> Ben Slater
>>> Chief Product Officer
>>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
>>> +61 437 929 798
>>>
>>
>>
>


-- 
---
Takeshi Yamamuro

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Dennis Lovely <dl...@aegisco.com>.
I believe you want to set memoryFraction higher, not lower.  These two
older threads seem to have similar issues you are experiencing:

https://mail-archives.apache.org/mod_mbox/spark-user/201503.mbox/%3CCAHUQ+_ZqaWFs_MJ=+V49bD2paKvjLErPKMEW5duLO1jAo4=d1A@mail.gmail.com%3E
https://www.mail-archive.com/user@spark.apache.org/msg44793.html

More info on tuning shuffle behavior:
https://spark.apache.org/docs/1.5.1/configuration.html#shuffle-behavior

On Thu, Jun 16, 2016 at 1:57 PM, Cassa L <lc...@gmail.com> wrote:

> Hi Dennis,
>
> On Wed, Jun 15, 2016 at 11:39 PM, Dennis Lovely <dl...@aegisco.com> wrote:
>
>> You could try tuning spark.shuffle.memoryFraction and
>> spark.storage.memoryFraction (both of which have been deprecated in 1.6),
>> but ultimately you need to find out where you are bottlenecked and address
>> that as adjusting memoryFraction will only be a stopgap.  both shuffle and
>> storage memoryFractions default to 0.6
>>
>> I have set above parameters to 0.5. Does it need to increased?
>
> Thanks.
>
>> On Wed, Jun 15, 2016 at 9:37 PM, Cassa L <lc...@gmail.com> wrote:
>>
>>> Hi,
>>>  I did set  --driver-memory 4G. I still run into this issue after 1
>>> hour of data load.
>>>
>>> I also tried version 1.6 in test environment. I hit this issue much
>>> faster than in 1.5.1 setup.
>>> LCassa
>>>
>>> On Tue, Jun 14, 2016 at 3:57 PM, Gaurav Bhatnagar <ga...@gmail.com>
>>> wrote:
>>>
>>>> try setting the option --driver-memory 4G
>>>>
>>>> On Tue, Jun 14, 2016 at 3:52 PM, Ben Slater <ben.slater@instaclustr.com
>>>> > wrote:
>>>>
>>>>> A high level shot in the dark but in our testing we found Spark 1.6 a
>>>>> lot more reliable in low memory situations (presumably due to
>>>>> https://issues.apache.org/jira/browse/SPARK-10000). If it’s an
>>>>> option, probably worth a try.
>>>>>
>>>>> Cheers
>>>>> Ben
>>>>>
>>>>> On Wed, 15 Jun 2016 at 08:48 Cassa L <lc...@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>> I would appreciate any clue on this. It has become a bottleneck for
>>>>>> our spark job.
>>>>>>
>>>>>> On Mon, Jun 13, 2016 at 2:56 PM, Cassa L <lc...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm using spark 1.5.1 version. I am reading data from Kafka into Spark and writing it into Cassandra after processing it. Spark job starts fine and runs all good for some time until I start getting below errors. Once these errors come, job start to lag behind and I see that job has scheduling and processing delays in streaming  UI.
>>>>>>>
>>>>>>> Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak memoryFraction parameters. Nothing works.
>>>>>>>
>>>>>>>
>>>>>>> 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with curMem=565394, maxMem=2778495713
>>>>>>> 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored as bytes in memory (estimated size 3.9 KB, free 2.6 GB)
>>>>>>> 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 69652 took 2 ms
>>>>>>> 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory threshold of 1024.0 KB for computing block broadcast_69652 in memory.
>>>>>>> 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache broadcast_69652 in memory! (computed 496.0 B so far)
>>>>>>> 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 2.6 GB (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6 GB.
>>>>>>> 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to disk instead.
>>>>>>> 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
>>>>>>> 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID 452316). 2043 bytes result sent to driver
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> L
>>>>>>>
>>>>>>>
>>>>>> --
>>>>> ————————
>>>>> Ben Slater
>>>>> Chief Product Officer
>>>>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
>>>>> +61 437 929 798
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Cassa L <lc...@gmail.com>.
Hi Dennis,

On Wed, Jun 15, 2016 at 11:39 PM, Dennis Lovely <dl...@aegisco.com> wrote:

> You could try tuning spark.shuffle.memoryFraction and
> spark.storage.memoryFraction (both of which have been deprecated in 1.6),
> but ultimately you need to find out where you are bottlenecked and address
> that as adjusting memoryFraction will only be a stopgap.  both shuffle and
> storage memoryFractions default to 0.6
>
> I have set above parameters to 0.5. Does it need to increased?

Thanks.

> On Wed, Jun 15, 2016 at 9:37 PM, Cassa L <lc...@gmail.com> wrote:
>
>> Hi,
>>  I did set  --driver-memory 4G. I still run into this issue after 1 hour
>> of data load.
>>
>> I also tried version 1.6 in test environment. I hit this issue much
>> faster than in 1.5.1 setup.
>> LCassa
>>
>> On Tue, Jun 14, 2016 at 3:57 PM, Gaurav Bhatnagar <ga...@gmail.com>
>> wrote:
>>
>>> try setting the option --driver-memory 4G
>>>
>>> On Tue, Jun 14, 2016 at 3:52 PM, Ben Slater <be...@instaclustr.com>
>>> wrote:
>>>
>>>> A high level shot in the dark but in our testing we found Spark 1.6 a
>>>> lot more reliable in low memory situations (presumably due to
>>>> https://issues.apache.org/jira/browse/SPARK-10000). If it’s an option,
>>>> probably worth a try.
>>>>
>>>> Cheers
>>>> Ben
>>>>
>>>> On Wed, 15 Jun 2016 at 08:48 Cassa L <lc...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>> I would appreciate any clue on this. It has become a bottleneck for
>>>>> our spark job.
>>>>>
>>>>> On Mon, Jun 13, 2016 at 2:56 PM, Cassa L <lc...@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm using spark 1.5.1 version. I am reading data from Kafka into Spark and writing it into Cassandra after processing it. Spark job starts fine and runs all good for some time until I start getting below errors. Once these errors come, job start to lag behind and I see that job has scheduling and processing delays in streaming  UI.
>>>>>>
>>>>>> Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak memoryFraction parameters. Nothing works.
>>>>>>
>>>>>>
>>>>>> 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with curMem=565394, maxMem=2778495713
>>>>>> 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored as bytes in memory (estimated size 3.9 KB, free 2.6 GB)
>>>>>> 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 69652 took 2 ms
>>>>>> 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory threshold of 1024.0 KB for computing block broadcast_69652 in memory.
>>>>>> 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache broadcast_69652 in memory! (computed 496.0 B so far)
>>>>>> 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 2.6 GB (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6 GB.
>>>>>> 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to disk instead.
>>>>>> 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
>>>>>> 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID 452316). 2043 bytes result sent to driver
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> L
>>>>>>
>>>>>>
>>>>> --
>>>> ————————
>>>> Ben Slater
>>>> Chief Product Officer
>>>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
>>>> +61 437 929 798
>>>>
>>>
>>>
>>
>

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Dennis Lovely <dl...@aegisco.com>.
You could try tuning spark.shuffle.memoryFraction and
spark.storage.memoryFraction (both of which have been deprecated in 1.6),
but ultimately you need to find out where you are bottlenecked and address
that as adjusting memoryFraction will only be a stopgap.  both shuffle and
storage memoryFractions default to 0.6

On Wed, Jun 15, 2016 at 9:37 PM, Cassa L <lc...@gmail.com> wrote:

> Hi,
>  I did set  --driver-memory 4G. I still run into this issue after 1 hour
> of data load.
>
> I also tried version 1.6 in test environment. I hit this issue much faster
> than in 1.5.1 setup.
> LCassa
>
> On Tue, Jun 14, 2016 at 3:57 PM, Gaurav Bhatnagar <ga...@gmail.com>
> wrote:
>
>> try setting the option --driver-memory 4G
>>
>> On Tue, Jun 14, 2016 at 3:52 PM, Ben Slater <be...@instaclustr.com>
>> wrote:
>>
>>> A high level shot in the dark but in our testing we found Spark 1.6 a
>>> lot more reliable in low memory situations (presumably due to
>>> https://issues.apache.org/jira/browse/SPARK-10000). If it’s an option,
>>> probably worth a try.
>>>
>>> Cheers
>>> Ben
>>>
>>> On Wed, 15 Jun 2016 at 08:48 Cassa L <lc...@gmail.com> wrote:
>>>
>>>> Hi,
>>>> I would appreciate any clue on this. It has become a bottleneck for our
>>>> spark job.
>>>>
>>>> On Mon, Jun 13, 2016 at 2:56 PM, Cassa L <lc...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm using spark 1.5.1 version. I am reading data from Kafka into Spark and writing it into Cassandra after processing it. Spark job starts fine and runs all good for some time until I start getting below errors. Once these errors come, job start to lag behind and I see that job has scheduling and processing delays in streaming  UI.
>>>>>
>>>>> Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak memoryFraction parameters. Nothing works.
>>>>>
>>>>>
>>>>> 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with curMem=565394, maxMem=2778495713
>>>>> 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored as bytes in memory (estimated size 3.9 KB, free 2.6 GB)
>>>>> 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 69652 took 2 ms
>>>>> 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory threshold of 1024.0 KB for computing block broadcast_69652 in memory.
>>>>> 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache broadcast_69652 in memory! (computed 496.0 B so far)
>>>>> 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 2.6 GB (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6 GB.
>>>>> 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to disk instead.
>>>>> 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
>>>>> 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID 452316). 2043 bytes result sent to driver
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> L
>>>>>
>>>>>
>>>> --
>>> ————————
>>> Ben Slater
>>> Chief Product Officer
>>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
>>> +61 437 929 798
>>>
>>
>>
>

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Cassa L <lc...@gmail.com>.
Hi,
 I did set  --driver-memory 4G. I still run into this issue after 1 hour of
data load.

I also tried version 1.6 in test environment. I hit this issue much faster
than in 1.5.1 setup.
LCassa

On Tue, Jun 14, 2016 at 3:57 PM, Gaurav Bhatnagar <ga...@gmail.com>
wrote:

> try setting the option --driver-memory 4G
>
> On Tue, Jun 14, 2016 at 3:52 PM, Ben Slater <be...@instaclustr.com>
> wrote:
>
>> A high level shot in the dark but in our testing we found Spark 1.6 a lot
>> more reliable in low memory situations (presumably due to
>> https://issues.apache.org/jira/browse/SPARK-10000). If it’s an option,
>> probably worth a try.
>>
>> Cheers
>> Ben
>>
>> On Wed, 15 Jun 2016 at 08:48 Cassa L <lc...@gmail.com> wrote:
>>
>>> Hi,
>>> I would appreciate any clue on this. It has become a bottleneck for our
>>> spark job.
>>>
>>> On Mon, Jun 13, 2016 at 2:56 PM, Cassa L <lc...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm using spark 1.5.1 version. I am reading data from Kafka into Spark and writing it into Cassandra after processing it. Spark job starts fine and runs all good for some time until I start getting below errors. Once these errors come, job start to lag behind and I see that job has scheduling and processing delays in streaming  UI.
>>>>
>>>> Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak memoryFraction parameters. Nothing works.
>>>>
>>>>
>>>> 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with curMem=565394, maxMem=2778495713
>>>> 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored as bytes in memory (estimated size 3.9 KB, free 2.6 GB)
>>>> 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 69652 took 2 ms
>>>> 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory threshold of 1024.0 KB for computing block broadcast_69652 in memory.
>>>> 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache broadcast_69652 in memory! (computed 496.0 B so far)
>>>> 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 2.6 GB (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6 GB.
>>>> 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to disk instead.
>>>> 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
>>>> 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID 452316). 2043 bytes result sent to driver
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> L
>>>>
>>>>
>>> --
>> ————————
>> Ben Slater
>> Chief Product Officer
>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
>> +61 437 929 798
>>
>
>

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Cassa L <lc...@gmail.com>.
Hi,
 I did set  --driver-memory 4G. I still run into this issue after 1 hour of
data load.

I also tried version 1.6 in test environment. I hit this issue much faster
than in 1.5.1 setup.
LCassa

On Tue, Jun 14, 2016 at 3:57 PM, Gaurav Bhatnagar <ga...@gmail.com>
wrote:

> try setting the option --driver-memory 4G
>
> On Tue, Jun 14, 2016 at 3:52 PM, Ben Slater <be...@instaclustr.com>
> wrote:
>
>> A high level shot in the dark but in our testing we found Spark 1.6 a lot
>> more reliable in low memory situations (presumably due to
>> https://issues.apache.org/jira/browse/SPARK-10000). If it’s an option,
>> probably worth a try.
>>
>> Cheers
>> Ben
>>
>> On Wed, 15 Jun 2016 at 08:48 Cassa L <lc...@gmail.com> wrote:
>>
>>> Hi,
>>> I would appreciate any clue on this. It has become a bottleneck for our
>>> spark job.
>>>
>>> On Mon, Jun 13, 2016 at 2:56 PM, Cassa L <lc...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm using spark 1.5.1 version. I am reading data from Kafka into Spark and writing it into Cassandra after processing it. Spark job starts fine and runs all good for some time until I start getting below errors. Once these errors come, job start to lag behind and I see that job has scheduling and processing delays in streaming  UI.
>>>>
>>>> Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak memoryFraction parameters. Nothing works.
>>>>
>>>>
>>>> 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with curMem=565394, maxMem=2778495713
>>>> 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored as bytes in memory (estimated size 3.9 KB, free 2.6 GB)
>>>> 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 69652 took 2 ms
>>>> 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory threshold of 1024.0 KB for computing block broadcast_69652 in memory.
>>>> 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache broadcast_69652 in memory! (computed 496.0 B so far)
>>>> 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 2.6 GB (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6 GB.
>>>> 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to disk instead.
>>>> 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
>>>> 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID 452316). 2043 bytes result sent to driver
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> L
>>>>
>>>>
>>> --
>> ————————
>> Ben Slater
>> Chief Product Officer
>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
>> +61 437 929 798
>>
>
>

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Gaurav Bhatnagar <ga...@gmail.com>.
try setting the option --driver-memory 4G

On Tue, Jun 14, 2016 at 3:52 PM, Ben Slater <be...@instaclustr.com>
wrote:

> A high level shot in the dark but in our testing we found Spark 1.6 a lot
> more reliable in low memory situations (presumably due to
> https://issues.apache.org/jira/browse/SPARK-10000). If it’s an option,
> probably worth a try.
>
> Cheers
> Ben
>
> On Wed, 15 Jun 2016 at 08:48 Cassa L <lc...@gmail.com> wrote:
>
>> Hi,
>> I would appreciate any clue on this. It has become a bottleneck for our
>> spark job.
>>
>> On Mon, Jun 13, 2016 at 2:56 PM, Cassa L <lc...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I'm using spark 1.5.1 version. I am reading data from Kafka into Spark and writing it into Cassandra after processing it. Spark job starts fine and runs all good for some time until I start getting below errors. Once these errors come, job start to lag behind and I see that job has scheduling and processing delays in streaming  UI.
>>>
>>> Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak memoryFraction parameters. Nothing works.
>>>
>>>
>>> 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with curMem=565394, maxMem=2778495713
>>> 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored as bytes in memory (estimated size 3.9 KB, free 2.6 GB)
>>> 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 69652 took 2 ms
>>> 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory threshold of 1024.0 KB for computing block broadcast_69652 in memory.
>>> 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache broadcast_69652 in memory! (computed 496.0 B so far)
>>> 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 2.6 GB (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6 GB.
>>> 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to disk instead.
>>> 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
>>> 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID 452316). 2043 bytes result sent to driver
>>>
>>>
>>> Thanks,
>>>
>>> L
>>>
>>>
>> --
> ————————
> Ben Slater
> Chief Product Officer
> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
> +61 437 929 798
>

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Ben Slater <be...@instaclustr.com>.
A high level shot in the dark but in our testing we found Spark 1.6 a lot
more reliable in low memory situations (presumably due to
https://issues.apache.org/jira/browse/SPARK-10000). If it’s an option,
probably worth a try.

Cheers
Ben

On Wed, 15 Jun 2016 at 08:48 Cassa L <lc...@gmail.com> wrote:

> Hi,
> I would appreciate any clue on this. It has become a bottleneck for our
> spark job.
>
> On Mon, Jun 13, 2016 at 2:56 PM, Cassa L <lc...@gmail.com> wrote:
>
>> Hi,
>>
>> I'm using spark 1.5.1 version. I am reading data from Kafka into Spark and writing it into Cassandra after processing it. Spark job starts fine and runs all good for some time until I start getting below errors. Once these errors come, job start to lag behind and I see that job has scheduling and processing delays in streaming  UI.
>>
>> Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak memoryFraction parameters. Nothing works.
>>
>>
>> 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with curMem=565394, maxMem=2778495713
>> 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored as bytes in memory (estimated size 3.9 KB, free 2.6 GB)
>> 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 69652 took 2 ms
>> 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory threshold of 1024.0 KB for computing block broadcast_69652 in memory.
>> 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache broadcast_69652 in memory! (computed 496.0 B so far)
>> 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 2.6 GB (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6 GB.
>> 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to disk instead.
>> 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
>> 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID 452316). 2043 bytes result sent to driver
>>
>>
>> Thanks,
>>
>> L
>>
>>
> --
————————
Ben Slater
Chief Product Officer
Instaclustr: Cassandra + Spark - Managed | Consulting | Support
+61 437 929 798

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Cassa L <lc...@gmail.com>.
Hi,
I would appreciate any clue on this. It has become a bottleneck for our
spark job.

On Mon, Jun 13, 2016 at 2:56 PM, Cassa L <lc...@gmail.com> wrote:

> Hi,
>
> I'm using spark 1.5.1 version. I am reading data from Kafka into Spark and writing it into Cassandra after processing it. Spark job starts fine and runs all good for some time until I start getting below errors. Once these errors come, job start to lag behind and I see that job has scheduling and processing delays in streaming  UI.
>
> Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak memoryFraction parameters. Nothing works.
>
>
> 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with curMem=565394, maxMem=2778495713
> 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored as bytes in memory (estimated size 3.9 KB, free 2.6 GB)
> 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 69652 took 2 ms
> 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory threshold of 1024.0 KB for computing block broadcast_69652 in memory.
> 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache broadcast_69652 in memory! (computed 496.0 B so far)
> 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 2.6 GB (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6 GB.
> 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to disk instead.
> 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
> 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID 452316). 2043 bytes result sent to driver
>
>
> Thanks,
>
> L
>
>

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Cassa L <lc...@gmail.com>.
Hi,
I would appreciate any clue on this. It has become a bottleneck for our
spark job.

On Mon, Jun 13, 2016 at 2:56 PM, Cassa L <lc...@gmail.com> wrote:

> Hi,
>
> I'm using spark 1.5.1 version. I am reading data from Kafka into Spark and writing it into Cassandra after processing it. Spark job starts fine and runs all good for some time until I start getting below errors. Once these errors come, job start to lag behind and I see that job has scheduling and processing delays in streaming  UI.
>
> Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak memoryFraction parameters. Nothing works.
>
>
> 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with curMem=565394, maxMem=2778495713
> 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored as bytes in memory (estimated size 3.9 KB, free 2.6 GB)
> 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 69652 took 2 ms
> 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory threshold of 1024.0 KB for computing block broadcast_69652 in memory.
> 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache broadcast_69652 in memory! (computed 496.0 B so far)
> 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 2.6 GB (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6 GB.
> 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to disk instead.
> 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
> 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID 452316). 2043 bytes result sent to driver
>
>
> Thanks,
>
> L
>
>

Re: Spark Memory Error - Not enough space to cache broadcast

Posted by Jacek Laskowski <ja...@japila.pl>.
Hi,

What do you see under Executors and Details for Stage (for the
affected stages)? Anything weird memory-related?

How does your "I am reading data from Kafka into Spark and writing it
into Cassandra after processing it." pipeline look like?

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Mon, Jun 13, 2016 at 11:56 PM, Cassa L <lc...@gmail.com> wrote:
> Hi,
>
> I'm using spark 1.5.1 version. I am reading data from Kafka into Spark and
> writing it into Cassandra after processing it. Spark job starts fine and
> runs all good for some time until I start getting below errors. Once these
> errors come, job start to lag behind and I see that job has scheduling and
> processing delays in streaming  UI.
>
> Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak
> memoryFraction parameters. Nothing works.
>
>
> 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with
> curMem=565394, maxMem=2778495713
> 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored as
> bytes in memory (estimated size 3.9 KB, free 2.6 GB)
> 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 69652
> took 2 ms
> 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory
> threshold of 1024.0 KB for computing block broadcast_69652 in memory.
> 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache
> broadcast_69652 in memory! (computed 496.0 B so far)
> 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 2.6 GB
> (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6 GB.
> 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to disk
> instead.
> 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
> 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 (TID
> 452316). 2043 bytes result sent to driver
>
>
> Thanks,
>
> L