You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Gurvinder Singh <gu...@uninett.no> on 2014/09/16 08:17:10 UTC
Re: spark and mesos issue

It might not be related only to memory issue. Memory issue is also
there as you mentioned. I have seen that one too. The fine mode issue
is mainly spark considering that it got two different block manager
for same ID, whereas if I search for the ID in the mesos slave, it
exist only on the one slave not on multiple of them. Theis might be
due to the size of ID, as spark out the error as

14/09/16 08:04:29 ERROR BlockManagerMasterActor: Got two different
block manager registrations on 20140822-112818-711206558-5050-25951-0

where as in the mesos slave I see logs as

I0915 20:55:18.293903 31434 containerizer.cpp:392] Starting container
'3aab2237-d32f-470d-a206-7bada454ad3f' for executor
'20140822-112818-711206558-5050-25951-0' of framework
'20140822-112818-711206558-5050-25951-0053'

I0915 20:53:28.039218 31437 containerizer.cpp:392] Starting container
'fe4b344f-16c9-484a-9c2f-92bd92b43f6d' for executor
'20140822-112818-711206558-5050-25951-0' of framework
'20140822-112818-711206558-5050-25951-0050'


you the last 3 digits of ID are missing in spark where as they are
different in mesos slaves.

- Gurvinder
On 09/15/2014 11:13 PM, Brenden Matthews wrote:
> I started hitting a similar problem, and it seems to be related to 
> memory overhead and tasks getting OOM killed.  I filed a ticket
> here:
> 
> https://issues.apache.org/jira/browse/SPARK-3535
> 
> On Wed, Jul 16, 2014 at 5:27 AM, Ray Rodriguez
> <rayrod2030@gmail.com <ma...@gmail.com>> wrote:
> 
> I'll set some time aside today to gather and post some logs and 
> details about this issue from our end.
> 
> 
> On Wed, Jul 16, 2014 at 2:05 AM, Vinod Kone <vinodkone@gmail.com 
> <ma...@gmail.com>> wrote:
> 
> 
> 
> 
> On Tue, Jul 15, 2014 at 11:02 PM, Vinod Kone <vinod@twitter.com 
> <ma...@twitter.com>> wrote:
> 
> 
> On Fri, Jul 4, 2014 at 2:05 AM, Gurvinder Singh 
> <gurvinder.singh@uninett.no <ma...@uninett.no>>
> wrote:
> 
> ERROR storage.BlockManagerMasterActor: Got two different block
> manager registrations on 201407031041-1227224054-5050-24004-0
> 
> Googling about it seems that mesos is starting slaves at the same
> time and giving them the same id. So may bug in mesos ?
> 
> 
> Has this issue been resolved? We need more information to triage
> this. Maybe some logs that show the lifecycle of the duplicate
> instances?
> 
> 
> @vinodkone
> 
> 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org