You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Marinov, Slavi (London)" <Sl...@man.com> on 2018/03/05 20:23:04 UTC

Dynamic Resource Allocation - session stuck

Hello,



I am playing with DRA, initially just trying to get a feel for functionality/limitations & getting the basics to work. Spark is running with Mesos (in turn on Zookeeper). Spark is version 2.2.0.



I am running this very simple snippet:


https://gist.github.com/anonymous/62076a40304d614b9549262b9d626a5b

The first time I run it in the session, it runs fine.



I then wait for the executors to be killed (60 seconds per the default settings), and run this again in the same session:


x = range(1500)
result = sc.parallelize(x, numSlices=len(x)) \
              .map(lambda x: x ** 2) \
              .collect()



I then see: https://gist.github.com/anonymous/d5ee64f939de3f2168d7c1f48112b218


And then it's stuck.

​If I look at the Mesos UI, resources are on offer - but there aren't 1500 cores on offer (that said, there weren't 1500 the first time around either - yet it ran successfully).

If I look at the Spark Driver UI, I see 45 active executors. Interestingly, these were there even after the DRA sweep - after I'd seen "INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 0" in the log file. So it doesn't look like DRA actually released the executors - but that is perhaps another issue.

This is in a test environment - nothing else particularly fancy is going on. No other drivers are running.

Any help would be much appreciated!

Slavi

This email has been sent by a member of the Man group (“Man”). Man’s parent company, Man Group plc, is registered in England and Wales (company number 08172396) at Riverbank House, 2 Swan  Lane, London, EC4R 3AD.
The contents of this email are for the named addressee(s) only. It contains information which may be confidential and privileged. If you are not the intended recipient, please notify the sender immediately, destroy this email and any attachments and do not otherwise disclose or use them. Email transmission is not a secure method of communication and Man cannot accept responsibility for the completeness or accuracy of this email or any attachments. Whilst Man makes every effort to keep its network free from viruses, it does not accept responsibility for any computer virus which might be transferred by way of this email or any attachments. This email does not constitute a request, offer, recommendation or solicitation of any kind to buy, subscribe, sell or redeem any investment instruments or to perform other such transactions of any kind. Man reserves the right to monitor, record and retain all electronic and telephone communications through its network in accordance with applicable laws and regulations.