You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Doug Rohrer (JIRA)" <ji...@apache.org> on 2018/10/12 15:16:00 UTC

[jira] [Commented] (SPARK-24225) Support closing AutoClosable objects in MemoryStore so Broadcast Variables can be released properly

    [ https://issues.apache.org/jira/browse/SPARK-24225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648009#comment-16648009 ] 

Doug Rohrer commented on SPARK-24225:
-------------------------------------

Sorry this has dropped off my radar for so long - work + life took me away from it for a while. So looking at the PR review comments and better understanding Broadcast Variable behavior (and some of the changes that took place in the 2.X series), it seems like simply trying to close Broadcast variables won't work as intended. However, I believe the underlying concept (driver-scoped shared variables, where the variable lives until the job is done or the driver removes it) is still worth pursuing. Being able to scope shared resources (like DB connection pools, which may need to change per phase of a job, or be able to be disposed of early in a process, which makes static variables not useful). Given that, I'd like to propose we add a new concept, similar to Broadcast Variables, called, perhaps, Scoped Variables. The intent would be for these to be scoped by the driver, be relatively small from a memory-consumption perspective (unlike broadcast variables, which can be much larger), and to be held in memory until explicitly removed by the driver. Most of the infrastructure work for broadcast variables supports this use-case, but we'd need to have either a "non-purgable" type in the MemoryStore, or some other store specific to these new scoped variables, in order to prevent them from being evicted like cached items are.

 

Thoughts on this? I'll start working on updating the PR to support something like this sometime today, but it might still take a while to get something workable put together, so I'd appreciate any feedback when someone has the time.

> Support closing AutoClosable objects in MemoryStore so Broadcast Variables can be released properly
> ---------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-24225
>                 URL: https://issues.apache.org/jira/browse/SPARK-24225
>             Project: Spark
>          Issue Type: New Feature
>          Components: Block Manager
>    Affects Versions: 1.6.3, 2.2.0, 2.2.1, 2.3.0
>            Reporter: Doug Rohrer
>            Assignee: Doug Rohrer
>            Priority: Major
>
> When using Broadcast Variables, it would be beneficial if classes implementing AutoClosable were closed when released. This would allow broadcast variables to be used, for example, as shared resource pools across multiple tasks within an executor without the use of static variables. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org