You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Justin Uang <ju...@gmail.com> on 2015/10/30 17:13:44 UTC

Off-heap storage and dynamic allocation

Hey guys,

According to the docs for 1.5.1, when an executor is removed for dynamic
allocation, the cached data is gone. If I use off-heap storage like
tachyon, conceptually there isn't this issue anymore, but is the cached
data still available in practice? This would be great because then we would
be able to set spark.dynamicAllocation.cachedExecutorIdleTimeout to be
quite small.

==================
In addition to writing shuffle files, executors also cache data either on
disk or in memory. When an executor is removed, however, all cached data
will no longer be accessible. There is currently not yet a solution for
this in Spark 1.2. In future releases, the cached data may be preserved
through an off-heap storage similar in spirit to how shuffle files are
preserved through the external shuffle service.
==================

Re: Off-heap storage and dynamic allocation

Posted by Justin Uang <ju...@gmail.com>.

Cool, thanks for the dev insight into what parts of the codebase are
worthwhile, and which are not =)

On Tue, Nov 3, 2015 at 10:25 PM Reynold Xin <rx...@databricks.com> wrote:

> It is quite a bit of work. Again, I think going through the file system
> API is more ideal in the long run. In the long run, I don't even think the
> current offheap API makes much sense, and we should consider just removing
> it to simplify things.
>
> On Tue, Nov 3, 2015 at 1:20 PM, Justin Uang <ju...@gmail.com> wrote:
>
>> Alright, we'll just stick with normal caching then.
>>
>> Just for future reference, how much work would it be to get it to retain
>> the partitions in tachyon. This is especially helpful in a multitenant
>> situation, where many users each have their own persistent spark contexts,
>> but where the notebooks can be idle for long periods of time while holding
>> onto cached rdds.
>>
>> On Tue, Nov 3, 2015 at 10:15 PM Reynold Xin <rx...@databricks.com> wrote:
>>
>>> It is lost unfortunately (although can be recomputed automatically).
>>>
>>>
>>> On Tue, Nov 3, 2015 at 1:13 PM, Justin Uang <ju...@gmail.com>
>>> wrote:
>>>
>>>> Thanks for your response. I was worried about #3, vs being able to use
>>>> the objects directly. #2 seems to be the dealbreaker for my use case right?
>>>> Even if it I am using tachyon for caching, if an executor is lost, then
>>>> that partition is lost for the purposes of spark?
>>>>
>>>> On Tue, Nov 3, 2015 at 5:53 PM Reynold Xin <rx...@databricks.com> wrote:
>>>>
>>>>> I don't think there is any special handling w.r.t. Tachyon vs in-heap
>>>>> caching. As a matter of fact, I think the current offheap caching
>>>>> implementation is pretty bad, because:
>>>>>
>>>>> 1. There is no namespace sharing in offheap mode
>>>>> 2. Similar to 1, you cannot recover the offheap memory once Spark
>>>>> driver or executor crashes
>>>>> 3. It requires expensive serialization to go offheap
>>>>>
>>>>> It would've been simpler to just treat Tachyon as a normal file
>>>>> system, and use it that way to at least satisfy 1 and 2, and also
>>>>> substantially simplify the internals.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Nov 3, 2015 at 7:59 AM, Justin Uang <ju...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Yup, but I'm wondering what happens when an executor does get
>>>>>> removed, but when we're using tachyon. Will the cached data still be
>>>>>> available, since we're using off-heap storage, so the data isn't stored in
>>>>>> the executor?
>>>>>>
>>>>>> On Tue, Nov 3, 2015 at 4:57 PM Ryan Williams <
>>>>>> ryan.blake.williams@gmail.com> wrote:
>>>>>>
>>>>>>> fwiw, I think that having cached RDD partitions prevents executors
>>>>>>> from being removed under dynamic allocation by default; see
>>>>>>> SPARK-8958 <https://issues.apache.org/jira/browse/SPARK-8958>. The
>>>>>>> "spark.dynamicAllocation.cachedExecutorIdleTimeout" config
>>>>>>> <http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation>
>>>>>>> controls this.
>>>>>>>
>>>>>>> On Fri, Oct 30, 2015 at 12:14 PM Justin Uang <ju...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hey guys,
>>>>>>>>
>>>>>>>> According to the docs for 1.5.1, when an executor is removed for
>>>>>>>> dynamic allocation, the cached data is gone. If I use off-heap storage like
>>>>>>>> tachyon, conceptually there isn't this issue anymore, but is the cached
>>>>>>>> data still available in practice? This would be great because then we would
>>>>>>>> be able to set spark.dynamicAllocation.cachedExecutorIdleTimeout to be
>>>>>>>> quite small.
>>>>>>>>
>>>>>>>> ==================
>>>>>>>> In addition to writing shuffle files, executors also cache data
>>>>>>>> either on disk or in memory. When an executor is removed, however, all
>>>>>>>> cached data will no longer be accessible. There is currently not yet a
>>>>>>>> solution for this in Spark 1.2. In future releases, the cached data may be
>>>>>>>> preserved through an off-heap storage similar in spirit to how shuffle
>>>>>>>> files are preserved through the external shuffle service.
>>>>>>>> ==================
>>>>>>>>
>>>>>>>
>>>>>
>>>
>

Re: Off-heap storage and dynamic allocation

Posted by Reynold Xin <rx...@databricks.com>.

It is quite a bit of work. Again, I think going through the file system API
is more ideal in the long run. In the long run, I don't even think the
current offheap API makes much sense, and we should consider just removing
it to simplify things.

On Tue, Nov 3, 2015 at 1:20 PM, Justin Uang <ju...@gmail.com> wrote:

> Alright, we'll just stick with normal caching then.
>
> Just for future reference, how much work would it be to get it to retain
> the partitions in tachyon. This is especially helpful in a multitenant
> situation, where many users each have their own persistent spark contexts,
> but where the notebooks can be idle for long periods of time while holding
> onto cached rdds.
>
> On Tue, Nov 3, 2015 at 10:15 PM Reynold Xin <rx...@databricks.com> wrote:
>
>> It is lost unfortunately (although can be recomputed automatically).
>>
>>
>> On Tue, Nov 3, 2015 at 1:13 PM, Justin Uang <ju...@gmail.com>
>> wrote:
>>
>>> Thanks for your response. I was worried about #3, vs being able to use
>>> the objects directly. #2 seems to be the dealbreaker for my use case right?
>>> Even if it I am using tachyon for caching, if an executor is lost, then
>>> that partition is lost for the purposes of spark?
>>>
>>> On Tue, Nov 3, 2015 at 5:53 PM Reynold Xin <rx...@databricks.com> wrote:
>>>
>>>> I don't think there is any special handling w.r.t. Tachyon vs in-heap
>>>> caching. As a matter of fact, I think the current offheap caching
>>>> implementation is pretty bad, because:
>>>>
>>>> 1. There is no namespace sharing in offheap mode
>>>> 2. Similar to 1, you cannot recover the offheap memory once Spark
>>>> driver or executor crashes
>>>> 3. It requires expensive serialization to go offheap
>>>>
>>>> It would've been simpler to just treat Tachyon as a normal file system,
>>>> and use it that way to at least satisfy 1 and 2, and also substantially
>>>> simplify the internals.
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Nov 3, 2015 at 7:59 AM, Justin Uang <ju...@gmail.com>
>>>> wrote:
>>>>
>>>>> Yup, but I'm wondering what happens when an executor does get removed,
>>>>> but when we're using tachyon. Will the cached data still be available,
>>>>> since we're using off-heap storage, so the data isn't stored in the
>>>>> executor?
>>>>>
>>>>> On Tue, Nov 3, 2015 at 4:57 PM Ryan Williams <
>>>>> ryan.blake.williams@gmail.com> wrote:
>>>>>
>>>>>> fwiw, I think that having cached RDD partitions prevents executors
>>>>>> from being removed under dynamic allocation by default; see
>>>>>> SPARK-8958 <https://issues.apache.org/jira/browse/SPARK-8958>. The
>>>>>> "spark.dynamicAllocation.cachedExecutorIdleTimeout" config
>>>>>> <http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation>
>>>>>> controls this.
>>>>>>
>>>>>> On Fri, Oct 30, 2015 at 12:14 PM Justin Uang <ju...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hey guys,
>>>>>>>
>>>>>>> According to the docs for 1.5.1, when an executor is removed for
>>>>>>> dynamic allocation, the cached data is gone. If I use off-heap storage like
>>>>>>> tachyon, conceptually there isn't this issue anymore, but is the cached
>>>>>>> data still available in practice? This would be great because then we would
>>>>>>> be able to set spark.dynamicAllocation.cachedExecutorIdleTimeout to be
>>>>>>> quite small.
>>>>>>>
>>>>>>> ==================
>>>>>>> In addition to writing shuffle files, executors also cache data
>>>>>>> either on disk or in memory. When an executor is removed, however, all
>>>>>>> cached data will no longer be accessible. There is currently not yet a
>>>>>>> solution for this in Spark 1.2. In future releases, the cached data may be
>>>>>>> preserved through an off-heap storage similar in spirit to how shuffle
>>>>>>> files are preserved through the external shuffle service.
>>>>>>> ==================
>>>>>>>
>>>>>>
>>>>
>>

Re: Off-heap storage and dynamic allocation

Posted by Justin Uang <ju...@gmail.com>.

Alright, we'll just stick with normal caching then.

Just for future reference, how much work would it be to get it to retain
the partitions in tachyon. This is especially helpful in a multitenant
situation, where many users each have their own persistent spark contexts,
but where the notebooks can be idle for long periods of time while holding
onto cached rdds.

On Tue, Nov 3, 2015 at 10:15 PM Reynold Xin <rx...@databricks.com> wrote:

> It is lost unfortunately (although can be recomputed automatically).
>
>
> On Tue, Nov 3, 2015 at 1:13 PM, Justin Uang <ju...@gmail.com> wrote:
>
>> Thanks for your response. I was worried about #3, vs being able to use
>> the objects directly. #2 seems to be the dealbreaker for my use case right?
>> Even if it I am using tachyon for caching, if an executor is lost, then
>> that partition is lost for the purposes of spark?
>>
>> On Tue, Nov 3, 2015 at 5:53 PM Reynold Xin <rx...@databricks.com> wrote:
>>
>>> I don't think there is any special handling w.r.t. Tachyon vs in-heap
>>> caching. As a matter of fact, I think the current offheap caching
>>> implementation is pretty bad, because:
>>>
>>> 1. There is no namespace sharing in offheap mode
>>> 2. Similar to 1, you cannot recover the offheap memory once Spark driver
>>> or executor crashes
>>> 3. It requires expensive serialization to go offheap
>>>
>>> It would've been simpler to just treat Tachyon as a normal file system,
>>> and use it that way to at least satisfy 1 and 2, and also substantially
>>> simplify the internals.
>>>
>>>
>>>
>>>
>>> On Tue, Nov 3, 2015 at 7:59 AM, Justin Uang <ju...@gmail.com>
>>> wrote:
>>>
>>>> Yup, but I'm wondering what happens when an executor does get removed,
>>>> but when we're using tachyon. Will the cached data still be available,
>>>> since we're using off-heap storage, so the data isn't stored in the
>>>> executor?
>>>>
>>>> On Tue, Nov 3, 2015 at 4:57 PM Ryan Williams <
>>>> ryan.blake.williams@gmail.com> wrote:
>>>>
>>>>> fwiw, I think that having cached RDD partitions prevents executors
>>>>> from being removed under dynamic allocation by default; see SPARK-8958
>>>>> <https://issues.apache.org/jira/browse/SPARK-8958>. The
>>>>> "spark.dynamicAllocation.cachedExecutorIdleTimeout" config
>>>>> <http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation>
>>>>> controls this.
>>>>>
>>>>> On Fri, Oct 30, 2015 at 12:14 PM Justin Uang <ju...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hey guys,
>>>>>>
>>>>>> According to the docs for 1.5.1, when an executor is removed for
>>>>>> dynamic allocation, the cached data is gone. If I use off-heap storage like
>>>>>> tachyon, conceptually there isn't this issue anymore, but is the cached
>>>>>> data still available in practice? This would be great because then we would
>>>>>> be able to set spark.dynamicAllocation.cachedExecutorIdleTimeout to be
>>>>>> quite small.
>>>>>>
>>>>>> ==================
>>>>>> In addition to writing shuffle files, executors also cache data
>>>>>> either on disk or in memory. When an executor is removed, however, all
>>>>>> cached data will no longer be accessible. There is currently not yet a
>>>>>> solution for this in Spark 1.2. In future releases, the cached data may be
>>>>>> preserved through an off-heap storage similar in spirit to how shuffle
>>>>>> files are preserved through the external shuffle service.
>>>>>> ==================
>>>>>>
>>>>>
>>>
>

Re: Off-heap storage and dynamic allocation

Posted by Reynold Xin <rx...@databricks.com>.

It is lost unfortunately (although can be recomputed automatically).


On Tue, Nov 3, 2015 at 1:13 PM, Justin Uang <ju...@gmail.com> wrote:

> Thanks for your response. I was worried about #3, vs being able to use the
> objects directly. #2 seems to be the dealbreaker for my use case right?
> Even if it I am using tachyon for caching, if an executor is lost, then
> that partition is lost for the purposes of spark?
>
> On Tue, Nov 3, 2015 at 5:53 PM Reynold Xin <rx...@databricks.com> wrote:
>
>> I don't think there is any special handling w.r.t. Tachyon vs in-heap
>> caching. As a matter of fact, I think the current offheap caching
>> implementation is pretty bad, because:
>>
>> 1. There is no namespace sharing in offheap mode
>> 2. Similar to 1, you cannot recover the offheap memory once Spark driver
>> or executor crashes
>> 3. It requires expensive serialization to go offheap
>>
>> It would've been simpler to just treat Tachyon as a normal file system,
>> and use it that way to at least satisfy 1 and 2, and also substantially
>> simplify the internals.
>>
>>
>>
>>
>> On Tue, Nov 3, 2015 at 7:59 AM, Justin Uang <ju...@gmail.com>
>> wrote:
>>
>>> Yup, but I'm wondering what happens when an executor does get removed,
>>> but when we're using tachyon. Will the cached data still be available,
>>> since we're using off-heap storage, so the data isn't stored in the
>>> executor?
>>>
>>> On Tue, Nov 3, 2015 at 4:57 PM Ryan Williams <
>>> ryan.blake.williams@gmail.com> wrote:
>>>
>>>> fwiw, I think that having cached RDD partitions prevents executors from
>>>> being removed under dynamic allocation by default; see SPARK-8958
>>>> <https://issues.apache.org/jira/browse/SPARK-8958>. The
>>>> "spark.dynamicAllocation.cachedExecutorIdleTimeout" config
>>>> <http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation>
>>>> controls this.
>>>>
>>>> On Fri, Oct 30, 2015 at 12:14 PM Justin Uang <ju...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hey guys,
>>>>>
>>>>> According to the docs for 1.5.1, when an executor is removed for
>>>>> dynamic allocation, the cached data is gone. If I use off-heap storage like
>>>>> tachyon, conceptually there isn't this issue anymore, but is the cached
>>>>> data still available in practice? This would be great because then we would
>>>>> be able to set spark.dynamicAllocation.cachedExecutorIdleTimeout to be
>>>>> quite small.
>>>>>
>>>>> ==================
>>>>> In addition to writing shuffle files, executors also cache data either
>>>>> on disk or in memory. When an executor is removed, however, all cached data
>>>>> will no longer be accessible. There is currently not yet a solution for
>>>>> this in Spark 1.2. In future releases, the cached data may be preserved
>>>>> through an off-heap storage similar in spirit to how shuffle files are
>>>>> preserved through the external shuffle service.
>>>>> ==================
>>>>>
>>>>
>>

Re: Off-heap storage and dynamic allocation

Posted by Justin Uang <ju...@gmail.com>.

Thanks for your response. I was worried about #3, vs being able to use the
objects directly. #2 seems to be the dealbreaker for my use case right?
Even if it I am using tachyon for caching, if an executor is lost, then
that partition is lost for the purposes of spark?

On Tue, Nov 3, 2015 at 5:53 PM Reynold Xin <rx...@databricks.com> wrote:

> I don't think there is any special handling w.r.t. Tachyon vs in-heap
> caching. As a matter of fact, I think the current offheap caching
> implementation is pretty bad, because:
>
> 1. There is no namespace sharing in offheap mode
> 2. Similar to 1, you cannot recover the offheap memory once Spark driver
> or executor crashes
> 3. It requires expensive serialization to go offheap
>
> It would've been simpler to just treat Tachyon as a normal file system,
> and use it that way to at least satisfy 1 and 2, and also substantially
> simplify the internals.
>
>
>
>
> On Tue, Nov 3, 2015 at 7:59 AM, Justin Uang <ju...@gmail.com> wrote:
>
>> Yup, but I'm wondering what happens when an executor does get removed,
>> but when we're using tachyon. Will the cached data still be available,
>> since we're using off-heap storage, so the data isn't stored in the
>> executor?
>>
>> On Tue, Nov 3, 2015 at 4:57 PM Ryan Williams <
>> ryan.blake.williams@gmail.com> wrote:
>>
>>> fwiw, I think that having cached RDD partitions prevents executors from
>>> being removed under dynamic allocation by default; see SPARK-8958
>>> <https://issues.apache.org/jira/browse/SPARK-8958>. The
>>> "spark.dynamicAllocation.cachedExecutorIdleTimeout" config
>>> <http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation>
>>> controls this.
>>>
>>> On Fri, Oct 30, 2015 at 12:14 PM Justin Uang <ju...@gmail.com>
>>> wrote:
>>>
>>>> Hey guys,
>>>>
>>>> According to the docs for 1.5.1, when an executor is removed for
>>>> dynamic allocation, the cached data is gone. If I use off-heap storage like
>>>> tachyon, conceptually there isn't this issue anymore, but is the cached
>>>> data still available in practice? This would be great because then we would
>>>> be able to set spark.dynamicAllocation.cachedExecutorIdleTimeout to be
>>>> quite small.
>>>>
>>>> ==================
>>>> In addition to writing shuffle files, executors also cache data either
>>>> on disk or in memory. When an executor is removed, however, all cached data
>>>> will no longer be accessible. There is currently not yet a solution for
>>>> this in Spark 1.2. In future releases, the cached data may be preserved
>>>> through an off-heap storage similar in spirit to how shuffle files are
>>>> preserved through the external shuffle service.
>>>> ==================
>>>>
>>>
>

Re: Off-heap storage and dynamic allocation

Posted by Reynold Xin <rx...@databricks.com>.

I don't think there is any special handling w.r.t. Tachyon vs in-heap
caching. As a matter of fact, I think the current offheap caching
implementation is pretty bad, because:

1. There is no namespace sharing in offheap mode
2. Similar to 1, you cannot recover the offheap memory once Spark driver or
executor crashes
3. It requires expensive serialization to go offheap

It would've been simpler to just treat Tachyon as a normal file system, and
use it that way to at least satisfy 1 and 2, and also substantially
simplify the internals.




On Tue, Nov 3, 2015 at 7:59 AM, Justin Uang <ju...@gmail.com> wrote:

> Yup, but I'm wondering what happens when an executor does get removed, but
> when we're using tachyon. Will the cached data still be available, since
> we're using off-heap storage, so the data isn't stored in the executor?
>
> On Tue, Nov 3, 2015 at 4:57 PM Ryan Williams <
> ryan.blake.williams@gmail.com> wrote:
>
>> fwiw, I think that having cached RDD partitions prevents executors from
>> being removed under dynamic allocation by default; see SPARK-8958
>> <https://issues.apache.org/jira/browse/SPARK-8958>. The
>> "spark.dynamicAllocation.cachedExecutorIdleTimeout" config
>> <http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation>
>> controls this.
>>
>> On Fri, Oct 30, 2015 at 12:14 PM Justin Uang <ju...@gmail.com>
>> wrote:
>>
>>> Hey guys,
>>>
>>> According to the docs for 1.5.1, when an executor is removed for dynamic
>>> allocation, the cached data is gone. If I use off-heap storage like
>>> tachyon, conceptually there isn't this issue anymore, but is the cached
>>> data still available in practice? This would be great because then we would
>>> be able to set spark.dynamicAllocation.cachedExecutorIdleTimeout to be
>>> quite small.
>>>
>>> ==================
>>> In addition to writing shuffle files, executors also cache data either
>>> on disk or in memory. When an executor is removed, however, all cached data
>>> will no longer be accessible. There is currently not yet a solution for
>>> this in Spark 1.2. In future releases, the cached data may be preserved
>>> through an off-heap storage similar in spirit to how shuffle files are
>>> preserved through the external shuffle service.
>>> ==================
>>>
>>

Re: Off-heap storage and dynamic allocation

Posted by Justin Uang <ju...@gmail.com>.

Yup, but I'm wondering what happens when an executor does get removed, but
when we're using tachyon. Will the cached data still be available, since
we're using off-heap storage, so the data isn't stored in the executor?

On Tue, Nov 3, 2015 at 4:57 PM Ryan Williams <ry...@gmail.com>
wrote:

> fwiw, I think that having cached RDD partitions prevents executors from
> being removed under dynamic allocation by default; see SPARK-8958
> <https://issues.apache.org/jira/browse/SPARK-8958>. The
> "spark.dynamicAllocation.cachedExecutorIdleTimeout" config
> <http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation>
> controls this.
>
> On Fri, Oct 30, 2015 at 12:14 PM Justin Uang <ju...@gmail.com>
> wrote:
>
>> Hey guys,
>>
>> According to the docs for 1.5.1, when an executor is removed for dynamic
>> allocation, the cached data is gone. If I use off-heap storage like
>> tachyon, conceptually there isn't this issue anymore, but is the cached
>> data still available in practice? This would be great because then we would
>> be able to set spark.dynamicAllocation.cachedExecutorIdleTimeout to be
>> quite small.
>>
>> ==================
>> In addition to writing shuffle files, executors also cache data either on
>> disk or in memory. When an executor is removed, however, all cached data
>> will no longer be accessible. There is currently not yet a solution for
>> this in Spark 1.2. In future releases, the cached data may be preserved
>> through an off-heap storage similar in spirit to how shuffle files are
>> preserved through the external shuffle service.
>> ==================
>>
>

Re: Off-heap storage and dynamic allocation

Posted by Ryan Williams <ry...@gmail.com>.

fwiw, I think that having cached RDD partitions prevents executors from
being removed under dynamic allocation by default; see SPARK-8958
<https://issues.apache.org/jira/browse/SPARK-8958>. The
"spark.dynamicAllocation.cachedExecutorIdleTimeout" config
<http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation>
controls this.

On Fri, Oct 30, 2015 at 12:14 PM Justin Uang <ju...@gmail.com> wrote:

> Hey guys,
>
> According to the docs for 1.5.1, when an executor is removed for dynamic
> allocation, the cached data is gone. If I use off-heap storage like
> tachyon, conceptually there isn't this issue anymore, but is the cached
> data still available in practice? This would be great because then we would
> be able to set spark.dynamicAllocation.cachedExecutorIdleTimeout to be
> quite small.
>
> ==================
> In addition to writing shuffle files, executors also cache data either on
> disk or in memory. When an executor is removed, however, all cached data
> will no longer be accessible. There is currently not yet a solution for
> this in Spark 1.2. In future releases, the cached data may be preserved
> through an off-heap storage similar in spirit to how shuffle files are
> preserved through the external shuffle service.
> ==================
>