You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Matus Faro <ma...@kik.com> on 2015/03/03 16:23:29 UTC

Re: On app upgrade, restore sliding window data.

Thank you Arush, I've implemented initial data for a windowed operation and
opened a pull request here:
https://github.com/apache/spark/pull/4875


On Tue, Feb 24, 2015 at 4:49 AM, Arush Kharbanda <arush@sigmoidanalytics.com
> wrote:

> I think this could be of some help to you.
>
> https://issues.apache.org/jira/browse/SPARK-3660
>
>
>
> On Tue, Feb 24, 2015 at 2:18 AM, Matus Faro <ma...@kik.com> wrote:
>
>> Hi,
>>
>> Our application is being designed to operate at all times on a large
>> sliding window (day+) of data. The operations performed on the window
>> of data will change fairly frequently and I need a way to save and
>> restore the sliding window after an app upgrade without having to wait
>> the duration of the sliding window to "warm up". Because it's an app
>> upgrade, checkpointing will not work unfortunately.
>>
>> I can potentially dump the window to an outside storage periodically
>> or on app shutdown, but I don't have an ideal way of restoring it.
>>
>> I thought about two non-ideal solutions:
>> 1. Load the previous data all at once into the sliding window on app
>> startup. The problem is, at one point I will have double the data in
>> the sliding window until the initial batch of data goes out of scope.
>> 2. Broadcast the previous state of the window separately from the
>> window. Perform the operations on both sets of data until it comes out
>> of scope. The problem is, the data will not fit into memory.
>>
>> Solutions that would solve my problem:
>> 1. Ability to pre-populate sliding window.
>> 2. Have control over batch slicing. It would be nice for a Receiver to
>> dictate the current batch timestamp in order to slow down or fast
>> forward time.
>>
>> Any feedback would be greatly appreciated!
>>
>> Thank you,
>> Matus
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>
>
> --
>
> [image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com>
>
> *Arush Kharbanda* || Technical Teamlead
>
> arush@sigmoidanalytics.com || www.sigmoidanalytics.com
>