You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@aurora.apache.org by David McLaughlin <dm...@apache.org> on 2017/03/30 17:16:58 UTC

Future of storage in Aurora

Hi all,

I'd like to start a discussion around storage in Aurora.

I think one of the biggest mistakes we made in migrating our storage to H2
was deleting the memory stores as we moved. We made a pretty big bet that
we could eventually make H2/relational databases work. I don't think that
bet has paid off and that we need to revisit the direction we're taking.

My belief is that the current H2/MyBatis approach is untenable for large
production clusters, at least without changing our current single-master
architecture. At Twitter we are already having to fight to keep GC
manageable even without DbTaskStore enabled, so I don't see a path forward
where we could eventually enable that. So far experiments with H2 off-heap
storage have provided marginal (if any) gains.

Would anyone object to restoring the in-memory stores and creating new
implementations for the missing ones (UpdateStore)? I'd even go further and
propose that we consider in-memory H2 and MyBatis a failed experiment and
we drop that storage layer completely.

Cheers,
David

Re: Future of storage in Aurora

Posted by Bill Farner <wf...@apache.org>.
Good questions!


> What does this mean in terms of the original goals behind the storage system
> refactor?


My current effort is targeting goals (b), (c), and (d) in the above list.

Are we confident that Jordan's work for hot-followers will alleviate the
> problems w/ long failovers?


My plans will not rely on Jordan's work.  I do, however, hope that it will
enable straightforward support for warm standby and very fast failover.

I'd also like to know what our plans are for storage in the future


The plan is nascent and still undergoing prototyping, but i intend to
implement a log-structured storage on top of a key-value abstraction.  This
would eliminate the need for snapshots.  The first implementation will be
backed by ZooKeeper.  I'll send out a doc once i have confidence in the
approach.

Also, what does this mean for stores that have never existed as non-H2 (i.e.
> the job update store).


They will need to be reimplemented with map-based stores.  I'm not phased
by this part of the effort, and should have a JobUpdateStore implementation
ready over the next few days.


> Will converting it have an impact on, e.g., storage write-lock contention?


For JobUpdateStore, i expect scheduler performance to increase
significantly.  This store has been a performance quagmire in high-scale
clusters.

Looking around at the current state of write locking, we're still at the
whim of a global write lock in LogStorage, so we at least should not
regress!


On Tue, Oct 3, 2017 at 7:45 AM, Joshua Cohen <jc...@apache.org> wrote:

> What does this mean in terms of the original goals behind the storage
> system refactor? Are we confident that Jordan's work for hot-followers will
> alleviate the problems w/ long failovers? I'm definitely in favor of
> killing the H2 code if its goals can never be realized and it's just a
> maintenance burden, but I'd also like to know what our plans are for
> storage in the future.
>
> Also, what does this mean for stores that have never existed as non-H2
> (i.e. the job update store). Will converting it have an impact on, e.g.,
> storage write-lock contention?
>
> On Sun, Oct 1, 2017 at 5:59 PM, Bill Farner <wf...@apache.org> wrote:
>
> > I would like to revive this discussion in light of some work i have been
> > doing around the storage system.  The fruits of the DB storage system
> will
> > require a lot of additional effort to reach the beneficial outcomes i
> laid
> > out above, and i agree that we should cut our losses.
> >
> > I plan to introduce patches soon to introduce non-H2 in-memory store
> > implementations.  *If anyone disagrees with removing the H2
> implementations
> > as well, please chime in here.*
> >
> > Disclaimer - i may propose an alternative for the persistent storage in
> the
> > near future.
> >
> > On Mon, Apr 3, 2017 at 9:40 AM, Stephan Erb <se...@apache.org> wrote:
> >
> > > H2 could give us fine granular data access. However, most of our code
> > > performs massive joins to reconstruct fully hydrated thrift objects.
> > > Most of the time we are then only interested in very few properties of
> > > those thrift structs. This applies to internal usage, but also how we
> > > use the API.
> > >
> > > I therefore believe we have to improve and refine our domain model in
> > > order to significantly improve the storage situation.
> > >
> > > I really liked Maxim's proposal from last year, and I think it is worth
> > > reconsidering: https://docs.google.com/document/d/
> 1myYX3yuofGr8JIzud98x
> > > Xd5mqgpZ8q_RqKBpSff4-WE/edit
> > >
> > > Best regards,
> > > Stephan
> > >
> > > On Thu, 2017-03-30 at 15:53 -0700, David McLaughlin wrote:
> > > > So it sounds like before we make any decisions around removing the
> > > > work
> > > > done in H2 so far, we should figure out what is remaining to move to
> > > > external storage (or if it's even still a goal).
> > > >
> > > > I may still play around with reviving the in-memory stores, but will
> > > > separate that work from any goal to remove the H2 layer. Since it's
> > > > motivated by performance, I'd verify there is a benefit before
> > > > submitting
> > > > any review.
> > > >
> > > > Thanks all for the feedback.
> > > >
> > > >
> > > > On Thu, Mar 30, 2017 at 12:08 PM, Bill Farner <
> wfarnerapache@gmail.co
> > > > m>
> > > > wrote:
> > > >
> > > > > Adding some background - there were several motivators to using SQL
> > > > > that
> > > > > come to mind:
> > > > > a) well-understood transaction isolation guarantees leading to a
> > > > > simpler
> > > > > programming model w.r.t. concurrency
> > > > > b) ability to offload storage to a separate system (e.g. Postgres)
> > > > > and
> > > > > scale it separately
> > > > > c) relief of computational burden of performing snapshots and
> > > > > backups due
> > > > > to (b)
> > > > > d) simpler code and operations model due to (b)
> > > > > e) schema backwards compatibility guarantees due to persistence-
> > > > > friendly
> > > > > migration-scripts
> > > > > f) straightforward normalization to facilitate sharing of
> > > > > otherwise-redundant state (I.e. TaskConfig)
> > > > >
> > > > > The storage overhaul comes with a huge caveat requiring the
> > > > > approach to
> > > > > scheduling rounds to change. I concur that the current model is
> > > > > hostile to
> > > > > offloaded storage, as ~all state must be read every scheduling
> > > > > round. If
> > > > > that cannot be worked around with lazy state or best-effort
> > > > > concurrency
> > > > > (I.e. in-memory caching), the approach is indeed flawed.
> > > > >
> > > > > On Mar 30, 2017, 10:29 AM -0700, Joshua Cohen <jc...@apache.org>,
> > > > > wrote:
> > > > > > My understanding of the H2-backed stores is that at least part of
> > > > > > the
> > > > > > original rationale behind them was that they were meant to be an
> > > > > > interim
> > > > > > point on the way to external SQL-backed stores which should
> > > > > > theoretically
> > > > > > provide significant benefits w.r.t. to GC (obviously unproven,
> > > > > > especially
> > > > > > at scale).
> > > > > >
> > > > > > I don't disagree that the H2 stores themselves are problematic
> > > > > > (to say
> > > > >
> > > > > the
> > > > > > least); do we have evidence that returning to memory based stores
> > > > > > will be
> > > > > > an improvement on that?
> > > > > >
> > > > > > On Thu, Mar 30, 2017 at 12:16 PM, David McLaughlin <
> > > > >
> > > > > dmclaughlin@apache.org
> > > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I'd like to start a discussion around storage in Aurora.
> > > > > > >
> > > > > > > I think one of the biggest mistakes we made in migrating our
> > > > > > > storage
> > > > >
> > > > > to H2
> > > > > > > was deleting the memory stores as we moved. We made a pretty
> > > > > > > big bet
> > > > >
> > > > > that
> > > > > > > we could eventually make H2/relational databases work. I don't
> > > > > > > think
> > > > >
> > > > > that
> > > > > > > bet has paid off and that we need to revisit the direction
> > > > > > > we're
> > > > >
> > > > > taking.
> > > > > > >
> > > > > > > My belief is that the current H2/MyBatis approach is untenable
> > > > > > > for
> > > > >
> > > > > large
> > > > > > > production clusters, at least without changing our current
> > > > >
> > > > > single-master
> > > > > > > architecture. At Twitter we are already having to fight to keep
> > > > > > > GC
> > > > > > > manageable even without DbTaskStore enabled, so I don't see a
> > > > > > > path
> > > > >
> > > > > forward
> > > > > > > where we could eventually enable that. So far experiments with
> > > > > > > H2
> > > > >
> > > > > off-heap
> > > > > > > storage have provided marginal (if any) gains.
> > > > > > >
> > > > > > > Would anyone object to restoring the in-memory stores and
> > > > > > > creating new
> > > > > > > implementations for the missing ones (UpdateStore)? I'd even go
> > > > >
> > > > > further and
> > > > > > > propose that we consider in-memory H2 and MyBatis a failed
> > > > > > > experiment
> > > > >
> > > > > and
> > > > > > > we drop that storage layer completely.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > David
> > > > > > >
> > >
> >
>

Re: Future of storage in Aurora

Posted by Joshua Cohen <jc...@apache.org>.
What does this mean in terms of the original goals behind the storage
system refactor? Are we confident that Jordan's work for hot-followers will
alleviate the problems w/ long failovers? I'm definitely in favor of
killing the H2 code if its goals can never be realized and it's just a
maintenance burden, but I'd also like to know what our plans are for
storage in the future.

Also, what does this mean for stores that have never existed as non-H2
(i.e. the job update store). Will converting it have an impact on, e.g.,
storage write-lock contention?

On Sun, Oct 1, 2017 at 5:59 PM, Bill Farner <wf...@apache.org> wrote:

> I would like to revive this discussion in light of some work i have been
> doing around the storage system.  The fruits of the DB storage system will
> require a lot of additional effort to reach the beneficial outcomes i laid
> out above, and i agree that we should cut our losses.
>
> I plan to introduce patches soon to introduce non-H2 in-memory store
> implementations.  *If anyone disagrees with removing the H2 implementations
> as well, please chime in here.*
>
> Disclaimer - i may propose an alternative for the persistent storage in the
> near future.
>
> On Mon, Apr 3, 2017 at 9:40 AM, Stephan Erb <se...@apache.org> wrote:
>
> > H2 could give us fine granular data access. However, most of our code
> > performs massive joins to reconstruct fully hydrated thrift objects.
> > Most of the time we are then only interested in very few properties of
> > those thrift structs. This applies to internal usage, but also how we
> > use the API.
> >
> > I therefore believe we have to improve and refine our domain model in
> > order to significantly improve the storage situation.
> >
> > I really liked Maxim's proposal from last year, and I think it is worth
> > reconsidering: https://docs.google.com/document/d/1myYX3yuofGr8JIzud98x
> > Xd5mqgpZ8q_RqKBpSff4-WE/edit
> >
> > Best regards,
> > Stephan
> >
> > On Thu, 2017-03-30 at 15:53 -0700, David McLaughlin wrote:
> > > So it sounds like before we make any decisions around removing the
> > > work
> > > done in H2 so far, we should figure out what is remaining to move to
> > > external storage (or if it's even still a goal).
> > >
> > > I may still play around with reviving the in-memory stores, but will
> > > separate that work from any goal to remove the H2 layer. Since it's
> > > motivated by performance, I'd verify there is a benefit before
> > > submitting
> > > any review.
> > >
> > > Thanks all for the feedback.
> > >
> > >
> > > On Thu, Mar 30, 2017 at 12:08 PM, Bill Farner <wfarnerapache@gmail.co
> > > m>
> > > wrote:
> > >
> > > > Adding some background - there were several motivators to using SQL
> > > > that
> > > > come to mind:
> > > > a) well-understood transaction isolation guarantees leading to a
> > > > simpler
> > > > programming model w.r.t. concurrency
> > > > b) ability to offload storage to a separate system (e.g. Postgres)
> > > > and
> > > > scale it separately
> > > > c) relief of computational burden of performing snapshots and
> > > > backups due
> > > > to (b)
> > > > d) simpler code and operations model due to (b)
> > > > e) schema backwards compatibility guarantees due to persistence-
> > > > friendly
> > > > migration-scripts
> > > > f) straightforward normalization to facilitate sharing of
> > > > otherwise-redundant state (I.e. TaskConfig)
> > > >
> > > > The storage overhaul comes with a huge caveat requiring the
> > > > approach to
> > > > scheduling rounds to change. I concur that the current model is
> > > > hostile to
> > > > offloaded storage, as ~all state must be read every scheduling
> > > > round. If
> > > > that cannot be worked around with lazy state or best-effort
> > > > concurrency
> > > > (I.e. in-memory caching), the approach is indeed flawed.
> > > >
> > > > On Mar 30, 2017, 10:29 AM -0700, Joshua Cohen <jc...@apache.org>,
> > > > wrote:
> > > > > My understanding of the H2-backed stores is that at least part of
> > > > > the
> > > > > original rationale behind them was that they were meant to be an
> > > > > interim
> > > > > point on the way to external SQL-backed stores which should
> > > > > theoretically
> > > > > provide significant benefits w.r.t. to GC (obviously unproven,
> > > > > especially
> > > > > at scale).
> > > > >
> > > > > I don't disagree that the H2 stores themselves are problematic
> > > > > (to say
> > > >
> > > > the
> > > > > least); do we have evidence that returning to memory based stores
> > > > > will be
> > > > > an improvement on that?
> > > > >
> > > > > On Thu, Mar 30, 2017 at 12:16 PM, David McLaughlin <
> > > >
> > > > dmclaughlin@apache.org
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I'd like to start a discussion around storage in Aurora.
> > > > > >
> > > > > > I think one of the biggest mistakes we made in migrating our
> > > > > > storage
> > > >
> > > > to H2
> > > > > > was deleting the memory stores as we moved. We made a pretty
> > > > > > big bet
> > > >
> > > > that
> > > > > > we could eventually make H2/relational databases work. I don't
> > > > > > think
> > > >
> > > > that
> > > > > > bet has paid off and that we need to revisit the direction
> > > > > > we're
> > > >
> > > > taking.
> > > > > >
> > > > > > My belief is that the current H2/MyBatis approach is untenable
> > > > > > for
> > > >
> > > > large
> > > > > > production clusters, at least without changing our current
> > > >
> > > > single-master
> > > > > > architecture. At Twitter we are already having to fight to keep
> > > > > > GC
> > > > > > manageable even without DbTaskStore enabled, so I don't see a
> > > > > > path
> > > >
> > > > forward
> > > > > > where we could eventually enable that. So far experiments with
> > > > > > H2
> > > >
> > > > off-heap
> > > > > > storage have provided marginal (if any) gains.
> > > > > >
> > > > > > Would anyone object to restoring the in-memory stores and
> > > > > > creating new
> > > > > > implementations for the missing ones (UpdateStore)? I'd even go
> > > >
> > > > further and
> > > > > > propose that we consider in-memory H2 and MyBatis a failed
> > > > > > experiment
> > > >
> > > > and
> > > > > > we drop that storage layer completely.
> > > > > >
> > > > > > Cheers,
> > > > > > David
> > > > > >
> >
>

Re: Future of storage in Aurora

Posted by Bill Farner <wf...@apache.org>.
That’s right, nothing fancy.

On Oct 2, 2017, 4:24 AM -0700, Erb, Stephan <St...@blue-yonder.com>, wrote:
> What do you have in mind for the in-memory replacement? Revert back to the usage of thrift objects within plain Java containers like we do for the task store?
>
> On 02.10.17, 00:59, "Bill Farner" <wf...@apache.org> wrote:
>
> I would like to revive this discussion in light of some work i have been
> doing around the storage system. The fruits of the DB storage system will
> require a lot of additional effort to reach the beneficial outcomes i laid
> out above, and i agree that we should cut our losses.
>
> I plan to introduce patches soon to introduce non-H2 in-memory store
> implementations. *If anyone disagrees with removing the H2 implementations
> as well, please chime in here.*
>
> Disclaimer - i may propose an alternative for the persistent storage in the
> near future.
>
> On Mon, Apr 3, 2017 at 9:40 AM, Stephan Erb <se...@apache.org> wrote:
>
> > H2 could give us fine granular data access. However, most of our code
> > performs massive joins to reconstruct fully hydrated thrift objects.
> > Most of the time we are then only interested in very few properties of
> > those thrift structs. This applies to internal usage, but also how we
> > use the API.
> >
> > I therefore believe we have to improve and refine our domain model in
> > order to significantly improve the storage situation.
> >
> > I really liked Maxim's proposal from last year, and I think it is worth
> > reconsidering: https://docs.google.com/document/d/1myYX3yuofGr8JIzud98x
> > Xd5mqgpZ8q_RqKBpSff4-WE/edit
> >
> > Best regards,
> > Stephan
> >
> > On Thu, 2017-03-30 at 15:53 -0700, David McLaughlin wrote:
> > > So it sounds like before we make any decisions around removing the
> > > work
> > > done in H2 so far, we should figure out what is remaining to move to
> > > external storage (or if it's even still a goal).
> > >
> > > I may still play around with reviving the in-memory stores, but will
> > > separate that work from any goal to remove the H2 layer. Since it's
> > > motivated by performance, I'd verify there is a benefit before
> > > submitting
> > > any review.
> > >
> > > Thanks all for the feedback.
> > >
> > >
> > > On Thu, Mar 30, 2017 at 12:08 PM, Bill Farner <wfarnerapache@gmail.co
> > > m
> > > wrote:
> > >
> > > > Adding some background - there were several motivators to using SQL
> > > > that
> > > > come to mind:
> > > > a) well-understood transaction isolation guarantees leading to a
> > > > simpler
> > > > programming model w.r.t. concurrency
> > > > b) ability to offload storage to a separate system (e.g. Postgres)
> > > > and
> > > > scale it separately
> > > > c) relief of computational burden of performing snapshots and
> > > > backups due
> > > > to (b)
> > > > d) simpler code and operations model due to (b)
> > > > e) schema backwards compatibility guarantees due to persistence-
> > > > friendly
> > > > migration-scripts
> > > > f) straightforward normalization to facilitate sharing of
> > > > otherwise-redundant state (I.e. TaskConfig)
> > > >
> > > > The storage overhaul comes with a huge caveat requiring the
> > > > approach to
> > > > scheduling rounds to change. I concur that the current model is
> > > > hostile to
> > > > offloaded storage, as ~all state must be read every scheduling
> > > > round. If
> > > > that cannot be worked around with lazy state or best-effort
> > > > concurrency
> > > > (I.e. in-memory caching), the approach is indeed flawed.
> > > >
> > > > On Mar 30, 2017, 10:29 AM -0700, Joshua Cohen <jc...@apache.org>,
> > > > wrote:
> > > > > My understanding of the H2-backed stores is that at least part of
> > > > > the
> > > > > original rationale behind them was that they were meant to be an
> > > > > interim
> > > > > point on the way to external SQL-backed stores which should
> > > > > theoretically
> > > > > provide significant benefits w.r.t. to GC (obviously unproven,
> > > > > especially
> > > > > at scale).
> > > > >
> > > > > I don't disagree that the H2 stores themselves are problematic
> > > > > (to say
> > > >
> > > > the
> > > > > least); do we have evidence that returning to memory based stores
> > > > > will be
> > > > > an improvement on that?
> > > > >
> > > > > On Thu, Mar 30, 2017 at 12:16 PM, David McLaughlin <
> > > >
> > > > dmclaughlin@apache.org
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I'd like to start a discussion around storage in Aurora.
> > > > > >
> > > > > > I think one of the biggest mistakes we made in migrating our
> > > > > > storage
> > > >
> > > > to H2
> > > > > > was deleting the memory stores as we moved. We made a pretty
> > > > > > big bet
> > > >
> > > > that
> > > > > > we could eventually make H2/relational databases work. I don't
> > > > > > think
> > > >
> > > > that
> > > > > > bet has paid off and that we need to revisit the direction
> > > > > > we're
> > > >
> > > > taking.
> > > > > >
> > > > > > My belief is that the current H2/MyBatis approach is untenable
> > > > > > for
> > > >
> > > > large
> > > > > > production clusters, at least without changing our current
> > > >
> > > > single-master
> > > > > > architecture. At Twitter we are already having to fight to keep
> > > > > > GC
> > > > > > manageable even without DbTaskStore enabled, so I don't see a
> > > > > > path
> > > >
> > > > forward
> > > > > > where we could eventually enable that. So far experiments with
> > > > > > H2
> > > >
> > > > off-heap
> > > > > > storage have provided marginal (if any) gains.
> > > > > >
> > > > > > Would anyone object to restoring the in-memory stores and
> > > > > > creating new
> > > > > > implementations for the missing ones (UpdateStore)? I'd even go
> > > >
> > > > further and
> > > > > > propose that we consider in-memory H2 and MyBatis a failed
> > > > > > experiment
> > > >
> > > > and
> > > > > > we drop that storage layer completely.
> > > > > >
> > > > > > Cheers,
> > > > > > David
> > > > > >
> >
>
>

Re: Future of storage in Aurora

Posted by "Erb, Stephan" <St...@blue-yonder.com>.
What do you have in mind for the in-memory replacement? Revert back to the usage of thrift objects within plain Java containers like we do for the task store?

On 02.10.17, 00:59, "Bill Farner" <wf...@apache.org> wrote:

    I would like to revive this discussion in light of some work i have been
    doing around the storage system.  The fruits of the DB storage system will
    require a lot of additional effort to reach the beneficial outcomes i laid
    out above, and i agree that we should cut our losses.
    
    I plan to introduce patches soon to introduce non-H2 in-memory store
    implementations.  *If anyone disagrees with removing the H2 implementations
    as well, please chime in here.*
    
    Disclaimer - i may propose an alternative for the persistent storage in the
    near future.
    
    On Mon, Apr 3, 2017 at 9:40 AM, Stephan Erb <se...@apache.org> wrote:
    
    > H2 could give us fine granular data access. However, most of our code
    > performs massive joins to reconstruct fully hydrated thrift objects.
    > Most of the time we are then only interested in very few properties of
    > those thrift structs. This applies to internal usage, but also how we
    > use the API.
    >
    > I therefore believe we have to improve and refine our domain model in
    > order to significantly improve the storage situation.
    >
    > I really liked Maxim's proposal from last year, and I think it is worth
    > reconsidering: https://docs.google.com/document/d/1myYX3yuofGr8JIzud98x
    > Xd5mqgpZ8q_RqKBpSff4-WE/edit
    >
    > Best regards,
    > Stephan
    >
    > On Thu, 2017-03-30 at 15:53 -0700, David McLaughlin wrote:
    > > So it sounds like before we make any decisions around removing the
    > > work
    > > done in H2 so far, we should figure out what is remaining to move to
    > > external storage (or if it's even still a goal).
    > >
    > > I may still play around with reviving the in-memory stores, but will
    > > separate that work from any goal to remove the H2 layer. Since it's
    > > motivated by performance, I'd verify there is a benefit before
    > > submitting
    > > any review.
    > >
    > > Thanks all for the feedback.
    > >
    > >
    > > On Thu, Mar 30, 2017 at 12:08 PM, Bill Farner <wfarnerapache@gmail.co
    > > m>
    > > wrote:
    > >
    > > > Adding some background - there were several motivators to using SQL
    > > > that
    > > > come to mind:
    > > > a) well-understood transaction isolation guarantees leading to a
    > > > simpler
    > > > programming model w.r.t. concurrency
    > > > b) ability to offload storage to a separate system (e.g. Postgres)
    > > > and
    > > > scale it separately
    > > > c) relief of computational burden of performing snapshots and
    > > > backups due
    > > > to (b)
    > > > d) simpler code and operations model due to (b)
    > > > e) schema backwards compatibility guarantees due to persistence-
    > > > friendly
    > > > migration-scripts
    > > > f) straightforward normalization to facilitate sharing of
    > > > otherwise-redundant state (I.e. TaskConfig)
    > > >
    > > > The storage overhaul comes with a huge caveat requiring the
    > > > approach to
    > > > scheduling rounds to change. I concur that the current model is
    > > > hostile to
    > > > offloaded storage, as ~all state must be read every scheduling
    > > > round. If
    > > > that cannot be worked around with lazy state or best-effort
    > > > concurrency
    > > > (I.e. in-memory caching), the approach is indeed flawed.
    > > >
    > > > On Mar 30, 2017, 10:29 AM -0700, Joshua Cohen <jc...@apache.org>,
    > > > wrote:
    > > > > My understanding of the H2-backed stores is that at least part of
    > > > > the
    > > > > original rationale behind them was that they were meant to be an
    > > > > interim
    > > > > point on the way to external SQL-backed stores which should
    > > > > theoretically
    > > > > provide significant benefits w.r.t. to GC (obviously unproven,
    > > > > especially
    > > > > at scale).
    > > > >
    > > > > I don't disagree that the H2 stores themselves are problematic
    > > > > (to say
    > > >
    > > > the
    > > > > least); do we have evidence that returning to memory based stores
    > > > > will be
    > > > > an improvement on that?
    > > > >
    > > > > On Thu, Mar 30, 2017 at 12:16 PM, David McLaughlin <
    > > >
    > > > dmclaughlin@apache.org
    > > > > wrote:
    > > > >
    > > > > > Hi all,
    > > > > >
    > > > > > I'd like to start a discussion around storage in Aurora.
    > > > > >
    > > > > > I think one of the biggest mistakes we made in migrating our
    > > > > > storage
    > > >
    > > > to H2
    > > > > > was deleting the memory stores as we moved. We made a pretty
    > > > > > big bet
    > > >
    > > > that
    > > > > > we could eventually make H2/relational databases work. I don't
    > > > > > think
    > > >
    > > > that
    > > > > > bet has paid off and that we need to revisit the direction
    > > > > > we're
    > > >
    > > > taking.
    > > > > >
    > > > > > My belief is that the current H2/MyBatis approach is untenable
    > > > > > for
    > > >
    > > > large
    > > > > > production clusters, at least without changing our current
    > > >
    > > > single-master
    > > > > > architecture. At Twitter we are already having to fight to keep
    > > > > > GC
    > > > > > manageable even without DbTaskStore enabled, so I don't see a
    > > > > > path
    > > >
    > > > forward
    > > > > > where we could eventually enable that. So far experiments with
    > > > > > H2
    > > >
    > > > off-heap
    > > > > > storage have provided marginal (if any) gains.
    > > > > >
    > > > > > Would anyone object to restoring the in-memory stores and
    > > > > > creating new
    > > > > > implementations for the missing ones (UpdateStore)? I'd even go
    > > >
    > > > further and
    > > > > > propose that we consider in-memory H2 and MyBatis a failed
    > > > > > experiment
    > > >
    > > > and
    > > > > > we drop that storage layer completely.
    > > > > >
    > > > > > Cheers,
    > > > > > David
    > > > > >
    >
    


Re: Future of storage in Aurora

Posted by Bill Farner <wf...@apache.org>.
I would like to revive this discussion in light of some work i have been
doing around the storage system.  The fruits of the DB storage system will
require a lot of additional effort to reach the beneficial outcomes i laid
out above, and i agree that we should cut our losses.

I plan to introduce patches soon to introduce non-H2 in-memory store
implementations.  *If anyone disagrees with removing the H2 implementations
as well, please chime in here.*

Disclaimer - i may propose an alternative for the persistent storage in the
near future.

On Mon, Apr 3, 2017 at 9:40 AM, Stephan Erb <se...@apache.org> wrote:

> H2 could give us fine granular data access. However, most of our code
> performs massive joins to reconstruct fully hydrated thrift objects.
> Most of the time we are then only interested in very few properties of
> those thrift structs. This applies to internal usage, but also how we
> use the API.
>
> I therefore believe we have to improve and refine our domain model in
> order to significantly improve the storage situation.
>
> I really liked Maxim's proposal from last year, and I think it is worth
> reconsidering: https://docs.google.com/document/d/1myYX3yuofGr8JIzud98x
> Xd5mqgpZ8q_RqKBpSff4-WE/edit
>
> Best regards,
> Stephan
>
> On Thu, 2017-03-30 at 15:53 -0700, David McLaughlin wrote:
> > So it sounds like before we make any decisions around removing the
> > work
> > done in H2 so far, we should figure out what is remaining to move to
> > external storage (or if it's even still a goal).
> >
> > I may still play around with reviving the in-memory stores, but will
> > separate that work from any goal to remove the H2 layer. Since it's
> > motivated by performance, I'd verify there is a benefit before
> > submitting
> > any review.
> >
> > Thanks all for the feedback.
> >
> >
> > On Thu, Mar 30, 2017 at 12:08 PM, Bill Farner <wfarnerapache@gmail.co
> > m>
> > wrote:
> >
> > > Adding some background - there were several motivators to using SQL
> > > that
> > > come to mind:
> > > a) well-understood transaction isolation guarantees leading to a
> > > simpler
> > > programming model w.r.t. concurrency
> > > b) ability to offload storage to a separate system (e.g. Postgres)
> > > and
> > > scale it separately
> > > c) relief of computational burden of performing snapshots and
> > > backups due
> > > to (b)
> > > d) simpler code and operations model due to (b)
> > > e) schema backwards compatibility guarantees due to persistence-
> > > friendly
> > > migration-scripts
> > > f) straightforward normalization to facilitate sharing of
> > > otherwise-redundant state (I.e. TaskConfig)
> > >
> > > The storage overhaul comes with a huge caveat requiring the
> > > approach to
> > > scheduling rounds to change. I concur that the current model is
> > > hostile to
> > > offloaded storage, as ~all state must be read every scheduling
> > > round. If
> > > that cannot be worked around with lazy state or best-effort
> > > concurrency
> > > (I.e. in-memory caching), the approach is indeed flawed.
> > >
> > > On Mar 30, 2017, 10:29 AM -0700, Joshua Cohen <jc...@apache.org>,
> > > wrote:
> > > > My understanding of the H2-backed stores is that at least part of
> > > > the
> > > > original rationale behind them was that they were meant to be an
> > > > interim
> > > > point on the way to external SQL-backed stores which should
> > > > theoretically
> > > > provide significant benefits w.r.t. to GC (obviously unproven,
> > > > especially
> > > > at scale).
> > > >
> > > > I don't disagree that the H2 stores themselves are problematic
> > > > (to say
> > >
> > > the
> > > > least); do we have evidence that returning to memory based stores
> > > > will be
> > > > an improvement on that?
> > > >
> > > > On Thu, Mar 30, 2017 at 12:16 PM, David McLaughlin <
> > >
> > > dmclaughlin@apache.org
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I'd like to start a discussion around storage in Aurora.
> > > > >
> > > > > I think one of the biggest mistakes we made in migrating our
> > > > > storage
> > >
> > > to H2
> > > > > was deleting the memory stores as we moved. We made a pretty
> > > > > big bet
> > >
> > > that
> > > > > we could eventually make H2/relational databases work. I don't
> > > > > think
> > >
> > > that
> > > > > bet has paid off and that we need to revisit the direction
> > > > > we're
> > >
> > > taking.
> > > > >
> > > > > My belief is that the current H2/MyBatis approach is untenable
> > > > > for
> > >
> > > large
> > > > > production clusters, at least without changing our current
> > >
> > > single-master
> > > > > architecture. At Twitter we are already having to fight to keep
> > > > > GC
> > > > > manageable even without DbTaskStore enabled, so I don't see a
> > > > > path
> > >
> > > forward
> > > > > where we could eventually enable that. So far experiments with
> > > > > H2
> > >
> > > off-heap
> > > > > storage have provided marginal (if any) gains.
> > > > >
> > > > > Would anyone object to restoring the in-memory stores and
> > > > > creating new
> > > > > implementations for the missing ones (UpdateStore)? I'd even go
> > >
> > > further and
> > > > > propose that we consider in-memory H2 and MyBatis a failed
> > > > > experiment
> > >
> > > and
> > > > > we drop that storage layer completely.
> > > > >
> > > > > Cheers,
> > > > > David
> > > > >
>

Re: Future of storage in Aurora

Posted by Stephan Erb <se...@apache.org>.
H2 could�give us fine granular data access. However, most of our code
performs massive joins to reconstruct fully hydrated thrift objects.
Most of the time we are then only interested in very few properties of
those thrift structs. This applies to internal usage, but also how we
use the API.

I therefore believe we have to improve and refine our domain model in
order to significantly improve the storage situation.

I really liked Maxim's proposal from last year, and I think it is worth
reconsidering: https://docs.google.com/document/d/1myYX3yuofGr8JIzud98x
Xd5mqgpZ8q_RqKBpSff4-WE/edit

Best regards,
Stephan

On Thu, 2017-03-30 at 15:53 -0700, David McLaughlin wrote:
> So it sounds like before we make any decisions around removing the
> work
> done in H2 so far, we should figure out what is remaining to move to
> external storage (or if it's even still a goal).
> 
> I may still play around with reviving the in-memory stores, but will
> separate that work from any goal to remove the H2 layer. Since it's
> motivated by performance, I'd verify there is a benefit before
> submitting
> any review.
> 
> Thanks all for the feedback.
> 
> 
> On Thu, Mar 30, 2017 at 12:08 PM, Bill Farner <wfarnerapache@gmail.co
> m>
> wrote:
> 
> > Adding some background - there were several motivators to using SQL
> > that
> > come to mind:
> > a) well-understood transaction isolation guarantees leading to a
> > simpler
> > programming model w.r.t. concurrency
> > b) ability to offload storage to a separate system (e.g. Postgres)
> > and
> > scale it separately
> > c) relief of computational burden of performing snapshots and
> > backups due
> > to (b)
> > d) simpler code and operations model due to (b)
> > e) schema backwards compatibility guarantees due to persistence-
> > friendly
> > migration-scripts
> > f) straightforward normalization to facilitate sharing of
> > otherwise-redundant state (I.e. TaskConfig)
> > 
> > The storage overhaul comes with a huge caveat requiring the
> > approach to
> > scheduling rounds to change. I concur that the current model is
> > hostile to
> > offloaded storage, as ~all state must be read every scheduling
> > round. If
> > that cannot be worked around with lazy state or best-effort
> > concurrency
> > (I.e. in-memory caching), the approach is indeed flawed.
> > 
> > On Mar 30, 2017, 10:29 AM -0700, Joshua Cohen <jc...@apache.org>,
> > wrote:
> > > My understanding of the H2-backed stores is that at least part of
> > > the
> > > original rationale behind them was that they were meant to be an
> > > interim
> > > point on the way to external SQL-backed stores which should
> > > theoretically
> > > provide significant benefits w.r.t. to GC (obviously unproven,
> > > especially
> > > at scale).
> > > 
> > > I don't disagree that the H2 stores themselves are problematic
> > > (to say
> > 
> > the
> > > least); do we have evidence that returning to memory based stores
> > > will be
> > > an improvement on that?
> > > 
> > > On Thu, Mar 30, 2017 at 12:16 PM, David McLaughlin <
> > 
> > dmclaughlin@apache.org
> > > wrote:
> > > 
> > > > Hi all,
> > > > 
> > > > I'd like to start a discussion around storage in Aurora.
> > > > 
> > > > I think one of the biggest mistakes we made in migrating our
> > > > storage
> > 
> > to H2
> > > > was deleting the memory stores as we moved. We made a pretty
> > > > big bet
> > 
> > that
> > > > we could eventually make H2/relational databases work. I don't
> > > > think
> > 
> > that
> > > > bet has paid off and that we need to revisit the direction
> > > > we're
> > 
> > taking.
> > > > 
> > > > My belief is that the current H2/MyBatis approach is untenable
> > > > for
> > 
> > large
> > > > production clusters, at least without changing our current
> > 
> > single-master
> > > > architecture. At Twitter we are already having to fight to keep
> > > > GC
> > > > manageable even without DbTaskStore enabled, so I don't see a
> > > > path
> > 
> > forward
> > > > where we could eventually enable that. So far experiments with
> > > > H2
> > 
> > off-heap
> > > > storage have provided marginal (if any) gains.
> > > > 
> > > > Would anyone object to restoring the in-memory stores and
> > > > creating new
> > > > implementations for the missing ones (UpdateStore)? I'd even go
> > 
> > further and
> > > > propose that we consider in-memory H2 and MyBatis a failed
> > > > experiment
> > 
> > and
> > > > we drop that storage layer completely.
> > > > 
> > > > Cheers,
> > > > David
> > > > 

Re: Future of storage in Aurora

Posted by David McLaughlin <dm...@apache.org>.
So it sounds like before we make any decisions around removing the work
done in H2 so far, we should figure out what is remaining to move to
external storage (or if it's even still a goal).

I may still play around with reviving the in-memory stores, but will
separate that work from any goal to remove the H2 layer. Since it's
motivated by performance, I'd verify there is a benefit before submitting
any review.

Thanks all for the feedback.


On Thu, Mar 30, 2017 at 12:08 PM, Bill Farner <wf...@gmail.com>
wrote:

> Adding some background - there were several motivators to using SQL that
> come to mind:
> a) well-understood transaction isolation guarantees leading to a simpler
> programming model w.r.t. concurrency
> b) ability to offload storage to a separate system (e.g. Postgres) and
> scale it separately
> c) relief of computational burden of performing snapshots and backups due
> to (b)
> d) simpler code and operations model due to (b)
> e) schema backwards compatibility guarantees due to persistence-friendly
> migration-scripts
> f) straightforward normalization to facilitate sharing of
> otherwise-redundant state (I.e. TaskConfig)
>
> The storage overhaul comes with a huge caveat requiring the approach to
> scheduling rounds to change. I concur that the current model is hostile to
> offloaded storage, as ~all state must be read every scheduling round. If
> that cannot be worked around with lazy state or best-effort concurrency
> (I.e. in-memory caching), the approach is indeed flawed.
>
> On Mar 30, 2017, 10:29 AM -0700, Joshua Cohen <jc...@apache.org>, wrote:
> > My understanding of the H2-backed stores is that at least part of the
> > original rationale behind them was that they were meant to be an interim
> > point on the way to external SQL-backed stores which should theoretically
> > provide significant benefits w.r.t. to GC (obviously unproven, especially
> > at scale).
> >
> > I don't disagree that the H2 stores themselves are problematic (to say
> the
> > least); do we have evidence that returning to memory based stores will be
> > an improvement on that?
> >
> > On Thu, Mar 30, 2017 at 12:16 PM, David McLaughlin <
> dmclaughlin@apache.org
> > wrote:
> >
> > > Hi all,
> > >
> > > I'd like to start a discussion around storage in Aurora.
> > >
> > > I think one of the biggest mistakes we made in migrating our storage
> to H2
> > > was deleting the memory stores as we moved. We made a pretty big bet
> that
> > > we could eventually make H2/relational databases work. I don't think
> that
> > > bet has paid off and that we need to revisit the direction we're
> taking.
> > >
> > > My belief is that the current H2/MyBatis approach is untenable for
> large
> > > production clusters, at least without changing our current
> single-master
> > > architecture. At Twitter we are already having to fight to keep GC
> > > manageable even without DbTaskStore enabled, so I don't see a path
> forward
> > > where we could eventually enable that. So far experiments with H2
> off-heap
> > > storage have provided marginal (if any) gains.
> > >
> > > Would anyone object to restoring the in-memory stores and creating new
> > > implementations for the missing ones (UpdateStore)? I'd even go
> further and
> > > propose that we consider in-memory H2 and MyBatis a failed experiment
> and
> > > we drop that storage layer completely.
> > >
> > > Cheers,
> > > David
> > >
>

Re: Future of storage in Aurora

Posted by Bill Farner <wf...@gmail.com>.
Adding some background - there were several motivators to using SQL that come to mind:
a) well-understood transaction isolation guarantees leading to a simpler programming model w.r.t. concurrency
b) ability to offload storage to a separate system (e.g. Postgres) and scale it separately
c) relief of computational burden of performing snapshots and backups due to (b)
d) simpler code and operations model due to (b)
e) schema backwards compatibility guarantees due to persistence-friendly migration-scripts
f) straightforward normalization to facilitate sharing of otherwise-redundant state (I.e. TaskConfig)

The storage overhaul comes with a huge caveat requiring the approach to scheduling rounds to change. I concur that the current model is hostile to offloaded storage, as ~all state must be read every scheduling round. If that cannot be worked around with lazy state or best-effort concurrency (I.e. in-memory caching), the approach is indeed flawed.

On Mar 30, 2017, 10:29 AM -0700, Joshua Cohen <jc...@apache.org>, wrote:
> My understanding of the H2-backed stores is that at least part of the
> original rationale behind them was that they were meant to be an interim
> point on the way to external SQL-backed stores which should theoretically
> provide significant benefits w.r.t. to GC (obviously unproven, especially
> at scale).
>
> I don't disagree that the H2 stores themselves are problematic (to say the
> least); do we have evidence that returning to memory based stores will be
> an improvement on that?
>
> On Thu, Mar 30, 2017 at 12:16 PM, David McLaughlin <dmclaughlin@apache.org
> wrote:
>
> > Hi all,
> >
> > I'd like to start a discussion around storage in Aurora.
> >
> > I think one of the biggest mistakes we made in migrating our storage to H2
> > was deleting the memory stores as we moved. We made a pretty big bet that
> > we could eventually make H2/relational databases work. I don't think that
> > bet has paid off and that we need to revisit the direction we're taking.
> >
> > My belief is that the current H2/MyBatis approach is untenable for large
> > production clusters, at least without changing our current single-master
> > architecture. At Twitter we are already having to fight to keep GC
> > manageable even without DbTaskStore enabled, so I don't see a path forward
> > where we could eventually enable that. So far experiments with H2 off-heap
> > storage have provided marginal (if any) gains.
> >
> > Would anyone object to restoring the in-memory stores and creating new
> > implementations for the missing ones (UpdateStore)? I'd even go further and
> > propose that we consider in-memory H2 and MyBatis a failed experiment and
> > we drop that storage layer completely.
> >
> > Cheers,
> > David
> >

Re: Future of storage in Aurora

Posted by Joshua Cohen <jc...@apache.org>.
My understanding of the H2-backed stores is that at least part of the
original rationale behind them was that they were meant to be an interim
point on the way to external SQL-backed stores which should theoretically
provide significant benefits w.r.t. to GC (obviously unproven, especially
at scale).

I don't disagree that the H2 stores themselves are problematic (to say the
least); do we have evidence that returning to memory based stores will be
an improvement on that?

On Thu, Mar 30, 2017 at 12:16 PM, David McLaughlin <dm...@apache.org>
wrote:

> Hi all,
>
> I'd like to start a discussion around storage in Aurora.
>
> I think one of the biggest mistakes we made in migrating our storage to H2
> was deleting the memory stores as we moved. We made a pretty big bet that
> we could eventually make H2/relational databases work. I don't think that
> bet has paid off and that we need to revisit the direction we're taking.
>
> My belief is that the current H2/MyBatis approach is untenable for large
> production clusters, at least without changing our current single-master
> architecture. At Twitter we are already having to fight to keep GC
> manageable even without DbTaskStore enabled, so I don't see a path forward
> where we could eventually enable that. So far experiments with H2 off-heap
> storage have provided marginal (if any) gains.
>
> Would anyone object to restoring the in-memory stores and creating new
> implementations for the missing ones (UpdateStore)? I'd even go further and
> propose that we consider in-memory H2 and MyBatis a failed experiment and
> we drop that storage layer completely.
>
> Cheers,
> David
>

Re: Future of storage in Aurora

Posted by Zameer Manji <zm...@apache.org>.
I don't object to changes to storage so long as we have a migration plan
and a design doc. I'm also not opposed to radical revisits of storage,
including overhauling what we store and where we store it. For example,
instead of storing our `TaskConfig` objects could we store Mesos `TaskInfo`
objects instead? Could we store data outside of the scheduler like in
Cassandra? Should we have a high level 'Job' store to make querying for job
level data easier?

On Thu, Mar 30, 2017 at 10:16 AM, David McLaughlin <dm...@apache.org>
wrote:

> Hi all,
>
> I'd like to start a discussion around storage in Aurora.
>
> I think one of the biggest mistakes we made in migrating our storage to H2
> was deleting the memory stores as we moved. We made a pretty big bet that
> we could eventually make H2/relational databases work. I don't think that
> bet has paid off and that we need to revisit the direction we're taking.
>
> My belief is that the current H2/MyBatis approach is untenable for large
> production clusters, at least without changing our current single-master
> architecture. At Twitter we are already having to fight to keep GC
> manageable even without DbTaskStore enabled, so I don't see a path forward
> where we could eventually enable that. So far experiments with H2 off-heap
> storage have provided marginal (if any) gains.
>
> Would anyone object to restoring the in-memory stores and creating new
> implementations for the missing ones (UpdateStore)? I'd even go further and
> propose that we consider in-memory H2 and MyBatis a failed experiment and
> we drop that storage layer completely.
>
> Cheers,
> David
>
> --
> Zameer Manji
>