You are viewing a plain text version of this content. The canonical link for it is here.

Posted to oak-dev@jackrabbit.apache.org by Matt Ryan <os...@mvryan.org> on 2017/07/26 00:20:05 UTC

Composite blob store and Overlay blob store proposals

Hi oak-dev,

I’ve written up some proposals on the wiki for blob stores that can
reference multiple blob storage locations.

Both act as a single logical blob store to Oak and can be treated as a
single blob store.  Both have at least two “delegate” blob stores managed
by the primary blob store.

There are two concepts.  One I’m currently calling the Overlay blob store
(we haven’t voted on this name yet).  In this case, delegates are
configured with a preferred order of lookup.  When a read is issued, the
overlay blob store will attempt to satisfy the read by going through the
delegates in order until one can satisfy the read.  [0]

The second concept is the Composite blob store that was previously
discussed on-list.  In this case, delegates are configured with rules
specifying which blobs belong in which delegate, with exactly one delegate
being specified as the default.  There is only ever exactly one correct
location for a blob in a composite blob store.  When a read is issued, the
composite blob store will evaluate the rules to determine which delegate
should be able to satisfy the request, and then read from that delegate
only, or fail if it is not found in the delegate.  [1]

As I thought about all the use cases, these two concepts kind of stood out
in contrast to each other so I thought I would propose that we formalize
the two as separate but similar concepts.

I would appreciate feedback and discussion on how we can make these useful
for future Oak versions.  Thanks!


-MR


[0] - https://wiki.apache.org/jackrabbit/Overlay%20Blob%20Store
[1] - https://wiki.apache.org/jackrabbit/Composite%20Blob%20Store

Re: Composite blob store and Overlay blob store proposals

Posted by Matt Ryan <os...@mvryan.org>.

Hi Amit,

> Storage class should be included if it can be, so long as it serves a
> purpose. I’m not sure I’m seeing the purpose yet.
>
>
Why I added "storage class" as separate from "priority" was to highlight
some DataStore(s) would need special handling for reads/writes for e.g. a
read from Glacier may not be immediately available, S3 IA advertises
availability as 99.9% and not 99.99% as is the case for S3 (1 in 1000 read
request can fail vs 1 in 10000). We would probably need some mechanism to
handle/signal the callers by having callbacks/exceptions/logging etc.
I am not saying this particular thing is what should be implemented right
away though and we cannot cover every nuanced interpretation of
categorizing the DataStore(s).


Thanks for the clarification - I think I understand the goal now, and I
agree with you at least generally.  As I’ve thought about how I might
implement an algorithm to look through a number of delegates, it is
definitely the case that I would want to know the storage class so I
prioritize it appropriately compared to other delegates.

I’ll update the wiki.


-MR

Re: Composite blob store and Overlay blob store proposals

Posted by Amit Jain <am...@ieee.org>.

Hi Matt,

Let’s consider AWS for an example.  Using this nomenclature, you’d probably
> say Glacier is “slow” and S3 is “fast”.  S3 IA is probably not any slower
> than S3, but you probably wouldn’t want to label it the same.  But it is
> certainly not “slow” if Glacier is also slow.  Likewise, if you also had a
> FileDataStore configured, it seems like S3 wouldn’t be “fast” anymore
> compared to FDS.
>
> I then thought of using terms like “hot” (S3), “cool” (S3 IA), and “cold”
> (Glacier), but I’m not sure what that tells me that I couldn’t know simply
> via priority.  And that doesn’t address the problem that FDS would also be
> “hot” but probably hotter than S3.  There could be a number of nuances in
> between as this grows.
>
> Storage class should be included if it can be, so long as it serves a
> purpose.  I’m not sure I’m seeing the purpose yet.
>
>
Why I added "storage class" as separate from "priority" was to highlight
some DataStore(s) would need special handling for reads/writes for e.g. a
read from Glacier may not be immediately available, S3 IA advertises
availability as 99.9% and not 99.99% as is the case for S3 (1 in 1000 read
request can fail vs 1 in 10000). We would probably need some mechanism to
handle/signal the callers by having callbacks/exceptions/logging etc.
I am not saying this particular thing is what should be implemented right
away though and we cannot cover every nuanced interpretation of
categorizing the DataStore(s).

So, we would want to design an API to be generic and extensible/pluggable
but prioritize what can be implemented for 1.8 release.

Thanks
Amit

Re: Composite blob store and Overlay blob store proposals

Posted by Matt Ryan <os...@mvryan.org>.

Hi Amit,

On July 31, 2017 at 4:49:37 AM, Amit Jain (amitj@ieee.org) wrote:

With this in mind can't we just conceptually have a Composite DataStore
(Not drilling down to interface/class hierarchy and API yet) which can then
support the following:
* User provided "type" of blob to influence the logical/physical DataStore
the blob/file to be written to (Needs some sort of configuration to have
some pre-defined types with mapping to the datastores).
* Optionally defining characteristic of the DataStore(s)
** read/write
** storage class - slow,fast
** priority


Can you give me a bit more detail about what you have in mind WRT “storage
class”?  What would be important to accomplish here that isn’t accomplished
by “priority”?

I’m not opposed to this idea, but it seems like it would be difficult to
make it really useful.  What I mean is, “slow” and “fast” are of course
relative, and any terms I can think of to use in their place have the same
limitation.

Let’s consider AWS for an example.  Using this nomenclature, you’d probably
say Glacier is “slow” and S3 is “fast”.  S3 IA is probably not any slower
than S3, but you probably wouldn’t want to label it the same.  But it is
certainly not “slow” if Glacier is also slow.  Likewise, if you also had a
FileDataStore configured, it seems like S3 wouldn’t be “fast” anymore
compared to FDS.

I then thought of using terms like “hot” (S3), “cool” (S3 IA), and “cold”
(Glacier), but I’m not sure what that tells me that I couldn’t know simply
via priority.  And that doesn’t address the problem that FDS would also be
“hot” but probably hotter than S3.  There could be a number of nuances in
between as this grows.

Storage class should be included if it can be, so long as it serves a
purpose.  I’m not sure I’m seeing the purpose yet.

-MR

Re: Composite blob store and Overlay blob store proposals

Posted by Matt Ryan <os...@mvryan.org>.

Hi all,

Thanks for the feedback, please keep it coming.  After considering what’s
been said I agree that merging the two concepts to one is less confusing
and a better approach.  Essentially the idea is to use the overlay concept
for request resolution (which eventually checks every delegate, thus
addressing the configuration change problem) but still allowing, although
not requiring, “storage hints”, i.e. rules for what may be stored in a
delegate.

I’ve updated the wiki page [0] accordingly.

[0] - https://wiki.apache.org/jackrabbit/Composite%20Blob%20Store

-MR

On July 31, 2017 at 4:49:37 AM, Amit Jain (amitj@ieee.org) wrote:

Hi,

I read the proposals and I have a comment wrt the Overlay/Composite
DataStores at a high level.

IIUC, these 2 are similar except that the Overlay could have duplicated
binaries right? After reading the details around the two it seems to me
Overlay is sort of a super set of the 2 and whether the composed DataStore
have duplicated set can be an implementation detail. For e.g. as a fallback
pluggable/configurable mechanism in case there are no rules/properties
defined while storing the blobs in the DataStore (rules/properties which
can influence the decision of where to redirect the blobs).

With this in mind can't we just conceptually have a Composite DataStore
(Not drilling down to interface/class hierarchy and API yet) which can then
support the following:
* User provided "type" of blob to influence the logical/physical DataStore
the blob/file to be written to (Needs some sort of configuration to have
some pre-defined types with mapping to the datastores).
* Optionally defining characteristic of the DataStore(s)
** read/write
** storage class - slow,fast
** priority

In fact another way to look at them might just be as having an
explicit/implicit decision making for storage of blobs. In which case there
will be occasions where certain blobs are explicitly being written to the
most preferred storage class configured as an example lucene blobs. Also, I
think a lot of administrative things would be same for both e.g.
moving/copying, garbage collection etc.

Also, wrt to the usage of the Overlay for UC9, this is still possible if we
map the cache directories to be on NFS. But do you any tests which show
that this could be a preferred option or the impact on performance? We
didn't give much thought to it but it looked like this may degrade
performance as writing to NFS would be slower and in fact we have a
CachingDataStore option implemented for FileDataStore configured on NFS to
improve performance.

Thanks
Amit

On Wed, Jul 26, 2017 at 5:50 AM, Matt Ryan <os...@mvryan.org> wrote:

> Hi oak-dev,
>
> I’ve written up some proposals on the wiki for blob stores that can
> reference multiple blob storage locations.
>
> Both act as a single logical blob store to Oak and can be treated as a
> single blob store. Both have at least two “delegate” blob stores managed
> by the primary blob store.
>
> There are two concepts. One I’m currently calling the Overlay blob store
> (we haven’t voted on this name yet). In this case, delegates are
> configured with a preferred order of lookup. When a read is issued, the
> overlay blob store will attempt to satisfy the read by going through the
> delegates in order until one can satisfy the read. [0]
>
> The second concept is the Composite blob store that was previously
> discussed on-list. In this case, delegates are configured with rules
> specifying which blobs belong in which delegate, with exactly one
delegate
> being specified as the default. There is only ever exactly one correct
> location for a blob in a composite blob store. When a read is issued, the
> composite blob store will evaluate the rules to determine which delegate
> should be able to satisfy the request, and then read from that delegate
> only, or fail if it is not found in the delegate. [1]
>
> As I thought about all the use cases, these two concepts kind of stood
out
> in contrast to each other so I thought I would propose that we formalize
> the two as separate but similar concepts.
>
> I would appreciate feedback and discussion on how we can make these
useful
> for future Oak versions. Thanks!
>
>
> -MR
>
>
> [0] - https://wiki.apache.org/jackrabbit/Overlay%20Blob%20Store
> [1] - https://wiki.apache.org/jackrabbit/Composite%20Blob%20Store
>

Re: Composite blob store and Overlay blob store proposals

Posted by Amit Jain <am...@ieee.org>.

Hi,

I read the proposals and I have a comment wrt the Overlay/Composite
DataStores at a high level.

IIUC, these 2 are similar except that the Overlay could have duplicated
binaries right? After reading the details around the two it seems to me
Overlay is sort of a super set of the 2 and whether the composed DataStore
have duplicated set can be an implementation detail. For e.g. as a fallback
pluggable/configurable mechanism in case there are no rules/properties
defined while storing the blobs in the DataStore (rules/properties which
can influence the decision of where to redirect the blobs).

With this in mind can't we just conceptually have a Composite DataStore
(Not drilling down to interface/class hierarchy and API yet) which can then
support the following:
* User provided "type" of blob to influence the logical/physical DataStore
the blob/file to be written to (Needs some sort of configuration to have
some pre-defined types with mapping to the datastores).
* Optionally defining characteristic of the DataStore(s)
** read/write
** storage class - slow,fast
** priority

In fact another way to look at them might just be as having an
explicit/implicit decision making for storage of blobs. In which case there
will be occasions where certain blobs are explicitly being written to the
most preferred storage class configured as an example lucene blobs. Also, I
think a lot of administrative things would be same for both e.g.
moving/copying, garbage collection etc.

Also, wrt to the usage of the Overlay for UC9, this is still possible if we
map the cache directories to be on NFS. But do you any tests which show
that this could be a preferred option or the impact on performance? We
didn't give much thought to it but it looked like this may degrade
performance as writing to NFS would be slower and in fact we have a
CachingDataStore option implemented for FileDataStore configured on NFS to
improve performance.

Thanks
Amit

On Wed, Jul 26, 2017 at 5:50 AM, Matt Ryan <os...@mvryan.org> wrote:

> Hi oak-dev,
>
> I’ve written up some proposals on the wiki for blob stores that can
> reference multiple blob storage locations.
>
> Both act as a single logical blob store to Oak and can be treated as a
> single blob store.  Both have at least two “delegate” blob stores managed
> by the primary blob store.
>
> There are two concepts.  One I’m currently calling the Overlay blob store
> (we haven’t voted on this name yet).  In this case, delegates are
> configured with a preferred order of lookup.  When a read is issued, the
> overlay blob store will attempt to satisfy the read by going through the
> delegates in order until one can satisfy the read.  [0]
>
> The second concept is the Composite blob store that was previously
> discussed on-list.  In this case, delegates are configured with rules
> specifying which blobs belong in which delegate, with exactly one delegate
> being specified as the default.  There is only ever exactly one correct
> location for a blob in a composite blob store.  When a read is issued, the
> composite blob store will evaluate the rules to determine which delegate
> should be able to satisfy the request, and then read from that delegate
> only, or fail if it is not found in the delegate.  [1]
>
> As I thought about all the use cases, these two concepts kind of stood out
> in contrast to each other so I thought I would propose that we formalize
> the two as separate but similar concepts.
>
> I would appreciate feedback and discussion on how we can make these useful
> for future Oak versions.  Thanks!
>
>
> -MR
>
>
> [0] - https://wiki.apache.org/jackrabbit/Overlay%20Blob%20Store
> [1] - https://wiki.apache.org/jackrabbit/Composite%20Blob%20Store
>

Re: Composite blob store and Overlay blob store proposals

Posted by Arek Kita <ki...@gmail.com>.

2017-07-26 17:38 GMT+02:00 Matt Ryan <os...@mvryan.org>:
> Hi Arek,
>
>
>> Regarding CompositeBlobStore -- what if customer changes the storage
>> rules in the meantime (refers also to Curation section). This will
>> result in the new layout for writes but binaries won't be read
>> correctly, am I right?
>> I guess this could be resolved by nesting: CompositeBlobStore with old
>> rules and CompositeBlobStore with new rules in OverlayBlobStore, but I
>> see that "curation" is for now deferred to other topic but still I
>> think this is quite important IMO to think about it a little upfront
>> at least for this blob store type.
>
>
> Agreed.  I’m just not sure how to go about it yet, open to ideas here. :)
>
> Perhaps a better approach would be to only have one type.  Use the overlay
> approach in Composite blob store, meaning reads will be tried to all
> delegates, but if rules are supplied we use those for priority before
> checking any other stores.  In that case, when configuration change happens
> the blob could still be found, but it would be suboptimal because it isn’t
> in the expected location.  The blob store could at that point start an async
> background job to move it to the correct location if needed.
>
> WDYT?
>

+1

Not focusing yet on solution or implementation under the hood but on
the problem statement I guess that both are very similar to each
other. Composite means for me more than one. The current meaning of
Composite is RoutingDataStore (sorry for the name), whilst
OverlayDataStore is OrderedDataStore to me.

Both are representing typical composite pattern [2] where Composite
Object implements the same interface as leaf + something that will
allow to compose leaf (but not digging whether this will be ordered
list or map with DS per criterion etc).

To me it looks like the solution (implementation) will be very similar
for both RoutingDataStore and OrderedDataStore (sorry for names
again). In fact in each case we are routing reads and writes and for
both we need to have a "migration" process to migrate binaries
(whether online or offline) like we have async indexing process.

Integrating them makes sense from caller perspective as it will be
just simpler -- one composite data store which is still DataStore --
both are different only when it comes to rules defined at
configuration so I think this leaves us open to what more rules you
can add in the future etc. without creating another DataStore type and
then migrating them supplying configurations etc). I'm thinking here
even for Oak users with questions how to decide which to use? Do I
need? In case of one composite you don't need to, you just specify
rules you need and that's all. It looks like a bit simpler solution.

What is more important it seems that one helps another to reach a
desired state (obviously with temporarily lookup overhead but without
downtime).
I imagine that changing the rules no matter if you are doing it
dynamically or during restarting the repository and whether the
migration is done automatically or in offline mode via additional tool
to reach a new desired state requires to have a temporary overhead as
you won't be able for large binary repository to reach the state
immediately... ...unless you have a layout of Data Stores already
framed which to me it looks to be an edge case.

Maybe in fact only routing rules (and thus migration rules) under the
hood should be pluggable.

I'm in favour of solution where configuration change allows to migrate
automatically to a new desired layout of binaries no matter if they
are divided into fast|slow|slower, important|archival|duplicated etc.
and in front of CompositeDataStore the whole system works (for some
binaries maybe a bit slower where there is a MISS).

I guess for current CompositeDataStore there could be a fallback
strategy like strict=true|false if only rules defined should be taken
into account for the edge cases but then what we'll do when rules
aren't changed but properties in JCR repo are changed without our
control (moving binary in synchronised way to repo write). I prefer
again async migration under the hood as after the write you might have
immediate read and in case of fallback the binary will be retrieved
anyway correctly no matter if it is copied or not.

Please correct me if went much further than I should :)

Thanks,
Arek


[2] https://en.wikipedia.org/wiki/Composite_pattern

Re: Composite blob store and Overlay blob store proposals

Posted by Matt Ryan <os...@mvryan.org>.

Hi Arek,


Regarding CompositeBlobStore -- what if customer changes the storage
rules in the meantime (refers also to Curation section). This will
result in the new layout for writes but binaries won't be read
correctly, am I right?
I guess this could be resolved by nesting: CompositeBlobStore with old
rules and CompositeBlobStore with new rules in OverlayBlobStore, but I
see that "curation" is for now deferred to other topic but still I
think this is quite important IMO to think about it a little upfront
at least for this blob store type.


Agreed.  I’m just not sure how to go about it yet, open to ideas here. :)

Perhaps a better approach would be to only have one type.  Use the overlay
approach in Composite blob store, meaning reads will be tried to all
delegates, but if rules are supplied we use those for priority before
checking any other stores.  In that case, when configuration change happens
the blob could still be found, but it would be suboptimal because it isn’t
in the expected location.  The blob store could at that point start an
async background job to move it to the correct location if needed.

WDYT?

-MR

Fwd: Composite blob store and Overlay blob store proposals

Posted by Arek Kita <ki...@gmail.com>.

Hi,

2017-07-26 17:33 GMT+02:00 Matt Ryan <os...@mvryan.org>:
>
>> If you could configure priority for read in the following order: {FBS,
>> S3DS} but for writes in inverse order {S3DS, FBS} then this will
>> almost satisfy any DataStore migration scenarios (except the need to
>> transfer content stored in initial blob store) but it can be done in
>> the background and it can be done by any other mechanism (offline copy
>> process).
>
> I think that could be satisfied by reconfiguring the FBS as a read-only
> store; from that point forward reads could still be satisfied from the FBS
> but writes would all end up going to S3DS instead.  Once all the blobs were
> transferred in the background, the FBS could be shut down.  Do you think
> that would work?

Right, I forgot about read-only blob stores. Indeed writes should be
performed without any overhead as the check is constant and global so
no additional lookup needed, exactly like with priorities.

Thanks for suggestion!
Arek

Re: Composite blob store and Overlay blob store proposals

Posted by Matt Ryan <os...@mvryan.org>.

Hi Arek,



In the wiki there is no precise IMHO statement about write:

> The overlay blob store fulfills write requests by attempting to write to
each delegate in priority order. Once a write is successfully satisfied by
a delegate, the result of the delegate write is returned as the result of
the overlay blob store write and no subsequent writes are attempted for
that request.

Does this mean that only one write to one delegate blob store must
succeed (in order of priority) before returning a result for a primary
blob store (Overlay in this case) ?

Yes, that is the intent.  The purpose is to make it easy to understand - it
always writes to the first delegate that will allow the write.



Imagine the following migration case:

FileBlobStore ---> S3DataStore


If you could configure priority for read in the following order: {FBS,
S3DS} but for writes in inverse order {S3DS, FBS} then this will
almost satisfy any DataStore migration scenarios (except the need to
transfer content stored in initial blob store) but it can be done in
the background and it can be done by any other mechanism (offline copy
process).

I think that could be satisfied by reconfiguring the FBS as a read-only
store; from that point forward reads could still be satisfied from the FBS
but writes would all end up going to S3DS instead.  Once all the blobs were
transferred in the background, the FBS could be shut down.  Do you think
that would work?


-MR

Re: Composite blob store and Overlay blob store proposals

Posted by Arek Kita <ki...@gmail.com>.

2017-07-26 11:02 GMT+02:00 Arek Kita <ki...@gmail.com>:
> S3DataStore ---> AmazonDataStore

oops, should be: S3DataStore ---> AzureDataStore

Re: Composite blob store and Overlay blob store proposals

Posted by Arek Kita <ki...@gmail.com>.

Hi Matt,

In the wiki there is no precise IMHO statement about write:

> The overlay blob store fulfills write requests by attempting to write to each delegate in priority order. Once a write is successfully satisfied by a delegate, the result of the delegate write is returned as the result of the overlay blob store write and no subsequent writes are attempted for that request.

Does this mean that only one write to one delegate blob store must
succeed (in order of priority) before returning a result for a primary
blob store (Overlay in this case) ?

Besides this I imagine such Overlay blob store might be helpful during
no downtime migrations. Imagine the following migration case:

FileBlobStore ---> S3DataStore

or

S3DataStore ---> AmazonDataStore


If you could configure priority for read in the following order: {FBS,
S3DS} but for writes in inverse order {S3DS, FBS} then this will
almost satisfy any DataStore migration scenarios (except the need to
transfer content stored in initial blob store) but it can be done in
the background and it can be done by any other mechanism (offline copy
process).


Regarding CompositeBlobStore -- what if customer changes the storage
rules in the meantime (refers also to Curation section). This will
result in the new layout for writes but binaries won't be read
correctly, am I right?
I guess this could be resolved by nesting: CompositeBlobStore with old
rules and CompositeBlobStore with new rules in OverlayBlobStore, but I
see that "curation" is for now deferred to other topic but still I
think this is quite important IMO to think about it a little upfront
at least for this blob store type.

Thanks,
Arek


2017-07-26 2:20 GMT+02:00 Matt Ryan <os...@mvryan.org>:
> There are two concepts.  One I’m currently calling the Overlay blob store
> (we haven’t voted on this name yet).  In this case, delegates are
> configured with a preferred order of lookup.  When a read is issued, the
> overlay blob store will attempt to satisfy the read by going through the
> delegates in order until one can satisfy the read.  [0]