You are viewing a plain text version of this content. The canonical link for it is here.

Posted to oak-dev@jackrabbit.apache.org by Amit Jain <am...@ieee.org> on 2017/08/01 04:09:09 UTC

Re: Composite blob store and Overlay blob store proposals

Hi Matt,

Let’s consider AWS for an example.  Using this nomenclature, you’d probably
> say Glacier is “slow” and S3 is “fast”.  S3 IA is probably not any slower
> than S3, but you probably wouldn’t want to label it the same.  But it is
> certainly not “slow” if Glacier is also slow.  Likewise, if you also had a
> FileDataStore configured, it seems like S3 wouldn’t be “fast” anymore
> compared to FDS.
>
> I then thought of using terms like “hot” (S3), “cool” (S3 IA), and “cold”
> (Glacier), but I’m not sure what that tells me that I couldn’t know simply
> via priority.  And that doesn’t address the problem that FDS would also be
> “hot” but probably hotter than S3.  There could be a number of nuances in
> between as this grows.
>
> Storage class should be included if it can be, so long as it serves a
> purpose.  I’m not sure I’m seeing the purpose yet.
>
>
Why I added "storage class" as separate from "priority" was to highlight
some DataStore(s) would need special handling for reads/writes for e.g. a
read from Glacier may not be immediately available, S3 IA advertises
availability as 99.9% and not 99.99% as is the case for S3 (1 in 1000 read
request can fail vs 1 in 10000). We would probably need some mechanism to
handle/signal the callers by having callbacks/exceptions/logging etc.
I am not saying this particular thing is what should be implemented right
away though and we cannot cover every nuanced interpretation of
categorizing the DataStore(s).

So, we would want to design an API to be generic and extensible/pluggable
but prioritize what can be implemented for 1.8 release.

Thanks
Amit

Re: Composite blob store and Overlay blob store proposals

Posted by Matt Ryan <os...@mvryan.org>.

Hi Amit,

> Storage class should be included if it can be, so long as it serves a
> purpose. I’m not sure I’m seeing the purpose yet.
>
>
Why I added "storage class" as separate from "priority" was to highlight
some DataStore(s) would need special handling for reads/writes for e.g. a
read from Glacier may not be immediately available, S3 IA advertises
availability as 99.9% and not 99.99% as is the case for S3 (1 in 1000 read
request can fail vs 1 in 10000). We would probably need some mechanism to
handle/signal the callers by having callbacks/exceptions/logging etc.
I am not saying this particular thing is what should be implemented right
away though and we cannot cover every nuanced interpretation of
categorizing the DataStore(s).


Thanks for the clarification - I think I understand the goal now, and I
agree with you at least generally.  As I’ve thought about how I might
implement an algorithm to look through a number of delegates, it is
definitely the case that I would want to know the storage class so I
prioritize it appropriately compared to other delegates.

I’ll update the wiki.


-MR