You are viewing a plain text version of this content. The canonical link for it is here.

Posted to oak-dev@jackrabbit.apache.org by Chetan Mehrotra <ch...@gmail.com> on 2013/10/30 07:50:57 UTC

Strategies around storing blobs in Mongo

Hi,

Currently we are storing blobs by breaking them into small chunks and
then storing those chunks in MongoDB as part of blobs collection. This
approach would cause issues as Mongo maintains a global exclusive
write locks on a per database level [1]. So even writing multiple
small chunks of say 2 MB each would lead to write lock contention.

Mongo also provides GridFS[2]. However it also uses a similar strategy
like we are currently using and such a support is built into the
Driver. For server they are just collection entries.

So to minimize contentions for write locks for uses cases where big
assets are being stored in Oak we can opt for following strategies

1. Store the blobs collection in a different database. As Mongo write
locks [1] are taken per db level then storing the blobs in different
db would allow the read/write of node data (majority usecase) to
continue.

2. For more asset/binary heavy usecase use a separate database server
itself to server the binaries.

3. Bring back the JR2 DataStore implementation and just save metadata
related to binaries in Mongo. We already have S3 based implementation
there and they would continue to work with Oak also

Chetan Mehrotra
[1] http://docs.mongodb.org/manual/faq/concurrency/#how-granular-are-locks-in-mongodb
[2] http://docs.mongodb.org/manual/core/gridfs/

Re: Strategies around storing blobs in Mongo

Posted by Amit Jain <am...@ieee.org>.

>> So even adding a 2
>>MB chunk on a sharded system over remote connection would block read
>>for that complete duration. So at minimum we should be avoiding that.

I guess if there are read replicas in the shard replica set then, it will
mitigate the effect to some extent



On Wed, Oct 30, 2013 at 3:04 PM, Chetan Mehrotra
<ch...@gmail.com>wrote:

> > sounds reasonable. what is the impact of such a design when it comes
> > to map-reduce features? I was thinking that we could use it e.g. for
> > garbage collection, but I don't know if this is still an option when data
> > is spread across multiple databases.
>
> Would investigate that aspect further
>
> > connecting to a second server would add quite some complexity to
> Yup. Option was just provided for completeness sake. And something
> like this would probably never be required.
>
> > that was one of my initial thoughts as well, but I was wondering what
> > the impact of such a deployment is on data store garbage collection.
>
> Probably we can make a shadow node for the binary in the blob
> collection and keep the binary content within the DataStore itself.
> Stuff like Garbage collection would be performed on the Shadow node
> and logic would use results from that to perform actual deletions.
>
>
> Chetan Mehrotra
>
>
> On Wed, Oct 30, 2013 at 1:13 PM, Marcel Reutegger <mr...@adobe.com>
> wrote:
> > Hi,
> >
> >> Currently we are storing blobs by breaking them into small chunks and
> >> then storing those chunks in MongoDB as part of blobs collection. This
> >> approach would cause issues as Mongo maintains a global exclusive
> >> write locks on a per database level [1]. So even writing multiple
> >> small chunks of say 2 MB each would lead to write lock contention.
> >
> > so far we observed high lock content primarily when there are a lot of
> > updates. inserts were not that big of a problem, because you can batch
> > them. it would probably be good to have a test to see how big the
> > impact is when blogs come into play.
> >
> >> Mongo also provides GridFS[2]. However it also uses a similar strategy
> >> like we are currently using and such a support is built into the
> >> Driver. For server they are just collection entries.
> >>
> >> So to minimize contentions for write locks for uses cases where big
> >> assets are being stored in Oak we can opt for following strategies
> >>
> >> 1. Store the blobs collection in a different database. As Mongo write
> >> locks [1] are taken per db level then storing the blobs in different
> >> db would allow the read/write of node data (majority usecase) to
> >> continue.
> >
> > sounds reasonable. what is the impact of such a design when it comes
> > to map-reduce features? I was thinking that we could use it e.g. for
> > garbage collection, but I don't know if this is still an option when data
> > is spread across multiple databases.
> >
> >> 2. For more asset/binary heavy usecase use a separate database server
> >> itself to server the binaries.
> >
> > connecting to a second server would add quite some complexity to
> > the system. wouldn't it be easier to just leverage standard mongodb
> > sharding to distribute the load?
> >
> >> 3. Bring back the JR2 DataStore implementation and just save metadata
> >> related to binaries in Mongo. We already have S3 based implementation
> >> there and they would continue to work with Oak also
> >
> > that was one of my initial thoughts as well, but I was wondering what
> > the impact of such a deployment is on data store garbage collection.
> >
> > regards
> >  marcel
> >
> >> Chetan Mehrotra
> >> [1] http://docs.mongodb.org/manual/faq/concurrency/#how-granular-are-
> >> locks-in-mongodb
> >> [2] http://docs.mongodb.org/manual/core/gridfs/
>

Re: Strategies around storing blobs in Mongo

Posted by Chetan Mehrotra <ch...@gmail.com>.

> sounds reasonable. what is the impact of such a design when it comes
> to map-reduce features? I was thinking that we could use it e.g. for
> garbage collection, but I don't know if this is still an option when data
> is spread across multiple databases.

Would investigate that aspect further

> connecting to a second server would add quite some complexity to
Yup. Option was just provided for completeness sake. And something
like this would probably never be required.

> that was one of my initial thoughts as well, but I was wondering what
> the impact of such a deployment is on data store garbage collection.

Probably we can make a shadow node for the binary in the blob
collection and keep the binary content within the DataStore itself.
Stuff like Garbage collection would be performed on the Shadow node
and logic would use results from that to perform actual deletions.


Chetan Mehrotra


On Wed, Oct 30, 2013 at 1:13 PM, Marcel Reutegger <mr...@adobe.com> wrote:
> Hi,
>
>> Currently we are storing blobs by breaking them into small chunks and
>> then storing those chunks in MongoDB as part of blobs collection. This
>> approach would cause issues as Mongo maintains a global exclusive
>> write locks on a per database level [1]. So even writing multiple
>> small chunks of say 2 MB each would lead to write lock contention.
>
> so far we observed high lock content primarily when there are a lot of
> updates. inserts were not that big of a problem, because you can batch
> them. it would probably be good to have a test to see how big the
> impact is when blogs come into play.
>
>> Mongo also provides GridFS[2]. However it also uses a similar strategy
>> like we are currently using and such a support is built into the
>> Driver. For server they are just collection entries.
>>
>> So to minimize contentions for write locks for uses cases where big
>> assets are being stored in Oak we can opt for following strategies
>>
>> 1. Store the blobs collection in a different database. As Mongo write
>> locks [1] are taken per db level then storing the blobs in different
>> db would allow the read/write of node data (majority usecase) to
>> continue.
>
> sounds reasonable. what is the impact of such a design when it comes
> to map-reduce features? I was thinking that we could use it e.g. for
> garbage collection, but I don't know if this is still an option when data
> is spread across multiple databases.
>
>> 2. For more asset/binary heavy usecase use a separate database server
>> itself to server the binaries.
>
> connecting to a second server would add quite some complexity to
> the system. wouldn't it be easier to just leverage standard mongodb
> sharding to distribute the load?
>
>> 3. Bring back the JR2 DataStore implementation and just save metadata
>> related to binaries in Mongo. We already have S3 based implementation
>> there and they would continue to work with Oak also
>
> that was one of my initial thoughts as well, but I was wondering what
> the impact of such a deployment is on data store garbage collection.
>
> regards
>  marcel
>
>> Chetan Mehrotra
>> [1] http://docs.mongodb.org/manual/faq/concurrency/#how-granular-are-
>> locks-in-mongodb
>> [2] http://docs.mongodb.org/manual/core/gridfs/

RE: Strategies around storing blobs in Mongo

Posted by Marcel Reutegger <mr...@adobe.com>.

Hi,

> Currently we are storing blobs by breaking them into small chunks and
> then storing those chunks in MongoDB as part of blobs collection. This
> approach would cause issues as Mongo maintains a global exclusive
> write locks on a per database level [1]. So even writing multiple
> small chunks of say 2 MB each would lead to write lock contention.

so far we observed high lock content primarily when there are a lot of
updates. inserts were not that big of a problem, because you can batch
them. it would probably be good to have a test to see how big the
impact is when blogs come into play.

> Mongo also provides GridFS[2]. However it also uses a similar strategy
> like we are currently using and such a support is built into the
> Driver. For server they are just collection entries.
> 
> So to minimize contentions for write locks for uses cases where big
> assets are being stored in Oak we can opt for following strategies
> 
> 1. Store the blobs collection in a different database. As Mongo write
> locks [1] are taken per db level then storing the blobs in different
> db would allow the read/write of node data (majority usecase) to
> continue.

sounds reasonable. what is the impact of such a design when it comes
to map-reduce features? I was thinking that we could use it e.g. for
garbage collection, but I don't know if this is still an option when data
is spread across multiple databases.

> 2. For more asset/binary heavy usecase use a separate database server
> itself to server the binaries.

connecting to a second server would add quite some complexity to
the system. wouldn't it be easier to just leverage standard mongodb
sharding to distribute the load?

> 3. Bring back the JR2 DataStore implementation and just save metadata
> related to binaries in Mongo. We already have S3 based implementation
> there and they would continue to work with Oak also

that was one of my initial thoughts as well, but I was wondering what
the impact of such a deployment is on data store garbage collection.

regards
 marcel

> Chetan Mehrotra
> [1] http://docs.mongodb.org/manual/faq/concurrency/#how-granular-are-
> locks-in-mongodb
> [2] http://docs.mongodb.org/manual/core/gridfs/

Re: Strategies around storing blobs in Mongo

Posted by Chetan Mehrotra <ch...@gmail.com>.

To close this thread

On Wed, Oct 30, 2013 at 7:52 PM, Jukka Zitting <ju...@gmail.com> wrote:
> So AFAICT the worry about a write blocking all concurrent reads is
> unfounded unless it shows up in a benchmark.

I tried to measure effect of such scenario in OAK-1153 [1] and from
results obtained there does not appear to be much change if
collections are managed in same database or different. So for now this
is nothing to worry about.

On a side note - No of reads and writes performed drop considerably
when accessing a remote Mongo server. See results at [1] for more
details

Chetan Mehrotra
[1] https://issues.apache.org/jira/browse/OAK-1153

Re: Strategies around storing blobs in Mongo

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Wed, Oct 30, 2013 at 2:50 AM, Chetan Mehrotra
<ch...@gmail.com> wrote:
> Currently we are storing blobs by breaking them into small chunks and
> then storing those chunks in MongoDB as part of blobs collection. This
> approach would cause issues as Mongo maintains a global exclusive
> write locks on a per database level [1]. So even writing multiple
> small chunks of say 2 MB each would lead to write lock contention.

Note that the underlying disk in any case forces the serialization of
all writes on a single shard, so I wouldn't be too worried about this
as MongoDB can still allow concurrent read access to cached content
(see http://docs.mongodb.org/manual/faq/concurrency/#does-a-read-or-write-operation-ever-yield-the-lock).
So AFAICT the worry about a write blocking all concurrent reads is
unfounded unless it shows up in a benchmark.

BR,

Jukka Zitting

Re: Strategies around storing blobs in Mongo

Posted by Michael Marth <mm...@adobe.com>.

Hi Chetan,

> 
> 3. Bring back the JR2 DataStore implementation and just save metadata
> related to binaries in Mongo. We already have S3 based implementation
> there and they would continue to work with Oak also
> 

I think we will need the data store impl for Oak in any case (regardless the outcome of this discussion) in order to enable the migration of large repos from JR2 where the data store cannot be moved. That would include the filesystem based DS and S3 DS.

When you write

> Mongo also provides GridFS[2]. However it also uses a similar strategy
> like we are currently using and such a support is built into the
> Driver. For server they are just collection entries.

do you imply that you consider a GridFS-backed DS implementation not doable or ideal? I am referring to the "However" :)

Michael

Re: Strategies around storing blobs in Mongo

Posted by Chetan Mehrotra <ch...@gmail.com>.

>  Open questions are, what is the write thoughput for one
> shard, does the write lock also block reads (I guess not), does the write

As Ian mentioned above write locks block all reads. So even adding a 2
MB chunk on a sharded system over remote connection would block read
for that complete duration. So at minimum we should be avoiding that.
Chetan Mehrotra


On Wed, Oct 30, 2013 at 2:40 PM, Ian Boston <ie...@tfd.co.uk> wrote:
> On 30 October 2013 07:55, Thomas Mueller <mu...@adobe.com> wrote:
>> Hi,
>>
>>> as Mongo maintains a global exclusive write locks on a per database level
>>
>> I think this is not necessarily a huge problem. As far as I understand, it
>> limits write concurrency within one shard only, so it does not block
>> scalability. Open questions are, what is the write thoughput for one
>> shard, does the write lock also block reads (I guess not), does the write
>> lock cause high latency for other writes because binaries are big.
>>
>
>
> This information would be extremely useful for all those looking to
> Oak to address use cases where the repository access is between 20 and
> 60% write.
>
> To answer one of your questions
> According to [1] write locks do block reads within the scope of the lock.
>
> Other information from [1].
> Write locks are exclusive and global.
> Write locks block read locks being established.
> (and obviously read locks block write locks being established)
> Read locks are concurrent and shared.
> Pre 2.2 a write lock was scoped to the mongod process.
> Post 2.2 a write lock is scoped to the database within the mondod process.
> All locks are scoped to a shard.
> IIUC, the lock behaviour is identical to that in JR2 except for the scope.
>
> Ian
>
> 1 http://docs.mongodb.org/manual/faq/concurrency/#how-granular-are-locks-in-mongodb
>
>
>
>>
>> I think it would make sense to have a simple benchmark (concurrent writing
>> / reading of binaries), so that we can test which strategy is best, and
>> possibly play around with different strategies (split binaries into
>> smaller / larger chunks, use different write concerns, use more
>> shards,...).
>>
>> Regards,
>> Thomas
>>
>>
>>
>> On 10/30/13 7:50 AM, "Chetan Mehrotra" <ch...@gmail.com> wrote:
>>
>>>Hi,
>>>
>>>Currently we are storing blobs by breaking them into small chunks and
>>>then storing those chunks in MongoDB as part of blobs collection. This
>>>approach would cause issues as Mongo maintains a global exclusive
>>>write locks on a per database level [1]. So even writing multiple
>>>small chunks of say 2 MB each would lead to write lock contention.
>>>
>>>Mongo also provides GridFS[2]. However it also uses a similar strategy
>>>like we are currently using and such a support is built into the
>>>Driver. For server they are just collection entries.
>>>
>>>So to minimize contentions for write locks for uses cases where big
>>>assets are being stored in Oak we can opt for following strategies
>>>
>>>1. Store the blobs collection in a different database. As Mongo write
>>>locks [1] are taken per db level then storing the blobs in different
>>>db would allow the read/write of node data (majority usecase) to
>>>continue.
>>>
>>>2. For more asset/binary heavy usecase use a separate database server
>>>itself to server the binaries.
>>>
>>>3. Bring back the JR2 DataStore implementation and just save metadata
>>>related to binaries in Mongo. We already have S3 based implementation
>>>there and they would continue to work with Oak also
>>>
>>>Chetan Mehrotra
>>>[1]
>>>http://docs.mongodb.org/manual/faq/concurrency/#how-granular-are-locks-in-
>>>mongodb
>>>[2] http://docs.mongodb.org/manual/core/gridfs/
>>

Re: Strategies around storing blobs in Mongo

Posted by Ian Boston <ie...@tfd.co.uk>.

On 30 October 2013 07:55, Thomas Mueller <mu...@adobe.com> wrote:
> Hi,
>
>> as Mongo maintains a global exclusive write locks on a per database level
>
> I think this is not necessarily a huge problem. As far as I understand, it
> limits write concurrency within one shard only, so it does not block
> scalability. Open questions are, what is the write thoughput for one
> shard, does the write lock also block reads (I guess not), does the write
> lock cause high latency for other writes because binaries are big.
>


This information would be extremely useful for all those looking to
Oak to address use cases where the repository access is between 20 and
60% write.

To answer one of your questions
According to [1] write locks do block reads within the scope of the lock.

Other information from [1].
Write locks are exclusive and global.
Write locks block read locks being established.
(and obviously read locks block write locks being established)
Read locks are concurrent and shared.
Pre 2.2 a write lock was scoped to the mongod process.
Post 2.2 a write lock is scoped to the database within the mondod process.
All locks are scoped to a shard.
IIUC, the lock behaviour is identical to that in JR2 except for the scope.

Ian

1 http://docs.mongodb.org/manual/faq/concurrency/#how-granular-are-locks-in-mongodb



>
> I think it would make sense to have a simple benchmark (concurrent writing
> / reading of binaries), so that we can test which strategy is best, and
> possibly play around with different strategies (split binaries into
> smaller / larger chunks, use different write concerns, use more
> shards,...).
>
> Regards,
> Thomas
>
>
>
> On 10/30/13 7:50 AM, "Chetan Mehrotra" <ch...@gmail.com> wrote:
>
>>Hi,
>>
>>Currently we are storing blobs by breaking them into small chunks and
>>then storing those chunks in MongoDB as part of blobs collection. This
>>approach would cause issues as Mongo maintains a global exclusive
>>write locks on a per database level [1]. So even writing multiple
>>small chunks of say 2 MB each would lead to write lock contention.
>>
>>Mongo also provides GridFS[2]. However it also uses a similar strategy
>>like we are currently using and such a support is built into the
>>Driver. For server they are just collection entries.
>>
>>So to minimize contentions for write locks for uses cases where big
>>assets are being stored in Oak we can opt for following strategies
>>
>>1. Store the blobs collection in a different database. As Mongo write
>>locks [1] are taken per db level then storing the blobs in different
>>db would allow the read/write of node data (majority usecase) to
>>continue.
>>
>>2. For more asset/binary heavy usecase use a separate database server
>>itself to server the binaries.
>>
>>3. Bring back the JR2 DataStore implementation and just save metadata
>>related to binaries in Mongo. We already have S3 based implementation
>>there and they would continue to work with Oak also
>>
>>Chetan Mehrotra
>>[1]
>>http://docs.mongodb.org/manual/faq/concurrency/#how-granular-are-locks-in-
>>mongodb
>>[2] http://docs.mongodb.org/manual/core/gridfs/
>

Re: Strategies around storing blobs in Mongo

Posted by Thomas Mueller <mu...@adobe.com>.

Hi,

> as Mongo maintains a global exclusive write locks on a per database level

I think this is not necessarily a huge problem. As far as I understand, it
limits write concurrency within one shard only, so it does not block
scalability. Open questions are, what is the write thoughput for one
shard, does the write lock also block reads (I guess not), does the write
lock cause high latency for other writes because binaries are big.

I think it would make sense to have a simple benchmark (concurrent writing
/ reading of binaries), so that we can test which strategy is best, and
possibly play around with different strategies (split binaries into
smaller / larger chunks, use different write concerns, use more
shards,...).

Regards,
Thomas

On 10/30/13 7:50 AM, "Chetan Mehrotra" <ch...@gmail.com> wrote:

>Hi,
>
>Currently we are storing blobs by breaking them into small chunks and
>then storing those chunks in MongoDB as part of blobs collection. This
>approach would cause issues as Mongo maintains a global exclusive
>write locks on a per database level [1]. So even writing multiple
>small chunks of say 2 MB each would lead to write lock contention.
>
>Mongo also provides GridFS[2]. However it also uses a similar strategy
>like we are currently using and such a support is built into the
>Driver. For server they are just collection entries.
>
>So to minimize contentions for write locks for uses cases where big
>assets are being stored in Oak we can opt for following strategies
>
>1. Store the blobs collection in a different database. As Mongo write
>locks [1] are taken per db level then storing the blobs in different
>db would allow the read/write of node data (majority usecase) to
>continue.
>
>2. For more asset/binary heavy usecase use a separate database server
>itself to server the binaries.
>
>3. Bring back the JR2 DataStore implementation and just save metadata
>related to binaries in Mongo. We already have S3 based implementation
>there and they would continue to work with Oak also
>
>Chetan Mehrotra
>[1] 
>http://docs.mongodb.org/manual/faq/concurrency/#how-granular-are-locks-in-
>mongodb
>[2] http://docs.mongodb.org/manual/core/gridfs/