You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Branko Cibej <br...@xbc.nu> on 2009/10/05 10:57:55 UTC

Any FSFS rep-sharing experts out there?

If so, please help me figure out how to figure out issue #3506.

-- Brane

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2403518

Re: Any FSFS rep-sharing experts out there?

Posted by Branko Cibej <br...@xbc.nu>.

Mark Phippard wrote:
> On Mon, Oct 5, 2009 at 9:18 AM, Bert Huijben <be...@qqmail.nl> wrote:
>
>   
>>> On Mon, Oct 5, 2009 at 6:57 AM, Branko Cibej <br...@xbc.nu> wrote:
>>>       
>>>> If so, please help me figure out how to figure out issue #3506.
>>>>         
>>> From my reading, SQLite locks the entire database file when it is
>>> writing.  So if the large commit is writing a lot of data to the file
>>> it could be that the lock window is high.  I imagine it is also
>>> possible for the size of the database in the ASF repository to make it
>>> take even longer?
>>>       
>> Normally SQLite doesn't block the entire database, but just the tables it will be writing to when committing the current transaction.
>>
>> The documentation also says something about: If the changes are too large to fit in the
>> memory cache, the lock is promoted to a full exclusive lock to allow spilling
>> intermediate results to the database file.
>>
>> This last thing might be the case here....?
>>     
>
> Some perhaps old stuff I read said the entire database is locked
> during writing.  In the case of rep-cache, isn't there only a single
> table anyway?  I have never looked but cannot imagine there are too
> many tables. So a table lock would essentially lock everything
> regardless.
>   

There's a single database with single table, yes. According to Paul's
report, this only happens for large commits -- so it would appear that
we should either try to write the rep-sharing info in smaller chunks, or
-- more likely, given that it's probably due to on underlying buffer
flush in SQLite -- trap the SQLITE_BUSY signals (possibly in fs_open?).

Though as Bert says, that last could lead to strange delays in opening
FS connections.

-- Brane

P.S.: Or use BDB instead of SQLite ...

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2403661

Re: Any FSFS rep-sharing experts out there?

Posted by Mark Phippard <ma...@gmail.com>.

On Mon, Oct 5, 2009 at 9:18 AM, Bert Huijben <be...@qqmail.nl> wrote:

>> On Mon, Oct 5, 2009 at 6:57 AM, Branko Cibej <br...@xbc.nu> wrote:
>> > If so, please help me figure out how to figure out issue #3506.
>>
>> From my reading, SQLite locks the entire database file when it is
>> writing.  So if the large commit is writing a lot of data to the file
>> it could be that the lock window is high.  I imagine it is also
>> possible for the size of the database in the ASF repository to make it
>> take even longer?
>
> Normally SQLite doesn't block the entire database, but just the tables it will be writing to when committing the current transaction.
>
> The documentation also says something about: If the changes are too large to fit in the
> memory cache, the lock is promoted to a full exclusive lock to allow spilling
> intermediate results to the database file.
>
> This last thing might be the case here....?

Some perhaps old stuff I read said the entire database is locked
during writing.  In the case of rep-cache, isn't there only a single
table anyway?  I have never looked but cannot imagine there are too
many tables. So a table lock would essentially lock everything
regardless.

>> Are we using these features of SQLite?
>
> I don't think we install a busy handler, and I don't know if we want to enable this
> globally; probably not.
>
> In the WC-Layer I would prefer an immediate negative answer over a delay..

I was mainly thinking of the repos layer.  We should not be
introducing problems like this into the repository.  I do not even
think this feature should have been added to Subversion if it can
introduce problems like this.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2403597

RE: Any FSFS rep-sharing experts out there?

Posted by Bert Huijben <rh...@sharpsvn.net>.

> -----Original Message-----
> From: Mark Phippard [mailto:markphip@gmail.com]
> Sent: maandag 5 oktober 2009 15:02
> To: Branko Cibej
> Cc: dev@subversion.tigris.org
> Subject: Re: Any FSFS rep-sharing experts out there?
> 
> On Mon, Oct 5, 2009 at 6:57 AM, Branko Cibej <br...@xbc.nu> wrote:
> > If so, please help me figure out how to figure out issue #3506.
> 
> From my reading, SQLite locks the entire database file when it is
> writing.  So if the large commit is writing a lot of data to the file
> it could be that the lock window is high.  I imagine it is also
> possible for the size of the database in the ASF repository to make it
> take even longer?

Normally SQLite doesn't block the entire database, but just the tables it will be writing to when committing the current transaction. 

The documentation also says something about: If the changes are too large to fit in the memory cache, the lock is promoted to a full exclusive lock to allow spilling intermediate results to the database file.

This last thing might be the case here....?

> Are we using these features of SQLite?

I don't think we install a busy handler, and I don't know if we want to enable this globally; probably not. 

In the WC-Layer I would prefer an immediate negative answer over a delay.. 

Retrying is in most cases not a valid answer to this. Just look at the WIN32 retry loop, which is/was sometimes used on multiple levels.. which could make directory deletes take a half minute when there was a locked file where the code assumed a directory.

But maybe we should enable this for this specific repository use of SQLite.

	Bert

> 
> http://www.sqlite.org/c3ref/busy_handler.html
> 
> It seems like our code needs to wait for the lock to clear, at least
> for a little bit of time.  Of course we also need to do as much as we
> can to have the lock window be as small as possible.
> 
> It seems like this will be a potential WC-NG issue as well if you have
> multiple processes accessing the same WC.
> 
> --
> Thanks
> 
> Mark Phippard
> http://markphip.blogspot.com/
> 
> ------------------------------------------------------
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageI
> d=2403570

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2403592

Re: Any FSFS rep-sharing experts out there?

Posted by Mark Phippard <ma...@gmail.com>.

On Mon, Oct 5, 2009 at 6:57 AM, Branko Cibej <br...@xbc.nu> wrote:
> If so, please help me figure out how to figure out issue #3506.

Re: Any FSFS rep-sharing experts out there?

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.

Daniel Shahaf wrote on Fri, 9 Oct 2009 at 13:00 +0200:
> Daniel Shahaf wrote on Fri, 9 Oct 2009 at 00:22 +0200:
> > David Glasser wrote on Thu, 8 Oct 2009 at 13:47 -0700:
> > > So yeah, your patch looks good but I'd go two steps farther and run
> > > the set_rep calls outside of the write lock and not in a transaction.
> > 
> > Per your other email, I'll move it outside the write lock.
> 
> I will do that in a followup commit.

r39897.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2405531

Re: Any FSFS rep-sharing experts out there?

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.

Daniel Shahaf wrote on Fri, 9 Oct 2009 at 00:22 +0200:
> David Glasser wrote on Thu, 8 Oct 2009 at 13:47 -0700:
> > On Thu, Oct 8, 2009 at 11:07 AM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> > > I took a stab, see attached.

Committed in r39892.  Further testing/reviews are still welcome, thanks.

> > So yeah, your patch looks good but I'd go two steps farther and run
> > the set_rep calls outside of the write lock and not in a transaction.
> > 
> 
> Per your other email, I'll move it outside the write lock.
> 

I will do that in a followup commit.

> As suggested upthread, we may still want to make the code ignore BUSY
> errors when opening the DB (either for read or for write)

*After* the open, we do wait 10 seconds when sqlite signals BUSY:

  /* Retry until timeout when database is busy. */
  SQLITE_ERR_MSG(sqlite3_busy_timeout(*db3, BUSY_TIMEOUT),
                 sqlite3_errmsg(*db3));

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2405484

Re: Any FSFS rep-sharing experts out there?

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.

David Glasser wrote on Thu, 8 Oct 2009 at 13:47 -0700:
> On Thu, Oct 8, 2009 at 11:07 AM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> > C. Michael Pilato wrote on Thu, 8 Oct 2009 at 12:51 -0400:
> >> Branko Cibej wrote:
> >> > The rep-cache database gets opened deep within svn_fs_fs__open and
> >> > svn_fs_fs__create. We don't really have a way to distinguish between
> >> > open-for-read and open-for-write in svn_fs_open. I can't form an opinion
> >> > right now on whether that's a serious omission or not, but in any case
> >> > adding an open-mode would be a huge conceptional change, of the svn-2.0
> >> > kind, IMHO.
> >>
> >> I can't see why we'd need to add different access modes.  Why not simply
> >> make the code avoid opening the cache database until it is needed?
> >
> > I took a stab, see attached.
> 
> Took a look at this and it generally looks good.  I was concerned at
> first that there could be a downside to a failure to update the DB
> after a successful commit_body, but that was back from when the table
> had a "ref count" for each rep; the current code has a different more
> arbitrary uniquifier, so this should work.
> 

Cool.

> This raises a question, though: do we even need to do
> write_reps_to_cache inside the FSFS write lock?  I actually don't
> think so. [...]

Yes, write_reps_to_cache() only takes care of updating rep-cache.db, so
there is no real "need" for it to be inside the write lock.

(I'm not familiar with sqlite, but I'm assuming it will not have a problem
handling the resulting additional concurrent accesses to the DB.)

> I think that the DB has to obey this invariant:
> 
>   "If at any point, it is possible to read
> HASH=>(rev,offset,size,expanded_size) from the DB, then there must
> exist a rep for HASH at (rev,offset) with the given size and expanded
> size for the indefinite future."
> 
> But it doesn't actually need to obey anything stronger than that: it's
> not necessary that that entry remain in the database, for example.
> 

Agreed.  (It also assumes that the HASH values of different reps don't
collide.)

> So yeah, your patch looks good but I'd go two steps farther and run
> the set_rep calls outside of the write lock and not in a transaction.
> 

Per your other email, I'll move it outside the write lock.

As suggested upthread, we may still want to make the code ignore BUSY
errors when opening the DB (either for read or for write) --- this will
only be significant, however, in an environment where commits touch many
files per second (since the patch makes the DB only be opened by FS
writers).

> --dave
> 
> 

Thanks for the review!

Daniel

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2405353

Re: Any FSFS rep-sharing experts out there?

Posted by David Glasser <gl...@davidglasser.net>.

On Thu, Oct 8, 2009 at 2:48 PM, Bert Huijben <be...@qqmail.nl> wrote:
> On Thu, Oct 8, 2009 at 10:47 PM, David Glasser <gl...@davidglasser.net> wrote:
>
>> (This is making the assumption that a series of N non-transaction
>> SQLite INSERT statements (which is to say, a series of N implicit
>> transactions) is as efficient as one transaction with N INSERTs.
>> Maybe it isn't; I'm not sure.  Getting it out of the FSFS write lock
>> would still be good, though.)
>
> This is answered in http://www.sqlite.org/faq.html#q19. It isn't as fast:
>
> --------------
> (19) INSERT is really slow - I can only do few dozen INSERTs per second
>
>   Actually, SQLite will easily do 50,000 or more INSERT statements
> per second on an average desktop computer. But it will only do a few
> dozen transactions per second. Transaction speed is limited by the
> rotational speed of your disk drive. A transaction normally requires
> two complete rotations of the disk platter, which on a 7200RPM disk
> drive limits you to about 60 transactions per second.
>
>   Transaction speed is limited by disk drive speed because (by
> default) SQLite actually waits until the data really is safely stored
> on the disk surface before the transaction is complete. That way, if
> you suddenly lose power or if your OS crashes, your data is still
> safe. For details, read about atomic commit in SQLite..
>
>   By default, each INSERT statement is its own transaction. But if
> you surround multiple INSERT statements with BEGIN...COMMIT then all
> the inserts are grouped into a single transaction. The time needed to
> commit the transaction is amortized over all the enclosed insert
> statements and so the time per insert statement is greatly reduced.
>
>   Another option is to run PRAGMA synchronous=OFF. This command will
> cause SQLite to not wait on data to reach the disk surface, which will
> make write operations appear to be much faster. But if you lose power
> in the middle of a transaction, your database file might go corrupt.
>
> ----
> And it is certainly at its right place in the FAQ.. Referenced it
> three times in the last few days ;)

Thanks Bert.  Then take away my recommendation to de-sqlitetxnify the
rep cache updating.  But it's still worth considering moving it
outside of the FSFS write lock.

The tradeoffs, as far as I see it, are:

  - In Daniel's current patch, the sqlite DB is only accessed when the
FSFS write lock is held, which means that in theory there should never
be *any* concurrent access to it, so we never have to worry about
EBUSY, overlapping txns, etc.  (Though at this point sqlite is kinda
overkill :) )  On the other hand, commits take longer and so if a repo
is getting lots and lots of commits, this could take longer.

 - If the rep cache is updated after the lock is dropped, then it's
possible to get contention on the DB, but multiple commits around the
same time can do more work in parallel.

I think it's worth trying...

-- 
glasser@davidglasser.net | langtonlabs.org | flickr.com/photos/glasser/

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2405335

Re: Any FSFS rep-sharing experts out there?

Posted by Bert Huijben <rh...@sharpsvn.net>.

On Thu, Oct 8, 2009 at 10:47 PM, David Glasser <gl...@davidglasser.net> wrote:

> (This is making the assumption that a series of N non-transaction
> SQLite INSERT statements (which is to say, a series of N implicit
> transactions) is as efficient as one transaction with N INSERTs.
> Maybe it isn't; I'm not sure.  Getting it out of the FSFS write lock
> would still be good, though.)

This is answered in http://www.sqlite.org/faq.html#q19. It isn't as fast:

--------------
(19) INSERT is really slow - I can only do few dozen INSERTs per second

   Actually, SQLite will easily do 50,000 or more INSERT statements
per second on an average desktop computer. But it will only do a few
dozen transactions per second. Transaction speed is limited by the
rotational speed of your disk drive. A transaction normally requires
two complete rotations of the disk platter, which on a 7200RPM disk
drive limits you to about 60 transactions per second.

   Transaction speed is limited by disk drive speed because (by
default) SQLite actually waits until the data really is safely stored
on the disk surface before the transaction is complete. That way, if
you suddenly lose power or if your OS crashes, your data is still
safe. For details, read about atomic commit in SQLite..

   By default, each INSERT statement is its own transaction. But if
you surround multiple INSERT statements with BEGIN...COMMIT then all
the inserts are grouped into a single transaction. The time needed to
commit the transaction is amortized over all the enclosed insert
statements and so the time per insert statement is greatly reduced.

   Another option is to run PRAGMA synchronous=OFF. This command will
cause SQLite to not wait on data to reach the disk surface, which will
make write operations appear to be much faster. But if you lose power
in the middle of a transaction, your database file might go corrupt.

----
And it is certainly at its right place in the FAQ.. Referenced it
three times in the last few days ;)

  Bert

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2405325

Re: Any FSFS rep-sharing experts out there?

Posted by David Glasser <gl...@davidglasser.net>.

On Thu, Oct 8, 2009 at 11:07 AM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> C. Michael Pilato wrote on Thu, 8 Oct 2009 at 12:51 -0400:
>> Branko Cibej wrote:
>> > The rep-cache database gets opened deep within svn_fs_fs__open and
>> > svn_fs_fs__create. We don't really have a way to distinguish between
>> > open-for-read and open-for-write in svn_fs_open. I can't form an opinion
>> > right now on whether that's a serious omission or not, but in any case
>> > adding an open-mode would be a huge conceptional change, of the svn-2.0
>> > kind, IMHO.
>>
>> I can't see why we'd need to add different access modes.  Why not simply
>> make the code avoid opening the cache database until it is needed?
>
> I took a stab, see attached.
>
> The ideas were:
>
> * centralize reading the config (avoid code duplication)
> * open the DB as late as possible (but don't bother closing it once it's opened)
> * write to the DB only after finishing the FS commit (thus enabling commit_body()
>  to be run outside the sqlite txn)
>
> It passes tests (C, basic, and commit) (but I'm positive I could have bugs
> that the tests wouldn't catch).

Took a look at this and it generally looks good.  I was concerned at
first that there could be a downside to a failure to update the DB
after a successful commit_body, but that was back from when the table
had a "ref count" for each rep; the current code has a different more
arbitrary uniquifier, so this should work.

This raises a question, though: do we even need to do
write_reps_to_cache inside the FSFS write lock?  I actually don't
think so.  Note that (as of r33408, back on the original feature
branch), Hyrum decided not to care if multiple transactions around the
same time tried to call set_rep_reference with the same checksum.
(Nobody passes TRUE for reject_dup.)  If we take the SQLite call out
of the write lock, the only real downside is that revisions committed
very soon after this one can't share reps with it (but if we're a
server with such frequent revisions, we'd probably prefer the extra
concurrency anyway).

In fact, why do you even need to run write_reps_to_cache in a
transaction at all?  I think that the DB has to obey this invariant:

  "If at any point, it is possible to read
HASH=>(rev,offset,size,expanded_size) from the DB, then there must
exist a rep for HASH at (rev,offset) with the given size and expanded
size for the indefinite future."

But it doesn't actually need to obey anything stronger than that: it's
not necessary that that entry remain in the database, for example.

So yeah, your patch looks good but I'd go two steps farther and run
the set_rep calls outside of the write lock and not in a transaction.

(This is making the assumption that a series of N non-transaction
SQLite INSERT statements (which is to say, a series of N implicit
transactions) is as efficient as one transaction with N INSERTs.
Maybe it isn't; I'm not sure.  Getting it out of the FSFS write lock
would still be good, though.)

--dave

-- 
glasser@davidglasser.net | langtonlabs.org | flickr.com/photos/glasser/

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2405309

Re: Any FSFS rep-sharing experts out there?

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.

Branko Čibej wrote on Thu, 8 Oct 2009 at 20:43 +0200:
> C. Michael Pilato wrote:
> > Daniel Shahaf wrote:
> >> C. Michael Pilato wrote on Thu, 8 Oct 2009 at 12:51 -0400:
> >>> I can't see why we'd need to add different access modes.  Why not simply
> >>> make the code avoid opening the cache database until it is needed?
> >>>       
> >> I took a stab, see attached.
> >>
> >
> > . o O ( I love open source software. )
> >   
> 
> :( You've just taken all the fun out of it. Here I was hoping to have a
> happy hacking week-end discovering the surface of FSFS, and now all I
> have left is trying to verify that your change has a positive effect on
> really large repos.
> 

We're in the same boat, then.  Nonetheless, sorry for spoiling your weekend.

Daniel

> .oO(I wish the following wouldn't fail...
>         $ set -e
>         $ [ $REALJOB = $SVN ])
> 
> -- Brane
>

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2405282

Re: Any FSFS rep-sharing experts out there?

Posted by Branko Cibej <br...@xbc.nu>.

C. Michael Pilato wrote:
> Daniel Shahaf wrote:
>   
>> C. Michael Pilato wrote on Thu, 8 Oct 2009 at 12:51 -0400:
>>     
>>> Branko Cibej wrote:
>>>       
>>>> The rep-cache database gets opened deep within svn_fs_fs__open and
>>>> svn_fs_fs__create. We don't really have a way to distinguish between
>>>> open-for-read and open-for-write in svn_fs_open. I can't form an opinion
>>>> right now on whether that's a serious omission or not, but in any case
>>>> adding an open-mode would be a huge conceptional change, of the svn-2.0
>>>> kind, IMHO.
>>>>         
>>> I can't see why we'd need to add different access modes.  Why not simply
>>> make the code avoid opening the cache database until it is needed?
>>>       
>> I took a stab, see attached.
>>
>> The ideas were:
>>
>> * centralize reading the config (avoid code duplication)
>> * open the DB as late as possible (but don't bother closing it once it's opened)
>> * write to the DB only after finishing the FS commit (thus enabling commit_body() 
>>   to be run outside the sqlite txn)
>>
>> It passes tests (C, basic, and commit) (but I'm positive I could have bugs 
>> that the tests wouldn't catch).
>>     
>
> . o O ( I love open source software. )
>   

:( You've just taken all the fun out of it. Here I was hoping to have a
happy hacking week-end discovering the surface of FSFS, and now all I
have left is trying to verify that your change has a positive effect on
really large repos.

.oO(I wish the following wouldn't fail...
        $ set -e
        $ [ $REALJOB = $SVN ])

-- Brane

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2405273

Re: Any FSFS rep-sharing experts out there?

Posted by "C. Michael Pilato" <cm...@collab.net>.

Daniel Shahaf wrote:
> C. Michael Pilato wrote on Thu, 8 Oct 2009 at 12:51 -0400:
>> Branko Cibej wrote:
>>> The rep-cache database gets opened deep within svn_fs_fs__open and
>>> svn_fs_fs__create. We don't really have a way to distinguish between
>>> open-for-read and open-for-write in svn_fs_open. I can't form an opinion
>>> right now on whether that's a serious omission or not, but in any case
>>> adding an open-mode would be a huge conceptional change, of the svn-2.0
>>> kind, IMHO.
>> I can't see why we'd need to add different access modes.  Why not simply
>> make the code avoid opening the cache database until it is needed?
> 
> I took a stab, see attached.
> 
> The ideas were:
> 
> * centralize reading the config (avoid code duplication)
> * open the DB as late as possible (but don't bother closing it once it's opened)
> * write to the DB only after finishing the FS commit (thus enabling commit_body() 
>   to be run outside the sqlite txn)
> 
> It passes tests (C, basic, and commit) (but I'm positive I could have bugs 
> that the tests wouldn't catch).

. o O ( I love open source software. )

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2405259

Re: Any FSFS rep-sharing experts out there?

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.

C. Michael Pilato wrote on Thu, 8 Oct 2009 at 12:51 -0400:
> Branko Cibej wrote:
> > The rep-cache database gets opened deep within svn_fs_fs__open and
> > svn_fs_fs__create. We don't really have a way to distinguish between
> > open-for-read and open-for-write in svn_fs_open. I can't form an opinion
> > right now on whether that's a serious omission or not, but in any case
> > adding an open-mode would be a huge conceptional change, of the svn-2.0
> > kind, IMHO.
> 
> I can't see why we'd need to add different access modes.  Why not simply
> make the code avoid opening the cache database until it is needed?

I took a stab, see attached.

The ideas were:

* centralize reading the config (avoid code duplication)
* open the DB as late as possible (but don't bother closing it once it's opened)
* write to the DB only after finishing the FS commit (thus enabling commit_body() 
  to be run outside the sqlite txn)

It passes tests (C, basic, and commit) (but I'm positive I could have bugs 
that the tests wouldn't catch).

Daniel

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2405246

Re: Any FSFS rep-sharing experts out there?

Posted by "C. Michael Pilato" <cm...@collab.net>.

Branko Cibej wrote:
> The rep-cache database gets opened deep within svn_fs_fs__open and
> svn_fs_fs__create. We don't really have a way to distinguish between
> open-for-read and open-for-write in svn_fs_open. I can't form an opinion
> right now on whether that's a serious omission or not, but in any case
> adding an open-mode would be a huge conceptional change, of the svn-2.0
> kind, IMHO.

I can't see why we'd need to add different access modes.  Why not simply
make the code avoid opening the cache database until it is needed?

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2405173

Re: Any FSFS rep-sharing experts out there?

Posted by Branko Cibej <br...@xbc.nu>.

Mark Phippard wrote:
> On Thu, Oct 8, 2009 at 12:40 PM, Branko Čibej <br...@xbc.nu> wrote:
>
>   
>>> 1) Why do we need to open this database when reading the repository?
>>> Such as for a checkout?  My understanding is that the only time the
>>> cache is needed is when we are writing to the database.  We want to
>>> see if there is an existing representation for the same hash.  So why
>>> should we care if this database is locked when someone is doing a
>>> checkout?
>>>
>>>       
>> The rep-cache database gets opened deep within svn_fs_fs__open and
>> svn_fs_fs__create. We don't really have a way to distinguish between
>> open-for-read and open-for-write in svn_fs_open. I can't form an opinion
>> right now on whether that's a serious omission or not, but in any case
>> adding an open-mode would be a huge conceptional change, of the svn-2.0
>> kind, IMHO.
>>     
>
> I am not a C programmer, so I am speaking from a position of ignorance
> here, but I do not see why we cannot just open the database at the
> same point where we actually need to use it.  It seems like poor
> design to be opening the SQLite database every time the FS is
> accessed, when it is only needed when we are committing something.
> Especially thinking of people that have their repository on NFS or a
> NetApp or something

I'm sure we can. And it has nothing to do with choice of programming
languages. :)

You're right, though; doing this would most likely solve 99% of the
blocking problems seen on svn.apache.org. Even two simultaneous writers
aren't likely to interfere with each other too much.

I'll see about confirming this hypothesis.

-- Brane

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2405168

Re: Any FSFS rep-sharing experts out there?

Posted by Mark Phippard <ma...@gmail.com>.

On Thu, Oct 8, 2009 at 12:40 PM, Branko Čibej <br...@xbc.nu> wrote:

>> 1) Why do we need to open this database when reading the repository?
>> Such as for a checkout?  My understanding is that the only time the
>> cache is needed is when we are writing to the database.  We want to
>> see if there is an existing representation for the same hash.  So why
>> should we care if this database is locked when someone is doing a
>> checkout?
>>
>
> The rep-cache database gets opened deep within svn_fs_fs__open and
> svn_fs_fs__create. We don't really have a way to distinguish between
> open-for-read and open-for-write in svn_fs_open. I can't form an opinion
> right now on whether that's a serious omission or not, but in any case
> adding an open-mode would be a huge conceptional change, of the svn-2.0
> kind, IMHO.

I am not a C programmer, so I am speaking from a position of ignorance
here, but I do not see why we cannot just open the database at the
same point where we actually need to use it.  It seems like poor
design to be opening the SQLite database every time the FS is
accessed, when it is only needed when we are committing something.
Especially thinking of people that have their repository on NFS or a
NetApp or something.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2405166

Re: Any FSFS rep-sharing experts out there?

Posted by Branko Cibej <br...@xbc.nu>.

Mark Phippard wrote:
> On Mon, Oct 5, 2009 at 12:52 PM, Branko Cibej <br...@xbc.nu> wrote:
>
>   
>> But anyway the question is irrelevant. If we manage to lock up the
>> server for tens of seconds because of a slightly larger-than-usual
>> commit, we need to fix it. This is pretty much on my plate right now,
>> but I'll ask around for help on understanding FSFS details.
>>     
>
> Have you made any progress on this?  It would seem worthwhile to get
> this fix into 1.6.6 if possible.
>   

Nope, I got sidetracked by $REAL_JOB. And I doubt it will be a trivial
change that can be coded, tested, reviewed and generally banged on in 6
days in time for 1.6.6.

> I do not want to sidetrack you, but I did have a couple questions (for anyone).
>
> 1) Why do we need to open this database when reading the repository?
> Such as for a checkout?  My understanding is that the only time the
> cache is needed is when we are writing to the database.  We want to
> see if there is an existing representation for the same hash.  So why
> should we care if this database is locked when someone is doing a
> checkout?
>   

The rep-cache database gets opened deep within svn_fs_fs__open and
svn_fs_fs__create. We don't really have a way to distinguish between
open-for-read and open-for-write in svn_fs_open. I can't form an opinion
right now on whether that's a serious omission or not, but in any case
adding an open-mode would be a huge conceptional change, of the svn-2.0
kind, IMHO.

> 2) I assume we are holding the database open while the transaction is
> being committed for atomicity, but do we really need it here?
> Couldn't we just write the new rows to the table after the commit
> succeeds?  If the rows are not written it just means that a future new
> rep would the same cache would not share it.  In the grand scheme of
> things, that seems minor compared to the current behavior.
>   

I'm looking at this alternative, yes. The other one is to do the inserts
in smaller transactions. I don't /think/ that merely opening the
database for read/write access should lock it up; according to all the
docs I've seen, SQLITE_BUSY is a consequence of the whole-table lock
that SQLite acquires for writes.

> If we solved #1, it would seem like #2 would not be an issue.
>   

Or if we had a real(er) database, such as BDB ... :D

-- Brane

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2405161

Re: Any FSFS rep-sharing experts out there?

Posted by Mark Phippard <ma...@gmail.com>.

On Mon, Oct 5, 2009 at 12:52 PM, Branko Cibej <br...@xbc.nu> wrote:

> But anyway the question is irrelevant. If we manage to lock up the
> server for tens of seconds because of a slightly larger-than-usual
> commit, we need to fix it. This is pretty much on my plate right now,
> but I'll ask around for help on understanding FSFS details.

Have you made any progress on this?  It would seem worthwhile to get
this fix into 1.6.6 if possible.

I do not want to sidetrack you, but I did have a couple questions (for anyone).

1) Why do we need to open this database when reading the repository?
Such as for a checkout?  My understanding is that the only time the
cache is needed is when we are writing to the database.  We want to
see if there is an existing representation for the same hash.  So why
should we care if this database is locked when someone is doing a
checkout?

2) I assume we are holding the database open while the transaction is
being committed for atomicity, but do we really need it here?
Couldn't we just write the new rows to the table after the commit
succeeds?  If the rows are not written it just means that a future new
rep would the same cache would not share it.  In the grand scheme of
things, that seems minor compared to the current behavior.

If we solved #1, it would seem like #2 would not be an issue.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2404941

Re: Any FSFS rep-sharing experts out there?

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.

Paul Querna wrote on Thu, 8 Oct 2009 at 00:48 -0700:
> On Wed, Oct 7, 2009 at 2:28 PM, David Glasser <gl...@davidglasser.net> wrote:
> > On Tue, Oct 6, 2009 at 7:10 PM, Paul Querna <ch...@force-elite.com> wrote:
> >> With help from Branko last night from IRC, pulled out the follow stats
> >> from the ASF repository:
> >> 15,612,528 representations total [1]
> >> 4,254,361 unique representations in the sqlitedb [2]
> >> (3.7x ratio)
> >
> > I'm not sure how useful that number is.  Is everything in the repo in
> > the db, or only reps created since rep-sharing was enabled?
> 
> Everything in the repo.  We did a full dump and reload for svn 1.6,
> and enabled rep-sharing before starting the load. (filtered out some
> paths at the same time, wasn't a pointless exercise)
> 
> >  The more
> > relevant number is "what is the sum of all the reference count
> > numbers, compared to the 4.2 million number".
> 
> tell me what to run to get you the interesting statistics, and I'm
> happy to do that :)
> 

Since you enabled rep-sharing prior to starting the load, then (IIUC) the 
"sum of all the reference counts" should be the same 15M number as above. 

(If you had rep-sharing disabled during portions of history, then the 
number of reps in those portions should be subtracted from the 15M.)

To see how much disk space is saved, I suppose you'll have to dump|load 
with enable-rep-sharing=false set --- I don't know of an easier way 
(not without having the reference counts).

Daniel

> > But more importantly, because the *only* advantage of rep-sharing is
> > that it potentially reduces disk use (there is absolutely no potential
> > time savings (unless you are very hopeful about disk cache) and there
> > is increased locking), the only relevant stats IMHO are "how much disk
> > space does the repo take up, compared to how much it would take up
> > without rep sharing... and how does that size delta affect the needs
> > of the ASF (cost of disks, backup speed, etc)".
> 
> We saw a pretty massive speedup upgrading 1.5-> 1.6.  I do attribute
> that somewhat to less disk thrashing, but its hard to compare that to
> pre-rep-sharing, since we did lots of things around that time to get
> speedups every way we could.  Reducing repo size though is a big deal,
> our repo is easily 80gb++, cutting that by more than 20% is huge.
> 
> -Paul
>

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2404840

Re: Any FSFS rep-sharing experts out there?

Posted by Paul Querna <ch...@force-elite.com>.

On Wed, Oct 7, 2009 at 2:28 PM, David Glasser <gl...@davidglasser.net> wrote:
> On Tue, Oct 6, 2009 at 7:10 PM, Paul Querna <ch...@force-elite.com> wrote:
>> On Mon, Oct 5, 2009 at 4:31 PM, David Glasser <gl...@davidglasser.net> wrote:
>>> On Mon, Oct 5, 2009 at 9:52 AM, Branko Čibej <br...@xbc.nu> wrote:
>>>> Daniel Shahaf wrote:
>>>>> Branko Cibej wrote on Mon, 5 Oct 2009 at 18:08 +0200:
>>>>> IIUC, the size of the DB is proportional to the number of (unique)
>>>>> representations.  This doesn't tell anything about the amount of space
>>>>> saved (by reusing representations).
>>>>>
>>>>
>>>> Oh, yes, you're right. Silly me.
>>>>
>>>> But anyway the question is irrelevant. If we manage to lock up the
>>>> server for tens of seconds because of a slightly larger-than-usual
>>>> commit, we need to fix it. This is pretty much on my plate right now,
>>>> but I'll ask around for help on understanding FSFS details.
>>>
>>> The relevance of the question is that if you're not actually getting a
>>> benefit from rep caching (a feature whose cost/benefit ratios I
>>> personally felt were not strong enough to warrant it being turned on
>>> by default), you could just avoid all the contention by not using it.
>>
>> With help from Branko last night from IRC, pulled out the follow stats
>> from the ASF repository:
>> 15,612,528 representations total [1]
>> 4,254,361 unique representations in the sqlitedb [2]
>> (3.7x ratio)
>
> I'm not sure how useful that number is.  Is everything in the repo in
> the db, or only reps created since rep-sharing was enabled?

Everything in the repo.  We did a full dump and reload for svn 1.6,
and enabled rep-sharing before starting the load. (filtered out some
paths at the same time, wasn't a pointless exercise)

>  The more
> relevant number is "what is the sum of all the reference count
> numbers, compared to the 4.2 million number".

tell me what to run to get you the interesting statistics, and I'm
happy to do that :)

> But more importantly, because the *only* advantage of rep-sharing is
> that it potentially reduces disk use (there is absolutely no potential
> time savings (unless you are very hopeful about disk cache) and there
> is increased locking), the only relevant stats IMHO are "how much disk
> space does the repo take up, compared to how much it would take up
> without rep sharing... and how does that size delta affect the needs
> of the ASF (cost of disks, backup speed, etc)".

We saw a pretty massive speedup upgrading 1.5-> 1.6.  I do attribute
that somewhat to less disk thrashing, but its hard to compare that to
pre-rep-sharing, since we did lots of things around that time to get
speedups every way we could.  Reducing repo size though is a big deal,
our repo is easily 80gb++, cutting that by more than 20% is huge.

-Paul

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2404815

Re: Any FSFS rep-sharing experts out there?

Posted by David Glasser <gl...@davidglasser.net>.

On Tue, Oct 6, 2009 at 7:10 PM, Paul Querna <ch...@force-elite.com> wrote:
> On Mon, Oct 5, 2009 at 4:31 PM, David Glasser <gl...@davidglasser.net> wrote:
>> On Mon, Oct 5, 2009 at 9:52 AM, Branko Čibej <br...@xbc.nu> wrote:
>>> Daniel Shahaf wrote:
>>>> Branko Cibej wrote on Mon, 5 Oct 2009 at 18:08 +0200:
>>>> IIUC, the size of the DB is proportional to the number of (unique)
>>>> representations.  This doesn't tell anything about the amount of space
>>>> saved (by reusing representations).
>>>>
>>>
>>> Oh, yes, you're right. Silly me.
>>>
>>> But anyway the question is irrelevant. If we manage to lock up the
>>> server for tens of seconds because of a slightly larger-than-usual
>>> commit, we need to fix it. This is pretty much on my plate right now,
>>> but I'll ask around for help on understanding FSFS details.
>>
>> The relevance of the question is that if you're not actually getting a
>> benefit from rep caching (a feature whose cost/benefit ratios I
>> personally felt were not strong enough to warrant it being turned on
>> by default), you could just avoid all the contention by not using it.
>
> With help from Branko last night from IRC, pulled out the follow stats
> from the ASF repository:
> 15,612,528 representations total [1]
> 4,254,361 unique representations in the sqlitedb [2]
> (3.7x ratio)

I'm not sure how useful that number is.  Is everything in the repo in
the db, or only reps created since rep-sharing was enabled?  The more
relevant number is "what is the sum of all the reference count
numbers, compared to the 4.2 million number".

But more importantly, because the *only* advantage of rep-sharing is
that it potentially reduces disk use (there is absolutely no potential
time savings (unless you are very hopeful about disk cache) and there
is increased locking), the only relevant stats IMHO are "how much disk
space does the repo take up, compared to how much it would take up
without rep sharing... and how does that size delta affect the needs
of the ASF (cost of disks, backup speed, etc)".

--dave

> other misc stats:
> 2352 average size of a compressed rep [3]
> 16043 average size of expanded rep [4]
>
> [1] grep -a -r '^text:' $repos/db/revs | wc -l
> [2] select count(*) from rep_cache;
> [3] select AVG(size)  from rep_cache;
> [4] select AVG(expanded_size)  from rep_cache;
>



-- 
glasser@davidglasser.net | langtonlabs.org | flickr.com/photos/glasser/

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2404694

Re: Any FSFS rep-sharing experts out there?

Posted by Paul Querna <ch...@force-elite.com>.

On Mon, Oct 5, 2009 at 4:31 PM, David Glasser <gl...@davidglasser.net> wrote:
> On Mon, Oct 5, 2009 at 9:52 AM, Branko Čibej <br...@xbc.nu> wrote:
>> Daniel Shahaf wrote:
>>> Branko Cibej wrote on Mon, 5 Oct 2009 at 18:08 +0200:
>>> IIUC, the size of the DB is proportional to the number of (unique)
>>> representations.  This doesn't tell anything about the amount of space
>>> saved (by reusing representations).
>>>
>>
>> Oh, yes, you're right. Silly me.
>>
>> But anyway the question is irrelevant. If we manage to lock up the
>> server for tens of seconds because of a slightly larger-than-usual
>> commit, we need to fix it. This is pretty much on my plate right now,
>> but I'll ask around for help on understanding FSFS details.
>
> The relevance of the question is that if you're not actually getting a
> benefit from rep caching (a feature whose cost/benefit ratios I
> personally felt were not strong enough to warrant it being turned on
> by default), you could just avoid all the contention by not using it.

With help from Branko last night from IRC, pulled out the follow stats
from the ASF repository:
15,612,528 representations total [1]
4,254,361 unique representations in the sqlitedb [2]
(3.7x ratio)

other misc stats:
2352 average size of a compressed rep [3]
16043 average size of expanded rep [4]

[1] grep -a -r '^text:' $repos/db/revs | wc -l
[2] select count(*) from rep_cache;
[3] select AVG(size)  from rep_cache;
[4] select AVG(expanded_size)  from rep_cache;

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2404354

Re: Any FSFS rep-sharing experts out there?

Posted by David Glasser <gl...@davidglasser.net>.

On Mon, Oct 5, 2009 at 9:52 AM, Branko Čibej <br...@xbc.nu> wrote:
> Daniel Shahaf wrote:
>> Branko Cibej wrote on Mon, 5 Oct 2009 at 18:08 +0200:
>>
>>> David Glasser wrote:
>>>
>>>> Is the ASF getting a measurable benefit from using rep caching?
>>>>
>>>>
>>> They have a half-gig rep-cache.db, so I expect yes.
>>>
>>
>> IIUC, the size of the DB is proportional to the number of (unique)
>> representations.  This doesn't tell anything about the amount of space
>> saved (by reusing representations).
>>
>
> Oh, yes, you're right. Silly me.
>
> But anyway the question is irrelevant. If we manage to lock up the
> server for tens of seconds because of a slightly larger-than-usual
> commit, we need to fix it. This is pretty much on my plate right now,
> but I'll ask around for help on understanding FSFS details.

The relevance of the question is that if you're not actually getting a
benefit from rep caching (a feature whose cost/benefit ratios I
personally felt were not strong enough to warrant it being turned on
by default), you could just avoid all the contention by not using it.

--dave


-- 
glasser@davidglasser.net | langtonlabs.org | flickr.com/photos/glasser/

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2403913

Re: Any FSFS rep-sharing experts out there?

Posted by Branko Cibej <br...@xbc.nu>.

Daniel Shahaf wrote:
> Branko Cibej wrote on Mon, 5 Oct 2009 at 18:08 +0200:
>   
>> David Glasser wrote:
>>     
>>> Is the ASF getting a measurable benefit from using rep caching?
>>>   
>>>       
>> They have a half-gig rep-cache.db, so I expect yes.
>>     
>
> IIUC, the size of the DB is proportional to the number of (unique) 
> representations.  This doesn't tell anything about the amount of space 
> saved (by reusing representations).
>   

Oh, yes, you're right. Silly me.

But anyway the question is irrelevant. If we manage to lock up the
server for tens of seconds because of a slightly larger-than-usual
commit, we need to fix it. This is pretty much on my plate right now,
but I'll ask around for help on understanding FSFS details.

-- Brane

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2403808

Re: Any FSFS rep-sharing experts out there?

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.

Branko Cibej wrote on Mon, 5 Oct 2009 at 18:08 +0200:
> David Glasser wrote:
> > Is the ASF getting a measurable benefit from using rep caching?
> >   
> 
> They have a half-gig rep-cache.db, so I expect yes.

IIUC, the size of the DB is proportional to the number of (unique) 
representations.  This doesn't tell anything about the amount of space 
saved (by reusing representations).

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2403805

Re: Any FSFS rep-sharing experts out there?

Posted by Branko Cibej <br...@xbc.nu>.

David Glasser wrote:
> Is the ASF getting a measurable benefit from using rep caching?
>   

They have a half-gig rep-cache.db, so I expect yes.

> --dave
>
> On Mon, Oct 5, 2009 at 3:57 AM, Branko Cibej <br...@xbc.nu> wrote:
>   
>> If so, please help me figure out how to figure out issue #3506.
>>
>> -- Brane
>>
>> ------------------------------------------------------
>> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2403518
>>
>>     
>
>
>
>

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2403782

Re: Any FSFS rep-sharing experts out there?

Posted by David Glasser <gl...@davidglasser.net>.

Is the ASF getting a measurable benefit from using rep caching?

--dave

On Mon, Oct 5, 2009 at 3:57 AM, Branko Cibej <br...@xbc.nu> wrote:
> If so, please help me figure out how to figure out issue #3506.
>
> -- Brane
>
> ------------------------------------------------------
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2403518
>



-- 
glasser@davidglasser.net | langtonlabs.org | flickr.com/photos/glasser/

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2403780