You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Julian Foad <ju...@gmail.com> on 2015/05/26 23:15:21 UTC

Queries about rep cache: get_shared_rep()

Index: subversion/libsvn_fs_fs/transaction.c
===================================================================
--- subversion/libsvn_fs_fs/transaction.c    (revision 1681856)
+++ subversion/libsvn_fs_fs/transaction.c    (working copy)
@@ -2243,12 +2243,14 @@ (representation_t **old_re
       const char *file_name
         = path_txn_sha1(fs, &rep->txn_id, rep->sha1_digest, scratch_pool);

       /* in our txn, is there a rep file named with the wanted SHA1?
          If so, read it and use that rep.
        */
+      /* ### Would it be faster (and so better) to just try reading it,
+             and handle ENOENT, instead of first checking for presence? */
       SVN_ERR(svn_io_check_path(file_name, &kind, scratch_pool));
       if (kind == svn_node_file)
         {
           svn_stringbuf_t *rep_string;
           SVN_ERR(svn_stringbuf_from_file2(&rep_string, file_name,
                                            scratch_pool));
@@ -2262,14 +2264,20 @@ get_shared_rep(representation_t **old_re

   /* A simple guard against general rep-cache induced corruption. */
   if ((*old_rep)->expanded_size != rep->expanded_size)
     {
       /* Make the problem show up in the server log.

-         Because not sharing reps is always a save option,
+         Because not sharing reps is always a safe option,
          terminating the request would be inappropriate.
+
+         ### [JAF] Why should we assume the cache is corrupt rather than the
+                   new rep is corrupt? Is this really the logic we want?
+
+                   Should we do something more forceful -- flag the cache as
+                   unusable, perhaps -- rather than just logging a warning?
        */
       svn_checksum_t checksum;
       checksum.digest = rep->sha1_digest;
       checksum.kind = svn_checksum_sha1;

       err = svn_error_createf(SVN_ERR_FS_CORRUPT, NULL,


- Julian

Re: Populating the rep-cache

Posted by Julian Foad <ju...@gmail.com>.
Philip Martin wrote:
> I've been thinking of writing some code to populate the rep-cache from
> existing revisions.  This code would parse the revision, a bit like
> verify, identify checksums in that revision and add any that are found
> to the rep-cache.  This would be time consuming if run on the whole
> repository but would run perfectly well in a separate process while the
> repository remains live.  It could also be run over a revision range
> rather than just the whole repository, and running on a single revision
> such as HEAD would be fast.

+1.

> I believe the code will be relative straightforward, if anything it is
> the API that is more of a problem.
>
>  - We could add a public svn_fs_rep_cache().  This is backend specific
>    but there is precedent: we have svn_fs_berkeley_logfiles() and
>    svn_fs_pack().
>
>  - We could add a more general svn_fs_optimize().  This would do backend
>    specific optimizations that may change in future versions.  Perhaps
>    passing backend-specific flags?
>
>  - We could add the behaviour to svn_fs_recover() by reving the function
>    with a revision range.  This would "recover" the rep-cache after the
>    existing recovery.  At present recover is fast so to preserve that
>    the compatibility function would pass a revision range that is just
>    HEAD.
>
>  - We could avoid a public API and call some FSFS function from svnfsfs.
>
> I'll probably go with the last option initially.  Any comments?

I think the interface to this should be explicit, not hidden in a
generic 'optimize' or 'recover' function. The last option sounds good
as a starting point.

Other than that, I have no opinions on the API yet, nor on the
specific range of functionality that it should offer (examples:
revision ranges, validating existing entries, clearing part or all of
the cache).

> I should note that WANdisco has an interest in this code being
> developed.

I suppose many companies and power users have to deal with issues
where this would be useful.

It might also be useful to consider whether and how Subversion could
tell us whether the rep cache is up to date -- I haven't thought about
this, but as an initial idea tracking the last revision number N where
all revs [0 .. N] are known to be cached would be a possible starting
point for such a feature.

- Julian

Re: Populating the rep-cache

Posted by Johan Corveleyn <jc...@gmail.com>.
On Thu, May 28, 2015 at 6:00 PM, Stefan Fuhrmann
<st...@wandisco.com> wrote:
> On Wed, May 27, 2015 at 8:14 PM, Philip Martin <ph...@wandisco.com>
> wrote:
>>
>> Julian Foad <ju...@gmail.com> writes:
>>
>> > Stefan Fuhrmann wrote:
>> >> * clear the rep-cache.db
>> >
>> > Clearing the cache and continuing operation may make subsequent
>> > commits much larger than they should be, and there is no easy way to
>> > undo that if it happens.
>>
>> I've been thinking of writing some code to populate the rep-cache from
>> existing revisions.  This code would parse the revision, a bit like
>> verify, identify checksums in that revision and add any that are found
>> to the rep-cache.  This would be time consuming if run on the whole
>> repository but would run perfectly well in a separate process while the
>> repository remains live.  It could also be run over a revision range
>> rather than just the whole repository, and running on a single revision
>> such as HEAD would be fast.
>
>
> Makes sense.
>
>>
>> I believe the code will be relative straightforward, if anything it is
>> the API that is more of a problem.
>>
>>  - We could add a public svn_fs_rep_cache().  This is backend specific
>>    but there is precedent: we have svn_fs_berkeley_logfiles() and
>>    svn_fs_pack().
>>
>>  - We could add a more general svn_fs_optimize().  This would do backend
>>    specific optimizations that may change in future versions.  Perhaps
>>    passing backend-specific flags?
>
>
> I think svn_fs_optimize(bool online) would make sense
> in the longer term.
>
> In the "offline" case, it could do anything from removing
> duplicate reps as we build the cache to sharding repos
> or repacking shards. Not that I would want to implement
> any of that soon.

I was wondering about that too. I think repopulating the rep-cache
(without the need to take the repos offline) is very interesting, but
I immediately think: functionality to repopulate the rep-cache *and*
(optionally) rewrite rev files to let them use rep sharing (i.e.
effectively deduplicating the repository) ... that would be even
better.

But big +1 on the initial idea already for offering the ability to
rebuild a broken rep-cache (without having to dump/load).

-- 
Johan

Re: Populating the rep-cache

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Wed, May 27, 2015 at 8:14 PM, Philip Martin <ph...@wandisco.com>
wrote:

> Julian Foad <ju...@gmail.com> writes:
>
> > Stefan Fuhrmann wrote:
> >> * clear the rep-cache.db
> >
> > Clearing the cache and continuing operation may make subsequent
> > commits much larger than they should be, and there is no easy way to
> > undo that if it happens.
>
> I've been thinking of writing some code to populate the rep-cache from
> existing revisions.  This code would parse the revision, a bit like
> verify, identify checksums in that revision and add any that are found
> to the rep-cache.  This would be time consuming if run on the whole
> repository but would run perfectly well in a separate process while the
> repository remains live.  It could also be run over a revision range
> rather than just the whole repository, and running on a single revision
> such as HEAD would be fast.
>

Makes sense.


> I believe the code will be relative straightforward, if anything it is
> the API that is more of a problem.
>
>  - We could add a public svn_fs_rep_cache().  This is backend specific
>    but there is precedent: we have svn_fs_berkeley_logfiles() and
>    svn_fs_pack().
>
>  - We could add a more general svn_fs_optimize().  This would do backend
>    specific optimizations that may change in future versions.  Perhaps
>    passing backend-specific flags?
>

I think svn_fs_optimize(bool online) would make sense
in the longer term.

In the "offline" case, it could do anything from removing
duplicate reps as we build the cache to sharding repos
or repacking shards. Not that I would want to implement
any of that soon.

OTOH, a new FS API makes only sense if we can control
it nicely and generically from svnadmin or its ilk. It seems
to me that a generic "make stuff better" optimize run has
its merits (e.g. after an svnadmin upgrade) but most people
probably want to tune only specific aspects. That's because
they are likely to have large repos that they can't take them
offline for long.


>  - We could add the behaviour to svn_fs_recover() by reving the function
>    with a revision range.  This would "recover" the rep-cache after the
>    existing recovery.  At present recover is fast so to preserve that
>    the compatibility function would pass a revision range that is just
>    HEAD.
>

There is nothing inherently wrong or broken with having an
incomplete rep cache. So, making this part of the recovery
procedure feels wrong.

 - We could avoid a public API and call some FSFS function from svnfsfs.
>

That is probably the best place even longer-term.

-- Stefan^2.

Populating the rep-cache

Posted by Philip Martin <ph...@wandisco.com>.
Julian Foad <ju...@gmail.com> writes:

> Stefan Fuhrmann wrote:
>> * clear the rep-cache.db
>
> Clearing the cache and continuing operation may make subsequent
> commits much larger than they should be, and there is no easy way to
> undo that if it happens.

I've been thinking of writing some code to populate the rep-cache from
existing revisions.  This code would parse the revision, a bit like
verify, identify checksums in that revision and add any that are found
to the rep-cache.  This would be time consuming if run on the whole
repository but would run perfectly well in a separate process while the
repository remains live.  It could also be run over a revision range
rather than just the whole repository, and running on a single revision
such as HEAD would be fast.

I believe the code will be relative straightforward, if anything it is
the API that is more of a problem.

 - We could add a public svn_fs_rep_cache().  This is backend specific
   but there is precedent: we have svn_fs_berkeley_logfiles() and
   svn_fs_pack().

 - We could add a more general svn_fs_optimize().  This would do backend
   specific optimizations that may change in future versions.  Perhaps
   passing backend-specific flags?

 - We could add the behaviour to svn_fs_recover() by reving the function
   with a revision range.  This would "recover" the rep-cache after the
   existing recovery.  At present recover is fast so to preserve that
   the compatibility function would pass a revision range that is just
   HEAD.

 - We could avoid a public API and call some FSFS function from svnfsfs.

I'll probably go with the last option initially.  Any comments?

I should note that WANdisco has an interest in this code being
developed.

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*

Re: Queries about rep cache: get_shared_rep()

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Wed, May 27, 2015 at 6:35 PM, Julian Foad <ju...@gmail.com> wrote:
> Stefan Fuhrmann wrote:
>> Alright. I gave it a bit more thought now.
>>
>> Whenever we encounter this mismatch, something pretty
>> bad likely happened to the repo - such as a failed restore
>> attempt. In turn, we can expect those situations to be
>> very rare - which means we can afford some disruption
>> for the user.
>>
>> I suggest that we do 3 things:
>>
>> * log the warning - for future reference, for being picked
>>   up by monitoring tools etc.
>
> We already do that.

Oh, absolutely. I just didn't mention it.

>> * clear the rep-cache.db
>
> Clearing the cache and continuing operation may make subsequent
> commits much larger than they should be, and there is no easy way to
> undo that if it happens.

Rep-sharing typically reduces the repo size by 25% (e.g. Apache)
to 60% (wordpress, inexperienced users using plain ADD for tags).

Assuming that most rep-sharing is relatively local, i.e. over the
span of a "few" revisions, e.g. due to catch-up merges between
branches, most of the inefficiency will only be temporary.
In short: no major impact.

> Attempting to clear the rep cache might itself fail in some way,
> depending on what kind of corruption has happened to it. It would also
> destroy the evidence of what went wrong.

That is a good point. Two good points, actually.

>> * fail the current commit
>>
>> That way, we can be quite sure that only valid data gets
>> committed.
>
> Failing the current commit will ensure that no potentially bad (but
> undiagnosed) response from the rep cache has already been used in an
> earlier part of the transaction. I suppose that's what you're thinking
> of. That makes sense to me.

Yes that and the rep cache also beging used to validate for the
incoming data - even if it is very unlikely that we mess up the
server-side SHA1 calculation of the fulltext stream.

>> Alternatively, we could block any commit
>> (inventing some new repo state) until the admin resolves
>> the situation manually. Not sure which one I would prefer.
>
> I suggest this is the best option, unless we specifically design and
> the administrator specifically chooses an option to have higher
> availability at the expense of disk space, fault diagnosis, and so on.

We could add a "continue-upon-failure" option to the
[rep-cache] section in fsfs.conf. Default would be "false".
If set to true, commits would not be held off by rep-cache
failures but the rep-cache would be disabled. If set to
false, the repo goes into a r/o state.

>> On top of that, we should handle the other rep-cache.db
>> consistency checks (e.g. head vs. rev of latest entry)
>> the same way.
>
> That makes sense.
>
> I suggest all of this should be treated as a possible future
> enhancement, not anything urgent.

I agree. In particular because it will require a format bump
for putting the "r/o" or "corruption" indicator somewhere.

-- Stefan^2.

Re: Queries about rep cache: get_shared_rep()

Posted by Julian Foad <ju...@gmail.com>.
Stefan Fuhrmann wrote:
> Alright. I gave it a bit more thought now.
>
> Whenever we encounter this mismatch, something pretty
> bad likely happened to the repo - such as a failed restore
> attempt. In turn, we can expect those situations to be
> very rare - which means we can afford some disruption
> for the user.
>
> I suggest that we do 3 things:
>
> * log the warning - for future reference, for being picked
>   up by monitoring tools etc.

We already do that.

> * clear the rep-cache.db

Clearing the cache and continuing operation may make subsequent
commits much larger than they should be, and there is no easy way to
undo that if it happens.

Attempting to clear the rep cache might itself fail in some way,
depending on what kind of corruption has happened to it. It would also
destroy the evidence of what went wrong.

> * fail the current commit
>
> That way, we can be quite sure that only valid data gets
> committed.

Failing the current commit will ensure that no potentially bad (but
undiagnosed) response from the rep cache has already been used in an
earlier part of the transaction. I suppose that's what you're thinking
of. That makes sense to me.

> Alternatively, we could block any commit
> (inventing some new repo state) until the admin resolves
> the situation manually. Not sure which one I would prefer.

I suggest this is the best option, unless we specifically design and
the administrator specifically chooses an option to have higher
availability at the expense of disk space, fault diagnosis, and so on.

> On top of that, we should handle the other rep-cache.db
> consistency checks (e.g. head vs. rev of latest entry)
> the same way.

That makes sense.

I suggest all of this should be treated as a possible future
enhancement, not anything urgent.

- Julian

Re: Queries about rep cache: get_shared_rep()

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Wed, May 27, 2015 at 1:11 PM, Julian Foad <ju...@gmail.com> wrote:

> Stefan Fuhrmann wrote:
> > Julian Foad wrote:
> >> @@ -2262,14 +2264,20 @@ get_shared_rep(representation_t **old_re
> >>
> >>    /* A simple guard against general rep-cache induced corruption. */
> >>    if ((*old_rep)->expanded_size != rep->expanded_size)
> >>      {
> [...]
> >> +         ### [JAF] Why should we assume the cache is corrupt rather
> than the
> >> +                   new rep is corrupt? Is this really the logic we
> want?
> >
> > Hm. We don't say "the cache is corrupt" but log a warning with
> > a "FS corruption" error code and tell the user that there is an
> > inconsistency / mismatch.
>
> I meant that we proceed with storing the new data into the repository
> (ignoring the cache), which assumes the new data is correct and the
> cache is wrong.
>

>> +                   Should we do something more forceful -- flag the
> cache as
> >> +                   unusable, perhaps -- rather than just logging a
> warning?
> >
> > I don't have a particularly strong opinion on that one. I guess
> > it is very unlikely that the mismatch has "legitimately" been
> > caused by e.g. a SHA1 collision. Maybe, we should reset /
> > clear the rep-cache.db at that point and say so in the warning.
>
> I don't know what would be best. It's probably "good enough" based on
> our empirical experience. But maybe we should think in terms of what
> guarantees we should expect from the cache. It doesn't have to return
> a result, but if it does then the result *must* be correct. If one
> result is demonstrably wrong (as in this code block) then maybe we
> should work on the assumption that the whole cache is bad, rather than
> run the risk that some other entries are wrong but happen to have a
> matching expanded-size.
>

Alright. I gave it a bit more thought now.

Whenever we encounter this mismatch, something pretty
bad likely happened to the repo - such as a failed restore
attempt. In turn, we can expect those situations to be
very rare - which means we can afford some disruption
for the user.

I suggest that we do 3 things:

* log the warning - for future reference, for being picked
  up by monitoring tools etc.
* clear the rep-cache.db
* fail the current commit

That way, we can be quite sure that only valid data gets
committed. Alternatively, we could block any commit
(inventing some new repo state) until the admin resolves
the situation manually. Not sure which one I would prefer.

On top of that, we should handle the other rep-cache.db
consistency checks (e.g. head vs. rev of latest entry)
the same way.

-- Stefan^2.

Re: Queries about rep cache: get_shared_rep()

Posted by Julian Foad <ju...@gmail.com>.
Stefan Fuhrmann wrote:
> Julian Foad wrote:
>> +      /* ### Would it be faster (and so better) to just try reading it,
>> +             and handle ENOENT, instead of first checking for presence? */
>>        SVN_ERR(svn_io_check_path(file_name, &kind, scratch_pool));
>>        if (kind == svn_node_file)
>>          {
>>            svn_stringbuf_t *rep_string;
>>            SVN_ERR(svn_stringbuf_from_file2(&rep_string, file_name,
>>                                             scratch_pool));
>
> This checks for duplicate representations within the same txn.
> It mainly finds them in a-typical or infrequent scenarios [...]
>
> I personally prefer checking for "preconditions" over just trying
> and then dissecting various error cases. Apart from that style
> issue, frequently constructing error codes can be expensive
> (when messages are localized).

Ack -- this way is reasonable in a case like this where it's usually
expected to fail. (It does have a disadvantage, which is that the
(rare) failure mode where the file becomes unreadable between the
check and the open will never get test coverage.)

>> @@ -2262,14 +2264,20 @@ get_shared_rep(representation_t **old_re
>>
>>    /* A simple guard against general rep-cache induced corruption. */
>>    if ((*old_rep)->expanded_size != rep->expanded_size)
>>      {
[...]
>> +         ### [JAF] Why should we assume the cache is corrupt rather than the
>> +                   new rep is corrupt? Is this really the logic we want?
>
> Hm. We don't say "the cache is corrupt" but log a warning with
> a "FS corruption" error code and tell the user that there is an
> inconsistency / mismatch.

I meant that we proceed with storing the new data into the repository
(ignoring the cache), which assumes the new data is correct and the
cache is wrong.

> It is simply our experience so far that it is more likely that
> the rep-cache.db contents is at fault rather than the rep
> we just calculated the sha1 for.

Acknowledged.

>> +                   Should we do something more forceful -- flag the cache as
>> +                   unusable, perhaps -- rather than just logging a warning?
>
> I don't have a particularly strong opinion on that one. I guess
> it is very unlikely that the mismatch has "legitimately" been
> caused by e.g. a SHA1 collision. Maybe, we should reset /
> clear the rep-cache.db at that point and say so in the warning.

I don't know what would be best. It's probably "good enough" based on
our empirical experience. But maybe we should think in terms of what
guarantees we should expect from the cache. It doesn't have to return
a result, but if it does then the result *must* be correct. If one
result is demonstrably wrong (as in this code block) then maybe we
should work on the assumption that the whole cache is bad, rather than
run the risk that some other entries are wrong but happen to have a
matching expanded-size.

- Julian

Re: Queries about rep cache: get_shared_rep()

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Tue, May 26, 2015 at 11:15 PM, Julian Foad <ju...@gmail.com> wrote:

> Index: subversion/libsvn_fs_fs/transaction.c
> ===================================================================
> --- subversion/libsvn_fs_fs/transaction.c    (revision 1681856)
> +++ subversion/libsvn_fs_fs/transaction.c    (working copy)
> @@ -2243,12 +2243,14 @@ (representation_t **old_re
>        const char *file_name
>          = path_txn_sha1(fs, &rep->txn_id, rep->sha1_digest, scratch_pool);
>
>        /* in our txn, is there a rep file named with the wanted SHA1?
>           If so, read it and use that rep.
>         */
> +      /* ### Would it be faster (and so better) to just try reading it,
> +             and handle ENOENT, instead of first checking for presence? */
>        SVN_ERR(svn_io_check_path(file_name, &kind, scratch_pool));
>        if (kind == svn_node_file)
>          {
>            svn_stringbuf_t *rep_string;
>            SVN_ERR(svn_stringbuf_from_file2(&rep_string, file_name,
>                                             scratch_pool));
>

This checks for duplicate representations within the same txn.
It mainly finds them in a-typical or infrequent scenarios like:

* setting the same property list on multiple nodes
* committing the same contents to multiple branches in the
  same txn
* loading a non-incremental dump file

I personally prefer checking for "preconditions" over just trying
and then dissecting various error cases. Apart from that style
issue, frequently constructing error codes can be expensive
(when messages are localized).

@@ -2262,14 +2264,20 @@ get_shared_rep(representation_t **old_re
>
>    /* A simple guard against general rep-cache induced corruption. */
>    if ((*old_rep)->expanded_size != rep->expanded_size)
>      {
>        /* Make the problem show up in the server log.
>
> -         Because not sharing reps is always a save option,
> +         Because not sharing reps is always a safe option,
>           terminating the request would be inappropriate.
>

Oops. Fixed in r1681949.

+
> +         ### [JAF] Why should we assume the cache is corrupt rather than
> the
> +                   new rep is corrupt? Is this really the logic we want?
>

Hm. We don't say "the cache is corrupt" but log a warning with
a "FS corruption" error code and tell the user that there is an
inconsistency / mismatch.

It is simply our experience so far that it is more likely that
the rep-cache.db contents is at fault rather than the rep
we just calculated the sha1 for.


> +
> +                   Should we do something more forceful -- flag the cache
> as
> +                   unusable, perhaps -- rather than just logging a
> warning?
>         */
>

I don't have a particularly strong opinion on that one. I guess
it is very unlikely that the mismatch has "legitimately" been
caused by e.g. a SHA1 collision. Maybe, we should reset /
clear the rep-cache.db at that point and say so in the warning.

-- Stefan^2.

Re: Queries about rep cache: get_shared_rep()

Posted by Julian Foad <ju...@gmail.com>.
Hi Bert.

Bert Huijben wrote:
> On Windows opening the file is sensitive to outside interactions and may
> trigger a retry loop. [...] A filestat is +- a constant time operation that
> doesn't have these problems.

OK...

> So this really depends on how common all cases are. If not existing or
> possibly locked is common, then statting first could certainly be useful...
>
> But if it is +- an error if the file doesn't exist and the file must be
> opened in almost every case then the open and handle errors keeps the code
> clean.

In this case we're looking to see if the rep is a duplicate of one
that has already been inserted in the current transaction. In the
majority of cases in typical work flows, the requested file will not
exist.

Therefore I think it is safe to say that the current code (statting
first) will be reasonably efficient overall.

However, I don't think that it would necessarily be inefficient to
just try opening the file. It's important to distinguish *requesting*
to open a file at a path where a file may or may not be present and
accessible, from actually opening the file. At what point do these
"outside interactions" typically occur? I spent a few minutes Googling
("file system filter driver" and "IRP_MJ_CREATE" are relevant terms)
but couldn't find an answer. I imagine if the file does not exist then
there would *not* be significant delays and retries due to waiting for
a virus scanner (etc.) if we simply tried to open it.

In conclusion, it seems to me now that the code is fine as it is, and
the question of whether just trying to open a non-existent file would
be generally efficient or inefficient is of only academic interest at
this point.

- Julian


> From: Julian Foad
> Index: subversion/libsvn_fs_fs/transaction.c
> ===================================================================
> --- subversion/libsvn_fs_fs/transaction.c    (revision 1681856)
> +++ subversion/libsvn_fs_fs/transaction.c    (working copy)
> @@ -2243,12 +2243,14 @@ (representation_t **old_re
>        const char *file_name
>          = path_txn_sha1(fs, &rep->txn_id, rep->sha1_digest, scratch_pool);
>
>        /* in our txn, is there a rep file named with the wanted SHA1?
>           If so, read it and use that rep.
>         */
> +      /* ### Would it be faster (and so better) to just try reading it,
> +             and handle ENOENT, instead of first checking for presence? */
>        SVN_ERR(svn_io_check_path(file_name, &kind, scratch_pool));
>        if (kind == svn_node_file)
>          {
>            svn_stringbuf_t *rep_string;
>            SVN_ERR(svn_stringbuf_from_file2(&rep_string, file_name,
>                                             scratch_pool));
> @@ -2262,14 +2264,20 @@ get_shared_rep(representation_t **old_re
>
>    /* A simple guard against general rep-cache induced corruption. */
>    if ((*old_rep)->expanded_size != rep->expanded_size)
>      {
>        /* Make the problem show up in the server log.
>
> -         Because not sharing reps is always a save option,
> +         Because not sharing reps is always a safe option,
>           terminating the request would be inappropriate.
> +
> +         ### [JAF] Why should we assume the cache is corrupt rather than the
> +                   new rep is corrupt? Is this really the logic we want?
> +
> +                   Should we do something more forceful -- flag the cache as
> +                   unusable, perhaps -- rather than just logging a warning?
>         */

RE: Queries about rep cache: get_shared_rep()

Posted by Bert Huijben <be...@qqmail.nl>.
On Windows opening the file is sensitive to outside interactions and may trigger a retry loop. E.g. A virusscanber that scans every file before opening by hooking the OS. A filestat is +- a constant time operation that doesn't have these problems.

So this really depends on how common all cases are. If not existing or possibly locked is common, then statting first could certainly be useful... But if it is +- an error if the file doesn't exist and the file must be opened in almost every case then the open and handle errors keeps the code clean.

Not sure what case applies here, but without looking at the code I would guess that in > 99.9% of the cases we can assume the cache is correct... And all else is an exception that might have a slight performance hit.

Bert

-----Original Message-----
From: "Julian Foad" <ju...@gmail.com>
Sent: ‎26-‎5-‎2015 23:16
To: "dev" <de...@subversion.apache.org>
Subject: Queries about rep cache: get_shared_rep()

Index: subversion/libsvn_fs_fs/transaction.c
===================================================================
--- subversion/libsvn_fs_fs/transaction.c    (revision 1681856)
+++ subversion/libsvn_fs_fs/transaction.c    (working copy)
@@ -2243,12 +2243,14 @@ (representation_t **old_re
       const char *file_name
         = path_txn_sha1(fs, &rep->txn_id, rep->sha1_digest, scratch_pool);

       /* in our txn, is there a rep file named with the wanted SHA1?
          If so, read it and use that rep.
        */
+      /* ### Would it be faster (and so better) to just try reading it,
+             and handle ENOENT, instead of first checking for presence? */
       SVN_ERR(svn_io_check_path(file_name, &kind, scratch_pool));
       if (kind == svn_node_file)
         {
           svn_stringbuf_t *rep_string;
           SVN_ERR(svn_stringbuf_from_file2(&rep_string, file_name,
                                            scratch_pool));
@@ -2262,14 +2264,20 @@ get_shared_rep(representation_t **old_re

   /* A simple guard against general rep-cache induced corruption. */
   if ((*old_rep)->expanded_size != rep->expanded_size)
     {
       /* Make the problem show up in the server log.

-         Because not sharing reps is always a save option,
+         Because not sharing reps is always a safe option,
          terminating the request would be inappropriate.
+
+         ### [JAF] Why should we assume the cache is corrupt rather than the
+                   new rep is corrupt? Is this really the logic we want?
+
+                   Should we do something more forceful -- flag the cache as
+                   unusable, perhaps -- rather than just logging a warning?
        */
       svn_checksum_t checksum;
       checksum.digest = rep->sha1_digest;
       checksum.kind = svn_checksum_sha1;

       err = svn_error_createf(SVN_ERR_FS_CORRUPT, NULL,


- Julian