You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Evgeny Kotkov <ev...@visualsvn.com> on 2015/05/21 17:23:22 UTC

Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Subversion 1.9.0-rc1 introduced a new svnadmin verify --keep-going mode [1].
In order to achieve this, we added a svn_repos_verify_fs3() API function and
deprecated its predecessor, svn_repos_verify_fs2(), that now calls the newer
function with keep_going argument set to FALSE.

However, svn_repos_verify_fs2() behaves differently in 1.9.0-rc1 and 1.8.13
when it comes to error reporting.  As an example, the following code ...

  SVN_ERR(svn_repos_verify_fs2(repos, 1, 5, NULL, NULL, NULL, NULL, pool));

...would return two different errors depending on the binaries being used,
assuming that one of the revisions in the [r1:r5] range is corrupted:

 (With 1.8.13)  E160004: Checksum mismatch in item at offset 0 of
    length 59 bytes in file path/asf/db/revs/0/2

 (With 1.9.0-rc1)  E165011: Repository 'path/asf' failed to verify

Please note that the second error is generic, and that the actual information
about the error is lost.  Existing API users of svn_repos_verify_fs2() are
going to lose an important bit of information about the root cause of the
corruption unless they update the code.  Furthermore, even if an API caller
subscribes to notifications with a notify_func / notify_baton pair, it would
still be necessary to update the code to handle svn_repos_notify_failure that
did not exist before 1.9.

I did not find any discussion on the matter or the corresponding entry in
/notes/api-errata [2] that would describe this behavior change, so I am
inclined to think that this is accidental and, probably, undesirable.

There is an option of restoring the 1.8 behavior when svn_repos_verify_fs3()
is called with keep_going set to FALSE.  We could immediately yield the error
in this case, and only use the notifications / generic error when keep_going
is set to TRUE.  Doing so would change the output of svnadmin verify without
--keep-going, because now it wouldn't include the generic error message in
the end.  I attached a proof-of-concept patch, that doesn't change the test
expectations and the documentation, i.e., it only includes the core change,
mostly as an illustration to these points.

Thoughts?

[1] https://svn.apache.org/r1492651
[2] https://svn.apache.org/repos/asf/subversion/trunk/notes/api-errata/1.9


Regards,
Evgeny Kotkov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Evgeny Kotkov wrote on Thu, May 21, 2015 at 18:23:22 +0300:
> @@ -2464,24 +2452,26 @@ svn_repos_verify_fs3(svn_repos_t *repos,
>                                    cancel_func, cancel_baton,
>                                    iterpool);
>  
> +        if (err && err->apr_err == SVN_ERR_CANCELLED)
> +        ...
> +        else if (err && !keep_going)
> +        ...
> +        else if (err && keep_going)
> +        ...
> +        else if (!err && notify_func)

The "!err &&" part of last condition is tautologic.  I'd remove it,
leaving the condition as just "[else] if (notify_func)".

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Wed, Jun 3, 2015 at 2:33 PM, Branko Čibej <br...@wandisco.com> wrote:

> On 21.05.2015 17:23, Evgeny Kotkov wrote:
> > Subversion 1.9.0-rc1 introduced a new svnadmin verify --keep-going mode
> [1].
> > In order to achieve this, we added a svn_repos_verify_fs3() API function
> and
> > deprecated its predecessor, svn_repos_verify_fs2(), that now calls the
> newer
> > function with keep_going argument set to FALSE.
> >
> > However, svn_repos_verify_fs2() behaves differently in 1.9.0-rc1 and
> 1.8.13
> > when it comes to error reporting.  As an example, the following code ...
> >
> >   SVN_ERR(svn_repos_verify_fs2(repos, 1, 5, NULL, NULL, NULL, NULL,
> pool));
> >
> > ...would return two different errors depending on the binaries being
> used,
> > assuming that one of the revisions in the [r1:r5] range is corrupted:
> >
> >  (With 1.8.13)  E160004: Checksum mismatch in item at offset 0 of
> >     length 59 bytes in file path/asf/db/revs/0/2
> >
> >  (With 1.9.0-rc1)  E165011: Repository 'path/asf' failed to verify
> >
> > Please note that the second error is generic, and that the actual
> information
> > about the error is lost.  Existing API users of svn_repos_verify_fs2()
> are
> > going to lose an important bit of information about the root cause of the
> > corruption unless they update the code.  Furthermore, even if an API
> caller
> > subscribes to notifications with a notify_func / notify_baton pair, it
> would
> > still be necessary to update the code to handle svn_repos_notify_failure
> that
> > did not exist before 1.9.
> >
> > I did not find any discussion on the matter or the corresponding entry in
> > /notes/api-errata [2] that would describe this behavior change, so I am
> > inclined to think that this is accidental and, probably, undesirable.
> >
> > There is an option of restoring the 1.8 behavior when
> svn_repos_verify_fs3()
> > is called with keep_going set to FALSE.  We could immediately yield the
> error
> > in this case, and only use the notifications / generic error when
> keep_going
> > is set to TRUE.  Doing so would change the output of svnadmin verify
> without
> > --keep-going, because now it wouldn't include the generic error message
> in
> > the end.  I attached a proof-of-concept patch, that doesn't change the
> test
> > expectations and the documentation, i.e., it only includes the core
> change,
> > mostly as an illustration to these points.
> >
> > Thoughts?
> >
> > [1] https://svn.apache.org/r1492651
> > [2]
> https://svn.apache.org/repos/asf/subversion/trunk/notes/api-errata/1.9
>
> I completed your patch and committed the fix in r1683311. Please review!
>

Hi Evgeny & Brane,

r1683311 looks good and I'll approve it for 1.9.x in a minute.

The reason why the FS backend returned an REPOS error
is its suggestive name. After all, the repos is corrupt ...
But that temptation has been fixed as well now.

On a more meta note concerning repos verification, we aren't
too thorough in what we test and errors that are reported often
don't give the admin a clue on how they might fix things. For
the first part, it would be nice to be able to test that a repo is
writeable. The second part probably needs planning / a concept
of which kind of problems we test for explicitly, e.g. missing
or superfluous files. Our "lets try to read all data" stage would
then be only the last step in the process.

Nothing urgent but maybe something to talk about in Berlin.

-- Stefan^2.

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Branko Čibej <br...@wandisco.com> writes:

>> Fine, but I suspect that this fix probably isn't going to address the very
>> last problem in the --keep-going code.  It would be a pity to delay
>> Subversion 1.9.0 because of that.
>
> Let's not make the perfect the enemy of the good enough; we'd never have
> released 1.0 otherwise. :)

Well, I was mostly thinking about possible regressions caused by the addition
of this feature — such as the svn_repos_verify_fs2() API incompatibility that
we were discussing originally.


Regards,
Evgeny Kotkov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Branko Čibej <br...@wandisco.com> writes:

>> With that in mind, I am -0 on the corresponding backport proposal.
>
> I removed the backport proposal until we get this sorted out.

Ack.

>> I sketched the other possible option with redesigning svn_repos_verify_fs3()
>> API to only report errors via the notify_func(), please see the attached
>> patch.  Although I won't insist on going this way, I like it better, as it
>> allows us to get rid of things like b->silent_errors, b->silent_running,
>> juggling with standard streams and API that can yield the same error in
>> two different ways.
>>
>> What do you think?
>
> IIRC Ivan already proposed something along these lines?

Right, that's the second known solution for this problem.  I believe it was
originally proposed in http://svn.haxx.se/dev/archive-2015-06/0018.shtml, and
I implemented it in the aforementioned patch so that we could make a choice
between two implementations instead of comparing an existing implementation
with a concept.

> My main objection to this approach is that it breaks all the API
> patterns we've ever had: it creates a function that does not return an
> error even though it clearly fails, and relies on some notification
> callback to report actual errors. This is, IMNSHO, fundamentally broken.
> We've relied throughout our code on the fact that it's safe to wrap any
> function that returns an svn_error_t* with SVN_ERR() and not have to
> worry about missing errors. Your proposal throws this pattern out the
> window; if we adopt it, we'll have to start checking API docs to see if
> the svn_error_t* actually guarantees that the caller gets an error when
> something fails. That's really, really painful.

I don't think that this approach actually breaks the SVN_ERR() pattern, as
the operation can still fail itself, say, with SVN_ERR_REPOS_BAD_ARGS or
SVN_ERR_CANCELLED.  The difference is in that we start treating errors that
happen, e.g., during revision replay, as the *result* of the operation.  In
this design, a successfully completed svn_repos_verify_fs3() call means that
nothing prevented us from executing a set of checks for the revisions in the
[start_rev, end_rev] range and that we shared our findings.  A failed verify
call, on the contrary, means that something prevented us from executing this
set of checks and that the reason for that is stated in returned svn_error_t.

The bright side of this approach, as I see it, is that don't put the burden
of guessing the right error to deal with on the caller (like we did with the
b->silent_errors field), as the it now always goes through the notify_func().
Boolean arguments of svn_repos_verify_fs3() such as keep_going, metadata_only
and check_normalization are a hint on where to stop or what to check, i.e.,
they affect the set of performed checks, but don't alter the way of delivering
the result.  Furthermore, this approach allows us not to invent an error like
"Verification of 10 out of 200 revisions failed ..." in the API itself — just
because we *need* to return something.  We could do that in our UI code
(svnadmin.c), but it's no longer mandatory.

Indeed, there is a drawback of requiring a compatibility wrapper that does
more that a single function call with appropriate arguments, but I see it as
a necessary price for a non-contradicting API.

Does it make sense?

> My minor objection is that the compatibility wrapper suddenly becomes
> comples, whereas the usual pattern is just wrapping the new function
> with a slightly different set of parameters. This makes maintaining the
> compatibility wrappers just that bit more complex. However, I could live
> with that if my former objection was resolved.

Ack.


Regards,
Evgeny Kotkov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Evgeny Kotkov <ev...@visualsvn.com> writes:

> See the attached patch.  Please note that the patch is not a continuation of
> what I posted in my last e-mail (verify-fixup-squashed-v1.patch.txt), but
> from my point of view is better and should address the raised concerns, such
> as having a non-trivial compatibility wrapper.

I should also mention that the patch doesn't include the required changes for
bindings (such as JavaHL).  I'll do that separately, if we agree on the core
change.


Regards,
Evgeny Kotkov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 29.06.2015 23:33, Evgeny Kotkov wrote:
> Evgeny Kotkov <ev...@visualsvn.com> writes:
>
>> Do you plan on voting for the corresponding group?  If yes, could you please
>> extend the nomination with this JavaHL change?  I am then going to look into
>> it more carefully and will update my vote accordingly.
> On the other hand, we're just one vote short with the core change, and it's
> a blocker for 1.9.0.  I think that we should first finish backporting it, and
> then nominate / backport the binding update, as it only requires two votes
> in /STATUS.

Approved; and the JavaHL backport proposal is in STATUS now, too.

-- Brane


Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Evgeny Kotkov <ev...@visualsvn.com> writes:

> Do you plan on voting for the corresponding group?  If yes, could you please
> extend the nomination with this JavaHL change?  I am then going to look into
> it more carefully and will update my vote accordingly.

On the other hand, we're just one vote short with the core change, and it's
a blocker for 1.9.0.  I think that we should first finish backporting it, and
then nominate / backport the binding update, as it only requires two votes
in /STATUS.


Regards,
Evgeny Kotkov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Branko Čibej <br...@wandisco.com> writes:

>> And I have the JavaHL bits done but now I have to leave in 10 minutes,
>> which isn't enough time to review and write the log message. Will commit
>> this evening.
>
> r1688273; I propose we add it to the existing backport proposal.

>From a quick glance, r1688273 looks good to me — apart from a couple of
typos in the documentation:

+/**
+ * Create a new object and store the Java object.
+ * @param notify    global reference to the Java object
+ */
+ReposVerifyCallback::ReposVerifyCallback(jobject verify_cb)

(...)

+  /**
+   * Implementation of the svn_repos_verify_callback_t API.
+   *
+   * @param baton notification instance is passed using this parameter
+   * @param notify all the information about the event
+   * @param pool An APR pool from which to allocate memory.
+   */
+  static svn_error_t * callback(void *baton,
+                                svn_revnum_t revision,
+                                svn_error_t *verify_err,
+                                apr_pool_t *scratch_pool);

(...)

+     * This callback method is invoked every time {@link ISVNRepos#verify}
+     * encounters and error.

Do you plan on voting for the corresponding group?  If yes, could you please
extend the nomination with this JavaHL change?  I am then going to look into
it more carefully and will update my vote accordingly.


Regards,
Evgeny Kotkov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 29.06.2015 12:20, Branko Čibej wrote:
> On 28.06.2015 22:15, Evgeny Kotkov wrote:
>> Branko Čibej <br...@wandisco.com> writes:
>>
>>>> It would be a nice thing to have, but I am thinking that we should first
>>>> nominate this change for a backport to 1.9.x, as we are dealing with an
>>>> API change.
>>>>
>>>> Can we do that without having a complete JavaHL wrapper for this?
>>> Of course. We've never treated incomplete bindings as showstoppers. I'm half
>>> done with the JavaHL change, too; will probably finish tonight.
>> Thanks for confirming my thoughts.  I nominated the whole group of changes
>> (r1684940, r1685034, r1687769, r1687776) for a backport to /branches/1.9.x.
> And I have the JavaHL bits done but now I have to leave in 10 minutes,
> which isn't enough time to review and write the log message. Will commit
> this evening.

r1688273; I propose we add it to the existing backport proposal.

-- Brane


Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 28.06.2015 22:15, Evgeny Kotkov wrote:
> Branko Čibej <br...@wandisco.com> writes:
>
>>> It would be a nice thing to have, but I am thinking that we should first
>>> nominate this change for a backport to 1.9.x, as we are dealing with an
>>> API change.
>>>
>>> Can we do that without having a complete JavaHL wrapper for this?
>> Of course. We've never treated incomplete bindings as showstoppers. I'm half
>> done with the JavaHL change, too; will probably finish tonight.
> Thanks for confirming my thoughts.  I nominated the whole group of changes
> (r1684940, r1685034, r1687769, r1687776) for a backport to /branches/1.9.x.

And I have the JavaHL bits done but now I have to leave in 10 minutes,
which isn't enough time to review and write the log message. Will commit
this evening.

-- Brane


Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Branko Čibej <br...@wandisco.com> writes:

>> It would be a nice thing to have, but I am thinking that we should first
>> nominate this change for a backport to 1.9.x, as we are dealing with an
>> API change.
>>
>> Can we do that without having a complete JavaHL wrapper for this?
>
> Of course. We've never treated incomplete bindings as showstoppers. I'm half
> done with the JavaHL change, too; will probably finish tonight.

Thanks for confirming my thoughts.  I nominated the whole group of changes
(r1684940, r1685034, r1687769, r1687776) for a backport to /branches/1.9.x.


Regards,
Evgeny Kotkov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 26 Jun 2015 5:25 pm, "Evgeny Kotkov" <ev...@visualsvn.com> wrote:
>
> Branko Čibej <br...@wandisco.com> writes:
>
> > I like this a lot. Much better than my hacks upon hacks. :)
> >
> > Please go ahead and commit this. As you said, JavaHL would break after
> > this change. If you like, I can fix it afterwards, but I'm on a rather
> > tight schedule tomorrow and on the week-end so I very likely couldn't do
> > that before Monday.
>
> Thanks!  I committed the patch in r1687769 and r1687776 [1,2] (second
patch
> improves the documentation).
>
> As for the JavaHL bindings, r1687769 contains a minimal necessary change
for
> them, but doesn't yet allow passing a custom verify callback when calling
> SVNRepos.verify().  It would be a nice thing to have, but I am thinking
that
> we should first nominate this change for a backport to 1.9.x, as we are
> dealing with an API change.
>
> Can we do that without having a complete JavaHL wrapper for this?

Of course. We've never treated incomplete bindings as showstoppers. I'm
half done with the JavaHL change, too; will probably finish tonight.

>
> [1] https://svn.apache.org/r1687769
> [1] https://svn.apache.org/r1687776
>
>
> Regards,
> Evgeny Kotkov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Branko Čibej <br...@wandisco.com> writes:

> I like this a lot. Much better than my hacks upon hacks. :)
>
> Please go ahead and commit this. As you said, JavaHL would break after
> this change. If you like, I can fix it afterwards, but I'm on a rather
> tight schedule tomorrow and on the week-end so I very likely couldn't do
> that before Monday.

Thanks!  I committed the patch in r1687769 and r1687776 [1,2] (second patch
improves the documentation).

As for the JavaHL bindings, r1687769 contains a minimal necessary change for
them, but doesn't yet allow passing a custom verify callback when calling
SVNRepos.verify().  It would be a nice thing to have, but I am thinking that
we should first nominate this change for a backport to 1.9.x, as we are
dealing with an API change.

Can we do that without having a complete JavaHL wrapper for this?

[1] https://svn.apache.org/r1687769
[1] https://svn.apache.org/r1687776


Regards,
Evgeny Kotkov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 25.06.2015 18:11, Evgeny Kotkov wrote:
> Branko Čibej <br...@wandisco.com> writes:
>
>> Go ahead; I won't be able to for the next few days. Please remember to
>> remove the new error code, too, if we won't be using it any more.
> With a bit of trial and error, I was able to come up with a complete and
> tested solution that I find suitable in terms of the API design and the
> calling site (svnadmin.c).  The ideas are partly borrowed from the already
> mentioned svn_fs_lock_many() and svn_fs_lock_callback_t implementations.
>
> See the attached patch.  Please note that the patch is not a continuation of
> what I posted in my last e-mail (verify-fixup-squashed-v1.patch.txt), but from
> my point of view is better and should address the raised concerns, such as
> having a non-trivial compatibility wrapper.
>
> Here is the log message:
> [[[
> Reimplement svn_repos_verify_fs3() to support an arbitrary callback that
> receives the information about an encountered problem and lets the caller
> decide on what happens next.  This supersedes the keep_going argument for
> this API.  A callback is optional; the behavior of this API if the callback
> is not provided is equivalent to how svn_repos_verify_fs2() deals with
> encountered errors.  This allows seamless migration to the new API, if the
> callback is not necessary.  The idea is partly taken from how our existing
> svn_fs_lock_many() API works with a svn_fs_lock_callback_t and passes error
> information to the caller.
>
> Immediately use the new API to provide an alternative solution for the
> encountered problem with 'svnadmin verify --keep-going -q' (see r1684940)
> being useless in terms that it was only giving an indication of whether a
> particular repository passes the verification or not, without providing a
> root cause (details) of what's wrong.
>
> Discussion can be found in http://svn.haxx.se/dev/archive-2015-05/0141.shtml
> (Subject: "Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1")
>
> * subversion/include/svn_error_codes.h
>   (SVN_ERR_REPOS_VERIFY_FAILED): Remove this error code, as we no longer
>    need to send a specific error from within svn_repos_verify_fs3().
>   (SVN_ERR_CL_REPOS_VERIFY_FAILED): New.
>
> * subversion/include/svn_repos.h
>   (svn_repos_notify_action_t): Remove svn_repos_notify_failure.
>   (svn_repos_notify_t): Remove 'err' field, as it is no longer needed.
>   (svn_repos_verify_callback_t): New optional callback type to be used with
>    svn_repos_verify_fs3().
>   (svn_repos_verify_fs3): Drop 'keep_going' argument in favor of accepting a
>    svn_repos_verify_callback_t.  Update the docstring accordingly.
>   (svn_repos_verify_fs3): Update the docstring for this deprecated function.

Typo; you mean 'svn_repos_verify_fs2' here.

> * subversion/libsvn_repos/deprecated.c
>   (svn_repos_verify_fs2): Update the call to svn_repos_verify_fs3() in this
>    compatibility wrapper.  Don't pass the verify callback.
>
> * subversion/libsvn_repos/dump.c
>   (notify_verification_error): Remove; this function is no longer required.
>   (report_error): New helper function.
>   (svn_repos_verify_fs3): In case we've got a svn_repos_verify_callback_t,
>    call it upon receiving an FS-specific structure failure or a revision
>    verification failure.  Delegate this action to the new report_error()
>    helper function.  Doing so makes the caller responsible for what's going
>    to happen with the error.  The caller can choose to store the error,
>    ignore it or use it in any other necessary way.  If a callback returns an
>    error, stop the verification process and immediately return that error.
>    If no callback is provided, mimic the behavior of svn_repos_verify_fs2()
>    and return the first encountered error.  Drop the logic related to error
>    formatting, as we no longer need it at this layer.  We are going to make
>    a simpler replacement for it is the UI code (svnadmin.c), where it is
>    supposed to live.
>
> * subversion/svnadmin/svnadmin.c
>   (struct repos_verify_callback_baton): New.  Contains the fields that are
>    required to track the --keep-going errors taken from ...
>   (struct repos_notify_handler_baton): ...this baton.  After the previous
>    step, this baton only contains the 'feedback_stream' field, so inline it
>    into every calling site.
>   (repos_notify_handler): Baton is now simply an svn_stream_t.  Remove the
>    boolean-based filtering logic from this handler and drop the handling of
>    svn_repos_notify_failure.  The latter is moved, with a bit of tweaking,
>    into ...
>   (repos_verify_callback): ...this new function, that implements a callback
>    for svn_repos_verify_fs3().  Depending on whether we are in --keep-going
>    mode or not, either dump the failure details to stderr and track them to
>    produce a summary, or immediately return it through the callback, thus
>    ending the verification process.  Remember all errors in the --keep-going
>    mode, not only those that are associated with a particular revision.
>    Prior to handling the error itself, tell that we failed to verify the
>    revision or metadata by writing corresponding messages to stderr.
>   (subcommand_dump, subcommand_load, subcommand_recover, subcommand_upgrade,
>    subcommand_hotcopy, subcommand_pack): Inline repos_notify_handler_baton
>    here, as it now contains a single svn_stream_t field.
>   (subcommand_verify): Inline repos_notify_handler_baton here, as it now
>    contains a single svn_stream_t field.  Avoid manipulations with boolean
>    fields like b->silent_errors and b->silent_running, because we no longer
>    need them, and the fields themselves are gone.  Create a feedback stream
>    only in non-quiet mode, as we do in other subcommand implementations.
>    Create a baton for repos_verify_callback() and adjust the calling site of
>    svn_repos_verify_fs3(), that now needs a callback.  Adjust --keep-going
>    summary printing to the new approach with the verification callback.
>    Finally, provide a simple error if we encountered at least one failure
>    in the --keep-going mode.
>
> * subversion/tests/cmdline/svnadmin_tests.py
>   (verify_keep_going, verify_keep_going_quiet, verify_invalid_path_changes):
>    Adjust the expectations, because now errors go straight to stderr in both
>    --keep-going and ordinary modes.  Where possible, make the expectations a
>    bit stricter by extending the lines that we check with RegexListOutput().
>
> * subversion/tests/libsvn_fs_fs/fs-fs-private-test.c
>   (load_index, load_index_keep_going): Squash two tests into one; basically,
>    undo the corresponding hunk from r1683311.  As we no longer have separate
>    keep_going mode in svn_repos_verify_fs3(), and the caller decides if the
>    verification continues or not, we don't have to check two different
>    scenarios.
>   (test_funcs): Track the test changes.
>
> * subversion/tests/libsvn_fs_fs/fs-fs-fuzzy-test.c
>   (fuzzing_1_byte_1_rev): Adjust the call to svn_repos_verify_fs3().
> ]]]
>
> What do you think?

I like this a lot. Much better than my hacks upon hacks. :)

Please go ahead and commit this. As you said, JavaHL would break after
this change. If you like, I can fix it afterwards, but I'm on a rather
tight schedule tomorrow and on the week-end so I very likely couldn't do
that before Monday.

-- Brane

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Branko Čibej <br...@wandisco.com> writes:

> Go ahead; I won't be able to for the next few days. Please remember to
> remove the new error code, too, if we won't be using it any more.

With a bit of trial and error, I was able to come up with a complete and
tested solution that I find suitable in terms of the API design and the
calling site (svnadmin.c).  The ideas are partly borrowed from the already
mentioned svn_fs_lock_many() and svn_fs_lock_callback_t implementations.

See the attached patch.  Please note that the patch is not a continuation of
what I posted in my last e-mail (verify-fixup-squashed-v1.patch.txt), but from
my point of view is better and should address the raised concerns, such as
having a non-trivial compatibility wrapper.

Here is the log message:
[[[
Reimplement svn_repos_verify_fs3() to support an arbitrary callback that
receives the information about an encountered problem and lets the caller
decide on what happens next.  This supersedes the keep_going argument for
this API.  A callback is optional; the behavior of this API if the callback
is not provided is equivalent to how svn_repos_verify_fs2() deals with
encountered errors.  This allows seamless migration to the new API, if the
callback is not necessary.  The idea is partly taken from how our existing
svn_fs_lock_many() API works with a svn_fs_lock_callback_t and passes error
information to the caller.

Immediately use the new API to provide an alternative solution for the
encountered problem with 'svnadmin verify --keep-going -q' (see r1684940)
being useless in terms that it was only giving an indication of whether a
particular repository passes the verification or not, without providing a
root cause (details) of what's wrong.

Discussion can be found in http://svn.haxx.se/dev/archive-2015-05/0141.shtml
(Subject: "Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1")

* subversion/include/svn_error_codes.h
  (SVN_ERR_REPOS_VERIFY_FAILED): Remove this error code, as we no longer
   need to send a specific error from within svn_repos_verify_fs3().
  (SVN_ERR_CL_REPOS_VERIFY_FAILED): New.

* subversion/include/svn_repos.h
  (svn_repos_notify_action_t): Remove svn_repos_notify_failure.
  (svn_repos_notify_t): Remove 'err' field, as it is no longer needed.
  (svn_repos_verify_callback_t): New optional callback type to be used with
   svn_repos_verify_fs3().
  (svn_repos_verify_fs3): Drop 'keep_going' argument in favor of accepting a
   svn_repos_verify_callback_t.  Update the docstring accordingly.
  (svn_repos_verify_fs3): Update the docstring for this deprecated function.

* subversion/libsvn_repos/deprecated.c
  (svn_repos_verify_fs2): Update the call to svn_repos_verify_fs3() in this
   compatibility wrapper.  Don't pass the verify callback.

* subversion/libsvn_repos/dump.c
  (notify_verification_error): Remove; this function is no longer required.
  (report_error): New helper function.
  (svn_repos_verify_fs3): In case we've got a svn_repos_verify_callback_t,
   call it upon receiving an FS-specific structure failure or a revision
   verification failure.  Delegate this action to the new report_error()
   helper function.  Doing so makes the caller responsible for what's going
   to happen with the error.  The caller can choose to store the error,
   ignore it or use it in any other necessary way.  If a callback returns an
   error, stop the verification process and immediately return that error.
   If no callback is provided, mimic the behavior of svn_repos_verify_fs2()
   and return the first encountered error.  Drop the logic related to error
   formatting, as we no longer need it at this layer.  We are going to make
   a simpler replacement for it is the UI code (svnadmin.c), where it is
   supposed to live.

* subversion/svnadmin/svnadmin.c
  (struct repos_verify_callback_baton): New.  Contains the fields that are
   required to track the --keep-going errors taken from ...
  (struct repos_notify_handler_baton): ...this baton.  After the previous
   step, this baton only contains the 'feedback_stream' field, so inline it
   into every calling site.
  (repos_notify_handler): Baton is now simply an svn_stream_t.  Remove the
   boolean-based filtering logic from this handler and drop the handling of
   svn_repos_notify_failure.  The latter is moved, with a bit of tweaking,
   into ...
  (repos_verify_callback): ...this new function, that implements a callback
   for svn_repos_verify_fs3().  Depending on whether we are in --keep-going
   mode or not, either dump the failure details to stderr and track them to
   produce a summary, or immediately return it through the callback, thus
   ending the verification process.  Remember all errors in the --keep-going
   mode, not only those that are associated with a particular revision.
   Prior to handling the error itself, tell that we failed to verify the
   revision or metadata by writing corresponding messages to stderr.
  (subcommand_dump, subcommand_load, subcommand_recover, subcommand_upgrade,
   subcommand_hotcopy, subcommand_pack): Inline repos_notify_handler_baton
   here, as it now contains a single svn_stream_t field.
  (subcommand_verify): Inline repos_notify_handler_baton here, as it now
   contains a single svn_stream_t field.  Avoid manipulations with boolean
   fields like b->silent_errors and b->silent_running, because we no longer
   need them, and the fields themselves are gone.  Create a feedback stream
   only in non-quiet mode, as we do in other subcommand implementations.
   Create a baton for repos_verify_callback() and adjust the calling site of
   svn_repos_verify_fs3(), that now needs a callback.  Adjust --keep-going
   summary printing to the new approach with the verification callback.
   Finally, provide a simple error if we encountered at least one failure
   in the --keep-going mode.

* subversion/tests/cmdline/svnadmin_tests.py
  (verify_keep_going, verify_keep_going_quiet, verify_invalid_path_changes):
   Adjust the expectations, because now errors go straight to stderr in both
   --keep-going and ordinary modes.  Where possible, make the expectations a
   bit stricter by extending the lines that we check with RegexListOutput().

* subversion/tests/libsvn_fs_fs/fs-fs-private-test.c
  (load_index, load_index_keep_going): Squash two tests into one; basically,
   undo the corresponding hunk from r1683311.  As we no longer have separate
   keep_going mode in svn_repos_verify_fs3(), and the caller decides if the
   verification continues or not, we don't have to check two different
   scenarios.
  (test_funcs): Track the test changes.

* subversion/tests/libsvn_fs_fs/fs-fs-fuzzy-test.c
  (fuzzing_1_byte_1_rev): Adjust the call to svn_repos_verify_fs3().
]]]

What do you think?


Regards,
Evgeny Kotkov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 19.06.2015 16:41, Evgeny Kotkov wrote:
> Philip Martin <ph...@wandisco.com> writes:
>
>> Branko Čibej <br...@wandisco.com> writes:
>>
>>> My main objection to this approach is that it breaks all the API
>>> patterns we've ever had: it creates a function that does not return an
>>> error even though it clearly fails, and relies on some notification
>>> callback to report actual errors.
>> We have been reporting errors via callbacks since 1.2, look at
>> svn_ra_lock().  We have done it again in 1.9, look at
>> svn_fs_lock_many().
>>
>> Also, is the function clearly failing when it reports a verification
>> error?  If it is correctly diagnosing that the repository is corrupt
>> then the function is in some sense succeeding.  I suppose we could
>> change  svn_repos_notify_t.err to some other type, but svn_error_t is a
>> convenient way to package the information.
> +1.
>
> Evgeny Kotkov <ev...@visualsvn.com> writes:
>
>> I sketched the other possible option with redesigning svn_repos_verify_fs3()
>> API to only report errors via the notify_func(), please see the attached
>> patch.  Although I won't insist on going this way, I like it better, as it
>> allows us to get rid of things like b->silent_errors, b->silent_running,
>> juggling with standard streams and API that can yield the same error in
>> two different ways.
> Branko, what do you think about doing it this way?  Should I work this proof-
> of-concept patch up to a committable state?

Go ahead; I won't be able to for the next few days. Please remember to
remove the new error code, too, if we won't be using it any more.

-- Brane


Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Philip Martin <ph...@wandisco.com> writes:

> Branko Čibej <br...@wandisco.com> writes:
>
>> My main objection to this approach is that it breaks all the API
>> patterns we've ever had: it creates a function that does not return an
>> error even though it clearly fails, and relies on some notification
>> callback to report actual errors.
>
> We have been reporting errors via callbacks since 1.2, look at
> svn_ra_lock().  We have done it again in 1.9, look at
> svn_fs_lock_many().
>
> Also, is the function clearly failing when it reports a verification
> error?  If it is correctly diagnosing that the repository is corrupt
> then the function is in some sense succeeding.  I suppose we could
> change  svn_repos_notify_t.err to some other type, but svn_error_t is a
> convenient way to package the information.

+1.

Evgeny Kotkov <ev...@visualsvn.com> writes:

> I sketched the other possible option with redesigning svn_repos_verify_fs3()
> API to only report errors via the notify_func(), please see the attached
> patch.  Although I won't insist on going this way, I like it better, as it
> allows us to get rid of things like b->silent_errors, b->silent_running,
> juggling with standard streams and API that can yield the same error in
> two different ways.

Branko, what do you think about doing it this way?  Should I work this proof-
of-concept patch up to a committable state?


Regards,
Evgeny Kotkov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Philip Martin <ph...@wandisco.com>.
Branko Čibej <br...@wandisco.com> writes:

> My main objection to this approach is that it breaks all the API
> patterns we've ever had: it creates a function that does not return an
> error even though it clearly fails, and relies on some notification
> callback to report actual errors.

We have been reporting errors via callbacks since 1.2, look at
svn_ra_lock().  We have done it again in 1.9, look at
svn_fs_lock_many().

Also, is the function clearly failing when it reports a verification
error?  If it is correctly diagnosing that the repository is corrupt
then the function is in some sense succeeding.  I suppose we could
change  svn_repos_notify_t.err to some other type, but svn_error_t is a
convenient way to package the information.

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 15.06.2015 20:54, Evgeny Kotkov wrote:
> Branko Čibej <br...@wandisco.com> writes:
>
>> Evgeny, please take a look at r1684940.
>>
>> I don't really like the fact that with this change and 'svnadmin
>> --keep-going --quiet', the text "Error verifying revision N" gets printed
>> to stderr; but I couldn't think of a better way to join the error to the
>> revision it was emitted for. I'd love to find a better solution.
> After giving this change a look, I cannot get rid of the feeling that what we
> are doing here is just adding different workarounds for the API issues, e.g.,
> booleans like b->silent_errors or b->silent_running.  I should also say that
> I find the b->feedback_stream manipulations questionable, because now, for
> instance, we write warnings to either stdout or stderr depending on the --quiet
> argument.  I always thought that --quiet should only suppress a part of the
> output, but shouldn't really retarget it to a different standard stream.
>
> With that in mind, I am -0 on the corresponding backport proposal.

I removed the backport proposal until we get this sorted out.


> I sketched the other possible option with redesigning svn_repos_verify_fs3()
> API to only report errors via the notify_func(), please see the attached patch.
> Although I won't insist on going this way, I like it better, as it allows us
> to get rid of things like b->silent_errors, b->silent_running, juggling with
> standard streams and API that can yield the same error in two different ways.
>
> What do you think?

IIRC Ivan already proposed something along these lines?

My main objection to this approach is that it breaks all the API
patterns we've ever had: it creates a function that does not return an
error even though it clearly fails, and relies on some notification
callback to report actual errors. This is, IMNSHO, fundamentally broken.
We've relied throughout our code on the fact that it's safe to wrap any
function that returns an svn_error_t* with SVN_ERR() and not have to
worry about missing errors. Your proposal throws this pattern out the
window; if we adopt it, we'll have to start checking API docs to see if
the svn_error_t* actually guarantees that the caller gets an error when
something fails. That's really, really painful.

My minor objection is that the compatibility wrapper suddenly becomes
comples, whereas the usual pattern is just wrapping the new function
with a slightly different set of parameters. This makes maintaining the
compatibility wrappers just that bit more complex. However, I could live
with that if my former objection was resolved.

-- Brane

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 15.06.2015 20:54, Evgeny Kotkov wrote:
> Branko Čibej <br...@wandisco.com> writes:
>
>> Evgeny, please take a look at r1684940.
>>
>> I don't really like the fact that with this change and 'svnadmin
>> --keep-going --quiet', the text "Error verifying revision N" gets printed
>> to stderr; but I couldn't think of a better way to join the error to the
>> revision it was emitted for. I'd love to find a better solution.
> After giving this change a look, I cannot get rid of the feeling that what we
> are doing here is just adding different workarounds for the API issues, e.g.,
> booleans like b->silent_errors or b->silent_running.  I should also say that
> I find the b->feedback_stream manipulations questionable, because now, for
> instance, we write warnings to either stdout or stderr depending on the --quiet
> argument.


Yes, that's questionable: However, I'll point out that /all/ the
svnadmin commands have the same problem: all notifications go to stdout,
even warnings and errors. IMO, we should change that and really always
send warnings and errors to stderr, which means changing the notify
callback in svnadmin.

The scope of that is somewhat bigger than this discussion of 'svnadmin
verify', but we may as well look at fixing all of svnadmin's error
reporting in 1.9 as long as we're looking at it; it wouldn't really be
that big a change overall.

-- Brane

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Branko Čibej <br...@wandisco.com> writes:

> Evgeny, please take a look at r1684940.
>
> I don't really like the fact that with this change and 'svnadmin
> --keep-going --quiet', the text "Error verifying revision N" gets printed
> to stderr; but I couldn't think of a better way to join the error to the
> revision it was emitted for. I'd love to find a better solution.

After giving this change a look, I cannot get rid of the feeling that what we
are doing here is just adding different workarounds for the API issues, e.g.,
booleans like b->silent_errors or b->silent_running.  I should also say that
I find the b->feedback_stream manipulations questionable, because now, for
instance, we write warnings to either stdout or stderr depending on the --quiet
argument.  I always thought that --quiet should only suppress a part of the
output, but shouldn't really retarget it to a different standard stream.

With that in mind, I am -0 on the corresponding backport proposal.

I sketched the other possible option with redesigning svn_repos_verify_fs3()
API to only report errors via the notify_func(), please see the attached patch.
Although I won't insist on going this way, I like it better, as it allows us
to get rid of things like b->silent_errors, b->silent_running, juggling with
standard streams and API that can yield the same error in two different ways.

What do you think?


Regards,
Evgeny Kotkov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 11.06.2015 17:49, Branko Čibej wrote:
> On 11.06.2015 17:34, Evgeny Kotkov wrote:
>> Branko Čibej <br...@wandisco.com> writes:
>>
>>> If '--quiet' means 'suppress progress', then the result is exactly as
>>> expected. We could change 'svnadmin verify' to behave differently, e.g.,
>>> only display error notifications in --quiet mode; I agree that would be
>>> better than the current behaviour, but again, this has nothing to do
>>> with the API itself.
>> I don't think that the current behavior — entirely losing error information —
>> is expected.  For example, here is what "svnadmin help verify" says about -q:
>>
>>     -q [--quiet]   : no progress (only errors to stderr)
> See my patch; a bit outdated (I've improved it a bit since then), but it
> fixes this issue.

Evgeny, please take a look at r1684940.

I don't really like the fact that with this change and 'svnadmin
--keep-going --quiet', the text "Error verifying revision N" gets
printed to stderr; but I couldn't think of a better way to join the
error to the revision it was emitted for. I'd love to find a better
solution.

-- Brane

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 11.06.2015 17:34, Evgeny Kotkov wrote:
> Branko Čibej <br...@wandisco.com> writes:
>
>> If '--quiet' means 'suppress progress', then the result is exactly as
>> expected. We could change 'svnadmin verify' to behave differently, e.g.,
>> only display error notifications in --quiet mode; I agree that would be
>> better than the current behaviour, but again, this has nothing to do
>> with the API itself.
> I don't think that the current behavior — entirely losing error information —
> is expected.  For example, here is what "svnadmin help verify" says about -q:
>
>     -q [--quiet]   : no progress (only errors to stderr)

See my patch; a bit outdated (I've improved it a bit since then), but it
fixes this issue.

>
>> I think we're at the point where we're trying to polish the UI. It is
>> rather late in the day for that. The minor tweak required in svnadmin to
>> emit errors in --quiet mode can be pushed into RC3, but I'd pretty much
>> stop there; we can always make minor UI changes later on. I wouldn't
>> change svn_repos_verify_fs3 any more; it's consistent and does what it
>> advertises.
> Fine, but I suspect that this fix probably isn't going to address the very last
> problem in the --keep-going code.  It would be a pity to delay Subversion
> 1.9.0 because of that.

Let's not make the perfect the enemy of the good enough; we'd never have
released 1.0 otherwise. :)

-- Brane

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Branko Čibej <br...@wandisco.com> writes:

> If '--quiet' means 'suppress progress', then the result is exactly as
> expected. We could change 'svnadmin verify' to behave differently, e.g.,
> only display error notifications in --quiet mode; I agree that would be
> better than the current behaviour, but again, this has nothing to do
> with the API itself.

I don't think that the current behavior — entirely losing error information —
is expected.  For example, here is what "svnadmin help verify" says about -q:

    -q [--quiet]   : no progress (only errors to stderr)

> I think we're at the point where we're trying to polish the UI. It is
> rather late in the day for that. The minor tweak required in svnadmin to
> emit errors in --quiet mode can be pushed into RC3, but I'd pretty much
> stop there; we can always make minor UI changes later on. I wouldn't
> change svn_repos_verify_fs3 any more; it's consistent and does what it
> advertises.

Fine, but I suspect that this fix probably isn't going to address the very last
problem in the --keep-going code.  It would be a pity to delay Subversion
1.9.0 because of that.


Regards,
Evgeny Kotkov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 11.06.2015 15:29, Branko Čibej wrote:
> I think we're at the point where we're trying to polish the UI. It is
> rather late in the day for that. The minor tweak required in svnadmin to
> emit errors in --quiet mode can be pushed into RC3

Perhaps something along the lines of the patch below. Note that this
patch makes subcommand_verify behave differently wrt --quiet than all
the other commends; on the other hand, this is the only command that
accepts a --keep-going flag, so I guess that's fine.

-- Brane

Index: subversion/svnadmin/svnadmin.c
===================================================================
--- subversion/svnadmin/svnadmin.c	(revision 1684887)
+++ subversion/svnadmin/svnadmin.c	(working copy)
@@ -868,6 +868,9 @@ struct repos_notify_handler_baton {
   /* Stream to write progress and other non-error output to. */
   svn_stream_t *feedback_stream;
 
+  /* Suppress notifications that are neither errors nor warnings. */
+  svn_boolean_t silent_running;
+
   /* Whether errors contained in notifications should be printed along
      with the notification. If FALSE, any errors will only be
      summarized. */
@@ -891,6 +894,14 @@ repos_notify_handler(void *baton,
   struct repos_notify_handler_baton *b = baton;
   svn_stream_t *feedback_stream = b->feedback_stream;
 
+  /* Don't print anything if the feedback stream isn't provided.
+     Only print errors and warnings in silent mode. */
+  if (!feedback_stream
+      || (b->silent_running
+          && notify->action != svn_repos_notify_warning
+          && notify->action != svn_repos_notify_failure))
+    return;
+
   switch (notify->action)
   {
     case svn_repos_notify_warning:
@@ -1838,9 +1849,11 @@ subcommand_verify(apr_getopt_t *os, void *baton, a
       upper = lower;
     }
 
-  if (! opt_state->quiet)
-    notify_baton.feedback_stream = recode_stream_create(stdout, pool);
+  notify_baton.feedback_stream = recode_stream_create(stdout, pool);
 
+  if (opt_state->quiet)
+    notify_baton.silent_running = TRUE;
+
   if (opt_state->keep_going)
     notify_baton.error_summary =
       apr_array_make(pool, 0, sizeof(struct verification_error *));
@@ -1853,10 +1866,8 @@ subcommand_verify(apr_getopt_t *os, void *baton, a
                                     opt_state->keep_going,
                                     opt_state->check_normalization,
                                     opt_state->metadata_only,
-                                    !opt_state->quiet
-                                    ? repos_notify_handler : NULL,
-                                    &notify_baton, check_cancel,
-                                    NULL, pool);
+                                    repos_notify_handler, &notify_baton,
+                                    check_cancel, NULL, pool);
 
   /* Show the --keep-going error summary. */
   if (opt_state->keep_going && notify_baton.error_summary->nelts > 0)


Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 11.06.2015 14:33, Evgeny Kotkov wrote:
> Branko Čibej <br...@wandisco.com> writes:
>
>>> Should be fixed in r1684325.
>> Meh ... siliness. It's actually r1684344. The correct fix is in
>> svnadmin; always sending notifications is correct from API users'
>> perspective, too.
> Thanks for looking into this.
>
> Frankly, I cannot say that adding a flag like b->silent_errors looks exactly
> right to me.  The calling site is now bound to an assumption that the error
> notification is the *same* as the error returned by svn_repos_verify_fs3() —
> otherwise we would be discarding a potentially useful part of the error info.

I'm looking at this from a different perspective; the API should behave
consistently regardless of the --keep-going flag:

  * if a notification handler is provided, it will be called for each
    error encountered during verification;
  * if an error occurs during verification, the API will return an error.

How the notifications are presented to the user is an implementation
detail of 'svnadmin verify' and has nothing to do with the API.

> Maybe this indicates that we could do a better job in designing the API?
> I think that if even we do this sort of silencing in svnadmin.c, potential
> users of svn_repos_verify_fs3() would have to introduce a similar logic and
> this could be confusing for them.

I think it's far more useful that the API is consistent WRT
notifications. How the API user handles notifications, or if she even
provides a notification handler, is up to her.

> Furthermore, I can't get rid of the feeling that --keep-going implementation
> has certain flaws in terms of both the API and the UI design.  Another example
> of what's currently wrong would be the behavior of svnadmin verify --keep-going
> --quiet, because right now the output is useless for the end user.  Here is a
> quick sample:
>
>   svnadmin verify --keep-going -q asf-part
>
>   svnadmin: E165011: Verification of 14 out of 108221 revisions failed on
>   repository 'C:\asf-part'
>
> We don't output errors with --quiet, and I think that this isn't something an
> administrator wants to see after executing verify for a couple of hours, e.g.,
> for a huge repository.  She would have to re-run the same command to get a
> hint of exactly what is wrong, even if the intention was just to suppress the
> progress.

If '--quiet' means 'suppress progress', then the result is exactly as
expected. We could change 'svnadmin verify' to behave differently, e.g.,
only display error notifications in --quiet mode; I agree that would be
better than the current behaviour, but again, this has nothing to do
with the API itself.

>   Worth mentioning, the test suite didn't catch this problem or the
> double error reporting bug, and I treat this is an indication that the tests
> probably require more work.

In this particular case it's a feature of our Python tests that the
regex expected results match at least one output line. We could invent a
new expected-result class, but I really don't think it's worth the
effort for catching such a minor issue.

> I didn't yet look into fixing this, but I am thinking that we're now trying to
> close the holes found in the --keep-going implementation through the STATUS
> file — and it doesn't sound quite right to me.  Maybe the API was overly bent
> for a particular usage scenario or maybe we just need a fresh look on this, but
> without fixing it we could be shipping a half-baked feature and would have to
> maintain the corresponding API.  Fixing this, on the other hand, could mean
> blocking the 1.9.0 release, and I don't like this either.

I think we're at the point where we're trying to polish the UI. It is
rather late in the day for that. The minor tweak required in svnadmin to
emit errors in --quiet mode can be pushed into RC3, but I'd pretty much
stop there; we can always make minor UI changes later on. I wouldn't
change svn_repos_verify_fs3 any more; it's consistent and does what it
advertises.

> A possible (although quite radical) option would be cutting the --keep-going
> feature from 1.9.x and redesigning it in 1.10.  I believe that there are some
> possibilities to explore — i.e., maybe, having a separate callback for error
> reporting or introducing svn_repos_notify_func2_t that would be yielding errors
> or would provide the necessary flexibility in another way.

I think the API consumer already has all the flexibility she needs.
Separate callbacks or new notification handler types add complexity, but
not flexibility, IMO.

As a matter of fact, I'd love to actually hear from GUI client
implementers instead of second-guessing their needs.

-- Brane

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Branko Čibej <br...@wandisco.com> writes:

>> Should be fixed in r1684325.
>
> Meh ... siliness. It's actually r1684344. The correct fix is in
> svnadmin; always sending notifications is correct from API users'
> perspective, too.

Thanks for looking into this.

Frankly, I cannot say that adding a flag like b->silent_errors looks exactly
right to me.  The calling site is now bound to an assumption that the error
notification is the *same* as the error returned by svn_repos_verify_fs3() —
otherwise we would be discarding a potentially useful part of the error info.
Maybe this indicates that we could do a better job in designing the API?
I think that if even we do this sort of silencing in svnadmin.c, potential
users of svn_repos_verify_fs3() would have to introduce a similar logic and
this could be confusing for them.

Furthermore, I can't get rid of the feeling that --keep-going implementation
has certain flaws in terms of both the API and the UI design.  Another example
of what's currently wrong would be the behavior of svnadmin verify --keep-going
--quiet, because right now the output is useless for the end user.  Here is a
quick sample:

  svnadmin verify --keep-going -q asf-part

  svnadmin: E165011: Verification of 14 out of 108221 revisions failed on
  repository 'C:\asf-part'

We don't output errors with --quiet, and I think that this isn't something an
administrator wants to see after executing verify for a couple of hours, e.g.,
for a huge repository.  She would have to re-run the same command to get a
hint of exactly what is wrong, even if the intention was just to suppress the
progress.  Worth mentioning, the test suite didn't catch this problem or the
double error reporting bug, and I treat this is an indication that the tests
probably require more work.

I didn't yet look into fixing this, but I am thinking that we're now trying to
close the holes found in the --keep-going implementation through the STATUS
file — and it doesn't sound quite right to me.  Maybe the API was overly bent
for a particular usage scenario or maybe we just need a fresh look on this, but
without fixing it we could be shipping a half-baked feature and would have to
maintain the corresponding API.  Fixing this, on the other hand, could mean
blocking the 1.9.0 release, and I don't like this either.

A possible (although quite radical) option would be cutting the --keep-going
feature from 1.9.x and redesigning it in 1.10.  I believe that there are some
possibilities to explore — i.e., maybe, having a separate callback for error
reporting or introducing svn_repos_notify_func2_t that would be yielding errors
or would provide the necessary flexibility in another way.


Regards,
Evgeny Kotkov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 09.06.2015 08:47, Branko Čibej wrote:
> On 08.06.2015 20:22, Branko Čibej wrote:
>> On 08.06.2015 19:06, Evgeny Kotkov wrote:
>>> Branko Čibej <br...@wandisco.com> writes:
>>>
>>>> I completed your patch and committed the fix in r1683311. Please review!
>>>>
>>>> -- Brane
>>> Sorry, I was on vacation last week and couldn't look at this fix earlier.
>>>
>>> I reviewed the committed patch and tested how Subversion 1.9.x behaves with it,
>>> as the change itself got merged into branches/1.9.x in r1683658.  From what I
>>> witness, 'svnadmin verify' now erroneously reports the same error twice for a
>>> corrupted repository:
>>>
>>>   * Verifying repository metadata ...
>>>   * Verified revision 0.
>>>   * Verified revision 1.
>>>   * Error verifying revision 2.
>>>   svnadmin: E160062: Malformed node revision ID string
>>>   svnadmin: E160062: Malformed node revision ID string
>>>
>>> Relevant hunks of r1683311 are located in subversion/libsvn_repos/dump.c:2433
>>> and :2466, where we send the error notification and immediately return the same
>>> error to the caller:
>>> [[[
>>>     ...
>>>     else if (err)
>>>       {
>>>         notify_verification_error(rev, err, notify_func, notify_baton,
>>>                                   iterpool);
>>>
>>>         if (!keep_going)
>>>           {
>>>             /* Return the error, the caller doesn't want us to continue. */
>>>             return svn_error_trace(err);
>>>           }
>>>     ...
>>> ]]]
>>>
>>> The caller (subversion/svnadmin/svnadmin.c:1908) writes both of these errors
>>> to stderr, and that results in the erroneous output.
>> Good catch. I suppose that means we shouldn't send the notification when
>> !keep_going, right?
> Should be fixed in r1684325.

Meh ... siliness. It's actually r1684344. The correct fix is in
svnadmin; always sending notifications is correct from API users'
perspective, too.


-- Brane


Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 08.06.2015 20:22, Branko Čibej wrote:
> On 08.06.2015 19:06, Evgeny Kotkov wrote:
>> Branko Čibej <br...@wandisco.com> writes:
>>
>>> I completed your patch and committed the fix in r1683311. Please review!
>>>
>>> -- Brane
>> Sorry, I was on vacation last week and couldn't look at this fix earlier.
>>
>> I reviewed the committed patch and tested how Subversion 1.9.x behaves with it,
>> as the change itself got merged into branches/1.9.x in r1683658.  From what I
>> witness, 'svnadmin verify' now erroneously reports the same error twice for a
>> corrupted repository:
>>
>>   * Verifying repository metadata ...
>>   * Verified revision 0.
>>   * Verified revision 1.
>>   * Error verifying revision 2.
>>   svnadmin: E160062: Malformed node revision ID string
>>   svnadmin: E160062: Malformed node revision ID string
>>
>> Relevant hunks of r1683311 are located in subversion/libsvn_repos/dump.c:2433
>> and :2466, where we send the error notification and immediately return the same
>> error to the caller:
>> [[[
>>     ...
>>     else if (err)
>>       {
>>         notify_verification_error(rev, err, notify_func, notify_baton,
>>                                   iterpool);
>>
>>         if (!keep_going)
>>           {
>>             /* Return the error, the caller doesn't want us to continue. */
>>             return svn_error_trace(err);
>>           }
>>     ...
>> ]]]
>>
>> The caller (subversion/svnadmin/svnadmin.c:1908) writes both of these errors
>> to stderr, and that results in the erroneous output.
> Good catch. I suppose that means we shouldn't send the notification when
> !keep_going, right?

Should be fixed in r1684325.

-- Brane

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 08.06.2015 19:06, Evgeny Kotkov wrote:
> Branko Čibej <br...@wandisco.com> writes:
>
>> I completed your patch and committed the fix in r1683311. Please review!
>>
>> -- Brane
> Sorry, I was on vacation last week and couldn't look at this fix earlier.
>
> I reviewed the committed patch and tested how Subversion 1.9.x behaves with it,
> as the change itself got merged into branches/1.9.x in r1683658.  From what I
> witness, 'svnadmin verify' now erroneously reports the same error twice for a
> corrupted repository:
>
>   * Verifying repository metadata ...
>   * Verified revision 0.
>   * Verified revision 1.
>   * Error verifying revision 2.
>   svnadmin: E160062: Malformed node revision ID string
>   svnadmin: E160062: Malformed node revision ID string
>
> Relevant hunks of r1683311 are located in subversion/libsvn_repos/dump.c:2433
> and :2466, where we send the error notification and immediately return the same
> error to the caller:
> [[[
>     ...
>     else if (err)
>       {
>         notify_verification_error(rev, err, notify_func, notify_baton,
>                                   iterpool);
>
>         if (!keep_going)
>           {
>             /* Return the error, the caller doesn't want us to continue. */
>             return svn_error_trace(err);
>           }
>     ...
> ]]]
>
> The caller (subversion/svnadmin/svnadmin.c:1908) writes both of these errors
> to stderr, and that results in the erroneous output.

Good catch. I suppose that means we shouldn't send the notification when
!keep_going, right?

-- Brane

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Branko Čibej <br...@wandisco.com> writes:

>
> I completed your patch and committed the fix in r1683311. Please review!
>
> -- Brane

Sorry, I was on vacation last week and couldn't look at this fix earlier.

I reviewed the committed patch and tested how Subversion 1.9.x behaves with it,
as the change itself got merged into branches/1.9.x in r1683658.  From what I
witness, 'svnadmin verify' now erroneously reports the same error twice for a
corrupted repository:

  * Verifying repository metadata ...
  * Verified revision 0.
  * Verified revision 1.
  * Error verifying revision 2.
  svnadmin: E160062: Malformed node revision ID string
  svnadmin: E160062: Malformed node revision ID string

Relevant hunks of r1683311 are located in subversion/libsvn_repos/dump.c:2433
and :2466, where we send the error notification and immediately return the same
error to the caller:
[[[
    ...
    else if (err)
      {
        notify_verification_error(rev, err, notify_func, notify_baton,
                                  iterpool);

        if (!keep_going)
          {
            /* Return the error, the caller doesn't want us to continue. */
            return svn_error_trace(err);
          }
    ...
]]]

The caller (subversion/svnadmin/svnadmin.c:1908) writes both of these errors
to stderr, and that results in the erroneous output.


Regards,
Evgeny Kotkov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 03.06.2015 19:50, Ivan Zhakov wrote:
> On 3 June 2015 at 20:44, Branko Čibej <br...@wandisco.com> wrote:
>> On 03.06.2015 19:38, Ivan Zhakov wrote:
>>> On 3 June 2015 at 20:29, Branko Čibej <br...@wandisco.com> wrote:
>>>
>>>> An API user who wants an early exit can always trigger the cancel_func
>>>> in her notification handler and get SVN_ERR_CANCELLED in response.
>>>>
>>> The problem that it's could be hard to distinguish summary errors from
>>> repository corruption errors itself for API user.
>> Why? The summary error code (SVN_ERR_REPOS_VERIFY_FAILED) is used in
>> only at the end of a run with keep_going=TRUE iff the FS backend
>> returned an error. It cannot be returned from the FS backend validation
>> functions, so it will never appear in a notification and will never be
>> returned when keep_going=FALSE. In fact, that's the main reason I
>> decided to keep a separate, new error code for this case.
>>
> Is it documented that only SVN_ERR_REPOS_VERIFY_FAILED used for
> summary errors? Do we promise to always return this error code?

If you think we need additional documentation, feel free to add it.

-- Brane


Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On 3 June 2015 at 20:44, Branko Čibej <br...@wandisco.com> wrote:
> On 03.06.2015 19:38, Ivan Zhakov wrote:
>> On 3 June 2015 at 20:29, Branko Čibej <br...@wandisco.com> wrote:
>>
>>> An API user who wants an early exit can always trigger the cancel_func
>>> in her notification handler and get SVN_ERR_CANCELLED in response.
>>>
>> The problem that it's could be hard to distinguish summary errors from
>> repository corruption errors itself for API user.
>
> Why? The summary error code (SVN_ERR_REPOS_VERIFY_FAILED) is used in
> only at the end of a run with keep_going=TRUE iff the FS backend
> returned an error. It cannot be returned from the FS backend validation
> functions, so it will never appear in a notification and will never be
> returned when keep_going=FALSE. In fact, that's the main reason I
> decided to keep a separate, new error code for this case.
>
Is it documented that only SVN_ERR_REPOS_VERIFY_FAILED used for
summary errors? Do we promise to always return this error code?

Anyway I didn't say that it's impossible to distinguish them, but
could be complicated and may be we should make it API user
responsibility to construct summary error if needed.

-- 
Ivan Zhakov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 03.06.2015 19:38, Ivan Zhakov wrote:
> On 3 June 2015 at 20:29, Branko Čibej <br...@wandisco.com> wrote:
>
>> An API user who wants an early exit can always trigger the cancel_func
>> in her notification handler and get SVN_ERR_CANCELLED in response.
>>
> The problem that it's could be hard to distinguish summary errors from
> repository corruption errors itself for API user.

Why? The summary error code (SVN_ERR_REPOS_VERIFY_FAILED) is used in
only at the end of a run with keep_going=TRUE iff the FS backend
returned an error. It cannot be returned from the FS backend validation
functions, so it will never appear in a notification and will never be
returned when keep_going=FALSE. In fact, that's the main reason I
decided to keep a separate, new error code for this case.

-- Brane


Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On 3 June 2015 at 20:29, Branko Čibej <br...@wandisco.com> wrote:
> On 03.06.2015 17:10, Ivan Zhakov wrote:
>> On 3 June 2015 at 17:19, Branko Čibej <br...@wandisco.com> wrote:
>>> On 03.06.2015 15:31, Ivan Zhakov wrote:
>
>>>>  I'm also not sure that we have to return error if we already reported it via notify_func in
>>>> svn_repos_verify_fs3().
>>>
>>> Out notification mechanism cannot replace API consistency. When an API call
>>> fails, it must return an svn_error_t; surely you're not proposing that we
>>> make an exception for svn_repos_verify_fs3?
>>>
>> This is depends whether check of corrupted repository is error or not.
>> I mean that semantic could be: "please give me all/first corruptions
>> in this repository via notify_func".
>
> Hm, well, that would be a different API than what svn_repos_verify_fs3
> documents.
>
Of course docstring should be updated in this case. My point was that
we could define svn_repos_verify_fs3() API to report repository
corruptions only via notify_func because these errors are special.

> An API user who wants an early exit can always trigger the cancel_func
> in her notification handler and get SVN_ERR_CANCELLED in response.
>
The problem that it's could be hard to distinguish summary errors from
repository corruption errors itself for API user.

-- 
Ivan Zhakov

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 03.06.2015 17:10, Ivan Zhakov wrote:
> On 3 June 2015 at 17:19, Branko Čibej <br...@wandisco.com> wrote:
>> On 03.06.2015 15:31, Ivan Zhakov wrote:

>>>  I'm also not sure that we have to return error if we already reported it via notify_func in
>>> svn_repos_verify_fs3().
>>
>> Out notification mechanism cannot replace API consistency. When an API call
>> fails, it must return an svn_error_t; surely you're not proposing that we
>> make an exception for svn_repos_verify_fs3?
>>
> This is depends whether check of corrupted repository is error or not.
> I mean that semantic could be: "please give me all/first corruptions
> in this repository via notify_func".

Hm, well, that would be a different API than what svn_repos_verify_fs3
documents.

An API user who wants an early exit can always trigger the cancel_func
in her notification handler and get SVN_ERR_CANCELLED in response.

-- Brane

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On 3 June 2015 at 17:19, Branko Čibej <br...@wandisco.com> wrote:
> On 03.06.2015 15:31, Ivan Zhakov wrote:
>
> On 3 June 2015 at 15:33, Branko Čibej <br...@wandisco.com> wrote:
>
> On 21.05.2015 17:23, Evgeny Kotkov wrote:
>
> Subversion 1.9.0-rc1 introduced a new svnadmin verify --keep-going mode [1].
> In order to achieve this, we added a svn_repos_verify_fs3() API function and
> deprecated its predecessor, svn_repos_verify_fs2(), that now calls the newer
> function with keep_going argument set to FALSE.
>
> However, svn_repos_verify_fs2() behaves differently in 1.9.0-rc1 and 1.8.13
> when it comes to error reporting.  As an example, the following code ...
>
>   SVN_ERR(svn_repos_verify_fs2(repos, 1, 5, NULL, NULL, NULL, NULL, pool));
>
> ...would return two different errors depending on the binaries being used,
> assuming that one of the revisions in the [r1:r5] range is corrupted:
>
>  (With 1.8.13)  E160004: Checksum mismatch in item at offset 0 of
>     length 59 bytes in file path/asf/db/revs/0/2
>
>  (With 1.9.0-rc1)  E165011: Repository 'path/asf' failed to verify
>
> Please note that the second error is generic, and that the actual
> information
> about the error is lost.  Existing API users of svn_repos_verify_fs2() are
> going to lose an important bit of information about the root cause of the
> corruption unless they update the code.  Furthermore, even if an API caller
> subscribes to notifications with a notify_func / notify_baton pair, it would
> still be necessary to update the code to handle svn_repos_notify_failure
> that
> did not exist before 1.9.
>
> I did not find any discussion on the matter or the corresponding entry in
> /notes/api-errata [2] that would describe this behavior change, so I am
> inclined to think that this is accidental and, probably, undesirable.
>
> There is an option of restoring the 1.8 behavior when svn_repos_verify_fs3()
> is called with keep_going set to FALSE.  We could immediately yield the
> error
> in this case, and only use the notifications / generic error when keep_going
> is set to TRUE.  Doing so would change the output of svnadmin verify without
> --keep-going, because now it wouldn't include the generic error message in
> the end.  I attached a proof-of-concept patch, that doesn't change the test
> expectations and the documentation, i.e., it only includes the core change,
> mostly as an illustration to these points.
>
> Thoughts?
>
> [1] https://svn.apache.org/r1492651
> [2] https://svn.apache.org/repos/asf/subversion/trunk/notes/api-errata/1.9
>
> I completed your patch and committed the fix in r1683311. Please review!
>
> [returning discussion back to this thread]
>
> On 3 June 2015 at 16:26, Branko Čibej <br...@wandisco.com> wrote:
>
> On 03.06.2015 14:57, Ivan Zhakov wrote:
>
> On 3 June 2015 at 15:35, Branko Čibej <br...@wandisco.com> wrote:
>
> On 03.06.2015 14:24, Ivan Zhakov wrote:
>
> On 3 June 2015 at 14:37, Branko Čibej <br...@wandisco.com> wrote:
>
> On 02.06.2015 20:05, Branko Čibej wrote:
>
> On 02.06.2015 12:45, Daniel Shahaf wrote:
>
> Ben Reser wrote on Sun, May 31, 2015 at 14:28:39 -0700:
>
> The 1.9.0-rc2 release artifacts are now available for testing/signing.
> Please get the tarballs from
>   https://dist.apache.org/repos/dist/dev/subversion
> and add your signatures there.
>
> Thanks!
>
> Note that Evgeny reported a regression in svn_repos_verify_fs2() in
> <http://svn.haxx.se/dev/archive-2015-05/0141.shtml>.  No objections to
> moving forward with rc2, but as that issue is a regression, we'll need
> an rc3 that fixes it.
>
> Yes ... and the patch has been reviewed but not committed. I believe it
> only needs a couple tweaks (fixing an "if" condition and removing the
> unused error code).
>
> I have a more complete fix based on Evgeny's patch, running tests now.
> It turns out that we still need a new error code for the summary
> results, but with a different meaning and therefore different name.
> Renaming it had a positive side effect as it turned out that we were
> emitting the SVN_ERR_REPOS_CORRUPTED error from FSFS and FSX instead of
> the correct SVN_ERR_FS_CORRUPT.
>
> Another option would be to require notify_func for
> svn_repos_verify_fs3() and always report errors through notify_func.
> This would make error reporting consistent whether keep_going TRUE or
> FALSE. For svn_repos_verify_fs2() we could create compat notify_func
> handler that converts notifications to errors.
>
> See r1683311; I believe error reporting is consistent now.
> Adding complex logic to the deprecated function isn't such a good idea, IMO.
>
> Maybe, but it's better than adding complex logic to current
> (not-deprecated) function. IMO it's not svn_repos_verify_fs3()
> responsibility to generate verification summary -- it should be done
> at the UI level. I.e. in svnadmin or other application using
> svn_repos_verify_fs3() API.
>
> But that would imply either returning a bunch of extra info from
> svn_repos_verify_fs3, or counting notifications in the callers (which
> means making the callers depend on implementation details of the API).
>
> I don't see problem provide wrapped notify func in
> svn_repos_verify_fs2() in call to svn_repos_verify_fs3(). Also we
> don't need to count notifications since svn_repos_verify_fs2() doesn't
> support keep_going flag.
>
> Note that the summary will only be generated if an error occurred during
> verification in keep-going mode, when we have to return some kind of
> error anyway; we may as well put as much info as we have available into
> the error message.
>
> That's what I call inconsistent API: the returned errors are different
> whether keep_going TRUE or FALSE.
>
>
> Of course the errors are different, the mode of operation is different, too.
> Keep-going mode drops underlying errors on purpose (after notifying); it's
> been this way from the beginning, it makes perfect sense and I didn't change
> that. What Evgeny and I fixed was the bug that the keep_going=FALSE mode
> also dropped the underlying errors, which is both wrong and inconsistent
> with 1.8.x. (And that keep_going=TRUE mode ignored cancellations.)
>

>>  I'm also not sure that we have to return error if we already reported it via notify_func in
>> svn_repos_verify_fs3().
>
>
> Out notification mechanism cannot replace API consistency. When an API call
> fails, it must return an svn_error_t; surely you're not proposing that we
> make an exception for svn_repos_verify_fs3?
>
This is depends whether check of corrupted repository is error or not.
I mean that semantic could be: "please give me all/first corruptions
in this repository via notify_func".

But I'm not insisting on this approach.

-- 
Ivan Zhakov
CTO | VisualSVN | http://www.visualsvn.com

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 03.06.2015 15:31, Ivan Zhakov wrote:
> On 3 June 2015 at 15:33, Branko Čibej <br...@wandisco.com> wrote:
>> On 21.05.2015 17:23, Evgeny Kotkov wrote:
>>> Subversion 1.9.0-rc1 introduced a new svnadmin verify --keep-going mode [1].
>>> In order to achieve this, we added a svn_repos_verify_fs3() API function and
>>> deprecated its predecessor, svn_repos_verify_fs2(), that now calls the newer
>>> function with keep_going argument set to FALSE.
>>>
>>> However, svn_repos_verify_fs2() behaves differently in 1.9.0-rc1 and 1.8.13
>>> when it comes to error reporting.  As an example, the following code ...
>>>
>>>   SVN_ERR(svn_repos_verify_fs2(repos, 1, 5, NULL, NULL, NULL, NULL, pool));
>>>
>>> ...would return two different errors depending on the binaries being used,
>>> assuming that one of the revisions in the [r1:r5] range is corrupted:
>>>
>>>  (With 1.8.13)  E160004: Checksum mismatch in item at offset 0 of
>>>     length 59 bytes in file path/asf/db/revs/0/2
>>>
>>>  (With 1.9.0-rc1)  E165011: Repository 'path/asf' failed to verify
>>>
>>> Please note that the second error is generic, and that the actual information
>>> about the error is lost.  Existing API users of svn_repos_verify_fs2() are
>>> going to lose an important bit of information about the root cause of the
>>> corruption unless they update the code.  Furthermore, even if an API caller
>>> subscribes to notifications with a notify_func / notify_baton pair, it would
>>> still be necessary to update the code to handle svn_repos_notify_failure that
>>> did not exist before 1.9.
>>>
>>> I did not find any discussion on the matter or the corresponding entry in
>>> /notes/api-errata [2] that would describe this behavior change, so I am
>>> inclined to think that this is accidental and, probably, undesirable.
>>>
>>> There is an option of restoring the 1.8 behavior when svn_repos_verify_fs3()
>>> is called with keep_going set to FALSE.  We could immediately yield the error
>>> in this case, and only use the notifications / generic error when keep_going
>>> is set to TRUE.  Doing so would change the output of svnadmin verify without
>>> --keep-going, because now it wouldn't include the generic error message in
>>> the end.  I attached a proof-of-concept patch, that doesn't change the test
>>> expectations and the documentation, i.e., it only includes the core change,
>>> mostly as an illustration to these points.
>>>
>>> Thoughts?
>>>
>>> [1] https://svn.apache.org/r1492651
>>> [2] https://svn.apache.org/repos/asf/subversion/trunk/notes/api-errata/1.9
>> I completed your patch and committed the fix in r1683311. Please review!
>>
> [returning discussion back to this thread]
>
> On 3 June 2015 at 16:26, Branko Čibej <br...@wandisco.com> wrote:
>> On 03.06.2015 14:57, Ivan Zhakov wrote:
>>> On 3 June 2015 at 15:35, Branko Čibej <br...@wandisco.com> wrote:
>>>> On 03.06.2015 14:24, Ivan Zhakov wrote:
>>>>> On 3 June 2015 at 14:37, Branko Čibej <br...@wandisco.com> wrote:
>>>>>> On 02.06.2015 20:05, Branko Čibej wrote:
>>>>>>> On 02.06.2015 12:45, Daniel Shahaf wrote:
>>>>>>>> Ben Reser wrote on Sun, May 31, 2015 at 14:28:39 -0700:
>>>>>>>>> The 1.9.0-rc2 release artifacts are now available for testing/signing.
>>>>>>>>> Please get the tarballs from
>>>>>>>>>   https://dist.apache.org/repos/dist/dev/subversion
>>>>>>>>> and add your signatures there.
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>> Note that Evgeny reported a regression in svn_repos_verify_fs2() in
>>>>>>>> <http://svn.haxx.se/dev/archive-2015-05/0141.shtml>.  No objections to
>>>>>>>> moving forward with rc2, but as that issue is a regression, we'll need
>>>>>>>> an rc3 that fixes it.
>>>>>>> Yes ... and the patch has been reviewed but not committed. I believe it
>>>>>>> only needs a couple tweaks (fixing an "if" condition and removing the
>>>>>>> unused error code).
>>>>>> I have a more complete fix based on Evgeny's patch, running tests now.
>>>>>> It turns out that we still need a new error code for the summary
>>>>>> results, but with a different meaning and therefore different name.
>>>>>> Renaming it had a positive side effect as it turned out that we were
>>>>>> emitting the SVN_ERR_REPOS_CORRUPTED error from FSFS and FSX instead of
>>>>>> the correct SVN_ERR_FS_CORRUPT.
>>>>>>
>>>>> Another option would be to require notify_func for
>>>>> svn_repos_verify_fs3() and always report errors through notify_func.
>>>>> This would make error reporting consistent whether keep_going TRUE or
>>>>> FALSE. For svn_repos_verify_fs2() we could create compat notify_func
>>>>> handler that converts notifications to errors.
>>>> See r1683311; I believe error reporting is consistent now.
>>>> Adding complex logic to the deprecated function isn't such a good idea, IMO.
>>> Maybe, but it's better than adding complex logic to current
>>> (not-deprecated) function. IMO it's not svn_repos_verify_fs3()
>>> responsibility to generate verification summary -- it should be done
>>> at the UI level. I.e. in svnadmin or other application using
>>> svn_repos_verify_fs3() API.
>> But that would imply either returning a bunch of extra info from
>> svn_repos_verify_fs3, or counting notifications in the callers (which
>> means making the callers depend on implementation details of the API).
>>
> I don't see problem provide wrapped notify func in
> svn_repos_verify_fs2() in call to svn_repos_verify_fs3(). Also we
> don't need to count notifications since svn_repos_verify_fs2() doesn't
> support keep_going flag.
>
>> Note that the summary will only be generated if an error occurred during
>> verification in keep-going mode, when we have to return some kind of
>> error anyway; we may as well put as much info as we have available into
>> the error message.
> That's what I call inconsistent API: the returned errors are different
> whether keep_going TRUE or FALSE.

Of course the errors are different, the mode of operation is different,
too. Keep-going mode drops underlying errors on purpose (after
notifying); it's been this way from the beginning, it makes perfect
sense and I didn't change that. What Evgeny and I fixed was the bug that
the keep_going=FALSE mode also dropped the underlying errors, which is
both wrong and inconsistent with 1.8.x. (And that keep_going=TRUE mode
ignored cancellations.)

>  I'm also not sure that we have to
> return error if we already reported it via notify_func in
> svn_repos_verify_fs3().

Out notification mechanism cannot replace API consistency. When an API
call fails, it must return an svn_error_t; surely you're not proposing
that we make an exception for svn_repos_verify_fs3?

Since we do need an error, we have four choices:

  * Combine all the verification errors into one huge stack; this has
    two problems: one is the obvious unbounded memory usage, the other
    is that such a stack is both unmanageable and duplicates the
    notifications.
  * Return the last or first encountered error; whilst this would work,
    it's not very informative and possibly confusing because it singles
    out one problem out of possibly hundreds.
  * Return a generic "something went wrong" error.
  * Return a generic error but provide a summary of the problems in the
    error message.

IMO, only the last two options make any kind of sense; and of these, the
latter is better because it gives the user some extra info essentially
for free.

We could, as you propose, add the summarizing into 'svnadmin verify';
but that would create more code churn for what is essentially a bug fix.

-- Brane


Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On 3 June 2015 at 15:33, Branko Čibej <br...@wandisco.com> wrote:
> On 21.05.2015 17:23, Evgeny Kotkov wrote:
>> Subversion 1.9.0-rc1 introduced a new svnadmin verify --keep-going mode [1].
>> In order to achieve this, we added a svn_repos_verify_fs3() API function and
>> deprecated its predecessor, svn_repos_verify_fs2(), that now calls the newer
>> function with keep_going argument set to FALSE.
>>
>> However, svn_repos_verify_fs2() behaves differently in 1.9.0-rc1 and 1.8.13
>> when it comes to error reporting.  As an example, the following code ...
>>
>>   SVN_ERR(svn_repos_verify_fs2(repos, 1, 5, NULL, NULL, NULL, NULL, pool));
>>
>> ...would return two different errors depending on the binaries being used,
>> assuming that one of the revisions in the [r1:r5] range is corrupted:
>>
>>  (With 1.8.13)  E160004: Checksum mismatch in item at offset 0 of
>>     length 59 bytes in file path/asf/db/revs/0/2
>>
>>  (With 1.9.0-rc1)  E165011: Repository 'path/asf' failed to verify
>>
>> Please note that the second error is generic, and that the actual information
>> about the error is lost.  Existing API users of svn_repos_verify_fs2() are
>> going to lose an important bit of information about the root cause of the
>> corruption unless they update the code.  Furthermore, even if an API caller
>> subscribes to notifications with a notify_func / notify_baton pair, it would
>> still be necessary to update the code to handle svn_repos_notify_failure that
>> did not exist before 1.9.
>>
>> I did not find any discussion on the matter or the corresponding entry in
>> /notes/api-errata [2] that would describe this behavior change, so I am
>> inclined to think that this is accidental and, probably, undesirable.
>>
>> There is an option of restoring the 1.8 behavior when svn_repos_verify_fs3()
>> is called with keep_going set to FALSE.  We could immediately yield the error
>> in this case, and only use the notifications / generic error when keep_going
>> is set to TRUE.  Doing so would change the output of svnadmin verify without
>> --keep-going, because now it wouldn't include the generic error message in
>> the end.  I attached a proof-of-concept patch, that doesn't change the test
>> expectations and the documentation, i.e., it only includes the core change,
>> mostly as an illustration to these points.
>>
>> Thoughts?
>>
>> [1] https://svn.apache.org/r1492651
>> [2] https://svn.apache.org/repos/asf/subversion/trunk/notes/api-errata/1.9
>
> I completed your patch and committed the fix in r1683311. Please review!
>

[returning discussion back to this thread]

On 3 June 2015 at 16:26, Branko Čibej <br...@wandisco.com> wrote:
> On 03.06.2015 14:57, Ivan Zhakov wrote:
>> On 3 June 2015 at 15:35, Branko Čibej <br...@wandisco.com> wrote:
>>> On 03.06.2015 14:24, Ivan Zhakov wrote:
>>>> On 3 June 2015 at 14:37, Branko Čibej <br...@wandisco.com> wrote:
>>>>> On 02.06.2015 20:05, Branko Čibej wrote:
>>>>>> On 02.06.2015 12:45, Daniel Shahaf wrote:
>>>>>>> Ben Reser wrote on Sun, May 31, 2015 at 14:28:39 -0700:
>>>>>>>> The 1.9.0-rc2 release artifacts are now available for testing/signing.
>>>>>>>> Please get the tarballs from
>>>>>>>>   https://dist.apache.org/repos/dist/dev/subversion
>>>>>>>> and add your signatures there.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>> Note that Evgeny reported a regression in svn_repos_verify_fs2() in
>>>>>>> <http://svn.haxx.se/dev/archive-2015-05/0141.shtml>.  No objections to
>>>>>>> moving forward with rc2, but as that issue is a regression, we'll need
>>>>>>> an rc3 that fixes it.
>>>>>> Yes ... and the patch has been reviewed but not committed. I believe it
>>>>>> only needs a couple tweaks (fixing an "if" condition and removing the
>>>>>> unused error code).
>>>>> I have a more complete fix based on Evgeny's patch, running tests now.
>>>>> It turns out that we still need a new error code for the summary
>>>>> results, but with a different meaning and therefore different name.
>>>>> Renaming it had a positive side effect as it turned out that we were
>>>>> emitting the SVN_ERR_REPOS_CORRUPTED error from FSFS and FSX instead of
>>>>> the correct SVN_ERR_FS_CORRUPT.
>>>>>
>>>> Another option would be to require notify_func for
>>>> svn_repos_verify_fs3() and always report errors through notify_func.
>>>> This would make error reporting consistent whether keep_going TRUE or
>>>> FALSE. For svn_repos_verify_fs2() we could create compat notify_func
>>>> handler that converts notifications to errors.
>>> See r1683311; I believe error reporting is consistent now.
>>> Adding complex logic to the deprecated function isn't such a good idea, IMO.
>> Maybe, but it's better than adding complex logic to current
>> (not-deprecated) function. IMO it's not svn_repos_verify_fs3()
>> responsibility to generate verification summary -- it should be done
>> at the UI level. I.e. in svnadmin or other application using
>> svn_repos_verify_fs3() API.
>
> But that would imply either returning a bunch of extra info from
> svn_repos_verify_fs3, or counting notifications in the callers (which
> means making the callers depend on implementation details of the API).
>
I don't see problem provide wrapped notify func in
svn_repos_verify_fs2() in call to svn_repos_verify_fs3(). Also we
don't need to count notifications since svn_repos_verify_fs2() doesn't
support keep_going flag.

> Note that the summary will only be generated if an error occurred during
> verification in keep-going mode, when we have to return some kind of
> error anyway; we may as well put as much info as we have available into
> the error message.
That's what I call inconsistent API: the returned errors are different
whether keep_going TRUE or FALSE. I'm also not sure that we have to
return error if we already reported it via notify_func in
svn_repos_verify_fs3().

-- 
Ivan Zhakov
CTO | VisualSVN | http://www.visualsvn.com

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 21.05.2015 17:23, Evgeny Kotkov wrote:
> Subversion 1.9.0-rc1 introduced a new svnadmin verify --keep-going mode [1].
> In order to achieve this, we added a svn_repos_verify_fs3() API function and
> deprecated its predecessor, svn_repos_verify_fs2(), that now calls the newer
> function with keep_going argument set to FALSE.
>
> However, svn_repos_verify_fs2() behaves differently in 1.9.0-rc1 and 1.8.13
> when it comes to error reporting.  As an example, the following code ...
>
>   SVN_ERR(svn_repos_verify_fs2(repos, 1, 5, NULL, NULL, NULL, NULL, pool));
>
> ...would return two different errors depending on the binaries being used,
> assuming that one of the revisions in the [r1:r5] range is corrupted:
>
>  (With 1.8.13)  E160004: Checksum mismatch in item at offset 0 of
>     length 59 bytes in file path/asf/db/revs/0/2
>
>  (With 1.9.0-rc1)  E165011: Repository 'path/asf' failed to verify
>
> Please note that the second error is generic, and that the actual information
> about the error is lost.  Existing API users of svn_repos_verify_fs2() are
> going to lose an important bit of information about the root cause of the
> corruption unless they update the code.  Furthermore, even if an API caller
> subscribes to notifications with a notify_func / notify_baton pair, it would
> still be necessary to update the code to handle svn_repos_notify_failure that
> did not exist before 1.9.
>
> I did not find any discussion on the matter or the corresponding entry in
> /notes/api-errata [2] that would describe this behavior change, so I am
> inclined to think that this is accidental and, probably, undesirable.
>
> There is an option of restoring the 1.8 behavior when svn_repos_verify_fs3()
> is called with keep_going set to FALSE.  We could immediately yield the error
> in this case, and only use the notifications / generic error when keep_going
> is set to TRUE.  Doing so would change the output of svnadmin verify without
> --keep-going, because now it wouldn't include the generic error message in
> the end.  I attached a proof-of-concept patch, that doesn't change the test
> expectations and the documentation, i.e., it only includes the core change,
> mostly as an illustration to these points.
>
> Thoughts?
>
> [1] https://svn.apache.org/r1492651
> [2] https://svn.apache.org/repos/asf/subversion/trunk/notes/api-errata/1.9

I completed your patch and committed the fix in r1683311. Please review!

-- Brane

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Branko Čibej <br...@wandisco.com>.
On 23.05.2015 13:07, Daniel Shahaf wrote:
> Evgeny Kotkov wrote on Thu, May 21, 2015 at 18:23:22 +0300:
>>   SVN_ERR(svn_repos_verify_fs2(repos, 1, 5, NULL, NULL, NULL, NULL, pool));
>>
>> ...would return two different errors depending on the binaries being used,
>> assuming that one of the revisions in the [r1:r5] range is corrupted:
>>
>>  (With 1.8.13)  E160004: Checksum mismatch in item at offset 0 of
>>     length 59 bytes in file path/asf/db/revs/0/2
>>
>>  (With 1.9.0-rc1)  E165011: Repository 'path/asf' failed to verify
> ...
>> Thoughts?
> I think the incumbent code doesn't use SVN_ERR_REPOS_CORRUPT (E165011)
> correctly.
>
> The keep-going implementation simply uses SVN_ERR_REPOS_CORRUPT to wrap
> whatever error the FS layer reported.  That's wrong, for two reasons:
>
> - The SVN_ERR_REPOS_CORRUPT code should be used to indicate corruption
>   of repos-layer on-disk data (such as $REPOS_DIR/db not existing or
>   $REPOS_DIR/format being a directory).  The code does not look for such
>   conditions.  In fact, since the repos layer got around to calling into
>   the FS layer, the repos layer itself is almost certainly integral.
>
> - The repos layer has no business second-guessing the FS layer's choice
>   of error code.  For example, a permission error on a rev file could
>   happen and does not indicate corruption, or a user might have pressed
>   ^C during svn_fs_verify()'s execution.  The incumbent code assumes
>   that any non-SVN_ERR_CANCELLED code indicates a corruption; that
>   assumption is wrong.
>
> So, the keep-going code should stop using SVN_ERR_REPOS_CORRUPT to
> indiscriminately wrap svn_fs_verify()'s error code.  That should make
> the E160004 (SVN_ERR_FS_CORRUPT) error visible, addressing Evgeny's
> issue.  (I am almost sure that's exactly what Evgeny's patch does.)
>
> Furthermore, I think the "if (found_corruption)" block at the end of
> svn_repos_verify_fs3() should also avoid using SVN_ERR_REPOS_CORRUPT,
> since that would be the wrong code to use in some circumstances (for
> example, if all revision files had permission errors).  The code should
> either say what it _does_ know for a fact — "N out of M revisions failed
> to verify" — or start inspecting the FS errors to determine whether they
> indicate corruption or transient errors (which isn't always possible,
> since the repos layer does not know the context the error was raised in).
>
> In fact, since the aforementioned two uses of SVN_ERR_REPOS_CORRUPT are
> the only two uses of that code, I conclude that I am of the opinion that
> that error code should be deleted entirely from the 1.9 codebase.  We
> can always revive it in 1.10 if needed.

I completely agree with your analysis and proposal.

-- Brane


Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Evgeny Kotkov wrote on Thu, May 21, 2015 at 18:23:22 +0300:
>   SVN_ERR(svn_repos_verify_fs2(repos, 1, 5, NULL, NULL, NULL, NULL, pool));
> 
> ...would return two different errors depending on the binaries being used,
> assuming that one of the revisions in the [r1:r5] range is corrupted:
> 
>  (With 1.8.13)  E160004: Checksum mismatch in item at offset 0 of
>     length 59 bytes in file path/asf/db/revs/0/2
> 
>  (With 1.9.0-rc1)  E165011: Repository 'path/asf' failed to verify
...
> Thoughts?

I think the incumbent code doesn't use SVN_ERR_REPOS_CORRUPT (E165011)
correctly.

The keep-going implementation simply uses SVN_ERR_REPOS_CORRUPT to wrap
whatever error the FS layer reported.  That's wrong, for two reasons:

- The SVN_ERR_REPOS_CORRUPT code should be used to indicate corruption
  of repos-layer on-disk data (such as $REPOS_DIR/db not existing or
  $REPOS_DIR/format being a directory).  The code does not look for such
  conditions.  In fact, since the repos layer got around to calling into
  the FS layer, the repos layer itself is almost certainly integral.

- The repos layer has no business second-guessing the FS layer's choice
  of error code.  For example, a permission error on a rev file could
  happen and does not indicate corruption, or a user might have pressed
  ^C during svn_fs_verify()'s execution.  The incumbent code assumes
  that any non-SVN_ERR_CANCELLED code indicates a corruption; that
  assumption is wrong.

So, the keep-going code should stop using SVN_ERR_REPOS_CORRUPT to
indiscriminately wrap svn_fs_verify()'s error code.  That should make
the E160004 (SVN_ERR_FS_CORRUPT) error visible, addressing Evgeny's
issue.  (I am almost sure that's exactly what Evgeny's patch does.)

Furthermore, I think the "if (found_corruption)" block at the end of
svn_repos_verify_fs3() should also avoid using SVN_ERR_REPOS_CORRUPT,
since that would be the wrong code to use in some circumstances (for
example, if all revision files had permission errors).  The code should
either say what it _does_ know for a fact — "N out of M revisions failed
to verify" — or start inspecting the FS errors to determine whether they
indicate corruption or transient errors (which isn't always possible,
since the repos layer does not know the context the error was raised in).

In fact, since the aforementioned two uses of SVN_ERR_REPOS_CORRUPT are
the only two uses of that code, I conclude that I am of the opinion that
that error code should be deleted entirely from the 1.9 codebase.  We
can always revive it in 1.10 if needed.

Cheers,

Daniel