You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Erik Huelsmann <eh...@gmail.com> on 2007/09/09 07:55:19 UTC

RFC: svn_iter_* iteration drivers for common iterations

While working in libsvn_wc I found I'm missing a construct which will
give me reliable (in terms of pool creation/destruction, but also the
actual iteration code) way to code iterations over arbitrary sets of
arbitrary items, apr_array_header_t's and apr_hash_t's, for example.

I'd really like to add these iteration drivers, for one because they
add consistency, but also because they are a great way to make the
extremely long functions in libsvn_wc shorter - it's still not lisp:
no way of inlining code to be iterated over. The thing with making the
functions shorter is that - to me - it increases readability:
currently all code is inlined in *very* long functions. Many of these
functions (svn_client_commit4, for example) consist of 4 or 5 actions,
each of which is implemented as a loop over a number of statement.

Which actions svn_client_commit4 is actually implementing is hard to
see, because only 1 of the actions fits in the visible part of my
emacs buffer (and note I already use a 60 line long buffer!). This
means I have to constantly search back and forward to find out what
the function is doing. Having these iterators would require each of
these actions to be in a separate function, leaving only just a few
lines of code per action in svn_client_commit4: the outline of what
it's trying to achieve.

So, what am I proposing we add? 2 functions, 2 new types and a macro,
to start with:

types (the iterator callbacks)


functions:


macro:
#define svn_iter_break return svn_error_create(SVN_ERR_CANCELLED, NULL, NULL)


How does it work?
The iterator callbacks are called for each item in the set. (If the
set is ordered, we could consider adding an optional 'from_end'
boolean, to make it explicitly iterate forward or backward). If the
iteration is to be cancelled, the iterator callback returns
SVN_ERR_CANCELLED.

Returning an error (including SVN_ERR_CANCELLED) from the iterator
callback acts like the C 'break;' statement:it stops the iteration.

The drivers have a *completed boolean return value which is set to
TRUE if the iteration was completed successfully, or FALSE otherwise.

The iterator driver return value is the one returned by the iterator
callback, except for SVN_ERR_CANCELLED in which case the driver
returns SVN_NO_ERROR.


Below you can find the implementation for iterating an apr_hash_t and
a snippet from libsvn_wc/log.c:do_log_committed() in the original and
the with-iterator situations. [Note how using the iterator version
fixes a pool useage bug: the original fails to create an iterpool.]

I'd like to hear your comments! BTW: Having these drivers doesn't mean
we need to start rewriting all loops at once.

bye,


Erik.


[[[Hash iterator definitions]]]

typedef svn_error_t *(*svn_iter_apr_hash_cb_t)(void *baton,
                                               const void *key,
                                               apr_ssize_t klen,
                                               void *val, apr_pool_t *pool);

svn_error_t *
svn_iter_apr_hash(svn_boolean_t *completed,
                  apr_hash_t *hash,
                  svn_iter_apr_hash_cb_t func,
                  void *baton,
                  apr_pool_t *pool)
{
  svn_error_t *err = SVN_NO_ERROR;
  apr_pool_t *iterpool = svn_pool_create(pool);
  apr_hash_index_t *hi;

  for (hi = apr_hash_first(pool, hash);
       ! err && hi; hi = apr_hash_next(hi))
    {
      const void *key;
      void *val;
      apr_ssize_t len;

      svn_pool_clear(iterpool);

      apr_hash_this(hi, &key, &len, &val);
      err = (*func)(baton, key, len, val, iterpool);
    }

  /* Clear the pool early: if we have an error,
     callers may clear the error, leaving any subpools created in
     iterpool in memory if we don't */
  svn_pool_destroy(iterpool);

  if (completed)
    *completed = ! err;

  if (err && err->apr_err != SVN_ERR_CANCELLED)
    return err;

  svn_error_clear(err);
  return SVN_NO_ERROR;
}


[[[Original snippet]]]
  if ((entry->schedule == svn_wc_schedule_replace) && is_this_dir)
    {
      apr_hash_index_t *hi;

      /* Loop over all children entries, look for items scheduled for
         deletion. */
      SVN_ERR(svn_wc_entries_read(&entries, loggy->adm_access, TRUE, pool));
      for (hi = apr_hash_first(pool, entries); hi; hi = apr_hash_next(hi))
        {
          const void *key;
          void *val;
          const svn_wc_entry_t *cur_entry;
          svn_wc_adm_access_t *entry_access;

          /* Get the next entry */
          apr_hash_this(hi, &key, NULL, &val);
          cur_entry = (svn_wc_entry_t *) val;

          /* Skip each entry that isn't scheduled for deletion. */
          if (cur_entry->schedule != svn_wc_schedule_delete)
            continue;

          /* Determine what arguments to hand to our removal function,
             and let BASE_NAME double as an "ok" flag to run that function. */
          base_name = NULL;
          if (cur_entry->kind == svn_node_file)
            {
              pdir = svn_wc_adm_access_path(loggy->adm_access);
              base_name = apr_pstrdup(pool, key);
              entry_access = loggy->adm_access;
            }
          else if (cur_entry->kind == svn_node_dir)
            {
              pdir = svn_path_join(svn_wc_adm_access_path(loggy->adm_access),
                                   key, pool);
              base_name = SVN_WC_ENTRY_THIS_DIR;
              SVN_ERR(svn_wc_adm_retrieve(&entry_access, loggy->adm_access,
                                          pdir, pool));
            }

          /* ### We pass NULL, NULL for cancel_func and cancel_baton below.
             ### If they were available, it would be nice to use them. */
          if (base_name)
            SVN_ERR(svn_wc_remove_from_revision_control
                    (entry_access, base_name, FALSE, FALSE,
                     NULL, NULL, pool));
        }
    }



[[[with-iterator snippet]]]

  if ((entry->schedule == svn_wc_schedule_replace) && is_this_dir)
    {
      /* Loop over all children entries, look for items scheduled for
         deletion. */
      SVN_ERR(svn_wc_entries_read(&entries, loggy->adm_access, TRUE, pool));
      SVN_ERR(svn_iter_apr_hash(NULL, entries,
                                remove_deleted_entry, loggy, pool));
    }


Where remove_deleted_entry() is defined as:

static svn_error_t *
remove_deleted_entry(void *baton, const void *key,
                               apr_ssize_t klen, void *val, apr_pool_t *pool)
{
  struct log_runner *loggy = baton;
  const char *base_name;
  const char *pdir;
  const svn_wc_entry_t *cur_entry = val;
  svn_wc_adm_access_t *entry_access;

  /* Skip each entry that isn't scheduled for deletion. */
  if (cur_entry->schedule != svn_wc_schedule_delete)
    return SVN_NO_ERROR;

  /* Determine what arguments to hand to our removal function,
     and let BASE_NAME double as an "ok" flag to run that function. */
  base_name = NULL;
  if (cur_entry->kind == svn_node_file)
    {
      pdir = svn_wc_adm_access_path(loggy->adm_access);
      base_name = apr_pstrdup(pool, key);
      entry_access = loggy->adm_access;
    }
  else if (cur_entry->kind == svn_node_dir)
    {
      pdir = svn_path_join(svn_wc_adm_access_path(loggy->adm_access),
                           key, pool);
      base_name = SVN_WC_ENTRY_THIS_DIR;
      SVN_ERR(svn_wc_adm_retrieve(&entry_access, loggy->adm_access,
                                  pdir, pool));
    }

  /* ### We pass NULL, NULL for cancel_func and cancel_baton below.
     ### If they were available, it would be nice to use them. */
  if (base_name)
    SVN_ERR(svn_wc_remove_from_revision_control
            (entry_access, base_name, FALSE, FALSE,
             NULL, NULL, pool));

  return SVN_NO_ERROR;
}

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: svn_iter_* iteration drivers for common iterations

Posted by Erik Huelsmann <eh...@gmail.com>.
On 9/10/07, Erik Huelsmann <eh...@gmail.com> wrote:
> On 9/10/07, Karl Fogel <kf...@red-bean.com> wrote:
> > "Erik Huelsmann" <eh...@gmail.com> writes:
> > > macro:
> > > #define svn_iter_break return svn_error_create(SVN_ERR_CANCELLED, NULL, NULL)
> > >
> > >
> > > How does it work?
> > > The iterator callbacks are called for each item in the set. (If the
> > > set is ordered, we could consider adding an optional 'from_end'
> > > boolean, to make it explicitly iterate forward or backward). If the
> > > iteration is to be cancelled, the iterator callback returns
> > > SVN_ERR_CANCELLED.
> > >
> > > Returning an error (including SVN_ERR_CANCELLED) from the iterator
> > > callback acts like the C 'break;' statement:it stops the iteration.
> > >
> > > The drivers have a *completed boolean return value which is set to
> > > TRUE if the iteration was completed successfully, or FALSE otherwise.
> > >
> > > The iterator driver return value is the one returned by the iterator
> > > callback, except for SVN_ERR_CANCELLED in which case the driver
> > > returns SVN_NO_ERROR.
> >
> > +1
> >
> > Maybe don't overload SVN_ERR_CANCELLED for this, but instead use a
> > dedicated, self-documenting error code -- SVN_ERR_ITER_BREAK or
> > something.
>
> Good point: It might be returned while (accidentally) being in a loop.
>
> > I'm a little worried about using an error object as the "stop" marker
> > at all.  When we have nested iterations, the inner iteration will end
> > up generating lots of error objects.  True, they will be cleared, but
> > there's some overhead to just allocating something in the global pool.
>
> > The only alternative I can think of would be that all iter '*_cb_t'
> > functions would have a boolean '&stop' parameter, which they set to
> > true when they want to break out of the iteration.
>
> Well, to avoid tons of parameters, I decided to use an error return.
> Also, current
>
>   if (expression)
>     break;
>
> constructs can be easily rewritten to
>
>   if (expression)
>     svn_iter_break;
>
> where svn_iter_break is a macro expanding to a return statement
> returning the required error.  It would be a lot harder to implement
> the same thing when using a parameter because it can have different
> names in different functions.
>
> > (I don't have hard numbers to back up this worry, so perhaps I'm just
> > being a curmudgeon.)
>
> Well, it's definitely not the common case for most loops, but I can't
> be absolutely sure it's not ever. My feeling is that it's not though.

Thinking about this some more, I came to the conclusion this error can
be allocated in a subpool of the iterpool: it's only a communication
means between caller and callee and doesn't have to outlive the
callers' pool's lifetime.

bye,

Erik.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: svn_iter_* iteration drivers for common iterations

Posted by Erik Huelsmann <eh...@gmail.com>.
On 9/10/07, Karl Fogel <kf...@red-bean.com> wrote:
> "Erik Huelsmann" <eh...@gmail.com> writes:
> > macro:
> > #define svn_iter_break return svn_error_create(SVN_ERR_CANCELLED, NULL, NULL)
> >
> >
> > How does it work?
> > The iterator callbacks are called for each item in the set. (If the
> > set is ordered, we could consider adding an optional 'from_end'
> > boolean, to make it explicitly iterate forward or backward). If the
> > iteration is to be cancelled, the iterator callback returns
> > SVN_ERR_CANCELLED.
> >
> > Returning an error (including SVN_ERR_CANCELLED) from the iterator
> > callback acts like the C 'break;' statement:it stops the iteration.
> >
> > The drivers have a *completed boolean return value which is set to
> > TRUE if the iteration was completed successfully, or FALSE otherwise.
> >
> > The iterator driver return value is the one returned by the iterator
> > callback, except for SVN_ERR_CANCELLED in which case the driver
> > returns SVN_NO_ERROR.
>
> +1
>
> Maybe don't overload SVN_ERR_CANCELLED for this, but instead use a
> dedicated, self-documenting error code -- SVN_ERR_ITER_BREAK or
> something.

Good point: It might be returned while (accidentally) being in a loop.

> I'm a little worried about using an error object as the "stop" marker
> at all.  When we have nested iterations, the inner iteration will end
> up generating lots of error objects.  True, they will be cleared, but
> there's some overhead to just allocating something in the global pool.

> The only alternative I can think of would be that all iter '*_cb_t'
> functions would have a boolean '&stop' parameter, which they set to
> true when they want to break out of the iteration.

Well, to avoid tons of parameters, I decided to use an error return.
Also, current

  if (expression)
    break;

constructs can be easily rewritten to

  if (expression)
    svn_iter_break;

where svn_iter_break is a macro expanding to a return statement
returning the required error.  It would be a lot harder to implement
the same thing when using a parameter because it can have different
names in different functions.

> (I don't have hard numbers to back up this worry, so perhaps I'm just
> being a curmudgeon.)

Well, it's definitely not the common case for most loops, but I can't
be absolutely sure it's not ever. My feeling is that it's not though.

> I found your rewritten code more readable, btw.

Good. I'll continue along these lines then.

Thanks for the review.

bye,

Erik.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: svn_iter_* iteration drivers for common iterations

Posted by Karl Fogel <kf...@red-bean.com>.
"Erik Huelsmann" <eh...@gmail.com> writes:
> macro:
> #define svn_iter_break return svn_error_create(SVN_ERR_CANCELLED, NULL, NULL)
>
>
> How does it work?
> The iterator callbacks are called for each item in the set. (If the
> set is ordered, we could consider adding an optional 'from_end'
> boolean, to make it explicitly iterate forward or backward). If the
> iteration is to be cancelled, the iterator callback returns
> SVN_ERR_CANCELLED.
>
> Returning an error (including SVN_ERR_CANCELLED) from the iterator
> callback acts like the C 'break;' statement:it stops the iteration.
>
> The drivers have a *completed boolean return value which is set to
> TRUE if the iteration was completed successfully, or FALSE otherwise.
>
> The iterator driver return value is the one returned by the iterator
> callback, except for SVN_ERR_CANCELLED in which case the driver
> returns SVN_NO_ERROR.

+1

Maybe don't overload SVN_ERR_CANCELLED for this, but instead use a
dedicated, self-documenting error code -- SVN_ERR_ITER_BREAK or
something.

I'm a little worried about using an error object as the "stop" marker
at all.  When we have nested iterations, the inner iteration will end
up generating lots of error objects.  True, they will be cleared, but
there's some overhead to just allocating something in the global pool.

The only alternative I can think of would be that all iter '*_cb_t'
functions would have a boolean '&stop' parameter, which they set to
true when they want to break out of the iteration.

(I don't have hard numbers to back up this worry, so perhaps I'm just
being a curmudgeon.)

I found your rewritten code more readable, btw.

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org