You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Ivan Zhakov <iv...@visualsvn.com> on 2015/05/27 11:28:27 UTC

Double cache lookup in svn_fs_fs__rep_contents_dir_entry()

It seems directory cache checked twice in function
svn_fs_fs__rep_contents_dir_entry:
[[[
svn_error_t *
svn_fs_fs__rep_contents_dir_entry(svn_fs_dirent_t **dirent,
                                  svn_fs_t *fs,
                                  node_revision_t *noderev,
                                  const char *name,
                                  apr_pool_t *result_pool,
                                  apr_pool_t *scratch_pool)
{
  svn_boolean_t found = FALSE;

  /* find the cache we may use */
  pair_cache_key_t pair_key = { 0 };
  const void *key;
  svn_cache__t *cache = locate_dir_cache(fs, &key, &pair_key, noderev,
                                         scratch_pool);
  if (cache)
    {
[...]
      SVN_ERR(svn_cache__get_partial((void **)dirent,
                                     &found,
                                     cache,
                                     key,
                                     svn_fs_fs__extract_dir_entry,
                                     &baton,
                                     result_pool));
    }

  /* fetch data from disk if we did not find it in the cache */
  if (! found)
    {
[...]

      /* read the dir from the file system. It will probably be put it
         into the cache for faster lookup in future calls. */
      SVN_ERR(svn_fs_fs__rep_contents_dir(&entries, fs, noderev,
                                          scratch_pool, scratch_pool));

[...]
    }

  return SVN_NO_ERROR;
}
]]]

And svn_fs_fs__rep_contents_dir() functions checks the dir cache again.

Is my analysis correct or I missed something important?

-- 
Ivan Zhakov

Re: Double cache lookup in svn_fs_fs__rep_contents_dir_entry()

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On 27 May 2015 at 13:23, Stefan Fuhrmann <st...@wandisco.com> wrote:
> On Wed, May 27, 2015 at 12:11 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:
>>
>> On 27 May 2015 at 12:49, Stefan Fuhrmann <st...@wandisco.com>
>> wrote:
>> > On Wed, May 27, 2015 at 11:28 AM, Ivan Zhakov <iv...@visualsvn.com>
>> > wrote:
>> >>
>> >> It seems directory cache checked twice in function
>> >> svn_fs_fs__rep_contents_dir_entry:
>> >> [[[
>> >> svn_error_t *
>> >> svn_fs_fs__rep_contents_dir_entry(svn_fs_dirent_t **dirent,
>> >>                                   svn_fs_t *fs,
>> >>                                   node_revision_t *noderev,
>> >>                                   const char *name,
>> >>                                   apr_pool_t *result_pool,
>> >>                                   apr_pool_t *scratch_pool)
>> >> {
>> >>   svn_boolean_t found = FALSE;
>> >>
>> >>   /* find the cache we may use */
>> >>   pair_cache_key_t pair_key = { 0 };
>> >>   const void *key;
>> >>   svn_cache__t *cache = locate_dir_cache(fs, &key, &pair_key, noderev,
>> >>                                          scratch_pool);
>> >>   if (cache)
>> >>     {
>> >> [...]
>> >>       SVN_ERR(svn_cache__get_partial((void **)dirent,
>> >>                                      &found,
>> >>                                      cache,
>> >>                                      key,
>> >>                                      svn_fs_fs__extract_dir_entry,
>> >>                                      &baton,
>> >>                                      result_pool));
>> >>     }
>> >>
>> >>   /* fetch data from disk if we did not find it in the cache */
>> >>   if (! found)
>> >>     {
>> >> [...]
>> >>
>> >>       /* read the dir from the file system. It will probably be put it
>> >>          into the cache for faster lookup in future calls. */
>> >>       SVN_ERR(svn_fs_fs__rep_contents_dir(&entries, fs, noderev,
>> >>                                           scratch_pool, scratch_pool));
>> >>
>> >> [...]
>> >>     }
>> >>
>> >>   return SVN_NO_ERROR;
>> >> }
>> >> ]]]
>> >>
>> >> And svn_fs_fs__rep_contents_dir() functions checks the dir cache again.
>> >>
>> >> Is my analysis correct or I missed something important?
>> >
>> >
>> > Your analysis is correct and the code is slightly less efficient
>> > that it could be. Feel free to add e.g. a "bypass_cache_lookup"
>> > flag to the svn_fs_fs__rep_contents_dir() signature.
>> >
>> Thanks for confirming the issue. I'll fix it then.
>
>
> Thanks.
>
Fixed in 1681974.

-- 
Ivan Zhakov

Re: Double cache lookup in svn_fs_fs__rep_contents_dir_entry()

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Wed, May 27, 2015 at 12:11 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:

> On 27 May 2015 at 12:49, Stefan Fuhrmann <st...@wandisco.com>
> wrote:
> > On Wed, May 27, 2015 at 11:28 AM, Ivan Zhakov <iv...@visualsvn.com>
> wrote:
> >>
> >> It seems directory cache checked twice in function
> >> svn_fs_fs__rep_contents_dir_entry:
> >> [[[
> >> svn_error_t *
> >> svn_fs_fs__rep_contents_dir_entry(svn_fs_dirent_t **dirent,
> >>                                   svn_fs_t *fs,
> >>                                   node_revision_t *noderev,
> >>                                   const char *name,
> >>                                   apr_pool_t *result_pool,
> >>                                   apr_pool_t *scratch_pool)
> >> {
> >>   svn_boolean_t found = FALSE;
> >>
> >>   /* find the cache we may use */
> >>   pair_cache_key_t pair_key = { 0 };
> >>   const void *key;
> >>   svn_cache__t *cache = locate_dir_cache(fs, &key, &pair_key, noderev,
> >>                                          scratch_pool);
> >>   if (cache)
> >>     {
> >> [...]
> >>       SVN_ERR(svn_cache__get_partial((void **)dirent,
> >>                                      &found,
> >>                                      cache,
> >>                                      key,
> >>                                      svn_fs_fs__extract_dir_entry,
> >>                                      &baton,
> >>                                      result_pool));
> >>     }
> >>
> >>   /* fetch data from disk if we did not find it in the cache */
> >>   if (! found)
> >>     {
> >> [...]
> >>
> >>       /* read the dir from the file system. It will probably be put it
> >>          into the cache for faster lookup in future calls. */
> >>       SVN_ERR(svn_fs_fs__rep_contents_dir(&entries, fs, noderev,
> >>                                           scratch_pool, scratch_pool));
> >>
> >> [...]
> >>     }
> >>
> >>   return SVN_NO_ERROR;
> >> }
> >> ]]]
> >>
> >> And svn_fs_fs__rep_contents_dir() functions checks the dir cache again.
> >>
> >> Is my analysis correct or I missed something important?
> >
> >
> > Your analysis is correct and the code is slightly less efficient
> > that it could be. Feel free to add e.g. a "bypass_cache_lookup"
> > flag to the svn_fs_fs__rep_contents_dir() signature.
> >
> Thanks for confirming the issue. I'll fix it then.
>

Thanks.


> > However, the actual gains from this should be minimal because
> > the failed lookup is easily dwarfed by the directory parsing time.
> > Do you have a specific workload where the double lookup becomes
> > more noticeable?
> >
> No, I don't have any specific workloads. I've noticed it just by
> reading code around.
>

Ack.

-- Stefan^2.

Re: Double cache lookup in svn_fs_fs__rep_contents_dir_entry()

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On 27 May 2015 at 12:49, Stefan Fuhrmann <st...@wandisco.com> wrote:
> On Wed, May 27, 2015 at 11:28 AM, Ivan Zhakov <iv...@visualsvn.com> wrote:
>>
>> It seems directory cache checked twice in function
>> svn_fs_fs__rep_contents_dir_entry:
>> [[[
>> svn_error_t *
>> svn_fs_fs__rep_contents_dir_entry(svn_fs_dirent_t **dirent,
>>                                   svn_fs_t *fs,
>>                                   node_revision_t *noderev,
>>                                   const char *name,
>>                                   apr_pool_t *result_pool,
>>                                   apr_pool_t *scratch_pool)
>> {
>>   svn_boolean_t found = FALSE;
>>
>>   /* find the cache we may use */
>>   pair_cache_key_t pair_key = { 0 };
>>   const void *key;
>>   svn_cache__t *cache = locate_dir_cache(fs, &key, &pair_key, noderev,
>>                                          scratch_pool);
>>   if (cache)
>>     {
>> [...]
>>       SVN_ERR(svn_cache__get_partial((void **)dirent,
>>                                      &found,
>>                                      cache,
>>                                      key,
>>                                      svn_fs_fs__extract_dir_entry,
>>                                      &baton,
>>                                      result_pool));
>>     }
>>
>>   /* fetch data from disk if we did not find it in the cache */
>>   if (! found)
>>     {
>> [...]
>>
>>       /* read the dir from the file system. It will probably be put it
>>          into the cache for faster lookup in future calls. */
>>       SVN_ERR(svn_fs_fs__rep_contents_dir(&entries, fs, noderev,
>>                                           scratch_pool, scratch_pool));
>>
>> [...]
>>     }
>>
>>   return SVN_NO_ERROR;
>> }
>> ]]]
>>
>> And svn_fs_fs__rep_contents_dir() functions checks the dir cache again.
>>
>> Is my analysis correct or I missed something important?
>
>
> Your analysis is correct and the code is slightly less efficient
> that it could be. Feel free to add e.g. a "bypass_cache_lookup"
> flag to the svn_fs_fs__rep_contents_dir() signature.
>
Thanks for confirming the issue. I'll fix it then.

> However, the actual gains from this should be minimal because
> the failed lookup is easily dwarfed by the directory parsing time.
> Do you have a specific workload where the double lookup becomes
> more noticeable?
>
No, I don't have any specific workloads. I've noticed it just by
reading code around.

-- 
Ivan Zhakov

Re: Double cache lookup in svn_fs_fs__rep_contents_dir_entry()

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Wed, May 27, 2015 at 11:28 AM, Ivan Zhakov <iv...@visualsvn.com> wrote:

> It seems directory cache checked twice in function
> svn_fs_fs__rep_contents_dir_entry:
> [[[
> svn_error_t *
> svn_fs_fs__rep_contents_dir_entry(svn_fs_dirent_t **dirent,
>                                   svn_fs_t *fs,
>                                   node_revision_t *noderev,
>                                   const char *name,
>                                   apr_pool_t *result_pool,
>                                   apr_pool_t *scratch_pool)
> {
>   svn_boolean_t found = FALSE;
>
>   /* find the cache we may use */
>   pair_cache_key_t pair_key = { 0 };
>   const void *key;
>   svn_cache__t *cache = locate_dir_cache(fs, &key, &pair_key, noderev,
>                                          scratch_pool);
>   if (cache)
>     {
> [...]
>       SVN_ERR(svn_cache__get_partial((void **)dirent,
>                                      &found,
>                                      cache,
>                                      key,
>                                      svn_fs_fs__extract_dir_entry,
>                                      &baton,
>                                      result_pool));
>     }
>
>   /* fetch data from disk if we did not find it in the cache */
>   if (! found)
>     {
> [...]
>
>       /* read the dir from the file system. It will probably be put it
>          into the cache for faster lookup in future calls. */
>       SVN_ERR(svn_fs_fs__rep_contents_dir(&entries, fs, noderev,
>                                           scratch_pool, scratch_pool));
>
> [...]
>     }
>
>   return SVN_NO_ERROR;
> }
> ]]]
>
> And svn_fs_fs__rep_contents_dir() functions checks the dir cache again.
>
> Is my analysis correct or I missed something important?
>

Your analysis is correct and the code is slightly less efficient
that it could be. Feel free to add e.g. a "bypass_cache_lookup"
flag to the svn_fs_fs__rep_contents_dir() signature.

However, the actual gains from this should be minimal because
the failed lookup is easily dwarfed by the directory parsing time.
Do you have a specific workload where the double lookup becomes
more noticeable?

-- Stefan^2.