You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Kamesh Jayachandran <ka...@collab.net> on 2006/10/06 10:49:05 UTC

[PATCH][merge-tracking]sqlite mergeinfo indexing for bdb repositories

Hi All,
Find the attached patch and log.

With regards
Kamesh Jayachandran

Re: [PATCH][merge-tracking]sqlite mergeinfo indexing for bdb repositories

Posted by Daniel Rall <dl...@collab.net>.
On Fri, 13 Oct 2006, Kamesh Jayachandran wrote:
...
> >On Thu, 12 Oct 2006, Kamesh Jayachandran wrote:
...
> >>Let me leverage on it with the suggested change to fs_skel.
> >
> >Sounds good, Kamesh.  I'll commit the patch after you've made those
> >changes.
>
> Attaching the patch/log having the 'svn:txn-mergeinfo' as an atomic blob.
> 
> Focus on subversion/libsvn_fs_base/revs-txn.c alone as that alone 
> changed majorly in this new patch.

Nice work, Kamesh.  I committed the patch in r22048, after making some
tweaks to your last serialization/deserialization changes.

Re: [PATCH][merge-tracking]sqlite mergeinfo indexing for bdb repositories

Posted by Kamesh Jayachandran <ka...@collab.net>.
Daniel Rall wrote:
> On Thu, 12 Oct 2006, Kamesh Jayachandran wrote:
>
>   
>> Patch looks fine to me.
>> Let me leverage on it with the suggested change to fs_skel.
>>     
>
> Sounds good, Kamesh.  I'll commit the patch after you've made those
> changes.
>
> - Dan
>   
Attaching the patch/log having the 'svn:txn-mergeinfo' as an atomic blob.

Focus on subversion/libsvn_fs_base/revs-txn.c alone as that alone 
changed majorly in this new patch.

With regards
Kamesh Jayachandran

Re: [PATCH][merge-tracking]sqlite mergeinfo indexing for bdb repositories

Posted by Daniel Rall <dl...@collab.net>.
On Thu, 12 Oct 2006, Kamesh Jayachandran wrote:

> Patch looks fine to me.
> Let me leverage on it with the suggested change to fs_skel.

Sounds good, Kamesh.  I'll commit the patch after you've made those
changes.

- Dan

Re: [PATCH][merge-tracking]sqlite mergeinfo indexing for bdb repositories

Posted by Kamesh Jayachandran <ka...@collab.net>.
Patch looks fine to me.
Let me leverage on it with the suggested change to fs_skel.

Thanks.
With regards
Kamesh Jayachandran

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH][merge-tracking]sqlite mergeinfo indexing for bdb repositories

Posted by Daniel Rall <dl...@collab.net>.
On Wed, 11 Oct 2006, Daniel Rall wrote:
...
> I chatted with Mike Pilato about this a bit.  If we're initially using
> sqlite for the merge info index, we should not be attempting to
> store/retrieve merge info properties as structured data!  They should
> be stored simply as atoms.
> 
> If we wanted to jump directly to storing merge info in BerkeleyDB,
> we'd need to replace all usage of the sqlite-based svn_fs_merge_info.h
> APIs in the BDB backend.  While we continue prototyping an end-to-end
> Merge Tracking solution, I would prefer to avoid this redundancy for
> the sake of simplicity.
...
> I've tweaked the patch accordingly, and updated the doc string for
> svn_fs_base__dag_commit_txn() (which was not updated in the previous
> patches).

I'm attaching the latest version of the patch, without removal of the
fs_skels.c changes.  Removing the fs_skels.c change requires
additional modifications to revs-txns.c.

> I'm also seeing 'stat_tests.py 20' fail over ra_local under BDB on the
> merge-tracking branch.  I have not confirmed whether that failure is
> related to these changes.  (I seem to recall you mentioning this in
> one of our previous conversations, but don't recall exactly what'cha
> said.)

I've confirmed that 'stat_tests.py --fs-type=bdb 20' fails both with
and without this patch.

Re: [PATCH][merge-tracking]sqlite mergeinfo indexing for bdb repositories

Posted by Daniel Rall <dl...@collab.net>.
On Thu, 12 Oct 2006, Kamesh Jayachandran wrote:

> Daniel Rall wrote:
> >On Wed, 11 Oct 2006, Kamesh Jayachandran wrote:
> >  
> >>Daniel Rall wrote:
...
> >>>Is it 1) appropriate and 2) necessary to push knowledge of skels
> >>>corresponding to specific properties (e.g. SVN_FS_PROP_TXN_MERGEINFO)
> >>>down into fs_skels.c (rather than knowledge which ends at the concept
> >>>of "proplists", as on trunk)?  To me, that file seems like a more
> >>>generic utility which shouldn't contain such specific knowledge.
> >>>      
> >>I think somehow we can't avoid that. Someway or the other parsing 
> >>code(which should sit inside the util/fs_skel.c) should know about 
> >>'SVN_FS_PROP_TXN_MERGEINFO'. Or we should treat the value of 
> >>'svn:txn-mergeinfo' as 'atom' and parse it outside?
> >>    
> >
> >I chatted with Mike Pilato about this a bit.  If we're initially using
> >sqlite for the merge info index, we should not be attempting to
> >store/retrieve merge info properties as structured data!  They should
> >be stored simply as atoms.
> >
> >If we wanted to jump directly to storing merge info in BerkeleyDB,
> >we'd need to replace all usage of the sqlite-based svn_fs_merge_info.h
> >APIs in the BDB backend.  While we continue prototyping an end-to-end
> >Merge Tracking solution, I would prefer to avoid this redundancy for
> >the sake of simplicity.
>
> Fine, I can implement 'svn:txn-mergeinfo' as like every other property 
> and put a temporary parsing function to understand this 'atom' and give 
> it as 'apr_hash_t'.

The apr_hash_t serialization you used looks great.

> >Along those lines, I question the safety of changing the signature of
> >svn_fs_base__dag_commit_txn() to remove its pool argument.  This
> >change makes the assumption that anything allocated from TRAIL->pool
> >has the same lifetime as TRAIL, which may not always be correct.
>
> Just out of curiosity would like to know when trail->pool would have a 
> short life time than trail itself so that come call back routines can 
> cause memory corruption? Is ok till we have BDB backed mergeinfo?

trail->pool shouldn't typically be destroyed before we're done using
trail.

> >While it is clearly okay for the current caller, txn_body_commit()
> >(which already passes in trail->pool), I hesitate to make this type of
> >interface change for a "protected" API (it could later cause bugs).
> >If the function was file-static (completely private), I would be in
> >favor of this.
> >
> >Also, the interface into svn_fs_base__dag_*() seems to consistently
> >use txn_id, rather than txn.  You seem to have made a change to txn to
> >accomodate svn_fs_merge_info__update_index().  Is this interface
> >discrepancy really worth it?
>
> Can I get svn_fs_txn_t* from txn_id?. I tried but got SIGABORT as we had 
> already opened a trail.

Okay.  I didn't explore why, but left it as you changed it (for the
time being).

Re: [PATCH][merge-tracking]sqlite mergeinfo indexing for bdb repositories

Posted by Kamesh Jayachandran <ka...@collab.net>.
Daniel Rall wrote:
> On Wed, 11 Oct 2006, Kamesh Jayachandran wrote:
>
>   
>> Daniel Rall wrote:
>>     
>>> Thanks for the patch, Kamesh.  I've made some changes to it and the
>>> change log message (attached), but I have some additional questions:
>>>
>>> Is it 1) appropriate and 2) necessary to push knowledge of skels
>>> corresponding to specific properties (e.g. SVN_FS_PROP_TXN_MERGEINFO)
>>> down into fs_skels.c (rather than knowledge which ends at the concept
>>> of "proplists", as on trunk)?  To me, that file seems like a more
>>> generic utility which shouldn't contain such specific knowledge.
>>>       
>> I think somehow we can't avoid that. Someway or the other parsing 
>> code(which should sit inside the util/fs_skel.c) should know about 
>> 'SVN_FS_PROP_TXN_MERGEINFO'. Or we should treat the value of 
>> 'svn:txn-mergeinfo' as 'atom' and parse it outside?
>>     
>
> I chatted with Mike Pilato about this a bit.  If we're initially using
> sqlite for the merge info index, we should not be attempting to
> store/retrieve merge info properties as structured data!  They should
> be stored simply as atoms.
>
>   

> If we wanted to jump directly to storing merge info in BerkeleyDB,
> we'd need to replace all usage of the sqlite-based svn_fs_merge_info.h
> APIs in the BDB backend.  While we continue prototyping an end-to-end
> Merge Tracking solution, I would prefer to avoid this redundancy for
> the sake of simplicity.
>   
Fine, I can implement 'svn:txn-mergeinfo' as like every other property 
and put a temporary parsing function to understand this 'atom' and give 
it as 'apr_hash_t'.

>
> ...
>   
>> I will go through you patch and get back.
>>     
>
> The remaining tweaks were mostly stylistic.
>
> Along those lines, I question the safety of changing the signature of
> svn_fs_base__dag_commit_txn() to remove its pool argument.  This
> change makes the assumption that anything allocated from TRAIL->pool
> has the same lifetime as TRAIL, which may not always be correct.
>   
Just out of curiosity would like to know when trail->pool would have a 
short life time than trail itself so that come call back routines can 
cause memory corruption? Is ok till we have BDB backed mergeinfo?
> While it is clearly okay for the current caller, txn_body_commit()
> (which already passes in trail->pool), I hesitate to make this type of
> interface change for a "protected" API (it could later cause bugs).
> If the function was file-static (completely private), I would be in
> favor of this.
>
> Also, the interface into svn_fs_base__dag_*() seems to consistently
> use txn_id, rather than txn.  You seem to have made a change to txn to
> accomodate svn_fs_merge_info__update_index().  Is this interface
> discrepancy really worth it?
>   
Can I get svn_fs_txn_t* from txn_id?. I tried but got SIGABORT as we had 
already opened a trail.

> I've tweaked the patch accordingly, and updated the doc string for
> svn_fs_base__dag_commit_txn() (which was not updated in the previous
> patches).
>
>
> I'm also seeing 'stat_tests.py 20' fail over ra_local under BDB on the
> merge-tracking branch.  I have not confirmed whether that failure is
> related to these changes.  (I seem to recall you mentioning this in
> one of our previous conversations, but don't recall exactly what'cha
> said.)
>   
It fails for BDB alone in trunk too. (Cause seems to be 'svn status -u' 
testcase expects update list in certain order which bdb does not.)

With regards
Kamesh Jayachandran

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH][merge-tracking]sqlite mergeinfo indexing for bdb repositories

Posted by "C. Michael Pilato" <cm...@collab.net>.
Daniel Rall wrote:
> On Wed, 11 Oct 2006, Kamesh Jayachandran wrote:
> 
>> Daniel Rall wrote:
>>> Thanks for the patch, Kamesh.  I've made some changes to it and the
>>> change log message (attached), but I have some additional questions:
>>>
>>> Is it 1) appropriate and 2) necessary to push knowledge of skels
>>> corresponding to specific properties (e.g. SVN_FS_PROP_TXN_MERGEINFO)
>>> down into fs_skels.c (rather than knowledge which ends at the concept
>>> of "proplists", as on trunk)?  To me, that file seems like a more
>>> generic utility which shouldn't contain such specific knowledge.
>> I think somehow we can't avoid that. Someway or the other parsing 
>> code(which should sit inside the util/fs_skel.c) should know about 
>> 'SVN_FS_PROP_TXN_MERGEINFO'. Or we should treat the value of 
>> 'svn:txn-mergeinfo' as 'atom' and parse it outside?
> 
> I chatted with Mike Pilato about this a bit.  If we're initially using
> sqlite for the merge info index, we should not be attempting to
> store/retrieve merge info properties as structured data!  They should
> be stored simply as atoms.
> 
> If we wanted to jump directly to storing merge info in BerkeleyDB,
> we'd need to replace all usage of the sqlite-based svn_fs_merge_info.h
> APIs in the BDB backend.  While we continue prototyping an end-to-end
> Merge Tracking solution, I would prefer to avoid this redundancy for
> the sake of simplicity.

Yeah, I think the key consideration here is that many times, the code
that parses property lists doesn't care about the individual property
values or their interpretation as structured information.
svn_fs_txn_proplist() doesn't care that there is such a thing as
merge-tracking information.  By breaking down that one property into
sub-skels, you add computational overhead that is not generally needed.
 Keep the merge-tracking property as a single blob of data to appease
the majority of the consumers of that information; parse it into a
structure for particular interfaces that require it (but at a layer
above the skels stuff).

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: [PATCH][merge-tracking]sqlite mergeinfo indexing for bdb repositories

Posted by Daniel Rall <dl...@collab.net>.
On Wed, 11 Oct 2006, Kamesh Jayachandran wrote:

> Daniel Rall wrote:
> >Thanks for the patch, Kamesh.  I've made some changes to it and the
> >change log message (attached), but I have some additional questions:
> >
> >Is it 1) appropriate and 2) necessary to push knowledge of skels
> >corresponding to specific properties (e.g. SVN_FS_PROP_TXN_MERGEINFO)
> >down into fs_skels.c (rather than knowledge which ends at the concept
> >of "proplists", as on trunk)?  To me, that file seems like a more
> >generic utility which shouldn't contain such specific knowledge.
>
> I think somehow we can't avoid that. Someway or the other parsing 
> code(which should sit inside the util/fs_skel.c) should know about 
> 'SVN_FS_PROP_TXN_MERGEINFO'. Or we should treat the value of 
> 'svn:txn-mergeinfo' as 'atom' and parse it outside?

I chatted with Mike Pilato about this a bit.  If we're initially using
sqlite for the merge info index, we should not be attempting to
store/retrieve merge info properties as structured data!  They should
be stored simply as atoms.

If we wanted to jump directly to storing merge info in BerkeleyDB,
we'd need to replace all usage of the sqlite-based svn_fs_merge_info.h
APIs in the BDB backend.  While we continue prototyping an end-to-end
Merge Tracking solution, I would prefer to avoid this redundancy for
the sake of simplicity.


...
> I will go through you patch and get back.

The remaining tweaks were mostly stylistic.

Along those lines, I question the safety of changing the signature of
svn_fs_base__dag_commit_txn() to remove its pool argument.  This
change makes the assumption that anything allocated from TRAIL->pool
has the same lifetime as TRAIL, which may not always be correct.
While it is clearly okay for the current caller, txn_body_commit()
(which already passes in trail->pool), I hesitate to make this type of
interface change for a "protected" API (it could later cause bugs).
If the function was file-static (completely private), I would be in
favor of this.

Also, the interface into svn_fs_base__dag_*() seems to consistently
use txn_id, rather than txn.  You seem to have made a change to txn to
accomodate svn_fs_merge_info__update_index().  Is this interface
discrepancy really worth it?

I've tweaked the patch accordingly, and updated the doc string for
svn_fs_base__dag_commit_txn() (which was not updated in the previous
patches).


I'm also seeing 'stat_tests.py 20' fail over ra_local under BDB on the
merge-tracking branch.  I have not confirmed whether that failure is
related to these changes.  (I seem to recall you mentioning this in
one of our previous conversations, but don't recall exactly what'cha
said.)

Re: [PATCH][merge-tracking]sqlite mergeinfo indexing for bdb repositories

Posted by Kamesh Jayachandran <ka...@collab.net>.
Daniel Rall wrote:
> Thanks for the patch, Kamesh.  I've made some changes to it and the
> change log message (attached), but I have some additional questions:
>
> Is it 1) appropriate and 2) necessary to push knowledge of skels
> corresponding to specific properties (e.g. SVN_FS_PROP_TXN_MERGEINFO)
> down into fs_skels.c (rather than knowledge which ends at the concept
> of "proplists", as on trunk)?  To me, that file seems like a more
> generic utility which shouldn't contain such specific knowledge.
>   
I think somehow we can't avoid that. Someway or the other parsing 
code(which should sit inside the util/fs_skel.c) should know about 
'SVN_FS_PROP_TXN_MERGEINFO'. Or we should treat the value of 
'svn:txn-mergeinfo' as 'atom' and parse it outside?
> In fs_skels.c:is_valid_proplist_skel(), the property name and length
> used to check whether we have a transaction merge info property are
> being set *after* being used, to values which always lag one iteration
> behind.  Is this correct?
>   
Yes.
I will go through you patch and get back.

With regards
Kamesh Jayachandran

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH][merge-tracking]sqlite mergeinfo indexing for bdb repositories

Posted by Daniel Rall <dl...@collab.net>.
Thanks for the patch, Kamesh.  I've made some changes to it and the
change log message (attached), but I have some additional questions:

Is it 1) appropriate and 2) necessary to push knowledge of skels
corresponding to specific properties (e.g. SVN_FS_PROP_TXN_MERGEINFO)
down into fs_skels.c (rather than knowledge which ends at the concept
of "proplists", as on trunk)?  To me, that file seems like a more
generic utility which shouldn't contain such specific knowledge.

In fs_skels.c:is_valid_proplist_skel(), the property name and length
used to check whether we have a transaction merge info property are
being set *after* being used, to values which always lag one iteration
behind.  Is this correct?