You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Kamesh Jayachandran <ka...@collab.net> on 2008/01/25 13:47:25 UTC

Proposal to implement get_commit_and_merge_ranges from FS

Hi All,

Problem: Implement get_commit_and_merge_ranges report from a mergeinfo 
in FS.

Interface:
svn_fs_fs_get_commit_and_merge_ranges
(apr_array_header_t **merge_ranges_list,
 apr_array_header_t **commit_rangelist,
 svn_fs_root_t *root,
 const char *merge_target,
 const char *merge_source,
 svn_revnum_t min_commit_rev,
 svn_revnum_t max_commit_rev,
 svn_mergeinfo_inheritance_t inherit,
 apr_pool_t *pool)

Algorithm:
1)Record svn:mergeinfo_added_on_targets in this commit as a transaction
  property. Format of this property's value is serialized hash of
  {/target1 -> '/src1: rR,rX-Y\n
                /src2: rS, rA-B\n',
   /target2 -> '/src1: rR,rX-Y\n
                /src2: rS, rA-B\n',
  }

2)Run something like 'svn_repos_history2' between revisions min_commit_rev
  and max_commit_rev to get commits revs where merge_source has been merged
  to merge_target along with their merge ranges.


Possible problems:
Storing it as revprop makes this data to be mutable, which may be 
problematic
if someone maliciously write some data that causes hash parser to break.


Possible solutions:
We can maintain this per commit data in a separate place other than the one
where revprops sit. Provide mechanisms to retrieve from this location.


Want to know what others think about this.


With regards
Kamesh Jayachandran

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Proposal to implement get_commit_and_merge_ranges from FS

Posted by David Glasser <gl...@davidglasser.net>.
On Jan 27, 2008 11:46 AM, Mark Phippard <ma...@gmail.com> wrote:
>
> On Jan 27, 2008 1:36 PM, David Glasser <gl...@davidglasser.net> wrote:
> > On Jan 25, 2008 5:47 AM, Kamesh Jayachandran <ka...@collab.net> wrote:
> > > Hi All,
> > >
> > > Problem: Implement get_commit_and_merge_ranges report from a mergeinfo
> > > in FS.
> > >
> > > Interface:
> > > svn_fs_fs_get_commit_and_merge_ranges
> > > (apr_array_header_t **merge_ranges_list,
> > >  apr_array_header_t **commit_rangelist,
> > >  svn_fs_root_t *root,
> > >  const char *merge_target,
> > >  const char *merge_source,
> > >  svn_revnum_t min_commit_rev,
> > >  svn_revnum_t max_commit_rev,
> > >  svn_mergeinfo_inheritance_t inherit,
> > >  apr_pool_t *pool)
> > >
> > > Algorithm:
> > > 1)Record svn:mergeinfo_added_on_targets in this commit as a transaction
> > >   property. Format of this property's value is serialized hash of
> > >   {/target1 -> '/src1: rR,rX-Y\n
> > >                 /src2: rS, rA-B\n',
> > >    /target2 -> '/src1: rR,rX-Y\n
> > >                 /src2: rS, rA-B\n',
> > >   }
> > >
> > > 2)Run something like 'svn_repos_history2' between revisions min_commit_rev
> > >   and max_commit_rev to get commits revs where merge_source has been merged
> > >   to merge_target along with their merge ranges.
> > >
> > >
> > > Possible problems:
> > > Storing it as revprop makes this data to be mutable, which may be
> > > problematic
> > > if someone maliciously write some data that causes hash parser to break.
> > >
> > >
> > > Possible solutions:
> > > We can maintain this per commit data in a separate place other than the one
> > > where revprops sit. Provide mechanisms to retrieve from this location.
> > >
> > >
> > > Want to know what others think about this.
> >
> > I guess what concerns me about this is that it's repository-global, so
> > unlike most svn operations, in a repository like the ASF (say) the
> > queries for one project are going to have to look through all the data
> > in the rev range for all the other projects.
>
> My understanding was that the existing merge algorithm determines what
> revisions need to be merged.  Once the revisions have been determined,
> it then needs to check if any of those revisions are "reflective".
> That does not sound like a repository wide scan to me.

No, the existing algorithm might say something like "r1000-4000 need
to be merged".  Maybe only five of these revisions actually change the
project/branch in question, but the way the diff algorithm works, this
doesn't matter, since it isn't iterating over the revisions.  Kamesh's
design for his new API would then make us iterate over all thse
"mergeinfo" files/revprops for all the revs from 1000-4000.

--dave

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Proposal to implement get_commit_and_merge_ranges from FS

Posted by Mark Phippard <ma...@gmail.com>.
On Jan 27, 2008 1:36 PM, David Glasser <gl...@davidglasser.net> wrote:
> On Jan 25, 2008 5:47 AM, Kamesh Jayachandran <ka...@collab.net> wrote:
> > Hi All,
> >
> > Problem: Implement get_commit_and_merge_ranges report from a mergeinfo
> > in FS.
> >
> > Interface:
> > svn_fs_fs_get_commit_and_merge_ranges
> > (apr_array_header_t **merge_ranges_list,
> >  apr_array_header_t **commit_rangelist,
> >  svn_fs_root_t *root,
> >  const char *merge_target,
> >  const char *merge_source,
> >  svn_revnum_t min_commit_rev,
> >  svn_revnum_t max_commit_rev,
> >  svn_mergeinfo_inheritance_t inherit,
> >  apr_pool_t *pool)
> >
> > Algorithm:
> > 1)Record svn:mergeinfo_added_on_targets in this commit as a transaction
> >   property. Format of this property's value is serialized hash of
> >   {/target1 -> '/src1: rR,rX-Y\n
> >                 /src2: rS, rA-B\n',
> >    /target2 -> '/src1: rR,rX-Y\n
> >                 /src2: rS, rA-B\n',
> >   }
> >
> > 2)Run something like 'svn_repos_history2' between revisions min_commit_rev
> >   and max_commit_rev to get commits revs where merge_source has been merged
> >   to merge_target along with their merge ranges.
> >
> >
> > Possible problems:
> > Storing it as revprop makes this data to be mutable, which may be
> > problematic
> > if someone maliciously write some data that causes hash parser to break.
> >
> >
> > Possible solutions:
> > We can maintain this per commit data in a separate place other than the one
> > where revprops sit. Provide mechanisms to retrieve from this location.
> >
> >
> > Want to know what others think about this.
>
> I guess what concerns me about this is that it's repository-global, so
> unlike most svn operations, in a repository like the ASF (say) the
> queries for one project are going to have to look through all the data
> in the rev range for all the other projects.

My understanding was that the existing merge algorithm determines what
revisions need to be merged.  Once the revisions have been determined,
it then needs to check if any of those revisions are "reflective".
That does not sound like a repository wide scan to me.

You have a valid point/question in your other mail though.  As in what
are the use cases this would solve that the reintegrate option would
not.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Proposal to implement get_commit_and_merge_ranges from FS

Posted by David Glasser <gl...@davidglasser.net>.
On Jan 25, 2008 5:47 AM, Kamesh Jayachandran <ka...@collab.net> wrote:
> Hi All,
>
> Problem: Implement get_commit_and_merge_ranges report from a mergeinfo
> in FS.
>
> Interface:
> svn_fs_fs_get_commit_and_merge_ranges
> (apr_array_header_t **merge_ranges_list,
>  apr_array_header_t **commit_rangelist,
>  svn_fs_root_t *root,
>  const char *merge_target,
>  const char *merge_source,
>  svn_revnum_t min_commit_rev,
>  svn_revnum_t max_commit_rev,
>  svn_mergeinfo_inheritance_t inherit,
>  apr_pool_t *pool)
>
> Algorithm:
> 1)Record svn:mergeinfo_added_on_targets in this commit as a transaction
>   property. Format of this property's value is serialized hash of
>   {/target1 -> '/src1: rR,rX-Y\n
>                 /src2: rS, rA-B\n',
>    /target2 -> '/src1: rR,rX-Y\n
>                 /src2: rS, rA-B\n',
>   }
>
> 2)Run something like 'svn_repos_history2' between revisions min_commit_rev
>   and max_commit_rev to get commits revs where merge_source has been merged
>   to merge_target along with their merge ranges.
>
>
> Possible problems:
> Storing it as revprop makes this data to be mutable, which may be
> problematic
> if someone maliciously write some data that causes hash parser to break.
>
>
> Possible solutions:
> We can maintain this per commit data in a separate place other than the one
> where revprops sit. Provide mechanisms to retrieve from this location.
>
>
> Want to know what others think about this.

I guess what concerns me about this is that it's repository-global, so
unlike most svn operations, in a repository like the ASF (say) the
queries for one project are going to have to look through all the data
in the rev range for all the other projects.

--dave

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Proposal to implement get_commit_and_merge_ranges from FS

Posted by Mark Phippard <ma...@gmail.com>.
On Jan 26, 2008 10:08 AM, Kamesh Jayachandran <ka...@collab.net> wrote:
>
> > If we changed our mergeinfo handling to not merge changes to the
> > mergeinfo property and instead just record direct mergeinfo, then we
> > could get the information you need by doing a property diff for that
> > revision.  Right?  Code we already have somewhere.
> >
>
> No impact. What I try to implement is equivalent of 'mergeinfo_changed'
> table in issue-2897 branch.
>
> This information is retrievable by doing a prop diff with earlier revs.
> But would be very complicated and performance intensive.
>
>
> > What impact would it have on our code if we did not carry around the
> > indirect mergeinfo?  Do we currently rely on this information as a
> > form of cache ... or rather to save us from crawling the fs?
> >
> >
>
> I don't rely on direct/indirect mergeinfo, I just bother about what is
> new merge in this commit.

Yes, but the only reason that is hard today is that the change in the
mergeinfo when you do a commit contains the information on what was
merged, but also typically also includes changes to the indirect
mergeinfo.  If it did not contain the latter, then a simple diff of
the property would tell you what was merged.

Of course elision would probably make that not entirely true.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Proposal to implement get_commit_and_merge_ranges from FS

Posted by Kamesh Jayachandran <ka...@collab.net>.
> If we changed our mergeinfo handling to not merge changes to the
> mergeinfo property and instead just record direct mergeinfo, then we
> could get the information you need by doing a property diff for that
> revision.  Right?  Code we already have somewhere.
>   

No impact. What I try to implement is equivalent of 'mergeinfo_changed' 
table in issue-2897 branch.

This information is retrievable by doing a prop diff with earlier revs.
But would be very complicated and performance intensive.


> What impact would it have on our code if we did not carry around the
> indirect mergeinfo?  Do we currently rely on this information as a
> form of cache ... or rather to save us from crawling the fs?
>
>   

I don't rely on direct/indirect mergeinfo, I just bother about what is 
new merge in this commit.

With regards
Kamesh Jayachandran





---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Proposal to implement get_commit_and_merge_ranges from FS

Posted by Mark Phippard <ma...@gmail.com>.
On Jan 26, 2008 9:52 AM, Kamesh Jayachandran <ka...@collab.net> wrote:
> > So, just to clarify, what you're looking for is mapping of revision
> > numbers (across a range) to mergeinfo changes made in that revision?
>
> Yes.
>
> > Why only mergeinfo additions -- aren't the subtractions interesting, too?
> >
>
> I am not interested in substractions as we don't support repeat 'merge
> reversal'.

If we changed our mergeinfo handling to not merge changes to the
mergeinfo property and instead just record direct mergeinfo, then we
could get the information you need by doing a property diff for that
revision.  Right?  Code we already have somewhere.

What impact would it have on our code if we did not carry around the
indirect mergeinfo?  Do we currently rely on this information as a
form of cache ... or rather to save us from crawling the fs?

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Proposal to implement get_commit_and_merge_ranges from FS

Posted by Kamesh Jayachandran <ka...@collab.net>.
C. Michael Pilato wrote:
> Kamesh Jayachandran wrote:
>> Hi All,
>>
>> Problem: Implement get_commit_and_merge_ranges report from a 
>> mergeinfo in FS.
>>
>> Interface:
>> svn_fs_fs_get_commit_and_merge_ranges
>> (apr_array_header_t **merge_ranges_list,
>> apr_array_header_t **commit_rangelist,
>> svn_fs_root_t *root,
>> const char *merge_target,
>> const char *merge_source,
>> svn_revnum_t min_commit_rev,
>> svn_revnum_t max_commit_rev,
>> svn_mergeinfo_inheritance_t inherit,
>> apr_pool_t *pool)
>>
>> Algorithm:
>> 1)Record svn:mergeinfo_added_on_targets in this commit as a transaction
>>  property. Format of this property's value is serialized hash of
>>  {/target1 -> '/src1: rR,rX-Y\n
>>                /src2: rS, rA-B\n',
>>   /target2 -> '/src1: rR,rX-Y\n
>>                /src2: rS, rA-B\n',
>>  }
>
> Oh, you mean exactly like what I *just removed from the BDB code* ?

Somewhat closer, it was storing the *full mergeinfo*, Now I need only 
'mergeinfo-added'.

>
>> 2)Run something like 'svn_repos_history2' between revisions 
>> min_commit_rev
>>  and max_commit_rev to get commits revs where merge_source has been 
>> merged
>>  to merge_target along with their merge ranges.
>>
>>
>> Possible problems:
>> Storing it as revprop makes this data to be mutable, which may be 
>> problematic
>> if someone maliciously write some data that causes hash parser to break.
>
> Yeah, that's no good.
>
> So, just to clarify, what you're looking for is mapping of revision 
> numbers (across a range) to mergeinfo changes made in that revision?  

Yes.

> Why only mergeinfo additions -- aren't the subtractions interesting, too?
>

I am not interested in substractions as we don't support repeat 'merge 
reversal'.

With regards
Kamesh Jayachandran

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Proposal to implement get_commit_and_merge_ranges from FS

Posted by "C. Michael Pilato" <cm...@collab.net>.
Kamesh Jayachandran wrote:
> Hi All,
> 
> Problem: Implement get_commit_and_merge_ranges report from a mergeinfo 
> in FS.
> 
> Interface:
> svn_fs_fs_get_commit_and_merge_ranges
> (apr_array_header_t **merge_ranges_list,
> apr_array_header_t **commit_rangelist,
> svn_fs_root_t *root,
> const char *merge_target,
> const char *merge_source,
> svn_revnum_t min_commit_rev,
> svn_revnum_t max_commit_rev,
> svn_mergeinfo_inheritance_t inherit,
> apr_pool_t *pool)
> 
> Algorithm:
> 1)Record svn:mergeinfo_added_on_targets in this commit as a transaction
>  property. Format of this property's value is serialized hash of
>  {/target1 -> '/src1: rR,rX-Y\n
>                /src2: rS, rA-B\n',
>   /target2 -> '/src1: rR,rX-Y\n
>                /src2: rS, rA-B\n',
>  }

Oh, you mean exactly like what I *just removed from the BDB code* ?

> 2)Run something like 'svn_repos_history2' between revisions min_commit_rev
>  and max_commit_rev to get commits revs where merge_source has been merged
>  to merge_target along with their merge ranges.
> 
> 
> Possible problems:
> Storing it as revprop makes this data to be mutable, which may be 
> problematic
> if someone maliciously write some data that causes hash parser to break.

Yeah, that's no good.

So, just to clarify, what you're looking for is mapping of revision numbers 
(across a range) to mergeinfo changes made in that revision?  Why only 
mergeinfo additions -- aren't the subtractions interesting, too?

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


RE: Proposal to implement get_commit_and_merge_ranges from FS

Posted by Kamesh Jayachandran <ka...@collab.net>.
>> Possible solutions:
>> We can maintain this per commit data in a separate place other than the one
>> where revprops sit. Provide mechanisms to retrieve from this location.

>I do not see anything wrong with this as a solution.  I assume in BDB
>we would just have a new table for this info (maybe you already do).
>In fsfs we could add a new folder to the repository db named
>"mergeinfo".  Whenever a commit is done that alters mergeinfo, you
>could create a file with the same name as the revision that stores
>this information.  If the commit does not alter mergeinfo, do not
>create a file.  You could even follow the same sharding scheme as the
>revs folder, just for consistency.

>I do not see any problem with this approach.

Thanks, That is what I work on now.

With regards
Kamesh Jayachandran


Re: Proposal to implement get_commit_and_merge_ranges from FS

Posted by Mark Phippard <ma...@gmail.com>.
On Jan 25, 2008 8:47 AM, Kamesh Jayachandran <ka...@collab.net> wrote:

> Problem: Implement get_commit_and_merge_ranges report from a mergeinfo
> in FS.
>
> Interface:
> svn_fs_fs_get_commit_and_merge_ranges
> (apr_array_header_t **merge_ranges_list,
>  apr_array_header_t **commit_rangelist,
>  svn_fs_root_t *root,
>  const char *merge_target,
>  const char *merge_source,
>  svn_revnum_t min_commit_rev,
>  svn_revnum_t max_commit_rev,
>  svn_mergeinfo_inheritance_t inherit,
>  apr_pool_t *pool)
>
> Algorithm:
> 1)Record svn:mergeinfo_added_on_targets in this commit as a transaction
>   property. Format of this property's value is serialized hash of
>   {/target1 -> '/src1: rR,rX-Y\n
>                 /src2: rS, rA-B\n',
>    /target2 -> '/src1: rR,rX-Y\n
>                 /src2: rS, rA-B\n',
>   }
>
> 2)Run something like 'svn_repos_history2' between revisions min_commit_rev
>   and max_commit_rev to get commits revs where merge_source has been merged
>   to merge_target along with their merge ranges.
>
>
> Possible problems:
> Storing it as revprop makes this data to be mutable, which may be
> problematic
> if someone maliciously write some data that causes hash parser to break.
>
>
> Possible solutions:
> We can maintain this per commit data in a separate place other than the one
> where revprops sit. Provide mechanisms to retrieve from this location.

I do not see anything wrong with this as a solution.  I assume in BDB
we would just have a new table for this info (maybe you already do).
In fsfs we could add a new folder to the repository db named
"mergeinfo".  Whenever a commit is done that alters mergeinfo, you
could create a file with the same name as the revision that stores
this information.  If the commit does not alter mergeinfo, do not
create a file.  You could even follow the same sharding scheme as the
revs folder, just for consistency.

I do not see any problem with this approach.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Proposal to implement get_commit_and_merge_ranges from FS

Posted by David Glasser <gl...@davidglasser.net>.
So this is to implement 2897.

Is it possible to show a simple test case that is handled very well by
the 2897 branch (before the anti-sqlite changes broke your API) and
which is completely unacceptable without 2897 (handled poorly by
reintegrate, etc)?  I'm still leaning strongly towards thinking that
2897 is way too much complexity for not much benefit.  Again, we never
promised that svn 1.5 would contain massively improved merge
*algorithms*; we just promised that it would do a better job of merge
*tracking*: eliminating the need to remember revision numbers etc.

--dave

On Jan 25, 2008 5:47 AM, Kamesh Jayachandran <ka...@collab.net> wrote:
> Hi All,
>
> Problem: Implement get_commit_and_merge_ranges report from a mergeinfo
> in FS.
>
> Interface:
> svn_fs_fs_get_commit_and_merge_ranges
> (apr_array_header_t **merge_ranges_list,
>  apr_array_header_t **commit_rangelist,
>  svn_fs_root_t *root,
>  const char *merge_target,
>  const char *merge_source,
>  svn_revnum_t min_commit_rev,
>  svn_revnum_t max_commit_rev,
>  svn_mergeinfo_inheritance_t inherit,
>  apr_pool_t *pool)
>
> Algorithm:
> 1)Record svn:mergeinfo_added_on_targets in this commit as a transaction
>   property. Format of this property's value is serialized hash of
>   {/target1 -> '/src1: rR,rX-Y\n
>                 /src2: rS, rA-B\n',
>    /target2 -> '/src1: rR,rX-Y\n
>                 /src2: rS, rA-B\n',
>   }
>
> 2)Run something like 'svn_repos_history2' between revisions min_commit_rev
>   and max_commit_rev to get commits revs where merge_source has been merged
>   to merge_target along with their merge ranges.
>
>
> Possible problems:
> Storing it as revprop makes this data to be mutable, which may be
> problematic
> if someone maliciously write some data that causes hash parser to break.
>
>
> Possible solutions:
> We can maintain this per commit data in a separate place other than the one
> where revprops sit. Provide mechanisms to retrieve from this location.
>
>
> Want to know what others think about this.
>
>
> With regards
> Kamesh Jayachandran
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
>
>



-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org