You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Prabhu Gnana Sundar <pr...@collab.net> on 2011/08/18 17:02:28 UTC

[PATCH] python script for issue #396, fixing the bogus mergeinfo

Hi all,

With reference to my earlier discussions in 
http://svn.haxx.se/dev/archive-2011-07/0432.shtml
I am attaching a python script which would find the bogus mergeinfo and 
fix it.

This script would eventually look for the location segments of the 
source paths against the respective revision
ranges present in the mergeinfo. With respect to the output of the 
location segments it would create a new mergeinfo
and store it in the ".newmergeinfo" file in the path from where the 
python script was run. The hash of this file would
be in the ".hashfile" in the same path.

By default, the script would not fix the new changes in the working 
copy. Fixing the new mergeinfo to the working copy
can be achieved by passing the "--fix" option to the script.

Also, this script can also authenticate against the self-signed ssl servers.

I tested this script on our  asf subversion repo found a few bogus 
mergeinfo such as in the "tree-conflicts" branch:

/subversion/branches/tree-conflicts with 872524-873154,868290-872329

The above should be correct mergeinfo since it was not at all present in 
the revision range: r872330-872524

Also, I ran this script against the 
https://ctf.open.collab.net/svn/repos/svnedge/trunk/console path and 
found a bogus mergeinfo. (it has a self-signed cert)

Please share your thoughts...



Thanks and regards
Prabhu

Re: [PATCH] python script for issue #3961, fixing the bogus mergeinfo

Posted by Prabhu Gnana Sundar <pr...@collab.net>.
Updated/fixed the script to not prompt the certificate even if the 
certificate is accepted permenantly, similar to the 
get-location-segments.py script.

Attaching the updated script.


Thanks and regards
Prabhu

On Friday 19 August 2011 11:45 AM, Prabhu Gnana Sundar wrote:
> Correcting the issue number as #3961...
>
>
> On Thursday 18 August 2011 08:32 PM, Prabhu Gnana Sundar wrote:
>> Hi all,
>>
>> With reference to my earlier discussions in 
>> http://svn.haxx.se/dev/archive-2011-07/0432.shtml
>> I am attaching a python script which would find the bogus mergeinfo 
>> and fix it.
>>
>> This script would eventually look for the location segments of the 
>> source paths against the respective revision
>> ranges present in the mergeinfo. With respect to the output of the 
>> location segments it would create a new mergeinfo
>> and store it in the ".newmergeinfo" file in the path from where the 
>> python script was run. The hash of this file would
>> be in the ".hashfile" in the same path.
>>
>> By default, the script would not fix the new changes in the working 
>> copy. Fixing the new mergeinfo to the working copy
>> can be achieved by passing the "--fix" option to the script.
>>
>> Also, this script can also authenticate against the self-signed ssl 
>> servers.
>>
>> I tested this script on our  asf subversion repo found a few bogus 
>> mergeinfo such as in the "tree-conflicts" branch:
>>
>> /subversion/branches/tree-conflicts with 872524-873154,868290-872329
>>
>> The above should be correct mergeinfo since it was not at all present 
>> in the revision range: r872330-872524
>>
>> Also, I ran this script against the 
>> https://ctf.open.collab.net/svn/repos/svnedge/trunk/console path and 
>> found a bogus mergeinfo. (it has a self-signed cert)
>>
>> Please share your thoughts...
>>
>>
>>
>> Thanks and regards
>> Prabhu
>


Re: [PATCH] python script for issue #3961, fixing the bogus mergeinfo

Posted by Prabhu Gnana Sundar <pr...@collab.net>.
Attaching an updated tweaked script...


Regards
Prabhu

On Friday 19 August 2011 11:45 AM, Prabhu Gnana Sundar wrote:
> Correcting the issue number as #3961...
>
>
> On Thursday 18 August 2011 08:32 PM, Prabhu Gnana Sundar wrote:
>> Hi all,
>>
>> With reference to my earlier discussions in 
>> http://svn.haxx.se/dev/archive-2011-07/0432.shtml
>> I am attaching a python script which would find the bogus mergeinfo 
>> and fix it.
>>
>> This script would eventually look for the location segments of the 
>> source paths against the respective revision
>> ranges present in the mergeinfo. With respect to the output of the 
>> location segments it would create a new mergeinfo
>> and store it in the ".newmergeinfo" file in the path from where the 
>> python script was run. The hash of this file would
>> be in the ".hashfile" in the same path.
>>
>> By default, the script would not fix the new changes in the working 
>> copy. Fixing the new mergeinfo to the working copy
>> can be achieved by passing the "--fix" option to the script.
>>
>> Also, this script can also authenticate against the self-signed ssl 
>> servers.
>>
>> I tested this script on our  asf subversion repo found a few bogus 
>> mergeinfo such as in the "tree-conflicts" branch:
>>
>> /subversion/branches/tree-conflicts with 872524-873154,868290-872329
>>
>> The above should be correct mergeinfo since it was not at all present 
>> in the revision range: r872330-872524
>>
>> Also, I ran this script against the 
>> https://ctf.open.collab.net/svn/repos/svnedge/trunk/console path and 
>> found a bogus mergeinfo. (it has a self-signed cert)
>>
>> Please share your thoughts...
>>
>>
>>
>> Thanks and regards
>> Prabhu
>


RE: [PATCH] python script for issue #3961, fixing the bogus mergeinfo

Posted by Kamesh Jayachandran <ka...@collab.net>.
Thanks Prabhu.

I fixed your bogus mergeinfo summary output which was always giving the full mergeinfo.

I committed this script in r1178435.

With regards
Kamesh Jayachandran




-----Original Message-----
From: Prabhu Gnana Sundar [mailto:prabhugs@collab.net]
Sent: Mon 9/19/2011 10:28 PM
To: dev@subversion.apache.org
Subject: Re: [PATCH] python script for issue #3961, fixing the bogus mergeinfo
 

Attaching a cleaner patch... Removed a couple of commented lines...


Regards
Prabhu



On Monday 19 September 2011 10:24 PM, Prabhu Gnana Sundar wrote:
> Thanks Kamesh for your valuable review...
>
>
> On Monday 12 September 2011 06:25 PM, Kamesh Jayachandran wrote:
>>
>> I like this patch. We need it.
>>
> Thank you...
>>
>> I found this bogus mergeinfo's were causing 404 file not found,
>> which were sometimes causing the merge to fail, making it run longer 
>> than necessary.
>>
>> Few comments.
>>
>>
>> 1.
>> >    opts, args = my_getopt(sys.argv[1:], "h?fRc", ["help", "fix", 
>> "recursive", "commit"])
>>
>> If you do not want to support 'commit' you can remove it as well.
>>
> Yeah... I removed it now..
>>
>>
>> 2. I ran it against a working copy of 
>> https://ctf.open.collab.net/svn/repos/svnedge/trunk/console
>> and got the following output
>>
>> [kamesh@kamesh console]$ python  /tmp/mergeinfo-sanitizer.py .
>> The mergeinfo has to be sanitized/reverse-merged for the following as:
>> /branches/CTF_MODE/SvnEdge: 515-766,
>>
>>
>> [kamesh@kamesh console]$ python  /tmp/mergeinfo-sanitizer.py --fix .
>> [kamesh@kamesh console]$ svn di
>> Property changes on: .
>> ___________________________________________________________________
>> Modified: svn:mergeinfo
>>    Reverse-merged /branches/CTF_MODE/SvnEdge:r512-515
>>
>> Based on my analysis on the above repo, I could see 
>> /branches/CTF_MODE/SvnEdge:r512-515 is bogus
>> and hence need to be removed. Anywway your script does it correctly 
>> with --fix.
>>
>> I would suggest changing the output of default invocation(i.e no --fix)
>>
> Changed the output as suggested...
>
>>
>> <snip>
>> Bogus mergeinfo summary:
>> Target 1: .
>> /branches/CTF_MODE/SvnEdge: 512-515,
>> /merge/source2: X-Y,
>> ..........
>>
>> Target 2: subtree_path
>> /merge/source3: A-B
>> ....
>>
>> Target 3: sub_sub_tree
>> /merge/source4: C-D
>> ......
>>
>> Run with --fix to fix this in your working copy and commit yourself.
>>
>>
>> 3. Your script seemed to sanitize to the depth of 'immediates' by 
>> default. What is the rational behind that?
>>
> No rational behind it, I was using it as default for my own 
> development purpose.
> Now fixed it.
>
>
>> 4. I ran your script against 
>> https://svn.apache.org/repos/asf/subversion/trunk
>>
>> And got the following output
>> <snip>
>>  The path 
>> /subversion/branches/diff-optimizations/subversion/libsvn_subr/adler32.c 
>> is not found
>>  The path 
>> /subversion/branches/diff-optimizations/subversion/include/private/svn_adler32.h 
>> is not found
>>
>> The mergeinfo has to be sanitized/reverse-merged for the following as:
>> /subversion/trunk/subversion/libsvn_subr/svn_temp_serializer.c:
>> /subversion/branches/tree-conflicts/subversion/include/svn_string.h: 
>> 872524-873154,868290-872329,
>> /subversion/trunk/subversion/include/private/svn_temp_serializer.h:
>> /subversion/branches/diff-optimizations/subversion/libsvn_subr/adler32.c:
>> /subversion/branches/tree-conflicts/subversion/libsvn_diff/diff.h: 
>> 872524-873154,868290-872329,
>> /subversion/trunk/subversion/libsvn_fs_fs/temp_serializer.h:
>> /subversion/branches/svn-patch-improvements: 918520-934609,
>> /subversion/trunk/subversion/libsvn_fs_fs/temp_serializer.c:
>> /subversion/branches/tree-conflicts/subversion/libsvn_subr/hash.c: 
>> 872524-873154,868290-872329,
>> /subversion/branches/diff-optimizations/subversion/include/svn_string.h: 
>> 1031270-1037352,
>> /subversion/branches/diff-optimizations-bytes/subversion/include/private/svn_adler32.h: 
>> 1054361-1067789,
>> /subversion/branches/svn-patch-improvements/subversion/include/svn_string.h: 
>> 918520-934609,
>> /subversion/branches/diff-optimizations/subversion/libsvn_subr/hash.c: 1031270-1037352,
>> /subversion/branches/tree-conflicts: 872524-873154,868290-872329,
>> /subversion/branches/diff-optimizations/subversion/include/private/svn_adler32.h:
>> /subversion/branches/diff-optimizations-bytes/subversion/libsvn_subr/adler32.c: 
>> 1054361-1067789,
>> /subversion/branches/svn-patch-improvements/subversion/libsvn_diff/diff.h: 
>> 918520-934609,
>> /subversion/branches/diff-optimizations: 1031270-1037352,
>> /subversion/branches/svn-patch-improvements/subversion/libsvn_subr/hash.c: 
>> 918520-934609,
>> /subversion/branches/integrate-cache-item-serialization/subversion/libsvn_fs_fs/temp_serializer.h: 
>> 1068738-1068739,
>> /subversion/branches/integrate-cache-item-serialization/subversion/libsvn_subr/svn_temp_serializer.c: 
>> 1068723-1068739,
>> /subversion/branches/integrate-cache-item-serialization/subversion/libsvn_fs_fs/temp_serializer.c: 
>> 1068738-1068739,
>> /subversion/trunk/subversion/libsvn_diff/diff.h: 1037352-1054248,
>> /subversion/trunk/subversion/libsvn_subr/adler32.c: 1054276-1054360,
>> /subversion/trunk/subversion/include/private/svn_adler32.h: 
>> 1054276-1054360,
>> /subversion/branches/integrate-cache-item-serialization/subversion/include/private/svn_temp_serializer.h: 
>> 1068723-1068739,
>> /subversion/trunk/subversion/libsvn_diff/util.c: 1037352-1054251,
>> </snip>
>>
>> path not found error need to be handled. error should be on STDERR.
>>
> now it is being handled and sent to the stderr...
>
>> 5. In main() you can call check_local_modifications before 
>> get_original_mergeinfo().
>>
> That sounds really good though I don't see 'propget' as a costly 
> operation.
>
>> 6. In get_original_mergeinfo()
>>
>> > if depth == 3:
>> Use the named constant than the arbitrary value.
>>
>> >  if depth == 3:
>> >    for entry in mergeinfo_catalog:
>> >      pathwise_mergeinfo = pathwise_mergeinfo + 
>> mergeinfo_catalog[entry] + "\n"
>>
>>
>> I think you do something wrong with the above concatenation of 
>> mergeinfos of different targets.
>>
>> >  else:
>> >    pathwise_mergeinfo = mergeinfo_catalog[wcpath]
>>
>> >  return core.svn_mergeinfo_parse(pathwise_mergeinfo, temp_pool)
>>
> Thanks for pointing this Kamesh... Fixed this in the attached patch...
>>
>>
>> 7. In get_new_location_segments():
>>
>> >      for revision_range in parsed_original_mergeinfo[path]:
>> >        try:
>> >          ra.get_location_segments(ra_session, "", revision_range.end,
>> >                                   revision_range.end, 
>> revision_range.start + 1, receiver)
>> >        except svn.core.SubversionException:
>> >          try:
>> >            ra.get_location_segments(ra_session, "", 
>> core.SVN_INVALID_REVNUM,
>> >                                     revision_range.end, 
>> revision_range.start + 1, receiver)
>>
>> Not sure why you run location segments against HEAD upon exception.
>>
> I was just thinking that the source path could even be present outside 
> the revision-ranges.
> But looks like I am doing something weird here... So fixed it by 
> removing the part that runs
> the location segments against the HEAD...
>>
>>
>>
>> >          except svn.core.SubversionException:
>> >            print " The path %s is not found " % path
>>
>> 8. Function parse_args is not used.
>>
> Removed it. This was mistakenly left out...
>>
>>
>> 9. Function receiver can have more meaningful name.
>>
> Changed the name ...
>>
>>
>>
>> 10. You need to organize your functions in such a way that it appears 
>> in some logical order.
>> For example I started my review from 'main()' which is in the middle 
>> of the file and I move above and below that function to review further.
>>
> Organized :)
>
>
>
> Attached the recent patch with this mail... Please share your thoughts...
>
>
> Thanks and regards
> Prabhu



Re: [PATCH] python script for issue #3961, fixing the bogus mergeinfo

Posted by Prabhu Gnana Sundar <pr...@collab.net>.
Attaching a cleaner patch... Removed a couple of commented lines...


Regards
Prabhu



On Monday 19 September 2011 10:24 PM, Prabhu Gnana Sundar wrote:
> Thanks Kamesh for your valuable review...
>
>
> On Monday 12 September 2011 06:25 PM, Kamesh Jayachandran wrote:
>>
>> I like this patch. We need it.
>>
> Thank you...
>>
>> I found this bogus mergeinfo's were causing 404 file not found,
>> which were sometimes causing the merge to fail, making it run longer 
>> than necessary.
>>
>> Few comments.
>>
>>
>> 1.
>> >    opts, args = my_getopt(sys.argv[1:], "h?fRc", ["help", "fix", 
>> "recursive", "commit"])
>>
>> If you do not want to support 'commit' you can remove it as well.
>>
> Yeah... I removed it now..
>>
>>
>> 2. I ran it against a working copy of 
>> https://ctf.open.collab.net/svn/repos/svnedge/trunk/console
>> and got the following output
>>
>> [kamesh@kamesh console]$ python  /tmp/mergeinfo-sanitizer.py .
>> The mergeinfo has to be sanitized/reverse-merged for the following as:
>> /branches/CTF_MODE/SvnEdge: 515-766,
>>
>>
>> [kamesh@kamesh console]$ python  /tmp/mergeinfo-sanitizer.py --fix .
>> [kamesh@kamesh console]$ svn di
>> Property changes on: .
>> ___________________________________________________________________
>> Modified: svn:mergeinfo
>>    Reverse-merged /branches/CTF_MODE/SvnEdge:r512-515
>>
>> Based on my analysis on the above repo, I could see 
>> /branches/CTF_MODE/SvnEdge:r512-515 is bogus
>> and hence need to be removed. Anywway your script does it correctly 
>> with --fix.
>>
>> I would suggest changing the output of default invocation(i.e no --fix)
>>
> Changed the output as suggested...
>
>>
>> <snip>
>> Bogus mergeinfo summary:
>> Target 1: .
>> /branches/CTF_MODE/SvnEdge: 512-515,
>> /merge/source2: X-Y,
>> ..........
>>
>> Target 2: subtree_path
>> /merge/source3: A-B
>> ....
>>
>> Target 3: sub_sub_tree
>> /merge/source4: C-D
>> ......
>>
>> Run with --fix to fix this in your working copy and commit yourself.
>>
>>
>> 3. Your script seemed to sanitize to the depth of 'immediates' by 
>> default. What is the rational behind that?
>>
> No rational behind it, I was using it as default for my own 
> development purpose.
> Now fixed it.
>
>
>> 4. I ran your script against 
>> https://svn.apache.org/repos/asf/subversion/trunk
>>
>> And got the following output
>> <snip>
>>  The path 
>> /subversion/branches/diff-optimizations/subversion/libsvn_subr/adler32.c 
>> is not found
>>  The path 
>> /subversion/branches/diff-optimizations/subversion/include/private/svn_adler32.h 
>> is not found
>>
>> The mergeinfo has to be sanitized/reverse-merged for the following as:
>> /subversion/trunk/subversion/libsvn_subr/svn_temp_serializer.c:
>> /subversion/branches/tree-conflicts/subversion/include/svn_string.h: 
>> 872524-873154,868290-872329,
>> /subversion/trunk/subversion/include/private/svn_temp_serializer.h:
>> /subversion/branches/diff-optimizations/subversion/libsvn_subr/adler32.c:
>> /subversion/branches/tree-conflicts/subversion/libsvn_diff/diff.h: 
>> 872524-873154,868290-872329,
>> /subversion/trunk/subversion/libsvn_fs_fs/temp_serializer.h:
>> /subversion/branches/svn-patch-improvements: 918520-934609,
>> /subversion/trunk/subversion/libsvn_fs_fs/temp_serializer.c:
>> /subversion/branches/tree-conflicts/subversion/libsvn_subr/hash.c: 
>> 872524-873154,868290-872329,
>> /subversion/branches/diff-optimizations/subversion/include/svn_string.h: 
>> 1031270-1037352,
>> /subversion/branches/diff-optimizations-bytes/subversion/include/private/svn_adler32.h: 
>> 1054361-1067789,
>> /subversion/branches/svn-patch-improvements/subversion/include/svn_string.h: 
>> 918520-934609,
>> /subversion/branches/diff-optimizations/subversion/libsvn_subr/hash.c: 1031270-1037352,
>> /subversion/branches/tree-conflicts: 872524-873154,868290-872329,
>> /subversion/branches/diff-optimizations/subversion/include/private/svn_adler32.h:
>> /subversion/branches/diff-optimizations-bytes/subversion/libsvn_subr/adler32.c: 
>> 1054361-1067789,
>> /subversion/branches/svn-patch-improvements/subversion/libsvn_diff/diff.h: 
>> 918520-934609,
>> /subversion/branches/diff-optimizations: 1031270-1037352,
>> /subversion/branches/svn-patch-improvements/subversion/libsvn_subr/hash.c: 
>> 918520-934609,
>> /subversion/branches/integrate-cache-item-serialization/subversion/libsvn_fs_fs/temp_serializer.h: 
>> 1068738-1068739,
>> /subversion/branches/integrate-cache-item-serialization/subversion/libsvn_subr/svn_temp_serializer.c: 
>> 1068723-1068739,
>> /subversion/branches/integrate-cache-item-serialization/subversion/libsvn_fs_fs/temp_serializer.c: 
>> 1068738-1068739,
>> /subversion/trunk/subversion/libsvn_diff/diff.h: 1037352-1054248,
>> /subversion/trunk/subversion/libsvn_subr/adler32.c: 1054276-1054360,
>> /subversion/trunk/subversion/include/private/svn_adler32.h: 
>> 1054276-1054360,
>> /subversion/branches/integrate-cache-item-serialization/subversion/include/private/svn_temp_serializer.h: 
>> 1068723-1068739,
>> /subversion/trunk/subversion/libsvn_diff/util.c: 1037352-1054251,
>> </snip>
>>
>> path not found error need to be handled. error should be on STDERR.
>>
> now it is being handled and sent to the stderr...
>
>> 5. In main() you can call check_local_modifications before 
>> get_original_mergeinfo().
>>
> That sounds really good though I don't see 'propget' as a costly 
> operation.
>
>> 6. In get_original_mergeinfo()
>>
>> > if depth == 3:
>> Use the named constant than the arbitrary value.
>>
>> >  if depth == 3:
>> >    for entry in mergeinfo_catalog:
>> >      pathwise_mergeinfo = pathwise_mergeinfo + 
>> mergeinfo_catalog[entry] + "\n"
>>
>>
>> I think you do something wrong with the above concatenation of 
>> mergeinfos of different targets.
>>
>> >  else:
>> >    pathwise_mergeinfo = mergeinfo_catalog[wcpath]
>>
>> >  return core.svn_mergeinfo_parse(pathwise_mergeinfo, temp_pool)
>>
> Thanks for pointing this Kamesh... Fixed this in the attached patch...
>>
>>
>> 7. In get_new_location_segments():
>>
>> >      for revision_range in parsed_original_mergeinfo[path]:
>> >        try:
>> >          ra.get_location_segments(ra_session, "", revision_range.end,
>> >                                   revision_range.end, 
>> revision_range.start + 1, receiver)
>> >        except svn.core.SubversionException:
>> >          try:
>> >            ra.get_location_segments(ra_session, "", 
>> core.SVN_INVALID_REVNUM,
>> >                                     revision_range.end, 
>> revision_range.start + 1, receiver)
>>
>> Not sure why you run location segments against HEAD upon exception.
>>
> I was just thinking that the source path could even be present outside 
> the revision-ranges.
> But looks like I am doing something weird here... So fixed it by 
> removing the part that runs
> the location segments against the HEAD...
>>
>>
>>
>> >          except svn.core.SubversionException:
>> >            print " The path %s is not found " % path
>>
>> 8. Function parse_args is not used.
>>
> Removed it. This was mistakenly left out...
>>
>>
>> 9. Function receiver can have more meaningful name.
>>
> Changed the name ...
>>
>>
>>
>> 10. You need to organize your functions in such a way that it appears 
>> in some logical order.
>> For example I started my review from 'main()' which is in the middle 
>> of the file and I move above and below that function to review further.
>>
> Organized :)
>
>
>
> Attached the recent patch with this mail... Please share your thoughts...
>
>
> Thanks and regards
> Prabhu


Re: [PATCH] python script for issue #3961, fixing the bogus mergeinfo

Posted by Prabhu Gnana Sundar <pr...@collab.net>.
Thanks Kamesh for your valuable review...


On Monday 12 September 2011 06:25 PM, Kamesh Jayachandran wrote:
>
> I like this patch. We need it.
>
Thank you...
>
> I found this bogus mergeinfo's were causing 404 file not found,
> which were sometimes causing the merge to fail, making it run longer 
> than necessary.
>
> Few comments.
>
>
> 1.
> >    opts, args = my_getopt(sys.argv[1:], "h?fRc", ["help", "fix", 
> "recursive", "commit"])
>
> If you do not want to support 'commit' you can remove it as well.
>
Yeah... I removed it now..
>
>
> 2. I ran it against a working copy of 
> https://ctf.open.collab.net/svn/repos/svnedge/trunk/console
> and got the following output
>
> [kamesh@kamesh console]$ python  /tmp/mergeinfo-sanitizer.py .
> The mergeinfo has to be sanitized/reverse-merged for the following as:
> /branches/CTF_MODE/SvnEdge: 515-766,
>
>
> [kamesh@kamesh console]$ python  /tmp/mergeinfo-sanitizer.py --fix .
> [kamesh@kamesh console]$ svn di
> Property changes on: .
> ___________________________________________________________________
> Modified: svn:mergeinfo
>    Reverse-merged /branches/CTF_MODE/SvnEdge:r512-515
>
> Based on my analysis on the above repo, I could see 
> /branches/CTF_MODE/SvnEdge:r512-515 is bogus
> and hence need to be removed. Anywway your script does it correctly 
> with --fix.
>
> I would suggest changing the output of default invocation(i.e no --fix)
>
Changed the output as suggested...

>
> <snip>
> Bogus mergeinfo summary:
> Target 1: .
> /branches/CTF_MODE/SvnEdge: 512-515,
> /merge/source2: X-Y,
> ..........
>
> Target 2: subtree_path
> /merge/source3: A-B
> ....
>
> Target 3: sub_sub_tree
> /merge/source4: C-D
> ......
>
> Run with --fix to fix this in your working copy and commit yourself.
>
>
> 3. Your script seemed to sanitize to the depth of 'immediates' by 
> default. What is the rational behind that?
>
No rational behind it, I was using it as default for my own development 
purpose.
Now fixed it.


> 4. I ran your script against 
> https://svn.apache.org/repos/asf/subversion/trunk
>
> And got the following output
> <snip>
>  The path 
> /subversion/branches/diff-optimizations/subversion/libsvn_subr/adler32.c 
> is not found
>  The path 
> /subversion/branches/diff-optimizations/subversion/include/private/svn_adler32.h 
> is not found
>
> The mergeinfo has to be sanitized/reverse-merged for the following as:
> /subversion/trunk/subversion/libsvn_subr/svn_temp_serializer.c:
> /subversion/branches/tree-conflicts/subversion/include/svn_string.h: 
> 872524-873154,868290-872329,
> /subversion/trunk/subversion/include/private/svn_temp_serializer.h:
> /subversion/branches/diff-optimizations/subversion/libsvn_subr/adler32.c:
> /subversion/branches/tree-conflicts/subversion/libsvn_diff/diff.h: 
> 872524-873154,868290-872329,
> /subversion/trunk/subversion/libsvn_fs_fs/temp_serializer.h:
> /subversion/branches/svn-patch-improvements: 918520-934609,
> /subversion/trunk/subversion/libsvn_fs_fs/temp_serializer.c:
> /subversion/branches/tree-conflicts/subversion/libsvn_subr/hash.c: 
> 872524-873154,868290-872329,
> /subversion/branches/diff-optimizations/subversion/include/svn_string.h: 
> 1031270-1037352,
> /subversion/branches/diff-optimizations-bytes/subversion/include/private/svn_adler32.h: 
> 1054361-1067789,
> /subversion/branches/svn-patch-improvements/subversion/include/svn_string.h: 
> 918520-934609,
> /subversion/branches/diff-optimizations/subversion/libsvn_subr/hash.c: 
> 1031270-1037352,
> /subversion/branches/tree-conflicts: 872524-873154,868290-872329,
> /subversion/branches/diff-optimizations/subversion/include/private/svn_adler32.h:
> /subversion/branches/diff-optimizations-bytes/subversion/libsvn_subr/adler32.c: 
> 1054361-1067789,
> /subversion/branches/svn-patch-improvements/subversion/libsvn_diff/diff.h: 
> 918520-934609,
> /subversion/branches/diff-optimizations: 1031270-1037352,
> /subversion/branches/svn-patch-improvements/subversion/libsvn_subr/hash.c: 
> 918520-934609,
> /subversion/branches/integrate-cache-item-serialization/subversion/libsvn_fs_fs/temp_serializer.h: 
> 1068738-1068739,
> /subversion/branches/integrate-cache-item-serialization/subversion/libsvn_subr/svn_temp_serializer.c: 
> 1068723-1068739,
> /subversion/branches/integrate-cache-item-serialization/subversion/libsvn_fs_fs/temp_serializer.c: 
> 1068738-1068739,
> /subversion/trunk/subversion/libsvn_diff/diff.h: 1037352-1054248,
> /subversion/trunk/subversion/libsvn_subr/adler32.c: 1054276-1054360,
> /subversion/trunk/subversion/include/private/svn_adler32.h: 
> 1054276-1054360,
> /subversion/branches/integrate-cache-item-serialization/subversion/include/private/svn_temp_serializer.h: 
> 1068723-1068739,
> /subversion/trunk/subversion/libsvn_diff/util.c: 1037352-1054251,
> </snip>
>
> path not found error need to be handled. error should be on STDERR.
>
now it is being handled and sent to the stderr...

> 5. In main() you can call check_local_modifications before 
> get_original_mergeinfo().
>
That sounds really good though I don't see 'propget' as a costly operation.

> 6. In get_original_mergeinfo()
>
> > if depth == 3:
> Use the named constant than the arbitrary value.
>
> >  if depth == 3:
> >    for entry in mergeinfo_catalog:
> >      pathwise_mergeinfo = pathwise_mergeinfo + 
> mergeinfo_catalog[entry] + "\n"
>
>
> I think you do something wrong with the above concatenation of 
> mergeinfos of different targets.
>
> >  else:
> >    pathwise_mergeinfo = mergeinfo_catalog[wcpath]
>
> >  return core.svn_mergeinfo_parse(pathwise_mergeinfo, temp_pool)
>
Thanks for pointing this Kamesh... Fixed this in the attached patch...
>
>
> 7. In get_new_location_segments():
>
> >      for revision_range in parsed_original_mergeinfo[path]:
> >        try:
> >          ra.get_location_segments(ra_session, "", revision_range.end,
> >                                   revision_range.end, 
> revision_range.start + 1, receiver)
> >        except svn.core.SubversionException:
> >          try:
> >            ra.get_location_segments(ra_session, "", 
> core.SVN_INVALID_REVNUM,
> >                                     revision_range.end, 
> revision_range.start + 1, receiver)
>
> Not sure why you run location segments against HEAD upon exception.
>
I was just thinking that the source path could even be present outside 
the revision-ranges.
But looks like I am doing something weird here... So fixed it by 
removing the part that runs
the location segments against the HEAD...
>
>
>
> >          except svn.core.SubversionException:
> >            print " The path %s is not found " % path
>
> 8. Function parse_args is not used.
>
Removed it. This was mistakenly left out...
>
>
> 9. Function receiver can have more meaningful name.
>
Changed the name ...
>
>
>
> 10. You need to organize your functions in such a way that it appears 
> in some logical order.
> For example I started my review from 'main()' which is in the middle 
> of the file and I move above and below that function to review further.
>
Organized :)



Attached the recent patch with this mail... Please share your thoughts...


Thanks and regards
Prabhu

RE: [PATCH] python script for issue #3961, fixing the bogus mergeinfo

Posted by Kamesh Jayachandran <ka...@collab.net>.
I like this patch. We need it.
I found this bogus mergeinfo's were causing 404 file not found,
which were sometimes causing the merge to fail, making it run longer than necessary.

Few comments.


1.
>    opts, args = my_getopt(sys.argv[1:], "h?fRc", ["help", "fix", "recursive", "commit"])

If you do not want to support 'commit' you can remove it as well.

2. I ran it against a working copy of https://ctf.open.collab.net/svn/repos/svnedge/trunk/console
and got the following output

[kamesh@kamesh console]$ python  /tmp/mergeinfo-sanitizer.py .
The mergeinfo has to be sanitized/reverse-merged for the following as:
/branches/CTF_MODE/SvnEdge: 515-766,


[kamesh@kamesh console]$ python  /tmp/mergeinfo-sanitizer.py --fix .
[kamesh@kamesh console]$ svn di
Property changes on: .
___________________________________________________________________
Modified: svn:mergeinfo
   Reverse-merged /branches/CTF_MODE/SvnEdge:r512-515

Based on my analysis on the above repo, I could see /branches/CTF_MODE/SvnEdge:r512-515 is bogus
and hence need to be removed. Anywway your script does it correctly with --fix.

I would suggest changing the output of default invocation(i.e no --fix)


<snip>
Bogus mergeinfo summary:
Target 1: .
/branches/CTF_MODE/SvnEdge: 512-515,
/merge/source2: X-Y,
..........

Target 2: subtree_path
/merge/source3: A-B
....

Target 3: sub_sub_tree
/merge/source4: C-D
......

Run with --fix to fix this in your working copy and commit yourself.


3. Your script seemed to sanitize to the depth of 'immediates' by default. What is the rational behind that?

4. I ran your script against https://svn.apache.org/repos/asf/subversion/trunk

And got the following output
<snip>
 The path /subversion/branches/diff-optimizations/subversion/libsvn_subr/adler32.c is not found
 The path /subversion/branches/diff-optimizations/subversion/include/private/svn_adler32.h is not found

The mergeinfo has to be sanitized/reverse-merged for the following as:
/subversion/trunk/subversion/libsvn_subr/svn_temp_serializer.c:
/subversion/branches/tree-conflicts/subversion/include/svn_string.h: 872524-873154,868290-872329,
/subversion/trunk/subversion/include/private/svn_temp_serializer.h:
/subversion/branches/diff-optimizations/subversion/libsvn_subr/adler32.c:
/subversion/branches/tree-conflicts/subversion/libsvn_diff/diff.h: 872524-873154,868290-872329,
/subversion/trunk/subversion/libsvn_fs_fs/temp_serializer.h:
/subversion/branches/svn-patch-improvements: 918520-934609,
/subversion/trunk/subversion/libsvn_fs_fs/temp_serializer.c:
/subversion/branches/tree-conflicts/subversion/libsvn_subr/hash.c: 872524-873154,868290-872329,
/subversion/branches/diff-optimizations/subversion/include/svn_string.h: 1031270-1037352,
/subversion/branches/diff-optimizations-bytes/subversion/include/private/svn_adler32.h: 1054361-1067789,
/subversion/branches/svn-patch-improvements/subversion/include/svn_string.h: 918520-934609,
/subversion/branches/diff-optimizations/subversion/libsvn_subr/hash.c: 1031270-1037352,
/subversion/branches/tree-conflicts: 872524-873154,868290-872329,
/subversion/branches/diff-optimizations/subversion/include/private/svn_adler32.h:
/subversion/branches/diff-optimizations-bytes/subversion/libsvn_subr/adler32.c: 1054361-1067789,
/subversion/branches/svn-patch-improvements/subversion/libsvn_diff/diff.h: 918520-934609,
/subversion/branches/diff-optimizations: 1031270-1037352,
/subversion/branches/svn-patch-improvements/subversion/libsvn_subr/hash.c: 918520-934609,
/subversion/branches/integrate-cache-item-serialization/subversion/libsvn_fs_fs/temp_serializer.h: 1068738-1068739,
/subversion/branches/integrate-cache-item-serialization/subversion/libsvn_subr/svn_temp_serializer.c: 1068723-1068739,
/subversion/branches/integrate-cache-item-serialization/subversion/libsvn_fs_fs/temp_serializer.c: 1068738-1068739,
/subversion/trunk/subversion/libsvn_diff/diff.h: 1037352-1054248,
/subversion/trunk/subversion/libsvn_subr/adler32.c: 1054276-1054360,
/subversion/trunk/subversion/include/private/svn_adler32.h: 1054276-1054360,
/subversion/branches/integrate-cache-item-serialization/subversion/include/private/svn_temp_serializer.h: 1068723-1068739,
/subversion/trunk/subversion/libsvn_diff/util.c: 1037352-1054251,
</snip>

path not found error need to be handled. error should be on STDERR.
5. In main() you can call check_local_modifications before get_original_mergeinfo().

6. In get_original_mergeinfo()

> if depth == 3:
Use the named constant than the arbitrary value.

>  if depth == 3:
>    for entry in mergeinfo_catalog:
>      pathwise_mergeinfo = pathwise_mergeinfo + mergeinfo_catalog[entry] + "\n"


I think you do something wrong with the above concatenation of mergeinfos of different targets.

>  else:
>    pathwise_mergeinfo = mergeinfo_catalog[wcpath]

>  return core.svn_mergeinfo_parse(pathwise_mergeinfo, temp_pool)


7. In get_new_location_segments():

>      for revision_range in parsed_original_mergeinfo[path]:
>        try:
>          ra.get_location_segments(ra_session, "", revision_range.end,
>                                   revision_range.end, revision_range.start + 1, receiver)
>        except svn.core.SubversionException:
>          try:
>            ra.get_location_segments(ra_session, "", core.SVN_INVALID_REVNUM,
>                                     revision_range.end, revision_range.start + 1, receiver)

Not sure why you run location segments against HEAD upon exception.


>          except svn.core.SubversionException:
>            print " The path %s is not found " % path

8. Function parse_args is not used.


9. Function receiver can have more meaningful name.


10. You need to organize your functions in such a way that it appears in some logical order.
For example I started my review from 'main()' which is in the middle of the file and I move above and below that function to review further.


With regards
Kamesh Jayachandran


-----Original Message-----
From: Prabhu Gnana Sundar [mailto:prabhugs@collab.net]
Sent: Mon 9/12/2011 2:25 PM
To: dev@subversion.apache.org
Subject: Re: [PATCH] python script for issue #3961, fixing the bogus mergeinfo
 
On Friday 19 August 2011 11:45 AM, Prabhu Gnana Sundar wrote:
> Correcting the issue number as #3961...
>

Minor fix in one of the commented descriptions... Attaching the updated 
script.
Please share your thoughts.





Thanks and regards
Prabhu


Re: [PATCH] python script for issue #3961, fixing the bogus mergeinfo

Posted by Prabhu Gnana Sundar <pr...@collab.net>.
On Friday 19 August 2011 11:45 AM, Prabhu Gnana Sundar wrote:
> Correcting the issue number as #3961...
>

Minor fix in one of the commented descriptions... Attaching the updated 
script.
Please share your thoughts.





Thanks and regards
Prabhu

Re: [PATCH] python script for issue #3961, fixing the bogus mergeinfo

Posted by Prabhu Gnana Sundar <pr...@collab.net>.
Correcting the issue number as #3961...


On Thursday 18 August 2011 08:32 PM, Prabhu Gnana Sundar wrote:
> Hi all,
>
> With reference to my earlier discussions in 
> http://svn.haxx.se/dev/archive-2011-07/0432.shtml
> I am attaching a python script which would find the bogus mergeinfo 
> and fix it.
>
> This script would eventually look for the location segments of the 
> source paths against the respective revision
> ranges present in the mergeinfo. With respect to the output of the 
> location segments it would create a new mergeinfo
> and store it in the ".newmergeinfo" file in the path from where the 
> python script was run. The hash of this file would
> be in the ".hashfile" in the same path.
>
> By default, the script would not fix the new changes in the working 
> copy. Fixing the new mergeinfo to the working copy
> can be achieved by passing the "--fix" option to the script.
>
> Also, this script can also authenticate against the self-signed ssl 
> servers.
>
> I tested this script on our  asf subversion repo found a few bogus 
> mergeinfo such as in the "tree-conflicts" branch:
>
> /subversion/branches/tree-conflicts with 872524-873154,868290-872329
>
> The above should be correct mergeinfo since it was not at all present 
> in the revision range: r872330-872524
>
> Also, I ran this script against the 
> https://ctf.open.collab.net/svn/repos/svnedge/trunk/console path and 
> found a bogus mergeinfo. (it has a self-signed cert)
>
> Please share your thoughts...
>
>
>
> Thanks and regards
> Prabhu