You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Alan Wood <Al...@clear.net.nz> on 2009/03/12 09:40:59 UTC

Re: [PATCH] Re: bug when diffing

Hi all,
 I would just like to nudge this patch along. Could someone please take a look at it.
The patch does apply to trunk,

 Its been about 6 weeks hanging around, and I know 1.6 is more fun and we have had
a patch manager change too.

 I was thinking that it would be nice to be able to see a 'pending patches' list
somewhere. I'm sure Gavin has something like this but could it be made more
accessible to all?

Cheers
Alan Wood

On 25 Jan 2009 at 16:43, Alan Wood wrote:

> Hi,
>  I have continued on from this and think I have found the cause.
>  The 128k size if the chunk_size for the file diff routines.
>
>  At the boundary of a chunk in the file libsvn_diff/diff_file.c
>  the routine svn_diff__normalize_buffer() is called for
>  the part of the line that is in the first chunk and then again for
>  the part of the line that is in the new chunk.
>  If the start of the new chunk has characters that are skipped, in this
>  case the \n of the \r\n pair, then the file_token->norm_offset is incorrectly
>  moved forward.
>
>  A possible log entry:
> [[[
>  Fix issue with incorrect diff on 128k chunk boundaries
>  * subversion/libsvn_diff/diff_file.c
>     (datasource_get_next_token) don't skip chars at front of buffer
>       if buffer is not the start of a line
> ]]]
>
> Sorry patch is against 1.5.x, I haven't got sqlite yet.
>
> Alan Wood
>
> On 24 Jan 2009 at 19:53, Stefan Küng wrote:
>
> > Hi,
> >
> > Just spent the last four hours trying to find the bug in the lib_diff
> > but failed miserably. So I'm reporting the issue here hoping someone
> > else is more familiar with that part of the code.
> >
> > How to reproduce the problem:
> >
> > * get the attached zip file
> > * extract file1.txt and file2.txt
> >   (the files are identical but the first line)
> >
> > $ svnadmin create repo
> > $ svn co file://repo/ wc
> > $ cd wc
> > $ mv file1.txt wc/file1.txt
> > $ svn add file1.txt
> > $ svn ci -m ""
> > $ mv -f file2.txt file1.txt
> > $ svn diff file1.txt
> > .. one line different shown
> >
> > But:
> > $ svn diff file1.txt -x --ignore-eol-style
> >
> > shows a line diff on every 128kByte mark in the file!
> >
> > I first thought the shown diffs were random, but they're in fact
> > *exactly* at 128k boundaries.
> >
> > This happens with a trunk build from yesterday as well as with an 1.5.4
> > client.
> >
> >
> > Stefan
> >
> > --
> >        ___
> >   oo  // \\      "De Chelonian Mobile"
> >  (_,\/ \_/ \     TortoiseSVN
> >    \ \_/_\_/>    The coolest Interface to (Sub)Version Control
> >    /_/   \_\     http://tortoisesvn.net
> >
> > ------------------------------------------------------
> > http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1047363
>
> ------------------------------------------------------
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1048312

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1312002

Re: [PATCH] Re: bug when diffing

Posted by Greg Stein <gs...@gmail.com>.
If you'd be more comfortable with a spreadsheet, then I'd say "go
ahead", and that you should use Google Spreadsheets with "public view"
turned on so that everybody can see it.

The issue tracker was used simply because we did not have a good way
to track the patches. The tracker does have the "attachment" feature,
which is handy (so you don't have to dig around the mailing list to
find the patch). I don't recall if Google Spreadsheets have
attachments. It probably does.

The tracker does allow for others to make edits, while you could share
the spreadsheet with other editors.

Personally, I believe it is your choice, and what makes things easier
for you to track these things. You *do* need to involve the other devs
and make it possible for them to know about outstanding patches. The
public spreadsheet works, along with (say) a weekly mailing of the
outstanding patches to the list. That said, I can defiitely see the
argument the other way: use the tracker for cooperative
editing/comments/etc.

Cheers,
-g

On Thu, Mar 12, 2009 at 15:01, Gavin 'Beau' Baumanis
<ga...@thespidernet.com> wrote:
> Hi everyone,
>
> When I originally volunteered, one of my first ideas was to a create a
> public facing web page of some sort,
> Perhaps just scheduling the HTMLing and publication of a spreadsheet, that I
> would keep up to date with details along the lines of;
>
> Initial Post Date
> Latest Post Date
> Issue #,
> Voted,
> Merged,
> Backported etc...
>
> I thought I would use it for my own personal requirements in keeping in
> touch with patches (as per my role) and if anyone else found it useful to
> look at  - well that's fine too.
> There's no "real" effort required in keeping it up to date and might well
> just stop a patch or two from falling into oblivion.
>
> I haven't actually done this though, because it seemed to be replicating the
> data housed in the issue tracker which Greg has pointed out..
>
> So I have just been keeping an eye on things through constant review of the
> @dev mailing list and using a few mail sorting rules.
> While I certainly pay "special" attention to any mail that has the [PATCH
> prefix - I review all @dev posts, just in case.
>
> I'm still happy to bring about the spreadsheet if people truly see it as
> being appropriate, otherwise I'll just keep on using the issue tracker as
> has been the "custom".
>
> Beau.
>
> On 12/03/2009, at 11:19 PM, Greg Stein wrote:
>
>> He may not have noticed because [PATCH] was not in the subject line
>> (we request that specifically for this reason)
>>
>> And unless things have changed since I was here last, the patches
>> should be hanging out in our issue tracker.
>>
>> Cheers,
>> -g
>>
>> On Thu, Mar 12, 2009 at 12:18, Stefan Sperling <st...@elego.de> wrote:
>>>
>>> On Thu, Mar 12, 2009 at 10:40:59PM +1300, Alan Wood wrote:
>>>>
>>>>  I was thinking that it would be nice to be able to see a 'pending
>>>>  patches' list somewhere. I'm sure Gavin has something like this but
>>>>  could it be made more accessible to all?
>>>
>>> That would certainly be nice!
>>>
>>> Stefan
>>>
>>>
>>
>> ------------------------------------------------------
>>
>> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1312600
>
>

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1313193


Re: [PATCH] Re: bug when diffing

Posted by Gavin Baumanis <ga...@thespidernet.com>.
Hi everyone,

When I originally volunteered, one of my first ideas was to a create a  
public facing web page of some sort,
Perhaps just scheduling the HTMLing and publication of a spreadsheet,  
that I would keep up to date with details along the lines of;

Initial Post Date
Latest Post Date
Issue #,
Voted,
Merged,
Backported etc...

I thought I would use it for my own personal requirements in keeping  
in touch with patches (as per my role) and if anyone else found it  
useful to look at  - well that's fine too.
There's no "real" effort required in keeping it up to date and might  
well just stop a patch or two from falling into oblivion.

I haven't actually done this though, because it seemed to be  
replicating the data housed in the issue tracker which Greg has  
pointed out..

So I have just been keeping an eye on things through constant review  
of the @dev mailing list and using a few mail sorting rules.
While I certainly pay "special" attention to any mail that has the  
[PATCH prefix - I review all @dev posts, just in case.

I'm still happy to bring about the spreadsheet if people truly see it  
as being appropriate, otherwise I'll just keep on using the issue  
tracker as has been the "custom".

Beau.

On 12/03/2009, at 11:19 PM, Greg Stein wrote:

> He may not have noticed because [PATCH] was not in the subject line
> (we request that specifically for this reason)
>
> And unless things have changed since I was here last, the patches
> should be hanging out in our issue tracker.
>
> Cheers,
> -g
>
> On Thu, Mar 12, 2009 at 12:18, Stefan Sperling <st...@elego.de> wrote:
>> On Thu, Mar 12, 2009 at 10:40:59PM +1300, Alan Wood wrote:
>>>  I was thinking that it would be nice to be able to see a 'pending
>>>  patches' list somewhere. I'm sure Gavin has something like this but
>>>  could it be made more accessible to all?
>>
>> That would certainly be nice!
>>
>> Stefan
>>
>>
>
> ------------------------------------------------------
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1312600

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1313002

Re: [PATCH] Re: bug when diffing

Posted by Greg Stein <gs...@gmail.com>.
He may not have noticed because [PATCH] was not in the subject line
(we request that specifically for this reason)

And unless things have changed since I was here last, the patches
should be hanging out in our issue tracker.

Cheers,
-g

On Thu, Mar 12, 2009 at 12:18, Stefan Sperling <st...@elego.de> wrote:
> On Thu, Mar 12, 2009 at 10:40:59PM +1300, Alan Wood wrote:
>>  I was thinking that it would be nice to be able to see a 'pending
>>  patches' list somewhere. I'm sure Gavin has something like this but
>>  could it be made more accessible to all?
>
> That would certainly be nice!
>
> Stefan
>
>

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1312600


Re: [PATCH] Re: bug when diffing

Posted by Stefan Sperling <st...@elego.de>.
On Thu, Mar 12, 2009 at 10:40:59PM +1300, Alan Wood wrote:
>  I was thinking that it would be nice to be able to see a 'pending
>  patches' list somewhere. I'm sure Gavin has something like this but
>  could it be made more accessible to all?

That would certainly be nice!

Stefan

Re: [PATCH] Re: bug when diffing

Posted by Gavin Baumanis <ga...@thespidernet.com>.
Thanks Greg,

Appreciate the help.

Gavin.


On 22/04/2009, at 9:48 AM, Greg Stein wrote:

> Yes, this patch has been applied. Backported to 1.6.1, too.
>
> On Wed, Apr 22, 2009 at 00:33, Gavin Baumanis  
> <ga...@thespidernet.com> wrote:
>> Hi Alan,
>>
>> I just realised that I hijacked your initial thread when I first took
>> over the role of the PM.
>>
>> Can I please bother you to let me know if your patch has been
>> committed or not?
>> If not - then this will obviously serve as an (overdue) ping for the
>> list.
>>
>> Gavin.
>>
>>
>> On 12/03/2009, at 8:40 PM, Alan Wood wrote:
>>
>>> Hi all,
>>> I would just like to nudge this patch along. Could someone please
>>> take a look at it.
>>> The patch does apply to trunk,
>>>
>>> Its been about 6 weeks hanging around, and I know 1.6 is more fun
>>> and we have had
>>> a patch manager change too.
>>>
>>> I was thinking that it would be nice to be able to see a 'pending
>>> patches' list
>>> somewhere. I'm sure Gavin has something like this but could it be
>>> made more
>>> accessible to all?
>>>
>>> Cheers
>>> Alan Wood
>>>
>>> On 25 Jan 2009 at 16:43, Alan Wood wrote:
>>>
>>>> Hi,
>>>> I have continued on from this and think I have found the cause.
>>>> The 128k size if the chunk_size for the file diff routines.
>>>>
>>>> At the boundary of a chunk in the file libsvn_diff/diff_file.c
>>>> the routine svn_diff__normalize_buffer() is called for
>>>> the part of the line that is in the first chunk and then again for
>>>> the part of the line that is in the new chunk.
>>>> If the start of the new chunk has characters that are skipped, in
>>>> this
>>>> case the \n of the \r\n pair, then the file_token->norm_offset is
>>>> incorrectly
>>>> moved forward.
>>>>
>>>> A possible log entry:
>>>> [[[
>>>> Fix issue with incorrect diff on 128k chunk boundaries
>>>> * subversion/libsvn_diff/diff_file.c
>>>>    (datasource_get_next_token) don't skip chars at front of buffer
>>>>      if buffer is not the start of a line
>>>> ]]]
>>>>
>>>> Sorry patch is against 1.5.x, I haven't got sqlite yet.
>>>>
>>>> Alan Wood
>>>>
>>>> On 24 Jan 2009 at 19:53, Stefan Küng wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Just spent the last four hours trying to find the bug in the
>>>>> lib_diff
>>>>> but failed miserably. So I'm reporting the issue here hoping  
>>>>> someone
>>>>> else is more familiar with that part of the code.
>>>>>
>>>>> How to reproduce the problem:
>>>>>
>>>>> * get the attached zip file
>>>>> * extract file1.txt and file2.txt
>>>>>  (the files are identical but the first line)
>>>>>
>>>>> $ svnadmin create repo
>>>>> $ svn co file://repo/ wc
>>>>> $ cd wc
>>>>> $ mv file1.txt wc/file1.txt
>>>>> $ svn add file1.txt
>>>>> $ svn ci -m ""
>>>>> $ mv -f file2.txt file1.txt
>>>>> $ svn diff file1.txt
>>>>> .. one line different shown
>>>>>
>>>>> But:
>>>>> $ svn diff file1.txt -x --ignore-eol-style
>>>>>
>>>>> shows a line diff on every 128kByte mark in the file!
>>>>>
>>>>> I first thought the shown diffs were random, but they're in fact
>>>>> *exactly* at 128k boundaries.
>>>>>
>>>>> This happens with a trunk build from yesterday as well as with an
>>>>> 1.5.4
>>>>> client.
>>>>>
>>>>>
>>>>> Stefan
>>>>>
>>>>> --
>>>>>       ___
>>>>>  oo  // \\      "De Chelonian Mobile"
>>>>> (_,\/ \_/ \     TortoiseSVN
>>>>>   \ \_/_\_/>    The coolest Interface to (Sub)Version Control
>>>>>   /_/   \_\     http://tortoisesvn.net
>>>>>
>>>>> ------------------------------------------------------
>>>>> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1047363
>>>>
>>>> ------------------------------------------------------
>>>> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1048312
>>>
>>>
>>> <diff_file.patch>
>>
>> ------------------------------------------------------
>> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1849708
>>
>
> ------------------------------------------------------
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1850768

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1850894


Re: [PATCH] Re: bug when diffing

Posted by Greg Stein <gs...@gmail.com>.
Yes, this patch has been applied. Backported to 1.6.1, too.

On Wed, Apr 22, 2009 at 00:33, Gavin Baumanis <ga...@thespidernet.com> wrote:
> Hi Alan,
>
> I just realised that I hijacked your initial thread when I first took
> over the role of the PM.
>
> Can I please bother you to let me know if your patch has been
> committed or not?
> If not - then this will obviously serve as an (overdue) ping for the
> list.
>
> Gavin.
>
>
> On 12/03/2009, at 8:40 PM, Alan Wood wrote:
>
>> Hi all,
>> I would just like to nudge this patch along. Could someone please
>> take a look at it.
>> The patch does apply to trunk,
>>
>> Its been about 6 weeks hanging around, and I know 1.6 is more fun
>> and we have had
>> a patch manager change too.
>>
>> I was thinking that it would be nice to be able to see a 'pending
>> patches' list
>> somewhere. I'm sure Gavin has something like this but could it be
>> made more
>> accessible to all?
>>
>> Cheers
>> Alan Wood
>>
>> On 25 Jan 2009 at 16:43, Alan Wood wrote:
>>
>>> Hi,
>>> I have continued on from this and think I have found the cause.
>>> The 128k size if the chunk_size for the file diff routines.
>>>
>>> At the boundary of a chunk in the file libsvn_diff/diff_file.c
>>> the routine svn_diff__normalize_buffer() is called for
>>> the part of the line that is in the first chunk and then again for
>>> the part of the line that is in the new chunk.
>>> If the start of the new chunk has characters that are skipped, in
>>> this
>>> case the \n of the \r\n pair, then the file_token->norm_offset is
>>> incorrectly
>>> moved forward.
>>>
>>> A possible log entry:
>>> [[[
>>> Fix issue with incorrect diff on 128k chunk boundaries
>>> * subversion/libsvn_diff/diff_file.c
>>>    (datasource_get_next_token) don't skip chars at front of buffer
>>>      if buffer is not the start of a line
>>> ]]]
>>>
>>> Sorry patch is against 1.5.x, I haven't got sqlite yet.
>>>
>>> Alan Wood
>>>
>>> On 24 Jan 2009 at 19:53, Stefan Küng wrote:
>>>
>>>> Hi,
>>>>
>>>> Just spent the last four hours trying to find the bug in the
>>>> lib_diff
>>>> but failed miserably. So I'm reporting the issue here hoping someone
>>>> else is more familiar with that part of the code.
>>>>
>>>> How to reproduce the problem:
>>>>
>>>> * get the attached zip file
>>>> * extract file1.txt and file2.txt
>>>>  (the files are identical but the first line)
>>>>
>>>> $ svnadmin create repo
>>>> $ svn co file://repo/ wc
>>>> $ cd wc
>>>> $ mv file1.txt wc/file1.txt
>>>> $ svn add file1.txt
>>>> $ svn ci -m ""
>>>> $ mv -f file2.txt file1.txt
>>>> $ svn diff file1.txt
>>>> .. one line different shown
>>>>
>>>> But:
>>>> $ svn diff file1.txt -x --ignore-eol-style
>>>>
>>>> shows a line diff on every 128kByte mark in the file!
>>>>
>>>> I first thought the shown diffs were random, but they're in fact
>>>> *exactly* at 128k boundaries.
>>>>
>>>> This happens with a trunk build from yesterday as well as with an
>>>> 1.5.4
>>>> client.
>>>>
>>>>
>>>> Stefan
>>>>
>>>> --
>>>>       ___
>>>>  oo  // \\      "De Chelonian Mobile"
>>>> (_,\/ \_/ \     TortoiseSVN
>>>>   \ \_/_\_/>    The coolest Interface to (Sub)Version Control
>>>>   /_/   \_\     http://tortoisesvn.net
>>>>
>>>> ------------------------------------------------------
>>>> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1047363
>>>
>>> ------------------------------------------------------
>>> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1048312
>>
>>
>> <diff_file.patch>
>
> ------------------------------------------------------
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1849708
>

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1850768


Re: [PATCH] Re: bug when diffing

Posted by Gavin Baumanis <ga...@thespidernet.com>.
Hi Alan,

I just realised that I hijacked your initial thread when I first took  
over the role of the PM.

Can I please bother you to let me know if your patch has been  
committed or not?
If not - then this will obviously serve as an (overdue) ping for the  
list.

Gavin.


On 12/03/2009, at 8:40 PM, Alan Wood wrote:

> Hi all,
> I would just like to nudge this patch along. Could someone please  
> take a look at it.
> The patch does apply to trunk,
>
> Its been about 6 weeks hanging around, and I know 1.6 is more fun  
> and we have had
> a patch manager change too.
>
> I was thinking that it would be nice to be able to see a 'pending  
> patches' list
> somewhere. I'm sure Gavin has something like this but could it be  
> made more
> accessible to all?
>
> Cheers
> Alan Wood
>
> On 25 Jan 2009 at 16:43, Alan Wood wrote:
>
>> Hi,
>> I have continued on from this and think I have found the cause.
>> The 128k size if the chunk_size for the file diff routines.
>>
>> At the boundary of a chunk in the file libsvn_diff/diff_file.c
>> the routine svn_diff__normalize_buffer() is called for
>> the part of the line that is in the first chunk and then again for
>> the part of the line that is in the new chunk.
>> If the start of the new chunk has characters that are skipped, in  
>> this
>> case the \n of the \r\n pair, then the file_token->norm_offset is  
>> incorrectly
>> moved forward.
>>
>> A possible log entry:
>> [[[
>> Fix issue with incorrect diff on 128k chunk boundaries
>> * subversion/libsvn_diff/diff_file.c
>>    (datasource_get_next_token) don't skip chars at front of buffer
>>      if buffer is not the start of a line
>> ]]]
>>
>> Sorry patch is against 1.5.x, I haven't got sqlite yet.
>>
>> Alan Wood
>>
>> On 24 Jan 2009 at 19:53, Stefan Küng wrote:
>>
>>> Hi,
>>>
>>> Just spent the last four hours trying to find the bug in the  
>>> lib_diff
>>> but failed miserably. So I'm reporting the issue here hoping someone
>>> else is more familiar with that part of the code.
>>>
>>> How to reproduce the problem:
>>>
>>> * get the attached zip file
>>> * extract file1.txt and file2.txt
>>>  (the files are identical but the first line)
>>>
>>> $ svnadmin create repo
>>> $ svn co file://repo/ wc
>>> $ cd wc
>>> $ mv file1.txt wc/file1.txt
>>> $ svn add file1.txt
>>> $ svn ci -m ""
>>> $ mv -f file2.txt file1.txt
>>> $ svn diff file1.txt
>>> .. one line different shown
>>>
>>> But:
>>> $ svn diff file1.txt -x --ignore-eol-style
>>>
>>> shows a line diff on every 128kByte mark in the file!
>>>
>>> I first thought the shown diffs were random, but they're in fact
>>> *exactly* at 128k boundaries.
>>>
>>> This happens with a trunk build from yesterday as well as with an  
>>> 1.5.4
>>> client.
>>>
>>>
>>> Stefan
>>>
>>> --
>>>       ___
>>>  oo  // \\      "De Chelonian Mobile"
>>> (_,\/ \_/ \     TortoiseSVN
>>>   \ \_/_\_/>    The coolest Interface to (Sub)Version Control
>>>   /_/   \_\     http://tortoisesvn.net
>>>
>>> ------------------------------------------------------
>>> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1047363
>>
>> ------------------------------------------------------
>> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1048312
>
>
> <diff_file.patch>

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1849708