Created on 2006-02-15.18:36:56 by vadim, last changed 2008-09-05.09:13:17 by rawagajah.
| File name |
Uploaded |
Type |
Edit |
Remove |
|
broken.diff
|
vadim,
2006-02-15.18:36:53
|
text/x-patch |
|
|
| msg6952 (view) |
Author: rawagajah |
Date: 2008-09-05.09:13:17 |
|
(I meant "hg diff -b 1" at the end there, of course, which gives the same poor diff)
|
| msg6931 (view) |
Author: rawagajah |
Date: 2008-09-03.17:33:26 |
|
Here's another demonstration of a poor diff caused by this bug, which standalone
"diff -ub" handles much better, giving a diff which is much easier for the user
to understand...
$ echo -e "X\n A\n B\n D\n E" > 1
$ echo -e "X\n A\n B\n C\n D\n E" > 2
$ diff -ub 1 2
--- 1 2008-09-03 18:24:35.000000000 +0100
+++ 2 2008-09-03 18:24:37.000000000 +0100
@@ -1,5 +1,6 @@
X
A
B
+ C
D
E
$ hg init
$ hg add 1 2
$ hg commit -m 'Check in'
$ mv 2 1
$ hg diff 1
diff -r 34bb3cd02d65 1
--- a/1 Wed Sep 03 18:24:47 2008 +0100
+++ b/1 Wed Sep 03 18:25:01 2008 +0100
@@ -1,5 +1,6 @@
X
- A
- B
- D
- E
+ A
+ B
+ C
+ D
+ E
|
| msg6774 (view) |
Author: haps |
Date: 2008-08-14.15:49:02 |
|
Well, you're absolutely correct on that one. I was positive that diff -w
ignored newline differences but some simple tests show me that I was way off on
that one. Maybe I was thinking of -wB, but even that will say that 'abc\ndef'
differs from 'ab\ncd\nef'. So please ignore my thought that -bB should act like -w.
|
| msg6773 (view) |
Author: djc |
Date: 2008-08-14.15:04:43 |
|
There are of course also the -b and -B options for newline stuff.
|
| msg6772 (view) |
Author: ThomasAH |
Date: 2008-08-14.15:02:43 |
|
Shouldn't the behaviour match that of the "normal" diff command?
diff (GNU diffutils) 2.8.1 considers with -w:
"foo bar\n" and "foo bar\n" to be equal, but "foo bar\n" and "foo\nbar\n" not,
so XML's idea of whitespace doesn't help here.
"foo bar\n" and "foo\rbar\r" to be equal, so \r seems to be viewed as whitespace.
|
| msg6771 (view) |
Author: haps |
Date: 2008-08-14.14:38:15 |
|
I agree, \n is sort of limiting to the non-windows world (who uses that anyway?:).
So, what about \s? I think though the most important thing is that it behaves
similarly to other diffs. At the moment it does not. Most definately, if -w is
changed to ignore \s+ then -b should be suitably updated (and if \r is included
in -w then -B should also be updated).
My key point is that -bB should be identical to -w. I hope it's clear and is
everyone in agreement with this? Can we move forward to fixing this issue?
|
| msg6769 (view) |
Author: tlynn |
Date: 2008-08-14.13:33:17 |
|
The definition of whitespace should include at least \r (' \t\r\n' is XML's idea
of whitespace). Alternatively you could use the '\s' regexp ('\t\n\v\f\r ',
usually the same as string.whitespace) or whatever the locale says. But at least \r.
|
| msg6649 (view) |
Author: haps |
Date: 2008-07-31.15:44:00 |
|
Hi;
I think that diff -w should be the same as diff -bB. When I use diff -bB it
acts like diff -w (no exhaustive test cases). The problem with the diff code
now I believe is that this old version
mercurial/mdiff.py, beginning at line 52 (version 0.9.5), old:
def wsclean(opts, text):
if opts.ignorews:
text = re.sub('[ \t]+', '', text)
elif opts.ignorewsamount:
text = re.sub('[ \t]+', ' ', text)
text = re.sub('[ \t]+\n', '\n', text)
if opts.ignoreblanklines:
text = re.sub('\n+', '', text)
return text
could be changed to something like:
def wsclean(opts, text):
if opts.ignorews:
text = re.sub('[ \t\n]+', '', text)
else:
if opts.ignorewsamount:
text = re.sub('[ \t]+', ' ', text)
text = re.sub('[ \t]+\n', '\n', text)
if opts.ignoreblanklines:
text = re.sub('\n+', '', text)
return text
|
| msg6051 (view) |
Author: jglick |
Date: 2008-05-20.18:32:37 |
|
A simple test case:
$ hg init
$ (echo 'doStuff'; echo 'doMoreStuff') > f
$ hg ci -A -m 1
adding f
$ (echo 'block {'; echo ' doStuff'; echo ' doMoreStuff'; echo '} end') > f
$ hg ci -m 2
$ hg cat -r 0 f > f_0
$ diff -uw f_0 f
--- f_0 2008-05-20 14:29:16.000000000 -0400
+++ f 2008-05-20 14:28:16.000000000 -0400
@@ -1,2 +1,4 @@
+block {
doStuff
doMoreStuff
+} end
$ hg --config diff.ignorews=1 di -r0:1
diff --git a/f b/f
--- a/f
+++ b/f
@@ -1,2 +1,4 @@
-doStuff
-doMoreStuff
+block {
+ doStuff
+ doMoreStuff
+} end
$
|
| msg4588 (view) |
Author: djc |
Date: 2007-12-13.12:48:26 |
|
I'll look into fixing this soon. Transcript from chat with pmezard:
13:35 < pmezard> djc: i think the lines should be normalized before sending
them to bdiff.blocks()
13:37 < djc> pmezard: indeed
13:37 < pmezard> "if wsclean(opts, "".join(old)) == wsclean(opts,
"".join(new)):" looks a bit simplistic
13:39 < pmezard> i don't know how expensive it would be to duplicate both line
lines. probably not too expensive, especially if done only for
ws cases.
13:39 < djc> pmezard: my concern was with dedenting a block of code
13:39 < djc> the diff becomes a bit unreadable if you do that
13:39 < djc> especially if the block is large
13:39 < djc> does this approach fix that problem?
13:40 < pmezard> only dedenting or dedenting + changes
13:41 < djc> well, I'd like to get a diff that ignores all the dedenting and
shows the other changes
13:41 < djc> and I figure diff -b should do that
13:41 < djc> but it doesn't
13:41 < pmezard> yes i think this case is currently ignore by the line i quote
above
13:41 < pmezard> it only checks for pure whitespaces changes in non-matching
blocks
13:42 < djc> right
13:42 < pmezard> problem is that any other changes make the whole block to be
emitted
13:43 < djc> exactly
13:43 < pmezard> so, you can create two copies of l1/l2 with every lines
normalized by wsclean, pass that to bdiff.blocks, then match
again with the original input
|
| msg4528 (view) |
Author: ThomasAH |
Date: 2007-12-08.09:58:05 |
|
Not fixed yet (tested with 1db1c7c1eb5e)
|
| msg4486 (view) |
Author: mpm |
Date: 2007-12-07.21:48:18 |
|
Did we fix this one?
|
| msg530 (view) |
Author: ThomasAH |
Date: 2006-02-21.13:49:07 |
|
changed title to match problem.
|
| msg485 (view) |
Author: mason |
Date: 2006-02-15.19:39:33 |
|
On Wednesday 15 February 2006 13:36, Vadim Gelfer wrote:
> New submission from Vadim Gelfer <vadim.gelfer@gmail.com>:
>
> i am trying to fix issue 126. when my fix does not work, i think it is my
> fault, but diff code is generating wrong output instead. i have attached
> broken diff. look for hunk at end. line with "-I" is only different with
> white space. but shows up in diff output as changed, even with -w option to
> diff.
It came as part of a larger hunk. The current ws handling only ignores hunks
where nothing but ws have changed.
-chris
|
| msg479 (view) |
Author: vadim |
Date: 2006-02-15.18:36:53 |
|
i am trying to fix issue 126. when my fix does not work, i think it is my fault,
but diff code is generating wrong output instead. i have attached broken diff.
look for hunk at end. line with "-I" is only different with white space. but
shows up in diff output as changed, even with -w option to diff.
|
|
| Date |
User |
Action |
Args |
| 2008-09-05 09:13:17 | rawagajah | set | nosy:
mpm, ThomasAH, mason, tksoh, jglick, djc, abuehl, tlynn, haps, rawagajah messages:
+ msg6952 |
| 2008-09-03 17:33:27 | rawagajah | set | nosy:
+ rawagajah messages:
+ msg6931 |
| 2008-08-14 15:49:03 | haps | set | nosy:
mpm, ThomasAH, mason, tksoh, jglick, djc, abuehl, tlynn, haps messages:
+ msg6774 |
| 2008-08-14 15:04:43 | djc | set | nosy:
mpm, ThomasAH, mason, tksoh, jglick, djc, abuehl, tlynn, haps messages:
+ msg6773 |
| 2008-08-14 15:02:44 | ThomasAH | set | nosy:
mpm, ThomasAH, mason, tksoh, jglick, djc, abuehl, tlynn, haps messages:
+ msg6772 |
| 2008-08-14 14:38:22 | haps | set | nosy:
mpm, ThomasAH, mason, tksoh, jglick, djc, abuehl, tlynn, haps messages:
+ msg6771 |
| 2008-08-14 13:33:20 | tlynn | set | nosy:
+ tlynn messages:
+ msg6769 |
| 2008-07-31 15:44:01 | haps | set | nosy:
+ haps messages:
+ msg6649 |
| 2008-06-06 05:12:14 | tksoh | set | nosy:
+ tksoh |
| 2008-06-05 08:26:39 | abuehl | set | nosy:
+ abuehl |
| 2008-05-20 18:32:37 | jglick | set | nosy:
+ jglick messages:
+ msg6051 |
| 2007-12-13 12:48:31 | djc | set | nosy:
+ djc messages:
+ msg4588 |
| 2007-12-08 09:58:05 | ThomasAH | set | nosy:
mpm, ThomasAH, mason messages:
+ msg4528 |
| 2007-12-07 21:48:18 | mpm | set | nosy:
+ mpm, - vadim messages:
+ msg4486 assignedto: mason -> |
| 2006-07-09 08:19:02 | tonfa | set | topic:
+ diff nosy:
vadim, ThomasAH, mason |
| 2006-02-21 13:51:40 | ThomasAH | link | issue126 superseder |
| 2006-02-21 13:51:25 | ThomasAH | set | nosy:
vadim, ThomasAH, mason title: diff does not ignore white space in hunks containing other changes -> hg diff -w does not ignore white space in hunks containing other changes |
| 2006-02-21 13:49:09 | ThomasAH | set | assignedto: mason title: diff does not ignore white space -> diff does not ignore white space in hunks containing other changes messages:
+ msg530 nosy:
+ mason, ThomasAH |
| 2006-02-15 19:39:39 | mason | set | status: unread -> chatting messages:
+ msg485 |
| 2006-02-15 18:36:56 | vadim | create | |
|