Description
Preface
I'm not sure if this is a bug or if I am misusing the tool and/or not configuring it correctly. If so, please consider this to be a question and I would appreciate any information that could be provided.
Thanks!
Describe the bug
When checking for equality, whitespaces are correctly ignored. When generating differences, some whitespaces are still compared while others are deleted from the result.
To Reproduce
Steps to reproduce the behavior:
DiffRowGenerator generator = DiffRowGenerator.create()
.showInlineDiffs(true)
.inlineDiffByWord(true)
.ignoreWhiteSpaces(true)
.oldTag(f -> "~") //introduce markdown style for strikethrough
.newTag(f -> "**") //introduce markdown style for bold
.build();
//compute the differences for two test texts.
List<DiffRow> rows1 = generator.generateDiffRows(
Arrays.asList("This\nis\na\ntest."),
Arrays.asList("This is a test"));
or a more basic example using tabs instead of newlines...
//compute the differences for two test texts.
List<DiffRow> rows2 = generator.generateDiffRows(
Arrays.asList("This\tis\ta\ttest."),
Arrays.asList("This is a test"));
or an even more basic example that just changes the number of spaces...
//compute the differences for two test texts.
List<DiffRow> rows3 = generator.generateDiffRows(
Arrays.asList("This is a test."),
Arrays.asList("This is a test"));
Actual Result
- rows1:
(period is correctly identified as an "old" tag while newlines are gone)
`Thisisatest~.~`
(spaces are considered to be "new" tags when we were asking for them to be ignored)
`This** **is** **a** **test`
- rows2:
(period is correctly identified as an "old" tag while tabs are also treated as "old" tags)
`This~ ~is~ ~a~ ~test~.~`
(spaces are considered to be "new" tags when we are asking for them to be ignored)
`This** **is** **a** **test`
- rows3:
(period is correctly identified as an "old" tag while the spaces are also treated as "old" tags)
`This~ ~is~ ~a~ ~test~.~`
(spaces are considered to be "new" tags when we are asking for them to be ignored)
`This** **is** **a** **test`
Expected Behavior
- rows1:
This\nis\na\ntest~.~
This is a test
- rows2:
This\tis\ta\ttest~.~
This is a test
- rows3:
This is a test~.~
This is a test
Notes
If a period is left at the end of the second string being diff-ed in any of the above blocks, this does not happen and the entire block is identified as matching like was expected.
Suggested Fix
- Pass
DiffRowGenerator.equalizer
down toDiffUtils.diff
when DiffRowGenerator.generateInlineDiffs is called. - Use an internal identifier for merging/splitting instead of something that could be contained within the comparison.
System
- Java version: 8 (1.8.0_151)
- Diff Utils Version: 4.5