-
-
Notifications
You must be signed in to change notification settings - Fork 195
Diffs recognizing less similarity since 4.10 #129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
There was a change to somehow decompress the computed deltas that have different target and source sizes. That was needed to correct some issues with multi line diffs. But I will look into it. Maybe we could make this decompression skippable. |
Could you check it? I added the possibility to skip the delta decompression. After it is switched off you get your old result.s |
This does indeed appear to address my issue. Thank you very much for the quick turn-around (quicker than I've even checked back). I worry a little about yet-another-builder-option, but I defer to your understanding of the code. Do you have some idea when the next release will be? In my production environment, due to security and network restrictions, I can only download published maven artifacts. If you are curious, here is sample data for a test case that is closer to my real-world scenario: Thank you again! |
Sorry, about the late answer: What about |
Since version 4.10 of the library, diffs are recognizing less similarity between texts.
In the attached program, under version 4.9, the library correctly recognizes that there is only a 5-character difference between the texts (3 letters + 2 whitespace characters). Under version 4.10, the library reports that a large block of identical text has been deleted and then added.
TestCase.txt
4.9 output:
4.10 output:
Admittedly, there are aspects of the 4.10 output that are improved over the 4.9 output. For example, the fact that line 6 of the 4.9 output is indicated as a CHANGE, but there is no oldLine text and no changes in the newLine text can be confusing. However, the sacrifice in accuracy in 4.10 is far less desirable. In 4.9, the line 6 difference is indeed a change, but it's almost like a new tag is needed to indicate a group (?) change to make it clear that the change is a continuation of the line 4 difference.
The text was updated successfully, but these errors were encountered: