switch to diff format from REST API #26

2bndy5 · 2022-11-26T05:41:29Z

resolves #10 (again)
partially resolves #4 (in a compromising way)

Doc updates

cli_arg_parser object moved to its own module.
slightly changed how the CLI options are documented (using std:option built into Sphinx).
added docs for the new git.py module
tweaked some existing docstrings as some info was missing from recent updates

API changes

All diff parsing happens in get_list_of_changed_files() which uses functions from the new git.py module. This means that filter_out_non_src_files() doesn't need to check for the status or patch fields that were used in the JSON format of REST API calls.

If executed from a local developer's environment, get_list_of_changed_files() will use the output provided by git status -v (with the summary info prelude stripped away). This should allow using options like lines-changed-only and files-changed-only without needing a CI environment.

unit test updates

It was getting pretty messy to test multiple commits from different repos that were defined in a hacky event_files.json file. We now have separate test resources to individually test multiple repos for multiple commits.

Furthermore, all tested repos' files are downloaded to the temporary test folder that is managed by pytest (as opposed to gitignoring the downloaded files in our tests/capture_tools_output folder). To retain test performance, I employed a caching tactic to avoid re-downloading the same files for similar tests.

use git diff for anything that isn't a PR.

explicitly use commit hash in git show also be sure to strip leading '\n' from diff output (during parsing) switch from re.findall() to re.search() manually set fetch depth to 3 set depth to 3 via workflow

on local dev env, use git status --verbose

- move cli_arg_parser object into separate module - docs can now be built from src instead of installing the cpp-linter pkg

Since we're moving to using the diff format of REST API calls, these changes update unit tests that previously used the JSON format of REST API calls. Updated unit tests take longer to perform since we're now properly testing specific commits for multiple repos individually. note: the status field is no longer needed as the diff format provides more detail than the JSON format.

Cache the sample repo's downloaded files and copy to relavant test folder instead of re-downloading the same files for each test.

codecov-commenter · 2022-11-26T05:43:21Z

Codecov Report

Merging #26 (d2ebbf6) into main (53e4755) will increase coverage by 1.55%.
The diff coverage is 99.13%.

@@            Coverage Diff             @@
##             main      #26      +/-   ##
==========================================
+ Coverage   81.97%   83.52%   +1.55%     
==========================================
  Files           6        8       +2     
  Lines         649      692      +43     
==========================================
+ Hits          532      578      +46     
+ Misses        117      114       -3

Impacted Files	Coverage Δ
cpp_linter/thread_comments.py	`69.44% <ø> (ø)`
cpp_linter/__init__.py	`92.30% <80.00%> (+0.24%)`	⬆️
cpp_linter/cli.py	`100.00% <100.00%> (ø)`
cpp_linter/git.py	`100.00% <100.00%> (ø)`
cpp_linter/run.py	`76.94% <100.00%> (-2.21%)`	⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

- improve test coverage for get_list_of_changed_files() & git.py - remove unused function in docs/conf.py - restore test coverage of thread_comments.py - reduce nested if condition despite pylint warning

shenxianpeng · 2022-11-29T10:51:38Z

is this PR ready to go? (just to make sure you're not waiting for me). looks like a big change.

2bndy5 · 2022-11-29T10:57:42Z

I was giving you time to review because it is a kinda big change, but it is well covered in the unit tests. Also, the news coming out of China sounds pretty serious, so take your time and try to stay safe.

shenxianpeng · 2022-11-29T11:25:59Z

Yes, it's a big change, I'll take time to study, and I'm busy with work and my child. Hope it won't let you wait too long.
Haha, you know a lot.. the government seems to be torn between being open like your country and continuing defense, but fortunately, our company can work from home, so it should be OK for me, thank you for caring. you too stay safe.

cpp_linter/__init__.py

cpp_linter/run.py

cpp_linter/cli.py

cpp_linter/git.py

shenxianpeng · 2022-11-30T14:33:40Z

======================= 67 passed, 14 xfailed in 33.43s ========================
see: https://github.com/cpp-linter/cpp-linter/actions/runs/3553264301/jobs/5968687430
There are 14 failed in pytest, is that the expected result?

2bndy5 · 2022-11-30T16:17:08Z

"xfail" is short for "expected failure". So, yes "xfail" results are good.

also update workflow matrix to include clang-tools v14 & v15

shenxianpeng

Sorry for staying with me for so long. I think we can go ahead and do the final test with the action repo.

An immature comment from me maybe we can refactor run.py for easy reading in the future.

personally I prefer main.py instead of run.py, it‘s easy to know that's the start of cpp_linter
reduce the number of functions in run.py and put them in the different py files may be easier to understand

Btw, there is 1 Code Smell reported with sonar, it seems to complain the code may not easy to understand.

2bndy5 · 2022-12-06T12:04:17Z

refactor run.py for easy reading in the future.

Yeah, I think some of the functions that use the REST API could go into its own rest.py module as well. I have no problem renaming run.py as main.py. Remember back when it was just a shell script named runchecks.sh?

reported with sonar, it seems to complain the code may not easy to understand.

It complains about a nest inline condition in cli.py. I chose to ignore it since it is used in a lamdba statement.

2bndy5 · 2022-12-06T12:17:45Z

0 Code Smells

Fixed it.

shenxianpeng · 2022-12-06T12:53:54Z

Yeah, I think some of the functions that use the REST API could go into its own rest.py module as well. I have no problem renaming run.py as main.py.

I like the idea of rest.py, maybe it can be refactored in the next(future) PR.

Remember back when it was just a shell script named runchecks.sh?

Yes, you have a good memory!

2bndy5 · 2022-12-06T22:29:05Z

cpp_linter/run.py

+            raw_url = f"https://github.com/{GITHUB_REPOSITORY}/raw/{GITHUB_SHA}/"
+            raw_url += urllib.parse.quote(file["filename"], safe="")
+            logger.info("Downloading file from url: %s", raw_url)
+            Globals.response_buffer = requests.get(raw_url)
            # retain the repo's original structure
            Path.mkdir(file_name.parent, parents=True, exist_ok=True)
            file_name.write_text(Globals.response_buffer.text, encoding="utf-8")


This may not work on a submodules' srcs.

This may not be fork friendly in case someone runs the CI on a renamed fork.

f"https://github.com/{GITHUB_REPOSITORY}/raw/{GITHUB_SHA}/"

GITHUB_REPOSITORY will always be set to the name of the repo running the CI.

GITHUB_SHA will always be set to the commit specific of the repo running the CI.

By default, we ignore submodules. But if the CI didn't checkout submodules and a submodule was explicitly not ignored, this code will get hit. When we get an incorrect GET response (because the raw_url was wrong), then there may be undefined behavior:

file_name.write_text(Globals.response_buffer.text, encoding="utf-8")

On second thought, I don't think the diff will include changes to submodules' srcs. So, this should never be a problem.

And I think the fork-friendly problem might resolve itself because the specific commit should be inherited with the git history of the forked repo.

tests/capture_tools_output/test_tools_output.py

2bndy5 · 2022-12-10T09:04:29Z

This CI hit the REST API rate limit for unauthenticated requests. https://github.com/cpp-linter/cpp-linter/actions/runs/3663491831/jobs/6193307913#step:8:251

This is because the test_get_changed_files() test actually requests data from the REST API.

shenxianpeng · 2022-12-10T09:23:31Z

All tests appear to have passed, it should be good to merge.

Unit test
The action repo test Test: Use git directly cpp-linter-action#125
The user repo test https://github.com/shenxianpeng/test-libvips/actions/runs/3663555213

shenxianpeng · 2022-12-10T09:27:33Z

This is because the test_get_changed_files() test actually requests data from the REST API.

So will the next run fail all because of this? Btw, another 1 Code Smell was reported :)

2bndy5 · 2022-12-10T09:43:19Z

yeah, I can refactor that logic. I pushed rather hastily...

2bndy5 · 2022-12-10T09:45:58Z

yeah I think another CI run on this branch will fail. I need to monkeypatch the test.

2bndy5 · 2022-12-11T11:07:30Z

Ok. I fixed the code smell issue by factoring out the logic to a private function. I also adjusted some unit tests that were directly using the REST API, so we shouldn't get those CI failures when the API rate limit is hit.

sonarqubecloud · 2022-12-11T11:07:49Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

No Coverage information
0.0% Duplication

shenxianpeng · 2022-12-11T13:07:19Z

👍 I also rerun the tests, they all passed. all looks good.

2bndy5 added 9 commits November 23, 2022 07:51

[no ci] get & parse diff for a commit

3d82aea

better patch parsing w/ regex

e62356a

use git diff for anything that isn't a PR.

try using different git cmds

1b8972a

explicitly use commit hash in git show also be sure to strip leading '\n' from diff output (during parsing) switch from re.findall() to re.search() manually set fetch depth to 3 set depth to 3 via workflow

use diff format from REST API

b2d91ea

on local dev env, use git status --verbose

adjust some unit tests

6717c57

doc updates

0531927

- move cli_arg_parser object into separate module - docs can now be built from src instead of installing the cpp-linter pkg

assemble raw_url for downloading missing files

8d8319b

improve unit test performance

a0c38ec

Cache the sample repo's downloaded files and copy to relavant test folder instead of re-downloading the same files for each test.

various improvements from CI review

5fc1743

- improve test coverage for get_list_of_changed_files() & git.py - remove unused function in docs/conf.py - restore test coverage of thread_comments.py - reduce nested if condition despite pylint warning

2bndy5 force-pushed the use-git-directly branch from 02129e3 to 5fc1743 Compare November 26, 2022 07:38

another round of doc tweaks

5d6159d