Skip to content

switch to diff format from REST API #26

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Dec 11, 2022
Merged

switch to diff format from REST API #26

merged 18 commits into from
Dec 11, 2022

Conversation

2bndy5
Copy link
Contributor

@2bndy5 2bndy5 commented Nov 26, 2022

resolves #10 (again)
partially resolves #4 (in a compromising way)

Doc updates

  • cli_arg_parser object moved to its own module.
  • slightly changed how the CLI options are documented (using std:option built into Sphinx).
  • added docs for the new git.py module
  • tweaked some existing docstrings as some info was missing from recent updates

API changes

All diff parsing happens in get_list_of_changed_files() which uses functions from the new git.py module. This means that filter_out_non_src_files() doesn't need to check for the status or patch fields that were used in the JSON format of REST API calls.

If executed from a local developer's environment, get_list_of_changed_files() will use the output provided by git status -v (with the summary info prelude stripped away). This should allow using options like lines-changed-only and files-changed-only without needing a CI environment.

unit test updates

It was getting pretty messy to test multiple commits from different repos that were defined in a hacky event_files.json file. We now have separate test resources to individually test multiple repos for multiple commits.

Furthermore, all tested repos' files are downloaded to the temporary test folder that is managed by pytest (as opposed to gitignoring the downloaded files in our tests/capture_tools_output folder). To retain test performance, I employed a caching tactic to avoid re-downloading the same files for similar tests.

use git diff for anything that isn't a PR.
explicitly use commit hash in git show

also be sure to strip leading '\n' from diff output (during parsing)

switch from re.findall() to re.search()

manually set fetch depth to 3

set depth to 3 via workflow
on local dev env, use git status --verbose
- move cli_arg_parser object into separate module
- docs can now be built from src instead of installing the cpp-linter pkg
Since we're moving to using the diff format of REST API calls, these changes update unit tests that previously used the JSON format of REST API calls.

Updated unit tests take longer to perform since we're now properly testing specific commits for multiple repos individually.

note: the status field is no longer needed as the diff format provides more detail than the JSON format.
Cache the sample repo's downloaded files and copy to relavant test folder instead of re-downloading the same files for each test.
@codecov-commenter
Copy link

codecov-commenter commented Nov 26, 2022

Codecov Report

Merging #26 (d2ebbf6) into main (53e4755) will increase coverage by 1.55%.
The diff coverage is 99.13%.

@@            Coverage Diff             @@
##             main      #26      +/-   ##
==========================================
+ Coverage   81.97%   83.52%   +1.55%     
==========================================
  Files           6        8       +2     
  Lines         649      692      +43     
==========================================
+ Hits          532      578      +46     
+ Misses        117      114       -3     
Impacted Files Coverage Δ
cpp_linter/thread_comments.py 69.44% <ø> (ø)
cpp_linter/__init__.py 92.30% <80.00%> (+0.24%) ⬆️
cpp_linter/cli.py 100.00% <100.00%> (ø)
cpp_linter/git.py 100.00% <100.00%> (ø)
cpp_linter/run.py 76.94% <100.00%> (-2.21%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

- improve test coverage for get_list_of_changed_files() & git.py
- remove unused function in docs/conf.py
- restore test coverage of thread_comments.py
- reduce nested if condition despite pylint warning
@shenxianpeng
Copy link
Contributor

is this PR ready to go? (just to make sure you're not waiting for me). looks like a big change.

@2bndy5
Copy link
Contributor Author

2bndy5 commented Nov 29, 2022

I was giving you time to review because it is a kinda big change, but it is well covered in the unit tests. Also, the news coming out of China sounds pretty serious, so take your time and try to stay safe.

@shenxianpeng
Copy link
Contributor

Yes, it's a big change, I'll take time to study, and I'm busy with work and my child. Hope it won't let you wait too long.
Haha, you know a lot.. the government seems to be torn between being open like your country and continuing defense, but fortunately, our company can work from home, so it should be OK for me, thank you for caring. you too stay safe.

@shenxianpeng
Copy link
Contributor

======================= 67 passed, 14 xfailed in 33.43s ========================
see: https://github.com/cpp-linter/cpp-linter/actions/runs/3553264301/jobs/5968687430
There are 14 failed in pytest, is that the expected result?

@2bndy5
Copy link
Contributor Author

2bndy5 commented Nov 30, 2022

"xfail" is short for "expected failure". So, yes "xfail" results are good.

also update workflow matrix to include clang-tools v14 & v15
Copy link
Contributor

@shenxianpeng shenxianpeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for staying with me for so long. I think we can go ahead and do the final test with the action repo.

An immature comment from me maybe we can refactor run.py for easy reading in the future.

  • personally I prefer main.py instead of run.py, it‘s easy to know that's the start of cpp_linter
  • reduce the number of functions in run.py and put them in the different py files may be easier to understand

Btw, there is Code Smell A 1 Code Smell reported with sonar, it seems to complain the code may not easy to understand.

@2bndy5
Copy link
Contributor Author

2bndy5 commented Dec 6, 2022

refactor run.py for easy reading in the future.

Yeah, I think some of the functions that use the REST API could go into its own rest.py module as well. I have no problem renaming run.py as main.py. Remember back when it was just a shell script named runchecks.sh?

reported with sonar, it seems to complain the code may not easy to understand.

It complains about a nest inline condition in cli.py. I chose to ignore it since it is used in a lamdba statement.

@2bndy5
Copy link
Contributor Author

2bndy5 commented Dec 6, 2022

Code Smell A 0 Code Smells

Fixed it.

@shenxianpeng
Copy link
Contributor

shenxianpeng commented Dec 6, 2022

Yeah, I think some of the functions that use the REST API could go into its own rest.py module as well. I have no problem renaming run.py as main.py.

I like the idea of rest.py, maybe it can be refactored in the next(future) PR.

Remember back when it was just a shell script named runchecks.sh?

Yes, you have a good memory!

Comment on lines +209 to 215
raw_url = f"https://github.com/{GITHUB_REPOSITORY}/raw/{GITHUB_SHA}/"
raw_url += urllib.parse.quote(file["filename"], safe="")
logger.info("Downloading file from url: %s", raw_url)
Globals.response_buffer = requests.get(raw_url)
# retain the repo's original structure
Path.mkdir(file_name.parent, parents=True, exist_ok=True)
file_name.write_text(Globals.response_buffer.text, encoding="utf-8")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not work on a submodules' srcs.

This may not be fork friendly in case someone runs the CI on a renamed fork.

f"https://github.com/{GITHUB_REPOSITORY}/raw/{GITHUB_SHA}/"
  • GITHUB_REPOSITORY will always be set to the name of the repo running the CI.
  • GITHUB_SHA will always be set to the commit specific of the repo running the CI.

By default, we ignore submodules. But if the CI didn't checkout submodules and a submodule was explicitly not ignored, this code will get hit. When we get an incorrect GET response (because the raw_url was wrong), then there may be undefined behavior:

file_name.write_text(Globals.response_buffer.text, encoding="utf-8")

Copy link
Contributor Author

@2bndy5 2bndy5 Dec 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, I don't think the diff will include changes to submodules' srcs. So, this should never be a problem.

And I think the fork-friendly problem might resolve itself because the specific commit should be inherited with the git history of the forked repo.

@2bndy5
Copy link
Contributor Author

2bndy5 commented Dec 10, 2022

This CI hit the REST API rate limit for unauthenticated requests. https://github.com/cpp-linter/cpp-linter/actions/runs/3663491831/jobs/6193307913#step:8:251

This is because the test_get_changed_files() test actually requests data from the REST API.

@shenxianpeng
Copy link
Contributor

All tests appear to have passed, it should be good to merge.

@shenxianpeng
Copy link
Contributor

This is because the test_get_changed_files() test actually requests data from the REST API.

So will the next run fail all because of this? Btw, another 1 Code Smell was reported :)

@2bndy5
Copy link
Contributor Author

2bndy5 commented Dec 10, 2022

yeah, I can refactor that logic. I pushed rather hastily...

@2bndy5
Copy link
Contributor Author

2bndy5 commented Dec 10, 2022

yeah I think another CI run on this branch will fail. I need to monkeypatch the test.

@2bndy5 2bndy5 force-pushed the use-git-directly branch 2 times, most recently from 5490ae6 to 8630564 Compare December 11, 2022 10:45
@2bndy5
Copy link
Contributor Author

2bndy5 commented Dec 11, 2022

Ok. I fixed the code smell issue by factoring out the logic to a private function. I also adjusted some unit tests that were directly using the REST API, so we shouldn't get those CI failures when the API rate limit is hit.

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@shenxianpeng
Copy link
Contributor

👍 I also rerun the tests, they all passed. all looks good.

@shenxianpeng shenxianpeng added bug Something isn't working enhancement New feature or request labels Dec 11, 2022
@2bndy5 2bndy5 merged commit b37e680 into main Dec 11, 2022
@2bndy5 2bndy5 deleted the use-git-directly branch December 11, 2022 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Line filtering error on large patches switching to git directly for getting an event's diff
3 participants