Skip to content

fix: optionally keep user-provided base URL for pagination #2149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 3, 2022

Conversation

iomarmochtar
Copy link
Contributor

Background

To enhance HTTP security our Gitlab is fronted by IAP for example https://gitlab.tools.domain.io but for internal communication from runner to Gitlab API we create another endpoint that protected by firewall rules so then it only can be accessed from certain IP only, for example https://runner.gitlab.tools.domain.io/ it actually using the reverse proxy that in the upstream destination will set Host header to the original one (gitlab.tools.domain.io).

But when we run a python script inside Gitlab pipeline to the endpoint (runner.gitlab.tools.domain.io) and pass argument all=True or iterator=True for loop all the data to all page will returning the base url as gitlab.tools.domain.io whereas it will causing the error due the request will be blocked by IAP.

Expected

The base url will be persist same as the first time set in the next url so the request will be expected goes as the first page when using pagination iterrator (all=True).

Fix

This MR will make sure the base in next url is the same. It's been tested in my case and working as expected.

@iomarmochtar iomarmochtar changed the title Keep base in next url same as what set fix: keep the base in next url same as what set in the first time Jul 17, 2022
@nejch nejch self-assigned this Jul 17, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jul 17, 2022

Codecov Report

Merging #2149 (c0f35d4) into main (98bdb98) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #2149   +/-   ##
=======================================
  Coverage   95.46%   95.46%           
=======================================
  Files          81       81           
  Lines        5353     5362    +9     
=======================================
+ Hits         5110     5119    +9     
  Misses        243      243           
Flag Coverage Δ
api_func_v4 81.38% <33.33%> (-0.07%) ⬇️
cli_func_v4 83.02% <22.22%> (+<0.01%) ⬆️
unit 87.29% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
gitlab/client.py 98.74% <100.00%> (+0.02%) ⬆️

@JohnVillalovos
Copy link
Member

I wonder if this would be better to be a config option? As probably the most likely case when this occurs is that something is wrong with their config.

@iomarmochtar
Copy link
Contributor Author

I wonder if this would be better to be a config option? As probably the most likely case when this occurs is that something is wrong with their config.

do you mean by set the config as parameter in class Gitlab, eg: keep_base_url=True, once they have the same case then just set it as True ?

@nejch
Copy link
Member

nejch commented Jul 18, 2022

Thanks for the work here @iomarmochtar! Have a look at #1978 as well, which covers mostly the same topic.

I agree with John we should at least not completely blindly follow either the server or the client-provided URL when there is a mismatch, hence my proposal in the issue above: I think we should issue a warning instead, unless explicitly configured to follow the base URL, and only then reconstruct the URL. Could we add that here? 🙇 Thanks again!


For context: I actually would consider this approach a misconfiguration of the external_url, at least until GitLab supports multiple urls. E.g. the way your admins have configured it does not really scale - if your devs use other tech you'll have to implement this in the node gitbeaker library, in the terraform GitLab provider, and other libraries/apps, since most of these follow the link headers.

Instead I think they should use clone_url in the runner config (which was designed for this purpose), and if needed have additional proxying done in front of specific runner endpoints. Because this also affects other aspects including many other types of links that GitLab returns in its API responses.

@nejch
Copy link
Member

nejch commented Jul 29, 2022

@iomarmochtar would you still like to work on this as an optional parameter as discussed above?

@iomarmochtar
Copy link
Contributor Author

thanks you for remind me @nejch ,almost forgot to continue it due office task. let me try in this weekend.

@iomarmochtar iomarmochtar force-pushed the persistence-next-url branch from c815b94 to c0f35d4 Compare July 30, 2022 01:41
@iomarmochtar
Copy link
Contributor Author

push as requested, please review it @nejch @JohnVillalovos

Copy link
Member

@nejch nejch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @iomarmochtar. I have a few small comments

@iomarmochtar iomarmochtar force-pushed the persistence-next-url branch from c0f35d4 to 9c02633 Compare August 2, 2022 16:35
@nejch
Copy link
Member

nejch commented Aug 3, 2022

Thanks @iomarmochtar. I have some ideas to reuse this elsewhere that I think are outside the scope of this PR, so I'll merge this as is, and I will do a little refactor after.

@nejch nejch changed the title fix: keep the base in next url same as what set in the first time fix: optionally keep user-provided base URL for pagination Aug 3, 2022
@nejch nejch merged commit e2ea8b8 into python-gitlab:main Aug 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants