Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid copying aggregated admin/edit/view roles during bootstrap #63761

Merged

Conversation

liggitt
Copy link
Member

@liggitt liggitt commented May 13, 2018

Fixes #63760

At apiserver startup, prior to reconciling cluster roles, the following roles (if they exist) are copied:

  • admin -> system:aggregate-to-admin
  • edit -> system:aggregate-to-edit
  • view -> system:aggregate-to-view

This was added in 1.9 as part of role aggregation to ensure custom permissions added to the admin/edit/view roles were preserved, prior to making the admin/edit/view roles aggregated (since the permissions of an aggregated role are controller-managed)

When starting multiple members of a new HA cluster simultaneously, the following race can occur:

  • t=0, server 1,2,3 start up
  • t=1, server 1 finds no admin/edit/view roles exist, begins role reconciliation and creates the aggregated admin role
  • t=2, server 2 finds and copies the admin role created by server 1 to system:aggregate-to-admin

If this race is encountered, it results in system:aggregate-to-admin being an aggregated role, and its permissions subject to being overwritten by the aggregating controller. To prevent this from happening, the permission-preserving copy should only copy over roles that are not yet aggregated.

To correct this in clusters that have already encountered it, role reconciliation should remove aggregation from a role that is not expected to be aggregated at all.

corrects a race condition in bootstrapping aggregated cluster roles in new HA clusters

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 13, 2018
@k8s-ci-robot k8s-ci-robot requested review from deads2k and enj May 13, 2018 19:33
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels May 13, 2018
@liggitt liggitt added kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. cherrypick-candidate sig/auth Categorizes an issue or PR as relevant to SIG Auth. labels May 13, 2018
@liggitt liggitt added this to the v1.10 milestone May 13, 2018
@deads2k
Copy link
Contributor

deads2k commented May 14, 2018

This pull reminds me. What did we ever doing about irreconcilable differences on rolebindings?

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 14, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

[MILESTONENOTIFIER] Milestone Pull Request: Up-to-date for process

@deads2k @liggitt

Pull Request Labels
  • sig/auth: Pull Request will be escalated to these SIGs if needed.
  • priority/important-soon: Escalate to the pull request owners and SIG owner; move out of milestone after several unsuccessful escalation attempts.
  • kind/bug: Fixes a bug discovered during the current release.
Help

@liggitt
Copy link
Member Author

liggitt commented May 14, 2018

What did we ever doing about irreconcilable differences on rolebindings?

We've always deleted/recreated if needed:

// Reset the binding completely if the roleRef is different
if expected.GetRoleRef() != existing.GetRoleRef() {
result.RoleBinding = expected
result.Operation = ReconcileRecreate
return result, nil
}

case ReconcileRecreate:
// Try deleting
err := o.Client.Delete(existingBinding.GetNamespace(), existingBinding.GetName(), existingBinding.GetUID())
switch {
case err == nil, errors.IsNotFound(err):
// object no longer exists, as desired
case errors.IsConflict(err):
// delete failed because our UID precondition conflicted
// this could mean another object exists with a different UID, re-run
return o.run(attempts + 1)
default:
// return other errors
return nil, err
}
// continue to create
fallthrough
case ReconcileCreate:

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit d5a930b into kubernetes:master May 14, 2018
@liggitt liggitt deleted the aggregated-bootstrap-race branch May 14, 2018 23:49
k8s-github-robot pushed a commit that referenced this pull request May 15, 2018
…1-upstream-release-1.10

Automatic merge from submit-queue.

Automated cherry pick of #63761: Avoid copying aggregated admin/edit/view roles during

Cherry pick of #63761 on release-1.10.

#63761: Avoid copying aggregated admin/edit/view roles during
k8s-github-robot pushed a commit that referenced this pull request May 15, 2018
…1-upstream-release-1.9

Automatic merge from submit-queue.

Automated cherry pick of #63761: Avoid copying aggregated admin/edit/view roles during

Cherry pick of #63761 on release-1.9.

#63761: Avoid copying aggregated admin/edit/view roles during
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/auth Categorizes an issue or PR as relevant to SIG Auth. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants