-
Notifications
You must be signed in to change notification settings - Fork 41.1k
authentication: when cert verification fails store details in audit log #130540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
authentication: when cert verification fails store details in audit log #130540
Conversation
Hi @vrutkovs. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/remove-sig api-machinery |
/label api-review And @vrutkovs mind adding an integration test? Thx! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass
@@ -184,6 +194,14 @@ func (a *Authenticator) AuthenticateRequest(req *http.Request) (*authenticator.R | |||
clientCertificateExpirationHistogram.WithContext(req.Context()).Observe(remaining.Seconds()) | |||
chains, err := req.TLS.PeerCertificates[0].Verify(optsCopy) | |||
if err != nil { | |||
ctx := req.Context() | |||
audit.AddAuditAnnotation(ctx, | |||
"authentication.k8s.io/certificate-error", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the part that needs API review.
@@ -184,6 +194,14 @@ func (a *Authenticator) AuthenticateRequest(req *http.Request) (*authenticator.R | |||
clientCertificateExpirationHistogram.WithContext(req.Context()).Observe(remaining.Seconds()) | |||
chains, err := req.TLS.PeerCertificates[0].Verify(optsCopy) | |||
if err != nil { | |||
ctx := req.Context() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean - we need this context so that AddAuditAnnotation
would store annotations in it
@@ -184,6 +194,14 @@ func (a *Authenticator) AuthenticateRequest(req *http.Request) (*authenticator.R | |||
clientCertificateExpirationHistogram.WithContext(req.Context()).Observe(remaining.Seconds()) | |||
chains, err := req.TLS.PeerCertificates[0].Verify(optsCopy) | |||
if err != nil { | |||
ctx := req.Context() | |||
audit.AddAuditAnnotation(ctx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need an integration test to show that all the wiring actually works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we need rate limiting on this to prevent audit log spam if a broken client is hot looping requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not issuing a new audit log event, this merely updates the existing one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that, you are still increasing the size of audit events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, we can truncate CN and Issuer CN in GetHumanCertDetail
to avoid audit log growing too large, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why are we concerned about the size of audit log event - users can enable audit-log-truncate-enabled
and tweak it with audit-log-truncate-max-event-size
setting.
if a client makes a request which contains an unacceptable client-cert AND a valid bearer token, will this check the cert, fail, audit-log the cert details, then validate the token, authenticate the request, and result in an audit event with authentication.k8s.io/certificate-error annotations that actually succeeded?
Yes, iiuc unionAuthRequestHandler
will pass it to the next handler. I think its desirable as it the only way I can think of to notify admins that certificate has expired / no longer valid
I don't have a firm handle on the exposure implications of logging issuer / subject for failed requests
Note that "logging" is a good word here - we're not storing certificate details in kube-apiserver logs, we're keeping them alongside other important private info - source IPs, query args for requests, sometimes even entire request / responses etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why are we concerned about the size of audit log event - users can enable audit-log-truncate-enabled and tweak it with audit-log-truncate-max-event-size setting.
The annotations field in the audit event says values should be short, and we pay serialization cost for excessively large values even if we then truncate them. If this PR is intended to assist admins in finding invalid certificates, let's just pick a reasonable bound to prevent abuse and truncate the issuer and subject CN.
Yes, iiuc unionAuthRequestHandler will pass it to the next handler. I think its desirable as it the only way I can think of to notify admins that certificate has expired / no longer valid
If the request is actually failing as a result, including the issuer/subject CN of a presented client cert is probably ok? If the request actually succeeded, it's pretty weird to do. I'd still like feedback from the folks mentioned in https://github.com/kubernetes/kubernetes/pull/130540/files#r2058912959 on how concerning it is to log the issuer / subject of every unknown cert we encounter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My calculations for maximum length of this string: max length(common name) + len("usages, serving") + len(" groups[]") + max length(organization) + len(" validServingFor=[]") + 3 * max length(domain defined attribute value length for IP) + 3 * max length(domain defined attribute value length for DNS names) + len(" issuer=") + len(" ( to (now=))") + 3*timestamp
based on RFC3280 assuming that the certificate has 3 IPs and 3 DNS names. 64 + 16 + 10 + 64 + 20 + 384 + 384 + 9 + 15 + 90
makes 1056
, so it should be safe to truncate the annotation to 1024 (we may cut part of current time, which isn't crucial since audit event will have the timestamp anyway). WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably don't need anything but the issuer and subject common names for identifying the certificate, the SANs / groups / timestamps seem less relevant for figuring out what client has the failing cert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes, I started with that in 7d29cf0 and then Mo recommended reusing GetHumanCertDetail
. I'll update it to use custom function and limit it to 512 to fit serial number and two common names
@@ -110,6 +111,15 @@ func certificateIdentifier(c *x509.Certificate) string { | |||
) | |||
} | |||
|
|||
func certificateErrorIdentifier(c *x509.Certificate) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetHumanCertDetail
might be better, though it could make the audit log large?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps GetPrivateCertDetail
to underline that it contains sensitive data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetHumanCertDetail
is an existing function in k8s.io/apiserver/pkg/server/dynamiccertificates
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, got it. Using it now - although I had to update it to truncate current time to seconds to pass unit tests (not before and not after are already being displayed like that now)
7d29cf0
to
b09fa5a
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: vrutkovs The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
b09fa5a
to
a50d1d6
Compare
/remove-sig api-machinery |
a50d1d6
to
9b3ed9d
Compare
In some cases finding the invalid cert by its serial number is tedious. However we can't expose more PII in kube-apiserver logs. This commit will add an audit log annotations with additional details: common name, issuer common name and serial number
9b3ed9d
to
b4713f3
Compare
/triage accepted |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
What type of PR is this?
/kind bug
What this PR does / why we need it:
In some cases finding the invalid cert by its serial number is tedious. However we can't expose more PII in kube-apiserver logs.
This commit will add an audit log annotations with additional details: common name, issuer common name and serial number
Alternative solution to #130137
Which issue(s) this PR fixes:
Fixes #129600
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: