Skip to content

Conversation

tmshort
Copy link
Contributor

@tmshort tmshort commented Aug 12, 2025

This does not remove the kustomize config, but instead puts a helm chart into the repo, that should give very close (but not identical) results.

  • Adds a new chart: helm/olmv1/
    • standard
    • experimental
    • openshift
    • cert-manager
    • e2e
    • tilt
  • Adds "values" files in helm/
  • Adds helm executable to .bingo/
  • Updates documents int docs/drafts/
  • Update tests in tests/
  • Update make manifests to use helm chart
    • Update the checked-in manifests
    • Use a tool like dyff to properly diff the manifests
  • Pull RBAC and WebHook config out of the goland code
    • controller-tools is not longer used to generate RBAC/Wehbooks
    • These resources are not part of the helm chart
    • The CRDs are still generated via kubebuilder
  • Add Prometheus Helm chart
    • This is just the configuration needed by OLMv1, it is separate because Prometheus is installed via a downloaded kustomize after OLMv1 is installed.
  • Organized chart files into subdirectories.
  • Added Makefile variables to assist in manipulating manifest generation.

Significant changes to the resulting manifests are listed in the RFC.

Assisted-by: Gemini (research)
Assisted-by: Claude (analysis)

Description

Reviewer Checklist

  • API Go Documentation
  • Tests: Unit Tests (and E2E Tests, if appropriate)
  • Comprehensive Commit Messages
  • Links to related GitHub Issue(s)

@tmshort tmshort requested a review from a team as a code owner August 12, 2025 20:52
@openshift-ci openshift-ci bot requested review from bentito and thetechnick August 12, 2025 20:52
Copy link

netlify bot commented Aug 12, 2025

Deploy Preview for olmv1 ready!

Name Link
🔨 Latest commit e0605c2
🔍 Latest deploy log https://app.netlify.com/projects/olmv1/deploys/68adf9620324670008af449a
😎 Deploy Preview https://deploy-preview-2145--olmv1.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link

codecov bot commented Aug 12, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.64%. Comparing base (dff07d5) to head (e0605c2).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2145      +/-   ##
==========================================
- Coverage   72.79%   72.64%   -0.16%     
==========================================
  Files          79       79              
  Lines        7340     7347       +7     
==========================================
- Hits         5343     5337       -6     
- Misses       1651     1662      +11     
- Partials      346      348       +2     
Flag Coverage Δ
e2e 44.23% <ø> (-0.20%) ⬇️
experimental-e2e 54.80% <ø> (-0.17%) ⬇️
unit 58.28% <ø> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@@ -24,7 +24,7 @@ To enable the Helm Chart support feature gate, you need to patch the `operator-c
2. **Wait for the controller manager pods to be ready:**

```bash
$ kubectl -n olmv1-system wait --for condition=ready pods -l control-plane=operator-controller-controller-manager
$ kubectl -n olmv1-system wait --for condition=ready pods -l apps.kubernetes.io/name=operator-controller
Copy link
Contributor

@camilamacedo86 camilamacedo86 Aug 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you added it in the RFC, but could it not be a breaking change?
Should we keep both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a breaking change, it's only in our draft documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, people could teorically have scripts and etc to filter things with control-plane=operator-controller-controller-manager

Copy link
Contributor Author

@tmshort tmshort Aug 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

theoretically... but I think our user base is very small and OCP focused. Also, I don't think we consider our YAML a published API.

@tmshort tmshort force-pushed the helm-config branch 4 times, most recently from 7beca6e to 70474d0 Compare August 19, 2025 19:12

TMPDIR="$(mktemp -d)"
trap 'echo "Cleaning up $TMPDIR"; rm -rf "$TMPDIR"' EXIT
#trap 'echo "Cleaning up $TMPDIR"; rm -rf "$TMPDIR"' EXIT
Copy link
Contributor

@camilamacedo86 camilamacedo86 Aug 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be really commented?

# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
# It is recommended to use it with quotes.
appVersion: "1.3.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is interesting, how we update this value when we do a new release?
We might want a github action that will open a PR for we do not forget that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how we update this, or if we even care? Because it's the commit we'd want the 1.x tag to be on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's optional, I removed it.

olm.operatorframework.io/feature-set: experimental
name: operator-controller-controller-manager
namespace: olmv1-system
olm.operatorframework.io/feature-set: experimental-e2e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that we are indeed fixing it 👍

app.kubernetes.io/name: e2e
app.kubernetes.io/part-of: olm
name: e2e-registries-conf
namespace: olmv1-system
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool it is showing up only in the e2e ones 👍

@tmshort tmshort force-pushed the helm-config branch 2 times, most recently from 6a6f688 to 7f7fab8 Compare August 25, 2025 18:00
This does not remove the kustomize config, but instead puts a helm chart
into the repo, that should give very close (but not identical) results.

* Adds a new chart: helm/olmv1/
  - standard
  - experimental
  - openshift
  - cert-manager
  - e2e
  - tilt
* Adds "values" files in helm/
* Adds helm executable to .bingo/
* Updates documents int docs/drafts/
* Update tests in tests/
* Update `make manifests` to use helm chart
  - Update the checked-in manifests
  - Use a tool like `dyff` to properly diff the manifests
* Pull RBAC and WebHook config out of the goland code
  - controller-tools is not longer used to generate RBAC/Wehbooks
  - These resources are not part of the helm chart
  - The CRDs are still generated via kubebuilder

Significant changes to the resulting manifests are listed in the RFC.

Signed-off-by: Todd Short <tshort@redhat.com>
Assisted-by: Gemini (research)
Assisted-by: Claude (analysis)
Signed-off-by: Todd Short <tshort@redhat.com>
Assisted-by: Gemini (research)
Assisted-by: Claude (analysis)
Signed-off-by: Todd Short <tshort@redhat.com>
Assisted-by: Gemini (research)
Assisted-by: Claude (analysis)
This is currently separate due to the ordering of application.
If we change the order, this could be included in the main Helm Chart.

Signed-off-by: Todd Short <tshort@redhat.com>
Assisted-by: Gemini (research)
Assisted-by: Claude (analysis)
@tmshort
Copy link
Contributor Author

tmshort commented Aug 25, 2025

From Claude Code:

● Helm Charts Analysis

This repository contains 3 Helm charts for the OLMv1 (Operator Lifecycle Manager v1) project:

  1. olmv1 Chart (helm/olmv1/)
  • Purpose: Main chart for deploying OLMv1 components
  • Components:
    • Operator Controller (manages ClusterExtensions)
    • Catalogd (manages ClusterCatalogs)
    • Optional cert-manager integration
    • OpenShift-specific configurations
  • Key Features:
    • Modular design - components can be enabled/disabled independently
    • Support for both standard and experimental feature sets
    • Comprehensive RBAC configuration
    • Network policies for security isolation
    • Multi-architecture support (amd64, arm64, ppc64le, s390x)
  1. prometheus Chart (helm/prometheus/)
  • Purpose: Monitoring stack for OLMv1
  • Components: Prometheus instance with ServiceMonitors
  • Monitors: Catalogd, operator-controller, and kubelet metrics
  1. Configuration Files
  • cert-manager.yaml, e2e.yaml, experimental.yaml, tilt.yaml - Environment-specific configurations

Security Best Practices Found:

✅ Strong Security Posture:

  • runAsNonRoot: true in pod security contexts
  • readOnlyRootFilesystem: true for containers
  • allowPrivilegeEscalation: false
  • All capabilities dropped (drop: [ALL])
  • seccomp profile set to RuntimeDefault
  • Principle of least privilege in RBAC roles
  • Network policies for traffic isolation
  • Node affinity constraints for control plane scheduling

✅ Operational Best Practices:

  • Proper resource naming and labeling
  • Graceful termination handling
  • Health probes configuration
  • Leader election for high availability
  • Comprehensive monitoring integration

The charts follow Kubernetes and Helm best practices with a strong focus on security and operational excellence.

@tmshort
Copy link
Contributor Author

tmshort commented Aug 25, 2025

Recommend the use of https://github.com/homeport/dyff to evaluate the manifest YAML differences.

@tmshort tmshort requested a review from camilamacedo86 August 26, 2025 13:31
algorithm: ECDSA
rotationPolicy: Always
size: 256
secretName: operator-controller-cert
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmshort we changed the name, it was before olmv1-cert
But I think it is fine, downstream would we need to do something when upgrade to delete the old secret?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an upstream-only (i.e. cert-manager). Downstream, we use the service-ca.

@@ -1,20 +1,122 @@
---
Copy link
Contributor

@camilamacedo86 camilamacedo86 Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resource Type File: standard.yaml File: standard-helm.yaml Difference Review & Considerations
Namespace Simple namespace manifest. Namespace manifest is templated with additional labels and security annotations. Templated with additional metadata. ✅ the additional shows fine
ServiceAccount Only one ServiceAccount has labels. Both ServiceAccounts have app.kubernetes.io/name and app.kubernetes.io/part-of labels. More consistent and descriptive labeling.
Service Selector Uses control-plane: <name> for the pod selector. Uses app.kubernetes.io/name: <name> for the pod selector. Uses a different labeling convention. ✅ I am little concern because technically it is a breaking change but we discussed it here #2145 (comment) and I agree with
Deployment Selector Uses control-plane: <name> in the selector. Uses app.kubernetes.io/name: <name> in the selector. Uses a different labeling convention. ✅ - Same of #2145 (comment)
Deployment Args Uses $(POD_NAMESPACE) variable for the external-address argument. Uses a hardcoded catalogd-service.olmv1-system.svc for the external-address argument. Hardcoded value instead of a variable. ✅ That is fine we get from the values: --external-address=catalogd-service.{{ .Values.namespaces.olmv1.name }}.svc
Volumes Uses secret name olmv1-cert for operator controller certificates. Uses secret name operator-controller-cert for operator controller certificates. Different naming convention for secrets. ✅ It seems fine; raised here: https://github.com/operator-framework/operator-controller/pull/2145/files#r2301592489

@tmshort look the delta ALL shows fine. 👍

@@ -11,10 +11,6 @@ import (
ocv1 "github.com/operator-framework/operator-controller/api/v1"
)

// +kubebuilder:webhook:admissionReviewVersions={v1},failurePolicy=Fail,groups=olm.operatorframework.io,mutating=true,name=inject-metadata-name.olm.operatorframework.io,path=/mutate-olm-operatorframework-io-v1-clustercatalog,resources=clustercatalogs,verbs=create;update,versions=v1,sideEffects=None,timeoutSeconds=10

// +kubebuilder:rbac:groups=olm.operatorframework.io,resources=clustercatalogs,verbs=get;list;watch;patch;update
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is described in the RFC
It would be amazing if we could keep but the thread off seems make sense 👍

@@ -88,7 +88,7 @@ var prometheuSpec = allowedPolicyDefinition{
// Ref: https://docs.google.com/document/d/1bHEEWzA65u-kjJFQRUY1iBuMIIM1HbPy4MeDLX4NI3o/edit?usp=sharing
var allowedNetworkPolicies = map[string]allowedPolicyDefinition{
"catalogd-controller-manager": {
selector: metav1.LabelSelector{MatchLabels: map[string]string{"control-plane": "catalogd-controller-manager"}},
selector: metav1.LabelSelector{MatchLabels: map[string]string{"app.kubernetes.io/name": "catalogd"}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All labels that we have now are:

app.kubernetes.io/name: olmv1
app.kubernetes.io/part-of: olm
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/enforce-version: latest
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/audit-version: latest
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/warn-version: latest

# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should match the version with the release version and have a GitHub action that automatically opens a PR to update it when we do a release for now. In the future, we will need to bump before release. Do we not plan to distribute the chart instead of the shell?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the chart version, which is supposed to be updated with every chart update, it is not necessarily supposed to match the version of the product. Then again, I've also seen cases where no one updates this...

But we are going to continue with a single YAML+shell script installation via goreleaser, and not distribute the chart, yet.

Copy link
Contributor

@camilamacedo86 camilamacedo86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review – Aug 26

I reviewed the diff and everything looks fine overall. 🥇 The only changes that caught my attention were:

See: #2145 (comment)

IMO, the only things missing are:

  1. The Chart version should match the release version (we need to discuss how to keep this updated).
  2. Add a GitHub Action to run helm lint so that we can validate the Helm chart.
  3. Optinally: Could we also test out helm install in a GitHub Action for now, just to ensure the chart is deployable? This would be a good stopgap until we start distributing the solution via Helm and use it to run our e2e tests.

@tmshort
Copy link
Contributor Author

tmshort commented Aug 26, 2025

Review – Aug 26

I reviewed the diff and everything looks fine overall. 🥇 The only changes that caught my attention were:

I commented on both of these. We moved to a "standard k8s" selector, but we can always put back the original if there are complaints.

See: #2145 (comment)

IMO, the only things missing are:

  1. The Chart version should match the release version (we need to discuss how to keep this updated).

The chart version is the version of the chart itself, and is supposed to be incremented whenever the chart is modified. It's not necessarily supposed to match the software release. The field that corresponds the the release (appVersion) is optional and has been removed, so we don't need to keep it in sync.

# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.1.0

# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
# It is recommended to use it with quotes.
appVersion: "1.0.0"

We don't actually use the field when rendering.

  1. Add a GitHub Action to run helm lint so that we can validate the Helm chart.

Not sure we need to add a new CI, we can do it as part of make manifests (which already does render some other options, i.e. tilt and openshift) which is part of the verify CI.

  1. Optinally: Could we also test out helm install in a GitHub Action for now, just to ensure the chart is deployable? This would be a good stopgap until we start distributing the solution via Helm and use it to run our e2e tests.

This is a bit out-of-scope now. We are only planning to template/render it. We do use the rendered manifests for e2e testing.

Signed-off-by: Todd Short <tshort@redhat.com>
Copy link
Contributor

@camilamacedo86 camilamacedo86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job 🚀

LGTM

I’ll hold for now, but feel free to unhold whenever you think it’s appropriate.
I believe it’s better to wait until the cut branch is created before merging this one,
so we avoid impacting any changes that should go into version 4.20.

/hold

(until we see that is fine get merged)

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 26, 2025
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 26, 2025
Copy link

openshift-ci bot commented Aug 26, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: camilamacedo86
Once this PR has been reviewed and has the lgtm label, please assign tmshort for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants