Skip to content

validate fail for dupe skips+replaces channel entries #1750

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

grokspawn
Copy link
Contributor

Description of the change:
opm validate fails when a skip edge exists for a channel which matches the entry's replaces edge.

Motivation for the change:
Due to OLMv0 graph mechanics, the skips edge will cause OLMv0 to ignore the bundle version when considering upgrades (since v0 discards graph contribution from skipped bundle versions).
Since the purpose of a replaces edge is to enable upgrade mobility across a graph, allowing the bundle version to be ignored (due to the skips entry) is an error, and potentially results in stranding.

For example, take input olm.channel:

{
    "schema": "olm.channel",
    "name": "stable-v1",
    "package": "test-operator",
    "entries": [
        {
            "name": "test-operator-v1.0.0"
        },
        {
            "name": "test-operator-v1.1.0"
        },
        {
            "name": "test-operator-v1.1.2"
        },
        {
            "name": "test-operator-v1.1.4",
            "replaces": "test-operator-v1.0.0",
            "skips": [
                "test-operator-v1.0.0",
                "test-operator-v1.1.0",
                "test-operator-v1.1.2"
            ]
        },
        {
            "name": "test-operator-v1.2.0"
        },
        {
            "name": "test-operator-v1.2.1",
            "replaces": "test-operator-v1.1.4",
            "skips": [
                "test-operator-v1.1.4",
                "test-operator-v1.2.0"
            ]
        },
        {
            "name": "test-operator-v1.3.0",
            "replaces": "test-operator-v1.2.1",
            "skips": [
                "test-operator-v1.2.1"
            ]
        },
        {
            "name": "test-operator-v1.4.0",
            "replaces": "test-operator-v1.3.0",
            "skips": [
                "test-operator-v1.3.0"
            ]
        }
    ]
}

Using a new version of opm which can optionally display OLMv0 graph semantics, you can appreciate that the edges with duplicate replaces/skips will be ignored in the graph (skipped objects are limned in red and ignored edges are red dashed arrows).
mermaid-diagram-2025-08-19-102936

Reviewer Checklist

  • Implementation matches the proposed design, or proposal is updated to match implementation
  • Sufficient unit test coverage
  • Sufficient end-to-end test coverage
  • Docs updated or added to /docs
  • Commit messages sensible and descriptive

Signed-off-by: grokspawn <jordan@nimblewidget.com>
Copy link

codecov bot commented Aug 19, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (master@b5c503a). Learn more about missing BASE report.
⚠️ Report is 15 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##             master    #1750   +/-   ##
=========================================
  Coverage          ?   55.17%           
=========================================
  Files             ?      136           
  Lines             ?    15923           
  Branches          ?        0           
=========================================
  Hits              ?     8785           
  Misses            ?     5985           
  Partials          ?     1153           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@grokspawn
Copy link
Contributor Author

/approve

Copy link
Contributor

openshift-ci bot commented Aug 20, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: grokspawn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 20, 2025
if slices.Contains(entry.Skips, entry.Replaces) {
return nil, fmt.Errorf("invalid package %q, channel %q: entry %q has identical replaces and skips: %q", c.Package, c.Name, entry.Name, entry.Replaces)
}
}
Copy link
Contributor

@camilamacedo86 camilamacedo86 Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Make sense for me my only concern is:
Did we check how many cases do we have that fail in this scenario?
we might need to create a script to validate, what we do if we have FBC catalogs with?

But maybe it will need to see outside of this PR

/lgtm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do see one instance in the operatorhubio catalog:

./operatorhubio/latest
FATA[0002] invalid package "grafana-operator", channel "v5": entry "grafana-operator.v5.10.0" has identical replaces and skips: "grafana-operator.v5.9.2"

let's
/hold
this until we can talk to some impacted folks and determine if this is a big enough problem to have to solve NOW.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 20, 2025
@joelanford
Copy link
Member

This validation check seems to be very narrowly tailored to "can't both skip and replace the same thing in one entry", which is good!

However, I think it very slightly misses the point and the broader problem.

  1. It is actually okay to both skip and replace a bundle that is already a leaf node in the graph.
  2. When a node is skip-ed and causes other entries to no longer have a path to the channel head, that is the real problem that we need to check for.

@grokspawn
Copy link
Contributor Author

This validation check seems to be very narrowly tailored to "can't both skip and replace the same thing in one entry", which is good!

However, I think it very slightly misses the point and the broader problem.

1. It is actually okay to both `skip` and `replace` a bundle that is already a leaf node in the graph.

This is totally fine in any OLMv1 context, but I'd argue that since it comes with migration side-effects for OLMv0 that it's never OK. In general, we should not have these kind of surprises, and I think it's reasonable to enforce the most-restrictive case here (because it's easier to grow-permissive than -restrictive).

2. When a node is `skip`-ed and causes other entries to no longer have a path to the channel head, _that_ is the real problem that we need to check for.

That's a specific flavor of this more general issue. But I'd argue that it is also resolved by preventing the more general issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants