Failure to determine correct level bump with custom parser in 9.17.0 #1162

pwrmiller · 2025-01-28T18:53:42Z

Bug Report

Description

Different / breaking behavior between 9.16.1 and 9.17.0 in algorithm.py Step 5. Parse commits to determine the bump level that should be applied.

Expected behavior

I would have expected the same behavior between 9.16.1 and 9.17.0 as minor version changes should not exhibit breaking behavior. This seems to be a regression.

Actual behavior

Specifically, in 9.16.1, parsed_levels will contain the results of the parsed levels in the commits_since_last_release, but 9.17.0 will not. It is not entirely clear to me where this is failing due to the complexity of the mapped/filtered set.

Environment

Operating System (w/ version): MacOS
Python version: 3.12
Pip version: 25.0 (using Poetry)
Semantic-release version: 9.17.0 has the regression
Build tool (w/ version): poetry 1.8.3

poetry show python-semantic-release

 name         : python-semantic-release                           
 version      : 9.16.1                                            
 description  : Automatic Semantic Versioning for Python projects 

dependencies
 - click >=8.0,<9.0
 - click-option-group >=0.5,<1.0
 - dotty-dict >=1.3,<2.0
 - gitpython >=3.0,<4.0
 - importlib-resources >=6.0,<7.0
 - jinja2 >=3.1,<4.0
 - pydantic >=2.0,<3.0
 - python-gitlab >=4.0,<5.0
 - requests >=2.25,<3.0
 - rich >=13.0,<14.0
 - shellingham >=1.5,<2.0
 - tomlkit >=0.11,<1.0

git log --oneline --decorate --graph --all -n 50

"""
*   5947a27 (origin/main, origin/HEAD) Merged PR 34224: some description
|\
| * 484adac (HEAD -> fix/950777-something) fix(XYZ-950777): some description
|/
*   6a320ac (main) Merged PR 34214: feat(XYZ-950230): some description
|\
| * fdc2c4b (feat/950230-ruff-version-for-check) feat(XYZ-950230): some description
|/
* b3d8aab (tag: v3.17.0) chore(release): release v3.17.0 [skip ci]
*   3ba9e71 Merged PR 33609: feat(XYZ-938710): some description
|\
| * eed8b07 feat(XYZ-938710): some description
|/
* 1886245 (tag: v3.16.1, fix/947167-something) chore(release): release v3.16.1 [skip ci]
*   a4ca462 Merged PR 33438: fix(XYZ-945996): some description
|\
| * d3096b5 fix(XYZ-945996): some description
| * 89e7a32 fix(XYZ-945996): some description
| * 241cee5 fix(XYZ-945996): some description
|/
*   eb53d60 Merged PR 32668: docs(XYZ-936992): some description
|\
| * ed1823c docs(XYZ-936992): some description
|/
* 1f4aeac (tag: v3.16.0) chore(release): release v3.16.0 [skip ci]
* e944328 Merged PR 32255: feat(XYZ-877699): some description
* b8f74f1 (tag: v3.15.0) chore(release): release v3.15.0 [skip ci]
*   b16df6e Merged PR 32425: some description
|\
| * dcbcb90 feat(XYZ-929489): some description
| | * 22ecbd7 (origin/feat/877699-something) feat(XYZ-877699): some description
| | * 19a171e feat(XYZ-877699): some description
| | * cc1a1ed feat(XYZ-877699): some description
| | * 2d2b874 feat(XYZ-877699): some description
| | * b734132 feat(XYZ-877699): some description
| | * fbcbab3 feat(XYZ-877699): some description
| | * 6c1b9f1 feat(XYZ-877699): some description
| |/
|/|
* | 44e166e (tag: v3.14.1) chore(release): release v3.14.1 [skip ci]
* |   d3d9c78 Merged PR 32406: fix(XYZ-929227): some description
|\ \
| * | fb90a03 fix(XYZ-929227): some description
|/ /
* | 60427a2 (tag: v3.14.0) chore(release): release v3.14.0 [skip ci]
* |   c9e0ac6 Merged PR 32349: feat: some description
|\ \
| |/
|/|
| * 921073b feat: update something
|/
* eab864b (tag: v3.13.1) chore(release): release v3.13.1 [skip ci]
*   4ab8ff1 Merged PR 32022: increasing minimum aws-cdk-lib version
|\
| * 4552a43 fix(933941): some description
* | 5fc2b5c (tag: v3.13.0) chore(release): release v3.13.0 [skip ci]
* |   651895b Merged PR 31947: feat(XYZ-933508): some description
|\ \
| * | 6a67df4 feat(XYZ-933508): some description
|/ /
| | * 028a9dd (origin/feat/933508-merge-coverage) Initial commit
| | * 99a7bcb feat(XYZ-933508): some description
| |/
|/|
| | * 46f6fb3 (origin/feat/877699-something, feat/877699-something) feat(XYZ-877699): some description
| |/
|/|
* | ad8093f (tag: v3.12.0) chore(release): release v3.12.0 [skip ci]
* |   8dd80e3 Merged PR 31794: feat(XYZ-932586): some description
|\ \
| * | 2ad1331 (feature/932586-something) feat(XYZ-932586): some description
* | | e456008 (tag: v3.11.0) chore(release): release v3.11.0 [skip ci]
* | |   f04be2d Merged PR 31765: XYZ-930364: some description
|\ \ \
| * | | 17a541e (origin/feature/XYZ-930364-something) feat(XYZ-930364): some description
|/ / /
| | | *   f103abc (refs/stash) WIP on feature/XYZ-930364-something: 2bc9a7e some description
| | | |\
| | | | * a4d443c index on feature/XYZ-930364-something: 2bc9a7e some description
| | | |/
| | | * 2bc9a7e (feature/XYZ-930364-something) some description

Configuration

Semantic Release Configuration

I use a custom commit parser to support working with Azure Devops.

import re
from typing import Any, Optional

import git
import toml
from semantic_release import (
    CommitParser,
    LevelBump,
    ParsedCommit,
    ParseError,
    ParseResult,
    ParserOptions,
)


class MyCommitParserOptions(ParserOptions):
    def __init__(self, **_: Any):
        # Load whole configuration
        super().__init__(**_)
        pyproject = toml.load("pyproject.toml")

        # Extract commit parsers options
        commit_parser_options = pyproject["tool"]["semantic_release"][
            "commit_parser_options"
        ]

        # Load options as instance variables
        self.allowed_tags = commit_parser_options["allowed_tags"]
        self.minor_tags = commit_parser_options["minor_tags"]
        self.patch_tags = commit_parser_options["patch_tags"]
        self.default_bump_level = LevelBump(commit_parser_options["default_bump_level"])


class MyCommitParser(CommitParser[ParseResult, MyCommitParserOptions]):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.options = ADOCommitParserOptions()

    def parse(self, commit: git.objects.commit.Commit) -> Optional[ParseResult]:
        sentences = commit.message.strip().split("\n\n")
        unique_sentences = list(dict.fromkeys(sentences))
        message = "\n\n".join(unique_sentences)
        allowed_tags_regex_part = "|".join(map(re.escape, self.options.allowed_tags))
        match = re.match(
            rf"^.*({allowed_tags_regex_part})(?:\(([^)]+)\))?: (.+)(?:\n|$)([\s\S]*)",
            message,
        )

        if match:
            type_ = match.group(1)
            scope = match.group(2) or ""
            description = match.group(3)
            rest_of_message = match.group(4)

            descriptions = [description]
            breaking_descriptions = []

            for line in rest_of_message.split("\n"):
                line = line.strip()
                if line.startswith("BREAKING CHANGE: "):
                    breaking_descriptions.append(line.replace("BREAKING CHANGE: ", ""))
                elif line != "":
                    descriptions.append(line)

            bump = self.options.default_bump_level
            if breaking_descriptions:
                bump = LevelBump.MAJOR
            elif type_ in self.options.minor_tags:
                bump = LevelBump.MINOR
            elif type_ in self.options.patch_tags:
                bump = LevelBump.PATCH

            return ParsedCommit(
                bump=bump,
                type=type_,
                scope=scope,
                descriptions=descriptions,
                breaking_descriptions=breaking_descriptions,
                commit=commit,
            )

        else:
            return ParseError(commit=commit, error="Could not parse commit message")

[tool.semantic_release]
assets = []
build_command_env = []
commit_message = "chore(release): release v{version} [skip ci]"
commit_parser = "commit_parser:MyCommitParser"
logging_use_named_masks = false
major_on_zero = true
allow_zero_version = true
no_git_verify = false
tag_format = "v{version}"
version_toml = [
    "pyproject.toml:tool.poetry.version",
]
version_variables = [
    "calabrio_cdk/_version.py:__version__"
]

[tool.semantic_release.branches.main]
match = "(main|master)"
prerelease_token = "rc"
prerelease = false

[tool.semantic_release.changelog]
template_dir = "templates"
changelog_file = "CHANGELOG.md"
exclude_commit_patterns = [
  '''chore(?:\([^)]*?\))?: .+''',
  '''ci(?:\([^)]*?\))?: .+''',
  '''refactor(?:\([^)]*?\))?: .+''',
  '''style(?:\([^)]*?\))?: .+''',
  '''test(?:\([^)]*?\))?: .+''',
  '''build\((?!deps\): .+)''',
  '''Merged? .*''',
]

[tool.semantic_release.changelog.environment]
block_start_string = "{%"
block_end_string = "%}"
variable_start_string = "{{"
variable_end_string = "}}"
comment_start_string = "{#"
comment_end_string = "#}"
trim_blocks = false
lstrip_blocks = false
newline_sequence = "\n"
keep_trailing_newline = false
extensions = []
autoescape = true

[tool.semantic_release.commit_author]
env = "GIT_COMMIT_AUTHOR"
default = "semantic-release <semantic-release>"

[tool.semantic_release.commit_parser_options]
allowed_tags = ["build", "chore", "ci", "docs", "feat", "fix", "perf", "style", "refactor", "test"]
minor_tags = ["feat"]
patch_tags = ["fix", "perf"]
default_bump_level = 0

[tool.semantic_release.remote]
name = "origin"
type = "github"
domain = "dev.azure.com"
api_domain = "dev.azure.com"
ignore_token_for_push = true
insecure = false
url = { env = "AZURE_REPOSITORY_URL" }

[tool.semantic_release.remote.token]
env = "AZURE_TOKEN"

[tool.semantic_release.publish]
dist_glob_patterns = ["dist/*"]
upload_to_vcs_release = false

Build System Configuration

GitHub Actions Job Definition

Execution Log

poetry run semantic-release -vv version --no-vcs-release

9.17.1

           DEBUG    [cmd.execute] Popen(['git', 'rev-list', 'v3.17.0', '--'], cwd=/Users/me/project, stdin=None, shell=False, universal_newlines=False)                                            cmd.py:1253
           INFO     [algorithm._traverse_graph_for_commits] Found 3 commits since the last release!                                                                                                                algorithm.py:112
           DEBUG    [algorithm.next_version] parsed the following distinct levels from the commits since the last release: set()

9.16.1

           DEBUG    [cmd.execute] Popen(['git', 'rev-list', 'v3.17.0', '--'], cwd=/Users/me/project, stdin=None, shell=False, universal_newlines=False)                                            cmd.py:1253
           DEBUG    [algorithm.dfs] queuing parent commit 6a320ac                                                                                                                                                   algorithm.py:92
           DEBUG    [algorithm.dfs] queuing parent commit b3d8aab                                                                                                                                                   algorithm.py:92
           DEBUG    [algorithm.dfs] queuing parent commit fdc2c4b                                                                                                                                                   algorithm.py:92
           DEBUG    [algorithm.dfs] queuing parent commit b3d8aab                                                                                                                                                   algorithm.py:92
           INFO     [algorithm._traverse_graph_for_commits] Found 3 commits since the last release!                                                                                                                algorithm.py:112
           DEBUG    [algorithm.next_version] parsed the following distinct levels from the commits since the last release: {<LevelBump.PATCH: 2>, <LevelBump.MINOR: 3>}

Additional context

Debugging suggests that perhaps the result of the filter on isinstance for ParsedCommit might be failing, since at this point the parsed_result is already a LevelBump. Though this is not entirely clear due to the complexity of step 5.

The commit parser returns the correct ParsedCommit(s) in both versions of PSR, however, the filtered list in parsed_levels differs between versions.

The text was updated successfully, but these errors were encountered:

codejedi365 · 2025-01-28T19:15:54Z

@pwrmiller, thanks for pointing me to a specific place in the code that you think there is a change, however, this is not enough information for me to understand what is happening and how to replicate.

Please fill in your configuration in the boxes above and provide your commit messages in a git tree so that I can replicate the problem because all of our extensive test cases did not catch/replicate your environment.

Obviously, it was not intended to be a breaking change otherwise I would of bumped it otherwise but people make mistakes.

pwrmiller · 2025-01-28T19:50:36Z

@codejedi365 I've provided additional context on my configuration. My apologies for not providing sufficient context for your review.

codejedi365 · 2025-01-29T03:41:22Z

@pwrmiller, I did find a bug in the code. Basically I was attempting to maintain the legacy return value of the parse() method which is a single ParseResult value--now with the feature to parse squash commits, this has changed to also allow a list[ParseResult] objects to be returned. To prevent a breaking change and maintain compatibility, I attempted to detect and cast all return values to a list with isinstance(result, Iterable). This ends up not working correctly because at the heart of it a ParseCommit extends the NamedTuple class which is iterable itself (which I overlooked as it's my first real time using NamedTuples). The custom parser with the single return value was the different scenario that was not covered in testing with the last release.

I've came up with an initial fix to apply more stricter type checking from return values of parsers, followed with a better type cast.

codejedi365 · 2025-01-29T04:26:38Z

Separately, I wanted to offer a few suggestions to your custom parser implementation.

I don't know if you happened to see my note on issue Commit parsers to handle merge commit prefixes #540 related to Asure DevOps, but it is my general recommendation to automatically ignore merge commits as they generally create duplicative information in changelogs. It looks like one of the features is that this parser removes the merge commit prefix but you have some other stuff happening. I recommend re-evaluating because the downside with a custom parser is you will miss out on a lot of the new updates and implementations I have been releasing over time.
Here is a few changes that I would make to your implementation. First off, PSR will automatically load your configuration from pyproject.toml and pass it into your options object (if it is defined in the parser class, see below) so there is no need to load that part yourself. I don't see any unique options defined so I just had it extend AngularParserOptions as that will inherit all of your expected values. If you want to hard code them instead then I would just copy the fields to define the dataclass rather than using the init function. The base CommitParser class will automatically load the options so all you need on the parser init is to call the super class.

+ from semantic_release.commit_parser.angular import AngularParserOptions
+ from semantic_release.helpers import force_str

+ class MyCommitParserOptions(AngularParserOptions):
- class MyCommitParserOptions(ParserOptions):
-      def __init__(self, **_: Any):
-          # Load whole configuration
-          super().__init__(**_)
-          pyproject = toml.load("pyproject.toml")
-
-          # Extract commit parsers options
-          commit_parser_options = pyproject["tool"]["semantic_release"][
-              "commit_parser_options"
-          ]
-
-          # Load options as instance variables
-          self.allowed_tags = commit_parser_options["allowed_tags"]
-          self.minor_tags = commit_parser_options["minor_tags"]
-          self.patch_tags = commit_parser_options["patch_tags"]
-          self.default_bump_level = LevelBump(commit_parser_options["default_bump_level"])


  class MyCommitParser(CommitParser[ParseResult, MyCommitParserOptions]):
+
+    parser_options = ADOCommitParserOptions   # removed in v10
+
-     def __init__(self, **kwargs):
-         super().__init__(**kwargs)
-         self.options = MyCommitParserOptions()
+     def __init__(self, options: MyCommitParserOptions | None = None) -> None:
+         super().__init__(options)
+
+     @staticmethod
+     def get_default_options() -> MyCommitParserOptions:
+         return MyCommitParserOptions()

      def parse(self, commit: git.objects.commit.Commit) -> Optional[ParseResult]:
-         sentences = commit.message.strip().split("\n\n")
+         sentences = force_str(commit.message).strip().split("\n\n")
          unique_sentences = list(dict.fromkeys(sentences))
          message = "\n\n".join(unique_sentences)
          allowed_tags_regex_part = "|".join(map(re.escape, self.options.allowed_tags))
          match = re.match(
              rf"^.*({allowed_tags_regex_part})(?:\(([^)]+)\))?: (.+)(?:\n|$)([\s\S]*)",
              message,
          )

          if match:
              type_ = match.group(1)
              scope = match.group(2) or ""
              description = match.group(3)
              rest_of_message = match.group(4)

              descriptions = [description]
              breaking_descriptions = []

              for line in rest_of_message.split("\n"):
                  line = line.strip()
                  if line.startswith("BREAKING CHANGE: "):
                      breaking_descriptions.append(line.replace("BREAKING CHANGE: ", ""))
                  elif line != "":
                      descriptions.append(line)

              bump = self.options.default_bump_level
              if breaking_descriptions:
                  bump = LevelBump.MAJOR
              elif type_ in self.options.minor_tags:
                  bump = LevelBump.MINOR
              elif type_ in self.options.patch_tags:
                  bump = LevelBump.PATCH

              return ParsedCommit(
                  bump=bump,
                  type=type_,
                  scope=scope,
                  descriptions=descriptions,
                  breaking_descriptions=breaking_descriptions,
                  commit=commit,
              )

          else:
              return ParseError(commit=commit, error="Could not parse commit message")

pwrmiller · 2025-01-29T14:11:36Z

@codejedi365 thank you so much for the detailed review and recommendations on my commit parser. I'll take a look on my end into avoiding the custom CommitParser as recommended. We were doing this to filter out certain commits that weren't matching any pattern, if I recall, to avoid version bumps that didn't make sense for a project that moved towards semantic versioning later in its life.

I'll be revisiting that re.match to see if it's doing anything special related to our earlier commits, but I don't see anything obvious other than perhaps dealing with formatting for multi-line commit messages (those with a more verbose description) since we wanted to see those in our CHANGELOG.md for completeness (these are auto-generated by your library after a certain release number, then pushed up to a static site with mkdocs).

codejedi365 · 2025-01-30T01:22:47Z

We were doing this to filter out certain commits that weren't matching any pattern, if I recall, to avoid version bumps that didn't make sense for a project that moved towards semantic versioning later in its life.

I would check out exclude_commit_patterns as well when it comes to changelog content if you use the init mode for changelog generation, but you likely will not need to consider older messages if you migrate to the update mode.

I'll be revisiting that re.match to see if it's doing anything special related to our earlier commits, but I don't see anything obvious other than perhaps dealing with formatting for multi-line commit messages (those with a more verbose description) since we wanted to see those in our CHANGELOG.md for completeness (these are auto-generated by your library after a certain release number, then pushed up to a static site with mkdocs).

If it is a more verbose description, then you may want to look at issue #1132 which discusses how to customize the changelog output. We now support parsing squash commits if that is what you were describing as a more verbose description. The default changelog currently prints out the entire commit body but in v10 it will only display the subject/summary line.

resolves the error where the casting to a list was incorrectly implemented causing single results back from custom parsers (legacy return value) to be mis-interpreted which meant no parse level would be determined. Resolves: python-semantic-release#1162

codejedi365 · 2025-02-06T04:52:10Z

🎉 This issue has been resolved in Version 9.18.0 🎉

You can find more information about this release on the GitHub Releases page.

pwrmiller added bug Something isn't working properly triage waiting for initial maintainer review labels Jan 28, 2025

codejedi365 added awaiting-reply Waiting for response and removed triage waiting for initial maintainer review labels Jan 28, 2025

codejedi365 changed the title ~~Regression between 9.17.0 and 9.16.1 in commit parsing to determine level bump (Step 5. in algorithm.py)~~ Failure to determine correct level bump with custom parser in 9.17.0 Jan 29, 2025

codejedi365 closed this as completed in cf340c5 Jan 31, 2025

codejedi365 mentioned this issue Jan 31, 2025

fix: refactor parsing compatibility function to support older custom parsers #1165

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failure to determine correct level bump with custom parser in 9.17.0 #1162

Failure to determine correct level bump with custom parser in 9.17.0 #1162

pwrmiller commented Jan 28, 2025 •

edited

Loading

codejedi365 commented Jan 28, 2025

pwrmiller commented Jan 28, 2025

codejedi365 commented Jan 29, 2025

codejedi365 commented Jan 29, 2025 •

edited

Loading

pwrmiller commented Jan 29, 2025

codejedi365 commented Jan 30, 2025 •

edited

Loading

codejedi365 commented Feb 6, 2025

Failure to determine correct level bump with custom parser in 9.17.0 #1162

Failure to determine correct level bump with custom parser in 9.17.0 #1162

Comments

pwrmiller commented Jan 28, 2025 • edited Loading

Bug Report

Description

Expected behavior

Actual behavior

Environment

Configuration

Execution Log

Additional context

codejedi365 commented Jan 28, 2025

pwrmiller commented Jan 28, 2025

codejedi365 commented Jan 29, 2025

codejedi365 commented Jan 29, 2025 • edited Loading

pwrmiller commented Jan 29, 2025

codejedi365 commented Jan 30, 2025 • edited Loading

codejedi365 commented Feb 6, 2025

🎉 This issue has been resolved in Version 9.18.0 🎉

pwrmiller commented Jan 28, 2025 •

edited

Loading

codejedi365 commented Jan 29, 2025 •

edited

Loading

codejedi365 commented Jan 30, 2025 •

edited

Loading