Skip to content

(Nearly) complete dot_parser coverage #482

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
May 18, 2025
Merged

Conversation

ferdnyc
Copy link
Member

@ferdnyc ferdnyc commented May 16, 2025

When looking into #479, I had the realization that if we had complete coverage in the parser code, we'd have caught that the parser was never applying the 'strict' keyword to graphs. That line had no test coverage (even though it should've), and even if we'd attempted to directly add coverage for that line, we'd never have succeeded in reaching it. Which would've been a red flag that there was something wrong with the code.

So, this PR updates the remainder of the parser code to not only make a lot of the parsing cleaner (thanks to results names and a new helper function), but eliminate tons of dead code that was never being called (and was often completely unreachable).

Direct testing for the __repr__s on P_AttrList and DefaultStatement is also added, to ensure coverage of that code.

The outcome is coverage for dot_parser.py jumping up to 93%, and except for the 5 lines of exception handling in parse_dot_data(), all of the uncovered lines are in push_top_graph_stmt (so they should be handled by #480). Once both of these PRs are merged we should have total coverage of all the parsing code, meaning we can be fairly certain that all of its features are operating properly.

Along the way I also made a few other adjustments:

  • Renamed the attr_stmt parser element to default_stmt like it always should've been.
  • Eliminated port parsing entirely. See the "improve node parsing" for a full explanation of that one
  • Got rid of all the misguided defaults handling in add_elements, again see the "Streamline add_elements" commit message for details

ferdnyc added 11 commits May 15, 2025 18:09
A helper function for `ParserElement`s that can accept `attr_list`
as part of their grammar. Assigning a results_name to the `attr_list`
component and passing it to this function will return either a
combined dict of all the attributes in all parsed attr_lists, or
an empty dict if none were found. This can then be included in
`pydot.core` class constructor calls via `**attrs`.
Rename `attr_stmt` to `default_stmt`, apply results names, and
rewrite `push_default_stmt` to make use of `expand_attr_lists()`
Again, add results names to grammar and use `expand_attr_lists()`
Slightly more complex than the previous parsing improvements, due
to the complexity present in `push_edge_stmt()`. It's now far _less_
complex, and drops a bunch of ugly hackery. Still, the meat of the
change is the same: add results names, use `expand_attr_lists()`.
This one's kind of a big deal. Yes, add results names and make use of
`apply_attr_lists()`. But beyond that, all parsing of `port` values
is eliminated — the `node_id` is now defined as a `DelimitedList`,
consisting of between 1 and 3 `ID` strings separated by colons. The
`DelimitedList` parser element will combine them into a single
string. We were always passing the node name strings in to the
`pydot.core.Node` constructor as a single string, anyway -- the
logic to parse out port values lives in _that_ code. So there's
no reason to duplicate it in the parser, other than to validate
that the string parsed has a _plausible_ format.

As a result, `do_node_ports()` is dropped as it's no longer ever
called.
The function contained a ton of dead code, being far too complex
for its only REAL purpose: To ensure that subgraphs on complex
edges have the correct parent graph hierarchy.

(The obvious question here would be: "Wait, aren't those subgraphs
stored as `FrozenDict`s? How can they be "updated"? Well, it turns
out each `FrozenDict` representing a subgraph has a `parent_graph`
key in its dictionary that points to the _actual_ `Subgraph` object
it was created from (since the `Subgraph` had itself as its own
parent, when it was first initialized and then frozen).

Those `Subgraph` objects don't get garbage collected, because
they're still referenced by the `FrozenDict` they were created
from. That's probably a bad thing, and maybe we should focus on
_deleting_ them instead of reparenting them, but for now this
is the way the code's been doing it.
Rewrite this crucial function to remove a _ton_ of dead code,
including all of the defaults-handling that never actually did
anything (and _shouldn't_ do anything, as it was based on a
misguided premise: That defaults should be copied directly into
the attributes of any individual graph elements they apply to.
We don't want that!)

As a result, also drop `add_defaults()` which is no longer used.
Copy link

github-actions bot commented May 16, 2025

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  src/pydot
  core.py
  dot_parser.py
  test
  test_parser.py
Project Total  

This report was generated by python-coverage-comment-action

@ferdnyc
Copy link
Member Author

ferdnyc commented May 16, 2025

Oh, yes, see also the "Clean up update_parent_graph_hierarchy" commit message for some details on why we may want to look deeper into that function and the reasons it exists at all.

Copy link

This pull request has conflicts, please resolve those so that the changes can be evaluated.

Copy link

All conflicts have been resolved, thanks!

@lkk7
Copy link
Member

lkk7 commented May 18, 2025

Ruff will always catch by surprise.

@lkk7 lkk7 merged commit 8f5e9bd into pydot:main May 18, 2025
20 checks passed
ferdnyc added a commit that referenced this pull request May 24, 2025
@ferdnyc ferdnyc deleted the more-coverage branch May 24, 2025 23:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants