(Nearly) complete dot_parser coverage #482

ferdnyc · 2025-05-16T14:02:18Z

When looking into #479, I had the realization that if we had complete coverage in the parser code, we'd have caught that the parser was never applying the 'strict' keyword to graphs. That line had no test coverage (even though it should've), and even if we'd attempted to directly add coverage for that line, we'd never have succeeded in reaching it. Which would've been a red flag that there was something wrong with the code.

So, this PR updates the remainder of the parser code to not only make a lot of the parsing cleaner (thanks to results names and a new helper function), but eliminate tons of dead code that was never being called (and was often completely unreachable).

Direct testing for the __repr__s on P_AttrList and DefaultStatement is also added, to ensure coverage of that code.

The outcome is coverage for dot_parser.py jumping up to 93%, and except for the 5 lines of exception handling in parse_dot_data(), all of the uncovered lines are in push_top_graph_stmt (so they should be handled by #480). Once both of these PRs are merged we should have total coverage of all the parsing code, meaning we can be fairly certain that all of its features are operating properly.

Along the way I also made a few other adjustments:

Renamed the attr_stmt parser element to default_stmt like it always should've been.
Eliminated port parsing entirely. See the "improve node parsing" for a full explanation of that one
Got rid of all the misguided defaults handling in add_elements, again see the "Streamline add_elements" commit message for details

A helper function for `ParserElement`s that can accept `attr_list` as part of their grammar. Assigning a results_name to the `attr_list` component and passing it to this function will return either a combined dict of all the attributes in all parsed attr_lists, or an empty dict if none were found. This can then be included in `pydot.core` class constructor calls via `**attrs`.

Rename `attr_stmt` to `default_stmt`, apply results names, and rewrite `push_default_stmt` to make use of `expand_attr_lists()`

Again, add results names to grammar and use `expand_attr_lists()`

Slightly more complex than the previous parsing improvements, due to the complexity present in `push_edge_stmt()`. It's now far _less_ complex, and drops a bunch of ugly hackery. Still, the meat of the change is the same: add results names, use `expand_attr_lists()`.

This one's kind of a big deal. Yes, add results names and make use of `apply_attr_lists()`. But beyond that, all parsing of `port` values is eliminated — the `node_id` is now defined as a `DelimitedList`, consisting of between 1 and 3 `ID` strings separated by colons. The `DelimitedList` parser element will combine them into a single string. We were always passing the node name strings in to the `pydot.core.Node` constructor as a single string, anyway -- the logic to parse out port values lives in _that_ code. So there's no reason to duplicate it in the parser, other than to validate that the string parsed has a _plausible_ format. As a result, `do_node_ports()` is dropped as it's no longer ever called.

The function contained a ton of dead code, being far too complex for its only REAL purpose: To ensure that subgraphs on complex edges have the correct parent graph hierarchy. (The obvious question here would be: "Wait, aren't those subgraphs stored as `FrozenDict`s? How can they be "updated"? Well, it turns out each `FrozenDict` representing a subgraph has a `parent_graph` key in its dictionary that points to the _actual_ `Subgraph` object it was created from (since the `Subgraph` had itself as its own parent, when it was first initialized and then frozen). Those `Subgraph` objects don't get garbage collected, because they're still referenced by the `FrozenDict` they were created from. That's probably a bad thing, and maybe we should focus on _deleting_ them instead of reparenting them, but for now this is the way the code's been doing it.

Rewrite this crucial function to remove a _ton_ of dead code, including all of the defaults-handling that never actually did anything (and _shouldn't_ do anything, as it was based on a misguided premise: That defaults should be copied directly into the attributes of any individual graph elements they apply to. We don't want that!) As a result, also drop `add_defaults()` which is no longer used.

github-actions · 2025-05-16T14:07:09Z

Coverage report

Click to see where and how coverage changed

File	Statements	Missing	Coverage	Coverage (new stmts)	Lines missing
src/pydot
core.py
dot_parser.py
test
test_parser.py
Project Total

_{This report was generated by python-coverage-comment-action}

ferdnyc · 2025-05-16T14:12:07Z

Oh, yes, see also the "Clean up update_parent_graph_hierarchy" commit message for some details on why we may want to look deeper into that function and the reasons it exists at all.

github-actions · 2025-05-17T21:42:07Z

This pull request has conflicts, please resolve those so that the changes can be evaluated.

github-actions · 2025-05-18T05:18:38Z

All conflicts have been resolved, thanks!

lkk7 · 2025-05-18T12:32:59Z

Ruff will always catch by surprise.

This reverts commit 8f5e9bd.

ferdnyc added 11 commits May 15, 2025 18:09

parser: remove unused get_port()

5f58152

Test repr for P_AttrList and DefaultStatement

2eb8ed4

Parser: Improve defaults node parsing

3626ca7

Rename `attr_stmt` to `default_stmt`, apply results names, and rewrite `push_default_stmt` to make use of `expand_attr_lists()`

Parser: Improve subgraph parsing

49c7718

Again, add results names to grammar and use `expand_attr_lists()`

Parser: Drop unused AttributeDict import

e915f7a

Parser: Type adjustment

3dcbd2b

github-actions bot added the conflicts label May 17, 2025

Merge branch 'main' into more-coverage

dee6add

github-actions bot removed the conflicts label May 18, 2025

Ruff fixes

bb143d6

lkk7 merged commit 8f5e9bd into pydot:main May 18, 2025
20 checks passed

ferdnyc added a commit that referenced this pull request May 24, 2025

Revert "(Nearly) complete dot_parser coverage (#482)"

84f85ea

This reverts commit 8f5e9bd.

ferdnyc deleted the more-coverage branch May 24, 2025 23:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

(Nearly) complete dot_parser coverage #482

(Nearly) complete dot_parser coverage #482

Uh oh!

ferdnyc commented May 16, 2025 •

edited

Loading

Uh oh!

github-actions bot commented May 16, 2025 •

edited

Loading

Uh oh!

ferdnyc commented May 16, 2025

Uh oh!

github-actions bot commented May 17, 2025

Uh oh!

github-actions bot commented May 18, 2025

Uh oh!

lkk7 commented May 18, 2025

Uh oh!

Uh oh!

Uh oh!

(Nearly) complete dot_parser coverage #482

(Nearly) complete dot_parser coverage #482

Uh oh!

Conversation

ferdnyc commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage report

Uh oh!

ferdnyc commented May 16, 2025

Uh oh!

github-actions bot commented May 17, 2025

Uh oh!

github-actions bot commented May 18, 2025

Uh oh!

lkk7 commented May 18, 2025

Uh oh!

Uh oh!

Uh oh!

ferdnyc commented May 16, 2025 •

edited

Loading

github-actions bot commented May 16, 2025 •

edited

Loading