-
Notifications
You must be signed in to change notification settings - Fork 165
(Nearly) complete dot_parser coverage #482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
A helper function for `ParserElement`s that can accept `attr_list` as part of their grammar. Assigning a results_name to the `attr_list` component and passing it to this function will return either a combined dict of all the attributes in all parsed attr_lists, or an empty dict if none were found. This can then be included in `pydot.core` class constructor calls via `**attrs`.
Rename `attr_stmt` to `default_stmt`, apply results names, and rewrite `push_default_stmt` to make use of `expand_attr_lists()`
Again, add results names to grammar and use `expand_attr_lists()`
Slightly more complex than the previous parsing improvements, due to the complexity present in `push_edge_stmt()`. It's now far _less_ complex, and drops a bunch of ugly hackery. Still, the meat of the change is the same: add results names, use `expand_attr_lists()`.
This one's kind of a big deal. Yes, add results names and make use of `apply_attr_lists()`. But beyond that, all parsing of `port` values is eliminated — the `node_id` is now defined as a `DelimitedList`, consisting of between 1 and 3 `ID` strings separated by colons. The `DelimitedList` parser element will combine them into a single string. We were always passing the node name strings in to the `pydot.core.Node` constructor as a single string, anyway -- the logic to parse out port values lives in _that_ code. So there's no reason to duplicate it in the parser, other than to validate that the string parsed has a _plausible_ format. As a result, `do_node_ports()` is dropped as it's no longer ever called.
The function contained a ton of dead code, being far too complex for its only REAL purpose: To ensure that subgraphs on complex edges have the correct parent graph hierarchy. (The obvious question here would be: "Wait, aren't those subgraphs stored as `FrozenDict`s? How can they be "updated"? Well, it turns out each `FrozenDict` representing a subgraph has a `parent_graph` key in its dictionary that points to the _actual_ `Subgraph` object it was created from (since the `Subgraph` had itself as its own parent, when it was first initialized and then frozen). Those `Subgraph` objects don't get garbage collected, because they're still referenced by the `FrozenDict` they were created from. That's probably a bad thing, and maybe we should focus on _deleting_ them instead of reparenting them, but for now this is the way the code's been doing it.
Rewrite this crucial function to remove a _ton_ of dead code, including all of the defaults-handling that never actually did anything (and _shouldn't_ do anything, as it was based on a misguided premise: That defaults should be copied directly into the attributes of any individual graph elements they apply to. We don't want that!) As a result, also drop `add_defaults()` which is no longer used.
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
Oh, yes, see also the "Clean up update_parent_graph_hierarchy" commit message for some details on why we may want to look deeper into that function and the reasons it exists at all. |
This pull request has conflicts, please resolve those so that the changes can be evaluated. |
All conflicts have been resolved, thanks! |
Ruff will always catch by surprise. |
When looking into #479, I had the realization that if we had complete coverage in the parser code, we'd have caught that the parser was never applying the 'strict' keyword to graphs. That line had no test coverage (even though it should've), and even if we'd attempted to directly add coverage for that line, we'd never have succeeded in reaching it. Which would've been a red flag that there was something wrong with the code.
So, this PR updates the remainder of the parser code to not only make a lot of the parsing cleaner (thanks to results names and a new helper function), but eliminate tons of dead code that was never being called (and was often completely unreachable).
Direct testing for the
__repr__
s onP_AttrList
andDefaultStatement
is also added, to ensure coverage of that code.The outcome is coverage for
dot_parser.py
jumping up to 93%, and except for the 5 lines of exception handling inparse_dot_data()
, all of the uncovered lines are inpush_top_graph_stmt
(so they should be handled by #480). Once both of these PRs are merged we should have total coverage of all the parsing code, meaning we can be fairly certain that all of its features are operating properly.Along the way I also made a few other adjustments:
attr_stmt
parser element todefault_stmt
like it always should've been.port
parsing entirely. See the "improve node parsing" for a full explanation of that oneadd_elements
, again see the "Streamline add_elements" commit message for details