Skip to content

Ambiguous symbol table names #121914

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
serhiy-storchaka opened this issue Jul 17, 2024 · 4 comments
Open

Ambiguous symbol table names #121914

serhiy-storchaka opened this issue Jul 17, 2024 · 4 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

serhiy-storchaka commented Jul 17, 2024

Feature or enhancement

Every symbol table has type and name. The name of the symbol table that corresponds a class or a function is the same as the name of the corresponding class or function. But there are special symbol tables of type function for lambdas and generator expressions (there were also symbol tables for comprehensions, but they are no longer used). For lambdas. the name "lambda" does not conflict with other function names, because it is a reserved word. But for general expressions, the name "genexpr" can conflict with local function "genexpr" (see #119698). It is possible to distinguish the symbol table corresponding to a generator expression by looking in the list of its parameters, but this is not so convenient.

I propose to make the difference more clear:

  • Either use separate types for lambda and generator expression symbol tables.

  • Or use names which cannot be confused with any function name:

    • reserved word like lambda. But I do not know good variant for generator expression.
    • hyphenated word, like gen-expr.
    • angle brackets, like <genexpr> and <lambda>.
    • other special characters, e.g. .genexpr.

    Names of symbol tables of other types like top and __annotations__ can also be changed for uniformity and to avoid possible future conflicts.

The original issue #119698 was solved in other way, so there is no urge for such change. This is just a wild idea.

Linked PRs

@JelleZijlstra
Copy link
Member

I think a leading dot is a good approach to get a name that doesn't conflict with user-defined names: it always works and it's already what we do in a few other places in the compiler pipeline (e.g., PEP 695 annotation scopes can have a compiler-created local variable called .type_params).

@picnixz
Copy link
Member

picnixz commented Jul 17, 2024

was solved in other way

Yes and I dislike that way although I couldn't come up with something else... It's a bit too fragile IMO =/

I'm actually happy with both proposals:

  • different type: you can have checks of the form .type == TYPE_GENERATOR_EXPRESSION without exposing any name. The issue is that you introduce more types but I'm not sure whether something else needs to be changed in symtable.c.
  • special names: I like the dotted names and since they are already part of PEP 695, maybe it's fine to extend their usage. Having a reserved word is kind of introducing a new keyword and I'm pretty sure that you that you don't want genexpr to be a hard keyword (or any other kind of name, like maybe __Py_genexpr but I don't think we gain anything here...). For brackets, I think it's already the case for the repr() but not in the symbol table itself.

serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Jun 9, 2025
… genexpr

Change the names of the symbol tables for lambda expressions and generator
expressions to "<lambda>" and "<genexpr>" respectively to avoid conflicts
with user-defined names.
@picnixz picnixz added the stdlib Python modules in the Lib dir label Jun 9, 2025
@serhiy-storchaka
Copy link
Member Author

New types for tables will require changing the compiler. Every time we check for FunctionBlock, we should add check for the new types. This can also affect third-party code. Special names would have less impact.

<genexpr> and <lambda> have the following advantages over .genexpr and .lambda:

  • They match the name of the corresponding code object.
  • Since static string objects <genexpr> and <lambda> are already defined, this will allow to reduce the number of static string objects.
  • We can reuse <listcomp>, <setcomp> and <dictcomp> for temporary symbol tables for zero cost.
  • If we even display full qualified names for symbol tables, Class.meth.<genexpr> will look less ambiguous than Class.meth..genexpr.

@serhiy-storchaka
Copy link
Member Author

Since there may be several symbol tables with the same name, the name cannot be used to identify the symbol table. It is only used as a hint.

I would also change the name of the top symbol table to <module>, but it is already documented as "top".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants