-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
GH-98831: Implement super-instruction generation #99084
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@CAM-Gerlach Can you double check the markup in the NEWS entry? What's the recommended way to refer to a markdown file in a different repo owned by a different org? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a few small suggestions to clarify the news entry and move the link to be inline where the DSL introduced, instead of a bare URL at the end of the body, but nothing too major.
Maybe out of scope for this particular PR, but while it may not affect too many people's work directly, this #98831 overall seems a significant enough change to deserve a What's New entry in the (e.g. in the Build Section, or as appropriate), which this NEWS entry seems like a great basis for.
Misc/NEWS.d/next/Build/2022-10-28-22-24-26.gh-issue-98831.IXRCRX.rst
Outdated
Show resolved
Hide resolved
Misc/NEWS.d/next/Build/2022-10-28-22-24-26.gh-issue-98831.IXRCRX.rst
Outdated
Show resolved
Hide resolved
Misc/NEWS.d/next/Build/2022-10-28-22-24-26.gh-issue-98831.IXRCRX.rst
Outdated
Show resolved
Hide resolved
Misc/NEWS.d/next/Build/2022-10-28-22-24-26.gh-issue-98831.IXRCRX.rst
Outdated
Show resolved
Hide resolved
Misc/NEWS.d/next/Build/2022-10-28-22-24-26.gh-issue-98831.IXRCRX.rst
Outdated
Show resolved
Hide resolved
Right now, same or different repo/org, there isn't (AFAIK) currently an easy way to do this with a nice role as opposed to just a plain reST link. However, following discussion and some experimentation in the devguide repo and Discord, I'm working on a little Sphinx extension with a base "SmarkLink" role that, with just a couple lines of customization can be used to link files on the CPython repo, other repos/orgs or even other websites (like PEPs, RFCs, etc, as a smarter and more powerful version of the built-in PEP/RFC roles). If we implement that, you'll be able to do e.g. Here's an earlier prototype. |
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Thanks for your suggestions! All applied. I'm looking forward to using your "SnarkLink" extension next time. :-) |
We need the code generator to have an understanding of stack effects before breaking instructions into their parts. TARGET(STORE_FAST__LOAD_FAST) {
{
PyObject *value = GETLOCAL(oparg);
assert(value != NULL);
Py_INCREF(value);
PUSH(value);
}
NEXTOPARG();
next_instr++;
{
PyObject *value = POP();
SETLOCAL(oparg, value);
}
DISPATCH();
} This is clearly inefficient. TARGET(STORE_FAST__LOAD_FAST) {
PyObject *_stack_tmp1;
{
PyObject *value = GETLOCAL(oparg);
assert(value != NULL);
Py_INCREF(value);
_stack_tmp1 = value;
}
NEXTOPARG();
next_instr++;
{
PyObject *value = _stack_tmp1
SETLOCAL(oparg, value);
}
DISPATCH();
} In order to do this, we first need to have stack effects that the tooling understands. This probably doesn't matter for superinstructions, but does matter if we want to split specialized instructions into guard and action parts. |
I think the next enhancements to generated code should be generating the code for dispatch related macros.
Implementing stack effects can be done before or after the above. Our goal is to expose "micro ops" for optimization:
And be able to generate code for |
@markshannon, I asked you one question you didn't answer yet: Does it matter that the placement of the
I imagine this is important for opcodes that can be combined into others. It doesn't matter for top-level instructions, right? Doing this everywhere seems complicated, since there are quite a few that have conditional
That was going to be my next project, indeed.
What purpose does this serve? The generator can recognize Anyway, my own planned sequence of projects is roughly:
After that we should decide what's next, e.g. combining uops into instrs, cache effects, array stack effects, conditional stack effects, register opcodes, who knows. |
Only if it impairs performance. I don't think it should. |
We need to implement stack effects before superinstructions. Without the stack effects the code generator cannot efficiently move values from one op to the next. |
Sure, it is just about of consistency. It is easier to see what's going on if |
Replacing all the |
Okay, but the current crop of 5 superinstructions don't need this. There's not a single case where the second half does something with the result of the first case (no LOAD_FAST__STORE_FAST). |
@markshannon Please see "PLEASE NOTE" below.
This is pretty straightforward: the expansion of
generates code like this:
The only tricky thing is that the code block for
LOAD_CONST
contains aPREDICTED()
macro call, so we strip ththoseat out. (Eventually we'll auto-generate thePREDICTED()
macros so we won't need to do this any more, but super-instruction generation is a simpler project than auto-PREDICTED insertion, so I figured I'd tackle it first.)The new super-instructions are all placed at the end (the parser doesn't keep track of how instruction definitions are interspersed), so the diff for generated_cases.c.h looks perhaps a bit more imposing than it is.
PLEASE NOTE: The code for some of the generated super-instructions is different than the hand-written super-instructions because somehow the hand-written versions moved the
NEXTOPARG(); next_instr++;
pair up a few lines. I don't know why the originals were written that way -- @markshannon do you remember? Do you think it matters? (Another difference is that the parts each define their own variables, whereas the hand-written versions reusedPyObject *value
, but I assume the optimizer will handle that.)