Skip to content

Add file describing how to add or modify specialized families of instructions. #26954

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 1, 2021

Conversation

markshannon
Copy link
Member

@markshannon markshannon commented Jun 29, 2021

We've had a few contributors wanting to implement specialized instructions.
To help out those contributors, I thought it would be useful add a "how to" describing the process.

Copy link
Member

@Fidget-Spinner Fidget-Spinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I noticed some of the lower level implementation details aren't described (like the code and data layout, being positive and negative indexes on a contiguous array). But those will probably change quite a bit over time, so I think it's fine to omit them.

Because it is impossible to measure the performance of an instruction without also
measuring unrelated factors, the assessment of the quality of a specialization will require some judgement.

As a general rule, specialized instructions should be much faster than the base instruction.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question: how much faster exactly? Or is this evaluated case-by-case using our own judgement?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends, so yes use your judgement.

If a specialization covers a lot of cases and reduces the number of misses by a lot, then a x2 speedup would be fine. For a more narrow specialization, something more like x4 would be appropriate.

I'm talking about the time spent in the instruction only, so for call instructions don't count the time spent in the callee or the overhead of making a C call.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you fold lines longer than 80 characters?

Adding more code examples would make the doc more accessible. At the very least indicate where to find the examples you do mention in the source code.

@pablogsal
Copy link
Member

I would advice to copy/move this guide to https://github.com/python/devguide as this is going to be much more visible there than a readme in the Python directory.

@gvanrossum
Copy link
Member

I would advice to copy/move this guide to https://github.com/python/devguide as this is going to be much more visible there than a readme in the Python directory.

Not clear that we need to give it much visibility though. There are probably only 20-30 opcodes that are candidates for specialization and only very few people (including everyone reading this PR :-) will need it. There are other readmes sprinkled throughout the code for similarly obscure things.

Maybe there should be a mention of this readme in ceval.c or specialize.c though?

@pablogsal
Copy link
Member

pablogsal commented Jun 29, 2021

Maybe there should be a mention of this readme in ceval.c or specialize.c though?

Right, that is probably a much better approach than the devguide 👍

at regular intervals, attempts to specialize itself. If not specializing, it executes
the non-adaptive instruction.
* It has at least one specialized form of the instruction that is tailored for a particular value or set of values at runtime.
* All members of the family have access to same number of cache entries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* All members of the family have access to same number of cache entries.
* All members of the family have access to the same number of cache entries.


### Gathering data

Before choosing how to specialize an instruction, it is important to gather some data. What are the pattern of usage of the base instruction?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Before choosing how to specialize an instruction, it is important to gather some data. What are the pattern of usage of the base instruction?
Before choosing how to specialize an instruction, it is important to gather some data. What are the patterns of usage of the base instruction?

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's land this!

@markshannon markshannon reopened this Jun 30, 2021
@markshannon markshannon merged commit 3623aaa into python:main Jul 1, 2021
@markshannon markshannon deleted the add-specialization-guide branch January 6, 2022 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants