Skip to content

Implement autoload device extension mechanism (from RFC #122468) #127228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

bkowalskiINTEL
Copy link

These changes refer to RFC from #122468 issue. Early testing has been done against habana_frameworks extension, with the entry point exported according to the RFC discussion.
Things to do:

  • Documentation
  • Unit tests (we are open for suggestions)
  • More in-depth testing (in progress)

Looking forward for your review.

Copy link

pytorch-bot bot commented May 27, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/127228

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 2f938df with merge base 1110edb (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link

linux-foundation-easycla bot commented May 27, 2024

CLA Missing ID CLA Not Signed

@bkowalskiINTEL bkowalskiINTEL marked this pull request as draft May 27, 2024 14:39
@@ -2065,6 +2065,28 @@ def _constrain_as_size(symbol, min: Optional[builtins.int] = None, max: Optional
"""
torch.sym_constrain_range_for_size(symbol, min=min, max=max)

def import_device_backends():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move the change to the end of file

for backend in entry_points(group='torch.backends'):
try:
backend_hook = backend.load()
if not hasattr(backend_hook, 'init_custom_backend'):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change a name of function to init_device_backend for consistency

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we use the official load() function of EntryPoint to get the init function to call?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can simplify to initialize at load

for backend in entry_points(group='torch.backends'):
try:
backend_hook = backend.load()
if not hasattr(backend_hook, 'init_custom_backend'):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we use the official load() function of EntryPoint to get the init function to call?


def is_device_backend_autoload_enabled() -> bool:
var = os.getenv("TORCH_DISABLE_DEVICE_BACKEND_AUTOLOAD")
return not var in (1, 'True', 'true', 'Yes', 'yes')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: suggest to do var.upper() to simplify the code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since os.getenv() returns None, when the variable isn't set, that's how I would see it then:

def is_device_backend_autoload_enabled() -> bool:
    var = os.getenv("TORCH_DISABLE_DEVICE_BACKEND_AUTOLOAD")
    return not str(var).upper() in ('1', 'TRUE', 'YES')

What do You think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def is_device_backend_autoload_enabled() -> bool:
    var = os.getenv("TORCH_DISABLE_DEVICE_BACKEND_AUTOLOAD")
    return var is None or not str(var).upper() in ('1', 'TRUE', 'YES')

Comment on lines +2074 to +2082
for backend in entry_points(group='torch.backends'):
try:
backend_hook = backend.load()
if not hasattr(backend_hook, 'init_custom_backend'):
print(f"No explicit backend init function for \'{backend_hook.__name__}\' has been found, custom backend can't be loaded.")
continue
backend_hook.init_custom_backend()
except IndexError:
pass
Copy link
Contributor

@shink shink May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to call the init function, only load it.
Since the init function is implemented in your extension, your extension will be imported automatically after backend.load() is called. And calling it directly might be risky.

Here is a simple example and any suggestions are welcome.

in torch_npu implementation:

# setup.py
setup(
    entry_points={
        'torch.backends': [
            'torch_npu = torch_npu:_init_device_backend',
        ],
    }
)

# torch_npu/__init__.py
def _init_device_backend():
    pass

in pytorch:

# torch/__init__.py
for plugin in discovered_plugins:
    try:
        # just loads the plugin without calling
        plugin.load()
    except Exception:
        # keep quiet
        pass

@jczaja
Copy link

jczaja commented May 29, 2024

@bsochack , @jgong5 , @shink Hi, We accidentally deleted branch used for this PR and so PR was invalidated. We recasted PR here: #127386 . It does contain fixes to all comments of this review. Please continue to review this functionality at new PR: #127386 . We apologize for inconvenience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants