Skip to content

Add decompositions for median and nonmedian #134881

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: gh/isuruf/81/base
Choose a base branch
from

Conversation

isuruf
Copy link
Collaborator

@isuruf isuruf commented Aug 30, 2024

isuruf added 2 commits August 30, 2024 20:09
[ghstack-poisoned]
[ghstack-poisoned]
Copy link

pytorch-bot bot commented Aug 30, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/134881

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Unrelated Failure

As of commit bac6299 with merge base 83875cd (image):

NEW FAILURES - The following jobs have failed:

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

  • pull / linux-jammy-py3_9-clang9-xla / test (xla, 1, 1, lf.linux.12xlarge, unstable) (gh) (#158876)
    /var/lib/jenkins/workspace/xla/torch_xla/csrc/runtime/BUILD:476:14: Compiling torch_xla/csrc/runtime/xla_util_test.cpp failed: (Exit 1): gcc failed: error executing CppCompile command (from target //torch_xla/csrc/runtime:xla_util_test) /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections ... (remaining 229 arguments skipped)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

isuruf added a commit that referenced this pull request Aug 30, 2024
ghstack-source-id: 4b1cb71
Pull Request resolved: #134881
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
isuruf added a commit that referenced this pull request Sep 3, 2024
ghstack-source-id: 12b12d2
Pull Request resolved: #134881
Copy link
Collaborator

@Chillee Chillee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you run some benchmarks comparing to eager? Also, do we want to change whether we do the fallback depending on the device?

@isuruf
Copy link
Collaborator Author

isuruf commented Sep 4, 2024

Benchmark results are not good compared to eager.

[------------------------------------ max_unpool -----------------------------------]
                                                              |  Decomposed  |  Eager
12 threads: -------------------------------------------------------------------------
      {'dim': 0, 'keepdim': False, 'shape': (3000, 3000)}     |     1098     |   1048
      {'dim': 0, 'keepdim': True, 'shape': (3000, 3000)}      |     1070     |   1049
      {'dim': 1, 'keepdim': False, 'shape': (3000, 3000)}     |      689     |    538
      {'dim': 1, 'keepdim': True, 'shape': (3000, 3000)}      |      662     |    542
      {'shape': (3000, 3000)}                                 |     1425     |   1407
      {'dim': 0, 'keepdim': False, 'shape': (200, 200, 200)}  |     1818     |    846
      {'dim': 0, 'keepdim': True, 'shape': (200, 200, 200)}   |     1790     |    846
      {'dim': 1, 'keepdim': False, 'shape': (200, 200, 200)}  |     1768     |    740
      {'dim': 1, 'keepdim': True, 'shape': (200, 200, 200)}   |     1742     |    739
      {'dim': 2, 'keepdim': False, 'shape': (200, 200, 200)}  |      676     |    679
      {'dim': 2, 'keepdim': True, 'shape': (200, 200, 200)}   |      650     |    677
      {'shape': (200, 200, 200)}                              |     1274     |   1263

@Chillee
Copy link
Collaborator

Chillee commented Sep 4, 2024

@isuruf I took a look into the perf, and some observations:

  1. it seems the main difference in perf is that for the case where dim is passed in, eager actually has atopk kernel it dispatches to, which is the main reason why eager is faster than decomposed for those cases.
  2. There are actually quite a bit of shapes where the decomposition is faster than aten. The issue is that your benchmarking is all with multiples of 100, and not multiples of 128 :P But this seems to mainly come down to whether aten.topk or aten.sort is faster.

For example

torch.Size([24000, 2048])
aten:  3.3983023166656494
compile:  2.7202703952789307

@isuruf
Copy link
Collaborator Author

isuruf commented Sep 5, 2024

Thanks. I'll look at getting a topk kernel in.

isuruf added 2 commits October 4, 2024 18:16
[ghstack-poisoned]
[ghstack-poisoned]
isuruf added a commit that referenced this pull request Oct 4, 2024
ghstack-source-id: 7be93be
Pull Request resolved: #134881
Copy link
Contributor

github-actions bot commented Dec 3, 2024

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Dec 3, 2024
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
isuruf added a commit that referenced this pull request Jan 2, 2025
ghstack-source-id: f101aae
Pull Request resolved: #134881
[ghstack-poisoned]
@pytorch-bot pytorch-bot bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Jan 3, 2025
isuruf added a commit that referenced this pull request Jan 3, 2025
ghstack-source-id: 584ed5d
Pull Request resolved: #134881
@isuruf isuruf requested a review from Chillee January 16, 2025 09:25
@isuruf
Copy link
Collaborator Author

isuruf commented Jan 23, 2025

I have removed the decompositions for median.dim and only kept it for median.default for now which uses aten.sort and therefore we don't have to get a topk kernel in. I also separated out the OpInfo for median.dim and median.default to two variants to make tests work.

@github-actions github-actions bot closed this Feb 22, 2025
@github-actions github-actions bot deleted the gh/isuruf/81/head branch March 25, 2025 02:11
@isuruf isuruf restored the gh/isuruf/81/head branch August 7, 2025 19:01
@isuruf isuruf removed the Stale label Aug 7, 2025
@isuruf isuruf reopened this Aug 7, 2025
[ghstack-poisoned]
isuruf added a commit that referenced this pull request Aug 7, 2025
ghstack-source-id: 8d0ff9c
Pull Request resolved: #134881
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants