[benchmark] Add HF LLM benchmarks #156967

angelayi · 2025-06-26T17:02:47Z

Results in https://docs.google.com/spreadsheets/d/1xXOPg9JjEmPx0zc5QBNdyXQq8-K2_r4ybHaiS-q7pZ0/edit?gid=88695043#gid=88695043

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames

pytorch-bot · 2025-06-26T17:02:52Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/156967

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 11 Pending

As of commit 441527d with merge base 80cca83 ():

NEW FAILURES - The following jobs have failed:

inductor / cuda12.8-py3.10-gcc9-sm86 / test (inductor_huggingface, 1, 1, linux.g5.4xlarge.nvidia.gpu) (gh)
ValueError: rope_scalingmust be a dictionary with with two fields,typeandfactor, got {'factor': 32.0, 'high_freq_factor': 4.0, 'low_freq_factor': 1.0, 'original_max_position_embeddings': 8192, 'rope_type': 'llama3'}
inductor / linux-jammy-cpu-py3.9-gcc11-inductor / test (dynamic_cpu_inductor_huggingface, 1, 1, linux.8xlarge.amx) (gh)
ValueError: rope_scalingmust be a dictionary with with two fields,typeandfactor, got {'factor': 32.0, 'high_freq_factor': 4.0, 'low_freq_factor': 1.0, 'original_max_position_embeddings': 8192, 'rope_type': 'llama3'}

This comment was automatically generated by Dr. CI and updates every 15 minutes.

benchmarks/dynamo/huggingface_llm_models.py

BoyuanFeng · 2025-07-15T16:47:51Z

Thanks for adding more models! A few minor comments. Also, please fix the ci.

torch/_dynamo/utils.py

benchmarks/dynamo/huggingface_llm.py

benchmarks/dynamo/huggingface_llm_models.py

benchmarks/dynamo/huggingface_llm.yaml

benchmarks/dynamo/huggingface_llm.py

BoyuanFeng · 2025-08-11T05:14:23Z

Curious, will we add these models into existing Huggingface column or a new column called "huggingface_llm"? It might be a bit confusing with two columns starting with "huggingface"..

angelayi · 2025-08-11T15:43:56Z

@BoyuanFeng yes! I have updated to merge everything into the huggingface column.

angelayi · 2025-08-11T15:45:51Z

benchmarks/dynamo/common.py

-        elif args.export_nativert:
-            frozen_model_iter_fn = export_nativert(model, example_inputs)
+        use_generate_mode = kwargs.get("use_generate_mode", False)
+        if use_generate_mode:


I added this use_generate_mode flag so that we only apply torch.compile/export to model.forward, instead of applying it to model.generate

angelayi requested review from zou3519 and anijain2305 June 26, 2025 17:02

pytorch-bot bot added ciflow/inductor module: dynamo labels Jun 26, 2025

angelayi requested a review from BoyuanFeng June 26, 2025 21:36

angelayi added the topic: not user facing topic category label Jun 26, 2025

anijain2305 reviewed Jul 14, 2025

View reviewed changes

benchmarks/dynamo/huggingface_llm_models.py Show resolved Hide resolved

BoyuanFeng reviewed Jul 15, 2025

View reviewed changes

angelayi force-pushed the angelayi/benchmark2 branch 4 times, most recently from 0749c30 to bbf4a09 Compare August 11, 2025 15:37

angelayi marked this pull request as ready for review August 11, 2025 15:43

angelayi commented Aug 11, 2025

View reviewed changes

angelayi force-pushed the angelayi/benchmark2 branch from bbf4a09 to f48faf2 Compare August 11, 2025 16:06

[benchmark] Add HF LLM benchmarks

441527d

angelayi force-pushed the angelayi/benchmark2 branch from f48faf2 to 441527d Compare August 12, 2025 04:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[benchmark] Add HF LLM benchmarks #156967

[benchmark] Add HF LLM benchmarks #156967

angelayi commented Jun 26, 2025 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Jun 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

BoyuanFeng commented Jul 15, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BoyuanFeng commented Aug 11, 2025

Uh oh!

angelayi commented Aug 11, 2025

Uh oh!

angelayi Aug 11, 2025

Uh oh!

Uh oh!

[benchmark] Add HF LLM benchmarks #156967

Are you sure you want to change the base?

[benchmark] Add HF LLM benchmarks #156967

Conversation

angelayi commented Jun 26, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/156967

❌ 2 New Failures, 11 Pending

Uh oh!

Uh oh!

BoyuanFeng commented Jul 15, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BoyuanFeng commented Aug 11, 2025

Uh oh!

angelayi commented Aug 11, 2025

Uh oh!

angelayi Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

angelayi commented Jun 26, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Jun 26, 2025 •

edited

Loading