2-HC2024.nvidia.MarkRen.Intro.v04

Introduction to AI for Chip Design
Haoxing (Mark) Ren, Director of Design Automation Research, NVIDIA

08/25/2024
AI for Design Performance and Productivity
Analysis Optimization Assistance
Faster Faster Know-how

Predictive More scalable Coding
Cross-Stage Better results Task automation
2
AI for Chip Design Research @ NVIDIA
We build AI to build chips for AI ! VerilogCoder
(RTL)
Design ChipNeMo ClusteringAgent FVAgent
(Engineering) (Cell) (FV)
Assistance
VerilogEval RTLFixer FVEval OPCAgent
(RTL) (RTL) (FV) (Lithography)
VAESA Transsizer BufFormer Clustering
Design (Arch) (PD) (PD) (Cell)
Optimization
TAG Dream-GAN ILILT
(Gen AI) (Analog) (PD) (Lithography)
NVCell-RL PrefixRL Graph Cluster AutoCRAFT RL

Design (Cell) (Synthesis) (PD) (Analog)
Optimization FIST ParaSize AutoDMP
(RL, BO) (PD) (Analog) (PD)
ParaGraph MAVIREC DOINN

Design (Parasitics) (IR Drop) (Lithography)
Analysis
PRIMAL GRANNITE
(Power) (Switching Activity)
HPGCN PowerNet
(Testability) (IR Drop)
2019 2020 2021 2022 2023 2024

3
AI Techniques
• Analysis
• Classical ML
• Deep learning
• Optimization
• Bayesian optimization
• Reinforcement learning
• Optimization
• Generative AI
• Assistance
• LLM
4
AI Techniques
Linear Regression Support Vector Machine
• Analysis
• Classical ML
• Deep learning
• Optimization
• Reinforcement learning Neural Network
Decision Tree
• Optimization
• Generative AI
• Assistance
• LLM
Suitable for small structured data
5
AI Techniques
CNN
Suitable for physical

• Analysis design data
• Classical ML
• Deep learning
• Optimization
• Blackbox optimization
1 3
2
• Optimization
GNN
• Generative AI
4 9 8 Suitable for circuit netlist
• Assistance data
• LLM 5 7
6
6
Faster Analysis – IR Drop Estimation
IR drop estimation is important for physical design, but it takes hours
Use AI to predict IR drop from cell level features
Time-decomposed IR drop map

Power map
Power map
Power
maps
Coefficient
𝛽 maps
Cell level features:

𝐼𝑅𝑖𝑛𝑠𝑡 = 𝛽1 𝑃𝑖 + 𝛽2 𝑃𝑠 + 𝛽3 𝑃𝑙 + 𝛽4 𝑃𝑟 + 𝛽5 𝑃𝑡𝑜𝑡 + 𝛽6 𝑅
𝑃𝑖 , 𝑃𝑠 , 𝑃𝑙 , 𝑃𝑟 , 𝑃𝑡𝑜𝑡 , 𝑅
94% accuracy in 3 second vs 3 hr in commercial tools
V.A. Chhabria et al, MAVIREC: ML-Aided Vectored IR-Drop Estimation and Classification 7
Cross-Stage Analysis – Parasitics Prediction
Impact of layout parasitics on schematic design
Use AI to predict layout parasitics from schematic
Convert schematic to graph and learn with GNN
Cap Prediction (F)
Ground Truth
MAE=0.852fF MAPE=15%
Circuit Schematics to Heterogenous Graph Conversion Simulation error reduced to <10%
H. Ren et al, ParaGraph: Layout Parasitics and Device Parameter Prediction using Graph Neural Networks 8
AI Techniques
Training Inference
Model x→f(x)
• Analysis
• Classical ML
max f(x)
• Deep learning
• Optimization Data points New data: xn+1
[x,f(x)]1-n
• Bayesian optimization Compute f(xn+1)
• Optimization
• Generative AI
• Assistance
• LLM
Build a probability model of the objective function from
data space and use it to select the most promising data to
sample next
9
Parameter Optimization – Macro Placement
Macro placement quality is very important for physical design
Placement parameters have a huge impact on macro placement
Multi-objective Bayesian optimization: wirelength, congestion, density
Find better macro placement with open-source GPU accelerated placement tools
AutoDMP
Baseline
Best AutoDMP
A. Agnesina et al, AutoDMP: Automated DREAMPlace-based Macro Placement 10

Environment
AI Techniques
• Analysis
• Classical ML
• Deep learning
State: St
• Optimization Current board
• Bayesian optimization Action: At
• Reinforcement learning Bf4d5
• Optimization
• Generative AI
• Assistance Reward: Rt
• LLM win/loose/piece
RL Agent
Objective → Reward
Variables → Action 11
Fix Design Rule Check(DRC)
Too many DRC rules to consider for cell layout
RL agent learns to fix DRC automatically
Action: adding
additional M0 grid to
reduce DRCs Local
Patterns
State: current
Step 0: D R C = 6 Step 1: D R C = 6 Step 2: D R C = 3 Step 3: D R C = 0
layout images reward=0 reward=3 reward=3
Reward: D R C
reduction
Used in NVCell layout

generator in production
H. Ren et al, NVCell: Standard Cell Layout in Advanced Technology Nodes with Reinforcement Learning 12
Design Better Datapath
Datapath synthesis important for GPU
Optimize prefix adder structure with RL
action
AGENT ENVIRONMENT
state
S1
S0
synthesized circuits
reward
∆(area,delay)
Action Space
PrefixRL achieves better results than well
Deep Q learning
Circuit synthesis in-the-loop Add/Delete prefix graph nodes known adder architectures
Roy et al, PrefixRL: Optimization of parallel prefix circuits using deep reinforcement learning 13
AI Techniques Generate optimal design points
欢迎来Hot Chips
• Analysis
condition
• Classical ML Encoder Decoder Transformer
• Deep learning
• Optimization Welcome to Hot Chips
Representation Learning for Optimization
• Optimization
• Generative AI Input Output
z~𝑁(0,1)
• Assistance
• LLM X Encoder Decoder X’ VAE
Latent space
14
Generate Optimal Gate Size
Timing/power optimization such as gate sizing affects scalability of PD tools
Model a path of gates as a sequence, generate optimized gate sizes using Transformer
S0 S1 S2 S3 S4 S : gate size
T0 T1 T2 T3 T4 T : gate features
Seq Q or Seq D or
Primary Primary
Input Output
Power/Delay Tradeoff
condition
T0, T1, T2, T3, T4 Encoder Decoder ෢
𝑆0, ෢
𝑆1, ෢
𝑆2, ෢
𝑆3, ෢
𝑆4
100X – 1000X speedup compared to traditional optimization with similar PPA
S. Nath et al, Transizer: A Novel Transformer-Based Fast Gate Size 15

Optimization – Accelerator Design
Irregular landscape of neural network accelerator design space
Optimize on the latent space (reduced dim, smooth) learned using VAE
6.8X sample efficiency and 5% better performance
16
Q. Huang et al, Learning A Continuous and Reconstructible Latent Space for Hardware Accelerator Design
AI Techniques
• Analysis LLM is good at
Open question answering
• Classical ML Closed question answering
• Deep learning Coding
Extraction
• Optimization Rewriting
• Bayesian optimization Classification
• Reinforcement learning Summarization
Reasoning
• Optimization …
• Generative AI
LLM is a generalist
• Assistance
Leverage pre-trained models
• LLM
17
Make LLM Learn to Do Chip Design
In-Context Learning Parameter Training Agent
Retrieval Augmented Generation M. Liu et al, ChipNeMo C.-T. Ho et al, VerilogCoder

(RAG)
18
LLM for Chip Design – Cambrian Explosion
Papers in ISLAD 2024 ( LLM-Aided Design) : islad.org 19

Design Assistance – LLM
Know-How Coding
Task Automation
Analysis Optimization Debug
20
Coding Assistance – EDA Script Generation
Generate scripts for specific tasks (VLSI)
M. Liu et al, ChipNeMo: Domain-Adapted LLMs for Chip Design 21

Know-how Assistance – Engineering Chat Bot
Answer questions about designs, infrastructures, tools, flows, HW domains,

etc.

Analysis Assistance – Bug Report Analysis
Summarize bug report, predict task assignment

Closing Thoughts
• BO and RL continue to drive better PPA for chip design

• Generative AI trained on optimized data to speed up traditional optimizations by
orders of magnitude
• LLM models and agents to significantly improve chip design
productivity by providing design assistance as chatbots and copilots
and automating more manual design tasks.
• The importance of reliable and efficient inference infrastructure.
• Call for action: need more datasets and benchmarks: VerilogEval,

FVEval, LLM4HWDesign, …
24

2-HC2024.nvidia.MarkRen.Intro.v04

Uploaded by

Copyright:

Available Formats

2-HC2024.nvidia.MarkRen.Intro.v04

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2-HC2024.nvidia.MarkRen.Intro.v04

Uploaded by

Copyright:

Available Formats

Introduction to AI for Chip Design

Haoxing (Mark) Ren, Director of Design Automation Research, NVIDIA

Analysis Optimization Assistance

Faster Faster Know-how

NVCell-RL PrefixRL Graph Cluster AutoCRAFT RL

ParaGraph MAVIREC DOINN

2019 2020 2021 2022 2023 2024

Suitable for physical

Time-decomposed IR drop map

Cell level features:

A. Agnesina et al, AutoDMP: Automated DREAMPlace-based Macro Placement 10

Used in NVCell layout

100X – 1000X speedup compared to traditional optimization with similar PPA

S. Nath et al, Transizer: A Novel Transformer-Based Fast Gate Size 15

Retrieval Augmented Generation M. Liu et al, ChipNeMo C.-T. Ho et al, VerilogCoder

Papers in ISLAD 2024 ( LLM-Aided Design) : islad.org 19

Analysis Optimization Debug

Generate scripts for specific tasks (VLSI)

M. Liu et al, ChipNeMo: Domain-Adapted LLMs for Chip Design 21

Answer questions about designs, infrastructures, tools, flows, HW domains,

M. Liu et al, ChipNeMo: Domain-Adapted LLMs for Chip Design 22

M. Liu et al, ChipNeMo: Domain-Adapted LLMs for Chip Design 23

• BO and RL continue to drive better PPA for chip design

• Call for action: need more datasets and benchmarks: VerilogEval,

You might also like