2-HC2024.nvidia.MarkRen.Intro.v04

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

Introduction to AI for Chip Design

Haoxing (Mark) Ren, Director of Design Automation Research, NVIDIA


08/25/2024
AI for Design Performance and Productivity

Analysis Optimization Assistance

Faster Faster Know-how


Predictive More scalable Coding
Cross-Stage Better results Task automation

2
AI for Chip Design Research @ NVIDIA
We build AI to build chips for AI ! VerilogCoder
(RTL)
Design ChipNeMo ClusteringAgent FVAgent
(Engineering) (Cell) (FV)
Assistance
VerilogEval RTLFixer FVEval OPCAgent
(RTL) (RTL) (FV) (Lithography)
VAESA Transsizer BufFormer Clustering
Design (Arch) (PD) (PD) (Cell)
Optimization
TAG Dream-GAN ILILT
(Gen AI) (Analog) (PD) (Lithography)

NVCell-RL PrefixRL Graph Cluster AutoCRAFT RL


Design (Cell) (Synthesis) (PD) (Analog)
Optimization FIST ParaSize AutoDMP
(RL, BO) (PD) (Analog) (PD)

ParaGraph MAVIREC DOINN


Design (Parasitics) (IR Drop) (Lithography)
Analysis
PRIMAL GRANNITE
(Power) (Switching Activity)
HPGCN PowerNet
(Testability) (IR Drop)

2019 2020 2021 2022 2023 2024


3
AI Techniques
• Analysis
• Classical ML
• Deep learning
• Optimization
• Bayesian optimization
• Reinforcement learning
• Optimization
• Generative AI
• Assistance
• LLM

4
AI Techniques
Linear Regression Support Vector Machine
• Analysis
• Classical ML
• Deep learning
• Optimization
• Bayesian optimization
• Reinforcement learning Neural Network
Decision Tree
• Optimization
• Generative AI
• Assistance
• LLM
Suitable for small structured data
5
AI Techniques
CNN

Suitable for physical


• Analysis design data
• Classical ML
• Deep learning
• Optimization
• Blackbox optimization
1 3
• Reinforcement learning
2
• Optimization
GNN
• Generative AI
4 9 8 Suitable for circuit netlist
• Assistance data
• LLM 5 7

6
6
Faster Analysis – IR Drop Estimation
IR drop estimation is important for physical design, but it takes hours
Use AI to predict IR drop from cell level features

Time-decomposed IR drop map


Power map
Power map

Power
maps

Coefficient
𝛽 maps

Cell level features:


𝐼𝑅𝑖𝑛𝑠𝑡 = 𝛽1 𝑃𝑖 + 𝛽2 𝑃𝑠 + 𝛽3 𝑃𝑙 + 𝛽4 𝑃𝑟 + 𝛽5 𝑃𝑡𝑜𝑡 + 𝛽6 𝑅
𝑃𝑖 , 𝑃𝑠 , 𝑃𝑙 , 𝑃𝑟 , 𝑃𝑡𝑜𝑡 , 𝑅
94% accuracy in 3 second vs 3 hr in commercial tools

V.A. Chhabria et al, MAVIREC: ML-Aided Vectored IR-Drop Estimation and Classification 7
Cross-Stage Analysis – Parasitics Prediction
Impact of layout parasitics on schematic design
Use AI to predict layout parasitics from schematic
Convert schematic to graph and learn with GNN
Cap Prediction (F)

Ground Truth

MAE=0.852fF MAPE=15%
Circuit Schematics to Heterogenous Graph Conversion Simulation error reduced to <10%

H. Ren et al, ParaGraph: Layout Parasitics and Device Parameter Prediction using Graph Neural Networks 8
AI Techniques
Training Inference
Model x→f(x)

• Analysis
• Classical ML
max f(x)
• Deep learning
• Optimization Data points New data: xn+1
[x,f(x)]1-n
• Bayesian optimization Compute f(xn+1)
• Reinforcement learning
• Optimization
• Generative AI
• Assistance
• LLM
Build a probability model of the objective function from
data space and use it to select the most promising data to
sample next
9
Parameter Optimization – Macro Placement
Macro placement quality is very important for physical design
Placement parameters have a huge impact on macro placement
Multi-objective Bayesian optimization: wirelength, congestion, density
Find better macro placement with open-source GPU accelerated placement tools

AutoDMP
Baseline
Best AutoDMP

A. Agnesina et al, AutoDMP: Automated DREAMPlace-based Macro Placement 10


Environment
AI Techniques
• Analysis
• Classical ML
• Deep learning
State: St
• Optimization Current board
• Bayesian optimization Action: At
• Reinforcement learning Bf4d5

• Optimization
• Generative AI
• Assistance Reward: Rt
• LLM win/loose/piece
RL Agent
Objective → Reward
Variables → Action 11
Fix Design Rule Check(DRC)
Too many DRC rules to consider for cell layout
RL agent learns to fix DRC automatically

Action: adding
additional M0 grid to
reduce DRCs Local
Patterns
State: current
Step 0: D R C = 6 Step 1: D R C = 6 Step 2: D R C = 3 Step 3: D R C = 0
layout images reward=0 reward=3 reward=3

Reward: D R C
reduction

Used in NVCell layout


generator in production

H. Ren et al, NVCell: Standard Cell Layout in Advanced Technology Nodes with Reinforcement Learning 12
Design Better Datapath
Datapath synthesis important for GPU
Optimize prefix adder structure with RL
action

AGENT ENVIRONMENT

state
S1
S0

synthesized circuits

reward
∆(area,delay)
Action Space
PrefixRL achieves better results than well
Deep Q learning
Circuit synthesis in-the-loop Add/Delete prefix graph nodes known adder architectures

Roy et al, PrefixRL: Optimization of parallel prefix circuits using deep reinforcement learning 13
AI Techniques Generate optimal design points

欢迎来Hot Chips
• Analysis
condition
• Classical ML Encoder Decoder Transformer
• Deep learning
• Optimization Welcome to Hot Chips

• Bayesian optimization
• Reinforcement learning
Representation Learning for Optimization
• Optimization
• Generative AI Input Output
z~𝑁(0,1)
• Assistance
• LLM X Encoder Decoder X’ VAE

Latent space
14
Generate Optimal Gate Size
Timing/power optimization such as gate sizing affects scalability of PD tools
Model a path of gates as a sequence, generate optimized gate sizes using Transformer

S0 S1 S2 S3 S4 S : gate size

T0 T1 T2 T3 T4 T : gate features

Seq Q or Seq D or
Primary Primary
Input Output
Power/Delay Tradeoff

condition
T0, T1, T2, T3, T4 Encoder Decoder ෢
𝑆0, ෢
𝑆1, ෢
𝑆2, ෢
𝑆3, ෢
𝑆4

100X – 1000X speedup compared to traditional optimization with similar PPA

S. Nath et al, Transizer: A Novel Transformer-Based Fast Gate Size 15


Optimization – Accelerator Design
Irregular landscape of neural network accelerator design space
Optimize on the latent space (reduced dim, smooth) learned using VAE
6.8X sample efficiency and 5% better performance

16
Q. Huang et al, Learning A Continuous and Reconstructible Latent Space for Hardware Accelerator Design
AI Techniques
• Analysis LLM is good at
Open question answering
• Classical ML Closed question answering
• Deep learning Coding
Extraction
• Optimization Rewriting
• Bayesian optimization Classification
• Reinforcement learning Summarization
Reasoning
• Optimization …
• Generative AI
LLM is a generalist
• Assistance
Leverage pre-trained models
• LLM

17
Make LLM Learn to Do Chip Design
In-Context Learning Parameter Training Agent

Retrieval Augmented Generation M. Liu et al, ChipNeMo C.-T. Ho et al, VerilogCoder


(RAG)
18
LLM for Chip Design – Cambrian Explosion

Papers in ISLAD 2024 ( LLM-Aided Design) : islad.org 19


Design Assistance – LLM

Know-How Coding

Task Automation

Analysis Optimization Debug

20
Coding Assistance – EDA Script Generation

Generate scripts for specific tasks (VLSI)

M. Liu et al, ChipNeMo: Domain-Adapted LLMs for Chip Design 21


Know-how Assistance – Engineering Chat Bot

Answer questions about designs, infrastructures, tools, flows, HW domains,


etc.

M. Liu et al, ChipNeMo: Domain-Adapted LLMs for Chip Design 22


Analysis Assistance – Bug Report Analysis
Summarize bug report, predict task assignment

M. Liu et al, ChipNeMo: Domain-Adapted LLMs for Chip Design 23


Closing Thoughts

• BO and RL continue to drive better PPA for chip design


• Generative AI trained on optimized data to speed up traditional optimizations by
orders of magnitude
• LLM models and agents to significantly improve chip design
productivity by providing design assistance as chatbots and copilots
and automating more manual design tasks.
• The importance of reliable and efficient inference infrastructure.

• Call for action: need more datasets and benchmarks: VerilogEval,


FVEval, LLM4HWDesign, …

24

You might also like