AMD Gem5 APU Simulator Micro 2015 Final PDF
AMD Gem5 APU Simulator Micro 2015 Final PDF
AMD Gem5 APU Simulator Micro 2015 Final PDF
SIMULATOR: MODELING
HETEROGENEOUS SYSTEMS
IN gem5
AMD RESEARCH
DECEMBER 6, 2015
Scope
Emphasis on the GPU side of the simulator
APU (CPU+GPU) model, not discrete GPU
Acknowledgement
AMD Researchs gem5 Team
2 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
Modeling
an APU
systems
ACKNOWLEDGEMENTS
MANY CONTRIBUTORS OVER THE PAST 5+ YEARS
Alex Dutu
David Roberts
Kunal Korgaonkar
Nagesh Lakshminarayana
Ali Jafri
Derek Hower
Lisa Hsu
Nilay Vaish
Arka Basu
Dmitri Yudanov
Manish Arora
Onur Kayiran
Ayse Yilmazer
Marc Orr
Si Li
Binh Pham
Gagan Sachdev
Mario Mendez-Lojo
Sooraj Puthoor
Blake Hechtman
Jason Power
Mark Leather
Steve Reinhardt
Brad Beckmann
Joel Hestness
Mark Wilkening
Tanmay Gangwani
Brandon Potter
Jieming Yin
Martin Brown
Tim Rogers
Can Hankendi
John Alsop
Matt Poremba
Tony Gutierrez
James Wang
Joe Gross
Mike Chu
Tushar Krishna
David Hashe
3 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
Yatin Manerkar
Yasuko Eckert
QUICK SURVEY
How many of you are:
Graduate students?
Faculty members?
Working for government research labs?
Working for industry?
4 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
OUTLINE
Topic
Presenter
Time
Background
Brad
8:45 9:05
Tony
9:05 9:30
Tony
9:30 10:00
Break
10:00 10:30
Brad
10:30 11:00
Demo
Tony
11:00 11:20
Comparisons/Limitations/Future Work
Brad
11:20 11:45
Questions
Both
11:45 12:00
5 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
BACKGROUND
Terminology and system overview
HSA Features
Coherent shared virtual memory
HSAIL: HSA Intermediate Language
6 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
GPU TERMINOLOGY
GPU I-Cache
SQC
GPU
Core
GPU
Core
GPU
Core
GPU
Core
CU
CU
CU
CU
L1D
L1D
L1D
L1D
TCP
TCP
TCP
TCP
L2
TCC
AMD terminology
CU: Compute Unit (SM in Nvidia terminology), TCP: Texture Cache per Pipe,
TCC: Texture Cache per Channel, SQC: Sequencer Cache
7 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
GPU
CPU
CPU I-Cache
GPU
Core
GPU
Core
GPU
Core
GPU
Core
CPU0
CPU1
L1D
L1D
L1D
L1D
L1D
L1D
L2
Directory
L2
Memory
Controller
8 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
Memory
Separate memory
Separate addr space
CPU CPU
CPU
1
2 N
CU
1
CU
2
CU
CU
3 M
PCIe
Coherent System
Memory
No pointer-based
data structures
High latency
Low bandwidth
GPU Memory
Need lots of
compute on GPU to
amortize copy
overhead
Very limited GPU
memory capacity
9 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
CU
2
CU
CU
3 M
No explicit copying
Data move on demand
Pointer-based data
structures shared across CPU
& GPU
Pageable virtual addresses
No GPU capacity constraints
10 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
11 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
12 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
OpenCL
App
Java App
C++ App
Python
App
OpenCL
Runtime
Java JVM
(Sumatra)
Continuums
Numba
Compiler
HSA
Helper Libraries
HSA Core
Runtime
Kernel Fusion
Driver (KFD)
HSA
Finalizer
http://hsafoundation.com
http://github.com/HSAFoundation
OpenSource
HSA Platform
System Arch
Specification
HSA Runtime
Signals
Platform Atomics
HSA
Programmers
Reference
Manual
HSA System
Runtime
Specification
OpenSource
CLANG/LLVM/HSAIL
C++, OpenMP, OpenACC, Python, OpenCL, etc
OpenSource
Create queues
Allocate memory
Device discovery
OpenSource
HSAIL
13 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
HSAIL
14 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
CLANG/LLVM/HSAIL
C++, OpenMP, OpenACC, Python, OpenCL, etc
Legend
Included in first release
Work-in-progress / may be released
Longer term work
GPU Architecture
GPU
GPU Core
Thread block
in CUDA
GPU Core
HSA Model
NDRange
Workgroup
Thread in
Work-item
CUDA
NDRange: N-Dimensional (N = 1, 2, or 3) index space
Partitioned into workgroups, wavefronts, and work-items
15 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
Grid in CUDA
Workgroup
Wavefront
Warp in CUDA
OUTLINE
Topic
Presenter
Time
Background
Brad
8:45 9:05
Tony
9:05 9:30
Tony
9:30 10:00
Break
10:00 10:30
Brad
10:30 11:00
Demo
Tony
11:00 11:20
Comparisons/Limitations/Future Work
Brad
11:20 11:45
Questions
Both
11:45 12:00
16 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
OVERVIEW OF gem5
Open-source modular platform for system architecture research
Integration of M5 (Univ. of Michigan) and GEMS (Univ. of Wisconsin)
Actively used in academia and industry
See http://www.gem5.org
17 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
Benchmarks
GCC
x86 host
binary
Emulated
OpenCL, OS,
driver ops.
APU extensions
CL
Kernels
CL compiler
LLVM Linker
Optimizer
Low-level
Compiler
BRIG
HSAIL
binary
19 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
HSAILbuiltins
EMULATED CL RUNTIME
Our implementation of OpenCL 2.0 runtime API
Simplifies OpenCL runtime for use with simulator
No OS kernel driver in SE mode, all driver calls captured by emulated driver
open()
Standard Unix system call for opening a device
Returns file descriptor for open device
ioctl()
(I/O control) standard Unix system call for sending commands to a device
Sends device-specific request codes, which are provided by the driver
20 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
x86 host
binary
Details
GCN GPU model
Flexible memory system
Emulated
OpenCL, OS,
driver ops.
HSAIL BRIG
loader
CPU-GPU Communication
via coherent shared virtual memory
X86CPU
CPU
X86
X86
CPU
X86
CPU
X86CPU
CPU
X86
X86
CUCPU
CPU
GPU Cache
Hierarchy
Ruby
21 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
clEnqueueNDRangeKernel()
memcpy(HsaQueueEntry, KernelObject);
open()/ioctl()
*Doorbell = 0; // MEM[Addr B] = 0;
Emulated Driver
BRIG Loader
HSA Kernels
then stored in
kernel object
HsaQueueEntry Addr A
Doorbell Addr B
CUs
Decoder
Instructions are
pre-decoded
Launch kernel
Simulated Mem.
gem5
Dispatcher
write(Addr B);
Shader
22 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
GPU Dispatcher
ID
NDRangeMap
NDRange
NDRange
NDRange
curTask
23 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
CU
CU
ID
NDRangeMap
NDRange
NDRange
NDRange
curTask
NDRange
wg(0, 0, 0)
wg(1, 0, 0)
work-item
wg(0, 1, 0)
wg(1, 1, 0)
24 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
OUTLINE
Topic
Presenter
Time
Background
Brad
8:45 9:05
Tony
9:05 9:30
Tony
9:30 10:00
Break
10:00 10:30
Brad
10:30 11:00
Demo
Tony
11:00 11:20
Comparisons/Limitations/Future Work
Brad
11:20 11:45
Questions
Both
11:45 12:00
25 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
fetch_unit.hh/cc
fetch_stage.hh/cc
wavefront.hh/cc
brig_object.hh/cc
lds_state.hh/cc
vector_register_file.hh/cc
hsail_code.hh/cc
local_memory_pipeline.hh/cc
vector_register_state.hh/cc
arch/hsail/decoder.hh
global_memory_pipeline.hh/cc
arch/hsail/decoder.cc (auto-generated)
gpu_static_inst.hh/cc
compute_unit.hh/cc
shader.hh/cc
GPU core in the APU simulator modeled after Graphics Core Next (GCN) Architecture
More details available here: GCN Architecture Whitepaper www.amd.com/Documents/GCN_Architecture_whitepaper.pdf
26 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
GPU
Core
GPU
Core
L1D
L1D
GPU
Core
L1D
GPU Core
Modules
Ruby
Modules
APU
Simulator
L2
Shader (shader.[cc|hh]): Object containing all GPU cores along with other misc. components
27 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
GPU
Core
GPU
Core
GPU
Core
GPU
Core
L1D
L1D
L1D
L1D
Instruction Fetch
WF 0-9
Contexts
L2
WF 10-19
Contexts
WF 20-29
Contexts
WF 30-39
Contexts
Instruction Decode
SIMD 0
Vector
Registers
SIMD 1
Vector
Registers
SIMD 2
Vector
Registers
SIMD 3
Vector
Registers
Vector
ALU
Vector
ALU
Vector
ALU
Vector
ALU
Shared resources
Fetch and decode
TCP (L1D)
Local data share (LDS)
TCP (L1D)
Functional
Timing
Register file
Simple register allocation model available
Different register organizations and access arbitration
policies possible using its API
Less Detailed
More detailed
29 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
Fetch
Ready WFs
Scoreboard
Executing WFs
Schedule
Execute
Execute-in-execute philosophy
Pipeline stages
30 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
Memory
pipeline
Instruction Fetch
WF 0-9
Contexts
WF 10-19
Contexts
WF 20-29
Contexts
WF 30-39
Contexts
WF Contexts (wavefront.[hh|cc])
Instruction Decode
SIMD 1
SIMD 2
SIMD 3
Vector
Registers
Vector
Registers
Vector
Registers
Vector
Registers
Vector
ALU
Vector
ALU
Vector
ALU
Vector
ALU
BB1
TCP (L1D)
SIMD Phase
Wavefront 0
Wavefront 1
PC
PC
IB
31 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
else
BB1
BB1
BB1
Wavefront 9
PC
IB
if
IB
reconvergence point
WF 10-19
Contexts
WF 20-29
Contexts
WF 30-39
Contexts
Issuing restrictions:
Instruction Decode
SIMD 0
SIMD 1
SIMD 2
SIMD 3
Vector
Registers
Vector
Registers
Vector
Registers
Vector
Registers
Vector
ALU
Vector
ALU
Vector
ALU
Vector
ALU
TCP (L1D)
1.
2.
SIMD Phase
Wavefront 0
Wavefront 1
PC
PC
IB
32 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
Wavefront 9
PC
IB
IB
SIMD 1
Phases
SIMD 2
Phases
SIMD 3
Phases
Instruction Decode
SIMD 0
SIMD 1
SIMD 2
SIMD 3
Vector
Registers
Vector
Registers
Vector
Registers
Vector
Registers
Vector
ALU
Vector
ALU
Vector
ALU
Vector
ALU
TCP (L1D)
vGPRs
33 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
VECTOR ALUs
16-lane vector pipeline per SIMD
Each lane has a set of functional units
One work-item per lane
Instruction Fetch
WF 0-9
Contexts
WF 10-19
Contexts
WF 20-29
Contexts
WF 30-39
Contexts
Instruction Decode
SIMD 0
SIMD 1
SIMD 2
SIMD 3
Vector
Registers
Vector
Registers
Vector
Registers
Vector
Registers
Vector
ALU
Vector
ALU
Vector
ALU
Vector
ALU
TCP (L1D)
34 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
Lane 1
Lane 15
Create
packet
Global/LDS
operation
Write back
New machine ISAs can use this capability to support its own memory instructions
Individual stages contribute to the memory instruction timing
Additionally memory end timing handled by ruby and memory technology parameters
global_memory_pipeline.[hh|cc] and local_memory_pipline.[hh.cc]
35 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
Tag
Coalesce
Write data
Data
Instruction Fetch
Read data
WF 0-9
Contexts
WF 10-19
Contexts
WF 20-29
Contexts
WF 30-39
Contexts
Instruction Decode
SIMD 0
SIMD 1
SIMD 2
SIMD 3
Vector
Registers
Vector
Registers
Vector
Registers
Vector
Registers
Vector
ALU
Vector
ALU
Vector
ALU
Vector
ALU
TCP (L1D)
Decompression
In gem5:
36 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
TCP
WF-related interfaces
Static instruction objects with no dynamic
information
GPU core
model
components
ISA-specific
instruction
classes and
methods
GPUExecContext gpu_exec_context.[hh|cc]
Define API for accessing ISA state
GPU core
model
components
ISA specific
instruction
classes and
methods
GPUStaticInst
GPUDynInst
GPUExecContext
HSACode
HSAIL Static Inst
HSAIL Code
HSAIL Decoder
Operands
ISA State
PSEUDO-INSTRUCTION
Magic instructions for GPU kernels: researcher-defined functionality
Examples include
HSAIL instructions not exposed in high-level languages (e.g., cross-lane instructions)
Print statements and panic instruction within the GPU kernel
GDB break points
Source files
[gem5] src/gpu/arch/hsail/insts/decl.hh
[gem5] src/gpu/arch/hsail/insts/pseudo_inst.cc
39 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
OUTLINE
Topic
Presenter
Time
Background
Brad
8:45 9:05
Tony
9:05 9:30
Tony
9:30 10:00
Break
10:00 10:30
Brad
10:30 11:00
Demo
Tony
11:00 11:20
Comparisons/Limitations/Future Work
Brad
11:20 11:45
Questions
Both
11:45 12:00
40 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
OUTLINE
Topic
Presenter
Time
Background
Brad
8:45 9:05
Tony
9:05 9:30
Tony
9:30 10:00
Break
10:00 10:30
Brad
10:30 11:00
Demo
Tony
11:00 11:20
Comparisons/Limitations/Future Work
Brad
11:20 11:45
Questions
Both
11:45 12:00
41 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
RUBY BACKGROUND
Flexible Memory System
Rich configuration
Simulate combination of caches, coherence, interconnect, etc
Rapid prototyping
Domain-Specific Language (SLICC) for coherence protocols
Modular components
Detailed statistics
Latency distributions for requests
Generated state transitions, network utilization, etc.
42 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
SYNCHRONIZATION BACKGROUND
Traditional synchronization
Kernel Begin: All stores from CPU and prior kernel completions are visible.
Kernel End: All stores from a kernel are visible to CPU and future kernels.
Barrier: All members of a workgroup are at the same PC and all prior stores in program
order will be visible.
43 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
Request coalescing
Hierarchical network topology configuration
44 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
45 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
CPU
L1I (SQC)
GPU
Core
GPU
Core
GPU
Core
GPU
Core
L1D
L1D
L1D
L1D
L2
Directory
GPU
GPU_RfO-SQC.sm
CPU I-Cache
CPU0
CPU1
L1D
L1D
GPU_RfO-TCP.sm
GPU_RfO-TCC.sm
GPU_RfO-TCCdir.sm
L2 directory
Memory
Controller
MOESI_AMD_Base-dir.sm
46 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
L2
Memory
CPU
L1I (SQC)
GPU
GPU_VIPER-SQC.sm
GPU
Core
GPU
Core
GPU
Core
GPU
Core
GPU_VIPER-TCP.sm
L1D
L1D
L1D
L1D
GPU_VIPER-TCC.sm
L1I
CPU0
CPU1
L1D
L1D
L2
Stateless
Directory
L2
Memory
Controller
MOESI_AMD_Base-dir.sm
48 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
Memory
CPU
L1I (SQC)
GPU
GPU_VIPER-SQC.sm
GPU
Core
GPU
Core
GPU
Core
GPU
Core
GPU_VIPER-TCP.sm
L1D
L1D
L1D
L1D
GPU_VIPER-TCC.sm
L1I
CPU0
CPU1
L1D
L1D
L2
Probe
Filter
L2
Memory
Controller
MOESI_AMD_Base-probeFilter.sm
49 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
Memory
GPU
Core
GPU
Core
GPU
Core
L1D
L1D
L1D
L1D
GPU
GPU_VIPER-SQC.sm
L1I
CPU0
CPU1
L1D
L1D
GPU_VIPER-TCP.sm
GPU_VIPER-Region-TCC.sm
L2
RegionBuffer
MOESI_
AMD_Ba
seRegion
Dir.sm
RegionDir
Directory
L2
RegionBuffer
MOESI_AMD_Base-RegionBuffer.sm
Memory
Controller
MOESI_AMD_Base-Region-dir.sm
50 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
Memory
GPU
Core
64 M5 Ports
one pkt per work-item
request (byte address)
L1D
Ruby Port
GPU Coalescer
Processed and buffered using
higher priority events at the
beginning of the cycle
1 Mandatory Queue
one RubyRequest per cache
block (block-aligned address)
51 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
GPU
Cluster
GPU I-Cache
GPU
Core
GPU
Core
GPU
Core
GPU
Core
L1D
L1D
L1D
L1D
CPU
Cluster
CPU I-Cache
CPU0
CPU1
L1D
L1D
L2
Main
Cluster
Directory
L2
Memory
Controller
Memory
src/gpu-compute
52 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
src/cpu
L1
L2
NOC
Router
OUTLINE
Topic
Presenter
Time
Background
Brad
8:45 9:05
Tony
9:05 9:30
Tony
9:30 10:00
Break
10:00 10:30
Brad
10:30 11:00
Demo
Tony
11:00 11:20
Comparisons/Limitations/Future Work
Brad
11:20 11:45
Questions
Both
11:45 12:00
54 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
OUTLINE
Topic
Presenter
Time
Background
Brad
8:45 9:05
Tony
9:05 9:30
Tony
9:30 10:00
Break
10:00 10:30
Brad
10:30 11:00
Demo
Tony
11:00 11:20
Comparisons/Limitations/Future Work
Brad
11:20 11:45
Questions
Both
11:45 12:00
55 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
Multi2Sim
Supports multiple ISAs including AMD Southern Islands Machine ISA
Limited instruction support
No transient states in coherence protocol
This is very different than the gem5 NoMALI emulated gpu device
56 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
LIMITATIONS
No IOMMU
Primitive TLB model
Not full-system
The driver is not supporting different HSA memory segments
There is no support for flat addressing in the emulated cl-runtime
57 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
OBVIOUS IMPROVEMENTS
Detailed performance correlation
Validation of coherence protocols / memory models
4 new coherence protocols
58 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
gpucompute
59 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
gem5
src
configs
mem/
mem/
protocol
ruby
cl-runtime
SUMMARY
Covered a very high-level overview of:
Introduction to the gem5 APU simulator
Mapping between APU system and gem5 APU simulator
Topics discussed
HSA and GPU Background
Compilation and Simulation Flow
GPU Core modules
GPU memory system models in Ruby
Comparisons/Limitations/Improvements
Code organization
OUTLINE
Topic
Presenter
Time
Background
Brad
8:45 9:05
Tony
9:05 9:30
Tony
9:30 10:00
Break
10:00 10:30
Brad
10:30 11:00
Demo
Tony
11:00 11:20
Comparisons/Limitations/Future Work
Brad
11:20 11:45
Questions
Both
11:45 12:00
61 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL
The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.
The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap
changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software
changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD
reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of
such revisions or changes.
AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES,
ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.
AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE
LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION
CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
ATTRIBUTION
2015 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices,
Inc. in the United States and/or other jurisdictions. SPEC is a registered trademark of the Standard Performance Evaluation Corporation (SPEC). OpenCL is a
trademark of Apple Inc. used by permission by Khronos. Other names are for informational purposes only and may be trademarks of their respective owners.
62 | THE AMD gem5 APU SIMULATOR | DECEMBER 6, 2015 | MICRO 2015 TUTORIAL