AMD Athlon: Processor

Download as pdf
Download as pdf
You are on page 1of 10

AMD Athlon

TM

Processor

Technical Brief

Publication # 22054 Rev: D


Issue Date: December 1999
© 1999 Advanced Micro Devices, Inc. All rights reserved.

The contents of this document are provided in connection with Advanced


Micro Devices, Inc. (“AMD”) products. AMD makes no representations or
warranties with respect to the accuracy or completeness of the contents of
this publication and reserves the right to make changes to specifications and
product descriptions at any time without notice. No license, whether express,
implied, arising by estoppel or otherwise, to any intellectual property rights
is granted by this publication. Except as set forth in AMD’s Standard Terms
and Conditions of Sale, AMD assumes no liability whatsoever, and disclaims
any express or implied warranty, relating to its products including, but not
limited to, the implied warranty of merchantability, fitness for a particular
purpose, or infringement of any intellectual property right.

AMD’s products are not designed, intended, authorized or warranted for use
as components in systems intended for surgical implant into the body, or in
other applications intended to support or sustain life, or in any other applica-
tion in which the failure of AMD’s product could create a situation where per-
sonal injury, death, or severe property or environmental damage may occur.
AMD reserves the right to discontinue or make changes to its products at any
time without notice.

Trademarks
AMD, the AMD logo, AMD Athlon, and combinations thereof, and 3DNow! are trademarks of Advanced Micro
Devices, Inc.

MMX is a trademark of Intel Corporation.

Digital and Alpha are a trademarks of Digital Equipment Corporation.

Other product names used in this publication are for identification purposes only and may be trademarks of
their respective companies.
22054D/0—December 1999 AMD Athlon™ Processor Technical Brief

Revision History
Date Rev Description
August 1999 C Initial public release.
Added information about AMD's new 0.18-micron process technology to “Process Technology”
December 1999 D
on page 7

Revision History iii


22054D/0—December 1999 AMD Athlon™ Processor Technical Brief

AMD Athlon™ Processor

Technical Brief

Introduction

The AMD Athlon™ processor powers the next generation in


computing platforms, delivering the ultimate performance for
cutting-edge applications and an unprecedented computing
experience.

The AMD Athlon™ processor is the first member of a new


family of seventh-generation AMD processors designed to meet
the computation-intensive requirements of cutting-edge
software applications running on high-performance desktop
systems, workstations, and servers. This technical brief
describes the features of the AMD Athlon processor’s
microarchitecture.

The AMD Athlon processor’s microarchitecture is designed to


support the g rowing processo r and system bandwidt h
requirements of emerging software, graphics, I/O, and memory
technologies. The AMD Athlon processor's high-speed
execution core includes multiple x86 instruction decoders, a
dual-ported 128-Kbyte split level-one (L1) cache, three
independent integer pipelines, three address calculation
pipelines, and the x86 industry's first superscalar, fully
pipelined, out-of-order, three-way floating-point engine. The
floating-point engine is capable of delivering 2.4 gigaflops
(G flops) of single-pre cision and more than 1 Gflop of

Introduction 1
AMD Athlon™ Processor Technical Brief 22054D/0—December 1999

double-precision floating-point results at 600 MHz for superior


performance on numerically complex applications.

The AMD Athlon processor’s microarchitecture includes:


■ The industry's first nine-issue, superpipelined, superscalar
x86 processor microarchitecture designed for high clock
frequencies
• Multiple x86 instruction decoders
• 72-entry instruction control unit
• Advanced dynamic branch prediction
• Three out-of-order, superscalar, fully pipelined
floating-point execution units, which execute all x87
(floating-point), MMX™ and 3DNow!™ instructions
• Three out-of-order, superscalar, pipelined integer units
• Three out-of-order, superscalar, pipelined address
calculation units
■ Enhanced 3DNow! technology with new instructions to
enable improved integer math calculations for speech or
video encoding and improved data movement for internet
plug-ins and other streaming applications
■ High-performance cache architecture featuring an
integrated 128-Kbyte L1 cache and a programmable,
high-speed backside L2 cache interface
■ 200-MHz AMD Athlon system bus (scalable beyond 400
MHz) enabling leading-edge system bandwidth for data
movement-intensive applications

2 Introduction
22054D/0—December 1999 AMD Athlon™ Processor Technical Brief

AMD Athlon™ Processor Microarchitecture

The AMD Athlon processor is based on a seventh-generation


x86 microarchitecture that features a superpipelined,
nine-issue superscalar microarchitecture optimized for high
clock frequency. The AMD Athlon has a large dual-ported
128-Kbyte split-L1 cache (64-Kbyte instruction cache +
64-Kbyte data cache), a two-way, 2048-entry branch prediction
table, multiple parallel x86 instruction decoders, and multiple
integer and floating-point schedulers for independent
superscalar, out-of-order, speculative execution of instructions.
These elements are packed into an aggressive processing
p i p e l i n e t h a t i n c l u d e s 1 0 -s t a g e i n t e g e r a n d 1 5 -s t a g e
floating-point pipelines, which are illustrated in Figure 1.

2-Way, 64-Kbyte Instruction Cache Predecode Branch


24-Entry L1 TLB/256-Entry L2 TLB Cache Prediction Table

Fetch/Decode
Control
3-Way x86 Instruction Decoders

Instruction Control Unit (72-Entry)

Integer Scheduler (18-Entry) FPU Stack Map / Rename


FPU Scheduler (36-Entry)
FPU Register File (88-Entry)
Bus IEU0 AGU0 IEU1 AGU1 IEU2 AGU2 L2 Cache
Interface FADD FMUL FSTORE Controller
Unit MMX™ MMX
3DNow!™ 3DNow!

Load / Store Queue Unit

2-Way, 64-Kbyte Data Cache


32-Entry L1 TLB/256-Entry L2 TLB

System Interface L2 SRAMs

Figure 1. AMD Athlon™ Processor Block Diagram

AMD Athlon™ Processor Microarchitecture 3


AMD Athlon™ Processor Technical Brief 22054D/0—December 1999

Multiple Decoders
The AMD Athlon processor includes three full x86 instruction
decoders. These decoders translate x86 instructions into
fixed-length MacroOPs for higher instruction throughput and
increased proc essing power. Inst ead of exec ut ing x86
instruct ions, which have lengths of 1 to 15 bytes, the
AMD Athlon processor executes the fixed-length MacroOPs,
while maintaining the instruction coding efficiencies found in
x86 programs.

Instruction Control Unit


Once MacroOPs are decoded, up to three MacroOPs per cycle
are dispatched to the instruction control unit (ICU). The ICU is
a 72-entry MacroOP reorder buffer (ROB) that manages the
execution and retirement of all MacroOPs, performs register
renaming for operands, and controls any exception conditions
and instruction retirement operations. The ICU dispatches the
MacroOPs to the AMD Athlon processor’s multiple execution
unit schedulers.

Execution Pipelines
T h e A M D A t h l o n p ro c e s s o r c o n t a i n s a n 1 8 -e n t ry
integer/address generation MacroOP scheduler and a 36-entry
floating-point unit (FPU)/multimedia scheduler. These
schedulers issue MacroOPs to the nine independent execution
pipelines — three for integer calculations, three for address
calculations, and three for execution of MMX, 3DNow!, and x87
floating-point instructions.
The AM D Athlon pro ce sso r o f fe rs the mo st powe rful,
architecturally advanced floating-point engine ever delivered
in an x86 microprocessor. The AMD Athlon processor's
three-issue, superscalar floating-point capability is based on
three pipelined, out-of-order floating-point execution units,
each with a one-cycle throughput. These three execution units
(FMUL, FADD, and FSTORE) execute all x87 (floating-point)
instructions, MMX instructions, and enhanced 3DNow!
instructions. Using a data format and single-instruction
multiple-data (SIMD) operations based on the MMX instruction
model, the AMD Athlon processor can deliver as many as four
32-bit, single-precision floating-point results per clock cycle,
resulting in a peak performance of 2.4 Gflops at 600 MHz.
4 AMD Athlon™ Processor Microarchitecture
22054D/0—December 1999 AMD Athlon™ Processor Technical Brief

Branch Prediction
The AMD Athlon processor offers sophisticated dynamic
branch prediction logic to minimize or eliminate the delays due
to the branch instructions (jumps, calls, returns) common in x86
software. The processor includes the following:
■ Branch prediction table
■ Branch target address table
■ Return address stack

The AMD Athlon processor implements a two-way, 2048-entry


branch prediction table. The branch prediction table stores
prediction information that is used for predicting the direction
of conditional branches. The branch target address table stores
target addresses of conditional and unconditional branches.
The return address stack optimizes CALL/RET instruction pairs
by storing the return address of each CALL within a nested
series of subroutines and supplying a return address as the
predicted target address of the corresponding RET instruction.

Enhanced 3DNow!™ Technology


The AMD Athlon processor includes enhanced 3DNow!
technology designed to take 3D multimedia performance to new
heights. The enhanced 3DNow! technology implemented in the
AMD Athlon includes AMD’s original twenty-one 3DNow!
instructions (the industry’s first x86 instruction set to use
superscalar SIMD floating-point techniques to accelerate 3D
performance), plus, twenty-four new instructions, which
perform the following functions:
■ Twelve instructions that improve multimedia-enhanced
integer math calculations used in such applications as
speech recognition and video processing
■ Seven instructions that accelerate data movement for more
detailed graphics and functionality for internet browser
plug-ins and other streaming applications, enabling a richer
internet experience
■ Five digital signal processing (DSP) instructions that
enhance the performance of communications applications,
including soft modems, soft ADSL, MP3, and Dolby Digital
surround sound processing

AMD Athlon™ Processor Microarchitecture 5


AMD Athlon™ Processor Technical Brief 22054D/0—December 1999

In enhancing 3DNow! technology, AMD kept the instruction set


design simple, yet powerful. AMD’s plan in designing the new
3 D N ow ! i n s t r u c t i o n s wa s t o p rov i d e p owe r f u l S I M D
performance while enabling ease of implementation for
software developers. The relatively few instructions of
enhanced 3DNow! technology allow developers to adopt this
technology and optimize their applications quickly.

Cache Architecture
The A MD At hlon processor’ s hig h-perfo rma nce cache
architecture includes an integrated, 64-bit, dual-ported
128-Kbyte split-L1 cache with separate snoop port, multi-level
translation lookaside buffers (TLBs), a scalable L2 cache
controller with a 72-bit (64-bit data + 8-bit ECC) interface to as
much as 8-Mbyte of industry-standard SDR or DDR SRAMs, and
an integrated tag for the most cost-effective 512-Kbyte L2
configurations.

The AMD Athlon processor’s integrated L1 cache comprises


two separate 64-Kbyte, two-way set-associative data and
instruction caches. The data cache has eight banks to support
concurrent access by two 64-bit loads or stores. The instruction
c a c h e c o n t a i n s p re d e c o d e d a t a t o a s s i s t m u l t i p l e ,
high-performance instruction decoders. The robust bi-level TLB
structure minimizes code and data delays when accessing
physical memory.

The AMD Athlon processor’s L2 cache controller operates at a


programmable frequency for compatibility with a variety of
industry-standard SRAMs including DDR. The integrated L2
cache tag provides a full tag for a 512-Kbyte L2 cache or a
partial tag for larger L2 caches.

System Bus Interface


The 200-MHz AMD Athlon system bus interface — the fastest
bus implement at ion for x86 platforms — leverages the
high-performance Digital™ Alpha™ EV6 system interface
technology to significantly boost system performance and
p rov i d e a m p l e h e a d ro o m fo r t o d ay ' s a n d t o m o r row ' s
applications. The AMD Athlon system bus provides advanced
features, such as source synchronous clocking for high-speed
200-MHz-to-400-MHz operation, point-to-point topology for

6 AMD Athlon™ Processor Microarchitecture


22054D/0—December 1999 AMD Athlon™ Processor Technical Brief

peak data bandwidth independent of the number of processors,


packet-based transfers for improved transaction pipelining,
large 64-byte burst data transfers, 8-bit ECC protection of data
and instructions, low-voltage signaling for high-performance,
low-cost motherboard implementations, and the ability to
address more than eight terabytes of physical memory.
The 200-MHz system bus implemented in the AMD Athlon
processor is capable of delivering a peak data transfer rate of
1.6 Gbytes per second — twice that of previous processor
generations. With its source synchronous clocking design, the
AMD Athlon processor's system bus is scalable to operate
beyond 400 MHz.

Process Technology
The AMD Athlon processor is manufactured on AMD's six-layer
metal, 0.25-micron process technology and AMD's new
0.18-micron process technology. In 0.25-micron technology, the
approximately 22-million-transistor AMD Athlon processor has
a d i e si z e o f 1 8 4 m m 2 . I n 0 . 1 8 -m i c ro n t e chn o l ogy, t h e
AMD Athlon processor has a die siz e of 102 mm 2 . The
AMD Athlon processor is inc luded in a cost-eff ective,
industry-standard module form factor — Slot A, which is
mechanically compatible with the existing Slot 1 infrastructure,
and therefore, leverages commonly available chassis, power
supply, and thermal solutions.

Summary
T h e A M D A t h l o n p ro c e s s o r ' s s e ve n t h -g e n e ra t i o n
microarchitecture and high-bandwidth system bus enable it to
attain performance levels never before achieved by an x86
processor. The AMD Athlon significantly outperforms
previous-generation x86 processors and delivers the highest
integer, floating-point, and 3D multimedia performance
available for x86 platforms, as measured by industry-standard
benchmarks.
The AMD Athlon provides industry-leading processing power
for cutting-edge software applications, including digital
content creation, digital photo editing, digital video, image
compression, video encoding for streaming over the internet,
sof t DVD, c ommerc ial 3D modeling, workstation-class
computer-aided design (CAD), commercial desktop publishing,
and speech recognition.
AMD Athlon™ Processor Microarchitecture 7

You might also like