Itanium Hardware Developer's Manual
Itanium Hardware Developer's Manual
Itanium Hardware Developer's Manual
August 2001
Figures
1-1 Intel® Itanium™ Processor 4 MB Cartridge Block Diagram...............................1-1
1-2 Intel® Itanium™ Processor 2 MB Cartridge Block Diagram...............................1-2
2-1 Two Examples Illustrating Supported Parallelism ..............................................2-2
2-2 Itanium™ Processor Core Pipeline ....................................................................2-3
2-3 Itanium™ Processor Block Diagram ..................................................................2-4
2-4 FMAC Units Deliver 8 Flops/Clock.....................................................................2-6
2-5 Itanium™ Processor Memory Subsystem..........................................................2-8
2-6 IA-32 Compatibility Microarchitecture ..............................................................2-10
3-1 Common Clock Latched Protocol.......................................................................3-2
3-2 Source Synchronous Latched Protocol..............................................................3-3
5-1 BR[3:0]# Physical Interconnection with Four Symmetric Agents .......................5-5
6-1 Test Access Port Block Diagram........................................................................6-2
6-2 TAP Controller State Diagram............................................................................6-3
6-3 Intel® Itanium™ Processor 2MB Cartridge Scan Chain Order ..........................6-7
6-4 Intel® Itanium™ Processor 4MB Cartridge Scan Chain Order ..........................6-7
7-1 Front View of the Mechanical Volume Occupied by ITP Hardware ...................7-2
7-2 Side View of the Mechanical Volume Occupied by ITP Hardware.....................7-2
7-3 Debug Port Connector Pinout Bottom View .......................................................7-3
7-4 Front View of LAI Adapter Keepout Volume ......................................................7-5
7-5 Side View of LAI Adapter Keepout Volume........................................................7-5
7-6 Bottom View of LAI Adapter Keepout Volume ...................................................7-6
CSRAM CSRAM
CSRAM CSRAM
Address
Command
Response
System Bus
000576a
CSRAM CSRAM
Address
Command
Response
System Bus
000577
The System Abstraction Layer (SAL) consists of the platform dependent firmware. SAL is the
Basic Input/Output System (BIOS) required to boot the operating system (OS). The Intel®
Itanium™ Architecture Software Developer’s Manual, Vol. 2: System Architecture describes the
PAL interface in detail.
1.5 Terminology
In this document, a ‘#’ symbol after a signal name refers to an active low signal. This means that a
signal is in the active state (based on the name of the signal) when driven to a low level. For
example, when RESET# is low, a processor reset has been requested. When NMI is high, a non-
maskable interrupt has occurred. In the case of lines where the name does not imply an active state
but describes part of a binary sequence (such as address or data), the ‘#’ symbol implies that the
signal is inverted. For example, D[3:0] = ‘HLHL’ refers to a hex ‘A’, and D [3:0] # = ‘LHLH’ also
refers to a hex ‘A’ (H= High logic level, L= Low logic level).
The term ‘system bus’ refers to the interface between the processor, system core logic and other
bus agents. The system bus is a multiprocessing interface to processors, memory and I/O. The L3
cache does NOT connect to the system bus, and is not accessible by other agents on the system bus.
Cache coherency is maintained with other agents on the system bus through the MESI cache
protocol as supported by the HIT# and HITM# bus signals.
The term "Intel Itanium processor" refers to the cartridge package which interfaces to a host system
board through a PAC418 connector. Intel Itanium processors include a processor core, an L3 cache,
and various system management features. The Intel Itanium processor includes a thermal plate for
a cooling solution attachment.
1.6 References
The reader of this manual should also be familiar with material and concepts presented in the
following documents and tools:
• Intel® Itanium™ Architecture Software Developer’s Manual, Volume 1-4 (Document Number:
245317, 245318, 245319, 245320)
• Intel® Itanium™ Processor at 800 MHz and 733 MHz Datasheet (Document Number:
249634)
• Itanium™ Processor Family Error Handling Guide (Document Number: 249278)
• Itanium™ Processor Microarchitecture Reference (Document Number: 245473)
• IEEE Standard Test Access Port and Boundary-Scan Architecture
• PAC418 VLIF Socket and Cartridge Ejector Design Specification
• PAC418 Cartridge/Power Pod Retention Mechanism and Triple Beam Design Guide
• Itanium™ Processor Heatsink Guidelines
Note: Contact your Intel representative for the latest revision of the documents without document
numbers.
Version
Description Date
Number
001 Initial release of the Intel® Itanium™ Processor Hardware Developer’s May 2001
Manual.
002 Updated Section 1: Introduction. August 2001
2.1 Overview
The Intel Itanium processor is the first implementation of the Itanium Instruction Set Architecture
(ISA). The processor employs EPIC (Explicitly Parallel Instruction Computing) design concepts
for a tighter coupling between hardware and software. In this design style, the interface between
hardware and software is designed to enable the software to exploit all available compile-time
information, and efficiently deliver this information to the hardware. It addresses several
fundamental performance bottlenecks in modern computers, such as memory latency, memory
address disambiguation, and control flow dependencies. The EPIC constructs provide powerful
architectural semantics, and enable the software to make global optimizations across a large
scheduling scope, thereby exposing available Instruction Level Parallelism (ILP) to the hardware.
The hardware takes advantage of this enhanced ILP, and provides abundant execution resources.
Additionally, it focuses on dynamic run-time optimizations to enable the compiled code schedule
to flow through at high throughput. This strategy increases the synergy between hardware and
software, and leads to higher overall performance.
The Itanium processor provides a 6-wide and 10-stage deep pipeline, running at 733 and 800 MHz.
This provides a combination of both abundant resources to exploit ILP as well as high frequency
for minimizing the latency of each instruction. The resources consist of 4 integer units, 4
multimedia units, 2 load/store units, 3 branch units, 2 extended-precision floating point units, and 2
additional single-precision floating point units. The hardware employs dynamic prefetch, branch
prediction, a register scoreboard, and non-blocking caches to optimize for compile-time non-
determinism. Three levels of on-package cache minimize overall memory latency. This includes a
4MB L3 cache, accessed at core speed, providing over 12GB/sec of data bandwidth. The system
bus is designed for glueless MP support for up to 4-processor systems, and can be used as an
effective building block for very large systems. The advanced FPU delivers over 3 GFLOPS of
numerics capability (6 GFLOPS for single-precision). The balanced core and memory subsystem
provide high performance for a wide range of applications ranging from commercial workloads to
high performance technical computing.
Figure 2-1 illustrates two examples demonstrating the level of parallel operation supported for
various workloads. For enterprise and commercial codes, the MII/MBB template combination in a
bundle pair provides 6 instructions or 8 parallel ops per clock (2 load/store, 2 general-purpose ALU
ops, 2 post-increment ALU ops, and 2 branch instructions). Alternatively, an MIB/MIB pair allows
the same mix of operations, but with 1 branch hint and 1 branch op, instead of 2 branch ops. For
scientific code, the use of the MFI template in each bundle enables 12 parallel Ops per clock
(loading 4 double-precision operands to the registers, executing 4 double-precision flops, 2 integer
ALU ops and 2 post-increment ALU ops). For digital content creation codes that use single
precision floating point, the SIMD features in the machine effectively provide the capability to
perform up to 20 parallel ops per clock (loading 8 single precision operands, executing 8 single
precision FLOPs, 2 integer ALUs, and 2 post-incrementing ALU operations).
Figure 2-1. Two Examples Illustrating Supported Parallelism
6 Instructions Provide:
M F I M F I
12 Parallel Ops/Clock for Scientific
Computing
20 Parallel Ops/Clock For Digital
Content Creation
2 ALU Ops
• Load 4 DP (8 SP) Ops
Via 2 Fld-pair 4 DP FLOPS
• 2 ALU Ops (Post incr.) (8 SP FLOPS)
6 Instructions Provide:
M I I M B B
8 Parallel Ops/Clock for Enterprise
and Internet Applications
IPG FET ROT EXP REN WLD REG EXE DET WRB
001097
L1 Instruction Cache
and ITLB IA-32
Branch Fetch/Pre-fetch Engine Decode
Prediction and
Decoupling Control
8 Bundles
Buffer
B B B M M I I F F
L3 Cache
128 Integer Registers 128 FP Registers
Scoreboard, Predicate,
Registers
NaTs, Exceptions
Integer
ALAT
Branch Dual-Port
and
Units L1
MM Units Floating
Data
Cache Point
and Units
DTLB SIMD
FMAC
Bus Controller
001096
The 16KB, 4-way set-associative instruction cache is fully pipelined, and can deliver 32B of code
(two instruction bundles or 6 instructions) every clock. It is supported by a single-cycle 64-entry
Instruction TLB that is fully-associative and backed up by an on-chip hardware page walker. The
fetched code is fed into a decoupling buffer that can hold 8 bundles of code. During instruction
issue, instructions read from the decoupling buffers are sent to the instruction issue and rename
logic based on the availability of execution resources.
Resteer-1: Special single-cycle branch predictor (uses compiler programmed Target Address
Registers).
The decoupling buffer feeds the dispersal in a bundle granular fashion (up to 2 bundles or 6
instructions per cycle), with a fresh bundle being presented each time one is consumed. Dispersal
from the two bundles is instruction granular – the processor disperses as many instruction as can be
issued (up to 6), in left-to-right order. The dispersal algorithm is fast and simple, with instructions
being dispersed to the first available issue port, subject to two constraints - detection of instruction
independence, and detection of resource oversubscription.
2.3 Execution
The Itanium processor has the following execution units: four integer, four multimedia, two
extended-precision floating-point and two additional single-precision floating-point, three branch,
and two load/store units. The processor implements 128 integer and 128 floating point registers,
and eight branch registers.
The integer engines support all non-packed integer arithmetic and logical operations. The
multimedia engines can treat 64-bit data as either 2 x 32-bit, 4 x 16-bit, or 8 x 8-bit packed data
types. Four integer or multimedia operations can be executed each cycle. The floating-point
engines support simultaneous multiply-add to provide performance for scientific computation.
The FPU has a 128-entry FP register file with eight read and four write ports supporting full
bandwidth operation. See Figure 2-4. Every cycle, the eight read ports can feed two extended-
precision FMACs (each with three operands) as well as two floating-point stores to memory. The
four write ports can accommodate two extended-precision results from the two MAC units and the
results from two load instructions each clock. To increase the effective write bandwidth into the
FPU from memory, the floating-point registers are divided into odd and even backs. This enables
the two physical ports dedicated to load returns to be used to write four values pre clock to the
register file (two to each bank), using two ldf-pair instructions. The earliest cache level to feed the
FPU is the unified L2 cache. The latency of loads from this cache to the FPU is nine clock cycles.
For data beyond the L2 cache, the bandwidth to the L3 cache is two double-precision operations
per clock (one 64-byte line every four clock cycles)
.
6 x 82 bits
2 Stores/Clock
Even Register
File
L3 L2 (128-entry
Cache Cache Odd 82 bits)
2 Double- 4 Double-
precision precision
Ops/Clock Ops/Clock 2 x 82 bits
(2 x ldf-pair)
001098
On a call return, the processor is reversed to restore access to the state prior to the call. In cases
where the RSE has saved some of the callee’s registers, the processor stalls on return until the RSE
can restore the appropriate number of the callee’s registers. The Itanium processor implements the
forced lazy mode of the RSE.
2.4 Control
The control group of the Itanium processor is made up of the exception handler and the pipeline
control. The exception handler implements exception prioritizing. The pipeline control uses a
scoreboard to detect register source dependencies and also special support for control and data
speculation as well as predication. For control speculation, the hardware manages the creation and
propagation deferred exception tokens (called NaT). For data speculation, the processor provides
an Advanced Load Address Table or ALAT (see Section 2.5.5). Predication support includes the
management 64 1-bit predicate registers as well as the effects of predicates on instruction
execution.
The L2 cache contains instructions and data accessed at the full clock speed of the processor. The
L2 cache can handle two requests per clock via banking if there are no conflict conditions. This
cache is unified, allowing it to service both instruction and data side requests from the L1 caches.
When a request to the L2 cache causes a miss, the request is automatically forwarded to the L3
cache.
The backside bus logic accesses the L3 cache through a 128-bit backside bus operating at the full
clock speed of the processor. L3 cache misses are automatically forwarded to main system memory
through the Itanium processor system bus.
L1I L1D
64b
Frontside
Bus
Bus Logic L2
Itanium™ Processor
128b Backside Bus
L3
000675a
On the Itanium processor, a separate two-entry, 64-byte buffer (WCB) is used for WC accesses
exclusively. The processor will evict (flush) each buffer if the buffer is full or if specific ordering
constraints are met.
On the Itanium processor, a transaction is considered visible when it hits the L1D (if the instruction
is serviceable by L1D), the L2, or the L3, or when it has reached the visibility point on the system
bus.
001095
Figure 3-1 illustrates the latched bus protocol as it appears on the bus. In subsequent descriptions,
the protocol is described as “B# is asserted in the clock after A# is observed asserted,” or “B# is
asserted two clocks after A# is asserted.” Note that A# is asserted in T1, but not observed asserted
until T2. A# has one full clock to propagate (indicated by the straight line with arrows) before it is
observed asserted. The receiving agent uses T2 to determine its response and asserts B# in T3. That
is, the receiving agent has one full clock cycle from the time it observes A# asserted (at the rising
edge of T2) to the time it computes its response (indicated by the curved line with the single arrow)
and drives this response at the rising edge of T3 on B#. Similarly, an agent observes A# asserted at
the rising edge of T2, and uses the full T2 clock to compute its response (indicated by the
lowermost curved arrow during T2). This response would be driven at the rising edge of T3 (not
shown in Figure 3-1) on {internal} signals. Although B# is driven at the rising edge of T3, it has
the full clock T3 to propagate. B# is observed asserted in T4.
T1 T2 T3 T4 T5
CLK
BCLKP
BCLKN
A#
B#
{Internal} 0 0 1 1
Assert A#
Latch A# Latch B#
Assert B#
Change internal state
Signals that are driven in the same clock by multiple system bus agents exhibit a “wired-OR glitch”
on the electrical low to electrical high transition. To account for this situation, these signal state
transitions are specified to have two clocks of settling time when deasserted before they can be
safely observed, as shown with B#. The bus signals that must meet this criterion are: BINIT#,
HIT#, HITM#, BNR#, TND#, and BERR#.
The source synchronous latched protocol operates the data bus at twice the “frequency” of the
common clock. Two source synchronous data transfers are driven onto the bus in the time it would
normally take to drive one common clock data transfer. At both the beginning and 50% points of
the bus clock period, drivers send new data. At both the 25% point and the 75% point of the bus
clock period, drivers send centered differential strobes. The receiver captures the data with the
strobes deterministically.
The driver pre-drives STBP# before driving data. It sends a rising and falling edge on STBP# and
STBN#, centered with data. The driver deasserts all strobes after the last data is sent. The receiver
captures valid data with the difference of both strobe signals, asynchronous to the common clock.
A signal synchronous to the common clock (DRDY#) indicates to the receiver that valid data has
been sent.
T1 T2 T3 T4
CLK
BCLKP
BCLKN
DRDY#
D1 D2 D3 D4
Data Bus
(@driver)
STBP# (@driver)
STBN# (@driver)
D1 D2 D3 D4
Data Bus
(@receiver)
STBP# (@receiver)
STBN# (@receiver)
Capture D1
Drive D1 Capture D2
The BCLKP (Positive Phase Bus Clock) input signal is the positive phase of the system bus clock
differential pair. It is also referred to as CLK in some of the waveforms in this overview. It specifies
the bus frequency and clock period and is used in the signaling scheme. Each processor derives its
internal clock from CLK by multiplying the bus frequency by a multiplier determined at
configuration. See Chapter 5, “Configuration and Initialization” for further details.
The BCLKN (Negative Phase Bus Clock) input signal is the negative phase of the system bus clock
differential pair.
The RESET# input signal resets all system bus agents to known states and invalidates their internal
caches. Modified or dirty cache lines are NOT written back. After RESET# is deasserted, each
processor begins execution at the power-on reset vector.
The PWRGOOD (Power Good) input signal must be deasserted during power-up and be asserted
after RESET# is first asserted by the system.
BR[3:0]# are the physical pins of the processor. All processors assert only BR0#. BREQ[3:0]#
refers to the system bus arbitration signals among four processors. BR0# of each of the four
processors is connected to a unique BREQ[3:0]# signal.
Up to five agents can simultaneously arbitrate for the request bus, one to four symmetric agents (on
BREQ[3:0]#) and one priority agent (on BPRI#). Processors arbitrate as symmetric agents, while
the priority agent normally arbitrates on behalf of the I/O agents and memory agents. Owning the
request bus is a necessary pre-condition for initiating a transaction.
The symmetric agents arbitrate for the bus based on a round-robin rotating priority scheme. The
arbitration is fair and symmetric. A symmetric agent requests the bus by asserting its BREQn#
signal. Based on the values sampled on BREQ[3:0]#, and the last symmetric bus owner, all agents
simultaneously determine the next symmetric bus owner.
The priority agent asks for the bus by asserting BPRI#. The assertion of BPRI# temporarily
overrides, but does not otherwise alter the symmetric arbitration scheme. When BPRI# is sampled
asserted, no symmetric agent issues another unlocked transaction until BPRI# is sampled
deasserted. The priority agent is always the next bus owner.
BNR# can be asserted by any bus agent to block further transactions from being issued to the
request bus. It is typically asserted when system resources, such as address or data buffers, are
about to become temporarily busy or filled and cannot accommodate another transaction. After bus
initialization, BNR# can be asserted to delay the first transaction until all bus agents are initialized.
The assertion of the LOCK# signal indicates that the symmetric agent is executing an atomic
sequence of transactions that must not be interrupted. A locked sequence cannot be interrupted by
another transaction regardless of the assertion of BREQ[3:0]# or BPRI#. LOCK# can be used to
implement memory-based semaphores. LOCK# is asserted from the start of the first transaction
through the end of the last transaction. When locking is disabled, the LOCK# signal will never be
asserted.
The assertion of ADS# defines the beginning of the transaction. The REQ[4:0]#, A[43:3]#,
AP[1:0]#, and RP# are valid in the clock that ADS# is asserted.
In the clock that ADS# is asserted, the A[43:3]# signals provide an active-low address as part of the
request. The low three bits of address are mapped into byte enable signals for 0 to 8 byte transfers.
AP1# covers the address signals A[43:24]#. AP0# covers the address signals A[23:3]#. A parity
signal on the system bus is correct if there are an even number of electrically low signals in the set
consisting of the covered signals plus the parity signal. Parity is computed using voltage levels,
regardless of whether the covered signals are active high or active low.
The Request Parity (RP#) signal covers the request pins REQ[4:0]# and the address strobe, ADS#.
The TND# signal may be asserted by a bus agent to delay completion of a Purge Global Translation
Cache (PTCG) instruction, even after the PTCG transaction completes on the system bus. Software
will guarantee that only one PTCG instruction is being executed in the system.
The HIT# and HITM# signals are used to indicate that the line is valid or invalid in the snooping
agent, whether the line is in the modified (dirty) state in the caching agent, or whether the
transaction needs to be extended. The HIT# and HITM# signals are used to maintain cache
coherency at the system level.
If the memory agent observes HITM# active, it relinquishes responsibility for the data return and
becomes a target for the implicit cache line writeback. The memory agent must merge the cache
line being written back with any write data and update memory. The memory agent must also
provide the implicit writeback response for the transaction.
If HIT# and HITM# are sampled asserted together, it means that a caching agent is not ready to
indicate snoop status, and it needs to extend the transaction.
The DEFER# signal is deasserted to indicate that the transaction can be guaranteed in-order
completion. An agent asserting ensures proper removal of the transaction from the In-order Queue
by generating the appropriate response.
The assertion of the GSEQ# signal allows the requesting agent to issue the next sequential
uncached write even though the transaction is not yet visible. By asserting the GSEQ# signal, the
platform also guarantees not to retry the transaction, and accepts responsibility for ensuring the
sequentiality of the transaction with respect to other uncached writes from the same agent.
Requests initiated in the Request Phase enter the In-order Queue, which is maintained by every
agent. The responding agent is responsible for completing the transaction at the top of the In-order
Queue. The responding agent is the agent addressed by the transaction.
For write transactions, TRDY# is asserted by the responding agent to indicate that it is ready to
accept write or writeback data. For write transactions with an implicit writeback, TRDY# is
asserted twice, first for the write data transfer and then for the implicit writeback data transfer.
The RSP# signal provides parity for RS[2:0]#. A parity signal on the system bus is correct if there
is an even number of low signals in the set consisting of the covered signals plus the parity signal.
Parity is computed using voltage levels, regardless of whether the covered signals are active high
or active low.
DRDY# indicates that valid data is on the bus and must be latched. The data bus owner asserts
DRDY# for each clock in which valid data is to be transferred. DRDY# can be deasserted to insert
wait states in the Data Phase.
DBSY# holds the data bus before the first DRDY# and between DRDY# assertions for a multiple
clock data transfer. DBSY# need not be asserted for single clock data transfers.
SBSY# holds the strobe bus before the first DRDY# and between DRDY# assertions for a multiple
clock data transfer. SBSY# must be asserted for all data transfers at the 2x transfer rate.
The D[63:0]# signals provide a 64-bit data path between agents. For partial transfers, the byte
enable signals BE[7:0]# determine which bytes of the data bus will contain valid data.
The DEP[7:0]# signals provide optional ECC (error correcting code) for D[63:0]#. DEP[7:0]#
provides valid ECC for the entire data bus on each clock, regardless of which bytes are enabled.
STBP[3:0]# and STBN[3:0]# (and DRDY#) are used to transfer data at the 2x transfer rate with the
source synchronous latched protocol. The agent driving the data transfer drives the strobes with the
corresponding data and ECC signals. The agent receiving the data transfer uses the strobes to
capture valid data. Each strobe is associated with 16 data bits and 2 ECC signals as shown in
Table 3-7.
ID Strobe IDS#
Transaction ID ID[7:0]#
IDS# is asserted to begin the deferred response. ID[7:0]# returns the ID of the deferred transaction
that was sent on DID[7:0]#. Please refer to Appendix A, “Signals Reference” for further details.
BINIT# is used to signal any bus condition that prevents reliable future operation of the bus.
BINIT# assertion can be enabled or disabled as part of the power-on configuration register (see
Chapter 5, “Configuration and Initialization”). If BINIT# assertion is disabled, BINIT# is never
asserted and the error recovery action is taken only by the processor detecting the error.
BINIT# sampling can be enabled or disabled at power-on reset. If BINIT# sampling is disabled,
BINIT# is ignored and no action is taken by the processor even if BINIT# is sampled asserted. If
BINIT# sampling is enabled and BINIT# is sampled asserted, all processor bus state machines are
reset. All agents reset their rotating ID for bus arbitration, and internal state information is lost.
Cache contents are not affected. BINIT# sampling and assertion must be enabled for proper
processor error recovery.
BERR# is used to signal any error condition caused by a bus transaction that will not impact the
reliable operation of the bus protocol (for example, memory data error or non-modified snoop
error). A bus error that causes the assertion of BERR# can be detected by the processor or by
another bus agent. BERR# assertion can be enabled or disabled at power-on reset. If BERR#
assertion is disabled, BERR# is never asserted. If BERR# assertion is enabled, the processor
supports two modes of operation, configurable at power-on (refer to section 5.2.6 and 5.2.7 for
further details). If BERR# sampling is disabled, BERR# assertion is ignored and no action is taken
by the processor. If BERR# sampling is enabled, and BERR# is sampled asserted, the processor
core is signaled with the machine check exception.
A machine check exception is taken for each BERR# assertion, configurable at power-on.
THERMTRIP# is the Thermal Trip signal. The Itanium processor protects itself from catastrophic
overheating by using an internal thermal sensor. This sensor is set well above the normal operating
temperature to ensure that there are no false trips. Data will be lost if the processor goes into
thermal trip. This is signaled to the system by the assertion of the THERMTRIP# pin. Once
asserted, the signal remains asserted until RESET# is asserted by the platform. There is no
hysteresis built into the thermal sensor itself; as long as the die temperature drops below the trip
level, a RESET# pulse will reset the processor. If the temperature has not dropped below the trip
level, the processor will continue to assert THERMTRIP# and remain stopped.
A thermal alert open-drain signal, indicated to the system by the THRMALERT# pin, brings the
ALERT# interrupt output from the thermal sensor located on the Itanium processor to the platform.
The signal is asserted when the measured temperature from the processor thermal diode equals or
exceeds the temperature threshold data programmed in the high-temp or low-temp registers on the
sensor. This signal can be used by the platform to implement thermal regulation features such as
generating an external interrupt to tell the operating system that the processor core is heating up.
INIT# triggers an unmaskable interrupt to the processor. Semantics required for platform
compatibility are supplied in the PAL firmware interrupt service routine. INIT# is usually used to
break into Hanging or Idle Processor states.
INIT# has another meaning during reset configuration. If INIT# is sampled asserted on the asserted
to deasserted transition of RESET#, then the processor executes its Built-In Self Test (BIST).
If TRISTATE# is sampled asserted on the asserted to deasserted transition of RESET#, then the
processor tristates all of its outputs. This function is used during board level testing.
PMI# is the platform management interrupt pin. It triggers the highest priority interrupt to the
processor. PMI# is usually used by the system to trigger system events that will be handled by
platform specific firmware.
LINT[1:0] are programmable local interrupt pins defined by the interrupt interface.These pins are
disabled after RESET#. LINT0 is typically software configured as INT, an 8259-compatible
maskable interrupt request signal. LINT1 is typically software configured as NMI, a non-maskable
interrupt.
LINT[1:0] are also used, along with the A20M# and IGNNE# signals, to determine the multiplier
for the internal clock frequency as described in Chapter 5, “Configuration and Initialization”.
The Itanium processor asserts FERR# when it detects an unmasked floating-point error. FERR# is
included for compatibility and is never asserted in the Itanium-based system environment.
in the presence of unmasked floating-point exceptions. IGNNE# is included for compatibility and
is never asserted in the Itanium-based system environment.
If A20M# is asserted, the processor is interrupted. Semantics required for platform compatibility
are supplied in the PAL firmware interrupt service routine. A20M# is included for compatibility
and is never asserted in the Itanium-based system environment.
A20M# and IGNNE# are also used, along with the LINT[1:0] signals, to determine the multiplier
for the internal clock frequency as described in Chapter 5, “Configuration and Initialization”.
CPUPRES# can be used to detect the presence of an Itanium processor in a socket. A ground
(GND) level indicates that the part is installed while an open indicates no part is installed.
DRATE# configures the system bus data transfer rate. If asserted, the system bus operates at a 1x
data transfer rate. If deasserted, the system bus operates at a 2x data transfer rate. DRATE# must be
valid at the asserted to deasserted transition of RESET# and must not change its value while
RESET# is deasserted.
BPM[5:0]# are the Breakpoint and Performance Monitor signals. These signals can be configured
as outputs from the processor that indicate the status of breakpoints and programmable counters for
monitoring processor events. These signals can be configured as inputs to break program
execution.
TCK is used to clock activity on the five-signal Test Access Port (TAP). TDI is used to transfer
serial test data into the processor. TDO is used to transfer serial test data out of the processor. TMS
is used to control the sequence of TAP controller state changes. TRST# is used to asynchronously
initialize the TAP controller.
Three control signal groups are explicitly protected by individual parity bits RP#, RSP#, and
IP[1:0]#. Errors on most remaining bus signals can be detected indirectly due to a well-defined bus
protocol specification that enables detection of protocol violation errors. Errors on a few bus
signals cannot be detected.
An agent is not required to enable all data integrity features since each feature is individually
enabled through the power-on configuration register. See Chapter 5, “Configuration and
Initialization”.
RP# ADS#,REQ[4:0]#
AP0# A[23:3]#
AP1# A[43:24]#
RSP# RS[2:0]#
IP0# IDS#, IDa[7:0]#
IP1# IDS#, IDb[7:2,0]#
DEP[7:0]# D[63:0]#
• Address/Request Bus Signals: A parity error detected on AP[1:0]# or RP# is reported based
on the option defined by the power-on configuration.
• Address/Request parity disabled: The agent detecting the parity error ignores it and
continues normal operation. This option is normally used in power-on system initialization and
system diagnostics.
• Response Signals: A parity error detected on RSP# should be reported by the agent detecting
the error as a protocol error.
• Deferred Signals: A parity error detected on IP[1:0]# should be reported by the agent
detecting the error as a protocol error.
• Data Transfer Signals: The Itanium processor system bus can be configured with either no
data bus error checking or ECC. If ECC is selected, single-bit errors can be corrected and
double-bit errors and poisoned data can be detected. Corrected single-bit ECC errors are
continuable errors. Double-bit errors and poisoned data may be recoverable or non-
recoverable. The errors on read data being returned are treated by the requestor as local errors.
The errors on write or writeback data are treated by the target as recoverable errors.
On observing a hard-fail response, the initiator may treat it as a local or a global machine check.
A system may contain single or multiple Itanium processors with one to four processors on a single
system bus. All processors are connected to one system bus unless the description specifically
states otherwise.
The Itanium processor can also be configured with software configuration options. These options
can be changed by writing to a power-on configuration register that all bus agents must implement.
These options should be changed only after taking into account synchronization between multiple
Itanium processor system bus agents.
BREQ[3:0]# bus signals are connected to the four symmetric agents in a rotating manner as shown
in Table 5-2 and in Figure 5-1. Every symmetric agent has one I/O pin (BR0#) and three input only
pins (BR1#, BR2#, and BR3#).
Priority
Agent
BPRI#
BR1#
BR2#
BR3#
BR0#
BR1#
BR2#
BR3#
BR0#
BR1#
BR2#
BR3#
BR0#
BR1#
BR2#
BR3#
BREQ0#
BREQ1#
BREQ2#
System
Interface Logic BREQ3#
During Reset
Table 5-4. Itanium™ Processor System Bus to Core Frequency Multiplier Configuration
Ratio of Bus
Frequency to Core LINT[1] LINT[0] IGNNE# A20M#
Frequency
Table 5-5 shows the architectural state initialized by the processor hardware and PAL firmware at
reset. All other architectural states are undefined at hardware reset. Refer to the Intel® Itanium™
Architecture Software Developer’s Manual for a detailed description of the registers.
Table 5-6 shows the processor state modified by INIT. Refer to the Intel® Itanium™ Architecture
Software Developer’s Manual for a detailed description of the registers.
Instruction Pointer IP Refer to the Itanium™ PALE_INIT entry point for the
Architecture Software Itanium™ processor.
Developer’s Manual for details
Interruption Instruction IIP Original value of IP Value of IP at the time of INIT.
Bundle Pointer
Interruption Processor IPSR Original value of PSR Value of PSR at the time of INIT.
Status Register
Interruption Function IFS v=0 Invalidate IFS.
State
A simplified block diagram of the TAP is shown in Figure 6-1. The TAP logic consists of a finite
state machine controller, a serially-accessible instruction register, instruction decode logic and data
registers. The set of data registers includes those described in the 1149.1 standard (the bypass
register, device ID register, BIST result register, and boundary scan register).
For specific boundary scan chain information, please reference the Itanium processor BSDL file
available at developer.intel.com.
6.1 Interface
The TAP logic is accessed serially through 5 dedicated pins on the processor package:
• TCK: The TAP clock signal
• TMS: “Test mode select,” which controls the TAP finite state machine
• TDI: “Test data input,” which inputs test instructions and data serially
• TRST#:“Test reset,” for TAP logic reset
• TDO: “Test data output,” through which test output is read serially
TMS, TDI and TDO operate synchronously with TCK (which is independent of any other
processor clock). TRST# is an asynchronous input signal.
1. ANSI/IEEE Std. 1149.1-1990 (including IEEE Std. 1149.1a-1993), “IEEE Standard Test Access Port and Boundary Scan Architecture,” IEEE
Press, Piscataway NJ, 1993.
Probe Instruction
Register
Control Signals
TDI TDO
Device Identification
BYPASS Register
RUNBIST Register
000682
1 Test-Logic-
Reset
0 0
1 Capture-DR 1 Capture-IR
0 0
0 0
Shift-DR Shift-IR
1 1
Exit1-DR 1 Exit1-IR 1
0 0
0 0
Pause-DR Pause-IR
1 1
0 Exit2-DR 0 Exit2-IR
1 1
Update-DR Update-IR
1 0 1 0
000683
The following is a brief description of each of the states of the TAP controller state machine. Refer
to the IEEE 1149.1 standard for detailed descriptions of the states and their operation.
• Test-Logic-Reset: In this state, the test logic is disabled so that the processor operates
normally. In this state, the instruction in the Instruction Register is forced to IDCODE.
Regardless of the original state of the TAP Finite State Machine (TAPFSM), it always enters
Test-Logic-Reset when the TMS input is held asserted for at least five clocks. The controller
also enters this state immediately when the TRST# pin is asserted, and automatically upon
power-up. The TAPFSM cannot leave this state as long as the TRST# pin is held asserted.
• Run-Test/Idle: A controller state between scan operations. Once entered the controller will
remain in this state as long as TMS is held low. In this state, activity in selected test logic
occurs only in the presence of certain instructions. For instructions that do not cause functions
to execute in this state, all test data registers selected by the current instructions retain their
previous state.
• Select-IR-Scan: This is a temporary controller state in which all test data registers selected by
the current instruction retain their previous state.
• Capture-IR: In this state, the shift register contained in the Instruction Register loads a fixed
value (of which the two least significant bits are “01”) on the rising edge of TCK. The parallel,
latched output of the Instruction Register (current instruction) does not change in this state.
• Shift-IR: The shift register contained in the Instruction Register is connected between TDI
and TDO and is shifted one stage toward its serial output on each rising edge of TCK. The
output arrives at TDO on the falling edge of TCK. The current instruction does not change in
this state.
• Exit-IR: This is a temporary state and the current instruction does not change in this state.
• Pause-IR: Allows shifting of the Instruction Register to be temporarily halted. The current
instruction does not change in this state.
• Exit2-IR: This is a temporary state and the current instruction does not change in this state.
• Update-IR: The instruction which has been shifted into the Instruction Register is latched into
the parallel output of the Instruction Register on the falling edge of TCK. Once the new
instruction has been latched, it remains the current instruction until the next Update-IR (or
until the TAPFSM is reset).
• Select-DR-Scan: This is a temporary controller state and all test data registers selected by the
current instruction retain their previous values.
• Capture-DR: In this state, data may be parallel-loaded into test data registers selected by the
current instruction on the rising edge of TCK. If a test data register selected by the current
instruction does not have a parallel input, or if capturing is not required for the selected test,
then the register retains its previous state.
• Shift-DR: The data register connected between TDI and TDO as a result of selection by the
current instruction is shifted one stage toward its serial output on each rising edge of TCK. The
output arrives at TDO on the falling edge of TCK. If the data register has a latched parallel
output then the latch value does not change while new data is being shifted in.
• Exit1-DR: This is a temporary state and all data registers selected by the current instruction
retain their previous values.
• Pause-DR: Allows shifting of the selected data register to be temporarily halted without
stopping TCK. All registers selected by the current instruction retain their previous values.
• Exit2-DR: This is a temporary state and all registers selected by the current instruction retain
their previous values.
• Update-DR: Some test data registers may be provided with latched parallel outputs to prevent
changes in the parallel output while data is being shifted in the associated shift register path in
response to certain instructions. Data is latched into the parallel output of these registers from
the shift-register path on the falling edge of TCK.
Bypass Register
The bypass register is a one-bit shift register that provides the minimal path length between TDI
and TDO. The bypass register is selected when no test operation is being performed by a
component on the board. The bypass register loads a logic zero at the start of a scan cycle.
RUNBIST Register
The runbist register is a one-bit register that is loaded with logic zero when BIST is successfully
completed. The register reports the result of the Itanium processor BIST.
Instruction Register
The instruction register contains a four-bit command field to indicate one of the following
instructions: BYPASS, EXTEST, SAMPLE/PRELOAD, IDCODE, RUNBIST, HIGHZ, and
CLAMP. The most significant bit of the Instruction register is connected to TDI and the least
significant bit is connected to TDO.
• CLAMP: This instruction selects the bypass register while the output buffers drive the data
contained in the boundary scan chain. This instruction protects the receivers from the values in
the boundary scan chain while data is being shifted out.
Figure 6-3 and Figure 6-4 illustrate the order of the scan chain of the Itanium processor cartridge of
varying L3 cache sizes.
Figure 6-3. Intel® Itanium™ Processor 2MB Cartridge Scan Chain Order
CSRAM CSRAM
Processor Core
TDO
001054
Figure 6-4. Intel® Itanium™ Processor 4MB Cartridge Scan Chain Order
CSRAM CSRAM
Processor Core
CSRAM CSRAM
001055
The Itanium processor also supports a Logic Analyzer Interface (LAI) module to connect a logic
analyzer to signals on the board. Third party logic analyzer vendors offer a variety of products with
bus monitoring capability.
This chapter describes the ITP and the LAI, as well as a number of technical issues that must be
taken into account when including these tools in a debug strategy. Please note that simulation of
your design is required to ensure that signal integrity issues are avoided.
The debug port, which is connected to the system bus, is a combination of the system, TAP, and
execution signals. There are certain electrical, functional, and mechanical constraints on the debug
port which must be followed. The electrical constraint requires the debug port to operate at the
speed of the Itanium processor system bus and use the TAP signals at high speed. The functional
constraint requires the debug port to use the TAP system via a handshake and multiplexing scheme.
The mechanical constraint requires the ITP associated hardware to fit within a specified volume
(see Section 7.1.2).
Figure 7-1. Front View of the Mechanical Volume Occupied by ITP Hardware
PIN 1
Figure 7-2. Side View of the Mechanical Volume Occupied by ITP Hardware
To remove any possible confusion over the connectivity of the ITP DP the connector pinout is
shown below in Figure 7-3 along with a connectivity table, Table 7-1. The recommended DP
connector is manufactured by Berg* under part number 61641-303, a 25 pin through-hole mount
header (pin 26 removed for keying). Also available from Berg is a surface mount version of this
connector under part number 61698-302TR. The through-hole mount version is recommended for
durability reasons. Full specification of the connectors, including PCB layout guidelines, are
available from Berg Electronics.
Figure 7-3. Debug Port Connector Pinout Bottom View
25
BPM5DR# 23 24 TDO
BCLKN 21 22 PWR
BCLKP 19 20
FBO 17 18 FBI
RESET# 15 16 TCK
BPM[5]# 13 14 TRST#
BPM[4]# 11 12 TMS
BPM[3]# 9 10 TDI
BPM[2]# 7 8
BPM[1]# 5 6 DBR#
BPM[0]# 3 4 DBA#
1 2
GND GND
000820
BCLK(P,N) Differential bus clock driven by the main system bus clock driver. Signals require
termination.
BPM[5:0]# Used by DP to force break at reset and to monitor events. Connected to all the Itanium™
processors and the chipset. Signals require termination.
BPM5DR# Signals assertion of pin BPM5# and is connected to that signal for on-board isolation.
DBA# Driven from the DP to indicate active TAP access of system or active event monitoring
while running; if not used can be left as a no connect.
DBR# This signals the target to initiate a reset. It is an open drain and should be pulled up on the
target. Connect to the system reset (not the same as RESET#).
FBI If using an additional clock buffer to drive TCK, use FBI as the clock source; otherwise it is
a no connect.
FBO This pin should be connected to the TCK signal of the Itanium Processor. Depending on
which topology is used, the specific connectivity will differ.
GND Ground.
PWR Connect to 1.5KΩ pull-up to VTT.
RESET# Input from target system to DP port indicating the system is completely inactive. This can
be connected to the main reset line shared between the processors and chipset.
TCK Connect to all devices in scan chain and debug port connector, unless using a TCK clock
buffer. If using a TCK clock buffer, termination resistor is not required and should be no
connect.
TDI TAP data in. Connected to the TDI of the first device in scan chain.
TDO TAP data out. Connected to TDO of the last device in scan chain.
TMS TAP state management signal. Connected all devices in scan chain and debug port
connector. May be individually buffered from debug port connector.
TRST# TAP reset signal. Connected to all devices in scan chain and debug port connector. May
be individually buffered from debug port connector.
In the past, standard practice for cost reduction of system platforms is to first remove the debug
header to improve board throughput, and then eventually remove the signal traces entirely to save
board space. This was an acceptable solution previously since an interposer style debug connector
could be used on the processor to gain access to the TAP chain. This is no longer acceptable since,
in a multiprocessor system using the Itanium processor with its tight timing margins, there is no
capability to use an interposer debug connector. This necessitates, at an absolute minimum, leaving
the hardware (without the header) for the debug port on the system board. Without this, the ability
to debug a multiprocessor system through a TAP compliant interface is very limited.
The Itanium processor system bus can be monitored with logic analyzer equipment. Due to the
complexity of Itanium multiprocessor systems, the LAI is critical in providing the ability to probe
and capture system bus signals using a logic analyzer for use in system debug and validation.
Therefore, the guidelines for the LAI keepout volume must be strictly adhered to in order for a
logic analyzer to interface with the target system.
An LAI probe adapter is installed between the socket and the Itanium processor cartridge. The LAI
adapter pins plug into the socket, while the Itanium processor cartridge pins plug into a socket on
the LAI adapter. Cabling that is part of the LAI probe adapter egresses the system to allow an
electrical connection between the Itanium processor and a logic analyzer or debug tool. The
maximum volume occupied by the LAI adapter, known as the keep-out volume, as well as the
cable egress restrictions, are illustrated in Figure 7-4, Figure 7-5, and Figure 7-6. Please contact the
logic analyzer vendor to get the actual keep-out volume for their respective LAI implementation.
Figure 7-4. Front View of LAI Adapter Keepout Volume
MATING PLANE OF
In addition to the system bus logic analyzer connection, consideration should be given to the other
buses in the system. For initial debug boards, logic analyzer connections should be provided for
monitoring the critical buses, including the LPC and PCI buses. If a PCI slot is present, logic
analyzer vendors provide a plug-in card for easy connectivity. Contact the logic analyzer vendor
for connector recommendations and part numbers.
This appendix provides an alphabetical listing of all Itanium processor system bus signals. The
tables at the end of this appendix summarize the signals by direction: output, input, and I/O.
For a complete pinout listing including processor specific pins, please refer to the Intel® Itanium™
Processor at 800 MHz and 733 MHz Datasheet.
On the active-to-inactive transition of RESET#, the processors sample the A[43:3]# pins to
determine their power-on configuration.
All observing bus agents that support the 4 GByte (32-bit) address space must respond to the
transaction only when ASZ[1:0]# equals 00. All observing bus agents that support the 64 GByte
(36-bit) address space must respond to the transaction when ASZ[1:0]# equals 00B or 01B. All
observing bus agents that support the 16 TByte (44-bit) address space must respond to the
transaction when ASZ[1:0]# equals 00B, 01B, or 10B.
000 UC Uncacheable
100 WC Write Coalescing
101 WT Write-Through
110 WP Write-Protect
111 WB Writeback
BCLKP and BCLKN indirectly determine the internal clock frequency of the Itanium processor.
Each Itanium processor derives its internal clock by multiplying the BCLKP and BCLKN
frequency by a ratio that is defined and allowed by the power-on configuration.
For Special transactions ((REQa[4:0]# = 01000B) and (REQb[1:0]# = 01B)), the BE[7:0]# signals
carry special cycle encodings as defined in Table A-3. All other encodings are reserved.
For Deferred Reply transactions, BE[7:0]# signals are reserved. The Defer Phase transfer length is
always the same length as that specified in the Request Phase. A Bus Invalidate Line (BIL)
transaction is the only exception to this rule.
On the Itanium processor, a BIL transaction may return one cache line (64 Bytes); however, the
length of the data returned on a BIL transaction may change for future Itanium processor family
members.
If BINIT# observation is enabled during power-on configuration, and BINIT# is sampled asserted,
all bus state machines are reset. All agents reset their rotating IDs for bus arbitration to the same
state as that after reset, and internal count information is lost. The L2 and L3 caches are not
affected.
If BINIT# observation is disabled during power-on configuration, BINIT# is ignored by all bus
agents with the exception of the central agent. The central agent must handle the error in a manner
that is appropriate to the system architecture.
Since multiple agents might need to request a bus stall at the same time, BNR# is a wire-OR signal.
In order to avoid wire-OR glitches associated with simultaneous edge transitions driven by
multiple drivers, BNR# is asserted and sampled on specific clock edges.
Table A-4. BR0# (I/O), BR1#, BR2#, BR3# Signals Rotating Interconnect
Bus Signal Agent 0 Pins Agent 1 Pins Agent 2 Pins Agent 3 Pins
During power-up configuration, the central agent must assert the BR0# bus signal. All symmetric
agents sample their BR[3:0]# pins on asserted-to-deasserted transition of RESET#. The pin on
which the agent samples an asserted level determines its agent ID. All agents then configure their
pins to match the appropriate bus signal protocol as shown in Table A-5.
BR0# 0
BR3# 1
BR2# 2
BR1# 3
The symmetric agents support distributed arbitration based on a round-robin mechanism. The
rotating ID is an internal state used by all symmetric agents to track the agent with the lowest
priority at the next arbitration event. At power-on, the rotating ID is initialized to three, allowing
agent 0 to be the highest priority symmetric agent. After a new arbitration event, the rotating ID of
all symmetric agents is updated to the agent ID of the symmetric owner. This update gives the new
symmetric owner lowest priority in the next arbitration event.
A new arbitration event occurs either when a symmetric agent asserts its BREQn# on an Idle bus
(all BREQ[3:0]# previously deasserted), or the current symmetric owner deasserts BREQn# to
release the bus ownership to a new bus owner n. On a new arbitration event, all symmetric agents
simultaneously determine the new symmetric owner using BREQ[3:0]# and the rotating ID. The
symmetric owner can park on the bus (hold the bus) provided that no other symmetric agent is
requesting its use. The symmetric owner parks by keeping its BREQn# signal asserted. On
sampling BREQn# asserted by another symmetric agent, the symmetric owner deasserts BREQn#
as soon as possible to release the bus. A symmetric owner stops issuing new requests that are not
part of an existing locked operation on observing BPRI# asserted.
A symmetric agent can deassert BREQn# before it becomes a symmetric owner. A symmetric
agent can reassert BREQn# after keeping it deasserted for one clock.
If BE7# is asserted, D[63:56]# transfers the most significant byte. The data driver asserts DRDY#
to indicate a valid data transfer.
The ECC error correcting code can detect and correct single-bit errors and detect double-bit or
nibble errors. Chapter 4, “Data Integrity”, provides more information about ECC.
DID[7:0]# is also transferred on Aa[23:16]# during the first clock of the Request Phase for
Deferred Reply transactions.
The deferred identifier defines the token supplied by the requesting agent. DID[7:4]# carry the
agent identifiers of the requesting agents (always valid) and DID[3:0]# carry a transaction
identifier associated with the request (valid only with DEN# asserted). This configuration limits the
bus specification to 16 bus masters with each one of the bus masters capable of making up to
sixteen requests. Table A-6 shows the DID encodings.
DID[7]# indicates the agent type. Symmetric agents use 0. Priority agents use 1. DID[5:4]#
indicates the agent ID. Symmetric agents use their arbitration ID. DID[6]# is reserved. DID[3:0]#
indicates the transaction ID for an agent. The transaction ID must be unique for all transactions
issued by an agent which have not reported their snoop results.
The Deferred Reply agent transmits the DID[7:0]# (Ab[23:16]#) signals received during the
original transaction on the Aa[23:16]# signals during the Deferred Reply transaction. This process
enables the original requesting agent to make an identifier match and wake up the original request
that is awaiting completion.
IDa[7:0]# returns the ID of the deferred transaction which was sent on Ab[23:16]# (DID[7:0]#).
During active RESET#, each processor begins sampling the A20M#, IGNNE#, and LINT[1:0]
values to determine the ratio of core-clock frequency to bus-clock frequency. On the active-to-
inactive transition of RESET#, each processor latches these signals and freezes the frequency ratio
internally. System logic must then release these signals for normal operation.
The LINT0 pin must be software configured to be used either as the INT signal or another local
interrupt.
requested by the requesting agent as shown in Table A-9. The LEN[1:0]#, HITM#, and RS[2:0]#
signals together define the length of the actual data transfer.
During active RESET#, each processor begins sampling the A20M#, IGNNE#, and LINT[1:0]
values to determine the ratio of core-clock frequency to bus-clock frequency. On the active-to-
inactive transition of RESET#, each processor latches these signals and freezes the frequency ratio
internally. System logic must then release these signals for normal operation.
For requests which would lock the system bus, the Itanium processor may optionally lock the
system bus, fault, or complete the request non-atomically.
When the priority agent asserts BPRI# to arbitrate for bus ownership, it waits until it observes
LOCK# deasserted. This enables the symmetric agents to retain bus ownership throughout the bus
locked operation and guarantee the atomicity of the locked transaction.
All receiving agents observe the REQ[4:0]# signals to determine the transaction type and
participate in the transaction as necessary, as shown in Table A-10.
Deferred Reply 0 0 0 0 0 x x x x x
Reserved (Ignore) 0 0 0 0 1 x x x x x
Interrupt Acknowledge 0 1 0 0 0 DSZ[1:0]# 0 0 0
Special Transactions 0 1 0 0 0 DSZ[1:0]# 0 0 1
Reserved (Central Agent
0 1 0 0 0 DSZ[1:0]# 0 1 x
Response)
Reserved (Central Agent
0 1 0 0 1 DSZ[1:0]# 0 x x
Response)
Interrupt 0 1 0 0 1 DSZ[1:0]# 1 0 0
Purge TC 0 1 0 0 1 DSZ[1:0]# 1 0 1
Reserved (Central Agent
0 1 0 0 1 DSZ[1:0]# 1 1 x
Response)
I/O Read 1 0 0 0 0 DSZ[1:0]# x x x
I/O Write 1 0 0 0 1 DSZ[1:0]# x x x
Reserved (Ignore) 1 1 0 0 x DSZ[1:0]# x x x
A number of bus signals are sampled at the asserted-to-deasserted transition of RESET# for the
power-on configuration.
Unless its outputs are tristated during power-on configuration, after asserted-to-deasserted
transition of RESET#, the processor optionally executes its Built-In Self-Test (BIST) and begins
program execution at the reset-vector
A correct parity signal is high if an even number of covered signals are low and low if an odd
number of covered signals are low. This definition allows parity to be high when all covered
signals are high.
A correct parity signal is high if an even number of covered signals are low and low if an odd
number of covered signals are low. During the Idle state of RS[2:0]# (RS[2:0]#=000), RSP# is also
high since it is not driven by any agent guaranteeing correct parity.
B E
BCLK ............................................................................... 3-4 Error Classification ......................................................... 4-1
BCLKN ................................................................... 3-3, A-2 Error Code Algorithms ................................................... 4-3
BCLKP .................................................................... 3-3, A-2 Error Correcting Code (ECC) ....................................... 3-6
BE[7:0]# ......................................................................... A-3 Error Detection ................................................................ 4-1
BERR# .......................................................... 3-8, 5-3, A-3 Error Signals ................................................................... 3-8
BINIT# ........................................................... 3-8, 5-4, A-3 Execution Control Signals ............................................. 3-9
BNR# ...................................................................... 3-4, A-4 EXF[4:0]# ........................................................................ A-8
Boundary Scan Chain ................................................... 6-1 EXTEST ........................................................................... 6-5
Boundary Scan Register .............................................. 6-4
BPM[5:0]# ................................................... 3-10, 7-3, A-4
BPRI# ..................................................................... 3-4, A-4
BR[3:1]# ......................................................................... A-4
F
BR0# ............................................................................... A-4 FCL# ................................................................................ A-8
Branch Prediction .......................................................... 2-5 FERR# ................................................................... 3-9, A-8
BREQ[3:0]# ........................................................... 3-4, A-5 Floating-point Unit (FPU) .............................................. 2-6
BREQ0# ................................................................. 5-5, A-4
Built-in Self-Test (BIST) ................................................ 5-3
RUNBIST Register ................................................ 6-5 G
Bus Signal Protection .................................................... 4-2 Global Error ..................................................................... 4-1
BYPASS .......................................................................... 6-5 GSEQ# ............................................................................ A-8
Bypass Register ............................................................. 6-5
Byte Enable signals ...................................................... A-3
H
HIT# ................................................................................. A-8
C HITM# ............................................................................. A-8
Clock Frequencies ......................................................... 5-6
Clock Ratios ................................................................... 5-6
Common Clock .............................................................. 3-1
Compatibility Signals ..................................................... 3-9
I
IA-32 Compatibility Signals ........................................... 3-9
Containable Error .......................................................... 4-1
ID[7:0]# ........................................................................... A-9
CPUPRES# ................................................................... A-5
IDCODE ........................................................................... 6-5
IDS# ................................................................................. A-9
IGNNE# .................................................................. 3-9, A-9
D INIT# ............................................................... 3-9, 5-7, A-9
D/C# ................................................................................ A-6 Initialization ...................................................................... 5-6
D[63:0]# .................................................................. 3-7, A-5 In-order Queue Pipelining ............................................. 5-4
Data Bus Error Checking .............................................. 5-3 Instruction Buffers .......................................................... 2-5
Data Reponse Signals .................................................. 3-6 Instruction Fetch ............................................................. 2-4
Data Signals ................................................................... 3-6 Instruction Prefetch ........................................................ 2-4
Data Size (DSZ) signals .............................................. A-7 Instruction Register ........................................................ 6-5
Data Transfer Rate ........................................................ 5-4 INT ................................................................................... A-9
Data Transfer Signals ................................................... 4-2 In-target Probe (ITP64B) ............................................... 7-1
DBSY# .................................................................... 3-6, A-6 Debug Port Connector ........................................... 7-3
O T
Output Tristate ................................................................ 5-3 TCK ............................................................................... A-13
OWN# ........................................................................... A-11 TDI ................................................................................. A-13
TDO ............................................................................... A-13
Test Access Port (TAP) ................................................ 6-1
P Instructions .............................................................. 6-5
Parity Algorithm .............................................................. 4-3 Registers ................................................................. 6-4
Parity Protection ............................................................. 4-1 TCK .......................................................................... 6-1
Pentium® III processors ................................................ 1-2 TDI ............................................................................ 6-1
Pentium® III Xeon™ processors ................................. 1-2 TDO .......................................................................... 6-1
Platform Signals ........................................................... 3-10 TMS .......................................................................... 6-1
PMI# .............................................................................. A-11 TRST# ..................................................................... 6-1
Processor Abstraction Layer (PAL) ............................. 1-2 THERMTRIP# .................................................... 3-8, A-14
Programmable Interrupt Controller (PIC) ................... 2-3 THRMALERT# ............................................................ A-14
Protocol Error .................................................................. 4-1 TMS ............................................................................... A-14
PWRGOOD .................................................................. A-11 TND# ............................................................................. A-14
Translation Lookaside Buffers (TLBs) ........................ 2-9
TRDY# ................................................................. 3-6, A-14
R TRISTATE# ................................................................. A-14
TRST# .......................................................................... A-14
Recoverable Error .......................................................... 4-1
Register Stack Engine (RSE) ....................................... 2-7
REQ[4:0]# ............................................................ 3-5, A-11
Request Parity (RP#) signal ......................................... 3-5 W
Request Signals ............................................................. 3-5 WSNP# ......................................................................... A-14