Folien Intel NetBurst Architecture
Folien Intel NetBurst Architecture
Folien Intel NetBurst Architecture
Prozessorarchitekturen:
Industrielle Software-Entwicklung - 1 -
Intel NetBurst Technology (1)
• Superscalar issue
-> to enable parallelism
Industrielle Software-Entwicklung - 2 -
Intel NetBurst Technology (2)
This means:
Design Goal:
Industrielle Software-Entwicklung - 3 -
Intel NetBurst Technology (3)
Instructions are stored in the trace cache after being decoded into µops. Rather
than storing instruction opcodes in a level-1 cache, it stores decoded µops.
One important reason for this is that the decoding stage was a bottleneck on
earlier processors. An opcode can have any length from 1 to 15 bytes. It is quite
complicated to determine the length of an instruction opcode; and we have to
know the length of the first opcode in order to know where the second opcode
begins. Therefore, it is difficult to determine opcode lengths in parallel.
Pipeline
The front end supplies instructions in program order to the out-of-order core. It
fetches and decodes instructions. The decoded instructions are translated into
µops.
The front end’s primary job is to feed a continuous stream of µops to the
execution core in original program order.
The out-of-order core aggressively reorders µops so that µops whose inputs are
ready (and have execution resources available) can execute as soon as possible. The
core can issue multiple µops per cycle.
The retirement section ensures that the results of execution are processed
according to original program order and that the proper architectural states are
updated.
Figure 2-5 illustrates a diagram of the major functional blocks associated with the
Intel NetBurst microarchitecture pipeline. The following subsections provide an
overview for each.
Industrielle Software-Entwicklung - 4 -
Intel NetBurst Technology (4)
The front end is designed to address two problems that are sources of delay:
Industrielle Software-Entwicklung - 5 -
Intel NetBurst Technology (5)
Out-of-order Core
The core’s ability to execute instructions out of order is a key factor in enabling
parallelism. This feature enables the processor to reorder instructions so that if one
µop is delayed while waiting for data or a contended resource, other µops that
appear later in the program order may proceed. This implies that when one portion
of the pipeline experiences a delay, the delay may be covered by other operations
executing in parallel or by the execution of µops queued up in a buffer.
The core is designed to facilitate parallel execution. It can dispatch up to six µops
per cycle through the issue ports (Figure 2-6).
Industrielle Software-Entwicklung - 6 -
Intel NetBurst Technology (6)
Retirement
The retirement section receives the results of the executed µops from the execution
core and processes the results so that the architectural state is updated according
to the original program order. For semantically correct execution, the results of
Intel 64 and IA-32 instructions must be committed in original program order
before they are retired. Exceptions may be raised as instructions are retired. For this
reason, exceptions cannot occur speculatively.
The retirement section also keeps track of branches and sends updated branch
target information to the branch target buffer (BTB). This updates branch history.
Industrielle Software-Entwicklung - 7 -