Existential Questions On The CPU

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 13

Existential questions on the CPU

WHAT DOES THE CPU ACTUALLY DO?


It handles basic instructions and allocates the more complicated tasks to other specific chips to get
them to do what they do best. The CPU itself is a core component of what makes a computer a
computer, but it isn’t the computer itself — it’s just the brains of the operation. CPUs are built by
placing billions of microscopic transistors onto a single computer chip. Those transistors allow it to
make the calculations it needs to run programs that are stored on your system’s memory. At its core,
a CPU takes instructions from a program or application and performs a calculation. This process can
be broken down into three key stages: Fetch, decode, and execute. A CPU fetches the instruction
from a system’s RAM, then it decodes what the instruction actually is, before it is executed by the
relevant parts of the CPU. Array processors or vector processors have multiple processors that
operate in parallel, with no unit considered central. There also exists the concept of virtual CPUs
which are an abstraction of dynamical aggregated computational resources.

WHAT IS THAT CLOCK AND COST THING?


Originally, processors had a single processing core. Today’s modern processors are made up of
multiple cores which allow it to perform multiple instructions at once. They’re effectively several
CPUs on a single chip. Some processors also employ a technology called multi-threading, which
creates virtual processor cores. They aren’t as powerful as physical cores, but they can help improve
a CPU’s performance. Clock speed is another number that’s thrown around a lot with CPUs. That’s
the “gigahertz,” (GHz) figure that you’ll see quoted on CPU product listings. It effectively denotes
how many instructions a CPU can handle per second, but that’s not the whole picture when it comes
to performance. Clock speed mostly comes into play when comparing CPUs from the same product
family or generation.

WHAT DOES A CPU CONSIST OF?


The two typical components of a CPU include the following:
 The arithmetic logic unit (ALU), which performs arithmetic and logical operations.
 The control unit (CU), which extracts instructions from memory and decodes
and executes them, calling on the ALU when necessary.
WHAT IS THE OPERATIONS THING?
The fundamental operation of most CPUs, regardless of the physical form they take, is to execute a
sequence of stored instructions that is called a program. The instructions to be executed are kept in
some kind of computer memory. Nearly all CPUs follow the fetch, decode and execute steps in their
operation, which are collectively known as the instruction cycle.

After the execution of an instruction, the entire process repeats, with the next instruction cycle
normally fetching the next-in-sequence instruction because of the incremented value in the program
counter. If a jump instruction was executed, the program counter will be modified to contain the
address of the instruction that was jumped to and program execution continues normally. In more
complex CPUs, multiple instructions can be fetched, decoded and executed simultaneously. This
section describes what is generally referred to as the "classic RISC pipeline", which is quite common
among the simple CPUs used in many electronic devices (often called microcontroller). It largely
ignores the important role of CPU cache, and therefore the access stage of the pipeline.

Some instructions manipulate the program counter rather than producing result data directly; such
instructions are generally called "jumps" and facilitate program behavior like loops, conditional
program execution (through the use of a conditional jump), and existence of functions. In some
processors, some other instructions change the state of bits in a "flags" register. These flags can be
used to influence how a program behaves, since they often indicate the outcome of various
operations. For example, in such processors a "compare" instruction evaluates two values and sets
or clears bits in the flags register to indicate which one is greater or whether they are equal; one of
these flags could then be used by a later jump instruction to determine program flow.
HOW CAN I IMPROVE THE CPU PERFORMANCE?
 eliminate the factors that hinder the CPU – example - use of cache memory
 simple structures – no longer possible on today's processors
 increase clock frequency – limited by technological issues
 parallel execution of instructions
Techniques:
• Pipeline structure
• Multiply the execution units
• Branch prediction
• Speculative execution
• Predication
• Out-of-order execution
• Register renaming
• Hyperthreading
• RISC architecture

WHAT IS THE PIPELINE THING?


A pipe is a message queue. A message can be anything. A filter is a process, thread, or other
component that perpetually reads messages from an input pipe, one at a time, processes each
message, then writes the result to an output pipe. Thus, it is possible to form pipelines of filters
connected by pipes:

WHAT IS LATENCY?
The number of clock cycles necessary for executing one instruction – given by the number of stages.

WHAT IS THROUGHPUT?
The number instruction finished per clock cycle – in theory equal to 1, in practice - smaller (because
of dependencies).

WHAT ARE DEPENDENCIES?


Sometimes you need more complex triggers than a simple pipeline of stages and jobs. In particular,
you may want a pipeline to trigger based on the result of a stage in another pipeline. This is possible
by adding pipelines as materials.

WHICH ARE THOSE DEPENDENCIES?


• structural dependencies - instructions being in different stages need the same component
• data dependencies - one instruction computes a result, another one uses it
• control dependencies - updating the value of the PC (usually) + increase the current value with the
size of the code of the previous instruction + load a new value - jump instructions
HOW DO I ELIMINATE THESE DEPENDENCIES?
Solutions: stall & forwarding.
Stall - occurs when an instruction needs a result which has not been computed yet. The instruction
moves on when the result it needs becomes available. Though, it is not really a solution – does not
truly eliminate the dependency – only guarantees the correct execution of the instructions – if an
instruction stalls, the next ones will also stall.

WHAT IS THE MULTIPLICATION OF THE EXECUTION UNITS?


Basic idea - multiple ALUs. So more pipelines inside the same processor – usually 2.

HOW DOES BRANCH PREDICTION WORK?


Basic idea - to "foresee" whether a jump will be taken or not – do not wait for the jump instruction
to end. Accurate prediction - no stalls in the pipeline. Wrong prediction - execute instructions that
should not have been executed, so their effects must be "erased". An instruction that should not
have been executed only produces effects when the result is written to its destination. Instruction
results - internally memorized by the processor until one can see whether the prediction was
accurate or not. Prediction schemes can be static (always make the same decision) or dynamic
(adaptive).

WHAT IS BTB?
Branch Target Buffer. Tag + prediction. Jump condition is true - the counter increases. Jump
condition is false - the counter decreases. Value range - between 0 (00) and 3 (11).
Implementation: state coding:
– strong hit - 11
– weak hit – 10
– weak miss - 01
– strong miss - 00

WHAT IS SPECULATIVE EXECUTION?


Speculative execution is an optimization technique where a computer system performs some
task that may not be needed. Work is done before it is known whether it is actually needed, so as
to prevent a delay that would have to be incurred by doing the work after it is known that it is
needed. If it turns out the work was not needed after all, most changes made by the work are
reverted and the results are ignored. Success rate: 100%. Although, it requires a lot of registers
to keep the temporary results.

WHAT IS PREDICATION?
Predication is an architectural feature that provides an alternative to conditional transfer
of control, implemented by machine instructions such as conditional branch, conditional call,
conditional return, and branch tables. Predication works by executing instructions from both
paths of the branch and only permitting those instructions from the taken path to modify
architectural state. The instructions from the taken path are permitted to modify architectural
state because they have been associated (predicated) with a predicate, a Boolean value used by
the instruction to control whether the instruction is allowed to modify the architectural state or not.
It is similar to the speculative execution. In short, an instruction produces effects if and only if the
associated predicate is true.

WHAT IS OUT-OF-ORDER EXECUTION?


Out-of-order execution is a paradigm used in most high-performance central processing units to
make use of instruction cycles that would otherwise be wasted. In this paradigm, a processor
executes instructions in an order governed by the availability of input data and execution
units, rather than by their original order in a program. In doing so, the processor can avoid being
idle while waiting for the preceding instruction to complete and can, in the meantime, process the
next instructions that are able to run immediately and independently. Purpose - eliminating some
pipeline stalls. Possible when there are no dependencies between the instructions.

HOW DOES REGISTER RENAMING WORK?


Register renaming is a technique that eliminates the false data dependencies arising from the
reuse of architectural registers by successive instructions that do not have any real data
dependencies between them. The elimination of these false data dependencies reveals
more instruction-level parallelism in an instruction stream, which can be exploited by various and
complementary techniques such as superscalar and out-of-order execution for
better performance.

WHAT FALSE DEPENDENCIES CAN OCCUR SO I HAVE TO RENAME A REGISTER?


• RAW (read after write) – first instruction modifies the resource, the second one reads it – CANNOT
BE ELIMINATED
• WAR (write after read) / anti-dependencies – the opposite case
• WAW (write after write) / output dependencies – both instructions modify the resource

HOW DOES HYPERTHREADING WORK?


For each processor core that is physically present, the operating system addresses two virtual
(logical) cores and shares the workload between them when possible. The main function of
hyper-threading is to increase the number of independent instructions in the pipeline; it takes
advantage of superscalar architecture, in which multiple instructions operate on separate data in
parallel. With HTT, one physical core appears as two processors to the operating system,
allowing concurrent scheduling of two processes per core. In addition, two or more processes
can use the same resources: if resources for one process are not available, then another
process can continue if its resources are available.

WHAT ARE RISC & CISC?


CISC (Complex Instruction Set Computer)
• large number of instructions
• very complex instructions - long execution times
• small number of registers - frequent memory accesses
+ 20% of the instructions are executed 80% of the time
RISC (Reduced Instruction Set Computer)
• simpler instruction set
– fewer (relatively) instructions
– instructions are simpler (elementary)
• large number of registers (tens)
• less memory addressing modes

HOW DO PARALLEL ARCHITECTURES WORK?


Parallel computing is a type of computation in which many calculations or the execution
of processes are carried out simultaneously. Large problems can often be divided into smaller
ones, which can then be solved at the same time. There are several different forms of parallel
computing: bit-level, instruction-level, data, and task parallelism.
HOW TO REACH PARALLELISM?
• pipeline structures – sequential/parallel
• multiprocessor systems – basic units - the processors
• distributed systems – basic units - the computers

WHAT ARE THE MULTIPROCESSOR SYSTEMS?


Multiprocessing is the use of two or more CPUs within a single computer system. The term also
refers to the ability of a system to support more than one processor or the ability to allocate tasks
between them. Types of systems:
• symmetric shared memory systems
• distributed shared memory systems
• message-passing systems

HOW DO PROCESSORS COMMUNICATE?


Bus & interconnection networks.

WHAT DOES THE BUS DO?


A bus is a communication system that transfers data between components inside a computer, or
between computers. This expression covers all related hardware components (wire, optical fiber,
etc.) and software, including communication protocols.

WHAT IS MEMORY COHERENCE?


Memory coherence is an issue that affects the design of computer systems in which two or
more processors or cores share a common area of memory. All processors must use the last
value written into a shared variable. Goal: every shared variable has the same value in all
caches + in the main memory.

WHAT ABOUT THE CACHE THEN?


When a system writes data to cache, it must at some point write that data to the backing store as
well. The timing of this write is controlled by what is known as the write policy. There are two
basic writing approaches:

 Write-through: write is done synchronously both to the cache and to the backing store.
 Write-back (also called write-behind): initially, writing is done only to the cache. The write to
the backing store is postponed until the modified content is about to be replaced by another
cache block.
Both write-through and write-back policies can use either of these write-miss policies, but usually
they are paired in this way:

 A write-back cache uses write allocate, hoping for subsequent writes (or even reads) to
the same location, which is now cached.
 A write-through cache uses no-write allocate. Here, subsequent writes have no
advantage, since they still need to be written directly to the backing store.

HOW IS THE CACHE UPDATE DONE?


• each cache announces the changes it makes
• the other caches react
• only write operations matter
WHAT ARE WRITE INVALIDATION AND WRITE UPDATE?
Invalidation:
• a processor changes the value of a location
• the change is made in its own cache – all other caches are notified
• every other cache
– has no copy of that location - no action
– has a copy of that location - invalidates its corresponding line
– the correct value will be requested when needed
Update:
• a processor changes the value of a location
• the change is made in its own cache – all other caches are notified – the new value is
broadcasted
• every other cache
– has no copy of that location - no action
– has a copy of that location - gets the new value

HOW DO PERIPHERAL DEVICES WORK?


A peripheral device provides a certain kind of communication – between the processor and the
"outside world". To manage communication with the processor, it includes an I/O controller.

WHAT IS A TRI-STATE CIRCUIT?


A tri-state circuit is just a circuit where output has 3 possible states
–0
–1
– high impedance (High-Z)
• first two states correspond to usual values
• third state means decoupling from the bus – just as the output of the circuit would not be
connected to the bus

WHAT IS AN OPEN-COLLECTOR CIRCUIT?


Here it is possible to connect more outputs together. Result value - Boolean AND between the
outputs that are connected

WHAT IS AN INTERRUPT?
An interrupt is a signal to the processor emitted by hardware or software indicating an event that
needs immediate attention. An interrupt alerts the processor to a high-priority condition requiring
the interruption of the current code the processor is executing. The processor responds by
suspending its current activities, saving its state, and executing a function called an interrupt
handler (or an interrupt service routine, ISR) to deal with the event. This interruption is
temporary, and, after the interrupt handler finishes, the processor resumes normal activities.
Interrupts can be either hardware, software or just traps. Hardware interrupts can be either
maskable (depend on the IF) or non-maskable.

WHAT IS THE INTERRUPT FLAG (IF)?


The Interrupt flag (IF) is a system flag bit in the x86 architecture's FLAGS register, which
determines whether or not the central processing unit (CPU) will handle maskable
hardware interrupts.
WHAT IS THE INTERRUPT CONTROLLER?
• specialized circuit
• collects interrupt requests from the peripherals
• sends them to the processor
• arbitrates the conflicts (more requests coming simultaneously)
– each peripheral is assigned a certain priority

WHAT IS AN OPERATING SYSTEM (OS) AND HOW DOES IT WORK?


An operating system (OS) is system software that manages computer
hardware and software resources and provides common services for computer programs. It
needs hardware support to carry out its duties – the most important: the interrupt system. Main
components: kernel & drivers.

WHAT IS THE KERNEL?


It is "the brain" of the operating system. It is also mostly independent from the hardware structure
it works with. It manages the computing resources - hardware and software.

HOW DO PROGRAMS ACTUALLY RUN?


• operating system - kernel mode – can perform any operation
• applications - user mode – cannot perform certain actions – must ask the kernel to perform those
actions for them

HOW TO SWITCH BETWEEN THE TWO MODES?


Through the interrupt system.
• user to kernel – a software interrupt is called – an exception (error) occurs
• kernel to user – return from the interrupt handling routine
Therefore, no application code can run in kernel mode.

WHAT ARE SYSTEM CALLS?


They are requests made by applications to the kernel. In other words, actions that applications may
not execute themselves. There can only be performed in processor's kernel mode. System calls can
be achieved through software interrupts.

WHAT ARE BUFFERS?


A buffer is a region of a physical memory storage used to temporarily store data while it is being
moved from one place to another.

WHAT ARE DIRVERS?


Drivers are program modules which manage communication to peripheral devices. Drivers are not
part of the kernel, but are controlled by the kernel.

WHAT IS PROCESS MANAGEMENT?


Process management is an integral part of any modern-day operating system (OS). The OS
must allocate resources to processes, enable processes to share and exchange information,
protect the resources of each process from other processes and enable synchronization among
processes. To meet these requirements, the OS must maintain a data structure for each process,
which describes the state and resource ownership of that process, and which enables the OS to
exert control over each process.
One can launch more programs at the same time (multitasking), but the parallelism is not real unless
the system has more processors. Otherwise – concurrency. A program may be split into more
sequences of instructions – processes. The operating system works with processes – not with
programs. When a new process is created, proper memory space is allocated to it.

WHAT ARE THE STATES OF A PROCESS?


• running – its instructions are executed by the processor
• ready – waits to be executed by the processor
• waiting – waits for the completion of a system call – currently, it does not compete for being
assigned to the processor

WHAT IS A THREAD?
A thread of execution is the smallest sequence of programmed instructions that can be managed
independently by a scheduler, which is typically a part of the operating system. The
implementation of threads and processes differs between operating systems, but in most cases a
thread is a component of a process. Multiple threads can exist within one process,
executing concurrently and sharing resources such as memory, while different processes do not
share these resources. In particular, the threads of a process share its executable code and the
values of its dynamically allocated variables and non-thread-local global variables at any given
time.

HOW DOES MEMORY MANAGEMENT WORK?


Functions
• allocates memory zones to the applications
• prevents the interferences between applications from occuring
• detects and stops illegal accesses

WHAT IS THE PROBLEM WITH THIS MEMORY MANAGEMENT?


• multiple applications running -> separate memory zones
• each application -> certain memory zones; which are these zones?
– depend on memory configuration at that moment
– cannot be known at compiling time

WHAT IS THE SOLUTION TO THIS PROBLEM?


• two kinds of addresses
– virtual - which the application believes to access
– physical - really accessed by the processor
• the correspondence between virtual and physical addresses - managed by the operating system

WHATS THE RELATION BETWEEN THE PHYSICAL AND VIRTUAL ADDRESSES?


Managing virtual and physical addresses:
• 2 different methods – segmentation & pagination
• can also be used together
• dedicated component of the processor - MMU (Memory Management Unit)
WHAT IS MEMORY SEGMENTATION?
• segment - contiguous memory zone
• contains information of the same kind
• the address of a location - made of 2 parts
– the start address of the segment
– the offset within the segment
Therefore, upon different executions of the program, a segment starts at different addresses.

WHAT IS A DESCRIPTOR?
Segment descriptor - data structure for managing a segment. Access to a segment is based on the
index in the descriptor table (selector).

HOW DOES VIRTUAL ADDRESS WORK?


The virtual address has 2 components – the index in the descriptor table + the offset within the
segment.

HOW DOES PHYSICAL ADDRESS WORK?


physical address = start address of the segment + offset

SO HOW IS MEMORY ACCESS DONE?


1. the program requests the virtual address
2. the segment descriptor is identified
3. the access rights are checked – insufficient rights -> a trap is generated
4. the offset is checked – if the offset exceeds the segment size - a trap is generate
5. • if an error has occurred on previous steps – the service routine of the trap ends the
program
6. • if no error has occurred – compute the physical address (segment start address + offset) –
access the location at the computed address

HOW IS A SEGMENT PLACED INTO THE MEMORY?


There are several algorithms:
• First Fit - first free, large enough zone that is found
• Best Fit - smallest free zone that is large enough
• Worst Fit - largest free zone (if it is large enough)

WHAT IS MEMORY FRAGMENTATION?


Memory fragmentation occurs when there are many free zones, too small to be used. This situation
arises after a large number of segment allocations and releases.

HOW TO SOLVE MEMORY FRAGMENTATION?


Memory defragmentation. Move the segments such that there remain no free zones between them.
A single free zone, of maximum size, is created. This is performed by a specialized program, part of
the operating system.

WHAT IS MEMORY PAGINATION?


Paging is a memory management scheme by which a computer stores and retrieves data
from secondary storage for use in main memory. In this scheme, the operating system retrieves
data from secondary storage in same-size blocks called pages. Paging is an important part
of virtual memory implementations in modern operating systems, using secondary storage to let
programs exceed the size of available physical memory.

HOW DOES MEMORY PAGINATION WORK?


• virtual address space - split into pages – zones of fixed size
• physical address space - split into page frames – same size as pages
• size - usually 4 KB

WHAT IS A PAGE TABLE?


The correspondence between pages and page frames - managed by the operating system. Upon
different runs of the program, pages are placed into different frames.

HOW TO FIND THE PHYSICAL ADDRESS BASED ON THE TABLE?


Example 1:
Pages 0 1 2 8 9 11 14 15
Page frames 5 7 4 3 9 2 14 21
• page size: 1000
• virtual address: 8039
– page: [8039/1000]=8 -> page frame 3
– offset: 8039%1000=39 (within the page)
• physical address: 3·1000+39=3039
Example 2:
• page size : 1000
• virtual address: 5276
– page: [5276/1000]=5
– not present in the page table
– error -> a trap is generated

WHAT IS VIRTUAL MEMORY?


Virtual memory (also virtual storage) is a memory management technique that provides an
"idealized abstraction of the storage resources that are actually available on a given
machine" which "creates the illusion to users of a very large (main) memory."

WHAT IS THE MMU (MEMORY MANAGEMENT UNIT)?


A memory management unit (MMU), sometimes called paged memory management
unit (PMMU), is a computer hardware unit having all memory references passed through itself,
primarily performing the translation of virtual memory addresses to physical addresses. It is
usually implemented as part of the central processing unit (CPU), but it also can be in the form of
a separate integrated circuit.

WHAT DOES CREATING A PROGRAM ACTUALLY CONSIST OF?


• compiling – translate the commands written in a source language into processor instructions
• linking – handles aspects regarding program's memory management

WHAT IS AN OBJECT FILE?


An object file is a file containing object code, meaning relocatable format machine code that is
usually not directly executable. There are various formats for object files, and the same object
code can be packaged in different object files. An object file may also work like a shared library.
WHAT ARE THE COMPONENTS OF AN OBJECT FILE?
1. header
– identification information
– information about the other parts of the file
2. entry point table
– contains the names of the symbols (variables and functions) in the current module that may be
used in other modules
3. external reference table
– contains the names of the symbols defined in other modules, but used in the current module
4. the code itself
– resulting from compilation
– the only part that will be found in the executable file
5. relocation dictionary
– contains information for locating the code instructions the require modifying the addresses
they work with
– variants:
• bitmap
• linked list

WHAT IS A LINKER?
A linker or link editor is a computer utility program that takes one or more object files generated
by a compiler and combines them into a single executable file, library file, or another 'object' file.

HOW DOES A LINKER WORK?


1. builds a table with all object modules and their sizes
2. based on this table, assigns start addresses to object modules – start address of a module = the
sum of the sizes of previous modules
3. determines the instructions that access the memory and adds a relocation constant to each
address – relocation constant = start address of the module it belongs to
4. determines the instructions that call functions/access data from other modules and inserts the
appropriate addresses

You might also like