Computer Science Crash Course - Session 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

Computer Architecture - What is a

computer and why do you need to know?

Computer Science Crash Course - Session 1


Self-introduction

● Tran Minh Hieu - Senior Software Engineer at Acronis Singapore


● K62 of Hanoi University of Science and Technology
● Working as BE Engineer, but also do FE in my free time
Table of Content

● Session 1: Computer Architecture - What is ● Session 7: High Level Programming -


a computer and why do you need to know? Comparison between JavaScript, Python,
● Session 2: Algorithms and Data structures Go
review - Part 1 ● Session 8: Networking
● Session 3: Algorithms and Data structures ● Session 9: Database - Part 1
● Session 10: Database - Part 2
review - Part 2
● Session 11: Security - Basic concepts for
● Session 4: Operating systems
System Design Interviews
● Session 5: Linux ● Session 12: Discussion - What we think
● Session 6: Low Level Programming - C/C++ about trends in Software Engineering
What is a computer?

● von Neumann architecture: A computer


consists of:
○ Central Processing Unit (CPU), containing
logic units and registers
○ Memory Unit, storing instructions and data
○ Input and output devices
■ Peripherals
■ External storage devices
Different types of computers
Game Boy

● Nintendo's famous handheld game console


● CPU: 8-bit Sharp SM83, 4.194304 MHz
○ Hybrid between Intel 8080 and Zilog Z80
○ 512 instructions - 2 * 2 ^8
● RAM: 65536 bytes - 16-bit address space
○ Shared RAM with PPU (Picture Processing Unit), audio, input
and other peripherals
○ Extensible via cartridge
● Also a chance for Hieu to show off his emulator
Logic units

● Take in operands and execute:


○ Arithmetic executions on integers
○ Floating-point calculation on real numbers
○ Boolean/bitwise operations
○ Memory unit/peripherals operations
● 1 execution of 1 instruction takes 1 cycle - fetch, decode, execute
○ Modern CPU can execute billions of executions/second
● Constructed from the 7 types of basic logic gates - AND, OR, NOT, NAND, XAND, XOR, XNOR
ADD logic unit from logic gates
Registers

● Small, quickly accessible memory location built into a


CPU
● May have specific hardware functions:
○ Flags
○ Stack pointer
○ Program counter
● May be read-only or write-only
● Example: Game Boy CPU registers
Instruction set

● A collection of all operations that a CPU can execute


● Different instruction sets for different CPU architectures
○ Example: x86, amd64, ARM64, RISC, etc…
● Different instructions may take different numbers of CPU cycles → Different speed
○ However, as the base unit of logic execution, they are the fastest possible logic you can execute in a computer
● Example: Game Boy CPU instruction set
○ 0x06 0xAB - LD B, d8 (0xAB), load the parameter value 0xAB to register B
○ 0xC3 0xAB 0xCD - JP d16 (0xCDAB), jump to execute code from the position 0xCDAB in RAM
Memory Unit - RAM

● Storing data needed for the execution


of programs
○ Instructions
○ Input parameters
● Allow read + write operation on
memory addresses, or both
● Example: Game Boy memory unit
Endianness
Memory Hierarchy

● Not all data storage are born equal


○ Some are optimized for speed
○ Some are optimized for storage
● Moving components physically
closer (or combining them) can
improve performance!
○ Example: System-on-a-chip (SOC)
CPU cache

● Small but fast memory space located on the CPU to reduce read/write time from RAM
○ Faster than RAM, slower than registers
● 1 CPU may have multiple caches, each divided into lines
○ When cache miss happens, CPU will load the memory address, along with adjacent addresses in the same
line, into the cache line
→ Accessing adjacent memory addresses is beneficial for performance - Spatial Locality!
External storage - Mechanical vs Flash
Let's see the Game Boy in action!
Stack versus Heap
Item Stack Memory Heap Memory

Allocation Memory is allocated in a continuous block Memory is allocated randomly

Allocation/read/write speed Fast (simple procedure, cache-friendly) Slower (complex management, not cache-friendly)

Access Only to the thread which owns the stack All thread can access the heap

Size Smaller (stack overflowing is an issue) Larger

Structure Linear Can support complex data structures (linked list,


resizable array, graph, etc…)
How did computers improve over time?

● Faster clock speed


○ Moore's law - the number of transistors
in an integrated circuit (IC) doubles
about every two years no longer correct
● Multi-core processing (of course)
○ More CPU, more better? Not really
■ Race conditions
■ Synchronization cost
How did computers improve over time?

● Instruction pipelining
○ Instruction life cycle
■ Fetching the instruction from memory
■ Decoding the instruction
■ Executing
■ Memory accessing
■ Writing back
○ The same CPU core can execute different stages of different
instructions at the same time
→ Instruction-level parallelism!
How did computers improve over time?

● Speculative execution
○ Instructions are loaded and executed before the task is needed
○ Reduce the overhead at branch conditions
○ If the task is indeed not needed, changes are reverted/not committed
● Can be an attack vector for cyber criminal!
○ Side-channel attack - faster execution of instructions in the CPU's cache can be measured and indicate
specific conditions of data in memory → in-memory secret can be discovered!
○ Example: Spectre and Meltdown
How did computers improve over time?

● Improvement in RAM
○ Higher speed
○ Higher capacity
○ Multi-channel RAM
New computational units

● GPU - Graphic Processing Unit, specialized in graphical operations


● TPU - Tensor Processing Unit, specialized in tensor computational operations
● DPU- Data Processing Unit, specialized in data processing operations
● High-level ideas:
○ CPU is oriented towards high overall performance in generic tasks
○ Complicated operations can be done with multiple simple CPU instructions - more cycles
○ Relatively low-level of parallelization
○ Specialized computational units can have:
■ Instructions for high-speed executions of specialized tasks
■ Higher parallelization with lower individual performance
● Nvidia GeForce RTX 4090: 16,384 cores, base clock 2235 MHz

You might also like