Microcontrollers: Lecture Notes No. 9

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 32

MICROCONTROLLERS

LECTURE NOTES no. 9

Lecturer PhD. bioeng. Monica-Claudia Dobrea


Memory System

 The following memory components are commonly available in most


microcontrollers:

 RAM:  Volatile memory (if the microcontroller loses power, the contents of
RAM memory are lost)
 It can be written to and read from during program execution.
 It is typically used during system development to store a program.
Once development is complete, the completed program is stored in
nonvolatile memory such as Flash EEPROM.
 During program execution, RAM is used to store global variables,
support dynamic memory allocation of variables, and to provide a
location for the stack.

 EEPROM  Used to permanently store and recall variables during program execution.
 Useful for storing data that must be retained during a power failure but
might need to be changed periodically.
Examples: applications in which it store system parameters, electronic lock
combinations, automatic garage door electronic unlock sequences etc.

 Flash  Used to store programs.


 It can be erased and programmed as a whole.
EEPROM  Some microcontroller systems provide a large complement of both RAM A sample Memory Map.
and Flash EEPROM. Therefore, a system program can be developed in The memory map shows which memory addresses
RAM and then transferred to Flash EEPROM when complete.
are currently in use and
the type of memory present.
Memory System

 Usually, microcontrollers use the Harvard memory architecture, with separate


program and data memories.
 Sometimes the program memory (code) can be divided into two read-only
memory (ROM) types:
 a program memory module is flash memory, and here is the executable
program stored, and
 the other module can be, for example, an EEPROM memory that can be
used to store tables, or variables that are rarely changed and stored in non-
volatile memory.
 The organizing mode and access to the internal address spaces is of two types,
depending on the type of microcontroller:
a. different memory addressing spaces, or
b. a common addressing space, regardless of the used types of memory.

A sample Memory Map.


The memory map shows which memory addresses
are currently in use and
the type of memory present.
(a) Different addressing locations depending on the type of memory

 With these microcontrollers, each type of memory is accessed separately, usually through different addressing modes (e.g., direct
access, indirect access via an index register to the EEPROM memory etc.).
 Figure below shows an example of this organizing and addressing approach, considering that there are 3 types of internal memory:
 32KB Flash memory, 16KB EEPROM memory, 16KB of data memory (SRAM).

It is noticeable that each address space starts with address 0, which is why each type of memory addresses with its own
addressing mode.

In order to understand how the location addresses have been passed to the hex,
the encoding mode is binary in hexadecimal:
(b) Common addressing space, regardless of the type of memory

 With these microcontrollers all address space is common, with each of the memory types
having specific access addresses.
 Figure below presents an example of this mode of organization and addressing (here, the same
types of memory and the same dimensions used previously are considered).
 For these microcontrollers, most of the time, space is reserved also for possible memory that
could be connected externally.

 Flash 32 k (215) Binar: 0000 0000 0000 0000 Hex: $0000


  0111 1111 1111 1111 $7FFF
 EEPROM 16 k (214) Binar: 1000 0000 0000 0000 Hex: $8000
  1011 1111 1111 1111 $BFFF
 SRAM 16 k (214) Binar: 1100 0000 0000 0000 Hex: $C000
  1111 1111 1111 1111 $FFFF
Memory Space

 To keep a track:
 of the memory locations in use and
 of the type of memory present within the system,
a visual tool called a memory map is employed.

 The memory map provides:


 the size in bytes of each memory component, and
 its start and stop address within the memory system.
Note that there are portions of the memory map not in
use. These open spaces are provided for system expansion.

A sample Memory Map.


The memory map shows which memory addresses are
currently in use and
the type of memory present .
 
Address Spaces

Also, the address map within the memory address space on Intel systems is split into two separate
sub-ranges:
 a range of addresses that when decoded accesses the DRAM (physical memory), and
 a range of addresses that are decoded to select the I/O devices – MMIO (Memory Mapped I/O).

 DRAM and MMIO space, together represents the


Memory Space, which is actually, the primary address
space.
 Beside the memory space, a microcontroller can also
have an I/O Address Space.

System Memory Map on Intel systems


Address Spaces

 Memory Space (the primary address space)  I/O Address Space


 covers the DRAM and most I/O devices (MMIO);  it is far smaller (only 64 KB) and can only be
 occupies the entire physical address space of the accessed via IN/OUT instructions;
processor;  as accesses to I/O devices through I/O space are
 access to memory space is achieved by memory relatively time consuming, most cases avoid use of
read/write instructions such as MOV. the I/O space except for supporting legacy features

System Memory Map on Intel systems Mapping Address Spaces


Address Spaces
For example, the Intel® Quark SoC X1000 supports four In this example, the CPU core can only directly access:
different address spaces:  memory space through memory reads and writes;
 Physical Address Space (Memory Space)  I/O space through the IN and OUT I/O port
 I/O Space  PCIinstructions.
configuration space is indirectly accessed through I/O
 PCI Configuration Space or memory space;
 Message Bus Space.  The Message Bus space is indirectly accessed through PCI
configuration space.

System Memory Map on Intel systems Mapping Address Spaces


Let’s remember!

 The main memory is connected to the CPU via a bus – in fact two or three buses:
 the data bus;
 the address bus and, also,
 a control bus, which carries timing pulses and the level to indicate writing or
reading.

 
Processor
k-bit Memory
address bus
MAR Up to addressable locations
n-bit
data bus Word length = n bits
MBR

  Control lines
(R/)
Let’s remember!

 The address bus does not have to be the same width as the data bus.

 The number of conductors in the address bus (the width of the address bus) sets the
upper limit of memory locations that may be linearly addressed by the
microcontroller (the number of uniquely addressable memory locations).

 
Processor
k-bit Memory
address bus
MAR Up to addressable locations
n-bit
data bus Word length = n bits
MBR

  Control lines
(R/)
Let’s remember!

 Having n address lines means that there are 2n addresses or


locations in the address space.
For example:
A microcontroller equipped:
 with a 16-bit address bus is capable of addressing 65536
(64 KB) separate locations. The first address in this memory
space is (0000)16 while the last address in this space will be
(FFFF)16.
 with a 24-bit address bus and a data bus having 16 bits
wide (the contents width is 16 bits) provides an address
space from 0x0 to 224 - 1 or 0xFFFFFF in hex. In the next
Figure, for example, the contents of address 4 are 0x01FF.
The largest (unsigned) integer number that can be held is 216
- 1 or 0xFFFF.
Let’s remember!

 A convenient method of figuring out 2n is to remember that 210 =


1024, so :
 n = 10 lines address 1K locations,
 n = 20 lines address 1M locations, and
 n = 30 can address 1G locations.
Of course, microcontrollers tend to have a smaller amount of
memory, because they are not designed to multitask (i.e., run
multiple programs), and 256K locations is the largest number
spotted (in 2010).
Memory Organization

 An address bus with n lines can address 2n words of


memory.
 So, a system with an address bus 24 bits wide (lines A0 ÷
A23) can address 16 Mwords or 32 MBytes of memory.
 Standard memories are 1 Byte wide, so our 16 bit data
bus would require 2 chips in parallel.
 Note the use of the Chip Select lines!!
 All the address lines go to each chip, whereas the data
lines are selective.
 Using the CS lines, one can select either two bytes, upper
byte, lower byte, or neither byte.

Memory chip organization.


Memory Organization
 Up to this point we have assumed that the address is given in
Words. But common is to give address in Bytes (e.g., MC68000
model):
 the upper byte has an even address,
 the lower byte an odd address.
From this new perspective, the 24 bit address lines will only
address 16 MBytes of memory.

 Moreover, usually the memory is only addressable on even Byte


boundaries.
What this means is that the only valid addresses are even (e.g.,
LDA 0 is legal, LDA 5 is not). A direct consequence is that only
words are addressable. However, because of this restriction, the
least significant address line is redundant – A0 would always be Our model.(up) and the 68000 scheme
equal to 0! So, one could use address lines A1 ÷ A23 only to (lower).
 
address 8 Mwords.
Building a memory from standard chips

For example, let’s build:


 a 2 MWord (where Word=2 Bytes) memory from
 1/4 MB chips (256 KB = 218 bytes)

Solution:
We’ll have to arrange them:
 “8 high” – to cover the address space, and
 “2 wide” to cover the data space.

Memory chip organization.


MEMORY SYSTEM – Cache memory

 Also called CPU memory.

 A cache is:
 a smaller memory,
 faster memory,
 closer to a processor core,
 which acts as a high speed buffer between CPU and
main memory. The term "cache memory"
or "memory cache" or
 It temporary stores very active data and actions during shortly "cache" without
processing: any specification, usually
 instructions currently being executed or which may be is referred to a
executed within a short period of time and/or "hidden memory that
 data that the CPU may frequently require for stores a subset of
manipulation. main memory content "
and specifically the
 It is:
 typically integrated directly with the CPU chip or "Instructions" of a
 placed on a separate chip that has a separate bus interconnect program and the related
with the CPU. "Data" that must be
MEMORY SYSTEM – Cache memory

When the processor needs to read from or write to a location in main memory:
 It first checks whether a copy of that data is in the L1 cache. If so (this is called a “cache hit”), the
processor immediately reads from or writes to the cache, which is much faster than reading from or writing
to main memory.
 If it does not find it ("memory miss"), it will continue to search for that instruction in the main memory,
but it will simply take longer.

Memory address
from processor Main memory accessed if
address not in cache

Compare with all


stored addresses
simultaneously

Address not
Address found found in cache

Address location
MEMORY SYSTEM – Cache memory

Functional principles of the cache memory


Cache Memory operation is based on two major "principles of locality“:
- Temporal locality
- Spatial locality'

Temporal locality
 Data that have been used recently, have high likelihood of being used again.
A cache stores only a subset of MM data – the most recent-used MRU. Data read from MM are temporary stored in cache. If the processor requires
the same data, this is supplied by the cache. The cache is effective because short instruction loops and routines are a common program structure
and generally several operations are performed on the same data values and variables.

Spatial locality
 If a data is referenced, is very likely that nearby data will be accessed soon.
Instructions and data are transferred from MM to the cache in fixed blocks (cache block), known as cache lines. Cache line size is in the range of 4 to
512 bytes so that more than one processing data (4/8 bytes) is stored in each cache entry. After a first MM access, all cache line data are available in
cache.
Most programs are highly sequential. Next instruction usually comes from the next memory location. Data is usually structured and data in these
structures normally are stored in contiguous memory locations (data strings, arrays, etc.).
Large lines size increase the spatial locality but increase also the number of invalidated data in case of line replacement (see Replacement policy).
(Note – for brevity, the term "data" often will be used instead of "cache line" or "cache block")
MEMORY SYSTEM – Cache memory

Write Policy
 When the processor writes data to cache, it must at some point write that data to the backing store as well.
 The timing of this write is controlled by what is known as the write policy. There are two basic writing
approaches:

 Write-through - write is done synchronously both to the cache and to the backing store.
 Write-back (also - initially, writing is done only to the cache; the write to the backing store is
called write-behind) postponed until the cache blocks containing the data are about to be
modified/replaced by new content
- this method was used in the Intel processor class starting with the 80486
processor.

 Due to the mechanisms that govern these methods:


 the write-back method often results in better performance (a higher speed) than when the write-
through method is used;
 these performances are superior due to minimizing the number of writings in the system's RAM.
MEMORY SYSTEM – Cache memory

The processor’s cache memory may have one of the following two organizations:
 unified cache memory - same memory for both data and code, and
 split cache for data and instructions - this cache organization is called modified Harvard architecture and
it is specific for the Intel's processors (IA-32 and IA-64), and not
only. These caches are shortly called "I-Cache" for Instruction cache
and "D-Cache" for Data cache.

a) b)

a) Example of Unified Cache Memory, and b) A simplified block diagram of a CPU with split L1 cache
 
MEMORY SYSTEM – Cache memory

Hierarchy of more cache levels


The data cache is usually organized as a hierarchy of more cache levels:
• Primary cache (also called L1):
- is located as close as possible to the execution unit, being integrated into the CPU;
- is a SRAM (static RAM, with very low access times); it is much faster than the external memory.
- the support circuits of the cache memory copy from the RAM the necessary data and program and try to
"guess" what will be needed in the immediate future in order to bring this information to the corresponding
memories (L1D and L1P). Range: 8 KB – 64 KB (POWER8).
• Secondary cache (called L2 Cache or Level 2 Cache):
- is also an SRAM memory; it can be external or internal to the processor ;
- L2 is a buffer between the processor and external RAM;
- it is a faster memory than external RAM, but it is slower than L1 cache - compared with this, latency times
are no longer very close to zero; range: 64 KB – 8 MB (16 MB IBM RS64-IV);
- used to increase the global size and for data coherency.
• Level 3 (L3) cache:
- is typically specialized memory that works to improve global size
and the performance of L1 and L2; range: 4 MB – 128 MB.
- it can be significantly slower than L1 or L2, but is usually double
the speed of RAM.
- in the case of multicore processors, each core may have its own
dedicated L1 and L2 cache, but share a common L3 cache.
MEMORY SYSTEM – Cache memory

Hierarchy of more cache levels


The data cache is usually organized as a hierarchy of more cache levels:
• Primary cache (also called L1):
- is located as close as possible to the execution unit, being integrated into the CPU;
- is a SRAM (static RAM, with very low access times); it is much faster than the external memory.
- the support circuits of the cache memory copy from the RAM the necessary data and program and try to
"guess" what will be needed in the immediate future in order to bring this information to the corresponding
memories (L1D and L1P). Range: 8 KB – 64 KB (POWER8).
• Secondary cache (called L2 Cache or Level 2 Cache):
- is also an SRAM memory; it can be external or internal to the processor ;
- L2 is a buffer between the processor and external RAM;
- it is a faster memory than external RAM, but it is slower than L1 cache - compared with this, latency times
are no longer very close to zero; range: 64 KB – 8 MB (16 MB IBM RS64-IV);
- used to increase the global size and for data coherency.
• Level 3 (L3) cache:
- is typically specialized memory that works to improve global size
and the performance of L1 and L2; range: 4 MB – 128 MB. • Level 4 (L4) cache:
- it can be significantly slower than L1 or L2, but is usually double - remote cache;
the speed of RAM. - range > L3 (512 MB or more).
- in the case of multicore processors, each core may have its own
dedicated L1 and L2 cache, but share a common L3 cache.
MEMORY SYSTEM – Cache memory (hierarchy)

Secondary memory

Memory Bandwidth and Latency:


L1 = 16 GB/s & 1.2 ns,
L2 = 15.5 GB/s & 3 ns,
L3 = 15 GB/s & 6,5 ns,
Main memory = 10 GB/s & 24 ns.
Memory-mapped IO vs. Port-mapped IO

 Microprocessors normally use two methods to connect external devices:


 memory mapped I/O (MMIO) or
 port mapped I/O (PMIO or Isolated IO).
 Memory mapped I/O (MMIO) is mapped into the same address space as program memory
and/or user memory, and is accessed in the same way (MOV).
 Port mapped I/O uses a separate, dedicated address space and is accessed via a dedicated set of
microprocessor instructions (on the Quark X1000 CPU, this is the IN and OUT instructions
which can read and write a single byte to an I/O address).
 The difference between the two schemes occurs within the microprocessor. Intel has, for the
most part, used the port mapped scheme for their microprocessors and Motorola has used the
memory mapped scheme.
Memory-mapped IO vs. Port-mapped IO

 As 16-bit processors have become obsolete and replaced with 32-bit and 64-bit in general use,
reserving ranges of memory address space for I/O is less of a problem, as the memory
address space of the processor is usually much larger than the required space for all memory
and I/O devices in a system.
 Therefore, it has become more frequently practical to take advantage of the benefits of
memory-mapped I/O.
 However, even with address space being no longer a major concern, neither I/O mapping
method is universally superior to the other, and there will be cases where using port-
mapped I/O is still preferable.
Memory-mapped IO

 I/O devices are mapped into the system memory map along with
RAM and ROM. To access a hardware device, simply read or write
to those “special” addresses using the normal memory access
instructions.

 All I/O devices monitor independently the CPU address bus and
respond to any access of device-assigned address space, connecting
the data bus to a desirable device's hardware register.

 The advantage to this method is that:


Every instruction which can access memory can be used to
manipulate an I/O device.

 The disadvantage to this method is that:


Memory Mapped I/O (simplified diagram)
The entire address bus must be fully decoded for every device.  

For example, a machine with a 32-bit address bus would require


logic gates to resolve the state of all 32 address lines to properly
decode the specific address of any device. This increases the cost of
adding hardware to the machine.
Memory-mapped IO

 Full I/O decoding involves checking every single line (i.e., all bits)
of the address bus (and the I/O R/W signal eventually) to determine
if a device is selected or not. With Full I/O decoding, each hardware
register is mapped to an unique I/O port address.

Memory Mapped I/O (simplified diagram)


 
Port-mapped IO (PMIO or Isolated IO)

 I/O devices are mapped into a separate address space.


 This is usually accomplished by having a different set of signal lines and an extra “I/O Request” pin on the CPU's
physical interface to indicate a memory access versus a port access (i.e., if the CPU try to access to the memory or an
I/O device).
 The address lines are usually shared between the two address spaces, but less of them are used for accessing
ports. An example of this is the standard PC which uses 16 bits of port address space, but 32 bits of memory address
space.

Port Mapped I/O (simplified diagram)


Port-mapped IO (PMIO or Isolated IO)

 The advantage to this system: Less logic is needed to decode a discrete address and therefore
less cost to add hardware devices to a machine.
 On the older PC compatible machines, only 10 bits of address space were decoded for I/O ports and so there were
only 1024 unique port locations.
 Modern PC's decode all 16 address lines.

Port Mapped I/O (simplified diagram)


Port-mapped IO (PMIO or Isolated IO)

 To read or write from a hardware device, special port I/O instructions are used.

 From a software perspective, this is a slight disadvantage because more instructions are required to
accomplish the same task.

For instance, if we want to test one bit on a memory mapped port, there is a single instruction to test a bit
in memory, but for ports we must read the data into a register, then test the bit.

Port Mapped I/O (simplified diagram)


Comparison: Memory-Mapped IO vs Port-Mapped IO

Memory-mapped IO Port-mapped IO
Same address bus to address memory Different address spaces for
and I/O devices. memory and I/O devices.
Access to the I/O devices using regular Uses a special class of CPU
instructions. instructions to access I/O devices.
x86 Intel microprocessors - IN and
Most widely used I/O method.
OUT instructions.

You might also like