BCSE305L - Embedded Systems Systems: Module - 3 Architecture of Special Purpose Computing System
BCSE305L - Embedded Systems Systems: Module - 3 Architecture of Special Purpose Computing System
BCSE305L - Embedded Systems Systems: Module - 3 Architecture of Special Purpose Computing System
MODULE - 3
ARCHITECTURE OF SPECIAL PURPOSE
COMPUTING SYSTEM
Page 2
Page 3
MODULE - 3 Page 4
MICROCONTROLLER ARCHITECTURE
CLASSIFICATION OF MICROOCNTROLLERS
MODULE - 3 Page 5
MICROCONTROLLER ARCHITECTURE
CLASSIFICATION OF MICROOCNTROLLERS
q MEMORY ARCHITECTURE:
MODULE - 3 Page 6
MICROCONTROLLER ARCHITECTURE
HARVARD VS VON-NEUMANN
MICROCONTROLLER
MICROCONTROLLER
PHERIPHERALS
PHERIPHERALS
Input Output
Input Output CPU
Devices CPU Devices
Devices Devices
MEMORY
(Program + Data) Program Data
Memory Memory
MODULE - 3 Page 7
MICROCONTROLLER ARCHITECTURE
CLASSIFICATION OF MICROOCNTROLLERS
q INSTRUCTION SET:
− CISC: Complex Instruction Set Computer
− Allow single (complex) instructions to perform numerous low-
level (simple) operations like a load from memory, arithmetic
operation, store into memory with multiple clock cycle
− MUL A, B : get the value of A and B from registers, compute
multiplication by repeated addition and store results back to
registers
Ex: Motorola 68000 (68K), 8051
MODULE - 3 Page 8
MICROCONTROLLER ARCHITECTURE
CLASSIFICATION OF MICROOCNTROLLERS
q INSTRUCTION SET:
− RISC: Reduced Instruction set Computer
− reduce the instruction execution complexity by having
several simple instructions which achieve low-level operation
within a single clock cycle
− LDR for loading, ADD with loop count for multiplication then
STR for storing operations
Ex: AVR, PIC, ARM
MODULE - 3 Page 9
MICROCONTROLLER ARCHITECTURE
CLASSIFICATION OF MICROOCNTROLLERS
q FAMILIES/ARCHITECTURE:
− MCS51: A microcontroller (MCU) series developed by Intel and optimized for
embedded control applications
− PIC: A family of microcontrollers made by Microchip Technology and used
in a wide variety of embedded systems
− AVR: A family of microcontrollers developed by Atmel (acquired by
Microchip in 2016) and majorly used by hobbyist and educational
embedded applications
− ARM: By combining the ARM microprocessor with RAM, ROM and other
peripherals in one single chip, ARM microcontrollers are made (LPC 11U24)
− OTHERS: MIPS, PowerPC, RISC-V, Motorola etc.,
MODULE - 3 Page 10
MICROCONTROLLER ARCHITECTURE
MCS51- Features (8051)
Ø Instruction set Architecture : CISC
Ø Data bus size : 8-bit
Ø Address bus size : 16-bit
Ø Program memory (ROM) : 4 KB
Ø Data memory (RAM) : 128 bytes
Ø General purpose registers : 32 (each 8-bit in size)
Ø Program counter : 16-bit (PC)
Ø Data pointer : 16-bit (DPTR)
Ø Input/output ports : 4 (each 8-bit in size)
Ø Timers : 2 (T0 & T1)
Ø Interrupts : 5 (3 internal + 2 external)
Ø Serial port : 1 (8-bit & 9-bit mode)
MODULE - 3 12
8051- Architecture
MODULE - 3 13
MICROCONTROLLER ARCHITECTURE
PIC – Features (16F877A)
Ø Instruction set Architecture : RISC
Ø Data bus size : 8-bit
Ø Address bus size : 14-bit
Ø Program memory (ROM) : 8 KB
Ø Data memory (RAM) : 368 bytes
Ø Data memory (EEPROM) : 258 bytes
Ø Input/output ports : 5 (A, B, C, D, E)
Ø Timers : 3 (Timer0, 1 & 2)
Ø Interrupts : 15
Ø ADC modules (10-bit) : 8 channels
Ø PWM modules :2
Ø Analog comparators :2
Ø Serial communication : MSSP, USART
Ø Parallel communication : PSP
MODULE - 3 CSE3006 – EMBEDDED SYSTEM DESIGN 14
PIC – Architecture (16F877A)
MODULE - 3 19
20
EMBEDDED MEMORY
MEMEORY ORGANISATION
MODULE - 3 37
EMBEDDED MEMORY
MEMEORY ORGANISATION
q System Space – Exception Vectors
MODULE - 3 23
EMBEDDED MEMORY
MEMORY CLASSIFICATION
SRAM
RAM
DRAM
MASKED ROM
EPROM
EEPROM
HYBRID FLASH
NVRAM
MODULE - 3 25
EMBEDDED MEMORY
CHARACTERISTICS OF VARIOUS EMBEDDED MEMORY DEVICES
Type Volatile? Writeable? Erase Size Max Erase Cycles Cost Speed
SRAM Yes Yes Byte Unlimited Expensive Fast
MODULE - 3 34
Handheld devices
37
Selecting The Right Processor
q µP are offered in 4 to 64-bit size with distinct features like cost, speed, no.
of CPU core, address & data line are used in simple toys to network router
q DSP are majorly used for high computation intensive applications such as
image processing, communication devices, voice to text converter etc.,
MODULE - 3 CSE3006 – EMBEDDED SYSTEM DESIGN 38
PROCESSOR SELECTION CRITERIA
q Sequence of analysis to be made selecting an appropriate processor for
embedded system applications as follows,
1. Application requirement analysis: understand the purpose of
application and arrive specific requirement
2. Processor Architecture analysis: MCS51, ARM, PIC, PowerPC, MIPS etc.,
3. Peripheral set analysis: Includes on-chip (RAM, ROM ,I/O Ports, ADC)
and specialized processing units (FPU, MMU, DMA)
q POS Hand Held spot Billing Machine (SBM) is a GPRS Enabled machine.
The device is compact and lightweight(less than 500grms).
q The machine is equipped with technology which serves as a Hand Held
computer.
q The device is WEB/USB enabled which helps the operator to get instant
bill remotely and update transactions back to the server.
Block Diagram
Specification
Processor : ARM 32-bit Cortex-M3 CPU(120 Mhz max)with Adaptive real-time accelerator (ART Accelerator),
MPU,150DMIPS/1.25DMIPS/MHz(Dhrystone 2.1)capable of In-System Programming.
On The Fly Programming(Can be done in the field using laptop)for Upgrading the units through
Serial/USB Ports.
Choice of Program Memory Capacity of 512K/1024K bytes.
Memory : 8 MB of Non-volatile data memory is provided with the units data retention is minimum 10 years
with out any power being applied. Expandable to 16 MB
Optional - It also has micro SD card slat for large memory backup.
Real Time Clock : In-built Real time clock with battery backup.
Key-Board : A 35 key multi function key board is provided on the front panel.
LC Display : 132*64 pixels graphical LCD display with back light provided for user interaction. It has various
graphical icons for indication of battery status, signal status etc.
MODULE - 3 42
PROCESSOR SELECTION CRITERIA
Case study-1: Handheld Device
Specification
Printer : High Speed 24 Column Impact Printer With 2.7 Lines/Sec, is Provided as
Standard fitment.
Paper Roll : 57.5mm+/-0.5mm,60mm dia Paper Roll Fitment.
Communication : a) RS232C Ports-1 Nos(Option of second port)with flexible baud rates.
b) RS232C Ports-1 Nos(Option of second port)with flexible baud rates.
c) Optional IR or IRDA or Both IR or IRDA Port for Communication with electricity meters.
d) Built-in GSM/GPRS modem optional.
e) Optional GPS.
Smart Card Interface : Optional Contact & Contact less(MIFARE)Smart Card Interface.
Batteries : The units are powered by 7.4v 2600 mAH LI-ION Rechargeable Battery pack.
Battery Charger : External Universal voltage AC/DC 9.7V-2Amp Adaptor provided.
Mechanical Dimension : Width-102.00,Length-276.00,Height-82.00.
Weight : Weight-655grams~(without paper roll).
MODULE - 3 42
PROCESSOR SELECTION CRITERIA
Case study-1: Handheld Device
Processor
MODULE - 3 42
PROCESSOR SELECTION CRITERIA
Case study-1: Handheld Device
Processor
MODULE - 3 42
PROCESSOR SELECTION CRITERIA
Case study-1: Handheld Device
Processor
MODULE - 3 42
PROCESSOR SELECTION CRITERIA
Case study-1: Handheld Device
Nested Vector Interrupt
MODULE - 3 42
PROCESSOR SELECTION CRITERIA
Case study-1: Handheld Device
MODULE - 3 42
PROCESSOR SELECTION CRITERIA
Case study-1: Handheld Device
q It is built for life out in the field, integrates Corning Gorilla glass for display,
able to handle drops, bumps, spills, dust, vibration.
q Android 10.0 OS
q 8GB ROM + 1 GB RAM(16GB+2GB Optional)
q 4'' high resolution(480*800) IPS TFT display
q 8MP AF Camera with LED flash
q 2G,3G,4G ,WIFI,Bluetooth connection
q IP 65 Sealing
q 2D Zebra SE4710 barcode scanner
MODULE - 3 42
PROCESSOR SELECTION CRITERIA
Case study-1: Handheld Device
Specification
MODULE - 3 42
PROCESSOR SELECTION CRITERIA
Case study-1: Handheld Device
Processor
q The Cortex-A53 processor has one to four cores, each with an L1 memory
system and a single shared L2 cache. It can be combined with other
Cortex-A CPUs in a big. LITTLE configuration.
MODULE - 3 Page 66
PROCESSOR SELECTION CRITERIA
Case study-1: Handheld Device
Processor
MODULE - 3 Page 67
PROCESSOR SELECTION CRITERIA
Case study-1: Handheld Device
Processor
MODULE - 3 Page 68
PROCESSOR SELECTION CRITERIA
Case study-2: Digital Camera
q DSP - is the brain of the camera and is responsible for performing all the
computations needed to process and compress the image.
MODULE - 3 Page 69
PROCESSOR SELECTION CRITERIA
Case study-2: Digital Camera
MODULE - 3 Page 70
PROCESSOR SELECTION CRITERIA
Case study-2: Digital Camera
MODULE - 3 Page 71
PROCESSOR SELECTION CRITERIA
Case study-2: Digital Camera
q On most DSCs, the user has the ability to view the image to be captured
on the LCD display.
q The compressed images are stored in Flash memory for later use.
q Most DSC systems also provide an NTSC/PAL video signal to view the
captured images (also the preview images) on a TV monitor.
q The current DSCs also provide ways to connect to the external PC or
printer through an RS-232 or a USB port.
MODULE - 3 Page 72
PROCESSOR SELECTION CRITERIA
Case study-2: Digital Camera
Image Pipeline
MODULE - 3 Page 73
PROCESSOR SELECTION CRITERIA
Case study-2: Digital Camera
Image Pipeline
q The Colour Filtered Array (CFA) data needs to undergo significant amount
of image processing before the image can be finally presented in a
usable format for compression.
q All these processing stages are collectively called the “image pipeline.”
q Most of these tasks are multiply-accumulate (MAC) intensive operations.
q The TMS320C54x DSP is well suited to perform these tasks efficiently and
generate a high-quality image that is close to the image quality offered
by traditional film from the raw CCD data.
MODULE - 3 Page 74
PROCESSOR SELECTION CRITERIA
Case study-2: Digital Camera
MODULE - 3 Page 75
PROCESSOR SELECTION CRITERIA
Case study-2: Digital Camera
MODULE - 3 Page 76
PROCESSOR SELECTION CRITERIA
Case study-2: Digital Camera
MODULE - 3 Page 77
PROCESSOR SELECTION CRITERIA
Case study-2: Digital Camera
MODULE - 3 Page 78
PROCESSOR SELECTION CRITERIA
Case study-2: Digital Camera
MODULE - 3 Page 79
PROCESSOR SELECTION CRITERIA
Case study-2: Digital Camera
MODULE - 3 Page 80
PROCESSOR SELECTION CRITERIA
Case study-3: ATM
MODULE - 3 Page 81
PROCESSOR SELECTION CRITERIA
Case study-3: ATM
MODULE - 3 Page 82
PROCESSOR SELECTION CRITERIA
Case study-3: ATM
ATM – Harware
q CPU (to control the user interface and transaction devices)
q Magnetic and/or Chip card reader (to identify the customer)
q PIN Pad (similar in layout to a Touch tone or Calculator keypad), often
manufactured as part of a secure enclosure.
q Secure cryptoprocessor, generally within a secure enclosure.
q Display (used by the customer for performing the transaction)
q Function key buttons (usually close to the display) or a Touchscreen (used
to select the various aspects of the transaction)
q Record Printer (to provide the customer with a record of their transaction)
q Vault (to store the parts of the machinery requiring restricted access)
q Housing (for aesthetics and to attach signage to)
MODULE - 3 Page 83
PROCESSOR SELECTION CRITERIA
Case study-3: ATM
ATM – Vaults
MODULE - 3 Page 84
PROCESSOR SELECTION CRITERIA
Case study-3: ATM
ATM – Networking
q The internet service provider (ISP) also plays an important role in the ATMs.
q This provides communication between ATM and host processors.
q When the transaction is made, the details are input by the cardholder.
q This information is passed on to the host processor by the ATM.
q The host processor checks these details with an authorized bank.
q If the details are matched, the host processor sends the approval code to the machine
so that the cash can be transferred.)
MODULE - 3 Page 85
PROCESSOR SELECTION CRITERIA
Special Purpose Processor
MODULE - 3 Page 86
Page 87
Page 88
Page 89
Page 90
Page 91
Page 92
Page 93
Page 94
Page 95
Page 96
Page 97
Page 98
Page 99
Page 100
Page 101
Page 102
Page 103
Page 104
Page 105
Page 106
Page 107
Page 108
GOALS OF DATA COMPRESSION IN EMBEDDED SYSTEM
Page 109
BLOCK DIAGRAM FOR COMPRESSOR
1..m packed
1..n input output
symbols symbols
input data compressor output
Page 110
HUFFMAN CODING
Page 111
HUFFMAN EXAMPLE
character P
a .45
b .24
c .11
d .08
e .07
f .05
Page 112
EXAMPLE HUFFMAN CODE
Page 113
REQUIREMENTS
Page 114
BUILDING A SPECIFICATION
Page 115
DATA-COMPRESSOR CLASS
data-compressor
buffer: data-buffer
table: symbol-table
current-bit: integer
encode(): boolean,
data-buffer
flush()
new-symbol-table()
Page 116
DATA-COMPRESSOR BEHAVIORS
Page 117
AUXILIARY CLASSES
data-buffer symbol-table
databuf[databuflen] : symbols[nsymbols] :
character data-buffer
len : integer len : integer
Page 118
AUXILIARY CLASS ROLES
Page 119
CLASS RELATIONSHIPS
data-compressor
1
1
1 1
data-buffer symbol-table
Page 120
ENCODE BEHAVIOR
F
add to buffer return false
Page 121
INSERT BEHAVIOR
pack into
T this buffer
input
symbol update
fills buffer?
length
F
pack bottom bits
into this buffer,
top bits into
overflow buffer
PROGRAM DESIGN
symbol table
compare
MODULE 4
PROGRAMMING TOOLS
Weightage
BCSE305L Page 2
BUILDING PROCESS FOR EMBEDDED SYSTEMS
BCSE305L Page 3
BUILDING PROCESS FOR EMBEDDED SYSTEMS
BCSE305L Page 4
BUILDING PROCESS FOR EMBEDDED SYSTEMS
q The only thing that has really changed is that you need to have
an understanding of the target hardware platform. Furthermore,
each target hardware platform is unique.
q When build tools run on the same system as the program they
produce, they can make a lot of assumptions about the system.
q Instead, the user must provide some knowledge of the system to the
tools by giving them more explicit instructions.
BCSE305L Page 6
BUILDING PROCESS FOR EMBEDDED SYSTEMS
BCSE305L Page 7
BUILDING PROCESS FOR EMBEDDED SYSTEMS
q Software Tools
1. Software Development Kit (SDK)
2. Source-code Engineering Software
3. RTOS
4. Integrated Development Environment (IDE)
5. Emulator
6. Editor
7. Interpreter
8. Compiler
9. Assembler
10. Cross Assembler
11. Locator
12. Testing and debugging tools
BCSE305L Page 8
PROCESS FOR DEVELOPING EMBEDDED SOFTWARE
q To develop software for a General Purpose Computer
Ø Create source file
Ø Type in C code
Ø Build: compile and link
Ø Execute: load and run
Each of the source files must be compiled or assembled into an object file
All of the object files that result from the first step must be linked together
to produce a single object file, called the re-locatable program.
BCSE305L Page 11
PREPROCESSING
q Most object files begin with a header that describes the sections
that follow.
q These are the symbols that refer to variables and functions defined
in other source files.
BCSE305L Page 18
LINKER
q All of the object files resulting from the compilation in step one must
be combined.
q The job of the linker is to combine these object files and, in the
process, to resolve all of the unresolved symbols
q By merging the text, data, and bss sections of the input object files,
the linker creates a new object file that contains all of the code
and data from the input object files.
BCSE305L Page 19
LINKER
q And all of the initialized and uninitialized variables will reside in the
new data and bss sections, respectively.
q After merging all of the code and data sections and resolving all of
the symbol references, the linker produces an object file that is a
special “relocatable” copy of the program.
BCSE305L Page 23
LOCATOR - EXAMPLE
q The first executable instruction is ENTRY command, which appears
on the first line and the entry point is the function main.
q This script informs the GNU linker’s built-in locator about the
memory on the target board, which contains 64 MB of RAM and 16
MB of flash ROM.
q The linker script file instructs the GNU linker to locate the data, bss,
and text sections in RAM starting at address 0x00400000.
q The start and stop addresses for this operation can be established
symbolically by referring to the addresses as _DataStart and
_DataEnd.
BCSE305L Page 25
BUILD PROCEDURE FOR THE ARCOM VIPER-LITE DEVELOPMENT BOARD
COMPILE
q The first step in the build process is to compile these two files.
BCSE305L Page 27
BUILD PROCEDURE FOR THE ARCOM VIPER-LITE DEVELOPMENT BOARD
COMPILE
q The basic structure for the gcc compiler command is:
BCSE305L Page 28
BUILD PROCEDURE FOR THE ARCOM VIPER-LITE DEVELOPMENT BOARD
COMPILE
q We broke up the compilation step into two separate commands,
but you can compile the two files with one command.
q To use a single command, just put both of the source files after the
options. If you wanted different options for one of the source files,
you would need to compile it separately as just shown.
q So if all goes well, there will now be two additional files—led.o and
blink.o—in the working directory.
BCSE305L Page 29
BUILD PROCEDURE FOR THE ARCOM VIPER-LITE DEVELOPMENT BOARD
q For the third step, locating, there is a linker script file named
viperlite.ld that we input to ld in order to establish the location of
each section in the Arcom board’s memory.
BCSE305L Page 30
BUILD PROCEDURE FOR THE ARCOM VIPER-LITE DEVELOPMENT BOARD
q If startup code were included, you would want that object file to be
located at the proper address.
q The linker script file can be used to specify where you want the
startup routine (and other code) to reside in memory.
q Furthermore, you can also use the linker script file to specify exact
addresses for code or data, should you find it necessary to do so.
BUILD PROCEDURE FOR THE ARCOM VIPER-LITE DEVELOPMENT BOARD
q The .map file gives a complete listing of all code and data
addresses for the final software image.
q The most straightforward pieces of information in the map file are the
actual memory regions, with location, size and access rights granted to
those regions
LINKER MAP FILES
q Linker script and memory map section contains a breakdown of the
memory contribution of each and every file that was linked into the final
image.
LOADING ON THE TARGET
q This topic explains how the hex code is loaded on the target board
which is referred as downloading
Ø Step 4: The Device programmer then transfers the binary image bit by
bit to the chip.
MODULE - 5 CSE3006 – EMBEDDED SYSTEM DESIGN
LOADING ON THE TARGET
1. Using a Device Programmer:
uP/uC/Memory Chip
Host Computer
Device Programmer
LOADING ON THE TARGET
2. Using In System Programmer(ISP):
Ø Certain Target embedded platforms uses a piece of hardware called
ISP that have a hardware interface to both the computer as well the
target board’s chip where the code is to be downloaded.
Modelling
embedded
Unified
systems Data
model Flow
language Graph
50
Modelling Programs
Representation of a program
q Can be represented by
q Data Flow graph (DFG)
q Control Data Flow Graph (CDFG)
q Decision nodes
while (a < b)
{
a = proc1(a,b);
b= proc2(a,b);
}
if (cond1)
basic_block_1( );
else
basic_block_2();
basic_block_3( );
switch (test1) {
case c1: basic_block_4( ); break;
case c2: basic_block_5( ); break;
case c3: basic_block_6( ): break;
}
COMPONENTS OF PETRINET
PROPERTIES
q Sequential Execution: Transition t2 can
take place only after t1. Here a
constraint is imposed “t2 after t1”
PROPERTIES
qScenario 1: Normal
§ Enters all 4 digits and
press OK.
qScenario 2: Exceptional
§ Enters only 3 digits
and press OK.
q The machine dispenses two kinds of snack bars 20c and 15c.
Constraint: 10c and 5c coins can only be used
q Scenario 1:
§ Deposit four 5c, take 20c snack bar.
q Scenario 2:
§ Deposit 10 + 5c, take 15c snack bar.
q Scenario 3:
§ Deposit 5 + 10 + 5c, take 20c snack bar.
INTRODUCTION
INTRODUCTION
q UML is used in many applications by many people like
business users, common people to make the system simple,
clear and understandable
System
Actors
Use Cases
Relationships
q Extend: extending use case will work exactly like the base use
case
Verify
Primary actor – Initiates the process
<<Include>> Password
Log in
Secondary actor – Responds
<<Extend>>
Error
Check Use case – Action
Balance
Customer
Association
Transfer
fund
Bank Exclude
Make
Include
Payment
Generalization
New Old
customer customer
CLASS DIAGRAM
q Class diagram- use to document software architecture
CLASS DIAGRAM
q Class diagram - Static diagram, Structural diagram
CLASS DIAGRAM
q The purpose of the class diagram can be summarized as −
Ø Conceptual modelling
Class
Relationships
Attributes
Methods
Anil Aggregation
Sunil
Classroom
Multiplicity
N
0…*
0…1
1…*
m…n
1…* 1
Tables Projector
Composition
SEQUENCE DIAGRAM
q Sequence Diagrams are suitable for following scenarios:
q Usage scenario – to determine how the system could be
used
q Method logic – to explore logic of any function, procedure
or complex process
q Service logic – ideal way to explain high level methods
used by clients
Alternate frame
Alt
If it is valid
Else
Yes/No
Yes/No
Payment
Receipt
STATE DIAGRAM
STATE DIAGRAM
q Uses of state chart diagram:
Ø We use it to depict event driven objects
Ø We use it for showing use cases in business context
Ø Shows the overall behavior of state machine
Ø We use it to model the dynamic behavior of the system .
q Steps to draw a state diagram:
Ø Identify the initial state and the final states.
Ø Identify the possible states in which the object can exist
Ø Label the events which trigger these transitions.
MODULE - 5 CSE3006 – EMBEDDED SYSTEM DESIGN
UNIFIED Modelling LANGUAGE (UML)
STATE DIAGRAM SYMBOL NOTATIONS
Final state
Self Transition
Send Special or
Idle order Normal
request order v
v
Initial Abnormal
state exit state
Order
Final confirmation
state v
Dispatch
order v
Statechart
Statechart for for
"User" “Elevator
Behavior Control
System"
ACTIVITY DIAGRAM
q An activity diagram - behavioral diagram
ACTIVITY DIAGRAM
ACTIVITY DIAGRAM
q Steps to draw a activity diagram:
Ø Identify the initial state and the final states.
Ø Identify the intermediate activities needed to reach the final
state from he initial state.
Ø Identify the conditions or constraints which cause the
system to change control flow.
Ø Draw the diagram with appropriate notations.
MODULE - 5 CSE3006 – EMBEDDED SYSTEM DESIGN
UNIFIED Modelling LANGUAGE (UML)
ACTIVITY DIAGRAM SYMBOL NOTATIONS
Initial node Final node Object node Decision node Merge node
Fork node
Join node
[Yes]
Confirm order
[No]
Termination
Dispatch order
if (a || b) { /* T1 */ a b c path
if ( c ) /* T2 */ 0 0 0 T1=F, T3=F: no assignments
x = r*s+t; /* A1 */
0 0 1 T1=F, T3=T: A4
else y=r+s; /* A2 */
z = r+s+u; /* A3 */ 0 1 0 T1=T, T2=F: A2, A3
N
i=N
for (i=0, f=0; i<N; i++)
Y
f = f + c[i] * x[i];
f = f + c[i] * x[i]
i=i+1
PATHS IN A LOOP WITH CONDITIONS
i=0;
executed 1 time
f=0;
• Trace-driven:
• Instrument the program.
• Save information about the path.
• Requires modifying the program.
• Trace files are large.
• Widely used for cache analysis.
PHYSICAL MEASUREMENT
• In-circuit emulator allows tracing.
• Affects execution timing.
• Logic analyzer can measure behavior at pins.
• Address bus can be analyzed to look for events.
• Code can be modified to make events visible.
• Particularly important for real-world input streams.
CPU SIMULATION
N
i<N*M
i<X
Y
for (i=0; i<N*M; i++)
z[i] = a[i] + b[i]; z[i] = a[i] + b[i];
i = i+1;
INDUCTION VARIABLE ELIMINATION
a[0,0]
1024
1024 4099
b[0,0] ...
4099
• Array elements conflict because they are in the same line, even if
not mapped to same location.
• Solutions:
• move one array;
• pad array to change alignment.
ARRAY PADDING
before after
LOOP TILING EXAMPLE
• Breaks one loop into a nest of loops.
• Changes order of accesses within array.
• Changes cache behavior.
for (j=0; j < N; j += TILE) {
for (i=0; i<N; i++) {
for (i=0; i<N; i++) {
for (j=0; j<N; j++) {
for (jj=0; j<TILE; jj++) {
z[i][j] = x[i] * y[j];
z[i][j + jj] = x[i] * y[j + jj];
}
}
}
}
}
x[0] x[] x[0] x[1] x[]
y[N-
y[0] y[1] y[] y[0] y[1] y[]
1]
PERFORMANCE OPTIMIZATION HINTS
164
I/O Interfacing Techniques
MODULE 5
BCSE305L Page 2
3
MODULE-6 3
NEED FOR RTOS
MODULE-6 4
NEED FOR RTOS
MODULE-6 5
NEED FOR RTOS
MODULE-6 6
WHAT IS RTOS?
Ø “Real time in operating systems is the ability of the OS to provide a required
level of service in a bounded response time.” - POSIX Standard 1003.1
Ø “A real-time operating system (RTOS) is an operating system (OS) intended to
serve real-time application requests.” - Wikipedia
Ø “RTOS: Any OS where interrupts are guaranteed to be handled within a
certain specified time, there by making it suitable for control hardware in
embedded system & time critical applications.” - Dictionary.com
Ø A real-time operating system (RTOS) is an operating system that guarantees
a certain capability within a specified time constraint to serve real time
embedded applications.
MODULE-6 7
RTOS - FEATURES
Ø Multitasking and Pre-emptibility: An RTOS must be multi-tasked and pre-
emptible to support multiple tasks in real-time applications.
Ø Task Priority: In RTOS, pre-emption capability is achieved by assigning
individual task with the appropriate priority level.
Ø Inter Task Communication: For multiple tasks to communicate in a timely
manner and to ensure data integrity among each other, reliable and sufficient
inter-task communication and synchronization mechanisms are required.
Ø Priority Inheritance: To allow applications with stringent priority requirements
to be implemented, RTOS must have a sufficient number of priority levels
when using priority scheduling.
MODULE-6 8
RTOS - FEATURES
Ø Control of Memory Management: To ensure predictable response to an
interrupt, an RTOS should provide way for task to lock its code and data into
real memory.
Consistency Reliability
Scalability Predictability
Performance
MODULE-6 10
RTOS - TYPES
Ø HARD RTOS strictly adhere to the deadline associated with the tasks and the
degree of tolerance for missed deadlines is negligible.
Ø Examples:
Ø Missile Navigation Systems
Ø Vehicle Air Bags Control
Ø Nuclear Power Plant Control
MODULE-6 11
RTOS - TYPES
Ø FIRM RTOS tolerates a low occurrence of missing a deadline.
Ø Examples:
Ø Robot in car assembly section
Ø Food processing control system
Ø Weather monitoring system
MODULE-6 12
RTOS - TYPES
Ø SOFT RTOS allows for frequently missed deadlines, and as long as tasks are
timely executed their results continue to have value.
Ø Even the soft real time systems cannot miss the deadline for every task or
process according to the priority it should meet the deadline. (Best effort
system)
Ø Examples:
Ø Multimedia transmission & reception
Ø Digital cameras & mobile phones
Ø Computer games
MODULE-6 13
RTOS - APPLICATIONS
MODULE-6 14
RTOS - APPLICATIONS
Mars Curiosity Rover, Uses VxWorks (For Split-second decision taking) and µc/os-II (For Sample Analysis)
MODULE-6 15
RTOS - APPLICATIONS
F-35 Fighter aircraft uses a Integrity DO-178B, a POSIX based RTOS developed by Green Hills Software
MODULE-6 16
RTOS - APPLICATIONS
International medical technology group Elekta is basing its new generations of equipment on the LynxOS-SE (RTOS)
MODULE-6 17
RTOS - APPLICATIONS
ThreadX RTOS is used by HP (All Printers) & Honeywell (Advanced security systems) in consumer electronics devices
MODULE-6 18
RTOS IN EMBEDDED MARKET
There are more than 100 RTOS currently available in embedded market
MODULE-6 19
MOST COMMONLY USED RTOS
MODULE-6 20
RTOS IN IOT ERA
Ø The Internet of Things (IoT) is accelerating the use of third-party software
stacks, especially the TCP/IP and USB stacks and these third-party stacks and
tools are often compatible with various RTOS solutions.
Ø Moreover, popular RTOS solutions like Azure RTOS, Amazon FreeRTOS, and
Segger’s embOS are being qualified for MCUs from large suppliers.
Ø This out-of-box integration is crucial in managing software complexity and
lowering the barriers to entry for smaller embedded design outfits.
Ø However, the use of RTOS also increases the overall system complexity and
may introduce new types of bugs. So, developers must have adequate
knowledge of how to implement RTOS-based programs effectively.
MODULE-6 21
22
MODULE-6 22
RTOS – ARCHITECTURE
Applications
RTOS
BSP
Hardware
MODULE-6 23
RTOS – KERNEL
Ø Its services include managing memory and devices and also to provide an
interface for software applications to use the resources.
MODULE-6 24
Ø Monolithic kernel RTOS - KERNEL
Ø Entire OS (device drivers, file system, and IPC) is working in kernel space.
Ø Each component of Monolithic kernel could communicate directly with any other
component, and had unrestricted system access.
Ø Micro kernel
Ø Includes very small number of services within the kernel in an attempt to keep it small
and scalable.
Ø The services typically include low-level memory management, inter-process
communication (IPC), basic process synchronization to enable processes to cooperate.
Ø Hybrid kernel
Ø It has a structure similar to microkernels, but implemented as a monolithic kernel.
Ø Hybrid kernels have the ability to pick and choose what they want to run in user mode
and what they want to run in kernel mode.
MODULE-6 25
EOS - KERNEL
Monolithic Kernel
Microlithic Kernel
Hybrid Kernel
MODULE-6 26
MONOLITHIC VS MICRO VS HYBRID KERNEL
BASIS FOR
MONOLITHIC KERNEL MICROKERNEL HYBRID KERNEL
COMPARISON
Basic In monolithic kernel, both user In microkernel user services and Hybrid kernel is a kernel
services and kernel services are kernel, services are kept in separate combining aspects from both
kept in the same address space. address space. Monolithic and Microkernel
designs
Size larger than microkernel Microkernel are smaller in size. smaller than monolithic kernel
Execution Fast execution. Slow execution. Moderate speed
Extendible The monolithic kernel is hard to The microkernel is easily extendible. Dynamically loadable module
extend.
Security If a service crashes, the whole If a service crashes, it does effect on It depends on service called
system crashes in monolithic working of microkernel.
kernel.
Code To write a monolithic kernel, more To write a microkernel, less code is Moderate code size
code is required. required.
Example Linux, BSDs, Windows (95,98,Me), QNX, Symbian, L4Linux, Singularity, Windows NT and above, Mac OS
Solaris, OS-9, AIX, HP-UX, DOS. K42, Integrity, PikeOS,. X, BeOS, ReactOS, Syllable.
MODULE-6 27
KERNEL
Ø An RTOS generally avoids implementing the kernel as a monolithic program.
Ø All other basic services can be made part of user space and can be run in the
form of servers.
QNX Microkernel
Microkernel Model
MODULE-6 29
BSP
Ø In embedded systems, a board support package (BSP) is the layer of
software containing hardware-specific drivers and other routines
Ø BSPs are typically customizable, allowing the user to specify which drivers
and routines should be included in the build based on their selection of
hardware and software options.
MODULE-6 30
BSP
Ø For instance, a particular single-board computer might be paired with any of
several graphics cards; in that case the BSP might include a driver for each.
Ø While building the BSP image the user would specify which graphics driver to
include based in his choice of hardware.
Ø BSP is supposed to perform the following operations
Ø Initialize the processor
Ø Initialize the bus
Ø Initialize the interrupt controller
Ø Initialize the clock
Ø Initialize the RAM settings
Ø Load and run boot loader from flash
MODULE-6 31
32
MODULE-6 32
RTOS PERFORMANCE METRICS
Ø There are three areas of interest when looking at the performance and usage
characteristics of an RTOS: (1) Memory (2) Latency (3) Kernel services
Ø Memory:
Ø How much ROM and RAM does the kernel need and how is this affected by options
and configuration?
Ø Factors that affect the memory footprint includes static or dynamic configuration of
Kernel, number of instruction related to processor, global variable declaration etc.,
Ø With the help of optimization setting in embedded compilers code size can be
reduced, but that will most likely affect performance.
Ø Most RTOS kernels are scalable, but some RTOSes, scalability only applies to the
kernel. For others, scalability is extended to the rest of the middleware.
MODULE-6 33
RTOS PERFORMANCE METRICS
Ø Latency:
Ø Interrupt latency: It is the sum of the hardware dependent time, which depends on the
interrupt controller as well as the type of the interrupt, and the OS induced overhead.
Ø Ideally, it should include the best and worst case scenarios.
Ø To measure a time interval, like interrupt latency, with any accuracy, requires a suitable
instrument and the best tool to use is an oscilloscope.
Ø Scheduling latency: Being real time, the efficiency at which threads or tasks are
scheduled is of some importance and the scheduler is at the core of an RTOS.
Ø It is hard to get a clear picture of performance, as there is a wide variation in the
techniques employed to make measurements and in the interpretation of the results.
Ø There are really two separate measurements to consider: (1) The context switch time
(2) The time overhead that the RTOS introduces when scheduling a task
MODULE-6 34
RTOS PERFORMANCE METRICS
Ø Performance of kernel services:
Ø An RTOS is likely to have many API (application program interface) calls, probably
numbering into the hundreds.
Ø To assess timing, it is not useful to try to analyse the timing of every single call. It makes
more sense to focus on the frequently used services.
Ø For most RTOSes, there are four key categories of service call:
Ø Threading services
Ø Synchronization services
Ø Inter-process communication services
Ø Memory services
MODULE-6 35
RTOS CONSIDERATIONS
Ø What are the real-time capabilities? Is it soft or hard real-time interrupt handling
Ø What is the target processor, and does it support the RTOS you have in mind?
Ø Does your processor development board have a BSP with your RTOS?
MODULE-6 36
RTOS CONSIDERATIONS
Ø Which development environments and debugging tools work with the RTOS?
Ø How flexible is the choice of scheduling algorithms in the RTOS? E.g., FIFO, round Robin,
rate-monotonic, sporadic, etc.
MODULE-6 37
38
MODULE-6 38
REAL-TIME TASK
Ø Real time tasks normally recur a large number of times at different instants of
time depending on event occurrence time.
Ø Ex.: A temperature sensing task in chemical plant might recur indefinitely with
a certain period because the temperature is sampled periodically, whereas a
task handling a device interrupt might recur at random instants.
Ø A real-time task is defined to be one in which there is a hard deadline to
meet; failure to meet the deadline can lead to disaster, Ex: task include
process control such as a nuclear reactor, automatic flight control, etc.
Ø Based on the way real-time tasks recur over a period of time, it is possible to
classify them into (1) Periodic tasks (2) Sporadic tasks (3) Aperiodic tasks
MODULE-6 39
REAL-TIME TASK - PERIODIC
Ø A periodic task is one that repeats after a certain fixed time interval also
referred as clock-driven tasks.
Ø A vast majority of the tasks present in a typical real-time system are periodic,
for example, monitoring certain conditions, polling information from sensors at
regular intervals to carry out certain action at regular intervals.
Ø Periodic task Ti can be represented using four tuples: (Øi, pi, ei, di)
Where,
– Øi - phase of the task
– pi - Period of task
– ei - Worst case execution time of the task
– di - Relative deadline of the task
MODULE-6 40
REAL-TIME TASK - SPORADIC
Ø A sporadic task is one that recurs at random instants.
Ø E.g. In a robot a task that gets generated to handle an obstacle that appears
suddenly is a sporadic task. In a factory , the task that handles fire condition
is a sporadic task.
Ø Sporadic task Ti can be represented using three tuples: (ei, gi, di)
Where, ei - Worst case execution time of the task
gi - the minimum separation between two consecutive instances of the task.
di - Relative deadline of the task
Ø Two or more instances of an aperiodic task might occur at the same time instant
because minimum separation gi, between two consecutive instances can be 0.
Ø Aperiodic task can recur in quick succession, therefore it is very difficult to meet
the deadlines of all instances of an aperiodic task.
Ø Soft real time systems can tolerate a few deadline misses. so, aperiodic tasks
are generally soft real time tasks.
Ø E.g. Logging task in a distributed systems. The logging task can be started by
different tasks running on different nodes.
MODULE-6 42
REAL-TIME TASK - ASSUMPTIONS
MODULE-6 44
SCHEDULING CRITERIA
1. CPU utilization: CPU should be working most of the time (Ideally 100% all the time)
2. Throughput: total number of processes completed per unit time(10 tasks/second)
3. Turnaround time (TAT): amount of time taken to execute a particular process, TATi=CTi–ATi
(Where CTi → Completion Time, ATi → Arrival Time)
4. Waiting time(WT): time periods spent waiting in the ready queue by a process to acquire
get control on the CPU, WTi = TATi – BTi (Where BTi → CPU burst time)
5. Load average: average number of processes residing in the ready queue waiting for their
turn to get into the CPU
6. Response time: Amount of time it takes from when a request was submitted until the first
response is produced
SCHEDULING OBJECTIVE: Max →CPU utilization, Throughput
Min → Turnaround time, Waiting time, Load average, Response time
MODULE-6 45
TASK MODEL
Ø A task = (C, P)
Ø C: worst case execution time/computing time (C<=P!)
Ø P: period (D=P)
Ø C/P is CPU utilization of a task
U>1 (overload): some task will fail to meet its deadline
Ø A task set: (Ci,Pi) U<=1 : it will depend on the scheduling algorithms
U=1 : CPU is kept busy, all deadlines will be met
Ø All tasks are independent
Ø The periods of tasks start at 0 simultaneously
Ø U=Σ(Ci/Pi) is CPU utilization of a task set
Ø CPU utilization is a measure on how busy the processor could be during the
shortest repeating cycle: P1*P2*...*Pn
MODULE-6 46
SCHEDULABILITY TEST
Ø Schedulability test determine whether a given task set is feasible to schedule?
Ø Necessary test:
Ø If test is passed, tasks may be schedulable but not necessarily
Ø If test is not passed, tasks are definitely not schedulable
Ø Sufficient test:
Ø If test is passed, then task are definitely schedulable
Ø If test is not passed, tasks may be schedulable but not necessarily
MODULE-6 49
Ø Scheduling refers to the way processes are assigned to run on the available
CPUs, since there are typically many more processes running on the system.
Ø By switching the CPU among processes, the operating system can make the
embedded system is more productive.
Ø The need for a scheduling algorithm arises from requirement for most modern
systems to perform multitasking.
MODULE-6 50
REAL-TIME SCHEDULING ALGORITHM
Ø The purpose of a real-time scheduling algorithm is to ensure that critical
timing constraints, such as deadlines and response time, are met.
Ø Real-time systems use preemptive multitasking with priorities are assigned to
tasks, and the RTOS always executes the task with highest priority.
Ø Most algorithms are classified as
Ø Fixed-priority: Algorithm assigns priorities at design time, and those priorities
remain constant for the lifetime of the task. Ex.: Rate-Monotonic Scheduling (RMS)
Ø Dynamic-priority: Assigns priorities dynamically at runtime, based on execution
parameters of tasks, such as upcoming deadlines. Ex.: Earliest Deadline First (EDF)
Ø Mixed-priority: This algorithm has both static and dynamic components.
MODULE-6 51
TYPES OF SCHEDULING ALGORITHM
Ø Pre-emptive Algorithm:
Ø Shortest remaining Time First(SRTF)
Ø Round Robin(RR)
Ø Priority(Pre-emptive)
Ø Rate Monotonic Scheduling (RMS)
Ø Earliest Deadline First (EDF)
MODULE-6 52
RATE MONOTONIC SCHEDULING(RMS)
Ø Concept: Task with the shortest period executes with the highest priority.
Ø Working :
Ø Rate-monotonic is a fixed priority based scheduling.
Ø The scheduling scheme is pre-emptive; it ensures that a task is pre-empted
if another task with a shorter period is expected to run.
Ø Used in embedded systems where the nature of the scheduling is
deterministic.
Ø Consider three tasks with a period Task-1(10ms), Task-2(15ms), Task-3
(20ms), then as per RMS priority of the tasks can be assigned as:
priority (task1) > priority (task2) > priority (task3)
MODULE-6 53
RATE MONOTONIC SCHEDULING(RMS) – EXAMPLE-1
Ø Consider set of task running on a Automotive control system as follows
Ø Speed measurement Task (T1): C=20ms, P=100ms, D=100ms
Ø ABS control Task (T2): C=40ms, P=150ms, D=150ms
Ø Fuel injection Task (T3): C=100ms, P=350ms, D=350ms
Ø Other software with soft deadlines e.g. audio, air condition etc.,
C - computation time
Necessary Test Sufficient Test
$ P - Period of task $
𝐶! 𝐶! #
𝑈=# ≤1 D - Relative deadline 𝑈=# ≤ 𝑁(2$ − 1)
𝑃! 𝑃𝑇!
!"# !"#
N- number of task (3 tasks)
U=(20/100)+(40/150)+(100/350) B(3)=0.779
= 0.2 + 0.2666 + 0.2857 = 0.7517 ≤ 1 U=0.7517 ≤ B(3) ≤ 1
Necessary Test is passed hence given task set must be tested under sufficient test Since given task set passes Sufficient test we may
to conclude the Schedulability. But, if necessary test is failed we may conclude conclude it is definitely schedulable under RMS
given task set is definitely not schedulable by any scheduling algorithm.
MODULE-6 54
RATE MONOTONIC SCHEDULING(RMS) – EXAMPLE-1
This is only for the first periods. But this is enough to conclude that the task set is schedulable
MODULE-6
RATE MONOTONIC SCHEDULING(RMS) – EXAMPLE-2
Ø Consider set of four task (Ci, Pi) running on embedded system as follows,
T1= (1, 3) , T2= (1, 5) , T3= (1, 6) , T4= (2, 20)
Necessary Test Sufficient Test
$ $
𝐶! 𝐶! #
𝑈=# ≤1 𝑈=# ≤ 𝑁(2$ − 1)
𝑃! 𝑃𝑇!
!"# !"#
U=(1/3)+(1/5)+(1/6)+(2/10)=0.899 ≤ 1 B(4)=0.756
Necessary Test is passed hence given task set must be tested under sufficient test The given task set fails in the Sufficient test due to
to conclude the Schedulability. But, if necessary test is failed we may conclude U=0.899 > B(4) we can’t conclude precisely whether
given task set is definitely not schedulable by any scheduling algorithm. given task set is schedulable or not under RMS
MODULE-6 56
RATE MONOTONIC SCHEDULING(RMS) – EXAMPLE-2
MODULE-6
RATE MONOTONIC SCHEDULING(RMS) – EXAMPLE-3
Ø Consider set of three task (Ci, Pi) running on embedded system as follows,
T1= (10, 30) , T2= (15, 40) , T3= (5, 50)
Necessary Test Sufficient Test
$ $
𝐶! 𝐶! #
𝑈=# ≤1 𝑈=# ≤ 𝑁(2$ − 1)
𝑃! 𝑃𝑇!
!"# !"#
U=(10/30)+(15/40)+(5/50)=0.808 ≤ 1 B(3)=0.779
Necessary Test is passed hence given task set must be tested under sufficient test The given task set fails in the Sufficient test due to
to conclude the Schedulability. But, if necessary test is failed we may conclude U=0.808 > B(3) we can’t conclude precisely whether
given task set is definitely not schedulable by any scheduling algorithm. given task set is schedulable or not under RMS
MODULE-6 58
RATE MONOTONIC SCHEDULING(RMS) – EXAMPLE-3
MODULE-6
RATE MONOTONIC SCHEDULING(RMS) - PROs & CONs
Ø PROs:
Ø Simple to understand
Ø Easy to implement (static/fixed priority assignment)
Ø Stable: though some of the lower priority tasks fail to meet deadlines,
others may meet deadlines
Ø CONs:
Ø Lower CPU utilization
Ø Requires D=T
Ø Only deal with independent tasks
Ø Non-precise Schedulability analysis
MODULE-6 60
EARLIEST DEADLINE FIRST (EDF)
Ø Concept: Task closest to the end of its period assigned the highest priority
Ø Working :
Ø Earliest deadline first is a dynamic priority based scheduling.
Ø The scheduling scheme is pre-emptive; it ensures that a task is pre-empted
if another task having a nearest deadline is expected to run.
Ø EDF is optimal scheduling i.e it can schedule the task set if any other
scheduling algorithm can schedule
Ø If two task have the same deadlines, need to chose one if the two at
random but in most of the case it proceed with the same task which is
currently executing on the CPU to avoid context switching overhead.
MODULE-6 61
EARLIEST DEADLINE FIRST (EDF) – EXAMPLE-1
Ø Consider set of task running on a weather monitoring station as follows
Ø Temperature measurement Task (T1): C=1ms, P=4ms
Ø Humidity measurement Task (T2): C=2ms, P=5ms
Ø Co2 measurement Task (T3): C=2ms, P=7ms
Necessary Test
$
𝐶!
𝑈=# ≤1
𝑃!
!"#
U=(1/4)+(2/5)+(2/7)
= 0.25 + 0.4+ 0.2857
= 0.935 ≤ 1
Necessary Test is passed hence given task set must be schedulable by EDF
MODULE-6 62
EARLIEST DEADLINE FIRST (EDF) – EXAMPLE-1
T1
T2
T3
MODULE-6
EARLIEST DEADLINE FIRST (EDF) – EXAMPLE-1
T1
T2
T3
Although given task set failed to schedule under RMS, it can scheduled by EDF
MODULE-6
EARLIEST DEADLINE FIRST (EDF) – EXAMPLE-2
Ø Consider set of four task (Ci, Pi) running on embedded system as follows,
T1= (1, 3) , T2= (1, 5) , T3= (1, 6) , T4= (2, 20)
Necessary Test
$
𝐶!
𝑈=# ≤1
𝑃!
!"#
U=(1/3)+(1/5)+(1/6)+(2/10)=0.899 ≤ 1
Necessary Test is passed hence given task set must be schedulable by EDF
MODULE-6 65
EARLIEST DEADLINE FIRST (EDF) – EXAMPLE-2
T1
T2
T3
MODULE-6
EARLIEST DEADLINE FIRST (EDF) – EXAMPLE-3
Ø Consider set of three task (Ci, Pi) running on embedded system as follows,
T1= (5, 20) , T2= (10, 40) , T3= (40, 80)
Necessary Test
$
𝐶!
𝑈=# ≤1
𝑃!
!"#
U= (5/20)+(10/40)+(40/80)
= 0.25 + 0.25 + 0.5 = 1 ≤ 1
If cumulative CPU utilization is 1 then the CPU will be utilized 100% without idle condition
MODULE-6 67
EARLIEST DEADLINE FIRST (EDF) – EXAMPLE-3
T1
T2
T3
At time instance 80ms all 3 tasks are having same deadline. When multiple tasks having deadline at same
instance of a time then one of the task will be chosen randomly and get executed.
MODULE-6
EARLIEST DEADLINE FIRST (EDF) - PROs & CONs
Ø PROs:
Ø It works for all types of tasks: periodic or non periodic
Ø Simple Schedulability test
Ø Best CPU utilization
Ø Optimal scheduling algorithm
Ø CONs:
Ø Difficult to implement due to the dynamic priority-assignment.
Ø Any task could get the highest priority even if the task is not important
Ø Non stable because if any task fails to meet its deadline, the system is
not predictable
MODULE-6 69