Embedded Systems Firmware Demystified
Embedded Systems Firmware Demystified
Embedded Systems Firmware Demystified
com
*More than 150,000 articles in the
search database
*Learn how almost everything
works
Embedded Systems
Firmware Demystified
Ed Sutter
Designations used by companies to distinguish their products are often claimed as trademarks.
In all instances where CMP Books is aware of a trademark claim, the product name appears
in initial capital letters, in all capital letters, or in accordance with the vendor’s
capitalization preference. Readers should contact the appropriate companies for more
complete information on trademarks and trademark registrations. All trademarks and
registered trademarks in this book are the property of their respective holders.
Copyright © 2002 by Lucent Technologies, except where noted otherwise. Published by CMP
Books, CMP Media LLC. All rights reserved. Printed in the United States of America. No part
of this publication may be reproduced or distributed in any form or by any means, or stored
in a database or retrieval system, without the prior written permission of the publisher.
The programs in this book are presented for instructional value. The programs have been
carefully tested, but are not guaranteed for any particular purpose. The publisher does
not offer any warranties and does not guarantee the accuracy, adequacy, or completeness
of any information herein and is not responsible for any errors or omissions. The publisher
assumes no liability for damages resulting from the use of the information in this book
or for any infringement of the intellectual property rights of third parties that would
result from the use of this information.
ISBN: 1-57820-099-7
Table of Contents
. . . . . . . . . . . .
Preface
. . . . . . . . . . . . . .
.................
vii
Who Is the
.
Reader? . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . ix
Conventions . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . xii
Source
Code . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
.................
xii
Acknowledgments . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
.................
xii
Chapter 1 A Hard
. . . . . . . . . . . .
Start
..................
1
System
. . . . . . .
Requirements . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . 2
Central Processing
Unit . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . 3
System
Memory . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . . . .
10
CPU
Supervision . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
................
12
Serial Port
Drivers . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . . 15
Ethernet
Interface . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
....... . ...... .
17
Flash Device
Options . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . 18
The CPU/Memory
Interface . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . 19
Summary . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . 27
Chapter 2 Getting
. . . . . . . . . . . .
Started
. . . . . . . . . . . . . . .
29
How Is It Done on a
. . . . . . . . . . . .
PC?
. . . . . . . . . . . . .
..................
30
Building
Libraries . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . . . .
45
Up
Front . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . 47
Run
Time . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . 55
Chapter 3 Introducing
. . . . . . . . . . .
MicroMonitor
. . . . . . . . 61
. . . . . . . . . . .
Platform
. . . . . . . . . . . . .
. . . . . . . . . . 61
Chapter 4 Assembly
. . . . . . . . . . . .
Required
. . . . . . . . . . . . 69
Just After
.
Reset . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . 70
I/O
Initialization . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . 76
Establish Exception
Handlers . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . 76
Summary . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . 86
. . . . . . . . .87
CLI
. . . .
Features . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . . . .
88
Command-Line
Redirection . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . 99
Password
Protection . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . 115
Summary . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . 116
Memory . . . . . . . . . . . .
. . . .117
The Interface
. . . . . . . . . . .
Functions
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . 118
Summary . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . 136
Chapter 7 A Flash File
. . . . . . . . . . . . .
System
. . . . . . . . . . .137
. . . . . . . . . . .
Platform
. . . . . . . . . . . . .
. . . . . . . . . . . . 138
High-Level
Details . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . 141
Defragmentation . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
.................
144
TFS
Implementation . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . . .
146
To Load or Not to
Load . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . 165
File
Decompression . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . . .
170
Execute In
Place . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . 171
Summary . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . 172
Chapter 8 Executing
. . . . . . . . . . . .
Scripts
. . . . . . . . . . . . .173
The Script
. . . . .
Runner . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . 173
Conditional
Branching . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . 181
A Few
Examples . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . . 187
Summary . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . 191
Chapter 9Network
. . . . . . . . . . .
Connectivity
. . . . . . . . . . .193
. . . . . . . . . . .
Ethernet
. . . . . . . . . . . . .
. . . . . . . . . . . . .
.................
194
ARP . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . 194
IP . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . 195
ICMP . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . 196
UDP and
TCP . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
....... ...... 197
DHCP/BOOTP . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
...... ..... .... .
198
Applied to Embedded
Systems . . . . . . . . .
. . . . . . . . . . . . .
................
199
Summary . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . 209
Chapter 10 File/Data
. . . . . . . . . . . .
Transfer
. . . . . . . . . . . 211
Xmodem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 211
TFTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . 219
Field Upgrade
Capability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
230
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . 231
Different Memory
Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
233
Less Intense
Startup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . 234
Establishing an Application
Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
235
Application-Originated
Drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
242
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . 245
Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . 249
Adding Symbolic
Capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
254
Displaying
Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
255
Stack
Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . 277
System
Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 289
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . 299
Chapter 13 Porting MicroMonitor to the
ColdFire™ MCF5272 . . . . . . . . . . . . . . . . . . . . . . . . . . .
301
The
Makefile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 304
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 332
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .335
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 348
The
Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . 350
Reentrancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . 356
Good Concurrency vs. Bad
Concurrency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 358
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .359
Preface
An embedded system is just a computer buried inside some other product. Surpris-
ingly, you can know a great deal about programming and computing and still get
lost in the arcane world of embedded systems. In the world of embedded systems
programming, countless details — both hardware- and software-related — make
the development process seem like a path that few have traveled and even fewer
have survived. How do software, hardware, and firmware differ? How in the world
does a 100,000-line program end up inside a device smaller than my fingernail?
What is flash memory and why do I need a cache? What is the difference between
a task and a process? Do I need to worry about reentrancy? As we progress through
Embedded Systems Firmware Demystified, you will come to see that these questions
are not as complex as they first appear.
Embedded systems programming spans a wide range of activities from building
programmable logic at the most concrete end to writing a UNIX™ process at the
most abstract end. Bracketed by these poles, the industry has exploded in the
last 20 years. In the late seventies, assemblers were considered luxuries. A
typical embedded system used less than 64Kb of system memory (bits, not bytes!).
There was no hardware hand-off to the firmware developer. The same person that
drew the schematics and soldered the prototype also wrote the firmware that
pulled the whole thing together. By the time Intel introduced the 8085 chip,
it was clear that those pesky microprocessors were here to stay. In the eighties,
the Motorola versus Intel CPU wars started, and C became a popular choice for
the elite few who dared to mess with a high-level language and an EPROM
programmer. Today, microprocessors are everywhere and range from the 4- and
8-bit workhorses still dominating the industry to 1GHz 64-bit processors that
almost need a freezer (microprocessor-controlled, no doubt) to keep them cool.
Over the years, the complexity of these systems has snowballed. The industry
has transitioned from programming a DEC PDP machine with binary codes on the front
panel to applying object-oriented design to a microcontroller in a toaster. If you
let the trend get to you, the changes can seem frazzling. There are microprocessors
and microcontrollers; RAM, DRAM, and SDRAM; pipelining and superscalar; EPROM and
flash memory; RISC and CISC; and RAS and CAS and cache — and that’s just the
beginning.
Now, everything from toothbrushes (no kidding) to fighter jets are likely to have
a version controlled by a microprocessor of some kind. With this trend come tools
and technologies. One can easily be overwhelmed by the available choices in hardware
(integrated circuits that the firmware must make work) and software (tools that are
used to build the firmware application).
The goal of this book is to prepare you for a real embedded systems project by
walking you through an entire embedded systems design. Not coincidentally, the
project source code included is a piece of firmware — an embedded boot platform
— that can simplify all your future projects. I assume a small hardware design
with CPU, memory, and some peripherals. I present a basic schematic and walk you
through the method in which instructions are fetched from memory. I discuss devices,
as well as concepts. I examine flash memory versus EPROM, SRAM versus DRAM,
microcontroller versus microprocessor, and data bus versus address bus. I also
explain how you convert your C and assembly language source code to a binary image
that ends up in the memory device from which the CPU boots (the boot flash memory).
Several chapters cover the basic steps of starting up an embedded system and
getting a working application (including the basic boot in assembler), exception
handling, flash memory drivers, a flash file system, and serial and Ethernet
connections. The result is an understanding of how an embedded systems project gets
started and how to build a platform on which an embedded systems application can
reside.
Sound exciting? Great! Sound scary? It’s not! The intent of this book is not to
discuss the latest superscalar architecture or the antenna effects of copper routes
on a printed circuit board, nor does it present a high-level abstract design process.
(Advanced architectures and transmission-line effects are certainly important, but
they are not the topic of this book.) This book is for those who want to get their
hands dirty without being overwhelmed by industry jargon or design-specific techni-
cal details. By the end of this book, you will know how to read a basic schematic,
know what goes into the boot flash device, and understand the major components of
a complete embedded systems development platform.
Who Is the Reader?
At the minimum, I assume the reader has some experience with C and basic assembly
language concepts. I do not assume any electronics or hardware background. Thus,
readers with a wide range of programming backgrounds will find this book useful.
Computer science or electrical engineering students without a significant
background in firmware development, but at least an interest, can obviously
benefit from this book.
Low-level firmware developers will find the working example helpful, as it
includes documentation and code explanations for an extensible firmware
development platform. I explain the details associated with booting new
hardware and the way in which the CPU interacts with peripherals. Topics range
from the lowest-level boot to a Trivial File Transfer Protocol (TFTP)
implementation over Ethernet. You can port the code in this book to your own
target system or integrate snippets of this code with your existing firmware
platform.
Hardware developers will find the completed platform useful for helping analyze
and debug hardware beyond the CPU complex. For those inquisitive enough to step
away from hardware to learn more about the firmware process, this book provides
a way to get started without getting too far from the hardware. (Hardware
designers regularly make the transition to programming in the firmware/software
world.)
Project leaders will also find this book useful, as the firmware package presented
is a mature platform. The platform is applicable to a wide range of real-time
operating systems (RTOS) and target architectures, and it is extremely easy
to port to new systems. The fact that the platform is fairly target- and
RTOS-independent makes the transition to a different target or RTOS much less
painful.
Conventions
Throughout the book, I use different typefaces to differentiate between a) the
book’s regular text and b) special software tools, code, and so forth. General
text is in roman font. Terms used for the first time are in italic. A monospaced
font indicates code, software tools, hexadecimal numbers, filenames, data types,
variables, and other identifiers, as well as other special items.
Source Code
This book covers a lot of source code. When the text describes various portions
of the code in detail, snippets or entire functions are included. All code in
this text has been transferred from the original working C source code and is
included on the CD. Some implementation details that were not applicable to
the discussion were removed from the text listings to make the presentation
clearer. I have made every effort to maintain accuracy between the original
code and the code presented in the text; however, the code on the CD is the
complete working version. If you have questions, please refer to the code on
the CD.
Acknowledgments
I want to thank my good friends and managers at Lucent — Roger Levy and Paul
Wilford — for their support and encouragement. Also, much thanks to my good
friends Patricia Dunkin and Agesino Primatic, Jr. for reviewing the text and
providing a lot of good comments and suggestions.
My appreciation goes to the team at CMP Books. Thanks to Joe Casad, Catherine
Janzen, and Robert Ward for editorial and technical contributions (special
thanks to Robert for dealing with my many questions and paranoias throughout
this whole process). Thanks to Michelle O’Neal, Justin Fulmer, James Hoyt, and
Madeleine Reardon Dimond for typesetting, illustrating, and indexing the book.
Finally, I want to thank my wife, Lucy, for her endless encouragement through
this project. I also want to thank my son, Tommy, for helping me keep my
priorities in line. Most of all, I want to thank my Lord and Savior, Jesus Christ,
for the sacrifice that He made to provide me with eternal salvation. It just
doesn’t get any better than that!
1
Chapter 1
A Hard Start
Although the primary focus of this book is firmware development, a good firmware
developer must have a reasonable understanding of the hardware on which the
firmware resides. You could compare a firmware programmer to someone who changes
oil at a gas station: if all that person knows is how to perform an oil change,
he won’t have the skills to notice warning signs that a fully trained mechanic
would pick up right away. A narrowly trained technician may change the oil
competently but not notice other problems like a leaky head gasket. Sooner or
later something else will go wrong, remain unnoticed, and cause additional
trouble due to the techni-cian’s limited knowledge.
This chapter explains what a firmware engineer needs to know about the hard-
ware in a typical system. The goal is not to turn you into a hardware designer
but simply to explain enough so that you can do more than “change the oil.”
To achieve this goal, this chapter discusses some of the common CPU support
peripherals and examines how the hardware processor does its job. I will start
by focusing in on the type of system that this book addresses.You will learn
about some of the most common features of today’s microprocessor-based systems,
some of which reside within the CPU silicon itself and others which are usually
external to the CPU. I will also discuss the sequence of steps that the CPU
takes to retrieve an instruction and the advantages and snags that are
introduced by the use of cache.
At the end of this chapter (although you will still not be a qualified hardware
designer) you will hopefully have a better understanding of the hardware
platform onto which you will be installing your firmware.
System Requirements
The hardware required for an embedded system varies greatly depending on its
intended use. Some tiny systems have barely 1K data space and 16K of instruction
space, while high-performance systems might run a 1GHz 64-bit processing engine
with 32MB CompactFlash and 128MB DRAM. In this book, I focus mostly on systems
with capabilities that fall somewhere in the middle. The firmware I’ll discuss
does require some memory space (ranging from 32K to 256K of flash memory and
8K to 128K of RAM) depending on what capabilities are built in, so it is not
appropriate for small microcontroller (8051, 68HC05/11/12, etc.) projects. On
the other hand, this code doesn’t need (or want!) a large machine to work
successfully.
Figure 1.1 represents a complete system. Using this system model, you
canprogram your own devices, talk to the PC, and even serve HTML pages to a
network browser. If you need to skimp, you can eliminate the reset/watchdog
and battery-backed-RAM/time-of-day (BBRAM/TOD) clock controller and still have
the capabilities I listed! I include these capabilities in our model because
they are extremely useful. They’re also quickly becoming standard equipment
on most microprocessor designs.
These are the components most common in embedded systems. All embedded systems use some
kind of nonvolatile storage (flash memory, EPROM, ROM) and some form of RAM. Most have some
channel they can use to communicate with a development host (a serial port, Ethernet port,
or JTAG port.).
Keep in mind that these definitions are not the rule in the industry. You
should also note that there is some overlap between the different terms. Both
microcontrollers and desktop microprocessors are used in embedded systems. In
fact, back in the 1980s, microcontrollers were even used in some of the first
desktop machines.
Microcontrollers, as defined above, dominate the embedded systems market by
several orders of magnitude. However, silicon (both processor and memory) prices
are dropping, and the larger 16- and 32-bit processors are gaining ground in embed-
ded applications. This book generally focuses on systems based on embedded micro-
processors.
Embedded microprocessors come in a variety of shapes and sizes. They are even
becoming available as logic cores designed for integration into very large program-
mable devices called field programmable gate arrays (FPGA). Companies that were
producing programmable logic at one time are now producing programmable logic with
a built-in processor.
Following is a brief description of some of the typical peripherals that are
included with today’s embedded microprocessors. I discuss each item using the anal-
ogy of a similar process in an office environment.
Glue logic refers to the extra circuitry (logic) that would be necessary to connect
(or glue) two devices together.
Interrupt Controller
An interrupt controller helps prioritize incoming messages, which is a task that
any office worker can appreciate. Consider the following small office scenario.
Person_A is in an office behind a desk, talking on the phone. Person_B walks down
the hall directly into Person_A’s office and asks a question. Person_A can do one
of several things:
. • Totally ignore Person_B and continue the phone conversation.
. • Quickly acknowledge Person_B’s request but respond with something
like “OK, I’ll get to that in a little while,” placing the request into a pile with
a bunch of others.
. • Tell the person on the phone that the conversation will have to
continue later and then respond to Person_B. This outcome is likely if Person_A
considers Person_B to be more important than whomever is on the phone.
. • End the conversation. This outcome occurs automatically, regardless
of whom is on the phone, if Person_B is the boss.
You can say that Person_A processes the interrupt from Person_B. Several factors
determine whether or not Person_B is acknowledged, including the importance of
the person on the phone, the fact that Person_A’s door might have been closed, and
so on. Somehow Person_A must prioritize the interrupt based on what is currently
going on.
Now, consider Person_A having the phone conversation to be the CPU currently
executing instructions and Person_B to be a peripheral device that needs the CPU’s
attention. An integrated interrupt controller provides the CPU with this ability
to enable, disable, and prioritize incoming interrupts from both internal and
external peripherals. Usually all incoming interrupts are maskable (can be
disabled) except the NMI (non-maskable interrupt) and reset lines (the bosses).
Timer-Counter Unit
Consider a scenario with several typists working in an office. For a little fun,
they decide to see who can type the most words in one minute. They gather around
a computer terminal, and each one gets a shot at the one-minute trial. However,
they find that they need some way to keep track of the 60-second interval of time
from the beginning to the end of the test. Alternatively, they might just count
the number of seconds taken to type a specific block of text. In this case, they
are not measuring a fixed amount of time but the elapsed time it takes each individual
to complete the challenge.
The timer-counter unit within an embedded system can help with this process.
The unit provides the CPU with capability related to elapsed time. The
timer-counter unit provides the firmware with the ability to generate periodic
events, including events based on incoming pulse counts.
It is important to note that the microprocessor’s timer-counter unit usually does not
deal with time of day on its own, so don’t assume that “timer” means “wall clock.” In
most cases, “timer” means “stop watch.” A stop watch can be converted to a wall clock
if at some starting point the stop watch is synchronized with a wall clock. This process
is what is done with the microprocessor’s timer if time-of-day is needed.
DMA Controller
Consider an office that contains a file cabinet. In that file cabinet is paperwork
that several office workers need to access. Some only need to access it once in
a while, while others require the paperwork much more often. The manager of the
office sets up a policy so that some of the individuals have a personal key to get
into the file cabinet, while others must get a shared key from the manager. In other
words, some office workers have direct access to the file cabinet, and others have
indirect access.
In many embedded system designs, the CPU is the only device that is connected
to the memory. This means that all transactions that deal with memory must use
the CPU to get the data portion of that transaction stored in memory, just as some
office workers had to obtain the manager’s key. Direct memory access (DMA) is a
feature that allows peripherals to access memory directly without CPU intervention.
These peripherals correspond to the office workers with their own keys.
For example, without DMA, an incoming character on a serial port would gener-
ate an interrupt to the CPU, and the firmware would branch to the interrupt han-
dler, retrieve the character from the peripheral device, and then place the
character in a memory location. With DMA, the serial port places the incoming
character in memory directly. When a certain programmed threshold is reached, the
DMA controller (not the serial port) interrupts the CPU and forces it to act on
the data in memory. DMA is a much more efficient process. Many integrated
microprocessors have multiple DMA channels that they can use to perform
I/O-to-memory, mem-ory-to-I/O, or memory-to-memory transfers.
Serial Port
Consider the front door of an office building. The front door is where people can
come in and out and easily interact with the facilities provided by the business.
This fact doesn’t mean that there aren’t other ways to contact the business, but
the front door is a very convenient and standard way of doing it.
This task of providing access is the serial port’s job. The serial port provides
basic communication to a console or some other device that understands the same
protocol. The serial port or universal asynchronous receiver transmitter (UART)
provides the CPU with an RS-232 bit stream, which is the same technology used for
the PC’s COM port. Different processors have different variations on this standard,
but, in almost all cases, the minimum is a basic asynchronous serial bit stream
(using RS-232 levels) consisting of one start bit, some number of data bits (usually
between five and nine), and one or two stop bits.
Programmable pins are sometimes referred to as dual function. Note that this dual
functionality should not be assumed. How each pin is configured and the ability to
configure it to run in different modes is dependent on the processor implementation.
Often a pin name is chosen to reflect the pin’s dual personality. For example if RX2
can be configured as a serial port 2 receiver or as a PIO pin, then it will probably
be labeled as RX2/PION (or something similar), where N is some number between one and
M, and M is the number of PIO pins on the processor. You should be aware that some
microprocessors may be advertised as having a set of features but actually provide these
features on dual-function pins. Hence, the full set of advertised features (two serial
ports and 32 PIO lines) may not be simultaneously available (because the pins used for
the second serial port are dual-functioned as PIO lines). Make sure you read the
datasheet carefully!
System Memory
Aside from the CPU itself, memory is the most fundamental building block in
any microprocessor-based system. The CPU fetches instructions from the memory,
and these instructions tell the CPU what to do. If the memory is programmed
incorrectly or connected to the CPU incorrectly, then even the most
sophisticated processor will be confused!
There are several different types of memory available. This range of
selection exists for the same reason that there are several varieties of almost
anything in the electronics industry: price/performance/density trade-offs.
Different designs have different requirements that make different memory
architectures attractive. For example, some systems need a lot of memory, but
not really fast memory; others require small amounts of really fast memory;
some need memory that does not lose its data when the power is removed; and
so on. Following is a discussion of the most common types of memory used today,
along with some of their characteristics.
RAM
This name is also an acronym: random access memory (RAM). Unlike the EPROM acronym,
the term RAM doesn’t give a very good indication of its characteristics, at least
not from our point of view. The name reflects the fact that any byte can be accessed
at any time, which was a step ahead of its predecessor, sequential access memory,
when it first appeared on the electronics scene. For the sake of our discussion,
we assume that all memory is random access, but not all memory is writable by the
CPU. Therefore for us, the differentiating characteristic (compared to EPROM) is
the fact that the processor can write to RAM. RAM is read/writable but is also
volatile, which means that when power is removed, the data is not retained.
There are two fundamental types of RAM: static (SRAM) and dynamic (DRAM). SRAM
is the easier of the two to interface with because it is “static,” meaning that
it does not require any baby-sitting from the processor to do its job. Simply wire
it up to the processor and use it. DRAM, on the other hand, requires external
hardware to refresh it periodically so that the internal capacitors hold their
charge. DRAM technology is much cheaper, but it is also slower and requires
additional hardware to keep it running (the DRAM controller mentioned earlier).
Because of these issues, DRAM is typically used in systems that require large
amounts (> 1MB) of memory so that the added expense of the controller is justified.
SRAM is simple but has a higher cost per byte of storage. It is typically used
in systems that require small amounts of memory or in systems that need a small
amount of fast writable memory (like a cache). For example, on your typical PC,
some DRAM is available for general use, and a much smaller amount of SRAM is
available for the CPU cache.
Flash Memory
Like EPROM, flash memory is also nonvolatile memory. The big advantage of flash memory
over EPROM is that it is in-system programmable, which means that no separate device
is needed to modify its contents. Early devices required a higher voltage (usually
12V), but today’s parts require the same voltage as the rest of the board. Since
the flash memory is in-system writable, no UV eraser is needed.
The architecture of flash memory comes in a few varieties, and although modern
flash memory is in-system programmable, it is still not as convenient as using RAM.
Typically, an erase procedure deals with a single “sector” of the memory. Sector
sizes are usually relatively large, and an erasure sets all the bits within that
sector to one (all bytes = 0xff). Writing to the individual bytes changes only some
of the bits within each byte to a zero state. Each operation (erase, write, and
so on) except read is performed with a special programming algorithm. This
algorithm is unique enough that it does not interfere with the typical interaction
between the CPU and the memory.
Flash memory is quickly becoming the standard nonvolatile memory choice for
new designs. Aside from the algorithm needed to write/erase the memory, the
only other drawback of flash memory is that it has an upper limit to the number
of times a sector can be erased. Usually the limit is high (100,000 or 1,000,000
cycles), but, nevertheless, it must be considered in the design.
Still Others
There are several other types of memory, most of which are some derivative of
one or more of the above types. These other standards are not as popular, but
they typically satisfy some niche in the market. For example, PSRAM
(pseudo-static RAM) is a DRAM with some kind of refresh controller built into
it. It satisfies a system that needs more SRAM than a single SRAM device supports
but doesn’t need the densities offered by DRAM. Nonvolatile SRAM (NVRAM) is
static RAM with a battery backup. Some devices actually have the battery built
into the plastic; others are nonvolatile simply because the hardware design
has battery backup protecting the device; still others provide some type of
automatic backup of RAM to on-chip flash when power is removed. Serial EEPROM
is a type of EEPROM that communicates with the CPU usually using two to four
I/O pins. Access is slow, but physical size is extremely small because there
is no address or data bus.
CPU Supervision
This section discusses functions that help the CPU maintain itself through some
otherwise catastrophic situations. The functions covered here are (in order
of importance) a power monitor for reset pulse generation, a watchdog timer,
a power monitor for SRAM nonvolatility, and a time-of-day clock. The latter
two are not actually considered part of CPU supervision, but it is very common
to see various combinations of these functions in the class of integrated
circuits which are referred to as “CPU supervisors.”
Reset
Before the CPU can do anything at all, it needs to be powered up, which simply
requires a connection to the power and ground pins on the device. Once powered
up, it’s essential that the internals of the device are allowed to synchronize
and start up in a sane state. A RESET signal forces key CPU components to a
known initial state.
Typical requirements on a reset input line of a processor are that it be held
in a constant active (usually low) state for some duration (say 100ms). In some
designs, a simple resistor/capacitor (RC) combination is used to keep the RESET
line in a low state for the 100ms time period when power is applied to the system
(this is referred to as power-on reset).
Without getting into a lot of detail, assume that the resistor/capacitor
(RC) circuit of Figure 1.2 provides a time-delay power-up. While the other pins
have the supply voltage applied immediately (Signal A), the RC connection to
the RESET pin holds it low for some delay longer (Signal B), thereby providing
the minimum 100ms of low state to RESET after power-up. Unfortunately, the RC
pair is not very good at detecting when it should apply the low level to the
RESET pin. That means that it doesn’t work well for systems that are in remote
locations and must be automatically reset after power outages and dips. In these
situations, the power could dip and cause the power supply level to fall, which
would in turn cause the CPU to go insane, but would not cause the RC circuit
to pull the RESET line low enough to bring the CPU out of its insane state.
In some respects, the RC combination is an analog solution for a digital problem.
The real solution for a safe power-up reset is to monitor the supply line and
pulse the RESET line for the designated amount of time whenever it transitions
from “out of ” to “within” CPU tolerance. Fortunately, there are components out
there that do just that! There are several different components available that
monitor the supply voltage and automatically generate a clean reset pulse into
the CPU when the supply drops below a certain level.
When power is applied to the system (Signal A), the charge time of the RC circuit
attached to the reset input (Signal B) delays the reset activation. While this
reset mechanism works when the power is cycled cleanly, it can cause problems
when power is momentarily interrupted.
Watchdog Timer
The watchdog timer (WDT) acts as a safety net for the system. If the software
stops responding or attending to the task at hand, the watchdog timer detects
that something is amiss and resets the software automatically. The system might
stop responding as a result of any number of difficult-to-detect hardware or
firmware defects. For example, if an unusual condition causes a buffer overrun
that corrupts the stack frame, some function’s return address could be
overwritten. When that function completes, it then returns to the wrong spot
leaving the system utterly confused. Runaway pointers (firmware) or a glitch
on the data bus (hardware) can cause similar crashes. Different external factors
can cause “glitches.” For example, even a small electrostatic discharge near
the device might cause enough interference to momentarily change the state of
one bit on the address or data bus. Unfortunately, these kinds of defects can
be very intermittent, making them easy to miss during the project’s system test
stage.
The watchdog timer is a great protector. Its sole purpose is to monitor the
CPU with a “you scratch my back and I’ll scratch yours” kind of relationship.
The typical watchdog (see Figure 1.3) has an input pin that must be toggled
periodically (forexample, once every second). If the watchdog is not toggled
within that period, it pulses one of its output pins. Typically, this output
pin is tied either to the CPU’s reset line or to some nonmaskable interrupt
(NMI), and the input pin is tied to an I/O line of the CPU. Consequently, if
the firmware does not keep the watchdog input line toggling at the specified rate,
the watchdog assumes that the firmware has stopped working, complains, and
causes the CPU to be restarted.
Time-of-Day Clock
For most embedded systems, the CPU provides all that is needed for maintaining
time. Typically, there is no need to keep track of the time of day; nevertheless,
when you need the time of day, you can’t get it without some type of
battery-backed time-of-day clock function. Even though the CPU has its own
crystal and can keep relatively good time, the CPU’s notion of time only persists
as long as the CPU is powered up and running. When the board is reset, the CPU’s
clock is reset as well, making it impossible for the CPU to maintain the time
on its own. If you need time of day in your system, then you need a battery
and a time-of-day chip. An exception to this case is if the embedded system
knows that it has an external device from which it can get the current time
after being reset.
Ethernet Interface
While there are a few processor chips that include a portion of the Ethernet
interface on the chip, it’s still more common to find the Ethernet interface
on a separate device. Like the serial port, the Ethernet interface is
partitioned into two layers (protocol and physical). The protocol layer is
implemented as a single block called the Media Access Layer (MAC). The physical
layer consists of two blocks: a PHY and a transformer. It is becoming more common
to see the PHY and Ethernet controller integrated into one device, but the
transformer is still separate; hence, the Ethernet interface can consist of
two or three distinct devices.
The Ethernet controller is the portion of the interface that does the
packet-level work. For incoming packets, it verifies that the incoming frame
has a valid cyclic redundancy check (CRC), ignores packets that do not match
a specified MAC address, organizes the incoming frames as packets that can be
retrieved by the CPU (usually through either a FIFO or DMA transfer), and
generates interrupts based on various configuration parameters established by
the driver. For outgoing packets, the Ethernet controller calculates the CRC,
transfers data from memory to the PHY, adds padding to small packets, and
interrupts the CPU to indicate that the packet has been sent.
The PHY takes care of the lowest level of the interface protocol. It is
responsible for various parameters (like bit rate) specific to the environment.
The transformer provides isolation and electrical conversion of the signals
passed over the cable.
In this schematic, the signals have been grouped to show how they relate to the
various system buses. Notice that nearly all of the CPU pins are dedicated to
creating these buses.
The CPU
In the CPU diagram (Figure 1.5), there is only one “big” part: the CPU. There are
also two other blocks of components on this page: the clock and the reset circuit.
The clock provides the CPU with the ability to step through processing states,
which can vary from one cycle per instruction (RISC) to sometimes over a dozen
cycles per instruction (CISC). The clock can be a crystal, or it can be a complete
clock circuit, depending on the needs of the CPU. The reset/WDT circuit provides
the processor with a logic level on the reset pin that forces the CPU back to its
starting point. This particular circuit uses separate logic to assure that the
processor’s reset pin is held low for the required amount of time after a power-up
or when the Switch_1 switch is pressed. Notice that a PIO line out of the CPU feeds
into this circuit to provide the WDT with a periodic sanity pulse.
The memory devices connect directly to the system buses. Because each device is
only 32K, each uses only 15 address lines. Notice how each device is activated
by a separate chip select.
The CPU in this design uses 16-bit addresses but transfers data eight bits at
a time. Thus, it has a 16-bit address bus and an 8-bit data bus. Using these 16
bits, the processor can address a 64K memory space. In this design, half of that
space is occupied by 32K of flash memory and the other half by 32K of SRAM (Figure
1.6).
Each CPU pin belongs to one of four groups: address, data, control, or I/O.
In this simple design, the majority of the CPU pins are committed to creating
address and data buses. Since each memory component houses only 32K of address space
the memory chips have only 15 address lines. In this design, the low-order 15 address
bits are directly connected to these 15 lines on the memory components. Two
additional CPU control signals — ChipSelect_0 and ChipSelect_1 — are used to
activate the appropriate memory device. The most significant bit Addr_15 is unused.
(If the CPU did not provide conveniently decoded chip select lines, we could have
used the high-order bit of the address bus and some additional logic (called address
decode logic) to activate the appropriate memory device.
Whenever the CPU wants to read or write a particular byte of memory, it places
the address of that byte on the address lines. If the address is 0x0000, the CPU
would drive all address lines to a low voltage, logic 0 state. If the address were
0xFFFE, the CPU would drive all except the least significant address line to a high
voltage, logic 1 state.
When a device is not selected, it is in a high-impedance state (electrically
disconnected) mode. Two more control lines (read and write) on the CPU control how
a selected device connects to the data bus. If the read line is active, then the
selected memory chip’s output circuits are enabled, allowing it to drive a data
value onto the data bus. If the write line is active, the CPU’s output circuits
are enabled, and the selected memory chip connects only its input circuits to the
data bus.
Collectively the read, write, and chip select pins are called the control/status
lines.
Thus, 0xBE represents the eight bits 1011 1110. Note that the number of hexadecimal
digits implies the size of the bus. 0xBE contains only two hexadecimal digits,
implying that the data bus is only eight bits wide. The four hexadecimal digits
in 0x26A4, on the other hand, suggest that the address bus is 16 bits wide. Because
of this implicit relationship between the hex representation and the bus size,
it is accepted convention to pad addresses and data values with zeros so that all
bits in the bus are specified. For example, when referencing address 1 in a machine
with 32-bit addresses, one would write 0x00000001, not 0x1.
Figure 1.7 summarizes the connections between the CPU and the flash device. Ifyou
compare this diagram to the schematic, you can see that the only additional con-
nections on the flash device are for power and ground.
Figure 1.7 Connection Between CPU and Boot Flash Device.
The CPU uses the read and write signals to control the output drivers on the various
memory and peripheral devices, and thus, controls the direction of the data bus.
The CPU-to-flash device interaction can be summarized with the following steps:
involve any CPU interaction. In other words, you now understand the fundamentals
of a simple microprocessor-based hardware design!
1. M. Morris Mano, Computer System Architecture, Second Ed. (Englewood Cliffs, NJ: Prentice Hall,
1982), pg 501.
to data space but tends to read from instruction space much more often than
it writes to it.
The only limitation that cache puts on typical high-level system programmers
is that it can be dangerous to modify instruction space, sometimes called
self-modifying code. This is because any modification to memory space is done
through a data access.
The D-cache is involved in the transaction; hence, the instruction destined
for physical memory can reside in the D-cache for some undetermined amount of
time after the code that actually made the access has completed. This behavior
presents a double chance of error: the data written to the instruction space
might not be in physical memory (because it is still in the D-cache), or the
contents of the instruc-tion’s address might already be in the I-cache, which
means the fetch for the instruction does not actually access the physical memory
that contains the new instruction.
Cache increases performance by allowing the CPU to fetch frequently used values
from fast, internal cache instead of from slower external memory. However, because
the cache control mechanism makes different assumptions about how data and
instruction spaces are manipulated, self-modifying code can create problems. Cache
can also create problems if it masks changes in peripheral registers.
Figure 1.9 shows how the cache gets between the data write and the instructionread.
Step A shows the CPU writing to memory through the D-cache. Step B shows the transfer
of the contents of that D-cache location to physical memory. Step C represents the
transfer of the physical memory to the I-cache, and step D shows the CPU’s memory
access unit retrieving the instruction from the I-cache. If the sequence of events
was guaranteed to be A-B-C-D, then everything would work fine. However, this sequence
cannot be guaranteed, because that would eliminate the efficiency gained by using
the cache. The whole point behind cache is to attempt to eliminate the B and C steps
whenever possible. The ultimate result is that the instruction fetch may be
corrupted as a result of skipping step B , step C, or both.
For embedded systems, the problem just gets worse. Understanding the above
problem makes the secondary problems fairly clear. Notice in Figure 1.8that there
is a flash device, DRAM, and a UART. Two additional complexities become apparent:
1. 1. The UART is on the same address/data bus as the memory, which means that
accesses to a UART or any other physical device outside the CPU must deal with the
fact that cache can “get in the way.” Hardware must be designed with this
consideration in mind (or the firmware must configure the hardware) so that certain
devices external to the CPU can easily be accessed without the data cache being
involved.
2. 2. The UART may be configured to use DMA to place incoming characters into
memory space. In many systems, DMA and cache are independent of each other. The
data cache is likely to be unaware of memory changes due to DMA transfers, which
means that if the data cache sits between the CPU and this memory space, more
inconsistencies must be dealt with.
The complexity of the hardware and the “need for speed” make these issues tricky
but not insurmountable. Most of the prior problems are solved through good hard-
ware and firmware design. The initial issue of I-cache and D-cache inconsistency
can be resolved by invoking a flush of the data cache and an invalidation of the
instruction cache. A flush forces the content of the data cache out to the real memory.
An invalidation empties the instruction cache so that future CPU requests retrieve
a fresh copy of corresponding memory.
Also, there are some characteristics of cache that can help resolve these
problems. For example, a write through data cache ensures that data written to memory
is placed in cache but also written out to physical memory. This guarantees that
data writes will be loaded to cache and will pass through the cache to real memory;
hence it really only provides a speed improvement for data reads. Also, a facility
in some CPUs called bus snooping helps with the memory inconsistency issues related
to DMA and cache. Bus snooping hardware detects when DMA is used to access memory
space that is cached and automatically invalidates the corresponding cache loca-
tions. Bus snooping hardware isn’t available on all systems however. Additionally,
to avoid the problem with cache being between the CPU and the UART, devices can
usually be mapped to a memory location that doesn’t involve the cache at all. It
is very common for a CPU to restrict caching to certain specific cachable regions
of memory, rather than designating its entire memory space cacheable. The bottom
line is that the firmware developer must be aware of these hardware and firmware capa-
bilities and limitations in order to deal with these complexities efficiently.
Summary
While embedded systems come in a fascinating array of variations, at the lowest
hardware levels they usually have many general similarities. Memory systems
interface with CPU via address and data buses. Systems interface with
development hosts via serial ports. Watchdog timers supply robust operation
even in the presence of intermittent software and hardware bugs. Knowing the
general structure of these common facilities gives you a useful framework for
learning the specifics of new systems.
The hardware coverage in this chapter won’t make you a hardware guru, but
it should prepare you to better understand the documentation for your particular
hardware. More importantly (for the purposes of the book), it should prepare
you to understand the hardware issues discussed in later chapters.
Where this chapter described primarily hardware features, the next chapter
focuses on software issues — specifically on how to compile and load programs
for an embedded target. More than anything else, these two issues, the need
for greater knowledge about the hardware and the need for tools that work in
a cross-develop-ment environment, separate embedded systems from application
development.