0% found this document useful (0 votes)
22 views24 pages

COMPUTER ARCHITECTURE (CHAPTER 1n2)

Computer architecture notes chapter 1 and two

Uploaded by

Akono Princesse
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views24 pages

COMPUTER ARCHITECTURE (CHAPTER 1n2)

Computer architecture notes chapter 1 and two

Uploaded by

Akono Princesse
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

COMPUTER ARCHITECTURE

INTRODUCTION TO COMPUTER ARCHITECTURE

Why study computer organization and architecture?


 Design better programs, including system software such as compilers, operating systems, and device drivers.
 Optimize program behavior.
 Evaluate (benchmark) computer system performance.
 Understand time, space, and price tradeoffs.
Computer Architecture: It focuses on the structure and behavior of the computer and refers to the logical aspects of
system implementation as seen by the programmer. Computer architecture includes many elements such as instruction
sets and formats, operation code, data types, the number and types of registers, addressing modes, main memory
access methods, and various I/O mechanisms. We are trying to understand How do I design a computer.
The computer architecture for a given machine is the combination of its hardware components plus its
instruction set architecture (ISA). The ISA is the interface between all the software that runs on the machine and the
hardware that executes it. The ISA allows you to talk to the machine.
1. The main components of a computer
Every task given to a computer follows an Input- Process- Output Cycle (IPO cycle). It needs certain input,
processes that input and produces the desired output. The input unit takes the input, the central processing
unit does the processing of data and the output unit produces the output. The memory unit holds the data
and instructions during the processing.

Functional Components of a computer


a) The input unit consists of input devices that are attached to the computer. These devices take input and convert it
into binary instructions that the computer understands.
b) The CPU is called the brain of the computer because it is the control centre of the computer, it is also called
the microprocessor. It first fetches instructions from memory and then interprets them so as to know what is to be
done. If required, data is fetched from memory or input device. Thereafter CPU executes or performs the required
computation and then either stores the output or displays on the output device. The CPU has three main
components, they are Arithmetic and Logic Unit (ALU), The Control unit and register.
c) Memory unit is attached to the CPU is used for storage of data and instructions and is called internal memory.
During processing, it is the internal memory that holds the data. is also known the Primary memory or Main
memory. They include RAM, ROM, registers and Caches Memories.
d) The secondary memory is needed to store data and information permanently for later use. This is also known as
auxiliary memory. It differs from primary storage in that it is not directly accessible by the CPU. The secondary
memory provides backup storage for instructions (computer programs) and data. Some of the examples of
secondary storage devices are hard disk, compact disks, pen drives etc.
e) The output unit consists of output devices that are attached with the computer. It converts the binary
data coming from CPU to human understandable form. The common output devices are monitor, printer,
plotter etc.
Example: an obsolete computer features

1
- The micro-processor in the ad is a Pentium III, operating at 667MHz. Every computer system contains a clock
that keeps the system synchronized. The clock sends electrical pulses simultaneously to all main
components, ensuring that data and instructions will be where they’re supposed to be, when they’re supposed
to be there. The number of pulsations emitted each second by the clock is its frequency. Clock frequencies
are measured in cycles per second, or hertz, here it is millions of pulsation per second(MHz). The fact that this
microprocessor runs at 667MHz, however , doesn’ t necessarily mean that it can execute 667 million
instructions every second because each computer instruction requires a fixed number of cycles to
execute.
- The 133MHz refers to the speed of the system bus, which is a group of wires that moves data and instructions
to various places within the computer. The system in our advertisement also boasts a memory capacity of 64
megabytes (MB), or about 64 million characters. SDRAM, short for synchronous dynamic random access
memory. SDRAM is much faster than conventional (nonsynchronous) memory because it can synchronize
itself with a microprocessor’s bus.
- “32KB L1 cache, 256KB L2 cache” also describes a type of memory. It is to know that, no matter how fast a
bus is, it still takes “a while” to get data from memory to the processor . To provide even faster access to data,
many systems contain a special memory called cache. The system in our advertisement has two kinds of
cache. Level 1 cache (L1) is a small, fast memory cache that is built into the microprocessor chip and helps
speed up access to frequently used data. Level 2cache (L2) is a collection of fast, built-in memory chips
situated between the microprocessor and main memory. Notice that the cache in our system has a capacity of
kilobytes (KB), which is much smaller than main memory.
- 30GBis the storage capacity of a fixed (or hard) disk. A large disk isn’t very helpful if it is too slow for its
host system. The computer in our ad has a hard drive that rotates at 7200 RPM (revolutions per minute). To
the knowledgeable reader, this indicates. Rotational speed is only one of the determining factors in the overall
performance of a disk. The manner in which it connects to—or interfaces with—the rest of the system is also
important. The advertised system uses a disk interface called EIDE, or enhanced integrated drive electronics.
Most EIDE systems share the main system bus with the processor and memory.
- “2 USB ports, 1 serial port, 1 parallel port.”: three different ports, ports allow movement of data to and
from devices external to the computer. Most desktop computers come with two kinds of data ports: serial
ports and parallel ports. Serial ports transfer data by sending a series of electrical pulses across one or two
data lines. Parallel ports use at least eight data lines, which are energized simultaneously to transmit data. Our
advertised system also comes equipped with a special serial connection called a USB (universal serial bus)
port. USB is a popular external bus that supports Plug-and-Play as well as hot plugging (the ability to add
and remove devices while the computer is running).
- Some systems augment their main bus with dedicated I/O buses. Peripheral Component Interconnect (PCI) is
one such I/O bus that supports the connection of multiple peripheral devices. There are two PCI devices
mentioned in the ad. The PCI modem and the PCI sound card.
- “19" monitor, .24mm AG, 1280 X 1024 at 85Hz.”: This means that the image displayed on the monitor is
repainted 85 times a second. Resolution is determined by the dot pitch of the monitor, which is the distance
between a dot (or pixel) and the closest dot of the same colour. we have a 0.24 millimeter dot pitch supported
by an AG (aperture grill) display. This monitor is further supported by an AGP (accelerated graphics port)
graphics card.

2
2. Standards organizations
Most standards-setting organizations are ad-hoc trade associations or consortia made up of industry leaders.
Manufacturers know that by establishing common guidelines for a particular type of equipment, they can market their
products to a wider audience than if they came up with separate and perhaps Incompatible specifications. Some
standards organizations have formal charters and are recognized internationally as the definitive authority in certain
areas of electronics and computers.
- The Institute of Electrical and Electronic Engineers (IEEE) is an organization dedicated to the
advancement of the professions of electronic and computer engineering.
- The International Telecommunications Union (ITU) : the ITU concerns itself with the interoperability of
telecommunications systems, including telephone, telegraph, and data communication systems.
- The International Organization for Standardization (ISO) is the entity that coordinates worldwide
standards development, including the activities of the American National Stan-
- dards Institute (ANSI), British Standards Institution (BSI), the CEN (Comite Europeen de Normalisation), the
European committee for standardization among others.
3. Computer Evolution and Performance
The evolution of computers has been characterized by increasing processor speed, decreasing component size,
increasing memory size, increasing I/O capacity and speed. One factor responsible for great increase in processor
speed is the shrinking size of microprocessor components; this reduces, the distance between components and increase
speed. A critical issue in computer system design is balancing the performance of various elements, so that gains in
performance in one area are not handicapped by a lag in other areas.
a. The First Generation
ENIAC (Electronic Numerical Integrator and Computer) 1943-1955
Numbers were represented in decimal form and arithmetic was performed in the decimal system. it was
made using vacuum tubes. Vacuum tube was a fragile glass device that could control and amplify electronic
signals. Its memory consisted of 20 accumulators, each capable of holding a 10-digit decimal number. Its major
drawback was that it had to be programmed manually by setting switches and plugging and unplugging cables. Also
its weigh was around 30 tons; occupied 1500 square feet of floor space containing more than 18,000 vacuum tubes,
consumed 140 kilowatts of power and was being capable of 5000 additions per second.

The Von Neumann Machine


The first publication of the idea was in a 1945 proposal by von Neumann for a new computer, the EDVAC
(Electronic Discrete Variable Computer). In 1946, von Neumann and his colleagues began the design of a new stored
program computer, referred to as the IAS computer, at the Princeton Institute for Advanced Studies. The IAS
computer, although not completed until 1952, is the prototype of all subsequent general-purpose computers. With rare
exceptions, all of today’s computers have this same general structure, and are thus referred to as von Neumann
machines.
First Commercial computers
In 1950, UNIVAC I (Universal Automatic Computer) which was commissioned by Bureau of Census was the first
successful commercial computer. The UNIVAC II, which had greater memory capacity and higher performance than
the UNIVAC I, was delivered in the late 1950s.
Salient features of First generation computers:
 Used vacuum tubes to control and amplify electronic signals
 Huge computers that occupied lot of space
 High electricity consumption and high heat generation

3
 Were unreliable since they were prone to frequent hardware failures
 Commercial production was difficult
 They were very costly and required constant maintenance
 Continuous air conditioning was required
 Programming was done in machine language although assembly language also started at the end of
this generation.
Example : ENIAC , EDVAC , UNIVAC 1
b. The Second Generation: Transistors
The first major change in the electronic computer came with the replacement of the vacuum tube by the
transistor. The transistor is smaller, cheaper, and dissipates less heat than a vacuum tube but can be used in the same
way as a vacuum tube to construct computers. Unlike the vacuum tube, which requires wires, metal plates, a glass
capsule, and a vacuum, the transistor is a solid-state device, made from silicon. It performs all the functions of a
vacuum tube i.e. switches circuit on and off at a very high speed. Transistors were developed at Bell Labs In
1947.
Other changes as well can be named; they are the introduction of more complex arithmetic and logic units and
control units, the use of high level programming languages, and the provision of system software with the computer.
Some of the names of second generation computers are IBM series, UNIVAC III, CDC 1400 series, Honeywell
etc.

Salient Features of Second generation computers:


 Use transistor based technology which makes Computer to be smaller and less expensive as compared to first
generation
 Consumed less electricity and emitted less heat
 Magnetic core memories and magnetic disks were used as primary and secondary storage respectively
 First operating system developed and Programming in assembly language and in the later part high level
languages were used
 Wider commercial use but commercial production was still difficult,They also required constant air-conditioning.
Examples: IBM 1401, IBM 1620, UNIVAC 1108
c. The Third Generation: Integrated Circuits(IC)
Throughout 1950s and early 1960, electronic equipment was composed largely of discrete components
transistors, resistors, capacitors and so on. In 1958 came the achievement that revolutionized electronics and started
the age of microelectronics: the invention of integrated circuits. integrated circuits is a micro electronics
technology, in which it is possible to integrate large number of circuit elements into very small surface area (less
than 5 mm square) of silicon known as Chip. Only two fundamental types of components are required: gates and
memory cells. These components were made up by capacitors, resistors and
A gate is a device that implements a simple Boolean or logical function and memory cell is a device that can store
one bit of data; we can relate this to our four basic functions as follows:
 Data storage: Provided by memory cells.
 Data processing: Provided by gates.
 Data movement: Paths between components are used to move data from memory to memory and from memory
through gates to memory.
 Control: Paths between components can carry control signals.
They were two types of ICs in that generation: small scale integration (SSI) chip used to have about 10 to 20
transistors on a single chip and a medium scale integration (MSI) chip had about 100 transistors per chip.
Moore’s Law
Gordon Moore, cofounder of Intel, propounded Moore’s law in 1965. According to Moore’s Law, numbers
of transistors on a chip will double every year. Since 1970’s development has slowed a little. Number of transistors
doubles every 18 months.
IBM System/360

4
The characteristics of 360 family are as follows: Similar or identical instruction sets, Similar or identical
operating system, Increasing speed, Increasing number of I/O ports, Increasing memory size, Increasing cost.
DEC PDP-8
In 1964, Digital Equipment Corporation (DEC) produced PDP-8, the first minicomputer. It was small enough
to sit on a lab bench and did not need air conditioned room. It used bus structure that is now virtually universal for
minicomputers and microcomputers.
Salient Features of Third Generation computers:
 Used integrated circuits
 Computers were smaller , faster and more reliable
 Low power consumption and less emission of heat as compared to previous generations
d. fourth Generations (1975-2010)
The development of microprocessor chip that contains entire central processing unit (CPU) on a single
silicon chip led to the invention of fourth generation computers. The technology that was used in fourth generation
computers Large-Scale Integration (LSI) in which, it was possible to integrate 30,000 transistors on a single chip
and latter VLSI (Very Large Scale Integration) has come, in which millions of transistors can be assembled on single
chip.
Semiconductor Memory
The first application of integrated circuit technology to computers was construction of the processor (the control
unit and the arithmetic and logic unit) out of integrated circuit chips. But it was also found that this same technology
could be used to construct memories.
In 1970, Fairchild produced the first relatively capacious semiconductor memory. This chip could hold 256 bits of
memory. It took only 70 billionths of a second to read a bit. Following this, there has been a continuing and rapid
decline in memory cost accompanied by a corresponding increase in physical memory density. This has led the way to
smaller, faster machines.
Microprocessors
In 1971, Intel developed its 4004 which was the first chip to contain all of the components of a CPU on a single
chip: the microprocessor was born. 4004 can add two 4-bit numbers these are its followers:
 In 1972 the introduction of Intel 8008 this was the first 8-bit microprocessor and was almost twice as complex as
the 4004. The 4004 and the 8008 had been designed for specific applications
 In 1974, Intel 8080 which was the first general-purpose microprocessor is introduced. Like the 8008, the 8080 is
an 8-bit microprocessor. However, is faster, has a richer instruction set, and has a large addressing capability.
 At the end of 1970s, general-purpose 16-bit microprocessors appeared. One of these was the 8086.

 Then, Bell Labs and Hewlett-Packard developed 32-bit, single-chip microprocessors


 Intel introduced its own 32-bit microprocessor, the 80386, in 1985.
 Intel produced a family of 64-bit microprocessors, called Itanium. In 2001, the first Itanium processor
codenamed as Merced is released
These are in order the generations of Intel processors: 8080, 8086, 80286, 80386, 80486, Pentium (Introduce
the use of superscalar techniques which allows multiple instructions executed in parallel.), Pentium Pro, Pentium
II(designed specifically to process graphics, video and audio data efficiently), Pentium III(Incorporates additional
floating point instructions for 3D graphics.), Pentium 4(Includes further floating point and multimedia
enhancements), Dual Core(referring to the implementation of two processors on a single chip), Itanium

Salient features of Fourth generation Computers


 Use ICs with LSI and VLSI technology
 Microprocessors was developed
 Portable computers developed
 Networking and data communication became popular
 Different types of secondary memory with high storage capacity and fast access developed

5
 Computers were Very reliable ,powerful and small in size
 Negligible power consumption and heat generation and Very less production cost
e. Fifth Generation Computers
Fifth Generation computers are still under development, they will have thinking power and capability to make
decisions like human beings, and may prove better than man in certain aspects. They will be more useful in the field of
knowledge processing rather than in data processing. The concept of Artificial Intelligence (AI) is being used in
these computers and also other concepts like swarm intelligence or distributed intelligence which has proven
consistent result in robot manufacturing. These computers will have Knowledge Information Processing System
(KIPS) rather than the present Data/Logic Information Processing System. Some applications like voice
recognition, visual recognition are a step in this very direction.
Salient features of fifth generation computers:
 Parallel Processing
 Superconductivity
 Artificial Intelligence and swarm intelligence
4. Von Neumann Architecture
General structure of the IAS computer consists of:
 A main memory, which stores both data and instructions.
 The central processing unit : Which holds (1) An arithmetic and logic unit (ALU) capable of operating on binary
data. (2) A control unit, which interprets the instructions in memory and causes them to be executed and (3)
Registers.
 Input and output (I/O) equipment

The von Neumann Architecture


This architecture runs programs in what is known as the von Neumann execution cycle (also called the fetch-
decode-execute cycles), which describes how the machine works. One iteration of the cycle is as follows:
 The control unit fetches the next instruction from memory using the program counter to determine where
the instruction is located.
 The instruction is decoded into a language that the ALU can understand.
 Any data operands required to execute the instruction are fetched from memory and placed into registers
within the CPU.
 The ALU executes the instruction and places results in registers or memory.
5. Interconnection between Functional Components
The interconnection between the functional components of a computer is done using Common Bus Architecture,
whereby devices communicate with each other through a common bus. A bus is a transmission path (set of
conducting wires) over which data or information in the form of electric signals, is passed from one component to
another in a computer. The bus can be of three types – Address bus, Data bus and Control Bus.

6
The Modified von Neumann Architecture, Adding a System Bus(Bus Architechture)

The Modified von Neumann Architecture, Adding a System Bus and controller
6. Computer measurement units
a. Clock frequencies
They are measured in cycles per second, or Hertz. Processor speeds are measured in MHz or GHz, 1MHz =
1,000,000Hz
b. unit in storage
Bytes are also used to represent characters in a text. Different types of coding schemes are used to represent the character set
and numbers. The most commonly used coding scheme is the American Standard Code for Information Interchange (ASCII).
The following table shows the representation of various memory sizes.

Read as 2 power10.
Megabyte: 1 megabyte = 220 bytes =1,048,576 bytes
Gigabyte: 1 gigabyte = 230 bytes = 1,073,741,824 bytes.
c. Measures of time and space
Millisecond = 1 thousandth (10-3) of a second (Hard disk drive access times
are often 10 to 20 milliseconds.)
Nanosecond = 1 billionth (10-9) of a second (Main memory access times are
often 50 to 70 nanoseconds.)
Micron (micrometer) = 1 millionth (10-6) of a meter (Circuits on computer chips are measured in microns.)
Exercise:
1) Describe each feature presented
in the following computer

2) Compare the Intel i3,i5,i7 and i9


processors
3) What is the use of a dedicated
graphic memory? Give a
practical use of it
4) Rank the 5 best CPU
Manufacturers Company and
List the 5 best CPU in the
market with their features.

7
CHAPTER 2
Data Representation (Integers, Floating-point Numbers, and Characters)

Learning outcomes
After completing this chapter, and the Essential reading and activities, you should be able to:

 explain how textual and numeric information can be represented in binary form
 o perform basic calculations within the binary system
Introduction
The previous chapter has given you an overview of how a computer operates. It has shown that, in
order to operate, the computer needs to store and process data. This chapter is concerned with the way
data needs to be represented so it can be stored in a computer’s memory and processed as part of
computer operations. The representation of data can be qualitative (based quality, characteristics, eg:
name, NIC, address, etc) or quantitative (proportional to a value, eg number of student, number of
marks, etc). Either quantitative or qualitative, all data in a computer are represented using the binary
system.
0. Data and information
Computer is an electronic device that processes the input according to the set of instructions
provided to it and gives the desired output at a very fast rate.
Data: It is the term used for raw facts and figures while information is Data represented in useful and
meaningful form is information. In simple words we can say that data is the raw material that is
processed to give meaningful, ordered or structured information. There are different types of data,
namely, text, number, images, sound or video.
Representing text
A common way to represent text is to agree on a unique code for every symbol (e.g. for every letter,
punctuation mark or number) that needs to be presented. Each code consists of a fixed-length sequence
of bits (but obviously the sequence is different for each code). A word can then be ‘written’ by
determining the code for each letter and stringing them together. Characters includes letters(A-Z,a-z),
digit, and other special symbols like &,#,{,[,|,\,^,@,],}. The standard used to represent characters is:
 Binary Coded Decimal (BCD) is a method of using binary digits to represent the decimal
digits 0–9. A decimal digit is represented by four binary digits. E.g. BCD(53)= 01010011
Decimal Code 0 1 2 3 4 5 6 7 8 9
BCD Digit 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001
 ASCII(American standard code for Information Interchange) it is limited to English symbol
 EBCDIC(Extended binary coded decimal interchange code), it is used by large IBM computer
 Unicode also known as UTF(Unicode text Format), it is the widely used because it is
compatible with many languages like French, Spanish, Chinese,Swahili, Haoussa, Arab etc
Representing images
Images are also encoded using bit patterns. Generally an image is divided into many small picture
elements (pixels), and the appearance of each pixel is then encoded in binary form. The collection of
these encoded pixels is known as the bitmap of the image. A colour image usually uses three bytes to
represent a single pixel. For example, computers frequently use a combination of red, green and blue
light to represent a wide spectrum of colour (the RGB colour model).

Representing sound
In order to store sound on your computer, the analogue sound signal needs to be converted into a
digital format. Typically the amplitude of the sound wave is checked and recorded at regular time
intervals. These values can then be stored in binary form and used to re-construct the initial wave at a
later stage.
A numbering system is a way of representing numbers. The most commonly used numbering system
is the decimal system. A bit (binary digit) is the smallest data stored in the computer memory. Its
values are 0 and 1. A collection of 8 bits is called a byte. With 8 bits or a byte, we can represent 256
values ranging from 0 to 255. The largest number with 8 binary digits, 28-1=255, thus the largest
number with n is 2n-1

1. Number Systems
Human beings use decimal (base 10) number systems for counting and measurements (probably because
we have 10 fingers and two big toes). Computers use binary (base 2) number system, as they are made
from binary digital components (known as transistors) operating in two states - on and off. In computing,
we also use hexadecimal (base 16) or octal (base 8) number systems, as a compact form for representing
binary numbers.
1.1 Decimal (Base 10) Number System
Decimal number system has ten symbols: 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9, called digits. It uses positional
notation. That is, the least-significant digit (right-most digit) is of the order of 10^0 (units or ones), the
second right-most digit is of the order of 10^1 (tens), the third right-most digit is of the order
of 10^2 (hundreds), and so on, where ^ denotes exponent. For example,

735 = 700 + 30 + 5 = 7×10^2 + 3×10^1 + 5×10^0

We shall denote a decimal number with an optional suffix D if ambiguity arises.


1.2 Binary (Base 2) Number System
Binary number system has two symbols: 0 and 1, called bits. It is also a positional notation, for example,

10110B = 10000B + 0000B + 100B + 10B + 0B = 1×2^4 + 0×2^3 + 1×2^2 + 1×2^1 + 0×2^0

We shall denote a binary number with a suffix B. Some programming languages denote binary numbers
with prefix 0b or 0B (e.g., 0b1001000), or prefix b with the bits quoted (e.g., b'10001111').
A binary digit is called a bit. Eight bits is called a byte (why 8-bit unit? Probably because 8=23).
1.3 Hexadecimal (Base 16) Number System
Hexadecimal number system uses 16 symbols: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F, called hex
digits. It is a positional notation, for example,

A3EH = A00H + 30H + EH = 10×16^2 + 3×16^1 + 14×16^0

We shall denote a hexadecimal number (in short, hex) with a suffix H. Some programming languages
denote hex numbers with prefix 0x or 0X (e.g., 0x1A3C5F), or prefix x with hex digits quoted
(e.g., x'C3A4D98B').
Each hexadecimal digit is also called a hex digit. Most programming languages accept lowercase 'a' to 'f' as
well as uppercase 'A' to 'F'.
Computers uses binary system in their internal operations, as they are built from binary digital electronic
components with 2 states - on and off. However, writing or reading a long sequence of binary bits is
cumbersome and error-prone (try to read this binary string: 1011 0011 0100 0011 0001 1101 0001 1000B ,
which is the same as hexadecimal B343 1D18H). Hexadecimal system is used as a compact form
or shorthand for binary bits. Each hex digit is equivalent to 4 binary bits, i.e., shorthand for 4 bits, as
follows:
He x a de c i ma l B i na r y D e c i ma l

0 0000 0
1 0001 1
2 0010 2
3 0011 3
4 0100 4
5 0101 5
6 0110 6
7 0111 7
8 1000 8
9 1001 9
A 1010 10
B 1011 11
C 1100 12
D 1101 13
E 1110 14
F 1111 15

1.4 Conversion from Hexadecimal to Binary


Replace each hex digit by the 4 equivalent bits (as listed in the above table), for examples,

A3C5H = 1010 0011 1100 0101B


102AH = 0001 0000 0010 1010B

1.5 Conversion from Binary to Hexadecimal


Starting from the right-most bit (least-significant bit), replace each group of 4 bits by the equivalent hex
digit (pad the left-most bits with zero if necessary), for examples,

1001001010B = 0010 0100 1010B = 24AH


10001011001011B = 0010 0010 1100 1011B = 22CBH

It is important to note that hexadecimal number provides a compact form or shorthand for representing
binary bits.
1.6 Conversion from Base r to Decimal (Base 10)
Given a n-digit base r number: dn-1dn-2dn-3...d2d1d0 (base r), the decimal equivalent is given by:
dn-1×rn-1 + dn-2×rn-2 + ... + d1×r1 + d0×r0

For examples,

A1C2H = 10×16^3 + 1×16^2 + 12×16^1 + 2 = 41410 (base 10)


10110B = 1×2^4 + 1×2^2 + 1×2^1 = 22 (base 10)

1.7 Conversion from Decimal (Base 10) to Base r

Use repeated division/remainder. For example,

To convert 261(base 10) to hexadecimal:


261/16 => quotient=16 remainder=5
16/16 => quotient=1 remainder=0
1/16 => quotient=0 remainder=1 (quotient=0 stop)
Hence, 261D = 105H (Collect the hex digits from the remainder in reverse order)

The above procedure is actually applicable to conversion between any 2 base systems. For example,

To convert 1023(base 4) to base 3:


1023(base 4)/3 => quotient=25D remainder=0
25D/3 => quotient=8D remainder=1
8D/3 => quotient=2D remainder=2
2D/3 => quotient=0 remainder=2 (quotient=0 stop)
Hence, 1023(base 4) = 2210(base 3)

1.8 Conversion between Two Number Systems with Fractional Part


1. Separate the integral and the fractional parts.
2. For the integral part, divide by the target radix repeatably, and collect the ramainder in reverse
order.
3. For the fractional part, multiply the fractional part by the target radix repeatably, and collect the
integral part in the same order.
Example 1: Decimal to Binary

Convert 18.6875D to binary


Integral Part = 18D
18/2 => quotient=9 remainder=0
9/2 => quotient=4 remainder=1
4/2 => quotient=2 remainder=0
2/2 => quotient=1 remainder=0
1/2 => quotient=0 remainder=1 (quotient=0 stop)
Hence, 18D = 10010B
Fractional Part = .6875D
.6875*2=1.375 => whole number is 1
.375*2=0.75 => whole number is 0
.75*2=1.5 => whole number is 1
.5*2=1.0 => whole number is 1
Hence .6875D = .1011B
Combine, 18.6875D = 10010.1011B

Example 2: Decimal to Hexadecimal

Convert 18.6875D to hexadecimal


Integral Part = 18D
18/16 => quotient=1 remainder=2
1/16 => quotient=0 remainder=1 (quotient=0 stop)
Hence, 18D = 12H
Fractional Part = .6875D
.6875*16=11.0 => whole number is 11D (BH)
Hence .6875D = .BH
Combine, 18.6875D = 12.BH

1.9 Exercises (Number Systems Conversion)


1. Convert the following decimal numbers into binary and hexadecimal numbers:
a. 108 b.4848 c.9000
2. Convert the following binary numbers into hexadecimal and decimal numbers:
a. 1000011000 b.10000000 c.101010101010
3. Convert the following hexadecimal numbers into binary and decimal numbers:
a. ABCDE b. 1234 c.80F
4. Convert the following decimal numbers into binary equivalent:
a.19.25D b.123.456D
Answers: You could use the Windows' Calculator (calc.exe) to carry out number system conversion, by
setting it to the Programmer or scientific mode. (Run "calc" ⇒ Select "Settings" menu ⇒ Choose
"Programmer" or "Scientific" mode.)
1. 1101100B, 1001011110000B, 10001100101000B, 6CH, 12F0H, 2328H.
2. 218H, 80H, AAAH, 536D, 128D, 2730D.
3. 10101011110011011110B, 1001000110100B, 100000001111B, 703710D, 4660D, 2063D.
4. ?? (You work it out!)

2. Computer Memory & Data Representation


Computer uses a fixed number of bits to represent a piece of data, which could be a number, a character, or
others. A n-bit storage location can represent up to 2^n distinct entities. For example, a 3-bit memory
location can hold one of these eight binary patterns: 000, 001, 010, 011, 100, 101, 110, or 111. Hence, it can
represent at most 8 distinct entities. You could use them to represent numbers 0 to 7, numbers 8881 to
8888, characters 'A' to 'H', or up to 8 kinds of fruits like apple, orange, banana; or up to 8 kinds of animals
like lion, tiger, etc.
Integers, for example, can be represented in 8-bit, 16-bit, 32-bit or 64-bit. You, as the programmer, choose
an appropriate bit-length for your integers. Your choice will impose constraint on the range of integers that
can be represented. Besides the bit-length, an integer can be represented in various representation schemes,
e.g., unsigned vs. signed integers. An 8-bit unsigned integer has a range of 0 to 255, while an 8-bit signed
integer has a range of -128 to 127 - both representing 256 distinct numbers.
It is important to note that a computer memory location merely stores a binary pattern. It is entirely up to
you, as the programmer, to decide on how these patterns are to be interpreted. For example, the 8-bit binary
pattern "0100 0001B" can be interpreted as an unsigned integer 65, or an ASCII character 'A', or some secret
information known only to you. In other words, you have to first decide how to represent a piece of data in
a binary pattern before the binary patterns make sense. The interpretation of binary pattern is called data
representation or encoding. Furthermore, it is important that the data representation schemes are agreed-
upon by all the parties, i.e., industrial standards need to be formulated and straightly followed.
Once you decided on the data representation scheme, certain constraints, in particular, the precision and
range will be imposed. Hence, it is important to understand data representation to write correct and high-
performance programs.
3. Integer Representation
Integers are whole numbers or fixed-point numbers with the radix point fixed after the least-significant bit.
They are contrast to real numbers or floating-point numbers, where the position of the radix point varies. It
is important to take note that integers and floating-point numbers are treated differently in computers. They
have different representation and are processed differently (e.g., floating-point numbers are processed in a
so-called floating-point processor). Floating-point numbers will be discussed later.
Computers use a fixed number of bits to represent an integer. The commonly-used bit-lengths for integers
are 8-bit, 16-bit, 32-bit or 64-bit. Besides bit-lengths, there are two representation schemes for integers:
1. Unsigned Integers: can represent zero and positive integers.
2. Signed Integers: can represent zero, positive and negative integers. Three representation schemes
had been proposed for signed integers:
a. Sign-Magnitude representation
b. 1's Complement representation
c. 2's Complement representation
You, as the programmer, need to decide on the bit-length and representation scheme for your integers,
depending on your application's requirements. Suppose that you need a counter for counting a small
quantity from 0 up to 200, you might choose the 8-bit unsigned integer scheme as there is no negative
numbers involved.
3.1 n-bit Unsigned Integers
Unsigned integers can represent zero and positive integers, but not negative integers. The value of an
unsigned integer is interpreted as "the magnitude of its underlying binary pattern".
Example 1: Suppose that n=8 and the binary pattern is 0100 0001B, the value of this unsigned integer
is 1×2^0 + 1×2^6 = 65D.
Example 2: Suppose that n=16 and the binary pattern is 0001 0000 0000 1000B, the value of this
unsigned integer is 1×2^3 + 1×2^12 = 4104D.
Example 3: Suppose that n=16 and the binary pattern is 0000 0000 0000 0000B, the value of this
unsigned integer is 0.
An n-bit pattern can represent 2^n distinct integers. An n-bit unsigned integer can represent integers
from 0 to (2^n)-1, as tabulated below:
n Minimum Maximum

8 0 (2^8)-1 (=255)

16 0 (2^16)-1 (=65,535)

32 0 (2^32)-1 (=4,294,967,295) (9+ digits)

64 0 (2^64)-1 (=18,446,744,073,709,551,615) (19+ digits)

3.2 Signed Integers


Signed integers can represent zero, positive integers, as well as negative integers. Three representation
schemes are available for signed integers:
1. Sign-Magnitude representation
2. 1's Complement representation
3. 2's Complement representation
In all the above three schemes, the most-significant bit (msb) is called the sign bit. The sign bit is used to
represent the sign of the integer - with 0 for positive integers and 1 for negative integers. The magnitude of
the integer, however, is interpreted differently in different schemes.
3.3 n-bit Sign Integers in Sign-Magnitude Representation
In sign-magnitude representation:
 The most-significant bit (msb) is the sign bit, with value of 0 representing positive integer and 1
representing negative integer.
 The remaining n-1 bits represents the magnitude (absolute value) of the integer. The absolute value of
the integer is interpreted as "the magnitude of the (n-1)-bit binary pattern".
Example 1 : Suppose that n=8 and the binary representation is 0 100 0001B.
Sign bit is 0 ⇒ positive
Absolute value is 100 0001B = 65D
Hence, the integer is +65D
Example 2 : Suppose that n=8 and the binary representation is 1 000 0001B.
Sign bit is 1 ⇒ negative
Absolute value is 000 0001B = 1D
Hence, the integer is -1D
Example 3 : Suppose that n=8 and the binary representation is 0 000 0000B.
Sign bit is 0 ⇒ positive
Absolute value is 000 0000B = 0D
Hence, the integer is +0D
Example 4 : Suppose that n=8 and the binary representation is 1 000 0000B.
Sign bit is 1 ⇒ negative
Absolute value is 000 0000B = 0D
Hence, the integer is -0D
The drawbacks of sign-magnitude representation are:
1. There are two representations (0000 0000B and 1000 0000B) for the number zero, which could lead
to inefficiency and confusion.
2. Positive and negative integers need to be processed separately.
3.4 n-bit Sign Integers in 1's Complement Representation
In 1's complement representation:
 Again, the most significant bit (msb) is the sign bit, with value of 0 representing positive integers and
1 representing negative integers.
 The remaining n-1 bits represents the magnitude of the integer, as follows:
o for positive integers, the absolute value of the integer is equal to "the magnitude of the (n-1)-bit
binary pattern".
o for negative integers, the absolute value of the integer is equal to "the magnitude of
the complement (inverse) of the (n-1)-bit binary pattern" (hence called 1's complement).
Example 1 : Suppose that n=8 and the binary representation 0 100 0001B.
Sign bit is 0 ⇒ positive
Absolute value is 100 0001B = 65D
Hence, the integer is +65D
Example 2 : Suppose that n=8 and the binary representation 1 000 0001B.
Sign bit is 1 ⇒ negative
Absolute value is the complement of 000 0001B, i.e., 111 1110B = 126D
Hence, the integer is -126D
Example 3 : Suppose that n=8 and the binary representation 0 000 0000B.
Sign bit is 0 ⇒ positive
Absolute value is 000 0000B = 0D
Hence, the integer is +0D
Example 4 : Suppose that n=8 and the binary representation 1 111 1111B.
Sign bit is 1 ⇒ negative
Absolute value is the complement of 111 1111B, i.e., 000 0000B = 0D
Hence, the integer is -0D
Again, the drawbacks are:
1. There are two representations (0000 0000B and 1111 1111B) for zero.
2. The positive integers and negative integers need to be processed separately.
3.5 n-bit Sign Integers in 2's Complement Representation
In 2's complement representation:
 Again, the most significant bit (msb) is the sign bit, with value of 0 representing positive integers and
1 representing negative integers.
 The remaining n-1 bits represents the magnitude of the integer, as follows:
o for positive integers, the absolute value of the integer is equal to "the magnitude of the (n-1)-bit
binary pattern".
o for negative integers, the absolute value of the integer is equal to "the magnitude of
the complement of the (n-1)-bit binary pattern plus one" (hence called 2's complement).
Example 1 : Suppose that n=8 and the binary representation 0 100 0001B.
Sign bit is 0 ⇒ positive
Absolute value is 100 0001B = 65D
Hence, the integer is +65D
Example 2 : Suppose that n=8 and the binary representation 1 000 0001B.
Sign bit is 1 ⇒ negative
Absolute value is the complement of 000 0001B plus 1, i.e., 111 1110B + 1B = 127D
Hence, the integer is -127D
Example 3 : Suppose that n=8 and the binary representation 0 000 0000B.
Sign bit is 0 ⇒ positive
Absolute value is 000 0000B = 0D
Hence, the integer is +0D
Example 4 : Suppose that n=8 and the binary representation 1 111 1111B.
Sign bit is 1 ⇒ negative
Absolute value is the complement of 111 1111B plus 1, i.e., 000 0000B + 1B = 1D
Hence, the integer is -1D
3.6 Computers use 2's Complement Representation for Signed Integers
We have discussed three representations for signed integers: signed-magnitude, 1's complement and 2's
complement. Computers use 2's complement in representing signed integers. This is because:
1. There is only one representation for the number zero in 2's complement, instead of two
representations in sign-magnitude and 1's complement.
2. Positive and negative integers can be treated together in addition and subtraction. Subtraction can be
carried out using the "addition logic".
Example 1: Addition of Two Positive Integers: Suppose that n=8, 65D + 5D = 70D
65D → 0100 0001B
5D → 0000 0101B(+
0100 0110B → 70D (OK)

Example 2: Subtraction is treated as Addition of a Positive and a Negative


Integers: Suppose that n=8, 5D - 5D = 65D + (-5D) = 60D
65D → 0100 0001B
-5D → 1111 1011B(+
0011 1100B → 60D (discard carry - OK)

Example 3: Addition of Two Negative Integers: Suppose that n=8, -65D - 5D = (-65D) + (-
5D) = -70D

-65D → 1011 1111B


-5D → 1111 1011B(+
1011 1010B → -70D (discard carry - OK)

Because of the fixed precision (i.e., fixed number of bits), an n-bit 2's complement signed integer has a
certain range. For example, for n=8, the range of 2's complement signed integers is -128 to +127. During
addition (and subtraction), it is important to check whether the result exceeds this range, in other words,
whether overflow or underflow has occurred.
Example 4: Overflow: Suppose that n=8, 127D + 2D = 129D (overflow - beyond the range)
127D → 0111 1111B
2D → 0000 0010B(+
1000 0001B → -127D (wrong)

Example 5: Underflow: Suppose that n=8, -125D - 5D = -130D (underflow - below the range)
-125D → 1000 0011B
-5D → 1111 1011B(+
0111 1110B → +126D (wrong)

The following diagram explains how the 2's complement works. By re-arranging the number line, values
from -128 to +127 are represented contiguously by ignoring the carry bit.

3.7 Range of n-bit 2's Complement Signed Integers


An n-bit 2's complement signed integer can represent integers from -2^(n-1) to +2^(n-1)-1, as tabulated.
Take note that the scheme can represent all the integers within the range, without any gap. In other words,
there is no missing integers within the supported range.
n minimum maximum

8 -(2^7) (=-128) +(2^7)-1 (=+127)

16 -(2^15) (=-32,768) +(2^15)-1 (=+32,767)

32 -(2^31) (=-2,147,483,648) +(2^31)-1 (=+2,147,483,647)(9+ digits)

64 -(2^63) (=- +(2^63)-1 (=+9,223,372,036,854,775,807)(18+ digits)


9,223,372,036,854,775,808)
3.8 Decoding 2's Complement Numbers
1. Check the sign bit (denoted as S).
2. If S=0, the number is positive and its absolute value is the binary value of the remaining n-1 bits.
3. If S=1, the number is negative. you could "invert the n-1 bits and plus 1" to get the absolute value of
negative number.
Alternatively, you could scan the remaining n-1 bits from the right (least-significant bit). Look for
the first occurrence of 1. Flip all the bits to the left of that first occurrence of 1. The flipped pattern
gives the absolute value. For example,

4. n = 8, bit pattern = 1 100 0100B


5. S = 1 → negative
6. Scanning from the right and flip all the bits to the left of the first occurrence of 1 ⇒ 011 1100B = 60D

Hence, the value is -60D

3.9 Big Endian vs. Little Endian


Modern computers store one byte of data in each memory address or location, i.e., byte addressable
memory. An 32-bit integer is, therefore, stored in 4 memory addresses.
The term"Endian" refers to the order of storing bytes in computer memory. In "Big Endian" scheme, the
most significant byte is stored first in the lowest memory address (or big in first), while "Little Endian"
stores the least significant bytes in the lowest memory address.
For example, the 32-bit integer 12345678H (30541989610) is stored as 12H 34H 56H 78H in big endian;
and 78H 56H 34H 12H in little endian. An 16-bit integer 00H 01H is interpreted as 0001H in big endian,
and 0100H as little endian.
3.10 Exercise (Integer Representation)
1. What are the ranges of 8-bit, 16-bit, 32-bit and 64-bit integer, in "unsigned" and "signed"
representation?
2. Give the value of 88, 0, 1, 127, and 255 in 8-bit unsigned representation.
3. Give the value of +88, -88 , -1, 0, +1, -128, and +127 in 8-bit 2's complement signed representation.
4. Give the value of +88, -88 , -1, 0, +1, -127, and +127 in 8-bit sign-magnitude representation.
5. Give the value of +88, -88 , -1, 0, +1, -127 and +127 in 8-bit 1's complement representation.
6. [TODO] more.
Answers
1. The range of unsigned n-bit integers is [0, 2^n - 1]. The range of n-bit 2's complement signed integer
is [-2^(n-1), +2^(n-1)-1];
2. 88 (0101 1000), 0 (0000 0000), 1 (0000 0001), 127 (0111 1111), 255 (1111 1111).
3. +88 (0101 1000), -88 (1010 1000), -1 (1111 1111), 0 (0000 0000), +1 (0000 0001), -128 (1000
0000), +127 (0111 1111).
4. +88 (0101 1000), -88 (1101 1000), -1 (1000 0001), 0 (0000 0000 or 1000 0000), +1 (0000 0001), -127
(1111 1111), +127 (0111 1111).
5. +88 (0101 1000), -88 (1010 0111), -1 (1111 1110), 0 (0000 0000 or 1111 1111), +1 (0000 0001), -127
(1000 0000), +127 (0111 1111).
4. Floating-Point Number Representation
A floating-point number (or real number) can represent a very large (1.23×10^88) or a very small
(1.23×10^-88) value. It could also represent very large negative number (-1.23×10^88) and very small
negative number (-1.23×10^88), as well as zero, as illustrated:
A floating-point number is typically expressed in the scientific notation, with a fraction (F), and
an exponent (E) of a certain radix (r), in the form of F×r^E. Decimal numbers use radix of 10 (F×10^E);
while binary numbers use radix of 2 (F×2^E).
Representation of floating point number is not unique. For example, the number 55.66 can be represented
as 5.566×10^1, 0.5566×10^2, 0.05566×10^3, and so on. The fractional part can be normalized. In the
normalized form, there is only a single non-zero digit before the radix point. For example, decimal
number 123.4567 can be normalized as 1.234567×10^2; binary number 1010.1011B can be normalized
as 1.0101011B×2^3.
It is important to note that floating-point numbers suffer from loss of precision when represented with a
fixed number of bits (e.g., 32-bit or 64-bit). This is because there are infinite number of real numbers (even
within a small range of says 0.0 to 0.1). On the other hand, a n-bit binary pattern can represent
a finite 2^n distinct numbers. Hence, not all the real numbers can be represented. The nearest approximation
will be used instead, resulted in loss of accuracy.
It is also important to note that floating number arithmetic is very much less efficient than integer
arithmetic. It could be speed up with a so-called dedicated floating-point co-processor. Hence, use integers
if your application does not require floating-point numbers.
In computers, floating-point numbers are represented in scientific notation of fraction (F) and exponent (E)
with a radix of 2, in the form of F×2^E. Both E and F can be positive as well as negative. Modern
computers adopt IEEE 754 standard for representing floating-point numbers. There are two representation
schemes: 32-bit single-precision and 64-bit double-precision.
4.1 IEEE-754 32-bit Single-Precision Floating-Point Numbers
In 32-bit single-precision floating-point representation:
 The most significant bit is the sign bit (S), with 0 for positive numbers and 1 for negative numbers.
 The following 8 bits represent exponent (E).
 The remaining 23 bits represents fraction (F).

Normalized Form
Let's illustrate with an example, suppose that the 32-bit pattern is 1 1000 0001 011 0000 0000 0000
0000 0000, with:
 S=1
 E = 1000 0001
 F = 011 0000 0000 0000 0000 0000
In the normalized form, the actual fraction is normalized with an implicit leading 1 in the form of 1.F. In
this example, the actual fraction is 1.011 0000 0000 0000 0000 0000 = 1 + 1×2^-2 + 1×2^-3 = 1.375D.
The sign bit represents the sign of the number, with S=0 for positive and S=1 for negative number. In this
example with S=1, this is a negative number, i.e., -1.375D.
In normalized form, the actual exponent is E-127 (so-called excess-127 or bias-127). This is because we
need to represent both positive and negative exponent. With an 8-bit E, ranging from 0 to 255, the excess-
127 scheme could provide actual exponent of -127 to 128. In this example, E-127=129-127=2D.
Hence, the number represented is -1.375×2^2=-5.5D.
De-Normalized Form
Normalized form has a serious problem, with an implicit leading 1 for the fraction, it cannot represent the
number zero! Convince yourself on this!
De-normalized form was devised to represent zero and other numbers.
For E=0, the numbers are in the de-normalized form. An implicit leading 0 (instead of 1) is used for the
fraction; and the actual exponent is always -126. Hence, the number zero can be represented
with E=0 and F=0 (because 0.0×2^-126=0).
We can also represent very small positive and negative numbers in de-normalized form with E=0. For
example, if S=1, E=0, and F=011 0000 0000 0000 0000 0000. The actual fraction is 0.011=1×2^-2+1×2^-
3=0.375D. Since S=1, it is a negative number. With E=0, the actual exponent is -126. Hence the number is -
0.375×2^-126 = -4.4×10^-39, which is an extremely small negative number (close to zero).
Summary
In summary, the value (N) is calculated as follows:
 For 1 ≤ E ≤ 254, N = (-1)^S × 1.F × 2^(E-127). These numbers are in the so-called normalized form.
The sign-bit represents the sign of the number. Fractional part (1.F) are normalized with an implicit
leading 1. The exponent is bias (or in excess) of 127, so as to represent both positive and negative
exponent. The range of exponent is -126 to +127.
 For E = 0, N = (-1)^S × 0.F × 2^(-126). These numbers are in the so-called denormalized form. The
exponent of 2^-126 evaluates to a very small number. Denormalized form is needed to represent zero
(with F=0 and E=0). It can also represents very small positive and negative number close to zero.
 For E = 255, it represents special values, such as ±INF (positive and negative infinity) and NaN (not a
number). This is beyond the scope of this article.
Example 1: Suppose that IEEE-754 32-bit floating-point representation pattern is 0 10000000 110
0000 0000 0000 0000 0000.
Sign bit S = 0 ⇒ positive number
E = 1000 0000B = 128D (in normalized form)
Fraction is 1.11B (with an implicit leading 1) = 1 + 1×2^-1 + 1×2^-2 = 1.75D
The number is +1.75 × 2^(128-127) = +3.5D

Example 2: Suppose that IEEE-754 32-bit floating-point representation pattern is 1 01111110 100
0000 0000 0000 0000 0000.
Sign bit S = 1 ⇒ negative number
E = 0111 1110B = 126D (in normalized form)
Fraction is 1.1B (with an implicit leading 1) = 1 + 2^-1 = 1.5D
The number is -1.5 × 2^(126-127) = -0.75D

Example 3: Suppose that IEEE-754 32-bit floating-point representation pattern is 1 01111110 000
0000 0000 0000 0000 0001.
Sign bit S = 1 ⇒ negative number
E = 0111 1110B = 126D (in normalized form)
Fraction is 1.000 0000 0000 0000 0000 0001B (with an implicit leading 1) = 1 + 2^-23
The number is -(1 + 2^-23) × 2^(126-127) = -0.500000059604644775390625 (may not be exact in decimal!)
Example 4 (De-Normalized Form): Suppose that IEEE-754 32-bit floating-point representation
pattern is 1 00000000 000 0000 0000 0000 0000 0001.

Sign bit S = 1 ⇒ negative number


E = 0 (in de-normalized form)
Fraction is 0.000 0000 0000 0000 0000 0001B (with an implicit leading 0) = 1×2^-23
The number is -2^-23 × 2^(-126) = -2×(-149) ≈ -1.4×10^-45

4.2 Exercises (Floating-point Numbers)


1. Compute the largest and smallest positive numbers that can be represented in the 32-bit normalized
form.
2. Compute the largest and smallest negative numbers can be represented in the 32-bit normalized
form.
3. Repeat (1) for the 32-bit denormalized form.
4. Repeat (2) for the 32-bit denormalized form.
Hints:
1. Largest positive number: S=0, E=1111 1110 (254), F=111 1111 1111 1111 1111 1111.
Smallest positive number: S=0, E=0000 00001 (1), F=000 0000 0000 0000 0000 0000.
2. Same as above, but S=1.
3. Largest positive number: S=0, E=0, F=111 1111 1111 1111 1111 1111 .
Smallest positive number: S=0, E=0, F=000 0000 0000 0000 0000 0001.
4. Same as above, but S=1.
Notes For Java Users
You can use JDK methods Float.intBitsToFloat(int bits) or Double.longBitsToDouble(long bits) to create a
single-precision 32-bit float or double-precision 64-bit double with the specific bit patterns, and print their
values. For examples,

System.out.println(Float.intBitsToFloat(0x7fffff));
System.out.println(Double.longBitsToDouble(0x1fffffffffffffL));

4.3 IEEE-754 64-bit Double-Precision Floating-Point Numbers


The representation scheme for 64-bit double-precision is similar to the 32-bit single-precision:
 The most significant bit is the sign bit (S), with 0 for positive numbers and 1 for negative numbers.
 The following 11 bits represent exponent (E).
 The remaining 52 bits represents fraction (F).

The value (N) is calculated as follows:


 Normalized form: For 1 ≤ E ≤ 2046, N = (-1)^S × 1.F × 2^(E-1023).
 Denormalized form: For E = 0, N = (-1)^S × 0.F × 2^(-1022). These are in the denormalized form.
 For E = 2047, N represents special values, such as ±INF (infinity), NaN (not a number).
4.4 More on Floating-Point Representation
There are three parts in the floating-point representation:
 The sign bit (S) is self-explanatory (0 for positive numbers and 1 for negative numbers).
 For the exponent (E), a so-called bias (or excess) is applied so as to represent both positive and
negative exponent. The bias is set at half of the range. For single precision with an 8-bit exponent, the
bias is 127 (or excess-127). For double precision with a 11-bit exponent, the bias is 1023 (or excess-
1023).
 The fraction (F) (also called the mantissa or significand) is composed of an implicit leading bit (before
the radix point) and the fractional bits (after the radix point). The leading bit for normalized numbers
is 1; while the leading bit for denormalized numbers is 0.
Normalized Floating-Point Numbers
In normalized form, the radix point is placed after the first non-zero digit, e,g., 9.8765D×10^-
23D, 1.001011B×2^11B. For binary number, the leading bit is always 1, and need not be represented
explicitly - this saves 1 bit of storage.
In IEEE 754's normalized form:
 For single-precision, 1 ≤ E ≤ 254 with excess of 127. Hence, the actual exponent is from -126 to +127.
Negative exponents are used to represent small numbers (< 1.0); while positive exponents are used to
represent large numbers (> 1.0).
N = (-1)^S × 1.F × 2^(E-127)
 For double-precision, 1 ≤ E ≤ 2046 with excess of 1023. The actual exponent is from -1022 to +1023,
and
N = (-1)^S × 1.F × 2^(E-1023)
Take note that n-bit pattern has a finite number of combinations (=2^n), which could represent finite distinct
numbers. It is not possible to represent the infinite numbers in the real axis (even a small range says 0.0 to
1.0 has infinite numbers). That is, not all floating-point numbers can be accurately represented. Instead, the
closest approximation is used, which leads to loss of accuracy.
The minimum and maximum normalized floating-point numbers are:
Precision Normalized N(min) Normalized N(max)

Single 0080 0000H 7F7F FFFFH


0 00000001 0 11111110 00000000000000000000000B
00000000000000000000000B E = 254, F = 0
E = 1, F = 0 N(max) = 1.1...1B × 2^127 = (2 - 2^-23) × 2^127
N(min) = 1.0B × 2^-126 (≈3.4028235 × 10^38)
(≈1.17549435 × 10^-38)

Double 0010 0000 0000 0000H 7FEF FFFF FFFF FFFFH


N(min) = 1.0B × 2^-1022 N(max) = 1.1...1B × 2^1023 = (2 - 2^-52) × 2^1023
(≈2.2250738585072014 × 10^-308) (≈1.7976931348623157 × 10^308)
Denormalized Floating-Point Numbers
If E = 0, but the fraction is non-zero, then the value is in denormalized form, and a leading bit of 0 is
assumed, as follows:
 For single-precision, E = 0,
N = (-1)^S × 0.F × 2^(-126)
 For double-precision, E = 0,
N = (-1)^S × 0.F × 2^(-1022)
Denormalized form can represent very small numbers closed to zero, and zero, which cannot be represented
in normalized form, as shown in the above figure.
The minimum and maximum of denormalized floating-point numbers are:

Precision Denor malized D(mi n) Denor malized D(ma x)

Single 0000 0001H 007F FFFFH


0 00000000 00000000000000000000001B 0 00000000 11111111111111111111111B
E = 0, F = 00000000000000000000001B E = 0, F = 11111111111111111111111B
D(min) = 0.0...1 × 2^-126 = 1 × 2^-23 × 2^-126 = 2^-149 D(max) = 0.1...1 × 2^-126 = (1-2^-23)×2^-126
(≈1.4 × 10^-45) (≈1.1754942 × 10^-38)

Double 0000 0000 0000 0001H 001F FFFF FFFF FFFFH


D(min) = 0.0...1 × 2^-1022 = 1 × 2^-52 × 2^-1022 = 2^-1074 D(max) = 0.1...1 × 2^-1022 = (1-2^-52)×2^-1022
(≈4.9 × 10^-324) (≈4.4501477170144023 × 10^-308)

Special Values
Zero: Zero cannot be represented in the normalized form, and must be represented in denormalized form
with E=0 and F=0. There are two representations for zero: +0 with S=0 and -0 with S=1.
Infinity: The value of +infinity (e.g., 1/0) and -infinity (e.g., -1/0) are represented with an exponent of all
1's (E = 255 for single-precision and E = 2047 for double-precision), F=0, and S=0 (for +INF) and S=1 (for -
INF).
Not a Number (NaN): NaN denotes a value that cannot be represented as real number (e.g. 0/0). NaN is
represented with Exponent of all 1's (E = 255 for single-precision and E = 2047 for double-precision) and
any non-zero fraction.
5. Character Encoding
In computer memory, character are "encoded" (or "represented") using a chosen "character encoding
schemes" (aka "character set", "charset", "character map", or "code page").
For example, in ASCII (as well as Latin1, Unicode, and many other character sets):
 code numbers 65D (41H) to 90D (5AH) represents 'A' to 'Z', respectively.
 code numbers 97D (61H) to 122D (7AH) represents 'a' to 'z', respectively.
 code numbers 48D (30H) to 57D (39H) represents '0' to '9', respectively.
It is important to note that the representation scheme must be known before a binary pattern can be
interpreted. E.g., the 8-bit pattern "0100 0010B" could represent anything under the sun known only to the
person encoded it.
The most commonly-used character encoding schemes are: 7-bit ASCII (ISO/IEC 646) and 8-bit Latin-x
(ISO/IEC 8859-x) for western european characters, and Unicode (ISO/IEC 10646) for internationalization
(i18n).
A 7-bit encoding scheme (such as ASCII) can represent 128 characters and symbols. An 8-bit character
encoding scheme (such as Latin-x) can represent 256 characters and symbols; whereas a 16-bit encoding
scheme (such as Unicode UCS-2) can represents 65,536 characters and symbols.

6. Summary - Why Bother about Data Representation?


Integer number 1, floating-point number 1.0 character symbol '1', and string "1" are totally different inside
the computer memory. You need to know the difference to write good and high-performance programs.
 In 8-bit signed integer, integer number 1 is represented as 00000001B.
 In 8-bit unsigned integer, integer number 1 is represented as 00000001B.
 In 16-bit signed integer, integer number 1 is represented as 00000000 00000001B.
 In 32-bit signed integer, integer number 1 is represented as 00000000 00000000 00000000 00000001B.
 In 32-bit floating-point representation, number 1.0 is represented as 0 01111111 0000000 00000000
00000000B, i.e., S=0, E=127, F=0.
 In 64-bit floating-point representation, number 1.0 is represented as 0 01111111111 0000 00000000
00000000 00000000 00000000 00000000 00000000B , i.e., S=0, E=1023, F=0.
 In 8-bit Latin-1, the character symbol '1' is represented as 00110001B (or 31H).
 In 16-bit UCS-2, the character symbol '1' is represented as 00000000 00110001B.
 In UTF-8, the character symbol '1' is represented as 00110001B.
If you "add" a 16-bit signed integer 1 and Latin-1 character '1' or a string "1", you could get a surprise.
6.1 Exercises (Data Representation)
For the following 16-bit codes:

0000 0000 0010 1010;


1000 0000 0010 1010;

Give their values, if they are representing:


1. a 16-bit unsigned integer;
2. a 16-bit signed integer;
3. two 8-bit unsigned integers;
4. two 8-bit signed integers;
Ans: (1) 42, 32810; (2) 42, -32726; (3) 0, 42; 128, 42; (4) 0, 42; -128, 42.

You might also like