0% found this document useful (0 votes)
4 views

Software Notes

Uploaded by

kolagdeshpande
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Software Notes

Uploaded by

kolagdeshpande
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

“Getting Started” Roadmap

High-Level Language (SYSC 2006)

Towards the Software 2 This slide deck


Towards the Software
SYSC 3006 Assembly Language
Computer Instruction Set Architecture 3
Assumes prerequisite knowledge from Organization Micro-architecture
ELEC 2607 and SYSC 2006
SYSC 2006
Towards the Hardware 1
Computer System Logic Circuits (ELEC 2607)
SYSC 2006

ELEC 2607 Hardware/Software (H/S) Interface Computer System

Dec 3, 2015 ELEC 2607 2

Target: Assembly Language Compiler


The “native” language of the processor HLL is abstract as processor independent as possible
The instructions that can be executed by the Hides connection to assembly language
processor Allows programmer to focus on solving the problem,
Specific to a particular make/model of processor not the details of the underlying machine
Written in a human-readable form (using an editor) Compiler is a tool that transforms a program written in
Must be “assembled” transformed by Assembler a HLL into an equivalent program for a lower-level
tool into the binary instruction encodings expected by language, like assembly language
the processor’s hardware 1 HLL statement is often transformed into many
Level of abstraction is lower than High-Level Language assembly language statements
(HLL) like C In SYSC 3006: use familiar HLL statements (from SYSC
Many more details! (registers, binary, opcodes) 2006) to motivate the need for various aspects of
assembly language
Dec 3, 2015 3 Dec 3, 2015 4

Review:
IDE
HLL Software Concepts from PreReq’s
Compiler is often integrated into a suite of development Program: data + processing
tools May also include input/output  later!
IDE: Integrated Development Environment Data: (information)
Type
Compiler: transforms HLL to assembly language Variable
Assembler: transforms to “object format” (obj) Pointer
Linker: joins libraries (obj) with program (obj) and Composite type (array and struct)
creates “executable” form (exe) Processing: (instructions)
Sequential execution
Loader: loads executable form into memory Expression
Debugger: controls the execution to allow debugging Conditional: if-then-else, switch
Loop: while, for
Relevant @ H/S Interface Functions, parameters
Dec 3, 2015 5 Dec 3, 2015 6
Data: Type Data: Variable
A description of a set of (information) values + Variable: declared to have a name and be of a specific type
operations on those values
E.G. int X;
E.G. boolean, int, char, float Used in a program to hold (store) dynamic information during
The range of values is limited by the fixed-width program execution
binary values used for encoding Can change (write) the stored value during program
execution using the assignment operator (=)
@ H/S Interface: must understand the binary encoding E.G. X = 7; L-value use of X (Left of =)
of values (constants) and how to manipulate binary Can use (read) the current stored value
values (operations) to get new values E.G. Y = X + 1; R-value use of X (Right of =)
E.G. signed integers @ H/S Interface: variables are Main Memory locations
Have seen some examples of manipulating values Name of variable is associated with the variable’s address
using ALU Need instructions to load and store data values from memory
Dec 3, 2015 7 Dec 3, 2015 8

Data: Pointer Data: Composite Type (array and struct)


Every type has a corresponding pointer type Composite: has a set of “fields”, each field has a type
E.G. boolean *, int *, char *, float * Array: fields must all be of same type and are “indexed”
Values:
Use index to access specific field e.g. alist[i]
• null (constant)
• reference to a variable of associated type Often process in a loop one iteration per field
Operations? = (assignment), == (equality test), Struct: “named” fields that may be of different types
* (dereference … to access referenced variable) Use “.” to access specific field e.g. node.next
@ H/S Interface: the value of a pointer is a Main @ H/S Interface: load and store instructions have a
Memory address (or null) null value? variety of addressing modes specifically intended to
A pointer variable is “just another” variable (previous slide) simplify accessing fields of composite types

Dec 3, 2015 9 Dec 3, 2015 10

Processing: Sequential Execution Processing: Expression


Program execution relies on executing steps in Expressions manipulate values to arrive at new values
sequence An expression generates a value of a particular type
E.G. X = 6; E.G. For ints X & Y: X + Y generates an int value
Y = 8; E.G. For ints X & Y: X < Y generates a boolean value
Z = X + Y; etc. Expressions rely on operation precedence (sequencing):
@ H/S Interface: microarchitecture defaults to E.G. X + Y * Z
execute instructions in sequence @ H/S Interface: must understand the binary encoding
of values and how to manipulate binary values
Similarity is not just a coincidence! (operations) to get new values
Have done some of this with ALU instructions
Dec 3, 2015 11 Dec 3, 2015 12
Processing: Conditional Execution Processing: Loop
if ( bool_exprs )  recall Processing: Expressions while ( cond ) [boolean expression]

{ then-block (only if bool_exprs == true) } { while-block (changes something about cond) }


else { else-block (only if bool_exprs == false) } for ( initial; test; adjust ) [assignment; bool_exprs; assign/expr]
switch (exprs) {  recall Processing: Expressions { for-block }
case constant-exprs : statement(s); @ H/S Interface: need unconditional control flow
break; /* optional */ instructions that can redirect execution back to
case constant-exprs : statement(s); beginning of loop, as well as conditional control flow
instructions to skip over the block when the test fails
etc. }
@ H/S Interface: need conditional control flow
instructions that can redirect execution to a non-
sequential instruction based on a condition (flags!)
Dec 3, 2015 13 Dec 3, 2015 14

Processing: Functions In this Deck: Towards the Software


Function Declaration: 1. Programmer’s Model the software abstraction
general description,
int min ( int X, int Y ) { // X and Y are parameters abstract 2. ALU instructions manipulate (generate/modify) data
if ( X < Y ) { return ( X ); } Registers only
else { return ( Y ); } Constants? Addressing Modes
} ?
3. Load and Store Instructions data movement
Invocation
… specific, instance Moving values between Registers and Main Memory
Z = min ( A, B ); // A and B are arguments Addressing Modes

4. Control Flow Instructions involves PC!
@ H/S Interface: need subroutine control flow instructions
that can redirect execution to function body and Unconditional, Conditional
“remember” how to get back … also need ways to pass PC relative vs. absolute addressing
arguments (including return value) Subroutines more elaborate control flow
Dec 3, 2015 15 Dec 3, 2015 16

When Introducing New Instruction Programmer’s Model Abstraction


Programmer’s model specify effects of instruction The abstraction that is relevant to the execution of
How to realize instruction on microarchitecture instructions as seen from the software side of the H/S
Interface
Add to Hardware as needed!
Instructions (including encoding)
Execution states State before & state after instruction executes
Instruction Encodings the information about the • Registers, Flags and memory contents!
instruction that crosses the H/S Interface Software-Visible Registers: R0 – R15 (PC is R15)
Input to the Control FSM! Flags Doesn’t include temporary registers
Use in program fragments to implement HLL statements IR
Assembly language syntax IR can’t be accessed directly by programs, but
• is relevant to instruction encoding (which is visible
to programmers!)
Included in Programmer’s Model
Dec 3, 2015 17 Dec 3, 2015 18
The Programmer’s Model Reveals of
How to Describe Activity In Processor?
the Underlying Hardware
Register Transfer Language (RTL)
Software
Abstract way to describe operations in terms of
Hardware
movement of data at level of Programmer’s Model
Processor Memory
Syntax: Dest  Source
Control FSM Register Address Bus (4 bits)
Source value is loaded as new contents of Dest
Address A3…A0 CLK 1M x 32-bit Words Dest: could be register or memory word
IR 16 x 32-bit Registers
R Data
Source: could be register or memory word
CLK
W D31…D0
W
Use [ operand ] to indicate “contents of” … but …
Internal Data Bus (32 bits) R
Data “contents of” Dest is implied, i.e. “Dest” is really
D31…D0
“[Dest]”
Flags
• use one less level of [ ] for Dest
Interconnection Bus
Dec 3, 2015 19 Dec 3, 2015 20

Register Transfer Notation Examples Will Now Consider New Instructions!


RTL Meaning IR formats (layout/use of bits)
Transfer (copy) contents of R4 (Source) into 8 opcode bits  up to 256 different instructions!
R5  [ R4 ]
register R5 (Dest) Formats  plural! Many Forms! Different bit layout/use
Transfer (copy) contents of memory at address New instructions will be presented in terms of general forms
R5  MMem[ [ R4 ] ]
contained in R4 into register R5
Form: Numbers / types of operands involved
Transfer (copy) contents of R4 to memory at
MMem[R5]  [ R4 ] E.G. Previous 3 register ALU instructions (e.g. ADD, XOR)
address contained in R5
• Has 3 operands, all 3 are registers  general Form 1
IR  MMem[ [ PC ] ] ☺
E.G. Previous 2 register ALU instructions (e.g. MOV, NOT)
• Has 2 operands, both are registers  general Form 2
Always use one less set of [ brackets ] on the left side of 

Dec 3, 2015 21 Dec 3, 2015 22

Presenting a New Instruction Form Consider these ALU Instructions First


For each form, will give:
Instruction Recall Recall
OpCode (hex)
Informal description & purpose mnemonic Effect at ALU ALU OP (hex)

RTL description Form 3 NOP 00 Do nothing 0000 (0)

Relevant OpCode(s) uniquely identify instruction(s) Form 1 ADD 01 R=X+Y 0001 (1)

IR Format Instruction encoding Form 1 SUB 02 R=X–Y 0010 (2)

Assembly language syntax Form 2 MOV 03 R=Y 0011 (3)

simplifies writing specific instructions (by people) Form 1 AND 04 R = X AND Y 0100 (4)

Examples of instructions Form 1 OR 05 R = X OR Y 0101 (5)

Assembly Language manuals present R=X⊕Y


Form 1 XOR 06 0110 (6)
instructions this way too!
Form 2 NOT 07 R = invert(Y) 0111 (7)

Dec 3, 2015 23 Dec 3, 2015 24


FORM 1 Form 1 (continued)
General Form for ALU instructions with 3 reg operands:
IR Instruction Encoding:
Op ADD, SUB, AND, OR, XOR bit
3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
9 8 7 6 5 4 3 2 1 0
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
Operands: Rd (D Reg), Rsx (Sx Reg), Rsy (Sy Reg) use Op Rd Rsx Rsy Not used (always 0)
RTL: Rd  [ Rsx ] Op [ Rsy ] hex H7 H6 H5 H4 H3 H2 H1 H0
Flags: ADD, SUB: C, V, N, Z
Assembly Syntax:
AND, OR, XOR: C = 0, V = 0, N, Z General Assembly Syntax: “;” – starts a comment
OpCodes: see previous slide Op Rd, Rsx, Rsy that continues to end of
line … like “//” in C
Have picked OpCodes to align with related ALU Examples:
opr’s on purpose! (trick to simplify hardware!) ADD R5, R1, R10 ; R5  [ R1 ] + [ R10 ]
Operands: XOR R4, R4, R4  ?????
Specify a register by its address
Register Addressing Mode
Dec 3, 2015 25 Dec 3, 2015 26

FORM 2 Form 2 (continued)


General Form for ALU instructions with 2 reg operands: IR Instruction Encoding: [Different than before!]
Op MOV (previously RY), NOT (previously NOTY) bit
3
1
3
0
2
9
2
8
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1
9
1
8
1
7
1
6
1
5
1
4
1
3
1
2
1
1
1
0
9 8 7 6 5 4 3 2 1 0

Operands: Rd (D Reg), Rs (Sy Reg) use Op Rd Rs Not used (always 0)

RTL: Rd  Op [ Rs ] Different from Lab 2? … later! hex H7 H6 H5 H4 H3 H2 H1 H0


Flags: MOV: unchanged (no data manipulation) General Assembly Syntax:
NOT: C = 0, V = 0, N, Z OP Rd, Rs
OpCodes: see earlier slide Examples:
MOV R5, R1 ; R5  [ R1 ]
NOT R10, R11 ; R10  not( [ R11 ] )
NOT R4, R4 ; R4  not( [ R4 ] )
Dec 3, 2015 27 Dec 3, 2015 28

WAIT !!!
Does the encoding change for 2 operand ALU
Problem & Solution
PROBLEM: Can’t use the same execution states for
instructions impact the microarchitecture? both 3 operand & 2 operand ALU instructions!
3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
bit Form 1
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
9 8 7 6 5 4 3 2 1 0 The IR formats no longer share the same bits for
use Op Rd Rsx Rsy Not used (always 0)
the SY Reg
hex H7 H6 H5 H4 H3 H2 H1 H0 SOLUTION: Add new states to deal with the 2 operand
3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
format
bit Form 2 9 8 7 6 5 4 3 2 1 0
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 Create a branch in the FSM graph
use Op Rd Rs Not used (always 0)
Change destination from Decode state based on
hex H7 H6 H5 H4 H3 H2 H1 H0 instruction OpCode
What has changed?
How has the microarchitecture handled this previously?
Dec 3, 2015 29 Dec 3, 2015 30
Revised FSM Impact on FSM Design?
Need more Control FSM Output ROM words to hold outputs
Fetch Fetch Fetch
State F0 State F1 State F2 for the new Form 2 ALU instruction execution states
Need to change the Decode ROM entries for Form 2 ALU
instructions to point to the first execution state for Form 2
Decode Execution Execution Execution instructions Decode ROM? …. Lab 4!!!
Form 1 State E0 State E1 State E2
State
(and F3) ALU (Form 1) (Form 1) (Form 1)
That’s all ?!?
OpCode In general, could this be done to introduce many new
Form 2 instruction forms? Yes! (Lab 5 ☺ )
ALU
OpCode
Execution Execution Add a new branch in the FSM graph
State E3 State E4
(Form 2) (Form 2) Add corresponding new states in FSM Output ROM
Make Decode ROM entry for new instructions of new
form to point to new states
New states
Dec 3, 2015 31 Dec 3, 2015 32

Revised FSM Decode ROM Contents


ROM
OpCode ROM Contents
Fetch Fetch Fetch Instruction Address
hex hex
State F0 State F1 State F2 hex
01
00 02 NOP 00 00 ???? (Form 3)

ADD 01 01 04 (Form 1)
Decode Execution Execution Execution
Form 1 State E0 State E1 State E2
State
ALU
SUB 02 02 04 (Form 1)
(and F3) (Form 1) (Form 1) (Form 1)
OpCode 04 05 06 MOV 03 03 07 (Form 2)
03
Form 2 AND 04 04 04 (Form 1)
State encoding ALU Execution Execution
OpCode
in hex State E3 State E4 OR 05 05 04 (Form 1)
(also FSM (Form 2) (Form 2)
Output ROM 07 08 XOR 06 06 04 (Form 1)
‘Next State’
address!) Decode ROM for form 2 ALU NOT 07 07 07 (Form 2)
Dec 3, 2015 instructions must point here! 33 Dec 3, 2015 34

FORM 3 NOP Instruction Implementing NOP


NOP: “Do nothing”, no operands Nothing to do beyond decoding the NOP
RTL: no effect Change Decode ROM entry for NOP to go to state 0
Flags: no changes Go and fetch the next instruction (from state 00)
OpCode: 0x00
Decode ROM Decode ROM
bit
3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
9 8 7 6 5 4 3 2 1 0
Instruction OpCode (hex)
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 Address (hex) Contents (hex)
use 0 0 0 0 0 0 00 Not used (always 0) NOP 00 00 00 (Form 3)
hex H7 H6 H5 H4 H3 H2 H1 H0

Syntax:
NOP This handling is the same as in Lab 4
… but different from Lab 5!

Dec 3, 2015 35 Dec 3, 2015 36


Constants & ALU Instructions
Now … on to something NEW Lab 6 Previously: source values were always from Registers
Constants as operands in ALU instructions Very common in computers!
What about using an explicit constant (literal) as a source
operand? e.g. In C: Y = X + 4;
In RTL: R6  [ R5 ] + 4 4 is a constant value
Constant as an operand immediate (imm) value
Encode the constant in the instruction as an operand
Extend Form 1 to get Form 4:
Form 1 operations with 2 registers & immediate value
Will need a way to encode the new Form 4 instructions!

Dec 3, 2015 37 Dec 3, 2015 38

FORM 4 Form 4 Opcodes


General Form for ALU instructions with 3 operands:
Using the different operand combination creates a
2 reg & 1 imm operands different instruction that requires different
Op ADD, SUB, AND, OR, XOR (same as Form 1) processing (different execution states!)
Operands: Rd, Rs, imm16 Want new instruction Opcodes to include the ALU
imm16 is an unsigned 16-bit value stated as a operation essential for using IR contents to drive
literal, e.g.: 10, 0x38, 0b01110, ‘AB’ ALU operation!
• imm16 value is 0-extended to 32 bits BUT … New instructions can’t use the same
Opcodes as the (Form 1) 3 register operand form!
–High 16 bits are all 0
Can still include ALU operation in Opcode if we pick
RTL: Rd  [ Rs ] Op 0-extend( imm16 ) the instruction Opcodes carefully!
Flags: ADD, SUB: C, V, N, Z
AND, OR, XOR: C = 0, V = 0, N, Z

Dec 3, 2015 39 Dec 3, 2015 40

Careful Choice of Opcodes Why 0x21 ?


Want least significant bits to include ALU operation On previous slide, could have used any value other than
As long as remaining bits & ALU bits uniquely 0b0000 in the hi nybble in Opcode: ( 0b xxxx 0001 )
identify the new instruction Opcode is OK! Why not this value? 0b xxxx 1001
E.G. ADD ALU opr = 0b0001 Save for later ALU expansion (more ALU oprns)
Form 1 ADD Opcode: 0b 0000 0001 = 0x01 Why not this value? 0b 0001 0001
Avoid confusion with Lab 5 ☺
Form 4 ADD OpCode: 0b 0010 0001 = 0x21
Why this value? 0b 0010 0001
Why not?
• Could have chosen any upper 4 bit value except 0001

Dec 3, 2015 41 Dec 3, 2015 42


Form 1 Opcode Table Form 4 Form 4 (continued) imm16 value is encoded
in the instruction!
OP Opcode Opcode OP Opcode Opcode IR Instruction Encoding:
(Form 1) (hex) (binary) (Form 4) (hex) (binary) bit
3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
9 8 7 6 5 4 3 2 1 0
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0

NOP 00 Form 3 use Op Rd Rs imm16

ADD 01 0000 0001 ADD 21 0010 0001 hex H7 H6 H5 H4 H3 H2 H1 H0


SUB 02 0000 0010 SUB 22 0010 0010
General Assembly Syntax:
MOV 03 Form 2
Op Rd, Rs, #imm16 ; “#” indicates immediate
AND 04 0000 0100 AND 24 0010 0100

OR 05 0000 0101 OR 25 0010 0101 • Yes … must put “#” in front of the imm16 value
XOR 06 0000 0110 XOR 26 0010 0110

NOT 07 Form 2

Dec 3, 2015 43 Dec 3, 2015 44

Form 4 Limitations Form 4 Instruction Examples


3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
bit 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
9 8 7 6 5 4 3 2 1 0
imm16 operands are restricted to 16 bits
use Op Rd Rs imm16
Limits the magnitude of constants
hex H7 H6 H5 H4 H3 H2 H1 H0
Unsigned: 0 .. 65,535
Unsigned operands ADD
8 7 6 R5,
5 4 3R1,
2 1 #10 3 2 
;14 R5 1 0 [ R1 ] + 10
3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1
bit 1 0 9 0 9 8 7 6 5
9 8 7 6 5 4 3 2 1 0

Can’t have negative constants 21 5 1 000A

hex H7 H6 H5 H4 H3 H2 H1 H0

Overcoming both limitations is discussed later bit


3
1
3
0
2
9
2
8
2
7
2
6
2 2 2 2 2 2 1 1 1 1 1
AND
5 4 3 R4,
2 1 0R4,
9 8#0x1
7 6 5
1
4
1
3
1
2 
1 1
1 0 ?????
9 8 7 6 5 4 3 2 1 0

24 4 4 0001

hex H7 H6 H5 H4 H3 H2 H1 H0

NOTE: Constant is encoded in the instruction!


Dec 3, 2015 45 Dec 3, 2015 Is loaded into IR during instruction fetch! 46

Form 4 Implementation 0-Extend( imm16 ) onto Data Bus?


Add new states
IR imm16
Store new outputs for new states in FSM Output ROM
Store new entries in Decode ROM for new OpCodes 0000 0000 0000 0000 Goes onto
Execution: Goes onto 31 – 16 splitter 15 – 0
low 16 bits
high 16 bits of Data Bus
1st Execution State: copy Rs to T1 of Data Bus
2nd Execution State: drive T1 into ALU X input, drive IRU16

0-extend[ imm16 ] on Internal Data Bus to ALU Y 32


input, drive operation to ALU, latch result in T2 Internal Data bus

3rd Execution State: copy T2 to Rd Need new FSM Output


What is different from Form 1 Execution? ☺ ROM bit IRU16
Dec 3, 2015 47 Dec 3, 2015 48
Resulting Control FSM Output Table
NOP, MOV, NOT ???
for Form 4 Instructions
Does it make sense to have a new form based on an
immediate operand for any of these?
I
R R T T T T A NOP No! What could any operand add? (Nothing)
R S A
e e D 1 1 2 2 N
State U
g g R
X
C O C O O
O Others Next State MOV Yes!!!
1 R P
R W E E E E P
6 NOT Yes, but not obvious! ☺ (coming … MVN)
MOV a constant (imm) value into a Register
1st 1 1 1 1 2nd Form 5
Nothing
2nd 1 1 1 1 happening 3rd
(all 0)
3rd 1 1 1 1 00

First time an instruction execution state has not involved a Register?


Dec 3, 2015 49 Dec 3, 2015 50

FORM 5 MOV Form 5 MOV Opcode


General Form for MOV instruction with 2 operands: Doesn’t matter! ??
1 reg & 1 imm operands: Will see that ALU is not involved! ? !
Op MOV Use 0x23 (but any free code could be used!)
Operands: Rd, imm20
imm20 is an unsigned 20-bit value (stated as a OP Opcode Opcode OP Opcode Opcode
literal, e.g.: 567, 0x10038, 0b01110, ‘AB’ ) (Form 2) (hex) (binary) (Form 5) (hex) (binary)

• imm20 value is 0-extended to 32 bits MOV 03 0000 0011 MOV 23 0010 0011

–High 12 bits are all 0


RTL: Rd  0-extend( imm20 )
Flags: no change (data transfer, no manipulation)

Dec 3, 2015 51 Dec 3, 2015 52

Form 5 MOV imm20 value is encoded Form 5 MOV Implementation


in the instruction!
IR Instruction Encoding: Add new state (only one?)
3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
bit 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
9 8 7 6 5 4 3 2 1 0
Store new outputs for new state in FSM Output
use 23 Rd imm20 ROM
hex H7 H6 H5 H4 H3 H2 H1 H0 Store new entry in the Decode ROM for the new
General Assembly Syntax: Opcode
MOV Rd, #imm20 ; “#” indicates immediate Execution:
Examples: 1st Execution state: drive 0-extend( imm20 ) on
0-extend!
Internal Data Bus, latch into Rd
MOV R5, #0xF1234 ; R5  0x000F 1234

Dec 3, 2015 53 Dec 3, 2015 54


Resulting Control FSM Output Table
0-Extend( imm20 ) on Data Bus?
19 16 0
for Form 5 MOV Instruction
imm20
IR
I I
0000 0000 0000 R R T T T T A
Goes onto R R S A
31 – 20 splitter 19 – 0 e e D 1 1 2 2 N
State U U X O Others Next State
Goes onto low 20 bits g g R C O C O O
2 1 R P
high 12 bits of Data Bus R W E E E E P
0 6
of Data Bus
IRU20
Nothing
32 1st 1 1 1 1 happening 00
Internal Data bus (all 0)

Need new FSM Output


ROM bit IRU20
Dec 3, 2015 55 Dec 3, 2015 56

Form 5 MOV for Negative Constants? MVN Application


Recall: Taking 2’s Complement to negate a value:
0-extend( imm20 ) always gives a non-negative value
Invert, then Add 1 2’s complement of the value
MOV cannot be used for negative values! So … simply inverting (not) gives the encoding of one less
New instruction: MVN Move Not than the 2’s complement of the value
Rd  NOT ( 0-extend( imm20 ) ) E.G. NOT ( 2 ) gives encoding of – 3
NOT ( 0000 0010 ) == 1111 1101
Works, but application requires care ☺
which is – 3 in 2’s comp. encoding ☺
Look at application, then implementation
if Add 1: 1111 1110  – 2
Can use MVN to load negative constant values, but must
be careful when specifying the constant!
specify positive constant value with magnitude one less
than desired negative value’s magnitude:
MVN R4, 2  loads – 3 into R4
Dec 3, 2015 57 Dec 3, 2015 58

FORM 5 MVN Form 5 MVN Opcode


General Form for MOV instruction with 2 operands: Picking Opcode carefully can help implementation!
1 reg & 1 imm: Want to use ALU NotY operation!
op MVN Use 0x27
Operands: Rd, imm20
imm20 is an unsigned 20-bit value (stated as a OP Opcode Opcode OP Opcode Opcode
literal, e.g.: 0x10038, 0b01110, 9999 ) (Form 2) (hex) (binary) (Form 5) (hex) (binary)

• imm20 value is 0-extended to 32 bits, then NOT 07 0000 0111 MVN 27 0010 0111

inverted
RTL: Rd  NOT ( 0-extend( imm20 ) )
Flags: C = 0, V = 0, N = 1, Z = 0

Dec 3, 2015 59 Dec 3, 2015 60


Form 5 MVN imm20 value is encoded Form 5 MVN Implementation
in the instruction!
IR Instruction Encoding: Add new states
3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
bit 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
9 8 7 6 5 4 3 2 1 0
Store new outputs for new states in FSM Output
use 27 Rd imm20 ROM
hex H7 H6 H5 H4 H3 H2 H1 H0 Store new entry in the Decode ROM for the new
General Assembly Syntax: OpCode
MVN Rd, #imm20 ; “#” indicates immediate Execution:
Examples: 1st Execution state: place 0-extended imm20 on
0-extend!
Internal Data Bus, ALU opr (NotY) to ALU, latch
MVN R5, #10 ; R5  not (0x0000 000A) result into T2
; R5  0xFFFF FFF5 (– 11) 2nd Execution State: copy T2 to Rd

Dec 3, 2015 61 Dec 3, 2015 62

Resulting Control FSM Output Table


In this Deck: Towards the Software
for Form 5 MVN Instruction 1. ALU instructions manipulate data Registers only
Source Operand: Register & Immediate Addressing Mode
2. Load and Store Instructions
I I Moving values between Registers and Main Memory
R R T T T T A
R R S A Addressing Modes
e e D 1 1 2 2 N
State U U X O Others Next State
2 1
g g R
R
C O C O O
P 3. Control Flow Instructions
R W E E E E P
0 6 Unconditional, Conditional
PC relative vs. absolute addressing
Subroutines more elaborate control flow
1st 1 1 1 Nothing 2nd
happening
2nd 1 1 1 1 (all 0) 00
In all of the above:
How to realize instructions on microarchitecture
• Add to Hardware as needed!
Instruction Encodings
Program fragments to implement HLL statements
Dec 3, 2015 63 Dec 3, 2015 64

Recall … HLL Variables Load (LDR) and Store (STR) Instructions


HLL variables are typically implemented using Main LDR: Load a value from Main Memory into a Register
Memory locations: Operands: Rd & source memory address
Store information values: read, write Rd is the data register (is also “destination”)
Name of variable represents address of variable in STR: Store data from a Register to Main Memory
Main Memory
Operands: Rd & destination memory address
@H/S interface:
Rd is the data register (is source not “destination”!)
Need instructions to load HLL variables from Main
Memory into registers for manipulation using the ALU Register operand is identified using previous approach:
Need instructions to store new values from registers Register Addressing Mode 4-bit Reg id (Rd) is
into HLL variables in Main Memory included in instruction encoding
Memory address operand can be specified in 4
different ways! 4 different addressing modes!
Dec 3, 2015 65 Dec 3, 2015 66
4 Memory Operand Addressing Modes LDR and STR Opcodes
Register (Indirect) Use of different addressing modes requires different
address is in a register: [ Rn ] execution states
Lab 6 Will pick general opcodes that can be altered slightly
(Register) Offset Immediate
address = [ Rn ] + (signed) immediate to identify the different addressing modes
(Register) Offset Register LDR: 0b 0011 00xx 0x3x (30, 31, 32, 33)
address = [ Rn ] + [ Rm ] STR: 0b 0011 01xx 0x3x (34, 35, 36, 37)
PC-relative (special case of Offset Immediate) Modify the 2 least signif. bits to create variations for
addressing modes
address = [ PC ] + (signed) immediate
NOTE: lo nybble is NOT specifying an ALU operation!

Dec 3, 2015 67 Dec 3, 2015 68

Register (Indirect) Addressing Mode LDR and STR Register Mode Form
for Memory Operand
Rn contains the address of the memory operand 2 operands: both are registers
Rn is a pointer to the memory operand! bit
3
1
3
0
2
9
2
8
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1
9
1
8
1
7
1
6
1
5
1
4
1
3
1
2
1
1
1
0
9 8 7 6 5 4 3 2 1 0

LDR: Rd  MMem[ [ Rn ] ] use Op Rd Rn Not used (always 0)


STR: MMem[ Rn ]  [ Rd ] hex H7 H6 H5 H4 H3 H2 H1 H0
Syntax: explicitly show that Rn is a pointer into memory
LDR Rd, [ Rn ] ; “[“ and “]” around register Rn Have we seen this form before? Yes! Form2!
STR Rd, [ Rn ] E.G. NOT R4, R5
Do we need new hardware? No!
Do we need new execution states? … Yes! Why?
Read [ and ] as meaning “operand is a memory address”.

Dec 3, 2015 69 Dec 3, 2015 70

Implementations Examples
LDR (0x30): Requires a Main Memory Read: Suppose current state: Registers (hex) Main Memory (hex)
Reg Value Address Contents
1st Execution State: MAR  [ Rn ] R0 1 200 300F
before
2nd Execution State: MDR  MMem[ [ MAR ] ]  R4 2 201 AC05
3rd Execution State: Rd  [ MDR ] Execute LDR R4, [ R12 ] R12 200

encoding: Registers (hex) Main Memory (hex)

STR (0x34): Requires a Main Memory Write: 0x304C 0000 after


R0 1 200 300F
R4 300F 201 AC05
1st Execution State: MAR  [ Rn ] R12 200
nd
2 Execution State: MDR  [ Rd ]
Registers (hex) Main Memory (hex)
3rd Execution State: MMem[ MAR ]  [ MDR ]  Then STR R0, [ R12 ]
R0 1 200 1
encoding: after
R4 300F 201 AC05
No new hardware needed! 0x340C 0000 R12 200

Dec 3, 2015 71 Dec 3, 2015 72


Examples (Register) Offset Immediate Mode
for Memory Operand
Suppose current state: Registers (hex) Main Memory (hex)
Reg Value Address Contents address of the memory operand =
before R0 1 200 300F
contents of Rn + immediate value
R4 2 201 AC05
Execute LDR R4, [ R12 ] R12 200 Rn + immediate is a pointer to the operand!
encoding: Registers (hex) Main Memory (hex)
LDR: Rd  MMem[ [ Rn ] + imm ]
0x304C 0000 before R0 1 200 300F STR: MMem[ Rn + imm ]  [ Rd ]
R4 300F 201 AC05
Syntax: explicitly show [ pointer into memory ]
R12 200
LDR Rd, [ Rn, #offset ]
Registers (hex) Main Memory (hex)
Then STR R0, [ R12 ] STR Rd, [ Rn, #offset ]
R0 1 200 1
encoding: after R4 300F 201 AC05
0x340C 0000 R12 200

Dec 3, 2015 73 Dec 3, 2015 74

LDR and STR General Form Sign-Extend( imm16 ) onto Data Bus?
for Offset Immediate memory operand 15
3 operands: 2 registers and 1 immediate
IR imm16
3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
bit 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
9 8 7 6 5 4 3 2 1 0
Goes onto
16 bits tied
together low 16 bits
use Op Rd Rn imm16 Need new FSM of Data Bus
31 – 16 splitter 15 – 0
hex H7 H6 H5 H4 H3 H2 H1 H0 Output ROM bit Goes onto
Have we seen this form before? Yes! Form4! for IRS16 high 16 bits
of Data Bus IRS16
Do we need new execution states? … Yes!
32
Internal Data bus
NOTE: imm16 is signed!
Need new hardware sign-extend( imm16 )

Dec 3, 2015 75 Dec 3, 2015 76

LDR Offset Immediate Implementation STR Offset Immediate Implementation


LDR (0x31): Requires calculating address and a Main STR (0x35): Requires calculating address and a Main
Memory Read: Memory Write:
1st Execution State: T1  [ Rn ] 1st Execution State: T1  [ Rn ]
IRS16 IRS16
2nd Execution State : T2  [ T1 ] + sign-extend( imm16 ) 2nd Execution State : T2  [ T1 ] + sign-extend( imm16 )
3rd Execution State: MAR  [ T2 ] 3rd Execution State: MAR  [ T2 ]
4th Execution State: MDR  MMem[ [ MAR ] ]  4th Execution State: MDR  [ Rd ]
th
5 Execution State: Rd  [ MDR ] th
5 Execution State: MMem[ MAR ]  [ MDR ] 

5 Execution States! 5 Execution States!

Dec 3, 2015 77 Dec 3, 2015 78


Examples Examples
Suppose current state: Registers (hex) Main Memory (hex) Suppose current state: Registers (hex) Main Memory (hex)
Reg Value Address Contents Reg Value Address Contents
before R0 1 200 300F before R0 1 200 300F
R4 2 201 AC05 R4 2 201 AC05
Execute LDR R4, [ R12, #1 ] R12 200 Execute LDR R4, [ R12, #1 ] R12 200
+
encoding: Registers (hex) Main Memory (hex) encoding: Registers (hex) Main Memory (hex)
0x314C 0001 after
R0 1 200 300F 0x314C 0001 before
R0 1 200 300F
R4 AC05 201 AC05 R4 AC05 201 AC05
R12 200 R12 200
+
Registers (hex) Main Memory (hex) Registers (hex) Main Memory (hex)
Then STR R0, [ R12, #1 ] Then STR R0, [ R12, #1 ]
R0 1 200 300F R0 1 200 300F
encoding: after R4 AC05 201 1 encoding: after R4 AC05 201 1
0x350C 0001 R12 200 0x350C 0001 R12 200

Dec 3, 2015 79 Dec 3, 2015 80

Variables of Type Node


Why is Offset Immediate Useful? @H/S Interface
Accessing variables of struct (composite) types! Each Node variable is an instance of the Node struct type:
Suppose a linked list: consists of an array of 10 integer data Samples, and
Head pointer variable Head Node pointer a Next pointer … occupies 11 words
Node is a struct type (composite): 2 fields A_Node Contents
Samples 0
1
Occupies first 10
Composed of an array of 10 integers Pointer variable 2
3
array[10] words of variable
4
occupies 1 word Occupies 11 words 5
6 integers in memory
(Samples), and a pointer (Next) of memory of memory
7
8
9
Occupies last 1
Node Contents Next A Node pointer word of variable
Samples Each Node variable
array[10] occupies 11 words in memory
integers of memory Next field is offset 10 (0xA) words from start of (every)
Node variable
Next Node pointer

Dec 3, 2015 81 Dec 3, 2015 82

Remember … Scenario
Head
Suppose list of 3 Nodes:
The Name of HLL variable is associated Similar diagrams were NodeX Contents
Samples
with the address where the variable is common in SYSC 2006 array[10]
integers
stored in Main Memory
NodeY Contents
Next Node pointer
Samples array[10]
integers
Address Contents NodeZ Contents
(Name) Samples array[10]
Next Node pointer
integers

Next Node pointer

Null pointer value (0x0000 0000)


Dec 3, 2015 83 Dec 3, 2015 84
Names / Addresses of Variables in Scenario Scenario with Names, Addresses & Contents
@H/S Interface @H/S Interface
0x100 Head 0x200
Head @ 0x100
Head 0x200 NodeX Contents
NodeX @ 0x200 0x300 NodeY Contents 0x200 .Samples array[10]
NodeX.Samples @ 0x200 0x300 .Samples
Contents array[10] integers
NodeX.Next @ 0x20A Node integers
.Samples array[10]
NodeY @ 0x300 integers 0x20A .Next 0x300
0x030A .Next 0x400
NodeY.Samples @ 0x300
NodeY.Next @ 0x30A .Next Node pointer 0x400 NodeZ Contents
0x400 .Samples array[10]
NodeZ @ 0x400 Next field is offset 0xA integers
NodeZ.Samples @ 0x400 Next field is offset 0xA words from start of
words from start of (every) Node variable
NodeZ.Next @ 0x40A (every) Node variable 0x040A .Next 0x0

null pointer value (0x0000 0000)


Dec 3, 2015 85 Dec 3, 2015 86

Walking the List Initializing Cur (R8): Cur = Head


accessing variables of struct type
0x200 NodeX Contents
Head 0x200
HLL: For (Cur = Head; Cur != null; Cur = Cur -> Next) { 0x100
addrs + 0 Samples array[10]
… do something with current node … integers

}
addrs + 0xA Next 0x300
@ H/S Interface suppose use R8 as Cur pointer R8 some value

Cur = Head Recall Head @ 0x100 MOV R8, #0x100 ; R8  address of Head
MOV R8, #0x100 ; R8  address of Head
LDR R8, [ R8 ] ; R8 points to first node LDR R8, [ R8 ] ; R8 points to first node
Cur = Cur -> Next Recall Next at offset 0xA in Node
LDR R8, [ R8, #0xA ] ; R8 points to next node

Visualize this!
Dec 3, 2015 87 Dec 3, 2015 88

Initializing Cur (R8) Initializing Cur (R8)


0x200 NodeX Contents 0x200 NodeX Contents
Head 0x200 Head 0x200
addrs + 0 Samples array[10] addrs + 0 Samples array[10]
0x100 0x100
integers integers

addrs + 0xA Next 0x300 addrs + 0xA Next 0x300


R8 initial value R8 initial value

MOV R8, #0x100 ; R8  address of Head MOV R8, #0x100 ; R8  address of Head
R8 0x100 Source immediate mode R8 0x100 Source immediate mode
LDR R8, [ R8 ] ; R8 points to first node LDR R8, [ R8 ] ; R8 points to first node

Dec 3, 2015 89 Dec 3, 2015 90


Initializing Cur (R8) Initializing Cur (R8)
0x200 NodeX Contents 0x200 NodeX Contents
Head 0x200 Head 0x200
addrs + 0 Samples array[10] 0x100 addrs + 0 Samples array[10]
0x100
integers integers

addrs + 0xA Next 0x300 addrs + 0xA Next 0x300


R8 initial value R8 initial value

MOV R8, #0x100 ; R8  address of Head MOV R8, #0x100 ; R8  address of Head
R8 0x100 R8 0x100

LDR R8, [ R8 ] ; R8 points to first node LDR R8, [ R8 ] ; R8 points to first node
R8 0x200 ???? Source Register mode R8 0x200 Source Register mode

Cur = Head; ☺
Dec 3, 2015 91 Dec 3, 2015 92

Advancing Cur: Cur = Cur -> Next


For Example: LDR R8, [ R8, #0xA ]
LDR R8, [ R8, #0xA ]
Before: Before: R8 points to NodeX
R8 addrs_cur_Node

addrs_cur_Node Contents 0x200 NodeX Contents


R8 0x200 0x200 Samples
addrs + 0 Samples array[10] array[10]
integers integers
imm 0xA
+
addrs + 0xA Next addrs_next_Node NodeY NodeY
0x20A Next 0x300

After: After: R8 points to NodeY


R8 addrs_next_Node NodeY NodeY
R8 0x300

Dec 3, 2015 Cur = Cur -> Next; 93 Dec 3, 2015 Cur = Cur -> Next; ☺ 94

OK … what’s next? (Register) Offset Register Mode


For LDR and STR: need to access memory To specify the address of a memory operand
Addressing mode for memory operand: Very similar to Offset Immediate … but
so far: Register & Offset Immediate modes Instead of adding an immediate value to the
value from Rn …
More to come!
add the contents of another register to
Offset Register the value from Rn
PC-Relative Operands: 3 operands, all registers
Destination: register (Rd)
Source: 2 registers (Rn, Rm)

Dec 3, 2015 95 Dec 3, 2015 96


(Register) Offset Register Mode LDR and STR General Form
for Memory Operand for Offset Register memory operand
address of the memory operand = 3 operands: all registers
contents of Rn + contents of Rm bit
3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
9 8 7 6 5 4 3 2 1 0
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
Rn + Rm is a pointer to the operand! use Op Rd Rn Rm Not used (always 0)
LDR: Rd  MMem[ [ Rn ] + [ Rm ] ] hex H7 H6 H5 H4 H3 H2 H1 H0
STR: MMem[ Rn + Rm ]  [ Rd ]
Syntax: explicitly show [ pointer into memory ] Have we seen this form before? Yes! Form1!
LDR Rd, [ Rn, Rm ] Good! All of the hardware is in place to get the
STR Rd, [ Rn, Rm ] operands from the IR!
Do we need new execution states? … Yes!

Dec 3, 2015 97 Dec 3, 2015 98

LDR Offset Register Implementation STR Offset Register Implementation


LDR (0x32): Requires calculating address and a Main STR (0x36): Requires calculating address and a Main
Memory Read: Memory Write:
Only difference from Only difference from
1st Execution State: T1  [ Rn ] Offset Immediate 1st Execution State: T1  [ Rn ] Offset Immediate
2nd Execution State : T2  [ T1 ] + [ Rm ] 2nd Execution State : T2  [ T1 ] + [ Rm ]
3rd Execution State: MAR  [ T2 ] 3rd Execution State: MAR  [ T2 ]
4th Execution State: MDR  MMem[ [ MAR ] ]  4th Execution State: MDR  [ Rd ]
5th Execution State: Rd  [ MDR ] 5th Execution State: MMem[ MAR ]  [ MDR ] 

Dec 3, 2015 99 Dec 3, 2015 100

Examples Examples
Suppose current state: Registers (hex) Main Memory (hex) Suppose current state: Registers (hex) Main Memory (hex)
Reg Value Address Contents Reg Value Address Contents
before R2 1 200 300F before R2 1 200 300F
R4 2 + 201 AC05 R4 2 201 AC05
Execute LDR R2, [ R12,R4 ] R12 200 202 FF Execute LDR R2, [ R12,R4 ] R12 200 202 FF

encoding: Registers (hex) Main Memory (hex) encoding: Registers (hex) Main Memory (hex)
0x322C 4000 after
R2 FF 200 300F 0x322C 4000 before
R2 FF 200 300F
R4 2 201 AC05 R4 2 201 AC05
+
R12 200 202 FF R12 200 202 FF

Registers (hex) Registers (hex) Main Memory (hex)


Then STR R4, [ R12, R4 ] Then STR R4, [ R12, R4 ]
R2 FF R2 FF 200 300F
encoding: after R4 2 encoding: after R4 2 201 AC05
0x364C 4000 R12 200 0x364C 4000 R12 200 202 2

Dec 3, 2015 101 Dec 3, 2015 102


Scenario with Addresses & Stored Values
Why is Offset Register Useful? @H/S Interface
Accessing variables of array (composite) types! 0x100 Head 0x200

Recall linked list: 0x200 NodeX Contents


0x300 NodeY Contents 0x200 .Samples array[10]
Head pointer variable Head Node pointer
0x300 .Samples array[10] integers
Node is a struct type (composite): 2 fields integers
composed of an array of 10 integers Pointer variable
0x20A .Next 0x300
occupies 1 word 0x030A .Next 0x400
(Samples), and a pointer (Next) of memory
0x400 NodeZ Contents
Node Contents
Samples Each Node variable 0x400 .Samples
array[10] array[10]
occupies 11 words
integers integers
of memory
(Samples values)
Next Node pointer 0x040A .Next 0x0

Null pointer value (0x0000 0000)


Dec 3, 2015 103 Dec 3, 2015 104

Process an Array in a Node in the List Working with the Array Index
HLL: for (Cur = Head; Cur != null; Cur = Cur -> Next) { Use R8 as Cur pointer (also points to first array element)
for ( i = 0; i < 10; i++ ) { Use R9 as the index i: (could use any register … except R15)
… do something with Cur -> Samples[ i ] MOV R9, # 0 ; initialize i = 0
… load Cur -> Samples[ i ] into R10 … to do something …
} LDR R10, [ R8, R9 ] ; get Samples[i]
} Increment i: i++
@ H/S Interface: again, use R8 as Cur pointer ADD R9, R9, # 1 ; increment i
Assume pointing to node to be processed
Recall: R8 also pointing to first array element! Visualize loading array element!

Dec 3, 2015 105 Dec 3, 2015 106

LDR R10, [ R8, R9 ] OK … what’s next?


Before: For LDR and STR: need to access memory
R8 addrs_Cur_Node addrs_Cur_Node Contents
array[10]
Addressing mode for memory operand:
addrs + 0 Samples
integers so far: Register & Offset Immediate
R9 index i value + Samples[i] value
and Offset Register modes
addrs + 0xA Next addrs_next_Node More to come!
PC-Relative
After:

R10 Samples[i] value

Dec 3, 2015 107 Dec 3, 2015 108


PC-Relative is Similar to PC-Relative Mode
Recall: Offset Immediate Mode for Memory Operand
address of the memory operand = ALWAYS R15
Offset Immediate: Add contents of Rn with a signed
immediate (imm) value to get address of memory contents of PC + sign-extend ( imm )
operand R15 + imm is a pointer to the operand!
The memory operand is offset imm words relative to LDR: Rd  MMem[ [ R15 ] + sign-extend( imm ) ]
the pointer in Rn
STR: MMem[ R15 + sign-extend( imm ) ]  [ Rd ]
It is possible to use R15 (PC) as Rn !
Syntax: explicitly show [ pointer into memory ]
PC-Relative: ALWAYS uses R15 (PC) as Rn
No need to specify R15 explicitly in instruction it is LDR Rd, [ #imm ]
assumed (implicitly) for every PC-relative operand STR Rd, [ #imm ]
• e.g. LDR R6, [ R15, #0x06F8 ] instead just write:
LDR R6, [ #0x06F8 ]
Dec 3, 2015 109 Dec 3, 2015 110

LDR and STR General Form Sign-Extend[ imm20 ] onto Data Bus?
for PC-Relative memory operand 19
2 explicit operands: 1 register and 1 (signed) imm
IR imm20
3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
bit 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
9 8 7 6 5 4 3 2 1 0

12 bits tied
use Op Rd imm20
together
31 – 20 19 – 0
hex H7 H6 H5 H4 H3 H2 H1 H0
Goes onto
high 12 bits
of Data Bus IRS20
Have we seen this form before? Yes! Form5!
Is all of the hardware is in place to get the operands 32
from the IR? … NO … need to sign-extend imm20! Internal Data bus

Do we need new execution states? … Yes! Need new FSM Output ROM bit for IRS20

Dec 3, 2015 111 Dec 3, 2015 112

LDR PC-Relative Implementation STR PC-Relative Implementation


LDR (0x33): Requires calculating address and a Main STR (0x37): Requires calculating address and a Main
Memory Read: ALWAYS R15
Memory Write: ALWAYS R15
1st Execution State: T1  [ R15 ] 1st Execution State: T1  [ R15 ]
IRS20 IRS20
2nd Execution State : T2  [ T1 ] + sign-extend( imm20 ) 2nd Execution State : T2  [ T1 ] + sign-extend( imm20 )
3rd Execution State: MAR  [ T2 ] 3rd Execution State: MAR  [ T2 ]
4th Execution State: Memory Read Cycle 4th Execution State: MDR  [ Rd ]
MDR  MMem[ [ MAR ] ]  th
5 Execution State: Memory Write Cycle
5th Execution State: Rd  [ MDR ] MMem[ MAR ]  [ MDR ] 

Dec 3, 2015 113 Dec 3, 2015 114


Simple … Right? Careful! Walking Through Scenario
Initially : PC contains 0x 250
Scenario: start of instruction cycle: PC contains 0x250 Fetch:
Address Contents LDR PC-relative
Load instruction into IR IR  0x 3330 0010
0x250 0x 3330 0010 ; LDR R3, [ #0x10 ] And what else happens as part of fetch?

0x25A 0x 0000 0001 ; #1

0x260 0x 0000 0002 ; #2
0x261 0x 0000 0003 ; #3

What value gets loaded into R3?


Dec 3, 2015 115 Dec 3, 2015 116

Walking Through Scenario Walking Through Scenario


Initially : PC contains 0x 250 Initially : PC contains 0x 250
Fetch: Fetch:
Load instruction into IR IR  0x 3330 0010 Load instruction into IR IR  0x 3330 0010
Increment PC PC  0x 251 Increment PC PC  0x 251
Now execute the instruction in IR Now execute the instruction in IR
1st execution state: T1  [ R15 ] 1st execution state: (T1  [ R15 ]) T1  0x 251
• What value is loaded into T1? 2nd Execution State : T2  [ T1 ] + sign-extend( imm20 )

Dec 3, 2015 117 Dec 3, 2015 118

Walking Through Scenario Walking Through Scenario


Initially : PC contains 0x 250 Initially : PC contains 0x 250
Fetch: Fetch:
Load instruction into IR IR  0x 3330 0010 Load instruction into IR IR  0x 3330 0010
Increment PC PC  0x 251 Increment PC PC  0x 251
Now execute the instruction in IR Now execute the instruction in IR
1st execution state: (T1  [ R15 ]) T1  0x 251 1st execution state: (T1  [ R15 ]) T1  0x 251
2nd Execution State : T2  [ T1 ] + sign-extend( imm20 ) 2nd Execution State : T2  [ T1 ] + sign-extend( imm20 )
• T2  0x 0000 0251 + 0x 0000 0010 • T2  0x 261
 0x 0000 0261 3rd Execution State: (MAR  [ T2 ]) MAR  0x 261
4th Execution State: MDR  MMem[ [ MAR ] ] 
• MDR  MMem [ 0x 261 ]
5th Execution State: Rd  [ MDR ]
• R3  MMem [ 0x 261 ]
Dec 3, 2015 119 Dec 3, 2015 120
Incremented PC Value is Used in
So … What Gets Loaded Into R3?
PC-Relative Mode
Scenario: PC contains 0x250 PC is incremented as part of the fetching of an
LDR
Address Contents PC-relative instruction
0x250 0x 3330 0010 ; LDR R3, [ #0x10 ] If the instruction uses PC-Relative mode, then the
incremented PC value is used in the instruction’s

execution
0x25A 0x 0000 0001 ; #1

0x260 0x 0000 0002 ; #2
0x261 0x 0000 0003 ; #3

Dec 3, 2015 121 Dec 3, 2015 122

Processor Decode
More Efficient PC-Relative Implementation? Control FSM Register Address Bus (4 bits)

PC-Relative is used in Control Flow instructions, too


CLK
PCOE 0000000000000000 Address A3…A0 CLK
(F3)
0000000000000001
Coming soon! C1OE
16 x 32-bit Registers
1111 RegR R Data
PC Addrs RegW PC (R15)
Recall: Fetching takes 4 states Q W D31…D0
IR D
CE
The last one is overlapped onto the Decode state
IRCE Internal Data Bus (32 bits) IBRead
What is the last thing done in the Fetching? IBWrite
• PC  PC + 1 T1CE
CE D Interconnection MARCE
T1OE OE T1 Bus MAROE
Trick! During Decode state: Q
Interface
MAR MDRCE
MDROE
Save PC + 1 in T1, too !!! How? X Y MDR MDRget
4 ALU
OP
If execution uses PC-Relative Opr MDRput
C R
mode, the PC + 1 is already Interconnection Address Bus
T2CE
in T1 !! T2OE
CE
PC + 1
D Interconnection Data Bus
OE Q IBRead
Dec 3, 2015 123 Dec 3, 2015 IBWrite 124

Revised PC-Relative Implementation Big deal … saves one clock cycle?


LDR (0x33): Requires calculating address and a Main Analysis of program execution reveals that Load/Store
Memory Read: Already in T1 ! instructions comprise ~ 45% to 50% (on average) of all
1st Execution State: T1  [ R15 ] instructions executed!
1st Execution State : T2  [ T1 ] + sign-extend[ imm20 ]
ALU instructions: ~ 30% – 40%
nd
2 Execution State: MAR  [ T2 ]
Control Flow instructions: ~ 15% – 20%
3rd Execution State: Memory Read Cycle
MDR  MMem[ MAR ]  Making Load/Store more efficient helps to improve the
4th Execution State: Rd  [ MDR ] performance of the most common instructions!
NOTE: 100% (ALL) instructions also require fetching!
Can do the same for Revised STR also! Improving ALL memory access would be a good thing!
Cache memory!!! (not here, covered in SYSC 4507)

Dec 3, 2015 125 Dec 3, 2015 126


Differences in Addressing Modes? Why is PC-Relative Mode Useful?
What are the differences among these instructions? Accessing variables!
MOV Rd, #0x100 Recall linked list example: Head Node pointer
LDR Rd, [ #0x100 ] Recall: Had to initialize R8 using 2 instructions:
LDR Rd, [ R15, #0x100 ] MOV R8, #0x100 ; R8  address of Head
LDR Rd, [ R15 ] LDR R8, [ R8 ] ; R8 points to first node
Might have just used PC-Relative mode and only
RTL? Encodings? Opcodes? Operands? needed this single instruction:
What is net effect of executing each? LDR R8, [ #offset_to_Head ]

Dec 3, 2015 127 Dec 3, 2015 128

Visualize #offset_to_Head
Calculating #offset_to_variable when Y < X (Y at lower addrs)
Suppose variable is stored at some address Y and instruction
accessing the variable is stored at some address X
Offset (number of words) from X to Y is Y – X addrs Y Head

Offset is signed!
– Y – ( X + 1 ) = #offset_to_Head negative offset
If Y < X then Y – X is negative, otherwise non-negative
But don’t forget that the PC has already been incremented addrs X LDR R8, [ #offset_to_Head ]
before the instruction calculates the operand address addrs X + 1 …
Therefore, the required #offset_to_variable is Y – ( X + 1 )

Visualize this ! incremented PC

Dec 3, 2015 129 Dec 3, 2015 130

Visualize #offset_to_Head
when X < Y (X at lower addrs)
Do People Calculate Offsets?
Only for labs and exams ☺
incremented PC
addrs X LDR R8, [ #offset_to_Head ] The Assembler does this in practice:
addrs X + 1 …
Allocates memory to use for variables and instructions
Calculates the required offsets
– Y – ( X + 1 ) = #offset_to_Head positive offset Assembly Syntax: Label: defines Name
Directive: DeClare Data
addrs Y Head Head DCD ; reserve a word for
… ; variable named “Head”
LDR R8, [ Head ]
What would happen if X == Y? Offset?
Dec 3, 2015 131 Dec 3, 2015 132
Summary: LDR and STR Addressing Modes In this Deck: Towards the Software
1. ALU instructions manipulate data Registers only
For LDR and STR: need to access memory Constants? Addressing Modes
Addressing modes for memory operand: 2. Load and Store Instructions
Moving values between Registers and Main Memory
Register Addressing Modes
Offset Immediate 3. Control Flow Instructions
Offset Register Unconditional, Conditional
PC relative vs. absolute addressing
PC-Relative PC-Relative
Subroutines more elaborate control flow

In all of the above:


How to realize instructions on microarchitecture
• Add to Hardware as needed!
Instruction Encodings
Program fragments to implement HLL statements
Dec 3, 2015 133 Dec 3, 2015 134

Recall Control Flow in HLL (Like C) Control Flow Instructions


Sequential execution of statements Control Flow modifies the PC Examples in labs?
Change pointer to next instruction to fetch
Unconditional (goto)
Implicit (Default) Control Flow: next sequential address
Conditional statements (if-then-else, switch)
Increment PC during every instruction fetch
Loops (for, while) Explicit Control Flow: instruction execution changes PC
Unconditional & Conditional Execution stage of control flow instruction may write
Functions: call (invoke), return a new value into PC (“may”? conditional!)
During execution of control flow instruction,
overwrite default next sequential address
(incremented during fetch )

Dec 3, 2015 135 Dec 3, 2015 136

Stored Program Concept P MM I/O Programmer


writes this Stored Program Memory
contents at
Interconnection Bus runtime
Instructions and data variables are stored (mixed Assembly Language
Program Main Memory
together) in the same memory Lo addrs
instructions
instructions
Von Neumann (Princeton) Architecture variables (data)
variables (data)
Programmer determines the way instructions Assembler instructions
instructions allocates memory
and data are mixed based on the variables (data)
variables (data)
Determined by the order written in program order that
instructions and
instructions
Programmer must use control flow carefully! … data are written in
the Assembly variables (data)
Language program
instructions
Hi addrs
Dec 3, 2015 137 Dec 3, 2015 138
Stored Program Concept P MM I/O Stored Program @ Runtime
Interconnection Bus Main Memory
Lo addrs
Instructions and data variables are stored (mixed instructions
Example in Lab 6?
together) in the same memory Is a binary value
variables (data)

BUT … everything in memory is just a “binary in memory an


instructions Control Flow: must
Instruction or
value” Data?
variables (data)
ensure that only the
desired instructions
PC just points into memory (and not variables) are
instructions loaded into the IR
When using control flow instructions: must
variables (data)
ensure that only desired instructions (and not
instructions
variables) are loaded into the IR for execution ! Hi addrs

Control Flow: Modifies the PC!


Dec 3, 2015 139 Dec 3, 2015 140

Stored Program @ Runtime Data Access? PC-Relative Mode


Main Memory Main Memory
Lo addrs
Programmer instructions instructions
Control Flow: must
must ensure variables (data) ensure that only the variables (data)
that this never
happens!
X instructions

variables (data)
desired instructions
(and not variables) are
loaded into the IR
PC
-ve offset

+ve offset
instructions

variables (data)

instructions instructions
+ve offset
variables (data) variables (data)

instructions instructions
Hi addrs

How is this managed in a HLL program?


Dec 3, 2015 141 Dec 3, 2015 142

Control Flow Instructions ALU & FLAGS


1. Revisit ALU and FLAGS 4 flags: Z (zero), N (negative), C (carry), V (overflow)
New control signal: FCE ALU operations manipulate data and modify FLAGS
2. Conditions for Conditional Control Flow Except NOP & RY no data manipulation
Condition Codes (cc) – based on FLAGS! When Data Manipulation instructions modify data:
Unconditional is just a special condition
Modify FLAGS to reflect result of manipulation
• Condition = ALWAYS
Data Manipulation instructions ≠ ALU operations
3. General Form: Bcc target (Branch instruction)
4. Examples Data Manipulation Instructions are a category of
instructions that use ALU operations to combine values
Assembly Language
and generate new data values
Implementing HLL control flow

Dec 3, 2015 143 Dec 3, 2015 144


What About Ancillary Use of ALU? What About Ancillary Use of ALU?
Ancillary: providing something additional to a main part or function Ancillary: providing something additional to a main part or function

Is the ALU used for anything other than the execution of Is the ALU used for anything other than the execution of
Data Manipulation (ALU) instructions? Data Manipulation (ALU) instructions?
YES! ALU ADD operation is used in several places
• Incrementing the PC
• Calculating memory operand addresses
– Offset Immediate, Offset Register, PC-Relative
Ancillary use of the ALU (vs. ALU instruction execution)
PROBLEM: Want the FLAGS to reflect results of Data
Manipulation instructions, not results of ancillary use!
SOLUTION: Add FLAGS register and “modify FLAGS”
control signal (FCE)
Dec 3, 2015 145 Dec 3, 2015 146

Revisit ALU and FLAGS register FLAGS are Persistent!


FCE from Control FSM n n
F FLAGS register gives FLAGS persistence!
Operation will modify X Y L
ALU Z A Can set FLAGS using one instruction and then check
FLAGS register OP C
N D G Q FLAGS values later
m S
iff FCE = 1 R V
1. Set FLAGS to reflect conditions associated with
CE
n data manipulation
FCE
2. Check conditions to decide control flow
Control FSM must use FCE appropriately every time an
ALU operation is issued (including NOP and RY) Will look at conditions first, then look at
conditional control flow instruction (Bcc), and
Solved problem of ancillary use of ALU!
then put together with some practical code
Use the FLAGS values latched in the FLAGS Register, examples
not the values straight from the ALU
Use the FCE signal to control when the FLAGS register
changes value
Dec 3, 2015 147 Dec 3, 2015 148

Control Flow Instructions: What’s Next Conditions


1. Revisit ALU and FLAGS Processor supports several conditions that can be used
New control signal: FCE to make control-flow decisions
Different conditions for unsigned vs. signed values
2. Conditions for Conditional Control Flow
Use FLAGS to decide if a condition is true
Condition Codes (cc) – based on FLAGS!
Will use up to 3 FLAGS in each condition
Unconditional is just a special condition • Unsigned: C and Z
• Condition = ALWAYS (regardless of FLAGS values) • Signed: V, N, and Z
3. General Form: Bcc target (Branch instruction)
4. Examples Goal: Understanding table on next slide
Assembly Language Will look at it briefly to see where we are going
Implementing HLL control flow Then discuss details
Then revisit
Dec 3, 2015 149 Dec 3, 2015 150
Code Meaning (for comparing X to Y) Flags Tested
1 eq Equal. Z==1 Meaning of Condition Names
2 ne Not equal. Z==0

3 hs (or cs) Unsigned higher or same (or carry set). C==1


Names chosen based on comparing two value X and Y
4 lo (or cc) Unsigned lower (or carry clear). C==0 Comparison:
5 mi Negative. The mnemonic stands for "minus". N==1 Use ALU to calculate X – Y to set flags, but discard the
6 pl Positive or zero. The mnemonic stands for "plus". N==0 difference (i.e. don’t save numerical result)
7 vs Signed overflow. The mnemonic stands for "V set". V==1
FLAGS reflect whether conditions are true or false for
8 vc No signed overflow. The mnemonic stands for "V clear". V==0
the compared values
9 hi Unsigned higher. (C==1) && (Z==0)
Subtraction Terminology:
10 ls Unsigned lower or same. (C==0) || (Z==1)

11 ge Signed greater than or equal. N==V


difference = minuend – subtrahend
12 lt Signed less than. N!=V minuend X
13 gt Signed greater than. (Z==0) && (N==V) subtrahend Y
14 le Signed less than or equal. (Z==1) || (N!=V)
Dec 3, 2015 al
15 Always 151
None tested Dec 3, 2015 152

Checking Signed vs. Unsigned Values Relevant FLAGS


Different FLAGS are set to indicate result of operations Unsigned: C, Z
under signed and unsigned interpretations Signed: N, V, Z
Recall: Confirm on Table! (next 2 slides)
C = 1 indicates overflow for unsigned values Note terminology:
V = 1 indicates overflow for signed values
Condition Unsigned Signed
Furthermore:
X>Y X higher than Y X greater than Y
N flag only has meaning for signed values
Z (zero) flag has meaning under both interpretations X<Y X lower than Y X less than Y

Programmer must know which conditions to test when X==Y X same as Y X equal to Y
dealing with different types of values!
Dec 3, 2015 153 Dec 3, 2015 154

Unsigned: C, Z Signed: N, V, Z
Code Meaning (for comparing X to Y) Flags Tested
Code Meaning (for comparing X to Y) Flags Tested Equal
1 eq
( X == Y )
Z==1
Equal
1 eq
( X == Y ) Z==1 2 ne
Not equal
Z==0
( X != Y )
Not equal Negative. The mnemonic stands for "minus“
2 ne
( X != Y ) Z==0 5 mi
(X–Y<0)
N==1

Positive or zero. The mnemonic stands for "plus“


Unsigned higher or same (or carry set) 6 pl
( X – Y >= 0 )
N==0
3 hs (or cs)
( X >= Y ) C==1
Signed overflow. The mnemonic stands for "V set“
7 vs
( X – Y overflowed)
V==1
Unsigned lower (or carry clear)
4 lo (or cc)
(X<Y) C==0 No signed overflow. The mnemonic stands for "V clear“
8 vc
( X – Y did not overflow)
V==0
Unsigned higher
9 hi
(X>Y) (C==1) && (Z==0) 11 ge
Signed greater than or equal
N==V
( X >= Y )
Unsigned lower or same Signed less than
10 ls (C==0) || (Z==1)
12 lt
(X<Y)
N!=V
( X <= Y )
Signed greater than
13 gt (Z==0) && (N==V)
(X>Y)
Signed less than or equal
14 le (Z==1) || (N!=V)
Dec 3, 2015 155 Dec 3, 2015 ( X <= Y ) 156
NOTE: What’s Left in Table?
Slides deriving the flag settings for the unsigned and The Always condition: (code = 15)
signed conditions in the condition code table are Code Meaning (for comparing X to Y) Flags Tested
included in an Appendix at the end of this slide deck 15 al Always None tested.

The content is included for completeness, but will Condition is Always true!
not be covered in the course (not enough time!)
Flags have no bearing on the condition being true
We will assume that the flags are always set as
specified whenever the associated condition is true
Still need to understand the table, the conditions
and the flag settings

Dec 3, 2015 157 Dec 3, 2015 158

Some Conclusions About the Table How to Compare X to Y?


These are the conditions supported by the Processor Could use SUB instruction:
The comparison of two values (X and Y) was used to SUB R5, R12, R9
derive the meaning of the conditions Would compare X (R12) to Y (R9)
Conditions are provided for both Unsigned and Signed But … would also save difference (? in R5 )
interpretations of X and Y When doing a “compare” don’t want to save difference
The FLAGS can be set by ALU instruction other than the Just set the FLAGS
sequence used in the calculation of X – Y CMP instruction ☺
example coming later … • Only purpose of CMP instruction is to set FLAGS
based on comparison of 2 values (i.e. X – Y)
How/Why the FLAGS are set is a programming concern
Can only test for FLAG conditions specified in table!

Dec 3, 2015 159 Dec 3, 2015 160

CMP Rx, Ry CMP Rx, Ry Implementation


Compare Instruction: CMP Rx, Ry RTL: FLAGS  FLAGS( [ Rx ] + ( NOT( [ Ry ] ) + 1) )
calculate Rx + ( NOT( Ry ) + 1 ) to set FLAGS, but Implementation:
discard the difference 1st execution state: T2  NOT [ Ry ] (FCE = 0)
OpCode: 0x 47 (Chosen carefully! ALU NOT opr = 7) 2nd execution state: T1  [ T2 ]
bit
3
1
3
0
2
9
2
8
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1
9
1
8
1
7
1
6
1
5
1
4
1
3
1
2
1
1
1
0
9 8 7 6 5 4 3 2 1 0 3rd execution state: T2  [ T1 ] + 1 (FCE = 0)
use 47 Rx Ry Not used (always 0)
4th execution state: T1  [ T2 ]
hex H7 H6 H5 H4 H3 H2 H1 H0 5th execution state: --  [ T1 ] + [ Rx ] (FCE = 1)

Form 2 !!! No new hardware


Need new execution states Discard difference (i.e. don’t save result in a register)

Dec 3, 2015 161 Dec 3, 2015 162


CMP Rx, #imm CMP Rx, #imm Implementation
Compare Instruction: unsigned immediate operand RTL:
calculate Rx + ( NOT( 0-extend( imm ) ) + 1 ) to set FLAGS  FLAGS( [ Rx ] + ( NOT( 0-extend( imm20 ) ) + 1 )
FLAGS, but discard the difference Implementation:
OpCode: 0x 57 (Chosen carefully! ALU NOT opr = 7) 1st execution state: only change

bit
3
1
3
0
2
9
2
8
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1
9
1
8
1
7
1
6
1
5
1
4
1
3
1
2
1
1
1
0
9 8 7 6 5 4 3 2 1 0 T2  NOT 0-extend( imm20 ) (FCE = 0)
nd
2 execution state: T1  [ T2 ]
use 57 Rx Imm20
rd
3 execution state: T2  [ T1 ] + 1 (FCE = 0)
hex H7 H6 H5 H4 H3 H2 H1 H0
th
4 execution state: T1  [ T2 ]
Form 5 !!! (Recall MOV immediate) No new hardware 5th execution state: --  [ T1 ] + [ Rx ] (FCE = 1)
Need new execution states
Discard difference (i.e. don’t save result in a register)
Dec 3, 2015 163 Dec 3, 2015 164

Only CMP Unsigned imm Values? CMN Example


CMN: Compare Negative instruction Suppose want to check if value in R6 > – 3
Specify second operand as a positive value CMN R6, #3
Compare first operand to the negation of the
second operand What condition should be checked?
• Calculate Rx – (negation of imm) What is the condition’s code (cc)?
= Rx + ( 2’s complement( negation of imm ) ) [table is on next slide]
Syntax: CMN Rx, #imm20
--  [ Rx ] + ( NOT( NEG( imm20 ) ) + 1 )
--  [ Rx ] + 0-extend( imm20 ) ☺

FCE = 1
Dec 3, 2015 165 Dec 3, 2015 166

Code Meaning (for comparing X to Y) Flags Tested


1 eq Equal. Z==1 Both Interpretations at Same Time
2 ne Not equal. Z==0

3 hs (or cs) Unsigned higher or same (or carry set). C==1


The ALU manipulates binary values
4 lo (or cc) Unsigned lower (or carry clear). C==0 Values have signed and unsigned interpretations
5 mi Negative. The mnemonic stands for "minus". N==1 Both of the sets of FLAGS (unsigned {C, Z} and
6 pl Positive or zero. The mnemonic stands for "plus". N==0 signed {N, V, Z}) are modified at the same time for
7 vs Signed overflow. The mnemonic stands for "V set". V==1 every operation
8 vc No signed overflow. The mnemonic stands for "V clear". V==0
Every ALU operation sets FLAGS for both
9 hi Unsigned higher. (C==1) && (Z==0)
interpretations at the same time
10 ls Unsigned lower or same. (C==0) || (Z==1)
Programmer knows which interpretation is relevant
11 ge Signed greater than or equal. N==V
and must use the right FLAGS conditions
12 lt Signed less than. N!=V

13 gt Signed greater than. (Z==0) && (N==V)

14 le Signed less than or equal. (Z==1) || (N!=V)


Dec 3, 2015 al
15 Always 167
None tested. Dec 3, 2015 168
Code Meaning (for comparing X to Y) Flags Tested
Different Use of Conditions 1 eq Equal. Z==1

2 ne Not equal. Z==0


Suppose execute: AND R7, R8, #0x0F0
3 hs (or cs) Unsigned higher or same (or carry set). C==1
What does the instruction accomplish? 4 lo (or cc) Unsigned lower (or carry clear). C==0
0000 0000 0000 0000 0000 0000 1111 0000 32-bit 0xF0
5 mi Negative. The mnemonic stands for "minus". N==1
R8 XXXX XXXX XXXX XXXX XXXX XXXX YYYY XXXX
6 pl Positive or zero. The mnemonic stands for "plus". N==0

What condition can test if the result is 0? 7 vs Signed overflow. The mnemonic stands for "V set". V==1

[table on next slide] 8 vc No signed overflow. The mnemonic stands for "V clear". V==0

Does the name of the condition make sense in 9 hi Unsigned higher. (C==1) && (Z==0)

this case? 10 ls Unsigned lower or same. (C==0) || (Z==1)

Does the name really matter as long as there is 11 ge Signed greater than or equal. N==V

an appropriate condition to use after AND sets 12 lt Signed less than. N!=V

the FLAGS? 13 gt Signed greater than. (Z==0) && (N==V)

14 le Signed less than or equal. (Z==1) || (N!=V)


Dec 3, 2015 169 Dec 3, 2015 al
15 Always 170
None tested.

Control Flow Instructions: What’s Next Branch Instruction


1. Revisit ALU and FLAGS Syntax:
New control signal: FCE Bcc PC-Relative-offset
2. Conditions for Conditional Control Flow B Branch instruction mnemonic
Condition Codes (cc) – based on FLAGS! OpCode: 0x 80
cc condition code mnemonic (from table)
Unconditional is just a special condition
E.G. BEQ for Branch-Equal, BGT for Branch-greater-than
• Condition = ALWAYS (regardless of FLAGS values)
PC-Relative-offset: PC-relative (signed) offset to next
3. General Form: Bcc target (Branch instruction) instruction to execute if condition is true
4. Examples RTL:
Assembly Language IF (current FLAGS satisfy condition specified by cc)
Implementing HLL control flow THEN PC  [ PC ] + sign-extend( PC-Relative-offset )
remember: using incremented PC value!!
Dec 3, 2015 171 Dec 3, 2015 172

General Form: Bcc PC-Relative-offset Terminology


3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
bit 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
9 8 7 6 5 4 3 2 1 0
The Branch is taken iff the condition specified by
use 80 cc imm20 the cc is true
hex H7 H6 H5 H4 H3 H2 H1 H0 taken: PC is modified (explicit control)
Seen form before? No! Form 6 not taken: PC is not modified (implicit control)
4-bit cc The address of the instruction that is to be executed
(signed) imm20 after taking the Branch is called the target of the
Branch
Need new hardware to deal with cc !!

Dec 3, 2015 173 Dec 3, 2015 174


Implementing the Branch RTL Integrating with Control FSM
IF (current FLAGS satisfy condition specified by cc) Recall: MUX allows selection of next state
THEN PC  [ PC ] + sign-extend( PC-offset ) “Normally”: from Outputs ROM
Use cc to decide whether to execute “THEN” part In Decode State: from Decode ROM
cc must be integrated with Control FSM “next state” logic
IF cc false don’t perform rest of instruction
• Instead, get default next instruction (go to FSM state 0)
If cc true: perform the “THEN” part
Incremented PC is already in T1 !
1. T2  [ T1 ] + sign-extend( PC-Relative-offset )
Decode State Detector
2. PC  [ T2 ]

Dec 3, 2015 175 Dec 3, 2015 176

Introduce cc MUX to integrate cc “Normal” Operation


When Branch instruction requires cc to be true, but cc “Normal” = not Decode state and not Branch instruction
is false go to state 0x00 (get default next instruction) i.e. most of the time
From Outputs ROM From Outputs ROM
0 0
0 0
From Decode ROM cc To State Register From Decode ROM cc Next State
1 1
MUX MUX From Outputs ROM
0x00 1 0x00 1

This is the MUX on 1 1 0


the previous slide iff in 1 iff in not in 0 1
Decode iff Next State Decode Decode Next State does not iff Next State
state requires cc true but state state require cc true requires cc true but
(as before) cc is false cc is false

Dec 3, 2015 177 Dec 3, 2015 178

1st Execution State of Branch


In Decode State
Instruction AND cc is false
Decode State Branch instruction: cc is false go to state 0x00
(get default next instruction)
From Outputs ROM From Outputs ROM
0 0
0 0
From Decode ROM 1
cc Next State From Decode ROM 1
cc Next State
MUX From Decode ROM MUX = 0x00
0x00 1 0x00 1
1 1 1 0
iff in in Decode 0 1 iff in not in 1 1
Decode state Next State does not iff Next State Decode Decode Next State requires iff Next State
state require cc true requires cc true but state state cc true but cc is false requires cc true but
cc is false cc is false

Recall Branch Execution States:


1. T2  [ T1 ] + sign-extend( PC-Relative-offset )
Dec 3, 2015 179 Dec 3, 2015 2. PC  [ T2 ] 180
1st Execution State of Branch
cc MUX Selection Signal
Instruction AND cc is true
Branch instruction: cc is true continue to next 1 iff Next State requires cc true but cc is false
state in Branch instruction Next State Requires cc to be true:
From Outputs ROM
0 control bit from Control FSM Outputs ROM
0 Next State Call the bit: cTest
From Decode ROM cc From Outputs ROM
1
0x00
MUX (e.g. 2nd exec. state of Need circuit to generate cTrue signal (1 iff cc true)
1 Branch instruction)
1 0 Then: 0
iff in not in 0 1 cc
Decode Decode iff Next State MUX
Next State requires 0x00 1
state state cc true and cc is true requires cc true but 1
cc is false iff Next State
requires cc true but
Recall Branch Execution States: cc is false
1. T2  [ T1 ] + sign-extend( PC-Relative-offset ) cTrue cTest
Dec 3, 2015 2. PC  [ T2 ] 181 Dec 3, 2015 182

cc cTrue = 1

cTrue Circuit (4-bit) iff


FLAGS
cTrue Circuit
1 eq (Z==1) C Z NV vs. 8 variable K-Map?
2 ne (Z==0)
Evaluate all conditions in parallel
3 hs (or cs) (C==1)
from FLAGS reg cc = 0: not in table
FLAGS 4 lo (or cc) (C==0) logic for cc = 1
C Z NV (equal)
5 mi (N==1) 0 0
Z
1
6 pl (N==0)

16-Input Multiplexer
...
7 vs (V==1)
Implementation? cTrue 8 vc (V==0)
logic for cc = i i cTrue
8 Variable K-Map?
9 hi ((C==1) && (Z==0)) ...
10 ls ((C==0) || (Z==1))
+ N != V
11 ge (N==V) 14
Z
cc 12 lt (N!=V) 1 15
logic for cc = 14 ALWAYS Use cc to select
13 gt ((Z==0) && (N==V)) (less than or equal)
condition code the relevant
((Z==1) || (N!=V))
(cc) from IR 14 le ((Z==1) || (N!=V)) condition
Dec 3, 2015 183 Dec 3, 2015 184
15 al ALWAYS cc

cTrue in Lab General Form: Bcc PC-Relative-offset


3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
bit 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
9 8 7 6 5 4 3 2 1 0

use 80 cc imm20

hex H7 H6 H5 H4 H3 H2 H1 H0
Seen form before? No! Form 6
Need new hardware to deal with cc
cc Mux & selection circuit
cTest control signal from FSM
cTrue Circuit
connect cc from IR (bits 23 – 20) to cc inputs of cTrue
Connect FLAGS register outputs to FLAGS inputs of cTrue
Present in Lab 7 circuit

Dec 3, 2015 185 Dec 3, 2015 186


Recall: Implementing the Branch RTL Unconditional Branch
IF (current FLAGS satisfy condition specified by cc) Always take the branch … regardless of flag settings
THEN PC  PC + sign-extend( PC-offset ) Condition 15 (al) ALWAYS
Use cc to decide whether to execute “THEN” part Could write as: BAL target-offset
cc must be integrated with Control FSM “next state” logic Instead write as: B target-offset
IF cc false don’t modify PC
Omit the “al” condition (al is implicitly assumed)
• Instead, loop to get next instruction (go to FSM state 0)
Hardware still handles as a conditional branch
“THEN” part
• Even though the condition is always true
Incremented PC is already in T1 !
• Recall cTrue circuit (slides 183 and 184)
1. T2  [ T1 ] + sign-extend( PC-offset ) cTest = 1 here!
2. PC  [ T2 ]
Only get here if cc condition is true!

Dec 3, 2015 187 Dec 3, 2015 188

Control Flow Instructions: What’s Next Target Assembly Language Syntax


1. Revisit ALU and FLAGS Need way to deal with control flow targets
New control signal: FCE Declare: defines a label (address) for use as a target
2. Conditions for Conditional Control Flow Reference: uses target in a Branch instruction
Condition Codes (cc) – based on FLAGS! Label: identifies an address (e.g. target)
Unconditional is just a special condition
• Condition = ALWAYS (regardless of FLAGS values) Declare label: “A_Label”
A_Label
3. General Form: Bcc target (Branch instruction)
… ; other instructions and declarations
4. Examples
BEQ A_Label Reference “A_Label” as target
Assembly Language: targets as labels
Implementing HLL control flow
Declaration can appear before or after Reference!!
Dec 3, 2015 189 Dec 3, 2015 190

Declaring a Label Label Declaration Examples


Label is a string of characters
Allowable: alphabetic: UPPER or lower case, (assume all start in first column of program text)
case sensitive, numeric: 0 - 9, other: _ DoThat MOV R1, #0
Must start with alphabetic there is a tab between the label and instruction
Followed immediately by:
DoThatAgain MOV R1, #0
Space, end-of-line, tab, comment (“;”)
Must start in first column of an assembly language there are spaces between the label and instruction
statement Calculate_the_Average
Cannot be a “reserved” word that has a defined
meaning in the assembly language there is an “Enter” following the label
E.G. instruction mnemonics, like MOV, and register host2net; there is a comment after the label
names, like R0 and PC
Must be unique in the program: can’t declare two labels
Dog8Homework ; there is a space after the label
with the identical string of characters
Dec 3, 2015 191 Dec 3, 2015 192
Referencing a Label in a Branch What Address does a Label Identify?
Recall Bcc: PC-Relative operand to the target The address of the next word used in the program
To reference the target, just give the target’s label DoThat MOV R1, #0
Assembler calculates the offset that gets encoded DoThat identifies (names) the address of the
into the instruction! word that contains the MOV instruction
E.G: for label “DoThat” Assembled information:
… ; other instructions and declarations Address Contents Assembly Language
B DoThat Reference “DoThat” as target 0000 0100 2310 0000 DoThat MOV R1, #0
… ; other instructions and declarations
DoThat MOV R1, #0 DoThat = 0x100 (in this example)
Declare “DoThat” as label

Dec 3, 2015 Declaration can appear before or after Reference!! 193 Dec 3, 2015 194

What Address does a Label Identify? What Address does a Label Identify?
The address of the next word used in the program The address of the next word used in the program
DoThat ; on a line by itself
DoThat ; on a line by itself
(a blank line)
MOV R1, #0 ; followed by another comment line
DoThat identifies (names) the address of the MOV R1, #0
word that contains the MOV instruction DoThat identifies (names) the address of the word that
contains the MOV instruction
Assembled form: Assembled form:
Address Contents Assembly Language Address Contents Assembly Language
0000 0100 DoThat 0000 0100 DoThat ; (comment)
(blank line)
0000 0100 2310 0000 MOV R1, #0 ; (comment)
0000 0100 2310 0000 MOV R1, #0
Dec 3, 2015 195 Dec 3, 2015 196

What About This? If-Then Example


DoThat ; label declaration Consider a simple HLL if-then-else statement:
DoThis ; label declaration if ( i < 10 ) {
MOV R1, #0 // do something …
}
If the value of variable i is less than 10, then the
Have we seen anything similar to this before?
statements in the “do something” block (i.e. the body
of the if-then statement) are executed
Otherwise, the statements in the “do something”
block are not executed and control is transferred to
the first statement after the “do something” block

Dec 3, 2015 197 Dec 3, 2015 198


Code Meaning (for comparing X to Y) Flags Tested
If-Then Example (con’t) 1 eq Equal. Z==1

2 ne Not equal. Z==0


The HLL statement might be assembled into:
3 hs (or cs) Unsigned higher or same (or carry set). C==1
LDR R4, [ i ] ; R4  i (uses PC-Relative Mode)
4 lo (or cc) Unsigned lower (or carry clear). C==0
CMP R4, #10 ; R4 < 10?
B?? After_if 5 mi Negative. The mnemonic stands for "minus". N==1

… ; instructions for “do something” block go here 6 pl Positive or zero. The mnemonic stands for "plus". N==0

After_if ; label declaration of 7 vs Signed overflow. The mnemonic stands for "V set". V==1

; first instruction after “do something” block 8 vc No signed overflow. The mnemonic stands for "V clear". V==0

9 hi Unsigned higher. (C==1) && (Z==0)


The value of the variable i is first loaded into R4, and then 10 ls Unsigned lower or same. (C==0) || (Z==1)
compared to the constant 10 11 ge Signed greater than or equal. N==V
What condition should be used on the B instruction? 12 lt Signed less than. N!=V
• Condition Table on next slide …
13 gt Signed greater than. (Z==0) && (N==V)
Does the type of variable i matter?
14 le Signed less than or equal. (Z==1) || (N!=V)
Dec 3, 2015 199 Dec 3, 2015 al
15 Always 200
None tested.

Another CMP Example Example: Linear Search of an Array


Recall code to walk array: Want to find index of a “value of interest”
for ( i = 0; i < 10; i++ ) { Is the value of interest stored in an element of the
… do something with Cur.Samples[ i ] … array, and if yes, what is the index of that element?
} Initially: Address of start of array is in R4
Assume R9 is loaded with the index variable i Initially: R5 contains the number of elements in the
Want to compare to see if i has exceeded the array size array
CMP R9, #10 Initially: Value of interest is in R3
What condition should be checked here? Will finish with either R6 = the index of the array
element that contains the value of interest, or R6 = -1
[table on previous slide] (if value of interest is not in the array)
Does the type of variable i matter?
Dec 3, 2015 201 Dec 3, 2015 202

Assembly Language for Search Example


HLL Pseudo-Code for Search Example MVN R6, #0 ; R6 = – 1
R6 = – 1; // initially, the value has not yet been found MOV R7, #0 ; initialize loop R7 = 0 for loop
for ( R7 = 0; R7 < R5; R7++ ) // R5 == array size TestForDone CMP R7, R5 ; R7 < R5?
{ if ( array[ R7 ] == R3 ) // found! BGE DoneFor ; No – done for loop!

{ R6 = R7; // save index


LDR R8, [R4,R7] ; get element array[R7]
CMP R8, R3 if ; element == value of interest?
break; // exit loop
BNE IncR7 ; No – continue in loop
} if-then MOV R6, R7 ; Yes – save index
} then
for loop B DoneFor ; – break if-then
IncR7 ADD R7,R7,#1 ; R7++
// continue here … R6 has the index (or – 1 ) B TestForDone 11 words,
DoneFor ; continue 4 Branch
Dec 3, 2015 203 Dec 3, 2015
instructions 204
Variation 1 on Search Example Variation 2 on Search Example
MVN R6, #0 ; R6 = – 1 MVN R6, #0 ; R6 = – 1
MOV R7, #0 ; initialize loop R7 = 0 MVN R7, #0 ; initialize loop R7 = – 1 then inc
B TestForDone ; test for done at end of loop! IncR7 ADD R7,R7,#1 ; R7++  moved to here!
DoFor LDR R8, [R4,R7] ; get element array[R7]
TestForDone CMP R7, R5 ; R7 < R5?
CMP R8, R3 ; element == value of interest?
BGE DoneFor ; No – done!
BNE IncR7 ; No – continue in loop
LDR R8, [R4,R7] ; get element array[R7]
MOV R6, R7 ; Yes – save index
B DoneFor ; – break if-then
CMP R8, R3 ; element == value of interest?

IncR7 ADD R7,R7,#1 ; R7++ BNE IncR7 ; No – continue in loop


TestForDone CMP R7, R5 ; R7 < R5? MOV R6, R7 ; Yes – save index
BLT DoFor ; Yes – do loop body again! DoneFor ; continue 9 words,
DoneFor ; continue 11 words, 2 Branch
4 Branch instructions
Dec 3, 2015 205 Dec 3, 2015 206
instructions

Does Variation 2 “Just” Save 2 Words Comparing Loop Executions


of Memory?
Saves 2 words smaller memory footprint
In original version, each loop iteration executes 7
instructions
In variations 1 & 2, each loop iteration executes 6
instructions ~15% faster!

Original: 7 instructions Variation 1: 6 instructions Variation 2: 6 instructions

Dec 3, 2015 207 Dec 3, 2015 208

Consider Loop to Initialize Values in


Which Search is “Better”?
Array[5] to -1
Footprint Iterate loop when
Pseudo-Code:
(words) not found
Assume: R4 has array address
(instructions)
for ( R7 = 0; R7 < 5 , R7++)
Original 11 7
{
array[ R7 ] = -1;
Variation 1 11 6
}

Variation 2 9 6
initialize loop, then perform 5 iterations of loop body

Dec 3, 2015 209 Dec 3, 2015 210


Loop to Initialize Values in Array[5] to -1 Variation: Initialize Values in Array[5] = -1
Assume: R4 has array address initialize 1 instr Assume: R4 has array address initialize 2 instr
MOV R7, #0 ; initialize R7 = 0 MOV R7, #0 ; initialize R7 = 0
TestForDone CMP R7, #5 ; R7 < 5? MVN R6, #0 ; R6 = – 1
BGE DoneFor ; No – done! loop body TestForDone CMP R7, #5 ; R7 < 5? loop body
MVN R6, #0 ; R6 = – 1 6 instr’s BGE DoneFor ; No – done! 5 instr’s
STR R6, [R4,R7] ; array[R4,R7] = – 1 STR R6, [R4,R7] ; array[R4,R7] = – 1
ADD R7,R7,#1 ; R7++ ADD R7,R7,#1 ; R7++
B TestForDone B TestForDone
DoneFor ; continue DoneFor ; continue
Footprint: 1 (init) + 6 (body) = 7 words Footprint: 2 (init) + 5 (body) = 7 words
Execute: 1 (init) + 5 x 6 (body) + 2 (exit) Execute: 2 (init) + 5 x 5 (body) + 2 (exit)
Dec 3, 2015 = 33 instructions 211 Dec 3, 2015 = 29 instructions (vs. 33) 212

Loop Unrolling: Initialize Values in Version 2 Loop Unrolling to


Assume: R4 has array address Array[5] to -1 Initialize Values in Array[5] to -1
MOV R7, #0 ; initialize R7 = 0 Assume: R4 has array address
MVN R6, #0 ; R6 = – 1 MVN R6, #0 ; R6 = – 1
STR R6, [R4,R7] ; array[R4,R7] = – 1 MOV R7, #0 ; initialize R7 = 0
ADD R7,R7,#1 ; R7++ STR R6, [R4,#0] ; array[R4,R7] = – 1
STR R6, [R4,R7] ; array[R4,R7] = – 1 ADD R7,R7,#1 ; R7++
ADD R7,R7,#1 ; R7++ STR R6, [R4,#1] ; array[R4,R7] = – 1
STR R6, [R4,R7] ; array[R4,R7] = – 1 ADD R7,R7,#1 ; R7++
ADD R7,R7,#1 ; R7++ STR R6, [R4,#2] ; array[R4,R7] = – 1
STR R6, [R4,R7] ; array[R4,R7] = – 1 ADD R7,R7,#1 ; R7++
ADD R7,R7,#1 ; R7++ STR R6, [R4,#3] ; array[R4,R7] = – 1
STR R6, [R4,R7] ; array[R4,R7] = – 1 ADD R7,R7,#1 ; R7++
DoneFor ; continue STR R6, [R4,#4] ; array[R4,R7] = – 1
Footprint: 11 words (vs. 7) 57% larger footprint DoneFor ; continue
Execute: 11 instructions!! (vs. 33 vs. 29) Up to 66% execution saving!! Footprint: 6 words (vs. 7)
Dec 3, 2015 213 Execute:
Dec 3, 2015 6 instructions!! (vs. 33 vs. 29) 214
but wait …

But …Could Programmer Write in a Way


A Note about Compilers
That Lets Compiler Be More Efficient?
A compiler performs the translation from HLL into
assembly language Consider Pseudo-Code:
“Debug mode”: will do very simple translations Assume: R4 has array address
Easy (?) for people to understand, “obvious” for ( R7 = 4; 0 <= R7, R7– – )
“Production mode”: will optimize, depending on {
optimization settings
E.G. comparing different implementation possibilities array[ R7 ] = -1
& picking “best” … move setting of R6 = –1 outside of }
the loop, loop unrolling
People “understanding” is secondary to performance
criteria

Dec 3, 2015 215 Dec 3, 2015 216


Variation: Initialize Values in Array[5] = -1 Where to Now?
Assume: R4 has array address initialize 3 instr Control Flow: Bcc instructions
MOV R7, #5 ; initialize index R7 = 4 + 1 (then SUB) Hardware to implement
MVN R6, #0 ; R6 = – 1 Examples
B TestForDone Functions
InitElem STR R6, [R4,R7] ; array[R7] = – 1 Declaration vs. Invocation
TestForDone SUB R7,R7,#1 ; R7 – – How to invoke? How to “return”?
loop body
BPL InitElem 3 instr’s Parameter passing
DoneFor ; continue • Parameters in, return value out
Footprint: 3 (init) + 3 (body) = 6 words (vs. 7) • Pass by value, pass by reference
Execute: 3 (init) + 5 x 3 (body) = 18 instructions (vs. 29)
Dec 3, 2015 217 Dec 3, 2015 218

HLL Functions HLL Function Declaration


Declaration: Declaration:
Header (name and parameters and return type) Header (name and parameters and return type)
Body Body
Return … return value? Return … return value?
UpCase Example: // assumes ASCII encoding of char header
char UpCase( char X ) char UpCase( char X ) body

{ if ( (‘a’ <= X ) && (X <= ‘z’) ) { if ( (‘a’ <= X ) && (X <= ‘z’) )
{ X = X – 0x20; } { X = X – 0x20; }
return X; return X; return, return value

} }
Dec 3, 2015 219 Dec 3, 2015 220

HLL Function Invocation Function Implementation


“calling” the function from some other code @ H/S Interface: Both the function code and the
1. Supply arguments for parameters invocation code have to be implemented using
2. … then use the returned value (if there is one) assembly language instructions
UpCase Example: // assume null-terminated string str
i = 0; Invocation Code Function Code
while ( strg[ i ] != null ) // null = 0x00 instr UpCase func instr
Control
{ Invoke flow instr func instr
strg[ i ] = UpCase ( strg[ i ] ); UpCase call-instr ….
i++;
Next instr func instr
}
1. Invoke UpCase argument = strg[ i ] Function code must return-instr
2. “After” UpCase invocation: assign “After” UpCase “know” where to
returned value to strg[i] return to !!
invocation … use
Dec 3, 2015 221 Dec 3, 2015
return value 222
Terminology Call / Return
The instruction that transfers control from the Important: Subroutine code must “know” the return
invocation code to the function code is the calling address in order to return successfully after the
(or call) instruction subroutine has executed
The instruction that transfers control from the Could the return address be hard-coded into the
function code back to the invocation code is the subroutine?
returning (or return) instruction In the example: could returning be as simple as
having the UpCase subroutine include:
The target address that the return instruction
transfers control to is the return address B Next
• Yes for the example ... But NO in general! …
In the example, the return address is Next
what if the subroutine is invoked from 2 different
In assembly language, the equivalent of HLL
functions are called subroutines places in the program?

Dec 3, 2015 223 Dec 3, 2015 224

Subroutine Implementation Call / Return


Invocation Code Function Code Must allow subroutine to be invoked from different places in
program
instr UpCase subr instr Therefore, can’t hard-code return address into subroutine
Control
flow instr 1 subr instr When invoking subroutine, the return address (address of
the “Next” instruction after the calling instruction) is already
call-instr 2
…. in the PC!!
Next1 instr subr instr PC was incremented when the calling instruction was
fetched!
Control … return-instr
flow Can pass the return address (i.e. contents of PC) into the
instr subroutine as an implicit parameter to every subroutine!
Subroutine code must return The return address is always passed, but is not included
call-instr to different addresses in each as an explicit parameter in the function header
Next2 instr invocation!!
1) Return Address = Next1
2) Return Address = Next2

Dec 3, 2015 225 Dec 3, 2015 226

ARM Implementation of Calling


BLcc Instruction
Instruction Similar to Bcc … just includes the link behaviour
When the calling instruction executes, it transfers Opcode: 0x 83 (chosen to include RY ALU opr = 0x3)
control to the subroutine AND places the return
Syntax: BLcc imm20
address value in R14 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
bit 9 8 7 6 5 4 3 2 1 0
Return address is available to subroutine! 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0

use 83 cc imm20
Calling instruction: Branch and Link (BL)
hex H7 H6 H5 H4 H3 H2 H1 H0
R14 is the Link Register (LR)
RTL: IF (cc is true)
This approach is different than Intel processors,
which have fewer registers and use the run-time THEN LR  [ PC ]
stack to hold the return address PC  [ PC ] + sign-extend( imm20 )
We will talk about the run-time stack later Seen form before? Yes! Form 6

Dec 3, 2015 227 Dec 3, 2015 228


BLcc Implementation What About Returning?
Incremented PC is already in T1 ! At end of subroutine, execution must transfer
BLcc Execution states: control back to the return address
1. T2  [ PC ] (use RY ALU opr for this!) cTest = 1 here! Return address in is LR (Link Register, R14)
2. R14  [ T2 ] only get to this state if cc is true Returning Instruction:
3. T2  [ T1 ] + sign-extend( PC-offset ) MOV R15, R14
4. PC  [ T2 ] Copies contents of LR (i.e. return address) into
Don’t modify LR (R14) unless cc is true! the PC (R15)
Need hardware addition to address LR (R14) in state 2 Next instruction will be fetched from the return
similar to addressing PC Reg Addrs Bus address
Add 4-bit constant = 0b1110, with LROE
control FSM Outputs signal LROE
1110
Dec 3, 2015 229 Dec 3, 2015 230

Where are We? Parameter Passing


HLL Functions Subroutines The BL instruction passes the return address into the
subroutine (in LR) as an implicit parameter
Declaration vs. Invocation
Uses a register to pass the argument
How to invoke? How to “return”? Places value in register before executing call
Parameter passing Value is available to subroutine to use as required
• Parameters in, return value out Additional explicit arguments can also be passed into a
subroutine using registers
• Pass by value, pass by reference
Return value can be passed back in a register, too!
Invocation code and subroutine code must agree on
Let’s talk briefly about parameter passing and then which registers to use
revisit the UpCase example i.e. subroutine must know which register(s) hold the
parameter(s)

Dec 3, 2015 231 Dec 3, 2015 232

Calling Convention Recall UpCase Example


Most developers (and compilers) use a consistent Header: char UpCase( char X )
convention to pass parameters in registers Subroutine name is UpCase
SYSC 3006 Parameter Passing Convention: Use this as target for invocation using BL
1. Write a header for the subroutine X is the only input parameter … it is an ASCII char value
• Identify: return value, name, parameter list that is passed into the subroutine in R0
Invocation code must put the value for X into R0
2. Use R0 to pass 1st param in list, R1 for 2nd … etc. before invoking the UpCase subroutine
• Left-to-right or vice versa? Left-to-right! The subroutine code assumes that the value of X is in
• Limit max. number of params in list = 4 R0 when the subroutine starts to execute
• More than 4? Outside scope of this course! Return Value is of type ASCII char … will be returned
3. If there is a return value, return it in R0 from subroutine in least signif. nybble of R0
Rest of R0 = 0

Dec 3, 2015 233 Dec 3, 2015 234


Recall Function Code UpCase Subroutine Code
char UpCase( char X ) ; on entry: R0 = X (char to convert)
// subroutine name = UpCase ; and R14 = return address
// on execution, R0 = X & R14 = return address UpCase CMP R0, #’a’ ; if ( ( ‘a’ <= X ) ?
{ if ( (‘a’ <= X ) && (X <= ‘z’) ) B??  What to do here?
{ X = X – 0x20; } (see HLL on previous slide)
return X; // on return, R0 = converted X What condition? (table on next slide)
}

Dec 3, 2015 235 Dec 3, 2015 236

Code Meaning (for comparing X to Y) Flags Tested


1 eq Equal. Z==1 UpCase Subroutine Code
2 ne Not equal. Z==0
; on entry: R0 = X (char to convert)
3 hs (or cs) Unsigned higher or same (or carry set). C==1

4 lo (or cc) Unsigned lower (or carry clear). C==0


; and R14 = return address
5 mi Negative. The mnemonic stands for "minus". N==1 UpCase CMP R0, #’a’ ; if ( ( ‘a’ <= X ) ?
6 pl Positive or zero. The mnemonic stands for "plus". N==0 BLO DoneUpCase ; No – done subroutine
7 vs Signed overflow. The mnemonic stands for "V set". V==1
CMP R0, #’z’ ; if ( ( X <= ‘z’) ) ?
8 vc No signed overflow. The mnemonic stands for "V clear". V==0
BHI DoneUpCase ; No – done subroutine
9 hi Unsigned higher. (C==1) && (Z==0)

10 ls Unsigned lower or same. (C==0) || (Z==1) SUB R0, R0, #0x20 ; convert char in R0
11 ge Signed greater than or equal. N==V DoneUpCase ; R0 contains converted char
12 lt Signed less than. N!=V MOV R15, R14 ; return, with R0 = converted char
13 gt Signed greater than. (Z==0) && (N==V)

14 le Signed less than or equal. (Z==1) || (N!=V)


Dec 3, 2015 al
15 Always 237
None tested. Dec 3, 2015 238

Recall UpCase Invocation code Assembly Invocation Code


i = 0; // will use R6 for i MOV R6, #0 ;i=0

while ( strg[ i ] != null ) // null = 0x00 MOV R7, #strg ; get address of str
CheckChar LDR R0, [ R7, R6 ] ; R0 = unconverted char from str
// will assume variable named strg
CMP R0, #0 ; char == null?
{ BEQ DoneStrg ; yes! – done string
strg[ i ] = UpCase ( strg[ i ] ); BL UpCase ; convert char call Upcase, R0 = ??
i++; ; on return: R0 = converted char

} STR R0, [ R7, R6 ] ; save converted char


ADD R6, R6, #1 ; increment i
B CheckChar ; check next char
DoneStrg ; continue

Dec 3, 2015 239 Dec 3, 2015 240


Parameters: Value vs. Reference Pass by Reference
The Upcase example passes the char parameter X by value Subroutine receives a pointer to the variable to be
The value associated with X copied from str variable (in used as the parameter
Main Memory) into R0 before invoking the subroutine Must use the pointer to access the variable
Any change to the copy of the value by the subroutine is Any changes to the referenced variable are
local to R0, not the str variable (in Main Memory) persistent outside of the subroutine
Suppose it is desired to pass an array[100] into a subroutine?
The value stored in the variable (in Main
Array is too big to fit into R0 – R3 !! Memory) is modified !
Can’t pass all the values in the array in registers
Pass the start address of the array instead … pass a
reference (pointer) to the array

Dec 3, 2015 241 Dec 3, 2015 242

UpCase Reworked: UpString UpString


Suppose pass a string (array) variable to UpString, and void UpString(&( char S[] ) )
the subroutine must convert the entire string “void” return value … no useful value
string parameter: one param – pass pointer to S in R0
void UpString( &( char S[] ) ) Have made “by reference” { i = 0;
explicit using “&”!
{ i = 0; while ( S[ i ] != null ) // null = 0x00
while ( S[ i ] != null ) // null = 0x00 {
{ if ( ( ‘a’ <= S[ i ] ) && ( S[ i ] <= ‘z’ ) )
if ( ( ‘a’ <= S[ i ] ) && ( S[ i ] <= ‘z’ ) ) S[ i ] = S[ i ] – 0x20;
S[ i ] = S[ i ] – 0x20; i++;
i++; }
} return ; no return value … changes to S are
return ; stored in Main Memory as they happen!
} }
Dec 3, 2015 243 Dec 3, 2015 244

UpString Subroutine LEA Instruction


; on entry: R0 = address of S (string to convert)
; and R14 = return address Processors often include instructions to help with
UpString MOV R1, #0 ; R1 = i; i = 0
loading the address of a variable
CheckChar LDR R2, [ R0, R1 ] ; R2 = unconverted char from S LEA = Load Effective Address
CMP R2, #0 ; char == null? Only uses PC-Relative mode
BEQ DoneS ; yes! – done string loads Rd with the address of the word
CMP R2, #’a’ ; if ( ( ‘a’ <= S[i] ) ? UpCase • LEA Rd, [ label ]
BLO DoNextChar ; No – done with this char
CMP R2, #’z’ ; if ( ( S[i] <= ‘z’) ) ?
Is LEA needed for other memory operand modes?
BHI DoNextChar ; No – done with this char LEA Rd, [ Rn ] vs. MOV Rd, Rn ??
Persistent
change SUB R2, R2, #0x20 ; convert char in R2 LEA Rd, [ Rn, #ofs ] vs. ADD Rd, Rn, #ofs
STR R2, [ R0, R1] ; save converted char LEA Rd, [ Rn, Rm ] vs. ADD Rd, Rn, Rm
DoNextChar ADD R1, R1, #1 ; increment i similar simple instruction for PC-Relative mode?
B CheckChar ; check next char
• MOV Rd, #label ; if address of label < 20 bits
DoneS MOV R15, R14 ; return
Dec 3, 2015 245 Dec 3, 2015 246
Form 5 LEA Instruction
PC-Relative memory operand only
Invoking UpString
bit
3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
9 8 7 6 5 4 3 2 1 0
Pass the address of string variable (pass by reference)
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0

use 3F Rd imm20 Assume string variable named a_str


hex H7 H6 H5 H4 H3 H2 H1 H0

Opcode: 0x3F LEA R0, [ a_str ] ; address of string in R0


RTL: Rd  [ PC ] + sign-extend( imm20 ) BL UpString ; call: return address to R14
; on return, the converted string has
General Syntax: LEA Rd, [ #imm20 ]
; been saved in Main Memory
Syntax using label: LEA Rd, [ label ]
NextInstr MOV … ; continue here

Dec 3, 2015 247 Dec 3, 2015 248

Register Use by Subroutine Register Use Issue


Original UpCase subroutine uses only the registers If subroutine changes a register value, will that
involved in passing parameters impact the calling code?
R0 (char to convert), R14 (LR)
Suppose calling code has an important value in R1
UpString subroutine uses additional registers
when it calls the UpString subroutine
R1 (index into string), R2 (char from string)
Author of calling code “knows” about use of reg’s for The subroutine will overwrite the important
parameter passing, but doesn’t (necessarily) know what value in R1!
additional reg’s subroutine uses When return from subroutine, calling code is
Author of subroutine doesn’t (necessarily) know what “broken” from perspective of calling code: the
reg’s beyond the parameters contain values that are important value in R1 has changed randomly
useful to calling code
What reg’s can subroutine safely modify?

Dec 3, 2015 249 Dec 3, 2015 250

Resolving Register Use Issue UpString & Register Use Issue


Either: Recall that UpString uses registers R0, R1, R2, R14
Calling code “temporarily saves” any important register Parameters: R0 (string address) & R14 (return address)
values (other than parameters) before calling Additional registers used by subroutine: R1, R2
OR Must resolve the Register Use Issue
Subroutine “temporarily saves” any register values SYSC 3006 Register Preservation Convention:
(other than param reg’s) before using the registers The subroutine is responsible for ensuring the
preservation of all register values except the registers
Choice is a “convention” (agreement by developers) used to pass parameters into the subroutine

But how to “temporarily save” values????


Dec 3, 2015 251 Dec 3, 2015 252
Stack Run-Time Stack
Recall “stack” from SYSC 2006: a LIFO (last-in-first-out) Stacks are so useful @ H/S interface that processors
list data structure include hardware support
PUSH to add a value at top of stack A hardware supported stack used to help with
POP to remove and return value from top of stack subroutine control flow is called the run-time stack
Solution to “temporarily save” register values: use a Used to save/restore register values (and more!)
stack to store values temporarily Run-time stack implementation … 2 parts:
Save a register value: PUSH value onto stack 1) A contiguous region (block) of RAM for storage
Restore a register value: POP value from stack 2) A stack pointer (SP) that points to the memory
Must restore in opposite order from save (LIFO) location containing the value that is currently on
the top of the stack

Dec 3, 2015 253 Dec 3, 2015 254

Run-Time Stack PUSH Instruction


@ H/S Interface: run-time stack is a contiguous Stack grows “down”
region (block) of RAM for storage + a stack pointer From high address towards low address
(SP) that points to the memory location containing PUSH { Ry }: add value from Ry value @ top of stack
the value on the top of the stack currently
SP  [ SP ] – 1 low address
low address
mem[SP]  value from Ry

...
...

Value Ry
Stack Pointer (SP) Contiguous region SP addrs Value X
addrs Value X of memory for Value W
Value W stack storage

...
Value Ry is now the value
...

currently on the top of the stack


high address
high address
Dec 3, 2015 255 Dec 3, 2015 256

POP Instruction Values Currently “in” Run-Time Stack


POP { Rx }: copy value @ top of stack into Rx, and The values that are in the stack (list) start at the top
“remove” value from stack of the stack and go to the high addrs
Rx  mem[SP] low address
SP  [ SP ] + 1 low address
...

Rx now contains Value X Stack Pointer (SP)


...

addrs Value X
Value W
SP addrs Value X Values
...

Value W currently in
...

Value W is now the value stack


high address
currently on the top of the stack
high address

Dec 3, 2015 257 Dec 3, 2015 258


Example: Before Example: then PUSH { R0 }
Suppose: R0 = 0x0000 0001 PUSH { R0 }: R0 = 0x0000 0001
R8 = 0xFFFF FFFF SP  [ SP ] – 1 R8 = 0xFFFF FFFF
There is always “some value” in memory.
“0x xxxx xxxx” indicates we don’t know what mem[SP]  R0
the value is (and don’t) care for this example
0x060 0x xxxx xxxx 0x060 0x xxxx xxxx
0x xxxx xxxx 0x xxxx xxxx

...
...
0x xxxx xxxx 0x0000 0001
SP 0x110 0x0000 000A SP 0x10F 0x0000 000A
0x5000 0000 0x5000 0000
Value currently Value currently

...
...
on top of stack on top of stack
0x xxxx xxxx 0x xxxx xxxx
0x160 0x xxxx xxxx 0x160 0x xxxx xxxx

Dec 3, 2015 259 Dec 3, 2015 260

Example: then POP { R8 } ARM Implementation


POP{ R8 }: R0 = 0x0000 0001 Stack Pointer (SP) = R13
R8  mem[SP] R8 = 0x0000 0001 PUSH: store multiple registers in stack
SP  [ SP ] + 1 POP: load multiple registers from stack
0x060 0x xxxx xxxx Uses a 16-bit mask (imm) to indicate which registers
0x xxxx xxxx to store/load
Value no longer
...

“in” stack Bit 0 R0, bit 1 R1, … , Bit 15 R15


0x0000 0001
SP 0x110 0x0000 000A PUSH order: high to low (e.g. R10 before R6)
0x5000 0000 POP order: low to high (e.g. R6 before R10)
Value currently
...

on top of stack
0x xxxx xxxx
0x160 0x xxxx xxxx

Dec 3, 2015 261 Dec 3, 2015 262

Syntax PUSH/POP Encoding


PUSH { reg list } PUSH opcode: 0x38
e.g.: PUSH { R0 – R3, R14 } POP opcode: 0x39
• Mask: 0b 0100 0000 0000 1111 imm16 register mask
POP { reg list } unused 8 bits = 0
e.g.: POP { R0 – R3, R15 } bit
3
1
3
0
2
9
2
8
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1
9
1
8
1
7
1
6
1
5
1
4
1
3
1
2
1
1
1
0
9 8 7 6 5 4 3 2 1 0

• Mask: 0b 1000 0000 0000 1111 use Opcode unused (0) imm16
Another example: hex H7 H6 H5 H4 H3 H2 H1 H0
PUSH { R1, R6 – R8, R10 }
New operand configuration Form 7
• Mask: 0b 0000 0101 1100 0010 Needs new h/w outside the scope of this course
If you are really interested look at circuits in lab ☺

Dec 3, 2015 263 Dec 3, 2015 264


Reconsider … Initializing SP
PUSH { R2 – R6, R14 } SP must be initialized to the highest address + 1 in the
PUSH order: high to low
Saves R14 saves LR! memory block to use as the stack
POP order: low to high
POP { R2 – R6, R15 } Why + 1? First PUSH will decrement SP before copying
Restores R15 restores PC! value into stack
• Control flow !!! First value goes into stack at highest address in block
What if … in subroutine and execute
PUSH { R2 – R6, R14 } low address

and then while still in subroutine execute Stack Pointer (SP)

...
POP { R2 – R6, R15 } initial addrs
neat, eh! ☺
First PUSH goes here high address
Not in stack space

Dec 3, 2015 265 Dec 3, 2015 266

Initializing SP UpString & Register Use Issue


Program grows from low to high addresses Recall that UpString uses registers R0, R1, R2, R14
Stack grows from high to low addresses Parameters: R0 (string address) & R14 (return address)
Typically: Initialize stack to provide largest “gap” Additional registers used: R1, R2
between program and stack Must resolve the Register Use Issue
SYSC 3006 Register Preservation Convention:
In Lab? 2k words highest address = 0x7FF The subroutine is responsible for ensuring the
Initialize SP = 0x800 preservation of all register values except the registers
used to pass parameters into the subroutine

Will use stack to “temporarily save” values of registers


used by subroutines!
Dec 3, 2015 267 Dec 3, 2015 268

Revised UpString Code Nested Function Calls


; on entry: R0 = address of S (string to convert) Consider revising UpString to call UpCase:
; and R14 = return address
UpString PUSH { R1, R2, R14 } ; save registers and return address
MOV R1, #0 ;i=0 void UpString( &( char S[] ) )
CheckChar LDR R2, [ R0, R1 ] ; R2 = unconverted char from S { i = 0;
CMP R2, #0 ; char == null? while ( S[ i ] != null ) // null = 0x00
BEQ DoneS ; yes! – done string
CMP R2, #’a’ ; if ( ( ‘a’ <= S[i] ) ? {
BLO DoNextChar ; No – done with this char S[i] = UpCase( S[i] );
CMP R2, #’z’ ; if ( ( S[i] <= ‘z’) ) ? i++;
BHI DoNextChar ; No – done with this char
SUB R2, R2, #0x20 ; convert char in R2
}
Call to UpCase is
STR R2, [ R0, R1 ] ; save converted char return ;
“nested” inside
DoNextChar ADD R1, R1, #1 ; increment i }
B CheckChar ; check next char UpString
DoneS POP { R1, R2, R15 } ; restore registers and return

Dec 3, 2015 269 Dec 3, 2015 270


“Naïve” Nested UpCase Call “Ugly” Resulting Code
; on entry: R0 = address of S (string to convert) ; on entry: R0 = address of S (string to convert)
; and R14 = return address ; and R14 = return address
UpString PUSH { R1, R2, R14 } ; save registers and return address UpString PUSH { R1, R2, R14 } ; save registers and return address
MOV R1, #0 ;i=0 MOV R1, #0 ;i=0
CheckChar LDR R2, [ R0, R1 ] ; R2 = unconverted char from S CheckChar LDR R2, [ R0, R1 ] ; R2 = unconverted char from S
CMP R2, #0 ; char == null? CMP R2, #0 ; char == null?
BEQ DoneS ; yes! – done string BEQ DoneS ; yes! – done string
PUSH R0 ; save R0 for reuse after call Ugly! PUSH R0 ; save R0 for reuse after call
; save R14 too?!? was already saved above! ☺ Have to save & ; should also save R14 !!! was already saved above! ☺
MOV R0, R2 ; R0 = char to convert (parameter) restore R0 for MOV R0, R2 ; R0 = char to convert (parameter)
BL UpCase ; call UpCase to convert every call and BL UpCase ; call UpCase to convert
MOV R2, R0 ; save char in R2 using R2 is MOV R2, R0 ; save char in R2
POP R0 ; restore string address awkward, too POP R0 ; restore string address
STR R2, [ R0, R1] ; save converted char STR R2, [ R0, R1] ; save converted char
DoNextChar ADD R1, R1, #1 ; increment i DoNextChar ADD R1, R1, #1 ; increment i
B CheckChar ; check next char B CheckChar ; check next char
DoneS POP { R1, R2, R15 } ; restore registers and return DoneS POP { R1, R2, R15 } ; restore registers and return

Ugly: Because use R0 as string address and want R2 to


contain converted char … could this be changed?
Dec 3, 2015 271 Dec 3, 2015 272

“Prettier”(?): use R2 as string address! Revisit UpCase Subroutine


; on entry: R0 = address of S (string to convert) Does UpCase subroutine need to add anything to be
; and R14 = return address consistent with Register Preservation Convention?
UpString PUSH { R1, R2, R14 } ; save registers and return address
MOV R1, #0 ;i=0
; on entry: R0 = X (char to convert)
MOV R2, R0 ; use R2 as string address
CheckChar LDR R0, [ R2, R1 ] ; R0 = unconverted char from S ; and R14 = return address
CMP R0, #0 ; char == null? UpCase CMP R0, #’a’ ; if ( ( ‘a’ <= X ) ?
BEQ DoneS ; yes! – done string BLO DoneUpCase ; No – done subroutine
BL UpCase ; call UpCase to convert CMP R0, #’z’ ; if ( ( X <= ‘z’) ) ?
STR R0, [ R2, R1] ; save converted char
DoNextChar ADD R1, R1, #1 ; increment i
BHI DoneUpCase ; No – done subroutine
B CheckChar ; check next char SUB R0, R0, #0x20 ; convert char in R0
DoneS POP { R1, R2, R15 } ; restore registers and return DoneUpCase ; R0 contains converted char
MOV R15, R14 ; return, with R0 = converted char
Prettier(?): Shorter ☺ … and in assembly language that
usually means easier to understand Looks OK as is!
Dec 3, 2015 273 Dec 3, 2015 274

Recursive Examples Factorial


uint Factorial ( uint X ) // uint unsigned int
For interest only, not on exam! {
if ( X < 2 ) // 1! = 0! = 1
Recursive algorithm: { return 1; }
1. Describe solution to problem of size X in terms of else Calculate
solution to problem of smaller size (e.g. X – 1) { return ( X * Factorial ( X – 1 ) ); }
2. Base case(s): solution to smallest sized problem }
Use solutions to smaller sized problems to solve
1. Describe solution to problem of size X (i.e.
larger sized problem Factorial(X)) in terms of solution to problem of
smaller size (i.e. Factorial (X – 1))
2. Base case(s): solution to smallest sized problem (i.e.
Dec 3, 2015 275 Dec 3, 2015 Factorial(X) when X < 2 276
Factorial Implementation How Does the Recursive Subroutine Work?
; on entry: R0 = X (will return X!) “Play computer” … walk through the execution of the code
; and R14 = return address It all relies on the contents of the stack!!
Factorial CMP R0, #2 ; if (X < 2 ) While “playing”, maintain a diagram of what is in the stack:
BHS Calculate ; No calculate X! ADD to stack: for the PUSH before each recursive call
MOV R0, #1 ; Yes X! = 1 • Make recursive call!
MOV R15, R14 ; return easy: base case REMOVE from stack: for the POP that follows the return
Calculate PUSH { R0, R1, R14 } ; save X and other reg’s from each recursive call
SUB R0, R0, #1 ; X = X – 1 REMOVE from stack: for the POP at the end of the
BL Factorial ; Recursive: get (X – 1)! execution of each recursive instance of the subroutine
R1 (not R0)! POP { R1 } ; restore X Each recursive call adds more onto the stack before
MUL MUL R0, R0, R1 ; R0 = X * (X – 1)! = X! ANYTHING is removed from the stack!
POP { R1, R15 } ; restore & Done! ☺ Never returns from any recursive call until the base case
is reached (i.e. invoke with X = 1)
Dec 3, 2015 277 Dec 3, 2015 278

Iterative Factorial Iterative Factorial @ H/S Interface


uint Factorial ( uint X ) // uint unsigned int ; on entry: R0 = X (want to return X!)
; and R14 = return address
{ uint Val = 1;
Factorial PUSH { R1, R14 } ; save other reg’s
while ( X > 1 )
{ Val *= X – –; } MOV R1, R0 ; use R1 as X
return ( Val ); } MOV R0, #1 ; Use R0 as Val, initialize Val = 1
; while loop
} Wloop CMP R1, #1 ; is (X > 1 )
BLS Done ; No Done!
Assembly language version? MUL R0, R0, R1 ; Val *= X
Comparison: Footprint? SUB R1, R1, #1 ;X––
B Wloop
Instructions Executed?
Stack Use? Done POP { R1, R15 } ; restore reg’s & return ( Val in R0 )
Dec 3, 2015 279 Dec 3, 2015 280

Recursive UpString Recursive UpString Implementation


void UpString( char* S ) // pass address of first char in string ; on entry: R0 = address of S (string to convert)
; and R14 = return address
{ if ( *S == null ) // null = 0x00 UpString PUSH { R1, R14 } ; save registers and return address
{ return; } base case CheckChar LDR R1, [ R0 ] ; R1 = unconverted char (*S)
CMP R1, #0 ; char == null? base case
if ( ( ‘a’ <= *S ) && ( *S <= ‘z’ ) ) BEQ Done ; yes all done
*S = *S – 0x20; BNE DoConversion ; no continue
POP { R1, R15 } ; yes restore and done!
UpString ( ++S ); // POINTER ARITHMETIC DoConversion CMP R1, #’a’ ; if ( ( ‘a’ <= *S ) ?
}  implicit return after returning from recursive call! BLO DoNextChar ; No – done with this char
CMP R1, #’z’ ; if ( ( *S <= ‘z’) ) ?
Tail recursion: recursive call is last thing done in function BHI DoNextChar ; No – done with this char
SUB R1, R1, #0x20 ; convert char in R1
Can be implemented without recursive calls! STR R1, [ R0 ] ; save converted char
DoNextChar ADD R0, R0, #1 ; S = S+1 (pointer arithmetic )
• Smart compilers will do this! BL UpString ; Recurse! check smaller string
POP { R1, R15 } ; done restore and return!
Dec 3, 2015 281 Dec 3, 2015 282
“Recursive” UpString Implementation We Have Arrived
; on entry: R0 = address of S (string to convert)
; and R14 = return address
On the software side
UpString PUSH { R1, R14 } ; save registers and return address of the H/S interface
CheckChar LDR R1, [ R0 ] ; R1 = unconverted char (*S)
SYSC 2006
CMP R1, #0 ; char == null?
BNE DoConversion ; no continue
POP { R1, R15 } ; yes restore and done!
DoConversion CMP R1, #’a’ ; if ( ( ‘a’ <= *S ) ?
BLO DoNextChar ; No – done with this char
CMP R1, #’z’ ; if ( ( *S <= ‘z’) ) ? Computer System
BHI DoNextChar ; No – done with this char
SUB R1, R1, #0x20 ; convert char in R1
STR R1, [ R0 ] ; save converted char
DoNextChar ADD R0, R0, #1 ; S = S+1 (pointer arithmetic )
B CheckChar ; Recurse! check smaller string ELEC 2607

Dec 3, 2015 283 Dec 3, 2015 284

“Getting Started” Roadmap Instruction Set Architecture (ISA)


High-Level Language (SYSC 2006) The aspects of the H/S Interface that are
Towards the Software 2 This slide deck visible to programmers, includes:
Registers: R0 – R15 and IR
SYSC 3006 Assembly Language Flags and Conditions
Computer Instruction Set Architecture
Data Manipulation (ALU) instructions:
Organization Micro-architecture
manipulate data & modify FLAGS
Towards the Hardware 1 Including CMP, CMN
Logic Circuits (ELEC 2607) Data Transfer: MOV, MVN, LDR, STR, LEA, PUSH, POP
Control Flow: Bcc, BLcc: cc Condition Codes!
Hardware/Software (H/S) Interface SYSC 2006 Also: Data Transfer into R15, and
Computer System Data Manipulation with Rd = R15
Dec 3, 2015 ELEC 2607 285 Dec 3, 2015 286

Instruction Set Architecture (ISA) Other Behaviour in ISA


ISA, includes: (con’t) Power-Up Behaviour: when power is applied
Addressing Modes: source & destination operands
Often same behaviour if hardware reset is invoked
Data Manipulation instructions: Register, Immediate
In lab: starts fetching instructions from address 0x00
LDR/STR/LEA memory operands: Register (Indirect),
Offset Immediate, Offset Register, PC-Relative Interrupt Behaviour: when interrupts occur
Bcc/BLcc: target PC-Relative Covered in next section of the course
Instruction Encoding (IR): instructions @ H/S interface
Instruction = Opcode & Operands
Both depend on addressing modes used for operands
Have seen several common forms of instruction encodings
Impact h/w (and Control FSM behaviour) needed to deal
with operands
Dec 3, 2015 287 Dec 3, 2015 288
Programmer’s Model Control FSM
Includes a concise statement of the ISA (not part of ISA, since not visible to programmers)
Includes memory contents too! Instruction Cycle stages: Fetch, Decode, Execute
Given in terms of visible aspects of state before FSM states … can describe each state using:
instruction is executed and state after instruction is RTL: “human readable”
executed Row in table of binary outputs: “machine readable”
Hides the details of the execution states associated Implementation in Labs:
with using the datapath
Output ROM
Just shows net effects of the execution states
Decode ROM
ccMUX, cTrue: integrate conditions
PUSH and POP: h/w not covered in course
Dec 3, 2015 289 Dec 3, 2015 290

Assembly Language Syntax Examples – LOTS!


(not part of ISA, determined by assembler: s/w tool) Motivated by HLL Implemented in assembly language
Mnemonics “human readable” @ H/S interface
Data: (information)
Immediate value: # …. 0x, 0b, decimal Type
Labels: Variable
The value of a label is ___________ Pointer
Composite type (array and struct)
Declaration vs. Reference
Processing: (instructions)
Use in PC-Relative addressing to simplify readability Sequential execution
• even though instruction includes the PC-Relative Expressions
offset to the target Conditional: if-then-else, (switch)
Loop: while, for
Functions, parameters
Dec 3, 2015 291 Dec 3, 2015 292

Higher Level Concepts Where To From Here?


Ultimately, everything done by a HLL must be carried I/O: getting information into and out of a computer
out @ H/S interface system
There are many ways to do things
Compiler optimization: footprint & execution speed
Pointers are very powerful P MM I/O External
Devices
Still add a level of confusion @ H/S interface
Interconnection Bus
Functions open a lot of possibilities for program design
Subroutines developer conventions are needed!
Run-time stack: LIFO list @ H/S interface Interrupts
Store values temporarily during execution Allow hardware events to trigger software attention
H/W support
Dec 3, 2015 293 Dec 3, 2015 294
Code Meaning (for comparing X to Y) Flags Tested
Appendix 1 eq Equal. Z==1

2 ne Not equal. Z==0


Deriving Flag Values When Comparing X and Y 3 hs (or cs) Unsigned higher or same (or carry set). C==1

4 lo (or cc) Unsigned lower (or carry clear). C==0

5 mi Negative. The mnemonic stands for "minus". N==1

6 pl Positive or zero. The mnemonic stands for "plus". N==0

7 vs Signed overflow. The mnemonic stands for "V set". V==1

8 vc No signed overflow. The mnemonic stands for "V clear". V==0

9 hi Unsigned higher. (C==1) && (Z==0)

10 ls Unsigned lower or same. (C==0) || (Z==1)

11 ge Signed greater than or equal. N==V

12 lt Signed less than. N!=V

13 gt Signed greater than. (Z==0) && (N==V)

14 le Signed less than or equal. (Z==1) || (N!=V)


Dec 3, 2015 295 Dec 3, 2015 al
15 Always 296
None tested

Using ALU to calculate X – Y Consider X – Y: Does X == Y?


For every processor, must read the reference manual When calculating X – Y using the ALU:
to see how calculations are performed and how flags Z = 1 iff X == Y
are set Does not matter if values are unsigned or signed
For ARM processors: both unsigned and signed After calculating X – Y , then:
values calculate X – Y using:
If Z == 1, then “eq” condition is true (X == Y)
result & FLAGS  X+(–Y) If Z == 0, then “ne” condition is true (X != Y)
 X + ( NOT( Y) + 1 ) Code Meaning (for cmp) Flags Tested

1 eq Equal (X == Y) Z==1

2 ne Not equal (X != Y) Z==0

Dec 3, 2015 297 Dec 3, 2015 298

Remaining Conditions? Consider Unsigned Values A and B


We could verify that flags are always set to match the For n-bit unsigned binary values:
various conditions in table when the condition is true A > B if there is a bit bi where both values have
E.G. whenever X < Y (Unsigned lower), then C==0 identical bit values for all bits above (more significant
The “proofs” are interesting (to an inquisitive mind), but than) bi , but bi = 1 in A and bi = 0 in B
binary values and reasoning about them can be tedious bi
OR … we can just accept that the specified flag settings A bn-1 bi + 1 1

are valid for the conditions in the table bi


This course will take this approach !! B bn-1 bi + 1 0

The following slides include “proofs” of the flag settings For bn-1 through bi + 1 , bit value in A = bit value in B
Included for the inquisitive mind
Will not be covered in course!
Dec 3, 2015 299 Dec 3, 2015 300
Consider Unsigned Values A and B Add Unsigned Values A + B
When A > B , then for A and NOT( B ): When A > B , then for A + NOT( B ):
Both values have bit bi = 1 (for bi on previous slide) Adding bits bi = 1 for both values causes a carry to
Both values have inverse bit values for all bits above ripple to next higher bit addition
(more significant than) bi 1 bi ?
bi
A bn-1 bi + 1 1 A bn-1 bi + 1 1

bi
B bn-1 bi + 1 0 + B bn-1 bi + 1 1

bi
B bn-1 bi + 1 1
bn-1 bi + 1 x

Dec 3, 2015 301 Dec 3, 2015 302

Add Unsigned Values A and B Add Unsigned Values A and B


When A > B , then for A + NOT( B ): When A > B , then for A + NOT( B ):
Adding bits bi+1 + bi+1 + 1 (carry) = 0 carry 1 (always) The carry ripples up and out of the msb (always)
Adding b + b always give 1, then adding 1 (carry) gives Therefore, C = 1 when A > B (always)
0 carry 1 1 1 bi 1 1 1 bi
A bn-1 bi + 1 1 A bn-1 bi + 1 1

+ B bn-1 bi + 1 1 + B bn-1 bi + 1 1

bn-1 0x Carry 1 0 0x

Dec 3, 2015 303 Dec 3, 2015 304

Unsigned X – Y when X > Y Unsigned Values and X > Y


Recall: X – Y X + NOT( Y ) + 1 8-bit Example: bi
From previous slide for A > B, here X > Y X = 0b 0000 1000
X = A and Y = B Y = 0b 0000 0010
X + NOT( Y ) always give C = 1 X–Y 0000 1000 + ( NOT( 0000 0010) + 1 )
Adding and extra 1 won’t change the Carry + 1111 1101
Therefore, when X > Y, X – Y always results in C = 1 Carry + 1
C=1
1 0000 0110

Dec 3, 2015 305 Dec 3, 2015 306


Unsigned Values and X == Y Suppose Unsigned Values and X < Y
When calculating X – Y : result will always be 0 (Z = 1), The result of X – Y is not defined for unsigned
but what about C? integers when X < Y, but the calculation can still be
For every b in X, there is a corresponding b in NOT( Y ) done
Adding: X + NOT (Y ) == all 1’s In this case, X < Y, bi = 1 in Y and bi = 0 in X
Then adding 1 gives all 0’s Z = 1 with C = 1 (Always!) bi
8-bit Example, X = 0b 0000 0010 Y = 0b 0000 0010 X bn-1 bi + 1 0

X–Y 0000 0010 + ( NOT( 0000 0010) + 1 ) bi


+ 1111 1101 Y bn-1 bi + 1 1

Carry + 1
C=1
1 0000 0000 Z=1

Dec 3, 2015 307 Dec 3, 2015 308

Consider Unsigned Values and X < Y Add Unsigned Values and X + Y


When X < Y , then for X and NOT( Y ): When X < Y , then for X + NOT( Y ):
Both values have bit bi = 0 Adding bits bi = 0 for both values causes no ripple
Both values have inverse bit values for all bits above carry even if a carry in!)
(more significant than) bi 0 bi ?
bi
X bn-1 bi + 1 0 X bn-1 bi + 1 0

bi
Y bn-1 bi + 1 1 + Y bn-1 bi + 1 0

bi
Y bn-1 bi + 1 0
bn-1 bi + 1 x

Dec 3, 2015 309 Dec 3, 2015 310

Add Unsigned Values A and B Add Unsigned Values X and Y


When X < Y , then for X + NOT( Y ): When X < Y , then for X + NOT( Y ):
Adding bits bi+1 + bi+1 + 0 (carry) = 1 carry 0 (always) The carry 0 ripples up and out of the msb (always)
Adding b + b always give 1, then adding 0 (carry) gives Therefore, C = 0 when X < Y (always)
1 carry 0 0 0 bi 0 0 0 bi
X bn-1 bi + 1 0 X bn-1 bi + 1 1

+ Y bn-1 bi + 1 0 + Y bn-1 bi + 1 1

bn-1 1x Carry 0 1 1x

Dec 3, 2015 311 Dec 3, 2015 312


Unsigned X – Y when X < Y Unsigned Values and X < Y
Recall: X – Y X + NOT( Y ) + 1 bi
8-bit Example,
From previous slide for X > Y
X = 0b 0000 0010
X + NOT( Y ) always give C = 0
Y = 0b 0000 1000
Adding and extra 1 won’t change the Carry
X–Y 0000 0010 + ( NOT( 0000 1000) + 1 )
Therefore, when X < Y, X – Y always results in C = 0
+ 1111 0111
Carry + 1
C=0
0 1111 1010

Dec 3, 2015 313 Dec 3, 2015 314

For Unsigned Values: X and Y Now Consider Signed: N, V, Z


Code Meaning (for cmp) Flags Tested
These seem straightforward: X – Y X + (– Y)
Unsigned higher or same (or carry set) C==1
3 cs or hs
( X >= Y ) (slides 160 & 162)
Code Meaning (for cmp) Flags Tested
Unsigned lower (or carry clear) C==0
4 cc or lo
(X<Y) (slide 168) Equal
Unsigned higher (C==1) && (Z==0)
1 eq
( X == Y ) Z==1
9 hi
(X>Y) (slides 160 & 154) Not equal
10 ls
Unsigned lower or same (C==0) || (Z==1) 2 ne
( X != Y ) Z==0
( X <= Y ) (slides 168 & 154)
Negative. The mnemonic stands for "minus“
When calculating X – Y using the ALU: 5 mi
(X–Y<0) N==1
Positive or zero. The mnemonic stands for "plus“
C = 1 iff ( X < Y or X == Y ) 6 pl
( X – Y >= 0 ) N==0
E.G. After calculating X – Y 7 vs
Signed overflow. The mnemonic stands for "V set“
V==1
( X – Y overflowed)
If C == 1, then “hs” condition is true No signed overflow. The mnemonic stands for "V clear“
8 vc V==0
If C == 1 && Z == 0, then “hi” condition is true ( X – Y did not overflow)

Dec 3, 2015 315 Dec 3, 2015 316

What about other Signed Conditions? Signed Values X and Y and X >= Y
Code Meaning (for cmp) Flags Tested Recall number line: Suppose both non-ve and X >= Y
Signed greater than or equal – max 0 Y X +max
11 ge
( X >= Y ) N==V -ve +ve
Signed less than Calculating X – Y always gives a non-ve number, and
12 lt
(X<Y) N!=V
Signed greater than never overflows
13 gt
(X>Y) (Z==0) && (N==V)
• This scenario always gives N = 0, V = 0
Signed less than or equal
14 le
( X <= Y ) (Z==1) || (N!=V)

Consider X >= Y and X < Y ( X – Y X + (– Y) ) • In this scenario: N == V


Looking for relationship between N and V and Z = 1 iff X == Y
6 scenarios based on combinations of –ve and
non–ve values
Dec 3, 2015 317 Dec 3, 2015 318
Signed Values X and Y and X >= Y Signed Values X and Y and X > Y
Suppose both -ve and X >= Y Suppose X non-ve and Y -ve: X > Y !!
– max Y X 0 +max – max Y 0 X +max
-ve +ve -ve +ve
|Y| |Y|
Calculating X – Y: Calculating X – Y: two possibilities:
Y X 0 Y 0 X
-ve +ve -ve +ve
|Y| |Y|
Result: non–ve value with no overflow: N = 0 ,V = 0 Result: non–ve value with no overflow: N = 0, V = 0
This scenario always gives N = 0, V = 0 Result: –ve value because of overflow: N = 1, V = 1
This scenario also gives N == V
This scenario gives: N == V
and Z = 1 iff X == Y
Dec 3, 2015 319 Dec 3, 2015 320

Signed Values X and Y and X < Y Signed Values X and Y and X < Y
Suppose both non-ve and X < Y Suppose both -ve and X < Y
– max 0 X Y +max – max X Y 0 +max
-ve +ve -ve +ve
|Y|
Calculating X – Y always gives a -ve number, and Calculating X – Y:
never overflows X Y 0
-ve +ve
This scenario always gives N = 1, V = 0 |Y|
Result: –ve value with no overflow: N = 1, V = 0
This scenario gives: N != V This scenario always gives N = 1, V = 0

This scenario gives: N != V


What happens if X == Y ?

Dec 3, 2015 321 Dec 3, 2015 322

Signed Values X and Y and X < Y Summary of Signed Value Scenarios


Suppose X -ve and Y non-ve: X < Y !! Relation X Y FLAGS Slide
– max X 0 Y +max X >= Y non-ve non-ve N == V 171
-ve +ve
|Y| X >= Y -ve -ve N == V 172
Calculating X – Y: two possibilities: X>Y non-ve -ve N == V 173
X 0 Y
-ve +ve
X<Y non-ve non-ve N != V 174
|Y| X<Y -ve -ve N != V 175
Result: –ve value with no overflow: N = 1, V = 0 X<Y -ve non-ve N != V 176
Result: non–ve value because of overflow: N = 0, V = 1
When X >= Y : N == V
This scenario gives: N != V
When X < Y : N != V

Dec 3, 2015 323 Dec 3, 2015 324


Signed: N, V, Z

Code Meaning (for cmp) Flags Tested


Signed greater than or equal
11 ge
( X >= Y ) N==V
Signed less than
12 lt
(X<Y) N!=V
Signed greater than
13 gt
(X>Y) (Z==0) && (N==V)
Signed less than or equal
14 le
( X <= Y ) (Z==1) || (N!=V)

Dec 3, 2015 325

You might also like