Csci 260 Study Guide-4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

else c = c - a;

# Allocate a = $s0, b = $s1, c = $s2

addi $s2, $zero, 10 # c = 10


slt $t0, $s1, $s0 # t0 = (b < a) ? 1 : 0
bne $t0, $zero, ELSE # goto ELSE if b < a
sub $S2, $s2, $s0 # c = c - b
j DONE
ELSE:
add $s2, $s2, $s1 # c = c + b
DONE:

Loops
Decisions are important both for choosing between two alternatives—found in if statements—and for
iterating a computation—found in loops. The same assembly instructions are the building blocks
for both cases.

For/While Statements
Example: Write the MIPS assembly code for

while (save[i] == k)
i += 1;
Assume that $s3 = i, $s4 = k, and $s6 = base of array save.

The first step is to load save[i] into a temporary register, however since we’re not accessing the array
with a constant but a variable, the step is more involved:

1. Before we can load save[i] into a temporary register, we need to have its address.
2. Before we can add i to the base of array save to form the address, we must multiply the index i
by 4 due to the byte addressing problem. Fortunately, we can use shift left logical, since shifting
2
left by 2 bits multiplies by 2 = 4.

We need to add the label Loop to it so that we can branch back to that instruction at the end of the loop:
Loop: sll $t1, $s3, 2 # t1 = i * 4

To get the address of save[i], we need to add $t1 and the base of save in $s6:
add $t1, $t1, $s6 # t1 = address of save[i] (i.e., base + offset)

Now we can use that address to load save[i] into a temporary register:

lw $t0, 0($t1) # t0 = save[i]

The next instruction performs the loop test, exiting if save[i] ≠ k:


bne $t0, $s5, Exit # go to Exit if save[i] ≠ k (i.e., exit the loop)

The next instruction adds 1 to i:

addi $s3, $s3, 1 # i = i + 1 (i.e., increment index)


The end of the loop branches back to the while test at the top of the loop. We just add the Exit label
after it, and we’re done:

j Loop # go to Loop
Exit:

Everything together:

LOOP:
sll $t1, $s3, 2 # t1 = i * 4
add $t1, $t1, $s6 # t1 = address of save[i]
lw $t0, 0($t1) # t0 = save[i]
bne $t0, $s5, EXIT # go to exit if save[i] ≠ k
addi $s3, $s3, 1 # i = i + 1
j Loop # goto Loop
EXIT: # exit loop

Example: Write the MIPS instruction for the following C statements:

for (i = 0; i < n; i++) {


x = 2 * x;
}
x += 7;

# Allocate i = $s0, x = $s1, n = $s2

add $s0, $s0, $zero # i = 0


LOOP: slt $t0, $s0, $s2 # t0 = i < n ? 1 : 0
beq $t0, $zero, EXIT # End loop if i >= n
add $s1, $s1, $s1 # x = 2 * x
addi $s0, $s0, 1 # i += 1
j LOOP # next iteration
EXIT:
addi $s1, $s1, 7 # x += 7

Case/Switch Statement
The case/switch statement in most programming languages allows the programmer to select one of
many alternatives depending on a single value. The simplest way to implement switch is via a
sequence of conditional tests, turning the switch statement into a chain of if-then-else statements.
NOTE: Although there are many statements for decisions and loops in high-level programming
languages (e.g., C, Java, Python, Raku, etc.), the bedrock statement that implements them at the
instruction set level is the conditional branch.

Signed and Unsigned Integers


Numbers are kept in computer hardware as a series of high and low electronic signals, which leads to
the natural use of binary numbers in computers (i.e., high and low = 2 states). A single digit of a binary
number is thus the “atom” of computing, since all information is composed of binary digits or bits. This
fundamental building block can be one of two values, which can be thought of as several alternatives:
high or low, on or off, true or false, or 1 or 0.

We number the bits 0, 1, 2, 3, . . . from right to left in a word. The drawing below shows the numbering
of bits within a MIPS word and the placement of the binary number 1011:

In a MIPS word, we call the rightmost bit and leftmost bit the least significant bit (lsb) and most
32
significant bit (msb) respectively. The MIPS word is 32 bits long, so we can represent 2 different
32
32-bit patterns. It is natural to let these combinations represent the numbers from 0 to 2 −1
(4,294,967,295 in base 10).

Historically there have been different representations, notably signed magnitude and one’s
complement, to encode negative numbers in the binary system. However, today every computer
uses two’s complement binary representations for signed numbers.
Two’s Complement
The two’s complement representation is quite simple; leading 0s mean positive, and leading 1s mean
negative.

Two’s complement has a few advantages over the other representations:


● It makes the hardware simpler.
● All negative numbers have a 1 in the most significant bit.
● Hardware needs to test only this bit to see if a number is positive or negative (with the number 0
considered positive). This bit is often called the sign bit. By recognizing the role of the sign bit,
we can represent positive and negative 32-bit numbers in terms of the bit value times a power of
31 30 29 1 0
2: (𝑥31 × − 2 ) +(𝑥30 × − 2 ) + (𝑥29 × − 2 ) +···+ (𝑥1 × − 2 ) + (𝑥0 × − 2 ).

A disadvantage of two’s complement is that there’s a negative number, 2,147,483,648 (base 10), that
has no corresponding positive counterpart. This imbalance proved to be a worry to the inattentive
programmer, however this was a fair trade-off when compared to sign and magnitude and/or one’s
complement. For instance, for sign and magnitude both the programmer and the hardware designer
had problems.

Negation Shortcut
To quickly negate a two’s complement binary number simply flips all its bits, and then add 1 to the
result.
This shortcut is based on the observation that the sum of a number and its inverted representation must
be 111 . . . 111 (base 2), which represents -1. Since 𝑥 + 𝑥' = − 1, therefore 𝑥' + 1 = − 𝑥.

Example: Negate 2020 (base 10), and then check the result by negating -2020 (base 10).

2020𝑡𝑒𝑛 = 0000 0000 0000 0000 0111 1110 01002

Negating this number by inverting all bits and then adding 1,

1111 1111 1111 1111 1000 0001 1011


+ 1
———————————————————————————————————————————
1111 1111 1111 1111 1000 0001 1100

Thus,

− 2020𝑡𝑒𝑛 = 1111 1111 1111 1111 1000 0001 11002

Let’s now negate -2020:

0000 0000 0000 0000 0000 0111 1110 0011


+ 1
———————————————————————————————————————————————
0000 0000 0000 0000 0000 0111 1110 0100

Sign Extension Shortcut


To convert a binary number represented in n bits to a number represented with more than n bits, we
take the most significant bit from the smaller quantity—the sign bit—and replicate it to fill the new bits of
the larger quantity. The old nonsign bits are simply copied into the right portion of the new word.

For example, the immediate field in the load, store, branch, add, and set on less than instructions
15 15
contains a two’s complement 16-bit number, representing − 32, 76810(2 ) to 32, 76710 (2 − 1). To
add the immediate field to a 32-bit register, the computer must convert that 16- bit number to its 32-bit
equivalent.

Example: Convert 16-bit binary versions of 5 (base 10) and 5 (base 10) to 32-bit binary numbers.

5𝑡𝑒𝑛 = 0000 0000 0000 0101𝑡𝑤𝑜


It is converted to a 32-bit number by making 16 copies of the value in the most significant bit (0) and
placing that in the left-hand half of the word. The right half gets the old value:

0000 0000 0000 0000 0000 0000 0000 0101𝑡𝑒𝑛 = 5𝑡𝑒𝑛

Let’s now negate the 16-bit version of 5 using the negation shortcut. Thus, 0000 0000 0000 0101
becomes 1111 1111 1111 1010 + 1 = 1111 1111 1111 1011𝑡𝑤𝑜 = − 5𝑡𝑒𝑛.

Creating a 32-bit version of the negative number means copying the sign bit 16 times and placing it on
the left:

1111 1111 1111 1111 1111 1111 1111 1011𝑡𝑤𝑜 =− 5


𝑡𝑒𝑛

Takeaway
We need to represent both positive and negative integers within a computer word, and although there
are pros and cons to any option (signed magnitude, one’s complement, two’s complement), the
unanimous choice since 1965 has been two’s complement.

Representing Instructions in the Computer


Instructions are kept in the computer as a series of high and low electronic signals and may be
represented as numbers. In fact, each part of an instruction can be considered as an individual number,
and placing these numbers side by side forms the instruction.

Since registers are referred to in instructions, there must be a convention to map register names into
numbers. In MIPS assembly language, registers $s0 to $s7 map onto registers 16 to 23. Hence, $s0
means register 16, $s1 means register 17, etc. The following table summarizes the 32 registers, some
of which have not been introduced yet:

Register name Number Usage

$zero 0 Constant 0

$at 1 Reserved for the assembler

$v0, $v1 2-3 Expression evaluation and function’s returned values

$a0-$a3 4-7 Function arguments 1 - 4

$t0-$t7 8-15 Temporary (not preserved across function calls)

$s0-$s7 16-23 Saved (preserved across function calls)


$t8-$t9 24-25 Temporary (not preserved across function calls)

$k0-$k1 26-27 Reserved for OS kernel

$gp 28 Global area pointer

$sp 29 Stack pointer

$fp 30 Frame pointer

$ra 31 Returned address (used by function calls)

MIPS Fields
The real MIPS language version of an instruction represented symbolically is a sequence of contiguous
segments known as fields, each of which represent a particular part of the instruction. Depending on
how the fields are laid out, there are three main instruction formats (i.e. a form of representation of an
instruction composed of fields of binary numbers) in MIPS: R-type, I-type, and J-type.

R-type
The following representation is the instruction format for R-type instructions.

As you can see from counting the number of bits, this MIPS instruction takes exactly 32 bits—the same
size as a data word. Here is the meaning of each name of the fields in MIPS instructions:
● op: Basic operation of the instruction, traditionally called the opcode (i.e., the field that denotes
the operation and format of an instruction.).
● rs: The first register source operand.
● rt: The second register source operand.
● rd: The register destination operand. It gets the result of the operation.
● shamt: Shift amount.
● funct: Function. This field, often called the function code, selects the specific variant of the
operation in the op field. Each operation is represented by a number. For example, the add
instruction is 32 (in decimal).

Except for the three shift instructions (sll, srl, and sra), all other remaining R-type instructions use
only registers. These instructions are identified by an opcode of 0, and are differentiated by their funct
values (i.e., the value that identifies the operation). For example, the function value for add is 32 (0x20)
while sub’s funct value is 34 (0x22).

Example: The real MIPS version of the instruction


add $t0, $s1, $s2 # t0 = s1 + s2

have the following decimal representation:

op rs rt rd shamt funct

0 17 18 8 0 32

Interpretation: The first and last field in combination tells MIPS computer that this instruction performs
an addition. The second and third fields give the first ($s1) and second ($s2) source operands
respectively. The fourth field is the destination operand ($t0). The fifth field is unused in this instruction
(no shifting is done) so it’s set to zero.

In binary the same instruction is as follows:

000000 10001 10010 01000 00000 100000

I-type
The I-type instruction format is specifically designed for immediates; it’s used by operations dealing
with immediates (e.g., addi, ori, etc) or operations for data transfer (e.g., lw, sw, etc.), all of which
need longer fields than those provided by the R-type. The fields of the I-format are laid out as follows:

I-type instructions are identified and differentiated by their opcode numbers (any number greater than
3). All of these instructions feature a 16-bit immediate, which is sign-extended to a 32-bit value in every
instruction (except for the and, or, and xor instructions which zero-extend and the lui instruction in which
it does not matter). Branch instructions also effectively multiply the immediate by 4, to get a byte offset.

Example: In the following load word and store word instructions

lw $t0, 32($s3) # t0 = A[8]


sw $t0, 32($s3) # A[8] = t0

op rs rt address
35 19 8 32

43 19 8 32

For both the load word and store word instructions, 19 (for $s3) is placed in the rs field, 8 (for $t0) is
placed in the rt field, and 32 is placed in the address field. Note the meaning of the rt field has
changed for these instructions (compared with R-type). In a load word instruction, the rt field specifies
the destination register, which receives the result of the load. In a store word instruction, the rt field
specifies the source register, which holds the data to be stored in a memory address. This can be
summarized as follows:

load word lw rt, address Load the word at address into register rt

store word sw rt, address Store the word from register rt into address

Although multiple formats complicate the hardware, we can reduce the complexity by keeping the
formats similar. The fields in each type are laid out in such a way that the same fields are always in the
same place for each type. For example, the first three fields of the R-type and I-type formats are the
same size and have the same names; the length of the fourth field in I-type is equal to the sum of the
lengths of the last three fields of R-type. In the case of the J-type (discussed down below), the length of
its second field is equal to 1) the sum of the last five fields of R-type and/or 2) the sum of the last three
fields of I-type.

How does MIPS decide which format to use? The formats are distinguished by the values in the first
field: each format is assigned a distinct set of values in the first field (op) so that the hardware knows
whether to treat the last half of the instruction as three fields (R-type) or as a single field (I-type).

You might also like