Assembler Manual
Assembler Manual
Assembler Manual
COPYRIGHT
This software product is copyrighted and all rights reserved. The
distribution and sale of this product are intended for the use of the
original purchaser only. Lawful users of this program are hereby
licensed only to read the program, from its medium into memory of a
computer, solely for the purpose of executing the program. Duplicating,
copying, selling or otherwise distributing this product is a violation
of the law.
This manual is copyright and all rights are reserved. This document
may not, in whole or part, be copied, photocopied, reproduced,
translated or reduced to any electronic medium or machine readable form
without prior consent, in writing, from Commodore Business Machines
(CBM).
PREFACE
This package contains everything that you will need to create, assemble,
load and execute 6500 series Assembly language code. You will notice
that like the software, this user's manual is directed towards the
experienced computer user that already has some familiarity with the
6500 series Assembly language and the operations of the Commodore PET
computer.
USER CONVENTIONS
Throughout this manual there are certain conventions used to help make
explanations less ambiguous. A list of these conventions is given
below. We recommend that the user become familiar with these.
TABLE OF CONTENTS
INTRODUCTION
This manual describes the Assembly Language and assembly process for
Commodore PET programs which use one of the 6500 series microprocessors.
Several assemblers are available for 6500 series program development,
each is slightly different in detail of use, yet all are the same in
principle. The 6500 series processors include the 6502 through the 6515
(the instruction sets are identical).
The process of translating a mnemonic or symbolic form of a computer
program to actual machine code is called assembly, and a program which
performs the translation is an assembler. We refer to the symbolic form
of the program as source code and the actual association for those
symbols are the Assembly Language. In general, one Assembly Language
statement will translate into one machine instruction. This
distinguishes an assembler from a compiler which may produce many
machine instructions from a single statement. An assembler which
executes on a computer other than the one for which code is generated,
is called a cross-assembler. Use of cross-assemblers for program
development for microprocessors is common because often a microcomputer
system has fewer resources than are needed for an assembler. However,
in the case of the Commodore PET, this is not true. With a floppy disk
and printer, the system is well suited for software development.
Normally, digital computers use the binary number system for
representation of data and instructions. Computers understand only ones
and zeros corresponding to an 'ON' or 'OFF' state. Users, on the other
hand, find it difficult to work with the binary number system and hence,
use a more convenient representation such as octal (base 8), decimal
(base 10), or hexadecimal (base 16). Two representations of the 6500
series operation to 'load' information into an 'accumulator' are:
10101001 (binary)
A9 (hexadecimal)
Seite 3
Assembler Manual
In this example, LDA is the symbol for A9, Load the Accumulator. An
assembler can translate the symbolic form LDA to the numeric form A9.
The label is L2, the opcode is LDA, and the operand is #12. At least
one blank must separate the three parts (fields) of the instruction.
Additional blanks may be inserted by the programmer for ease of reading.
Instructions for the 6500 series processors have at most one operand and
may have none. In these cases, the operation to be performed is totally
specified by the opcode as in CLC (Clear the Carry Bit).
Programming in Assembly Language requires learning the instruction set
(opcodes), addressing conventions for referencing data, the data
structures within the processor, as well as the structure of Assembly
Language programs. The user will be aided in this by reading and
studying the 6500 series hardware and programming manuals supplied with
this development package.
Assembler instructions for the Commodore PET Assembler are of two basic
types according to function:
Fields are bracketed to indicate that they are optional. Labels and
comments are always optional and many opcodes such as RTS (Return from
Subroutine) do not require operands. A line may also contain only a
label or only a comment.
A typical instruction showing all four fields is:
1.1 Symbolic
Perhaps the most common operand addressing mode is the symbolic form as
in:
LDA BETA ;PUT BETA VALUE IN ACCUMULATOR
the address ALPHA + BETA is computed by the assembler, and the value at
the computer address is loaded into the accumulator.
LDA BETA
If BETA is located at byte 4B in page zero memory, then the code
generated is A5 B4. This is called page zero addressing. If BETA is at
013C, which is located in memory page one, the code generated is AD 3C
01. This is an example of 'absolute' addressing. Thus, to optimize
storage and execution time, a programmer should design with data areas
in page zero memory whenever possible. (Please avoid assembling code in
page zero, as problems may be encountered.) Remember, the assembler
makes decisions on which form to use, based on operand address
computation.
1.2 Constants
LDA BETA + 5
the decimal number 5 is added to BETA to compute the address.
Similarly,
LDA BETA + $5F
LDA #2
specifies that the decimal value 2 is to be put into the accumulator.
Similarly,
LDA #'G
will load the ASCII value of character G into the accumulator. Since
the accumulator is one byte, the value loaded must be in the range of 0
to 255 (decimal).
1.3 Relative
1.4 Implied
Four instructions, ASL, LSR, ROL, and ROR, are special in that the
accumulator, A, can be used as an operand. In this special case, these
four instructions are treated as implied mode addressing and only an
operation code is generated.
LDA (BETA,X)
the parentheses around the operand indicates indirect mode. In the
above example, the value in index register X is added to BETA. That sum
must reference a location in page zero memory. During execution, the
high order byte of the address is ignored; thus, forcing a page zero
address. The two bytes starting at that location in page zero memory
are taken as the address of the operand in low byte, high byte format.
For purposes of illustration, assume the following:
Address Value
+---------+
BETA | $12 | + $04 = $0016
+---------+
+---------+
$0016 | $25 | Treated as Low Byte
+---------+
$0017 | $01 | Treated as High Byte, result is $0125
+---------+
+---------+
$0125 | $37 | This value is loaded into the Accumulator
+---------+
LDA (GAMMA),Y
In this case, GAMMA references a page zero location at which an address
is to be found. The value in index Y is added to that address to
compute the actual address of the operand. Suppose for example that:
LDA (GAMMA),Y
Address Value
+---------+
GAMMA | $38 |
+---------+
+---------+
$0038 | $54 | Treated as low byte
+---------+
$0039 | $00 | Treated as high byte, result is $0054
+---------+ Add $07 from Y, result is $005B
+---------+
$005B | $26 | This value is loaded into the Accumulator
+---------+
Labels and symbols other than directives may not begin with a period.
.BYTE is used to reserve one byte of memory and load it with a value.
The directive may contain multiple operands which will store values in
consecutive bytes. ASCII strings may be generated by enclosing the
string with quotes. (All quotes are "single" quotes, i.e., SHIFT 7.)
It should be noted, however, that there is a limitation of 40 ASCII
characters that can be stored in each .BYTE directive.
HERE .BYTE 2
THERE .BYTE 1, $F, @3, %101, 7
ASCII .BYTE 'ABCDEFH'
Seite 8
Assembler Manual
could be used to store:
JIM'S CYCLE
It should be noted that the use of arithmetic operations in the .BYTE
directive is not supported in this version of the package.
.WORD is used to reserve and load two bytes of data at a time. Any
valid expression, except for ASCII strings, may be used in the operand
field. For example:
HERE .WORD 2
THERE .WORD 1, $FF03, @3
WHERE .WORD HERE, THERE
The most common use for .WORD is to generate addresses as shown in the
previous example labelled "WHERE", which stores the 16 bit address of
"HERE" and "THERE". Addresses in the 6500 series are fetched from
memory in the order low-byte, then high-byte. Therefore, .WORD
generates the value in this order.
The hexadecimal portion of the example ($FF03) would be stored $03, $FF.
If this order is not desired, use .DBYTE rather than .WORD.
.DBYTE is exactly like .WORD, except the bytes are stored in high-byte,
low-byte order. For example:
.DBYTE $FF03
will generate $FF, $03. Thus, fields generated by .DBYTE may not be
used as indirect addresses.
An advanced technique is to set up vector tables under the assembler
directive .DBYTE and to push the starting vector address onto the stack,
then execute an RTS instruction to access your routine. Remember that
the addresses for the operands should be the actual address location
minus one. When constructing a JUMP table in the usual way using an
indirect jump instruction (opcode JMP ($6C)), do not subtract one from
the address of the operand.
Equal (=) is the EQUATE directive and is used to reserve memory
locations, reset the program counter (*), or assign a value to a symbol.
HERE * = * + 1 ;RESERVE ONE BYTE
WHERE * = * + 2 ;RESERVE TWO BYTES
* = $200 ;SET PROGRAM COUNTER
NB = 8 ;ASSIGN VALUE
MB = NB + %101 ;ASSIGN VALUE
The '=' directive is very powerful and can be used for a wide variety of
purposes.
Asterisk (*) directive is used to change the program counter. To create
an object code program that starts assembly at any address greater than
zero, the '*' directive must be used. For example, '* = $200' starts
assembling at address $200.
ERRORS, NOERRORS:
Used to control creation of a separate error file. The error file
contains the source line in error and the error message. This
facility is normally of greatest use to time-sharing users who have
limited print capacity. The error file may be turned on and examined
until all errors have been corrected. This listing file may then be
examined. Another possibility is to run with:
GENERATE, NOGENERATE:
Used to control printing of ASCII strings in the .BYTE directive. The
first two characters will always be printed, and subsequent characters
will be printed (normally two bytes per line), if GENERATE is used.
.END should be the last directive in a file and is used to signal the
physical end of the file. Its use is optional, but highly recommended
for program documentation.
.LIB allows the user to insert source code from another file into the
assembly. When the assembler encounters this directive, it temporarily
Seite 10
Assembler Manual
ceases reading source code from the current file and starts reading from
the file named in the .LIB. Processing of the original source file
resumes when end-of-file (EOF) or .END is encountered in the library
file. The control file containing the .LIB can contain other assembler
directives to turn the listing function on and off, etc.
.FIL can be used to link another file to a current one during assembly.
A library file called by a .LIB may not contain another .LIB, but it may
contain a .FIL. A '.FIL' terminates assembly of the file containing it
and transfers source reading to the file named on the OPERAND. There
are no restrictions on the number of files which may be linked by .FIL
directives. Caution should be exercised when using this directive to
ensure that no circular linkages are created. An assembler pass can
only be terminated by (EOF) or the .END directive.
Listing File
The listing file will be produced unless the NOLIST option is used on
the .OPT assembler directive. This file is make up of two sections:
Program and Error List, and Symbol Table.
The symbol table will always be produced unless the NOSYM option is
used. It contains a list of all symbols used in the program, and their
addresses.
Interface File
This file does not contain true object code, but data which can be
loaded and converted to machine code by the loader. The format for the
first and all succeeding records, except for the last record, is as
follows:
; n1n0 a3a2a1a0 (d1d0)1 (d1d0)2 ... (d1d0)23 x3x2x1x0
Where the following statements apply:
; 00 c3c2c1c0 x3x2x1x0
1. ; 00 Zero bytes of data are in this record. The zeros identify
this as the final record in a file.
The editor is used to enter and modify source files for the assembler.
The editor retains all of the features of the BASIC screen editor and
allows AUTOmatic line numbering, FIND, CHANGE, DELETE within a range,
and reNUMBER. Other commands include COLD, GET, PUT, BREAK, KILL and
FORMAT. All of the commands are detailed in the summary at the end of
this section.
The editor commands operate in a similar fashion to the commands already
existing in the computer's BASIC. For practice, we suggest that you try
to create short example files using the editor commands.
The data files on which the assembler operates are made up of CBM ASCII
characters with each line terminated by a carriage return. The only
restriction on data lines is in naming. Due to the method in which the
assembler parses, spaces are not allowed in filenames. The files are
sequential and must be terminated by a zero byte $00. When listing a
directory, these files will show as file type SEQ.
SYS 59648
After typing the SYS command, the editor has been loaded. At this point,
type a NEW command to clear the text pointers. You are now ready to edit
or enter assembler source files.
The AUTO command generates new line numbers while entering a new source
code file. To enable the AUTO command, type the following:
AUTO n1
BREAK Command
The CHANGE command automatically locates and replaces one string with
another (multiple occurrences). This command is entered in the
following format:
CHANGE/str1/str2/(,n1-n2)
/ Delimits the str1 and str2 (use any character not in either
string)
str1 Search string
str2 Replacement string
n1-n2 Range parameters. The format is the same as the LIST command
in BASIC. If omitted, the whole file is searched.
(Optional)
COLD Command
Performs a cold-start of the PET
CPUT Command
The CPUT command outputs source files with no unnecessary spaces to the
disk for later assembly. The syntax for this command is the same as the
PUT command.
DELETE
The DELETE command allows the user to delete several lines at a time.
Simply input the range of lines to be deleted (n1 through n2). (The
format is the same as the LIST command in BASIC).
Seite 13
Assembler Manual
DELETE n1-n2
To delete a single line, enter the line number alone on a blank line and
press RETURN.
FIND string
The FIND command is used to search for and locate specific character
strings in text. Each occurrence of the string is printed on the CRT.
You can pause the printing with the space bar. Printing can then be
continued with the space bar, or terminated with the RUN/STOP key. The
format of the FIND command is:
FIND/str1/(,n1-n2)
Note: This command has the same controls as FIND. For example, press
space bar to halt printing and press it again to restart printing.
Press the RUN/STOP key to terminate the listing.
GET Command
This command is used to load assembler source text files into the editor
from disk. It can also be used to append to files already in memory.
GET "filename"(,n1)(,n2)(,n3)
KILL Command
This command causes the editor to disengage. To restart the editor,
type the same command used to start the editor (SYS 59648).
LIST Command
The editor LIST command works in the same manner as the LIST command in
BASIC.
LIST (n1)-(n2)
where n1-n2 specifies a range of lines. Valid parameters also include
'n1-' (which will list all lines from n1 to the end) and '-n2' (which
will list all lines from the beginning up to and including n2).
ReNUMBER Lines
Seite 14
Assembler Manual
The NUMBER command allows the user to renumber all or part of the file
in memory.
NUMBER (n1),(n2),(n3)
The PUT command outputs source files to the disk for later assembly.
PUT has the ability to output all or part of the memory resident file.
PUT "filename",(n1-n2),(n3),(n4)
n1 Starting line number (Optional)
n2 Ending line number (Optional)
n3 Device number, default is 8 (Optional)
n4 Secondary address, default is 8 (Optional)
If n1-n2,n3,n4 are left out, the whole file is output to the disk.
The assembler will print a copyright notice and the first user prompt
when execution begins.
When a program is being assembled, the user has the option of creating
an object file which contains the data necessary to create a machine
code program (by the loader). The name of this file is specified by the
user before assembly starts.
Enter the name of the source file that you wish to assemble.
After entering this last prompt, the assembler program begins to
execute. If, during this assembly, the symbol table overflows, the
assembly process will stop.
SYS 44421
When activated, the loader prints a copyright notice and prompts the user
for a load offset. The offset is used to place object code into an address
range other than the one that it was assembled into. This allows the user
to assemble for an area where there is no RAM and load into a RAM area.
The object can then be programmed into EPROM, etc.
After the offset is entered, the loader will prompt the user for the
object filename to be loaded. The loader will then initialize the
drive, search for the file, and start the load. As the data is laoded,
the program will print the input data to the CRT. This is for user
feedback only. When the load is completed, the loader prints the
message 'END OF LOAD' and returns to BASIC.
There are three errors that can occur during a load (each is self
documenting):
Errors are considered fatal; the load is terminated, the object file is
closed, and control is returned to BASIC.
**A,X,Y,S,P RESERVED
All of the branch instructions (excluding the two jumps) are assembled
into two bytes of code. One byte is for the opcode and the other for
the address to branch to. The branch is taken relative to the address
of the beginning of the next instruction. If the value of the byte is
0-127, the branch is forward; if the value is 128-255, the branch is
backward. (A negative branch is in two's complement form). Therefore,
a branch instruction can only branch forward 127 or backward 128 bytes
relative to the beginning of the next instruction. If an attempt is
made to branch further than these limits, this error message will be
printed. To correct, restucture the program.
**CAN'T EVAL EXPRESSION
**FILE EXISTS
The FILE EXISTS error message occurs when the object file named already
exists on the diskette. This error can be corrected by scratching the
old file or changing the diskette.
This error may also mean that a value on the right side of the '=' is
not defined at all in the program, in which case, the cure is the same
as for undefined values.
Seite 20
Assembler Manual
The assembler cannot process more than one level of computed forward
reference. All expressions with symbols that appear on the right side
of any equal sign must refer only to previously defined symbols for the
equate to be processed.
**ILLEGAL OPERAND TYPE
After finding an opcode that does not have an implied operand, the
assembler passes the operand field (the next non-blank field following
the opcode) and determines what type of operand it is (indexed,
absolute, etc.). If the type of operand found is not valid for the
opcode, this error message will be printed.
Check to see what types of operands are allowed for the opcode and make
sure the form of the operand type is correct (see the section 1.1 on
addressing modes).
Check for the operand field starting with a left parenthesis. If it is
supposed to be an indirect operand, recheck the correct format for the
two types available. If the format was wrong (missing right parenthesis
or index register), this error will be printed. Also check for missing
or wrong index registers in an indexed operand (form: expression, index
register).
**IMPROPER OPCODE
This error can occur if opcodes are misspelled, in which case the
assembler will interpret the opcode as a label (if no label appears on
the card). It will then try to assemble the next field as the opcode.
If there is another field, this error will be printed.
Check for a misspelled opcode or for more than one label on a line.
**INDEXED MUST BE X OR Y
After finding a valid opcode, the assembler looks for the operand. In
this case, the first character in the operand field is a left
parenthesis. The assembler interpretes the next field as an indirect
addrses which, with the exception of the jump statement, must be
indexed by one of the index registers, X or Y. In the erroneous case,
the character that the assembler was trying to interpret as an index
register is not X or Y and this error message is printed.
Check for the operand field starting with a left parenthesis. If it is
supposed to be an indirect operand, recheck the correct format for the
two types available. If the format is wrong (missing right parenthesis
or index register), this error will be printed. Also, check for missing
or wrong index registers in an indexed operand (form: expression, index
register).
**INDIRECT OUT OF RANGE
Seite 21
Assembler Manual
This error will only occur if the operand field is in correct form
(i.e., an index register following the address), and the address field
is out of page zero. To correct this, the address field must refer to
page zero memory. (The implied high order byte is 00.)
**INVALID ADDRESS
Check for an unlabelled statement with only an operand field that does
not start with a special character. Also check for an illegal label in
the instruction.
**LABEL TOO LONG
All symbols are limited to six characters in length. When parsing, the
assembler looks for one of the separating characters (usually a blank)
to find the end of a label or string. If other than one of these
separators is used, the error message will be printed providing that the
illegal separator causes the symbol to extend beyond six characters in
length. Check for no spacing between labels and opcodes. Also, check
for a comment card with a long first word that doesn't begin with a
semicolon. In this case the assembler is trying to interpret part of
the comment as a label.
**NON-ALPHANUMERIC
Labels are made up of one to six alphanumeric digits. The label field
must be separated from the opcode field by one or more blanks. If a
special character or other separator is between the label and the
opcode, this error message might be printed.
**PC NEGATIVE--RESET 0
An assembled program is loaded into core in the range of position 0 to
64K (65535). This is the extent of the machine. A maximum of two bytes
can be used to define an address. Because there is no such thing as
negative memory, an attempt to reference a negative position will cause
this error and the program counter (or pointer to the current memory
Seite 22
Assembler Manual
location) to be reset to zero.
When this error occurs, the assembler continues assembling the code
with the new value of the program counter. This could cause multiple
bytes to be assembled into the same locations. Therefore, care should
be taken to keep the program counter within the proper limits.
**UNDEFINED DIRECTIVE
**UNDEFINED SYMBOL
This error is generated by the second pass. If in the first pass the
assembler finds a symbol in the operand field (the field following the
opcode or an equals sign) that has not been defined yet, the assembler
puts the symbol into the table and flags it for interpretation by pass
two. If the symbol is defined (shows up on the left of an equate or as
the first non-blank field in a statement), pass one will define it and
enter it in the symbol table. Therefore, a symbol in an operand field,
found before the definition, will be defined with a value when pass two
assembles it. In this case, the assembly process can be completed.
This is what is meant by one level of forward reference (See Forward
Reference Error).
However, if pass one doesn't find the symbol as a label or on the left
of an equate, the assembler never enters it in the symbol table as a
defined symbol. When pass two tries to interpret the operand field the
symbol is in, there is no corresponding value for the symbol and the
field cannot be interpreted. Therefore, the error message is printed
with no value for the operand.
Seite 24