0% found this document useful (0 votes)
42 views

System Software: Assignment On

This document summarizes topics related to system software, including ANSI C macros, MASM assembler, and the Microsoft object file format. It discusses how the C preprocessor handles macros, header files, and other preprocessing directives. It also provides an overview of the MASM assembler, describing its support for 16-bit, 32-bit, and 64-bit assembly as well as its ability to generate PE/COFF object files. Finally, it outlines the structure and contents of Microsoft object files, including their record-based format and common fields.

Uploaded by

Sahanshah K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

System Software: Assignment On

This document summarizes topics related to system software, including ANSI C macros, MASM assembler, and the Microsoft object file format. It discusses how the C preprocessor handles macros, header files, and other preprocessing directives. It also provides an overview of the MASM assembler, describing its support for 16-bit, 32-bit, and 64-bit assembly as well as its ability to generate PE/COFF object files. Finally, it outlines the structure and contents of Microsoft object files, including their record-based format and common fields.

Uploaded by

Sahanshah K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 9

ASSIGNMENT ON

SYSTEM SOFTWARE

Topics :
- ANSII C Macros - The C preprocessor
- MASM Assembler
- Microsoft object file format
ANSII C Macros - The C preprocessor

The C preprocessor is a macro processor that is used automatically by the C


compiler to transform your program before actual compilation. It is called a macro
processor because it allows you to define macros, which are brief abbreviations for
longer constructs.

The C preprocessor provides four separate facilities that you can use as you see fit :

 Inclusion of header files. These are files of declarations that can be substituted into
your program.
 Macro expansion. You can define macros, which are abbreviations for arbitrary
fragments of C code, and then the C preprocessor will replace the macros with
their definitions throughout the program.
 Conditional compilation. Using special preprocessing directives, you can include or
exclude parts of the program according to various conditions.
 Line control. If you use a program to combine or rearrange source files into an
intermediate file which is then compiled, you can use line control to inform the
compiler of where each source line originally came from.

The C preprocessor is designed for C-like languages; you may run into problems if
you apply it to other kinds of languages, because it assumes that it is dealing with C.
For example, the C preprocessor sometimes outputs extra white space to avoid
inadvertent C token concatenation, and this may cause problems with other languages.

- Transformations Made Globally :

Most C preprocessor features are inactive unless you give specific directives to
request their use. But there are three transformations that the preprocessor always
makes on all the input it receives, even in the absence of directives.

 All C comments are replaced with single spaces.


 Backslash-Newline sequences are deleted, no matter where. This feature allows
you to break long lines for cosmetic purposes without changing their meaning.
 Predefined macro names are replaced with their expansions
The first two transformations are done before nearly all other parsing and before
preprocessing directives are recognized. Thus, for example, you can split a line
cosmetically with Backslash-Newline anywhere (except when trigraphs are in use).

/*
*/ # /*
*/ defi\
ne FO\
O 10\
20

is equivalent into `#define FOO 1020'. You can split even an escape sequence
with Backslash-Newline. For example, you can split "foo\bar" between the `\' and the
`b' to get

"foo\\
bar"

This behavior is unclean: in all other contexts, a Backslash can be inserted in a


string constant as an ordinary character by writing a double Backslash, and this creates
an exception. But the ANSI C standard requires it. (Strict ANSI C does not allow
Newlines in string constants, so they do not consider this a problem.)

- Preprocessing Directives :

Most preprocessor features are active only if you use preprocessing directives to
request their use.
Preprocessing directives are lines in your program that start with `#'. The `#' is
followed by an identifier that is the directive name. For example, `#define' is the
directive that defines a macro. Whitespace is also allowed before and after the `#'.
The set of valid directive names is fixed. Programs cannot define new preprocessing
directives.
Some directive names require arguments; these make up the rest of the directive
line and must be separated from the directive name by whitespace. For example,
`#define' must be followed by a macro name and the intended expansion of the
macro.
A preprocessing directive cannot be more than one line in normal circumstances. It
may be split cosmetically with Backslash-Newline, but that has no effect on its
meaning. Comments containing Newlines can also divide the directive into multiple
lines, but the comments are changed to Spaces before the directive is interpreted. The
only way a significant Newline can occur in a preprocessing directive is within a string
constant or character constant.
Note that most C compilers that might be applied to the output from the
preprocessor do not accept string or character constants containing Newlines.The `#'
and the directive name cannot come from a macro expansion. For example, if `foo' is
defined as a macro expanding to `define', that does not make `#foo' a valid
preprocessing directive.

- Header Files :

A header file is a file containing C declarations and macro definitions to be shared


between several source files. You request the use of a header file in your program with
the C preprocessing directive `#include'.

Header files serve two kinds of purposes :

 System header files declare the interfaces to parts of the operating system. You
include them in your program to supply the definitions and declarations you need
to invoke system calls and libraries.
 Your own header files contain declarations for interfaces between the source files
of your program. Each time you have a group of related declarations and macro
definitions all or most of which are needed in several different source files, it is a
good idea to create a header file for them.

Including a header file produces the same results in C compilation as copying the
header file into each source file that needs it. But such copying would be time-
consuming and error-prone. With a header file, the related declarations appear in only
one place. If they need to be changed, they can be changed in one place, and
programs that include the header file will automatically use the new version when next
recompiled. The header file eliminates the labor of finding and changing all the copies
as well as the risk that a failure to find one copy will result in inconsistencies within a
program.

The usual convention is to give header files names that end with `.h'. Avoid unusual
characters in header file names, as they reduce portability.

- Macros :

A macro is a sort of abbreviation which you can define once and then use later.
There are many complicated features associated with macros in the C preprocessor.

A simple macro is a kind of abbreviation. It is a name which stands for a fragment of


code. Some people refer to these as manifest constants.
Before you can use a macro, you must define it explicitly with the `#define'
directive. `#define' is followed by the name of the macro and then the code it should
be an abbreviation for. For example,

#define BUFFER_SIZE 1020

defines a macro named `BUFFER_SIZE' as an abbreviation for the text `1020'. If


somewhere after this `#define' directive there comes a C statement of the form

foo = (char *) xmalloc (BUFFER_SIZE);

then the C preprocessor will recognize and expand the macro `BUFFER_SIZE', resulting
in

foo = (char *) xmalloc (1020);

The use of all upper case for macro names is a standard convention. Programs are
easier to read when it is possible to tell at a glance which names are macros.

MASM Assembler
The Microsoft Macro Assembler (MASM) is an x86 assembler that uses the Intel
syntax for MS-DOS and Microsoft Windows. Beginning with MASM 8.0 there are two
versions of the assembler - one for 16-bit and 32-bit assembly sources, and another
(ML64) for 64-bit sources only.

MASM is maintained by Microsoft, but since version 6.12 has not been sold as a
separate product, it is instead supplied with various Microsoft SDKs and C compilers.
Recent versions of MASM are included with Microsoft Visual Studio.

An assembly language is thus specific to certain physical (or virtual) computer


architecture. This is in contrast to most high-level programming languages, which,
ideally are different programming languages that software developers or programmers
are able to interact with using compiler that first compile codes in high-level language
to machine language before logically speaks to CPU architecture.

Object module formats supported by MASM:

Early versions of MASM generated object modules using the OMF format, which
was used to create binaries for MS-DOS or OS/2. Since version 6.1, MASM is able to
produce object modules in the Portable Executable (PE/COFF) format. PE/COFF is
compatible with recent Microsoft C compilers, and object modules produced by either
MASM or the C compiler can be routinely intermixed and linked into Win32 and
Win64 binaries.

Assemblers compatible with MASM :

Some other assemblers can assemble most code written for MASM, with the exception
of more complex macros.

 Turbo Assembler (TASM) developed by Borland, later owned by Embarcadero, last


updated in 2002 and supplied with Delphi and C++Builder for several years, later
discontinued.
 JWASM Macro Assembler, licensed under the Sybase Open Watcom EULA.
 Pelle's Macro Assembler, a component of the Pelles C development environment.

Microsoft object file format


The .OBJ files are binary files used by compilers to link in precompiled code. They
contain symbol and relocation information necessary to link the data and code
contained in the files. The .OBJ files have no common header which makes a validation
or identification guesswork at best. The .OBJ files consist of at least one record, each
of the following type :

OFFSET Count TYPE Description

0000h 1 byte Record type (see below)


0001h 1 word Record length
="LEN"
0003h "LEN" byte Record data
0003h 1 byte Checksum or 0
+"LEN" (that much for validation)

The maximum size of the entire record (unless otherwise noted for specific record
types) is 1024 bytes.

The contents of each record are determined by the record type, but certain subfields
appear frequently enough to be explained separately. The format of such fields is
below.

Names :

A name string is encoded as an 8-bit unsigned count followed by a string of count


characters. The character set is usually some ASCII subset. A null name is specified by a
single byte of 0 (indicating a string of length 0).

Indexed References :

Certain items are ordered by occurrence and are referenced by index. The first
occurrence of the item has index number 1. Index fields may contain 0 (indicating that
they are not present) or values from 1 through 7FFF. The index number field in an
object record can be either 1 or 2 bytes long. If the number is in the range 0-7FH, the
high-order bit (bit 7) is 0 and the low-order bits contain the index number, so the field
is only 1 byte long. If the index number is in the range 80- 7FFFH, the field is 2 bytes
long.

Type Indexes :

Type Index fields occupy 1 or 2 bytes and occur in PUBDEF, LPUBDEF, COMDEF,
LCOMDEF, EXTDEF, and LEXTDEF records. They are encoded as described above for
indexed references, but the interpretation of the values stored is governed by whether
the module has the "new" or "old" object module format.

Ordered Collections :

Certain records and record groups are ordered so that the records may be referred
to with indexes (the format of indexes is described in the "Indexed References" section
of this document). The same format is used whether an index refers to names, logical
segments, or other items.
The overall ordering is obtained from the order of the records within the file
together with the ordering of repeated fields within these records. Such ordered
collections are referenced by index, counting from 1 (index 0 indicates unknown or not
specified).
For example, there may be many LNAMES records within a module, and each of those
records may contain many names. The names are indexed starting at 1 for the first
name in the first LNAMES record encountered while reading the file, 2 for the second
name in the first record, and so forth, with the highest index for the last name in the
last LNAMES record encountered.

The ordered collections are:

Names Ordered by occurrence of LNAMES records and


names within each. Referenced as a name
index.

Logical Ordered by occurrence of SEGDEF records in


Segments file. Referenced as a segment index.

Groups Ordered by occurrence of GRPDEF records in


file. Referenced as a group index.

External Ordered by occurrence of EXTDEF, COMDEF,


Symbols LEXTDEF, and LCOMDEF records and symbols
within each. Referenced as an external name
index (in FIXUP subrecords).

Numeric 2- and 4-Byte Fields :

Certain records, notably SEGDEF, PUBDEF, LPUBDEF, LINNUM, LEDATA,


LIDATA, FIXUPP, and MODEND, contain size, offset, and displacement
values that may be 32-bit quantities for Use32 segments. The encoding
is as follows:

- When the least-significant bit of the record type byte is set (that
is, the record type is an odd number), the numeric fields are 4
bytes.

- When the least-significant bit of the record type byte is clear,


the fields occupy 2 bytes. The values are zero-extended when
applied to Use32 segments.

NOTE: See the description of SEGDEF records in this document for an


explanation of Use16/Use32 segments.

You might also like