Portable Document Format Reference Manual - Version 1.1
Portable Document Format Reference Manual - Version 1.1
Portable Document Format Reference Manual - Version 1.1
March 1, 1996
Tim Bienz, Richard Cohn, and James R. Meehan
Adobe Systems Incorporated
Adobe Systems Incorporated
Portable Document Format
Reference Manual
Version 1.1
PDF Reference Manual April 16, 1996
Copyright
1993, 1996. Adobe Systems Incorporated. All rights reserved. Patents Pending.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic,
mechanical, photocopying, recording, or otherwise, without the prior written consent of the publisher. Any software referred to herein
is furnished under license and may only be used or copied in accordance with the terms of such license. Printed in the United States
of America.
This publication and the information herein is furnished AS IS, is subject to change without notice, and should not be construed as a
commitment by Adobe Systems Incorporated. Adobe Systems Incorporated assumes no responsibility or liability for any errors or
inaccuracies, makes no warranty of any kind (express, implied or statutory) with respect to this publication, and expressly disclaims
any and all warranties of merchantability, fitness for particular purposes and noninfringement of third party rights.
PostScript is a registered trademark of Adobe Systems Incorporated. All instances of the name PostScript in the text are references to
the PostScript language as defined by Adobe Systems Incorporated unless otherwise stated. The name PostScript also is used as a
product trademark for Adobe Systems implementation of the PostScript language interpreter.
Any references to a PostScript printer, a PostScript file, or a PostScript driver refer to printers, files, and driver programs
(respectively) which are written in or support the PostScript language. The sentences in this book that use PostScript language as
an adjective phrase are so constructed to reinforce that the name refers to the standard language definition as set forth by Adobe
Systems Incorporated.
Adobe, Acrobat, the Acrobat logo, Adobe Garamond, Adobe Illustrator , Carta, Distiller, FrameMaker, Minion, Photoshop, the
Photoshop logo, Poetica, PostScript, and the PostScript logo are registered trademarks of Adobe Systems Incorporated. TrueType and
QuickDraw are trademarks and Apple, Macintosh, and Mac are registered trademarks of Apple Computer, Inc. ITC Stone and ITC
Zapf Dingbats are registered trademarks of International Typeface Corporation. Helvetica and Times are registered trademarks of
LinotypeHell AG and/or its subsidiaries. Microsoft and Windows are registered trademarks of Microsoft Corporation. SelectSet is
a trademark of Agfa Division, Miles, Inc. Sun is a trademark of Sun Microsystems, Inc. SPARCstation is a registered trademark of
SPARC International, Inc., licensed exclusively to Sun Microsystems, Inc. and is based upon an architecture developed by Sun
Microsystems, Inc. NeXT is a trademark of NeXT Computer, Inc. UNIX is a registered trademark in the United States and other
countries, licensed exclusively through X/Open Company, Ltd. All other brand or product names are the trademarks or registered
trademarks of their respective holders.
Library of Congress Cataloging-in-Publication Data
Portable document format reference manual / Adobe Systems Incorporated.
p. cm.
Includes bibliographical references (p. 207) and index.
ISBN 0201626284
1. File organization (Computer science) 2. PostScript (Computer program language)
3. Text processing (Computer science) I. Adobe Systems.
QA76.9.F5P67 1993938046
005.74dc20CIP
1 2 3 4 5 6 7 8 9MA9796959493
First Printing, June 1993
PDF Reference Manual April 16, 1996 Contents
iii
Contents
Contents
iii
Figures
ix
Tables
xi
Examples
xiii
Chapter 1: Introduction
1
1.1 About this book 1
1.2 Introduction to the Second EditionPDF 1.1 3
1.3 Conventions used in this book 5
1.4 A note on syntax 5
1.5 Copyrights and permissions to use PDF 6
Section I: Portable Document Format
9
Chapter 2: Overview
11
2.1 What is the Portable Document Format? 11
2.2 Using PDF 12
2.3 General properties 14
2.4 PDF and the PostScript language 18
2.5 Understanding PDF 20
PDF Reference Manual April 16, 1996 Contents
iv Contents
Chapter 3: Coordinate Systems
21
3.1 Device space 21
3.2 User space 22
3.3 Text space 23
3.4 Character space 24
3.5 Image space 24
3.6 Form space 24
3.7 Relationships among coordinate systems 24
3.8 Transformations between coordinate systems 25
3.9 Transformation matrices 27
Chapter 4: Objects
31
4.1 Introduction 31
4.2 Booleans 31
4.3 Numbers 31
4.4 Strings 32
4.5 Names 35
4.6 Arrays 35
4.7 Dictionaries 35
4.8 Streams 36
4.9 The null object 47
4.10 Indirect objects 48
4.11 Object references 48
Chapter 5: File Structure
51
5.1 Introduction 51
5.2 Header 53
5.3 Body 53
5.4 Cross-reference table 53
5.5 Trailer 56
5.6 Incremental update 58
5.7 Encryption 60
PDF Reference Manual April 16, 1996 Contents
v
Chapter 6: Document Structure
61
6.1 Introduction 61
6.2 Catalog 63
6.3 Pages tree 65
6.4 Page objects 67
6.5 Thumbnails 74
6.6 Annotations 74
6.7 Outline tree 93
6.8 Resources 96
6.9 Info dictionary 129
6.10 Articles 131
6.11 File ID 133
6.12 Encryption dictionary 134
Chapter 7: Page Descriptions
137
7.1 Overview 137
7.2 Graphics state 138
7.3 Graphics state operators 147
7.4 Color operators 148
7.5 Path operators 150
7.6 Text state 156
7.7 Text operators 161
7.8 XObject operator 164
7.9 In-line image operators 165
7.10 Type 3 font operators 167
7.11 In-line pass-through PostScript fragments 167
7.12 Compatibility operators 168
PDF Reference Manual April 16, 1996 Contents
vi Contents
Section II: Optimizing PDF Files
169
Chapter 8: General Techniques for Optimizing PDF Files
171
8.1 Use short names 171
8.2 Use direct and indirect objects appropriately 172
8.3 Take advantage of combined operators 173
8.4 Remove unnecessary clipping paths 174
8.5 Omit unnecessary spaces 174
8.6 Omit default values 175
8.7 Take advantage of forms 175
8.8 Limit the precision of real numbers 175
8.9 Write parameters only when they change 176
8.10 Dont draw outside the crop box 176
8.11 Consider target device resolution 176
8.12 Share resources 177
8.13 Store common Page attributes in the Pages object 177
Chapter 9: Optimizing Text
179
9.1 Dont produce unnecessary text objects 179
9.2 Use automatic leading 180
9.3 Take advantage of text spacing operators 183
9.4 Dont replace spaces between words 184
9.5 Use the appropriate operator to draw text 184
9.6 Use the appropriate operator to position text 185
9.7 Remove text clipping 186
9.8 Consider target device resolution 187
Chapter 10: Optimizing Graphics
189
10.1 Use the appropriate color-setting operator 189
10.2 Defer path painting until necessary 189
10.3 Take advantage of the closepath operator 190
10.4 Dont close a path more than once 191
10.5 Dont draw zero-length lines 192
10.6 Make sure drawing is needed 193
10.7 Take advantage of rectangle and curve operators 193
10.8 Coalesce operations 194
PDF Reference Manual April 16, 1996 Contents
vii
Chapter 11: Optimizing Images
195
11.1 Preprocess images 195
11.2 Match image resolution to target device resolution 195
11.3 Use the minimum number of bits per color component 196
11.4 Take advantage of indexed color spaces 196
11.5 Use the DeviceGray color space for monochrome images 197
11.6 Use in-line images appropriately 197
11.7 Dont compress in-line images unnecessarily 197
11.8 Choose the appropriate lters 198
Chapter 12: Clipping and Blends
203
12.1 Clipping to a path 204
12.2 Clipping to text 206
12.3 Image masks 208
12.4 Blends 211
PDF Reference Manual April 16, 1996 Contents
viii Contents
Appendix A: Example PDF Files
219
A.1 Minimal PDF le 219
A.2 Simple text string 221
A.3 Simple graphics 223
A.4 Pages tree 226
A.5 Outline 230
A.6 Updated le 234
Appendix B: Summary of Page Marking Operators
243
Appendix C: Predened Font Encodings
247
C.1 Predened encodings sorted by character name 248
C.2 Predened encodings sorted by character code 254
C.3 MacExpert encoding 260
Appendix D: Implementation Limits
263
Appendix E: Obtaining XUIDs and Technical Notes
267
Appendix F: PDF Name Registry
269
Appendix G: Compatibility
271
G.1 Version numbers 271
G.2 Viewer compatibility behavior 273
Bibliography
279
Colophon
283
PDF Reference Manual April 16, 1996 Figures
ix
Figures
Figure 1.1 Creating PDF files using PDF Writer 12
Figure 1.2 Creating PDF files using the Distiller program 13
Figure 1.3 Viewing and printing a PDF document 14
Figure 1.4 PDF components 20
Figure 2.1 Device space 22
Figure 2.2 User space 23
Figure 2.3 Relationships among PDF coordinate systems 24
Figure 2.4 Effects of coordinate transformations 26
Figure 2.5 Effect of the order of transformations 27
Figure 4.1 Structure of a PDF file that has not been updated 52
Figure 4.2 Structure of a PDF file after changes have been appended several times 59
Figure 5.1 Structure of a PDF document 62
Figure 5.2 Page objects media box and crop box 69
Figure 5.3 Annotation types 75
Figure 5.4 Characteristics represented in the flags field of a font descriptor 113
Figure 5.5 Color spaces 114
Figure 6.1 Flatness 142
Figure 6.2 Line cap styles 143
Figure 6.3 Line dash pattern 144
Figure 6.4 Line join styles 145
Figure 6.5 Miter length 146
Figure 6.6 Bzier curve 151
Figure 6.7
v
operator 152
Figure 6.8
y
operator 153
Figure 6.9 Non-zero winding number rule 154
Figure 6.10 Evenodd rule 155
Figure 6.11 Character spacing 157
Figure 6.12 Horizontal scaling 157
Figure 6.13 Leading 158
Figure 6.14 Text rendering modes 159
Figure 6.15 Text rise 160
PDF Reference Manual April 16, 1996 Figures
x Figures
Figure 6.16 Effect of word spacing 160
Figure 6.17 Operation of
TJ
operator 164
Figure 2.1 Restoring clipping path after clipping to text 187
Figure 4.1 Effect of JPEG encoding on a screenshot 199
Figure 4.2 Effect of JPEG encoding on a continuous-tone image 200
Figure 5.1 Clipping to a path 204
Figure 5.2 Using text as a clipping path 206
Figure 5.3 Images and image masks 209
Figure 5.4 Using an image to produce a linear blend 212
Figure 5.5 Using an image to produce a square blend 216
Figure A.1 Pages tree for 62-page document example 226
Figure A.2 Example of outline with six items, all open 230
Figure A.3 Example of outline with six items, five of which are open 232
PDF Reference Manual April 16, 1996 Tables
xi
Tables
Table 3.1 Escape sequences in strings 32
Table 3.2 Stream attributes 37
Table 3.3 Standard filters 38
Table 3.4 Optional parameters for LZW filter 43
Table 3.5 Optional parameters for CCITTFaxDecode filter 45
Table 4.1 Trailer attributes 57
Table 5.1 Catalog attributes 64
Table 5.2 Pages attributes 65
Table 5.3 Page attributes 67
Table 5.4 Transition attributes 71
Table 5.5 Transition Effects 72
Table 5.6 Effect parameters 73
Table 5.7 Annotation attributes (common to all annotations) 76
Table 5.8 Text annotation attributes (in addition to those in Table 5.7) 77
Table 5.9 Link annotation attributes (in addition to those in Table 5.7) 78
Table 5.10 Destination specification 79
Table 5.11 GoTo action attributes 81
Table 5.12 GoToR action attributes 82
Table 5.13 Launch action attributes 83
Table 5.14 Windows-specific launch attributes 84
Table 5.15 Thread action attributes 84
Table 5.16 URI action attributes 86
Table 5.17 URI attributes 87
Table 5.18 Examples of file specifications 89
Table 5.19 File specification attributes 90
Table 5.20 Movie annotation attributes (in addition to those in Table 5.7) 91
Table 5.21 Movie dictionary attributes 92
Table 5.22 Activation attributes 93
Table 5.23 Outlines attributes 94
Table 5.24 Outline entry attributes 94
Table 5.25 Predefined procsets 98
PDF Reference Manual April 16, 1996 Tables
xii Tables
Table 5.26 Attributes common to all types of fonts 98
Table 5.27 Type 1 font additional attributes 100
Table 5.28 Base 14 fonts 100
Table 5.29 Multiple master Type 1 font additional attributes 103
Table 5.30 Type 3 font additional attributes 104
Table 5.31 TrueType font attributes 106
Table 5.32 Font encoding attributes 107
Table 5.33 Font descriptor attributes 109
Table 5.34 Additional attributes for FontFile stream 111
Table 5.35 Font flags 112
Table 5.36
CalGray
attributes 117
Table 5.37
CalRGB
attributes 118
Table 5.38
Lab
attributes 119
Table 5.39 Image resource attributes 122
Table 5.40 Default
Decode
arrays for various color spaces 125
Table 5.41 Color rendering intents 126
Table 5.42 Form resource attributes 127
Table 5.43 PDF Info dictionary attributes 130
Table 5.44 Thread attributes 131
Table 5.45 Bead attributes 132
Table 5.46 Encrypt dictionary attributes 134
Table 5.47 Standard security handler attributes 136
Table 6.1 General graphics state parameters 139
Table 6.2 Text-specific graphics state parameters 140
Table 6.3 Abbreviations for in-line image names 165
Table 1.1 Optimized operator combinations 174
Table 2.1 Comparison of text string operators 185
Table 2.2 Comparison of text positioning operators 186
Table 4.1 Comparison of compression filters for images 201
Table A.1 Objects in empty example 219
Table A.2 Objects in Hello World example 221
Table A.3 Objects in graphics example 224
Table A.4 Object use after adding four text annotations 235
Table A.5 Object use after deleting two text annotations 238
Table A.6 Object use after adding three text annotations 240
Table B.1 PDF page marking operators 243
Table D.1 Architectural limits 264
Table G.1 Acrobat 1.0 Viewer behavior with unknown filters 276
Table G.2 Acrobat 2.0 Viewer behavior with unknown filters 277
PDF Reference Manual April 16, 1996 Examples
xiii
Examples
Example 3.1 Dictionary 35
Example 3.2 Dictionary within a dictionary 36
Example 3.3 Stream that has been LZW and ASCII85 encoded 38
Example 3.4 Unencoded stream 39
Example 3.5 Indirect reference 49
Example 4.1 Cross-reference section with a single subsection 55
Example 4.2 Cross-reference section with multiple subsections 56
Example 4.3 Trailer 58
Example 5.1 Catalog 63
Example 5.2 Pages tree for a document containing three pages 66
Example 5.3 Inheritance of attributes 67
Example 5.4 Page with thumbnail, annotations, and Resources dictionary 70
Example 5.5 A page with information for presentation mode 73
Example 5.6 Thumbnail 74
Example 5.7 Text annotation 78
Example 5.8 Link annotation 78
Example 5.9 GoTo action 82
Example 5.10 Outlines object with six open entries 94
Example 5.11 Outline entry 96
Example 5.12 Resources dictionary 97
Example 5.13 Type 1 font resource and character widths array 101
Example 5.14 Multiple master font resource and character widths array 103
Example 5.15 Type 3 font resource 105
Example 5.16 TrueType font resource 106
Example 5.17 Font encoding 107
Example 5.18 Embedded Type 1 font definition 111
Example 5.19 Font descriptor 114
Example 5.20 Color space resource for an indexed color space 120
Example 5.21 Image resource with length specified as an indirect object 123
Example 5.22 Form resource 128
Example 5.23 Info dictionary 131
PDF Reference Manual April 16, 1996 Examples
xiv Examples
Example 5.24 Thread 132
Example 6.1 In-line image 166
Example 2.1 Changing the text matrix inside a text object 179
Example 2.2 Multiple lines of text without automatic leading 180
Example 2.3 Multiple lines of text using automatic leading 181
Example 2.4
TJ
operator without automatic leading 181
Example 2.5 Use of the
T*
operator 182
Example 2.6 Using the
TL
operator to set leading 182
Example 2.7 Using the
TD
operator to set leading 183
Example 2.8 Character and word spacing using the
Tc
and
Tw
operators 183
Example 2.9 Character and word spacing using the
"
operator 184
Example 2.10 Restoring clipping path after using text as clipping path 186
Example 3.1 Each path segment as a separate path 190
Example 3.2 Grouping path segments into a single path 190
Example 3.3 Using redundant
l
and
h
operators to close a path inefficiently 191
Example 3.4 Using the
l
operator to close a path inefficiently 191
Example 3.5 Taking advantage of the
h
operator to close a path 191
Example 3.6 Improperly closing a path: multiple path closing operators 192
Example 3.7 Properly closing a path: single path closing operator 192
Example 3.8 Portion of a path before coalescing operations 194
Example 3.9 Portion of a path after coalescing operations 194
Example 5.1 Clipping to a path 205
Example 5.2 Using text as a clipping path 206
Example 5.3 Images and image masks 209
Example 5.4 Using images as blends 213
Example 5.5 Image used to produce a grayscale square blend 216
Example A.1 Minimal PDF file 220
Example A.2 PDF file for simple text example 222
Example A.3 PDF file for simple graphics example 224
Example A.4 Pages tree for a document containing 62 pages 226
Example A.5 Six entry outline, all items open 230
Example A.6 Six entry outline, five entries open 232
Example A.7 Update section of PDF file when four text annotations are added 235
Example A.8 Update section of PDF file when one text annotation is modified 237
Example A.9 Update section of PDF file when two text annotations are deleted 239
Example A.10 Update section of PDF file after three text annotations are added 240
PDF Reference Manual April 16, 1996 Chapter 1: Introduction
1
CHAPTER
1
Introduction
This book describes the Portable Document Format (PDF), the native le
format of the Adobe
Acrobat
family of products. The goal of these
products is to enable users to easily and reliably exchange and view
electronic documents independent of the environment in which they were
created. PDF relies on the imaging model of the PostScript
language to
describe text and graphics in a device- and resolution-independent manner.
To improve performance for interactive viewing, PDF denes a more
structured format than that used by most PostScript language programs.
PDF also includes objects, such as annotations and hypertext links, that are
not part of the page itself but are useful for interactive viewing.
PDF les are built from a sequence of numbered objects similar to those
used in the PostScript language. The text, graphics, and images that make
up the contents of a page are represented using operators based on those in
the PostScript language, and closely follow the Adobe Illustrator
3.0 page
description operators.
A PDF le is not a PostScript language program and cannot be directly
interpreted by a PostScript interpreter. However, the page descriptions in a
PDF le can be converted into a PostScript language program.
1.1 About this book
This book provides a description of the PDF le format, as well as
suggestions for producing efcient PDF les. It is intended primarily for
application developers who wish to produce PDF les directly. This book
also contains enough information to allow developers to write applications
that read and modify PDF les. While PDF is independent of any particular
application, occasionally PDF features are best explained by the actions a
particular application takes when it encounters that feature in a le.
Similarly, Appendix D discusses some implementation limits in the Acrobat
viewer applications, even though these limits are not part of the le format
itself.
PDF Reference Manual April 16, 1996 Chapter 1: Introduction
2 Chapter 1: Introduction
This book consists of two sections. The rst section describes the le format
and the second lists techniques for producing efcient PDF les. In
addition, appendices provide example les, detailed descriptions of several
predened font encodings, and a summary of PDF page marking operators.
Readers are assumed to have some knowledge of the PostScript language,
as described in the
PostScript Language Reference Manual, Second Edition
[1]. In addition, some understanding of fonts, as described in the
Adobe
Type 1 Font Format
[4]
,
is useful.
The rst section of this book, Portable Document Format, includes Chapters
2 through 7 and describes the PDF le format.
Chapter 2 describes the motivation for creating the PDF le format and
provides an overview of its architecture. PDF is compared to the PostScript
language.
Chapter 3 discusses the coordinate systems and transformations used in
PDF les. Because the coordinate systems used in PDF are very much like
those used in the PostScript language, users with substantial background in
the PostScript language may wish to read this chapter only as a review.
Chapter 4 describes the types of objects used to construct documents in
PDF les. These types are similar to those used in the PostScript language.
Readers familiar with the types of objects present in the PostScript language
may wish to read this chapter quickly as a reminder.
Chapter 5 provides a description of the format of PDF les, how they are
organized on disk, and the mechanism by which updates can be appended to
a PDF le.
Chapter 6 describes the way that a document is represented in a PDF le,
using the object types presented in Chapter 4.
Chapter 7 discusses the page marking operators used in PDF les. These are
the operators that actually make marks on a page. Many are similar to one
or more PostScript language operators. Readers with PostScript language
experience will quickly see the similarities.
The second section of this book, Optimizing PDF Files, includes Chapters 8
through 12 and describes techniques for producing efcient PDF les.
Many of the techniques presented can also be used in the PostScript
language. The techniques are broken down into four areas: text, graphics,
images, and general techniques.
PDF Reference Manual April 16, 1996 Chapter 1: Introduction
1.2 Introduction to Version 1.1PDF 1.1 3
Chapter 8 discusses general optimizations that may be used in a wide
variety of situations in PDF les.
Chapter 9 discusses optimizations for text.
Chapter 10 discusses graphics optimizations.
Chapter 11 discusses optimizations that may be used on sampled images.
Finally, Chapter 12 contains techniques for using clipping paths to restrict
the region in which drawing occurs and a technique using images to make
efcient blends.
1.2 Introduction to Version 1.1PDF 1.1
This document is a revision of the 1993 edition of
Portable Document
Format Reference Manual
. It describes version 1.1 of the Portable
Document Format.
The PDF specication is independent of any particular implementation of a
PDF generator or consumer. To provide guidance to implementors,
however,
Implementation Notes
that accompany the specication and
Appendix G describe the behavior of Acrobat viewers (versions 1.0, 2.0,
and 2.1) when they encounter the changes documented herein.
Implementation note PDF 1.1 is the native le format of the Adobe Acrobat 2.0 family of
products.
The PDF 1.1 specication, like the PDF 1.0 specication, denes a
minimum interchange level of functionality. The Portable Document Format
is an extensible format, which means that PDF les may contain objects not
dened by this specication.
Consumers
, applications that read PDF les
and interpret their contents, are expected to implement correctly the
semantics of objects that are specied by PDF 1.1 and, as gracefully as
possible, to ignore any objects that they do not understand. Appendix G
provides guidance on how a consumer should handle objects it does not
understand.
Implementation note Some Acrobat 2.0 and subsequent products provide an interface that
supports plug-ins. These plug-ins can use and/or put private data objects
within a PDF le. Appendix G indicates the kinds of private data that can
be used and Appendix F denes a registry for this data. The registry can be
used to avoid conicts in identifying data from independent plug-ins.
PDF Reference Manual April 16, 1996 Chapter 1: Introduction
4 Chapter 1: Introduction
New features introduced in PDF 1.1 include the following:
The ability to protect a document with a password and to restrict
operations on a document.
The ability to tie blocks of text together into articles, making reading
easier.
The generalization of link and bookmark destinations to actions, which
include links to other PDF les and foreign les.
The ability to dene new annotation types and to provide additional
attributes for existing types.
The ability to specify default settings and actions when a document is
opened.
Device-independent color.
An ID included in les to make it easier to verify that a le is the correct
le, even under circumstances where the les name is incorrect (such as
les on some networks).
A binary option that allows les to be smaller.
A new date format that allows programmatic comparison of dates.
The ability to provide additional document information.
Note In PDF 1.1, dictionary key names are often one or two letters in order to
conserve space in les. When these keys are described below, they are
followed in parentheses by a more descriptive string. However, only the
actual one- or two-letter name may be used in a PDF le.
Note PDF is an evolving language, and new editions of this manual will be
offered on an ongoing basis to document the changes.
PDF Reference Manual April 16, 1996 Chapter 1: Introduction
1.3 Conventions used in this book 5
1.3 Conventions used in this book
Text styles are used to identify various operators, keywords, terms, and
objects. Four formatting styles are used in this book:
PostScript language operators, PDF operators, PDF keywords, the names
of keys in dictionaries, and other predened names are written in
boldface. Examples are
moveto
,
Tf
,
stream
,
Type
, and
MacRomanEncoding
.
Operands of PDF operators are written in an italic sans serif font. An
example is
linewidth
.
Object types are written with initial capital letters. An example is
FontDescriptor.
The rst occurrence of terms and the boolean values
true
and
false are
written in italics. This style is also used for emphasis.
Tables containing dictionary keys are normally organized with the Type and
Subtype keys rst, followed by any other keys that are required in the
dictionary, followed by any optional keys.
All changes from the rst edition of this manual are marked with change
bars in the margin. Most of the changes are related to the differences
between PDF 1.0 and PDF 1.1. Other changes are corrections to errors in
the rst edition.
1.4 A note on syntax
Throughout this book, BackusNaur form (BNF) notation is used to
describe syntax:
<xyz> ::= abc <def> ghi |
<k> j
A token enclosed in angle brackets names a class of document component,
while plain text appears verbatim or with some obvious substitution. The
grammar rules have two parts. The name of a class of component is on the
left of the denition symbol (::=). In the example above, the class is xyz. On
the right of the denition symbol is a set of one or more alternative forms
that the class component might take in the document. A vertical bar (|)
separates alternative forms.
PDF Reference Manual April 16, 1996 Chapter 1: Introduction
6 Chapter 1: Introduction
The right side of the denition may be on one or more lines. With only a
few exceptions, these lines do not correspond to lines in the le.
The notation {...} means that the items enclosed in braces are optional. If an
asterisk follows the braces, the objects inside the braces may be repeated
zero or more times. The notation <...>+ means that the items enclosed
within the brackets must be repeated one or more times.
When an operator appears in a BNF specication, it is shorthand for the
operator plus its operands. For example, when the operator m appears in a
BNF specication, it means x y m, where x and y are numbers.
Note that PDF is case-sensitive. Uppercase and lowercase letters are
distinct.
1.5 Copyrights and permissions to use PDF
The general idea of utilizing an interchange format for nal-form
documents is in the public domain. Anyone is free to devise his or her own
set of unique commands and data structures that dene an interchange
format for nal-form documents. Adobe owns the copyright in the data
structures, operators, and the written specication for the particular
interchange format called the Portable Document Format. These elements
may not be copied without Adobes permission.
Adobe will enforce its copyright. Adobes intention is to maintain the
integrity of the Portable Document Format as a standard. This enables the
public to distinguish between the Portable Document Format and other
interchange formats for nal-form documents.
However, Adobe desires to promote the use of the Portable Document
Format for information interchange among diverse products and
applications. Accordingly, Adobe gives permission to anyone to:
Prepare les in which the le content conforms to the Portable
Document Format.
Write drivers and applications that produce output represented in the
Portable Document Format.
Write software that accepts input in the form of the Portable Document
Format and displays the results, prints the results, or otherwise interprets
a le represented in the Portable Document Format.
PDF Reference Manual April 16, 1996 Chapter 1: Introduction
1.5 Copyrights and permissions to use PDF 7
Copy Adobes copyrighted list of operators and data structures to the
extent necessary to use the Portable Document Format for the above
purposes.
The only condition on such permission is that anyone who uses the
copyrighted list of operators and data structures in this way must include an
appropriate copyright notice.
This limited right to use the copyrighted list of operators and data structures
does not include the right to copy the Portable Document Format Reference
Manual, other copyrighted material from Adobe, or the software in any of
Adobes products which use the Portable Document Format, in whole or in
part.
PDF Reference Manual April 16, 1996 Chapter 1: Introduction
8 Chapter 1: Introduction
PDF Reference Manual April 16, 1996
Section I
Portable Document Format
PDF Reference Manual April 16, 1996
PDF Reference Manual April 16, 1996 Chapter 2: Overview
11
CHAPTER 2
Overview
Before examining the detailed structure of a PDF le, it is important to
understand what PDF is and how it relates to the PostScript language. This
chapter discusses PDF and its relationship to the PostScript language.
Chapter 3 discusses the coordinate systems used to describe various
components of a PDF le. Chapters 4 and 5 discuss the basic types of
objects supported by PDF and the structure of a PDF le. Chapters 6 and 7
describe the structure of a PDF document and the operators used to draw
text, graphics, and images.
2.1 What is the Portable Document Format?
PDF is a le format used to represent a document in a manner independent
of the application software, hardware, and operating system used to create
it. A PDF le contains a PDF document and other supporting data.
A PDF document contains one or more pages. Each page in the document
may contain any combination of text, graphics, and images in a device- and
resolution-independent format. This is the page description. A PDF
document may also contain information possible only in an electronic
representation, such as hypertext links.
In addition to a document, a PDF le contains the version of the PDF
specication used in the le and information about the location of important
structures in the le.
PDF Reference Manual April 16, 1996 Chapter 2: Overview
12 Chapter 2: Overview
2.2 Using PDF
To understand PDF, it is important to understand how PDF documents will
be produced and used. As PDF documents and applications that read PDF
les become more prevalent, new ways of creating and using PDF les will
be invented. This is one of the goals of this bookto make the le format
accessible so that application developers can expand on the ideas behind
PDF and the applications that initially support it.
Currently, PDF les may be produced either directly from applications or
from les containing PostScript page descriptions.
Many applications can produce PDF les directly. The PDF Writer,
available on both Apple
Macintosh
Windows
Math fall into this category. It is not possible to simulate a symbolic font
effectively.
For symbolic fonts, a font descriptor (including metrics and style
information) is not sufcient; the actual character shapes (or glyphs) are
required to accurately display and print the document. For all symbolic
fonts other than Symbol and ITC Zapf Dingbats
, a compressed version of
the Type 1 font program for the font is included in the PDF le. Symbol and
ITC Zapf Dingbats, the most widely used symbolic fonts, ship with Acrobat
Exchange and Acrobat Reader and do not need to be included in a PDF le.
2.3.5 Single-pass le generation
Because of system limitations and efciency considerations, it may be
desirable or necessary for an implementation of a program that produces
PDF such as the PDF Writer to create a PDF le in a single pass. This may
be, for example, because the application has access to limited memory or is
unable to open temporary les. For this reason, PDF supports single-pass
generation of les. While PDF requires certain objects to contain a number
specifying their length in bytes, a mechanism is provided allowing the
length to be located in the le after the object. In addition, information such
as the number of pages in the document can be written into the le after all
pages have been written into the le.
2.3.6 Random access
Tools that extract and display a selected page from a PostScript language
program must scan the program from its beginning until the desired page is
found. On average, the time needed to view a page depends not only on the
complexity of the page but also on the total number of pages in the
document. This is problematic for interactive document viewing, where it is
important that the time needed to view a page be independent of the total
number of pages in the document.
PDF Reference Manual April 16, 1996 Chapter 2: Overview
18 Chapter 2: Overview
Every PDF le contains a cross-reference table that can be used to locate
and directly access pages and other important objects in the le. The
location of the cross-reference table is stored at the end of the le, allowing
applications that produce PDF les in a single pass to store it easily and
allowing applications that read PDF les to locate it easily. Using the cross-
reference table, the time needed to view a page in a PDF le can be nearly
independent of the total number of pages in the document.
2.3.7 Incremental update
Applications may allow users to modify PDF documents, which can contain
hundreds of pages or more. Users should not have to wait for the entire le
to be rewritten each time modications to the document are saved. PDF
allows modications to be appended to a le, leaving the original data
intact. The addendum appended when a le is incrementally updated
contains only the objects that were modied or added, and includes an
update to the cross-reference table. Support for incremental update allows
an application to save modications to a PDF document in an amount of
time proportional to the size of the modication instead of the size of the
le. In addition, because the original contents of the le are still present in
the le, it is possible to undo saved changes by deleting one or more
addenda.
2.3.8 Extensibility
PDF is designed to be extensible. Undoubtedly, developers will want to add
features to PDF that have not yet been implemented or thought of. For
example, only simple text annotations are allowedgraphics cannot be
included.
The design of PDF is such that not only can new features be added, but
applications that understand earlier versions of the format will not
completely break when they encounter features that they do not implement.
Appendix G, Compatibility, species how a viewer should behave when it
reads a le that does not conform to the specication it was expecting.
2.4 PDF and the PostScript language
The preceding sections mentioned several ways in which PDF differs from
the PostScript language. This section summarizes these differences and
describes the process of converting a PDF le into a PostScript language
program.
PDF Reference Manual April 16, 1996 Chapter 2: Overview
2.4 PDF and the PostScript language 19
While PDF and the PostScript language share the same basic imaging
model, there are some important differences between them:
A PDF le may contain objects such as hypertext links that are useful
only for interactive viewing.
To simplify the processing of page descriptions, PDF provides no
programming language constructs.
PDF enforces a strictly dened le structure that allows an application to
access parts of a document randomly.
PDF les contain information such as font metrics, to ensure viewing
delity.
Because of these differences, a PDF le cannot be downloaded directly to a
PostScript printer for printing. An application that prints a PDF le to a
PostScript printer must carry out the following steps:
1. Insert procsets, sets of PostScript language procedure denitions
that implement the PDF page description operators.
2. Extract the content for each page. Pages are not necessarily stored in
sequential order in the PDF le. Each page description is essentially
the script portion of a traditional PostScript language program using
very specic procedures, such as m for moveto and l for
lineto.
3. Decode compressed text, graphics, and image data. This is not
required for PostScript Level 2 printers, which can accept
compressed data in a PostScript language le.
4. Insert any resources, such as fonts, into the PostScript language le.
Substitute fonts are dened and inserted as needed, based on the font
metrics in the PDF le.
5. Put the information in the correct order. The result is a traditional
PostScript language program that fully represents the visual aspects
of the document, but no longer contains PDF elements such as
hypertext links, annotations, and bookmarks.
6. Send the PostScript language program to the printer.
PDF Reference Manual April 16, 1996 Chapter 2: Overview
20 Chapter 2: Overview
2.5 Understanding PDF
PDF is best understood by thinking of it in four parts, as shown in Figure
2.4.
Figure 2.4 PDF components
The rst component is the set of basic object types used by PDF to represent
objects. These types, with only a few exceptions, correspond to the data
types used in the PostScript language. Chapter 4 discusses these object
types.
The second component is the PDF le structure. The le structure
determines how objects are stored in a PDF le, how they are accessed, and
how they are updated. This structure is independent of the semantics of the
objects. Chapter 5 explains the le structure.
The third component is the PDF document structure. The document
structure species how the basic object types are used to represent
components of a PDF document: pages, annotations, hypertext links, fonts,
and more. Chapter 6 explains the PDF document structure.
The fourth and nal component is the PDF page description. A PDF page
description, while part of a PDF page object, can be explained
independently of the other components. A PDF page description has only
limited interaction with other parts of a PDF document. This simplies its
conversion into a PostScript language program. Chapter 7 discusses PDF
page descriptions.
Objects
File
structure
Document
structure
Page
description
PDF Reference Manual April 16, 1996 Chapter 3: Coordinate Systems
21
CHAPTER 3
Coordinate Systems
Coordinate systems dene the canvas on which all drawing in a PDF
document occurs; that is, the position, orientation, and size of the text,
graphics, and images that appear on a page are determined by coordinate
systems.
PDF supports a number of coordinate systems, most of them identical to
those used in the PostScript language. This chapter describes each of the
coordinate systems used in PDF, how they are related, and how
transformations among coordinate systems are specied. At the end of the
chapter is a description of the mathematics involved in coordinate
transformations. It is not necessary to read this section to use coordinate
systems and transformations. It is presented for those readers who wish to
gain a deeper understanding of the mechanics of coordinate
transformations.
3.1 Device space
The contents of a page ultimately appear on a display or a printer. Each type
of device on which a PDF page can be drawn has its own built-in coordinate
system, and, in general, each type of device has a different coordinate
system. Coordinates specied in a devices native coordinate system are
said to be in device space. On pixel-based devices such as computer screens
and laser printers, coordinates in device space generally specify a particular
pixel.
If coordinates in PDF les were specied in device space, the les would be
device-dependent and would accordingly appear differently on different
devices. For example, images drawn in the typical device space of a 72 pixel
per inch display and on a 600 dpi printer differ in size by more than a factor
of 8; an eight-inch line segment on a display would appear as a one-inch
segment on the printer. Different devices also have different orientations of
their coordinate systems. On one device, the origin of the coordinate system
may be at the upper left corner of the page, with the positive direction of the
PDF Reference Manual April 16, 1996 Chapter 3: Coordinate Systems
22 Chapter 3: Coordinate Systems
y-axis pointing downward. On another device, the origin may be in the
lower left corner of the page with the positive direction of the y-axis
pointing upward. Figure 3.1 shows an object that is two units high in device
space, and illustrates the fact that coordinates specied in device space are
device-dependent.
Figure 3.1 Device space
3.2 User space
PDF, like the PostScript language, denes a coordinate system that appears
the same, regardless of the device on which output occurs. This allows PDF
documents to be independent of the resolution of the output device. This
resolution-independent coordinate system is called user space and provides
the overall coordinate system for a page.
The transformation from user space to device space is specied by the
current transformation matrix (CTM). Figure 3.2 shows an object that is
two units high in user space and indicates that the CTM provides the
resolution-independence of the user space coordinate system.
Device space for
72-dpi screen
Device space for
300-dpi printer
PDF Reference Manual April 16, 1996 Chapter 3: Coordinate Systems
3.3 Text space 23
Figure 3.2 User space
The user space coordinate system is initialized to a default state for each
page of a document. By default, user space coordinates have 72 units per
inch, corresponding roughly to the various denitions of the typographic
unit of measurement known as the point. The positive direction of the y-axis
points upward, and the positive direction of the x-axis to the right. The
region of the default coordinate system that is viewed or printed can be
different for each page, and is described in Section 6.4, Page objects.
3.3 Text space
The coordinates of text are specied in text space. The transformation from
text space to user space is provided by a matrix called the text matrix. This
matrix is often set so that text space and user space are the same.
User space
Device space for
72-dpi screen
Device space for
300-dpi printer
CTM
PDF Reference Manual April 16, 1996 Chapter 3: Coordinate Systems
24 Chapter 3: Coordinate Systems
3.4 Character space
Characters in a font are dened in character space. The transformation
from character space to text space is dened by a matrix. For most types of
fonts, this matrix is predened except for an overall scale factor. (For
details, see Section 6.8.2, Font resources.) This scale factor changes when
a user selects the font size for text.
3.5 Image space
All images are dened in image space. The transformation from image
space to user space is predened and cannot be changed. All images are one
unit by one unit in user space, regardless of the number of samples in the
image.
3.6 Form space
PDF provides an object known as a Form, discussed in Section 6.8.6,
XObject resources. Forms contain sequences of operations and are the
same as forms in the PostScript language. The space in which a form is
dened is form space. The transformation from form space to user space is
specied by a matrix contained in the form.
3.7 Relationships among coordinate systems
PDF denes a number of interrelated coordinate systems, described in the
previous sections. Figure 3.3 shows the relationships among the coordinate
systems. Each line in the gure represents a transformation from one
coordinate system to another. PDF allows modications to many of these
transformations.
Figure 3.3 Relationships among PDF coordinate systems
Character
space
User
space
Text
space
Device
space
Form
space
Image
space
PDF Reference Manual April 16, 1996 Chapter 3: Coordinate Systems
3.8 Transformations between coordinate systems 25
Because PDF coordinate systems are dened relative to each other, changes
made to one transformation can affect the appearance of objects drawn in
several coordinate systems. For example, changes made to the CTM affect
the appearance of all objects, not just graphics drawn directly in user space.
3.8 Transformations between coordinate systems
Transformation matrices specify the relationship between two coordinate
systems. By modifying a transformation matrix, objects can be scaled,
rotated, translated, or transformed in other ways.
A transformations matrix in PDF, as in the PostScript language, is specied
by an array containing six elements. This section lists the arrays used for the
most common transformations. The following section contains more
mathematical details of transformations, including information on
specifying transformations that are combinations of those listed in this
section.
Translations are specied as [1 0 0 1 t
x
t
y
], where t
x
and t
y
are the
distances to translate the origin of the coordinate system in x and y,
respectively.
Scaling is obtained by [s
x
0 0 s
y
0 0]. This scales the coordinates so that
one unit in the x and y directions of the new coordinate system is the
same size as s
x
and s
y
units in the previous coordinate system,
respectively.
Rotations are carried out by [cos sin -sin cos 0 0], which has the
effect of rotating the coordinate system axes by degrees
counterclockwise.
Skew is specied by [1 tan tan 1 0 0], which skews the x-axis by an
angle and the y-axis by an angle . and are measured in degrees.
Figure 3.4 shows examples of each transformation. The directions of
translation, rotation, and skew shown in the gure correspond to positive
values of the array elements.
PDF Reference Manual April 16, 1996 Chapter 3: Coordinate Systems
26 Chapter 3: Coordinate Systems
Figure 3.4 Effects of coordinate transformations
Translation Scaling
Rotation Skewing
If several transformations are applied, the order in which they are applied
generally is important. For example, scaling the x-axis followed by a
translation of the x-axis is not the same as rst translating the x-axis, then
performing the scaling. In general, to obtain the expected results,
transformations should be done in the order: translate, rotate, scale.
Figure 3.5 shows that the order in which transformations are applied is
important. The gure shows two sequences of transformations applied to a
coordinate system. After each successive transformation, an outline of the
letter n is drawn. The transformations in the gure are a translation of 10
units in the x-direction and 20 units in the y-direction, a rotation of 30
degrees, and a scaling by a factor of 3 in the x-direction. In the gure, the
axes are drawn with a dash-pattern having two units dash, two units gap. In
addition, the untransformed coordinate system is drawn in light gray in each
section. Notice that the scalerotatetranslate ordering results in a distortion
of the coordinate system leaving the x- and y-axes no longer perpendicular,
while the recommended translaterotatescale ordering does not.
t
y
t
x
s
y
s
x
Macintosh
Bead
Bead
Thread
Thread
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.2 Catalog 63
the Catalog object as the value of the trailers Root key. In addition, the
trailer species the location of the documents Info dictionary, a structure
that contains general information about the document, as the value of the
trailers Info key.
Note In many of the tables in this chapter, certain keyvalue pairs contain the
notation must be an indirect reference or indirect reference preferred.
Unless one of these is specied in the description of the keyvalue pair,
objects that are the value of a key can either be specied directly or using
an indirect reference, as described in Section 4.11, Object references.
6.2 Catalog
The Catalog is a dictionary that is the root node of the document. It contains
a reference to the tree of pages contained in the document, a reference to the
tree of objects representing the documents outline, a reference to the
documents article threads, and the list of named destinations. In addition,
the Catalog indicates whether the documents outline or thumbnail page
images should be displayed automatically when the document is viewed and
whether some location other than the rst page should be shown when the
document is opened. Example 6.1 shows a sample Catalog object.
Example 6.1 Catalog
1 0 obj
<<
/Type /Catalog
/Pages 2 0 R
/Outlines 3 0 R
/PageMode /UseOutlines
>>
endobj
Table 6.1 shows the attributes for a Catalog.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
64 Chapter 6: Document Structure
Table 6.1 Catalog attributes
Key Type Semantics
Type name (Required) Object type. Always Catalog.
Pages dictionary (Required, must be an indirect reference) Pages object that is the root of the
documents Pages tree.
Outlines dictionary (Required if the document has an outline; must be an indirect reference)
The Outlines object that is the root of the documents outline tree, described
in Section 6.7, Outline tree.
PageMode name (Optional) How the document should appear when opened. Allowed values:
UseNone Open document with neither outline nor thumbnails visible
UseOutlines Open document with outline visible
UseThumbs Open document with thumbnails visible
FullScreen Open document in full-screen mode; in full-screen mode,
there is no menu bar, window controls, nor any other
window present.
The default value of PageMode is UseNone.
OpenAction
array or dictionary (Optional) Any legal action, as described in Section 6.6.3, Destinations. If
the value of this key is an array, it must be a destination. If it is a dictionary,
it must be an action. If no action is specied, the top of the rst page will
appear at default zoom.
Threads array (Required if the document has any threads; must be an indirect reference)
An array of threads as described in Section 6.10, Articles."
Dests dictionary (Required if the document has named destinations; must be an indirect
reference) A dictionary of names and corresponding destinations; see
Section 6.6.4, Named destinations.
URI dictionary (Optional) Contains document-level information for Uniform Resource
Identifier annotations; see page 87.
Implementation note Acrobat 1.0 viewers ignore OpenAction, Threads and Dests. They also
ignore FullScreen as the value of PageMode.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.3 Pages tree 65
6.3 Pages tree
The pages of a document are accessible through a tree of nodes known as
the Pages tree. This tree denes the ordering of the pages in the document.
To optimize the performance of viewer applications, the Acrobat Distiller
program and Acrobat PDF Writer construct balanced trees with each node
in the tree containing up to six children. (For further information on
balanced trees, see reference [6] in the Bibliography on page 279.) The tree
structure allows applications to quickly open a document containing
thousands of pages using only limited memory. Applications should accept
any sort of tree structure as long as the nodes of the tree contain the keys
described in Table 6.2. The simplest structure consists of a single Pages
node that references all the page objects directly.
Note The structure of the Pages tree for a document is unrelated to the content of
the document. In a PDF le for a book, for example, theres no guarantee
that a chapter will be represented by a single node in the Pages tree.
Applications that consume or produce PDF les are not required to
preserve the existing structure of the Pages tree.
The root and all interior nodes of the Pages tree are dictionaries, whose
minimum contents are shown in Table 6.2.
Table 6.2 Pages attributes
Key Type Semantics
Type name (Required) Object type. Always Pages.
Kids array (Required) List of indirect references to the immediate children of this
Pages node.
Count integer (Required) Species the number of leaf nodes (imageable pages) under this
node. The leaf nodes do not have to be immediately below this node in the
tree, but can be several levels deeper in the tree.
Parent dictionary (Required; must be indirect reference) Pages object that is the immediate
ancestor of this Pages object. The root Pages object has no Parent.
Example 6.2 illustrates the Pages object for a document with three pages,
while Appendix A contains an example showing the Pages tree for a
document containing 62 pages.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
66 Chapter 6: Document Structure
Example 6.2 Pages tree for a document containing three pages
2 0 obj
<<
/Type /Pages
/Kids [ 4 0 R 10 0 R 24 0 R ]
/Count 3
>>
endobj
Inheritance of attributes
A Pages object may contain additional keys that provide values for Page
objects that are its descendants. Such values are said to be inherited. For
example, a document may specify a MediaBox for all pages by dening
one in the root Pages object. An individual page in the document could
override the MediaBox in this example by specifying a MediaBox in the
Page object for that page.
Attributes that may be inherited are indicated in Table 6.3. If a required key
that may be inherited is omitted from a Page object, then a value must be
supplied in one of its ancestors. If an optional key that may be inherited is
omitted, then a value may be supplied in one of its ancestors; barring that,
the default value will be used.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.4 Page objects 67
Example 6.3 demonstrates inheritance by showing a tree of Pages objects
and Page objects. Pages 1, 2, and 4 are rotated 90. Page 3 is rotated 270.
Pages 5 and 7 are not rotated (rotated 0). Page 6 is rotated 180.
6.4 Page objects
A Page object is a dictionary whose keys describe a single page containing
text, graphics, and images. A Page object is a leaf of the Pages tree, and has
the attributes shown in Table 6.3.
Table 6.3 Page attributes
Key Type Semantics
Type name (Required) Object type. Always Page.
MediaBox array (Required; may be inherited) Rectangle specifying the natural size of the
page, for example the dimensions of an A4 sheet of paper. The rectangle is
an array [ ll
x
ll
y
ur
x
ur
y
], specifying the lower left x, lower left y, upper right
x, and upper right y coordinates of the page, in that order. The coordinates
are measured in default user space units.
Parent dictionary (Required; must be indirect reference) Pages object that is the immediate
ancestor of this page.
Pages
page 1
Pages
/Rotate 90
Pages Pages
/Rotate 180
Pages
Page
Page Page
/Rotate 90
Page Page
/Rotate 0
Page
Page
/Rotate 270
page 2
page 3 page 4
page 5 page 6 page 7
Example 6.3 Inheritance of attributes
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
68 Chapter 6: Document Structure
Resources dictionary (Required; may be inherited) Resources required by this page, described in
Section 6.8, Resources. If the page requires no resources, this value
should be an empty dictionary, written as << >>. Omitting this value, or
specifying a null value, indicates that the value is to be inherited from an
ancestor Pages object.
Contents stream or array (Optional; must be indirect reference) The page description (contents) for
this page, described in Chapter 7. If Contents is an array of streams, they
are concatenated to produce the page description. This allows a program
that is creating a PDF le to create image objects and other resources as
they occur, even though they interrupt the page description. If Contents is
absent, the page is empty.
CropBox array (Optional; may be inherited) Rectangle specifying the region of the page
displayed and printed. The rectangle is specied in the same way as
MediaBox.
Rotate integer (Optional; may be inherited) Species the number of
degrees the page should be rotated clockwise when it is
displayed. This value must be zero (the default) or a
multiple of 90.
Thumb stream (Optional; must be indirect reference) Object that contains a thumbnail
sketch of the page, described in Section 6.5, Thumbnails.
Annots array (Optional) An array of objects, each representing an annotation on the page,
described in Section 6.6, Annotations. Omit the Annots key if the page
has no annotations.
B (Beads) array (Recommended if the page contains article beads) An array whose elements
are indirect references to each article bead on the page, in drawing order
(the same order as the Annots array). Articles are described in Section 6.10
on page 131.
Implementation note The Acrobat 2.0 viewers will rebuild the Beads array for all pages of a
document containing beads if the rst page with a bead does not have a
Beads array.
Dur (Duration) real (Optional; may be inherited) Species the advance timing (display
duration) of a page. By default, the page will not advance automatically. See
Section 6.4.1, Presentation mode.
0
270
180
90
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.4 Page objects 69
Hid (Hidden) boolean (Optional; may be inherited) If true, the page should be hidden (not
displayed) during a presentation. The default is false. See Section 6.4.1,
Presentation mode.
Trans (Transition)
dictionary (Optional; may be inherited) A Transition dictionary, containing
information about transitions between pages. See Section 6.4.1,
Presentation mode.
Note that some Page attributes may be inherited; see the note, Inheritance
of attributes, on page 66.
Note The intersection between the pages media box and the crop box is the
region of the default user space coordinate system that is viewed or printed.
Typically, the crop box is located entirely inside the media box, so that the
intersection is the same as the crop box itself.
Figure 6.2 on page 69 shows the distinction between the media box and the
crop box. In the gure, the crop box has been sized so that the crop marks
do not appear when the page is viewed or printed.
TIME
PHOTONS FROM
LIQUID RADIATOR
CHARGE DIVISION
THIRD COORDINATE
READOUT
WIRE
ADDRESS
CATHODE
ANODE WIRE
PLANE
PHOTONS FROM
GAS RADIATOR
e
Figure 2.6 TPC and detector for barrel CRID.
The drift space inside each TPC is 1.268m long, 30.7cm wide, and the thickness
tapers from 9.2cm to 5.6cm. Forty TPCs will be used in the CRID, arranged as
twenty in each end of the barrel. The TPC's are filled with a gas mixture that is
transparent to ultraviolet photons, has good electron lifetime, has a pulse
height spectrum with a peak clearly separated from the noise, and includes a
component which efficiently converts photons to single photoelectrons. Good
electron lifetime is necessary to minimize losses as single electrons drift up to
1.3m in the TPC. Pure ethane has been chosen for the TPC gas, with Tetrakis
15
Crop box
(region displayed
and printed)
Media box
(size of page)
Figure 6.2 Page objects media box and crop box
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
70 Chapter 6: Document Structure
Example 6.4 on page 70 shows a Page object with a thumbnail and two
annotations. In addition, the Resources dictionary is specied as a direct
object, and shows that the page makes use of three fonts, with the names F3,
F5, and F7.
6.4.1 Presentation mode
A Page dictionary may contains three keys, Dur, Hid, and Trans, that
contain information that is intended to be used when displaying a PDF
document as a presentation or slide show and are otherwise ignored. A
PDF viewer is not required to provide a presentation mode. If such a mode
is provided by the viewer or a plug-in, however, then these keys dene its
behavior.
Implementation note The Acrobat 2.0 viewers do not currently provide a presentation mode. They
may do so in the future.
Duration
The Dur key in a Page dictionary species the advance timing of the page.
The advance timing is intended to be used only when a presentation is being
played in a non-interactive mode. It describes the maximum amount of time
the page will be displayed before the viewer will automatically turn to the
next page; the user can advance the page manually before the time is up. If
no Dur key is specied for a Page object or any of its Pages ancestors, the
page will not advance automatically.
Example 6.4 Page with thumbnail, annotations, and Resources dictionary
3 0 obj
<<
/Type /Page
/Parent 4 0 R
/MediaBox [ 0 0 612 792 ]
/Resources << /Font << /F3 7 0 R /F5 9 0 R /F7 11 0 R >>
/ProcSet [ /PDF ] >>
/Thumb 12 0 R
/Contents 14 0 R
/Annots [ 23 0 R 24 0 R ]
>>
endobj
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.4 Page objects 71
The advance timing is dened as the amount of time between the end of the
last transition and the beginning of the next one, as shown in the time-line
below:
Hidden
The Hid (Hidden) key in a Page dictionary species that the page is not to
be displayed during the presentation. If the user attempts to turn to a hidden
page from the previous or following page during a presentation, the page
will be skipped and the next visible page will be displayed. If the page is the
destination of a link or thread, the Hidden attribute will be ignored and the
page will be displayed.
The Hidden attribute of a page will hide the page only during a presentation;
other aspects of the user interface ignore the Hidden attribute.
Transition
The Trans key in a Page dictionary species a Transition dictionary, which
describes the effect to use when going to that page, and the amount of time
the transition should take. For example, a transition effect in the Transition
dictionary of page two will execute whenever the user goes to page two,
regardless of the previous page. Table 6.4 denes keys for all Transition
dictionaries; they may contain additional keys that control specic
transition effects.
Table 6.4 Transition attributes
Key Type Semantics
Type name (Optional) Object type. Always Trans.
S (Subtype) name (Optional) Describes the transition effect. If this key is omitted, there will
be no transition effect to that page (the page will be displayed normally),
and the D key in the Transition dictionary is ignored. Transition effects are
described in the following section.
Transition from
page 1 to page 2
Transition from
page 2 to page 3 Page 2 is displayed
Transition duration Advance timing Transition duration
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
72 Chapter 6: Document Structure
D (Duration) real (Optional) The duration (in seconds) of the transition effect. The default
duration is 1 second.
Transition effects
All implementations of presentation mode will support the transition effects
shown in Table 6.5. Some of these effects include optional parameters that
control the appearance of the effect. The parameters are described in Table
6.6.
Table 6.5 Transition Effects
Effect Parameters Description
Split Dm, M Two lines sweep across the screen revealing the new page image. The lines
can be either horizontal or vertical, as determined by the Dm key, and can
move from the center out or from the edges in as determined by the M key.
Blinds Dm Multiple lines, evenly distributed across the screen, appear and
synchronously sweep in the same direction to reveal the new page. The lines
are either horizontal or vertical, as determined by the Dm key. Horizontal
lines move down; vertical lines move to the right.
Box Dm A box sweeps from the center out or from the edges inward, as determined
by the M key, revealing the new page image.
Wipe Di A single line sweeps across the screen from one edge to the other, revealing
the new page image. Possible values for Di include 0, 90, 180, and 270.
Dissolve (none) The old page image dissolves in a piecemeal fashion to reveal the new
page.
Glitter Di Similar to Dissolve, except the effect sweeps across the image in a wide
band moving from one side of the screen to the other. Supported directions
are 0, 270, and 315.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.4 Page objects 73
Table 6.6 Effect parameters
Key Type Semantics
Di (Direction) real The direction of movement, specied in degrees,
increasing in a counterclockwise direction. A value of 0
points to the right, indicating that the effect proceeds
from left to right. A value of 90 points upward, indicating
that the effect moves from bottom to top.
Note This is different from the page rotation, where the degrees increase in a
clockwise direction.
Dm (Dimension) name For those effects which can be performed either horizontally or vertically,
the Dm key species which dimension to use. Possible values are H
(horizontal) or V (vertical).
M (Motion) name For those effects which can be performed either from the center out or the
edges in, the M key species which direction to use. Possible values are I
(In) or O (Out).
Example 6.5 shows a page that, in presentation mode, would be displayed
for 5 seconds before advancing to the following page. Before the page is
displayed, there is a 3-second transition in which two vertical lines sweep
across the screen, from the center outwards.
Example 6.5 A page with information for presentation mode
<</Type /Page
/Parent 4 0 R
/Contents 16 0 R
/Dur 5
/Trans<< /S /Split
/D 3.0
/M /O
/Dm /V >>
>>
0
90
180
270
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
74 Chapter 6: Document Structure
6.5 Thumbnails
A PDF document may include thumbnail sketches of its pages. They are not
required, and even if some pages have them, others may not.
The thumbnail image for a page is the value of the Thumb key of the page
object. The structure of a thumbnail is very similar to that of an Image
resource (see Section 6.8.6, XObject resources). The only difference
between a thumbnail and an Image resource is that a thumbnail does not
include Type, Subtype, and Name keys.
Note Different pages in a document may have thumbnails with different numbers
of bits per color component.
Example 6.6 Thumbnail
12 0 obj
<<
/Filter [ /ASCII85Decode /DCTDecode ]
/Width 76
/Height 99
/BitsPerComponent 8
/ColorSpace /DeviceRGB
/Length 13 0 R
>>
stream
s4IA>!"M;*Ddm8XA,lT0!!3,S!/(=R!<E3%!<N<(!WrK*!WrN,!
... image data omitted...
$B@Eme1Y7Z;J4$cc=Lj/]5#e^_1plJ-N)DE>A<*F2m0Y-
endstream
endobj
13 0 obj
4298
endobj
6.6 Annotations
Annotations are notes or other objects that are associated with a page but are
separate from the page description itself. PDF 1.1 supports three kinds of
annotations: text notes, hypertext links, and movies. (See Figure 6.3.) In the
future, PDF may support additional types.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.6 Annotations 75
If a page includes annotations, they are stored in an array as the value of the
Annots key of the Page object. Each annotation is a dictionary. As shown
in Table 6.7, all annotations must provide a core set of keys, including
Type, Subtype, and Rect. Certain other keys, indicating an annotations
color, title, modication date, border, and other information, are also
dened for all annotations but are optional.
Note All coordinates and measurements in text annotations, link annotations, and
outline entries are specied in default user space units. Where a rectangle is
specied as an array of integers, it is in the form:
[ ll
x
ll
y
ur
x
ur
y
]
specifying the lower left x, lower left y, upper right x, and upper right y
coordinates of the rectangle.
XYZ
Fit
FitH
FitV
FitR
FitB
FitBH
FitBV
GoTo
GoToR
Launch
Thread
URI
Destination
Action
Link
Movie
Text
Annotation
Figure 6.3 Annotation types
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
76 Chapter 6: Document Structure
Table 6.7 Annotation attributes (common to all annotations)
Key Type Semantics
Type name (Required in PDF 1.0, optional otherwise) Object type. Always Annot.
Subtype name (Required) Annotation subtype.
Rect array of integers (Required) Rectangle specifying the location of the annotation.
Border array (Optional) In PDF 1.0, this is an array of three numbers, specifying the
horizontal corner radius, the vertical corner radius, and the width of the
border of the annotation. The default values are 0, 0, and 1, respectively. No
border is drawn if the width is 0.
Implementation note Acrobat viewers ignore the rst two numbers.
In PDF 1.1, the array may have a fourth element, a dash array that allows
specication of solid and dashed borders. The dash array contains on and
off stroke-lengths for drawing dashes, in the same format as the setdash
marking operator, d (see page 147). An example of a border with a dash
array is [ 0 0 1 [ 3 ] ].
Implementation note Acrobat 2.0 viewers support a maximum of 10 entries in the dash array.
C (Color) array (Optional) The annotation color. For links, this is the border color. For text
annotations, it is the background color of a closed annotations icon, the title
bar color of an active open annotations window, and the window frame
color of an inactive open annotation. A color is specied as an array of three
numbers in the range 0 to 1, representing a color in DeviceRGB space.
T (Title) string (Optional) An arbitrary text label associated with the annotation. It is
displayed in an active open text annotations title bar and can be edited from
the annotations properties dialog. The characters in this string are encoded
using the predened encoding PDFDocEncoding, described in Appendix
C.
M (ModDate) string (Optional) The last time an annotation was modied. A text annotations
modication date is updated each time the text is changed. The preferred
string value is the date format described in Section 4.4, Strings, but
viewers should accept and display any string.
Implementation note The Acrobat 2.0 viewers update the ModDate string only for text
annotations.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.6 Annotations 77
F (Flags) integer (Optional) This value is interpreted as a collection of ags that dene
various characteristics of the annotation. The least signicant bit is the
invisible ag, which species how an annotation is displayed when the
appropriate annotation handler is not available. If this ags value is 1 and
the viewer does not provide a handler for the annotations subtype, the
annotation will not be displayed. If this ags value is 0 and the viewer does
not provide a handler for the annotations subtype, the annotation will
appear as an unknown annotation. (See the implementation note following
this table.) All other bits are reserved and must be set to 0. The default value
for this key is 0.
Implementation note If an Acrobat 2.0 viewer encounters an annotation of a type it does not
understand, the viewer will display it as an unknown annotation unless the
annotations F (Flags) key species that the invisible ag is set. The C, T,
M, and F keys are ignored by Acrobat 1.0 viewers.
6.6.1 Text annotations
A text annotation contains a string of text. When the annotation is open, the
text is displayed. A PDF viewer application chooses the size and typeface of
the text. Table 6.8 shows the contents of the text annotation dictionary.
Example 6.7 shows a text annotation.
Table 6.8 Text annotation attributes (in addition to those in Table 6.7)
Key Type Semantics
Subtype name (Required) Annotation subtype. Always Text.
Contents string (Required) The text to be displayed. Text can be separated into paragraphs
using carriage returns. The characters in this string are encoded using the
predened encoding PDFDocEncoding, described in Appendix C.
Open boolean (Optional) If true, species that the annotation should initially be displayed
opened. The default is false (closed).
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
78 Chapter 6: Document Structure
Example 6.7 Text annotation
22 0 obj
<<
/Type /Annot
/Subtype /Text
/Rect [ 266 116 430 204 ]
/Contents (text for two)
>>
endobj
6.6.2 Link annotations
A link annotation, when activated, displays a destination or performs an
action. A destination is a view of another location, possibly on a different
page, with a different zoom factor, or in a different le. Table 6.9 shows the
contents of the link annotation dictionary.
Table 6.9 Link annotation attributes (in addition to those in Table 6.7)
Key Type Semantics
Subtype name (Required) Annotation subtype. Always Link.
Dest array or name (Required unless the A key is present) The view to go to, represented either
as a direct destination (an array, described in Section 6.6.3,
Destinations), or a named destination (a name, described in Section
6.6.4 on page 80).
A (Action) dictionary (Required unless the Dest key is present) The action to be performed on
activating this link annotation; see Section 6.6.5, Actions.
Example 6.8 Link annotation
93 0 obj
<<
/Type /Annot
/Subtype /Link
/Rect [ 71 717 190 734 ]
/Border [ 16 16 1 ]
/Dest [ 3 0 R /FitR 4 399 199 533 ]
>>
endobj
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.6 Annotations 79
Implementation note Acrobat 1.0 viewers do not report an error when a user activates a link or
outline entry that has an unknown destination type or is missing a
destination. Links and outline entries with an A key will appear to have no
destination. The Acrobat 2.0 viewers will report an error when the
destination or action type is unknown.
6.6.3 Destinations
A Link annotation or Outline entry may specify a destination, which
consists of a page, the location of the display window on the destination
page, and the zoom factor to use when displaying the destination page. The
destination is represented as an array containing an indirect reference to the
Page object which is the destination page, along with other information
needed to specify the location and zoom.
Table 6.10 shows the allowed forms of the destination. In the table, top, left,
right, and bottom are numbers specied in the default user space coordinate
system. page is an indirect reference to the destination Page object, except
in the case of the GoToR action, where it is a page number. The pages
bounding box is the smallest rectangle enclosing all objects on the page. No
side of the bounding box is permitted to be outside the pages crop box. If it
is, that side of the bounding box is dened by the corresponding side of the
crop box.
Table 6.10 Destination specication
Value of Dest key Semantics
[ page /XYZ left top zoom ]
If left, top, or zoom is null, the current value of that parameter is retained.
For example, specifying a destination as [4 0 R null null null] will go to
the page object with an object ID of 4 0, retaining the same top, left, and
zoom as the current page. A zoom of 0 has the same meaning as a zoom of
null.
[ page /Fit ] Fit the page to the window.
[ page /FitH top ] Fit the width of the page to the window. top species the y-coordinate of the
top edge of the window.
[ page /FitV left ] Fit the height of the page to the window. left species the x-coordinate of
the left edge of the window.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
80 Chapter 6: Document Structure
[ page /FitR left bottom right top ]
Fit the rectangle specied by left bottom right top in the window. If the
height (top bottom) and width (right left) imply different zoom factors,
the numerically smaller zoom factor is used, to ensure that the specied
rectangle ts in the window.
[ page /FitB ] Fit the pages bounding box to the window.
[ page /FitBH top ] Fit the width of the pages bounding box to the window. top species the y-
coordinate of the top edge of the window.
[ page /FitBV left ] Fit the height of the pages bounding box to the window. left species the x-
coordinate of the left edge of the window.
6.6.4 Named destinations
A destination may also be represented by a name. A name allows a
destination to be specied indirectly, even if the destination is in another
le. For example, one le may contain a link to the rst page of Chapter 6 in
another le. If the link uses a name (e.g., /Chap6.begin) rather than a
specic location (e.g., page 42), then the page on which Chapter 6 starts can
change without invalidating the link.
The mapping from names to destinations is dened in the les Catalog
object, in a dictionary stored as the value of the Dests key. Each key in this
dictionary is a name, and the corresponding value is either a destination, as
dened in Section 6.6.3 on page 79, or a dictionary. If it is a dictionary, it
must have a D key whose value is a destination. (The dictionary enables
named destinations to have additional attributes.)
If an action that contains a destination name does not also contain a le
specication, then the name refers to a destination in the current le and
should be found in the current les Dests dictionary. If an action does
contain a le specication, then the name refers to a destination in that le.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.6 Annotations 81
6.6.5 Actions
In PDF 1.1, in addition to specifying a destination, it is possible to specify
an action to be performed when a Link annotation or Outline entry is
activated, or when a document is opened. PDF 1.1 denes ve types of
actions:
GoTo Change the current page view to a specied page and zoom
factor.
GoToR Open another PDF le at a specied page and zoom factor.
Launch Launch an application, usually to open a le.
Thread Begin reading an article thread, possibly in another PDF le.
Section 6.10, Articles, further describes article threads.
URI Resolve the specied Uniform Resource Identier (URI). See
page 85.
Implementation note It is intended that plug-in extensions may add new actions, as described in
Appendix G.
An action is represented as a dictionary. Every action must contain an S
(Subtype) key. Other keys may be present, depending on the action type.
The tables below list the attributes of the ve specied action types.
GoTo action
A GoTo action has the same effect as specifying a destination (with a Dest
key) in the Link annotation, but it is less compact and is not compatible with
PDF 1.0. Destinations are preferred over GoTo actions.
Table 6.11 GoTo action attributes
Key Type Semantics
Type name (Optional) Object type. Always Action.
S (Subtype) name (Required) Action type. Always GoTo.
D (Dest) array or name (Required) The destination, as described in Table 6.10 on page 79.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
82 Chapter 6: Document Structure
Example 6.9 GoTo action
42 obj
<<
/Type /Annot
/Subtype /Link
/Rect [ 71 717 190 734 ]
/Border [ 16 16 1 ]
/A<< /Type /Action
/S /GoTo
/D [ 3 0 R /FitR 4 399 199 533 ] >>
>>
endobj
Note This example has the same effect as the Link annotation shown in Example
6.8 on page 78, which uses a destination (a Dest key).
GoToR action
The GoToR action is similar to the GoTo action. However, it includes an
additional parameter, the File key, that species the PDF le that contains
the actions destination.
Table 6.12 GoToR action attributes
Key Type Semantics
Type name (Optional) Object type. Always Action.
S (Subtype) name (Required) Action type. Always GoToR.
D (Dest) array (Required) The destination, represented by an array, as described in Table
6.10 on page 79, except that the destination page (the rst element of the
array) must be specied by a page number, not by an indirect reference to
the Page object. The rst page is 0.
or
name (Required) The name of a destination. See Section 6.6.4 on page 80.
F (File)
string or dictionary (Required) The le containing the destination view. See Section 6.6.6, File
specications, for the interpretation of the File key.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.6 Annotations 83
Launch action
The Launch action species an application to launch or document to open.
The action must specify the application or document as a le, using the F
key.
PDF 1.1 also allows platform-specic information to be included in the
Launch dictionary where that information is needed for specic platform.
The key Win is used for information related to Microsoft Windows
launches; the key Unix is used for information related to UNIX system
launches. If there is no platform specic key, then the F key is used.
Implementation note Some implementations of Acrobat 2.0 viewers may check for alternative
keys whose values provide platform-specic parameters for the Launch
action. For example, the Acrobat 2.0 viewer for Windows will use the
dictionary corresponding to the Win key to determine its launch
parameters.
Table 6.13 Launch action attributes (Continued)
Key Type Semantics
Type name (Optional) Object type. Always Action.
S (Subtype) name (Required) Action type. Always Launch.
F (File)
string or dictionary (Required if there is no alternative key) The le to use in performing the
specied action. See Section 6.6.6, File specications, for the
interpretation of the F key. A viewer that encounters an action with no F key
and for which it does not understand any of the alternative keys will do
nothing.
Win dictionary (Optional) Windows-specic launch parameters as described in Table 6.14.
Unix string (Optional) Not yet dened.
Implementation note The Acrobat 2.0 viewers for Windows use the Windows function
ShellExecute to launch an application. The Win dictionary entries
correspond to the parameters of ShellExecute.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
84 Chapter 6: Document Structure
Table 6.14 Windows-specic launch attributes
Key Type Semantics
F (File) string (Required) The document or application to launch, specied as a DOS le
name using standard DOS syntax. If the string includes a backslash ( \ ), the
backslash must itself be preceded by a backslash.
O (Operation) string (Optional) The operation to perform: (open) or (print). (open) is the
default. If the F key species an application, this key is ignored and the
application is launched.
P (Parameters) string (Optional) The parameters passed to the application specied by the F key.
If the F key species a document, this key should not be provided.
D (Directory) string (Optional) The default directory, specified using standard DOS syntax.
Thread action
When a viewer performs a Thread action, it goes to the specied thread and
enters thread mode. The thread need not be in the current PDF le.
Table 6.15 Thread action attributes (Continued)
Key Type Semantics
Type name (Optional) Object type. Always Action.
S (Subtype) name (Required) Action type. Always Thread.
F (File)
string or dictionary (Required if the thread is in an external le) The le containing the
destination thread. See Section 6.6.6, File specications, for the
interpretation of the F key.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.6 Annotations 85
D (Dest) (Required) The desired thread destination. One of the following forms must
be provided:
dictionary An indirect reference to a thread in the current le. (See Section 6.10,
Articles.)
number A number that species the index of a thread in an external le. (The index
of the rst thread in a document is 0.)
string The title of a thread in an external le. If more than one thread has the same
title, the rst thread in the documents list of threads with that title will be
chosen.
name The name of a destination, in either the current le or an external le. See
Section 6.6.4, Named destinations.
array A destination, as specied in Table 6.10 on page 79.
B (Bead) (Optional) The desired bead in the destination thread. One of the following
forms may be provided:
dictionary An indirect reference to a Bead dictionary in the current le. See Table 6.45
on page 132.
number A number that species the beads index in the thread in an external le.
(The index of the rst bead in a thread is 0).
URI action
A Uniform Resource Identier (URI) is a string used to identify a resource
on the Internet, typically a le that is the destination of a hypertext link,
although it can also resolve to a query or other entity. In PDF 1.1, a URI
action is a Link annotation that includes a URI in its dictionary; activating
the link causes the URI to be resolved.
Note The URI action is resolved by the Acrobat WebLink plug-in.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
86 Chapter 6: Document Structure
Table 6.16 URI action attributes (Continued)
Key Type Semantics
Type name (Optional) Object type. Always Action.
S (Subtype) name (Required) Action type. Always URI.
URI string (Required) The Uniform Resource Identier to resolve, encoded in 7-bit
ASCII.
IsMap boolean (Optional) If this key is true, the mouse position should be tracked when
link is activated.
In a URI, any characters following a # dene a fragment identier. The
meaning of this identier depends on the type of the resource that the URI
identies. In a PDF le, the fragment identier is the name of a destination,
so the URI action is similar to a GoToR action that uses a named
destination.
Names in PDF allow characters that are not allowed in URI strings. To use
such characters in a fragment identier, write their two hex-digit character
codes, preceded by a percent sign. The name X&Y, for example, would be
written as X%26Y.
Implementation note When resolving the fragment identier, the WebLink plug-in will check all
named destinations dened for the document. If one is found whose name
matches the fragment identier, that destination will be invoked.
In the future, the syntax of the fragment identier may be extended to
specify threads, highlighting, and direct destinations. In order to reserve a
name space for these future specications, the destination name PDFD is
reserved.
A URI actions IsMap attribute indicates that when the action is performed,
the (x, y) position of the mouse within the parent link annotation (relative to
the upper left hand corner of the link rectangle) should be concatenated to
the end of the URI, preceded by a question mark. Here is an example:
http://www.adobe.com/intro?100,200
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.6 Annotations 87
Suppose the bounding rectangle in user space of the Link annotation (the
value of the Rect key) is [ ll
x
ll
y
ur
x
ur
y
]. Given the coordinates of the
mouse position in device space, (x
d
, y
d
), transform the mouse coordinates to
user space, (x
u
, y
u
). The nal coordinates, (x, y), are obtained in this way:
x = x
u
- ll
x
y = y
u
- ur
y
Because these coordinates can be fractional and the IsMap attribute
requires integers, the nal coordinates should be rounded to the nearest
integer.
URI dictionary in the Catalog
In order to support URI action types, the Catalog of the PDF le may
include a URI dictionary.
Table 6.17 URI attributes
Key Type Semantics
Base string (Optional) Base URI to resolve relative references. This element allows the
URI of the document itself to be recorded in situations in which the
document may be accessed out of context. URI actions within the document
may be in a partial form relative to this base address. When the base
address is not specified, the URI is assumed to be the one originally used to
locate the document. For example, if a document has been moved but the
documents pointed to by relative links within the document have not, the
Base key could be used to override the true URI of the document to fix the
relative links. This concept is parallel to the description of the body element
<BASE> as described in Section 2.7.2 of the HTML specication [8].
6.6.6 File specications
A le specication together with a le system describes the location of a
le. A simple le specication does not specify a le system to be used, and
a full le specication includes information that selects one or more le
systems. Simple le specications are strings that represent the name of the
referenced le in a format that is independent of operating system naming
conventions. Simple le specication strings are encoded with the
PDFDocEncoding.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
88 Chapter 6: Document Structure
The standard format for a simple le specication divides the string into
component strings separated by the slash ( / ) character. The slash is used as
a generic component separator that is mapped to the appropriate separator
when generating a system-dependent le name. The component string may
be empty, and if the component string contains one or more slashes (e.g., in/
out ) each slash must be preceded by a backslash ( \ ) (e.g., in\\/out ). Note
that the backslash itself must be preceded by a backslash to indicate it is
being used as a character in the string and not the escape character. The
backslashes are removed in dening the components; they are only needed
to distinguish the component values from the component separators.
A simple le specication that begins with a slash is an absolute le
specication. Within an absolute le specication, the last component is the
le name, and the preceding components are the context. The le name may
be empty in some le specications; for example, URL specications can
specify directories instead of les. A le specication that begins with a
component (i.e., one that does not begin with a slash) is a relative le
specication. A relative le specication is relative to the le specication
of the document containing the relative le specication.
In the case of a URL le system, the rules of RFC 1808, Relative Uniform
Resource Locators [12], are used to compute an absolute URL from the
documents le specication and a relative le specication. Prior to this
process, the relative le reference is converted into a relative URL by using
the escape mechanism of RFC 1738, Uniform Resource Locators [9], to
represent any octets that would be either unsafe according to RFC 1738 or
not representable in 7-bit US ASCII. In addition, such URL-based relative
le references are limited to being paths as dened in RFC 1808; the
scheme, network location/login, fragment identier, query information, and
parameters are not allowed.
In the case of other le systems, an absolute le specication is created
from a relative le specication and the le specication of the document
containing the relative le specication by removing the le name
component of the documents le specication and appending the relative
le specication.
The special component .. allows condensing a le specication.
Proceeding from left to right, whenever a component that is not .. is
followed by .., that component and the .. are eliminated from the le
specication and the process is begun again. This allows relative le
specications that are relative to an initial segment of an absolute le
specication.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.6 Annotations 89
The conversion of a le specication into a system-dependent le name is
specied for each le system. For the Macintosh, the components are
separated by colons ( : ). For UNIX, the components are separated by
slashes, and an initial slash, if present, is preserved. For DOS, the initial
component is either a physical or logical drive identier or a network
resource name as returned by the Microsoft Windows function
WNetGetConnection and is followed by a colon. A network resource
name is constructed from the rst two components of the le specication;
the rst component is the server name and the second component is the
share name (volume name). All the components are then separated by
backslashes. It is possible to specify an absolute DOS path without a drive
by making the rst component empty. (Empty components are ignored by
other platforms.)
Table 6.18 provides examples of le specications on various platforms.
A le specication can be either a string, formatted as described above, or a
dictionary. The dictionary form of the le specication provides for
platform-specic le specications and allows extension of the form of le
specications. A dictionary that contains a platform-specic le system key
or a le system key (FS) is a full le specication. This provides alternate
ways to locate a le.
A PDF le viewer should use the appropriate platform-specic key (Mac,
DOS, or Unix). If it does not nd the appropriate platform-specic key and
there is no le system value (FS), it should treat the value of the le
Table 6.18 Examples of le specications
System System-dependent path String
Mac Macintosh HD:PDFDocs:spec.pdf (/Macintosh HD/PDFDocs/spec.pdf)
DOS \pdfdocs\spec.pdf (no drive) (//pdfdocs/spec.pdf)
DOS r:\pdfdocs\spec.pdf (/r/pdfdocs/spec.pdf)
DOS pcadobe/eng:\pdfdocs\spec.pdf (/pcadobe/eng/pdfdocs/spec.pdf)
UNIX /user/fred/pdfdocs/spec.pdf (/user/fred/pdfdocs/spec.pdf)
UNIX pdfdocs/spec.pdf (relative) (pdfdocs/spec.pdf)
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
90 Chapter 6: Document Structure
specication key (F) as a simple le specication. The keys need not
specify the same le, allowing a single le specication to describe
appropriate but different les for different platforms.
Table 6.19 describes the le specication dictionary attributes.
Table 6.19 File specication attributes
Key Type Semantics
FS (FileSystem) name (Optional) The name of the le system to be used to interpret this le
specication. A viewer or plug-in can register a le system. A le system
interprets le specications, opens les, and provides the usual input and
output operations. If a le specication includes a le system, all other keys
are interpreted by this le system. Note that this key is independent of the F,
Mac, DOS. and Unix keys.
F (File) string (Required if no other keys are present) A le specication using the string
format described earlier in this section. A viewer that encounters an action
with no F key and that does not understand any of the alternative keys need
not do anything.
Mac string (Optional) A string that species a Macintosh le name using the string
format described above.
DOS string (Optional) A string that species a DOS le name using the string format
described above.
Unix string (Optional) A string that species a UNIX le name using the string format
described above.
ID array (Optional) An array of two strings. The ID is a le ID as described in
Section 6.11. This allows a viewer to find the exact match more often, and it
allows viewers to warn a user if the file has changed since the link was
made.
The string values of the DOS, Mac and Unix keys should not be modied
by the implementation and are passed unchanged to the le system as an
octet string.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.6 Annotations 91
When the FS key has the value URL, the value of the F key is not a le
specication string: instead, it is a URL formatted as specied in RFC 1738
and must follow the character encoding requirements of that RFC. Because
7-bit US ASCII is a strict subset of the PDFDocEncoding, this value may
also be considered to be in the PDFDocEncoding.
Care must be taken to use safe path names when creating collections of
documents that will be used on various le systems. A safe path name is one
that can be used to locate les on the most common le systems. For
maximum compatibility, only a subset of the US ASCII character set should
be used. All of the upper and lowercase alphabetic (a-z, A-Z) and numeric
characters (0-9) are safe, as are the hyphen ( ) and the underscore ( _ ).
The period ( . ) has special meaning as a relative path specier in DOS and
Windows le names. When used in le names, the period should only be
used to separate a base le name from a le extension. Some systems are
case-insensitive, so names within a directory should be distinguishable if
case is folded. On DOS and Windows 3.1 systems and on some CD-ROM
le systems, le names are limited to eight characters plus a three character
extension. File system software typically converts long names to short
names by retaining the rst six or seven characters and the rst three
characters after the last period, if any. The seventh or eighth characters are
converted to other values unrelated to the original value. Therefore, safe le
names are distinguishable from the rst six characters.
6.6.7 Movie annotations
A Movie annotation describes the static display and playing of movies and
sounds within PDF documents. These annotations appear to be embedded in
the document, similarly to links. The activation area may be invisible,
bordered in the manner of a link button. There are several options that
control the way a movie is displayed and played.
The activation area may also have the movies poster displayed. A
QuickTime movie may designate a poster, which is a single frame from the
movie itself or a separately authored frame. If not otherwise specied by the
movie author, the poster is the rst frame of the movie. For AVI movies, the
poster is always the rst frame of the movie.
Table 6.20 Movie annotation attributes (in addition to those in Table 6.7)
Key Type Semantics
Subtype name (Required) Annotation subtype. Always Movie.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
92 Chapter 6: Document Structure
Movie dictionary (Required) A description of the static characteristics of the Movie; see Table
6.21.
A (Activation) boolean (Optional) A ag that indicates whether the movie should be shown by
clicking in the annotation rectangle. Possible values are:
false Do not play the movie when clicked.
true Play the movie with the default activation values. (This is
the default value for the A key.)
or
dictionary (Optional) Directions for playing the Movie; see Table 6.22.
The Movie dictionary contains information needed to locate the movie data
and to display the poster (if requested) in the annotation rectangle:
Table 6.21 Movie dictionary attributes
Key Type Semantics
F (File)
string or dictionary (Required) A le specication for a self-describing movie le.
Note The format of a self-describing movie le is left unspecied, and there is
no guarantee of portability.
Aspect array (Optional) If the movie is visible, the horizontal and vertical sizes of the
movies bounding box in pixels: [ horiz vert ]. An invisible movie is one
with no video: it has only sound.
Poster boolean (Optional) A ag indicating whether the poster is to be retrieved from the
movie le for display. Possible values are:
false Do not show a poster image. (This is the default if the
Poster key is omitted.)
true Show the poster image from the movie le.
or
stream (Optional) An image object that is to be displayed as the poster. The format
of this object is identical to an Image resource (see page 122), except that
the Name key is not required.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.7 Outline tree 93
The Activation dictionary contains information needed to control the
dynamics of playing the movie:
Table 6.22 Activation attributes
Key Type Semantics
Show-Controls boolean (Optional) If this key is true, a Movie Controller bar is shown when the
movie is played.
Mode name (Optional) The playing mode for the movie. The dened values are:
Once Show the movie once and stop. (This is the default value.)
Open Show the movie and leave the controller open.
Repeat Repeat the movie from the beginning until stopped.
Palindrome Play the movie back and forth until stopped.
FWScale array (Optional) If this key is omitted, the movie will be played in the annotation
rectangle. Otherwise, it will be played in a oating window. The array
contains two integers, [ a b ], representing the rational number a b, which
species the magnication factor for the movie. The nal window size for
the movie will be (a b) Aspect pixels.
6.7 Outline tree
An outline allows a user to access views of a document by name. As with a
link annotation, activation of an outline entry (also called a bookmark)
brings up a new view based on the destination description. Outline entries
form a hierarchy of elements. An entry may be one of several at the same
level in the outline, it may be a sub-entry of another entry, and it may have
its own set of child entries. An outline entry may be open or closed. If it is
open, its immediate children are visible when the outline is displayed. If it is
closed, they are not.
If a document includes an outline, it is accessed from the Outlines key in
the Catalog object. The value of this key is the Outlines object, which is the
root of the outline tree. The contents of the Outlines dictionary appear in
Table 6.23 and Example 6.10. The top-level outline entries are contained in
a linked list, with First pointing to the head of the list and Last pointing to
the tail of the list. When displayed, outline entries appear in the order in
which they occur in the linked list.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
94 Chapter 6: Document Structure
Table 6.23 Outlines attributes
Key Type Semantics
Count integer (Required if document has any open outline entries, otherwise optional)
Total number of open entries in the outline. This includes the total number
of items open at all outline levels, not just top-level outline entries. If the
count is zero, this key should be omitted.
First dictionary (Required if document has any outline entries; must be indirect reference)
Reference to the outline entry that is the head of the linked list of top-level
outline entries.
Last dictionary (Required if document has any outline entries; must be indirect reference)
Reference to the outline entry that is the tail of the linked list of top-level
outline entries.
Example 6.10 Outlines object with six open entries
21 0 obj
<<
/Count 6
/First 22 0 R
/Last 29 0 R
>>
endobj
Each outline entry is a dictionary, whose contents are shown in Table 6.24.
Table 6.24 Outline entry attributes
Key Type Semantics
Title string (Required) The text that appears in the outline for this entry. The characters
in this string are encoded using the predened encoding
PDFDocEncoding, described in Appendix C.
Dest array or name (Required unless the A key is present) A destination, as described in Table
6.9 on page 78.
A (Action) dictionary (Required unless the Dest key is present) The action to be performed when
this link annotation is activated; see Section 6.6.5, Actions.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.7 Outline tree 95
Parent dictionary (Required; must be indirect reference) Species the entry for which the
current entry is a sub-entry. The parent of the top-level entries is the
Outlines object.
Prev dictionary (Required if the entry is not the rst of several entries at the same outline
level; must be indirect reference) Species the previous entry in the linked
list of outline entries at this level.
Next dictionary (Required if the entry is not the last of several entries at the same outline
level; must be indirect reference) Species the next entry in the linked list of
outline entries at this level.
First dictionary (Required if an entry has sub-entries; must be indirect reference) Species
the outline entry that is the head of the linked list of sub-entries of this
outline item.
Last dictionary (Required if an entry has sub-entries; must be indirect reference) Species
the outline entry that is the tail of the linked list of sub-entries of this outline
item.
Count integer (Required if an entry has sub-entries) If positive, species the number of
open descendants the entry has. This includes not just immediate sub-
entries, but sub-entries of those entries, and so on. If the value is negative,
the entry is closed and the absolute value of Count species how many
entries will appear when the entry is reopened. If an entry has no
descendants, the Count key should be omitted.
As with Link annotations, GoTo actions should be specied using the Dest
key, for compatibility with viewers implementing the PDF 1.0 specication.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
96 Chapter 6: Document Structure
Example 6.11 shows an outline entry. An example of a complete outline tree
can be found in Appendix A.
Example 6.11 Outline entry
22 0 obj
<<
/Parent 21 0 R
/Dest [ 3 0 R /Top 0 792 0 ]
/Title (Document)
/Next 29 0 R
/First 25 0 R
/Last 28 0 R
/Count 4
>>
endobj
6.8 Resources
The content of a Page object is represented by a sequence of instructions
that produce the text, graphics, and images on that page. The instructions for
a particular page may make use of certain objects not contained within that
pages description itself but that are either located elsewhere in the PDF le
or are PostScript language objects such as fonts. These objects, which are
required in order to draw the page but are not stored in the page content
itself, are called resources.
Resources are not part of a page but are simply referenced by the page.
Multiple pages can share a resource. Because resources are stored outside
the content of all pages, even pages that share resources remain independent
of each other.
PDF currently supports the following resource types: ProcSet, Font,
Encoding, FontDescriptor, ColorSpace, and XObject.
Each page includes a list of the ProcSet, Font, and XObject resources it
uses. This resource list is stored as a dictionary that is the value of the
Resources key in the Page object, and has two functions: it enumerates
the resources directly needed by the page, and it establishes names by which
operators in the page description can refer to the resources. All instructions
in the page description that operate on resources refer to them by name.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.8 Resources 97
Each key in the Resources dictionary is a resource type, whose value is a
dictionary or an array. If it is a dictionary, it contains keys that are resource
names and values that are indirect references to the PDF objects specifying
the resources. If it is an array, it contains a list of names. Only the list of
ProcSet resources is represented as an array in the Resources dictionary; all
other resource lists are represented as dictionaries within the Resources
dictionary.
Example 6.12 shows a Resources dictionary containing a ProcSet array, a
Font dictionary, and an XObject dictionary. The ProcSet array is described
in the following section. The font dictionary contains four fonts named F5,
F6, F7, and F8, and associated with object numbers 6, 8, 10, and 12,
respectively. The XObject dictionary contains two XObjects named Im1 and
Im2 and associated with object numbers 13 and 15, respectively.
Example 6.12 Resources dictionary
<<
/ProcSet [/PDF /ImageB]
/Font << /F5 6 0 R /F6 8 0 R /F7 10 0 R /F8 12 0 R >>
/XObject << /Im1 13 0 R /Im2 15 0 R >>
>>
Some PDF operators take resource names as operands. These resource
names are expected to appear in the current pages Resources dictionary. If
they do not, an error may be raised or in the case of a font, a default font
may be substituted.
6.8.1 ProcSet resources
The types of instructions that may be used in a PDF page description are
grouped into independent sets of related instructions. Each of these sets,
called ProcSets, may or may not be used on a particular page. ProcSets
contain implementations of the PDF operators and are used only when a
page is printed. The Resources dictionary for each page must contain a
ProcSet key whose value is an array consisting of the ProcSets used on
that page. Each of the entries in the array must be one of the predened
ProcSets shown in Table 6.25. The Resources dictionary shown in Example
6.12 contains a ProcSet key.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
98 Chapter 6: Document Structure
Table 6.25 Predened procsets
Procset Name Required if the page has any
PDF marks on the page whatsoever
Text text
ImageB grayscale images or image masks
ImageC color images
ImageI indexed images (also called color-table images)
6.8.2 Font resources
A PDF font resource is a dictionary specifying the kind of font the resource
provides, its real name, its encoding, and information describing the font
that can be used to provide a substitute for it when it is not available. A font
resource may describe a Type 1 font, an instance of a multiple master Type
1 font, a Type 3 font, or a TrueType font.
All types of fonts supported by PDF share a number of attributes. Table 6.26
lists these attributes.
Table 6.26 Attributes common to all types of fonts
Key Type Semantics
Type name (Required) Resource type. Always Font.
Name name (Required only in PDF 1.0) Resource name, used as an operand of the Tf
operator when selecting the font. Name must match the name used in the
font dictionary within the pages Resources dictionary.
Implementation note All Acrobat viewers ignore the Name key.
FirstChar integer (Required except for base 14 Type 1 fonts listed in Table 6.28) Species the
rst character code dened in the fonts Widths array.
LastChar integer (Required except for base 14 Type 1 fonts) Species the last character code
dened in the fonts Widths array.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.8 Resources 99
Widths array (Required except for base 14 Type 1 fonts; indirect reference preferred) An
array of LastChar FirstChar + 1 widths. For character codes outside
the range FirstChar to LastChar, the value of MissingWidth from the
fonts descriptor is used (see Section 6.8.4, Font descriptors.) The units in
which character widths are measured depend on the type of font resource.
Encoding
name or dictionary (Optional) Species the fonts character encoding. If it is a name, it must be
the name of an encoding resource or the name of a predened encoding. If it
is a dictionary, it must be an Encoding resource dictionary. If this key is not
present, the fonts built-in encoding is used. Appendix C describes the
predened encodings (MacRomanEncoding, MacExpertEncoding,
and WinAnsiEncoding).
For Type 1 and TrueType fonts, the BaseFont key in the font dictionary
may contain a style string. If the font is a bold, italic, or bold italic font for
which no PostScript language name is available, the BaseFont key
contains the base name of the font with any spaces removed, followed by a
comma, followed by a style string. The style string contains one of the
strings Italic, Bold, or BoldItalic. For example, the italic variant of
the New York font has a BaseFont of /NewYork,Italic. The PostScript
language name of a font is the name which, in a PostScript language
program, is used as an operand of the ndfont operator. It is the name
associated with the font by a denefont operation. This is usually the
value of the FontName key in the PostScript language font dictionary of
the font. For more information, see Section 5.2 of the PostScript Language
Reference Manual, Second Edition.
Type 1 fonts
Type 1 fonts, described in detail in Adobe Type 1 Font Format, are special-
purpose PostScript language programs used for dening fonts. As compared
to Type 3 fonts, Type 1 fonts can be dened more compactly, make use of a
special procedure for drawing the characters that results in higher quality
output at small sizes and low resolution, and have a built-in mechanism for
specifying hints, which are data that indicate basic features of the character
shapes not directly expressible by the basic PostScript language operators.
In addition, Type 1 fonts that contain a UniqueID in the font itself can be
cached across jobs, potentially resulting in enhanced performance. See
Section 2.5 of the Adobe Type 1 Font Format for further information on
UniqueIDs for Type 1 fonts.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
100 Chapter 6: Document Structure
Table 6.27 shows the attributes specic to Type 1 font resources.
Note Character widths in Type 1 font resources are measured in units in which
1000 units correspond to 1 unit in text space.
Table 6.27 Type 1 font additional attributes
Key Type Semantics
Subtype name (Required) Type of font. Always Type1.
BaseFont name (Required) A PostScript language name or a style string specifying the base
font. (See the section on Font Subsets on page 102 for restrictions on the
name.)
FontDescriptor
dictionary (Required except for base 14 fonts; must be indirect reference) A font
descriptor resource describing the fonts metrics other than its character
widths.
The base 14 Type 1 fonts
Some font attributes can be omitted for the fourteen Type 1 fonts guaranteed
to be present with Acrobat Exchange and Acrobat Reader. These fonts are
called the base 14 fonts and include members of the Courier, Helvetica, and
Times families, along with Symbol and ITC Zapf Dingbats. Table 6.28 lists
the PostScript language names of these fonts.
Table 6.28 Base 14 fonts
Courier Symbol
Courier-Bold Times-Roman
Courier-Oblique Times-Bold
Courier-BoldOblique Times-Italic
Helvetica Times-BoldItalic
Helvetica-Bold ZapfDingbats
Helvetica-Oblique
Helvetica-BoldOblique
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.8 Resources 101
Example 6.13 shows the font resource for the Adobe Garamond Semibold
font. In this example, the font is given the name F1, by which it can be
referred to in the PDF page description. The font has an encoding (object
number 25), although neither the encoding nor the font descriptor (object
number 7) is shown in the example.
Example 6.13 Type 1 font resource and character widths array
14 0 obj
<<
/Type /Font
/Subtype /Type1
/Name /F1
/BaseFont /AGaramond-Semibold
/Encoding 25 0 R
/FontDescriptor 7 0 R
/FirstChar 0
/LastChar 255
/Widths 21 0 R
>>
endobj
21 0 obj
[ 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255
255 255 255 255 255 255 255 255 255 255 255 255 255 255 255
255 255 255 280 438 510 510 868 834 248 320 320 420 510 255
320 255 347 510 510 510 510 510 510 510 510 510 510 255 255
510 510 510 330 781 627 627 694 784 580 533 743 812 354 354
684 560 921 780 792 588 792 656 504 682 744 650 968 648 590
638 320 329 320 510 500 380 420 510 400 513 409 301 464 522
268 259 484 258 798 533 492 516 503 349 346 321 520 434 684
439 448 390 320 255 320 510 255 627 627 694 580 780 792 744
420 420 420 420 420 420 402 409 409 409 409 268 268 268 268
533 492 492 492 492 492 520 520 520 520 486 400 510 510 506
398 520 555 800 800 1044 360 380 549 846 792 713 510 549 549
510 522 494 713 823 549 274 354 387 768 615 496 330 280 510
549 510 549 612 421 421 1000 255 627 627 792 1016 730 500
1000 438 438 248 248 510 494 448 590 100 510 256 256 539 539
486 255 248 438 1174 627 580 627 580 580 354 354 354 354 792
792 790 792 744 744 744 268 380 380 380 380 380 380 380 380
380 380 ]
endobj
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
102 Chapter 6: Document Structure
Font Subsets
PDF 1.1 permits documents to include subsets of Type 1 fonts. The font
resource and font descriptor that describe a font subset are slightly different
from those of ordinary fonts. These differences allow an application to
recognize font subsets and to merge documents containing different subsets
of the same font.
The value of the font resources BaseFont key and the font descriptors
FontName key use the following format:
pseudoUniqueTag+PostScriptName
pseudoUniqueTag consists of exactly six uppercase alphabetic characters.
PostScriptName must be the name of the complete Type 1 font. A plus
sign separates pseudoUniqueTag and PostScriptName. For example,
EOODIA+Poetica. The purpose of the tag is to identify the subset.
Different subsets should have different tags.
Note Any font whose BaseFont or FontName uses this format is assumed to
be a font subset.
Implementation note These restrictions make font subsets compatible with 1.0 viewers, enable the
Distiller application to recognize font subsets in its input stream, and
enable Acrobat 2.0 viewers to merge documents containing subsets.
Multiple master Type 1 fonts
The multiple master font format is an extension of the Type 1 font format
that allows the generation of a wide variety of typeface styles from a single
font. This is accomplished through the presence of various design
dimensions in the font. Examples of design dimensions are weight (light to
extra-bold) and width (condensed to expanded). Coordinates along these
design dimensions (such as the degree of boldness) are specied by
numbers.
To specify the appearance of the font, numeric values must be supplied for
each design dimension of the multiple master font. A completely specied
multiple master font is referred to as an instance of the multiple master font.
The note Adobe Type 1 Font Format: Multiple Master Extensions describes
multiple master fonts. An instance of a multiple master font, shown in Table
6.29, has the same keys as an ordinary Type 1 font.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.8 Resources 103
Note Character widths in multiple master Type 1 font resources are measured in
units in which 1000 units correspond to 1 unit in text space.
Table 6.29 Multiple master Type 1 font additional attributes
Key Type Semantics
Subtype name (Required) Type of font. Always MMType1.
BaseFont name (Required) Species the PostScript language name of the instance. If the
name contains spaces (such as MinionMM 366 465 11), these spaces are
replaced with underscores.
FontDescriptor
dictionary (Required; must be indirect reference) A font descriptor resource describing
the fonts metrics other than its character widths.
Example 6.14 Multiple master font resource and character widths array
7 0 obj
<<
/Type /Font
/Subtype /MMType1
/Name /F4
/BaseFont /MinionMM_366_465_11
/FirstChar 32
/LastChar 255
/Widths 19 0 R
/Encoding 5 0 R
/FontDescriptor 6 0 R
>>
endobj
19 0 obj
[ 187 235 317 430 427 717 607 168 326 326 421 619 219 317 219
282 427 427 427 427 427 427 427 427 427 427 219 219 619 619
... omitted data...
301 301 301 569 569 0 569 607 607 607 239 400 400 400 400 253
400 400 400 400 400 ]
endobj
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
104 Chapter 6: Document Structure
Type 3 fonts
PostScript Type 3 fonts, also known as user-dened fonts, are described in
Section 5.7 of the PostScript Language Reference Manual, Second Edition.
PDF provides a variant of Type 3 fonts in which characters are dened by
streams of PDF page-marking operators. These streams, known as
CharProcs, are associated with the character names. As with any font, the
character names are accessed via an encoding vector.
PDF Type 3 font resources differ from the other font resources provided by
PDF. Type 3 font resources dene the font itself, while the other font
resources simply contain information about the font.
Type 3 fonts are more exible than Type 1 fonts because the character-
drawing streams may contain arbitrary PDF page marking operators.
However, Type 3 fonts have no mechanism for improving output at small
sizes or low resolutions, and no built-in mechanism for hinting. Table 6.30
shows the attributes specic to Type 3 font resources.
Table 6.30 Type 3 font additional attributes
Key Type Semantics
Subtype name (Required) Type of font. Always Type3.
CharProcs dictionary (Required) Each key in this dictionary is a character name and the value
associated with that key is a stream object that draws the character. Any
operator that can be used in a PDF page description can be used in this
stream. However, the stream must include as its rst operator either d0 (d
zero) or d1 (d one), equivalent to the PostScript language setcharwidth
and setcachedevice operators.
FontBBox array (Required) Array of four numbers, [ ll
x
ll
y
ur
x
ur
y
], specifying the lower left
x, lower left y, upper right x, and upper right y coordinates of the font
bounding box, in that order. The coordinates are measured in character
space. The font bounding box is the smallest rectangle enclosing the shape
that results if all characters in the font are placed with their origins
coincident, and then painted. FontBBox is identical to the PostScript Type
3 font FontBBox.
FontMatrix array (Required) Species the transformation from character space to text space.
FontMatrix is identical to the PostScript Type 3 font FontMatrix.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.8 Resources 105
Note Character widths and FontBBox in Type 3 font resources are measured in
character space. The transformation from character space to text space is
specied by the value of the FontMatrix key in the Type 3 font dictionary.
Example 6.15 shows a Type 3 font resource.
Example 6.15 Type 3 font resource
6 0 obj
<<
/Type /Font
/Subtype /Type3
/Name /T36
/CharProcs 1928 0 R
/FontBBox [ 3 241 875 856 ]
/FontMatrix [ .001 0 0 .001 0 0 ]
/FirstChar 3
/LastChar 101
/Widths 7 0 R
/Encoding 1927 0 R
>>
endobj
7 0 obj
[ 55 0 0 589 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 31 31 0 0 0 270 0 0 410 40 640
40 0 40 0 40 40 0 0 0 0 0 0 0 0 60 0
58 61 54 52 603 0 29 0 0 853 73 60 62 504 0 659
44 58 60 60 0 0 603 0 0 0 0 0 0 0 0 0
35 0 35 ]
endobj
TrueType fonts
The TrueType font format was developed by Apple Computer. A TrueType
font resource, shown in Table 6.31, has the same keys as a Type 1 font
resource.
Note Character widths in TrueType font resources are measured in units in which
1000 units correspond to 1 unit in text space.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
106 Chapter 6: Document Structure
Table 6.31 TrueType font attributes
Key Type Semantics
Subtype name (Required) Type of font. Always TrueType.
BaseFont name (Required) Style string specifying the base TrueType font.
FontDescriptor
dictionary (Required; must be indirect reference) A font descriptor resource describing
the fonts metrics other than its character widths.
Example 6.16 TrueType font resource
17 0 obj
<<
/Type /Font
/Subtype /TrueType
/Name /F1
/BaseFont /NewYork,Bold
/FirstChar 0
/LastChar 255
/Widths 23 0 R
/Encoding /MacRomanEncoding
/FontDescriptor 7 0 R
>>
endobj
23 0 obj
[ 0 333 333 333 333 333 333 333 0 333 333 333 333 333 333 333
333 333 333 333 333 333 333 333 333 333 333 333 333 0 333 333
333 303 500 666 666 882 848 303 446 446 507 666 303 378 303
... omitted data ...
303 530 1280 757 605 757 605 605 355 355 355 355 803 803 790
803 780 780 780 340 636 636 636 636 636 636 636 636 636 636 ]
endobj
6.8.3 Encoding resources
An encoding resource describes a fonts character encoding, the mapping
between numeric character codes and character names. These character
names are keys in the font dictionary and are used to retrieve the code which
draws the character. Thus, the font encoding provides the link which
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.8 Resources 107
associates numeric character codes with the glyphs drawn when those codes
are encountered in text. An encoding resource is a dictionary whose
contents are shown in Table 6.32.
Table 6.32 Font encoding attributes
Key Type Semantics
Type name (Optional) Resource type. Always Encoding.
BaseEncoding name (Optional) Species the encoding from which the new encoding differs.
This key is not present if the encoding is based on the base fonts encoding.
Otherwise it must be one of the predened encodings
MacRomanEncoding, MacExpertEncoding, or WinAnsiEncoding,
described in Appendix C.
Differences array (Optional) Describes the differences from the base encoding.
The value of the Differences key is an array of character codes and glyph
names organized as follows:
code
1
/name
11
/name
12
... /name
1i
code
2
/name
21
/name
22
... /name
1j
...
code
n
/name
n1
/name
n2
... /name
nk
Each code is the rst index in a sequence of characters to be changed. The
rst glyph name after the code becomes the name corresponding to that
code. Subsequent names replace consecutive code indexes until the next
code appears in the array or the array ends.
For example, in the encoding in Example 6.17, the glyph quotesingle ()
is associated with character code 39. Adieresis () is associated with code
128, Aring () with 129, and trademark () with 170.
Example 6.17 Font encoding
25 0 obj
<<
/Type /Encoding
/Differences [ 39 /quotesingle 96 /grave 128 /Adieresis /Aring
/Ccedilla /Eacute /Ntilde /Odieresis /Udieresis /aacute /agrave
/acircumflex /adieresis /atilde /aring /ccedilla /eacute /egrave
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
108 Chapter 6: Document Structure
/ecircumflex /edieresis /iacute /igrave /icircumflex /idieresis /ntilde
/oacute /ograve /ocircumflex /odieresis /otilde /uacute /ugrave
/ucircumflex /udieresis /dagger /degree /cent /sterling /section /bullet
/paragraph /germandbls /registered /copyright /trademark /acute
/dieresis 174 /AE /Oslash 177 /plusminus 180 /yen /mu 187
/ordfeminine /ordmasculine 190 /ae /oslash /questiondown
/exclamdown /logicalnot 196 /florin 199 /guillemotleft /guillemotright
/ellipsis 203 /Agrave /Atilde /Otilde /OE /oe /endash /emdash
/quotedblleft /quotedblright /quoteleft /quoteright /divide 216
/ydieresis /Ydieresis /fraction /currency /guilsinglleft /guilsinglright /fi
/fl /daggerdbl /periodcentered /quotesinglbase /quotedblbase
/perthousand /Acircumflex /Ecircumflex /Aacute /Edieresis /Egrave
/Iacute /Icircumflex /Idieresis /Igrave /Oacute /Ocircumflex 241
/Ograve /Uacute /Ucircumflex /Ugrave /dotlessi /circumflex /tilde
/macron /breve /dotaccent /ring /cedilla /hungarumlaut /ogonek
/caron ]
>>
endobj
6.8.4 Font descriptors
A font descriptor species a fonts metrics, attributes, and glyphs. These
metrics provide information needed to create a substitute multiple master
font when the original font is unavailable. The font descriptor may also be
used to embed the original font in the PDF le.
A font descriptor is a dictionary, as shown in Table 6.33, whose keys specify
various font attributes. Most keys are similar to the keys found in Type 1
font and FontInfo dictionaries described in Section 5.2 of the PostScript
Language Reference Manual, Second Edition and the Adobe Type 1 Font
Format. All integer values are units in character space. The conversion from
character space to text space depends on the type of font. See the discussion
in Section 6.8.2, Font resources.
Note For detailed information on the coordinate system in which characters are
dened, see Section 5.4 in the PostScript Language Reference Manual,
Second Edition or Section 3.1 in the Adobe Type 1 Font Format.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.8 Resources 109
Table 6.33 Font descriptor attributes
Key Type Semantics
Type name (Required) Resource type. Always FontDescriptor.
Ascent integer (Required) The maximum height above the baseline reached by characters
in this font, excluding the height of accented characters.
CapHeight integer (Required) The y-coordinate of the top of at capital letters, measured from
the baseline.
Descent integer (Required) The maximum depth below the baseline reached by characters in
this font. Descent is a negative number.
Flags integer (Required) Collection of ags dening various characteristics of the font.
See Table 6.35.
FontBBox array (Required) Array of four numbers, [ ll
x
ll
y
ur
x
ur
y
], specifying the lower left
x, lower left y, upper right x, and upper right y coordinates of the font
bounding box, in that order. The font bounding box is the smallest rectangle
enclosing the shape that results if all characters in the font are placed with
their origins coincident, and then painted.
FontName name (Required) The name passed to the PostScript language denefont
operator. (See the section on Font Subsets on page 102 for restrictions on
the name.)
ItalicAngle integer (Required) Angle in degrees counterclockwise from the vertical of the
dominant vertical strokes of the font. ItalicAngle is negative for fonts that
slope to the right, as almost all italic fonts do.
StemV integer (Required) The width of vertical stems in characters.
AvgWidth integer (Optional) The average width of characters in this font. The default value is
0.
FontFile stream (Optional) A stream that denes a Type 1 font.
FontFile2 stream (Optional) A stream that denes a TrueType font.
Leading integer (Optional) The desired spacing between lines of text. The default value is 0.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
110 Chapter 6: Document Structure
MaxWidth integer (Optional) The maximum width of characters in this font. The default value
is 0.
MissingWidth integer (Optional) The width to use for unencoded character codes. The default
value is 0.
StemH integer (Optional) The width of horizontal stems in characters. The default value is
0.
XHeight integer (Optional) The y-coordinate of the top of at non-ascending lowercase
letters, measured from the baseline. The default value is 0.
CharSet string (Optional) A string which lists the glyph names corresponding to the entries
in the CharStrings dictionary if the font described is a subset font. Each
name must be preceded by a slash. The names may appear in any order. The
name .notdef should be omitted; it is assumed to exist in the font subset.
Font les
Currently, a multiple master Type 1 font can only be used to substitute for
fonts that use the Adobe Roman Standard Character Set as dened in
Appendix E.5 of the PostScript Language Reference Manual, Second
Edition. To make a document portable, it is necessary to embed fonts that do
not use this character set. The only exceptions are the fonts Symbol and ITC
Zapf Dingbats, which are assumed to be present.
Type 1 fonts may be embedded in a PDF 1.1 le using the FontFile
mechanism. The value of the FontFile key in a font descriptor is a stream
that contains a Type 1 font denition. A Type 1 font denition, as described
in the Adobe Type 1 Font Format, consists of three parts: a clear text
portion, an encrypted portion, and a xed content portion. The xed content
portion contains 512 ASCII zeros followed by a cleartomark operator, and
perhaps followed by additional data. The stream dictionary for a font le
contains the standard Length and Filter keys plus the additional keys
shown in Table 6.34. While the encrypted portion of a Type 1 font may be in
binary or ASCII hexadecimal format, PDF supports only the binary format.
Example 6.18 shows the structure of an embedded Type 1 font.
TrueType fonts are embedded using the FontFile2 mechanism. The font
descriptor for an embedded TrueType font should contain a FontFile2 key
whose value is a stream that contains the TrueType font denition as
described in TrueType 1.0 Font Files. The stream dictionary should include
a Length1 key as specied in Table 6.34; that key species the length in
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.8 Resources 111
bytes of the font le after it has been decoded using the lters specied by
the streams Filter key. The Length2 and Length3 keys should not be
used for TrueType fonts.
Because the stream containing Type 1 or TrueType font data may include
binary data, it may be desirable convert this data to ASCII using either the
ASCII hexadecimal or ASCII base-85 encoding.
Implementation note Embedded TrueType fonts are ignored by Acrobat 1.0 viewers.
Table 6.34 Additional attributes for FontFile stream
Key Type Semantics
Length1 integer (Required) Length in bytes of the ASCII portion of the Type 1 font le after
it has been decoded using the lters specied by the streams Filter key.
Length2 integer (Required for Type 1 fonts) Length in bytes of the encrypted portion of the
Type 1 font le after it has been decoded using the lters specied by the
streams Filter key.
Length3 integer (Required for Type 1 fonts) Length in bytes of the portion of the Type 1 font
le that contains the 512 zeros, plus the cleartomark operator, plus any
following data. This is the length of the data after it has been decoded using
the lters specied by the streams Filter key. If Length3 is zero, it
indicates that the 512 zeros and cleartomark have not been included in the
FontFile and must be added.
Example 6.18 Embedded Type 1 font denition
12 0 obj
<<
/Filter /ASCII85Decode
/Length 13 0 R
/Length1 15 0 R
/Length2 14 0 R
/Length3 16 0 R
>>
stream
,p>`rDKJj'E+LaU0eP.@+AH9dBOu$hFD55nC
omitted data
JJQ&Nt')<=^p&mGf(%:%h1%9c//K(/*o=.C>UXkbVGTrr~>
endstream
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
112 Chapter 6: Document Structure
endobj
13 0 obj
41116
endobj
14 0 obj
32393
endobj
15 0 obj
2526
endobj
16 0 obj
570
endobj
Flags
The value of the Flags key in a font descriptor is a 32-bit integer that
contains a collection of boolean attributes. These attributes are true if the
corresponding bit is set in the integer. Table 6.35 species the meanings of
the bits, with bit 1 being the least signicant. Reserved bits must be set to
zero.
Table 6.35 Font ags
Bit position Semantics
1 Fixed-width font
2 Serif font
3 Symbolic font
4 Script font
5 Reserved
6 Uses the Standard Roman Character Set
7 Italic
816 Reserved
17 All-cap font
18 Small-cap font
19 Force bold at small text sizes
2032 Reserved
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.8 Resources 113
All characters in a xed-width font have the same width, while characters in
a proportional font have different widths. Characters in a serif font have
short strokes drawn at an angle on the top and bottom of character stems,
while sans serif fonts do not have such strokes. A symbolic font contains
symbols rather than letters and numbers. Characters in a script font
resemble cursive handwriting. An all-cap font, which is typically used for
display purposes such as titles or headlines, contains no lowercase letters. It
differs from a small-cap font in that characters in the latter, while also
capital letters, have been sized and their proportions adjusted so that they
have the same size and stroke weight as lowercase characters in the same
typeface family. Figure 6.4 shows examples of these types of fonts.
Figure 6.4 Characteristics represented in the ags eld of a font
descriptor
Bit 6 in the ags eld indicates that the fonts character set is the Adobe
Standard Roman Character Set, or a subset of that, and that it uses the
standard names for those characters. The characters in the Adobe Standard
Roman Character Set are shown in the rst column of Table C.1 on page
248 (A, , , etc.); the character names are shown in column 2 (A, AE,
Aacute, etc.).
Finally, bit 19 is used to determine whether or not bold characters are drawn
with extra pixels even at very small text sizes. Typically, when characters
are drawn at small sizes on very low resolution devices such as display
screens, features of bold characters may appear only one pixel wide.
Because this is the minimum feature width on a pixel-based device,
ordinary non-bold characters also appear with one-pixel wide features, and
cannot be distinguished from bold characters. If bit 19 is set, features of
bold characters may be thickened at small text sizes.
Fixed-width font
Sans serif font
Serif font
Symbolic font
Italic font
Script font
All cap font
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
114 Chapter 6: Document Structure
Example 6.19 Font descriptor
7 0 obj
<<
/Type /FontDescriptor
/FontName /AGaramond-Semibold
/Flags 262192
/FontBBox [ -177 -269 1123 866 ]
/MissingWidth 255
/StemV 105
/StemH 45
/CapHeight 660
/XHeight 394
/Ascent 720
/Descent -270
/Leading 83
/MaxWidth 1212
/AvgWidth 478
/ItalicAngle 0
>>
endobj
6.8.5 Color space resources
A color space species how color values should be interpreted. While some
PDF operators implicitly specify the color space they use, others require a
color space to be specied. As shown in Figure 6.5, PDF 1.1 supports seven
color spaces: DeviceGray, DeviceRGB, DeviceCMYK, CalGray,
CalRGB, Lab, and Indexed. In addition, provisions have been made for a
DeviceGray
DeviceRGB
DeviceCMYK
CalGray
CalRGB
(CalCMYK)
Indexed
Device-dependent
Device-independent
Special
Figure 6.5 Color spaces
Lab
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.8 Resources 115
CalCMYK color space, although the attributes of this type of space have
not yet been dened. The color spaces follow the semantics described in
Section 4.8 of the PostScript Language Reference Manual, Second Edition.
A Color Space resource is specied by a name if it is one of the device-
dependent color spaces (DeviceGray, DeviceRGB, or DeviceCMYK).
Otherwise it is specied as an array that contains one of the device-
independent color spaces (CalGray, CalRGB, Lab, or CalCMYK) or
special color spaces (Indexed)
In a device-dependent color space, the color values are interpreted as
specifying the percentage of device colorant to be used. This means that the
exact color produced depends on the characteristics of the output device.
For example, in the DeviceRGB color space, a value of 1 for the red
component means turn red all the way on. If the output device is a
monitor, the color displayed depends strongly on the settings of the
monitors brightness, contrast, and color balance adjustments. In addition,
the precise color displayed depends on the chemical composition of the
compound used as the red phosphor in the particular monitor being used,
the length of time the monitor has been turned on, and the age of the
monitor.
In a device-independent color space, color values are dened by a mapping
from the device-independent color space into a standard color space, the
CIE (Commission Internationale de lclairage) 1931 XYZ color space.
Since the values in the XYZ space can be measured colormetrically, this
establishes a device-independent specication of the desired color. When a
device-independent color value is rendered on a device, the rendered color
is based on the device-independent color specication as well as the color
characteristics of the device. This may or may not result in a true
colorimetric rendering. Variations from a colorimetric rendering may occur
as a consequence of gamut limitations and rendering intents. See the
discussion of color rendering intents on page 125.
See the PostScript Language Reference Manual, Second Edition for further
explanation of device-independent color.
Implementation note The Acrobat 2.0 viewers allow a user to approximate device-independent
colors with device-dependent colors with no transformation. CalGray
colors are treated as DeviceGray, and CalRGB colors are treated as
DeviceRGB.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
116 Chapter 6: Document Structure
Device-dependent color space resources
DeviceGray color space
Colors in the DeviceGray color space are specied by a single value: the
intensity of achromatic light. In this color space, 0 is black, 1 is white, and
intermediate values represent shades of gray.
DeviceRGB color space
Colors in the DeviceRGB color space are represented by three values: the
intensity of the red, green, and blue components in the output. DeviceRGB
is commonly used for video displays because they are generally based on
red, green, and blue phosphors.
DeviceCMYK color space
Colors in the DeviceCMYK color space are represented by four values.
These values are the amounts of the cyan, magenta, yellow, and black
components in the output. This color space is commonly used for color
printers, where they are the colors of the inks traditionally used for four-
color printing. Only cyan, magenta, and yellow are strictly necessary, but
black is generally also used in printing because black ink produces a better
black than a mixture of cyan, magenta, and yellow inks, and because black
ink is less expensive than the other inks.
Device-independent color space resources
CalGray color space
Colors in a CalGray color space are represented by a single value. Input
values are in the range 0 to 1, where 0 is black, 1 is white and intermediate
values are gray.
A CalGray color space is specied by an array of the form
[ /CalGray dict ]
where the contents of dict are described in Table 6.36.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.8 Resources 117
Table 6.36 CalGray attributes (Continued)
Key Type Semantics
WhitePoint array (Required) Three numbers [ X
w
Y
w
Z
w
] that specify the CIE 1931 (XYZ)-
space tristimulus value of the diffuse white point. The numbers X
w
and Z
w
must be positive, and Y
w
must be equal to 1. See discussion in 4.8.3 in the
PostScript Language Reference Manual, Second Edition for further details.
BlackPoint array (Optional) Three numbers [ X
b
Y
b
Z
b
] that specify the CIE 1931 (XYZ)-
space tristimulus value of the diffuse black point. The numbers must be
non-negative. The default value is [ 0 0 0 ]. See discussion in 4.8.3 in the
PostScript Language Reference Manual, Second Edition for further details.
Gamma number (Optional) Denes the exponential relationship between the gray
component and Y. The governing equation is Y = gray
Gamma
. Gamma must
be positive and will generally be greater than or equal to 1. The default
value is 1.
CalRGB color space
Colors in a CalRGB color space are represented by three values: the red,
green and blue components of the color. Each value is in the range 0 to 1.
A CalRGB color space is specied by an array of the form:
[ /CalRGB dict ]
where the contents of dict are described in Table 6.37.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
118 Chapter 6: Document Structure
Table 6.37 CalRGB attributes
Key Type Semantics
WhitePoint array (Required) Same as for CalGray.
BlackPoint array (Optional) Same as for CalGray.
Gamma array (Optional) Three numbers [ G
r
G
g
G
b
] that specify the gamma for the red,
green, and blue components respectively. The governing equations are R =
R
G
r
, G = G
G
g
, and B = B
G
b
, where R, G, and B are the input calibrated
RGB values, and R, G, and B are the gamma-modied values. The default
value is [ 1 1 1 ].
Matrix array (Optional) Nine numbers [ X
r
Y
r
Z
r
X
g
Y
g
Z
g
X
b
Y
b
Z
b
] that specify the
linear interpretation of the gamma-modied red, green, and blue
components, R, G, and B. The default value is the identity matrix, [ 1 0 0
0 1 0 0 0 1 ]. The transformation from RGB to XYZ is given by:
X = R X
r
+ G X
g
+ B X
b
Y = R Y
r
+ G Y
g
+ B Y
b
Z = R Z
r
+ G Z
g
+ B Z
b
An example of a CalRGB color space resource is shown here for D65
white point, 1.8 gammas, and Trinitron phosphor chromaticities.
12 0 obj
[/CalRGB
<<
/WhitePoint [0.9505 1 1.0890]
/Gamma [1.8 1.8 1.8]
/Matrix [ 0.4497 0.2446 0.0252 0.3163 0.6720 0.1412 0.1845
0.0833 0.9227 ]
>> ]
endobj
Lab color space
Colors in a Lab color space are represented by three values: the L*, a* and
b* components of the color. The ranges of each of the three values are
specied under the Range key in Table 6.38.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.8 Resources 119
A Lab color space is specied by an array of the form:
[ /Lab dict ]
where the contents of dict are described in Table 6.38.
Table 6.38 Lab attributes
Key Type Semantics
WhitePoint array (Required) Same as for CalGray.
BlackPoint array (Optional) Same as for CalGray.
Range array (Optional) Four numbers [a
min
a
max
b
min
b
max
] specifying the range of the a*
and b* components. That is, a* and b* are limited by a
min
a* a
max
,
b
min
b* b
max
. The default value is [ -100 100 -100 100 ]. The range of
L* is always 0 to 100.
CalCMYK color space
A CalCMYK color space is specied by an array of the form:
[ /CalCMYK dict ]
where the contents of dict are not dened. These contents will be dened in
a future version of PDF.
Implementation note The CalCMYK color space resource type has been partially dened with
the expectation that its denition will be completed in a future version of
PDF. PDF 1.1 viewers should ignore CalCMYK color space attributes and
render colors specied in this color space as if they had been specied
using DeviceCMYK.
Special color space resources
Indexed color space
Indexed color spaces allow colors to be specied by small integers that are
used as indexes into a table of color values. The values in this table are
colors specied in either the DeviceRGB or DeviceCMYK color space.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
120 Chapter 6: Document Structure
For example, an indexed color space can have white as color number 1, dark
blue as color number 2, turquoise as color number 3, and black as color
number 4.
An indexed color space is specied as follows:
[ /Indexed base hival lookup ]
The base color space is specied by base and must be either DeviceRGB
or DeviceCMYK. The maximum valid index value, specied by hival, is
determined by the number of colors desired in the indexed color space.
Colors will be specied by integers in the range 0 to hival. The color table
values are contained in lookup, which is a PDF stream. The stream contains
m (hival + 1) bytes where m is the number of color components in the
base color space. Each byte is an unsigned integer in the range 0 to 255 that
is divided by 255, yielding a color component value in the range 0 to 1. The
color components for each entry in the table are adjacent in the stream. For
example, if the base color space is DeviceRGB and the indexed color
space contains two colors, the order of bytes in the stream is: R
0
G
0
B
0
R
1
G
1
B
1
, where letters are the color component and numbers are the table
entry.
Example 6.20 shows a color space resource for an indexed color space.
Colors in the table are specied in the DeviceRGB color space, and the
table contains 256 entries. The stream containing the table has been LZW
and ASCII base-85 encoded.
Example 6.20 Color space resource for an indexed color space
12 0 obj
[ /Indexed /DeviceRGB 255 13 0 R ]
endobj
13 0 obj
<< /Filter [ /ASCII85Decode /LZWDecode ] /Length 554 >>
stream
J3Vsg-=dE=!]*)rE$,8^$P%cp+RI0B1)A)g_;FLE.V9
omitted data
bS/5%"OmlTJ=PC!c2]]^rh(A~>
endstream
endobj
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.8 Resources 121
Default color space resources
PDF 1.1 adds device-independent color spaces to the color spaces dened in
PDF 1.0. Because viewers for PDF 1.0 generally do not expect these new
color spaces and default gracefully when they are used, a second method for
specifying the use of a device-independent color space is provided in PDF
1.1. This second method allows an appropriate color space to be substituted
for either the DeviceGray or DeviceRGB color spaces. The substitution
is controlled by two special keys, DefaultGray and DefaultRGB, that can
be used in the ColorSpace dictionary of the Resources dictionary of the
current page (or inherited from a Pages object that is an ancestor of the
page). They are used as follows.
When a viewer is performing an operation that results in rendering to a
medium, there is always a current color space, which is established using
the operators of Section 7.4, Color operators, or using the ColorSpace
key of an Image resource or an in-line image. When the current color space
is DeviceGray, the ColorSpace dictionary of the Resources dictionary of
the current page is checked for the presence of the DefaultGray key. If this
key is present, then the color space that is the value of that key is used as the
color space for the operation currently being performed. The value of the
DefaultGray key may be either DeviceGray or a CalGray color space
specication.
Similarly, when the current color space is DeviceRGB, the ColorSpace
dictionary of the Resources dictionary of the current page is checked for the
presence of the DefaultRGB key. If this key is present, then the color space
that is the value of that key is used as the color space for the operation
currently being performed. The value of the DefaultRGB key may be
either DeviceRGB or a CalRGB color space specication.
Implementation note The Acrobat 1.0 viewer ignores DefaultRGB and DefaultGray.
6.8.6 XObject resources
XObjects are named resources that appear in the XObject subdictionary
within the Resources dictionary of a page object. PDF currently supports
three types of XObjects: images, forms, and pass-through PostScript
language fragments. In the future it may support other object types.
XObjects are passed by name to the Do operator, described on page 164.
The action taken by the Do operator depends on the type of XObject passed
to it. In the case of images and forms, the Do operator draws the XObject.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
122 Chapter 6: Document Structure
Image resources
An Image resource is an XObject whose Subtype is Image. Image
resources allow a PDF page description to specify a sampled image or
image mask. PDF supports image masks, 1-, 2-, 4-, and 8-bit grayscale
images, and 1-, 2-, 4-, and 8-bit per component color images. Color images
may have three or four components representing either RGB or CMYK.
The sample data format and sample interpretation conform to the
conventions required by the PostScript language image and imagemask
operators. However, all PDF images have a size of 11 unit in user space,
and the data must be specied left-to-right, top-to-bottom. Like images in
the PostScript language, PDF images are sized and positioned by adjusting
the current transformation matrix in the page description.
An Image resource is specied by a stream object. The stream dictionary
must include the standard keys required of all streams as well as additional
ones described in the following table. Several of the keys are the same as
those required by the PostScript language image and imagemask
operators. Matching keys have the same semantics.
Table 6.39 Image resource attributes
Key Type Semantics
Type name (Required) Resource type. Always XObject.
Subtype name (Required) Resource subtype. Always Image.
Name name (Required for compatibility with PDF 1.0) Resource name, used as an
operand of the Do operator. Name must match the name used in the
XObject dictionary within the pages Resources dictionary.
Implementation note The Name key is ignored by all Acrobat viewers.
Width integer (Required) Width of the source image in samples.
Height integer (Required) Height of the source image in samples.
BitsPerComponent
integer (Required) The number of bits used to represent each color component.
ColorSpace color space (Required for images, not allowed for image masks) Color space used for
the image samples. This may be any color space dened in PDF 1.1,
including a device-independent color space. However, for compatibility
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.8 Resources 123
with 1.0 viewers, the DefaultRGB or DefaultGray key should be used to
reference a device-independent color space, as described in the section on
Default color space resources on page 121.
Decode array (Optional) An array of numbers specifying the mapping from sample values
in the image to values appropriate for the current color space. The number
of elements in the array must be twice the number of color components in
the color space specied in the ColorSpace key. The default value results
in the image sample values being used directly. Decode arrays are described
further on page 124.
Interpolate boolean (Optional) If true, requests that image interpolation be performed.
Interpolation attempts to smooth transitions between sample values.
Interpolation may be performed differently by different devices, and not at
all by some. The default value is false.
ImageMask boolean (Optional) Species whether the image should be treated as a mask. If true,
the image is treated as a mask; BitsPerComponent must be 1,
ColorSpace should not be provided, and the mask is drawn using the
current ll color. If false, the image is not treated as a mask. The default
value is false.
Intent name (Optional) A name which is a color rendering intent indicating the style of
color rendering that should occur. For example, one might want to render
images in a perceptual or pleasing manner while rendering line art colors
with exact color matches. Intents are meaningful only for the device-
independent color spaces. For further details, see page 125.
Example 6.21 shows an image object. It is a monochrome (1-bit per
component, DeviceGray) image that is 24 samples wide and 23 samples
high. Interpolation is not requested and the default decode array is used. The
image is given the name Im0, which is used to refer to the image when it is
drawn.
Example 6.21 Image resource with length specied as an indirect object
5 0 obj
<<
/Type /XObject
/Subtype /Image
/Name /Im0
/Width 24
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
124 Chapter 6: Document Structure
/Height 23
/BitsPerComponent 1
/ColorSpace /DeviceGray
/Filter /ASCIIHexDecode
/Length 6 0 R
>>
stream
003B00 002700 002480 0E4940
114920 14B220 3CB650 75FE88
17FF8C 175F14 1C07E2 3803C4
703182 F8EDFC B2BBC2 BB6F84
31BFC2 18EA3C 0E3E00 07FC00
03F800 1E1800 1FF800>
endstream
endobj
6 0 obj
174
endobj
Decode arrays
A Decode array can be used to invert the colors in an image or to compress
or expand the range of values specied in the image data. Each pair of
numbers in a Decode array species the upper and lower values to which
the range of sample values in the image is mapped. A Decode array contains
one pair of numbers for each component in the color space specied in the
image. The mapping for each color component is a linear mapping that, for
a Decode array of the form [D
Min
D
Max
], can be written as:
where:
n is the value of BitsPerComponent
i is the input value, in the range 0 to 2
n
1
D
Min
and D
Max
are the values specied in the Decode array
o is the output value, to be interpreted in the color space of the image.
Samples with a value of zero are mapped to D
Min
, samples with a value of
2
n
- 1 are mapped to D
Max
, and samples with intermediate values are
mapped linearly between D
Min
and D
Max
. The default Decode array for each
o D
Min
i
D
Max
D
Min
2
n
1
------------------------------- + =
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.8 Resources 125
color component is [0 1], causing sample values in the range 0 to 2
n
- 1 to
be mapped to color values in the range 0 to 1. Table 6.40 shows the default
Decode arrays for various color spaces.
Table 6.40 Default Decode arrays for various color spaces
Color space Default Decode array
DeviceGray [0 1]
DeviceRGB [0 1 0 1 0 1]
DeviceCMYK [0 1 0 1 0 1 0 1]
Indexed [0 N] where N = 2
n
1
CalGray [0 1]
CalRGB [0 1 0 1 0 1]
Lab [0 100 a
Min
a
Max
b
Min
b
Max
] where a
Min
, a
Max
, b
Min
,
and b
Max
correspond to the entries in the Range
array of the images color space. 0 and 100 are the
rst two entries since the range of L* is always 0
to 100.
As an example of a Decode array, consider a DeviceGray image with 8 bits
per component. The color of each sample in a DeviceGray image is
represented by a single number. The default Decode array maps a sample
value of 0 to a color value of 0 and a sample value of 255 to a color value of
1. A negative image is produced by specifying a Decode array of [ 1 0 ],
which maps a sample value of 0 to a color value of 1 and a sample value of
255 maps to a color value of 0. If the image only contains values from 0 to
63 and is to be displayed using the full gray range of 0 to 1, a Decode array
of [ 0 4 ] should be used. With this Decode array, a sample value of 0 maps
to a color value of 0, a sample value of 255 maps to a color value of 4, and a
sample value of 63 (the maximum value in the example) maps to a color
value of 0.99.
Color rendering intents
Implementation note The Acrobat 1.0 viewers display an error if an image species an Intent.
The supported color rendering intents and their meanings are given below in
Table 6.41. Other intents are permitted, but a viewer based on the PDF 1.1
specication will most likely ignore its value. The default intent is
RelativeColorimetric.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
126 Chapter 6: Document Structure
Table 6.41 Color rendering intents
Name Semantics
AbsoluteColorimetric Requests an exact color (hue, saturation, and brightness) match. This is
appropriate for uses such as some line art or spot colors. If the exact color
cannot be displayed, the closest available one is substituted.
RelativeColorimetric Requests an exact hue/saturation match, but scales the brightness range so
that all brightnesses t into the display devices brightness range. This is
often appropriate for line art and spot color. As a result of the brightness
scaling, the exact colors produced will differ on devices having different
brightness range capabilities. If the exact hue/saturation cannot be
displayed, the closest available one is substituted.
Perceptual Scales the hue, saturations and brightness ranges so that all values can be
displayed on the output device. This generally provides a pleasing rendering
of scanned images. As a result of the scaling, all colors are modied
somewhat.
Saturation Emphasizes saturation. This is appropriate for business graphics.
Implementation note Because of the large gamut of most displays, version 2.0 of the Acrobat
viewers ignore the Intent key when displaying a PDF le and always use
RelativeColorimetric. When printing to a PostScript printer, the Acrobat
viewers do not specify an intent unless one was explicitly specied.
Form resources
A form is a self-contained description of any text, graphics, or sampled
images that is drawn multiple times on several pages or at different
locations on a single page.
A Form resource is specied by a PDF stream. The keys in the stream
dictionary correspond to the keys in a PostScript language Form dictionary.
Unlike a PostScript language Form dictionary, the Form resource dictionary
does not contain a PaintProc key. Instead, the stream contents specify the
painting procedure. These contents must be described using the same
marking operators that are used for PDF page descriptions. As usual, the
stream must also include a Length key and may include Filter and
DecodeParms keys if the stream is encoded. Table 6.42 describes the
attributes of a Form resource dictionary.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.8 Resources 127
To draw a form, the Do operator is used, with the name of the form to be
drawn given as an operand. As discussed in the introduction to Section 6.8,
Resources, this name is mapped to an object ID using the Resources
dictionary for the page on which the form is drawn.
Table 6.42 Form resource attributes
Key Type Semantics
Type name (Required) Resource type. Always XObject.
Subtype name (Required) Resource subtype. Always Form.
BBox array (Required) An array of four numbers that species the forms bounding box
in the form coordinate system. This bounding box is used to clip the output
of the form and to determine its size for caching.
FormType integer (Required) Must be 1.
Matrix matrix (Required) A transformation matrix that maps from the forms coordinate
space into user space.
Name name (Required) Resource name, used as an operand of the Do operator. Name
must match the name used in the XObject dictionary within the pages
Resources dictionary.
Resources dictionary (Optional) A list of the resources such as fonts and images required by this
form. The dictionarys format is the same as for the Resources dictionary in
a Page object. All resources used in the form must be included in the
Resources dictionary of the Page object on which the form appears,
regardless of whether or not they also appear in the Resources dictionary of
the form. It can be useful to also specify them in the forms Resources
dictionary in order to easily determine which resources are used inside the
form. If a resource is included in both dictionaries, it should have the same
name in both locations.
XUID array (Optional) An ID that uniquely identies the form. This allows the form to
be cached after the rst time it has been drawn in order to improve the speed
of subsequent redraws.
XUID arrays may contain any number of elements. The rst element in an
XUID array is the organization ID. Forms that are used only in closed
environments may use 1000000 as the organization ID. Any value can be
used for subsequent elements, but the same values must not be used for
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
128 Chapter 6: Document Structure
different forms. Organizations that plan to distribute forms widely and wish
to use XUIDs must obtain an organization ID from Adobe Systems
Incorporated, as described in Appendix E. Section 5.8 of the PostScript
Language Reference Manual, Second Edition provides a further explanation
of XUIDs.
Example 6.22 Form resource
6 0 obj
<<
/Type /XObject
/Subtype /Form
/Name /Fm0
/FormType 1
/BBox [ 0 0 1000 1000 ]
/Matrix [ 1 0 0 1 0 0 ]
/Length 38
>>
stream
0 0 m 0 1000 l 1000 1000 l 1000 0 l f
endstream
endobj
Pass-through PostScript language resources
PDF 1.1 enables a document to include PostScript language fragments in a
page description. These fragments are printer-dependent and take effect
only when printing on a PostScript printer. They have no effect either when
viewing the le or when printing to a non-PostScript printer. In addition,
applications that understand PDF are unlikely to be able to interpret the
PostScript language fragments. Hence, this capability should be used only if
there is no other way to achieve the same result.
A PostScript resource is an XObject whose Subtype key has the value PS.
When a document is printed to a PostScript printer, the contents of the
resource stream replace the Do command that references the resource. This
stream is copied without interpretation and may include PostScript
comments. In any other case, the resource is ignored. When printing to a
PostScript Level 1 printer, if the XObject contains a Level1 key, the value
of that key, which must be a stream, will be used instead of the contents of
the PostScript resource stream.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.9 Info dictionary 129
The PostScript fragment may use Type 1 and TrueType fonts listed in the
resources of the page containing the fragment. It may not use Type 3 fonts.
Note Pass-through PostScript resources should be used with extreme caution,
and only to obtain results not otherwise possible in PDF. Inappropriate use
of PostScript resources can cause PDF les to print incorrectly.
The PostScript resource is not compatible with 1.0 viewers. The following
method can be used instead to create PostScript pass-through data when
compatibility with 1.0 viewers is necessary. A form should be dened with
an empty stream content. It should include a BBox of all zeros, a
FormType of 1, and a Matrix that is the identity matrix. It should include
a Subtype2 key whose value is PS, and a PS key whose value is a stream
that contains the PostScript language pass-through data. It may also contain
a Level1 key as described previously in this section.
6.9 Info dictionary
A documents trailer may contain a reference to an Info dictionary that
provides information about the document. This optional dictionary may
contain one or more keys, whose values should be strings. These strings
may be displayed in an Acrobat viewers Document Info dialog. The
characters in these strings are encoded using the predened encoding
PDFDocEncoding, described in Appendix C.
Note Omit any key in the Info dictionary for which a value is not known, rather
than including it with an empty string as its value.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
130 Chapter 6: Document Structure
Table 6.43 PDF Info dictionary attributes
Key Type Semantics
Author string (Optional) The name of the person who created the document.
CreationDate string (Optional) The date the document was created. It should be in the format
described in Section 4.4, Strings.
ModDate string (Optional) The date the document was last modied. It should be in the
format described in Section 4.4, Strings.
Creator string (Optional) If the document was converted into a PDF document from
another form, this is the name of the application that created the original
document.
Producer string (Optional) The name of the application that converted the document from
its native format to PDF.
Title string (Optional) The documents title.
Subject string (Optional) The subject of the document.
Keywords string (Optional) Keywords associated with the document.
Info strings that are to be interpreted as dates must include the D: prex (see
Section 4.4, Strings). In particular, the 1.0 key CreationDate and the 1.1
key ModDate should use this format. All Info strings that represent dates
should be displayed as a human-readable date. Other Info strings are
uninterpreted.
Info keys and strings may be added to or changed by users or extensions,
and some extensions may choose to permit searches on these keys. PDF 1.1
does not dene short names for the keys in Table 6.43, to make it easier to
browse and edit Info dictionary entries. New names should be chosen with
care so that they make sense to users.
Although private data can be stored in the Info dictionary, it is more
appropriate to store it in the Catalog. This allows a user or program to alter
entries in the Info dictionary with less chance of unforeseen side effects.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.10 Articles 131
Example 6.23 shows an example of an Info dictionary.
Example 6.23 Info dictionary
1 0 obj
<<
/Creator (Adobe Illustrator)
/CreationDate (D:19930204080603-08'00')
/Author (Werner Heisenberg)
/Producer (Acrobat Network Distiller 1.0 for Macintosh)
>>
endobj
6.10 Articles
An article thread identies related elements in a document, enabling a user
to follow a ow of information that may span multiple columns or pages.
A PDF document may include one or more article threads. Each thread has
a title and a list of thread elements, which are referred to as beads. A viewer
may allow the user to select a particular thread and then navigate through it;
the viewer automatically maintains a comfortable zoom level for reading
and moves from one bead to the next, rather than from one page to the next.
If a document includes any threads, they are stored in an array as the value
of the Threads key in the Catalog object. Each thread and its beads are
dictionaries. Table 6.44 lists the attributes of a Thread dictionary, and Table
6.45 lists the attributes of a Bead dictionary.
Table 6.44 Thread attributes
Key Type Semantics
F (First) dict (Required; must be an indirect reference) Species the bead that is the rst
element of this thread.
I (Info) dict (Optional) Information about the thread. This dictionary should contain
information similar to the documents Info dictionary and should use the
same key names and data formats for entries that correspond to Info
dictionary entries. Entries in this dictionary should be strings encoded using
the predened encoding PDFDocEncoding, described in Appendix C.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
132 Chapter 6: Document Structure
Table 6.45 Bead attributes
Key Type Semantics
T (Thread) dict (Required for the rst bead of a thread; must be an indirect reference) The
thread of which this bead is the rst element.
V (Prev) dict (Required; must be indirect) The previous bead of this thread; for the rst
bead in a thread, V species the last bead in the thread.
N (Next) dict (Required; must be indirect) The next bead of this thread; for the last bead
in a thread, N species the rst bead in the thread.
P (Page) dict (Required; must be indirect) The Page on which this bead appears.
R (Rect) array (Required) Rectangle specifying the location of this bead.
Example 6.24 shows a thread with three beads:
Example 6.24 Thread
22 0 obj
<< /F 23 0 R /I << /Title (Man Bites Dog) >> >>
endobj
23 0 obj
<< /T 22 0 R /V 25 0 R /N 24 0 R /P 8 0 R
/R [158 247 318 905] >>
endobj
24 0 obj
<< /V 23 0 R /N 25 0 R /P 8 0 R /R [322 246 486 904] >>
endobj
25 0 obj
<< /V 25 0 R /N 23 0 R /P 10 0 R /R [157 254 319 903] >>
endobj
The Page object for each page on which beads appear should contain a B
key, as described in Section 6.4, Page objects. The value of this key is an
array of indirect references to each bead on the page, in drawing order.
Implementation note The thread array and dictionary objects are invisible to 1.0 viewers on all
platforms. Consequently, insert and delete pages operations will not carry
along any threads.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.11 File ID 133
6.11 File ID
A PDF le may contain a reference to another PDF le. Storing a le name,
even in a platform-independent format, does not guarantee that the le can
be found, even if it exists and its name has not been changed. Different
server software applications often present different names for the same le.
For example, servers running on DOS platforms must convert all le names
to eight letters and a three-letter extension. Different servers use different
strategies for converting long names to this format.
References to PDF les can be made more reliable by making the PDF le
reference consist of two parts: (1) a normal operating system-based le
reference and (2) a le ID. The le ID characterizes the le and is stored
with the le. Placing a le ID with the le reference and in the le itself
increases the chances that a le reference can be resolved correctly.
Matching the ID in the reference with the ID in the le indicates whether
the desired le was found.
Implementation note The indexes created by the Acrobat Catalog application also contain
references to PDF les.
PDF 1.1 recommends that les have an ID key in their trailer. The value of
this key is an array of two strings. The rst element is a permanent ID,
based on the contents of the le at the time the le was created. This ID
does not change when the le is incrementally updated. The second element
is a changing ID, based on the contents of the le at the time the le is
incrementally updated. When a le is rst written, the IDs are set to the
same value. When resolving a le reference, if both IDs match, it is very
likely that the correct le has been found. If only the rst ID matches, then a
different version of the correct le has been found.
Implementation note Although the ID key is not required, all Adobe applications that produce
PDF will include this key. Acrobat Exchange will add this key when saving
a le if it is not present.
To help insure the uniqueness of the le ID, it is recommend that le ID be
computed using a message digest algorithm such as MD5, as described in
RFC 1321: The MD5 Message-Digest Algorithm [19]. It is recommend that
the following information be passed to the message digest algorithm:
the current time
a string representation of the location of the le, usually a path name
the document size in bytes
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
134 Chapter 6: Document Structure
the value of each entry in the documents Info dictionary.
Implementation note Adobe applications pass this information to the MD5 message digest
algorithm to calculate le IDs. Note that the calculation of the le IDs need
not be reproducible. All that matters is that the le IDs are likely to be
unique. For example, two implementations of this algorithm might use
different formats for the current time. This will cause them to produce
different le IDs for the same le created at the same time, but this does not
affect the uniqueness of the ID.
6.12 Encryption dictionary
Documents can be protected via encryption, as described in Section 5.7,
Encryption. Every protected document must have an Encrypt dictionary,
which species the security handler to be used to authorize access to the
document. The Encrypt dictionary also contains whatever additional
information the security handler chooses to store in it.
Table 6.46 describes the standard keys in the Encrypt dictionary. In addition
to the keys listed in the table, a security handler may add other keyvalue
pairs. Strings in the Encrypt dictionary must be encrypted and decrypted by
the security handler itself, using whatever encryption algorithm it chooses;
unlike other strings in a PDF le, they are not automatically encrypted and
decrypted.
Table 6.46 Encrypt dictionary attributes
Key Type Semantics
Filter name (Required ) The security handlers name.
6.12.1 Security handlers
Security handlers authorize users to access the content of PDF les. They
may use whatever data they choose to do so, such as passwords, the
presence of a specic hardware key, or the output of a ngerprint scanner.
Implementation note Version 2.0 of the Acrobat viewers include one built-in security handler,
described in the following section. Plug-ins can provide other security
handlers.
In addition to granting access to the contents of the le, a security handler
may grant permission to perform specic operations on the le.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
6.12 Encryption dictionary 135
Implementation note Version 2.0 of the Acrobat viewers support the following permissions:
Printing the document.
Copying text and graphics in the document to the clipboard.
Modifying the document.
Adding notes to the document and modifying existing notes.
Security handlers can place whatever additional keyvalue pairs they wish
into the Encrypt dictionary. Examples of such data includes permissions,
data that allows the security handler to determine which permissions a
particular user should be granted, or data needed for authorizing the user.
6.12.2 Standard security handler
Version 2.0 of the Acrobat viewers includes one built-in security handler,
whose name is Standard. This security handler supports two passwords
(owner and user) that are obtained via a password dialog box. The standard
security handler also supports restricted permissions for users. These
permissions can be set by the owner.
PDF Reference Manual April 16, 1996 Chapter 6: Document Structure
136 Chapter 6: Document Structure
Table 6.47 describes the information in the Encrypt dictionary used by the
standard security handler.
Table 6.47 Standard security handler attributes
Key Type Semantics
R (Revision) number (Required) Revision number of algorithm used to encode data in this
dictionary. The revision number for the standard security handler in Acrobat
2.0 is 2.
U (User) string (Required) Data related to the password needed to open le. This data is
used to determine whether the user entered the user password and whether
the les permissions have been tampered with. This data is not an
encrypted form of the password, however.
O (Owner) string (Required) Data related to the password needed to gain full access to le.
This data is used to determine whether the user entered the owner password
and whether the les permissions have been tampered with. This data is not
an encrypted form of the owner password, however.
P (Permissions) string (Required) Permissions granted to a user who opens a le with the user
password.
PDF Reference Manual April 16, 1996 Chapter 7: Page Descriptions
137
CHAPTER 7
Page Descriptions
This chapter describes the PDF operators that draw text, graphics, and
images on the page. It completes the specication of PDF. The following
chapters describe how to produce efcient PDF les.
Text, graphics, and images are drawn using the coordinate systems
described in Chapter 3. It may be useful to refer to that chapter when
reading the description of various operators, to obtain a better understanding
of the coordinate systems used in PDF documents and the relationships
among them.
Appendix B contains a complete list of operators, arranged alphabetically.
Note Throughout this chapter, PDF operators are shown with a list of the
operands they require. A dash () is used to indicate that an operator takes
no operands. In addition, for operators that correspond to one or more
PostScript language operators, the corresponding PostScript language
operators appear in bold on the rst line of the operators denition. An
operand specied as a number may be either integer or real. Otherwise,
numeric operands must be integer.
7.1 Overview
A PDF page description can be considered a sequence of graphics objects.
These objects generate marks that are applied to the current page, obscuring
any previous marks they may overlay.
PDF Reference Manual April 16, 1996 Chapter 7: Page Descriptions
138 Chapter 7: Page Descriptions
PDF provides four types of graphics objects:
A path object is an arbitrary shape made of straight lines, rectangles, and
cubic curves. A path may intersect itself and may have disconnected
sections and holes. A path object includes a painting operator that
species whether the path is lled, stroked, and/or serves as a clipping
path.
A text object consists of one or more character strings that can be placed
anywhere on the page and in any orientation. Like a path, text can be
stroked, lled, and/or serve as a clipping path.
An image object consists of a set of samples using a specied color
model. Images can be placed anywhere on a page and in any orientation.
An XObject is a PDF object referenced by name. The interpretation of an
XObject depends on its type. PDF currently supports three types of
XObjects: images, forms, and pass-through PostScript language
fragments.
As described in Section 6.8, Resources, a PDF page description is not
necessarily self-contained. It often contains references to resources such as
fonts, forms, or images not found within the page description itself but
located elsewhere in the PDF le.
7.2 Graphics state
The exact effect of drawing a graphics or text object is determined by
parameters such as the current line thickness, font, and leading. These
parameters are part of the graphics state.
Although the contents of the PDF graphics state are similar to those of the
graphics state in the PostScript language, PDF extends the graphics state to
include separate stroke and ll colors and additional elements that affect
only text. The use of separate ll and stroke colors in PDF is necessary to
implement painting operators that both ll and stroke a path or text. The
additional text state enables the implementation of a more compact set of
text operators.
Tables 7.1 and 7.2 list the parameters in the graphics state, arranged
alphabetically. For each parameter, the table lists the operator that sets the
parameter, along with any restriction on where the operator may appear in a
page description. For convenience, the text-specic elements are listed
separately.
PDF Reference Manual April 16, 1996 Chapter 7: Page Descriptions
7.2 Graphics state 139
Note None of the graphics state operators may appear within a path.
Table 7.1 General graphics state parameters
Parameter Operator Operator may not appear
clipping path See the description of the clipping path in Section 7.2.1, Clipping path.
CTM cm within a text object or path
current point See the description of the current point in Section 7.2.3, Current point.
ll colorspace
g, rg, k, cs within a path
stroke colorspace
G, RG, K, CS within a path
ll color g, rg, k, sc within a path
stroke color
G, RG, K, SC within a path
atness i within a path
line cap style J within a path
line dash pattern d within a path
line join style j within a path
line width w within a path
miter limit M within a path
rendering intent ri within a path
PDF Reference Manual April 16, 1996 Chapter 7: Page Descriptions
140 Chapter 7: Page Descriptions
Table 7.2 Text-specic graphics state parameters
Parameter Operator Operator may not appear
character spacing after characters
Tc within a path
word spacing Tw within a path
character and word spacing
" outside a text object
horizontal scaling Tz within a path
leading TL within a path
TD outside a text object
text font Tf within a path
text matrix Tm outside of a text object
text rise Ts within a path
text size Tf within a path
text rendering mode Tr within a path
The graphics state is initialized at the beginning of each page, using the
default values specied in each of the graphics state operator descriptions.
PDF provides a graphics state stack for saving and restoring the graphics
state. PDF provides an operator that saves a copy of the entire graphics state
onto the graphics state stack. Another operator removes the most recently
saved graphics state from the stack and makes it the current graphics state.
Each of the elements in Table 7.1 is described in the following sections,
while the operators that set these parameters are described in Section 7.3,
Graphics state operators, and Section 7.4, Color operators. The text-
specic parameters listed in Table 7.2 are described in Section 7.6, Text
state, near the discussion of text objects. The operators that set them are
described in Sections 7.7.2, Text state operators, and 7.7.3, Text
positioning operators.
PDF Reference Manual April 16, 1996 Chapter 7: Page Descriptions
7.2 Graphics state 141
7.2.1 Clipping path
The clipping path restricts the region to which paint can be applied on a
page. Marks outside the region bounded by the clipping path are not
painted. Clipping paths may be specied either by a path, or by using one of
the clipping modes for text rendering. These are described in Section 7.5.3,
Path clipping operators, and Section 7.6.6, Text rendering mode.
7.2.2 CTM
The CTM is the matrix specifying the transformation from user space to
device space. It is described in Section 3.2, User space.
7.2.3 Current point
All drawing on a page makes use of the current point. In an analogy to
drawing on paper, the current point can be thought of as the location of the
pen used for drawing.
The current point must be set before graphics can be drawn on a page.
Several of the operators discussed in Section 7.5.1, Path segment
operators, set the current point. As a path object is constructed, the current
point is updated in the same way as a pen moves when drawing graphics on
a piece of paper. After the path is painted using the operators described in
Section 7.5.2, Path painting operators, the current point is undened.
The current point also determines where text is drawn. Each time a text
object begins, the current point is set to the origin of the pages coordinate
system. Several of the operators described in Section 7.7.3, Text
positioning operators, change the current point. The current point is also
updated as text is drawn using the operators described in Section 7.7.4,
Text string operators.
7.2.4 Fill color
The ll color is used to paint the interior of paths and text characters that are
lled. Filling is described in Section 7.5.2, Path painting operators.
7.2.5 Flatness
Flatness sets the maximum permitted distance in device pixels between the
mathematically correct path and an approximation constructed from straight
line segments, as shown in Figure 7.1.
PDF Reference Manual April 16, 1996 Chapter 7: Page Descriptions
142 Chapter 7: Page Descriptions
Note Flatness is inherently device-dependent, because it is measured in device
pixels.
Figure 7.1 Flatness
7.2.6 Line cap style
The line cap style species the shape to be used at the ends of open subpaths
when they are stroked. Allowed values are shown in Figure 7.2.
Flatness error
tolerance
PDF Reference Manual April 16, 1996 Chapter 7: Page Descriptions
7.2 Graphics state 143
Figure 7.2 Line cap styles
7.2.7 Line dash pattern
The line dash pattern controls the pattern of dashes and gaps used to stroke
paths. It is specied by an array and a phase. The array species the length
of alternating dashes and gaps. The phase species the distance into the
dash pattern to start the dash. Both the elements of the array and the phase
are measured in user space units. Before beginning to stroke a path, the
array is cycled through, adding up the lengths of dashes and gaps. When the
sum of dashes and gaps equals the value specied by the phase, stroking of
the path begins, using the array from the point that has been reached. Figure
7.3 shows examples of line dash patterns. As can be seen from the gure,
the command [ ] 0 d can be used to restore the dash pattern to a solid line.
0
1
2
Line cap
style Description
Butt end capsthe stroke is
squared off at the endpoint of the
path.
0 (default)
10
Word Space
Word Space
PDF Reference Manual April 16, 1996 Chapter 7: Page Descriptions
7.7 Text operators 161
7.7 Text operators
A PDF text object consists of operators that specify character strings,
movement of the current point, and text state. A text object begins with the
BT operator and ends with the ET operator.
<text object> ::= BT
<text operator or graphics state operator>*
ET
Note The graphics state operators q, Q, and cm cannot appear within a text
object.
When BT is encountered, the text matrix is initialized to the identity matrix.
When ET is encountered, the text matrix is discarded. Text objects cannot
be nesteda second BT cannot appear before an ET.
Note If a page does not contain any text, no text operators (including operators
that merely set the text state) may be present in the page description.
7.7.1 Text object operators
BT Begins a text object. Initializes the text matrix to the identity matrix.
ET Ends a text object. Discards the text matrix.
7.7.2 Text state operators
These operators set the text-specic parameters in the graphics state.
Note These operators can appear outside of text objects, and the values they set
are retained across text objects on a single page. Like other graphics state
parameters, the values are initialized to the default values at the beginning
of each page.
charSpace Tc Set character spacing
Sets the character spacing parameterwhich determines the amount of
space after a characterin the graphics state. Character spacing is used,
together with word spacing, by the Tj, TJ, and ' operators to calculate
spacing of text within a line. charSpace is a number expressed in text
space units and has a default value of 0.
PDF Reference Manual April 16, 1996 Chapter 7: Page Descriptions
162 Chapter 7: Page Descriptions
fontname size Tf Set font and size
Sets the text font and text size in the graphics state. There is no default value
for either fontname or size; they must be selected using Tf before drawing
any text. fontname is a resource name. size is a number expressed in text
space units.
leading TL Set text leading
Sets the leading parameter in the graphics state. Leading is used by the T*,
' , and " operators to calculate the position of the next line of text. The TL
operator need not be used in a PDF le unless the T*, ', or " operators are
used. leading is a number expressed in text space units and has a default
value of 0.
render Tr Set the text rendering mode
render is an integer and has a default value of 0.
rise Ts Set text rise
Moves the baseline vertically by rise units. This operator is used for
superscripting and subscripting. rise is a number expressed in text space
units and has a default value of 0.
wordSpace Tw Set word spacing
Sets the word spacing parameter in the graphics state. Word spacing is used,
together with character spacing, by the Tj, TJ, and ' operators to calculate
spacing of text within a line. wordSpace is a number expressed in text
space units and has a default value of 0.
scale Tz Set horizontal scaling
Sets the horizontal scaling parameter in the graphics state. scale is a
number expressed in percent of the normal scaling and has a default value
of 100.
7.7.3 Text positioning operators
A text object keeps track of the current point and the start of the current line.
The text string operators move the current point like the various forms of the
PostScript language show operator. Operators that move the start of the
current line move the current point as well.
Note These operators may appear only within text objects.
t
x
t
y
Td Moves to the start of the next line, offset from the start of the current line by
(t
x
, t
y
). t
x
and t
y
are numbers expressed in text space units.
PDF Reference Manual April 16, 1996 Chapter 7: Page Descriptions
7.7 Text operators 163
t
x
t
y
TD Moves to the start of the next line, offset from the start of the current line by
(t
x
, t
y
). As a side effect, this sets the leading parameter in the graphics state,
used by the T*, ', and " operators. t
x
and t
y
are numbers expressed in text
space units. The value assigned to the leading is the negative of t
y
.
a b c d e f Tm Sets the text matrix and sets the current point and line start position to the
origin. The operands are all numbers, and the default matrix is [1 0 0 1 0 0].
Although the operands specify a matrix, they are passed as six numbers, not
an array.
Note The matrix specied by the operands passed to the Tm operator is not
concatenated onto the current text matrix, but replaces it.
T* Moves to the start of the next line. The x-coordinate is the same as that of
the most recent TD, Td, or Tm operation, and the y-coordinate equals that of
the current line minus the leading.
7.7.4 Text string operators
These operators draw text on the page. Although it is possible to pass
individual characters to the text string operators, text searching performs
signicantly better if the text is grouped by word and paragraph.
PDF supports the same conventions as the PostScript language for
specifying non-printable ASCII characters. That is, a character can be
represented by an escape sequence, as enumerated in Table 4.1 on page 32.
Note The default current point is at the page origin. Therefore, unless some prior
operation in the same text object changes the current point, the text will
appear at the origin. It is suggested that a Tm operation be used to
establish the initial current point in a text object at the position in text space
where initial text is to appear. Subsequent text operations may change the
current point.
string Tj Shows text string, using the character and word spacing parameters from the
graphics state.
string ' Moves to next line and shows text string, using the character and word
spacing parameters from the graphics state.
a
w
a
c
string " Moves to next line and shows text string. a
w
and a
c
are numbers expressed
in text space units. a
w
species the additional space width and a
c
species
the additional space between characters, otherwise specied using the Tw
and Tc operators.
PDF Reference Manual April 16, 1996 Chapter 7: Page Descriptions
164 Chapter 7: Page Descriptions
Note The values specied by a
w
and a
c
remain the word and character spacings
after the " operator is executed, as though they were set using the Tc and
Tw operators.
[ number or string ]
TJ Shows text string, allowing individual character positioning, and using the
character and word spacing parameters from the graphics state. For each
element of the array that is passed as an operand, if the element is a string,
shows the string. If it is a number, moves the current point to the left by the
given amount, expressed in thousandths of an em. (An em is a typographic
unit of measurement equal to the size of a fontfor example, in a 12-point
font an em is 12 points.)
Each character is rst justied according to any character and word spacing
settings made with the Tc or Tw operators, and then any numeric offset
present in the array passed to the TJ operator is applied. An example of the
use of TJ is shown in Figure 7.17.
Note When using the TJ operator, the x-coordinate of the current point after
drawing a character and moving by any specied offset must not be less
than the x-coordinate of the current point before the character was drawn.
Figure 7.17 Operation of TJ operator
7.8 XObject operator
The Do operator permits the execution of an arbitrary object whose data is
encapsulated within a PDF object. The currently supported XObjects are
images and PostScript language forms, discussed in Section 6.8.6, XObject
resources.
xobject Do Executes the specied XObject. xobject must be a resource name.
[(AWAY again) ] TJ
AWAY again
MacRomanEncoding
,
MacExpertEncoding
, and
WinAnsiEncoding
may be used in Font and Encoding objects.
PDFDocEncoding
is the encoding used in outline entries, text anno-
tations, and strings in the Info dictionary.
StandardEncoding
is the built-in encoding for many fonts.
This appendix contains three tables describing these encodings. The
rst table shows all encodings except
MacExpertEncoding
and is
arranged alphabetically by character name. The second table is similar,
except that it is arranged numerically by character code. The third table
shows the encoding for
MacExpertEncoding
, which is shown in a
separate table because it has a substantially different character set than
the other encodings.
PDF Reference Manual February 23, 1996 Predefined Font Encodings
248 Appendix C: Predefined Font Encodings
C.1 Predened encodings sorted by character name
Char Name
StandardEncoding MacRomanEncoding WinAnsiEncoding PDFDocEncoding
Decimal Octal Decimal Octal Decimal Octal Decimal Octal
A
A 65 101 65 101 65 101 65 101
AE 225 341 174 256 198 306 198 306
Aacute 231 347 193 301 193 301
Acircumflex 229 345 194 302 194 302
Adieresis 128 200 196 304 196 304
Agrave 203 313 192 300 192 300
Aring 129 201 197 305 197 305
Atilde 204 314 195 303 195 303
B
B 66 102 66 102 66 102 66 102
C
C 67 103 67 103 67 103 67 103
Ccedilla 130 202 199 307 199 307
D
D 68 104 68 104 68 104 68 104
E
E 69 105 69 105 69 105 69 105
Eacute 131 203 201 311 201 311
Ecircumflex 230 346 202 312 202 312
Edieresis 232 350 203 313 203 313
Egrave 233 351 200 310 200 310
Eth 208 320 208 320
F
F 70 106 70 106 70 106 70 106
G
G 71 107 71 107 71 107 71 107
H
H 72 110 72 110 72 110 72 110
I
I 73 111 73 111 73 111 73 111
Iacute 234 352 205 315 205 315
Icircumflex 235 353 206 316 206 316
Idieresis 236 354 207 317 207 317
Igrave 237 355 204 314 204 314
J
J 74 112 74 112 74 112 74 112
K
K 75 113 75 113 75 113 75 113
L
L 76 114 76 114 76 114 76 114
Lslash 232 350 149 225
M
M 77 115 77 115 77 115 77 115
N
N 78 116 78 116 78 116 78 116
Ntilde 132 204 209 321 209 321
O
O 79 117 79 117 79 117 79 117
OE 234 352 206 316 140 214 150 226
Oacute 238 356 211 323 211 323
Ocircumflex 239 357 212 324 212 324
PDF Reference Manual February 23, 1996 Predefined Font Encodings
C.1 Predefined encodings sorted by character name 249
Odieresis 133 205 214 326 214 326
Ograve 241 361 210 322 210 322
Oslash 233 351 175 257 216 330 216 330
Otilde 205 315 213 325 213 325
P
P 80 120 80 120 80 120 80 120
Q
Q 81 121 81 121 81 121 81 121
R
R 82 122 82 122 82 122 82 122
S
S 83 123 83 123 83 123 83 123
Scaron 138 212 151 227
T
T 84 124 84 124 84 124 84 124
Thorn 222 336 222 336
U
U 85 125 85 125 85 125 85 125
Uacute 242 362 218 332 218 332
Ucircumflex 243 363 219 333 219 333
Udieresis 134 206 220 334 220 334
Ugrave 244 364 217 331 217 331
V
V 86 126 86 126 86 126 86 126
W
W 87 127 87 127 87 127 87 127
X
X 88 130 88 130 88 130 88 130
Y
Y 89 131 89 131 89 131 89 131
Yacute 221 335 221 335
Ydieresis 217 331 159 237 152 230
Z
Z 90 132 90 132 90 132 90 132
Zcaron 153 231
a
a 97 141 97 141 97 141 97 141
aacute 135 207 225 341 225 341
acircumflex 137 211 226 342 226 342
acute 194 302 171 253 180 264 180 264
adieresis 138 212 228 344 228 344
ae 241 361 190 276 230 346 230 346
agrave 136 210 224 340 224 340
&
ampersand 38 46 38 46 38 46 38 46
aring 140 214 229 345 229 345
^
asciicircum 94 136 94 136 94 136 94 136
~
asciitilde 126 176 126 176 126 176 126 176
*
asterisk 42 52 42 52 42 52 42 52
@
at 64 100 64 100 64 100 64 100
atilde 139 213 227 343 227 343
b
b 98 142 98 142 98 142 98 142
Char Name
StandardEncoding MacRomanEncoding WinAnsiEncoding PDFDocEncoding
Decimal Octal Decimal Octal Decimal Octal Decimal Octal
PDF Reference Manual February 23, 1996 Predefined Font Encodings
250 Appendix C: Predefined Font Encodings
\
backslash
92 134 92 134 92 134 92 134
|
bar 124 174 124 174 124 174 124 174
{
braceleft 123 173 123 173 123 173 123 173
}
braceright 125 175 125 175 125 175 125 175
[
bracketleft 91 133 91 133 91 133 91 133
]
bracketright 93 135 93 135 93 135 93 135
breve 198 306 249 371 24 30
brokenbar 166 246 166 246
bullet 183 267 165 245 149 225 128 200
c
c 99 143 99 143 99 143 99 143
caron 207 317 255 377 25 31
ccedilla 141 215 231 347 231 347
cedilla 203 313 252 374 184 270 184 270
cent 162 242 162 242 162 242 162 242
circumflex 195 303 246 366 136 210 26 32
:
colon 58 72 58 72 58 72 58 72
,
comma 44 54 44 54 44 54 44 54
copyright 169 251 169 251 169 251
currency 168 250 219 333 164 244 164 244
d
d 100 144 100 144 100 144 100 144
dagger 178 262 160 240 134 206 129 201
daggerdbl 179 263 224 340 135 207 130 202
degree 161 241 176 260 176 260
dieresis 200 310 172 254 168 250 168 250
divide 214 326 247 367 247 367
$
dollar 36 44 36 44 36 44 36 44
dotaccent 199 307 250 372 27 33
dotlessi 245 365 245 365 154 232
e
e 101 145 101 145 101 145 101 145
eacute 142 216 233 351 233 351
ecircumflex 144 220 234 352 234 352
edieresis 145 221 235 353 235 353
egrave 143 217 232 350 232 350
8
eight 56 70 56 70 56 70 56 70
ellipsis 188 274 201 311 133 205 131 203
emdash 208 320 209 321 151 227 132 204
endash 177 261 208 320 150 226 133 205
=
equal 61 75 61 75 61 75 61 75
eth 240 360 240 360
Char Name
StandardEncoding MacRomanEncoding WinAnsiEncoding PDFDocEncoding
Decimal Octal Decimal Octal Decimal Octal Decimal Octal
PDF Reference Manual February 23, 1996 Predefined Font Encodings
C.1 Predefined encodings sorted by character name 251
!
exclam
33 41 33 41 33 41 33 41
exclamdown 161 241 193 301 161 241 161 241
f
f 102 146 102 146 102 146 102 146
fi 174 256 222 336 147 223
5
five 53 65 53 65 53 65 53 65
fl 175 257 223 337 148 224
florin 166 246 196 304 131 203 134 206
4
four 52 64 52 64 52 64 52 64
fraction 164 244 218 332 135 207
g
g 103 147 103 147 103 147 103 147
germandbls 251 373 167 247 223 337 223 337
`
grave 193 301 96 140 96 140 96 140
>
greater 62 76 62 76 62 76 62 76
guillemotleft 171 253 199 307 171 253 171 253
guillemotright 187 273 200 310 187 273 187 273
guilsinglleft 172 254 220 334 139 213 136 210
guilsinglright 173 255 221 335 155 233 137 211
h
h 104 150 104 150 104 150 104 150
hungarumlaut 205 315 253 375 28 34
-
hyphen 45 55 45 55 45 55 45 55
i
i 105 151 105 151 105 151 105 151
iacute 146 222 237 355 237 355
icircumflex 148 224 238 356 238 356
idieresis 149 225 239 357 239 357
igrave 147 223 236 354 236 354
j
j 106 152 106 152 106 152 106 152
k
k 107 153 107 153 107 153 107 153
l
l 108 154 108 154 108 154 108 154
<
less 60 74 60 74 60 74 60 74
logicalnot 194 302 172 254 172 254
lslash 248 370 155 233
m
m 109 155 109 155 109 155 109 155
macron 197 305 248 370 175 257 175 257
minus 138 212
mu 181 265 181 265 181 265
multiply 215 327 215 327
n
n 110 156 110 156 110 156 110 156
9
nine 57 71 57 71 57 71 57 71
ntilde
oacute 151 227 243 363 243 363
ocircumflex 153 231 244 364 244 364
odieresis 154 232 246 366 246 366
oe 250 372 207 317 156 234 156 234
ogonek 206 316 254 376 29 35
ograve 152 230 242 362 242 362
1
one 49 61 49 61 49 61 49 61
onehalf 189 275 189 275
onequarter 188 274 188 274
onesuperior 185 271 185 271
ordfeminine 227 343 187 273 170 252 170 252
ordmasculine 235 353 188 274 186 272 186 272
oslash 249 371 191 277 248 370 248 370
otilde 155 233 245 365 245 365
p
p 112 160 112 160 112 160 112 160
paragraph 182 266 166 246 182 266 182 266
(
parenleft 40 50 40 50 40 50 40 50
)
parenright 41 51 41 51 41 51 41 51
%
percent 37 45 37 45 37 45 37 45
.
period 46 56 46 56 46 56 46 56
periodcentered 180 264 225 341 183 267 183 267
perthousand 189 275 228 344 137 211 139 213
+
plus 43 53 43 53 43 53 43 53
plusminus 177 261 177 261 177 261
q
q 113 161 113 161 113 161 113 161
?
question 63 77 63 77 63 77 63 77
questiondown 191 277 192 300 191 277 191 277
"
quotedbl 34 42 34 42 34 42 34 42
quotedblbase 185 271 227 343 132 204 140 214
quotedblleft 170 252 210 322 147 223 141 215
quotedblright 186 272 211 323 148 224 142 216
quoteleft 96 140 212 324 145 221 143 217
quoteright 39 47 213 325 146 222 144 220
quotesinglbase 184 270 226 342 130 202 145 221
'
quotesingle 169 251 39 47 39 47 39 47
r
r 114 162 114 162 114 162 114 162
registered
168 250 174 256 174 256
Char Name
StandardEncoding MacRomanEncoding WinAnsiEncoding PDFDocEncoding
Decimal Octal Decimal Octal Decimal Octal Decimal Octal
PDF Reference Manual February 23, 1996 Predefined Font Encodings
C.1 Predefined encodings sorted by character name 253
Note In the
WinAnsiEncoding
, the hyphen character can also be accessed using a
character code of 173, the space using 160, and bullets are used for the other-
wise unused character codes 127, 128, 129, 141, 142, 143, 144, 157, and
158.
ring 202 312 251 373 176 260 30 36
s
s 115 163 115 163 115 163 115 163
scaron 154 232 157 235
section 167 247 164 244 167 247 167 247
;
semicolon 59 73 59 73 59 73 59 73
7
seven 55 67 55 67 55 67 55 67
6
six 54 66 54 66 54 66 54 66
/
slash 47 57 47 57 47 57 47 57
space 32 40 32, 202 40,312 32 40 32 40
sterling 163 243 163 243 163 243 163 243
t
t 116 164 116 164 116 164 116 164
thorn 254 376 254 376
3
three 51 63 51 63 51 63 51 63
threequarters 190 276 190 276
threesuperior 179 263 179 263
tilde 196 304 247 367 152 230 31 37
trademark 170 252 153 231 146 222
2
two 50 62 50 62 50 62 50 62
twosuperior 178 262 178 262
u
u 117 165 117 165 117 165 117 165
uacute 156 234 250 372 250 372
ucircumflex 158 236 251 373 251 373
udieresis 159 237 252 374 252 374
ugrave 157 235 249 371 249 371
_
underscore 95 137 95 137 95 137 95 137
v
v 118 166 118 166 118 166 118 166
w
w 119 167 119 167 119 167 119 167
x
x 120 170 120 170 120 170 120 170
y
y 121 171 121 171 121 171 121 171
yacute 253 375 253 375
ydieresis 216 330 255 377 255 377
yen 165 245 180 264 165 245 165 245
z
z 122 172 122 172 122 172 122 172
zcaron 158 236
0
zero 48 60 48 60 48 60 48 60
Char Name
StandardEncoding MacRomanEncoding WinAnsiEncoding PDFDocEncoding
Decimal Octal Decimal Octal Decimal Octal Decimal Octal
PDF Reference Manual February 23, 1996 Predefined Font Encodings
254 Appendix C: Predefined Font Encodings
C.2 Predened encodings sorted by character code
Note Character codes 0 through 23 are not used in any of the predened encodings.
Code
StandardEncoding MacRomanEncoding WinAnsiEncoding PDFDocEncoding
Decimal Octal
24 30 breve
25 31 caron
26 32 circumflex
27 33 dotaccent
28 34 hungarumlaut
29 35 ogonek
30 36 ring
31 37 tilde
32 40 space space space space
33 41 exclam exclam exclam exclam
34 42 quotedbl quotedbl quotedbl quotedbl
35 43 numbersign numbersign numbersign numbersign
36 44 dollar dollar dollar dollar
37 45 percent percent percent percent
38 46 ampersand ampersand ampersand ampersand
39 47 quoteright quotesingle quotesingle quotesingle
40 50 parenleft parenleft parenleft parenleft
41 51 parenright parenright parenright parenright
42 52 asterisk asterisk asterisk asterisk
43 53 plus plus plus plus
44 54 comma comma comma comma
45 55 hyphen hyphen hyphen hyphen
46 56 period period period period
47 57 slash slash slash slash
48 60 zero zero zero zero
49 61 one one one one
50 62 two two two two
51 63 three three three three
52 64 four four four four
53 65 five five five five
54 66 six six six six
55 67 seven seven seven seven
56 70 eight eight eight eight
57 71 nine nine nine nine
58 72 colon colon colon colon
59 73 semicolon semicolon semicolon semicolon
60 74 less less less less
61 75 equal equal equal equal
PDF Reference Manual February 23, 1996 Predefined Font Encodings
C.2 Predefined encodings sorted by character code 255
62 76 greater greater greater greater
63 77 question question question question
64 100 at at at at
65 101 A A A A
66 102 B B B B
67 103 C C C C
68 104 D D D D
69 105 E E E E
70 106 F F F F
71 107 G G G G
72 110 H H H H
73 111 I I I I
74 112 J J J J
75 113 K K K K
76 114 L L L L
77 115 M M M M
78 116 N N N N
79 117 O O O O
80 120 P P P P
81 121 Q Q Q Q
82 122 R R R R
83 123 S S S S
84 124 T T T T
85 125 U U U U
86 126 V V V V
87 127 W W W W
88 130 X X X X
89 131 Y Y Y Y
90 132 Z Z Z Z
91 133 bracketleft bracketleft bracketleft bracketleft
92 134 backslash backslash backslash backslash
93 135 bracketright bracketright bracketright bracketright
94 136 asciicircum asciicircum asciicircum asciicircum
95 137 underscore underscore underscore underscore
96 140 quoteleft grave grave grave
97 141 a a a a
98 142 b b b b
99 143 c c c c
100 144 d d d d
101 145 e e e e
102 146 f f f f
103 147 g g g g
Code
StandardEncoding MacRomanEncoding WinAnsiEncoding PDFDocEncoding
Decimal Octal
PDF Reference Manual February 23, 1996 Predefined Font Encodings
256 Appendix C: Predefined Font Encodings
104 150 h h h h
105 151 i i i i
106 152 j j j j
107 153 k k k k
108 154 l l l l
109 155 m m m m
110 156 n n n n
111 157 o o o o
112 160 p p p p
113 161 q q q q
114 162 r r r r
115 163 s s s s
116 164 t t t t
117 165 u u u u
118 166 v v v v
119 167 w w w w
120 170 x x x x
121 171 y y y y
122 172 z z z z
123 173 braceleft braceleft braceleft braceleft
124 174 bar bar bar bar
125 175 braceright braceright braceright braceright
126 176 asciitilde asciitilde asciitilde asciitilde
127 177 bullet
128 200 Adieresis bullet bullet
129 201 Aring bullet dagger
130 202 Ccedilla quotesinglbase daggerdbl
131 203 Eacute florin ellipsis
132 204 Ntilde quotedblbase emdash
133 205 Odieresis ellipsis endash
134 206 Udieresis dagger florin
135 207 aacute daggerdbl fraction
136 210 agrave circumflex guilsinglleft
137 211 acircumflex perthousand guilsinglright
138 212 adieresis Scaron minus
139 213 atilde guilsinglleft perthousand
140 214 aring OE quotedblbase
141 215 ccedilla bullet quotedblleft
142 216 eacute bullet quotedblright
143 217 egrave bullet quoteleft
144 220 ecircumflex bullet quoteright
145 221 edieresis quoteleft quotesinglbase
Code
StandardEncoding MacRomanEncoding WinAnsiEncoding PDFDocEncoding
Decimal Octal
PDF Reference Manual February 23, 1996 Predefined Font Encodings
C.2 Predefined encodings sorted by character code 257
146 222 iacute quoteright trademark
147 223 igrave quotedblleft fi
148 224 icircumflex quotedblright fl
149 225 idieresis bullet Lslash
150 226 ntilde endash OE
151 227 oacute emdash Scaron
152 230 ograve tilde Ydieresis
153 231 ocircumflex trademark Zcaron
154 232 odieresis scaron dotlessi
155 233 otilde guilsinglright lslash
156 234 uacute oe oe
157 235 ugrave bullet scaron
158 236 ucircumflex bullet zcaron
159 237 udieresis Ydieresis
160 240 dagger space
161 241 exclamdown degree exclamdown exclamdown
162 242 cent cent cent cent
163 243 sterling sterling sterling sterling
164 244 fraction section currency currency
165 245 yen bullet yen yen
166 246 florin paragraph brokenbar brokenbar
167 247 section germandbls section section
168 250 currency registered dieresis dieresis
169 251 quotesingle copyright copyright copyright
170 252 quotedblleft trademark ordfeminine ordfeminine
171 253 guillemotleft acute guillemotleft guillemotleft
172 254 guilsinglleft dieresis logicalnot logicalnot
173 255 guilsinglright hyphen
174 256 fi AE registered registered
175 257 fl Oslash macron macron
176 260 degree degree
177 261 endash plusminus plusminus plusminus
178 262 dagger twosuperior twosuperior
179 263 daggerdbl threesuperior threesuperior
180 264 periodcentered yen acute acute
181 265 mu mu mu
182 266 paragraph paragraph paragraph
183 267 bullet periodcentered periodcentered
184 270 quotesinglbase cedilla cedilla
185 271 quotedblbase onesuperior onesuperior
186 272 quotedblright ordmasculine ordmasculine
187 273 guillemotright ordfeminine guillemotright guillemotright
Code
StandardEncoding MacRomanEncoding WinAnsiEncoding PDFDocEncoding
Decimal Octal
PDF Reference Manual February 23, 1996 Predefined Font Encodings
258 Appendix C: Predefined Font Encodings
188 274 ellipsis ordmasculine onequarter onequarter
189 275 perthousand onehalf onehalf
190 276 ae threequarters threequarters
191 277 questiondown oslash questiondown questiondown
192 300 questiondown Agrave Agrave
193 301 grave exclamdown Aacute Aacute
194 302 acute logicalnot Acircumflex Acircumflex
195 303 circumflex Atilde Atilde
196 304 tilde florin Adieresis Adieresis
197 305 macron Aring Aring
198 306 breve AE AE
199 307 dotaccent guillemotleft Ccedilla Ccedilla
200 310 dieresis guillemotright Egrave Egrave
201 311 ellipsis Eacute Eacute
202 312 ring space Ecircumflex Ecircumflex
203 313 cedilla Agrave Edieresis Edieresis
204 314 Atilde Igrave Igrave
205 315 hungarumlaut Otilde Iacute Iacute
206 316 ogonek OE Icircumflex Icircumflex
207 317 caron oe Idieresis Idieresis
208 320 emdash endash Eth Eth
209 321 emdash Ntilde Ntilde
210 322 quotedblleft Ograve Ograve
211 323 quotedblright Oacute Oacute
212 324 quoteleft Ocircumflex Ocircumflex
213 325 quoteright Otilde Otilde
214 326 divide Odieresis Odieresis
215 327 multiply multiply
216 330 ydieresis Oslash Oslash
217 331 Ydieresis Ugrave Ugrave
218 332 fraction Uacute Uacute
219 333 currency Ucircumflex Ucircumflex
220 334 guilsinglleft Udieresis Udieresis
221 335 guilsinglright Yacute Yacute
222 336 fi Thorn Thorn
223 337 fl germandbls germandbls
224 340 daggerdbl agrave agrave
225 341 AE periodcentered aacute aacute
226 342 quotesinglbase acircumflex acircumflex
227 343 ordfeminine quotedblbase atilde atilde
228 344 perthousand adieresis adieresis
229 345 Acircumflex aring aring
Code
StandardEncoding MacRomanEncoding WinAnsiEncoding PDFDocEncoding
Decimal Octal
PDF Reference Manual February 23, 1996 Predefined Font Encodings
C.2 Predefined encodings sorted by character code 259
230 346 Ecircumflex ae ae
231 347 Aacute ccedilla ccedilla
232 350 Lslash Edieresis egrave egrave
233 351 Oslash Egrave eacute eacute
234 352 OE Iacute ecircumflex ecircumflex
235 353 ordmasculine Icircumflex edieresis edieresis
236 354 Idieresis igrave igrave
237 355 Igrave iacute iacute
238 356 Oacute icircumflex icircumflex
239 357 Ocircumflex idieresis idieresis
240 360 eth eth
241 361 ae Ograve ntilde ntilde
242 362 Uacute ograve ograve
243 363 Ucircumflex oacute oacute
244 364 Ugrave ocircumflex ocircumflex
245 365 dotlessi dotlessi otilde otilde
246 366 circumflex odieresis odieresis
247 367 tilde divide divide
248 370 lslash macron oslash oslash
249 371 oslash breve ugrave ugrave
250 372 oe dotaccent uacute uacute
251 373 germandbls ring ucircumflex ucircumflex
252 374 cedilla udieresis udieresis
253 375 hungarumlaut yacute yacute
254 376 ogonek thorn thorn
255 377 caron ydieresis ydieresis
Code
StandardEncoding MacRomanEncoding WinAnsiEncoding PDFDocEncoding
Decimal Octal
PDF Reference Manual February 23, 1996 Predefined Font Encodings
260 Appendix C: Predefined Font Encodings
C.3 MacExpert encoding
Char Name
Code
Char Name
Code
Decimal Octal Decimal Octal
i
AEsmall 190 276
i
Lslashsmall 194 302
Aacutesmall 135 207
i
Lsmall 108 154
a
Acircumflexsmall 137 211
Macronsmall 244 364
Acutesmall 39 47
x
Msmall 109 155
a
Adieresissmall 138 212
x
Nsmall 110 156
\
Agravesmall 136 210
x
Ntildesmall 150 226
\
Aringsmall 140 214
o
OEsmall 207 317
a
Asmall 97 141
o
Oacutesmall 151 227
Atildesmall 139 213
Ocircumflexsmall 153 231
Brevesmall 243 363
Odieresissmall 154 232
n
Bsmall 98 142
Ogoneksmall 242 362
`
Caronsmall 174 256
o
Ogravesmall 152 230
Ccedillasmall 141 215
o
Oslashsmall 191 277
Cedillasmall 201 311
o
Osmall 111 157
`
Circumflexsmall 94 136
Otildesmall 155 233
c
Csmall 99 143
i
Psmall 112 160
Dieresissmall 172 254
q
Qsmall 113 161
Dotaccentsmall 250 372
Ringsmall 251 373
o
Dsmall 100 144
i
Rsmall 114 162
i
Eacutesmall 142 216
s
Scaronsmall 167 247
Ecircumflexsmall 144 220
s
Ssmall 115 163
i
Edieresissmall 145 221
i
Thornsmall 185 271
i
Egravesmall 143 217
Tildesmall 126 176
i
Esmall 101 145
r
Tsmall 116 164
o
Ethsmall 68 104
u
Uacutesmall 156 234
i
Fsmall 102 146
u
Ucircumflexsmall 158 236
`
Gravesmall 96 140
Udieresissmall 159 237
c
Gsmall 103 147
Ugravesmall 157 235
u
Hsmall 104 150
u
Usmall 117 165
Hungarumlautsmall 34 42
v
Vsmall 118 166
Iacutesmall 146 222
w
Wsmall 119 167
i
Icircumflexsmall 148 224
x
Xsmall 120 170
Idieresissmall 149 225
\
Yacutesmall 180 264
Igravesmall 147 223
\
Ydieresissmall 216 330
i
Ismall 105 151
\
Ysmall 121 171
;
Jsmall 106 152
z
Zcaronsmall 189 275
x
Ksmall 107 153
z
Zsmall 122 172
PDF Reference Manual February 23, 1996 Predefined Font Encodings
C.3 MacExpert encoding 261
x
ampersandsmall 38 46
|
lsuperior 241 361
asuperior 129 201
msuperior 247 367
bsuperior 245 365
nineinferior 187 273
.
centinferior 169 251
,
nineoldstyle 57 71
c
centoldstyle 35 43
ninesuperior 225 341
centsuperior 130 202
nsuperior 246 366
:
colon 58 72
onedotenleader 43 53
colonmonetary 123 173
oneeighth 74 112
,
comma 44 54
I
onefitted 124 174
,
commainferior 178 262
onehalf 72 110
commasuperior 248 370
oneinferior 193 301
s
dollarinferior 182 266
:
oneoldstyle 49 61
s
dollaroldstyle 36 44
onequarter 71 107
dollarsuperior 37 45
onesuperior 218 332
dsuperior 235 353
onethird 78 116
eightinferior 165 245
osuperior 175 257
eightoldstyle 56 70
parenleftinferior 91 133
eightsuperior 161 241
parenleftsuperior 40 50
esuperior 228 344
parenrightinferior 93 135
.
exclamdownsmall 214 326
parenrightsuperior 41 51
:
exclamsmall 33 41
.
period 46 56
ff 86 126
.
periodinferior 179 263
ffi 89 131
periodsuperior 249 371
ffl 90 132
questiondownsmall 192 300
fi 87 127
:
questionsmall 63 77
figuredash 208 320
rsuperior 229 345
fiveeighths 76 114
|
rupiah 125 175
fiveinferior 176 260
;
semicolon 59 73
,
fiveoldstyle 53 65
seveneighths 77 115
fivesuperior 222 336
seveninferior 166 246
fl 88 130
;
sevenoldstyle 55 67
fourinferior 162 242
sevensuperior 224 340
fouroldstyle 52 64
sixinferior 164 244
foursuperior 221 335
o
sixoldstyle 54 66
fraction 47 57
sixsuperior 223 337
-
hyphen 45 55 space 32 40
hypheninferior 95 137
ssuperior 234 352
hyphensuperior 209 321
threeeighths 75 113
isuperior 233 351
threeinferior 163 243
Char Name
Code
Char Name
Code
Decimal Octal Decimal Octal
PDF Reference Manual February 23, 1996 Predefined Font Encodings
262 Appendix C: Predefined Font Encodings
,
threeoldstyle 51 63
threequarters 73 111
-
threequartersemdash 61 75
`
threesuperior 220 334
tsuperior 230 346
twodotenleader 42 52
twoinferior 170 252
:
twooldstyle 50 62
`
twosuperior 219 333
twothirds 79 117
zeroinferior 188 274
c
zerooldstyle 48 60
zerosuperior 226 342
Char Name
Code
Char Name
Code
Decimal Octal Decimal Octal
PDF Reference Manual April 16, 1996 Appendix D: Implementation Limits
263
APPENDIX D
Implementation Limits
In general, PDF does not restrict the size or quantity of things described in
the le format, such as numbers, arrays, images, and so on. However, a PDF
viewer application running on a particular processor and in a particular
operating environment does have such limits. If a viewer application
attempts to perform an action that exceeds one of the limits, it will display
an error.
PostScript interpreters also have implementation limits, listed in Appendix
B of the PostScript Language Reference Manual, Second Edition. It is
possible to construct a PDF le that does not violate viewer application
limits but will not print on a PostScript printer. Keep in mind that these
limits vary according to the PostScript language level, interpreter version,
and the amount of memory available to the interpreter.
All limits are sufciently large that most PDF les should never approach
them. However, using the techniques described in Chapters 8 through 12 of
this book will further reduce the chance of reaching these limits.
This appendix describes typical limits for Acrobat Exchange and Acrobat
Reader. These limits fall into two main classes:
Architectural limits. The hardware on which a viewer application
executes imposes certain constraints. For example, an integer is usually
represented in 32 bits, limiting the range of allowed integers. In addition,
the design of the software imposes other constraints, such as a limit of
65,535 elements in an array or string.
Memory limits. The amount of memory available to a viewer application
limits the number of memory-consuming objects that can be held
simultaneously.
PDF itself has one architectural limit. Because ten digits are allocated to
byte offsets, the size of a le is limited to 10
10
bytes (approximately 10GB).
PDF Reference Manual April 16, 1996 Appendix D: Implementation Limits
264 Appendix D: Implementation Limits
Table D.1 describes the architectural limits for most PDF viewer
applications running on 32-bit machines. These limits are likely to remain
constant across a wide variety of implementations. However, memory limits
will often be exceeded before architectural limits, such as the limit on the
number of PDF objects, are reached.
Table D.1 Architectural limits
Quantity Limit Explanation
integer 2,147,483,647 Largest positive value, 2
31
1.
2,147,483,648 Largest negative value, 2
31
.
real 32,767 Approximate range of values.
1/65,536 Approximate smallest non-zero value.
5 Approximate number of decimal digits of precision in fractional part.
array 65,535 Maximum number of elements in an array.
dictionary 65,535 Maximum number of keyvalue pairs in a dictionary.
string 65,535 Maximum number of characters in a string.
name 127 Maximum number of characters in a name.
indirect object 250,000 Maximum number of indirect objects in a PDF le.
Memory limits cannot be characterized so precisely, because the amount of
available memory and the way in which it is allocated vary from one
implementation to another.
Memory is automatically reallocated from one use to another when
necessary. When more memory is needed for a particular purpose, it can be
taken away from memory allocated to another purpose if that memory is
currently unused or its use is non-essential (a cache, for example.) Also,
data is often saved to a temporary le when memory is limited. Because of
this behavior, it is not possible to state limits for such items as the number of
pages, number of text annotations or hypertext links on a page, number of
graphics objects on a page, or number of fonts on a page or in a document.
PDF Reference Manual April 16, 1996 Appendix D: Implementation Limits
265
Version 1.0 of Acrobat Exchange and Acrobat Reader have some additional
architectural limits:
Thumbnails may be no larger than 106106 samples, and should be
created at one-eighth scale for 8.511 inch and A4 size pages.
Thumbnails should use either the DeviceGray or direct or indexed
DeviceRGB color space.
The minimum allowed page size is 11 inch (7272 units in the default
user space coordinate system), and the maximum allowed page size is
4545 inches (32403240 units in the default user space coordinate
system).
The zoom factor of a view is constrained to be between 12% and 800%,
regardless of the zoom factor specied in the PDF le.
When Acrobat Exchange or Acrobat Reader reads a PDF le with a
damaged or missing cross-reference table, it attempts to rebuild the table
by scanning all the objects in the le. However, the generation numbers
of deleted entries are lost if the cross-reference table is missing or
severely damaged. Reconstruction fails if any object identiers do not
occur at the start of a line or if the endobj keyword does not appear at
the start of a line. Also, reconstruction fails if a stream contains a line
beginning with the word endstream, aside from the required
endstream that delimits the end of the stream.
PDF Reference Manual April 16, 1996 Appendix D: Implementation Limits
266 Appendix D: Implementation Limits
PDF Reference Manual April 16, 1996 Appendix E: Obtaining XUIDs and Technical
267
APPENDIX E
Obtaining XUIDs and
Technical Notes
Creators of widely distributed forms who wish to use the XUID mechanism
must obtain an organization ID from Adobe Systems Incorporated at the
addresses listed below.
Technical notes, technical support, and periodic mailings are available to
members of the Adobe Developers Association. In particular, the PostScript
language software development kit (SDK) contains all the technical notes
mentioned in this book. The Adobe Developers Association can be
contacted at the addresses listed below:
Europe:
Adobe Developers Association
Adobe Systems Europe B.V.
Europlaza
Hoogoorddreef 54a
1101 BE Amsterdam Z-O
The Netherlands
Telephone: +44-131-458-6800
Fax: +44-131-458-6801
U.S. and the rest of the world:
Adobe Developers Association
Adobe Systems Incorporated
1585 Charleston Road
P.O. Box 7900
Mountain View, CA 94039-7900
Telephone: (415) 9614111
Fax: (415) 9694138
PDF Reference Manual April 16, 1996 Appendix E: Obtaining XUIDs and Technical
268 Appendix E: Obtaining XUIDs and Technical Notes
In addition, some technical notes and other information may be available
from Adobes World Wide Web server
http://www.adobe.com
and from an anonymous ftp site
ftp.adobe.com
When accessing the anonymous ftp site, use anonymous as the user name,
and provide your E-mail address as the password (for example,
smith@adobe.com).
PDF Reference Manual April 16, 1996 Appendix F: PDF Name Registry
269
APPENDIX F
PDF Name Registry
With the introduction of Adobe Acrobat 2.0, it has become easy for third
parties to add private data to PDF documents and to add plug-ins that
change viewer behavior based on this data. However, Acrobat users have
certain expectations when opening a PDF document, no matter what plug-
ins are available. PDF enforces certain restrictions on private data in order
to meet these expectations.
A PDF producer or Acrobat viewer plug-in may dene new action,
destination, annotation, and security handler types. If a user opens a PDF
document and the plug-in that implements the new type of object is
unavailable, the viewers will behave as described in Appendix G.2, Viewer
compatibility behavior."
A PDF producer or Acrobat plug-in may also add keys to any PDF object
that is implemented as a dictionary except the trailer dictionary.
To avoid conicts with third-party names and with future versions of PDF,
Adobe maintains a registry, similar to the registry it maintains for Document
Structuring Conventions. Third-party developers must only add private data
that conforms to the registry rules. The registry includes three classes:
First-class Names and data of value to a wide range of developers. All
the names dened in PDF 1.0 and 1.1 are rst-class names. Plug-ins that
are publicly available should often use rst-class names for their private
data. First class names and data formats must be registered with Adobe,
and will be made available for all developers to use. To submit a private
data name and format for consideration as rst-class, contact Adobes
Developer Support group, as described later in this section.
Second-class Names that are applicable to a specic developer.
(Adobe does not register second-class data formats.) Adobe distributes
second-class names by registering developer-specic prexes, which
must be used as the rst characters in the names of all private data added
PDF Reference Manual April 16, 1996 Appendix F: PDF Name Registry
270 Appendix F: PDF Name Registry
by the developer. Adobe will not register the same prex to two different
developers, ensuring that different developers second-class names will
not conict. It is up to each developer to ensure that they do not use the
same name in conicting ways themselves. To request a prex for
second-class names, contact Adobes Developer Support group, as
described later in this section.
Third-class Names that can be used only in les that will never be
seen by other third parties, because they may conict with third-class
names dened by others. Third-class names all begin with a specic
prex reserved by Adobe for private plug-ins; this prex is XX. This
prex must be used as the rst characters in the names of all private data
added by the developer. It it not necessary to contact Adobe to register
third-class names.
Note New keys for the Info dictionary in the Catalog and in Threads need not be
registered.
To register either rst- or second-class names, contact Adobes Developer
Support group at (415) 961-4111, or send e-mail to
devsup-person@adobe.com
PDF Reference Manual April 16, 1996 Appendix G: Compatibility
271
APPENDIX G
Compatibility
The goal of the Adobe Acrobat family of products is to enable people to
easily and reliably exchange and view electronic documents. Ideally, easily
and reliably means that any Acrobat viewer should be able to display the
contents of any PDF le even if the PDF le was created long before or long
after the viewer. Of course, new versions of viewers are introduced to
provide additional capabilities not present before. Furthermore, beginning
with Acrobat 2.0, viewers may accept plug-in extensions, making some
Acrobat 2.0 viewers more capable than others depending on what
extensions are present. Both the viewers and PDF itself have been designed
to enable users to view everything in the document that the viewer
understands and to ignore or inform the user about objects not understood.
The decision whether to ignore or inform the user is made on a feature-by-
feature basis.
The original PDF specication did not specify how a viewer should behave
when it reads a le that does not conform to the specication. This
addendum provides this information. The PDF version number associated
with a le determines how it should be treated when a viewer encounters a
problem.
G.1 Version numbers
The PDF version number consists of a major and minor version. The
version number is part of the PDF header, the rst line of the le. This
header takes the form:
%PDF-M.m
where M is the major number and m is the minor number.
PDF Reference Manual April 16, 1996 Appendix G: Compatibility
272 Appendix G: Compatibility
If PDF changes in a way that current viewers will be unlikely to read a
document without a serious error, the major version number will be
incremented. A serious error is an error that prevents pages from being
viewed. Adding a new lter type for page contents would require a change
in the major version number. Adding a new page description operator would
not.
If PDF changes in a way that a viewer will display an error message but
continue its work, the minor version number will change. Adding new page
description operators would require a change in the minor version number.
If PDF changes in a way that current viewers are unlikely to detect, the
version number need not change.This includes the addition of private data
that can be gracefully ignored by consumers that do not understand that
data. An example is adding a key to a dictionary object such as the Catalog.
An Acrobat viewer will try to read any le with a valid PDF header, even if
the version number is newer than the viewer itself. It will read without
errors any le that does not require a plug-in, even if the version number is
older than the viewer. Some documents may require a plug-in to display an
annotation or execute a link or bookmark action. Viewer behavior in this
situation is described below. However, a plug-in is never required to display
the contents of a page.
If a viewer opens a document with a newer major version number than it
expects, it warns the user that it is unlikely to be able to read the document
successfully and that the user will not be able to change or save the
document. At the rst error related to document processing, the viewer will
notify the user that an error has occurred but that no further errors will be
reported. (Some errors will always be reported, including le I/O errors,
extension loading errors, out-of-memory errors, and notication that a
command failed.) Processing will continue if possible. Acrobat Exchange
will not permit a document with a newer major version number to be
inserted into another document.
If a viewer opens a document with a newer minor version number than it
expects, it silently remembers the version number. Only if it encounters an
error does it alert the user. At this point it noties the user that the document
is newer than expected, that an error has occurred, and that no further errors
will be reported. The document may not be incrementally saved but can be
saved to a new le. The saved le will continue to have the new version
number. A user may insert a document with a newer minor version into
another document. The resulting document can be saved. Its version number
will be the maximum of the version number of the original document and
the documents inserted into the original.
PDF Reference Manual April 16, 1996 Appendix G: Compatibility
G.2 Viewer compatibility behavior 273
When opening a le, the Acrobat viewers are very liberal in their check for a
valid PDF header. All viewers allow the header to appear anywhere in the
rst 1,000 bytes of the le. The 1.0 viewers require only that "%PDF-"
appear in the header, but ignore the rest of the header. The 2.0 viewers
search for a header of the form described above. However, they also accept
a header of the form:
%!PS-Adobe-N.n PDF-M.m
where N.n is an Adobe Document Structuring Conventions version number
and M.m is a PDF version number. (The PostScript Language Reference
Manual describes the Document Structuring Conventions).
G.2 Viewer compatibility behavior
This section describes how the Acrobat 1.0 and 2.0 viewers behave when
encountering items that do not conform to the PDF 1.0 specication. It is
planned that future Acrobat viewers will behave the same as Acrobat 2.0
viewers.
G.2.1 Dictionary keys
Adding key-value pairs not described in the PDF specication to dictionary
objects usually does not affect the behavior of 1.0 viewers and never affects
the behavior of Acrobat 2.0 viewers. These keys are ignored. If a dictionary
object such as an annotation is copied into another document during a page
insertion (or in Acrobat 2.0 viewers during a page extraction), all key-value
pairs are copied. If a value is an indirect reference to another object, that
object may be copied as well, depending on the key.
In some cases a 1.0 viewer will display an error if it nds an unknown key
in a dictionary. These cases are keys in image dictionaries (both XObjects
and in-line images) and keys in DecodeParms dictionaries for lters.
See Appendix F for information on how to choose key names that are
compatible with future versions of PDF.
G.2.2 Annotations
An annotation is a dictionary element of a pages Annots array. Its
Subtype species the kind of annotation it is. Only Text and Link are
dened by PDF 1.0. If a 1.0 viewer reads a page with an annotation whose
Subtype is not Text or Link, it displays an error. It displays one error per
page no matter how many annotations are present.
PDF Reference Manual April 16, 1996 Appendix G: Compatibility
274 Appendix G: Compatibility
An Acrobat 2.0 viewer displays unknown annotations in a closed form
similar to text annotations, with an icon containing a question mark. If the
user opens the annotation, an alert appears with a message giving the
annotation type and explaining that an unavailable plug-in is required to
open it. An unknown annotation can be selected, moved, and deleted. Every
annotation type must specify its position and size using the Rect key.
G.2.3 Destinations and actions
A link or a bookmark in PDF 1.0 is a dictionary that contains a Dest key
that species a new view of the document that should be displayed when the
link or bookmark is activated. A destination is an array. Its rst element is a
name that serves as destination type that determines the interpretation of
subsequent array elements. If a 1.0 viewer encounters an unknown
destination type, no action is performed and no error is reported when the
user activates the link or bookmark. An Acrobat 2.0 viewer will display a
message when it nds an unknown destination type.
PDF 1.1 adds several new destination types described in Section 6.6.3,
Destinations. This section also describes actions, which have superseded
destinations in PDF 1.1. An Acrobat 1.0 viewer ignores actions. It does
nothing if it does not nd a Dest key in a link or bookmark.
G.2.4 XObjects
An XObject is a stream or dictionary that is referred to by name from a page
description by the Do operator. The effect of the operator is determined by
the type of the XObject. PDF 1.0 supports Image and Form XObjects. A 1.0
viewer displays an error for each XObject of a different type, no matter how
many are on a page.
Plug-ins may not add XObject types, since they are considered part of the
page and a viewer without plug-ins should always be able to display a page.
If an Acrobat 2.0 viewer encounters an unknown XObject type, it will be in
a document with a PDF version number greater than 1.1. The viewer will
display an error specifying the type of XObject but not report any further
errors.
To avoid the 1.0 viewers error behavior, new XObject types in PDF 1.1 can
be specied as Forms, providing the required Form keys but having no
content. The required keys are Name, BBox, FormType, and Matrix.
Subtype2 can specify the actual type, and additional keys can specify
additional information. See Section 6.8.6, XObject resources, for a
description of the one new XObject type added in PDF 1.1.
PDF Reference Manual April 16, 1996 Appendix G: Compatibility
G.2 Viewer compatibility behavior 275
A 1.0 viewer checks the FormType and displays an error once per form if
the FormType is not 1. It also displays an error that it cannot nd the form
each the time a page references the form. An Acrobat 2.0 viewer checks that
the FormType is 1 and puts up an error once per document and then
ignores the form if its FormType is not 1.
G.2.5 Color spaces
An image has a ColorSpace key. A 1.0 viewer displays an error each time
it nds an image with a color space that is not one of the PDF 1.0 color
spaces. Like XObjects, color spaces may not be added by plug-ins. If an
Acrobat 2.0 viewer encounters an unknown color space, it will be in a
document with a PDF version number greater than 1.1. The viewer will
display an error specifying the type of color space but not report any further
errors.
PDF 1.1 denes three additional color spaces: CalGray, CalRGB, and
Lab. To be more compatible with 1.0 viewers, PDF 1.1 allows an image
color space to be specied indirectly through the page resources. When an
Acrobat 2.0 viewer processes an image and the images ColorSpace key
species DeviceRGB, the viewer looks in the pages resources for a color
space called DefaultRGB. If this key is present, the color space associated
with it is used instead of DeviceRGB. Similarly, if an images
ColorSpace key species DeviceGray, the viewer looks for
DefaultGray. The 1.0 viewer ignores DefaultRGB and DefaultGray.
See Section 7.4 on page 148 for an explanation of the use of color spaces in
page descriptions. The presence of DefaultRGB or DefaultGray change
the interpretation of some color operators.
G.2.6 Filters
PDF uses stream objects to encapsulate image, indexed color space,
thumbnail, and embedded font data and page, form, and Type 3 character
descriptions. These streams usually use lters to compress their data. The
legal PDF 1.0 lters are the same as those available in PostScript Level 2.
The 1.0 viewer behavior when encountering an unknown lter depends on
its context, as described in Table G.1.
PDF Reference Manual April 16, 1996 Appendix G: Compatibility
276 Appendix G: Compatibility
Table G.1 Acrobat 1.0 Viewer behavior with unknown lters
Context Behavior
Image resource The image does not appear but no error is reported.
In-line image (An in-line image is specied directly in a page description, while an image
resource is specied outside of a page and referenced from the page.) An
error is reported, and page processing stops.
Indexed color space An error is reported, but page processing continues.
Thumbnail An error is reported, no more thumbnails are displayed, but the thumbnails
can be deleted and created again.
Embedded font An error is reported, and the viewer behaves as if the font is not embedded.
Page description An error is reported, and page processing stops.
Form description An error is reported, and page processing stops.
Type 3 character description An error is reported, and page processing stops.
The Acrobat 2.0 viewers do not allow plug-ins to provide additional lters.
If an unrecognized lter is encountered, an Acrobat 2.0 viewer will specify
the context in which the lter was found. If an error occurs while displaying
a page, only the rst error is reported. Subsequent behavior depends on the
context, as described in Table G.2.
PDF Reference Manual April 16, 1996 Appendix G: Compatibility
G.2 Viewer compatibility behavior 277
Table G.2 Acrobat 2.0 Viewer behavior with unknown lters
Context Behavior
Image resource The image does not appear but page processing continues.
In-line image Page processing stops.
Indexed color space The image does not appear but page processing continues.
Thumbnail An error is reported, no more thumbnails are displayed, but the thumbnails
can be deleted and created again.
Embedded font The viewer behaves as if the font had not been embedded.
Page description Page processing stops.
Form description The form does not appear but page processing continues.
Type 3 character description The character does not appear but page processing continues. The current
point is adjusted based on the characters width.
Operations that process pages, such as Find and Create Thumbnails, stop as
soon as an error occurs.
G.2.7 Page description operators
A 1.0 viewer reports an error the rst time it nds an unknown operator or
an operator with too few operands, but it continues processing the page. If it
nds ten errors on a page, it reports back to the user and asks whether to
continue processing. No further errors are reported. Each time an error
occurs, the operand stack is cleared. Acrobat 2.0 viewers behave the same,
although there is no additional warning if ten errors are encountered.
PDF 1.1 provides new page description operators for specifying device-
independent color and pass-through PostScript fragments. Since these
operators are incompatible with 1.0 viewers, PDF 1.1 provides alternative
compatible methods as well.
PDF Reference Manual April 16, 1996 Appendix G: Compatibility
278 Appendix G: Compatibility
G.2.8 Procedure sets
Each page includes a ProcSet resource that describe the PostScript
procedure sets required to print the page. A 1.0 viewer ignores requests for
unknown procedure sets. An Acrobat 2.0 viewer warns the user that a
procedure set is unavailable and cancels printing.
G.2.9 Uniform Resource Identiers
Acrobat 1.0 viewers report no error when a link annotation that uses the
URI action is invoked. The link inverts its color and performs no action.
Acrobat 2.0 viewers report the following error when a link annotation that
uses the URI action is invoked: The plug-in required by this URI action is
unavailable.
G.2.10 Movie Annotations
Acrobat 1.0 viewers report the following error when they encounter an
annotation of type Movie: An error occurred while reading a note or link.
Unknown annotation type. The annotation does not appear on the
document. Acrobat 2.0 viewers report the following error when they
encounter an annotation of type Movie: The Plug-in required by this
Movie annotation is unavailable. The annotation is displayed as a grayed
rectangle with a question-mark.
PDF Reference Manual April 16, 1996 Bibliography
279
Bibliography
[1] Adobe Systems Incorporated, PostScript Language Reference
Manual, Second Edition, Addison-Wesley, 1990, ISBN 0-201-10174-2.
Reference manual describing the imaging model used in the PostScript
language and the language itself.
[2] Adobe Systems Incorporated, Supporting Data Compression in
PostScript Level 2 and the Filter Operator, Adobe Developer Support
Technical Note 5115.
[3] Adobe Systems Incorporated, Supporting the DCT Filters in
PostScript Level 2, Adobe Developer Support Technical Note 5116.
Contains errata for the JPEG discussion in the PostScript Language
Reference Manual, Second Edition. Also describes the compatibility of the
JPEG implementation with various versions of the JPEG standard.
[4] Adobe Systems Incorporated, Adobe Type 1 Font Format, Addison-
Wesley, 1990, ISBN 0-201-57044-0. Explains the internal organization of a
PostScript language Type 1 font program.
[5] Adobe Systems Incorporated, Adobe Type 1 Font Format: Multiple
Master Extensions, Adobe Developer Support Technical Note 5086.
Describes the additions made to the Type 1 font format to support multiple
master fonts.
[6] Aho, Alfred V., John E. Hopcroft, and Jeffrey D. Ullman, Data
Structures and Algorithms, Addison-Wesley, 1983, ISBN 0-201-00023-7.
Includes a discussion of balanced trees.
[7] Arvo, James (ed.), Graphics Gems II, Academic Press, 1991, ISBN 0-
12-064480-0. The section Geometrically Continuous Cubic Bzier
Curves by Hans-Peter Seidel describes the mathematics used to smoothly
join two cubic Bzier curves.
PDF Reference Manual April 16, 1996 Bibliography
280 Bibliography
[8] Berners-Lee, T., and D. Connolly. Internet RFC 1866, Hypertext
Markup Language 2.0 Proposed Standard. November 1995. For updates,
see http://www.w3.org/pub/WWW/MarkUp/html-spec.
[9] Berners-Lee, T., Masinter, McCahill, and the Network Working
Group. Internet RFC 1738, Uniform Resource Locators.
<URL:ftp://ds.internic.net/rfc/rfc1738.txt;type=a>
[10] CCITT, Blue Book, Volume VII.3, 1988. ISBN 92-61-03611-2.
Recommendations T.4 and T.6 are the CCITT standards for Group 3 and
Group 4 facsimile encoding. This document may be purchased from Global
Engineering Documents, P.O. Box 19539, Irvine, California 92713.
[11] CCITT, Recommendation X.208: Specication of Abstract Syntax
Notation One (ASN.1), 1988.
[12] Fielding, Network Working Group. Internet RFC 1808, Relative
Uniform Resource Locators.
<URL:ftp://ds.internic.net/rfc/rfc1808.txt;type=a>
[13] Foley, James D., Andries van Dam, Steven K. Feiner, and John F.
Hughes, Computer Graphics: Principles and Practice, Second Edition,
Addison-Wesley, 1990, ISBN 0-201-12110-7. Section 11.2, Parametric
Cubic Curves, contains a description of the mathematics of cubic Bzier
curves and a comparison of various types of parametric cubic curves.
[14] Glassner, Andrew S. (ed.), Graphics Gems, Academic Press, 1990,
ISBN 0-12-286165-5. The section An Algorithm For Automatically Fitting
Digitized Curves by Philip J. Schneider describes an algorithm for
determining the set of Bzier curves approximating an arbitrary set of user-
provided points. Appendix 2 contains an implementation of the algorithm,
written in the C programming language. Other sections relevant to the
mathematics of Bzier curves include Solving the Nearest-Point-On-Curve
Problem by Philip J. Schneider, Some Properties of Bzier Curves by
Ronald Goldman, and A Bzier Curve-Based Root-Finder by Philip J.
Schneider. The source code appearing in the appendix is available via
anonymous ftp, as described in the preface to Graphics Gems III.
[15] Joint Photographic Experts Group (JPEG) Revision 8 of the JPEG
Technical Specication, ISO/IEC JTC1/SC2/WG8, CCITT SGVIII,
August 14, 1990. Denes a set of still-picture grayscale and color image
data compression algorithms.
[16] Kirk, David (ed.), Graphics Gems III, Academic Press, 1992, ISBN 0-
12-409670-0 (with IBM Disk) or ISBN 0-12-409671-9 (with Macintosh
disk). The section Interpolation Using Bzier Curves by Gershon Elber
PDF Reference Manual April 16, 1996 Bibliography
281
contains an algorithm for calculating a Bzier curve that passes through a
user-specied set of points. The algorithm utilizes not only cubic Bzier
curves, which are supported in PDF, but also higher-order Bzier curves.
The appendix contains an implementation of the algorithm, written in the C
programming language. All of the source code appearing in the appendix is
available via anonymous ftp, as described in the preface.
[17] Microsoft Corp., TrueType 1.0 Font Files, Revision 1.00, May 1992.
[18] Pennebaker, W. B. and Joan L. Mitchell, JPEG Still Image Data
Compression Standard, Van Nostrand Reinhold, 1993, ISBN 0-442-01272-
1.
[19] Ron Rivest, RFC 1321: The MD5 Message-Digest Algorithm, April
1992.
[20] Warnock, John and D. Wyatt, A Device Independent Graphics
Imaging Model for Use with Raster Devices, Computer Graphics (ACM
SIGGRAPH), Volume 16, Number 3, July 1982. Technical background for
the imaging model used in the PostScript language.
PDF Reference Manual April 16, 1996 Bibliography
282 Bibliography
PDF Reference Manual April 16, 1996 Colophon
283
Colophon
This book was produced electronically using Adobe FrameMaker
on the
Macintosh