Programming With Posix Threads (Libro) PDF

Uploaded by

AlvaroFernandezVillar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

875 views398 pages

Programming With Posix Threads (Libro) PDF

Uploaded by

AlvaroFernandezVillar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 398

Addison-Wesley Professional Computing Ser Brian W. Kernighan, Consulting Editor Ken Amold/John Peyton, A C User's Guide to ANSI C David R. Butenhof, Programming with POSIX” Threads Tom Cargill, C++ Programming Style William R. Cheswick/Steven M. Bellovin, Firewalls and Internet Security: Repelling the Wily Hacker David A. Curry, UNIX? System Security: A Guide for Users and System Administrators Erich Gamma/Richard Helm Ralph Johnson/John Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software Erich Gamma/Richard Helm/Ralph Johnson John Vlissides, Design Patterns CD: Elements of Reusable Object-Oriented Software David R. Hanson, C Interfaces and Implementations: Techniques for Creating Reusable Software S. Keshav, An Engineering Approach to Computer Networking: ATM Networks, The Internet, and the Telephone Network John Lakos, Large-Scale C++ Software Design Scott Meyers, Effective C+-+, Second Edition: 50 Specific Ways to Improve Your Programs and Designs Scott Meyers, More Effective C++: 35 New Ways to Improve Your Programs and Designs Robert B. Murray, C++ Strategies and Tactics David R. Musser/Atul Saini, STL Tutorial and Reference Guide: C++ Programming with the Standard Template Library John K. Ousterhout, Tel and the Tk Toolkit Craig Partridge, Gigabit Networking J. Stephen Pendergrast Jr, Desktop KornShell Graphical Programming Radia Perlman, Interconnections: Bridges and Routers David M, Piscitello/ A. Lyman Chapin, Open Systems Networking: TCP/P and OSI Stephen A. Rago, UNIX® System V Network Programming Curt Schimmel, UNIX® Systems for Modern Architectures: Symmetric Multiprocessing i Caching for Kernel Programmers W. Richard Stevens, Adcanced Programming in the UNIX® Environment W. Richard Stevens, TCP/IP Iilustrated, Volume 1: The Protocols W. Richard Stevens, TCP/P Illustrated, Volume 3: TCP for Transactions, HTTP, NNT? = UNIX® Domain Protacols Gary R. Wright/W. Richard Stevens, TCP/IP Illustrated, Volume 2: The Imperse:Programming with POSIX® Threads David R. Butenhof a vv ADDISON-WESLEY _ - ‘An Imprint of Addison Wesley Longman, Inc. Reading, Massachusetts * Harlow, England + Menlo Park, California Berkeley, California * Don Mills, Ontario + Sydney Bonn + Amsterdam + Tokyo * Mexico City‘Trademark acknowledgments: UNIX is a registered trademark in the United States and other countries, licensed ex- clusively through X/Open Company Ltd. Digital, DEC, Digital UNIX, DECthreads, VMS, and OpenVMS are trademarks of Digital Equipment Corporation. Solaris, SPARC, SunOS, and Sun are trademarks of Sun Microsystems Incorporated. SGI and IRIX are trademarks of Silicon Graphies, Incorporated. HP-UX is a trademark of Hewlett-Packard Company. AIX. IBM, and OS/2 are trademarks or registered trademarks of the IBM Corporation. X/Open is a trademark of X/Open Company Ltd. POSIX is a registered trademark of the Institute of Electrical and Electronics Engineers, Inc. Many of the designations used by manufacturers and sellers to distinguish their prod- ucts are claimed as trademarks. Where those designations appear in this book and Addison-Wesley was aware of a trademark claim, the designations have been printed in initial caps or all caps. ‘The authors and publishers have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. ‘The publisher offers discounts on this book when ordered in quantity for special sales. For more information, please contact Corporate & Professional Publishing Group Addison-Wesley Publishing Company ‘One Jacob Way Reading, Massachusetts 01867 Library of Congress Cataloging-in-Publication Data Butenhof, David R., 1956- Programming with POSIX threads / David R. Butenhof. . cm. -- (Addison-Wesley professional computing series) Includes bibliographical references and index. ISBN 0-201-63392-2 (pbk.) 1. Threads (Computer programs) 2. POSIX (Computer software standard) 3, Electronic digital computers--Programming, 1. Title I. Series, QA76.76.T55B88 1997 005.4'32--de21 97-6835, cp Copyright © 1997 by Addison Wesley Longman, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form, or by any means, electronic. mechanical. photo. copying, recording, or otherwise, without the prior consent of the publisher. Printed in the United States of America. Published simultaneously in Canada, ‘Text printed on recycled and acid-free paper. 23456789 MA 00999897 2nd Printing Oster, 1997To Anne, Amy, and Alyssa.‘Quote acknowledgments: American Heritage Dictionary of the English Language: page 1 ISO/IEC 9945-1:1996, © 1996 by IBEE: page 29. Lewis Carroll, Alice's Adventures in Wonderlanct pages xv, 47, 70, 8, 97, 98, 106, 131, 142, 161, 189, 197. Reproduced by permission of Macmillan Children's Books. Lewis Carroll, Through the Looking Glass: pages 1, 4, 8, 20, 25, 29, 35, 45, 172, 214, 241, 283, 290, 302. Reproduced by permission of Macmillan Children’s Books. Lewis Carroll, The Hunting of the Snaric pages 3, 13, 28, 39, 120, 131, 134, 289, 967. Reproduced by permission of Macmillan Children's Books.Contents List of Example Programs . xii Preface xv Intended audience. ........ : sees XO About the author xvi Acknowledgments xvi 1 Introduction ............45 seeee 2 1.1 The “bailing programmers” . 3 1.2 Definitions and terminology. 4 1.2.1 Asynchronous 4 1.2.2 Concurrency 4 1.2.3 Uniprocessor and multiprocessor . 5 1.2.4 Parallelism . 5 1.2.5 Thread safety and reentraney 6 1.2.6 Concurrency control functions. . 7. 1.3. Asynchronous programming is intuitive.....-.. pee 1.3.1 . .. because UNIX is asynchronous aete eisai 1.3.2... because the world is asynchronous . ell 1.4. About the examples in this book . . 12 1.5 Asynchronous programming, by example... . ad 1.5.1 The baseline, synchronous version .. . . 214 1.5.2 Aversion using multiple processes .......--.+-+ 15 1.8.3 Aversion using multiple threads . . arene al 1.5.4 Summary ........ F : 19 1.6 Benefits of threading .. : ee 20 1.6.1 Parallelism ......... aati 20 1.6.2. Concurrency .. a eeneeaas 22 1.6.3 Programming model. Aegan e) 17 Costs of threading . ee ce eeees 25 1.7.1 Computing overhead...) sss sssveeees 26 1.7.2 Programming discipline 26 1.7.3. Harder to debug 27 1.8 To thread or not to thread? .... 66.0000 pias 1.9 POSIX thread concepts .... 1.9.1 Architectural overview : eee 1.9.2 Types and interfaces ...... 6.06. e eee +. 30Ac Contents 1.9.3. Checking for errors. . 31 2 Threads............... 35 2.1 Creating and using threads 35 2.2 The life ofa thread . 39 2.2.1 Creation, 40 2.2.2 Startup 41 2.2.3 Running and blocking . 42 2.2.4 Termination . 43 2.2.5. Recycling . . 44 3 Synchronization ............. 00... cee ee eee e es 48 3.1 Invariants, eritical sections, and predicates 45 3.2 Mutexes. . . a7 3.2.1 Creating and destroying a mutex 49 3.2.2 Locking and unlocking a mutex 52 3.2.2.1 Nonblocking mutex locks 58 8.2.3. Using mutexes for atomicity 61 3.2.4 Sizing a mutex to fit the job 62 3.2.5 Using more than one mutex 63 3.2.5.1 Lock hierarchy... . 63 8.2.5.2 Lock chaining. 70 3.3. Condition variables. - 70 3.3.1 Creating and destroying a condition variable 74 3.3.2. Waiting on a condition variable. 7 3.3.3. Waking condition variable waiters. . 81 3.3.4 One final alarm program . 82 3.4 Memory visibility between threads. . 88 4 A few ways to use threads foie 4.1 Pipeline... 4.2, Work Crew. 106 4.3 Client/Server. 120 5 Advanced threaded programming ... eee ASL 5.1 One-time initialization +131 5.2 Attributes objects 134 5.2.1 Mutex attributes. 7 135 5.2.2 Condition variable attributes: 137 5.2.3 Thread attributes . . - 138 5.3 Cancellation 142 5.3.1 Deferred cancelability. . 147 5.3.2 Asynchronous cancelability . » 150Contents 5.3.3 Cleaning up. . 154 5.4 Thread-specific data ... 161 5.4.1 Creating thread-specifc data. 163 5.4.2 Using thread-specific data . . 166 5.4.3. Using destructor functions . ++ +167 5.5 Realtime scheduling. .... = 172 POSIX realtime options... . : 173 ‘Scheduling policies and priorities .. 174 Contention scope and allocation domain . Problems with realtime scheduling . Priority-aware mutexes . . een 5.5.5.1 Priority ceiling mutexes... 5.5.5.2 Priority inheritance mutexes 5.6 Threads and kernel entities... 5.6.1 Many-to-one (user level). . 5.6.2 One-to-one (kernel level) 5.6.3. Many-to-few (two level) 6 POSIX adjusts to threads. 6.1 fork ..... 6.1.1 Fork handlers 6.2 exec 63 Process extt . 64 Sidio....... 644.1 flockfile and funtockfile . 6.4.2. getchar_unlocked and putchar_unlocked. 65 Thread-safe functions ..... 5.1 User and terminal identification 2 Directory searching 3 String token ........ 5.4 ‘Time representation . 6.5.5 Random number generation, -181 gegag aa ebe 210 212 -212 erala 213 6.5.6 Group and user database 213 6.6 Signals...... a 214 6.6.1 Signal actions . 215 6.6.2 Signal masks. . 216 6.6.3 pthread Kill........... 217 6.6.4 sigwait and sigwaitinfo . 227 6.6.5 SIGEV_THREAD . 230 666 Semaphores: synchronizing with a signal-catching function ns 7 “Real code”.7 Contents 7.1 Extended synchronization ..... 2... 6ee eee cee vee eves 241 7.1.1 Barriers . 242 7.1.2. Read-write locks. ee eeeana) 7.2 Work queue manager....... = ee 1) 7.3: But what about existing libraries? .......ve eee eve. 283 7.3.1 Modifying libraries to be thread-safe 284 7.3.2. Living with legacy libraries 285 8 Hints to avoid debugging.......................289 8.1 Avoiding incorrect code... . . eae 200 8.1.1 Avoid relying on “thread inertia” Sone 291 8.1.2 Never bet your mortgage on a thread race egg: 8.1.3 Cooperate to avold deadlocks . 297 8.1.4 Beware of priority inversion . : - 299 8.1.5. Never share condition variables between | Predicates .... cette 300 8.1.6 Sharing stacks and related memory _ corrupters .. seston 301 8.2 Avotding performance problems... 302 8.2.1 Beware of concurrent serialization ||... css 802 8.2.2 Use the right number of mutexes .. 303 8.2.2.1 Too many mutexes will not help... 304 8.2.3 Never fight over cache lines.......... cee. BOR 9 POSIX threads mini-reference ..... 9.1 POSIX 1003.1¢-1995 options 9.2 POSIX 1003. 1c-1995 limits. . 9.3 POSIX 1003. 1c-1995 interfaces .. 9.3.1 Error detection and reporting 9.3.2. Use of void* type. . 9.3.3 Threads : 9.3.4 Mutexes . 9.8.5 Condition variabies, 9.3.6 Cancellation. .......... 9.3.7 Thread-specific data. . 9.3.8 Realtime scheduling. . 9.3.9 Fork handlers. 9.3.10 Sido. 9.3.11 ‘Thread-safe functions. 9.3.12 Signals........ 9.3.18 Semaphores 10 Future standardization ....................-..-347 10.1 X/Open XSHS [UNIX98)........ cee 347Contents 10.1.1 POSIX options for XSHS 10.1.2 Mutex type ....-- 10.1.3 Set concurrency level. 10.1.4 Stack guard size .. 10.1.5 Parallel 1/0 10.1.6 Cancellation points 10.2 POSIX 1003.1). 10.2.1 Barriers.......... 10.2.2 Read-write locks .. 10.2.3 Spinlocks .... 10.2.4 Condition variable wait clock. 10.2.5 Thread abort 10.3 POSIX 1003.14 . Bibliography . Thread resources on the Internet. Index...Example programs sample.c part 1 sample info alarm.c alarm_fork.c. alarn_thread.c part 1 definitions alarm_thread.c part 2 alarm thread alarm_thread.c part 3 main thread_error.c. errors-h . lifecycle.c. mutex_static.c. mutex_dynamic.c..... alarm_mutex.c part 1 definitions. alarm_mutex.c part 2 alarm thread. alarm_mutex.c part 3 main trylock.c -. backof£.c cond_static.c.. cond_dynamic.c .. cond-c alarm_cond.c part 1 declarations. alarm_cond.c part 2 alarm insert alarm_cond.c part 3 alarm routine alarm_cond.c part 4 main pipe.c part 1 definitions pipe.c part 2 pipe_send pipe-c part 3 pipe_stage pipe.c part 4 pipe_create . pipe.e part 5 pipe_start, pipe result pipe-c part 6 main crew.c part 1 definitions .. crew.c part 2 worker_routine crew.c part 3 crew create . crew.c part 4 crew_start crew.c part 5 main server.c part 1 definitions server.c part 2 tty_server_routine . server.c part 3 tty_server_request . server.c part 4 client_routine. xi 13 14 15 7 18 18 32 33 38 50 50 52 54 56 59 66 74 75 78 83 84 86 87 99 100 101 - 102 104 105 = 108 110 5 7 119 121 123 125 127Example programs server.c part 5 main.. 129 Gaceie secrete 2-188 mutex_ater.c 2-136 cond_attr.c. 2.137 thread_attr-c +140 cancel.c 2145 cancel_disable.c .. 149 cance 2-152 cancel_cleanup.c .. 1156 cancel_subcontract.c. . 158 | once.c. : 164 ted_destructor.c .. 169 ached_attr.c 176 ached thread.c 179 atfork.c 201 flock. 205 putchar.c .. +208 getlogin.c eee eee aneeate 2i1 susp.c part 1 signal-catchingfunctions: 218 suep.c part 2 initialization . 220 susp.c part 3 thd suspend... 2-221 susp.c part 4 thd continue 1-223 susp. part § sampleprogran . 1225 Bigvait.c 2c cceeee eres 228 sigey_thread.c 232 semaphore_signal.c +238 barrier-h part 1 barrier_t 245 barrier.h part 2 interfeces 245 barrier.c part 1 barrier init 246 barrier.c part 2 barrier destroy .. 2247 barrier.c part 3 barrier_wait.. 250 barrier main.c +251 rvlock.b part 1 rwlock = 1255 rwlock.h part 2 interfaces 257 rwlock.c part 1 rwl_init . 1287 rwlock.c part 2 rwl_destroy rwlock.c part 3 cleanup handle: rwlock.c part 4 rwi_readlock . . xwlock.c part 5 rwl_readtrylock Ewlock.c part 6 rwl_readunlock. xwlock.c part 7 rwl_writelock. . xwlock.c part 8 rwl_writetrylock rwlock.c part 9 rwi_writeunlock. rwlock_main.c. workg.h part 1 workqt. = 264 +265, :271workq-h part workg-c part. workg.c part workg-c part workg.c part workg_main.c inertia.c 2 1 2 3 4 interfaces, workg_init. workg_destroy . workq_add. workq server. Example programs - 272 272 274 : 275 277 ~ 280 = 291Preface S_|_ The White Rabbit put on his spectacles, “Where shall! begin, please your Majesty?” he asked. “Begin at he beginning,” the King said, very gravely, ‘and go on il you come fo the end: then stop." Lewis Carol Alice's Adventures in Wonderiand ‘This book is about "threads" and how to use them, “Thread” is just a name for a basic software “thing” that can do work on a computer. A thread is smaller, faster, and more maneuverable than a traditional process. In fact, once threads have been added to an operating system, a “process” becomes just data—address space, files, and so forth—plus one or more threads that do something with all that data, With threads, you can build applications that utilize system resources more efficiently, that are more friendly to users, that run blazingly fast on multiproces- sors, and that may even be easier to maintain. To accomplish all this, you need only add some relatively simple function calls to your code, adjust to a new way of, thinking about programming, and leap over a few yawning chasms. Reading this book carefully will, T hope, help you to accomplish all that without losing your sense of humor. ‘The threads model used in this book is commonly called “Pthreads." or “POSIX threads.” Or, more formally (since you haven't yet been properly introduced), the POSIX 1003.1c-1995 standard. I'l give you a few other names later— but for now, “Pthreads” is all you need to worry about. ‘As I write this, Sun's Solaris, Digital's Digital UNIX, and SGI's IRIX already support Pthreads. The other major commercial UNIX operating systems will soon have Pthreads as well, maybe even by the time you read this, including IBM's AIX and Hewlett-Packard’s HP-UX. Pthreads implementations are also available for Linux and other UNIX operating systems. In the personal computer market, Microsoft's Win32 API (the primary programming interface to both Windows NT and Windows 95) supports threaded programming, as does IBM's OS/2. These threaded programming models are quite different from Pthreads, but the important first step toward using them pro- ductively is understanding concurrency, synchronization, and scheduling. The rest is (more or less) a matter of syntax and style, and an experienced thread programmer can adapt to any of these models.xvi Preface ‘The threaded model can be (and has been) applied with great success to a wide range of programming problems. Here are just a few: + Large scale, computationally intensive programs + High-performance application programs and library code that can take advantage of multiprocessor systems * Library code that can be used by threaded application programs * Realtime application programs and library code * Application programs and library code that perform 1/O to slow external devices (stich as networks and human beings). Intended audience This book assumes that you are an experienced programmer, familiar with developing code for an operating system in “the UNIX family” using the ANSI C language. I have tried not to assume that you have any experience with threads or other forms of asynchronous programming. The Introduction chapter provides a general overview of the terms and concepts you'll need for the rest of the book. Ifyou don't want to read the Introduction first, that's fine, but if you ever feel like You're “missing something" you might try skipping back to get introduced. Along the way you'll find examples and simple analogies for everything, In the end I hope that you'll be able to continue comfortably threading along on your own. Have fun, and “happy threading.” About the author 1 have been involved in the Pthreads standard since it began, although 1 stayed at home for the first few meetings. I was finally forced to spend a grueling week in the avalanche-proof concrete bunker at the base of Snowbird ski resort in Utah, watching hard-working standards representatives from around the world wax their skis. This was very distracting, because I had expected a standards mecting to be a formal and stuffy environment. As a result of this misunderstanding, I was forced to rent ski equipment instead of using my own. After the Pthreads standard went into balloting, I worked on additional thread synchronization interfaces and multiprocessor issues with several POSIX working groups. | also helped to define the Aspen threads extensions, which were fast- tracked into X/Open XSH5. Thave worked at Digital Equipment Corporation for (mumble, mumble) years, In various locations throughout Massachusetts and New Hampshire. I was one of the creators of Digital's own threading architecture, and I designed (and implemented much of) the Pthreads interfaces on Digital UNIX 4.0, I have been helping people develop and debug threaded code for more than eight years.Preface xvii My unofficial motto is “Better Living Through Concurrency.” Threads are not sliced bread, but then, we're programmers, not bakers, so we do what we can. Acknowledgments ‘This is the part where I write the stuff that I'd like to see printed, and that my friends and coworkers want to see. You probably don't care, and I promise not to be annoyed if you skip over it—but if you're curious, by all means read on. No project such as this book can truly be accomplished by a single person, despite the fact that only one name appears on the cover. I could have written a book about threads without any help—I know a great deal about threads, and 1 am at least reasonably competent at written communication. However, the result would not have been this book, and this book is better than that hypothetical work could possibly have been. ‘Thanks first and foremost to my manager Jean Fullerton, who gave me the time and encouragement to write this book on the job—and thanks to the rest of, the DECthreads team who kept things going while I wrote, including Brian Keane, Webb Scales, Jacqueline Berg, Richard Love, Peter Portante, Brian Silver, Mark Simons, and Steve Johnson. ‘Thanks to Garret Swart who, while he was with Digital at the Systems Research Center, got us involved with POSIX. Thanks to Nawaf Bitar who worked with Garret to create, literally overnight, the first draft of what became Pthreads, and who became POSIX thread evangelist through the difficult period of getting everyone to understand just what the heck this threading thing was all about anyway. Without Garret, and especially Nawaf, Pthreads might not exist, and certainly wouldn't be as good as it is. (The lack of perfection is not their responsibility—that's the way life is.) ‘Thanks to everyone who contributed to the design of cma, Pthreads, UNIX98, and to the users of DCE threads and DECthreads, for all the help, thought-pro- voking discourse, and assorted skin-thickening exercises, inchiding Andrew Birrell, Paul Borman, Bob Conti, Bill Cox, Jeff Denham, Peter Gilbert, Rick Greer, Mike Grier, Kevin Harris, Ken Hobday, Mike Jones, Steve Kleiman, Bob Knighten, Leslie Lamport, Doug Locke, Paula Long, Finnbarr P. Murphy, Bill Noyce. Simon Patience. Harold Seigel, Al Simons, Jim Woodward, and John Zolnowsky. Many thanks to all those who patiently reviewed the drafts of this book (and even to those who didn't seem so patient at times). Brian Kernighan, Rich Stevens, Dave Brownell. Bill Gallmeister, lan Ginzburg, Will Morse, Bryan O'Sullivan, Bob Robillard, Dave Ruddock, Bil Lewis, and many others suggested or motivated improvements in structure and detail—and provided additional skin-thickening exercises to keep me in shape. Devang Shah and Bart Smaalders answered some Solaris questions, and Bryan O'Sullivan suggested what became the “bailing programmers” analogy.xviii Preface Thanks to John Wait and Lana Langlois at Addison Wesley Longman, who waited with great patience as a first-time writer struggled to balance writing a book with engineering and consulting commitments. Thanks to Pamela Yee and Erin Sweeney, who managed the book’s production process, and to alll the team (many of whose names I'll never know), who helped. Thanks to my wife, Anne Lederhos, and our daughters Amy and Alyssa, for all the things for which any writers may thank their families, including support, tol- erance, and Just being there. And thanks to Charles Dodgson (Lewis Carroll), who wrote extensively about threaded programming (and nearly everything else) in his classic works Alice’s Adventures in Wonderland, Through the Looking Glass, and The Hunting of the Snark. Dave Butenhof Digital Equipment Corporation 110 Spit Brook Road, ZKO2-3/Q18 Nashua, NH 03062 butenhof@zko.dec.com December 19961 Introduction “The time has come,” the Walrus said, “To alk of many things; (0f shoes—and ships-—and sealing wax— (Of cabbages—and kings— ‘And why the sea is Boling hot— ‘And whether pigs have wings.” Lewis Carol, Trough the Looking Glass In a dictionary, you would probably see that one of several definitions for “thread” is along the lines of the third definition in the American Heritage paper~ back dictionary on my desk: “Anything suggestive of the continuousness and sequence of thread.” In computer terms, a thread is the set of properties that sug- gest "continuousness and sequence” within the machine. A thread comprises the machine state necessary to execute a sequence of machine instructions—the location of the current instruction, the machine's address and data registers, and so forth. ‘A UNIX process can be thought of as a thread, plus an address space, file descriptors, and an assortment of other data. Some versions of UNIX support “lightweight” or “variable weight” processes that allow you to strip some or all of that data from some of your processes for efficiency. Now, whether you're using a “thread” or a “lightweight process,” you still need the address space, file descriptors, and everything else. So, you might ask, what's the point? The point is that you can have many threads sharing an address space. doing different things. On ‘a multiprocessor, the threads in a process can be doing different things simultaneously, When computers lived in glass caves and were fed carefully prepared punch cards, the real world outside could be kept waiting with no consequences more severe than some grumbling programmers. But the real world doesn't do one thing at a time, and gradually computers began to model that world by adding capabilities such as multiprogramming, time sharing, multiprocessing, and, eventually, threads. Threads can help you bring your application out of the cave, and Pthreads helps you do it in a way that will be neat, efficient, and portable. This chapter briefly introduces you to what you need to begin understanding and using threads, Don't worry—the rest of the book will follow up on the details left dan- aling in this chapter. Section 1.1 presents the framework for a number of analogies that I will use to explain threading as we go. There is nothing all that unusual in the brief story— 12 CHAPTER 1 Introduction but hereafter you will understand when I talle about programmers and buckets, which, otherwise, might seem mildly odd. ‘Section 1.2 defines some essential concepts and terms used in this book. The ‘most important of these concepts deserves a special introduction, which will also Serve to demonstrate the convention with which various particularly important Points shall be emphasized throughout this book: Asynchronous: Any two operations are “asynchronous when they can proceed independentiy of each other Section 1.8 describes how you already use asynchronous programming on a Fegular basis, both as a UNIX programmer and user, and as a human being in the Teal world, I wouldn't dare to claim that asynchronous programming is easy. but the basic concepts it tries to model are so easy and natural that you rarely need even to think about them until you try to apply them to software Threads are, to some extent, just one more way to make applications asynchronous. but threads have some advantages over other models that have been used to build asynchronous applications, Section 1,5 will show you some of the advantages as we apply various programming methods in several versions of Simple alarm clock. You will get to see “threads in action’ right away, with a brict description of the few Pthreads interfaces needed to build this simple application, Armed, now, with a basic understanding of what threads are all about. you can go on to Section 1.6, where we will explore some of the fundamental advan, tages of a threaded programming model. Although there are a lot of excellent reasons to use threads, there is a price to be Paid Section 1.7 provides a counterpoint to the previous section by deseribing some of the costs. What it botls down to, though, is simply that you need te lear how the model works, and then apply it carefully. Its not as hard as some folke would have you believe. ‘You have seen some of the fundamental benefits and costs, It may be obvious that you do not want to rush out and put threads into every application or brary you write. Section 1.8 asks the question “To thread, or not to thread?” and | wil attempt to guide you toward determining the proper answer in various cases You will know at that point what threads are, what they do, and when te use them. Aside from brief examples, you haven't yet seen any detailed information about the particular programming interfaces (APIs) that compose Pthreads, Sex, tion 1.9 points out some of the basic landmarks of the Pthreads universe to get You ortented before we plunge ahead. The most important part of this section Is 1.9.3, which describes the Pthreads model for reporting errors-—-which is some. what different than the rest of UNIX and POSIX,‘The “bailing programmers” 3 dll The “bailing programmers” This wos charming, no doubt but they shortly found out That the Captain they trusted so well Had only one notion for crossing the ocean, ‘And that was fo fingle his bell. Lewis Carol The Hunting of the Snark Tree programmers sal cut to sea one fine day in a small boat They sail quite some cis tance from shore, enjoying the sun and sea breeze, allowing the wind to carry them, The sky darkens, and a storm strikes, Ine small boat fs tossed violently about, and when the storm abates the programmers are missing thelr boat's sail and most of the mast. The boat hos sprung a smal leak, and there sno land in ight. The boat is equipped with food, water, oars, and a baling bucket, and the pro: ‘grammers set to work, One programmer rows, and monitors the accumulating water in the bottom of the boat. The other programmers alternately sleep, watch the water level. or scan the horizon for sight of land or another ship, ‘An idle progrommer may notice rising water in the boat. and begin balling. When both idle programmers are awake, and become simultaneously conceined regarding thelr increasing dampness, they may both lunge for the bailing bucket—but one wil inevitably teach it fist,and the other will have to wait: Ifthe rower decides that balling is required while both his companions sleep.@ nudge is usualy sufficient to awaken a programmer, lowing the other to continue sleeping. But ifthe rower is in a bad mood, he may resort to a loud yell, awakening both steeping pro- grommers. While one programmer assumes the necessary duty, the other can try fo fall asleep again. When the rower tires. he can signal one of the other programmers to take over the ‘tosk, and immediately fal into @ deep seep waiting to be signaled in tum. In this woy, they Jourey on for some time. So, just what do the Bailing Programmers have to do with threads? I'm glad you asked! The elements of the story represent analogies that apply to the Pthreads programming model. We'll explore some additional analogies in later sections, and even expand the story a little, but for now consider a few basics: ‘A programmer is an entity that is capable of independent activity. Our programmers represent threads. A thread is not really much like a programmer, who, as we all know, is a fascinatingly sophisticated mixture of engineer, mathematician, and artist that no computer can match. Still as a representation of the “active element” in our programming model, it will be sufficient. ‘The bailing bucket and the oars are “tokens” that can be held by only one individual at a time. They can be thought of as shared data, or as synchronization objects. The primary Pthreads synchronization object, by the way, is called a mutex.4 CHAPTER 1 Introduction ‘Nudges and shouts are communication mechanisms associated with a synchronization object, on which individuals wait for some condition, Pthreads provides condition variables, which may be signaled or broadcast to indicate changes in shared data state. 1.2 Definitions and terminology “When | use a word,” Humpty Dumpty said, in rather a scomtul fone, “it means just what | choose it fo mean—nelther more nor less.” Lewis Carol rough the Looking Glass ‘This book will use several critical terms that may be unfamiliar to you unless you've already had some experience with parallel or asynchronous programming. Even if you are familiar with them, some of the terms have seen assorted and even contradictory uses within research and industry, and that is clearly not going to help communication. We need to begin by coming to a mutual agreement regarding the meaning of these terms, and, since I am writing the book, we will agree to use my definitions, (Thank you.) 1.2.1 Asynchronous Asynchronous means that things happen independently (concurrently) unless there's some enforced dependency. Life is asynchronous. The dependencies are ‘supplied by nature, and events that are not dependent on one another can occur simultaneously. A programmer cannot row without the oars, or bail effectively without the bucket—but a programmer with oars can row while another pro: grammer with a bucket bails. Traditional computer programming, on the other hand, causes all events to occur in series unless the programmer takes “extraor- dinary measures” to allow them to happen concurrently. ‘The greatest complication of “asynchrony” has been that there's little advantage to being asynchronous unless you can have more than one activity going at a time. If you can start an asynchronous operation, but then you can do nothing but wait for it, you're not getting much benefit from the asynchrony. 1.2.2 Concurrency Concurrency, which an English dictionary will tell you refers to things hap- ening at the same time, is used to refer to things that appear to happen at the same time, but which may occur serially. Concurrency describes the behavior of threads or processes on a uniprocessor system. The definttion of concurrentDefinitions and terminology 5 1.2.3 execution in POSIX requires that “functions that suspend the execution of the calling thread shall not cause the execution of other threads to be indefinitely suspended.” Concurrent operations may be arbitrarily interleaved so that they make progress independently (one need not be completed before another begins), but concurrency does not imply that the operations proceed simultaneously. Never- theless, concurrency allows applications to take advantage of asynchronous capabilities, and “do work” while independent operations are proceeding. ‘Most programs have asynchronous aspects that may not be immediately obvious. Users, for example, prefer asynchronous interfaces. They expect to be able to issue a command while they're thinking about it, even before the program has finished with the last one. And when a windowing interface provides separate windows, don't you intuitively expect those windows to act asynchronously? Nobody likes a “busy” cursor. Pthreads provides you with both concurrency and asynchrony, and the combination is exactly what you need to easily write responsive and efficient programs. Your program can “wait in parallel” for slow 1/O devices, and automatically take advantage of multiprocessor systems to compute 1m parallel. Uniprocessor and multiprocessor 1.2.4 ‘The terms uniprocessor and multiprocessor are fairly straightforward, but let's define them just to make sure there's no confusion. By uniprocessor, I mean a computer with a single programmer-visible execution unit (processor). A single general-purpose processor with superscalar processing, or vector processors, or other math or I/O coprocessors {s still usually considered a uniprocessor. By multiprocessor, I mean a computer with more than one processor sharing a common instruction set and access to the same physical memory. While the processors need not have equal access to all physical memory, it should be possible for any processor to gain access to most memory. A “massively parallel processor” (MPP) may or may not qualify as a multiprocessor for the purposes of this book. Many MPP systems do qualify, because they provide access to all physical memory from every processor, even though the access times may vary widely. Parallelism Parallelism describes concurrent sequences that proceed simultaneously. In other words, software “parallelism” is the same as English “concurrency” and if- ferent from software “concurrency.” Parallelism has a vaguely redeeming analogy to the English definition: It refers to things proceeding in the same direction Independently (without intersection) True parallelism can occur only on a multiprocessor system, but concurrency can occur on both uniprocessor and multiprocessor systems. Concurrency can6 CHAPTER 1 Introduction 1.25 occur on a uniprocessor because concurrency is, essentially, the illusion of paral- Ielism. While parallelism requires that a program be able to perform two computations al once, concurrency requires only that the programmer be able to pretend that two things can happen at once. Thread safety and reentrancy “Thread-safe” means that the code can be called from multiple threads with. out destructive results. It does not require that the code run efficiently in multiple threads, only that it can operate safely in multiple threads. Most exist ing functions can be made thread-safe using tools provided by Pthreads— mutexes, condition variables, and thread-specific data. Functions that don't require persistent context can be made thread-safe by serializing the entire func- ton, for example, by locking a mutex on entry to the function, and unlocking the mutex before returning. Functions made thread-safe by serializing the entire function can be called in multiple threads—but only one thread can truly perform the function at a time. More usefully. thread-safe functions can be broken down into smaller critical sections. That allows more than one thread to execute within the function, although not within the same part. Even better, the code can be redesigned to protect critical data rather than critical code, which may allow fully parallel execution of the code, when the threads don’t need to use the same data at the same time. The putchar function, for example, which writes a character into a standard 1/0 (stdo} buffer. might be made thread-safe by turning putchar into a critical section. That is, putchar might lock a “putchar mutex,” write the character, and then unlock the putchar mutex. You could call putchar from two threads, and no data would be corrupted—it would be thread-safe. However, only one thread could write its character at a ime, and the others would wait, even if they were writing to different sidio streams. The correct solution Is to associate the mutex with the stream, protecting the data rather than the code. Now your threads, as long as they are writing to different streams, can execute putchar in parallel. More importantly, all functions that access a stream can use the same mutex to safely coordinate their access to that stream, ‘The term “reentrant” is sometimes used to mean “efficiently thread-safe." That is, the code was made thread-safe by some more sophisticated measures than converting the function or library into a single serial region. Although existing code can usually be made thread-safe by adding mutexes and thread-specific data, it is often necessary to change the Interface to make a function reentrant. Reentrant code should avoid relying on static data and, ideally, should avoid relt- ance on any form of synchronization between threads, Often, a function can avoid internal synchronization by saving state in a “context structure” that is controlled by the caller. The caller is then responsible forDefinitions and terminology 7 1.2.6 any necessary synchronization of the data. The UNIX readdir function, for example, returns each directory entry in sequence. To make readdir thread-safe, you might add a mutex that readdir locked each time it was called, and unlocked before it returned to the caller. Another approach, as Pthreads has taken with readdir_r, is to avoid any locking within the function, letting the caller allocate a structure that maintains the context of readdir_r as it searches a directory. At first glance, it may seem that we're just making the caller perform what ought to be the job of readdiz_r, But remember that only the caller knows how the data will be used. If only one thread uses this particular directory context, for example, then no synchronization is needed. Even when the data is shared between threads, the caller may be able to supply more efficient synchronization, for example, if the context can be protected using a mutex that the application also uses for other data. Concurrency control functions Any “concurrent system" must provide a core set of essential functions that you need to create concurrent execution contexts, and control how they operate within your library or application. Here are three essential facilities, or aspects. of any concurrent system: 1. Execution context is the state of a concurrent entity. A concurrent system must provide a way to create and delete execution contexts, and maintain thelr state independently. It must be able to save the state of one context and dispatch to another at various times, for example, when one needs to wait for an external event. It must be able to continue a context from the point where it last executed, with the same register contents, at a later time, 2. Scheduling determines which context (or set of contexts) should execute at any given point in time, and switches between contexts when necessary. 3, Synchronization provides mechanisms for concurrent execution contexts to coordinate their use of shared resources. We use this term in a way that is, nearly the opposite of the standard dictionary meaning. You'll find a de! tion much like “cause to occur at the same time,” whereas we usually mean something that might better be expressed as “prevent from occurring at the same time.” In a thesaurus, you may find that “cooperate” is a synonym for “synchronize"—and synchronization is the mechanism by which threads cooperate to accomplish a task. This book will use the term, “synchronization,” though, because that is what you'll see used, almost universally. ‘There are many ways to provide each of these facilities~but they are always present in some form. The particular choices presented in this book are dictated by the book’s subject—Pthreads. Table 1.1 shows a few examples of the three facilities in various systems.s CHAPTER 1 Introduction Execution context _ Scheduling ‘Synchronization Real traffic automobile traffic lights and | turn signals and brake signs lights ee ee cate cesses a UNIX process priority (nice) | wait and pipes (before threads) | Pthreads thread policy, priority | condition variables and mutexes TABLE 1.1 Execution contexts, schedulers, and synchronization A system's scheduling facility may allow each thread to run until it voluntarily yields the processor to another thread (‘run until block’), It may provide time- sticing, where each thread is forced to periodically yield so that other threads may run (‘round-robin’). It may provide various scheduling policies that allow the application to control how each thread is scheduled according to that thread's function. It may provide a “class scheduler” where dependencies between threads are described so that, for example, the scheduler can ensure that members of a tightly coupled parallel algorithm are scheduled at the same time. ‘Synchronization may be provided using a wide variety of mechanisms. Some of the most common forms are mutexes, condition variables, semaphores, and events. You may also use message passing mechanisms, such as UNIX pipes, sockets, POSIX message queues, or other protocols for communicating between asynchronous processes—on the same system or across a network. Any form of communication protocol contains some form of synchronization, because passing data around with no synchronization results in chaos, not communication, The terms thread, mutex, and condition variable are the main topics of this book. For now, it is enough to know that a thread represents an “executable thing” on your computer. A mutex provides a mechanism to prevent threads from collid- ing unexpectedly, and a condition variable allows a thread, once it has avoided such a collision, to wait until it is safe to proceed, Both mutexes and condition variables are used to synchronize the operation of threads, 1.3 Asynchronous programming is intuitive “in most gardens,” the Tigerlly said, “they make the beds 100 soft—so that the lowers are always asleep.” This sounded a very good reason, and Alice wos quite leased fo know i “Inever thought of hat before!” she said, Lewis Carrol Trough the Looking Glass If you haven't been involved in traditional realtime programming, asynchronous programming may seem new and different. But you've probably been usingAsynchronous programming is intuitive 9 1.3.1 asynchronous programming techniques all along. You've probably used UNIX, for example, and, even as a user, the common UNIX shells from sh to ksh have been designed for asynchronous programming. You've also been using asynchronous “programming” techniques in real life since you were born. Most people understand asynchronous behavior much more thoroughly than, they expect, once they get past the complications of formal and restricted definitions, - because UNIX is asynchronous In any UNIX system, processes execute asynchronously with respect to each other, even when there is only a single processor. Yes, until recently it was difficult to write individual programs for UNIX that behaved asynchronously—but UNIX has always made it fairly easy for you to behave asynchronously. When you type a command to a shell, you are really starting an independent program—if you run the program in the background, it runs asynchronously with the shell When you pipe the output of one command to another you are starting several independent programs, which synchronize between themselves using the pipe. 1 Time iso synchronization mechanism, In many cases you provide synchronization between a series of processes your: self, maybe without even thinking about It. For example, you run the compiler after you've finished editing the source files. It wouldn't occur to you to compile them first, or even at the same time. That’s elementary real-life synchronization. | UNIX pipes and fies can be synchronization mechanisrs. In other cases you may use more complicated software synchronization mechanisms. When you type “1s|more” to a shell to pass the output of the 1s command into the moxe command, you're describing synchronization by specily- ing a data dependency. The shell starts both commands right away, but the more command can’t generate any output until it recetves input from 1s through the pipe. Both commands proceed concurrently (or even in parallel on a multiproces sor) with 1s supplying data and more processing that data, independently of each other. If the pipe buffer is big enough, 1s could complete before more ever started; but more can't ever get ahead of 1s. ‘Some UNIX commands perform synchronization internally. For example, the command “ce -o thread thread.c might involve a number of separate processes. The ce command might be a “front end” to the C language environment, which runs a filter o expand preprocessor commands (like #include and #1), a compiler to translate the program into an intermediate form, an optimizer to reor- der the translation, an assembler (o translate the intermediate form into object language, and a loader to translate that into an executable binary file. As with10 CHAPTER 1 Introduction 1s|more, all these programs may be running at the same time, with synchronization provided by pipes, or by access to temporary file. UNIX processes can operate asynchronously because each process includes all the information needed to execute code, The operating system can save the state of one process and switch to another without affecting the operation of either. Any general-purpose asynchronous “entity” needs enough state to enable the operating system to switch between them arbitrarily. But a UNIX process includes additional state that is not directly related to “execution context.” such as an address space and file descriptors. A thread is the part of a process that's necessary to execute code. On most computers that means each thread has a pointer to the thread’s current instruction (often called a “PC” or “program counter’), a pointer to the top of the thread's, stack (SP), general registers, and floating-point or address registers if they are kept separate. A thread may have other things, such as processor status and coprocessor control registers. A thread does not include most of the rest of the state associated with a process; for example, threads do not have their own file descriptors or address space. All threads within a process share all of the files, and memory, including the program text and data segments. 1 Threads are “simpler” than processes. You can think of a thread as a sort of “stripped down” process, lean and mean and ready to go. The system can switch between two threads within a process much faster than it can switch between processes. A large part of this advantage comes from the fact that threads within a process share the address space—code. data, stack, everything, When a processor switches between two processes, all of the hardware state for that process becomes invalid. Some may need to be changed as part of the context switch procedure—data cache and virtual memory translation entries may be flushed, for example. Even when they do not need to be flushed immediately, however, the data is not useful to the new process. Each process has a separate virtual memory address space, but threads running within the same process share the virtual address space and all other process data. ‘Threads can make high-bandwidth communication easier between independent parts of your program. You don't have to worry about message passing mechanisms like pipes or about keeping shared memory region address references consistent between several different address spaces. Synchronization is faster, and programming is much more natural. If you create or open a file, all threads can use it. Ifyou allocate a dynamic data structure with malloc, you can pass the address to other threads and they can reference it. Threads make it easy to take advantage of concurrencyAsynehronous programming is intultve un 1.3.2 because the world is asynchronous ‘Thinking asynchronously can seem awkward at first, but itl become natural with a little practice. Start by getting over the unnatural expectation that every: thing will happen serially unless you do something “unusual.” On a one-lane road cars proceed one at a time—but on a two-lane road two cars go at once. You ‘can go out for a cup of coffee, leaving your computer compiling some code and fully expecting that it will proceed without you. Parallelism happens everywhere in the real world, and you expect it. ‘Arrow of cashiers in a store serve customers in parallel; the customers in each line generally wait their turn. You can improve throughput by opening more lines, as long as there are registers and cashiers to serve them, and enough customers to be served by them. Creating two lines for the same register may avoid confusion by keeping lines shorter—but nobody will get served faster. Opening three registers to serve two customers may look good, but itis just a waste of resources. In an assembly line, workers perform various parts of the complete job in parallel, passing work down the line. Adding a station to the line may improve performance if it parallels or subdivides a step in the assembly that was so con plicated that the operator at the next station spent a lot of time waiting for each plece. Beware of improving one step so much that it generates more work than the next step on the assembly line can handle. In an office. each project may be assigned to a “specialist.” Common special les include marketing, management, engineering, typing pool, sales, support, and so forth. Each specialist handles her project independently on behalf of the customer or some other specialist, reporting back in some fashion when done. Assigning a second specialist to some task, or defining narrower specialties (for example, assigning an engineer or manager permanently to one product) may improve performance as long as there's enough work to keep her busy. If not, some specialists play games while others’ in-baskets overflow. Motor vehicles move in parallel on a highway. They can move at different speeds, pass each other, and enter and exit the highway independently. The driv- ers must agree to certain conventions in order to avoid collisions. Despite speed limits and traffic signs, compliance with the “rules of the road” is mostly voluntary. Similarly, threads must be coded to “agree” to rules that protect the program, or risk ending up undergoing emergency debugging at the thread hospital Software can apply parallelism in the same ways you might use it in real life, and for the same reasons. When you have more than one “thing” capable of doing ‘work, you naturally expect them to all do work at the same time. A multiprocessor system can perform multiple computations, and any time-sharing system can perform computations while waiting for an external device to respond. Software12 CHAPTER 1 Introduction 14 parallelism is subject to all of the complications and problems that we have seen in real life—and the solutions may not be as easy to see or to apply. You need enough threads, but not too many; enough communication, but nat too much. A. key to good threaded programming {s learning how to judge the proper balance for each situation. Each thread can process similar parts of a problem, just like supermarket cashiers handling customers. Each thread can perform a specific operation on each data item in turn, just like the workers on an assembly line, Each thread can specialize in some specific operation and perform that operation repeatedly on behalf of other threads. You can combine these basic models in all sorts of ways: for example, in parallel assembly lines with some steps performed by a pool of servers. ‘As you read this book you'll be introduced to concepts that may seem unfamil- jar: mutexes, condition variables, race conditions, deadlocks, and priority inversions. Threaded programming may feel daunting and unnatural. But I'll explain all those concepts as we move through this book, and once you've been writing multithreaded code for a while you may find yourself noticing real-world analogies to the concepts. Threads and all this other stuff are formalized and restricted representations of things you already understand. If you find yourself thinking that someone shouldn't interrupt you because you have the conversation mutex locked, you've begun to develop an intuitive understanding of threaded programming.” You can apply that understanding to help you design better threaded code with less effort. If something wouldn't make sense in real life, you probably shouldn't try it in a program either. About the examples in this book This book contains a number of examples. All are presented as complete programs, and they have been built and tested on Digital UNIX 4.0d and Solaris 2.5. All of these programs do something, but many do not do anything of any particular importance. The purpose of the examples is to demonstrate thread management and synchronization techniques, which are mere overhead in most real programs. They would be less effective at revealing the details if that “over head” was buried within large programs that “did something.” ‘Within the book, examples are presented in sections, usually one function at a time. The source code is separated from the surrounding text by a header and trailer block which include the file name and, if the example comprises more than one section, a section number and the name of the function. Each line of the source code has a line number at the left margin. Major functional blocks of each section are described in specially formatted paragraphs preceding the source code. These paragraphs are marked by line numbers outside the left margin of “Itmay also be a good time to take a break and read some healthy escapist fletion for a whileAsynchronous programming, by example 13 1.5 the paragraph, denoting the line numbers in the source listing to which the paragraph refers, Here’s an example: ‘These lines show the header files included in most of the examples. The header file declares constants and prototypes for the Pthreads functions, and the errors.h header file includes various other headers and some error-checking functions, m_sample.c part 1 sampleinto #include finclude “errors.h zi zs pies Fare 1 ean! T have written these examples to use error checking everywhere. That 1s, 1 check for errors on each function call. As long as you code carefully, this isn't necessary, and some experts recommend testing only for errors that can result from insufficient resources or other problems beyond your control. I disagree, unless of course you're the sort of programmer who never makes a mistake. Checking for errors is not that tedious, and may save you a lot of trouble during debugging. You can build and run all of the examples for yourself—the source code is available online at http://www.aw.com/butenhof/ posixcode.html. A Makefile is, provided to build all of the examples, though it requires modifications for various platforms. On Digital UNIX, the examples were built with crLacs=-pthread =stdl ~wi. On Solaris, they were built with CFLAGS=-D_REENTRANT -D_POSIX_C_ SOURCE=199506 -ipthread. Some of the examples require interfaces that may Not be in the Pthreads library on your system, for example, clock_gettine, which is part of the POSIX. 1b realtime standard. The additional realtime library is specified by the RT#LAGS variable, which is defined as RrFLAGS=-1re on Digital UNIX, and as RTFLAGS=-1posixd on Solaris, On Solaris 2.5 systems, several of the examples require calls to thr_setcon~ currency to ensure proper operation. This function causes Solaris to provide the process with additional concurrency. In a few cases, the example will not operate at all without this call, and in other cases, the example would fail to demonstrate some behavior. Asynchronous programming, by example “In one moment I've seen what has hitherto been Enveloped in absolute mystery, ‘And without extra charge | will give you at iarge ‘A Lesson in Natural History.” Lewis Carol. The Hunting of the Snark1.5.1 a rr 1s 16 a ae 4 CHAPTER 1_ Introduction ‘This section demonstrates some basic asynchronous programming, using a simple program that does something vaguely useful, by pretending to be an alarm clock with a command interface for which you would not consider paying a dime ina store. But then, this book is about threads, not user interfaces, and the code that I need to show takes up quite enough space already. The program prompts for input lines in a loop until it receives an error or end of file on stdin. On each line, the first nonblank token is interpreted as the number of seconds to wait, and the rest of the line (up to 64 characters) is a message that will be printed when the wait completes. I will offer two additional versions — one using multiple processes, and one using multiple threads. Well use the three examples to compare the approaches. The baseline, synchronous version Include the header file errors.h, which includes standard headers like and and defines error reporting macros that are used throughout the examples in this book. We don't use the error reporting macros in this particular example, but consistency is nice, sometimes, The “baseline” version, alarm.c, is a synchronous alarm program with a single routine, main. Most of main is a loop, which processes simple commands until fgets returns a NULL (error or end of file). Each line is "parsed" with sscanf to separate the number of seconds to wait (td, the first sequence of digits) from the message string to print (864(*\n}, the rest of the line, up to 64 characters exclud- ing newline), m_starn.c eee eee eee eee! Hinclude “errors. int main (int argc, char *argvi}) 4 int seconds; char Line[128); char message[64 while (1) { printf ("Alarm> *); if (fgets (Line, sizeof (line), stdin) if (strlen (line) <= 1) continue; NULL) exit (0); % Parse input line into seconds (#4) and a message * (864(7\n}), consisting of up to 64 characters * separated from the seconds by whitespace. “”Asynchronous programming, by example 15 20 21 2 23 24 25 15.2 if (sscanf (Line, "td 864(*\n}", keeconds, message) < 2) { fprintf (stderr, "Bad conmand\n"); } else ¢ sleep (seconds); printf ("(%d) %5\n", seconds, message); > wie ‘The problem with the program alarm.c is that only one alarm request can be active at a time. If you set an alarm to remind you to do something in 10 minutes (600 seconds}, you can't decide to have it remind you of something else in 5 minutes, The program Is doing something synchronously that you would probably lke to be asynchronous. A version using multiple processes 27-37 ‘There are lots of ways to make this program asynchronous; for example, you could run more than one copy of the program. One way to run multiple copies is to fork a child process for each command, as shown in alarm_fork.c. The new version is asynchronous—you can enter commands at any time. and they will be carried out independently. It isn't much more complicated than the original, which is nice. The main difference between alarm.c and alarm fork.c is that instead of calling s1eep directly, it uses fork to create a new child process, which then calls sleep (and, eventually, printf) asynchronously, while the parent process continues. ‘The primary complication in this version is the need to “reap” any child processes that have terminated. If the program fails to do this, the system will save them all until the program terminates. The normal way to reap terminated child processes is to call one of the wait functions. In this case, we call waitpid, which allows the caller to specify the WNOHANG flag. The function will immediately reap ‘one child process if any have terminated, or will immediately return with a process ID (pid) of 0. The parent process continues to reap terminated child processes until there are no more to reap, When the loop terminates, main loops back to line 13 to read a new command. alarm forkc finclude #include #include “errors.h*2 4 15 16 v7 18 19 20 2 23 25 26 27 2a 29 30 3 32 34 35 36 a7 38 39 4 a1 2 45 46 “a 48 ery 50 16 CHAPTER 1 _ Introduction int main (int argc, char *argv[}) ‘ int status; char Line(i28]; int seconds; pid_t pid: char message[ 64]; white (1) ( printf ("Alarn> if (fgets (Line, sizeof (line), stdin) if (strlen (Line) <= 1) continue; NULL) exit (0); Parse input line into seconds (#4) and a message (964(*\n]), consisting of up to 64 characters * separated from the seconds by whitespace. “ if (sscanf (Line, "td 864[*\n]", seconds, message) < 2) { fprintf (stderr, "Bad command\n"); } else { pid = fork (7 Af (pid == (pid_t)-1) errno_abort ("Fork"); if (pid == (pid_tyo) ( 7 * In the child, wait and then print a message 7 sleep (seconds); printé (*($d) $8\1 exit (0); > else { y + In the parent, call waitpid() to collect children + that have already terminated. + seconds, message); / do { pid = waitpid ((pid_t)-1, NULL, WNOHANG); if (pid == (pid t)-) errno_abort ("Wait for child"); } while (pid T= (pid_tyo); > w clmtoreAsynchronous programming, by example 17 15.3 A version using multiple threads Now, let us try another alarm program, alarm_thread.c. It is much like the fork version in alarn_fork.c, except that it uses threads instead of child processes to create asynchronous alarms. Four Pthreads calls are used in this program: + pthread_create creates a thread running the routine specified in the third argument (e1arm_thread). returning an identifier for the new thread to the variable referenced by thread + pthread_detach allows Pthreads to reclaim the thread's resources as soon as it terminates. ‘+ pthread_exit terminates the calling thread + pthread_self returns the calling thread's identifier. ‘The alarm_t structure defines the information stored for each alarm command, the number of seconds until the alarm is due, and the message string that will be printed by the thread. #include Hinclude “errors.h" typedef struct alarmtag { int seconds; char message[ 64 } alarm_t; waters tres i arin ‘The alarm_thread function is the “alarm thread.” That is, each alarm thread is created running this function, and when the function returns the thread terminates. The function's argument (void *arg) Is the fourth argument that was passed to pthread_create, in this case, a pointer to the control packet (alarm _t) created for the alarm request that the thread is to satisfy. The thread starts by “mapping” the void * argument as a pointer to a control packet. The thread detaches itself by calling pthread_detach, which informs Pthreads that the application does not need to know when the thread terminates or its termination status. The thread sleeps for the number of seconds specified in its control packet, and then prints the message string. Finally, the thread frees the control packet and returns. When a thread returns from its initial routine, as it does here, the thread terminates. Normally, Pthreads would hold the thread's resources so that another thread could later determine that it had exited and retrieve a final result. Because the thread detached itself, none of that is necessary.1a n 2 2 2s a 12 4 15 16 v 18 1 20 18 CHAPTER 1 Introduction alarm thread.c part 2__ alarm thread void ‘alarm thread (void +arg) ‘ elarmt ‘alarm = (alarm t*)arg; int status; status = pthread detach (pthread_self ()); if (status t= 0) err_abort (status, “Detach thread”); Sleep (alarm->seconds); prints ("(td) 48\n", alarm->seconds, alarm->nessage); free (alarm); return NULL; } a ‘wiara_thread [he ain program of alarn_thread.c is much the same as the other two varl- see, l0ePs. reading and interpreting command lines as long as it can read em stdin. ‘Av alarm thread is created, running function alara_thread, with the alarm data (alarm_t) as the thread's argument, alarm thresd.c part 3 main int main (int arge, char *argvi}) « int status; char Line( 128); alara_t ‘alarn; pthread t thread; while (1) ( printf (*Alarm> "); Af (fgets (Line, sizeof (1ine), stdin) if (strlen (Line) <= 1) continues alarm = (alarm t*)malloc (sizeof (alarm_t)); if (alarm == NOLL) errno_abort ("Allocate alarm’ NULL) exit (0); Ie 7 Parse input Line into seconds (4) and a message * (864("\n]), consisting of up to 64 characters * separated from the seconds by whitespace, "22 23 24 26 20 29 30 2 1.5.4 Asynchronous programming, by example 19 Af (sscanf (Line, "td 464(7\n}", galarm->seconds, alarm->message) <2) { Eprintf (stderr, "Bad comnand\n"); free (alarm); ) else { status = pthread create ( Sthread, NULL, alarm_thread, alarm); if (status I= 0) err_abort (status, "Create alarm thread"); > WW aiara threate part main Summary ‘A good way to start thinking about threads is to compare the two asynchronous versions of the alarm program. First, in the fork version, each alarm has an independent address space, copied from the main program. That means we can put the seconds and message values into local variables—once the child has ‘been created (when fork returns), the parent can change the values without affecting the alarm. In the threaded version, on the other hand, all threads share the same address space—so we call malloc to create a new structure for each alarm, which is passed to the new thread. The extra bookkeeping required introduces a little complexity into the threaded version. In the version using fork, the main program needs to tell the kernel to free resources used by each child process it creates, by calling waitpid or some other member of the wait “family.” The alarm_fork.c program, for example, calls waitpid in a loop after each command, to collect all child processes that have completed. You do not need to wait for a thread unless you need the thread's return value—in alarn_thread.c, for example, each alarm thread detaches itself (at line 6, part 2) so that the resources held by the thread will be returned immediately when it terminates. In the threaded version, the “primary activities” (sleeping and printing the message) must be coded in a separate routine. In alarm.c and alarm_fork.c. those activities were performed without a call. In simple cases such as our alarm program, it is often easier to understand the program with all code in one place, so that might seem like an advantage for alarm_fork.c. In more complicated programs, though, it is rare that a program's “primary activities” are so simple that they can be performed in a single routine without resulting in total confusion, Ina real alarm program, you wouldn't want to create a process for each alarm, You might easily have hundreds of alarms active, and the system probably ‘wouldn't let you create that many processes. On the other hand, you probably can create hundreds of threads within a process. While there is no real need to main tain a stack and thread context for each alarm request, it is a perfectly viable design.20 CHAPTER 1 Introduction A more sophisticated version of alarm_thread.c might use only two threads: ‘one to read input from the user, and another to wait for expiration of the next alarm—T'l show that version later, after we've worked through some more basic concepts. You could do the same thing with two processes, of course, but it would be more cumbersome. Passing information between two threads is easy and fast—no shared memory to map, no pipes to read or write, no concerns about whether you are passing addresses that may not mean the same thing in both Processes, Threads share everything in their address space—any address that's valid in one thread is valid in all threads. 1.6 Benefits of threading “0 Looking-Giass creatures,’ quoth Alice, ‘draw near! ‘Ts an honour fo see me, «favour fo her: "Ts a privilege high to have dinner and tea ‘Along withthe Rec! Queen, the While Queen, and me!" Lewis Care, Through the Looking Giass Some advantages of the multithreaded programming model follow: 1. Exploitation of program parallelism on multiprocessor hardware, Parallel- ism is the only benefit that requires special hardware. The others can help most programs without special hardware. 2. More efficient exploitation of a program’s natural concurrency, by allowing the program to perform computations while waiting for slow I/O operations to complete 3. A modular programming model that clearly expresses relationships between independent “events” within the program. ‘These advantages are detailed in the following sections, 1.6.1 Parallelism On a multiprocessor system, threading allows a process to perform more than one independent computation at the same time. A computation-intensive threaded application running on two processors may achieve nearly twice the Performance of a traditional single-threaded version. “Nearly twice” takes into account the fact that you'll always have some overhead due to creating the extra thread(s) and performing synchronization. This effect is often referred to as “sealing.” A two-processor system may perform 1.95 times as fast as a single processor, a three-processor system 2.9 times as fast, a four-processor system 3.8 limes as fast, and so forth. Scaling almost always falls off as the number of processors increases because there's more chance of lock and memory collisions, which cost time.Benefits of treading 21 ‘Scaling can be predicted by “Amdah's law,” which is shown in Figure 1.1. In the equation, p represents the ratio of “parallelizable code” over “total execution time,” and n represents the number of processors the code can use. The total elapsed time for a parallel job is the sum of the elapsed time for the nonparalleliz- able work (1 ~ p) and the elapsed time for each processor executing the parallelizable work (p/n). Amdah!'s law is a simple relationship showing how parallelism is limited by the amount of serialization needed. When the program has no parallelizable code (p is 0), the speedup is 1. That is, it is not a parallel program. If the program requires no synchronization or other serial code (p is 1), then the speedup is n (the number of processors). As more synchronization is required, parallelism provides less benefit. To put it another way, you'll get better scaling with activities that are completely independent than with activities that are highly dependent: ‘The independent activities need less synchronization. ‘The diagram in Figure 1.2 shows the effect of Amdaht's law. “Clock time” progresses from left to right across the page, and the diagram shows the number of processors working in parallel at any moment. Areas where the diagram has only a single horizontal line show that the process is serialized. Areas that have several horizontal lines in parallel show where the process benefits from multiple processors. If you can apply multiple processors for only 10% of your program's execution time, and you have four processors, then Amdahl's law predicts a speedup of 1/(0.9+(0.1/4)), or about 8%: time FIGURE 1.2. Parallelism charted against time22 CHAPTER 1_ Introduction 1.6.2 Operations on large matrices can often be “parallelized” by splitting the matrix into pieces. For example, each thread may be able to operate on a set of rows or columns without requiring any data written by threads operating on other slices. You still generally need to synchronize threads at the beginning and end of processing the matrix, frequently using a barrier.” Amdahi's law shows that you'll get better performance by giving each thread a large and relatively independent “chunk” of work, requiring infrequent synchronization, than by giving them smaller chunks. ‘Amdahi's law is an excellent thought exercise to help you understand scaling. It is not, however, a practical tool, because it is nearly impossible to accurately compute p for any program. To be accurate, you need to consider not only all serialized regions within your code, but also within the operating system kernel and even in hardware, Multiprocessor hardware must have some mechanism to synchronize access to the contents of memory. When each processor has a private data cache, the contents of those caches must be kept consistent with each other and with the data in memory. All of this serialization must be included in any accurate calculation. Concurrency ‘The threaded programming model allows the program to make computational progress while waiting for blocking operations like 1/0. This is useful for network servers and clients, and it is the main reason that client/server systems (such as OSF DCE) use threads. While one thread waits for a long network read or write operation, that thread is blocked and other threads in your application can execute independently. Some systems support asynchronous 1/O operations, which can give similar advantages; but most UNIX-based systems do not have asynchronous 1/0." Furthermore, asynchronous I/O is generally a lot more complicated to use than threads. For example, you need to either handle asynchronous notification when the 1/0 completes, or poll for completion. If you issued an asynchronous I/O and then entered a polling loop. you would lose the advantage of asynchronous 1/O— your application would just wait. If you poll elsewhere, or handle asynchronous notification, then issuing the 1/O and processing the resulting data occur in different locations within your program. That makes the code more difficult to “A barrier is a simple synchronization mechanism that blocks each thread until a certain number has reached the barrier: then all threads are unblocked. Barriers can be used. for example, to keep any thread from executing a parallel region of code until all threads are ready to execute the region. Section 7.1.1 describes barriers in more detail, and demonstrates the con- struction of a simple barrier package. + UNIX systems support “nonblocking 1/0." but this fs not the same thing as asynchronous 1/0. Nonblocking 1/0 allows the program to defer Issuing an 1/O operation until it ean complete ‘without blocking, but asynchronous 1/0 can proceed while the program does something eis.Benefits of threading 23 analyze and maintain, When you use synchronous I/O you just perform the 1/0 and then do whatever comes next. Synchronous I/O within multiple threads fives nearly all the advantages of asynchronous 1/O. In most cases you will find it much easier to write complex asynchronous code using threads than using traditional asychronous programming techniques. You could write an alarm program, like those shown in Section 1.5, as an asynchronous program without using processes or threads, with timer signals for the alarms and asynchronous reads for input. Using timer signals is more complicated in many ways, since you are severely limited in what you can do within a signal handler. Asynchronous 1/0 does not allow you to take advantage of the convenience of stdio functions. The basic program function will be scattered through a series of signal handlers and functions, and will probably be harder to understand ‘Asynchronous 1/0 does have one advantage over threaded concurrency, though. Just as a thread is usually “cheaper” (in execution time and storage space) than a process, the context required for an asynchronous 1/0 operation is almost always cheaper than a thread. If you plan to have a lot of asynchronous 1/0 operations active at the same time, that might be important enough to justify using the more complicated programming model. But watch out—some “asynchronous 1/0” packages just distribute your 1/0 requests across a pool of threads! Most of the time you will be better off using threads ‘Another method of coding an asynchronous application is for each action to be treated as an “event.” Events are queued by some “hidden” process. and dispatched serially to be handled by the application, usually through “callback” routines registered with the dispatcher. Event dispatchers have been popularized by windowing interface systems such as the Apple Macintosh toolbox, Microsoft Windows, and X Windows on UNIX (used by Motif and CDE), ‘The event mechanism alleviates much of the complication of using signals and asynchronous 1/0. as long as the events are supported direetly by the event dispatcher, All, for example, handle input from the keyboard or pointer device, and generally one can request a timer event to be inserted automatically at a desired time. Thus, the alarm program, written to an event interface, need only initialize the event dispatcher and enter a loop to process events. Input events would be dispatched to the parser, resulting in a request for a new timer event; and timer events would be dispatched to a function that would format and print the alarm message. For very simple applications (and the alarm program here is certainly one example), an event-based implementation may be simpler than the multiprocess or multithread variations I've shown—at least when the (often substantial) overhead of initializing the event dispatcher is removed. The limitations of events become more obvious when you build larger and more sophisticated applica tions—the problem is that the events are sequential vents are not concurrent, and the program can do only one thing at a time ‘Your application receives an event, processes it, and then receives the next event. If processing an event takes a long time, for example, sorting a large database,24 CHAPTER 1_ Introduction 1.6.3 the user interface may remain unresponsive for quite a while. If an event involves a long wait. for example, reading data over a slow network connection, then, again, the user must wait. ‘The response problems can be minimized by liberally sprinkling extended operations with calls to the event dispatcher—but getting them in the right place, without substantially impacting the performance of the operation, can be diffi cult. Furthermore, you may not have that option, if the database sort is taking place in a shared library you bought from somebody else, On the other hand, one might code the application to create a new thread that runs the database sort, or reads from the slow network, leaving the “user interface” thread to immediately request another event. The application becomes responsive, while the slow operation continues to run, You can do this even if a database package, for example, cannot tolerate being run in multiple threads, by queuing a “sort” command to a server thread that runs database operations serially—while still retaining interface responsiveness. Programming model It may be surprising that programming with threads is a good idea even if you know your code will never run on a multiprocessor. But it is true, Writing with threads forces you to think about and plan for the synchronization requirements of your program. You've always had to think about program dependencies, but threads help to move the requirements from comments into the executable structure of the program. Assembly language programs can use all the same sequential control structures (loops, conditional code) as programs written in a high-level language. However, it can be difficult to determine whether a branch instruction represents the top or bottom of a loop, a simple conditional, a “conditional goto.” or something more exotic. Switching to a higher-level language that supports these sequential controls directly in source, for example, the C language do, while, for. if, and switch statements, makes these sequential programming constructs explicit in the source language. Making control structures explicit in the program source code means that more of your program's design is explicit in the source, and that makes it easier for someone else to understand. Similarly, a C language program (or even an assembler program) may use data encapsulation and polymorphism by adhering to programming conventions, and with luck those conventions may be carefully documented and the documenta tion kept updated. But if that same code is written in an object-oriented language, the encapsulation and polymorphism become explicit in the source language. In a sequential program, synchronization requirements are implicit in the ordering of operations. The true synchronization requirements, for example, that “a file must be opened before data can be read from the file,” may be documented only by source comments, if at all. When you program using threads, sequential assumptions are (or at least should be) limited to small segments of contiguousCosts of threading 25 17 code—for example, within a single function. More global assumptions, to be at all safe, must be protected by explicit synchronization constructs. In traditional serial programming you call function A to do one thing, then call another function B to do something else, even when those two functions don't require serialization, If a developer is trying to determine what the program is doing, perhaps to trace a bug, it often isn't obvious that there may be no need to follow both calls. Furthermore, the strictly serial model makes it easy for someone to inadvertently make function B dependent on some side effect of function A. Ifa later modification reverses the order of the calls, the program may break in ‘ways that aren't obvious, Program dependencies may be documented using source code comment blocks, but comments are often ambiguous and may not be properly updated when code is changed. ‘The threaded programming model isolates independent or loosely coupled functional execution streams (threads) in a clear way that’s made explicit in the program's source code. If activities are designed as threads, each function must include explicit synchronization to enforce its dependencies. Because synchronization is executable code, it can't be ignored when dependencies are changed. ‘The presence of synchronization constructs allows anyone reading the code to follow temporal dependencies within the code, which can make maintenance substantially easier, especially for large programs with a lot of independent code. ‘An assembly language programmer can write better, more maintainable assembly code by understanding high-level language programming; a C language programmer can write better, more maintainable C code by understanding object- oriented programming. Even if you never write a threaded program, you may benefit from understanding the threaded programming model of independent functions with explicit dependencies. These are “mental models" (or that dread- fully overused word, “paradigms") that are more or less independent of the specific code sequences you write. Cleanly isolating functionally independent code may even make sequential programs easier to understand and maintain. Costs of threading All this time the Guard was looking at her. rst rough a telescope, then through a microscope, and then through an opera-glass. Al last he said, "You're traveling the wrong way,” ‘and shut up the window, and went away. Lewis Carol, Trough the Looking-Glass Of course there's always “the flip side.” As I showed in the previous section, threads provide definite and powerful advantages, even on uniprocessor systems. ‘They provide even more advantages on a multiprocessor. ‘So why wouldn't you want to use threads? Everything has a cost, and threaded programming is no exception. In many cases the advantage exceeds the26 CHAPTER 1 Introduction 17.1 cost; in others it doesn't. To be fair, the following subsections discuss the cost of threaded programming. Computing overhead 1.7.2 Overhead costs in threaded code include direct effects such as the time it takes to synchronize your threads, Many clever algorithms are available for avoiding synchronization in some cases, but none of them is portable. You'll have to use some synchronization in just about any threaded code. It is easy to lose performance by using too much synchronization; for example, by separately protecting two variables that are always used together. Protecting each variable separately means you spend a lot more time on synchronization without gaining parallelism, since any thread that needs one variable will need the other as well. ‘The overhead of threaded programming can also include more subtle effects, For example, threads that constantly write the same memory locations may spend a lot of time synchronizing the memory system on processors that support “read/ write ordering.” Other processors may spend that time synchronizing only when you use special instructions such as a memory barrier, or a “multiprocessor atomic” operation like test-and-set. Section 3.4 says a lot more about these effects. Removing a bottleneck in your code, for example, by adding threads to perform, multiple concurrent I/O operations, may end up revealing another bottleneck at a lower level—in the ANSI C library, the operating system, the file system, the device driver, the memory or I/O architecture, or the device controller. These effects are often difficult to predict, or measure, and are usually not well documented, A compute-bound thread, which rarely blocks for any external event, cannot effectively share a processor with other compute-bound threads. An I/O thread might interrupt it once in a while, but the 1/0 thread would block for another external event and the compute-bound thread would run again. When you create more compute-bound threads than there are available processors, you may gain better code structuring over a single-threaded implementation, but you will have worse performance. The performance suffers because the multithreaded implementation adds thread synchronization and scheduling overhead to the work you ‘wanted to accomplish—and does it all using the same compute resources. Programming discipline Despite the basic simplicity of the threaded programming model, writing real- world code is never trivial, Writing code that works well in multiple threads takes careful thought and planning. You have to keep track of synchronization protocols and program invariants. You have to avoid deadlocks, races, and priority inversions. I'll describe all of these things in later sections, show how to design code to avoid the problems, and how to find and repair them after the fact.Costs of threading 27 1.7.3 ‘You will almost certainly use library code that you did not write. Some will be supplied with the operating system you use. and most of the more common libraries will likely be safe to use within multiple threads, POSIX guarantees that most functions specified by ANSI C and POSIX must be safe for use by mullti- threaded applications. However, a lot of “interesting” functions you will probably need are not included in that list. You will often need to call libraries that are not supplied with the operating system, for example, database software. Some of that code will not be thread-safe. I will discuss techniques to allow you to use most unsafe code, but they will not always work, and they can be ugly. All threads within a process share the same address space, and there's no protection boundary between the threads. If a thread writes to memory through an uninitialized pointer, it can wipe out another thread's stack, or heap memory being used by some other thread. The eventual failure will most likely occur in the innocent victim, possibly long after the perpetrator has gone on to other things. This can be especially important if arbitrary code is run within a thread. For example, in a library that supports callbacks to functions supplied by its caller, be sure that the callback, as well as the library, is thread-safe. The important points are that good sequential code is not necessarily good threaded code, and bad threaded code will break in ways that are more difficult to locate and repair, Thinking about real-life parallelism can help a lot, but programming requires a lot more detailed work than most things in real life. Harder to debug You will learn more about debugging threaded code, and, more importantly. not debugging threaded code, in Chapter 8. You will see some of the tools you may encounter as well as some techniques you can use on your own. By then you will know all about mutexes and memory visibility, and you will be ready to deal with deadlocks and races. Don't worry about the details now—the point of this brief section is to demonstrate that you will have to learn about threaded debugging, and it {s not as easy yet as anyone would like it to be. (So when was debugging ever easy?) ‘Systems that support threads generally extend traditional sequential debugging tools to provide basic debugging support. The system may provide a debugger that allows you to see the call tree for all of your program's threads, for ‘example, and set breakpoints that activate only in particular threads. The system may provide some form of performance analyzer that lets you measure the processor time accumulated within each function for a specific thread or across all threads. Unfortunately that's only the beginning of the problems when you're debugging asynchronous code. Debugging inevitably changes the timing of events. That doesn't matter much when you're debugging sequential code, but it is critical when you're debugging asynchronous code. If one thread runs even slightly slower than another because it had to process a debugger trap. the problem you're1.8 28 CHAPTER 1 Introduction trying to track down may not happen. Every programmer has run into problems that won't reproduce under the debugger. You'll run into a lot more of them when you use threads, Itis difficult to track down a memory corruptor, for example, a function that writes through an uninitialized pointer, in a sequential program. It is even harder in a threaded program. Did some other thread write to memory without using @ mutex? Did it use the wrong mutex? Did it count on another thread setting up a pointer without explicit synchronization? Was it just an old fashioned sequential memory corruptor? Various additional tools are provided by some systems to help you. None of these {s standard or widely available. Tools may check source code for obvious violations of locking protocol, given a definition of which variables are shared and how they should be locked. They may record thread interactions while the program runs, and allow you to analyze or even replay the interactions to determine what happened. They may record and measure synchronization contention and overhead. They may detect complicated deadlock conditions between a set of mutexes. Your most powerful and portable thread debugging tool is your mind, applied through the old fashioned manual human-powered code review. You'll probably spend a lot of time setting up a few breakpoints and examining lots of states to try to narrow the problem down a little and then carefully reading the code to find problems. It is best if you have someone available who didn't write the code, because a lot of the worst errors are embarrassingly obvious to someone who's. not burdened with detailed knowledge of what the code was supposed to do. To thread or not to thread? “My poor client's fale now depends on your votes.” Here the speaker sat down in his place, ‘And directed the Judge to reter to his notes. ‘And briefly fo sum up the case. Lows Carol, Ihe Hunting of the Snark Threads don't necessarily provide the best solution to every programming problem. They're not always easier to use, and they don't always result in better performance. ‘A few problems are really “inherently nonconcurrent.” and adding threads will only slow the program down and complicate it. If every step in your program depends directly on the result of the previous step, then using threads probably won't help. Each thread would have to wait for another thread to complete. The most obvious candidates for threaded coding are new applications that accomplish the following:19 POSIX thread concepts 29 1. Perform extensive computation that can be parallelized (or “decomposed") into multiple threads, and which is intended to run on multiprocessor hardware, or 2. Perform substantial [/O which can be overlapped to improve throughput— many threads can wait for different I/O requests at the same time. Distrib- uted server applications are good candidates, since they may have work to do in response to multiple clients, and they must also be prepared for unsolicited I/O over relatively slow network connections. Most programs have some natural concurrency, even if it is only reading a command from the input device while processing the previous command. ‘Threaded applications are often faster, and usually more responsive, than sequential programs that do the same job. They are generally much easier to develop and maintain than nonthreaded asynchronous applications that do the same job. ‘So should you use threads? You probably won't find them appropriate for every programming job you approach. But threaded programming is a technique that all software developers should understand. POSIX thread concepts “You seem very clever at explaining words, Si,” said Alice. “Would you kindly tell me the meaning of the poem ‘called ‘Jabberwocky’?” “Let's hear It," said Humpty Dumpty. "ican explain all the poems that ever were Invented—and a good many that haven't been invented just yet.” Lewis Carol, Trough the Looking Glass First of all, this book focuses on “POSIX threads.” Technically, that means the thread “application programming interfaces” (API) specified by the international formal standard POSIX 1003.1¢-1995. This standard was approved by the IEEE in June 1995, A new edition of POSIX 1003.1, called ISO/IEC 9945-1:1996 (ANSI/ IEEE Std 1003.1, 1996 Edition) is available from the IEEE.” This new document includes 1003.1b-1993 (realtime), 1003.1c-1995 (threads), and 1003.1i-1995 (corrections to 1003. 1b-1993). Unless you are writing an implementation of the standard, or are extremely curious, you probably don't want to bother buying the * Contact the IEEE at 1-800-678-IEEE, 9945-1:1996 Information Technology Portable Opera- ting System interface (POSIX)—Part I: System Application: Program Interface (API) [C Languagel ISBN 1-55937-573-6, order number SH94352.30 CHAPTER 1_ Introduction 19.1 POSIX standard. For writing threaded code, you'll find books like this one much more useful, supplemented by the programming documentation for the operating system you're using. As | explained in the preface, I will use the informal term "Pthreads” to refer to “POSIX 1003. 1c-1995." | will use the slightly more formal term “POSIX.1b" to refer to “POSIX 1003.1b-1993" in the text, "POSIX.14" to refer to the POSIX 1003.14 "Multiprocessor Profile,” and similar abbreviated notation for other POSIX standards. I'll use the full names where precision is important, for example, to compare POSIX 1003.1-1990 and POSIX 1003.1-1996, and also in section titles and captions that appear in the table of contents. Architectural overview You may remember from Section 1.2 that the three essential aspects of a thread system are execution context, scheduling, and synchronization, When you evaluate any thread system, or compare any two thread systems, start by categorizing the features into capabilities that support execution contexts, scheduling, and synchronization, With Pthreads, you create an execution context (thread) by calling pthread_ create. Creating a thread also schedules the thread for execution, and it will begin by calling a “thread start function” that you specified. Pthreads allows you to specify scheduling parameters either at the time you create the thread, or later on while the thread is running. A thread normally terminates when it calls pthread_exit, or returns from the thread start function, although we will encounter a few other possibilities later. The primary Pthreads synchronization model uses mutexes for protection and condition variables for communication. You can also use other synchronization mechanisms such as semaphores, pipes, and message queues. A mutex allows one thread to lock shared data while using it so that other threads cannot acel- dentally interfere. A condition variable allows a thread to wait for shared data to reach some desired state (such as “queue not empty" or “resource available’) Types and interfaces ‘This section briefly outlines the Pthreads data types, and some of the rules for interpreting them. For a full description of the “object” represented by each type and how to create and use those objects in a program, see the appropriate sections later in this book, as shown in Table 1.2.POSIX thread concepts 31 1.9.3 Section Description pehread_t 2 thread identifier pthread mutex t | 3.2 | mutex pthread_cond_t 3.3 condition variable read:fey 2 BA ica Praread attr § [S237 thread atitbutes objet eicoe area] S| er aes est 5.2.2 __ condition variable attributes object pebread condattz_t pthread_oncet | 5.1 __| “one time initialization” control context TABLE 1.2 POSIX threads types All Pthreads types are “opaque” Portable code cannot moke assumptions regarding the representation of these types, All of the “pthread” types listed in Table 1.2 are considered opaque. There is no public definition of these types’ representation, and programmers should never assume anything about the representation. You should use them only in the man- ner specifically described by the standard. A thread identifier, for example, may be ‘an integer, or a pointer, or a structure, and any code that uses a thread identifier in a way that is not compatible with all of those definitions is incorrect. Checking for errors Pthteads introduces a new way to report errors, without using the errno variable. ‘The Pthreads amendment is the first part of POSIX to depart from the ancient UNIX and C language conventions regarding error status. Traditionally, functions that succeed returned a useful value if appropriate, or otherwise indicated success by returning the value 0. On failure, they returned the special value ~1, and set the global value errno to a code specifying the type of error. ‘The old mechanism has a number of problems, including the fact that it is difficult to create a function that can both report an error and return a useful value of -1. There are even worse problems when you add multiple threads to a process. In traditional UNIX systems, and in the original POSIX.1-1990 standard, errno was an extern int variable. Since such a variable can have only one value at a time, it can support only a single stream of execution within the process.32 CHAPTER 1_ Introduction n 12 1 16 ” 1 Pthecids functions don’t sot errne on errors! (But most other POSIX functions do.) New functions in the Pthreads standard reserve the return value for error status, and errno 1s not used. Pthreads functions return the value 0 on success, and include an extra output parameter to specify an address where “useful results” are stored. When a function cannot complete successfully, an error code from the header file is returned instead of 0 as the function value. Pthreads also provides a per-thread errno, which supports other code that uses errno. This means that when one thread calls some function that reports an error using errno, the value cannot be overwritten, or read, by any other thread— you may go on using errno just as you always have. But if you're designing new interfaces you should consider following the new Pthreads convention for reporting crrors. Setting or reading the per-thread errno involves more overhead than reading or writing a memory location, or returning a value from a function, To wait for a thread, for example, and check for an error, you might use code like that shown in the following code example, thread_error.c. The pthread_ Join function, used to wait for a thread to terminate, will report an invalid thread identifier by returning the error code esRc#. An uninitialized pthread_t is likely to be an invalid thread identifier on most implementations, The result of running this program should be a message such as “error 3: no such process. In the unlikely event that the uninitialized thread variable has a pthread_t value that is not invalid, it should be the ID of the initial thread (there are no other threads in this process). In this case, pthread_join should either fail with EDEADLK, if your implementation of Pthreads detects self-deadlock, or the thread will hang waiting for itself to exit. thread error.c #include Hinclude #include int main (int argc, char *argv[}) ‘ pthread _t thread; int status; r Avtempt to join with an uninitialized thread ID. On most implementations, this will return an ESRCH error code. If the local (and uninitialized) pthread_t happens to be a valid thread ID, it is almost certainly that of the initial thread, which is running main(). In that case, your Pthreads implementation may either return EDEADLK (self-deadlock) , or it may hang. If it hangs, quit and try again. /POSIX thread concepts 33 status = pthread_join (thread, NULL); if (status I= 0) fprinté (stderr, “error td: te\n", status, strerror (status); return status; > W thread errone Note that there is no equivalent to the perror function to format and print an error value returned by the Pthreads interfaces. Instead, use strerror to get a string description of the error number, and print the string to the file stream stderr, To avold cluttering each function call in the example programs with a block of code to report each error and call abort, I have built two error macros—err_abort detects a standard Pthreads error, and errno_abort is used when a value of -1 means that errno contains an error code. The following header file, called errors.h, shows these macros. The errors.h header file also includes several system header files, which would otherwise be required by most of the example programs—this helps to reduce the size of the examples. fitndef _errors_h define _ errors h 1 3 4 Hinelude 5 Hinclude 6 finclude 7 #inelude 8 #include I * Define a macro that can be used for diagnostic output from * examples. when compiled -DDESUG, it results in calling printt 43. * with the specified argument list. When DEBUG is not defined, it 14 * expands to nothing. aso 16 #ifdef DEBUG 17 # define DPRINTF(arg) printf arg 18 #else 19 # define DPRINTF(arg) 20 endif a 22 /* 23. * NOTE: the "do {" ... "} while (0);" bracketing around the macros 24 * allows the err_abort and errno_abort macros to be used as if they 25 * were function calls, even in contexts where a trailing ";" would 26 * generate a null statement. For example,27 28 29 30 2 2 33 4 35 36 37 39 0 a 2 a 4a 4s a o a CHAPTER 1_ Introduction * if (status t= 0) . err_abort (status, *message"); * else . return status; bort is a macro ending with "}", because to follow the "}". Because ¢ does expect fohlowing the *)" in the do...while construct, err abort and [/ftine_abort can be used as if they were function calls 7 fdefine err_abort(code,text) do ( \ fprintf (stderr, "te at \nts\":td: ta\n", \ text, _FILE_, LINE, strerror (Code); \ abort (7 V } white (0) define errno _abort(text) do ( \ fprinté (stderr, “ts at \"$s\":4d: 45\n", \ text, PILE, _LINE_, strerror (errno); \ abort (); 7 } while (0) fendi 9 _The one exception to the Pthreads error rules is pthread_getspeciic, which Trans the thread-specific data value of a shared *key.” Section 5.4 deseribes so the pthread_ Hea fe function doesn't report errors at all. If the pthread key t value 18 legal, or if no value has been set in the thread, pehread geterccitie Just returns the value NULL.2 Threads eee 21 “Wt seven maids with seven mops ‘Swept itfor half a year, Do you suppose," the Walrus sd, “That they could get if clear?” “I doubt it," sal the Carpenter, ‘And shed a biter tear Lows Carell, Trough the Looking-Giass ‘Threads are (and perhaps this will come as no surprise) the essential basis of the style of programming that I am advocating. Although this chapter focuses on threads, you will never learn everything you need to know about threads by simn- ply skipping to this chapter and reading it. Threads are a critical part of the landscape, but you can't do much with only threads. Nevertheless, one must start somewhere, and here we are. ‘Section 2.1 describes the programming aspects of creating and managing threads in your program, that is, how to create threads, how they are represented in your program, and the most basic things you can do to them once you've cre~ ated them. Section 2.2 describes the life cycle of a thread, from creation through “recy~ cling,” taking you through all the scheduling states threads can assume along the way. Creating and using threads “A loat of bread,” the Walrus said, “is what we chiefly need: Pepper and vinegar besides ‘Are very good indeed— ‘Now, If you're ready, Oysters dear, We can begin fo feed.” Lewis Carol, Trough the Looking-Glass 3836 CHAPTER 2 Threads pthread t thread; int pthread equal (pthread_t tl, pthread t 2); int pthread create (pthread_t ‘thread, const pthread attr_t vattr, void *(*start)(void *), void *arg); pthread t pthread self (void); int sched_yield (void); int pthread exit (void tvalue_ptr); int pthread detach (pthread t thread); int pthread_join (pthread t thread, void *+value ptr); The introduction covered some of the basics of what a thread is, and what it means to the computer hardware. This section begins where the introduction left off. It explains how a thread is represented in your program, what it means to your program, and some of the operations you can perform on threads. If you haven't read the introduction, this would be a good time to skip back to it. (I'll wait for you here.) Within your program a thread is represented by a thread identifier, of the opaque type pthread _t. To create a thread, you must declare a variable of type pthread_t somewhere in your program. If the identifier is needed only within a function, or if the function won't return until the thread is done, you could declare the identifier with auto storage class. Most of the time, though, the iden- lifier will be stored in a shared (static or extern) variable, or in a structure allocated from the heap. A Pthreads thread begins by calling some function that you provide. This “thread function” should expect a single argument of type void *, and should return a value of the same type. You create a thread by passing the thread function's address, and the argument value with which you want the function to be called, to pthread_create. When you create a thread, pthread create returns an identifier, in the pthread_t value referred to by the thread argument, by which your code refers to the new thread. A thread can also get its own identifier using the pthread_self function. There {s no way to find a thread's identifier unless either the creator or the thread itself stores the identifier somewhere. You need to have a thread's identifier to do anything to the thread. If you'll need to know when a thread completes, for example, you must keep the identifier somewhere. Pthreads provides the pthread_equal function to compare two thread identifi ers. You can only test for equality. It doesn't make any sense to ask whether one thread identifier is “greater than” or “less than” another, because there is no ordering between threads. The pthread_equal function returns a nonzero value if the thread identifiers refer to the same thread, and the value 0 if they do not refer to the same thread. 1 The intiol thread (main) is specialCreating and using threads 37 When a C program runs, it begins in a special function named main. In a threaded program, this special stream of execution is called the “initial thread” or sometimes the “main thread.” You can do anything within the initial thread that you can do within any other thread. It can determine its own thread identifier by calling pthread_se1f, for example, or terminate itself by calling pthread_exit. If the initial thread stores {ts thread identifier somewhere accessible to another thread, that thread can wait for the initial thread to terminate, or detach the initial thread. ‘The initial thread ts special because Pthreads retains traditional UNIX process behavior when the function rain returns; that is, the process terminates without allowing other threads to complete. In general, you do not want to do this in a threaded program, but sometimes it can be convenient. In many of the programs {in this book, for example, threads are created that have no effect on anything outside the process. It doesn’t really matter what those threads are doing, then, if the process goes away. When the process exits, those threads, all their states, and anything they might accomplish, simply “evaporate"—there's no reason to clean up. Detaching o thread that is stil running doesn’t atfect the thread in ony \way-it ust informs the system that the thread's resources can be reclaimed when the thread eventually terminates. Although “thread evaporation” is sometimes useful, most of the time your process will outlive the individual threads you create. To be sure that resources used by terminated threads are available to the process, you should always detach each thread you create when you're finished with it: Threads that have terminated but are not detached may retain virtual memory, including their stacks, as well as other system resources. Detaching a thread tells the system that you no longer need that thread, and allows the system to reclaim the resources it has allocated to the thread. If you create a thread that you will never need to control, you can use an attribute to create the thread so that it is already detached. (We'll get to attributes later, in Section 5.2.3.) If you do not want to wait for a thread that you created, and you know that you will no longer need to control that thread, you can detach it at any time by calling pthread_detach. A thread may detach itself, or any other thread that knows Its pthread_¢ identifier may detach it at any time. If you need to know a thread's return value, or if you need to know when a thread has com pleted, call pthread_join, The pthread_join function will block the caller until the thread you specify has terminated, and then, optionally, store the terminated thread's return value, Calling pthread_join detaches the specified thread automatically. ‘As we've seen, threads within a process can execute different instructions, using different stacks, all at the same time, Although the threads execute independently of cach other, they always share the same address space and file38 CHAPTER 2 Threads 18-25 26-29 n 2 14 descriptors. The shared address space provides an important advantage of the threaded programming model by allowing threads to communicate efficiently. Some programs may create threads that perform unrelated activities, but ‘most often a set of threads works together toward a common goal. For example, one set of threads may form an assembly line in which each performs some spe- cifie task on a shared data stream and then passes the data on to the next thread. A set of threads may form a work crew and divide independent parts of a common task. Or one "manager" thread may take control and divide work among a “crew” of worker threads. You can combine these models in a variety of ways; for example, a work crew might perform some complicated step in a pipeline, such as transforming a slice of an array. The following program, Lifecycie.c, creates a thread. We'll refer to this simple example in the following sections about a thread's life cycle. The thread function, thread_routine, returns a value to satisfy the standard thread function prototype. In this example the thread returns its argument, and the value is always NULL. “The program creates a thread by calling pthresd_create, and then waits for it by calling pthread_join. You don't need to wait for a thread, but if you dont, you'll need to do something else to make sure the process runs until the thread completes. Returning from main will cause the process to terminate, along with all threads. You could, for example, code the main thread to terminate by calling pthread_exit, which would allow the process to continue until all threads have terminated. When the join completes, the program checks the thread's return value, to be sure that the thread returned the value it was given. The program exits with 0 (Success) if the value is NULL, oF with 1 otherwise. It is a good idea for all thread functions to return something, even if itis simply tit. If you omit the return statement, pthzead_join will still return some value—whatever happens to be in the place where the thread's start function would have stored a return value (probably a register) lifecycle, #include Ainclude “errors.h" ” * Thread start routine. 7 void *thread_routine (void *arg) ‘ return arg; > main (int arg, char *argvi}) ‘ pthread_t thread_id; void *thread_resuit:1 vv 18 19 a 2 25 26 27 2a 29 30 2.2 The life of a thread 39 int status; status = pthread_create ( kthread_id, NULL, thread_routine, NULL); if (status T= 0) err_abort (status, "Create thread"); status = pthread_join (thread id, éthread_result); if (status I= 0) err_abort (status, “Join thread"); if (thread_result == NULL) return 0; else return 1; Titer If the “Joining” thread doesn't care about the return value, or if it knows that the “joinee” (the thread with which it is joining) didn't return a value, then it can pass NULL instead of 6retval in the call to pthread_join. The joinee’s return value will be ignored. When the call to pthread_join returns, the joinee has been detached and you can't join with it again. In the rare cases where more than one thread might need to know when some particular thread has terminated, the threads should wait on a condition variable instead of calling pthread_join. The terminating thread would store its return value (or any other information) in some known location, and broadcast the condition variable to wake all threads that might be interested, The life of a thread Come, listen, my men, while I Yell you again The five unmistakable marks By which you may know, wheresoever you go, The warranted genuine Snarks. Lewis Carol, The Hunting ofthe Snark At any instant, a thread {s in one of the four basic states described in Table 2.1 In implementations, you may see additional “states” that distinguish between various reasons for entering the four basic states. Digital UNIX, for example, represents these finer distinctions as “substates,” of which each state may have several. Whether they're called “substates" or additional states, “terminated” might be divided into “exited” and “cancelled”; “blocked” might be broken up into “blocked on condition variable,” “blocked on mutex,” “blocked in read,” and so forth.2.2.1 40 CHAPTER 2 Threads State Meaning Ready The thread is able to run, but is waiting for a processor. It may have just started, or just been unblocked, or preempted by another thread. Running | The thread is currently running; on a multiprocessor there may be more than one running thread in the process. Blocked | The thread is not able to run because it is waiting for something; for example, it may be waiting for a condition variable, or waiting to lock a mutex, or waiting for an I/O operation to complete. Terminated | The thread has terminated by returning from its start function, calling pthread_exit. or having been cancelled and completing all, cleanup handlers. It was not detached, and has not yet been Joined. Once it is detached or joined, it will be recycled. TABLE 2.1. Thread states These finer distinctions can be important in debugging and analyzing threaded programs. However, they do not add substantially to the basfe understanding of thread scheduling, and we will not deal with them here. ‘Threads begin in the ready state. When the new thread runs it calls your specified thread start function. It may be preempted by other threads, or block itself to wait for external events any number of times. Eventually it completes and either returns from the thread start function or calls the pthread_exit function, In either case it terminates. If the thread has been detached, it is immediately recycled. (Doesn't that sound nicer than “destroyed"—and most systems reuse the resources to make new threads.) Otherwise the thread remains in the term nated state until joined or detached. Figure 2.1 shows the relationships between these thread states, and the events that cause threads to move from one state to another. ‘Creation ‘The “initial thread” of a process is created when the process is created. In a system that fully supports threaded programming, there's probably no way to execute any code without a thread. A thread Is likely to be the only software context that includes the hardware state needed to execute code: registers, program counter, stack pointer, and so forth. ‘Additional threads are created by explicit calls. The primary way to create threads on a Pthreads system is to call pthread_create. Threads may also be created when the process receives a POSIX signal if the process signal notify mechanism is set to SIGEV_THREAD, Your system may provide additional non- standard mechanisms to create a thread.Magia ‘The life of a thread 41 (~ ready wait satisfied blocked Stat created schedyled walt for resource running Jdone, or cancelled FIGURE 2.1. Thread state transitions When a new thread is created, its state ts ready. Depending on scheduling constraints, it may remain in that state for a substantial period of time before executing. Section 5.5 contains more information on thread scheduling, Going back to Lifecycle.c, the thread running thread_routine becomes ready during rains call to pthread_create, at line 18, ‘The most important thing to remember about thread creation is that there is no synchronization between the creating thread's return from pthread_create and the scheduling of the new thread. That is, the thread may start before the creating thread returns. The thread may even run to completion and terminate before pthread_create returns. Refer to Section 8.1.1 for more information and warnings about what to expect when you create a thread. Startup Once a thread has been created, it will eventually begin executing machine instructions. The initial sequence of instructions will lead to the execution of the ‘read start function that you specified to pthread_create. The thread start function is called with the argument value you specified when you created the thread. In Lifecycte.c, for example, the thread begins executing user code at function thread_routine, with the formal parameter argunent having a value of NULL. In the initial thread, the thread “start function” (nain) is called from outside your program: for example, many UNIX systems link your program with a file called crt0.0, which initializes the process and then calls your main. This is a42 CHAPTER 2 Threads 2.2.3 minor implementation distinction, but it is Important to remember because there are a few ways in which the initial thread is different. For one thing, main is called with different arguments than a thread start function: the program's argument array (age and argv} instead of a single void* argument. For another thing, when a thread start function returns, the thread terminates but other threads continue to run, When the function main returns in the initial thread, the process will be terminated immediately. If you want to terminate the initial thread while allowing other threads in the process to continue running, call pthread_exit instead of returning from main. Another important difference to remember is that on most systems, the initial thread runs on the default process stack, which can grow to a substantial size, “Thread” stacks may be much more limited on some implementations, and the program will fail with a segmentation fault or bus error if a thread overflows its stack. Running and blocking Like us, threads usually can't stay awake their entire life. Most threads occa- sionally go to sleep. A thread can go to sleep because it needs a resource that is not available (it becomes “blocked or because the system reassigned the processor on which it was running (it is “preempted’). A thread spends most of its active life in three states: ready, running, and blocked. ‘A thread is ready when it is first created, and whenever its unblocked so that it 18 once again eligible to run. Ready threads are waiting for a processor. Also, when a running thread is preempted, for example, if it is timesliced (because it has run too long). the thread immediately becomes ready. 'A thread becomes running when it was ready and a processor selects the thread for execution, Usually this means that some other thread has blocked. or has been preempted by a timeslice—the blocking (or preempted) thread saves its context and restores the context of the next ready thread to replace itself. On a multiprocessor, however, a previously unused processor may execute a readied thread without any other thread blocking. ‘A thread becomes blocked when it attempts to lock a mutex that is currently locked, when it waits on a condition variable, when it calls eigvait for a signal that is not currently pending, or when it attempts an 1/O operation that cannot be immediately completed. A thread may also become blocked for other system operations, such as a page fault. When a thread is unblocked after a wait for some event, it is made ready again. It may execute immediately, for example, if a processor is available. In Lifecyeie.c, the main thread blocks at line 23, in pthread_join, to wait for the thread it created to run. If the thread had not already run ai this point, it would move from ready to running when main becomes blocked. As the thread runs to completion and returns, the main thread will be unblocked—returning to the ready state, When processor resources are available, either immediately or after the thread becomes terminated, main will again become running, and complete.2.2.4 ‘The life of a thread 43 Termination A thread usually terminates by returning from its start function (the one you ass to the pthread_create function). The thread shown in 1ifecycie.c term!- nates by returning the value NULL, for example. Threads that call pthread_exit or that are cancelled using pthread_cancel also terminate after calling each cleanup handler that the thread registered by calling pthread_cleanup_push and that hasn't yet been removed by calling pthread_cleanup_pop. Cleanup handlers are discussed in Section 5.3.3. Threads may have private “thread-specific data” values (thread-specific data is discussed in Section 5.4). If the thread has any non-NutL thread-specific data values, the associated destructor functions for those keys (if any) are called, If the thread was already detached it moves immediately to the next section, recycling. Otherwise, the thread becomes terminated. It will remain available for another thread to join with it using pthread_join. This is analogous to a UNIX process that's terminated but hasn't yet been “reaped” by a wait operation. Some- times it is called a “zombie” because it still exists even though it is “dead.” A zombie may retain most or all of the system resources that it used when running, 80 it {5 not a good idea to leave threads in this state for longer than necessary. Whenever you create a thread with which you won't need to join, you should use the detachstate attribute to create it “detached” (see Section 5.2.3). At a minimum, a terminated thread retains the identification (pthread_t value) and the void* return value that was returned from the thread's start fune- tion or specified in a call to pthread_exit, The only external difference between a thread that terminated “normally” by returning or calling pthread_exit, and one that terminated through cancellation, is that a cancelled thread's return value is always PTHREAD_CANCELLED. (This is why “cancelled” is not considered a distinct thread state.) If any other thread is waiting to join with the terminating thread, that thread is awakened. It will return from its call to pthread_join with the appropriate return value. Once pthread_join has extracted the return value. the terminated thread is detached by pthread_join, and may be recycled before the call to pthread_join returns. This means that, among other things, the returned value should never be a stack address associated with the terminated thread's stack— the value at that address could be overwritten by the time the caller could use it. In Lifecyele.c, the main thread will return from the pthread_join call at line 23 with the value wut, 1 pthread_joinis a convenience,not rule. Even when you need a return value from a thread that you create, it is often at least as simple to create the thread detached and devise your own customized return mechanism as it is to use pthread_join. For example, if you pass information to a worker thread in some form of structure that another thread can find later, you might have the worker thread simply place the result in that same2.2.5 44 CHAPTER 2 Threads structure and broadcast a condition variable when done. The Pthreads context for the thread, including the thread identifier, can then be recycled immediately when the thread is done, and you still have the part you really need, the return value, where you can find it easily at any time. If pthread_join does exactly what you want, then by all means use it. But remember that it is nothing more than a convenience for the simplest and most limited model of communicating a thread’s results, If it does not do exactly what you need, build your own return mechanism instead of warping your design to fit the limitations of pthread_join, Recycling If the thread was created with the detachstate attribute set to PTHREAD_ OREATE_DBTACHED (see Section 5.2.3), or if the thread or some other thread has already called pthread_detach for the thread's identifier, then the thread is immediately recycled when it becomes terminated, If the thread has not been detached when it terminates, it remains in the ter minated state until the thread's pthread_t identifier Is passed to pthread_detach or pthread_join. When cither function returns, the thread cannot be accessed again, In 1ifecycle.c, for example, the thread that had run thread_routine will be recycled by the time the main thread returns from the pthread_join call at ne 23, Recycling releases any system or process resources that weren't released at termination. That includes the storage used for the thread's return value, the stack, memory used to store register state, and so forth. Some of these resources may have been released at termination: it is important to remember that none of it should be accessed from any other thread after termination. For example, if a thread passes a pointer to its stack storage to another thread through shared data, you should treat that information as obsolete from the time the thread that owns the stack terminates,3 Synchronization 3.1 “That's right” said the Tiger-tly. “The daisies are worst of al. When one speaks, they ail begin together, and i's ‘enough fo make one wither fo hear ihe way they go on!” —tews Corel Through the Looking-Giass ‘To write a program of any complexity using threads, you'll need to share data between threads, or cause various actions to be performed in some coherent order across multiple threads. To do this, you need to synchronize the activity of your threads. Section 3.1 describes a few of the basic terms we'll be using to talk about thread synchronization: critical section and invariant. Section 3.2 describes the basic Pthreads synchronization mechanism, the mutex. Section 3.3 describes the condition variable, a mechanism that your code can use to communicate changes to the state of invariants protected by a mutex. Section 3.4 completes this chapter on synchronization with some important information about threads and how they view the computer's memory. Invariants, critical sections, and predicates “1know what you'te thinking about,” sald Tweedledum; “but i in’? $0, nohow.” “Contrariwise,” continued Tweecledee, “itt was so, it might be; and ifit were $0, t would be; ut as tisn', rain". Mars logic.” Lows Carrol, Through the Looking-Giass Invariants are assumptions made by a program, especially assumptions about the relationships between sets of variables. When you bulld a queue package, for example, you need certain data. Each queue has a queue header, which is a pointer to the first queued data element. Each data element includes a pointer to the next data element. But the data isn't all that’s important—your queue package relies on relationships between that data. The queue header, for example, 4546 CHAPTER $ Synchronization ‘must either be NULL oF contain a pointer to the first queued data element. Each data element must contain a pointer to the next data element. or NULL if itis the last. Those relationships are the invariants of your queue package. It is hard to write a program that doesn't have invariants, though many of them are subtle. When a program encounters a broken invariant, for example, if it dereferences a queue header containing a pointer to something that is not a valid data element, the program will probably produce incorrect results or fail immediately. Critical sections (also sometimes called “serial regons") are areas of code that affect a shared state. Since most programmers are trained to think about pro- ‘gram functions instead of program data, you may well find it easier to recognize critical sections than data invariants. However, a critical section can almost always be translated into a data invariant, and vice versa. When you remove an element from a queue, for example, you can see the code performing the removal as a critical section, or you can see the state of the queue as an invariant. Which you see first may depend on how you're thinking about that aspect of your design. ‘Most invariants can be “broken,” and are routinely broken, during {solated areas of code, The trick is to be sure that broken invariants are always repaired before “unsuspecting” code can encounter them. That is a large part of what “synchronization” is all about in an asynchronous program. Synchronization protects your program from broken invariants. If your code locks a mutex whenever it must (temporarily) break an invariant, then other threads that rely on the invariant, and which also lock the mutex, will be delayed until the mutex is unlocked— when the invariant has been restored. Synchronization is voluntary, and the participants must cooperate for the system to work. The programmers must agree not to fight for (or against) possession of the bailing bucket. The bucket itself does not somehow magically ensure that ‘one and only one programmer bails at any time. Rather, the bucket is a reliable shared token that, if used properly, can allow the programmers to manage their resources effectively. “Predicates” are logical expressions that describe the state of invariants needed by your code. In English, predicates can be expressed as statements like “the queue is empty” or “the resource is available.” A predicate may be a bootean variable with a TRUE or FALSE value, or it may be the result of testing whether a pointer is NULL. A predicate may also be a more complicated expression, such as determining whether a counter is greater than some threshold. A predicate may even be a value returned from some function. Fer example, you might call select or poll to determine whether a file is ready for input.Mutexes a7 3.2 Mutexes “How are you getting on?” said the Cot, {5 s00n as there was mouth enough fort fo speak with ‘Alice waited ti he eyes appeared, and then nodded. “i's no use speaking 0 i,” she thought, “ts ears have come, or atleast one of them.” Lewis Carol Ace's Adventures in Wonderland Most threaded programs need to share some data between threads, There may be trouble if two threads try to access shared data at the same time, because one thread may be in the midst of modifying some data invariant while another acts on the data as if it were consistent. This section is all about protecting the pro- ‘gram from that sort of trouble. ‘The most common and general way to synchronize between threads is to ensure that all memory accesses to the same (or related) data are “mutually exclusive.” That means that only one thread is allowed to write at a time—others must wait for their turn, Pthreads provides mutual exclusion using a special form of Edsger Dijkstra's semaphore [Dijkstra, 1968ql, called a mutex. The word ‘mutex is a clever combination of “mut” from the word “mutual” and "ex" from the word “exclusion.” Experience has shown that it is easier to use mutexes correctly than it is to use other synchronization models such as a more general semaphore. It is also easy to build any synchronization models using mutexes In combination with condition variables (we'll meet them at the next corner, in Section 3.3). Mutexes are simple, flexible, and can be implemented efficiently. ‘The programmers’ bailing bucket is something like a mutex (Figure 3.1). Both are “tokens” that can be handed around, and used to preserve the integrity of the concurrent system. The bucket can be thought of as protecting the bailing critical section—each programmer accepts the responsibility of bailing while holding the bucket, and of avoiding interference with the current batler while not holding the bucket. Or, the bucket can be thought of as protecting the invariant that water can be removed by only one programmer at any time. Synchronization isn't important just when you modify data. You also need synchronization when a thread needs to read data that was written by another thread, if the order in which the data was written matters. As we'll see a little later, in Section 3.4, many hardware systems don't guarantee that one processor will See shared memory accesses in the same order as another processor without a “nudge” from software.48 CHAPTER $ Synchronization FIGURE 3.1. Mutex analogy Consider, for example, a thread that writes new data to an element in an array, and then updates a max_index variable (o indicate that the array element is valid. Now consider another thread, running simultaneously on another processor, that steps through the array performing some computation on each valid element. If the second thread “sees” the new value of max_index before it sees the new value of the array element, the computation would be incorrect. This may seem irrational, but memory systems that work this way can be substantially faster than memory systems that guarantee predictable ordering of memory accesses. A mutex is one general solution to this sort of problem. If each thread locks a mutex around the section of code that’s using shared data, only one thread will be able to enter the section at a time. Figure 3.2 shows a timing diagram of three threads sharing a mutex. Sections of the lines that are above the rounded box labeled “mutex” show where the asso- clated thread does not own the mutex. Sections of the lines that are below the center line of the box show where the associated thread owns the mutex, and sec- Uons of the lines hovering above the center line show where the thread is waiting to own the mutex, Initially, the mutex {s unlocked. Thread 1 locks the mutex and, because there is no contention, it succeeds immediately—thread 1's line moves below the center

Vite - Js Cheatsheet
No ratings yet
Vite - Js Cheatsheet
10 pages
Ship Sensors 2024
No ratings yet
Ship Sensors 2024
267 pages
210316105159
No ratings yet
210316105159
26 pages
Sensor Fusion of Differential Gps and Inertial Measuring Unit To Measure State of A Test Vehicle
No ratings yet
Sensor Fusion of Differential Gps and Inertial Measuring Unit To Measure State of A Test Vehicle
216 pages
Global Positioning System
No ratings yet
Global Positioning System
52 pages
The UNIX Programming Environment PDF
100% (1)
The UNIX Programming Environment PDF
370 pages
Unix Network Programming Volume 2 PDF
No ratings yet
Unix Network Programming Volume 2 PDF
92 pages
Stochastic Finance
No ratings yet
Stochastic Finance
1,208 pages
The GNU C Preprocessor: User Manual
No ratings yet
The GNU C Preprocessor: User Manual
87 pages
Introduction To Computer Graphics
No ratings yet
Introduction To Computer Graphics
441 pages
Reinhard Wilhelm, Helmut Seidl, Sebastian Hack (Auth.) - Compiler Design - Syntactic and Semantic Analysis-Springer-Verlag Berlin Heidelberg (2013)
No ratings yet
Reinhard Wilhelm, Helmut Seidl, Sebastian Hack (Auth.) - Compiler Design - Syntactic and Semantic Analysis-Springer-Verlag Berlin Heidelberg (2013)
232 pages
Autonomous Underwater Vehicles: Jing Yan Xian Yang Haiyan Zhao Xiaoyuan Luo Xinping Guan
No ratings yet
Autonomous Underwater Vehicles: Jing Yan Xian Yang Haiyan Zhao Xiaoyuan Luo Xinping Guan
222 pages
Cuda C/C++ Basics: NVIDIA Corporation
No ratings yet
Cuda C/C++ Basics: NVIDIA Corporation
67 pages
Opencv Tutorials
No ratings yet
Opencv Tutorials
447 pages
MIT2 086F12 Notes Unit1
No ratings yet
MIT2 086F12 Notes Unit1
110 pages
Simulation and 3D Visualization of Physical Phenomena On Mobile Devices
No ratings yet
Simulation and 3D Visualization of Physical Phenomena On Mobile Devices
80 pages
bookAdomaitisAppliedMath4ChE 19utf81 PDF
No ratings yet
bookAdomaitisAppliedMath4ChE 19utf81 PDF
216 pages
C++ in Embedded Systems - Myth and Reality
No ratings yet
C++ in Embedded Systems - Myth and Reality
9 pages
Sensor Fusion and Tracking For Autonomous Systems White Paper PDF
No ratings yet
Sensor Fusion and Tracking For Autonomous Systems White Paper PDF
15 pages
Arinc 653
No ratings yet
Arinc 653
5 pages
Model-Based Engineering of Embedded Systems - The SPES 2020 Methodology (PDFDrive)
No ratings yet
Model-Based Engineering of Embedded Systems - The SPES 2020 Methodology (PDFDrive)
297 pages
Pthread Tutorial by Peter (Good One)
No ratings yet
Pthread Tutorial by Peter (Good One)
29 pages
Robot Tool Center Point Calibration Using Computer Vision - Johan Hallenberg PDF
No ratings yet
Robot Tool Center Point Calibration Using Computer Vision - Johan Hallenberg PDF
104 pages
Markov Chain Implementation in C++ Using Eigen
No ratings yet
Markov Chain Implementation in C++ Using Eigen
9 pages
F2837x 180controlCARD R1 3 SCH 02oct2015
No ratings yet
F2837x 180controlCARD R1 3 SCH 02oct2015
8 pages
Modeling and Control of A 4-Wheel Skid-Steering Mobile Robot
No ratings yet
Modeling and Control of A 4-Wheel Skid-Steering Mobile Robot
20 pages
Amd x64 Architecture Volume 1
No ratings yet
Amd x64 Architecture Volume 1
400 pages
Practical System Programming With C - Pragmatic Example Applications in Linux and Unix-Based Operating Systems. 1st Edition Sri Manikanta Palakollu PDF Download
No ratings yet
Practical System Programming With C - Pragmatic Example Applications in Linux and Unix-Based Operating Systems. 1st Edition Sri Manikanta Palakollu PDF Download
56 pages
Egomotion Estimation Using Visual Odometry
No ratings yet
Egomotion Estimation Using Visual Odometry
40 pages
The UNIX I/O System - Dennis M. Ritchie
No ratings yet
The UNIX I/O System - Dennis M. Ritchie
7 pages
[Lecture Notes in Computer Science 7196 Lecture Notes in Artificial Intelligence] Grzegorz Skorupa, Wojciech Lorkiewicz, Radosław Katarzyniak (Auth.), Jeng-Shyang Pan, Shyi-Ming Chen, Ngoc Thanh Nguyen (Eds.) - Intelligent Inform
No ratings yet
[Lecture Notes in Computer Science 7196 Lecture Notes in Artificial Intelligence] Grzegorz Skorupa, Wojciech Lorkiewicz, Radosław Katarzyniak (Auth.), Jeng-Shyang Pan, Shyi-Ming Chen, Ngoc Thanh Nguyen (Eds.) - Intelligent Inform
531 pages
Visual Odometry Tutorial
No ratings yet
Visual Odometry Tutorial
94 pages
Cai, Baoping - Bayesian Networks in Fault Diagnosis Practice and Application-World Scientific Publishing (2019) PDF
No ratings yet
Cai, Baoping - Bayesian Networks in Fault Diagnosis Practice and Application-World Scientific Publishing (2019) PDF
418 pages
Comprehensive Model-Based Engineering For Systems of Systems 2016
No ratings yet
Comprehensive Model-Based Engineering For Systems of Systems 2016
4 pages
Algorithm Design Slides
No ratings yet
Algorithm Design Slides
139 pages
Direct Drive DC Torque Motors Catalog 2005
No ratings yet
Direct Drive DC Torque Motors Catalog 2005
12 pages
CUDA C Programming Guide
No ratings yet
CUDA C Programming Guide
376 pages
ARINC Specification 653 Based Real-Time Software Engineering
No ratings yet
ARINC Specification 653 Based Real-Time Software Engineering
12 pages
Visual Control of Wheeled Mobile Robots Unifying Vision and Control in Generic Approaches by Héctor - M Becerra, Carlos Sagüés (Auth.)
No ratings yet
Visual Control of Wheeled Mobile Robots Unifying Vision and Control in Generic Approaches by Héctor - M Becerra, Carlos Sagüés (Auth.)
127 pages
09 Race Conditions
No ratings yet
09 Race Conditions
93 pages
Robotics Real-Time Programming
No ratings yet
Robotics Real-Time Programming
112 pages
Parallela Cluster by Michael Johan Kruger
No ratings yet
Parallela Cluster by Michael Johan Kruger
56 pages
Advances in Intelligent Robotics and Collaborative Automation2
No ratings yet
Advances in Intelligent Robotics and Collaborative Automation2
363 pages
Operating Systems and Middle Ware - Supporting Controlled Interaction
100% (2)
Operating Systems and Middle Ware - Supporting Controlled Interaction
563 pages
Multi Sensor Fusion and Integration Seminar Report
No ratings yet
Multi Sensor Fusion and Integration Seminar Report
23 pages
CS312 Intro To Robotics
100% (1)
CS312 Intro To Robotics
25 pages
Me3116 E3.0
No ratings yet
Me3116 E3.0
14 pages
2D Transformation: Prepared By-Deep Tank (160110116052) Kishan Thakur (160110116054) Harshal Dankhara (160110116064)
No ratings yet
2D Transformation: Prepared By-Deep Tank (160110116052) Kishan Thakur (160110116054) Harshal Dankhara (160110116064)
17 pages
Writing NES Emulator in Rust
No ratings yet
Writing NES Emulator in Rust
105 pages
Table of Contents
100% (1)
Table of Contents
6 pages
O'Reilly, Posix 4, Programming For The Real World
No ratings yet
O'Reilly, Posix 4, Programming For The Real World
552 pages
Think Os
No ratings yet
Think Os
99 pages
Unix Programming Environment
No ratings yet
Unix Programming Environment
369 pages
A ShortIntroductiontoPOSIX Threads
No ratings yet
A ShortIntroductiontoPOSIX Threads
8 pages
Operating Systems
No ratings yet
Operating Systems
85 pages
The UNIX Programming Environment - Brian W. Kernighan, Rob Pike
No ratings yet
The UNIX Programming Environment - Brian W. Kernighan, Rob Pike
369 pages
BIT 4205 Network Programming
No ratings yet
BIT 4205 Network Programming
100 pages
The Unix Programming Environment
100% (3)
The Unix Programming Environment
369 pages
The Unix Programming Environment PDF
No ratings yet
The Unix Programming Environment PDF
369 pages
ODTK Objects and GUI Overview PDF
No ratings yet
ODTK Objects and GUI Overview PDF
45 pages
Precise Formation Flying Control of Mult PDF
No ratings yet
Precise Formation Flying Control of Mult PDF
15 pages
Thin Film Calculator From Dissertation - JunZhang - 080110 - Optimized PDF
No ratings yet
Thin Film Calculator From Dissertation - JunZhang - 080110 - Optimized PDF
26 pages
Preparatory School To The Winter College On Micro and Nano Photonics For Life Sciences
No ratings yet
Preparatory School To The Winter College On Micro and Nano Photonics For Life Sciences
39 pages
Doublet Lattice Method PDF
No ratings yet
Doublet Lattice Method PDF
13 pages
BLAIR-A Compilation of The Mathematics Leading Doublet Lattice Method-1994
No ratings yet
BLAIR-A Compilation of The Mathematics Leading Doublet Lattice Method-1994
138 pages
A Computational Nonlinear Aeroelastic Framework With Multiple Capabilities
No ratings yet
A Computational Nonlinear Aeroelastic Framework With Multiple Capabilities
21 pages
Finite DM PDF
No ratings yet
Finite DM PDF
41 pages
Acta Astronautica: Yue Wang, Shijie Xu
No ratings yet
Acta Astronautica: Yue Wang, Shijie Xu
19 pages
Sensors: Performance Analysis of Constrained Loosely Coupled GPS/INS Integration Solutions
No ratings yet
Sensors: Performance Analysis of Constrained Loosely Coupled GPS/INS Integration Solutions
25 pages

Programming With Posix Threads (Libro) PDF

Uploaded by

Programming With Posix Threads (Libro) PDF

Uploaded by

You might also like