Welcome To The Parallel Jungle

1. The document discusses the end of Moore's Law and the transition to mainstream parallel computing across multicore CPUs, heterogeneous cores like GPUs, and cloud computing. 2. It argues that these trends are aspects of a single overarching transition to putting a personal heterogeneous supercomputer cluster on every device, with applications needing to harness large numbers of different local and distributed cores. 3. Software developers will need to enable applications to exploit vast numbers of cores that are increasingly specialized and located across local and remote systems in order to continue benefiting from increasing computational power.

Uploaded by

Angel Herrera Basto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

98 views4 pages

Welcome To The Parallel Jungle

Uploaded by

Angel Herrera Basto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Welcome to the Parallel Jungle!

By Herb Sutter, January 29, 2012

Herb Sutter dives into the repercussions of parallel's reach from mobile devices, to the desktop, to clusters, and at the highest level of granularity to the cloud. This welter of different parallel implementations presents significant challenges for programming. The free lunch of sequential programming is well and truly over.
In the twilight of Moore's Law, the transitions to multicore processors, GPU computing, and hardware or infrastructure as a service (HaaS) cloud computing are not separate trends, but aspects of a single trend mainstream computers from desktops to "smartphones" are being permanently transformed into heterogeneous supercomputer clusters. Henceforth, a single compute-intensive application will need to harness different kinds of cores, in immense numbers, to get its job done. The free lunch is over. Now welcome to the hardware jungle. From 1975 to 2005, our industry accomplished a phenomenal mission: In 30 years, we put a personal computer on every desk, in every home, and in every pocket. In 2005, however, mainstream computing hit a wall. In "The Free Lunch Is Over (A Fundamental Turn Toward Concurrency in Software)," I described the reasons for the then-upcoming industry transition from single-core to multicore CPUs in mainstream machines, why it would require changes throughout the software stack from operating systems to languages to tools, and why it would permanently affect the way we as software developers have to write our code if we want our applications to continue exploiting Moore's transistor dividend. In 2005, our industry undertook a new mission: to put a personal parallel supercomputer on every desk, in every home, and in every pocket. 2011 was special: it's the year that we completed the transition to parallel computing in all mainstream form factors, with the arrival of multicore tablets (such as iPad 2, Playbook, Kindle Fire, Nook Tablet) and smartphones (for example, Galaxy S II, Droid X2, iPhone 4S). 2012 will see us continue to build out multicore with mainstream quad- and eight-core tablets (as Windows 8 brings a modern tablet experience to x86 as well as ARM), and the last single-core gaming console holdout will go multicore (as Nintendo's Wii U replaces Wii. This time, it took us just six years to deliver mainstream parallel computing in all popular form factors. And we know the transition to multicore is permanent, because multicore delivers compute performance that single-core cannot and there will always be mainstream applications that run better on a multicore machine. There's no going back. For the first time in the history of computing, mainstream hardware is no longer a single-processor von Neumann machine, and never will be again. That was the first act.

It turns out that multicore is just the first of three related permanent transitions that layer on and amplify each other; as the timeline in Figure 1 illustrates.

Figure 1. Multicore (2005-). As explained previously. Heterogeneous cores (2009-). A single computer already typically includes more than one kind of processor core, as mainstream notebooks, consoles, and tablets all increasingly have both CPUs and compute-capable GPUs. The open question in the industry today is not whether a single application will be spread across different kinds of cores, but only "how different" the cores should be. That is, whether they should be basically the same with similar instruction sets but in a mix of a few big cores that are best at sequential code plus many smaller cores best at running parallel code (the Intel MIC model slated to arrive in 2012-2013, which is easier to program), or should they be cores with different capabilities that may only support subsets of general-purpose languages like C and C++ (the current Cell and GPGPU model, which requires more complexity including language extensions and subsets). Heterogeneity amplifies the first trend (multicore), because if some of the cores are smaller, then we can fit more of them on the same chip. Indeed, 100x and 1,000x parallelism is already available today on many mainstream home machines for programs that can harness the GPU. We know the transition to heterogeneous cores is permanent because different kinds of computations naturally run faster and/or use less power on different kinds of cores and different parts of the same application will run faster and/or cooler on a machine with several different kinds of cores. 3. Elastic compute cloud cores (2010-). For our purposes, "cloud" means specifically HaaS delivering access to more computational hardware as an extension of the mainstream machine. This trend started to hit the mainstream with commercial compute cloud offerings from Amazon Web Services (AWS), Microsoft Azure, Google App Engine (GAE), and others. 1. 2.

Cloud HaaS again amplifies both of the first two trends, because it's fundamentally about deploying large numbers of nodes where each node is a mainstream machine containing multiple and heterogeneous cores. In the cloud, the number of cores available to a single application is scaling fast. In mid-2011, Cycle Computing delivered a 30,000-core cloud for under $1,300/hour using AWS) and the same heterogeneous cores are available in compute nodes (e.g., AWS already offers "Cluster GPU" nodes with dual nVIDIA Tegra M2050 GPU cards, enabling massively parallel and massively distributed CUDA applications). In short, parallelism is not just in full bloom, but increasingly in full variety. This article will develop four key points: Moore's End. We can observe clear evidence that Moore's Law is ending because we can point to a pattern that precedes the end of exploiting any kind of resource. But there's no reason to panic, because Moore's Law limits only one kind of scaling, and we have already started another kind. Mapping one trend, not three. Multicore, heterogeneous cores, and HaaS cloud computing are not three separate trends, but aspects of a single trend: putting a personal heterogeneous supercomputer cluster on every desk, in every home, and in every pocket. The effect on software development. As software developers, we will be expected to enable a single application to exploit a jungle of enormous numbers of cores that are increasingly different in kind (specialized for different tasks) and different in location (from local to very remote; on-die, inbox, on-premises, in-cloud). The jungle of heterogeneity will continue to spur deep and fast evolution of mainstream software development, but we can predict what some of the changes will be. Three distinct near-term stages of Moore's End. And why "smartphones" aren't, really. Mainstream hardware is becoming permanently parallel, heterogeneous, and distributed. These changes are permanent, and so will permanently affect the way we have to write performance-intensive code on mainstream architectures. The good news is that Moore's "local scale-in" transistor mine isn't empty yet. It appears the transistor bonanza will continue for about another decade, give or take a half decade or so, which should be long enough to exploit the lowercost side of the Law to get us to parity between desktops and pocket tablets. The bad news is that we can clearly observe the diminishing returns as the transistors are decreasingly exploitable with each new generation of processors, software developers have to work harder and the chips get more difficult to power. And with each new crank of the diminishing-returns wheel, there's less time for hardware and software designers to come up with ways to overcome the next hurdle; the motherlode free lunch lasted 30 years, but the homogeneous multicore era lasted only about six years, and we are now already overlapping the next two eras of hetero-core and cloud-core.

But all is well: When your mine is getting empty, you don't panic, you just open a new mine at a new motherlode, operate both mines for a while, then continue to profit from the new mine long-term even after the first one finally shuts down and gets converted into a museum. As usual, in this case the end of one dominant wave overlaps with the beginning of the next, and we are now early in the period of overlap where we are standing with a foot in each wave, a crew in each of Moore's mine and the cloud mine. Perhaps the best news of all is that the cloud wave is already scaling enormously quickly faster than the Moore's Law wave that it complements, and that it will outlive and replace.

If you haven't done so already, now is the time to take a hard look at the design of your applications, determine what existing features or better still, what potential and currently unimaginable demanding new features are CPU-sensitive now or are likely to become so soon, and identify how those places could benefit from local and distributed parallelism. Now is also the time for you and your team to grok the requirements, pitfalls, styles, and idioms of hetero-parallel (e.g., GPGPU) and cloud programming (e.g., Amazon Web Services, Microsoft Azure, Google App Engine). To continue enjoying the free lunch of shipping an application that runs well on today's hardware and will just naturally run faster or better on tomorrow's hardware, you need to write an app with lots of latent parallelism expressed in a form that can be spread across a machine with a variable number of cores of different kinds local and distributed cores, and big/small/specialized cores. The throughput gains now cost extra extra development effort, extra code complexity, and extra testing effort. The good news is that for many classes of applications the extra effort will be worthwhile, because concurrency will let them fully exploit the exponential gains in compute throughput that will continue to grow strong and fast long after Moore's Law has gone into its sunny retirement, as we continue to mine the cloud for the rest of our careers.

X-Type 2003 2004 Elec Guide
100% (2)
X-Type 2003 2004 Elec Guide
160 pages
Electrical Installations-Marinas and Boats
No ratings yet
Electrical Installations-Marinas and Boats
8 pages
Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet
Software-Defined Networks: A Systems Approach
From Everand
Software-Defined Networks: A Systems Approach
Larry Peterson
5/5 (1)
Torino Piemonte Aerospace Company List
100% (1)
Torino Piemonte Aerospace Company List
96 pages
Newage MX321 Automatic Voltage Regulator
100% (10)
Newage MX321 Automatic Voltage Regulator
6 pages
ParallelComputing Backgrounder
No ratings yet
ParallelComputing Backgrounder
2 pages
Edge Cloud Operations: A Systems Approach
From Everand
Edge Cloud Operations: A Systems Approach
Larry L Peterson
No ratings yet
Parallel Processing Assignment 1
No ratings yet
Parallel Processing Assignment 1
14 pages
Parallel Processing Assignment 1
No ratings yet
Parallel Processing Assignment 1
14 pages
Heterogeneous Multicore Systems: A Future Trend Analysis
No ratings yet
Heterogeneous Multicore Systems: A Future Trend Analysis
5 pages
Berkeley View
No ratings yet
Berkeley View
54 pages
From Zero to Market with Flutter
From Everand
From Zero to Market with Flutter
Viachaslau Lyskouski
No ratings yet
Cloud Computing Unveiled: A Short Journey Through Time
From Everand
Cloud Computing Unveiled: A Short Journey Through Time
Maula Issa
No ratings yet
6. Asanovic. 2009
No ratings yet
6. Asanovic. 2009
9 pages
Multi-Core Programming - Increasing Performance Through Software Multi-Threading
No ratings yet
Multi-Core Programming - Increasing Performance Through Software Multi-Threading
11 pages
History Final PRT 2
No ratings yet
History Final PRT 2
6 pages
Parallel Computing: "Parallelization" Redirects Here. For Parallelization of Manifolds, See
No ratings yet
Parallel Computing: "Parallelization" Redirects Here. For Parallelization of Manifolds, See
20 pages
From Zero to Market with Flutter: Desktop, Mobile, and Web Distribution
From Everand
From Zero to Market with Flutter: Desktop, Mobile, and Web Distribution
Viachaslau Lyskouski
No ratings yet
Chapter 1 - Parallel Architectures
No ratings yet
Chapter 1 - Parallel Architectures
60 pages
Virtual Report Processing: The Mapper Story
From Everand
Virtual Report Processing: The Mapper Story
Louis Schlueter
No ratings yet
Management Strategies for the Cloud Revolution (Review and Analysis of Babcock's Book)
From Everand
Management Strategies for the Cloud Revolution (Review and Analysis of Babcock's Book)
BusinessNews Publishing
No ratings yet
Trends and Challenges in Operating Syste
No ratings yet
Trends and Challenges in Operating Syste
11 pages
Computer Performance
No ratings yet
Computer Performance
18 pages
Parallel N Distributed Systems
No ratings yet
Parallel N Distributed Systems
44 pages
Comp Org and Desgn Notes
No ratings yet
Comp Org and Desgn Notes
5 pages
Parallel Computer Architecture A Hardware / Software Approach
100% (1)
Parallel Computer Architecture A Hardware / Software Approach
877 pages
Lean and the Art of Cloud Computing Management
From Everand
Lean and the Art of Cloud Computing Management
Gregor Petri
No ratings yet
Chapter-1---Introduction_2023_Programming-Massively-Parallel-Processors
No ratings yet
Chapter-1---Introduction_2023_Programming-Massively-Parallel-Processors
20 pages
Patterson&Hennessy - (1 8)
No ratings yet
Patterson&Hennessy - (1 8)
3 pages
Challenges_and_Emerging_Technologies_for
No ratings yet
Challenges_and_Emerging_Technologies_for
13 pages
Operating Systems: Cracking the Code
From Everand
Operating Systems: Cracking the Code
Pasquale De Marco
No ratings yet
Mastering Cloud Computing With Best Practices
From Everand
Mastering Cloud Computing With Best Practices
Manish Soni
No ratings yet
The Free Lunch Is Over - A Fundamental Turn Toward Concurrency in Software
No ratings yet
The Free Lunch Is Over - A Fundamental Turn Toward Concurrency in Software
5 pages
Many Core Processor Architecture
No ratings yet
Many Core Processor Architecture
36 pages
CC-1
No ratings yet
CC-1
16 pages
About The Chip in Your Phone
No ratings yet
About The Chip in Your Phone
4 pages
Patrick Warrington On FPGAs
No ratings yet
Patrick Warrington On FPGAs
2 pages
History and Development of Operating Systems
From Everand
History and Development of Operating Systems
Steven Ferraro
No ratings yet
Debugging Real-Time Multiprocessor Systems: Class #264, Embedded Systems Conference, Silicon Valley 2006
No ratings yet
Debugging Real-Time Multiprocessor Systems: Class #264, Embedded Systems Conference, Silicon Valley 2006
15 pages
Multi-Core Processing: Advantages & Challenges
No ratings yet
Multi-Core Processing: Advantages & Challenges
35 pages
Shedding Light on Cloud Computing
From Everand
Shedding Light on Cloud Computing
Gregor Petri
5/5 (1)
JNTUK R20 B.tech CSE 4-1 Cloud Computing Unit 1 Notes
No ratings yet
JNTUK R20 B.tech CSE 4-1 Cloud Computing Unit 1 Notes
18 pages
S1 Chap 1.0
No ratings yet
S1 Chap 1.0
91 pages
Operating System Text Book
From Everand
Operating System Text Book
Manish Soni
No ratings yet
MODULE- 01 CC(BCS601)
No ratings yet
MODULE- 01 CC(BCS601)
47 pages
Lecture Parallel Computing
No ratings yet
Lecture Parallel Computing
6 pages
ACA Notes UNIT-1
No ratings yet
ACA Notes UNIT-1
20 pages
Cyber Vulnerabilities: Education, #3
From Everand
Cyber Vulnerabilities: Education, #3
Artur Victoria
No ratings yet
1-Introduction
No ratings yet
1-Introduction
48 pages
Parallel Computing Varun Patial
No ratings yet
Parallel Computing Varun Patial
41 pages
1.1 Parallelism
No ratings yet
1.1 Parallelism
29 pages
JNTUK 4-1 CSE R20 CC UNIT-I (WWW - Jntumaterials.co - In)
No ratings yet
JNTUK 4-1 CSE R20 CC UNIT-I (WWW - Jntumaterials.co - In)
18 pages
HSAF Purpose and Outlook by Moor Insights Strategy
No ratings yet
HSAF Purpose and Outlook by Moor Insights Strategy
17 pages
WWII 457th Anti-Aircraft Artillery
No ratings yet
WWII 457th Anti-Aircraft Artillery
229 pages
HPC Module 1
No ratings yet
HPC Module 1
11 pages
The Future of Computer Technology and Its Implicat
No ratings yet
The Future of Computer Technology and Its Implicat
9 pages
Compute Cores Whitepaper
No ratings yet
Compute Cores Whitepaper
6 pages
1 of 1 PDF
No ratings yet
1 of 1 PDF
7 pages
Computing Beyond The End of Moores Law Rob Leland
No ratings yet
Computing Beyond The End of Moores Law Rob Leland
16 pages
Private 5G: A Systems Approach
From Everand
Private 5G: A Systems Approach
Larry L Peterson
No ratings yet
Rust for Embedded Systems
From Everand
Rust for Embedded Systems
James Oakton
No ratings yet
Explain All The Evolutionary Changes in The Age of Internet Computing. The Age of Internet Computing
No ratings yet
Explain All The Evolutionary Changes in The Age of Internet Computing. The Age of Internet Computing
5 pages
ch1 PC
No ratings yet
ch1 PC
84 pages
Installation Manual FXMQ
No ratings yet
Installation Manual FXMQ
13 pages
ANSI Device Numbers PDF
No ratings yet
ANSI Device Numbers PDF
6 pages
Chiller Controls Testing Procedure
No ratings yet
Chiller Controls Testing Procedure
2 pages
Manual de Mantenimiento VJ1304
100% (1)
Manual de Mantenimiento VJ1304
438 pages
7300G NJEX Manual 6-2010-Final
No ratings yet
7300G NJEX Manual 6-2010-Final
130 pages
A Non-Inverting Single-Switch Buck-Boost Converter Based LED Driver
No ratings yet
A Non-Inverting Single-Switch Buck-Boost Converter Based LED Driver
14 pages
How To Select MCB - MCCB - Controlmakers
100% (1)
How To Select MCB - MCCB - Controlmakers
11 pages
VD 4
No ratings yet
VD 4
56 pages
Syllabus RRB Je Cma Dms Popsts
No ratings yet
Syllabus RRB Je Cma Dms Popsts
7 pages
Elektronikos Firmu Sarasas 2015
No ratings yet
Elektronikos Firmu Sarasas 2015
14 pages
EE360 - Synchronous Machines
100% (1)
EE360 - Synchronous Machines
85 pages
Echalk Word
No ratings yet
Echalk Word
3 pages
Sanken Electric Co., LTD.: Instruction Manual (Basic)
No ratings yet
Sanken Electric Co., LTD.: Instruction Manual (Basic)
136 pages
(GSA-OA) 1ZSE 2750-111 en Rev 8
No ratings yet
(GSA-OA) 1ZSE 2750-111 en Rev 8
16 pages
Aer - 200kW
No ratings yet
Aer - 200kW
2 pages
Fractal Audio Blocks Guide
No ratings yet
Fractal Audio Blocks Guide
100 pages
ECLIA Service Manual
No ratings yet
ECLIA Service Manual
39 pages
Simeas P Power Meter: Power Quality Catalog SR 10.3.1 2001
No ratings yet
Simeas P Power Meter: Power Quality Catalog SR 10.3.1 2001
16 pages
Samsung+Scx 6322dn,+Scx 6322dn Xax+Parts+List,+Service+Manual
No ratings yet
Samsung+Scx 6322dn,+Scx 6322dn Xax+Parts+List,+Service+Manual
151 pages
G8680 Data Sheet
No ratings yet
G8680 Data Sheet
6 pages
Mooney Service Manuel M20J Vol. 2 of 2
No ratings yet
Mooney Service Manuel M20J Vol. 2 of 2
37 pages
Tms 320 F 28069
No ratings yet
Tms 320 F 28069
173 pages
MV Capacitor Calculation
No ratings yet
MV Capacitor Calculation
1 page
COMPLIANCE_REPORT__C_7664_E-ROOM_3
No ratings yet
COMPLIANCE_REPORT__C_7664_E-ROOM_3
7 pages
Bobina Proporcional
No ratings yet
Bobina Proporcional
3 pages

Welcome To The Parallel Jungle

Uploaded by

Welcome To The Parallel Jungle

Uploaded by

Welcome to the Parallel Jungle!

By Herb Sutter, January 29, 2012

You might also like