100% found this document useful (1 vote)

121 views

Tensor Processing Unit

The document discusses Google's Tensor Processing Unit (TPU), a custom ASIC chip designed for machine learning and neural network workloads. It provides a high-level overview of the TPU's history, architecture, and performance advantages compared to CPUs and GPUs. The TPU is optimized for neural network operations through its use of 8-16 bit integers, large on-chip memory, and a matrix multiplication unit containing over 65,000 arithmetic logic units connected in a systolic array configuration. This design enables it to perform tens of trillions of operations per second and powers various Google services involving machine learning.

Uploaded by

Osama Asghar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

121 views

Tensor Processing Unit

Uploaded by

Osama Asghar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Tensor Processing Unit

Tensor Proces

By: Lucas Jodon

Yelman Khan
Overview
● History
● Neural Networks
● Architecture
● Performance
● Real-World Uses
● Future Development
History of TPUs
● Google began searching for a way to support neural networking for the development of their services such as
voice recognition
○ Using existing hardware, they would require twice as many data centers
○ Development of a new architecture instead
● Norman Jouppi begins work on a new architecture to support TensorFlow
○ FPGA’s were not power-efficient enough
○ ASIC design was selected for power and performance benefits
○ Device would execute CISC instructions on many networks
○ Device was made to be programmable, but operate on matrices instead of vector/scalar
○ Resulting device was comparable to a GPU or Signal Processor
Neural Networks
● First proposed in 1944 by Warren McCullough and Walter Pitts
○ Modeled loosely on human learning
● Neural nets are a method of machine learning
○ Computer learns to perform a task by analyzing training examples
○ EX: pair several audio files with the text words they mean, the machine will then find patterns between the audio data and the
labels
○ Each incoming pairing is given a weight, which is added to pre-existing node pairings
○ Once node weights pass a predefined threshold, the pairing is considered active
● Google began development on DistBelief in 2011
○ DistBelief became TensorFlow, which officially released version 1.0.0 in February 2017
○ TensorFlow is a software library with significant machine learning support
○ TensorFlow is intended to be a production grade library for dataflow implementation
Quantization in Neural Networks
● Precision of 32-bit/16-bit floating points usually not required
● Accuracy can be maintained with 8-bit integers
● Energy consumption and hardware footprint is reduced
Architecture Overview
● Large, on-chip DRAM required for
accessing pairing weight values.
● It is Possible to simultaneously
store weights and load activations.
○ TPU can do 64,000 of these
accumulates per cycle.
● First generation used 8-bit
operands and quantization
○ Second generation uses 16-bit
● Matrix Multiplication Unit has 256
× 256 (65,536) ALUs
Architecture Overview Continued
● Minimalistic hardware design used to
improve space and power consumption
○ No caches, branch prediction, out-of-order
execution, multiprocessing, speculative
prefetching, address coalescing,
multithreading, context switching, etc.
○ Minimalism is beneficial here because TPU is
required only to run neural network
prediction
● TPU chip is half the size of the other
chips
○ 28 nm process with a die size ≤ 331 mm
○ This is partially due to simplification of
control logic
TPU Stack
● TPU performs the actual neural
network calculation
● Wide range of neural network
models
● TPU stack translates the API
calls into TPU instructions
CPUs & GPUs
● CPUs and GPUs store values in registers
● A program tracks the read/operate/write operations
● A program tells ALUs :
○ Which Register to read from
○ What operation to perform
○ Which Register to write to
Performance
● TPU consists of Matrix Multiplier Unit (MXU)
● MXU performs hundreds of thousands of operations per
clock cycle
● Reads an input value only once
● Inputs are used many times without storing back to register
● Wires connect adjacent ALUs
● Multiplication and addition are performed in specific order
● Short and energy efficient
● Design is known as systolic array
Matrix Multiplication Unit
● Contains 256 x 256 = 65,536 ALUs
● TPU runs at 700 MHz
● Able to compute 46 x 1012
multiply-and-add operations per second
● Equivalent to 92 Teraops per second in
matrix unit
USES
● RankBrain algorithm used by Google search
● Google Photos
● Google Translate
● Google Cloud Platform
Future Development
● Google Cloud TPUs
○ Uses TPU version 2
○ Each TPU include a high-speed network
○ Allows to build machine learning supercomputers called “TPU Pods”
○ Improvement in training times
○ Allows mixing and matching with other hardware which includes Skylake CPUs and NVIDIA
GPUs
Works Cited
● First In-Depth Look at Google's TPU Architecture
https://www.nextplatform.com/2017/04/05/first-depth-look-googles-tpu-architecture/
● An in-depth look at Google's first Tensor Processing Unit (TPU) | Google Cloud Big Data and
Machine Learning Blog | Google Cloud Platform
https://cloud.google.com/blog/big-data/2017/05/an-in-depth-look-at-googles-first-tensor-processing-u
nit-tpu
● Google Cloud TPU Details Revealed
Servethehome - https://www.servethehome.com/google-cloud-tpu-details-revealed/
● TensorFlow - Google
● https://research.googleblog.com/2015/11/tensorflow-googles-latest-machine_9.html
● Explained: Neural networks
● Larry Hardesty | MIT News Office -
http://news.mit.edu/2017/explained-neural-networks-deep-learning-0414
● https://www.nextplatform.com/2017/04/05/first-depth-look-googles-tpu-architecture/
● Google cloud TPU -
https://www.blog.google/topics/google-cloud/google-cloud-offer-tpus-machine-learning/

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
85% of Quranic Word Urdu Book 30 Sep 2019 With Ism Maf'ool (2021 - 07 - 12 14 - 19 - 23 UTC)
100% (1)
85% of Quranic Word Urdu Book 30 Sep 2019 With Ism Maf'ool (2021 - 07 - 12 14 - 19 - 23 UTC)
48 pages
Trackpad Pro Ver. 5.0 Class 6
From Everand
Trackpad Pro Ver. 5.0 Class 6
Nidhi Arora
No ratings yet
Current
No ratings yet
Current
575 pages
Microblaze MCS Tutorial v3
No ratings yet
Microblaze MCS Tutorial v3
15 pages
WWW Mikroe Com Chapters View 67 Chapter 4 At89s8253 Microcon
100% (1)
WWW Mikroe Com Chapters View 67 Chapter 4 At89s8253 Microcon
51 pages
Day5 FDP IoT Part1
No ratings yet
Day5 FDP IoT Part1
89 pages
Systolic Array
No ratings yet
Systolic Array
42 pages
L 1 ParallelProcess Challenges
No ratings yet
L 1 ParallelProcess Challenges
82 pages
Design Issues: SMT and CMP Architectures
No ratings yet
Design Issues: SMT and CMP Architectures
9 pages
Microcontroller Architecture and Programming
No ratings yet
Microcontroller Architecture and Programming
24 pages
Parallel Architecture Classification
50% (2)
Parallel Architecture Classification
41 pages
Unit 1 Introduction To Embedded Systems
No ratings yet
Unit 1 Introduction To Embedded Systems
61 pages
Unit 5 (Slides)
No ratings yet
Unit 5 (Slides)
75 pages
S.No Topics Lec: Advanced Computer Network ETCS-401
No ratings yet
S.No Topics Lec: Advanced Computer Network ETCS-401
4 pages
Microprocessor - Overview: How Does A Microprocessor Work?
No ratings yet
Microprocessor - Overview: How Does A Microprocessor Work?
8 pages
Slot14 15 CH08 OperatingSystemSupport 43 Slides
No ratings yet
Slot14 15 CH08 OperatingSystemSupport 43 Slides
34 pages
Csa Mod 2
100% (1)
Csa Mod 2
28 pages
Parallel Computer Models: CSE7002: Advanced Computer Architecture
No ratings yet
Parallel Computer Models: CSE7002: Advanced Computer Architecture
37 pages
Patterson6e MIPS Ch04 PPT
No ratings yet
Patterson6e MIPS Ch04 PPT
137 pages
Field Programmable Gate Array
No ratings yet
Field Programmable Gate Array
18 pages
Virtual Memory, Segmentation and Paging
No ratings yet
Virtual Memory, Segmentation and Paging
22 pages
cortex-a9-processor
No ratings yet
cortex-a9-processor
20 pages
Instruction Pipeline
No ratings yet
Instruction Pipeline
27 pages
Unit 2 MPMC Notes
No ratings yet
Unit 2 MPMC Notes
37 pages
Seminar Report
50% (4)
Seminar Report
30 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
36 pages
TMS320C50 Architecture
100% (5)
TMS320C50 Architecture
2 pages
Unit-6: Pipeline & Vector Processing
No ratings yet
Unit-6: Pipeline & Vector Processing
41 pages
The History of Microprocessor
No ratings yet
The History of Microprocessor
13 pages
Slot12 13 CH07 InputOutput 35 Slides
No ratings yet
Slot12 13 CH07 InputOutput 35 Slides
35 pages
Coa
No ratings yet
Coa
11 pages
Chapter 4 - Introduction To Intel 8086 Microprocessor
No ratings yet
Chapter 4 - Introduction To Intel 8086 Microprocessor
12 pages
RISCV
No ratings yet
RISCV
451 pages
Microblaze C Reference
No ratings yet
Microblaze C Reference
1 page
Assignment
No ratings yet
Assignment
29 pages
Computer Organization & Architecture
No ratings yet
Computer Organization & Architecture
55 pages
Embedded System: 1 History
No ratings yet
Embedded System: 1 History
11 pages
Chapter 06
No ratings yet
Chapter 06
76 pages
Design and Simulation of ZIGBEE Transmitter
No ratings yet
Design and Simulation of ZIGBEE Transmitter
5 pages
Slot10 11 CH06 ExternalMemory 50 Slides
No ratings yet
Slot10 11 CH06 ExternalMemory 50 Slides
52 pages
MP Unit-6 Se-Ii
No ratings yet
MP Unit-6 Se-Ii
51 pages
Pulpissimo: Datasheet: The Pulp Team
No ratings yet
Pulpissimo: Datasheet: The Pulp Team
101 pages
Introduction To Multi-Core Architecture
No ratings yet
Introduction To Multi-Core Architecture
16 pages
Slot16 CH11 Digital Logic 24 Slides
No ratings yet
Slot16 CH11 Digital Logic 24 Slides
17 pages
MP QB
No ratings yet
MP QB
19 pages
15EC62T - Embedded Systems Electronics 6th Sem Syllabus For Diploma DTE Karnataka C15 Scheme - All About VTU
No ratings yet
15EC62T - Embedded Systems Electronics 6th Sem Syllabus For Diploma DTE Karnataka C15 Scheme - All About VTU
11 pages
OpenCL Best Practices Guide
No ratings yet
OpenCL Best Practices Guide
54 pages
Computer Architecture and Parallel Processing
No ratings yet
Computer Architecture and Parallel Processing
29 pages
DSP Architecture
No ratings yet
DSP Architecture
90 pages
P11Mca1 & P8Mca1 - Advanced Computer Architecture: Unit V Processors and Memory Hierarchy
No ratings yet
P11Mca1 & P8Mca1 - Advanced Computer Architecture: Unit V Processors and Memory Hierarchy
45 pages
Architecture
No ratings yet
Architecture
21 pages
Microprocessors & Interfacing
No ratings yet
Microprocessors & Interfacing
43 pages
21CS43 - Module 1
No ratings yet
21CS43 - Module 1
21 pages
Computer Architecture - Memory System
100% (1)
Computer Architecture - Memory System
22 pages
report
No ratings yet
report
9 pages
Motivation_for_and_Evaluation_of_the_First_Tensor_Processing_Unit
No ratings yet
Motivation_for_and_Evaluation_of_the_First_Tensor_Processing_Unit
10 pages
Google TPU
No ratings yet
Google TPU
27 pages
Tensor Processing Unit
50% (2)
Tensor Processing Unit
23 pages
research paper
No ratings yet
research paper
7 pages
2021-Jouppi 10 Lessions
No ratings yet
2021-Jouppi 10 Lessions
14 pages
Intel® UHD Graphics 620 - Thursday, 03 March 2022
No ratings yet
Intel® UHD Graphics 620 - Thursday, 03 March 2022
2 pages
16B FYP REPORT TEMPLATE - EL Dept UIT (UPDATED)
No ratings yet
16B FYP REPORT TEMPLATE - EL Dept UIT (UPDATED)
26 pages
Hostel Allotment Females Feb2021
No ratings yet
Hostel Allotment Females Feb2021
4 pages
Lecture Notes For EE-226 Circuit Analysis-II: Dr. Ghulam Mustafa
No ratings yet
Lecture Notes For EE-226 Circuit Analysis-II: Dr. Ghulam Mustafa
14 pages
Cep, Ecd2020
100% (1)
Cep, Ecd2020
2 pages
Question: What's Meant by "Nafs"? Write Down Its Types With Some Explanation
No ratings yet
Question: What's Meant by "Nafs"? Write Down Its Types With Some Explanation
1 page
Calculus-Ii (Bs 2 Semester) Assignment No. 4 Deadline: April 24, 2020 Total Points: 04
No ratings yet
Calculus-Ii (Bs 2 Semester) Assignment No. 4 Deadline: April 24, 2020 Total Points: 04
2 pages
Rockee T. Bull: Technical Writing Cover Letter Sample
No ratings yet
Rockee T. Bull: Technical Writing Cover Letter Sample
1 page
Email Extractor-FAQ
No ratings yet
Email Extractor-FAQ
14 pages
Powerful Pulleys - Lesson - TeachEngineering
No ratings yet
Powerful Pulleys - Lesson - TeachEngineering
8 pages
EDC Lab Manual (Exp - 5)
No ratings yet
EDC Lab Manual (Exp - 5)
8 pages
CEP Grouping (ECD Fall 2020)
No ratings yet
CEP Grouping (ECD Fall 2020)
1 page
Conclusion-WPS Office
No ratings yet
Conclusion-WPS Office
1 page
AU Vacancy Position of PHD and MPhil 03072018
No ratings yet
AU Vacancy Position of PHD and MPhil 03072018
5 pages
Fitjee 2015
0% (2)
Fitjee 2015
32 pages
Arch 7074
No ratings yet
Arch 7074
7 pages
Olivias Resume
No ratings yet
Olivias Resume
3 pages
Interviewing-What Is It?
No ratings yet
Interviewing-What Is It?
34 pages
NR.2 2016 Refacuta
No ratings yet
NR.2 2016 Refacuta
87 pages
Artifact Sociofact Mentifact
No ratings yet
Artifact Sociofact Mentifact
7 pages
READING-PLAN-FOR-FLEXIBLE-READING-PROGRAM-february GRADE 6
No ratings yet
READING-PLAN-FOR-FLEXIBLE-READING-PROGRAM-february GRADE 6
3 pages
Leadership Sample
No ratings yet
Leadership Sample
6 pages
Entrep Week 10
No ratings yet
Entrep Week 10
14 pages
HRMS 2
No ratings yet
HRMS 2
2 pages
Language, Culture and Identity
No ratings yet
Language, Culture and Identity
4 pages
Authority in The Global Political Economy 2008
No ratings yet
Authority in The Global Political Economy 2008
350 pages
ACTIVITY NO. 2 Dream Business PDF
No ratings yet
ACTIVITY NO. 2 Dream Business PDF
3 pages
Bhushan Arun Patil - Formulation Development (R and D)
No ratings yet
Bhushan Arun Patil - Formulation Development (R and D)
3 pages
Final File 1 3
No ratings yet
Final File 1 3
37 pages
Adaptasi Dan Validasi VHI
No ratings yet
Adaptasi Dan Validasi VHI
6 pages
Electrical Instrumentation
No ratings yet
Electrical Instrumentation
4 pages
Code Mixing Analysis
No ratings yet
Code Mixing Analysis
110 pages
Biographia Literaria or Biographica Studyguide
No ratings yet
Biographia Literaria or Biographica Studyguide
32 pages
Assessment and Evaluation 1 Activity 2: Taxonomic Classification
100% (2)
Assessment and Evaluation 1 Activity 2: Taxonomic Classification
2 pages
Innovation Management: Dr. Babasaheb Ambedkar Technological University, Lonere
No ratings yet
Innovation Management: Dr. Babasaheb Ambedkar Technological University, Lonere
3 pages
styleTTS2205 15439
No ratings yet
styleTTS2205 15439
20 pages
Obafemi Awolowo University - Course Registration
No ratings yet
Obafemi Awolowo University - Course Registration
1 page
Acharya Institute of Graduate Studies Department of Management EVEN SEMESTER 2012 - 2013
No ratings yet
Acharya Institute of Graduate Studies Department of Management EVEN SEMESTER 2012 - 2013
15 pages
Vyom Raj HRM Psda
No ratings yet
Vyom Raj HRM Psda
9 pages
(Ebook) Advances in Sustainable and Environmental Hydrology, Hydrogeology, Hydrochemistry and Water Resources: Proceedings of the 1st Springer Conference of the Arabian Journal of Geosciences (CAJG-1), Tunisia 2018 by Helder I. Chaminé, Maurizio Barbieri, Ozgur Kisi, Mingjie Chen, Broder J. Merkel ISBN 9783030015718, 9783030015725, 3030015718, 3030015726 2024 Scribd Download
100% (3)
(Ebook) Advances in Sustainable and Environmental Hydrology, Hydrogeology, Hydrochemistry and Water Resources: Proceedings of the 1st Springer Conference of the Arabian Journal of Geosciences (CAJG-1), Tunisia 2018 by Helder I. Chaminé, Maurizio Barbieri, Ozgur Kisi, Mingjie Chen, Broder J. Merkel ISBN 9783030015718, 9783030015725, 3030015718, 3030015726 2024 Scribd Download
57 pages
Addtl MOOE For F2F
No ratings yet
Addtl MOOE For F2F
12 pages

Tensor Processing Unit

Uploaded by

Tensor Processing Unit

Uploaded by

Tensor Processing Unit

By: Lucas Jodon

You might also like