Agentic AI Cloud - Investor Summary_vDraft (2) (2)

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 47

haimaker

Investment Summary
December 2024

Copyright © 2024 SambaNova Systems


Confidential & Proprietary | Internal Use
Only
The total AI market size could reach $990 billion
within the next 2 years, driven by Agentic AI
2024: AI Market 2027: AI Market
“Single shot AI” “Agentic AI”

$70- 40%-55% $780-


90Bn CAGR
TAM $990 Bn
TAM

Note: TAM represents AI Infrastructure including compute, inference services, tooling and infrastructure software, networking & storage.
Sources: IDC, Gartner, Morgan Stanley, Bain & Company

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 2


Trends in AI Cloud Infrastructure

Today’s AI infrastructure can’t support the


Performance Needs for the Future of AI

Agentic AI requires Fast Inference and Concurrent


Support of Larger and Larger Foundational Models
Tuned with Private Data

Compute Must be Optimized for Lower Provisioning


Requirements and Energy Consumption

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 3


What is haimaker

haimaker is the first AI Agentic Cloud


purpose-built to support multi-model
agent-based AI, delivering high-speed,
inference built on top of SambaNova1,
the most efficient compute hardware in the
world

1) haimaker is spin-out of SambaNova Systems with SambaNova serving as the anchor


investor.

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 4


There is Not a good Agentic Cloud Solution On the
Market Today
Current AI Clouds are Not Agentic AI Cloud
Built for Agentic AI Requirements

● Inference services are single- ● Ultra-low latency in multi-


model, single-shot model applications

● Sub-par performance for multi- ● Run 1000s of fine tuned


model agentic flows models and hotswap in
milliseconds
● Focus is on massive
infrastructure investment, not ● Efficient compute at the lowest
AI workflows tokens per second, per watt

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 5


Agentic AI will Drive an Exponential Increase in
Inference Demand
Low compute, High Latency ~100x More Compute, Low Latency
Acceptable Required
Unique
model Simple Chatbot Knowledge Chain of Thought Agentic System
Unique Single shot workflow w/ Many intermediate steps Many models, steps and huge
step Assistant
Data source (e.g., vector single mode (e.g. ChatGPT) Chatbot with RAG, requiring from single model to amounts of data to enable
db)
embeddings and much more produce higher quality autonomous agents
data response

Tokens generated per


1X 1X 10X 100X
query
Models Used per query 1 12 1 ~5-10
Data per query 1X 5X 10X 1000X

1. DC, Gartner, MorganCopyright


Stanley, BainSambaNova
© 2024 & Company, SambaNova
Systems Systems..
Inc. | Confidential & Proprietary
These Systems Require Performance Optimized for
Specific Parameters
Agentic AI Performance
Category Metrics Use Cases
Mostly In ● Customer Support Focused
dev, ● AI Avatars data &
nearing Conversational AI Time to First Token simpler
● Human-like
prod logic
interactivity

● Autonomous Systems
Decision Support
Total Latency ● Algorithmic Trading
Systems
● Emergency Response

● Drug Discovery Broader


Token Throughput
Knowledge Generation ● Research data &
Speed
● Complex Simulations complex
AI Frontier reasoning

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 7


The Agentic AI Performance Gap
Typical NVIDIA H100
Speeds1 Agentic AI Requirement

~2-4 s < 400 ms


Time to First Token
(seconds) (milliseconds)

~.5+ s < 200 ms


Total Latency
(seconds) (milliseconds)

Throughput Speed ~75 tps 500+ tps


(tokens per second) (tokens per second)
5-10x Optimization Gap
1. H100 speeds based on average serving profiles on the LLAMA 3.2 70b.
2. Data derived from averages of publicly available inference services, including Together, Fireworks, DeepInfra, Lambda, NovitaAI, Hyperbolic, Lepton.

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 8


Our Partnership With SambaNova Provides Us With
the Fastest Speeds on the Largest Models

Llama 3.1 405B Inference


Speeds

Do not
NVIDIA HW support 405b

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 9


Faster Using Only 16 chips and 10kW
10X Faster Tokens/Second/User
Powe
SambaNova Chips
r
Llama 3.1 405B 16-
bit
200 16 10kW

Llama 3.1 70B 16-bit 580 16 10kW

Llama 3.1 8B 16-bit 1115 16 10kW


1
Llama 3.2 3B 16-bit 1500 16 10kW
0 1
0

Llama 3.2 1B 16-bit 2500 16 10kW

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 10


New players in the market attempt to solve for
speed but offer solutions that cannot scale and
require new DCs
to be built to support
Hardware Configuration to run 70b

100s of models Single Model Capabilities Only

1 rack

10 KW total 238 KW total 92 KW total

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 11


Enter haimaker: The Agentic AI Cloud
haimaker provides a purpose-built platform for the agentic AI revolution.

● HIgh-Speed Inference: High-speed, low-latency


performance at scale with HW optimized for
model swapping and fine tuned data.

● SambaNova Partnership; Spin out from


SambaNova provides access to the the most
energy efficient AI Compute HW at wholesale
prices.

● Federated Infrastructure: A global network of


existing data centers contributing capacity
without requiring new builds.

● World Class Orchestration: Combining high


performance computing with cloud native
efficiency.

● Sovereign AI: Pipeline of Sovereign customers


seeking a managed cloud solution TODAY!

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 12


A Unified Vision for Agentic AI
We are partnering with Lepton AI to provide the Only End-to-End Cloud Solution for Agentic AI that is
ready to enter the market today

API Services
haimaker's compute backbone,
federated globally
Lepton for AI Cloud Architecture
Integrated compute to runtime,
ready for agentic flows

Lepton's orchestration on haimaker's


distributed compute

High performance, low latency, and


cost efficiency at scale
Federated by haimaker
Specifically designed for multi-model,
low-latency workflows

Production ready, secure, and


haimaker Federated Data Centers Powered SambaNova’s compliant
World’s Most Efficient Compute

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 13


Business Model

Lines of Business Revenue Streams

● Hardware sales: Sell SambaNova AI Compute ● Rack Sales: $350k margin per node/rack sold
HW to data centers where a portion of all HW to data centers, with a target of 200 racks sold
sold gets contributed to our managed cloud for in year 1.
a revenue share. ● Inference: 10% revenue share on inference
● Agentic AI Cloud: Provide managed cloud services from data center-contributed capacity.
inference service, in a federated model, ● SaaS: $25k SaaS revenue per rack per year for
enabling data centers to quickly enter the AI
Lepton AI orchestration platform.
market and optimize per-rack profitability.

We have line of sight into more than $70M of contributed margin from hardware sales
in year 1

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


We have a pipeline of deals ready to close and
adopt a managed cloud solution

The leadership team at haimaker is


the same team that is driving
hardware salesSystems
Copyright © 2024 SambaNova at SambaNova
Inc. | Confidential & Proprietary
Operating Model with SambaNova

NewCo will acquire SambaNova racks at a fixed price.

NewCo will contractually provide a guaranteed minimum capacity


for SambaNova and adhere to SLAs that require fulfilling any additional
capacity requests through a purchase mechanism.

NewCo will launch with SambaNova's existing technology, as is, to


offer a white-label managed cloud. Funding will be used for hiring and
optmization of the technology for the Agentic use case.

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


Partnership with Lepton

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


Graveyard

Investment Summary

Copyright © 2024 SambaNova Systems


Confidential & Proprietary | Internal Use
Only
Enter haimaker: THe New Agentic AI Cloud

● Inference services catered to agentic flows


Performance Tokens
powered by the fastest, most efficient hardware

● Spin out from SambaNova provides access to


SambaNova
compute at below market rates

● World class orchestration technology to


Lepton AI Partnership optimize the federated model across data centers
and geos

An inference cloud purpose-built for Agentic AI, powered by SambaNova, the world’s
most capable AI hardware

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


haimaker: The Agentic AI Cloud
haimaker provides a purpose-built platform for the agentic AI revolution.

● HIgh-Speed Inference: High-speed, low-latency


performance at scale with HW optimized for
model swapping and fine tuned data.

● SambaNova Partnership; Access to the the


most energy efficient AI Compute HW at
wholesale prices.

● World Class Orchestration: Combining high


performance computing with cloud native
efficiency

● Federated Infrastructure: A global network of


existing data centers without requiring new builds.

● Sovereign AI: Pipeline of Sovereign customers


seeking a managed cloud solution TODAY!

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 20


The Agentic AI Opportunity
Agentic AI Revolution: Agentic AI
applications will require 100-1000x more
inference capacity.
2024: AI Market 2027: AI Market
“Single shot AI” “Agentic AI” Multi-Model Workflows: Agentic flows rely
on multiple models with multiple steps
increasing the need for compute.

$70- Performance Demands: Low latency and


40%-55% $780- high throughput are critical for real-time
90Bn CAGR
TAM $990 Bn agentic applications.
TAM
Infrastructure Limitations: Current cloud
infrastructure is not designed for agentic AI's
unique demands.

Demand for Scalability: Businesses need a


platform that can scale to meet the
exponential growth of agentic AI.

Note: TAM represents AI Infrastructure including compute, inference services, tooling and infrastructure software, networking & storage.
Sources: IDC, Gartner, Morgan Stanley, Bain & Company

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 21


Agentic AI will result in an exponential increase in
inference demand, driving revenue for On Demand Cloud
models that can meet the performance requirements
Low compute, High Latency ~100x More Compute, Low Latency
Acceptable Required
Unique
model Simple Chatbot Knowledge Chain of Thought Agentic System
Unique Single shot workflow w/ Many intermediate steps Many models, steps and huge
step Assistant
Data source (e.g., vector single mode (e.g. ChatGPT) Chatbot with RAG, requiring from single model to amounts of data to enable
db)
embeddings and much more produce higher quality autonomous agents
data response

Tokens generated per


1X 1X 10X 100X
query
Models Used per query 1 12 1 ~5-10
Data per query 1X 5X 10X 1000X

1. DC, Gartner, Morgan Stanley, Bain & Company, SambaNova Systems..


Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary
The Agentic AI Performance Imperative
Agentic AI Performance
Category Metrics Use Cases
Mostly In ● Customer Support Focused
dev, ● AI Avatars data &
nearing Conversational AI Time to First Token simpler
● Human-like
prod logic
interactivity

● Autonomous Systems
Decision Support
Total Latency ● Algorithmic Trading
Systems
● Emergency Response

● Drug Discovery Broader


Token Throughput
Knowledge Generation ● Research data &
Speed
● Complex Simulations complex
AI Frontier reasoning

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 23


The Current Infrastructure Bottleneck
Typical NVIDIA H100
Speeds1 Agentic AI Requirement

~2-4 s < 400 ms


Time to First Token
(seconds) (milliseconds)

~.5+ s < 200 ms


Total Latency
(seconds) (milliseconds)

Throughput Speed ~75 tps 500+ tps


(tokens per second) (tokens per second)
5-10x Optimization Gap
1. H100 speeds based on average serving profiles on the LLAMA 3.2 70b.
2. Data derived from averages of publicly available inference services, including Together, Fireworks, DeepInfra, Lambda, NovitaAI, Hyperbolic, Lepton.

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 24


New players in the market attempt to solve for
speed but offer solutions that cannot scale and
require new DCs
to be built to support
Hardware Configuration to run 70b

100s of models Single Model Capabilities Only

1 rack

10 KW total 238 KW total 92 KW total

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 25


The total AI market size could reach $780 billion to
$990 billion within the next 2 years, driven by
Agentic AI
2024: AI Market 2027: AI Market
“Single shot AI” “Agentic AI”

$70- 40%-55% $780-


90Bn CAGR
TAM $990 Bn
TAM

Note: TAM represents AI Infrastructure including compute, inference services, tooling and infrastructure software, networking & storage.
Sources: IDC, Gartner, Morgan Stanley, Bain & Company

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 26


Enter Agentic AI Cloud

On Demand ● Inference services catered to agentic flows


Compute powered by the fastest, most efficient hardware

● Global install base comprised of contributed


Federated Model
capacity from existing data centers (no new build)

● World class orchestration technology to


Managed Cloud
optimize the federated model across data centers
Services
and geos

An inference cloud purpose-built for Agentic AI, powered by SambaNova, the world’s
most capable AI hardware

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


AI Agentic AI Cloud: The Only Way Forward

AI Agentic Nvidia Groq Cerebras Breakdown


Cloud

Nvidia’s architecture makes inference speed


impossible. This is where SambaNova, Groq and
Fastest Tokens Cerebras thrive. Fast inference is fundamental for
Agentic AI.

Groq and Cerebras requires tens of racks to get this


Efficient Tokens speed from one model.

And since Agentic AI requires lots of models,


Groq and Cerebras become unviable, as more models
Models per System means more racks. Nvidia too quickly runs out of
memory/

Agentic AI also requires should amounts of data


from external sources and from reused data during
Data per System previous steps. Groq and Cerebras run out of memory.

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 28


Agentic AI Cloud: We Start Where Others Fail

We unlock vacant or older


data centers without the need to
retrofit or built out.

We manage SambaNova’s
Cloud and cater to the Agentic
AI market

Complete E2E solution with


turnkey managed services
available for sovereign AI clouds.

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


Business Model
Objective: Establish an Agentic AI Cloud as a sales acceleration and hardware distribution
platform for SambaNova, akin to CoreWeave's relationship with NVIDIA, via white-label managed
cloud .

Lines of Business Revenue Streams

● HW Sales via Revenue Share: Sell ● Rack Sales: $350k margin per node/rack sold
SambaNova AI Compute HW to data centers to data centers, with a target of 200 racks sold
where a portion of all HW sold gets contributed in year 1.
to the AI Agentic Cloud inference platform. ● Inference: 10% revenue share on inference
● Agentic AI Cloud: Provide managed cloud services from data center-contributed capacity.
inference service, in a federated model, ● SaaS: $25k SaaS revenue per rack per year for
enabling data centers to quickly enter the AI
orchestration platform.
market and optimize per-rack profitability.

We will launch the business by purchasing 85 racks in Year 1 from SambaNova.

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


Proposed Structure & Investment Thesis

Lines of Business Investment Thesis

● Spin out NewCo as a managed AI Agentic ● Time to revenue is immediate via HW


Cloud and SambaNova’s distribution sales.
arm. ● Proven sales pipeline executed by the
● SambaNoa provides a $20M anchor same management team.
investment within a ~$120M raise ○ Line of sight into 200 racks year
(option to increase). 1.
● Funds to be used to purchase 85 racks ● Wholesale agreements with SambaNova
from SambaNova during year 1 to launch provide competitive advantage.
the Agentic Cloud.
● Limited new software development
drives time to market.
● Break-even and FCF positive within 2
years.
Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 31
Operating Model with SambaNova

NewCo will acquire SambaNova racks at a fixed price.

NewCo will contractually provide a guaranteed minimum capacity


for SambaNova and adhere to SLAs that require fulfilling any additional
capacity requests through a purchase mechanism.

NewCo will launch with SambaNova's existing technology, as is, to


offer a white-label managed cloud. Funding will be used for hiring and
optmization of the technology for the Agentic use case.

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


Financial Model ($000s)
Scenario: $120M funding with 85 racks purchased during year 1
Rack Sales Drive Immediate Profits, FCF Positive in Year 2

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


Sources & Uses ($000s)

Scenario 1: Sources and Uses (85 racks)

Notes:
An itemized breakdown of all uses of funds is currently under development.

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


Graveyard

Investment Summary

Copyright © 2024 SambaNova Systems


Confidential & Proprietary | Internal Use
Only
Enter Agentic AI Cloud Powered by SambaNova
An Agentic AI requires a
system that: Agentic AI Cloud SambaNova

Can run fast inference On Demand


10 kw per rack
at efficient TPS per MW Compute

Can run many models at


Model Chaining & Host 100s of model per
once and hotswap in
Orchestration rack
milliseconds

And can cache huge SLAs on TTFT, Latency Large Context


amounts of data and Throughput Capabilities

An inference cloud purpose-built for Agentic AI, powered by the world’s most capable
AI hardware

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


SambaNova

$120M Delivery of
Cash 100 racks

Principal,
Interest,
Colocation Fees Distributions

Asset Lender +
Data Center SPV
Colo Services
80/20 debt Equity Investors
equity $$

100% rev share to


recovery 50/50 after Exclusive right to
operate racks in DC
4 year term
Recovery is Sum of:
Capex + Interest +
Colo
100 racks
Agentic Cloud
(“Newco”)

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


SambaNova

Delivery of 100 racks


$120M 100% rev share to recovery 50/50 after
Cash
4 year term. Recovery is Sum of: Capex + Interest + Colo
Exclusive right to
operate racks in DC
Principal,
Interest,
Colocation Fees Distributions

Asset Lender +
Data Center SPV
Colo Services
80/20 debt Equity Investors
equity $$

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


More info on AI Cloud…. We talked a lot of about
the hardware, now let’s talk about the cloud more
here before we get into business model. This is
circle back to Slide 5 that makes it come to life.
This slide to the CSP/DC level?
Introducing the only true Agentic AI Cloud - we start
where others fail

Quick start: we go into vacant datacenters without


retrofit or need for new builds

Expertise: We manage SambaNova’s cloud

Complete solution: E2E solution with turnkey managed


services available for sovereign AI clouds

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


Agentic AI requires a system that:

●Can run fast inference (tokens per second per user)


●But that fast inference needs to be efficient (tokens per second per watt)
●Can run lots of models and hotswap in milliseconds (models deployed per rack)
●And can cache huge amounts of data, as these systems need to remember a lot
of things (total context window per rack)

Notes:
1) An itemized breakdown of all uses of funds is currently under development.

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


Enter SambaNova + Agentic AI Cloud (I think we
keep this slide to the developer personal level?)
Agentic AI
Cloud Inference

Performance
Token Speeds

Serving Profiles
Built for Agentic AI

No Data Center
Buildout/Upgrade
● Operates at 11 kW for inference on a single Required
rack
● Host 100s of models per rack
● 5x more tokens per MW than NVIDIA 100
● Works in traditional CPU datacenters
Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary
SN40L: SambaNova’s new CoE-optimized RDU

“Cerulean” Architecture-based Reconfigurable Dataflow Unit

5nm TSMC 3-tier Dataflow Memory


Cerulean SN40L
RDU

520 MB
102B Transistors
On-Chip Memory

1,040 RDU Cores 64 GB


High Bandwidth
Memory
4
2
638 TFLOPS (bf16) 1.5 TB
High Capacity Memory

Generative AI Training and Inference

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


SN40L System - 3-tier Dataflow Memory

On-Chip SRAM [4 GB, PBs per sec]


Dataflow enabled by large On-Chip Memory

12.8
TB/s
RDU High Bandwidth Memory [512 GB]
Super Low Latency Model Switching (Eg. <0.02sec for llama V2
7B)

800
GB/s
RDU High Capacity DDR Memory [12 TB]
4
Up to 5 Trillion Parameters! 3

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


Support Large Volume of Models with 10x Less Hardware

10 Nodes Memory Required [DGX H100] 1 Node Memory Reduce Data Center
Required Footprint
Space, HW, Power

Simplified Hardware
Capacity Planning
Throughput + Memory
capacity → Throughput

Auto-Optimize for
Various Traffic Patterns
With less predictable
model traffic patterns,
system can self-optimize

60% 30% 80% 15% 40% 4


4
Utilized Utilized Utilized Utilized Utilized

Illustrative Example: 175 expert models (~2.6T Parameters)


including mix of: Llama 3 8B, Mistral 7B, Llama 3 70B, Llama 2
70B

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary


The total AI market size could reach $780 billion to
$990 billion by 2027, driven by Agentic AI

2024: Single Shot


2027: Agentic AI
AI
40%-55% CAGR
$70-90 Bn $300-$500 Bn
Total Addressable Total Addressable
Market Market

Notes:
1) TAM represents AI Infrastructure Enablers including compute tooling and infrastructure software, networking & storage.
2) Sources: IDC, Gartner, Morgan Stanley, Bain & Company

Copyright © 2024 SambaNova Systems Inc. | Confidential & Proprietary 47

You might also like