Proximal Cloud · Hardware Platform

Unity

A 50× roadmap to population-scale inference.
Memory-centric. Heterogeneous. Open.
50×
Platform improvement
Lower cost
15×
More performance
▮ Made in India
For population-scale inference

The enterprise question

Real business decisions need answers that span ALL enterprise data — fast, private, on-prem.

“How do the latest tariffs or steel prices affect my quarterly results, my ability to deliver products, my revenue and my cost?”
— paraphrasing Larry Ellison on the enterprise AI workload
Enterprise truth lives across four data modalities

Relational SQL

transactions, financials, ERP

CPU

Vector DB

semantic search, embeddings

CPU / GPU

Graph DB

relationships, supply chain

CPU

LLM Inference

reasoning, synthesis

GPU
Answering these questions requires running all four — together, securely, at scale. Today’s infrastructure can’t.

The directional shift

From racks of single-purpose chips → integrated nodes where CPU, GPU and xPU live next to memory.

Yesterday

Rack-scale silos

GPU RACK
xPU RACK
CPU RACK
3 racks, 3 fabrics, 3 vendors. Compute travels far to reach memory.
Tomorrow

Memory-centric integrated node

HBM · 128–512 GB
CPU
x86 · 64–128 cores
xPU
Compute.AI
GPU
LLM · Tensor
CXL FABRIC · DDR · SRAM · HBF
One node. All compute. Memory at the center — KV-aware routing happens inside the node, not across three buildings of fabric.

The Unity™ node

A memory-centric 2U node where every enterprise workload sits next to HBM — connected by CXL, networked by 800G Ethernet.

Unity™ Node · Compute.AI
HBM · 128–512 GB
LLM Inference
(GPU)
Relational SQL
(CPU)
Compute.AI
xPU · x86 64–128 cores
orchestrator
Vector DB
(CPU/GPU)
Graph DB
(CPU)
CXL · DDR LANES
4× 400G / 800G Ethernet

Memory at the center.

Every workload reaches HBM in one hop — no rack-crossing penalties.

All four modalities, one node.

LLM, Vector, Graph and SQL run side-by-side on the right silicon for each.

Open, heterogeneous compute.

x86 + GPU + xPU coexist over CXL. No single-vendor lock-in.

Standard 2U, standard Ethernet.

4× 400G→800G uplinks. Drops into any datacenter.

Proximal Compute — block to full scale

From a single rack-friendly block to a population-scale system. Same node. Same fabric. Linear scale-out.

Build block
Compute.AI · Unity™ Node
Compute.AI · Unity™ Node
Compute.AI · Unity™ Node
Compute.AI · Unity™ Node
4
Nodes
4 KW
Power
240 B
Parameters
400G fabric · 1.6 Tbit aggregate
12×→
Full rack
48
Nodes
48 KW
Power
3.3 T
Parameters
2.5 M
Tokens / sec
800G fabric · 3.2 Tbit aggregate

Platform roadmap — 50×

Three generations. Costs cut by 3×. Performance up by 15×. Result: 50× better than today.

3× lower cost  ×  15× more performance  =  50× platform improvement
Gen 12026
HBM128 GB
x86 cores24
Tensor220
Ethernet400 GbE
HBM128 GB · 1 TB/s
DDR5512 GB · 64 GB/s/ch
Memory cost  $400K
Gen 22028
HBM256 GB
x86 cores64
Tensor512
Ethernet800 GbE
HBM256 GB · 2 TB/s
DDR6512 GB · 96 GB/s/ch
HBF-11 TB · 128 GB/s
Memory cost  $300K
Gen 32030
HBM512 GB
x86 cores96
Tensor1024
Ethernet1600 GbE
HBM512 GB · 4 TB/s
DDR61 TB · 128 GB/s/ch
HBF-24 TB · 256 GB/s/card
Memory cost  $200K
Unity™ · Hardware Platform

The dot in dot.ai —
built in India, for population scale.

Memory-centric

HBM at the core. CXL across the node.

Heterogeneous

x86 + GPU + xPU under one roof.

Open

Standard 2U, standard Ethernet, no lock-in.

Sovereign

Designed and built in India.

2.5M
tokens / sec per rack
3.3T
parameters per rack
50×
platform improvement by 2030