One platform. Every workload.

From frontier pretraining to low-latency inference to academic research, the primitives are the same — B200s on an NDR fabric, priced to use.

01 · TRAINING

Pretraining, up to frontier scale.

Reserved pods from 16 to 1,024 GPUs, NDR400 fabric, fault-tolerant checkpointing, and an on-call SRE for every reservation over 128 GPUs.

Largest run to date840B params · 672 B200
MFU (B200, FP8)53% sustained
Checkpoint restore< 90s · 128 GPUs
Min reservation24 hours
Reserve training pod →
02 · INFERENCE

Low-latency serving at any QPS.

Bare-metal or Kubernetes. vLLM, TensorRT-LLM, TGI images preloaded. Autoscale on tokens/sec — not CPU. Pay only when inference is running.

Llama-3.1-70B · TTFT p50128 ms · 4× H200
Throughput (70B FP8)18,200 tok/s · node
Autoscale triggertokens/sec, queue depth
Cold start~11s · pre-pulled images
Launch inference stack →
03 · FINE-TUNING

Minutes, not months.

Launch a single-node 8×B200 by CLI, finish a LoRA run before lunch, push the artifact to your registry, tear the cluster down. Pay for 3 hours.

# launch, train, teardown $ neoscale cluster create --gpu b200 --count 8 $ neoscale ssh $(neoscale cluster latest) $ accelerate launch train_lora.py \ --model meta-llama/Llama-3.1-70B \ --dataset s3://ours/fine \ --epochs 3 # ... 2h 14m later $ neoscale cluster destroy --confirm → total: $55.84 · 2h 14m · you saved $64.13
04 · RESEARCH

Built for university labs.

Shared Slurm clusters, fair-share scheduling, JupyterHub. 40% academic discount. Grant-friendly billing — annual invoice, NET 60.

Academic rate (B200)$2.09 / GPU·hr
Free credits (new labs)$2,500
SchedulerSlurm 23.02 · shared
Open-source discountadditional 15%
Apply for research access →

Who's building on Neoscale.

// foundation models

Frontier labs

Multi-month 1,024-GPU reservations for pretraining runs. Private fabric, named SRE, priority scheduling.

// robotics

Embodied AI

Sim-training with Isaac and MuJoCo. Bursty 128-GPU jobs, nightly fine-tuning of world models.

// biotech

Protein & molecule AI

AlphaFold3-class pipelines, diffusion models for binder design. HIPAA-aligned private tenancy.

// media

Video & 3D

L40S fleets for rendering, B200 for diffusion training. Object storage with zero egress to your CDN.

// finance

Quant & risk

Private-tenancy H100 pods, SOC 2 Type II, audit log export to your SIEM.

// public sector

Gov & defense

FedRAMP in progress. US-only operators, citizen-personnel option, ITAR workflows on request.