The Execution Engine for Heavy Compute.
One API across every vetted GPU supplier. USD invoicing through procurement. Failover without cross-region egress. Built for teams whose async compute bill outgrew its buyer.
- CapacityHyperscaler + Tier-3 + verified alt supply.
- ComplianceKYC, OFAC, USD invoicing. SOC 2-aligned. Procurement-signable.
- PriceH100 at $2.20 / hr PAYG. −44% vs AWS retail.
Engineers from Google · Amazon · Snowflake · Uber · Scale AI · Citadel · Capital One
engineering, security, & procurement walkthrough → Product. pricing & commit-plan math → Pricing. agents & services buying their own compute → M2M.
Product
One API. Every vetted GPU supplier.
You submit a container and a GPU. We route it to the cheapest healthy supplier that passes your compliance filter, handle failover when hardware drops, and invoice you in USD. That is the whole product.
In 60 seconds
Install the SDK. Submit a job. Wait on the handle. That is the whole API surface.
$ pip install nodus-sdk >>> import os, nodus >>> >>> client = nodus.Client(api_key=os.environ["NODUS_API_KEY"]) >>> job = client.run( ... image="ghcr.io/acme/train:v3", ... command=["python", "train.py", "--epochs", "10"], ... gpu="h100_80gb", gpu_count=8, ... max_runtime_seconds=18 * 3600, ... ) >>> job.wait() >>> print(job.status, job.supplier, job.region, job.cost_usd) JobStatus.COMPLETED lambda us-east-1 47.62
REST and Python SDK today. TypeScript SDK in private preview. Full reference at nodus.run/docs; live sandbox at nodus.run/docs/#playground.
What you can run
Anything you ship as an OCI image with a GPU and a runtime cap. GPU first; the rest is on the roadmap below.
| Compute | Supply | Typical workloads |
|---|---|---|
| GPU async Live | H100, H200, B200, A100 across hyperscaler regions, Tier-3 data centers, and verified neocloud capacity. | Training, fine-tuning, RAG ingestion, agent fleets, batch inference, diffusion render. |
| CPU batch Q3 2026 | Spot CPU pools and bare-metal CPU clusters, billed by the second. | Monte Carlo, ETL, transcoding, agent simulation, embeddings prep. |
| Bare metal & HPC Q4 2026 | Audited Tier-3 partners and certified HPC operators with InfiniBand and parallel filesystems. | CFD, weather and climate, genomics, protein folding, physics. |
Reliability
If a supplier fails mid-run, the next attempt lands on a healthy one near where your data already sits. No restart from scratch. No cross-region egress every time a node drops.
- Per-supplier scoring. Every job writes placement, retries, and outcome to the Trust Ledger. The router reads those scores on the next placement.
- Checkpoints stay next to compute. Job state writes to parallel storage with geographic redundancy. Failover picks a supplier on the same backbone so state loads locally.
- Idempotent retries. The SDK keys every submission. Retrying the same call returns the original job, never a duplicate charge.
Fits your stack
- Same images, same entry points. Your Docker images, your command lines, your artifact paths. Nothing about your training script changes.
- Schedulers already integrated. Call from Airflow, Dagster, Prefect, Temporal, LangChain, AutoGen, or any worker that can make an HTTP request.
- Your data plane stays yours. Your S3, your VPC peering, your auth. Nodus only sees what you put in the job spec.
- Webhooks on every state change. Placed, running, completed, failed, and budget thresholds. HMAC-SHA256 signed, one schema, same scheme as inbound signed submissions.
Agents and Services, not just humans
Nodus was designed from day one for callers that don't have a person in the loop. Spend caps, signed requests, and programmatic settlement are primitives, not settings we bolted on later.
- Per-key spend envelopes. Mint a key with a monthly USD cap. Any request that would push month-to-date plus worst-case runtime past the cap fails fast with 402 budget_exceeded, before a GPU is provisioned. In-flight jobs count against the ceiling so a burst can't race the cap.
- Signed submissions. Keys minted for agents ship with an HMAC-SHA256 signing secret. Every request carries Nodus-Signature: v1=<mac>,ts=<unix>; stale timestamps are rejected independently of the MAC. Same scheme as outbound webhooks — one SDK helper works both directions.
- Programmatic settlement. No invoices, no net-30 for a bot. Every finished job writes a metered event tagged with api_key_id and principal kind. Humans and agents are separable in your Stripe data without a second pipe.
- Budget webhooks. Subscribe to budget.threshold_reached and budget.exceeded so the agent's control loop can self-throttle without polling. Endpoints can be scoped to a single key, so a customer running many agent fleets fans alerts out correctly.
- Row-level attribution. Every job in the Trust Ledger carries its api_key_id and principal kind (human, agent, or service). Reliability analytics separate M2M traffic from human-driven runs without a join.
Security, compliance, billing
Bundled at every plan, not paywalled. Procurement gets one MSA, one invoice, one vendor.
Security
- SOC 2-aligned controls. Type II under audit Q3 2026.
- Single-tenant job isolation. Per-job audit record.
- Customer-managed keys for checkpoint storage (Q3 2026).
Supply-side compliance
- KYC and AML on every operator.
- OFAC screening on every payout.
- Source-of-funds documented before onboarding.
Billing
- USD invoice via Stripe, monthly in arrears.
- Standard MSA. No crypto ever touches your books.
- Every run is a named line item.
Pricing
Per-hour SKUs. The published price is the all-in price — no platform fee, no take rate, no surprise line items. Free under $200 / mo.
Per-hour rate
| SKU | Nodus PAYG | Reference list by supplier class | Savings vs hyperscaler |
||
|---|---|---|---|---|---|
| Tier-3 | Neocloud | Hyperscaler (AWS) | |||
| H100 80GB | $2.20 / hr | $2.40 / hr | $2.65 / hr | $3.90 / hr | −44% |
| H200 141GB | $3.20 / hr | $3.49 / hr | $3.86 / hr | $5.90 / hr | −46% |
| A100 80GB | $1.40 / hr | $1.53 / hr | $2.00 / hr | $3.06 / hr | −54% |
| A100 40GB | $1.10 / hr | $1.20 / hr | $1.57 / hr | $2.40 / hr | −54% |
| L40S | $0.95 / hr | $1.05 / hr | $1.44 / hr | $2.20 / hr | −57% |
| H100 spot | $1.65 / hr | n/a | n/a | n/a | preemptible, 99.5% completion target |
Tier-3 / Neocloud / Hyperscaler: indicative published / walk-up list by class (US, Q1 2026), not the cheapest informal wholesale print. Nodus PAYG is set below the Tier-3 reference on each row. Hyperscaler = US-East on-demand where listed. CPU batch and bare-metal HPC SKUs ship Q3 and Q4 2026.
Commitments · optional, never required.
| Plan | SKU discount | Notes |
|---|---|---|
| Pay-as-you-go | — | First $200 / mo free. |
| 1-year commit | −15% | Quota guaranteed. |
| 3-year commit | −30% | Reserved capacity, named CSE. |
| Enterprise | negotiated | $1M+/mo. Dedicated support. |
Example spend · $250k / mo of H100 compute
~113,000 GPU-hours / month. Same workload, five price points.
- AWS retail @ $3.90 / hr $443k / mo
- AWS with EDP @ ~$2.80 / hr $318k / mo
- Nodus PAYG @ $2.20 / hr $250k / mo
- 1-year commit @ $1.87 / hr $213k / mo
- 3-year commit @ $1.54 / hr $175k / mo
- Savings vs retail on PAYG $193k / mo · $2.3M / yr
Roadmap
What ships over the next four quarters. Closed-loop changelog to design partners every two weeks.
- Q1 · V1 control plane. REST and Python SDK. Stripe USD billing. Per-job audit record. Per-key spend caps, signed submissions, and budget webhooks shipping with V1 (see M2M). TypeScript SDK in private preview.
- Q2 · Trust Ledger live. First 10,000 hours of telemetry scoring H100, A100, and L40S suppliers. Failover next to data on Filecoin and Arweave.
- Q3 · Enterprise + CPU. SOC 2 Type II under audit. VPC peering. Customer-managed keys. CPU batch (spot and bare-metal) live.
- Q4 · HPC + pipelines. Native plugins for Airflow, Dagster, Prefect, Temporal, LangChain, AutoGen. Multi-region DR. Tier-3 HPC operators with InfiniBand and parallel filesystems.