Modal Pricing 2026: Serverless GPU & CPU Costs

Price checkPer GPU

StarterFree Team$250/GPU/mo EnterpriseCustom

All Modal Plans & Pricing

Plan	Monthly	Annual	Best For
Starter credits: $30/month compute creditsseats: Up to 3	Free	Free	Individual developers and small teams getting started with serverless GPU compute
Verified pricing · last checked May 2026 · 3 sources Get this price at Modal →
What's included at Starter Best for: Individual developers and small teams getting started with serverless GPU compute $30/month in free compute credits Up to 3 workspace seats 100 containers + 10 GPU concurrency 5 deployed crons, 8 deployed web endpoints 1 day log retention 200 deployed apps Real-time metrics and logs Region selection (1.25–2.5x base prices) SOC 2 compliance Community Slack support Limits credits$30/month compute credits seatsUp to 3 containers100 gpuConcurrency10 crons5 deployed webhooks8 deployed endpoints logRetention1 day deployedApps200
Team credits: $100/month compute creditsseats: Unlimited	$250 /month	—	Startups and growing teams needing higher concurrency, custom domains, and collaboration features
Verified pricing · last checked May 2026 · 3 sources Get this price at Modal →
What's included at Team Best for: Startups and growing teams needing higher concurrency, custom domains, and collaboration features $100/month in free compute credits Unlimited workspace seats 1000 containers + 50 GPU concurrency Unlimited crons and web endpoints 30 day log retention 1000 deployed apps Custom domains Static IP proxy Deployment rollbacks (custom versions) RBAC and SSO Community Slack support Limits credits$100/month compute credits seatsUnlimited containers1000 gpuConcurrency50 cronsUnlimited webhooksUnlimited logRetention30 days deployedApps1000
Enterprise	Contact Sales	Contact Sales	Organizations prioritizing security, compliance, dedicated support, and high-volume GPU compute
Verified pricing · last checked May 2026 · 3 sources Get this price at Modal →
What's included at Enterprise Best for: Organizations prioritizing security, compliance, dedicated support, and high-volume GPU compute Volume-based compute discounts Unlimited workspace seats Custom GPU concurrency limits Custom container limits Embedded ML engineering services Support via private Slack Audit logs, Okta SSO, HIPAA compatibility Custom log retention AWS and GCP marketplace transacting

View all features by plan (compare side-by-side)

Starter

$30/month in free compute credits
Up to 3 workspace seats
100 containers + 10 GPU concurrency
5 deployed crons, 8 deployed web endpoints
1 day log retention
200 deployed apps
Real-time metrics and logs
Region selection (1.25–2.5x base prices)
SOC 2 compliance
Community Slack support

Team

$100/month in free compute credits
Unlimited workspace seats
1000 containers + 50 GPU concurrency
Unlimited crons and web endpoints
30 day log retention
1000 deployed apps
Custom domains
Static IP proxy
Deployment rollbacks (custom versions)
RBAC and SSO
Community Slack support

Enterprise

Volume-based compute discounts
Unlimited workspace seats
Custom GPU concurrency limits
Custom container limits
Embedded ML engineering services
Support via private Slack
Audit logs, Okta SSO, HIPAA compatibility
Custom log retention
AWS and GCP marketplace transacting

Try Modal Free

Compare Modal with alternativesAdjust seats, lock a tier, add up to 2 more products side-by-side. Shareable URL.

Cost calculator

What does Modal actually cost you?

Drag the slider. Pick a tier. Your projected spend updates as you go.

Quantity

25units

Tier

Billing

Your projected cost$6.3Kper month · $250/unit × 25units

Year 1 license$75K12 months at this rate

At a glance

List price by tier (annualized, per unit)

Per-unit list price across Modal's plans, annualized. Custom-priced tiers show a hatched bar.

StarterCustom

Team$3.0K/yr

EnterpriseCustom

Quick Answer

Last verified: May 6, 2026

High confidence

Modal costs Free to $250 per GPU/hour as of July 2026, with 3 plans available including a free tier. Plans: Starter (free), and Team at $250/GPU/hour. Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.

Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.

Free tier: Yes

Modal offers 3 pricing tiers: Starter, Team, Enterprise. A free plan is available. Paid plans include Team at $250/month. The Team plan is startups and growing teams needing higher concurrency, custom domains, and collaboration features.

Compared to other ai/gpu cloud compute software, Modal is positioned at the mid-market price point.

2 documented hidden costs beyond list price

How much does Modal cost?

Modal offers 3 pricing plans, starting with a free tier and scaling to custom enterprise pricing. Plans include Starter (free), Team at $250/GPU/hour, Enterprise (custom pricing).

Modal Pricing Overview

Modal has 3 pricing plans, including a free tier. Paid plans range from $0 to $250/GPU/hour. The Starter plan is free and is best for individual developers and small teams getting started with serverless gpu compute. The Team plan costs $250/GPU/hour, best for startups and growing teams needing higher concurrency, custom domains, and collaboration features. The Enterprise plan requires contacting sales for a custom quote and is designed for organizations prioritizing security, compliance, dedicated support, and high-volume gpu compute.

Modal and annual price increases of $2-$72/hr per GPU.

There are at least 2 documented hidden costs beyond Modal's list price, including implementation, training, and add-on fees.

This pricing was last verified in May 6, 2026 from 3 independent sources.

Try Modal Free

Modal is a serverless compute platform that lets Python developers run GPU and CPU workloads in the cloud with zero infrastructure management. You write Python functions decorated with @app.function, and Modal handles scaling, containers, and billing. You pay only for active compute time — no idle charges, no minimum commitments. Modal is popular for ML inference, fine-tuning, batch jobs, and GPU-accelerated Python scripts. GPU pricing starts at $1.10/hr for A10G and goes up to $4.29/hr for H100 SXM.

How Modal Pricing Compares

Compare Modal pricing against top alternatives in AI/GPU Cloud Compute.

RunPod $0.34-$3.49/GPU/hour Compare → Lambda $0.69-$6.99/GPU/hour Compare → CoreWeave $10-$68.8/instance/hour Compare →

Usage-Based Rates

Per-unit pricing for Modal API usage.

Starter

Model	Unit	Rate
Nvidia B200	second	$0.00174 $6.25/hr — top-tier GPU
Nvidia H200	second	$0.00126 $4.54/hr
Nvidia H100 (80GB)	second	$0.00110 $3.95/hr
Nvidia RTX PRO 6000	second	$0.00084 $3.03/hr
Nvidia A100 80GB	second	$0.00069 $2.50/hr
Nvidia A100 40GB	second	$0.00058 $2.10/hr
Nvidia L40S (48GB)	second	$0.00054 $1.95/hr
Nvidia A10 (24GB)	second	$0.00031 $1.10/hr
Nvidia L4	second	$0.00022 $0.80/hr
Nvidia T4 (16GB)	second	$0.00016 $0.59/hr — entry GPU
CPU (per physical core, 2 vCPU equiv)	second	$0.000013 $0.047/core-hr
Memory (per GiB)	second	$0.000002 $0.008/GiB-hr

Compute rates apply to all plans (Starter, Team, Enterprise)
Billed per second of active execution — no idle charges
Free credits automatically applied before billing
Sandbox/Notebook CPU and memory priced higher (3x base)
Non-preemptible execution available at 3x base prices

Compare Modal vs Alternatives

Before committing to Modal, compare pricing with these 3 alternatives in the same category.

VSRunPod

From $0.34/GPU/hour

Batch workloads, training runs, and cost-sensitive inference that can tolerate interruptions

Full comparison

VSLambda

From $0.69/GPU/hour

Development, fine-tuning, and small inference jobs

Full comparison

VSCoreWeave

From $10/instance/hour

Large-scale AI training

Full comparison

All Modal alternatives & migration guides

What Companies Actually Pay for Modal

Review scores

Third-party review aggregates, as of Apr 2026

Top pricing complaints

GPU costs escalate very quickly beyond baseline usageCold start latency impacts production workload reliabilityRequires DIY configuration for inference stacks like vLLMH100 pricing (~$4.5/hr) is higher than competing platforms like RunPod (~$2.5/hr)

Modal Year 1 Total Cost by Company Size

Real deployment costs including licenses, implementation, training, and admin — not just the sticker price.

Solo Developer on Starter Plan $0 Year 1 total

A solo developer using the free Starter plan for occasional on-demand GPU workloads. Compute is billed at ~$2/hr per GPU baseline with no monthly platform fee.

Small Team Running Large Model Inference $250 Year 1 total

A small team on the Team plan deploying a 100B+ parameter model for product use. Large model serving can require multi-GPU setups at ~$72/hr, making this viable for shared usage across many users but prohibitively expensive for low-utilization individual deployments.

HN community (sid-the-kid, 2025-04-24)

How Modal Pricing Compares

Software	Starting Price	Top Price
Modal	Free	$250/GPU/hour
CoreWeave	$6.27/instance/hour	$68.8/instance/hour
Hyperbolic	$0.16/GPU/hour	$3.5/GPU/hour
Lambda	$0.69/GPU/hour	$6.99/GPU/hour
Paperspace	$0.45/GPU/hour	$5.95/GPU/hour
RunPod	$0.27/GPU/hour	$7.39/GPU/hour

Detailed pricing comparisons:

Browse all AI/GPU Cloud Compute pricing →

2 Modal Hidden Costs Beyond the List Price

Beyond the listed price, Modal has at least 2 documented hidden costs that can significantly increase total cost of ownership.

Watch for 2 hidden costs

GPU Compute Costs Billed on Top of Plan Fee $2-$72/hr per GPU
high 3 sources

Hacker News "Pricing is about $2/hr per GPU (as a baseline of the costs). Long story short, things get VERY expensive quickly."
Hacker News "On Modal, I think should cost about $72/hr to serve Kimi K2 https://modal.com/pricing Once that's running it can serve the needs of many users/clients simultaneously."
Reddit "I just looked at the H100S pricing—it's $4.5/hour."
DIY Configuration Overhead and Cold Start Latency 5-15% of license costs
medium 1 source

Hacker News "every inference provider is either fast-but-expensive (Together, Fireworks — you pay for always-on GPUs) or cheap-but-DIY (Modal, RunPod — you configure vLLM yourself and deal with slow cold starts)."

Tip

Ask your Modal sales rep about these costs upfront. Getting them in writing before signing can save you from surprise charges later.

Full hidden costs breakdown →

Intelligence sourced from 2 independent sources

Hacker News Tech community Reddit User discussions

Key claims include inline source attribution. Data verified against multiple independent sources. 7 source citations total.

Modal Contract Terms

Modal contracts do not auto-renew. Changes require advance notice. These terms are sourced from verified buyer experiences.

Contract Terms

Price Escalation $2-$72/hr per GPU

Based on 3 verified sources

Modal Pricing FAQ

01 How much does Modal cost?

Modal charges per second of active compute. GPU pricing ranges from $1.10/hr for A10G/T4 to $4.29/hr for H100 SXM. CPU is $0.000306/vCPU-second. All new accounts get $30/month in free compute credits.

02 Does Modal have a free tier?

Yes — Modal's Starter plan includes $30/month in compute credits at no cost. These credits apply to any GPU or CPU workload, making it free for light personal use and experimentation.

03 How does Modal billing work?

Modal bills per millisecond of actual execution. There are no idle charges — you only pay when your function is actively running. This makes Modal very cost-effective for sporadic workloads compared to always-on servers.

04 What GPUs does Modal support?

Modal supports T4, A10G, L40S, A100 (40GB PCIe and 80GB SXM), H100 (PCIe and SXM), and H200. GPU availability varies; H100 and H200 require selection via the SDK.

05 Modal vs RunPod: which is cheaper?

For sporadic/bursty workloads, Modal is cheaper due to zero idle charges. For sustained heavy usage, RunPod's Secure Cloud or Community Cloud (spot at ~50% off) is cheaper per hour. Modal A10G costs $1.10/hr vs RunPod A10G at ~$0.69/hr on-demand — but Modal's billing precision eliminates wasted compute.

06 Is the Starter or Team plan fee the total cost, or are there additional compute charges?

The plan fee ($0 for Starter, $250/month for Team) is a platform access charge only. GPU compute is billed separately per-hour on top of the plan. Baseline GPU costs start around $2/hr per GPU, H100s run ~$4.5/hr, and serving large models can reach ~$72/hr. Actual monthly spend depends entirely on compute utilization.

07 Does Modal have cold start issues for production workloads?

Yes. Modal's serverless architecture spins containers on-demand rather than keeping GPUs always-on. This produces cold start latency when functions haven't been called recently. Users also need to configure their own inference stack (e.g., vLLM) rather than using a managed inference endpoint.

08 How does Modal GPU pricing compare to RunPod?

Based on community comparisons, Modal's H100 runs ~$4.5/hr versus RunPod's H100 at ~$2.5/hr. RunPod also offers lower-tier GPUs like RTX A5000 and RTX 3090 at ~$0.22/hr. Modal's higher pricing reflects its managed serverless infrastructure and simpler deployment model. Check modal.com/pricing directly for current GPU-specific rates.

09 Do I pay for container uptime or only execution time on Modal?

This is a common question. Modal's on-demand model is designed around paying for execution time rather than idle GPU time, which is its core value proposition versus always-on GPU providers. However, confirm the specific billing model for your workload type at modal.com/pricing, as container warm-up and minimum billing increments may apply.

Is this pricing incorrect? — we'll verify and update it.