ArchitectureJanuary 29, 20257 min read

API Rate Limiting Strategies for SaaS Applications

Rate limiting protects your API, enforces pricing tiers, and prevents abuse. Here are the strategies that work for SaaS applications.

Why rate limiting matters for SaaS

Rate limiting controls how many API requests a client can make in a given time period. For SaaS, it serves three purposes:

1. Tier enforcement — Free users get 100 requests/day, Pro get 10,000. 2. Infrastructure protection — Prevent a single user from overloading your servers. 3. Cost control — If your backend calls paid APIs (OpenAI, Twilio), rate limiting prevents runaway costs.

Without rate limiting, a single misbehaving client can take down your entire service.

Fixed window rate limiting

The simplest strategy: count requests in fixed time windows (per minute, per hour). When the count exceeds the limit, return 429 Too Many Requests until the window resets.

Pros: Simple to implement. Cons: Susceptible to burst attacks at window boundaries — a user can make 100 requests at 12:00:59 and another 100 at 12:01:00 (200 in 2 seconds).

Best for: Simple APIs with relaxed requirements.

Sliding window rate limiting

Sliding window smooths the burst problem by using a weighted average of the current and previous window's request counts.

If the limit is 100/minute, and at 12:01:30 (halfway through the window) the user has 40 requests in the current window and had 80 in the previous window, the effective count is: 80 * 0.5 + 40 = 80. They have 20 requests remaining.

Pros: Prevents burst attacks, smoother rate distribution. Cons: Slightly more complex to implement.

Best for: Production SaaS APIs that need fair rate distribution.

Token bucket algorithm

Token bucket assigns each user a bucket of tokens that refills at a constant rate. Each request consumes one token. When the bucket is empty, requests are rejected.

This allows controlled bursts: a user who hasn't made requests for a while has a full bucket and can burst. A user making steady requests consumes tokens as fast as they refill.

Pros: Allows controlled bursts, intuitive for users. Cons: Requires maintaining per-user state.

Best for: APIs where occasional bursts are acceptable (webhooks, batch operations).

Per-tenant rate limiting with ShipStack

For SaaS applications, rate limits should vary by customer tier. ShipStack implements per-tenant rate limiting out of the box:

- Hobby tier: 5,000 API calls/month - Launch tier: 100,000 API calls/month - Scale tier: 1,000,000 API calls/month

Limits are enforced at the tenant level, so one customer's traffic spike doesn't affect another. When a tenant exceeds their limit, they receive a 429 response with headers indicating when the limit resets.

For your own SaaS built on ShipStack, you can implement additional application-level rate limiting by tracking requests per user in your database and checking counts before processing. ShipStack handles the infrastructure-level limits; you handle the business logic limits.

rate limitingAPISaaSarchitecturesecurity

Ready to ship your backend?

Free to start. No credit card required. Connect your first provider in under 5 minutes.

Get Started Free

API Rate Limiting Strategies for SaaS Applications

Why rate limiting matters for SaaS

Fixed window rate limiting

Sliding window rate limiting

Token bucket algorithm

Per-tenant rate limiting with ShipStack

Ready to ship your backend?

Related articles

Multi-Tenant Architecture: A Practical Guide for SaaS Developers

How to Implement Usage-Based Billing With API Analytics

How to Build a SaaS Backend in Under 10 Minutes