Rate limiting

The RateLimit plugin allows you to limit the number of requests a client can make within a certain time period. Ktor provides different means for configuring rate limiting, for example:

You can enable rate limiting globally for a whole application or configure different rate limits for different resources.
You can configure rate limiting based on specific request parameters: an IP address, an API key or access token, and so on.

Add dependencies

To use RateLimit, you need to include the ktor-server-rate-limit artifact in the build script:

implementation("io.ktor:ktor-server-rate-limit:$ktor_version")

implementation "io.ktor:ktor-server-rate-limit:$ktor_version"

        
                    <dependency>
                        <groupId>io.ktor</groupId>
                        <artifactId>ktor-server-rate-limit-jvm</artifactId>
                        <version>${ktor_version}</version>
                    </dependency>
                    
        

Install RateLimit

To install the RateLimit plugin to the application, pass it to the install function in the specified module. The code snippets below show how to install RateLimit...

... inside the embeddedServer function call.
... inside the explicitly defined module, which is an extension function of the Application class.

                    import io.ktor.server.engine.*
                    import io.ktor.server.netty.*
                    import io.ktor.server.application.*
                    import io.ktor.server.plugins.ratelimit.*

                    fun main() {
                        embeddedServer(Netty, port = 8080) {
                            install(RateLimit)
                            // ...
                        }.start(wait = true)
                    }
                

                    import io.ktor.server.application.*
                    import io.ktor.server.plugins.ratelimit.*
                    // ...
                    fun Application.module() {
                        install(RateLimit)
                        // ...
                    }
                

Configure RateLimit

Overview

Ktor uses the token bucket algorithm for rate limiting, which works as follows:

In the beginning, we have a bucket defined by its capacity - the number of tokens.
Each incoming request tries to consume one token from the bucket:
- If there is enough capacity, the server handles a request and sends a response with the following headers:
  - X-RateLimit-Limit: a specified bucket capacity.
  - X-RateLimit-Remaining: the number of tokens remaining in a bucket.
  - X-RateLimit-Reset: a UTC timestamp (in seconds) that specifies the time of refilling a bucket.
- If there is insufficient capacity, the server rejects a request using a 429 Too Many Requests response and adds the Retry-After header, indicating how long the client should wait (in seconds) before making a follow-up request.
After a specified period of time, a bucket capacity is refilled.

Register a rate limiter

Ktor allows you to apply rate limiting globally to a whole application or to specific routes:

To apply rate limiting to a whole application, call the global method and pass a configured rate limiter.
install(RateLimit) { global { rateLimiter(limit = 5, refillPeriod = 60.seconds) } }
The register method registers a rate limiter that can be applied to specific routes.
install(RateLimit) { register { rateLimiter(limit = 5, refillPeriod = 60.seconds) } }

Code samples above demonstrate minimal configurations for the RateLimit plugin, but for a rate limiter registered using the register method you also need to apply it to a specific route.

Configure rate limiting

In this section, we'll see how to configure rate limiting:

(Optional) The register method allows you to specify a rate limiter name that can be used to apply rate limiting rules to specific routes:
install(RateLimit) { register(RateLimitName("protected")) { // ... } }
The rateLimiter method creates a rate limiter with two parameters: limit defines the bucket capacity, while refillPeriod specifies a refill period for this bucket. A rate limiter in the example below allows handling 30 requests per minute:
register(RateLimitName("protected")) { rateLimiter(limit = 30, refillPeriod = 60.seconds) }
(Optional) requestKey allows you to specify a function that returns a key for a request. Requests with different keys have independent rate limits. In the example below, the login query parameter is a key used to distinguish different users:
register(RateLimitName("protected")) { requestKey { applicationCall -> applicationCall.request.queryParameters["login"]!! } }
Note that keys should have good equals and hashCode implementations.
(Optional) requestWeight sets a function that returns how many tokens are consumed by a request. In the example below, a request key is used to configure a request weight:
register(RateLimitName("protected")) { requestKey { applicationCall -> applicationCall.request.queryParameters["login"]!! } requestWeight { applicationCall, key -> when(key) { "jetbrains" -> 1 else -> 2 } } }
(Optional) modifyResponse allows you to override default X-RateLimit-* headers sent with each request:
register(RateLimitName("protected")) { modifyResponse { applicationCall, state -> applicationCall.response.header("X-RateLimit-Custom-Header", "Some value") } }

Define rate limiting scope

After configuring a rate limiter, you can apply its rules to specific routes using the rateLimit method:

    routing {
        rateLimit {
            get("/") {
                val requestsLeft = call.response.headers["X-RateLimit-Remaining"]
                call.respondText("Welcome to the home page! $requestsLeft requests left.")
            }
        }
    }

This method can also accept a rate limiter name:

    routing {
        rateLimit(RateLimitName("protected")) {
            get("/protected-api") {
                val requestsLeft = call.response.headers["X-RateLimit-Remaining"]
                val login = call.request.queryParameters["login"]
                call.respondText("Welcome to protected API, $login! $requestsLeft requests left.")
            }
        }
    }

Example

The code sample below demonstrates how to use the RateLimit plugin to apply different rate limiters to different resources. The StatusPages plugin is used to handle rejected requests, for which the 429 Too Many Requests response was sent.

package com.example

import io.ktor.http.*
import io.ktor.server.application.*
import io.ktor.server.plugins.ratelimit.*
import io.ktor.server.plugins.statuspages.*
import io.ktor.server.response.*
import io.ktor.server.routing.*
import kotlin.time.Duration.Companion.seconds

fun main(args: Array<String>): Unit = io.ktor.server.netty.EngineMain.main(args)

fun Application.module() {
    install(RateLimit) {
        register {
            rateLimiter(limit = 5, refillPeriod = 60.seconds)
        }
        register(RateLimitName("public")) {
            rateLimiter(limit = 10, refillPeriod = 60.seconds)
        }
        register(RateLimitName("protected")) {
            rateLimiter(limit = 30, refillPeriod = 60.seconds)
            requestKey { applicationCall ->
                applicationCall.request.queryParameters["login"]!!
            }
            requestWeight { applicationCall, key ->
                when(key) {
                    "jetbrains" -> 1
                    else -> 2
                }
            }
        }
    }
    install(StatusPages) {
        status(HttpStatusCode.TooManyRequests) { call, status ->
            val retryAfter = call.response.headers["Retry-After"]
            call.respondText(text = "429: Too many requests. Wait for $retryAfter seconds.", status = status)
        }
    }
    routing {
        rateLimit {
            get("/") {
                val requestsLeft = call.response.headers["X-RateLimit-Remaining"]
                call.respondText("Welcome to the home page! $requestsLeft requests left.")
            }
        }
        rateLimit(RateLimitName("public")) {
            get("/public-api") {
                val requestsLeft = call.response.headers["X-RateLimit-Remaining"]
                call.respondText("Welcome to public API! $requestsLeft requests left.")
            }
        }
        rateLimit(RateLimitName("protected")) {
            get("/protected-api") {
                val requestsLeft = call.response.headers["X-RateLimit-Remaining"]
                val login = call.request.queryParameters["login"]
                call.respondText("Welcome to protected API, $login! $requestsLeft requests left.")
            }
        }
    }
}

You can find the full example here: rate-limit.

Last modified: 02 April 2024