blob: 708f1f8883f2284b5ff16c186643d012f8822647 [file] [log] [blame]
= Request Rate Limiters
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
Solr allows rate limiting per request type. Each request type can be allocated a maximum allowed number of concurrent requests
that can be active. The default rate limiting is implemented for updates and searches.
If a request exceeds the request quota, further incoming requests are rejected with HTTP error code 429 (Too Many Requests).
Note that rate limiting works at an instance (JVM) level, not at a core or collection level. Consider that when planning capacity.
There is future work planned to have finer grained execution here (https://issues.apache.org/jira/browse/SOLR-14710[SOLR-14710]).
== When To Use Rate Limiters
Rate limiters should be used when the user wishes to allocate a guaranteed capacity of the request threadpool to a specific
request type. Indexing and search requests are mostly competing with each other for CPU resources. This becomes especially
pronounced under high stress in production workloads. The current implementation has a query rate limiter which can free up
resources for indexing.
== Rate Limiter Configurations
The default rate limiter is search rate limiter. Accordingly, it can be configured using the following command:
curl -X POST -H 'Content-type:application/json' -d '{
"set-ratelimiter": {
"enabled": true,
"guaranteedSlots":5,
"allowedRequests":20,
"slotBorrowingEnabled":true,
"slotAcquisitionTimeoutInMS":70
}
}' http://localhost:8983/api/cluster
=== Enable Query Rate Limiter
Controls enabling of query rate limiter. Default value is `false`.
"enabled": true
=== Maximum Number Of Concurrent Requests
Allows setting maximum concurrent search requests at a given point in time. Default value is number of cores * 3.
"allowedRequests":20
=== Request Slot Allocation Wait Time
Wait time in ms for which a request will wait for a slot to be available when all slots are full,
before the request is put into the wait queue. This allows requests to have a chance to proceed if
the unavailability of the request slots for this rate limiter is a transient phenomenon. Default value
is -1, indicating no wait. 0 will represent the same -- no wait. Note that higher request allocation times
can lead to larger queue times and can potentially lead to longer wait times for queries.
"slotAcquisitionTimeoutInMS":70
=== Slot Borrowing Enabled
If slot borrowing (described below) is enabled or not. Default value is false.
NOTE: This feature is experimental and can cause slots to be blocked if the
borrowing request is long lived.
"slotBorrowingEnabled":true,
=== Guaranteed Slots
The number of guaranteed slots that the query rate limiter will reserve irrespective
of the load of query requests. This is used only if slot borrowing is enabled and acts
as the threshold beyond which query rate limiter will not allow other request types to
borrow slots from its quota. Default value is allowed number of concurrent requests / 2.
NOTE: This feature is experimental and can cause slots to be blocked if the
borrowing request is long lived.
"guaranteedSlots":5,
== Salient Points
These are some of the things to keep in mind when using rate limiters.
=== Over Subscribing
It is possible to define a size of quota for a request type which exceeds the size
of the available threadpool. Solr does not enforce rules on the size of a quota that
can be define for a request type. This is intentionally done to allow users full
control on their quota allocation. However, if the quota exceeds the available threadpool's
size, the standard queuing policies of the threadpool will kick in.
=== Slot Borrowing
If a quota does not have backlog but other quotas do, then the relatively less busier quota can
"borrow" slot from the busier quotas. This is done on a round robin basis today with a futuristic
pending task to make it a priority based model (https://issues.apache.org/jira/browse/SOLR-14709).
NOTE: This feature is experimental and gives no guarantee of borrowed slots being
returned in time.