Preventing Elasticsearch Out of Memory with Gateway Rate Limiting
Understanding Elasticsearch Resource Constraints
Elasticsearch differs from traditional relational databases in several ways. For instance, relational databases typically have a "maximum connections" setting to control system load and prevent resource exhaustion. Elasticsearch, however, lacks such a parameter, making it vulnerable to being overwhelmed by traffic and potentially causing OOM (Out of Memory) errors.
To address this challenge, we can implement rate limiting strategies using a gateway solution. This approach helps maintain Elasticsearch service availability by controlling both connection volume and request processing rates.
Gateway Architecture Overview
All Elasticsearch requests are routed through the gateway, which can be deployed in a clustered configuration for high availability and scalability.
Connection Limiting Strategy
The gateway implements connection throttling to prevent overwhelming Elasticsearch nodes. This is configured through the gateway's settings file:
api:
- name: elastic_entry_point
active: true
routing: elastic_router
max_connections: 8000
network:
binding: $[[env.GATEWAY_PORT]]
# See documentation for port reuse configuration
reuse_port: true
When the connection limits reached, additional requests are rejected. This prevents the Elasticsearch cluster from being overloaded with too many simultaneous connections.
Indexing Rate Control
Without rate limiting, indexing performance can be unstable, with throughput fluctuating between 90,000-150,000 documents per second. To stabilize this, we can implement bulk indexing throttling.
The following configuration limits bulk indexing to 10,000 documents per second:
environment: # Add under environment section
INDEX_THROTTLE_MAX_BYTES: 33554432 # 32MB/s
INDEX_THROTTLE_MAX_DOCS: 10000 # 10k docs/s
INDEX_THROTTLE_ACTION: retry # Options: retry, drop
INDEX_THROTTLE_RETRIES: 15 # Maximum retry attempts
INDEX_THROTTLE_DELAY_MS: 50 # Retry delay in milliseconds
routing: # Modify routing section
- name: elastic_router
default_flow: data_flow
tracing_flow: audit_flow
rules:
- method:
- "*"
pattern:
- "/_bulk"
- "/{index_name}/_bulk"
flow:
- indexing_flow
flow: # Add to flow section
- name: indexing_flow
filter:
- flow:
flows:
- bulk_index_limiter
- elasticsearch:
cluster: production
max_node_connections: 800
- name: bulk_index_limiter
filter:
- bulk_throttle:
indices:
"primary-index":
max_bytes: $[[env.INDEX_THROTTLE_MAX_BYTES]]
max_docs: $[[env.INDEX_THROTTLE_MAX_DOCS]]
action: $[[env.INDEX_THROTTLE_ACTION]]
retry_delay_ms: $[[env.INDEX_THROTTLE_DELAY_MS]]
max_retries: $[[env.INDEX_THROTTLE_RETRIES]]
alert_message: "Indexing rate exceeded threshold" # Custom alert message
log_warnings: true
With this configuration, the primary-index's indexing rate is consistently capped at 10,000 documents per second.
Multi-Index Rate Control
To apply different rate limits to multiple indices, simply add additional configuration blocks. For example:
- name: bulk_index_limiter
filter:
- bulk_throttle:
indices:
"secondary-index":
max_docs: 20000
action: drop
alert_message: "Secondary index write threshold exceeded"
log_warnings: true
"primary-index":
max_bytes: $[[env.INDEX_THROTTLE_MAX_BYTES]]
max_docs: $[[env.INDEX_THROTTLE_MAX_DOCS]]
action: $[[env.INDEX_THROTTLE_ACTION]]
retry_delay_ms: $[[env.INDEX_THROTTLE_DELAY_MS]]
max_retries: $[[env.INDEX_THROTTLE_RETRIES]]
alert_message: "Primary index write rate too high"
log_warnings: true
Search Request Rate Limiting
Search operations can also be rate-limited to prevent resource exhaustion. Without controls, search performance might reach 70,000 queries per second.
The following configuration limits search requests to 10,000 queries per second:
- name: data_flow
filter:
- query_limiter:
notification: "Search rate limit reached"
rules:
- pattern: "/(?P<index_name>main-index)/_search"
max_qps: 10000
group_by: index_name
- elasticsearch:
cluster: production
max_node_connections: 800
</index_name>
Multiple Index Search Rate Control
Different search rate limits can be applied to different indices:
- name: data_flow
filter:
- query_limiter:
notification: "Search rate limit reached"
rules:
- pattern: "/(?P<index_name>main-index)/_search"
max_qps: 10000
group_by: index_name
- pattern: "/(?P<index_name>auxiliary-index)/_search"
max_qps: 20000
group_by: index_name
- elasticsearch:
cluster: production
max_node_connections: 800
</index_name></index_name>
Gateway Cluster Considerations
When using multiple gateway instances, the total request rate to Elasticsearch equals the sum of all gateway limits. This distributed approach provides better scalability and fault tolerance.