Building a Log Management System with Elastic Stack
Elastic Stack Components
The Elastic Stack (commonly referred to as ELK) consists of three core components:
- Elasticsearch: Distributed search and analytics engine for storing and querying log data
- Logstash: Data processing pipeline for collecting, parsing, and transforming logs
- Kibana: Visualization platform for creating dashboards and analyzing log data
These open-source tools work together to provide comprehensive log management capabilities.
Log Management Challenges
Modern infrastructure generates diverse log types including system logs, application logs, and security logs. These logs provide critical insights into system performance, configuration issues, and security events. However, managing logs across distributed systems presents several challenges:
- Logs are typically scattered across multiple servers
- Manual log inspection becomes impractical at scale
- Real-time analysis requires centralized collection and processing
Core Elasticsearch Concepts
Elasticsearch organizes data using several key concepts:
- Index: Similar to a database table, an index contains documents of similar structure
- Shard: Horizontal partitiosn that distribute data across cluster nodes for scalability
- Replica: Copy of a shard that provides high availability and improved query performance
Docker Deployment Setup
Create Docker Network
docker network create elastic-net
Deploy Elasticsearch
# Create directories with proper permissions
mkdir -p /opt/elastic/{config,data,plugins}
chmod -R 755 /opt/elastic/
# Run Elasticsearch container
docker run -d \
--name elastic-node \
--restart=always \
-e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \
-e "discovery.type=single-node" \
-v /opt/elastic/data:/usr/share/elasticsearch/data \
-v /opt/elastic/plugins:/usr/share/elasticsearch/plugins \
--privileged \
--network elastic-net \
-p 9200:9200 \
-p 9300:9300 \
elasticsearch:7.12.1
Configuration parameters:
ES_JAVA_OPTS: JVM heap size configurationdiscovery.type=single-node: Single node cluster mode- Volume mounts for data persistence
- Network configuration for inter-container communication
Verify deployment by accessing: http://your-server-ip:9200
Deploy Kibana
docker run -d \
--name kibana-ui \
--restart=always \
-e ELASTICSEARCH_HOSTS=http://elastic-node:9200 \
-e "I18N_LOCALE=zh-CN" \
--network=elastic-net \
-p 5601:5601 \
kibana:7.12.1
Access the Kibana interface at: http://your-server-ip:5601
Deploy Logstash
docker run -d -p 5044:5044 -p 9600:9600 --name log-processor --network=elastic-net logstash:7.12.1
Configure Logstash Pipeline
Access the container and configure the processing pipeline:
docker exec -it log-processor /bin/bash
Edit the main configuration file:
# logstash.yml
http.host: "0.0.0.0"
xpack.monitoring.elasticsearch.hosts: [ "http://elastic-node:9200" ]
Create pipeline configuration:
# pipeline/logstash.conf
input {
file {
path => "/var/log/application/app.log"
codec => "json"
}
}
filter {
grok {
match => {
"message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{GREEDYDATA:log_message}"
}
}
}
output {
elasticsearch {
hosts => ["http://elastic-node:9200"]
index => "app-logs-%{+YYYY.MM.dd}"
}
}
Restart the Logstash service:
docker restart log-processor
Architecture Considerations
Basic ELK deployments may encounter several limitations:
- Logstash resource consumption can impact system performance
- Single Elasticsearch nodes may become bottlenecks
- Storage requirements grow with log volume
- Mixed master/data nodes can affect query performance
For production environments, consider implementing:
- Dedicated master nodes for cluster coordination
- Multiple data nodes for horizontal scaling
- Log shipping alternatives like Filebeat for resource efficiency
- Index lifecycle management for storage optimization