Building a Log Analysis System with Elasticsearch and Kibana
Log Analysis System Overview
The ELK stack consists of three core components:
- Elasticsearch: Handles log indexing, storage, and search capabilities
- Logstash: Manages log collection, parsing, and data transformation
- Kibana: Provides visualization and dashboard creation interface
These open-source tools, now under Elastic.co, form a comprehensive solution for enterprise log management. Major organizations including Sina, Ctrip, Huawei, and Meituan utilize this technology stack.
Common use cases include:
- Centralized querying of distributed log data
- Infrastructure and application component monitoring
- System troubleshooting and diagnostics
- Security event analysis and reporting
Elasticsearch Implementation
Single Node Deployment
Minimum requirements: 2 CPU cores, 2GB RAM, 20GB storage
Initial setup process:
# Configure repository and hostname resolution
[root@node1 ~]# vim /etc/hosts
192.168.1.41 node1
# Install required packages
[root@node1 ~]# yum install -y java-1.8.0-openjdk elasticsearch
# Configure network access
[root@node1 ~]# vim /etc/elasticsearch/elasticsearch.yml
55: network.host: 0.0.0.0
# Enable service startup
[root@node1 ~]# systemctl enable --now elasticsearch
# Verify installation
[root@node1 ~]# curl http://192.168.1.41:9200/
Cluster Configuration
Five-node cluster setup with identical specifications:
- 192.168.1.41 node1
- 192.168.1.42 node2
- 192.168.1.43 node3
- 192.168.1.44 node4
- 192.168.1.45 node5
Cluster configuration steps:
# Update hostname mappings
[root@node1 ~]# vim /etc/hosts
192.168.1.41 node1
192.168.1.42 node2
192.168.1.43 node3
192.168.1.44 node4
192.168.1.45 node5
# Install components
[root@node1 ~]# yum install -y java-1.8.0-openjdk elasticsearch
# Configure cluster settings
[root@node1 ~]# vim /etc/elasticsearch/elasticsearch.yml
17: cluster.name: production-cluster
23: node.name: node1
55: network.host: 0.0.0.0
68: discovery.zen.ping.unicast.hosts: ["node1", "node2"]
[root@node1 ~]# systemctl enable --now elasticsearch
# Check cluster health
[root@node1 ~]# curl http://192.168.1.41:9200/_cluster/health?pretty
Cluster status indicators:
green: All nodes operationalyellow: Minor issues presentred: Critical system failures
Mnaagement Plugin Installation
Head plugin provides cluster topology visualization and administrative functions.
Setup on separate web server (192.168.1.48):
# Install web server
[root@webserver ~]# yum install -y httpd
[root@webserver ~]# tar zxf head.tar.gz
[root@webserver ~]# mv elasticsearch-head /var/www/html/head
[root@webserver ~]# systemctl enable --now httpd
# Configure CORS access
[root@node1 ~]# vim /etc/elasticsearch/elasticsearch.yml
http.cors.enabled : true
http.cors.allow-origin : "*"
http.cors.allow-methods : OPTIONS, HEAD, GET, POST, PUT, DELETE
http.cors.allow-headers : X-Requested-With,X-Auth-Token,Content-Type,Content-Length
[root@node1 ~]# systemctl restart elasticsearch
Elasticsearch API Operations
Cluster Information Queries
The _cat API family retrieves cluster metadata:
# Identify master node
[root@server ~]# curl -XGET http://node1:9200/_cat/master?v
Index Management
Creating new indices with custom settings:
# Create index with shard configuration
[root@server ~]# curl -XPUT -H "Content-Type: application/json" 'http://node1:9200/company' -d '{
"settings":{
"index":{
"number_of_shards": 5,
"number_of_replicas": 1
}
}
}'
Document Operations
Adding structured data:
# Insert document
[root@server ~]# curl -XPUT -H "Content-Type: application/json" 'http://node1:9200/company/staff/1' -d '{
"role": "developer",
"name": "Alice Smith",
"department": "engineering",
"hire_date": "2023-01-15"
}'
Retrieving stored information:
# Fetch specific document
[root@server ~]# curl -XGET http://node1:9200/company/staff/1?pretty
Updating existing records:
# Modify document fields
[root@server ~]# curl -XPOST -H "Content-Type: application/json" http://node1:9200/company/staff/1/_update -d '{
"doc": {
"department": "product development"
}
}'
Removing data entries:
# Delete individual documents
[root@server ~]# curl -XDELETE -H "Content-Type: application/json" http://node1:9200/company/staff/1
# Remove entire indices
[root@server ~]# curl -XDELETE -H "Content-Type: application/json" http://node1:9200/company
Kibana Dashboard Platform
Kibana enables interactive data visualization through:
- Flexible analytical frameworks
- Real-time metric dashboards
- Customizable user interfaces
- Shareable embedded reports
Installation Process
System requirements: 1 CPU, 1GB memory, 10GB storage Server IP: 192.168.1.46
Installation procedure:
# Hostname mapping
[root@dashboard ~]# vim /etc/hosts
192.168.1.41 node1
192.168.1.42 node2
192.168.1.43 node3
192.168.1.44 node4
192.168.1.45 node5
192.168.1.46 dashboard
[root@dashboard ~]# yum -y install kibana
Service configuration:
[root@dashboard ~]# vim /etc/kibana/kibana.yml
02 server.port: 5601
07 server.host: "0.0.0.0"
28 elasticsearch.hosts: ["http://node2:9200", "http://node3:9200"]
37 kibana.index: ".kibana"
40 kibana.defaultAppId: "home"
113 i18n.locale: "zh-CN"
[root@dashboard ~]# systemctl enable --now kibana
Access interface via browser at http://192.168.1.46:5601
Data Visualization Setup
Requirements for importing datasets:
- JSON format specification
- Content-Type header set to application/json
- Use of
_bulkendpoint - Binary data transmission method
Data import workflow:
# Transfer compressed dataset
[root@local ~]# scp /var/ftp/localrepo/elk/*.gz root@192.168.1.46:/root/
# Extract and load data
[root@dashboard ~]# gzip -d logs.jsonl.gz
[root@dashboard ~]# curl -XPOST -H "Content-Type: application/json" http://node1:9200/_bulk --data-binary @logs.jsonl
Note: Default time range filtering may require adjustment for historical data visibility.