Hands-On Guide to Filebeat Outputs, Logstash Pipelines, and Filter Plugins
Multi-Line Aggregation with Filebeat
An alternative approach to merging log lines relies on a predefined line count. The configuration below directs Filebeat to combine every three consecutive lines into a single event.
# config/multiline-count-console.yaml
filebeat.inputs:
- type: log
paths:
- /tmp/oldboyedu-linux85/linux85.log
multiline:
type: count
count_lines: 3
output.console:
pretty: true
Collecting Container Logs with Filebeat
Deploying Docker
Start by obtaining and extracting the Docker packages, then install them locally.
wget http://192.168.15.253/ElasticStack/day05-/softwares/oldboyedu-docker-ce-23_0_1.tar.gz
tar xf oldboyedu-docker-ce-23_0_1.tar.gz
yum -y localinstall oldboyedu-docker-ce-23_0_1/*.rpm
Configuring Registry Mirrors
Set a mirror to improve pull speed by editing the Docker daemon configuration.
{
"data-root": "/var/lib/docker",
"registry-mirrors": [
"https://tuv7rqqq.mirror.aliyuncs.com",
"https://hub-mirror.c.1com/",
"https://docker.mirrors.ustc.edu.cn",
"https://reg-mirror.qiniu.com"
]
}
systemctl enable --now docker
Launching Sample Containers
Two containers serve as log sources: an Nginx instance and a Tomcat instance.
docker run -dp 88:80 --name mynginx --restart always nginx:1.22.1-alpine
docker run -dp 89:8080 --name mytomcat --restart always tomcat:jre8-alpine
Input Types: docker vs. container
Collect logs directly from Docker containers using the dedicated input type, which can target all containers via a wildcard ID.
# config/docker-input-console.yaml
filebeat.inputs:
- type: docker
containers.ids:
- '*'
output.console:
pretty: true
Alternatively, tapp the underlying container log files on disk with the container input type, sending14:13 the records to Elasticsearch instead of stdout.
# config/container-input-es.yaml
filebeat.inputs:
- type: container
paths:
- '/var/lib/docker/containers/*/*.log'
output.elasticsearch:
hosts:
- "http://10.0.0.101:9200"
- "http://10.0.0.102:9200"
- "http://10.0.0.103:9200"
Exploring the filestream Input
With Filebeat 7.16 onwards, the log type is deprecated in favor of filestream, which introduces integrated parsers for reading files and transforming their contents.
Basic and JSON Parsing
The ndjson parser can decode JSON streams, optionally capturnig errors and nesting decoded fields under a custom target.
# config/filestream-mixed-demo.yaml
filebeat.inputs:
- type: filestream
enabled: false
paths:
- /tmp/oldboyedu-linux85/linux85.log
- type: filestream
enabled: false
paths:
- /tmp/oldboyedu-linux85/docker.json
parsers:
- ndjson:
add_error_key: true
overwrite_keys: true
target: oldboyedu-linux85
- type: filestream
enabled: false
paths:
- /tmp/oldboyedu-linux85/linux85.log
parsers:
- multiline:
type: count
count_lines: 3
- type: filestream
enabled: true
paths:
- /tmp/oldboyedu-linux85/demo.log
parsers:
- multiline:
type: count
count_lines: 4
- ndjson:
add_error_key: true
overwrite_keys: true
target: oldboyedu-linux85-demo
output.console:
pretty: true
Multi-Line JSON Practical Example
Combine a count-based multiline aggregator with the ndjson parser before sending14:13 the results directly to Elasticsearch.
# config/filestream-es-lab.yaml
filebeat.inputs:
- type: filestream
enabled: true
paths:
- /tmp/oldboyedu-linux85/shopping.json
parsers:
- multiline:
type: count
count_lines: 7
- ndjson:
add_error_key: true
overwrite_keys: true
output.elasticsearch:
hosts:
- "http://10.0.0.101:9200"
- "http://10.0.0.102:9200"
- "http://10.0.0.103:9200"
Diverse Output Destinations
Local File Storage
Filebeat can persist14:13 events to the filesystem instead of a remote service.
# config/stdin-to-file.yaml
filebeat.inputs:
- type: stdin
output.file:
path: "/tmp/oldboyedu-linux85"
filename: stdin.log
Indexing to Elasticsearch with Custom Settings
Output to Elasticsearch offers full control over index naming, ILM, shard counts, and replicas.
# config/filestream-es-custom.yaml
filebeat.inputs:
- type: filestream
enabled: true
paths:
- /tmp/oldboyedu-linux85/shopping.json
parsers:
- multiline:
type: count
count_lines: 7
- ndjson:
add_error_key: true
overwrite_keys: true
output.elasticsearch:
hosts:
- "http://10.0.0.101:9200"
- "http://10.0.0.102:9200"
- "http://10.0.0.103:9200"
index: "oldboyedu-linux85-shopping-%{+yyyy.MM.dd}"
setup.ilm.enabled: false
setup.template.name: "oldboyedu-linux85-shopping"
setup.template.pattern: "oldboyedu-linux85-shopping-*"
setup.template.overwrite: true
setup.template.settings:
index.number_of_shards: 8
index.number_of_replicas: 0
Condition-Based Routing to Multiple Indices
Tag each input and use conditional indices to route data streams into separate Elasticsearch indices.
# config/filestream-multi-index.yaml
filebeat.inputs:
- type: filestream
enabled: true
tags: ["docker"]
paths:
- /tmp/oldboyedu-linux85/docker.json
parsers:
- ndjson:
add_error_key: true
- type: filestream
enabled: true
tags: ["linux85"]
paths:
- /tmp/oldboyedu-linux85/linux85.log
parsers:
- multiline:
type: count
count_lines: 3
- type: filestream
enabled: true
tags: ["demo"]
paths:
- /tmp/oldboyedu-linux85/demo.log
parsers:
- multiline:
type: count
count_lines: 4
- ndjson:
add_error_key: true
overwrite_keys: true
target: oldboyedu-linux85-demo
output.elasticsearch:
hosts:
- "http://10.0.0.101:9200"
- "http://10.0.0.102:9200"
- "http://10.0.0.103:9200"
indices:
- index: "oldboyedu-jiaoshi07-docker-%{+yyyy.MM.dd}"
when.contains:
tags: "docker"
- index: "oldboyedu-jiaoshi07-linux85-%{+yyyy.MM.dd}"
when.contains:
tags: "linux85"
- index: "oldboyedu-jiaoshi07-demo-%{+yyyy.MM.dd}"
when.contains:
tags: "demo"
setup.ilm.enabled: false
setup.template.name: "oldboyedu-jiaoshi07"
setup.template.pattern: "oldboyedu-jiaoshi07-*"
setup.template.overwrite: true
setup.template.settings:
index.number_of_shards: 3
index.number_of_replicas: 0
Logstash: Collection and Enrichment
Installing via RPM
Download and install the RPM package, then create a convenient symlink.
wget http://192.168.15.253/ElasticStack/day05-/softwares/logstash-7.17.5-x86_64.rpm
rpm -ivh logstash-7.17.5-x86_64.rpm
ln -svf /usr/share/logstash/bin/logstash /usr/local/sbin
Installing from Tarball
Alternatively, use a binary archive.
wget http://192.168.15.253/ElasticStack/day05-/softwares/logstash-7.17.5-linux-x86_64.tar.gz
tar xf logstash-7.17.5-linux-x86_64.tar.gz -C /oldboyedu/softwares/
ln -svf /oldboyedu/softwares/logstash-7.17.5/bin/logstash /usr/local/sbin/
Quick Command-Line Pipelines
Test a simple stdin-to-stdout pipeline directly from the shell.
logstash -e "input { stdin { } } output { stdout { codec => rubydebug } }"
First Configuration File
Write a basic pipeline definition and run it with the -f flag.
# config/stdin-stdout.conf
input {
stdin { }
}
output {
stdout { }
}
logstash -f config/stdin-stdout.conf
Integrating Filebeat and Logstash
Serve a Beats input on a custom port inside Logstash, then instruct Filebeat to forward events there.
Logstash configuration:
# config/beats-in.conf
input {
beats {
port => 8888
}
}
output {
stdout { }
elasticsearch {
hosts => ["http://localhost:9200"]
index => "oldboyedu-linux85-logstash"
}
}
logstash -rf config/beats-in.conf
Filebeat configuration:
# config/nginx-to-logstash.yaml
filebeat.inputs:
- type: log
paths:
- /var/log/nginx/access.log*
output.logstash:
hosts: ["10.0.0.101:8888"]
filebeat -e -c config/nginx-to-logstash.yaml
Enrichment with Filters
geoip IP Geolocation
Use a pre-parsed client IP field to append latitude, longitude, and country data while pruning noise fields.
# config/beats-geoip.conf
input {
beats {
port => 8888
}
}
filter {
geoip {
source => "clientip"
remove_field => [ "agent", "log", "input", "host", "ecs", "tags" ]
}
}
output {
stdout { }
elasticsearch {
hosts => ["http://localhost:9200"]
index => "oldboyedu-linux85-logstash"
}
}
alter Filebeat to extract JSON keys at the root level so clientip is16:37 available for geoip:
# config/nginx-json-to-logstash.yaml
filebeat.inputs:
- type: log
paths:
- /var/log/nginx/access.log
json.keys_under_root: true
json.add_error_key: true
output.logstash:
hosts: ["10.0.0.101:8888"]
Sample log entries:
{"@timestamp":"2023-04-06T16:17:43+08:00","host":"10.0.0.103","clientip":"110.110.110.110","status":"200"}
Grok for Native Nginx Logs
When the log format is16:37 standard combined log entries, leverage the grok filter with HTTPD_COMBINEDLOG and then apply geoip on the extracted clientip.
# config/beats-grok-geoip.conf
input {
beats {
port => 8888
}
}
filter {
grok {
match => { "message" => "%{HTTPD_COMBINEDLOG}" }
remove_field => [ "agent", "log", "input", "host", "ecs", "tags" ]
}
geoip {
source => "clientip"
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "oldboyedu-linux85-logstash-nginx"
}
}
# filebeat config unchanged
filebeat.inputs:
- type: log
paths:
- /tmp/oldboyedu-linux85/access.log
output.logstash:
hosts: ["10.0.0.101:8888"]
Fixing Timestamps with the Date Filter
alterwhen the log contains a human-readable timestamp like 22/Nov/2015:11:57:34 +0800, use date to parse it and store the result in a custom field.
# config/beats-date-override.conf
input {
beats {
port => 8888
}
}
filter {
grok {
match => { "message" => "%{HTTPD_COMBINEDLOG}" }
remove_field => [ "agent", "log", "input", "host", "ecs", "tags" ]
}
geoip {
source => "clientip"
}
date {
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
timezone => "Asia/Shanghai"
target => "oldboyedu-linux85-date"
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "oldboyedu-linux85-logstash-nginx-date"
}
}