Home > Tech > Content

Hands-On Guide to Filebeat Outputs, Logstash Pipelines, and Filter Plugins

Tech May 19 18

Multi-Line Aggregation with Filebeat

An alternative approach to merging log lines relies on a predefined line count. The configuration below directs Filebeat to combine every three consecutive lines into a single event.

# config/multiline-count-console.yaml
filebeat.inputs:
- type: log
  paths:
    - /tmp/oldboyedu-linux85/linux85.log
  multiline:
    type: count
    count_lines: 3

output.console:
  pretty: true

Collecting Container Logs with Filebeat

Deploying Docker

Start by obtaining and extracting the Docker packages, then install them locally.

wget http://192.168.15.253/ElasticStack/day05-/softwares/oldboyedu-docker-ce-23_0_1.tar.gz
tar xf oldboyedu-docker-ce-23_0_1.tar.gz
yum -y localinstall oldboyedu-docker-ce-23_0_1/*.rpm

Configuring Registry Mirrors

Set a mirror to improve pull speed by editing the Docker daemon configuration.

{
  "data-root": "/var/lib/docker",
  "registry-mirrors": [
    "https://tuv7rqqq.mirror.aliyuncs.com",
    "https://hub-mirror.c.1com/",
    "https://docker.mirrors.ustc.edu.cn",
    "https://reg-mirror.qiniu.com"
  ]
}

systemctl enable --now docker

Launching Sample Containers

Two containers serve as log sources: an Nginx instance and a Tomcat instance.

docker run -dp 88:80 --name mynginx --restart always nginx:1.22.1-alpine
docker run -dp 89:8080 --name mytomcat --restart always  tomcat:jre8-alpine

Input Types: docker vs. container

Collect logs directly from Docker containers using the dedicated input type, which can target all containers via a wildcard ID.

# config/docker-input-console.yaml
filebeat.inputs:
- type: docker
  containers.ids:
    - '*'

output.console:
  pretty: true

Alternatively, tapp the underlying container log files on disk with the container input type, sending14:13 the records to Elasticsearch instead of stdout.

# config/container-input-es.yaml
filebeat.inputs:
- type: container
  paths:
    - '/var/lib/docker/containers/*/*.log'

output.elasticsearch:
  hosts:
    - "http://10.0.0.101:9200"
    - "http://10.0.0.102:9200"
    - "http://10.0.0.103:9200"

Exploring the filestream Input

With Filebeat 7.16 onwards, the log type is deprecated in favor of filestream, which introduces integrated parsers for reading files and transforming their contents.

Basic and JSON Parsing

The ndjson parser can decode JSON streams, optionally capturnig errors and nesting decoded fields under a custom target.

# config/filestream-mixed-demo.yaml
filebeat.inputs:
- type: filestream
  enabled: false
  paths:
    - /tmp/oldboyedu-linux85/linux85.log

- type: filestream
  enabled: false
  paths:
    - /tmp/oldboyedu-linux85/docker.json
  parsers:
    - ndjson:
        add_error_key: true
        overwrite_keys: true
        target: oldboyedu-linux85

- type: filestream
  enabled: false
  paths:
    - /tmp/oldboyedu-linux85/linux85.log
  parsers:
    - multiline:
        type: count
        count_lines: 3

- type: filestream
  enabled: true
  paths:
    - /tmp/oldboyedu-linux85/demo.log
  parsers:
    - multiline:
        type: count
        count_lines: 4
    - ndjson:
        add_error_key: true
        overwrite_keys: true
        target: oldboyedu-linux85-demo

output.console:
  pretty: true

Multi-Line JSON Practical Example

Combine a count-based multiline aggregator with the ndjson parser before sending14:13 the results directly to Elasticsearch.

# config/filestream-es-lab.yaml
filebeat.inputs:
- type: filestream
  enabled: true
  paths:
    - /tmp/oldboyedu-linux85/shopping.json
  parsers:
    - multiline:
        type: count
        count_lines: 7
    - ndjson:
        add_error_key: true
        overwrite_keys: true

output.elasticsearch:
  hosts:
    - "http://10.0.0.101:9200"
    - "http://10.0.0.102:9200"
    - "http://10.0.0.103:9200"

Diverse Output Destinations

Local File Storage

Filebeat can persist14:13 events to the filesystem instead of a remote service.

# config/stdin-to-file.yaml
filebeat.inputs:
- type: stdin

output.file:
  path: "/tmp/oldboyedu-linux85"
  filename: stdin.log

Indexing to Elasticsearch with Custom Settings

Output to Elasticsearch offers full control over index naming, ILM, shard counts, and replicas.

# config/filestream-es-custom.yaml
filebeat.inputs:
- type: filestream
  enabled: true
  paths:
    - /tmp/oldboyedu-linux85/shopping.json
  parsers:
    - multiline:
        type: count
        count_lines: 7
    - ndjson:
        add_error_key: true
        overwrite_keys: true

output.elasticsearch:
  hosts:
    - "http://10.0.0.101:9200"
    - "http://10.0.0.102:9200"
    - "http://10.0.0.103:9200"
  index: "oldboyedu-linux85-shopping-%{+yyyy.MM.dd}"

setup.ilm.enabled: false
setup.template.name: "oldboyedu-linux85-shopping"
setup.template.pattern: "oldboyedu-linux85-shopping-*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 8
  index.number_of_replicas: 0

Condition-Based Routing to Multiple Indices

Tag each input and use conditional indices to route data streams into separate Elasticsearch indices.

# config/filestream-multi-index.yaml
filebeat.inputs:
- type: filestream
  enabled: true
  tags: ["docker"]
  paths:
    - /tmp/oldboyedu-linux85/docker.json
  parsers:
    - ndjson:
        add_error_key: true

- type: filestream
  enabled: true
  tags: ["linux85"]
  paths:
    - /tmp/oldboyedu-linux85/linux85.log
  parsers:
    - multiline:
        type: count
        count_lines: 3

- type: filestream
  enabled: true
  tags: ["demo"]
  paths:
    - /tmp/oldboyedu-linux85/demo.log
  parsers:
    - multiline:
        type: count
        count_lines: 4
    - ndjson:
        add_error_key: true
        overwrite_keys: true
        target: oldboyedu-linux85-demo

output.elasticsearch:
  hosts:
    - "http://10.0.0.101:9200"
    - "http://10.0.0.102:9200"
    - "http://10.0.0.103:9200"
  indices:
    - index: "oldboyedu-jiaoshi07-docker-%{+yyyy.MM.dd}"
      when.contains:
        tags: "docker"
    - index: "oldboyedu-jiaoshi07-linux85-%{+yyyy.MM.dd}"
      when.contains:
        tags: "linux85"
    - index: "oldboyedu-jiaoshi07-demo-%{+yyyy.MM.dd}"
      when.contains:
        tags: "demo"

setup.ilm.enabled: false
setup.template.name: "oldboyedu-jiaoshi07"
setup.template.pattern: "oldboyedu-jiaoshi07-*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 3
  index.number_of_replicas: 0

Logstash: Collection and Enrichment

Installing via RPM

Download and install the RPM package, then create a convenient symlink.

wget http://192.168.15.253/ElasticStack/day05-/softwares/logstash-7.17.5-x86_64.rpm
rpm -ivh logstash-7.17.5-x86_64.rpm
ln -svf /usr/share/logstash/bin/logstash /usr/local/sbin

Installing from Tarball

Alternatively, use a binary archive.

wget http://192.168.15.253/ElasticStack/day05-/softwares/logstash-7.17.5-linux-x86_64.tar.gz
tar xf logstash-7.17.5-linux-x86_64.tar.gz -C /oldboyedu/softwares/
ln -svf /oldboyedu/softwares/logstash-7.17.5/bin/logstash /usr/local/sbin/

Quick Command-Line Pipelines

Test a simple stdin-to-stdout pipeline directly from the shell.

logstash -e "input { stdin { } } output { stdout { codec => rubydebug } }"

First Configuration File

Write a basic pipeline definition and run it with the -f flag.

# config/stdin-stdout.conf
input {
  stdin { }
}

output {
  stdout { }
}

logstash -f config/stdin-stdout.conf

Integrating Filebeat and Logstash

Serve a Beats input on a custom port inside Logstash, then instruct Filebeat to forward events there.

Logstash configuration:

# config/beats-in.conf
input {
  beats {
    port => 8888
  }
}

output {
  stdout { }
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "oldboyedu-linux85-logstash"
  }
}

logstash -rf config/beats-in.conf

Filebeat configuration:

# config/nginx-to-logstash.yaml
filebeat.inputs:
- type: log
  paths:
    - /var/log/nginx/access.log*

output.logstash:
  hosts: ["10.0.0.101:8888"]

filebeat -e -c config/nginx-to-logstash.yaml

Enrichment with Filters

geoip IP Geolocation

Use a pre-parsed client IP field to append latitude, longitude, and country data while pruning noise fields.

# config/beats-geoip.conf
input {
  beats {
    port => 8888
  }
}

filter {
  geoip {
    source => "clientip"
    remove_field => [ "agent", "log", "input", "host", "ecs", "tags" ]
  }
}

output {
  stdout { }
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "oldboyedu-linux85-logstash"
  }
}

alter Filebeat to extract JSON keys at the root level so clientip is16:37 available for geoip:

# config/nginx-json-to-logstash.yaml
filebeat.inputs:
- type: log
  paths:
    - /var/log/nginx/access.log
  json.keys_under_root: true
  json.add_error_key: true

output.logstash:
  hosts: ["10.0.0.101:8888"]

Sample log entries:

{"@timestamp":"2023-04-06T16:17:43+08:00","host":"10.0.0.103","clientip":"110.110.110.110","status":"200"}

Grok for Native Nginx Logs

When the log format is16:37 standard combined log entries, leverage the grok filter with HTTPD_COMBINEDLOG and then apply geoip on the extracted clientip.

# config/beats-grok-geoip.conf
input {
  beats {
    port => 8888
  }
}

filter {
  grok {
    match => { "message" => "%{HTTPD_COMBINEDLOG}" }
    remove_field => [ "agent", "log", "input", "host", "ecs", "tags" ]
  }

  geoip {
    source => "clientip"
  }
}

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "oldboyedu-linux85-logstash-nginx"
  }
}

# filebeat config unchanged
filebeat.inputs:
- type: log
  paths:
    - /tmp/oldboyedu-linux85/access.log

output.logstash:
  hosts: ["10.0.0.101:8888"]

Fixing Timestamps with the Date Filter

alterwhen the log contains a human-readable timestamp like 22/Nov/2015:11:57:34 +0800, use date to parse it and store the result in a custom field.

# config/beats-date-override.conf
input {
  beats {
    port => 8888
  }
}

filter {
  grok {
    match => { "message" => "%{HTTPD_COMBINEDLOG}" }
    remove_field => [ "agent", "log", "input", "host", "ecs", "tags" ]
  }

  geoip {
    source => "clientip"
  }

  date {
    match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
    timezone => "Asia/Shanghai"
    target => "oldboyedu-linux85-date"
  }
}

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "oldboyedu-linux85-logstash-nginx-date"
  }
}

Tags: filebeat logstash

Back to List

Prev: Building Web Interfaces for Python CLI Applications with Wooey

Next: Core Graph Algorithms and Data Structures Reference

Fading Coder

Hands-On Guide to Filebeat Outputs, Logstash Pipelines, and Filter Plugins

Multi-Line Aggregation with Filebeat

Collecting Container Logs with Filebeat

Deploying Docker

Configuring Registry Mirrors

Launching Sample Containers

Input Types: docker vs. container

Exploring the filestream Input

Basic and JSON Parsing

Multi-Line JSON Practical Example

Diverse Output Destinations

Local File Storage

Indexing to Elasticsearch with Custom Settings

Condition-Based Routing to Multiple Indices

Logstash: Collection and Enrichment

Installing via RPM

Installing from Tarball

Quick Command-Line Pipelines

First Configuration File

Integrating Filebeat and Logstash

Enrichment with Filters

geoip IP Geolocation

Grok for Native Nginx Logs

Fixing Timestamps with the Date Filter

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

SBUS Signal Analysis and Communication Implementation Using STM32 with Fus Remote Controller

Leave a Comment

Copyright © fadingcoder.top

Fading Coder

Hands-On Guide to Filebeat Outputs, Logstash Pipelines, and Filter Plugins

Multi-Line Aggregation with Filebeat

Collecting Container Logs with Filebeat

Deploying Docker

Configuring Registry Mirrors

Launching Sample Containers

Input Types: docker vs. container

Exploring the filestream Input

Basic and JSON Parsing

Multi-Line JSON Practical Example

Diverse Output Destinations

Local File Storage

Indexing to Elasticsearch with Custom Settings

Condition-Based Routing to Multiple Indices

Logstash: Collection and Enrichment

Installing via RPM

Installing from Tarball

Quick Command-Line Pipelines

First Configuration File

Integrating Filebeat and Logstash

Enrichment with Filters

geoip IP Geolocation

Grok for Native Nginx Logs

Fixing Timestamps with the Date Filter

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

SBUS Signal Analysis and Communication Implementation Using STM32 with Fus Remote Controller

Leave a CommentCancel Reply

Copyright © fadingcoder.top

Leave a Comment