Filebeat for ft_transcendence: Efficient Log Collection
Filebeat for ft_transcendence: Efficient Log Collection
What is Filebeat?
Filebeat is a lightweight shipper for log data. It's designed to efficiently collect, process, and forward logs from various sources to Elasticsearch. For our ft_transcendence project, Filebeat serves as the foundation of our observability pipeline, gathering logs from multiple services.
Current Filebeat Version: 8.17.4
(matching our Elasticsearch version)
Why Filebeat for ft_transcendence?
Filebeat offers several advantages for our project:
- Low Resource Footprint: Important for our constrained local environment
- Reliable Log Delivery: Includes features like back-pressure sensing and persistent queues
- Log Enrichment: Adds metadata to make logs more searchable and analyzable
- Pre-built Modules: Ready-to-use configurations for common systems like Nginx and PostgreSQL
Log Sources in ft_transcendence
We're collecting logs from multiple services:
- Django: Application logs and API access logs
- Nginx: Web server access and error logs
- Next.js: Frontend application logs
- Redis: Cache operation logs
- PostgreSQL: Database query and transaction logs
Filebeat Configuration
filebeat.yml
# Basic settings
filebeat.config:
modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: true
# Input configurations
filebeat.inputs:
- type: container
enabled: true
paths:
- /var/lib/docker/containers/*/*.log
processors:
- add_docker_metadata: ~
# Module configurations
filebeat.modules:
- module: nginx
access:
enabled: true
var.paths: ['/var/log/nginx/access.log*']
error:
enabled: true
var.paths: ['/var/log/nginx/error.log*']
- module: postgresql
log:
enabled: true
var.paths: ['/var/log/postgresql/*.log*']
# Custom file input for Django logs
- type: log
enabled: true
paths:
- /var/log/django/*.log
fields:
service: django
fields_under_root: true
multiline:
pattern: '^[[:space:]]+|^Traceback[[:space:]]|^[[:alnum:]]+Error:|^[[:alnum:]]+Exception:'
negate: false
match: after
# Custom file input for Next.js logs
- type: log
enabled: true
paths:
- /var/log/nextjs/*.log
fields:
service: nextjs
fields_under_root: true
json:
keys_under_root: true
message_key: message
# Output configuration
output.elasticsearch:
hosts: ['elasticsearch:9200']
username: '${ELASTIC_USER}'
password: '${ELASTIC_PASSWORD}'
ssl.enabled: true
ssl.verification_mode: 'none' # For development only
# Index settings
indices:
- index: 'nginx-%{+yyyy.MM.dd}'
when.contains:
service: 'nginx'
- index: 'django-%{+yyyy.MM.dd}'
when.contains:
service: 'django'
- index: 'nextjs-%{+yyyy.MM.dd}'
when.contains:
service: 'nextjs'
- index: 'postgresql-%{+yyyy.MM.dd}'
when.contains:
service: 'postgresql'
- index: 'redis-%{+yyyy.MM.dd}'
when.contains:
service: 'redis'
- index: 'system-%{+yyyy.MM.dd}'
# Processing pipeline
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
- add_fields:
target: ''
fields:
project: 'ft_transcendence'
environment: 'development'
Docker Compose Integration
Here's how we've added Filebeat to our Docker Compose stack:
filebeat:
image: docker.elastic.co/beats/filebeat:8.17.4
container_name: ft_filebeat
user: root # To access container logs
volumes:
- ./config/filebeat/filebeat.yml:/usr/share/filebeat/filebeat.yml:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
- filebeat_data:/usr/share/filebeat/data
# Log volume mounts
- ./logs/nginx:/var/log/nginx:ro
- ./logs/django:/var/log/django:ro
- ./logs/nextjs:/var/log/nextjs:ro
- ./logs/postgresql:/var/log/postgresql:ro
environment:
- ELASTIC_USER=elastic
- ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
networks:
- elastic
depends_on:
- elasticsearch
restart: unless-stopped
command: filebeat -e -strict.perms=false
Log Processing Features
Multiline Handling
For logs that span multiple lines (like Python tracebacks), we've configured pattern recognition:
multiline:
pattern: '^[[:space:]]+|^Traceback[[:space:]]|^[[:alnum:]]+Error:|^[[:alnum:]]+Exception:'
negate: false
match: after
JSON Parsing
For services that output JSON logs (like Next.js), we enable automatic parsing:
json:
keys_under_root: true
message_key: message
Metadata Enrichment
We add various metadata to make our logs more useful:
- Host Information: OS, hostname, architecture
- Docker Metadata: Container name, image, labels
- Custom Fields: Project name, environment
Log Rotation and Management
To prevent disk space issues in our limited environment, we implement:
- Log Rotation: Using Docker log rotation settings
- Index Lifecycle Management: Aging out older logs
- Compaction: Regular optimization of indices
Performance Tuning
For optimal Filebeat performance in a resource-constrained environment:
# Add to filebeat.yml for performance
queue.mem:
events: 1024
flush.min_events: 512
flush.timeout: 5s
max_procs: 1 # Limit CPU usage on small machines
Monitoring Filebeat
To ensure Filebeat itself is operating correctly:
- Check the status with:
docker logs ft_filebeat
- Monitor internal metrics through Kibana
- Set up alerts for Filebeat failures
Next Steps
In the next article, we'll explore how Logstash processes and enriches the logs collected by Filebeat before they reach Elasticsearch.