Skip to Content
Back to Blog

Elasticsearch for ft_transcendence: Setup and Configuration

Elasticsearch for ft_transcendence: Setup and Configuration

What is Elasticsearch?

Elasticsearch is a distributed, RESTful search and analytics engine capable of performing complex searches and aggregations. It allows us to search, index, store, and analyze data of all shapes and sizes in near real-time, making it perfect for our ft_transcendence project needs.

Why This Configuration?

This configuration is custom-tailored for the ft_transcendence project. Since the project requirements force us to test in a local environment with limited resources, we've optimized Elasticsearch to run efficiently without sacrificing necessary functionality.

Current Elasticsearch Version: 8.17.4 (latest version as of March 31, 2024)

Use Cases for ft_transcendence

For our project, we'll leverage Elasticsearch for three primary purposes:

1. Observability

  • Collect data from containers
  • Monitor application logs

2. Search

  • Efficient log search and analysis

3. Security

  • Monitor endpoints
  • Track user activities
  • Analyze data access patterns

Installation Approach

While there are multiple ways to install Elasticsearch, we'll use pre-built Docker images from Docker Hub for our development environment. Note that in production environments, a self-managed installation would be more secure and flexible.

Base Image Initial Dependencies

The Elasticsearch container requires these dependencies:

  • sudo
  • wget
  • gpg
  • curl
  • apt-transport-https
  • elasticsearch

Elasticsearch Security Configuration

By default, Elasticsearch is configured with:

  • Built-in superuser account (elastic) with a generated password
  • Authentication and authorization enabled
  • TLS enabled

The password for the elastic user is set to the value of the ELASTIC_PASSWORD environment variable in our Docker configuration.

Configuration Details

elasticsearch.yml

# Cluster configuration
cluster.name: ft_transcendence
node.name: ${HOSTNAME}

# Network settings
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300
http.cors.enabled: true
http.cors.allow-origin: '*'

# Security settings
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true
xpack.security.audit.enabled: true

# Memory and performance
bootstrap.memory_lock: true

# Paths
path.data: /var/lib/elasticsearch/data
path.logs: /var/lib/elasticsearch/logs

JVM Options

# Memory settings - adjusted for smaller environments
-Xms1g
-Xmx1g

# GC settings
-XX:+UseG1GC
-XX:G1ReservePercent=25
-XX:InitiatingHeapOccupancyPercent=30

# Logging
-Xlog:gc*,gc+age=trace,safepoint:file=/var/lib/elasticsearch/logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m

Logging Configuration

Our setup includes:

  • Comprehensive logging with rotation
  • Security audit logs
  • Slowlog for indexing and search operations

Scaling: Adding Nodes to the Cluster

For future scaling, we can add nodes to the cluster by adding this configuration:

# Discovery for adding nodes later
discovery.seed_hosts: ['host1', 'host2']
cluster.initial_master_nodes: ['node-1']

Accessing the Elasticsearch Instance

Security Considerations

Our current setup includes basic security suitable for development. For production, we would:

  1. Generate proper certificates using Elasticsearch's certutil
  2. Implement more restrictive access controls
  3. Configure network security properly
  4. Use dedicated users for different services

Docker Compose Implementation

Our implementation uses Docker Compose to orchestrate:

  • Elasticsearch
  • Kibana (for visualization)
  • Filebeat (for log collection)
  • Logstash (for log filtering)

We're collecting logs from:

  • Django
  • Nginx
  • Next.js
  • Redis
  • PostgreSQL

Some services like Grafana and pgAdmin are excluded from log collection to preserve resources.

Getting Started

To start the Elasticsearch stack:

docker-compose up -d elasticsearch kibana

Check the status with:

curl -u elastic:${ELASTIC_PASSWORD} https://localhost:9200/_cluster/health?pretty

In the next articles, we'll explore each component of our observability stack in depth.