HAProxy is a popular reverse proxy server. This short guide shows you how to use it store its logs into Elasticsearch to monitor its performance.


  • A basic understanding of Fluentd
  • HAProxy logs written to files via syslog-ng/rsyslogd
  • A running Elasticsearch instance

In this guide, we assume we are running td-agent on Ubuntu Precise.

Tailing the HAProxy logs

The first step is to set up the tail input to tail the HAProxy log.

The TCP HAProxy logs look something like this:

haproxy[27508]: info [12/Jul/2012:15:19:03.258] wss-relay wss-relay/local02_9876 0/0/50015 1277 cD 1/0/0/0/0 0/0

Which can be parsed with the following regular expression:

/^(?<ps>\w+)\[(?<pid>\d+)\]: (?<pri>\w+) (?<c_ip>[\w\.]+):(?<c_port>\d+) \[(?<time>.+)\] (?<f_end>[\w-]+) (?<b_end>[\w-]+)\/(?<b_server>[\w-]+) (?<tw>\d+)\/(?<tc>\d+)\/(?<tt>\d+) (?<bytes>\d+) (?<t_state>[\w-]+) (?<actconn>\d+)\/(?<feconn>\d+)\/(?<beconn>\d+)\/(?<srv_conn>\d+)\/(?<retries>\d+) (?<srv_queue>\d+)\/(?<backend_queue>\d+)$/

The HTTP HAProxy logs look something like this:

Which can be parsed with the following regular expression:

 /^(?<ps>\w+)\[(?<pid>\d+)\]: (?<c_ip>[\w\.]+):(?<c_port>\d+) \[(?<time>.+)\] (?<f_end>[\w-]+) (?<b_end>[\w-]+)\/(?<b_server>[\w-]+) (?<tq>\d+)\/(?<tw>\d+)\/(?<tc>\d+)\/(?<tr>\d+)\/(?<tt>\d+) (?<status_code>\d+) (?<bytes>\d+) (?<req_cookie>\S+) (?<res_cookie>\S+) (?<t_state>[\w-]+) (?<actconn>\d+)\/(?<feconn>\d+)\/(?<beconn>\d+)\/(?<srv_conn>\d+)\/(?<retries>\d+) (?<srv_queue>\d+)\/(?<backend_queue>\d+) \{(?<req_headers>[^}]*)\} \{(?<res_headers>[^}]*)\} "(?<request>[^"]*)"/

In the rest of the article, we assume the format is TCP. Hence, assuming the HAProxy log is located at /var/log/haproxy/haproxy.log, add the following to the configuration file (which, for td-agent is at /etc/td-agent/td-agent.conf.)

  type tail
  path /var/log/haproxy/haproxy.log
  pos  /path/to/file_position_file
  format /^(?<ps>\w+)\[(?<pid>\d+)\]: (?<pri>\w+) (?<c_ip>[\w\.]+):(?<c_port>\d+) \[(?<time>.+)\] (?<f_end>[\w-]+) (?<b_end>[\w-]+)\/(?<b_server>[\w-]+) (?<tw>\d+)\/(?<tc>\d+)\/(?<tt>\d+) (?<bytes>\d+) (?<t_state>[\w-]+) (?<actconn>\d+)\/(?<feconn>\d+)\/(?<beconn>\d+)\/(?<srv_conn>\d+)\/(?<retries>\d+) (?<srv_queue>\d+)\/(?<backend_queue>\d+)$/
  tag haproxy.tcp
  time_format %d/%B/%Y:%H:%M:%S

Outputting Data into Elasticsearch

Fluentd support Elasticsearch as an output. For td-agent, run

/usr/sbin/td-agent-gem install fluent-plugin-elasticsearch

If you are using vanilla Fluentd, run

fluent-gem install fluent-plugin-elasticsearch

(You might need to sodo). Now, configure Elasticsearch as an output.

<match haproxy.*>
  type copy
    # for debug (see /var/log/td-agent.log)
    type stdout
    type elasticsearch
    logstash_format true
    flush_interval 10s # for testing.
    host YOUR_ES_HOST
    port YOUR_ES_PORT

Restart and Confirm That Data Flow into Elasticsearch

Restart td-agent with sudo service td-agent restart. Then, run tail against /var/log/td-agent.log. You should see the following lines:

2012-07-12 15:19:03 +0000 haproxy.tcp: {"ps":"haproxy","pid":"27508","pri":"info","c_ip":"","c_port":"45111","f_end":"wss-relay","b_end":"wss-relay","b_server":"local02_9876","tw":"0","tc":"0","tt":"50015","bytes":"1277","t_state":"cD","actconn":"1","feconn":"0","beconn":"0","srv_conn":"0","retries":"0","srv_queue":"0","backend_queue":"0"}

Then, query Elasticsearch to make sure the data is in there.

What's Next?

In production, you might want to remove writing output into stdout. So, use the following output configuration:

<match haproxy.*>
  type elasticsearch
  logstash_format true

Do you wish to store HAProxy logs into other systems? Check out other data outputs!.

What's Next?

Interested in other data sources and output destinations? Check out the following resources:

Get Started

Ready to try? Download Fluentd and start collecting more data!


Get the Fluentd Newsletter


Want to learn the basics of Fluentd? Check out these pages.

Ask the Community

Couldn't find enough information? Let's ask the community!