Debian based fluentd docker image has been released

Hi users!

We officially provide Alpine Linux based Fluentd docker image. Alpine Linux is light weight and this is good for Fluentd use cases. But some plugins, e.g. fluent-plugin-systemd, don't work on Alpine Linux, so we received the request "Could you provide other OS based image for xxx."

To resolve this problem, @tyranron works on improving Docker image management. Easy to add other OS, more automated release and tagging, build testing and more. Thanks to @tyranron for your hard work :)

In this result, we start to provide Debian based docker image together.

Debian images by tyranron · Pull Request #71 · fluent/fluentd-docker-image

The default is still Alpine Linux but you can choose Debian version for your requirement. See Supported tags section in README for all available images.

If you have any problem, please let me know.


Happy logging!

Read More

Fluentd v0.12.32 has been released

Hi users!

We have released Fluentd version 0.12.32. Here are the changes:

New features / Enhancement

Bug fixes

formatter: Add add_newline parameter to remove '\n' from the result

json, ltsv, csv and hash formatters now support add_newline parameter. If you set add_newline false, the formatted result doesn't contain \n. This is useful for Queue / KVS plugins.

in_tail: Capture unmatched lines

v0.12.32 introduces emit_unmatched_lines parameter to in_tail plugin. This option enables you to get unparseable line as an event.

For example, if you have following configuration and file:

  • fluent.conf
<source>
  @type tail
  path /path/to/json.log
  format json
  emit_unmatched_lines true
  # ...
</source>
  • json.log
{"message":"Hello!"}
This is broken line!
{"message":"Yay :)"}

Then, the following events are generated from this tail plugin.

{"message":"Hello!"}
{"unmatched_line":"This is broken line!"}
{" message":"Yay :)"}

Unmatched line is assigned to unmatched_line field.

This option works with both single-line and multi-lines.

in_monitor_agent: Add retry field

Fluentd v0.14 adds retry field to /api/plugins.json response. The retry field contains detailed information of buffer's retry. For API consistency, v0.12's in_monitor_agent also provides same field. Here is retry field example:

{
  // other monitor_agent keys
  "retry": {
    "steps":1, // same as retry_count
    "next_time":"2016-12-23 01:37:16 +0700"
  }
}

Unlike v0.14, start field doesn't exist because v0.12's buffer doesn't have its information.

We recommend to use retry's steps instead of retry_count for monitoring.


Lastly, v0.12.32 docker image has also been available on Docker Hub.


Happy logging!

Read More

Fluentd v0.14.12 has been released

Hi users!

After the release of Fluentd v0.14.10, we did 2 releases, v0.14.11 at the end of 2016, and v0.14.12 today. Fluentd v0.14.11 release was a kind of quick fix for major bug. And now, we're very happy to introduce three major new feature with Fluentd v0.14.12!

Here are major changes (full ChangeLog is available here):

We say again, fluentd v0.14 is still development version. Newly added features are under experimental... we need your feedback seriously! If you try to use v0.14, check your configuration and plugins carefully.

Root directory, and no more "path" parameters in buffer configuration

With past versions of Fluentd, file buffer plugin requires path parameter to store buffer chunks in local file systems. These paths should be configured not to use same directories carefully.

<match log.service>
  @id   forward_a
  @type forward
  <buffer>
    @type file
    path  /path/of/buffer/for/service
    flush_interval 1s
  </buffer>
  <server>
    # ...
  </server>
</match>
<match log.system>
  @id   forward_b
  @type forward
  <buffer>
    @type file
    path  /path/of/buffer/for/systems
    flush_interval 1s
  </buffer>
  <server>
    # ...
  </server>
</match>

With Fluentd v0.14.12, these paths can be configured automatically, using root_dir option in <system> directive.

<system>
  root_dir /path/fluentd/root
</system>
<label @traffic>
  <match log.service>
    @id   forward_a
    @type forward
    <buffer>
      @type file
      flush_interval 1s
    </buffer>
    <server>
      # ...
    </server>
  </match>
  <match log.system>
    @id   forward_b
    @type forward
    <buffer>
      @type file
      flush_interval 1s
    </buffer>
    <server>
      # ...
    </server>
  </match>
</label>

There are no need to configure path in <buffer> sections when root_dir is configured and all plugins are configured with @id parameter. All buffer chunks (and storage plugin files) will be stored under /path/fluentd/root/worker0/plugin_id. Don't miss to monitor the disk usage of that partition!

Multi process workers

Fluentd v0.14.12 supports multi process workers, finally! With this feature, two or more Fluentd workers will be launched, using the same number of CPU cores. This feature can replace fluent-plugin-multiprocess, with very simple configuration, to process huge traffic on multi CPU servers.

Note: There are a requirement: All configured plugin must support multi process workers feature.

<system>
  workers 8                    # also can be specified by `--workers` command line option
  root_dir /path/fluentd/root  # strongly recommended to specify this
</system>
<source>
  @id    input_from_edge
  @type  forward
  @label @aggr_traffic
  port 24224
</source>
<label @aggr_traffic>
  <match **>
    @id   output_to_backend
    @type forward
    <buffer>
      @type file
      flush_interval 1s
      chunk_limit_size 8m
      total_limit_size 64g  # limit per plugin instance, per worker (== 64GB * 8 in total max)
    </buffer>
    <server>
      # ...
    </server>
  </match>
</label>

Eight worker (and a supervisor) processes work like below:

  • Fluentd launches 8 worker processes in the startup sequence
  • Supervisor process listens port 24224
  • When TCP connection requests arrived from client, supervisor delegates accept to a worker of workers (server plugin helper does it)
  • A worker accepts the connection and process inbound traffic from that socket

All worker processes have the set of plugins, routing and buffers (under root_dir/workerN if root_dir is configured), and work in parallel completely. Each workers consumes memory and disk spaces - you should take care to configure buffer spaces, and to monitor memory/disk consumption.

This feature is highly experimental. We need many feedbacks, including bug reports, performance graphs, resource consumption summary and more.

Making plugins available with multi process workers

Once again, multi process workers configuration requires all configured plugins support multi process worker feature. It means:

  • Plugin MUST be implemented with v0.14 API natively
    • Plugin class is a direct subclass of Fluent::Plugin::Input, Fluent::Plugin::Output, Fluent::Plugin::Filters, etc
  • Plugin MUST have #multi_workers_ready? method and return true
    • The default return value of this method is true in filter/buffer/parser/formatter/storage plugins
    • But it's false in input/output plugins
    • 3rd party plugins SHOULD re-implemnet #multi_workers_ready? to return true to support multi process workers
  • Plugin MUST NOT use raw sockets to listen network ports
    • It makes Errno::ADDRINUSE errors with multi process worker configurations
    • Use server plugin helper to create networks servers to listen network port via the supervisor process

Almost all built-in plugins support multi process workers, but in_tail doesn't support it. If in_tail is configured with workers, Fluentd will fail to start. This disadvantage will be solved in future versions (see also).

TLS encrypted communication support

There has been many requests to support secure network transportation between Fluentd nodes, for the cases of communication between data centers. Some users had used 3rd party fluent-plugin-secure-forward plugin... But finally, Fluentd core supports it!

Here's a configuration example to use server certificates for server nodes (input side), and to share it in client side (output side) too:

##### output side
<match secret.log>
  @id   datacenter_a_output
  @type forward

  transport tls
  tls_cert_path /etc/mysecret/cert.pem # server certification file of destination node
                                       # or private CA, or ...

  ### tls_cert_path can be specified with 2 or more paths
  # tls_cert_path /etc/mysecret/cert1.pem, /etc/mysecret/cert2.pem

  ### There are some options for detailed configuration....
  # tls_version              TLSv1_2  # TLS v1.2 (or TLSv1_1)
  # tls_insecure_mode          false  # skip all certificate checks if true
  # tls_allow_self_signed_cert false
  # tls_verify_hostname         true  # verify FQDN in certificates, with SNI

  <server>
    host dest1.myhost.example.com
    port 24228
  </server>
  <server>
    # or, use host(ip address) & name(hostname to verify with cert)
    host 203.0.113.103
    name dest2.myhost.example.com
    port 24228
  </server>
</match>

##### input side
<source>
  @id   datacenter_b_input
  @type forward
  port 24228

  <transport tls>
    # version TLSv1_2
    cert_path              /etc/mysecret/cert1
    private_key_path       /etc/mysecret/server.key.pem
    private_key_passphrase my_secret_passphrase_for_key
  </transport>
</source>

If your server certificate requires intermediate CA certificates to be verified, concat these certs to make chained certs (just like nginx) and specify it in cert_path.

There are some patterns to configure forward input plugin with TLS. All parameters for input plugin (server side) below should be specified in <transport tls> section.

  • Server certificates signed by public CA
    • specify cert_path, private_key_path and private_key_passphrase in server side
    • just specify transport tls in clien side (if your system's cert store has valid root CA certs)
  • Server certificates signed by private CA
    • specify cert_path, private_key_path and private_key_passphrase in server side
    • specify transport tls and tls_cert_path in cliden side
  • Server certificates signed by self
    • specify cert_path, private_key_path and private_key_passphrase in server side
    • specify transport tls, tls_cert_path and tls_allow_self_signed_cert true in cliden side
  • Automatically generated server certificates using private CA certs and keys
    • specify ca_cert_path, ca_private_key_path and ca_private_key_passphrase in server side
    • specify options for cert generation (described below) if needed in server side
    • specify tls_cert_path to specify private CA cert in client side
  • Automatically generated server certificate signed by self (THIS IS ONLY FOR TESTING)
    • specify insecure true, and options for cert generation (if needed) in server side
    • specify tls_insecure_mode true in client side

Options for cert generation:

  • generate_private_key_length (default: 2048)
  • generate_cert_country (default: 'US')
  • generate_cert_state (default: 'CA')
  • generate_cert_locality (default: 'Mountain View')
  • generate_cert_common_name (default: hostname of the node)
  • generate_cert_expiration (default: 3650d)
  • generate_cert_digest (default: 'sha256')

This feature is highly experimental. Try this in your development/evaluation environment, and send us feedbacks!

Compatibility between this feature and fluent-plugin-secure-forward

This TLS support feature is almost same with the one of fluent-plugin-secure-forward. Fluentd forward plugins already have authentication feature, introduced at v0.14.5, which is comptible with secure-forward plugins. Now, we have TLS support, so 100% compatible with secure-forward plugin about these protocols.

  • secure_forward output plugin + forward input plugin (w/ TLS, auth):
    • configure secure_forward output plugin as usual
    • configure forward input plugin using <transport tls> and <secure> sections
  • forward output plugin (w/ TLS, auth) + secure_forward input plugin:
    • configure forward output plugin using transport tls and tls_* options
    • configure forward output plugin with time_as_integer if secure_forward plugin is working on Fluentd v0.12.x or earlier
    • configure secure_forward input plugin as usual

There are some difference about behavior. It will make difference about performance and resource usage:

  • secure_forward output plugin uses connection keep-alive, but forward connects to servers every flush time
    • establishing TLS connection requires CPU time, so CPU usage might be higher in forward plugins (in both of input and output)
    • keep-alive sometimes make troubles (especially on Internet), so forward may have less network connectivity troubles
  • secure_forward output plugin creates threads per connections, but forward uses asynchronous I/O
    • this may make some difference about CPU time / memory usage, but not sure

Please notice everything to the Fluentd core developer team via issues if you have any troubles.

in_forward: Add source_address_key and source_hostname_key options

forward input plugin now has two options to inject fields about data source host address and hostname:

<source>
  @id input_via_network
  @type forward
  port 24224
  source_address_key  address
  source_hostname_key hostname
  # All events should have data like:
  # { "address": "203.0.113.103", "hostname": "web.example.com" }
</source>

The Fluentd process with forward input plugin, which is configured to enable these options, will consume much CPU time to manipulate all events. Take care about it.

Fluentd internal log event routing

Fluentd's internal log events can be caputured with <match fluent.**> sections. Now these events can be routed into the <label @FLUENT_LOG> section if it's configured.

<label @FLUENT_LOG>
  <match fluent.{warn,error,fatal}>
    @id   fluentd_monitoring
    @type elasticsearch
    # ....
  </match>
  # logs in info, debug, trace will be dropped
</label>

Make your configuration clean & readable with labels!

Ruby 2.4 support

Fluentd is now tested on Ruby 2.4, and next td-agent will be released with bundled Ruby 2.4.

Major bug fixes

  • command line: fix bug to ignore command line options: --rpc-endpoint, --suppress-config-dump, etc #1398
  • supervisor: fix bug of process_name option not to work about supervisor #1380
  • in_forward: Fix a bug not to handle require_ack_response correctly #1389

Enjoy logging!

Read More

Update the GPG key for Treasure Agent

Hi folks,

This article is for Treasure Agent users.

We used SHA1 based GPG key for td-agent package signing, but SHA1 has beed deprecated. For example, apt will remove SHA1 support: Teams/Apt/Sha1Removal - Debian Wiki

So we have updated Treasure Agent's GPG key for deb/rpm to drop SHA1 based signing. It means you need to update imported old GPG key before td-agent update.

If new deployment or if you disable gpg check, no need update action.

Here is an update steps for deb/rpm.

deb

Remove old GPG key

% apt-key del A12E206F

Import new GPG key

% curl -O https://packages.treasuredata.com/GPG-KEY-td-agent
% apt-key add GPG-KEY-td-agent

You can check imported is succeeded or not.

% apt-key list

Error content

Here is error example with old GPG key

W: GPG error: http://packages.treasuredata.com/2/ubuntu/xenial xenial InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 901F9177AB97ACBE

rpm

Remove old GPG key

If you found this key in rpm, remove it by following command:

% rpm -e --allmatches "gpg-pubkey-a12e206f-*"

Imoprt new GPG key

Import by rpm import

% rpm --import https://packages.treasuredata.com/GPG-KEY-td-agent

You can check imported is succeeded or not.

% rpm -qi "gpg-pubkey-ab97acbe-*"

Error content

Here is error example with old GPG key

The GPG keys listed for the "TreasureData" repository are already installed but they are not correct for this package.
Check that the correct key URLs are configured for this repository.


 Failing package is: td-agent-2.3.4-0.el7.x86_64
 GPG Keys are configured as: https://packages.treasuredata.com/GPG-KEY-td-agent

Old GPG key

If you need old GPG key for older packages, use following key:

GPG-KEY-td-agent-old-sha1

Read More

Fluentd v0.12.31 has been released

Hi users!

We have released Fluentd version 0.12.31. Here are the changes:

New features / Enhancement

  • output: Add slow_flush_log_threshold parameter: #1366
  • formatter_csv: Change fields parameter to required. Now accepts both a,b and ["a", "b"]: #1361
  • in_syslog: Add priority_key and facility_key parameters: #1351

Add slow_flush_log_threshold parameter

We introduce slow_flush_log_threshold parameter to investigate flush performance issue. Fluentd users sometimes hit BufferQueueLimitError without errors. In such case, users hard to find the cause, traffic is high or output destination becomes slow?

With this parameter, users can know output destination has a problem or not. If you see following message in the fluentd log, your output destination or network has a problem and it causes slow chunk flush.

2016-12-19 12:00:00 +0000 [warn]: buffer flush took longer time than slow_flush_log_threshold: elapsed_time=15.0031226690043695 slow_flush_log_threshold=10.0 plugin_id="es_output"

In this example, slow_flush_log_threshold is 10.0 but chunk flush takes 15 seconds.

<match app.**>
  @type elasticsearch
  @id es_output
  slow_flush_log_threshold 10.0
  # ...
</filter>

The default value is 20.0. If your buffer chunk is small and network latency is low, set smaller value for better monitoring.


Lastly, v0.12.31 docker image has also been available on Docker Hub.


Happy logging!

Read More

About Fluentd

Fluentd is an open source data collector to simplify log management.

Learn

Want to learn more about Fluentd? Check out these pages.

Follow Us!