Continuous Deployment of Telegraf configurations

After I shared my "Using Telegraf as a Gateway" post with Rawkode, he mentioned a talk he gave at InfluxDays San Francisco last month, about migrating away from a centralised implementation of Telegraf, towards automatically configuring it by editing his version-controlled config files in Github. Watch that talk here.

I had a couple of questions about having it reload configs, which he answered, and I wondered whether I have all the toolkit components already running to get this working.

There are a number of challenges, but nothing that should require me to get into coding. For this, you will need:

  1. Telegraf on Linux machines (I did this on Debian Buster, which uses systemd, and I followed the instructions in my previous blog entry for installation, so telegraf regularly updates via apt)
  2. Root access to those machines, to change configuration files
  3. Node-Red
  4. Err, that's it!

There's a little-used way of starting telegraf. Rather than the default command line option of pointing at a configuration file and configuration directory, telegraf can read its configuration from an HTTP endpoint.

 /usr/bin/telegraf -config http://URLtoEndpointServingConfiguration/  
That will run telegraf, connecting to the specified endpoint, and read its configuration from there.

So, what happens if I use this:

 /usr/bin/telegraf -config http://192.168.1.8:1880/telegraf/myhost  
This will connect to that IP address (it's my Node-Red server) on port 1880 (the Node-Red port), to /telegraf/myhost. If I can get Node-Red to respond to web requests on /telegraf/myhost, I should be able to send the configuration to telegraf.

The Node-Red http-in and http-out nodes can do exactly this.

The next challenge is that each of my telegraf instances might want a different set of configurations, so I want Node-Red to understand which instance of telegraf is calling it, and to change configs accordingly.

After that, I want Node-Red to send updated configurations, for when I change the source, so I need to have Node-Red refresh its configs periodically.

Then I want each instance of telegraf to refresh configurations, so everything is automated.

Of course, I want all this stuff to run without exposing any sensitive config information, so I need to have Node-Red modify any retrieved configs to add personal information.

This is quite a kit-bag of things I need to do, so let's get to it!

First, the telegraf end:
I run telegraf from systemd, so it runs automatically on start-up. Therefore, the configuration is set in the systemd subsystem. If you're manually running telegraf, you can just change your command line to what I have above, but for my configuration, I need to edit the telegraf service file
/lib/systemd/system/telegraf.service
The ExecStart line shows the command line that will run telegraf. Change it to this:

 ExecStart=/usr/bin/telegraf -config http://192.168.1.8:1880/telegraf/%H $TELEGRAF_OPTS  
Note the %H. It means "replace this with the hostname of this machine". So each machine will request its own set of configs

Look at the next line in that file. It claims that a reload command will cause telegraf to refresh its configuration. Linux machines run cron for regular automated tasks, so we can use this. Create a file /etc/cron.hourly/telegraf and put these lines into it

 #!/bin/bash -e  
 # random sleep (max 5 min) to prevent clients from hitting the server at the same time  
 SLEEP=$[ ($RANDOM % 300) ] && sleep $SLEEP  
 systemctl reload telegraf  

This will cause telegraf to reload its configuration every 60-65 minutes.

Make that file able to run

 chmod +x /etc/cron.hourly/telegraf  

Word of warning - if your filename uses punctuation marks, it will be rejected by cron. There's no running a telegraf.runthis file - just keep the filename simple.

Now run

 systemctl restart telegraf  
and telegraf is now complaining that it can't get a configuration from anywhere. We'd better jump to Node-Red and sort this out!

Before we run off, a quick note about Windows. Edit the registry, go to

 HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\telegraf  
Change ImagePath like this:
Either reboot, or run Services and restart telegraf.

This will NOT cause telegraf to regularly reload its configuration. Just reboot or restart telegraf for this to happen.

Now, let's run to Node-Red.

Create a new tab in Node-Red. Put the required nodes into it for configs.

The first node updates the configs every 10 minutes.

The next nodes set properties for private settings. These are "Change" nodes, which look like this:

Now we need to retrieve the configs from wherever they are stored:
This is the time where you can get really creative! I use Node-Red to directly store my configs, like this:
You can see here that the URL, database, username and password are all inserted correctly into the telegraf config.
This shows that multiple config snippets can exist in one template node.

But you don't need to use telegraf to store your configs. Node-Red can retrieve configs from elsewhere, such as external files, via file transfer, pulling from git, etc:
You just need to make sure your config snippets use the mustache format as shown above (with the parentheses around the property names) in your storage of the configs.

So we now have all of our config snippets loaded into Node-Red, and refreshed every 10 minutes. We need to serve a web page for telegraf to use:
The first node in here is a http-in node. It listens for GET requests on :1880/telegraf/:hostname. The :hostname automatically becomes a property in Node-Red, which we will use in the following Switch node.
The Switch node sends the request to a different path, depending on the hostname.

Let's look at what happens for the host "mqtt". It goes next to a "MQTT config" template node, and then to a "linux config" template node.

The MQTT config node enables specific monitoring config for my MQTT implementation

The Linux config node inserts all the standard Linux monitoring configs I want to use, appending the MQTT config at the end:

The final step is to respond to the client with the required config.

Every few minutes, Node-Red logs that a connection has been made and a new config has been served:

So all my telegraf instances are now updating their running config every hour, which means I can tune my jitter and buffers as required, update my configs as I want, for every server, a subset of servers, or individually, all without needing to log in to a server.

Node-Red can even be configured to commit your tabs to version control, so it's running as an industrial change management system.

Comments

Popular Posts