The InfluxDB Brake

I’m always on the hunt for things to cut down the execution time of different parts in the NAHS-Bricks environment. For Example I could improve the response time of NAHS-BrickServer by 2ms with writing a background-worker for the transmission of MQTT packages. With this in mind I planed to do the same thing for writing metrics to InfluxDB in the hope to gain another one or two milliseconds.

The discovery

All incoming sensor data are, besides evaluation, stored to InfluxDB for later review, drawing graphs and so on. For this purpose I wrote a connector module that offers functions for the different sensor types and uses the default InfluxDB python library to transmit the data to InfluxDB with the write_points function. This works totally fine but as write_points is a blocking function that is released after all network packages are transmitted and received from InfluxDB I saw a point of improvement.

The solution is to have a background worker that takes data, to transmit, from a (multiprocessing) queue and calls the write_points function with this data. With this worker the main process just needs to prepare the data but has not to wait for write_points to finish, as this is done in the background. Implementation of this just took 15 minutes after which I immediately fired my performance test to see if the worker in deed transmits the data to InfluxDB.

And the result was 13.5ms on avg per request! (Side note: The avg response-time was 20.5ms before the background worker) So I checked the the data in InfluxDB, as I expected something went wrong, and the data maybe never got transmitted to InfluxDB. But it did, all data was correctly saved to InfluxDB. So maybe my first test was a lucky shot, and I ran the test again. The result was 13.5ms again. After this I started digging a bit deeper, as I couldn’t believe the result, but came to the conclusion: yes, it’s valid. I’ve just improved the average response-time by 7ms (or 34%) with nearly no effort….

Why does it matter?

All Bricks are connected to BrickServer via the Brick-Interface, which is a HTTP-server receiving Delivery data from the Bricks and sending Feedback data back to them. When a Brick wakes up it gathers it’s Delivery data and then sends them out to the BrickServer. While it is waiting for the response it needs to keep WiFi up and can’t go back to sleep. This means the shorter a Brick has to wait the earlier it can go back to sleep. Which saves consumed power that extends the overall lifetime of the battery before it needs a recharge. And one of the main goals of NAHS-Bricks is to focus on battery driven sensors, with a maximum possible lifetime.

In conclusion: In theory the improvement in response-time should improve the battery lifetime.

But does it really matter?

As I lately received a Power Profiler Kit 2 (PPK2) by Nordic Semiconductor I took the chance and captured the power consumption of a NAHS-RHTBrick in different configurations with and without running the background worker on NAHS-BrickServer.

Test-setup

  • NAHS-RHTBrick
    • Static sleep delay of 10 seconds
    • Power cycled for each test
    • Configured pin D6 to be pulled low immediately before communication with BrickServer starts and pulled high immediately after response from BrickServer received. This ensures I can select exactly the relevant part of the capture for collecting data
  • Power Profiler
    • Powers NAHS-RHTBrick trough battery terminal with 3.9ish volts
    • NAHS-RHTBricks pin D6 is connected to digital channel 0
    • Resolution set to maximum
    • Capture time set to 100 seconds (this captures 9 transmissions to NAHS-BrickServer for each testrun (Fig. 1))
  • Sensors
    1. Testrun with one HDC1080 humidity reading
    2. Testrun with one HDC1080 humidity reading and two DS18B20 temperature readings
    3. Testrun with one HDC1080 humidity reading and four DS18B20 temperature readings
  • NAHS-BrickServer
    • MongoDB and InfluxDB is cleared before each testrun
    • Testruns done with
      1. either background worker enabled (async)
      2. or inline write_points function calls (sync)

Test-execution

In total 6 testruns were done 3 with sync InfluxDB connection with one, three and five sensors connected to Brick and 3 with async InfluxDB connection with one, three and five sensors connected to Brick. For each testrun I collected all nine measurements which results in 27 samples for each (sync, async) BrickServer configuration.

The reason why I choose to do testruns with different sensor counts is, that I was unsure if a difference is measurable between sync and async with just one sensor. This is because every sensor issues a write_points call, which sum up with three or five sensors. But as you are going to see even with just one sensor there was a measurable difference. (Take a look at Fig. 2 and Fig. 3 for a direct comparison)

Fig. 1: One (nearly) hole captured testrun

Test-results

Here are the average time, current and charge values for each testrun:

BrickServer ModeSensor CountAVG Time(ms)AVG Current (mA)AVG Charge (mAs)AVG Charge (mAh)
sync146.3198.914.580.00127
sync354.6196.525.270.00146
sync565.4192.206.030.00168
syncTotal AVG55.4495.885.290.00147
async141.43100.944.180.00116
async352.7697.515.140.00143
async555.9997.955.480.00152
asyncTotal AVG50.0698.804.940.00137
Fig. 2: Sync – One Sensor – Sample #3
Fig. 3: Async – One Sensor – Sample #3

Conclusion

Putting everything together; the use of a background worker for InfluxDB results in 7ms time saving in synthetic tests. This improvement is in fact measurable with a average saving of 5ms time and 100nAh charge for each wake of a Brick. This is by far not a huge impact but it is an improvement in the end. Also it was quiet some fun to come along this journey 😉

If anyone is interested in reviewing my sampled data I’ve uploaded an archive with all raw data. The ppk files can be opened with power profile which is part of nRF Connect by Nordic Semiconductor that can be found here: https://www.nordicsemi.com/Products/Development-tools/nRF-Connect-for-desktop