Skip to main content

7 posts tagged with "reliability"

View All Tags

Edge Gateway Hot-Reload and Watchdog Patterns for Industrial IoT [2026]

· 12 min read

Here's a scenario every IIoT engineer dreads: it's 2 AM on a Saturday, your edge gateway in a plastics manufacturing plant has lost its MQTT connection to the cloud, and nobody notices until Monday morning. Forty-eight hours of production data — temperatures, pressures, cycle counts, alarms — gone. The maintenance team wanted to correlate a quality defect with process data from Saturday afternoon. They can't.

This is a reliability problem, and it's solvable. The patterns that separate a production-grade edge gateway from a prototype are: configuration hot-reload (change settings without restarting), connection watchdogs (detect and recover from silent failures), and graceful resource management (handle reconnections without memory leaks).

This guide covers the architecture behind each of these patterns, with practical design decisions drawn from real industrial deployments.

Edge gateway hot-reload and firmware patterns

The Problem: Why Edge Gateways Fail Silently

Industrial edge gateways operate in hostile environments: temperature swings, electrical noise, intermittent network connectivity, and 24/7 uptime requirements. The failure modes are rarely dramatic — they're insidious:

  • MQTT connection drops silently. The broker stops responding, but the client library doesn't fire a disconnect callback because the TCP connection is still half-open.
  • Configuration drift. An engineer updates tag definitions on the management server, but the gateway is still running the old configuration.
  • Memory exhaustion. Each reconnection allocates new buffers without properly freeing the old ones. After enough reconnections, the gateway runs out of memory and crashes.
  • PLC link flapping. The PLC reboots or loses power briefly. The gateway keeps polling, getting errors, but never properly re-detects or reconnects.

Solving these requires three interlocking systems: hot-reload for configuration, watchdogs for connections, and disciplined resource management.

Pattern 1: Configuration File Hot-Reload

The simplest and most robust approach to configuration hot-reload is file-based with stat polling. The gateway periodically checks if its configuration file has been modified (using the file's modification timestamp), and if so, reloads and applies the new configuration.

Design: stat() Polling vs. inotify

You have two options for detecting file changes:

stat() polling — Check the file's st_mtime on every main loop iteration:

on_each_cycle():
current_stat = stat(config_file)
if current_stat.mtime != last_known_mtime:
reload_configuration()
last_known_mtime = current_stat.mtime

inotify (Linux) — Register for kernel-level file change notifications:

fd = inotify_add_watch(config_file, IN_MODIFY)
poll(fd) // blocks until file changes
reload_configuration()

For industrial edge gateways, stat() polling wins. Here's why:

  1. It's simpler. No file descriptor management, no edge cases with inotify watches being silently dropped.
  2. It works across filesystems. inotify doesn't work on NFS, CIFS, or some embedded filesystems. stat() works everywhere.
  3. The cost is negligible. A single stat() call takes ~1 microsecond. Even at 1 Hz, it's invisible.
  4. It naturally integrates with the main loop. Industrial gateways already run a polling loop for PLC reads. Adding a stat() check is one line.

Graceful Reload: The Teardown-Rebuild Cycle

When a configuration change is detected, the gateway must:

  1. Stop active PLC connections. For EtherNet/IP, destroy all tag handles. For Modbus, close the serial port or TCP connection.
  2. Free allocated memory. Tag definitions, batch buffers, connection contexts — all of it.
  3. Re-read and validate the new configuration.
  4. Re-detect the PLC and re-establish connections with the new tag map.
  5. Resume data collection with a forced initial read of all tags.

The critical detail is step 2. Industrial gateways often use a pool allocator instead of individual malloc/free calls. All configuration-related memory is allocated from a single large buffer. On reload, you simply reset the allocator's pointer to the beginning of the buffer:

// Pseudo-code: pool allocator reset
config_memory.write_pointer = config_memory.base_address
config_memory.used_bytes = 0
config_memory.free_bytes = config_memory.total_size

This eliminates the risk of memory leaks during reconfiguration. No matter how many times you reload, memory usage stays constant.

Multi-File Configuration

Production gateways often have multiple configuration files:

  • Daemon config — Network settings, serial port parameters, batch sizes, timeouts
  • Device configs — Per-device-type tag maps (one JSON file per machine model)
  • Connection config — MQTT broker address, TLS certificates, authentication tokens

Each file should be watched independently. If only the daemon config changes (e.g., someone adjusts the batch timeout), you don't need to re-detect the PLC — just update the runtime parameter. If a device config changes (e.g., someone adds a new tag), you need to rebuild the tag chain.

A practical approach: when the daemon config changes, set a flag to force a status report on the next MQTT cycle. When a device config changes, trigger a full teardown-rebuild of that device's tag chain.

Pattern 2: Connection Watchdogs

The most dangerous failure mode in MQTT-based telemetry is the silent disconnect. The TCP connection appears alive (no RST received), but the broker has stopped processing messages. The client's publish calls succeed (they're just writing to a local socket buffer), but data never reaches the cloud.

The MQTT Delivery Confirmation Watchdog

The robust solution uses MQTT QoS 1 delivery confirmations as a heartbeat:

// Track the timestamp of the last confirmed delivery
last_delivery_timestamp = 0

on_publish_confirmed(packet_id):
last_delivery_timestamp = now()

on_watchdog_check(): // runs every N seconds
if last_delivery_timestamp == 0:
return // no data sent yet, nothing to check

elapsed = now() - last_delivery_timestamp
if elapsed > WATCHDOG_TIMEOUT:
trigger_reconnect()

With MQTT QoS 1, the broker sends a PUBACK for every published message. If you haven't received a PUBACK in, say, 120 seconds, but you've been publishing data, something is wrong.

The key insight is that you're not watching the connection state — you're watching the delivery pipeline. A connection can appear healthy (no disconnect callback fired) while the delivery pipeline is stalled.

Reconnection Strategy: Async with Backoff

When the watchdog triggers, the reconnection must be:

  1. Asynchronous — Don't block the PLC polling loop. Data collection should continue even while MQTT is reconnecting. Collected data gets buffered locally.
  2. Non-destructive — The MQTT loop thread must be stopped before destroying the client. Stopping the loop with force=true ensures no callbacks fire during teardown.
  3. Complete — Disconnect, destroy the client, reinitialize the library, create a new client, set callbacks, start the loop, then connect. Half-measures (just calling reconnect) often leave stale state.

A dedicated reconnection thread works well:

reconnect_thread():
while true:
wait_for_signal() // semaphore blocks until watchdog triggers

log("Starting MQTT reconnection")
stop_mqtt_loop(force=true)
disconnect()
destroy_client()
cleanup_library()

// Re-initialize from scratch
init_library()
create_client(device_id)
set_credentials(username, password)
set_tls(certificate_path)
set_protocol(MQTT_3_1_1)
set_callbacks(on_connect, on_disconnect, on_message, on_publish)
start_loop()
set_reconnect_delay(5, 5, no_exponential)
connect_async(host, port, keepalive=60)

signal_complete() // release semaphore

Why a separate thread? The connect_async call can block for up to 60 seconds on DNS resolution or TCP handshake. If this runs on the main thread, PLC polling stops. Industrial processes don't wait for your network issues.

PLC Connection Watchdog

MQTT isn't the only connection that needs watching. PLC connections — both EtherNet/IP and Modbus TCP — can also fail silently.

For Modbus TCP, the watchdog logic is simpler because each read returns an explicit error code:

on_modbus_read_error(error_code):
if error_code in [ETIMEDOUT, ECONNRESET, ECONNREFUSED, EPIPE, EBADF]:
close_modbus_connection()
set_link_state(DOWN)
// Will reconnect on next polling cycle

For EtherNet/IP via libraries like libplctag, a return code of -32 (connection failed) should trigger:

  1. Setting the link state to DOWN
  2. Destroying the tag handles
  3. Attempting re-detection on the next cycle

A critical detail: track consecutive errors, not individual ones. A single timeout might be a transient hiccup. Three consecutive timeouts (error_count >= 3) indicate a real problem. Break the polling cycle early to avoid hammering a dead connection.

The gateway should treat the connection state itself as a telemetry point. When the PLC link goes up or down, immediately publish a link state tag — a boolean value with do_not_batch: true:

link_state_changed(device, new_state):
publish_immediately(
tag_id=LINK_STATE_TAG,
value=new_state, // true=up, false=down
timestamp=now()
)

This gives operators cloud-side visibility into gateway connectivity. A dashboard can show "Device offline since 2:47 AM" instead of just "no data" — which is ambiguous (was the device off, or was the gateway offline?).

Pattern 3: Store-and-Forward Buffering

When MQTT is disconnected, you can't just drop data. A production gateway needs a paged ring buffer that accumulates data during disconnections and drains it when connectivity returns.

Paged Buffer Architecture

The buffer divides a fixed-size memory region into pages of equal size:

Total buffer: 2 MB
Page size: ~4 KB (derived from max batch size)
Pages: ~500

Page states:
FREE → Available for writing
WORK → Currently being written to
USED → Full, queued for delivery

The lifecycle:

  1. Writing: Data is appended to the WORK page. When it's full, WORK moves to the USED queue, and a FREE page becomes the new WORK page.
  2. Sending: When MQTT is connected, the first USED page is sent. On PUBACK confirmation, the page moves to FREE.
  3. Overflow: If all pages are USED (buffer full, MQTT down for too long), the oldest USED page is recycled as the new WORK page. This loses the oldest data to preserve the newest — the right tradeoff for most industrial applications.

Thread safety is critical. The PLC polling thread writes to the buffer, the MQTT thread reads from it, and the PUBACK callback advances the read pointer. A mutex protects all buffer operations:

buffer_add_data(data, size):
lock(mutex)
append_to_work_page(data, size)
if work_page_full():
move_work_to_used()
try_send_next()
unlock(mutex)

on_puback(packet_id):
lock(mutex)
advance_read_pointer()
if page_fully_delivered():
move_page_to_free()
try_send_next()
unlock(mutex)

on_disconnect():
lock(mutex)
connected = false
packet_in_flight = false // reset delivery state
unlock(mutex)

Sizing the Buffer

Buffer sizing depends on your data rate and your maximum acceptable offline duration:

buffer_size = data_rate_bytes_per_second × max_offline_seconds

For a typical deployment:

  • 50 tags × 4 bytes average × 1 read/second = 200 bytes/second
  • With binary encoding overhead: ~300 bytes/second
  • Maximum offline duration: 2 hours (7,200 seconds)
  • Buffer needed: 300 × 7,200 = ~2.1 MB

A 2 MB buffer with 4 KB pages gives you ~500 pages — more than enough for 2 hours of offline operation.

The Minimum Three-Page Rule

The buffer needs at minimum 3 pages to function:

  1. One WORK page (currently being written to)
  2. One USED page (queued for delivery)
  3. One page in transition (being delivered, not yet confirmed)

If you can't fit 3 pages in the buffer, the page size is too large relative to the buffer. Validate this at initialization time and reject invalid configurations rather than failing at runtime.

Pattern 4: Periodic Forced Reads

Even with change-detection enabled (the compare flag), a production gateway should periodically force-read all tags and transmit their values regardless of whether they changed. This serves several purposes:

  1. Proof of life. Downstream systems can distinguish "the value hasn't changed" from "the gateway is dead."
  2. State synchronization. If the cloud-side database lost data (a rare but real scenario), periodic full-state updates resynchronize it.
  3. Clock drift correction. Over time, individual tag timers can drift. A periodic full reset realigns all tags.

A practical approach: reset all tags on the hour boundary. Check the system clock, and when the hour rolls over, clear all "previously read" flags. Every tag will be read and transmitted on its next polling cycle, regardless of change detection:

on_each_read_cycle():
current_hour = localtime(now()).hour
previous_hour = localtime(last_read_time).hour

if current_hour != previous_hour:
reset_all_tags() // clear read-once flags
log("Hourly forced read: all tags will be re-read")

This adds at most one extra transmission per tag per hour — a negligible bandwidth cost for significant reliability improvement.

Pattern 5: SAS Token and Certificate Expiry Monitoring

If your MQTT connection uses time-limited credentials (like Azure IoT Hub SAS tokens or short-lived TLS certificates), the gateway must monitor expiry and refresh proactively.

For SAS tokens, extract the se (expiry) parameter from the connection string and compare it against the current system time:

on_config_load(sas_token):
expiry_timestamp = extract_se_parameter(sas_token)

if current_time > expiry_timestamp:
log_warning("Token has expired!")
// Still attempt connection — the broker will reject it,
// but the error path will trigger a config reload
else:
time_remaining = expiry_timestamp - current_time
log("Token valid for %d hours", time_remaining / 3600)

Don't silently fail. If the token is expired, log a prominent warning. The gateway should still attempt to connect (the broker rejection will be informative), but operations teams need visibility into credential lifecycle.

For TLS certificates, monitor both the certificate file's modification time (has a new cert been deployed?) and the certificate's validity period (is it about to expire?).

How machineCDN Implements These Patterns

machineCDN's edge gateway — deployed on OpenWRT-based industrial routers in plastics manufacturing plants — implements all five patterns:

  • Configuration hot-reload using stat() polling on the main loop, with pool-allocated memory for zero-leak teardown/rebuild cycles
  • Dual watchdogs for MQTT delivery confirmation (120-second timeout) and PLC link state (3 consecutive errors trigger reconnection)
  • Paged ring buffer with 2 MB capacity, supporting both JSON and binary encoding, with automatic overflow handling that preserves newest data
  • Hourly forced reads that ensure complete state synchronization regardless of change detection
  • SAS token monitoring with proactive expiry warnings

These patterns enable 99.9%+ data capture rates even in plants with intermittent cellular connectivity — because the gateway collects data continuously and back-fills when connectivity returns.

Implementation Checklist

If you're building or evaluating an edge gateway for industrial IoT, verify that it supports:

CapabilityWhy It Matters
Config hot-reload without restartZero-downtime updates, no data gaps during reconfiguration
Pool-based memory allocationNo memory leaks across reload cycles
MQTT delivery watchdogDetects silent connection failures
Async reconnection threadPLC polling continues during MQTT recovery
Paged store-and-forward bufferPreserves data during network outages
Consecutive error thresholdsAvoids false-positive disconnections
Link state telemetryDistinguishes "offline gateway" from "idle machine"
Periodic forced readsState synchronization and proof-of-life
Credential expiry monitoringProactive certificate/token management

Conclusion

Reliability in industrial IoT isn't about preventing failures — it's about recovering from them automatically. Networks will drop. PLCs will reboot. Certificates will expire. The question is whether your edge gateway handles these events gracefully or silently loses data.

The patterns in this guide — hot-reload, watchdogs, store-and-forward, forced reads, and credential monitoring — are the difference between a gateway that works in the lab and one that works at 3 AM on a holiday weekend in a plant with spotty cellular coverage.

Build for the 3 AM scenario. Your operations team will thank you.

Paged Ring Buffers for Industrial MQTT: How to Never Lose a Data Point [2026]

· 10 min read

Here's the scenario every IIoT engineer dreads: your edge gateway is collecting temperature, pressure, and vibration data from 200 tags across 15 PLCs. The cellular modem on the factory roof drops its connection — maybe for 30 seconds during a handover, maybe for 4 hours because a backhoe hit a fiber line. When connectivity returns, what happens to the data?

If your answer is "it's gone," you have a buffer management problem. And fixing it properly requires understanding paged ring buffers — the unsung hero of reliable industrial telemetry.

Why Naive Buffering Fails

The simplest approach — queue MQTT messages in memory and retry on reconnect — has three fatal flaws:

  1. Memory exhaustion: A gateway reading 200 tags at 1-second intervals generates ~12,000 readings per minute. At ~100 bytes per JSON reading, that's 1.2 MB/minute. A 4-hour outage accumulates ~288 MB. Your 256 MB embedded gateway just died.

  2. No delivery confirmation: MQTT QoS 1 guarantees "at least once" delivery, but the Mosquitto client library's in-flight message queue is finite. If you publish 50,000 messages into a disconnected client, most will be silently dropped by the client library's internal buffer long before the broker sees them.

  3. Thundering herd on reconnect: When connectivity returns, dumping 288 MB of queued messages simultaneously will choke the cellular uplink (typically 1–5 Mbps), cause broker-side backpressure, and likely trigger another disconnect.

The Paged Ring Buffer Architecture

The solution is a fixed-size, page-based circular buffer that sits between the data collection layer and the MQTT client. Here's how it works:

Memory Layout

The buffer is allocated as a single contiguous block — typically 2 MB on an embedded gateway. This block is divided into equal-sized pages, where each page can hold one complete MQTT payload.

┌─────────────────────────────────────────────────┐
│ 2 MB Buffer Memory │
├────────┬────────┬────────┬────────┬────────┬────┤
│ Page 0 │ Page 1 │ Page 2 │ Page 3 │ Page 4 │ ...│
│ 4 KB │ 4 KB │ 4 KB │ 4 KB │ 4 KB │ │
└────────┴────────┴────────┴────────┴────────┴────┘

With a 4 KB page size and 2 MB total buffer, you get approximately 500 pages. Each page holds multiple MQTT messages packed sequentially.

Page States

Every page exists in exactly one of three states:

  • Free: Available for new data. Part of a singly-linked free list.
  • Work: Currently being filled with incoming data. Only one work page exists at a time.
  • Used: Full of data, waiting to be transmitted. Part of a singly-linked FIFO queue.
Free Pages → [P5] → [P6] → [P7] → null
Work Page → [P3] (currently filling)
Used Pages → [P0] → [P1] → [P2] → null
↑ sending waiting →

The Write Path

When a batch of PLC tag values arrives from the data collection layer:

  1. Check the work page: If there's no current work page, pop one from the free list. If the free list is empty, steal the oldest used page (overflow — we're losing old data to make room for new data, which is the correct trade-off for operational monitoring).

  2. Calculate fit: Each message is packed as: [4-byte message ID] [4-byte message size] [message payload]. Check if the current work page has enough remaining space for this overhead plus the payload.

  3. If it fits: Write the message ID (initially zero — will be filled by the MQTT client), the size, and the payload. Advance the write pointer.

  4. If it doesn't fit: Move the current work page to the tail of the used queue. Pop a new page from the free list (or steal from used queue). Write into the new page.

Page Internal Layout:
┌──────────┬──────────┬─────────────┬──────────┬──────────┬─────────────┐
│ msg_id_1 │ msg_sz_1 │ payload_1 │ msg_id_2 │ msg_sz_2 │ payload_2 │
│ (4 bytes) │ (4 bytes) │ (N bytes) │ (4 bytes) │ (4 bytes) │ (M bytes) │
└──────────┴──────────┴─────────────┴──────────┴──────────┴─────────────┘
↑ write_p (current position)

The Send Path

The MQTT send logic runs after every write operation and follows strict rules:

  1. Check prerequisites: Connection must be up (connected == 1) AND no packet currently in-flight (packet_sent == 0). If either fails, do nothing — the data is safely buffered.

  2. Select the send source: If there are used pages, send from the first one in the FIFO. If no used pages exist but the work page has data, promote the work page to used and send from it.

  3. Read the next message from the current page's read pointer: extract the size, get the data pointer, and call mosquitto_publish() with QoS 1.

  4. Mark packet as in-flight: Set packet_sent = 1. This is critical — only one message can be in-flight at a time. This prevents the thundering herd problem and ensures ordered delivery.

  5. Wait for acknowledgment: The MQTT client library calls the publish callback when the broker confirms receipt (PUBACK for QoS 1). Only then do we advance the read pointer and send the next message.

The Acknowledgment Path

When the Mosquitto library fires the on_publish callback with a packet ID:

  1. Verify the ID matches the in-flight message on the current used page
  2. Advance the read pointer past the delivered message (skip message ID + size + payload bytes)
  3. Check if page is fully delivered: If read_p >= write_p, move the page back to the free list
  4. Clear the in-flight flag: Set packet_sent = 0
  5. Immediately attempt to send the next message — this creates a natural flow control where messages are delivered as fast as the broker can acknowledge them
Delivery Flow:
publish()
[Used Page] ──────────────────→ [MQTT Broker]
↑ │
│ PUBACK │
└────────────────────────────────┘
advance read_p, try next

Thread Safety: The Mutex Dance

In a real gateway, data collection and MQTT delivery run on different threads. The PLC polling loop writes data every second, while the Mosquitto client library fires callbacks from its own network thread. Every buffer operation — add, send, acknowledge, connect, disconnect — must be wrapped in a mutex:

// Data collection thread:
mutex_lock(buffer)
add_data(payload)
try_send_next() // opportunistic send
mutex_unlock(buffer)

// MQTT callback thread:
mutex_lock(buffer)
mark_delivered(packet_id)
try_send_next() // chain next send
mutex_unlock(buffer)

The key insight is that try_send_next() is called from both threads — after every write (in case we're connected and idle) and after every acknowledgment (to chain the next message). This ensures maximum throughput without busy-waiting.

Handling Disconnects Gracefully

When the MQTT connection drops, two things happen:

  1. The disconnect callback fires: Set connected = 0 and packet_sent = 0. The in-flight message is NOT lost — it's still in the page at the current read pointer. When connectivity returns, it will be re-sent.

  2. Data keeps flowing in: The PLC polling loop doesn't stop. New data continues to fill pages. The used queue grows. If it fills all available pages, new pages will steal from the oldest used pages — but this only happens under extreme sustained outages.

When the connection re-establishes:

  1. The connect callback fires: Set connected = 1 and trigger try_send_next()
  2. Buffered data starts flowing: Messages are delivered in FIFO order, one at a time, with acknowledgment pacing

This means the broker receives data in chronological order, with timestamps embedded in each batch. Analytics systems downstream can seamlessly handle the gap — they see a burst of historical data followed by real-time data, all correctly timestamped.

The Cloud Watchdog: Detecting Silent Failures

There's a subtle failure mode: the MQTT connection appears healthy (no disconnect callback), but data isn't actually being delivered. This can happen with certain TLS middlebox issues, stale TCP connections that haven't timed out, or Azure IoT Hub token expirations.

The solution is a delivery watchdog:

  1. Track the timestamp of the last successful packet delivery
  2. On a periodic check (every 120 seconds), compare the current time against the last delivery timestamp
  3. If no data has been delivered in 120 seconds AND the connection claims to be up, force a reconnection:
    • Reset the MQTT configuration timestamp (triggers config reload)
    • Clear the watchdog timer
    • The main loop will detect the stale configuration and restart the MQTT client
if (now - last_delivery_time > 120s) AND (connected) {
log("No data delivered in 120s — forcing MQTT reconnect")
force_mqtt_restart()
}

This catches the "zombie connection" problem that plagues many IIoT deployments — the gateway thinks it's sending, but nothing is actually arriving at the cloud.

Binary vs. JSON: The Bandwidth Trade-off

The paged buffer doesn't care about the payload format — it stores raw bytes. But the choice between JSON and binary encoding has massive implications for buffer utilization:

JSON payload for one tag reading:

{"id":42,"values":[23.7],"ts":1709337600}

~45 bytes per reading.

Binary payload for the same reading:

Tag ID:    2 bytes (uint16)
Status: 1 byte
Value Cnt: 1 byte
Value Sz: 1 byte
Value: 4 bytes (float32)
─────────────────────
Total: 9 bytes per reading

That's a 5x reduction. With batching (multiple readings per batch header), the per-reading overhead drops further because the timestamp and device identity are shared across a group of values.

On a cellular connection billing per megabyte, this isn't academic — it's the difference between $15/month and $75/month per gateway. On satellite connections (Iridium, Starlink maritime), it can be $50 vs. $250.

Binary Batch Wire Format

A binary batch on the wire follows this structure:

[0xF7]                          — 1 byte, magic/version marker
[num_groups] — 4 bytes, big-endian uint32
For each group:
[timestamp] — 4 bytes, big-endian time_t
[device_type] — 2 bytes, big-endian uint16
[serial_number] — 4 bytes, big-endian uint32
[num_values] — 4 bytes, big-endian uint32
For each value:
[tag_id] — 2 bytes, big-endian uint16
[status] — 1 byte (0 = OK, else error code)
If status == 0:
[values_count] — 1 byte
[value_size] — 1 byte (1, 2, or 4)
[values...] — values_count × value_size bytes

A batch of 50 tag readings fits in ~600 bytes binary versus ~3,000 bytes JSON. Over a 4-hour outage with 200 tags at 60-second intervals, that's the difference between buffering ~4.8 MB (binary) versus ~24 MB (JSON) — within or far exceeding a typical gateway's buffer.

Sizing Your Buffer: The Math

For a given deployment, calculate your buffer needs:

Tags: 200
Read interval: 60 seconds
Binary payload per reading: ~9 bytes
Readings per minute: 200
Bytes per minute: 200 × 9 = 1,800 bytes
With batch overhead (~15 bytes per group): ~1,815 bytes/min

Buffer size: 2 MB = 2,097,152 bytes
Retention: 2,097,152 / 1,815 = ~1,155 minutes = ~19.2 hours

So a 2 MB buffer can hold approximately 19 hours of data for 200 tags at 60-second intervals using binary encoding. With JSON, that drops to ~3.8 hours. Size your buffer accordingly.

What machineCDN Does Differently

machineCDN's edge gateway implements this paged ring buffer architecture natively. Every gateway shipped includes:

  • Fixed 2 MB paged buffer with configurable page sizes matching the MQTT broker's maximum packet size
  • Automatic binary encoding for all telemetry — 5x bandwidth reduction over JSON
  • Single-message flow control with QoS 1 acknowledgment tracking — no thundering herd on reconnect
  • 120-second delivery watchdog that detects zombie connections and forces reconnect
  • Graceful overflow handling — when buffer fills, oldest data is recycled (not newest), preserving the most recent operational state

For plant engineers, this means deploying a gateway on a cellular connection and knowing that a connectivity outage — whether 30 seconds or 12 hours — won't result in lost data. The buffer holds, the watchdog monitors, and data flows in order when the link comes back.

Key Takeaways

  1. Never use unbounded queues for industrial telemetry buffering — use fixed-size paged buffers that degrade gracefully under memory pressure
  2. One message in-flight at a time prevents the thundering herd problem and ensures ordered delivery
  3. Always track delivery acknowledgments — don't just publish and forget; verify the broker received each packet before advancing
  4. Implement a delivery watchdog — silent MQTT failures are harder to detect than disconnects
  5. Use binary encoding — 5x bandwidth reduction means 5x longer buffer retention on the same memory
  6. Size for your worst outage — calculate how much buffer you need based on tag count, interval, and the longest connectivity gap you expect
  7. Thread safety is non-negotiable — data collection and MQTT delivery run concurrently; every buffer operation needs mutex protection

The paged ring buffer isn't exotic computer science — it's a practical engineering pattern that's been battle-tested in thousands of industrial deployments. The difference between a prototype IIoT system and a production one often comes down to exactly this kind of infrastructure.

Thread-Safe Telemetry Pipelines: Building Concurrent IIoT Edge Gateways That Don't Lose Data [2026]

· 17 min read

An edge gateway on a factory floor isn't a REST API handling one request at a time. It's a real-time system juggling multiple competing demands simultaneously: polling a PLC for tag values every second, buffering data locally when the cloud connection drops, transmitting batched telemetry over MQTT, processing incoming configuration commands from the cloud, and monitoring its own health — all at once, on hardware with the computing power of a ten-year-old smartphone.

Get the concurrency wrong, and you don't get a 500 error in your logs. You get silent data loss, corrupted telemetry batches, or — worst case — a watchdog reboot loop that takes your monitoring offline during a critical production run.

This guide covers the architecture patterns that make industrial edge gateways reliable under real-world conditions: concurrent PLC polling, thread-safe buffering, MQTT delivery guarantees, and the store-and-forward patterns that keep data flowing when the network doesn't.

Thread-safe edge gateway architecture with concurrent data pipelines

The Concurrency Challenge in Industrial Edge Gateways

A typical edge gateway has at least three threads running concurrently:

  1. The polling thread — reads tags from PLCs at configured intervals (1-second to 60-second cycles)
  2. The MQTT network thread — manages the broker connection, handles publish/subscribe, reconnection
  3. The main control thread — processes incoming commands, monitors watchdog timers, manages configuration

These threads all share one critical resource: the outgoing data buffer. The polling thread writes telemetry into the buffer. The MQTT thread reads from the buffer and transmits data. When the connection drops, the buffer must hold data without the polling thread stalling. When the connection recovers, the buffer must drain in order without losing or duplicating messages.

This is a classic producer-consumer problem, but with industrial constraints that make textbook solutions insufficient.

Why Standard Queues Fall Short

Your first instinct might be to use a thread-safe queue — a ConcurrentLinkedQueue in Java, a queue.Queue in Python, or a lock-free ring buffer. These work fine for web applications, but industrial edge gateways have constraints that break standard queue implementations:

1. Memory Is Fixed and Finite

Edge gateways run on embedded hardware with 64 MB to 512 MB of RAM — no swap space, no dynamic allocation after startup. An unbounded queue will eventually exhaust memory during a long network outage. A fixed-size queue forces you to choose: block the producer (stalling PLC polling) or drop the oldest data.

2. Network Outages Last Hours, Not Seconds

In a factory, network outages aren't transient blips. A fiber cut, a misconfigured switch, or a power surge on the network infrastructure can take connectivity down for hours. Your buffer needs to hold potentially thousands of telemetry batches — not just a few dozen.

3. Delivery Confirmation Is Asynchronous

MQTT QoS 1 guarantees at-least-once delivery, but the PUBACK confirmation comes back asynchronously — possibly hundreds of milliseconds after the PUBLISH. During that window, you can't release the buffer space (the message might need retransmission), and you can't stall the producer (PLC data keeps flowing).

4. Data Must Survive Process Restarts

If the edge gateway daemon restarts (due to a configuration update, a watchdog trigger, or a power cycle), buffered-but-undelivered data must be recoverable. Purely in-memory queues lose everything.

The Paged Ring Buffer Pattern

The pattern that works in production is a paged ring buffer — a fixed-size memory region divided into pages, with explicit state tracking for each page. Here's how it works:

Memory Layout

At startup, the gateway allocates a single contiguous memory block and divides it into equal-sized pages:

┌─────────┬─────────┬─────────┬─────────┬─────────┐
│ Page 0 │ Page 1 │ Page 2 │ Page 3 │ Page 4 │
│ FREE │ FREE │ FREE │ FREE │ FREE │
└─────────┴─────────┴─────────┴─────────┴─────────┘

Each page has its own header tracking:

  • A page number (for logging and debugging)
  • A start_p pointer (beginning of writable space)
  • A write_p pointer (current write position)
  • A read_p pointer (current read position for transmission)
  • A next pointer (linking to the next page in whatever list it's in)

Three Page Lists

Pages move between three linked lists:

  1. Free pages — available for the producer to write into
  2. Used pages — full of data, queued for transmission
  3. Work page — the single page currently being written to
Producer (Polling Thread)          Consumer (MQTT Thread)
│ │
▼ │
┌──────────┐ │
│Work Page │──────── When full ──────►┌──────────┐
│(writing) │ │Used Pages│──► MQTT Publish
└──────────┘ │(queued) │
▲ └──────────┘
│ │
│ When delivered │
│◄──────────────────────────────────────┘
┌──────────┐
│Free Pages│
│(empty) │
└──────────┘

The Producer Path

When the polling thread has a new batch of tag values to store:

  1. Check the work page — if there's no current work page, grab one from the free list
  2. Calculate space — check if the new data fits in the remaining space on the work page
  3. If it fits — write the data (with a size header) and advance write_p
  4. If it doesn't fit — move the work page to the used list, grab a new page (from free, or steal the oldest from used if free is empty), and write there
  5. After writing — check if there's data ready to transmit and kick the consumer

The critical detail: if the free list is empty, the producer steals the oldest used page. This means during extended outages, the buffer wraps around and overwrites the oldest data — exactly the behavior you want. Recent data is more valuable than stale data in industrial monitoring.

The Consumer Path

When the MQTT connection is active and there's data to send:

  1. Check the used page list — if empty, check if the work page has unsent data and promote it
  2. Read the next message from the first used page's read_p position
  3. Publish via MQTT with QoS 1
  4. Set a "packet sent" flag — this prevents sending the next message until the current one is acknowledged
  5. Wait for PUBACK — when the broker confirms receipt, advance read_p
  6. If read_p reaches write_p — the page is fully delivered; move it back to the free list
  7. Repeat — grab the next message from the next used page

The Mutex Strategy

The entire buffer is protected by a single mutex. This might seem like a bottleneck, but in practice:

  • Write operations (adding data) take microseconds
  • Read operations (preparing to transmit) take microseconds
  • The actual MQTT transmission happens outside the mutex — only the buffer state management is locked

The mutex is held for a few microseconds at a time, never during network I/O. This keeps the polling thread from ever blocking on network latency.

Polling Thread:               MQTT Thread:
lock(mutex) lock(mutex)
write data to page read data from page
check if page full mark as sent
maybe promote page unlock(mutex)
trigger send check ─── MQTT publish ───
unlock(mutex) (outside mutex!)
lock(mutex)
process PUBACK
maybe free page
unlock(mutex)

Message Framing Inside Pages

Each page holds multiple messages packed sequentially. Each message has a simple header:

┌──────────────┬──────────────┬─────────────────────┐
│ Message ID │ Message Size │ Message Body │
│ (4 bytes) │ (4 bytes) │ (variable) │
└──────────────┴──────────────┴─────────────────────┘

The Message ID field is initially zero. When the MQTT library publishes the message, it fills in the packet ID assigned by the broker. This is how the consumer tracks which specific message was acknowledged — when the PUBACK callback fires with a packet ID, it can match it to the message at read_p and advance.

This framing makes the buffer self-describing. During recovery after a restart, the gateway can scan page contents by reading size headers sequentially.

Handling Disconnections Gracefully

When the MQTT connection drops, the consumer thread must handle it without corrupting the buffer:

Connection Lost:
1. Set connected = 0
2. Clear "packet sent" flag
3. Do NOT touch any page pointers

That's it. The producer keeps writing — it doesn't know or care about the connection state. The buffer absorbs data normally.

When the connection recovers:

Connection Restored:
1. Set connected = 1
2. Trigger send check (under mutex)
3. Consumer picks up where it left off

The key insight: the "packet sent" flag prevents double-sending. If a PUBLISH was in flight when the connection dropped, the PUBACK never arrived. The flag remains set, but the disconnection handler clears it. When the connection recovers, the consumer re-reads the same message from read_p (which was never advanced) and re-publishes it. The broker either receives a duplicate (handled by QoS 1 dedup) or receives it for the first time.

Binary vs. JSON Batch Encoding

The telemetry data written into the buffer can be encoded in two formats, and the choice affects both bandwidth and reliability.

JSON Format

Each batch is a JSON object containing groups of timestamped values:

{
"groups": [
{
"ts": 1709424000,
"device_type": 1017,
"serial_number": 123456,
"values": [
{"id": 80, "values": [725]},
{"id": 81, "values": [680]},
{"id": 82, "values": [285]}
]
}
]
}

Pros: Human-readable, easy to debug, parseable by any language. Cons: 5-8× larger than binary, float precision loss (decimal representation), size estimation is rough.

Binary Format

A compact binary encoding with a header byte (0xF7), followed by big-endian packed groups:

F7                              ← Header
00 00 00 01 ← Number of groups (1)
65 E8 2C 00 ← Timestamp (Unix epoch)
03 F9 ← Device type (1017)
00 01 E2 40 ← Serial number
00 00 00 03 ← Number of values (3)
00 50 00 01 02 02 D5 ← Tag 80: status=0, 1 value, 2 bytes, 725
00 51 00 01 02 02 A8 ← Tag 81: status=0, 1 value, 2 bytes, 680
00 52 00 01 02 01 1D ← Tag 82: status=0, 1 value, 2 bytes, 285

Pros: 5-8× smaller, perfect float fidelity (raw bytes preserved), exact size calculation. Cons: Requires matching decoder on the cloud side, harder to debug without tools.

For gateways communicating over cellular connections — common in remote facilities like water treatment plants, oil wells, or distributed renewable energy sites — binary encoding is essentially mandatory. A gateway polling 100 tags every 10 seconds generates about 260 MB/month in JSON versus 35 MB/month in binary. At typical IoT cellular rates ($0.50-$2.00/MB), that's the difference between $130/month and $17/month per gateway.

The MQTT Watchdog Pattern

MQTT connections can enter a zombie state — technically connected according to the TCP stack, but the broker has stopped responding. This is especially common behind industrial firewalls and NAT devices with aggressive connection timeout policies.

The Problem

The MQTT library reports the connection as alive. The gateway publishes messages. No PUBACK comes back — ever. The buffer fills up because the consumer thinks each message is "in flight" (the packet_sent flag is set). Eventually the buffer wraps and data loss begins.

The Solution: Last-Delivered Timestamp

Track the timestamp of the last successful PUBACK. If more than N seconds have passed since the last acknowledged delivery, and there are messages waiting to be sent, the connection is stale:

monitor_watchdog():
if connected AND packet_sent:
elapsed = now - last_delivered_packet_timestamp
if elapsed > WATCHDOG_THRESHOLD:
// Force disconnect and reconnect
force_disconnect()
// Disconnection handler clears packet_sent
// Reconnection handler will re-deliver from read_p

A typical threshold is 60 seconds for LAN connections and 120 seconds for cellular. This catches zombie connections that the TCP stack and MQTT keep-alive miss.

Reconnection with Backoff

When the watchdog (or a genuine disconnection) triggers a reconnect, use a dedicated thread for the connection attempt. The connect_async call can block for the TCP timeout duration (potentially 30+ seconds), and you don't want that blocking the main loop or the polling thread.

A semaphore controls the reconnection thread:

Main Thread:                Reconnection Thread:
Detects need to (blocked on semaphore)
reconnect │
Posts semaphore ──────► Wakes up
Calls connect_async()
(may block 30s)
Success or failure
Posts "done" semaphore
Waits for "done" ◄──────
Checks result

The reconnect delay should be fixed and short (5 seconds is typical) for industrial applications, not exponential backoff. In a factory, the network outage either resolves quickly (a transient) or it's a hard failure that needs human intervention. Exponential backoff just delays reconnection after the network recovers.

Batching Strategy: Size vs. Time

Telemetry batches should be finalized and queued for transmission based on whichever threshold hits first: size or time.

Size-Based Finalization

When the accumulated batch data exceeds a configured maximum (typically 4-500 KB for JSON, 50-100 KB for binary), finalize and queue it. This prevents any single MQTT message from being too large for the broker or the network MTU.

Time-Based Finalization

When the batch has been collecting data for more than a configured timeout (typically 30-60 seconds), finalize it regardless of size. This ensures that even slowly-changing tags get transmitted within a bounded time window.

The Interaction Between Batching and Buffering

Batching and buffering are separate concerns that interact:

PLC Tags ──► Batch (collecting) ──► Buffer Page (queued) ──► MQTT (transmitted)

Tag reads accumulate When batch finalizes, Pages are transmitted
in the batch structure the encoded batch goes one at a time with
into the ring buffer PUBACK confirmation

A batch contains one or more "groups" — each group is a set of tag values read at the same timestamp. Multiple polling cycles might go into a single batch before it's finalized by size or time. The finalized batch then goes into the ring buffer as a single message.

Dependent Tag Reads and Atomic Groups

In many PLC configurations, certain tags are only meaningful when read together. For example:

  • Alarm word tags — a uint16 register where each bit represents a different alarm. You read the alarm word, then extract the individual bits. If the alarm word changes, you need to read and deliver the extracted bits atomically with the parent.

  • Machine state transitions — when a "blender running" tag changes from 0 to 1, you might need to immediately read all associated process values (RPM, temperatures, pressures) to capture the startup snapshot.

The architecture handles this through dependent tag chains:

Parent Tag (alarm_word, interval=1s, compare=true)
└── Calculated Tag (alarm_bit_0, shift=0, mask=0x01)
└── Calculated Tag (alarm_bit_1, shift=1, mask=0x01)
└── Dependent Tag (motor_speed, read_on_change=true)
└── Dependent Tag (temperature, read_on_change=true)

When the parent tag changes, the polling thread:

  1. Finalizes the current batch
  2. Recursively reads all dependent tags (forced read, ignoring intervals)
  3. Starts a new batch group with the same timestamp

This ensures that the dependent values are timestamped identically with the trigger event and delivered together.

Hourly Full-Read Reset

Change-of-value (COV) filtering dramatically reduces bandwidth, but it introduces a subtle failure mode: if a value changes during a transient read error, the gateway might never know it changed.

Here's the scenario:

  1. At 10:00:00, tag value = 72.5 → transmitted
  2. At 10:00:01, PLC returns an error for that tag → not transmitted
  3. At 10:00:02, tag value = 73.0 → compared against last successful read (72.5), change detected, transmitted
  4. But if the error at 10:00:01 was actually a valid read of 73.0 that was misinterpreted as an error, and the value stayed at 73.0, then at 10:00:02 the comparison against the last known value (72.5) correctly catches it.

The real problem is when:

  1. At 10:00:00, tag value = 72.5 → transmitted
  2. The PLC program changes the tag to 73.0 and then back to 72.5 between polling cycles
  3. The gateway never sees 73.0 — it polls at 10:00:00 and 10:00:01 and gets 72.5 both times

For most industrial applications, this sub-second transient is irrelevant. But to guard against drift — where small rounding differences accumulate between the gateway's cached value and the PLC's actual value — a full reset is performed every hour:

Every hour boundary (when the system clock's hour changes):
1. Clear the "read once" flag on every tag
2. Clear all last-known values
3. Force read and transmit every tag regardless of COV

This guarantees that the cloud platform has a complete snapshot of every tag value at least once per hour, even for tags that haven't changed.

Putting It All Together: The Polling Loop

Here's the complete polling loop architecture that ties all these patterns together:

main_polling_loop():
FOREVER:
current_time = monotonic_clock()

FOR each configured device:
// Hourly reset check
if hour(current_time) != hour(last_poll_time):
reset_all_tags(device)

// Start a new batch group
start_group(device.batch, unix_timestamp())

FOR each tag in device.tags:
// Check if this tag needs reading now
if not tag.read_once OR elapsed(tag.last_read) >= tag.interval:

value, status = read_tag(device, tag)

if status == LINK_ERROR:
set_link_state(device, DOWN)
break // Stop reading this device

set_link_state(device, UP)

// COV check
if tag.compare AND tag.read_once:
if value == tag.last_value AND status == tag.last_status:
continue // No change, skip

// Deliver value
if tag.do_not_batch:
deliver_immediately(device, tag, value)
else:
add_to_batch(device.batch, tag, value)

// Check dependent tags
if value_changed AND tag.has_dependents:
finalize_batch()
read_dependents(device, tag)
start_new_group()

// Update tracking
tag.last_value = value
tag.last_status = status
tag.read_once = true
tag.last_read = current_time

// Finalize batch group
stop_group(device.batch, output_buffer)
// ↑ This checks size/time thresholds and may
// queue the batch into the ring buffer

sleep(polling_interval)

Performance Characteristics

On a typical industrial edge gateway (ARM Cortex-A9, 512 MB RAM, Linux):

OperationTimeNotes
Mutex lock/unlock~1 µsPer buffer operation
Modbus TCP read (10 registers)5-15 msNetwork dependent
Modbus RTU read (10 registers)20-50 msBaud rate dependent (9600-115200)
EtherNet/IP tag read2-8 msCIP overhead
JSON batch encoding0.5-2 ms100 tags
Binary batch encoding0.1-0.5 ms100 tags
MQTT publish (QoS 1)1-5 msLAN broker
Buffer page write5-20 µsmemcpy only

The bottleneck is always the PLC protocol reads, not the buffer or transmission logic. A gateway polling 200 Modbus TCP tags can complete a full cycle in under 200 ms, leaving plenty of headroom for a 1-second polling interval.

For Modbus RTU (serial), the bottleneck shifts to the baud rate. At 9600 baud, a single register read takes ~15 ms including response. Polling 50 registers individually would take 750 ms — too close to a 1-second interval. This is why contiguous register grouping matters: reading 50 consecutive registers in a single request takes about 50 ms, a 15× improvement.

How machineCDN Implements These Patterns

machineCDN's edge gateway uses exactly these patterns — paged ring buffers with mutex-protected page management, QoS 1 MQTT with PUBACK-based buffer advancement, and both binary and JSON encoding depending on the deployment's bandwidth constraints.

The platform's gateway daemon runs on Linux-based edge hardware (including cellular routers like the Teltonika RUT series) and handles simultaneous Modbus RTU, Modbus TCP, and EtherNet/IP connections to mixed-vendor equipment. The buffer is sized during commissioning based on the expected outage duration — a 64 KB buffer holds roughly 4 hours of data at typical polling rates; a 512 KB buffer extends that to over 24 hours.

The result: plants running machineCDN don't lose telemetry during network outages. When connectivity recovers, the buffered data drains automatically and fills in the gaps in trending charts and analytics — no manual intervention, no missing data points.

Key Takeaways

  1. Use paged ring buffers, not unbounded queues — fixed memory, graceful overflow (oldest data dropped first)
  2. Protect buffer operations with a mutex, but never hold it during network I/O — microsecond lock durations keep producers and consumers non-blocking
  3. Track PUBACK per-message to prevent double-sending and enable reliable buffer advancement
  4. Implement a MQTT watchdog using last-delivery timestamps to catch zombie connections
  5. Batch by size OR time (whichever hits first) to balance bandwidth and latency
  6. Reset all tags hourly to guarantee complete snapshots and prevent drift
  7. Binary encoding saves 5-8× bandwidth with zero precision loss — essential for cellular-connected gateways
  8. Group contiguous Modbus registers into single requests — 15× faster than individual reads on RTU

Building a reliable IIoT edge gateway is fundamentally a systems programming challenge. The protocols, the buffering, the concurrency — each one is manageable alone, but getting them all right together, on constrained hardware, with zero tolerance for data loss, is what separates toy prototypes from production infrastructure.


See machineCDN's store-and-forward buffering in action with real factory data. Request a demo to explore the platform.

Cloud Connection Watchdogs for IIoT Edge Gateways: Designing Self-Healing MQTT Pipelines [2026]

· 12 min read

The edge gateway powering your factory floor monitoring has exactly one job that matters: get data from PLCs to the cloud. Everything else — protocol translation, tag mapping, batch encoding — is just preparation for that moment when bits leave the gateway and travel to your cloud backend.

And that's exactly where things break. MQTT connections go stale. TLS certificates expire silently. Cloud endpoints restart for maintenance. Cellular modems drop carrier. The gateway's connection looks alive — the TCP socket is open, the MQTT client reports "connected" — but nothing is actually getting delivered.

This is the silent failure problem, and it kills more IIoT deployments than any protocol misconfiguration ever will. This guide covers how to design watchdog systems that detect, diagnose, and automatically recover from every flavor of connectivity failure.

Why MQTT Connections Fail Silently

To understand why watchdogs are necessary, you need to understand what MQTT's keep-alive mechanism does and — more importantly — what it doesn't do.

MQTT keep-alive is a bi-directional ping. The client sends a PINGREQ, the broker responds with PINGRESP. If the broker doesn't hear from the client within 1.5× the keep-alive interval, it considers the client dead and closes the session. If the client doesn't get a PINGRESP, it knows the connection is lost.

Sounds robust, right? Here's where it falls apart:

The Half-Open Connection Problem

TCP connections can enter a "half-open" state where one side thinks the connection is alive, but the other side has already dropped it. This happens when a NAT gateway times out the session, a cellular modem roams to a new tower, or a firewall silently drops the route. The MQTT client's operating system still shows the socket as ESTABLISHED. The keep-alive PINGREQ gets queued in the kernel's send buffer — and sits there, never actually reaching the wire.

The Zombie Session Problem

The gateway reconnects after an outage and gets a new TCP session, but the broker still has the old session's resources allocated. Depending on the clean session flag and broker implementation, you might end up with duplicate subscriptions, missed messages on the command channel, or a broker that refuses the new connection because the old client ID is still "active."

The Token Expiration Problem

Cloud IoT platforms (Azure IoT Hub, AWS IoT Core, Google Cloud IoT) use SAS tokens or JWT tokens for authentication. These tokens have expiration timestamps. When a token expires, the MQTT connection stays open until the next reconnection attempt — which then fails with an authentication error. If your reconnection logic doesn't refresh the token before retrying, you'll loop forever: connect → auth failure → reconnect → auth failure.

The Backpressure Problem

The MQTT client library reports "connected," publishes succeed (they return a message ID), but the broker is under load and takes 30 seconds to acknowledge the publish. Your QoS 1 messages pile up in the client's outbound queue. Eventually the client's memory is exhausted, publishes start failing, but the connection is technically alive.

Designing a Proper Watchdog

A production-grade edge watchdog doesn't just check "am I connected?" It monitors three independent health signals:

Signal 1: Connection State

Track the MQTT on_connect and on_disconnect callbacks. Maintain a state machine:

States:
DISCONNECTED → CONNECTING → CONNECTED → DISCONNECTING → DISCONNECTED

Transitions:
DISCONNECTED + config_available → CONNECTING (initiate async connect)
CONNECTING + on_connect(status=0) → CONNECTED
CONNECTING + on_connect(status≠0) → DISCONNECTED (log error, wait backoff)
CONNECTED + on_disconnect → DISCONNECTING → DISCONNECTED

The key detail: initiate MQTT connections asynchronously in a dedicated thread. A blocking mqtt_connect() call in the main data collection loop will halt PLC reads during the TCP handshake — which on a cellular link with 2-second RTT means 2 seconds of missed data. Use a semaphore or signal to coordinate: the connection thread posts "I'm ready" when it finishes, and the main loop picks it up on the next cycle.

Signal 2: Delivery Confirmation

This is the critical signal that catches silent failures. Track the timestamp of the last successfully delivered message (acknowledged by the broker, not just sent by the client).

For QoS 1: the on_publish callback fires when the broker acknowledges receipt with a PUBACK. Record this timestamp every time it fires.

Last Delivery Tracking:
on_publish(packet_id) → last_delivery_timestamp = now()

Watchdog Check (every main loop cycle):
if (now() - last_delivery_timestamp > WATCHDOG_TIMEOUT):
trigger_reconnection()

What's the right watchdog timeout? It depends on your data rate:

Data RateSuggested TimeoutRationale
Every 1s30–60s30 missed deliveries before alert
Every 5s60–120s12–24 missed deliveries
Every 30s120–300s4–10 missed deliveries

The timeout should be significantly longer than your maximum expected inter-delivery interval. If your batch timeout is 30 seconds, a 120-second watchdog timeout gives you 4 batch cycles of tolerance before concluding something is wrong.

Signal 3: Token/Certificate Validity

Before attempting reconnection, check the authentication material:

Token Check:
if (token_expiration_timestamp ≠ 0):
if (current_time > token_expiration_timestamp):
log("WARNING: Cloud auth token may be expired")
else:
log("Token valid until {expiration_time}")

If your deployment uses SAS tokens with expiration timestamps, parse the se= (signature expiry) parameter from the connection string at startup. Log a warning when the token is approaching expiry. Some platforms provide token refresh mechanisms; others require a redeployment. Either way, knowing the token is expired before the first reconnection attempt saves you from debugging phantom connection failures at 3 AM.

Buffer-Aware Recovery: Don't Lose Data During Outages

The watchdog triggers a reconnection. But what happens to the data that was collected while the connection was down?

This is where most IIoT platforms quietly drop data. The naïve approach: if the MQTT publish call fails, discard the message and move on. This means any network outage, no matter how brief, creates a permanent gap in your historical data.

A proper store-and-forward buffer works like this:

Page-Based Buffer Architecture

Instead of a simple FIFO queue, divide a fixed memory region into pages. Each page holds multiple messages packed sequentially. Three page lists manage the lifecycle:

  • Free Pages: Empty, available for new data
  • Work Page: Currently being filled with new messages
  • Used Pages: Full pages waiting for delivery
Data Flow:
PLC Read → Batch Encoder → Work Page (append)
Work Page Full → Move to Used Pages queue

MQTT Connected:
Used Pages front → Send first message → Wait for PUBACK
PUBACK received → Advance read pointer
Page fully delivered → Move to Free Pages

MQTT Disconnected:
Used Pages continue accumulating
Work Page continues filling
If Free Pages exhausted → Reclaim oldest Used Page (overflow warning)

Why Pages, Not Individual Messages

Individual message queuing has per-message overhead that becomes significant at high data rates: pointer storage, allocation/deallocation, fragmentation. A page-based buffer pre-allocates a contiguous memory block (typically 1–2 MB on embedded edge hardware) and manages it as fixed-size pages. No dynamic allocation after startup. No fragmentation. Predictable memory footprint.

The overflow behavior is also better. When the buffer is full and the connection is still down, you sacrifice the oldest complete page — losing, say, 60 seconds of data from 10 minutes ago rather than randomly dropping individual messages from different time periods. The resulting data gap is clean and contiguous, which is much easier for downstream analytics to handle than scattered missing points.

Disconnect Recovery Sequence

When the MQTT on_disconnect callback fires:

  1. Mark connection as down immediately — the buffer stops trying to send
  2. Reset "packet in flight" flag — the pending PUBACK will never arrive
  3. Continue accepting data from PLC reads into the buffer
  4. Do NOT flush or clear the buffer — all unsent data stays queued

When on_connect fires after reconnection:

  1. Mark connection as up
  2. Begin draining Used Pages from the front of the queue
  3. Send first queued message, wait for PUBACK, then send next
  4. Simultaneously accept new data into the Work Page

This "catch-up" phase is important to handle correctly. New real-time data is still flowing into the buffer while old data is being drained. The buffer must handle concurrent writes (from the PLC reading thread) and reads (for MQTT delivery) safely. Mutex protection on the page list operations is essential.

Async Connection Threads: The Pattern That Saves You

Network operations block. DNS resolution blocks. TCP handshakes block. TLS negotiation blocks. On a cellular connection with packet loss, a single connection attempt can take 5–30 seconds.

If your edge gateway has a single thread doing both PLC reads and MQTT connections, that's 5–30 seconds of missed PLC data every time the connection drops. For an injection molding machine with a 15-second cycle, you could miss an entire shot.

The solution is a dedicated connection thread:

Main Thread:
loop:
read_plc_tags()
encode_and_buffer()
dispatch_command_queue()
check_watchdog()
if watchdog_triggered:
post_job_to_connection_thread()
sleep(1s)

Connection Thread:
loop:
wait_for_job() // blocks on semaphore
destroy_old_connection()
create_new_mqtt_client()
configure_tls()
set_callbacks()
mqtt_connect_async(host, port)
signal_job_complete() // post semaphore

Two semaphores coordinate this:

  • Job semaphore: Main thread posts to trigger reconnection, connection thread waits on it
  • Completion semaphore: Connection thread posts when done, main thread checks (non-blocking) before posting next job

Critical detail: check that the connection thread isn't already running before posting a new job. If the main thread fires the watchdog timeout every 120 seconds but the last reconnection attempt is still in progress (stuck in a 90-second TLS handshake), you'll get overlapping connection attempts that corrupt the MQTT client state.

Reconnection Backoff Strategy

When the cloud endpoint is genuinely down (maintenance window, region outage), aggressive reconnection attempts waste cellular data and CPU cycles. But when it's a transient network glitch, you want to reconnect immediately.

The right approach combines fixed-interval reconnect with watchdog escalation:

Reconnect Timing:
Attempt 1: Immediate (transient glitch)
Attempt 2: 5 seconds
Attempt 3: 5 seconds (cap at 5s for constant backoff)

Watchdog escalation:
if no successful delivery in 120 seconds despite "connected" state:
force full reconnection (destroy + recreate client)

Why not exponential backoff? In industrial settings, the most common failure mode is a brief network interruption — a cell tower handoff, a router reboot, a firewall session timeout. These resolve in 5–15 seconds. Exponential backoff would delay your reconnection to 30s, 60s, 120s, 240s... meaning you could be offline for 4+ minutes after a 2-second glitch. Constant 5-second retry with watchdog escalation provides faster recovery for the common case while still preventing connection storms during genuine outages.

Device Status Broadcasting

Your edge gateway should periodically broadcast its own health status via MQTT. This serves two purposes: it validates the delivery pipeline end-to-end, and it gives the cloud platform visibility into the gateway fleet's health.

A well-designed status message includes:

  • System uptime (OS level — how long since last reboot)
  • Daemon uptime (application level — how long since last restart)
  • Connected device inventory (PLC types, serial numbers, link states)
  • Token expiration timestamp (proactive alerting for credential rotation)
  • Buffer utilization (how close to overflow)
  • Software version + build hash (for fleet management and OTA targeting)
  • Per-device tag counts and last-read timestamps (stale data detection)

Send a compact status on every connection establishment, and a detailed status periodically (every 5–10 minutes). The compact status acts as a "birth certificate" — the cloud platform immediately knows which gateway just came online and what equipment it's managing.

Real-World Failure Scenarios and How the Watchdog Handles Them

Scenario 1: Cellular Modem Roaming

Symptom: TCP connection goes half-open. MQTT client thinks it's connected. Publishes queue up in OS buffer. Detection: Watchdog timeout fires — no PUBACK received in 120 seconds despite continuous publishes. Recovery: Force reconnection. Buffer holds all unsent data. Reconnect on new cell tower, drain buffer. Data loss: Zero (buffer sized for 2-minute outage).

Scenario 2: Cloud Platform Maintenance Window

Symptom: MQTT broker goes offline. Client receives disconnection callback. Detection: Immediate — on_disconnect fires. Recovery: 5-second reconnect attempts. Buffer accumulates data. Connection succeeds when maintenance ends. Data loss: Zero if maintenance window is shorter than buffer capacity (typically 10–30 minutes at normal data rates).

Scenario 3: SAS Token Expiration

Symptom: Connection drops. Reconnection attempts fail with authentication error. Detection: Watchdog notices repeated connection failures. Token timestamp check confirms expiration. Recovery: Log critical alert. Wait for token refresh (manual or automated). Reconnect with new token. Data loss: Depends on token refresh time. Buffer provides bridge.

Scenario 4: PLC Goes Offline

Symptom: Tag reads start returning errors. Gateway loses link state to PLC. Detection: Link state monitoring fires immediately. Error delivered to cloud as a priority (unbatched) event. Recovery: Gateway continues attempting PLC reads. When PLC comes back, link state restored, reads resume. MQTT impact: None — the cloud connection is independent of PLC connections. Both failures are handled by separate watchdog systems.

Monitoring Your Watchdog (Yes, You Need to Watch the Watcher)

The watchdog itself needs observability:

  1. Log every watchdog trigger with reason (no PUBACK, connection timeout, token expiry)
  2. Count reconnection attempts per hour — a spike indicates infrastructure instability
  3. Track buffer high-water marks — if the buffer repeatedly approaches capacity, your connectivity is too unreliable for the data rate
  4. Alert on repeated authentication failures — this is almost always a credential rotation issue

Platforms like machineCDN build this entire watchdog system into the edge agent — monitoring cloud connections, managing store-and-forward buffers, handling reconnection with awareness of both the MQTT transport state and the buffer delivery state. The result is a self-healing data pipeline where network outages create brief delays in cloud delivery but never cause data loss.

Implementation Checklist

Before deploying your edge gateway to production, verify:

  • Watchdog timer runs independently of MQTT callback threads
  • Connection establishment is fully asynchronous (dedicated thread)
  • Buffer survives connection loss (no flush on disconnect)
  • Buffer overflow discards oldest data, not newest
  • Token/certificate expiration is checked before reconnection
  • Reconnection doesn't overlap with in-progress connection attempts
  • Device status is broadcast on every successful reconnection
  • Buffer drain and new data accept can operate concurrently
  • All watchdog events are logged with timestamps for post-mortem analysis
  • PLC read loop continues uninterrupted during reconnection

The unsexy truth about industrial IoT reliability is that it's not about the protocol choice or the cloud platform. It's about what happens in the 120 seconds after your connection drops. Get the watchdog right, and a 10-minute network outage is invisible to your operators. Get it wrong, and a 2-second glitch creates a permanent hole in your production data.

Build the self-healing pipeline. Your 3 AM self will thank you.

Store-and-Forward Buffer Design for Reliable Industrial MQTT Telemetry [2026]

· 12 min read

Your edge gateway just collected 200 data points from six machines. The MQTT connection to the cloud dropped 47 seconds ago. What happens to that data?

In consumer IoT, the answer is usually "it gets dropped." In industrial IoT, that answer gets you fired. A single missed alarm delivery can mean a $50,000 chiller compressor failure. A gap in temperature logging can invalidate an entire production batch for FDA compliance.

The solution is a store-and-forward buffer — a memory structure that sits between your data collection layer and your MQTT transport, holding telemetry data during disconnections and draining it the moment connectivity returns. It sounds simple. The engineering details are anything but.

This article walks through the design of a production-grade store-and-forward buffer for resource-constrained edge gateways running on embedded Linux.

Store-and-forward buffer architecture for MQTT telemetry

Why MQTT QoS Isn't Enough

The first objection is always: "MQTT already has QoS 1 and QoS 2 — doesn't the broker handle retransmission?"

Technically yes, but only for messages that have already been handed to the MQTT client library. The problem is what happens before the publish call:

  1. The TCP connection is down. mosquitto_publish() returns MOSQ_ERR_NO_CONN. Your data is gone unless you stored it somewhere.
  2. The MQTT library's internal buffer is full. Most MQTT client libraries have a finite send queue. When it fills, new publishes get rejected.
  3. The gateway rebooted. Any data in memory is lost. Only data written to persistent storage survives.

QoS handles message delivery within an established session. Store-and-forward handles data persistence across disconnections, reconnections, and reboots.

The Page-Based Buffer Architecture

A production buffer uses a paged memory pool — a contiguous block of memory divided into fixed-size pages that cycle through three states:

┌─────────────────────────────────────────────────────┐
│ Buffer Memory Pool │
│ │
│ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ │
│ │Page 0│ │Page 1│ │Page 2│ │Page 3│ │Page 4│ │
│ │ FREE │ │ USED │ │ USED │ │ WORK │ │ FREE │ │
│ └──────┘ └──────┘ └──────┘ └──────┘ └──────┘ │
│ │
│ FREE = empty, available for writing │
│ WORK = currently being filled with incoming data │
│ USED = full, queued for delivery to MQTT broker │
└─────────────────────────────────────────────────────┘

Page States

  • FREE pages form a linked list of available pages. When the buffer needs a new work page, it pulls from the free list.
  • WORK page is the single page currently accepting incoming data. New telemetry batches get appended here. There is always at most one work page.
  • USED pages form an ordered queue of pages waiting to be delivered. The buffer sends data from the head of the used queue, one message at a time.

Page Structure

Each page contains multiple messages, packed sequentially:

┌─────────────────────────────────────────────┐
│ Page N │
│ │
│ ┌──────────┬──────────┬──────────────────┐ │
│ │ msg_id │ msg_size │ message_data │ │
│ │ (4 bytes)│ (4 bytes)│ (variable) │ │
│ ├──────────┼──────────┼──────────────────┤ │
│ │ msg_id │ msg_size │ message_data │ │
│ │ (4 bytes)│ (4 bytes)│ (variable) │ │
│ ├──────────┼──────────┼──────────────────┤ │
│ │ ... more messages ... │ │
│ └──────────────────────────────────────────┘ │
│ │
│ write_p ──→ next write position │
│ read_p ──→ next read position (delivery) │
│ │
└─────────────────────────────────────────────┘

The msg_id field is critical — it gets filled in by the MQTT library's publish() call, which returns a packet ID. When the broker acknowledges delivery (via the PUBACK callback in QoS 1), the buffer matches the acknowledged ID against the head of the delivery queue.

Memory Sizing

The minimum viable buffer needs at least three pages:

  • One page being filled (WORK)
  • One page being transmitted (USED, head of queue)
  • One page available for the next batch (FREE)

In practice, you want more headroom. The formula:

buffer_size = page_size × desired_holdover_time / batch_interval

Example:
- Page size: 32 KB
- Batch interval: 30 seconds
- Desired holdover: 10 minutes
- Pages needed: 32KB × (600s / 30s) = 20 pages = 640 KB

On a typical embedded Linux gateway with 256MB–512MB RAM, dedicating 1–4 MB to the telemetry buffer is reasonable.

The Write Path: Accepting Incoming Data

When the data collection layer finishes a polling cycle and has a batch of tag values ready to deliver, it calls into the buffer:

Step 1: Check the Work Page

If no work page exists, allocate one from the free list. If the free list is empty, steal the oldest used page — this is the overflow strategy (more on this below).

Step 2: Size Check

Before writing, verify that the message (plus its 8-byte header) fits in the remaining space on the work page:

remaining = page_size - (write_p - start_p)
needed = 4 (msg_id) + 4 (msg_size) + payload_size

if needed > remaining:
move work_page to used_pages queue
allocate a new work page
retry

Step 3: Write the Message

1. Write 4 zero bytes at write_p    (placeholder for msg_id)
2. Write message size as uint32 (4 bytes)
3. Write message payload (N bytes)
4. Advance write_p by 8 + N

The msg_id is initially zero because we don't know it yet — it gets assigned when the message is actually published to MQTT.

Step 4: Trigger Delivery

After every write, the buffer checks if it can send data. If the connection is up and no message is currently awaiting acknowledgment, it initiates delivery of the next queued message.

The Read Path: Delivering to MQTT

Delivery follows a strict one-message-at-a-time discipline. The buffer maintains a packet_sent flag:

if connected == false:  return
if packet_sent == true: return (waiting for PUBACK)

message = used_pages[0].read_p
result = mqtt_publish(message.data, message.size, &message.msg_id)

if result == success:
packet_sent = true
else:
packet_sent = false (retry on next opportunity)

Why One at a Time?

Sending multiple messages without waiting for acknowledgment is tempting — it would be faster. But it creates a delivery ordering problem. If messages 1, 2, and 3 are sent simultaneously and message 2's PUBACK arrives first, you don't know whether messages 1 and 3 were delivered. With one-at-a-time, the delivery order is guaranteed to match the insertion order.

For higher throughput, some implementations pipeline 2–3 messages and track a small window of in-flight packet IDs. But for industrial telemetry where data integrity matters more than latency, sequential delivery is the safer choice.

The Delivery Confirmation Callback

When the MQTT library's on_publish callback fires with a packet ID:

1. Lock the buffer mutex
2. Check that the packet_id matches used_pages[0].read_p.msg_id
3. Advance read_p past the delivered message
4. If read_p >= write_p:
- Page completely delivered
- Move page from used_pages to free_pages
- Reset the page's write_p and read_p
5. Set packet_sent = false
6. Attempt to send the next message
7. Unlock mutex

This is where the msg_id field in the page pays off — it's the correlation key between "we published this" and "the broker confirmed this."

Overflow Handling: When Memory Runs Out

On a constrained device, the buffer will eventually fill up during an extended outage. The question is: what do you sacrifice?

Strategy 1: Drop Newest (Ring Buffer)

When the free list is empty, reject new writes. The data collection layer simply loses the current batch. This preserves historical data but creates gaps at the end of the outage.

Strategy 2: Drop Oldest (FIFO Eviction)

When the free list is empty, steal the oldest used page — the one at the head of the delivery queue. This preserves the most recent data but creates gaps at the beginning of the outage.

Which to Choose?

For industrial monitoring, drop-oldest is almost always correct. The reasoning:

  • During a long outage, the most recent data is more actionable than data from 20 minutes ago.
  • When connectivity returns, operators want to see current machine state, not historical state from the beginning of the outage.
  • Historical data from the outage period can often be reconstructed from PLC internal logs after the fact.

A production implementation logs a warning when it evicts a page:

Buffer: Overflow warning! Extracted USED page (#7)

This warning should be forwarded to the platform's monitoring layer so operators know data was lost.

Thread Safety

The buffer is accessed from two threads:

  1. The polling thread — calls buffer_add_data() after each collection cycle
  2. The MQTT callback thread — calls buffer_process_data_delivered() when PUBACKs arrive

A mutex protects all buffer operations:

// Pseudocode
void buffer_add_data(buffer, data, size) {
lock(buffer->mutex)
write_data_to_work_page(buffer, data, size)
try_send_next_message(buffer)
unlock(buffer->mutex)
}

void buffer_on_puback(buffer, packet_id) {
lock(buffer->mutex)
advance_read_pointer(buffer, packet_id)
try_send_next_message(buffer)
unlock(buffer->mutex)
}

The key insight: try_send_next_message() is called from both code paths. After adding data, the buffer checks if it can immediately begin delivery. After confirming delivery, it checks if there's more data to send. This creates a self-draining pipeline that doesn't need a separate timer or polling loop.

Connection State Management

The buffer tracks connectivity through two callbacks:

On Connect

buffer->connected = true
try_send_next_message(buffer) // Start draining the queue

On Disconnect

buffer->connected = false
buffer->packet_sent = false // Reset in-flight tracking

The packet_sent = false on disconnect is critical. If a message was in flight when the connection dropped, we have no way of knowing whether the broker received it. Setting packet_sent = false means the message will be re-sent on reconnection. This may result in duplicate delivery — which is fine. Industrial telemetry systems should be idempotent anyway (a repeated temperature reading at timestamp T is the same as the original).

Batch Finalization: When to Flush

Data arrives at the buffer through a batch layer that groups multiple tag values before serialization. The batch finalizes (and writes to the buffer) on two conditions:

1. Size Limit

When the accumulated batch exceeds a configured maximum size (e.g., 32 KB for JSON, or when the binary payload reaches 90% of the maximum), the batch is serialized and written to the buffer immediately:

if current_batch_size > max_batch_size:
finalize_and_write_to_buffer(batch)
reset_batch()

2. Time Limit

When the time since the batch started collecting exceeds a configured timeout (e.g., 30 seconds), the batch is finalized regardless of size:

elapsed = now - batch_start_time
if elapsed > max_batch_time:
finalize_and_write_to_buffer(batch)
reset_batch()

The time-based trigger is checked at the end of each tag group within a polling cycle, not on a separate timer. This avoids adding another thread and ensures the batch is finalized at a natural boundary in the data stream.

Binary vs. JSON Serialization

Production edge systems typically support two serialization formats:

JSON Format

{
"groups": [
{
"ts": 1709341200,
"device_type": 1018,
"serial_number": 12345,
"values": [
{"id": 1, "values": [452]},
{"id": 2, "values": [38]},
{"id": 162, "error": -5}
]
}
]
}

JSON is human-readable and easy to debug but verbose. A batch of 25 tag values in JSON might be 800 bytes.

Binary Format

0xF7              Command byte
[4B] num_groups Number of timestamp groups
[4B] timestamp Unix timestamp
[2B] dev_type Device type ID
[4B] serial Device serial number
[4B] num_values Number of values in group
[2B] tag_id Tag identifier
[1B] status 0x00=OK, other=error
[1B] count Array size
[1B] elem_sz Element size (1, 2, or 4 bytes)
[N×S bytes] Packed values (MSB first)

The same 25 tag values in binary format might be 180 bytes — a 4.4× reduction. On cellular connections where bandwidth is metered per megabyte, this matters enormously.

The format choice is configured per device. Many deployments use binary for production and JSON for commissioning/debugging.

Monitoring the Buffer

A healthy buffer should have these characteristics:

  • Pages cycling regularly — pages move from FREE → WORK → USED → FREE in a steady rhythm
  • No overflow warnings — if you see "extracted USED page" in the logs, the buffer is undersized or the connection is too unreliable
  • Delivery timestamps advancing — track the timestamp of the last confirmed delivery. If it stops advancing while data is being collected, something is wrong with the MQTT connection

The edge daemon should publish buffer health as part of its periodic status message:

{
"buffer": {
"total_pages": 20,
"free_pages": 14,
"used_pages": 5,
"work_pages": 1,
"last_delivery_ts": 1709341200,
"overflow_count": 0
}
}

How machineCDN Implements Store-and-Forward

machineCDN's edge gateway implements the full page-based buffer architecture described in this article. The buffer sits between the batch serialization layer and the MQTT transport, providing:

  • Automatic page management — the gateway sizes the buffer based on available memory and configured batch parameters
  • Drop-oldest overflow — during extended outages, the most recent data is always preserved
  • Dual-format support — JSON for commissioning, binary for production deployments, configurable per device
  • Connection-aware delivery — the buffer begins draining immediately when the MQTT connection comes back up, with sequential delivery confirmation via QoS 1 PUBACKs

For multi-machine deployments on cellular gateways, the binary format combined with batch-and-forward typically reduces bandwidth consumption by 70–80% compared to per-tag JSON publishing — which translates directly to lower cellular data costs.

Key Takeaways

  1. MQTT QoS doesn't replace store-and-forward. QoS handles delivery within a session. Store-and-forward handles persistence across disconnections.

  2. Use a paged memory pool. Fixed-size pages with three states (FREE/WORK/USED) give you predictable memory usage and simple overflow handling.

  3. One message at a time for delivery integrity. Sequential delivery with PUBACK confirmation guarantees ordering and makes the system easy to reason about.

  4. Drop oldest on overflow. In industrial monitoring, recent data is more valuable than historical data from the beginning of an outage.

  5. Finalize batches on both size and time. Size limits prevent memory bloat; time limits prevent stale data sitting in an incomplete batch.

  6. Thread safety is non-negotiable. The polling thread and MQTT callback thread both touch the buffer. A mutex with minimal critical sections keeps things safe without impacting throughput.

The store-and-forward buffer is the unsung hero of reliable industrial telemetry. It's not glamorous, it doesn't show up in marketing slides, but it's the component that determines whether your IIoT platform loses data at 2 AM on a Saturday when the cell tower goes down — or quietly holds everything until the connection comes back and delivers it all without anyone ever knowing there was a problem.

MQTT Store-and-Forward for IIoT: Building Bulletproof Edge-to-Cloud Pipelines [2026]

· 12 min read

Factory networks go down. Cellular modems lose signal. Cloud endpoints hit capacity limits. VPN tunnels drop for seconds or hours. And through all of it, your PLCs keep generating data that cannot be lost.

Store-and-forward buffering is the difference between an IIoT platform that works in lab demos and one that survives a real factory. This guide covers the engineering patterns — memory buffer design, connection watchdogs, batch queuing, and delivery confirmation — that keep telemetry flowing even when the network doesn't.

MQTT store-and-forward buffering for industrial IoT

Reliable Telemetry Delivery in IIoT: Page Buffers, Batch Finalization, and Disconnection Recovery [2026]

· 13 min read

Your edge gateway reads 200 tags from a PLC every second. The MQTT connection to your cloud broker drops for 3 minutes because someone bumped the cellular antenna. What happens to the 36,000 data points collected during the outage?

If your answer is "they're gone," you have a toy system, not an industrial one.

Reliable telemetry delivery is the hardest unsolved problem in most IIoT architectures. Everyone focuses on the protocol layer — Modbus reads, EtherNet/IP connections, OPC-UA subscriptions — but the real engineering is in what happens between reading a value and confirming it reached the cloud. This article breaks down the buffer architecture that makes zero-data-loss telemetry possible on resource-constrained edge hardware.

Reliable telemetry delivery buffer architecture

The Problem: Three Asynchronous Timelines

In any edge-to-cloud telemetry system, you're managing three independent timelines:

  1. PLC read cycle — Tags are read at fixed intervals (1s, 60s, etc.). This never stops. The PLC doesn't care if your cloud connection is down.

  2. Batch collection — Raw tag values are grouped into batches by timestamp and device. Batches accumulate until they hit a size limit or a timeout.

  3. MQTT delivery — Batches are published to the broker. The broker acknowledges receipt. At QoS 1, the MQTT library handles retransmission, but only if you give it data in the right form.

These three timelines run independently. The PLC read loop runs on a tight 1-second cycle. Batch finalization might happen every 30–60 seconds. MQTT delivery depends on network availability. If any one of these stalls, the others must keep running without data loss.

This is fundamentally a producer-consumer problem with a twist: the consumer (MQTT) can disappear for minutes at a time, and the producer (PLC reads) cannot slow down.

The Batch Layer: Grouping Values for Efficient Transport

Raw tag values are tiny — a temperature reading is 4 bytes, a boolean is 1 byte. Sending each value as an individual MQTT message would be absurdly wasteful. Instead, values are collected into batches — structured payloads that contain multiple timestamped readings from one or more devices.

Batch Structure

A batch is organized as a series of groups, where each group represents one polling cycle (one timestamp, one device):

Batch
├── Group 0: { timestamp: 1709284800, device_type: 5000, serial: 12345 }
│ ├── Value: { id: 2, values: [72.4] } // Delivery Temp
│ ├── Value: { id: 3, values: [68.1] } // Mold Temp
│ └── Value: { id: 5, values: [12.6] } // Flow Value
├── Group 1: { timestamp: 1709284860, device_type: 5000, serial: 12345 }
│ ├── Value: { id: 2, values: [72.8] }
│ ├── Value: { id: 3, values: [68.3] }
│ └── Value: { id: 5, values: [12.4] }
└── ...

Dual-Format Encoding: JSON vs Binary

Production edge daemons typically support two encoding formats for batches, and the choice has massive implications for bandwidth:

JSON format:

{
"groups": [
{
"ts": 1709284800,
"device_type": 5000,
"serial_number": 12345,
"values": [
{"id": 2, "values": [72.4]},
{"id": 3, "values": [68.1]}
]
}
]
}

Binary format (same data):

Header:  F7                           (1 byte - magic)
Groups: 00 00 00 01 (4 bytes - group count)
Group 0: 65 E5 A0 00 (4 bytes - timestamp)
13 88 (2 bytes - device type: 5000)
00 00 30 39 (4 bytes - serial number)
00 00 00 02 (4 bytes - value count)
Value 0: 00 02 (2 bytes - tag id)
00 (1 byte - status: OK)
01 (1 byte - values count)
04 (1 byte - element size: 4 bytes)
42 90 CC CD (4 bytes - float 72.4)
Value 1: 00 03
00
01
04
42 88 33 33 (4 bytes - float 68.1)

The JSON version of this payload: ~120 bytes. The binary version: ~38 bytes. That's a 3.2x reduction — and on a metered cellular connection at $0.01/MB, that savings compounds quickly when you're transmitting every 30 seconds 24/7.

The binary format uses a simple TLV-like structure: magic byte, group count (big-endian uint32), then for each group: timestamp (uint32), device type (uint16), serial number (uint32), value count (uint32), then for each value: tag ID (uint16), status byte, value count, element size, and raw value bytes. No field names, no delimiters, no escaping — just packed binary data.

Batch Finalization Triggers

A batch should be finalized (sealed and queued for delivery) when either condition is met:

  1. Size limit exceeded — When the accumulated batch size exceeds a configured maximum (e.g., 500KB for JSON, or when the binary buffer is 90%+ full). The 90% threshold for binary avoids the edge case where the next value would overflow the buffer.

  2. Collection timeout expired — When elapsed time since the batch started exceeds a configured maximum (e.g., 60 seconds). This ensures data flows even during quiet periods with few value changes.

if (elapsed_seconds > max_collection_time) → finalize
if (batch_size > max_batch_size) → finalize

Both checks happen after every group is closed (after every polling cycle). This means finalization granularity is tied to your polling interval — if you poll every 1 second and your batch timeout is 60 seconds, each batch will contain roughly 60 groups.

The "Do Not Batch" Exception

Some values are too important to wait for batch finalization. Equipment alarms, pump state changes, emergency stops — these need to reach the cloud immediately. These tags are flagged as "do not batch" in the configuration.

When a do-not-batch tag changes value, it bypasses the normal batch pipeline entirely. A mini-batch is created on the spot — containing just that single value — and pushed directly to the outgoing buffer. This ensures sub-second cloud visibility for critical state changes, while bulk telemetry still benefits from batch efficiency.

Tag: "Pump Status"     interval: 1s    do_not_batch: true
Tag: "Heater Status" interval: 1s do_not_batch: true
Tag: "Delivery Temp" interval: 60s do_not_batch: false ← normal batching

The Buffer Layer: Surviving Disconnections

This is where most IIoT implementations fail. The batch layer produces data. The MQTT layer consumes it. But what sits between them? If it's just an in-memory queue, you'll lose everything on disconnect.

Page-Based Ring Buffer Architecture

The production-grade answer is a page-based ring buffer — a fixed-size memory region divided into equal-sized pages that cycle through three states:

States:
FREE → Available for writing
WORK → Currently being filled with batch data
USED → Filled, waiting for MQTT delivery

Lifecycle:
FREE → WORK (when first data is added)
WORK → USED (when page is full or batch is finalized)
USED → transmit → delivery ACK → FREE (recycled)

Here's how it works:

Memory layout: At startup, a contiguous block of memory is allocated (e.g., 2MB). This block is divided into pages of a configured size (matching the MQTT max packet size, typically matching the batch size). Each page has a small header tracking its state and a data area.

┌──────────────────────────────────────────────┐
│ [Page 0: USED] [Page 1: USED] [Page 2: WORK]│
│ [Page 3: FREE] [Page 4: FREE] [Page 5: FREE]│
│ [Page 6: FREE] ... [Page N: FREE] │
└──────────────────────────────────────────────┘

Writing data: When a batch is finalized, its serialized bytes are written to the current WORK page. Each message gets a small header: a 4-byte message ID slot (filled later by the MQTT library) and a 4-byte size field. If the current page can't fit the next message, it transitions to USED and a fresh FREE page becomes the new WORK page.

Overflow handling: When all FREE pages are exhausted, the buffer reclaims the oldest USED page — the one that's been waiting for delivery the longest. This means you lose old data rather than new data, which is the right trade-off: the most recent readings are the most valuable. An overflow warning is logged so operators know the buffer is under pressure.

Delivery: When the MQTT connection is active, the buffer walks through USED pages and publishes their contents. Each publish gets a packet ID from the MQTT library. When the broker ACKs the packet (via the PUBACK callback for QoS 1), the corresponding page is recycled to FREE.

Disconnection recovery: When the MQTT connection drops:

  1. The disconnect callback fires
  2. The buffer marks itself as disconnected
  3. Data continues accumulating in pages (WORK → USED)
  4. When reconnected, the buffer immediately starts draining USED pages

No data is lost unless the buffer physically overflows. With 2MB of buffer and 500KB page size, you get 4 pages of headroom — enough to survive several minutes of disconnection at typical telemetry rates.

Thread Safety

The PLC read loop and the MQTT event loop run on different threads. The buffer must be thread-safe. Every buffer operation acquires a mutex:

  • buffer_add_data() — called from the PLC read thread after batch finalization
  • buffer_process_data_delivered() — called from the MQTT callback thread on PUBACK
  • buffer_process_connect() / buffer_process_disconnect() — called from MQTT lifecycle callbacks

Without proper locking, you'll see corrupted pages, double-free crashes, and mysterious data loss under load. This is non-negotiable.

Sizing the Buffer

Buffer sizing depends on three variables:

  1. Data rate: How many bytes per second does your polling loop produce?
  2. Expected outage duration: How long do you need to survive without MQTT?
  3. Available memory: Edge devices (especially industrial routers) have limited RAM

Example calculation:

  • 200 tags, average 6 bytes each (including binary overhead) = 1,200 bytes/group
  • Polling every 1 second = 1,200 bytes/second = 72KB/minute
  • Target: survive 30-minute outage = 2.16MB buffer
  • With 500KB pages = 5 pages minimum (round up for safety)

In practice, 2–4MB covers most scenarios. On a 32MB industrial router, that's well within budget.

The MQTT Layer: QoS, Reconnection, and Watchdogs

QoS 1: At-Least-Once Delivery

For industrial telemetry, QoS 1 is the right choice:

  • QoS 0 (fire and forget): No delivery guarantee. Unacceptable for production data.
  • QoS 1 (at least once): Broker ACKs every message. Duplicates possible but data loss prevented. Good trade-off.
  • QoS 2 (exactly once): Eliminates duplicates but doubles the handshake overhead. Rarely worth it for telemetry.

The page buffer's recycling logic depends on QoS 1: pages are only freed when the PUBACK arrives. If the ACK never comes (connection drops mid-transmission), the page stays in USED state and will be retransmitted after reconnection.

Connection Watchdog

MQTT connections can enter a zombie state — the TCP socket is open, the MQTT loop is running, but no data is actually flowing. This happens when network routing changes, firewalls silently drop the connection, or the broker becomes unresponsive.

The fix: a watchdog timer that monitors delivery acknowledgments. If no PUBACK has been received within a timeout window (e.g., 120 seconds) and data has been queued for transmission, force a reconnect:

if (now - last_delivered_packet_time > 120s) {
if (has_pending_data) {
// Force MQTT reconnection
reset_mqtt_client();
}
}

This catches the edge case where the MQTT library thinks it's connected but the network is actually dead. Without this watchdog, your edge daemon could silently accumulate hours of undelivered data in the buffer, eventually overflowing and losing it all.

Asynchronous Connection

MQTT connection establishment (DNS resolution, TLS handshake, CONNACK) can take several seconds, especially over cellular links. This must not block the PLC read loop. The connection should happen on a separate thread:

  1. Main thread detects connection is needed
  2. Connection thread starts connect_async()
  3. Main thread continues reading PLCs
  4. On successful connect, the callback fires and buffer delivery begins

If the connection thread is still working when a new connection attempt is needed, skip it — don't queue multiple connection attempts or you'll thrash the network stack.

TLS for Production

Any MQTT connection leaving your plant network must use TLS. Period. Industrial telemetry data — temperatures, pressures, equipment states, alarm conditions — is operationally sensitive. On the wire without encryption, anyone on the network path can see (and potentially modify) your readings.

For cloud brokers like Azure IoT Hub, TLS is mandatory. The edge daemon should:

  • Load the CA certificate from a PEM file
  • Use MQTT v3.1.1 protocol (widely supported, well-tested)
  • Monitor the SAS token expiration timestamp and alert before it expires
  • Automatically reinitialize the MQTT client when the certificate or connection string changes (file modification detected via stat())

Daemon Status Reporting

A well-designed edge daemon reports its own health back through the same MQTT channel it uses for telemetry. A periodic status message should include:

  • System uptime and daemon uptime — detect restarts
  • PLC link state — is the PLC connection healthy?
  • Buffer state — how full is the outgoing buffer?
  • MQTT state — connected/disconnected, last ACK time
  • SAS token expiration — days until credentials expire
  • Software version — for remote fleet management

An extended status format can include per-tag state: last read time, last delivery time, current value, and error count. This is invaluable for remote troubleshooting — you can see from the cloud exactly which tags are stale and why.

Value Comparison and Change Detection

Not all values need to be sent every polling cycle. A temperature that's been 72.4°F for the last hour doesn't need to be transmitted 3,600 times. Change detection — comparing the current value to the last sent value — can dramatically reduce bandwidth.

The implementation: each tag stores its last transmitted value. After reading, compare:

if (tag.compare_enabled && tag.has_been_read_once) {
if (current_value == tag.last_value) {
skip_this_value(); // Don't add to batch
}
}

Important caveats:

  • Not all tags should use comparison. Continuous process variables (temperatures, flows) should always send, even if unchanged — the recipient needs the full time series to calculate trends and detect flatlines (a stuck sensor reads the same value forever, which is itself a fault condition).
  • Discrete state tags (booleans, enums) are ideal for comparison — they change rarely and each change is significant.
  • Floating-point comparison should use an epsilon threshold, not exact equality, to avoid sending noise from ADC jitter.

Putting It All Together: The Main Loop

The complete edge daemon main loop ties all these layers together:

1. Parse configuration (device addresses, tag lists, MQTT credentials)
2. Allocate memory (PLC config pool + output buffer)
3. Format output buffer into pages
4. Start MQTT connection thread
5. Detect PLC device (probe address, determine type/protocol)
6. Load device-specific tag configuration

MAIN LOOP (runs every 1 second):
a. Check for config file changes → restart if changed
b. Read PLC tags (coalesced Modbus/EtherNet/IP)
c. Add values to batch (with comparison filtering)
d. Check batch finalization triggers (size/timeout)
e. Process incoming commands (config updates, force reads)
f. Check MQTT connection watchdog
g. Sleep 1 second

Every component — polling, batching, buffering, delivery — operates within this single loop iteration, keeping the system deterministic and debuggable.

How machineCDN Implements This

The machineCDN edge runtime implements this full stack natively on resource-constrained industrial routers. The page-based ring buffer runs in pre-allocated memory (no dynamic allocation after startup), the MQTT layer handles Azure IoT Hub and local broker configurations interchangeably, and the batch layer supports both JSON and binary encoding selectable per-device.

On a Teltonika RUT9xx router with 256MB RAM, the daemon typically uses under 4MB total — including 2MB of buffer space that can store 20+ minutes of telemetry during a connectivity outage. Tags are automatically sorted, coalesced, and dispatched with zero configuration beyond listing the tag names and addresses.

The result: edge gateways that have been running continuously for years in production environments, surviving cellular dropouts, network reconfigurations, and even firmware updates without losing a single data point.

Conclusion

Reliable telemetry delivery isn't about the protocol — it's about the pipeline. Modbus reads are the easy part. The hard engineering is in the layers between: batching values efficiently, buffering them through disconnections, and confirming delivery before recycling memory.

The key design principles:

  1. Never block the read loop — PLC polling is sacred
  2. Buffer with finite, pre-allocated memory — dynamic allocation on embedded systems is asking for trouble
  3. Reclaim oldest data first — in overflow, recent values matter more
  4. Acknowledge before recycling — a page stays USED until the broker confirms receipt
  5. Watch for zombie connections — a connected socket doesn't mean data is flowing

Get these right, and your edge infrastructure becomes invisible — which is exactly what production IIoT should be.