Edge Computing Architectures for IIoT: Store-and-Forward, Local Processing, and Bandwidth Optimization [2026]
Here's an uncomfortable truth about Industrial IoT: the factory floor doesn't care about your cloud architecture. PLCs don't pause production because your MQTT broker is restarting. Cellular connections drop. Ethernet switches fail. And through all of it, sensor data keeps flowing at 1-second intervals — either you capture it, or it's gone forever.
Edge computing in IIoT isn't about running machine learning models on Raspberry Pis. It's about building a reliable data pipeline between deterministic control systems and non-deterministic cloud infrastructure. The gap between those two worlds is where the real engineering happens.
This guide covers the architectural patterns that make industrial edge computing work: page-based store-and-forward buffering, connection resilience, bandwidth-aware data transport, and the design decisions that separate production-grade systems from demo-day prototypes.

The Edge Gateway: More Than a Protocol Translator
The simplest mental model of an edge gateway is "read from PLC, send to cloud." But production edge gateways handle a staggering amount of complexity between those two steps:
- Protocol detection — Auto-detect whether the connected device speaks EtherNet/IP, Modbus TCP, or Modbus RTU
- Device identification — Read device type codes and serial numbers to load the correct configuration
- Tag polling — Continuously read configured data points at device-specific intervals
- Change detection — Compare values against previous readings to suppress redundant data
- Data batching — Accumulate readings into efficiently-packed payloads
- Store-and-forward — Buffer data locally when cloud connectivity is lost
- Reliable delivery — Guarantee data reaches the cloud exactly once, in order
- Remote configuration — Accept configuration updates from the cloud without requiring physical access
Each of these stages has failure modes that must be handled without losing data or disrupting production. Let's dig into the critical ones.
Store-and-Forward: The Page Buffer Architecture
The most important component in any edge gateway is its store-and-forward buffer. This is the mechanism that decouples data acquisition from data transmission — ensuring that sensor readings survive connectivity outages.
Why Ring Buffers Aren't Enough
The naive approach is a simple circular buffer: write data at the head, read from the tail, overwrite old data when full. This fails in industrial contexts for several reasons:
- Message boundaries: Industrial payloads are variable-length (a batch might be 200 bytes or 3,000 bytes). Fixed-size ring buffer slots either waste memory or truncate messages.
- Delivery confirmation: You can't move the read pointer until MQTT confirms delivery (via the
PUBACKin QoS 1). Ring buffers don't naturally support this. - Concurrent access: The data acquisition thread writes continuously while the MQTT thread reads and publishes asynchronously. Lock contention becomes a bottleneck.
The Page-Based Buffer
A production-grade approach uses a page-based buffer with three pools:
┌─────────────────────────────────────────────┐
│ Fixed Memory Block │
│ (e.g., 2 MB pre-allocated at startup) │
├──────────┬──────────┬──────────┬────────────┤
│ Page 0 │ Page 1 │ Page 2 │ Page 3... │
│ (4 KB) │ (4 KB) │ (4 KB) │ (4 KB) │
└──────────┴──────────┴──────────┴────────────┘
Three pools:
FREE ──→ Pages available for writing
WORK ──→ Currently being filled with data
USED ──→ Full, queued for delivery
Lifecycle of a page:
- Free → Work: When data arrives and no work page exists, grab one from the free pool
- Work (accumulating): Multiple messages are packed into the page sequentially, each prefixed with a 4-byte message ID placeholder and 4-byte length
- Work → Used: When the page is full (next message wouldn't fit), move it to the used queue
- Used → Delivering: Read messages one at a time from the used page, publish via MQTT
- Delivering → Delivered: When MQTT confirms delivery (
on_publishcallback with matching packet ID), advance the read pointer - Used → Free: When all messages on a page are delivered, move the page back to the free pool
The Overflow Strategy
What happens when the free pool is empty — all pages are either being written or awaiting delivery?
You have two choices, and neither is great:
- Drop new data: Preserve older data, lose current readings. Acceptable if historical data is more valuable (rare in industrial contexts).
- Sacrifice the oldest used page: Reclaim the oldest undelivered page for new writes. You lose some historical data, but current readings are preserved.
Option 2 is almost always correct for industrial telemetry. Current production data has higher operational value than readings from 10 minutes ago that haven't been delivered yet. The system should log a warning when this overflow occurs — it indicates the connectivity outage is severe enough to cause data loss, which may warrant an alert.
Thread Safety
The buffer must be thread-safe, because the PLC reading loop and the MQTT delivery loop run concurrently. Mutex-based locking around buffer operations is the pragmatic choice for embedded Linux gateways:
- Lock on write: Acquire mutex, add data to work page (potentially promoting it to used), attempt to send next queued message, release mutex
- Lock on delivery confirmation: Acquire mutex, advance read pointer, potentially free the page, attempt to send next message, release mutex
- Lock on disconnect: Acquire mutex, mark buffer as disconnected, clear the "packet in flight" flag, release mutex
The key insight is that the send attempt happens inside both the write and delivery-confirmation paths. This ensures data flows out as fast as the MQTT connection allows, without needing a separate send timer.
MQTT Transport: Beyond Hello World
Most MQTT tutorials cover connect → publish → disconnect. Industrial MQTT requires handling about 15 additional failure modes that those tutorials never mention.
Connection Lifecycle Management
An industrial MQTT client must handle:
- Initial connection: Often via TLS with certificate pinning (Azure IoT Hub, AWS IoT Core). Connection string parsing, SAS token extraction, certificate validation.
- Async connection: The DNS resolution and TLS handshake can take seconds on cellular networks. Blocking the main loop is unacceptable — use
connect_asyncin a separate thread. - Automatic reconnection: When the connection drops, the client should retry with a fixed delay (e.g., 5 seconds). Exponential backoff sounds sophisticated but introduces unnecessary complexity for dedicated M2M connections.
- Subscription on connect: Subscribe to the device-specific command topic immediately after connection succeeds (in the
on_connectcallback), not before. - Watchdog monitoring: If no data has been published or acknowledged for a configurable timeout (e.g., 120 seconds), force-reconnect the MQTT client. This catches silent disconnections that don't trigger the
on_disconnectcallback.
QoS 1: Exactly Once Delivery (Almost)
For industrial telemetry, MQTT QoS 1 is the sweet spot:
- QoS 0 (fire and forget): Unacceptable — you'll silently lose data during network blips
- QoS 1 (at least once): The broker acknowledges receipt. May produce duplicates on reconnection, but duplicates are far better than data loss
- QoS 2 (exactly once): 4-packet handshake per message. The latency and complexity overhead is unjustifiable for sensor telemetry
The practical architecture: publish with QoS 1, use the on_publish callback with the matching packet ID to confirm delivery, and only advance the buffer read pointer after confirmation.
Token Expiration Monitoring
Cloud IoT platforms use time-limited authentication tokens (SAS tokens for Azure IoT Hub, JWT for Google Cloud IoT). The edge gateway must:
- Parse the expiration timestamp from the token at startup
- Compare against the device's current time
- Log a warning if the token is expired or approaching expiration
- Ideally, request a token refresh before expiration — but many constrained devices rely on periodic manual token rotation
This is a mundane but critical detail. Expired tokens cause silent connection failures that are extremely difficult to diagnose remotely.
Bandwidth Optimization Strategies
Industrial cellular connections (4G/LTE on Teltonika RUT-series routers, Cradlepoint, Sierra Wireless) typically have data caps ranging from 1 GB to 10 GB per month. A naive implementation that publishes every sensor reading as a separate JSON message can burn through 10 GB in days.
Binary vs. JSON: A 5x Difference
Consider a typical sensor reading payload:
JSON format (102 bytes):
{"ts":1709136000,"type":1010,"serial":12345,"values":[{"id":2,"value":55}]}
Binary format (20 bytes):
F7 00000001 60060A93 03F2 00003039 00000001 0002 00 0100 0037
That's a 5x reduction for the same information. Over a month of readings at 1-second intervals, that difference is:
- JSON: ~260 MB/month
- Binary: ~52 MB/month
For cellular-connected devices, binary packing isn't an optimization — it's a requirement.
Intelligent Batching
Beyond binary packing, batching multiple readings into a single MQTT message reduces overhead from MQTT framing, TLS record headers, and TCP acknowledgments:
| Strategy | Messages/hour | Bytes/hour | MQTT overhead |
|---|---|---|---|
| Individual readings (1/sec) | 3,600 | ~360 KB | ~180 KB |
| Time-batched (60s window) | 60 | ~72 KB | ~3 KB |
| Size-batched (4 KB limit) | ~18 | ~72 KB | ~1 KB |
Using both time and size limits together provides the best behavior:
- During active production (many tag changes): batches fill and flush based on size limit
- During idle periods (few changes): the time limit ensures data doesn't sit in the buffer indefinitely
Change-Only Transmission
The highest-impact bandwidth optimization is simply not sending data that hasn't changed. A compare=true flag on stable configuration tags (device type, firmware version, serial number) means those values are only transmitted once — on first read or when they actually change.
For a typical device with 40 tags where 30 are configuration/status values that rarely change, this reduces steady-state bandwidth by 75%.
But pure change-detection has a reliability gap: if a single reading is lost, the cloud side has stale data until the value changes again. The solution is a periodic full refresh — force-read and transmit all tags once per hour, regardless of whether they've changed. This bounds the staleness window to 60 minutes maximum.
Remote Configuration: Closing the Loop
A truly useful edge computing architecture isn't just a one-way data pipe. The cloud side needs to push configuration updates back to the edge — new tag definitions, adjusted polling intervals, updated firmware parameters — without requiring a truck roll.
Configuration Hot-Reload
The edge daemon should monitor its configuration files for changes (via stat() file modification timestamps). When a configuration change is detected:
- Parse and validate the new configuration
- Tear down existing PLC connections cleanly
- Rebuild the device context with the new parameters
- Resume data acquisition with the updated tag list
Critically, this must happen without restarting the daemon process. A restart means a gap in data acquisition, which means missed production events.
Cloud-to-Edge Commands via MQTT
The MQTT subscription channel enables bidirectional communication. Common cloud-to-edge commands:
daemon_config: Update the central device configuration (IP addresses, serial ports, batch parameters)device_config: Update a specific PLC's tag definitions (add/remove/modify tags)tag_update: Modify the polling interval of a single tag at runtime (e.g., increase frequency during a diagnostic window)read_now: Trigger an immediate read of a specific tag, bypassing the normal interval scheduleget_status: Request the current daemon status (uptime, connection states, tag health)
Each command is delivered as a JSON message on the device-specific MQTT topic. The edge daemon parses the command, executes it, and (for configuration updates) persists the change to the local filesystem so it survives reboots.
Device Detection and Auto-Configuration
In environments with diverse equipment, the edge gateway must auto-detect what's connected and load the appropriate configuration.
The Detection Sequence
A practical detection sequence for a multi-protocol gateway:
- Try EtherNet/IP first: Attempt to read a
device_typetag from the configured IP address using the CIP protocol. If successful, you have an Allen-Bradley PLC. - Fall back to Modbus TCP: Connect to the configured IP and port (default 502). Read input register 800 to get the device type code.
- Identify the specific model: Map the device type code (e.g., 1010 = Batch Blender, 1017 = Portable Chiller, 1018 = Central Chiller) to the correct configuration file.
- Read serial number: Each device type stores its serial number in different registers (the chiller stores year/month/unit across three holding registers at addresses 500/510/520, while the blender exposes them as named EtherNet/IP tags).
- Load configuration: Find and parse the JSON configuration file that matches the detected device type.
- Validate and start: Verify the configuration is internally consistent, then begin the polling loop.
If detection fails, the daemon continues retrying periodically rather than crashing. The PLC may not be powered up yet, or the network cable may be disconnected temporarily. Patience is a feature.
Hardware Platform Considerations
Edge computing hardware for IIoT falls into three tiers:
Tier 1: Industrial Cellular Routers (OpenWRT)
- Examples: Teltonika RUT955, RUT950
- CPU: MIPS or ARM, ~580 MHz
- RAM: 128 MB
- Storage: 16 MB flash
- Connectivity: 4G/LTE cellular + Ethernet + RS-232/485
- Constraints: Extremely limited memory and storage. Binary-only payloads. No room for scripting languages — C is the only practical choice.
These are the workhorses of remote industrial monitoring. The edge daemon must be compiled specifically for the target architecture (cross-compilation via the device SDK), and every byte of memory matters.
Tier 2: Industrial PCs and Panels
- Examples: Siemens IPC, Advantech ADAM, Beckhoff
- CPU: x86 or ARM Cortex-A series
- RAM: 2–8 GB
- Connectivity: Multiple Ethernet, serial, sometimes fieldbus
- Constraints: More capable, but typically shared with HMI or SCADA software. The edge daemon runs as one process among many.
Tier 3: Cloud Gateways
- Examples: AWS IoT Greengrass on any Linux box
- CPU/RAM: Flexible
- Constraints: Primarily software constraints — latency to the actual devices, container overhead, network configuration.
machineCDN targets all three tiers, with particular strength in Tier 1 deployments where the combination of C-based efficiency, binary data packing, and page-based buffering delivers reliable data acquisition on hardware that costs under $300 per site.
Failure Mode Analysis
Every component in the edge architecture has failure modes. The system must degrade gracefully:
| Failure | Impact | Recovery |
|---|---|---|
| PLC communication lost | Tag reads return error status | Retry up to 3 times, then report link-down. Resume automatically when PLC responds. |
| Serial port error (Modbus RTU) | ETIMEDOUT, ECONNRESET, EPIPE | Close port, reconnect on next poll cycle |
| MQTT broker unreachable | Data accumulates in page buffer | Auto-reconnect every 5 seconds. Buffer overflows if outage exceeds buffer capacity. |
| MQTT token expired | Connection rejected | Log warning. Requires manual token rotation (or automated renewal if supported). |
| Configuration file corrupt | Daemon can't load tag definitions | Continue running with last known good config. Report status error to cloud. |
| Memory exhaustion | Buffer allocation fails | Pre-allocate all memory at startup. No dynamic allocation during runtime. |
The most critical design principle: pre-allocate everything at startup. An edge daemon that calls malloc() during steady-state operation will eventually fail due to memory fragmentation on constrained devices. Allocate the PLC configuration memory (1 MB), the output buffer (2 MB), and all tag definitions in one shot at startup.
Real-World Performance Numbers
Based on production deployments monitoring plastics auxiliary equipment:
- Tag read cycle: 1 second per device (50-tag configuration)
- Average batch size: 800–2,000 bytes (binary format)
- Batch interval: 60 seconds typical
- Bandwidth consumption: 1.5–4 MB/day per device on cellular
- Buffer capacity: ~500 batches (enough for ~8 hours of offline buffering)
- Memory footprint: Under 3 MB RSS for the complete daemon
- CPU usage: Under 2% on MIPS 580 MHz
- Uptime: Months between restarts (typically only for firmware updates)
Key Takeaways
-
Buffer before you transmit: A page-based store-and-forward buffer is the single most important component in an edge gateway. Without it, every connectivity blip means lost data.
-
Binary over JSON for constrained links: The 5x bandwidth reduction from binary packing pays for itself immediately on cellular connections.
-
Pre-allocate everything: No
malloc()after startup. Industrial systems run for months — memory fragmentation will find you. -
Detect, don't assume: Auto-detect connected devices and load configurations dynamically. The edge gateway should work out of the box when plugged into an unknown PLC.
-
Watchdog everything: Monitor MQTT connection health independently of the library's built-in reconnection. Silent failures are the most dangerous failures.
-
Configuration as data: Tag definitions, polling intervals, batch parameters, and network settings should all live in JSON configuration files that can be updated remotely via MQTT commands.
Where machineCDN Fits
machineCDN provides purpose-built edge infrastructure that implements every pattern discussed in this article — from page-based buffering and binary transport to auto-detection, remote configuration, and multi-protocol support. The platform runs on everything from $200 cellular routers to full industrial PCs, delivering sub-3MB memory footprints and months of unattended uptime.
If you're evaluating edge computing platforms for industrial equipment monitoring, machineCDN is worth a look — especially if your deployment involves cellular connectivity, mixed PLC types, or sites where physical access for troubleshooting is expensive.
Running into edge gateway challenges? We've deployed these architectures across hundreds of manufacturing sites. Get in touch to discuss your specific requirements.
