Skip to main content

187 posts tagged with "Industrial IoT"

Industrial Internet of Things insights and best practices

View All Tags

Edge Gateway Lifecycle Architecture: From Boot to Steady-State Telemetry in Industrial IoT [2026]

· 14 min read

Most IIoT content treats the edge gateway as a black box: PLC data goes in, cloud data comes out. That's fine for a sales deck. It's useless for the engineer who needs to understand why their gateway loses data during a network flap, or why configuration changes require a full restart, or why it takes 90 seconds after boot before the first telemetry packet reaches the cloud.

This article breaks down the complete lifecycle of a production industrial edge gateway — from the moment it powers on to steady-state telemetry delivery, including every decision point, failure mode, and recovery mechanism in between. These patterns are drawn from real-world gateways running on resource-constrained hardware (64MB RAM, MIPS processors) in plastics manufacturing plants, monitoring TCUs, chillers, blenders, and dryers 24/7.

Phase 1: Boot and Configuration Load

When a gateway boots (or restarts after a configuration change), the first task is loading its configuration. In production deployments, there are typically two configuration layers:

The Daemon Configuration

This is the central configuration that defines what equipment to talk to:

{
"plc": {
"ip": "192.168.5.5",
"modbus_tcp_port": 502
},
"serial_device": {
"port": "/dev/rs232",
"baud": 9600,
"parity": "none",
"data_bits": 8,
"stop_bits": 1,
"byte_timeout_ms": 4,
"response_timeout_ms": 100
},
"batch_size": 4000,
"batch_timeout_sec": 60,
"startup_delay_sec": 30
}

The startup delay is a critical design choice. When a gateway boots simultaneously with the PLCs it monitors (common after a power outage), the PLCs may need 10-30 seconds to initialize their communication stacks. If the gateway immediately tries to connect, it fails, marks the PLC as unreachable, and enters a slow retry loop. A 30-second startup delay avoids this race condition.

The serial link parameters (baud, parity, data bits, stop bits) must match the PLC exactly. A mismatch here produces zero error feedback — you just get silence. The byte timeout (time between consecutive bytes) and response timeout (time to wait for a complete response) are tuned per equipment type. TCUs with slower processors may need 100ms+ response timeouts; modern PLCs respond in 10-20ms.

The Device Configuration Files

Each equipment type gets its own configuration file that defines which registers to read, what data types to expect, and how often to poll. These files are loaded dynamically based on the device type detected during the discovery phase.

A real device configuration for a batch blender might define 40+ tags, each with:

  • A unique tag ID (1-32767)
  • The Modbus register address or EtherNet/IP tag name
  • Data type (bool, int8, uint8, int16, uint16, int32, uint32, float)
  • Element count (1 for scalars, 2+ for arrays or multi-register values)
  • Poll interval in seconds
  • Whether to compare with previous value (change-based delivery)
  • Whether to send immediately or batch with other values

Hot-reload capability is essential for production systems. The gateway should monitor configuration file timestamps and automatically detect changes. When a configuration file is modified (pushed via MQTT from the cloud, or copied via SSH during maintenance), the gateway reloads it without requiring a full restart. This means configuration updates can be deployed remotely to gateways in the field without disrupting data collection.

Phase 2: Device Detection

After configuration loads successfully, the gateway enters the device detection phase. This is where protocol-level intelligence matters.

Multi-Protocol Discovery

A well-designed gateway doesn't assume which protocol the PLC speaks. Instead, it tries multiple protocols in order of preference:

Step 1: Try EtherNet/IP

The gateway sends a CIP (Common Industrial Protocol) request to the configured IP address, attempting to read a device_type tag. EtherNet/IP uses the ab-eip protocol with a micro800 CPU profile (for Allen-Bradley Micro8xx series). If the PLC responds with a valid device type, the gateway knows this is an EtherNet/IP device.

Connection path: protocol=ab-eip, gateway=192.168.5.5, cpu=micro800
Target tag: device_type (uint16)
Timeout: 2000ms

Step 2: Fall back to Modbus TCP

If EtherNet/IP fails (error code -32 = "no connection"), the gateway tries Modbus TCP on port 502. It reads input register 800 (address 300800) which, by convention, stores the device type identifier.

Function code: 4 (Read Input Registers)
Register: 800
Count: 1
Expected: uint16 device type code

Step 3: Serial detection for Modbus RTU

If TCP protocols fail, the gateway probes the serial port for Modbus RTU devices. RTU detection is trickier because there's no auto-discovery mechanism — you must know the slave address. Production gateways typically configure a default address (slave ID 1) and attempt a read.

Serial Number Extraction

After identifying the device type, the gateway reads the equipment's serial number. This is critical for fleet management — each physical machine needs a unique identifier for cloud-side tracking.

Different equipment types store serial numbers in different registers:

Equipment TypeProtocolMonth RegisterYear RegisterUnit Register
Portable ChillerModbus TCPInput 22Input 23Input 24
Central ChillerModbus TCPHolding 520Holding 510Holding 500
TCUModbus RTUEtherNet/IPEtherNet/IPEtherNet/IP
Batch BlenderEtherNet/IPCIP tagCIP tagCIP tag

The serial number is packed into a 32-bit value:

Byte 3: Year  (0x40=2010, 0x41=2011, ...)
Byte 2: Month (0x00=Jan, 0x01=Feb, ...)
Bytes 0-1: Unit number (sequential)

Example: 0x002A0050 = January 2010, unit #80

Fallback serial generation: If the PLC doesn't have a programmed serial number (common with newly installed equipment), the gateway generates one using the router's serial number as a seed, with a prefix byte distinguishing PLCs (0x7F) from TCUs (0x7E). This ensures every device in the fleet has a unique identifier even before the serial number is programmed.

Configuration Loading by Device Type

Once the device type is known, the gateway searches for a matching configuration file. If type 1010 is detected, it loads the batch blender configuration. If type 5000, it loads the TCU configuration. If no matching configuration exists, the gateway logs an error and continues monitoring other ports.

This pattern — detect → identify → configure — means a single gateway binary handles dozens of equipment types. Adding support for a new machine is a configuration file change, not a firmware update.

With devices detected and configured, the gateway establishes its cloud connection via MQTT.

Connection Architecture

Production IIoT gateways use MQTT 3.1.1 over TLS (port 8883) for cloud connectivity. The connection setup involves:

  1. Certificate verification — the gateway validates the cloud broker's certificate against a CA root cert stored locally
  2. SAS token authentication — using a device-specific Shared Access Signature that encodes the hostname, device ID, and expiration timestamp
  3. Topic subscription — after connecting, the gateway subscribes to its command topic for receiving configuration updates and control commands from the cloud
Publish topic:  devices/{deviceId}/messages/events/
Subscribe topic: devices/{deviceId}/messages/devicebound/#
QoS: 1 (at least once delivery)

QoS 1 is the standard choice for industrial telemetry — it guarantees message delivery while avoiding the overhead and complexity of QoS 2 (exactly once). Since the data pipeline is designed to handle duplicates (via timestamp deduplication at the cloud layer), QoS 1 provides the right balance of reliability and performance.

The Async Connection Thread

MQTT connection can take 5-30 seconds depending on network conditions, DNS resolution, and TLS handshake time. A naive implementation blocks the main loop during connection, which means no PLC data is read during this time.

The solution: run mosquitto_connect_async() in a separate thread. The main loop continues reading PLC tags and buffering data while the MQTT connection establishes in the background. Once the connection callback fires, buffered data starts flowing to the cloud.

This is implemented using a semaphore-based producer-consumer pattern:

  1. Main thread prepares connection parameters and posts to a semaphore
  2. Connection thread wakes up, calls connect_async(), and signals completion
  3. Main thread checks semaphore state before attempting reconnection (prevents double-connect)

Connection Watchdog

Network connections fail. Cell modems lose signal. Cloud brokers restart. A production gateway needs a watchdog that detects stale connections and forces reconnection.

The watchdog pattern:

Every 120 seconds:
1. Check: have we received ANY confirmation from the broker?
(delivery ACK, PUBACK, SUBACK — anything)
2. If yes → connection is healthy, reset watchdog timer
3. If no → connection is stale. Destroy MQTT client and reinitiate.

The 120-second timeout is tuned for cellular networks where intermittent connectivity is expected. On wired Ethernet, you could reduce this to 30-60 seconds. The key insight: don't just check "is the TCP socket open?" — check "has the broker confirmed any data delivery recently?" A half-open socket can persist for hours without either side knowing.

Phase 4: Steady-State Tag Reading

Once PLC connections and MQTT are established, the gateway enters its main polling loop. This is where it spends 99.9% of its runtime.

The Main Loop (1-second resolution)

The core loop runs every second and performs three operations:

  1. Configuration check — detect if any configuration file has been modified (via file stat monitoring)
  2. Tag read cycle — iterate through all configured tags and read those whose polling interval has elapsed
  3. Command processing — check the incoming command queue for cloud-side instructions (config updates, manual reads, interval changes)

Interval-Based Polling

Each tag has a polling interval in seconds. The gateway maintains a monotonic clock timestamp of the last read for each tag. On each loop iteration:

for each tag in device.tags:
elapsed = now - tag.last_read_time
if elapsed >= tag.interval_sec:
read_tag(tag)
tag.last_read_time = now

Typical intervals by data category:

Data TypeIntervalRationale
Temperatures, pressures60sSlow-changing process values
Alarm states (booleans)1sImmediate awareness needed
Machine state (running/idle)1sOEE calculation accuracy
Batch counts1sProduction tracking
Version, serial number3600sStatic values, verify hourly

Compare Mode: Change-Based Delivery

For many tags, sending the same value every second is wasteful. If a chiller alarm bit is false for 8 hours straight, that's 28,800 redundant messages.

Compare mode solves this: the gateway stores the last-read value and only delivers to the cloud when the value changes. This is configured per tag:

{
"name": "Compressor Fault Alarm",
"type": "bool",
"interval": 1,
"compare": true,
"do_not_batch": true
}

This tag is read every second, but only transmitted when it changes. The do_not_batch flag means changes are sent immediately rather than waiting for the next batch finalization — critical for alarm states where latency matters.

Hourly Full Refresh

There's a subtle problem with pure change-based delivery: if a value changes while the MQTT connection is down, the cloud never learns about the transition. And if a value stays constant for days, the cloud has no heartbeat confirming the sensor is still alive.

The solution: every hour (on the hour change), the gateway resets all "read once" flags, forcing a complete re-read and re-delivery of all tags. This guarantees the cloud has fresh values at least hourly, regardless of change activity.

Phase 5: Data Batching and Delivery

Raw tag values don't get sent individually (except high-priority alarms). Instead, they're collected into batches for efficient delivery.

Binary Encoding

Production gateways use binary encoding rather than JSON to minimize bandwidth. The binary format packs values tightly:

Header:        1 byte  (0xF7 = tag values)
Group count: 4 bytes (number of timestamp groups)

Per group:
Timestamp: 4 bytes
Device type: 2 bytes
Serial num: 4 bytes
Value count: 4 bytes

Per value:
Tag ID: 2 bytes
Status: 1 byte (0x00=OK, else error code)
Array size: 1 byte (if status=OK)
Elem size: 1 byte (1, 2, or 4 bytes per element)
Data: size × count bytes

A batch containing 20 float values uses about 200 bytes in binary vs. ~2,000 bytes in JSON — a 10× bandwidth reduction that matters on cellular connections billed per megabyte.

Batch Finalization Triggers

A batch is finalized (sent to MQTT) when either:

  1. Size threshold — the batch reaches the configured maximum size (default: 4,000 bytes)
  2. Time threshold — the batch has been collecting for longer than batch_timeout_sec (default: 60 seconds)

This ensures data reaches the cloud within 60 seconds even during low-activity periods, while maximizing batch efficiency during high-activity periods (like a blender running a batch cycle that triggers many dependent tag reads).

The Paged Ring Buffer

Between the batching layer and the MQTT publish layer sits a paged ring buffer. This is the gateway's resilience layer against network outages.

The buffer divides available memory into fixed-size pages. Each page holds one or more complete MQTT messages. The buffer operates as a queue:

  • Write side: Finalized batches are written to the current work page. When a page fills up, it moves to the "used" queue.
  • Read side: When MQTT is connected, the gateway publishes the oldest used page. Upon receiving a PUBACK (delivery confirmation), the page moves to the "free" pool.
  • Overflow: If all pages are used (network down too long), the gateway overwrites the oldest used page — losing the oldest data to preserve the newest.

This design means the gateway can buffer 15-60 minutes of telemetry data during a network outage (depending on available memory and data density), then drain the buffer once connectivity restores.

Disconnect Recovery

When the MQTT connection drops:

  1. The buffer's "connected" flag is cleared
  2. All pending publish operations are halted
  3. Incoming PLC data continues to be read, batched, and buffered
  4. The MQTT async thread begins reconnection
  5. On reconnection, the buffer's "connected" flag is set, and data delivery resumes from the oldest undelivered page

This means zero data loss during short outages (up to the buffer capacity), and newest-data-preserved during long outages (the overflow policy drops oldest data first).

Phase 6: Remote Configuration and Control

A production gateway accepts commands from the cloud over its MQTT subscription topic. This enables remote management without SSH access.

Supported Command Types

CommandDirectionDescription
daemon_configCloud → DeviceUpdate central configuration (IP addresses, serial params)
device_configCloud → DeviceUpdate device-specific tag configuration
get_statusCloud → DeviceRequest current daemon/PLC/TCU status report
get_status_extCloud → DeviceRequest extended status with last tag values
read_now_plcCloud → DeviceForce immediate read of a specific tag
tag_updateCloud → DeviceChange a tag's polling interval remotely

Remote Interval Adjustment

This is a powerful production feature: the cloud can remotely change how often specific tags are polled. During a quality investigation, an engineer might temporarily increase temperature polling from 60s to 5s to capture rapid transients. After the investigation, they reset to 60s via another command.

The gateway applies interval changes immediately and persists them to the configuration file, so they survive a restart. The modified_intervals flag in status reports tells the cloud that intervals have been manually adjusted.

Designing for Constrained Hardware

These gateways often run on embedded Linux routers with severely constrained resources:

  • RAM: 64-128MB (of which 30-40MB is available after OS)
  • CPU: MIPS or ARM, 500-800 MHz, single core
  • Storage: 16-32MB flash (no disk)
  • Network: Cellular (LTE Cat 4/Cat M1) or Ethernet

Design constraints this imposes:

  1. Fixed memory allocation — allocate all buffers at startup, never malloc() during runtime. A memory fragmentation crash at 3 AM in a factory with no IT staff is unrecoverable.

  2. No floating-point unit — older MIPS processors do software float emulation. Keep float operations to a minimum; do heavy math in the cloud.

  3. Flash wear — don't write configuration changes to flash more than necessary. Batch writes, use write-ahead logging if needed.

  4. Watchdog timer — use the hardware watchdog timer. If the main loop hangs, the hardware reboots the gateway automatically.

How machineCDN Implements These Patterns

machineCDN's ACS (Auxiliary Communication System) gateway embodies all of these lifecycle patterns in a production-hardened implementation that's been running on thousands of plastics manufacturing machines for years.

The gateway runs on Teltonika RUT9XX industrial cellular routers, providing cellular connectivity for machines in facilities without available Ethernet. It supports EtherNet/IP and Modbus (both TCP and RTU) simultaneously, auto-detecting device types at boot and loading the appropriate configuration from a library of pre-built equipment profiles.

For manufacturers deploying machineCDN, the complexity described in this article — protocol detection, configuration management, MQTT buffering, recovery — is entirely handled by the platform. The result is that plant engineers get reliable, continuous telemetry from their equipment without needing to understand (or debug) the edge gateway's internal lifecycle.


Understanding how edge gateways actually work — not just what they do, but how they manage their lifecycle — is essential for building reliable IIoT infrastructure. The patterns described here (startup sequencing, multi-protocol detection, buffered delivery, watchdog recovery) separate toy deployments from production systems that run for years without intervention.

How to Implement Multi-Zone Machine Monitoring: Organizing Your Factory Floor for Maximum Visibility

· 10 min read
MachineCDN Team
Industrial IoT Experts

Most factory floors are not organized the way IIoT platforms expect them to be. Machines are clustered by process, scattered across buildings, or arranged by historical accident — the CNC mill is next to the paint booth because that is where the power drop was when the building was renovated in 2003. When you deploy an IIoT monitoring platform, the way you organize machines into zones and locations determines whether your dashboards show actionable insight or meaningless noise.

Multi-zone machine monitoring is the practice of organizing your monitored equipment into logical groupings — by location, process area, product line, or function — so that your monitoring data tells a story your team can act on. This guide walks through how to plan, implement, and optimize a zone-based monitoring structure for manufacturing plants of any size.

How to Monitor Industrial Compressors and Chillers with IIoT: A Practical Guide for Plant Engineers

· 9 min read
MachineCDN Team
Industrial IoT Experts

Industrial compressors and chillers are the unsung heroes of manufacturing. They don't make products, but without them, nothing else works. Compressed air powers pneumatic actuators, controls, and tools across the plant. Chillers maintain process temperatures for injection molding, chemical reactions, food processing, and data centers. When a compressor or chiller fails, the entire production line stops — often with zero warning.

IEEE 754 Floating-Point Edge Cases in Industrial Data Pipelines: A Practical Guide [2026]

· 12 min read

If you've ever seen a temperature reading of 3.4028235 × 10³⁸ flash across your monitoring dashboard at 2 AM, you've met IEEE 754's ugly side. Floating-point representation is the lingua franca of analog process data in industrial automation — and it's riddled with traps that can silently corrupt your data pipeline if you don't handle them at the edge.

This guide covers the real-world edge cases that matter when reading float registers from PLCs over Modbus, EtherNet/IP, and other industrial protocols — and how to catch them before they poison your analytics, trigger false alarms, or crash your trending charts.

IEEE 754 floating point data flowing through an industrial data pipeline

Why Floating-Point Matters More in Industrial IoT

In enterprise software, a floating-point rounding error means your bank balance is off by a fraction of a cent. In industrial IoT, a misinterpreted float register can mean:

  • A temperature sensor reading infinity instead of 450°F, triggering an emergency shutdown
  • An OEE calculation returning NaN, breaking every downstream dashboard
  • A pressure reading of -0.0 confusing threshold comparison logic
  • Two 16-bit registers assembled in the wrong byte order, turning 72.5 PSI into 1.6 × 10⁻³⁸

These aren't theoretical problems. They happen on real factory floors, every day, because the gap between PLC register formats and cloud-native data types is wider than most engineers realize.

The Anatomy of a PLC Float

Most modern PLCs store floating-point values as IEEE 754 single-precision (32-bit) numbers. The 32 bits break down as:

┌─────┬──────────┬───────────────────────┐
│Sign │ Exponent │ Mantissa │
│1 bit│ 8 bits │ 23 bits │
└─────┴──────────┴───────────────────────┘
Bit 31 Bits 30-23 Bits 22-0

This gives you a range of roughly ±1.18 × 10⁻³⁸ to ±3.40 × 10³⁸, with about 7 decimal digits of precision. That's plenty for most process variables — but the encoding introduces special values and edge cases that PLC programmers rarely think about.

The Five Dangerous Values

PatternValueWhat Causes It
0x7F800000+InfinityDivision by zero, sensor overflow
0xFF800000-InfinityNegative division by zero
0x7FC00000Quiet NaNUninitialized register, invalid operation
0x7FA00000Signaling NaNHardware fault flags in some PLCs
0x00000000 / 0x80000000+0.0 / -0.0Legitimate zero, but -0.0 can trip comparisons

Why PLCs Generate These Values

PLC ladder logic and structured text don't always guard against special float values. Common scenarios include:

Uninitialized registers: When a PLC program is downloaded but a tag hasn't been written to yet, many PLCs leave the register at 0x00000000 (zero) — but some leave it at 0xFFFFFFFF (NaN). There's no universal standard here.

Sensor faults: When an analog input card detects a broken wire or over-range condition, some PLCs write a sentinel value (often max positive float or NaN) to the associated tag. Others set a separate status bit and leave the value register frozen at the last good reading.

Division by zero: If your PLC program calculates a rate (e.g., throughput per hour) and the divisor drops to zero during a machine stop, you get infinity. Not every PLC programmer wraps division in a zero-check.

Scaling arithmetic: Converting raw 12-bit ADC counts (0–4095) to engineering units involves multiplication and offset. If the scaling coefficients are misconfigured, you can get results outside the normal range that are still technically valid IEEE 754 floats.

The Byte-Ordering Minefield

Here's where industrial protocols diverge from IT conventions in ways that cause the most data corruption.

Modbus Register Ordering

Modbus transmits data in 16-bit registers. A 32-bit float occupies two consecutive registers. The question is: which register holds the high word?

The Modbus specification says big-endian (high word first), but many PLC vendors violate this:

Standard Modbus (Big-Endian / "ABCD"):
Register N = High word (bytes A, B)
Register N+1 = Low word (bytes C, D)

Swapped (Little-Endian / "CDAB"):
Register N = Low word (bytes C, D)
Register N+1 = High word (bytes A, B)

Byte-Swapped ("BADC"):
Register N = Byte-swapped high word (B, A)
Register N+1 = Byte-swapped low word (D, C)

Full Reverse ("DCBA"):
Register N = (D, C)
Register N+1 = (B, A)

Real-world example: A process temperature of 72.5°F is 0x42910000 in IEEE 754. Here's what you'd read over Modbus depending on the byte order:

OrderRegister NRegister N+1Decoded Value
ABCD0x42910x000072.5 ✅
CDAB0x00000x42911.598 × 10⁻⁴¹ ❌
BADC0x91420x0000-6.01 × 10⁻²⁸ ❌
DCBA0x00000x9142Garbage ❌

The only reliable way to determine byte ordering is to read a known value from the PLC — like a setpoint you can verify — and compare the decoded result against all four orderings.

EtherNet/IP Tag Ordering

EtherNet/IP (CIP) is generally more predictable because it transmits structured data with typed access. When you read a REAL tag from an Allen-Bradley Micro800 or CompactLogix, the CIP layer handles byte ordering transparently. The value arrives in the host's native format through the client library.

However, watch out for array access. When reading a float array starting at a specific index, the start index and element count must match the PLC's memory layout exactly. Requesting tag_name[1] with elem_count=6 reads elements 1 through 6 — the zero-indexed first element is skipped. Getting this wrong doesn't produce an error; it silently gives you shifted values.

Practical Validation Strategies

Layer 1: Raw Register Validation

Before you even try to decode a float, validate the raw bytes:

import struct
import math

def validate_float_register(high_word: int, low_word: int,
byte_order: str = "ABCD") -> tuple[float, str]:
"""
Decode and validate a 32-bit float from two Modbus registers.
Returns (value, status) where status is 'ok', 'nan', 'inf', or 'denorm'.
"""
# Assemble bytes based on ordering
if byte_order == "ABCD":
raw = struct.pack('>HH', high_word, low_word)
elif byte_order == "CDAB":
raw = struct.pack('>HH', low_word, high_word)
elif byte_order == "BADC":
raw = struct.pack('>HH',
((high_word & 0xFF) << 8) | (high_word >> 8),
((low_word & 0xFF) << 8) | (low_word >> 8))
elif byte_order == "DCBA":
raw = struct.pack('<HH', high_word, low_word)
else:
raise ValueError(f"Unknown byte order: {byte_order}")

value = struct.unpack('>f', raw)[0]

# Check special values
if math.isnan(value):
return value, "nan"
if math.isinf(value):
return value, "inf"

# Check denormalized (subnormal) — often indicates garbage data
raw_int = struct.unpack('>I', raw)[0]
exponent = (raw_int >> 23) & 0xFF
if exponent == 0 and (raw_int & 0x7FFFFF) != 0:
return value, "denorm"

return value, "ok"

Layer 2: Engineering-Range Clamping

Every process variable has a physically meaningful range. A mold temperature can't be -40,000°F. A flow rate can't be 10 billion GPM. Enforce these ranges at the edge:

RANGE_LIMITS = {
"mold_temperature_f": (-50.0, 900.0),
"barrel_pressure_psi": (0.0, 40000.0),
"screw_rpm": (0.0, 500.0),
"coolant_flow_gpm": (0.0, 200.0),
}

def clamp_to_range(tag_name: str, value: float) -> tuple[float, bool]:
"""Clamp a value to its engineering range. Returns (clamped_value, was_clamped)."""
if tag_name not in RANGE_LIMITS:
return value, False
low, high = RANGE_LIMITS[tag_name]
if value < low:
return low, True
if value > high:
return high, True
return value, False

Layer 3: Rate-of-Change Filtering

A legitimate temperature can't jump from 200°F to 800°F in one polling cycle (typically 1–60 seconds). Rate-of-change filtering catches sensor glitches and transient read errors:

MAX_RATE_OF_CHANGE = {
"mold_temperature_f": 50.0, # Max °F per polling cycle
"barrel_pressure_psi": 2000.0, # Max PSI per cycle
"screw_rpm": 100.0, # Max RPM per cycle
}

def rate_check(tag_name: str, new_value: float,
last_value: float) -> bool:
"""Returns True if the change rate is within acceptable limits."""
if tag_name not in MAX_RATE_OF_CHANGE:
return True
max_delta = MAX_RATE_OF_CHANGE[tag_name]
return abs(new_value - last_value) <= max_delta

The 32-Bit Float Reassembly Problem

When your edge gateway reads two 16-bit Modbus registers and needs to assemble them into a 32-bit float, the implementation must handle several non-obvious cases.

Two-Register Float Assembly

The most common approach reads two registers and combines them. But there's a critical subtlety: the function code determines how you interpret the raw words.

For holding registers (function code 3) and input registers (function code 4), each register is a 16-bit unsigned integer. To assemble a float:

Step 1: Read register N → uint16 word_high
Step 2: Read register N+1 → uint16 word_low
Step 3: Combine → uint32 raw = (word_high << 16) | word_low
Step 4: Reinterpret raw as IEEE 754 float

But here's the trap: some Modbus libraries automatically apply byte swapping at the protocol layer (converting from Modbus big-endian to host little-endian), which means your "high word" might already be byte-swapped before you assemble it.

A robust implementation uses the library's native float-extraction function (like modbus_get_float() in libmodbus) rather than manual assembly when possible. When you must assemble manually, test against a known value first.

Handling Mixed-Endian Devices

In real factories, you'll often have devices from multiple vendors on the same Modbus network — each with their own byte-ordering conventions. Your edge gateway must support per-device (or even per-register) byte-order configuration:

devices:
- name: "Injection_Molding_Press_1"
protocol: modbus-tcp
address: "192.168.1.10"
byte_order: ABCD
tags:
- name: barrel_temp_zone1
register: 40001
type: float32
# Inherits device byte_order

- name: "Chiller_Unit_3"
protocol: modbus-tcp
address: "192.168.1.20"
byte_order: CDAB # This vendor swaps words
tags:
- name: coolant_supply_temp
register: 30000
type: float32

Change Detection with Floating-Point Values

One of the most powerful bandwidth optimizations in IIoT edge gateways is change-of-value (COV) detection — only transmitting a value when it actually changes. But floating-point comparison is inherently tricky.

The Naive Approach (Broken)

// DON'T DO THIS
if (new_value != old_value) {
send(new_value);
}

This fails because:

  • Sensor noise causes sub-LSB fluctuations that produce different float representations
  • NaN ≠ NaN by IEEE 754 rules, so you'd send NaN every single cycle
  • -0.0 == +0.0 by IEEE 754, so you'd miss sign changes that might matter

The Practical Approach

Compare at the raw register level (integer comparison), not the float level. If the uint32 representation of two registers hasn't changed, the float is identical bit-for-bit — no ambiguity:

uint32_t new_raw = (word_high << 16) | word_low;
uint32_t old_raw = stored_raw_value;

if (new_raw != old_raw) {
// Value actually changed — decode and transmit
stored_raw_value = new_raw;
transmit(decode_float(new_raw));
}

This approach is used in production edge gateways and avoids all the floating-point comparison pitfalls. It's also faster — integer comparison is a single CPU instruction, while float comparison requires FPU operations and NaN handling.

Batching and Precision Preservation

When batching multiple tag values for transmission, format choice matters for float precision.

JSON Serialization Pitfalls

JSON doesn't distinguish between integers and floats, and most JSON serializers will round-trip a float through a decimal representation, potentially losing precision:

Original float: 72.5 (exact in IEEE 754: 0x42910000)
JSON: "72.5" → Deserialized: 72.5 ✅

Original float: 72.3 (NOT exact: 0x4290999A)
JSON: "72.30000305175781" → Deserialized: 72.30000305175781
Or: "72.3" → Deserialized: 72.30000305175781 (different!)

For telemetry where exact bit-level reproduction matters (e.g., comparing dashboard values against PLC HMI values), use binary encoding. A well-designed binary telemetry format encodes the tag ID, status, value type, and raw bytes — preserving perfect fidelity with less bandwidth.

A typical binary batch frame looks like:

┌──────────┬────────────┬──────────┬──────────┬────────────────┐
│ Batch │ Group │ Device │ Serial │ Values │
│ Header │ Timestamp │ Type │ Number │ Array │
│ (1 byte) │ (4 bytes) │ (2 bytes)│ (4 bytes)│ (variable) │
└──────────┴────────────┴──────────┴──────────┴────────────────┘

Each value entry:
┌──────────┬────────┬──────────┬──────────┬────────────────┐
│ Tag ID │ Status │ Count │ Elem │ Raw Values │
│ (2 bytes)│(1 byte)│ (1 byte) │ Size │ (count × size) │
│ │ │ │ (1 byte) │ │
└──────────┴────────┴──────────┴──────────┴────────────────┘

This format reduces a typical 100-tag batch from ~5 KB (JSON) to ~600 bytes (binary) — an 8× bandwidth reduction with zero precision loss.

Edge Gateway Best Practices

Based on years of deploying edge gateways in plastics, metals, and packaging manufacturing, here are the practices that prevent float-related data quality issues:

1. Validate at the Source

Don't wait until data reaches the cloud to check for NaN and infinity. By then, you've wasted bandwidth transmitting garbage and may have corrupted aggregations. Validate immediately after the register read.

2. Separate Value and Status

Every tag read should produce two outputs: the decoded value AND a status code. Status codes distinguish between "value is zero because the sensor reads zero" and "value is zero because the read failed." Most Modbus libraries return error codes — propagate them alongside the values.

3. Configure Byte Order Per Device

Don't hardcode byte ordering. Every industrial device you connect might have different conventions. Your tag configuration should support per-device or per-tag byte-order specification.

If your edge gateway communicates over cellular (4G/5G) or satellite, binary encoding pays for itself immediately. The bandwidth savings compound with polling frequency — a gateway polling 200 tags every second generates 17 GB/month in JSON but only 2 GB/month in binary.

5. Hourly Full Reads

Even with change-of-value filtering, perform a full read of all tags at least once per hour. This catches situations where a value changed but the change was lost due to a transient error, and ensures your cloud platform always has a recent snapshot of every tag — even slowly-changing ones.

How machineCDN Handles Float Data

machineCDN's edge infrastructure handles these float challenges at the protocol driver level. The platform supports automatic byte-order detection during device onboarding, validates every register read against configurable engineering ranges, and uses binary telemetry encoding to minimize bandwidth while preserving perfect float fidelity.

For plants running mixed-vendor equipment — which is nearly every plant — machineCDN normalizes all float data into a consistent format before it reaches your dashboards, ensuring that a temperature from a Modbus chiller and a temperature from an EtherNet/IP blender are directly comparable.

Key Takeaways

  1. IEEE 754 special values (NaN, infinity, denormals) appear regularly in PLC data — don't assume every register read produces a valid number
  2. Byte ordering varies by vendor, not by protocol — always verify against a known value
  3. Compare at the raw register level for change detection — never use float equality
  4. Binary encoding preserves precision and saves 8× bandwidth over JSON for telemetry
  5. Validate at the edge, not in the cloud — garbage data should never leave the factory

Getting floating-point handling right at the edge gateway is one of those unglamorous engineering fundamentals that separates reliable IIoT platforms from brittle ones. Your trending charts, alarm logic, and analytics all depend on it.


Want to see how machineCDN handles multi-protocol float data normalization in production? Request a demo to explore the platform with real factory data.

IIoT for Electronics Manufacturing: How to Monitor SMT Lines, Reflow Ovens, and Test Equipment in Real Time

· 10 min read
MachineCDN Team
Industrial IoT Experts

Electronics manufacturing operates at the intersection of high precision and high volume. A surface-mount technology (SMT) line placing 50,000 components per hour needs every placement to be accurate to within 0.05mm. A reflow oven running a temperature profile with five distinct zones needs each zone to hold within 2°C of its setpoint. An automated optical inspection (AOI) system needs to catch every defect without generating false positives that slow the line.

When any of these parameters drift, the consequences compound fast. A single SMT nozzle running slightly off calibration can misplace 5,000 components before anyone notices. A reflow oven zone that is 8°C too hot produces solder joints that pass visual inspection but fail under thermal cycling six months later. These are the kinds of problems that IIoT monitoring was designed to catch — before they become quality escapes that reach your customers.

This guide covers how to deploy IIoT monitoring across an electronics manufacturing facility, which parameters matter most, and how real-time data changes the way electronics manufacturers manage quality, throughput, and equipment health.

IIoT for Semiconductor Manufacturing: How to Monitor Lithography, Etching, and Deposition Equipment in Real Time

· 8 min read
MachineCDN Team
Industrial IoT Experts

A single hour of unplanned downtime in a semiconductor fab costs between $100,000 and $500,000. With equipment valued at $10–$50 million per tool and process tolerances measured in nanometers, semiconductor manufacturing demands the most precise equipment monitoring in any industry. IIoT platforms are transforming how fabs manage equipment health, predict failures, and protect yield — but the semiconductor environment has unique challenges that general-purpose monitoring tools weren't designed to handle.

IIoT for Woodworking and Lumber Manufacturing: How to Monitor Sawmills, CNC Routers, and Drying Kilns in Real Time

· 9 min read
MachineCDN Team
Industrial IoT Experts

Woodworking and lumber manufacturing operate in a unique space: heavy industrial processes producing natural material products with inherent variability. Moisture content shifts between logs. Blade wear changes cut quality unpredictably. Kiln temperatures drift. Adhesive curing depends on ambient conditions. This variability makes real-time monitoring not just valuable — it's essential for consistent output.

IoTFlows vs MachineCDN for Energy Monitoring: Which IIoT Platform Tracks Real Power Consumption?

· 9 min read
MachineCDN Team
Industrial IoT Experts

Energy costs now rank as the second-largest operating expense for most manufacturers, right behind labor. With industrial electricity rates climbing 12-18% year over year across North America and Europe, plant managers need granular visibility into exactly where power is being consumed — not just a monthly utility bill that tells them nothing actionable.

Both IoTFlows and MachineCDN offer industrial monitoring platforms, but their approaches to energy tracking differ fundamentally. This comparison breaks down how each platform handles energy consumption data, where the gaps are, and which one gives your maintenance and operations teams the data they actually need to cut costs.

JSON-Based PLC Tag Configuration: Building Maintainable IIoT Device Templates [2026]

· 12 min read

If you've ever stared at a spreadsheet of 200 PLC register addresses trying to figure out which ones your SCADA system is actually polling, you know the pain. Traditional tag configuration — hardcoded in ladder logic comments, scattered across HMI screens, buried in proprietary configuration tools — doesn't scale.

The solution that's gaining traction in modern IIoT deployments is declarative, JSON-based tag configuration. Instead of configuring your data collection logic in opaque proprietary formats, you define your device's entire tag map as a structured JSON document. This approach brings version control, template reuse, and automated validation to the industrial data layer.

In this guide, we'll walk through the architecture of a production-grade JSON tag configuration system, drawing from real patterns used in industrial edge gateways connecting to Allen-Bradley Micro800 PLCs via EtherNet/IP and to various devices via Modbus RTU and TCP.

JSON-based PLC tag configuration for IIoT

Why JSON for PLC Tag Configuration?

The traditional approach to configuring PLC data collection involves vendor-specific tools: RSLinx for Allen-Bradley, TIA Portal for Siemens, or proprietary gateway configurators. These tools work, but they create several problems at scale:

  • No version control. You can't git diff a proprietary binary config file.
  • No templating. When you deploy the same machine type across 50 sites, you're manually recreating the same configuration 50 times.
  • No validation. Typos in register addresses don't surface until runtime.
  • No automation. You can't script the generation of configurations from a master device database.

JSON solves all of these. A tag configuration becomes a text file that can be:

  • Stored in Git with full change history
  • Templated per device type (one JSON per machine model)
  • Validated against a schema before deployment
  • Generated programmatically from engineering databases

Anatomy of a Tag Configuration Document

A well-structured PLC tag configuration document needs to capture several layers of information:

Device-Level Metadata

Every configuration file should identify the device type it applies to, carry a version string for change tracking, and specify the protocol:

{
"device_type": 1010,
"version": "a3f7b2c",
"name": "Continuous Blender Model X",
"protocol": "ethernet-ip",
"plctags": [ ... ]
}

The device_type field is a numeric identifier that maps to a specific machine model. When an edge gateway auto-detects a PLC (by reading a known register), it uses this type ID to look up the correct configuration file. The version field — ideally a short Git hash — lets you track which configuration version is running on each gateway in the field.

For Modbus devices, you'd also include protocol-specific parameters:

{
"device_type": 5000,
"version": "b8e1d4a",
"name": "Temperature Control Unit",
"protocol": "modbus-rtu",
"base_addr": 48,
"baud": 9600,
"parity": "even",
"data_bits": 8,
"stop_bits": 1,
"byte_timeout": 4,
"resp_timeout": 100,
"plctags": [ ... ]
}

Notice the serial link parameters are part of the same document. This is deliberate — you want a single source of truth for "how to talk to this device and what to read from it."

Tag Definitions: The Core Data Model

Each tag in the configuration represents a single data point you want to collect from the PLC. A complete tag definition captures:

{
"name": "barrel_zone1_temp",
"id": 42,
"type": "float",
"ecount": 2,
"sindex": 0,
"interval": 5,
"compare": true,
"do_not_batch": false
}

Let's break down each field:

name — A human-readable identifier for the tag. For EtherNet/IP (CIP) devices, this is the actual PLC tag name. For Modbus, it's a descriptive label since Modbus uses numeric addresses.

id — A numeric identifier used in the wire protocol when transmitting data to the cloud. Using compact integer IDs instead of string names dramatically reduces payload sizes — critical when you're sending telemetry over cellular connections.

type — The data type of the register value. Common types include:

TypeSizeRangeUse Case
bool1 byte0 or 1Alarm states, run/stop status
int81 byte-128 to 127Small counters, mode selectors
uint81 byte0 to 255Status codes, alarm bytes
int162 bytes-32,768 to 32,767Temperature (×10), pressure
uint162 bytes0 to 65,535RPM, flow rate, raw ADC values
int324 bytes±2.1 billionProduction counters, energy
uint324 bytes0 to 4.2 billionLifetime counters, timestamps
float4 bytesIEEE 754Temperature, weight, setpoints

ecount (element count) — How many consecutive elements to read. For a single register, this is 1. For a 32-bit float stored across two Modbus registers, this is 2. For an array of 10 temperature readings, this is 10.

sindex (start index) — The starting element index for array reads. Combined with ecount, this lets you read slices of PLC arrays without pulling the entire array.

interval — How often (in seconds) to poll this tag. This is where you make intelligent decisions about bandwidth:

  • 1 second: Critical alarms, emergency stops, safety interlocks
  • 5 seconds: Process temperatures, pressures, flows
  • 30 seconds: Setpoints, mode selectors (change infrequently)
  • 300 seconds: Configuration parameters, serial numbers

compare — When true, the gateway compares each new reading against the previous value and only transmits if the value changed. This is the single most impactful optimization for reducing bandwidth and cloud ingestion costs.

do_not_batch — When true, the value is transmitted immediately rather than being accumulated into a batch payload. Use this for critical alarms that need sub-second cloud visibility.

Modbus Address Conventions

For Modbus devices, each tag also carries an addr field that encodes both the register address and the function code:

{
"name": "process_temp",
"id": 10,
"addr": 400100,
"type": "float",
"ecount": 2,
"interval": 5,
"compare": true
}

The address convention follows a well-established pattern:

Address RangeModbus Function CodeRegister Type
0 – 65,536FC 01Coils (read/write)
100,000 – 165,536FC 02Discrete Inputs (read)
300,000 – 365,536FC 04Input Registers (read)
400,000 – 465,536FC 03Holding Registers (R/W)

So addr: 400100 means "holding register at address 100, read via function code 3." This convention eliminates ambiguity about which Modbus function to use — the address itself encodes it.

Why this matters: A common source of bugs in Modbus deployments is using the wrong function code. Someone configures a tag to read address 100 with FC 03 when the device exposes it as an input register (FC 04). With the address convention above, the function code is implicit and unambiguous.

Advanced Patterns: Calculated and Dependent Tags

Simple register reads cover 80% of use cases. But industrial devices often pack multiple boolean values into a single 16-bit alarm word, or have tags whose values only matter when a parent tag changes.

Calculated Tags: Extracting Bits from Alarm Words

Many PLCs pack 16 individual alarm flags into a single uint16 register. Rather than reading 16 separate coils, you read one register and extract the bits:

{
"name": "alarm_word_1",
"id": 50,
"addr": 400200,
"type": "uint16",
"ecount": 1,
"interval": 1,
"compare": true,
"calculated": [
{
"name": "high_temp_alarm",
"id": 51,
"type": "bool",
"shift": 0,
"mask": 1
},
{
"name": "low_pressure_alarm",
"id": 52,
"type": "bool",
"shift": 1,
"mask": 1
},
{
"name": "motor_overload",
"id": 53,
"type": "bool",
"shift": 2,
"mask": 1
}
]
}

When alarm_word_1 is read, the gateway automatically:

  1. Reads the raw uint16 value
  2. For each calculated tag, applies the right-shift and mask to extract the bit
  3. Compares the extracted boolean against its previous value
  4. Only transmits if the bit actually changed

This is vastly more efficient than polling 16 individual coils — one Modbus read instead of 16, with identical semantic output.

Dependent Tags: Event-Driven Secondary Reads

Some tags only need to be read when a related tag changes. For example, you might have a machine_state register that changes between IDLE, RUNNING, and FAULT. When it changes, you want to immediately read a block of diagnostic registers — but you don't want to poll those diagnostics every cycle when the machine state is stable.

{
"name": "machine_state",
"id": 100,
"addr": 400001,
"type": "uint16",
"ecount": 1,
"interval": 1,
"compare": true,
"dependents": [
{
"name": "fault_code",
"id": 101,
"addr": 400010,
"type": "uint16",
"ecount": 1,
"interval": 60
},
{
"name": "fault_timestamp",
"id": 102,
"addr": 400011,
"type": "uint32",
"ecount": 2,
"interval": 60
}
]
}

When machine_state changes, the gateway forces an immediate read of all dependent tags, regardless of their normal polling interval. This gives you:

  • Low latency on state transitions — fault diagnostics arrive within 1 second of the fault occurring
  • Low bandwidth during steady state — diagnostic registers are only polled every 60 seconds when nothing is happening

Contiguous Register Optimization

One of the most impactful optimizations in Modbus data collection is contiguous register grouping. Instead of making separate Modbus read requests for each tag, the gateway sorts tags by address and groups adjacent registers into single bulk reads.

Consider these tags:

[
{ "name": "temp_1", "addr": 400100, "ecount": 1 },
{ "name": "temp_2", "addr": 400101, "ecount": 1 },
{ "name": "temp_3", "addr": 400102, "ecount": 1 },
{ "name": "pressure", "addr": 400103, "ecount": 2 }
]

A naive implementation makes four separate Modbus requests. An optimized one makes one request: read 5 registers starting at address 400100. The response contains all four values, which are dispatched to the correct tag definitions.

For this optimization to work, the configuration system must:

  1. Sort tags by address at load time, not at runtime
  2. Validate that function codes match — you can't group a coil read (FC 01) with a holding register read (FC 03)
  3. Respect maximum packet sizes — Modbus TCP allows up to 125 registers per read; some devices are more restrictive
  4. Respect polling intervals — only group tags that share the same polling interval

The performance difference is dramatic. A typical PLC with 50 Modbus tags might require 50 individual reads (50 × ~10ms = 500ms per cycle) or 5 grouped reads (5 × ~10ms = 50ms per cycle). That's a 10× improvement in polling speed.

IEEE 754 Float Handling: The Register Order Problem

Reading 32-bit floating-point values over Modbus is notoriously tricky because the Modbus specification doesn't define register byte ordering for multi-register values. A float spans two 16-bit registers, and different PLCs may store them in different orders:

  • Big-endian (AB CD): Register N contains the high word, N+1 the low word
  • Little-endian (CD AB): Register N contains the low word, N+1 the high word
  • Mid-endian (BA DC or DC BA): Each word's bytes are swapped

Your tag configuration should support specifying the byte order, or at least document which convention your gateway assumes. Most libraries (libmodbus, for example) provide helper functions like modbus_get_float() that assume big-endian by default — but always verify against your specific PLC.

Pro tip: When commissioning a new device, read a register where you know the expected value (e.g., a temperature setpoint showing 72.0°F on the HMI). If the gateway reads 72.0, your byte order is correct. If it reads 2.388e-38 or 1.23e+12, you have a byte-order mismatch.

Binary vs. JSON Telemetry Encoding

Once you've collected your tag values, you need to transmit them. Your configuration should support both JSON and binary encoding, with the choice driven by bandwidth constraints:

JSON encoding is human-readable and debuggable:

{
"groups": [{
"ts": 1709500800,
"device_type": 1010,
"serial_number": 85432,
"values": [
{ "id": 42, "values": [72.3] },
{ "id": 43, "values": [true] }
]
}]
}

Binary encoding is 3-5× smaller. A typical binary frame packs:

  • 1-byte header marker
  • 4-byte group count
  • Per group: 4-byte timestamp, 2-byte device type, 4-byte serial number, 4-byte value count
  • Per value: 2-byte tag ID, 1-byte status, 1-byte value count, 1-byte value size, then raw value bytes

A batch that's 2,000 bytes in JSON might be 400 bytes in binary. Over a cellular connection billed per megabyte, that savings compounds fast.

Putting It All Together: Configuration Lifecycle

A production deployment follows this lifecycle:

  1. Template creation: For each machine model, create a JSON tag configuration. Store it in Git.
  2. Deployment: Push configurations to edge gateways via your device management platform. The gateway monitors the config file and reloads automatically when it changes.
  3. Auto-detection: When the gateway starts, it queries the PLC for its device type (a known register). It then matches the type to the correct configuration file.
  4. Validation: At load time, validate register addresses (no duplicates, valid ranges), data types, and interval values. Reject invalid configs before they cause runtime errors.
  5. Runtime: The gateway polls tags according to their configured intervals, applies change detection, groups contiguous registers, and batches values for transmission.

How machineCDN Handles Tag Configuration

machineCDN's edge gateway uses this exact pattern — JSON-based device templates that are automatically selected based on PLC auto-detection. Each machine type in a plastics manufacturing facility (blenders, dryers, granulators, chillers, TCUs) has its own configuration template with pre-mapped tags, optimized polling intervals, and calculated alarm decomposition.

When a new machine is connected, the gateway detects the PLC type, loads the matching template, and starts collecting data — typically in under 30 seconds with zero manual configuration. For plants running 20+ machines across 5 different models, this eliminates weeks of commissioning time.

Common Pitfalls

1. Overlapping addresses. Two tags pointing to the same register with different IDs will cause confusion in your data pipeline. Validate for uniqueness at load time.

2. Wrong element count for floats. A 32-bit float on Modbus requires ecount: 2 (two 16-bit registers). Setting ecount: 1 gives you garbage data.

3. Polling too fast on serial links. Modbus RTU over RS-485 at 9600 baud can handle roughly 10-15 register reads per second. If you configure 50 tags at 1-second intervals, you'll never keep up. Budget your polling rate against your link speed.

4. Missing change detection on high-volume tags. Without compare: true, every reading gets transmitted. For a tag polled every second, that's 86,400 data points per day — even if the value never changed.

5. Batch timeout too long. If your batch timeout is 60 seconds but an alarm fires, it won't reach the cloud for up to a minute unless that alarm tag has do_not_batch: true.

Conclusion

JSON-based tag configuration isn't just a nice-to-have — it's a fundamental enabler for scaling IIoT deployments. It brings software engineering best practices (version control, templating, validation, automation) to a domain that has traditionally relied on manual, vendor-specific tooling.

The key design principles are:

  • One file per device type with version tracking
  • Rich tag metadata covering data types, intervals, and delivery modes
  • Hierarchical relationships for calculated and dependent tags
  • Protocol-aware addressing that encodes function codes implicitly
  • Contiguous register grouping for optimal Modbus performance

Get this foundation right, and you'll spend your time analyzing machine data instead of debugging data collection.

Machine Changeover Time Tracking with IIoT: How to Cut Setup Time and Boost OEE

· 8 min read
MachineCDN Team
Industrial IoT Experts

Changeover time — the gap between the last good part of one run and the first good part of the next — is one of manufacturing's most persistent productivity killers. In most plants, changeovers consume 10-30% of available production time. Worse, most manufacturers don't actually measure changeover time accurately. They estimate. They round up. They accept "about two hours" when the actual time ranges from 45 minutes to four hours depending on the shift, the operator, and the product.