Fire Panel - Gateway Test Automation with TestBot

CASE STUDY SNAPSHOT

Customer : Leading fire safety equipment manufacturer specializing in integrated detection and alarm systems
Project vertical : Building Safety and Fire Detection Systems
System Scope : Multi-zone fire control panel with networked gateway and cloud-based central monitoring platform
Regulatory Context : NFPA 72, EN 54 compliance - comprehensive validation required for safety and audit purposes
Challenge : Validating a three-tier fire safety system (panel, gateway, cloud) requiring guaranteed event delivery, accurate timestamping, reliable buffering during offline scenarios, and correct bidirectional synchronization of operator actions under all operating and failure conditions.
Solution : Implemented TestBot’s Custom Simulator Agent to automate event generation, simulate operator actions, validate panel behavior, verify cloud synchronization, and ensure end-to-end consistency across panel, gateway, and cloud layers.
Tools and Technologies :
  • Framework: TestBot (Unified automated testing framework)
  • Agent: Custom Simulator Agent
  • System Components: Fire Control Panel, Protocol Gateway, Cloud Monitoring Platform
  • Validation Scope: Event flow, buffering, retry logic, acknowledgment propagation
  • Compliance: NFPA 72, EN 54

Customer Overview

The customer develops advanced fire safety systems where a multi-zone Fire Control Panel integrates with a protocol gateway and a cloud-based monitoring platform.

System Architecture

The system operates as three integrated tiers:

  • Fire Control Panel - Receives signals from distributed smoke and heat detectors, processes alarms, activates local notification systems, and communicates with the gateway.
  • Custom Protocol Gateway - Receives events from the panel, translates them into custom protocol messages, buffers them if the cloud is offline, and relays them to the cloud platform.
  • Cloud Monitoring Platform - Receives events from all deployed gateways, maintains real-time status, enables remote operator actions (acknowledge, silence, reset), and provides compliance reporting.

For this system to work correctly, three synchronization requirements must be met:

  • Panel-to-Gateway: Every alarm and state change reaches the gateway with no loss
  • Gateway-to-Cloud: Every event is timestamped, queued, and transmitted (or buffered if offline)
  • Bidirectional Coherence: Operator actions from the cloud propagate back to the panel correctly

The Problem

Validating this three-tier system required orchestrating multiple devices, systems, and event flows simultaneously. The team attempted manual testing: trigger smoke detectors, watch the gateway display, log into the cloud console, verify the event arrived, acknowledge the alarm, verify the acknowledgment reached the panel.

Each test case took 10-15 minutes. For dozens of zones, multiple alarm states, and failure scenarios, comprehensive testing required weeks of manual labour - and critical failure paths (cloud offline, simultaneous multi-zone alarms, gateway reconnection edge cases) were either skipped or tested only once.

The team had no way to systematically test whether event acknowledgments propagated correctly back to the panel when the gateway reconnected to the cloud, or what happened when three alarms arrived simultaneously, or whether events were lost if the cloud rejected a transmission with a transient error.

The Solution: TestBot's Custom Simulator Agent for Fire Panel

TestBot's Custom Simulator Agent bridges the three tiers of the fire safety system, controlling events and verifying state synchronization across all three layers. The simulator operates as a test harness that can:

  • Add/remove detector zones and configure their properties
  • Create alarm, supervisory, and fault events on any zone
  • Acknowledge alarms from the cloud console
  • Silence sounders, reset systems, arm/disarm zones - simulating operator actions
  • Clear events to test recovery and state cleanup
  • Query the cloud event log to verify events reached the platform
  • Monitor panel responses (GPIO state, relay changes, sounder output)

Every operation is controlled by TestBot via parameterized test data. A test case defines a sequence of operations, what the panel should do in response, and what should appear in the cloud. TestBot executes the sequence automatically, compares results against expected values, and reports pass/fail.

Test Scenarios

Test 1: Single Zone Alarm - Basic Event Flow

Create smoke detection event on Zone 1 → Panel displays alarm and activates sounder → Event recorded in cloud with correct timestamp → Operator acknowledges via cloud console → Acknowledgment propagates to panel, sounder silences → Clear event in simulator → Cloud log updated with clear event.

Result: Pass - Event flows through all three tiers with correct timestamps. Acknowledgment propagates correctly.

Test 2: Multi-Zone Simultaneous Alarms

Create three smoke alarms within 50ms of each other → Panel identifies all zones and displays highest priority → All three events recorded separately in cloud → Acknowledge Zone 1 only → Cloud and panel both show Zone 1 acknowledged, Zones 2 & 3 still in alarm → Acknowledge remaining zones → All zones show acknowledged → Clear all events.

Result: Pass - System correctly isolates and tracks multiple simultaneous events. Partial acknowledgment does not interfere with other alarms.

Shape

Test 3: Gateway Offline During Alarm

Simulate gateway losing cloud connectivity → Create alarm on Zone 1 while gateway offline → Verify alarm in gateway local buffer but NOT in cloud → Simulate gateway reconnection → Cloud receives buffered alarm with original timestamp → Verify no duplicate entries → Acknowledge alarm via cloud → Acknowledgment propagates back through gateway to panel.

Result: Pass - Gateway correctly buffers events during offline and replays without loss or duplication. Timestamps preserved.

Test 4: Cloud Rejection and Retry Logic

Create alarm on Zone 1 → Simulate cloud API returning error 409 (Conflict/Duplicate) → Verify gateway detects rejection and initiates retry → Simulate cloud accepting the event on retry → Query cloud event log - event recorded after N retries → Verify gateway logs show initial rejection and successful retry.

Result: Pass - Gateway implements retry logic on cloud rejection. Event succeeds without manual intervention.

Test 5: Acknowledgment Propagation Failure

Create alarm on Zone 1 → Acknowledge via cloud console → Simulate gateway-to-panel communication failure → Verify cloud state shows acknowledged but panel still shows alarm active → Test detects state incoherence and logs failure → Simulate gateway recovery → Acknowledgment retransmitted to panel → State coherence restored.

Result: Fail (with caveat) - System detects state divergence. Acknowledgment eventually propagates and corrects the state. Issue escalated: persistent gateway-to-panel failures can leave the sounder active while operators believe it is silenced - compliance concern.

Test 6: Supervisory and Fault Events

Create low battery supervisory event on Zone 2 → Panel displays supervisory indication without triggering sounder → Create alarm on Zone 1 while supervisory active → Panel correctly prioritizes alarm over supervisory → Cloud log shows both events independently tracked → Clear supervisory event → Verify alarm on Zone 1 unaffected.

Result: Pass - Supervisory and alarm events correctly isolated and tracked independently. No interference between event types.

Critical Bugs Found and Fixed

Bug 1: Event ID Collision Under High-Rate Alarm

When three alarms triggered within 50ms, the gateway assigned identical event IDs (based on millisecond-resolution timestamp). Cloud rejected the second alarm as a duplicate, silently dropping it from the event log. In a real multi-zone fire, later alarms could be lost.

Fix: Event ID now includes sub-millisecond counter.

Bug 2: Acknowledgment Lost During Gateway Reconnect

When the gateway reconnected to the cloud, the command queue (containing pending acknowledgments) was cleared, discarding the acknowledgment before it reached the panel. Operator sees "acknowledged" in cloud while the sounder is still active in the building.

Fix: Command queue preserved across reconnections with retry mechanism.

Bug 3: State Incoherence After Partial Acknowledgment

Panel incorrectly transitioned to all-clear when one zone was cleared, even though other zones remained in alarm. System falsely signalled "all clear" while actual alarm condition was still active.

Fix: State machine now tracks acknowledged vs. normal separately; all-clear only when all zones normal.

Bug 4: Missing Timestamps During Gateway Offline Replay

Events buffered while gateway was offline were timestamped in the cloud with the replay time, not the original event time. Timeline became inaccurate for compliance reporting.

Fix: Gateway now preserves original event timestamps and transmits them during replay.

Bug 5: Event Loss on Cloud 5xx Errors

When cloud returned a 5xx server error (transient), the gateway gave up after one retry and discarded the event. Transient cloud issues resulted in permanent event loss.

Fix: Retry logic now distinguishes between client errors (stop) and server errors (indefinite retry).

Impact and Metrics

MetricBefore TestBotAfter TestBotImprovement
Test execution time8-10 hours18 minutes97% reduction
Test coverage~25%100%+300%
Time to detect bugs6-12 weeks (field)Immediate (pre-release)Before customer
Release cycle time5-6 weeks3 weeks45% faster
Field failures (post-release)2-3 per year0 (ongoing)100% elimination
Estimated savings-$2.3M annually-

Conclusion

The fire safety system demonstrates a critical class of product where integration testing is regulatory-mandated (NFPA 72, EN 54). Manual testing of a three-tier system (panel, gateway, cloud) is not scalable. TestBot's Custom Simulator Agent made comprehensive integration testing achievable:

  • 97% reduction in test time - 8 hours manual → 18 minutes automated
  • 100% scenario coverage - all normal and fault combinations tested systematically
  • Five critical bugs caught pre-release - all safety and compliance violations
  • Zero field integration failures since deployment - regressions caught at build time
  • 45% faster release cycle - fewer manual bottlenecks

For any product integrating external systems, networks, or cloud platforms, automated three-tier validation is essential. The automotive instrument cluster appendix demonstrates how TestBot's multi-agent architecture scales to include mechanical actuation and visual validation - applicable across any embedded system requiring coordinated electrical, mechanical, and visual testing.