Cold Chain Monitoring — Home Assistant on OpenAgTechnology

Cold chain monitoring is mostly about catching failures before they cost product. A walk-in cooler that drifts from 35°F to 50°F overnight ruins whatever is in it; the grower who learns about the failure in the morning has already lost the product. The monitoring job is not to control the coolers — commercial refrigeration systems have their own control and safety — but to watch them continuously, alert on real problems quickly enough to respond, and document the conditions for compliance and traceability. A well-designed cold chain monitoring deployment in Home Assistant combines reliable temperature sensors, alert logic tuned to the operation's tolerances, redundancy that catches single-sensor failures, and a logging strategy that produces audit-grade records. This page covers the specific patterns — where to place sensors, how to build alerts that catch problems without crying wolf, what compliance documentation typically requires, and the failure modes that cost product when monitoring isn't set up carefully.

Before building cold chain monitoring.

Prerequisites:

Organizational foundation. Per [Organizing Home Assistant for a Farm](/home-assistant/agriculture/organizing). Each cold-storage area is its own area with properly-named sensors.

Reliable sensors. Cold-chain sensors need to be more reliable than typical monitoring sensors because a missed alert costs product. Consumer BLE sensors work for budget-conscious operations; operations with compliance requirements benefit from higher-tier sensors with documented accuracy.

Hardware safety on the refrigeration itself. Commercial walk-in coolers have their own thermostats, compressor safeties, and alarm systems. Home Assistant is the layer that watches the refrigeration; the refrigeration's own safety systems are the primary protection for the product. Home Assistant's value is catching situations the refrigeration's local alarms miss — a door left open, a power blink that reset the controller, a slow drift that stays within the local alarm band but is still unhealthy.

An understanding of compliance requirements. If the operation is subject to food safety regulations (FSMA, HACCP plans, state regulations), the cold-chain monitoring setup needs to meet specific documentation requirements — sampling frequency, retention periods, validation procedures, audit trails. The compliance requirements drive some of the design decisions.

What cold chain means.

"Cold chain" refers to the continuous refrigeration a product experiences from harvest or production through storage, transport, and sale. Breaks in the chain — periods when product was warmer than it should have been — reduce shelf life, increase food safety risk, and can force product disposal.

For agricultural operations, the typical cold-chain points are:

Field or harvest holding. Product held briefly at ambient before transport to cooling.

Pre-cooling. Active cooling immediately after harvest to bring field heat down rapidly. Hydro-cooling, forced-air cooling, or vacuum cooling depending on the product.

Cold storage. Walk-in coolers or cold rooms where product is held at specified temperatures (35-45°F for most leafy greens and berries; other products have their own targets).

Packhouse. Processing and packing areas, typically at moderate temperatures (55-65°F) to allow worker comfort while slowing product deterioration.

Shipping dock. Brief holding before product leaves.

Truck and delivery. Outside Home Assistant's direct scope but sometimes monitored via IoT trackers.

The home-operation parts of the chain — cold storage, packhouse, shipping dock — are where Home Assistant monitoring applies. Home Assistant tracks the conditions continuously, alerts on excursions, and produces the records that document cold-chain integrity.

Sensor placement for cold chain.

Different from growing-zone placement because the measurement questions differ.

Walk-in cooler placement.

Place sensors where the product actually sits, not at the front door where they're easy to access. The temperature near the evaporator (coldest point) and near the door (warmest point) can differ by 5-10°F; the product-zone temperature is what matters.

Typical pattern for a walk-in cooler: - One primary sensor in the middle of the product zone, at product height. - One secondary sensor near the door (warmest spot; catches door-left-open events). - One tertiary sensor near the evaporator (coldest spot; catches refrigeration cycling and freezing risk for freeze-sensitive products).

Three sensors per cooler is more than minimum but catches the different failure modes. A single-sensor deployment catches some failures but misses others.

Packhouse placement.

Packhouses are larger and more variable than coolers. Multiple sensors at different locations capture the actual conditions across the space. Near the packing line, near the cold storage door (often warmest), near the shipping door, and at a central location provide a reasonable coverage.

Pre-cooling placement.

For active pre-cooling equipment, sensors on both the incoming product and the discharge confirm the cooling is actually happening. Flow-based cooling effectiveness (degrees reduced per minute of processing) is measurable with sensors on both ends.

Sensor mounting.

Stainless steel or food-grade plastic enclosures for areas where sensors may contact product or be near wash-down. Cable-routed sensors rather than battery-powered when possible for compliance-critical areas (battery failures produce monitoring gaps). Probe-style sensors for inserting into product stacks give more realistic readings than sensors measuring headspace.

Alert strategy.

An alert that doesn't fire in time is useless. An alert that fires constantly gets ignored. Tuning alerts for cold-chain monitoring is specific work.

Tiered alerts.

Different severity levels for different conditions:

Informational. Cooler temperature approaching the upper bound. Not yet a problem, but worth knowing. Log only; no notification.

Warning. Cooler above upper bound for 5 minutes. Possible short-term event (door opened, defrost cycle) or early indicator of problem. Notification to primary responsible person.

Critical. Cooler above upper bound for 15 minutes, or above a higher "serious" threshold. Real problem. Notification to primary person plus backup contacts. Possible automated response (adjust setpoint, activate backup cooling if available).

Emergency. Cooler above product-damage threshold, or multiple sensors all reading high (indicates cooler-wide failure rather than sensor issue). All contacts notified. Possibly automated documentation of the event start time for compliance records.

The tiered approach means alerts fire in proportion to severity. Growers learn that a warning means "check when convenient"; a critical alert means "respond now."

Door-open detection.

A cooler door left open is one of the most common causes of temperature excursion. An automation specific to door-open events handles this more efficiently than temperature alerts alone.

Typical logic: if the cooler door (tracked by a door sensor) is open for more than 3 minutes, send an alert. If open for more than 10 minutes, escalate. This catches the "someone propped the door open while loading" scenario before it causes a temperature excursion.

Multi-sensor agreement for cooler-wide alerts.

A single sensor reading high might be a sensor problem. Multiple sensors in the same cooler all reading high is a cooler problem. Alert logic that requires agreement across sensors (or that distinguishes single-sensor failure from cooler-wide failure) reduces false alerts while catching real problems.

Pattern: alert on "cooler-wide temperature issue" when 2 out of 3 sensors are above threshold. Alert on "sensor issue" when 1 sensor disagrees significantly with the others. Different responses to each.

Rate-of-change alerts.

A cooler rising 5°F in 10 minutes is cause for concern even if the current temperature is still within range. The trend indicates a developing problem (compressor failing, door left open, refrigerant leak). Rate-of-change alerts catch developing problems earlier than absolute-threshold alerts alone.

Power-based alerts.

If the refrigeration unit is monitored for power draw, a sudden drop in power indicates the compressor stopped (possibly failed, possibly tripped a breaker). The alert fires immediately on the power change, not after the cooler has warmed enough to trigger a temperature alert. Earlier detection means more time to respond.

Alert routing.

Who gets alerts varies by time and severity. Business hours: primary responsible person, escalation to backup after 10 minutes of no response. Off-hours and weekends: primary person immediately, escalation to on-call backup after shorter time. Notification channels also matter — a phone call or SMS for critical alerts (people tune out app notifications); push notification for warnings; email/log for informational.

Compliance documentation.

Operations subject to food safety regulations need documentation that meets specific requirements. The specifics vary by regulation and jurisdiction; the patterns below are typical.

Sampling frequency.

Regulations typically require temperature readings at specified intervals (15-minute intervals for many food-safety frameworks, continuous for others). Home Assistant's default state history captures every sensor change, which is more frequent than most requirements need. The compliance-relevant challenge is not capturing data but summarizing and retaining it appropriately.

Retention periods.

Regulations commonly require data retention for specified periods (often 2-7 years). Home Assistant's default recorder configuration keeps history for a shorter period by default (often 10-14 days) for performance reasons. For compliance, data needs to be either retained in Home Assistant for the full period (often requires recorder configuration adjustment and more storage) or exported to long-term storage (InfluxDB, external databases, file archives).

Export formats.

Inspectors often want data in specific formats — CSV, PDF reports, specific data dictionaries. An automation that periodically exports relevant sensor data to a compliance-appropriate format (daily, weekly, monthly exports) produces the records without manual work.

Audit trail.

Beyond the temperature data itself, records of alerts, responses, and system events produce an audit trail showing the operation monitored actively. Home Assistant's logbook captures automation activity; exporting the logbook alongside sensor data provides a fuller picture.

Validation and calibration records.

Regulations typically require documentation of sensor calibration — when was the sensor validated, what was the reference used, what was the offset (if any) applied. Separate from Home Assistant's data flow, but should be maintained alongside the sensor data. A spreadsheet or document tracking per-sensor calibration events is usually sufficient.

Digital signatures and chain of custody.

For regulatory contexts requiring formal signatures on records, Home Assistant's exports don't inherently carry digital signatures. Operations needing this typically export data through a compliance-specific tool that adds the required signature chain.

Cold storage specific patterns.

Walk-in coolers have some specific considerations beyond general monitoring.

Defrost cycles.

Commercial walk-in coolers run periodic defrost cycles — the evaporator heats briefly to melt accumulated frost. During defrost, the cooler warms several degrees before cooling resumes. Home Assistant alerts should not fire on expected defrost cycles.

Typical pattern: configure alert thresholds with enough margin to tolerate defrost-cycle temperature rise, or detect defrost cycles through the refrigeration's own signals if available and suppress alerts during them.

Power events.

A brief power event (lightning, utility switching) can reset the refrigeration controller without cooking product. A longer power event (multi-hour outage) definitely can. Detection and response:

- UPS on the Home Assistant host (so monitoring continues through brief events). - Power monitoring on the refrigeration circuit (detects when power fails). - Alert escalation based on power-outage duration. - Post-event verification automation that checks whether temperatures recovered after power returned.

Loading events.

Loading product into a cooler temporarily warms the cooler — more than the usual door-open event because product is at higher temperature than cooler air. A 30-minute recovery window after loading events is normal; alerts during that window would be false.

Pattern: a "loading mode" input_boolean that the grower activates during loading, with alerts suppressed for a defined period. Less sophisticated but simpler: wider temperature tolerance during known loading windows.

Product-specific sensors.

For high-value or particularly sensitive products, a probe-style sensor inserted into a representative product gives a better reading than headspace air temperature. The product's internal temperature lags behind air temperature changes, so a probe reading stays within tolerance longer during brief excursions and catches actual damaging conditions more accurately.

Packhouse specific patterns.

Packhouses are less tightly controlled than coolers but need their own monitoring patterns.

Temperature and humidity together.

Packhouses typically target moderate temperature (55-65°F) and controlled humidity (60-70%) for worker comfort and product quality. Both need monitoring; both can drift for different reasons.

Worker-time-of-day.

Packhouse conditions during operating hours may differ from after-hours. During operations, worker doors open repeatedly, equipment generates heat, cooling systems cycle more. Alert thresholds may need to be time-of-day aware — stricter during empty hours, more tolerant during active hours.

Air quality.

For packhouses handling produce that generates ethylene (apples, bananas, some other fruits) or that is ethylene-sensitive (leafy greens, some vegetables), air exchange and ventilation matter. CO2 monitoring as a proxy for air exchange, or dedicated ethylene monitoring for operations where it matters, add to the monitoring picture.

Cleanup cycles.

Packhouses undergo daily or weekly cleanup with water and sanitizers. Temperature and humidity spike during cleanup; sensors near cleanup areas may need temporary alert suppression during known cleanup windows.

Common failure modes.

Specific things that go wrong in cold chain monitoring.

The sensor near the evaporator that read cold enough to suppress alerts. The placement near the cold spot meant the sensor saw the coldest temperature in the cooler. When the cooler as a whole warmed, this sensor warmed last. Alerts didn't fire until the problem was severe. Fix: sensor in the product zone, not at the coldest point.

The door sensor that wasn't reliable. A magnetic door sensor that intermittently dropped alerted falsely (door seemed to open). Grower ignored the false alerts. When the door was actually left open one evening, the grower assumed another false alert. Cooler warmed overnight. Fix: sensor reliability matters for alert-triggering devices; replace cheap sensors with reliable ones for critical monitoring.

The defrost-cycle alerts that desensitized everyone. Every defrost cycle triggered a temperature alert. Grower turned off notifications. A real failure went unnoticed because real alerts looked like the dozen-per-day defrost false alerts. Fix: alert thresholds and timing that tolerate defrost cycles.

The weekend gap. A cooler drifted warm starting Saturday evening. Alert fired to primary person, who didn't check phone until Sunday morning. Product damaged. Fix: escalation paths that don't depend on one person always being reachable.

The battery-powered sensor that died during a long weekend. Battery low alerts were dismissed as "I'll replace it Monday." Sensor died Saturday. No monitoring through the holiday weekend; a cooler problem during that time would have gone undetected. Fix: proactive battery replacement, or wired sensors for compliance-critical locations.

The compliance export that wasn't exporting. The automated export stopped working weeks earlier (integration credentials expired). When the inspector asked for records, the grower discovered the gap. Fix: monitoring on the monitoring — an automation that verifies the export ran successfully and alerts if it didn't.

The power blink that reset the refrigeration controller. Power returned quickly but the refrigeration controller entered its startup/configuration mode rather than cooling mode. Cooler warmed for several hours before anyone noticed. Fix: power monitoring on the refrigeration circuit, automatic verification that cooling resumed after power events.

The sensor that drifted over years. Sensor had been in service three years; accuracy had degraded but the error was within the alert tolerance. When inspection happened, verification against a reference showed the sensor was reading 2°F low consistently. Historical records may have been off for the whole year. Fix: periodic calibration, documented validation procedures, replacement schedules for sensors in compliance-critical roles.

The dashboard that didn't highlight excursions. Cold chain data was available in the Home Assistant interface but buried in general dashboards. A brief temperature excursion happened; no one saw it because no one was scanning all the graphs. The excursion was technically logged but not acted on. Fix: dedicated cold-chain dashboard with excursion-highlighting, daily summary reports that call out any excursions.

The alert that went to the wrong address. Staff turnover; the primary alert contact left the company but email continued routing to their (unmonitored) address. Alerts fired; nobody saw them. Fix: periodic review of alert routing, alert acknowledgement workflows that catch unacknowledged alerts.

What not to do.

Don't rely on single sensors for critical cold storage. Multiple sensors per cooler catches failure modes that single sensors miss.

Don't place sensors for convenience. Place them where the measurement is most meaningful, not where they're easy to reach.

Don't let alerts become noise. Tune thresholds, timing, and escalation so alerts fire on real problems. When alerts become background noise, real problems get missed.

Don't skip documentation for compliance operations. Regulations require specific documentation; Home Assistant's default retention and export don't automatically meet those requirements. Purposeful compliance configuration is separate work.

Don't assume the refrigeration is working because it was working yesterday. Periodic verification — does the cooler actually get to its setpoint, does it cycle normally, does power draw look typical — catches slow degradation.

Don't rely solely on a single communication channel for alerts. Push notifications miss phones on silent; email gets buried; SMS depends on carrier delivery. Critical alerts should use multiple channels.

Don't skip post-event verification. After an alert, after a power event, after a maintenance action, verify the system is back to normal before considering the event resolved.

Don't forget calibration. Sensors drift. Periodic calibration against a reference maintains accuracy and produces compliance documentation.