07 Fundamentals · Lesson 7 of 8

Understanding
Data.

Reading time
~30 minutes · 6,100 words
FAQ
34 questions
Status
Draft 1 · under review
Prerequisites
None. Reads from a ninth-grade floor.

\[LEDE — the first paragraph IS the answer an AI engine will extract\]

01Why this lesson matters.

Most growers who install monitoring for the first time focus on the sensors. The sensors are tangible — a physical probe, a visible readout, a number on a dashboard. The assumption is that once the sensor is reporting, the system is working. But a sensor reporting is only the first step. What happens after the reading arrives determines whether the sensor produced any real value.

Consider the failure modes. A temperature sensor reports into a cloud service that shows the current value in an app. The grower can see the current temperature, but the data from two weeks ago is gone — the service only keeps a rolling window. When the grower wonders whether the crop stress they see today is related to a pattern that happened ten days ago, the data is not available to answer. The sensor worked; the system did not store what the grower needed.

Or this: sensors report into a system that stores everything, but there is no alerting. The temperature goes out of range at 3 AM; the system dutifully logs every reading; no one knows until the grower checks the dashboard in the morning and sees the crop suffered all night. The data was captured. Nothing happened with it when it mattered.

Or this: alerts fire when values cross thresholds, and the grower gets notified, but the system does not log the actions the grower takes in response. Six months later, the grower wants to review what they did when the heater failed in February — did they restart it quickly? What happened that finally resolved it? The event is gone because the system did not support capturing the response. The monitoring failed to turn into learning.

Each of these cases shows the same pattern: data captured without the surrounding system to make it useful. The lesson this page teaches is that raw readings are only the beginning. Everything that makes the data actually valuable — storage, display, alerting, action tracking, analysis, integration — is part of the system, and choosing the right level of capability for an operation matters as much as choosing the right sensors.

02The data ladder.

Agricultural data systems fall into rough levels of capability. Each level builds on the one below. A grower can start at any level, but moving up the ladder unlocks more value from the same underlying sensor investment.

### Rung 0: The reading exists but is not captured.

The base case. A sensor reports a value that appears on a display or in an app — and when it appears, a human looks at it or does not. The value is not stored. Next week, nobody can tell what the temperature was last Tuesday because nobody wrote it down. This is where consumer-grade monitoring often sits: the sensor works, the app shows current values, but the history is shallow or nonexistent.

When this is appropriate: for a hobbyist or a grower who just wants to know what is happening right now, with no need for history or analysis. Rung 0 has its place; it is not the right level for a commercial operation.

### Rung 1: Push to a simple destination.

The sensor sends its readings to a simple target that stores them. Examples include pushing to a Google Sheet, writing to a CSV file on a cloud storage service, emailing a daily summary, or sending to a basic cloud service like ThingSpeak. The data accumulates in a form the grower can access — imperfect, limited, but present.

A Google Sheets-based system can be built for almost nothing and requires no infrastructure beyond a Google account. A sensor (or more often, an ESP32 microcontroller with a sensor) connects to WiFi, authenticates to Google, and appends a row every time it reports. The grower opens the spreadsheet and sees history, can make charts, can calculate averages, can export. Lots of operations work fine at this level.

What this level does well: minimum cost, minimum complexity, data that lives in a familiar format the grower already uses, and no dependency on specialized software. What it does not do well: real-time alerting, sophisticated dashboards, automations that act on the data, storage of decades of high-frequency readings. The spreadsheet approach breaks down when the grower needs many sensors, fast response, or the ability to query and analyze large amounts of data.

When this is appropriate: operations that value simplicity over sophistication, single-person farms, educational settings where the grower wants to work with data in spreadsheets directly, or as a starting point on the journey up the ladder.

### Rung 2: Local edge platform — the Hort Assistant approach.

The sensor reports to a local computer — typically a small device running Home Assistant or a similar edge platform — that captures, stores, displays, and acts on the data all in one place. This is the level where the monitoring system becomes self-contained. The computer runs on the grower's property, stores data locally, displays dashboards, fires alerts, and runs automations. If the internet goes down, the system keeps working. If the vendor disappears, the system keeps working. The grower owns everything.

Home Assistant is the open-source platform the Hort Assistant approach uses. It has been in production for over a decade with millions of deployments. It runs on a small computer (a Raspberry Pi, a mini PC, or similar hardware costing $75 to $200). It supports thousands of sensor types and device integrations. It stores historical data for as long as the grower wants to keep it. It builds dashboards that show current values, history, and trends. It runs automations based on sensor values, schedules, and combinations. It sends alerts through many channels — push notifications, SMS, email, phone calls. The grower does not pay a monthly fee and does not depend on anyone else's infrastructure for the core system.

Setting up a Home Assistant system takes some effort — hours to days depending on scale and the grower's technical comfort. Configurations shared through the collective and other community resources reduce that effort significantly. Once running, the system maintains itself — automatic backups, update management, and community support make ongoing operation manageable.

What this level does well: comprehensive local monitoring and control, complete ownership of data and logic, flexibility to add new sensors or modify automations, no recurring costs, production-grade reliability at commodity prices. What it does not do well: scale across many separate sites without additional coordination software, integrate with business operations (inventory, sales, labor, compliance) without additional tools, or replace dedicated commercial platforms where complete business integration is needed.

When this is appropriate: most small-and-mid-scale agricultural operations. This is the level the OpenAgTech collective primarily serves — where the right combination of commodity hardware, mature open-source software, and shared knowledge delivers production-grade capability at a price that fits the operation.

### Rung 3: Edge plus cloud — local-first with cloud backup and remote access.

The local system continues to do the real work, but a copy of the data flows to the cloud for backup, remote access, and coordination across sites. The grower can look at the greenhouse from their phone while at a conference in another state. The data has a safe copy in case the local storage is destroyed. A multi-site operation can see all sites from one place. An expert the grower trusts can look at the system from their office when diagnosing a problem.

The cloud layer at this level is assist, not foundation. If the cloud service stops, the local system keeps working. If the internet is out, the local system keeps working. The cloud is a convenience for remote access, not a dependency. This is a critical distinction — it is what separates Rung 3 from commercial cloud-first IoT products, which typically make the cloud service the primary system and the sensors dependent on it.

Implementations at this level vary. Home Assistant offers cloud services through Nabu Casa (a paid service run by the Home Assistant founders) that provide secure remote access without exposing the local system to the internet directly. Alternative approaches include self-hosted VPN access, cloud databases receiving periodic data pushes, or custom backends that receive MQTT streams from local brokers. Each has tradeoffs in cost, privacy, and convenience.

What this level does well: remote access without surrendering local control, backup resilience, multi-site coordination, collaboration with advisors and team members in different locations. What it does not do well: full business integration (the cloud layer is usually still data-focused rather than operational), and handles operations complex enough to require dedicated business software.

When this is appropriate: operations with multiple sites, operations where the grower travels or works remotely, operations where several people need access to the monitoring data, and operations where the cost of data loss would be high enough that backup justifies the modest recurring cost.

### Rung 4: Grow management platforms and operational integration.

Specialized software that combines monitoring data with the operational side of the business — planting schedules, harvest tracking, inventory, compliance records, labor hours, regulatory reporting. These platforms are designed for specific industries (cannabis cultivation, commercial produce, nurseries, CEA operations) and integrate sensor data as one input among many. The sensor readings contextualize tasks, decisions, and records rather than standing alone.

Examples of this level vary by industry. Cannabis operations often use METRC (state-mandated in many jurisdictions) paired with platforms like Trym, Growlink, or custom deployments. Produce operations use systems like Croptracker, Agworld, or Priva Connected. Hydroponic and indoor farming operations may use specialized ERP modules or vendor-specific platforms. Each platform makes tradeoffs between flexibility, industry-specific features, cost, and degree of cloud dependency.

What this level does well: business integration, compliance, team coordination, and connecting sensor data to operational outcomes. A pest outbreak detected through sensor anomalies becomes a scouting task assigned to a worker, consuming inventory when treatment is applied, recording labor hours, and updating the crop's compliance record — all in one system. What it does not do well: modest cost, local control, or flexibility for operations that do not fit the platform's assumptions. These platforms are typically subscription-based and often cloud-dependent, with all the tradeoffs that implies.

When this is appropriate: operations with significant regulatory compliance needs (especially cannabis), operations with many staff members who need coordinated access, operations where the sensor data is one small piece of a larger operational picture, and operations that can absorb the recurring costs of commercial platforms.

### Rung 5: Fully integrated enterprise systems.

The most sophisticated level. Sensor data, operational data, financial data, customer relationships, supply chain, compliance, and labor all live in one system with the sensor readings as context across everything. A freeze event triggers automatic recalculation of harvest projections, updates the production forecast sent to customers, adjusts the labor schedule for recovery work, and documents the event for insurance and crop records. The sensor data is not a separate system with its own dashboard — it is context woven through every business decision.

Systems at this level are typically custom deployments built on enterprise platforms like ERPNext (open-source), NetSuite, or SAP, configured specifically for agriculture. They require significant setup investment and ongoing management, and they justify themselves only at scale — operations large enough that the coordination benefits exceed the overhead costs. The collective approach can meet enterprise systems partway through the Rung 3 cloud layer, but full Rung 5 integration is typically where larger operations and commercial deployments live.

What this level does well: everything at once, integrated. What it does not do well: fit anything smaller than a fairly substantial operation, cost less than a significant investment, or offer the flexibility of simpler approaches. Most operations do not need Rung 5; some do, and those that need it value it highly.

When this is appropriate: multi-site commercial operations, operations where every hour of work and every compliance record needs to be integrated, and operations where the management team has the capability and interest to run a sophisticated enterprise system.

03Closing the loop — from data to action.

The data ladder describes capability. The loop is what makes the capability useful. A complete system closes a loop from measurement through action back to learning.

The loop has six steps. The simplest data systems stop after step two or three; the most sophisticated close every one. A grower evaluating a system can ask, for each step, 'is this closed?' If the answer is no at any step, the system has a weakness that limits what the sensor investment can produce.

Step 1: Sense.

A sensor measures a value. This is the step almost everyone thinks about. Every system supports sensing to some degree.

Step 2: Store.

The reading is saved somewhere the system can retrieve it later. Rung 0 systems do not store. Rung 1 and up systems do. The questions at this step: how long is the data kept, in what format, and who can access it.

Step 3: Display.

A human can look at the current value and some amount of history in a form they can understand. Dashboards, apps, charts, daily emails. The question at this step: can the person who needs to see the data actually see it when they need to, from where they need to see it?

Step 4: Decide.

A human or the system itself determines whether action is needed. A grower looks at the dashboard and decides to adjust irrigation. An automation evaluates a rule and concludes the fan should run. An alert determines that the temperature has been out of range long enough to warrant notification. Every decision needs something — human judgment, or a rule, or both — that gets from the display to a conclusion.

Step 5: Act.

Something happens in the physical world. The fan turns on, the grower drives out to check the pump, the alert wakes the grower who then makes a phone call. The action is the payoff of everything before it. Without action, the data and decisions produced nothing.

Step 6: Record and review.

The action is logged — what happened, when, what changed as a result. Over time, this history becomes the basis for learning. The grower can see that the irrigation adjustment three weeks ago produced the result they wanted, or that the automation rule they added last month has been firing too often and needs tuning. Without the record-and-review step, the operation cannot learn from its own history.

Most monitoring failures happen at steps 2, 4, or 6. A system with weak storage loses history. A system with weak decision logic produces noise instead of insight. A system that does not record actions cannot teach the grower anything about what works and what does not. Evaluating a system loop-by-loop reveals where the gaps are.

04Storage — keeping the data.

Different storage technologies fit different needs. The grower does not need to become a database engineer, but understanding the basic options helps match storage to the application.

Flat files (CSV, JSON).

The simplest storage. A text file that accumulates rows of data. Easy to read, easy to open in any spreadsheet program, easy to move around. Does not scale well — a file with millions of rows becomes slow to work with — but perfect for small amounts of data. Most Rung 1 systems use flat files.

Spreadsheets (Google Sheets, Excel).

Flat files with calculation and display tools built in. Good for small amounts of data that needs to be analyzed directly in the spreadsheet. Google Sheets adds collaborative access and web availability. Excel files can be opened on any computer but are harder to collaborate on. Both have size limits (Google Sheets: 10 million cells; Excel: about 17 billion cells but performance drops earlier) that matter for high-frequency sensor data.

Time-series databases.

Databases designed specifically for sensor data — readings arriving at regular intervals and accumulating over time. InfluxDB is the most common in the open-source world. Home Assistant uses a time-series-style database internally. Time-series databases handle millions of readings efficiently, support fast queries like 'what was the temperature at 3:47 AM on March 14,' and are the right choice for serious monitoring at any scale. Setup requires more technical knowledge than spreadsheets; the payoff is efficient handling of any volume.

Relational databases (MySQL, MariaDB, PostgreSQL).

General-purpose databases that can store sensor data alongside other operational data. Relational databases are good when the sensor data needs to be combined with tables of other information — crop records, task lists, inventory, compliance logs. They handle volume well though not quite as efficiently as dedicated time-series databases for pure sensor workloads. Most Rung 4 and Rung 5 systems use relational databases at their core.

Cloud databases and object storage.

Commercial cloud services — AWS DynamoDB, Google BigQuery, Azure CosmosDB — that handle arbitrary amounts of data with pay-as-you-go pricing. Excellent for large-scale systems; overkill for small operations. The relevant concern is that cloud storage creates ongoing costs and vendor dependency. For most agricultural operations, local storage is appropriate; cloud storage is a Rung 3\+ consideration.

Retention and archiving.

How long to keep the data matters. Some operations only need recent data — the last 30 days of sensor readings, with older data discarded. Others need everything forever — decades of readings that show seasonal patterns and long-term trends. A retention policy decides: high-resolution recent data (every reading for the last week), medium-resolution history (averages per hour for the last year), low-resolution archives (daily averages for older data). This approach keeps storage manageable while preserving useful historical context. Plan retention before the database fills up, not after.

Backup and disaster recovery.

A local database that is not backed up is one hardware failure away from zero data. Basic backup: a copy of the database on a second device or off-site location, updated regularly. Good backup: automated daily backups with multiple generations kept. Great backup: off-site copies that survive a complete loss of the primary site. The right level depends on the value of the data — most small operations do well with automated daily backups to a cloud service or external drive. Critical operations with compliance implications need more rigorous approaches.

05Visualization — seeing the data.

A grower who cannot easily see their data cannot make decisions from it. Good visualization turns numbers into understanding. Poor visualization produces dashboards that look impressive and do not actually inform.

The real-time view.

Current values for the key measurements. Temperature right now. Humidity right now. Soil moisture right now. Simple, clean, accurate. The first thing the grower wants to see when they open the dashboard — what is happening this moment. Overcomplicated real-time views (dozens of values crammed on one screen, rapidly-updating animations, elaborate gauges) often obscure more than they reveal. A clean summary of the most important current values beats a cockpit of every possible metric.

The trending view.

A chart that shows how values have changed over recent history — the last hour, last day, last week. Trends reveal what current values do not. A temperature of 75°F is fine on its own; a temperature that has been rising steadily for three hours suggests the fan is not keeping up. Trends catch slow-developing problems before they become emergencies.

The comparative view.

Multiple related values shown together. Temperature and humidity side by side (revealing VPD). Soil moisture across multiple zones (revealing irrigation coverage). Light and temperature (revealing heating from lighting). The comparative view is often where real insight lives, because agricultural problems are rarely caused by one variable alone.

The status view.

A summary of system health. Which sensors are reporting. Which devices are online. What alerts are active. When was the last backup. This view answers 'is the monitoring system itself working' — a question that seems obvious but is critical, because the dashboards showing sensor data often do not show whether the sensors are actually reporting.

The investigative view.

A flexible interface for exploring specific questions. Zoom in on the temperature last Tuesday afternoon. Compare soil moisture between two zones over the past month. Look at what was happening when the automation fired at 3 AM. Investigative tools are what turn a dashboard from a display into a tool for learning.

Common mistakes.

Dashboards that show too much. A screen with forty values makes it harder to see the five that matter. Dashboards that show impressive-looking gauges and arrows without informing. Dashboards optimized for demos rather than daily use. Dashboards that do not survive on a mobile phone, where many growers check them most often. Design dashboards for the context in which they will actually be used — in the field, on a phone, checking quickly.

Home Assistant and similar platforms.

Modern automation platforms include dashboard builders that scale from simple to sophisticated. Home Assistant's Lovelace dashboards are configurable through YAML or through a graphical editor; the resulting dashboards run well on phones, tablets, and computers. Grafana is an alternative or complement for more data-analysis-focused visualization. Node-RED includes dashboard capabilities. The common thread: dashboards that the grower configures for their operation, not what a vendor decided the dashboard should look like.

06Analysis — learning from the data.

Beyond real-time monitoring and historical review, the data stored over time can reveal patterns that direct observation cannot. Analysis is the step where accumulated sensor data becomes operational insight.

Trend analysis.

How values have changed over months or seasons. Soil moisture tracking reveals seasonal watering efficiency. CO2 levels over a growing season reveal whether enrichment is sustained consistently. Temperature trends reveal whether a heating system is aging. Trends that a grower does not have time to notice daily become obvious when plotted across weeks or months.

Correlation analysis.

Looking at how different variables move together. A pest outbreak that happened last August — what was different about the temperature and humidity in the week leading up to it? A yield difference between two years — was it correlated with differences in DLI? Correlation does not prove causation, but it suggests where to look for explanations.

Anomaly detection.

Automated flagging of readings that are unusual compared to historical patterns. A temperature that is normal for summer but strange for March. A water flow rate that is within the normal range but has been steadily increasing for two weeks, suggesting a developing leak. Anomaly detection catches subtle problems that threshold-based alerting misses.

Baseline establishment.

Understanding what normal looks like for a specific operation. The first year of monitoring often produces more value through establishing baselines than through active intervention — the grower learns what typical ranges are for their specific greenhouse, crop, and climate. Baselines make every subsequent year's anomalies meaningful.

Comparison across crops, seasons, or locations.

Did the tomatoes grow better in the east greenhouse or the west greenhouse? How did last winter's energy use compare to this winter's? Did the new irrigation strategy actually reduce water use? These comparisons are only possible with historical data, and they are often the most valuable output of monitoring — they quantify the effect of changes the grower made.

Where specialized analysis lives.

Basic analysis (trends, comparisons, totals) can be done in Home Assistant, a spreadsheet, or any simple dashboard tool. More sophisticated analysis (correlation, anomaly detection, machine learning) typically requires specialized tools — Grafana for visualization-focused analysis, Python or R for custom statistical work, dedicated data-analysis platforms for large-scale work. For most small-to-mid-scale operations, basic analysis in the monitoring platform is sufficient. Specialized analysis becomes valuable when the operation is large enough or the data is rich enough to produce actionable patterns.

07Action tracking — closing the loop with what happened.

The sensor data records what the environment did. Action tracking records what the grower did about it. Without action tracking, the feedback loop from measurement to outcome cannot close.

What to track.

Actions worth recording include: fertilizer and nutrient applications (type, amount, timing, target), pest control interventions (product, application rate, target, results), equipment changes (new heater installed, fan repaired, valve replaced), environmental setpoint changes (temperature target raised from 72 to 74), automation rule changes, staff changes (who worked, what they did), and observed plant responses (flowering started, stress observed, disease spotted). Each action is a change the grower introduced; later review shows what the change did.

Where to track it.

For small operations, a simple notebook or a shared text document often suffices. For larger operations, structured task management systems or purpose-built agriculture software keeps action history in queryable form. Home Assistant has calendar and logbook integrations that track user actions and automation history. Commercial farm management software typically makes action tracking central. The key is that actions are recorded somewhere searchable, not scattered across memory and sticky notes.

Connecting actions to outcomes.

The real value comes from asking 'what happened after this action?' Did the pest treatment work? Did the nutrient adjustment produce the expected response? Did the new automation save energy? Answering these questions requires the action log connected to the sensor history — actions on one timeline, measurements on another, reviewed together. This does not require sophisticated software; it requires the discipline to look. A monthly review where the grower walks through the action log and the sensor data together is where learning happens.

Compliance and regulatory records.

For certain crops (cannabis especially) and certain markets (organic certified, GAP certified), action tracking is not optional. Spray records, input logs, harvest records, and chain-of-custody documentation are required for compliance. Systems that track actions for operational learning can often feed the required compliance reports with minimal extra work.

08Access, sharing, and collaboration.

Data that sits on a hard drive accessible to one person often does not reach its full value. Growers, workers, advisors, researchers, and regulators may all need access to some of the data in different ways. Thinking through access and sharing from the beginning prevents the common pattern of data being trapped on a specific computer that only one person can use.

Role-based access.

Different people need different views. The owner wants full access to everything. A grow manager wants access to the areas they manage. A worker might only need the dashboard for their zone. A visiting consultant might need read-only access to specific data. Modern monitoring platforms support role-based access that matches each user's access to their legitimate need.

Remote access.

The grower on the road, the consultant in another state, the research partner at the university — all need to access the data remotely. Rung 3 of the ladder is specifically about this — adding remote access without surrendering local control. Secure remote access (through VPN, through platforms like Nabu Casa, or through purpose-built cloud bridges) is a first-class feature of any serious monitoring system.

Sharing with advisors and consultants.

Independent consultants, extension agents, and specialized advisors often benefit from looking directly at a grower's data. The question is how to share access without giving away control. Read-only user accounts, time-limited access, shared dashboard views — different platforms handle this differently. The goal is that advisors can see what they need to see, when they need to see it, without the grower losing any control over the underlying system.

Contributing to the collective.

Anonymized or aggregated data from many operations can produce insights no single operation could reach. How do soil moisture patterns differ across climate zones? What VPD ranges produce the best yields for a specific cultivar? Voluntary contribution of data to research and extension work is how agriculture has historically advanced. The collective approach to monitoring supports this — if a grower wants to contribute data, the infrastructure makes it easy; if they do not, the data stays entirely private.

Data ownership in the collective approach.

A core principle: the grower owns their data. Nothing about the collective approach changes that. Data lives on the grower's equipment, in formats the grower can access and export, under the grower's control. Sharing is explicit and consensual. Access by others is granted deliberately, not assumed. The collective approach is about sharing knowledge and configuration patterns, not surrendering data.

09The discipline of reviewing data.

A monitoring system that is not reviewed produces data nobody looks at. The system works; the grower learns nothing. This is the most common failure mode among operations that invest in monitoring — they build the system, it runs correctly, and six months later the grower cannot remember the last time they looked at anything beyond the current dashboard.

Regular scheduled review.

Set aside a specific time — weekly, monthly, or at crop transitions — to review what happened. Walk through the historical charts. Look at the alerts that fired. Review the action log. Check whether the automations are doing what they were set up to do. Identify things to change. This practice is not glamorous. It is where the value of monitoring actually lands.

Post-event review.

After any significant event — a close call, an unexpected success, a disease outbreak, a crop loss, a new crop started — review the relevant data. What did the sensors show leading up to it? What did the grower do? What could be done differently next time? Post-event reviews are where the sharpest lessons come from, because the event makes the stakes clear.

Seasonal recap.

At the end of each crop cycle or each season, a comprehensive review. How did this crop perform compared to previous ones? What patterns are emerging across years? What should be different next time? This review connects short-term observations to long-term improvement. It is also where operational changes that took place during the season get documented in their full context.

Comparison with others.

When possible, compare notes with other growers running similar operations. A grower's own data in isolation is informative; the same data in context of what other growers are seeing is often much more informative. The collective approach facilitates this — comparing configurations, patterns, and outcomes across operations with similar circumstances reveals what good looks like and where there is room to improve.

10The principle that unifies everything: actionable data.

Data has to be actionable, and the system has to support the action. This is the sentence that captures the whole lesson. Every decision about a data system — what to capture, how to store it, how to display it, when to alert, whether to automate — comes back to this test.

Actionable data means the grower has the information they need to make a decision or confirm a state. Temperature readings are actionable because high temperature calls for ventilation and low temperature calls for heat. Soil moisture is actionable because it drives irrigation. Battery voltage is actionable because a declining battery indicates sensor maintenance is coming up. Each of these measurements connects to actions a grower can take.

A system that supports the action means the actions are possible. If the temperature is out of range, the grower can open vents, turn on fans, adjust heating. If the soil is dry, irrigation can fire. If a sensor's battery is low, the grower can replace it. The measurement connects to an intervention. When the measurement does not connect to any possible action, the data is not actionable regardless of how accurate it is.

The opposite failure — measurements without intervention capability — is surprisingly common. An operation installs a fancy plant canopy temperature sensor and the data streams in, and nobody knows what to do with 78°F canopy temperature. The information is not actionable because no one has established what setpoints should be for this crop, what interventions are available when the value is outside range, and what the action threshold is. Data without a connected action is information theater — it looks like monitoring but produces nothing.

A practical discipline: for every measurement a system captures, the grower should be able to answer two questions. What action would I take if this value was too high? What action would I take if this value was too low? If the grower cannot answer both questions, the measurement is not producing actionable data — either the measurement is not the right one, or the operation has not yet developed the knowledge to use it.

This discipline is especially important when growers encounter new monitoring possibilities. Every vendor selling sensors will describe interesting things the sensor can measure. The question to ask before buying is: if this sensor reports, what will I do differently? If the answer is 'I do not know yet,' that is a signal that the sensor belongs in a learning phase, not an operational phase — measure it, watch it, understand what it is telling you, and only then rely on it for decisions.

11Live data as proof of concept.

The small widget at the top of the OpenAgTech homepage — the live data feed from a Tennessee CEA facility — is not decoration. It is an intentional demonstration of what appropriate-technology data systems look like in practice.

The numbers come from real sensors in a real operation. They are captured by a small local computer, stored locally, displayed on the local dashboard, and also pushed to the OpenAgTechnology website where anyone can see them. The grower owns the system. The data lives on the grower's equipment. The website is a convenience — if the website goes down, the grower's system keeps working; if the grower's internet goes down, the website shows stale data but the grower's local system keeps operating.

This is the Rung 3 pattern in its simplest form. Local-first, cloud-assisted. The local computer does all the real work; the cloud layer adds remote access and public visibility. The sensors are inexpensive and commodity-priced. The software is free and open-source. The total hardware investment is a few hundred dollars. The operation runs forever without recurring fees.

The widget also makes a quieter point about the data itself. The numbers shown are the measurements the grower actually finds useful — a subset of a much larger dataset the local system captures. The grower's own dashboard shows many more sensors with deeper history and automations the public widget does not need to display. The widget shows enough to demonstrate that the system works, not everything the system does. This distinction matters: a public display of data is a highlight reel; the real operational value lives in the full system behind it.

A grower visiting this page who wonders what their own system could look like has the answer on the screen. It could look like this. It could cost about this much. It could run forever. It could belong to you entirely. The technology to do it exists, the knowledge to assemble it is shared through the collective, and the result is something your operation actually uses rather than something a vendor rents to you.

12Rules of thumb for data decisions.

A practical summary:

Start at the ladder rung that fits the operation today, not the rung that sounds impressive. A simple system running well beats a complex system running poorly.

Store the data you need long enough to answer the questions you will actually ask. For most agricultural operations, that means at least a year of history — enough to compare seasons.

Keep the monitoring system local. Cloud backup is fine; cloud dependency is not. A system that stops working when the internet fails is not an appropriate agricultural monitoring system.

Design dashboards for the moment of use — on a phone, in the field, checking quickly — not for impressive screenshots.

Alert on data absence, not just threshold crossings. The absence of readings is often more important than any specific value.

Record the actions the grower takes. Without action tracking, the loop from measurement to learning cannot close.

Review the data regularly. A monthly walk through the history catches patterns direct observation misses.

For every measurement, know what action it supports. If no action is connected, the measurement is not yet producing actionable data.

Own the data. The grower owns the sensors, the computer, the storage, the logic, and the history. The collective approach shares knowledge, not data.

The data has to be actionable and the system has to support the action. Everything else on this page is in service of that sentence.

\[FAQ — JSON-LD to be generated from these Q&A pairs. Organized beginner → ladder/storage → visualization → analysis → action → access → myths.\]

Frequently asked questions.

The honest version.

What is sensor data?

Sensor data is the stream of measurements a sensor produces over time — temperature readings every minute, humidity readings every 15 minutes, soil moisture checked hourly. Each reading is a single data point tagged with what was measured, where, and when. Sensor data accumulates over days, months, and years into a history that can be analyzed for patterns, used to trigger alerts and automations, and reviewed to understand how the operation has performed.

What does actionable data mean?

Actionable data is information the grower can do something with. A temperature reading is actionable because high temperature calls for ventilation and low temperature calls for heat. A soil moisture reading is actionable because it drives irrigation decisions. Data that does not connect to any possible action is not actionable — it is just numbers. Good monitoring systems capture actionable data; poor systems capture whatever sensors are cheapest without thinking about what the grower would do with the readings.

What is a dashboard?

A dashboard is a screen that displays sensor data in a form a grower can quickly understand. Current values, recent trends, active alerts, system status. Good dashboards are designed for the context where they will be used — on a phone, in the field, during a quick check. Modern monitoring platforms like Home Assistant let growers configure dashboards that match their specific operation, rather than accepting whatever a vendor decided the dashboard should look like.

What is a time-series database?

A time-series database is a database designed specifically for data that accumulates over time with timestamps — sensor readings, log entries, metrics. Time-series databases efficiently store millions or billions of timestamped readings and support fast queries like 'what was the temperature between 2 AM and 4 AM yesterday.' InfluxDB is the most common open-source time-series database. Most modern monitoring platforms use time-series-style storage internally.

What is an alert?

An alert is a notification that something needs attention — a temperature out of range, a sensor not reporting, a piece of equipment malfunctioning. Alerts are sent through channels the grower can reach quickly: push notifications on a phone, SMS text messages, emails, voice calls. Good alerts are specific (what is wrong, where, since when), actionable (the grower knows what to do), and appropriately urgent (critical issues wake people up; minor issues wait).

What is the simplest data storage for a small operation?

For very small operations, pushing sensor readings to a Google Sheet can work as a starting point. The sensor (often an ESP32 microcontroller) connects to WiFi, authenticates to Google, and appends a row of data every time it reports. The grower opens the spreadsheet and has a history of readings they can chart, analyze, and export. This approach has limits — it does not scale to many sensors or frequent readings — but for a few sensors reporting every 15 minutes, it works fine and costs almost nothing.

What is Home Assistant?

Home Assistant is an open-source home automation platform that works well as an agricultural monitoring and control system. It runs on a small computer (typically $75 to $200 in hardware), captures data from thousands of types of sensors, stores historical data, displays configurable dashboards, fires alerts, and runs automations. It is free to use, has been in production for over a decade with millions of deployments, and is the platform the OpenAgTech collective uses most often. The Hort Assistant approach is specifically Home Assistant configured for agricultural monitoring.

What is the difference between edge computing and cloud computing?

Edge computing runs software on a local computer physically near the sensors — in the greenhouse, in the office, on the farm. Cloud computing runs software on remote servers accessed over the internet. Edge computing is faster (no network latency), works without internet access, and keeps data under the grower's control. Cloud computing scales easily and supports multi-site access. The OpenAgTech collective approach is edge-first, with optional cloud layers for remote access and backup — the local computer does the real work, and the cloud is a convenience rather than a dependency.

Can I run my monitoring system without internet?

Yes — this is one of the main advantages of a properly-designed system. A local computer running Home Assistant monitors sensors, stores data, runs dashboards, fires alerts, and executes automations entirely on its own. The internet adds remote access and cloud backup, but the core system works without it. If the internet goes down during a crisis, the grower's monitoring continues — which is exactly when monitoring matters most.

How much storage does sensor data take?

Less than most growers expect. A sensor reporting one reading every minute produces 1,440 readings per day or about 525,000 per year. Each reading is typically a few dozen bytes of data. A year of readings from one sensor takes a few megabytes of storage — negligible on any modern computer. Ten sensors reporting every minute take a few tens of megabytes per year. Hundreds of sensors with frequent readings reach hundreds of megabytes to a few gigabytes per year, still manageable on commodity hardware.

How long should I keep sensor data?

At least a year for most operations — enough to compare current conditions to last year's same season. Ideally, keep the full detail for recent periods (last 30 to 90 days) and downsampled summaries (hourly or daily averages) for longer histories. Specialized operations may have regulatory requirements for longer retention — cannabis operations, for example, often need years of logged data for compliance. Retention policy should be decided before the database fills up, not after.

How do I back up my monitoring data?

Most monitoring platforms have built-in backup capabilities. Home Assistant can produce automated daily backups that get stored locally, copied to a USB drive, or pushed to a cloud storage service. For serious operations, a three-part backup strategy works well: a local backup on a second device, an off-site copy (cloud or remote location), and periodic verification that the backups can actually be restored. The cost of backup is trivial; the cost of losing years of data is significant.

What happens to my data if Home Assistant is discontinued?

Home Assistant is open-source software maintained by a community of thousands of developers, not a product of a single company that could discontinue it. The software itself will keep working regardless of what any individual contributor does. More importantly, the data Home Assistant stores is in standard formats that other tools can read. If the grower ever wants to move to another system, the data is exportable. This is the ownership principle in practice — nothing ties the grower to Home Assistant except its usefulness.

How do I look at my sensor data?

Through dashboards, typically. A monitoring platform like Home Assistant lets the grower configure dashboard views that show current values, historical trends, and system status. Dashboards are accessed through a web browser on any device (phone, tablet, computer) on the network — and through apps that connect to Home Assistant for mobile use. The dashboard is the primary way the grower interacts with the monitoring system day to day.

What makes a good agricultural monitoring dashboard?

A good dashboard shows what the grower actually needs, clearly, without overwhelming them with data. The key elements: current values for the critical measurements, recent trends showing how things are moving, active alerts, and system health (sensors reporting, devices online). Good dashboards survive on a phone screen, update in real time, and work quickly. Common mistakes: cramming too many values on one screen, using flashy graphics that obscure the data, designing for demo screenshots rather than field use.

Can I see my greenhouse data on my phone?

Yes. Modern monitoring platforms (Home Assistant and similar) support mobile apps and mobile-friendly web dashboards. A grower can check current conditions, view trends, respond to alerts, and operate controls from a phone anywhere they have network access. This is one of the most-used features of a modern monitoring system — a quick check from the car, from the store, from another state while traveling.

What insights can I get from sensor data?

The insights depend on what the sensors measure and how long the grower has been collecting data. With temperature and humidity, you can calculate VPD and track stress patterns. With soil moisture, you can evaluate irrigation efficiency and detect leaks. With light, you can track DLI and tune supplemental lighting. With a season or more of history, you can compare cultivars, crops, or management changes. With multi-year history, you can see seasonal patterns and long-term trends. The insights grow deeper as the history accumulates.

How do I compare data between seasons or years?

Store the data long enough to have multiple seasons to compare. Most modern monitoring platforms can show multiple time ranges on the same chart — this season versus last season, this week versus the same week last year. Direct visual comparison often reveals patterns clearly. For more rigorous comparison, exporting the data to a spreadsheet or analysis tool allows statistical comparison and more sophisticated analysis.

What is anomaly detection?

Anomaly detection automatically flags readings that are unusual compared to historical patterns. A temperature of 80°F might be normal in August and anomalous in February. A water flow rate that is within normal range but has been steadily increasing for two weeks suggests a developing leak. Anomaly detection catches subtle problems that threshold-based alerting misses. Basic anomaly detection is available in most monitoring platforms; sophisticated anomaly detection requires specialized analysis tools.

What is action tracking?

Action tracking is recording the actions a grower takes in response to observations — what fertilizer was applied when, what pest treatments were used, what equipment was changed, what setpoints were adjusted. Action tracking closes the loop from observation to outcome, making it possible to review what worked and what did not. Without action tracking, the grower cannot reliably answer 'did this change produce the result I expected?' which is where most operational learning happens.

Why should I record what I do in response to alerts?

Because otherwise the feedback loop from observation to outcome never closes. When an alert fires and the grower takes action, that action produces a result — the temperature came back in range, or the equipment failed differently, or the problem did not recur. Recording the action and the outcome lets the grower review later and learn what worked. Without the record, every response is isolated — the grower solves the immediate problem but does not accumulate the knowledge that prevents the next one.

Who owns the data my sensors produce?

In the collective approach, the grower owns the data completely. It lives on the grower's equipment, in formats the grower can access and export. The OpenAgTech collective and the Home Assistant platform do not claim any rights to the data. This is a deliberate design choice — data ownership is fundamental to appropriate agricultural technology, and any system that takes data ownership away from the grower is not appropriate regardless of what else it does well.

Can I share my data with my advisor or extension agent?

Yes. Most monitoring platforms support creating user accounts with specific access levels — read-only access to specific dashboards, time-limited access for a visit, full access for a trusted consultant. The grower decides who sees what. The ability to share access when useful without surrendering control is one of the benefits of a properly-designed local-first system.

Can I access my data from anywhere?

Yes, with proper configuration. Modern monitoring platforms support secure remote access through various mechanisms — Nabu Casa for Home Assistant, self-hosted VPNs, or cloud bridges. The grower traveling to a conference, checking the farm from another state, or working from a vacation can access their monitoring system as easily as if they were at home. The key is that remote access is layered on top of local operation, not the foundation the system depends on.

Should I contribute my data to research?

That is entirely the grower's choice. Some growers enjoy contributing anonymized data to research and extension work — it is how agriculture has historically advanced. Others prefer to keep their data entirely private. The collective approach supports both. If the grower wants to contribute, there are mechanisms to do so. If they do not, the data stays private. Nothing about the collective requires data sharing — the collective shares knowledge and configurations, not data.

How much does a monitoring data system cost?

A small-to-mid-scale system using Home Assistant costs $300 to $800 in initial hardware (the hub computer, a few sensors, communication hardware). The software is free. Ongoing costs are minimal — electricity for the hub (a few dollars a month), occasional replacement sensors or batteries, and optional cloud services for remote access ($5 to $10 per month for Nabu Casa, or $0 for self-hosted alternatives). Total cost of ownership over 10 years is typically a few hundred to a few thousand dollars — far less than commercial alternatives.

Do I need to pay for cloud storage?

Not for the basic monitoring system. Home Assistant stores data locally on the hub computer; no cloud service is required. If the grower wants remote access, Nabu Casa or similar services add $5 to $10 per month. If the grower wants off-site backup of their data, cloud storage services cost a few dollars a month for the volumes typical of agricultural monitoring. Cloud costs are a choice, not a requirement.

Why is my dashboard not updating?

Common causes: the hub computer is offline or crashed, the sensor has stopped reporting (dead battery, network issue, hardware failure), the browser is showing cached data, or there is a network issue between the device and the hub. Diagnosis order: check whether the hub is online, look at the sensor's last-reported time, try refreshing the browser, check other sensors to see if the issue is localized or widespread.

Why does my system store some data but not others?

Monitoring platforms typically have configuration that controls what gets stored. Home Assistant, for example, has 'recorder' settings that include or exclude specific entities from the database. Check the platform's retention settings to confirm the sensors you care about are actually being recorded. Also check whether the sensors are producing data at the expected rate — a sensor reporting intermittently due to battery issues may have gaps in its stored history.

Why is my data retention different from what I configured?

Usually because of storage policies that automatically purge old data to control database size. Home Assistant's default is 10 days of detailed history; older data is typically kept as hourly summaries. Custom retention policies can extend this significantly. Check the recorder configuration in Home Assistant (or the equivalent in other platforms) to verify the retention matches what you expect. Plan retention deliberately; the default may not match your needs.

Do I need cloud services to monitor my farm?

No. A local monitoring system running on a small computer on the grower's property does everything a typical agricultural operation needs — sensors, storage, dashboards, alerts, automations. Cloud services add remote access and backup, which are valuable but optional. Many commercial monitoring products position cloud services as essential because it is their revenue model; that does not mean cloud services are actually necessary for the technical work.

Is more data always better?

No. More data takes more storage, more review time, and more effort to organize. The useful discipline is to capture the data that supports actions the grower actually takes, and to skip data that does not connect to any decision. A grower who captures fifty measurements but only acts on five is mostly producing noise. Five well-chosen measurements that drive decisions are worth more than fifty that do not.

Do I need specialized agricultural software?

Depends on scale. Small operations often do well with general-purpose tools — Home Assistant for monitoring, a simple spreadsheet for records, a text document for notes. Larger operations with compliance requirements or complex team coordination benefit from specialized agricultural software. The transition from general-purpose tools to specialized software is usually driven by specific operational needs — compliance requirements, multi-site coordination, integrated business operations — not by farm size alone.

Will AI tell me how to grow?

AI can help analyze data and suggest patterns, but it cannot replace the grower's judgment about their specific operation. AI works best as a tool that the grower uses to extract insights from their data faster, not as an oracle that makes decisions for them. Be skeptical of products that claim AI-driven growing decisions — the algorithms are often trained on data that does not match the grower's specific context, and the decisions have consequences that need human judgment. See the section on Understanding Ai.