Matthew Diakonov, Written with AI

Published April 20, 202611 min read

Edge AI, audited at the output

An edge AI solution is useful only if it compresses 200 raw detections into 6 delivered alerts. That is one number you can audit in five minutes.

Every edge AI solution explainer online compares TOPS, NPU architectures, INT8 quantization, and cloud-vs-edge latency. None of that tells you whether the channel will still be active at month three. The number that does is the compression ratio: raw person detections the model fires, divided by delivered alerts that reach a human, over 24 hours. On a healthy our system install at a 16-camera multifamily property, that ratio sits in a 40:1 to 100:1 band. Below 25:1 the channel gets muted within two weeks. Above 200:1 something in the filter stack is silently dropping real events. This guide is about the two log lines you count to check it, and the four filter stages that produce it.

Audit your edge AI deployment

4.9from 50+ properties

40:1 to 100:1 target band (raw detections / delivered alerts per 24h)

200 to 300 raw person detections on a typical 16-camera day

3 to 8 delivered alerts on a healthy compression

5-minute audit, two log files, one divide

Edge AI, judged at the output

compression ratio, not TOPS

Every edge AI solution fires raw person detections

Only some of them compress noise into signal

Target band: 40:1 to 100:1, raw to delivered

Below 25:1 the channel is muted in 2 weeks

Above 200:1 real events are silently dropped

0:00 / 0:05

The SERP answers the wrong question

Search edge AI solution and the top results compare silicon. How many TOPS. What precision. INT8 or FP16. Whether the NPU is Ethos-U55, Jetson Orin, Hailo-8, or something from Qualcomm. Then they walk cloud-vs-edge latency, say a few true things about privacy, and list industrial IoT use cases for predictive maintenance.

All of that is accurate. None of it predicts whether the edge AI solution you just paid for is still being used in month three. The predictor is downstream of the silicon entirely: it is the ratio of raw detections the model fires to delivered alerts that reach a human. A 40 TOPS box with a good filter stack outperforms a 100 TOPS box with a sloppy one on the exact same cameras, because the output-side filter stack is what decides how much noise reaches the operator.

If an edge AI solution cannot answer the compression-ratio question, it cannot be audited. And if it cannot be audited, the property manager is trusting a spec sheet, not inspecting a deployment.

The four filter stages that do the compression

The model fires person_detected on every frame where it sees a person. That is the top of the funnel. On a unit, four stages sit between that firing and the outbox writer. The compression ratio is their combined work. The model is the start of the pipeline, not the end.

person_detected → delivered alert, four filter stages

Stage 1: Layout routing

The active DVR layout_id maps the frame into tiles. A detection in the DVR clock or channel bug region is dropped at this stage, before it is ever treated as a camera event.

Stage 2: Zone filter

Each camera has one or more configured zone polygons (entry_pre_action, mailroom, stairwell). A detection whose bbox_norm does not intersect a zone is dropped. No zone means no alert.

Stage 3: Dwell threshold

The detection must persist across enough frames to pass a per-class dwell. Typical thresholds: 2 s for trespass, 15 s for loitering, 60 s for package_dwell. Sub-threshold = dropped.

Stage 4: Threat tier

The threat classifier assigns LOW or HIGH. LOW lands in the daily digest. HIGH is sent to the on-call operator in real time. The outbox only forwards HIGH to the always-on channel.

Outbox append

After the four stages, the event passes the egress validator and is appended to /var/lib/cyrano/outbox/pending.ndjson. That append is counted as the delivered-alert numerator.

What the band actually looks like, in numbers

The figures below are from multifamily properties in the 16 to 25 camera range. Raw detections scale roughly linearly with pedestrian traffic, which at a 200-unit Class C property runs 200 to 300 per day. Delivered alerts scale with the filter stack, not traffic.

0raw person detections, one 24h sample

0delivered alerts from that day

0compression ratio (in band)

0 Bbytes per delivered alert

40:1–100:1

“A healthy edge AI solution at a 16-camera property compresses 200 to 300 raw person detections into 3 to 8 delivered alerts every 24 hours. Below 25:1 the channel gets muted within two weeks. Above 200:1 real events are being silently dropped. The band is not marketing language; it is what the detector logs show across 50+ our system installs.”

Our system install telemetry, /var/log/cyrano/detector.log vs /var/lib/cyrano/outbox/sent/

The raw detector log, and what falls out at each stage

The excerpt below is three incidents in sequence. The first survives all four filter stages and becomes a delivered loitering alert. The second is dropped at the zone filter because the person walked through the lobby without crossing a configured polygon. The third is dropped at dwell because 3.1 seconds is not long enough to be a loiter. Every line with person_detected counts in the denominator. Only the outbox_append counts in the numerator.

/var/log/cyrano/detector.log

The five-minute audit, on a real unit

This is the exact sequence a property ops lead runs to check whether an edge AI solution is still in band, with no vendor in the loop. Two log files, one divide, one verdict. If the unit exposes detector and outbox logs, it works on any edge AI solution, not only our system.

property-shell$ cyrano-audit compression

Where the compression actually happens in the pipeline

Three deployments, same hardware, different output

The rows below are three real our system installs on the same chassis, pulling the same DVR multiview feed. What differs is the filter stack configuration. Row 1 is in band. Row 2 is below the mute floor. Row 3 is in silent-under-reporting territory. Nothing in the silicon changed.

Feature	Out-of-band installs	In-band install (healthy)
Cameras	16 (identical chassis, identical feed)	16
Raw person_detected / 24h	261 / 238 (model fires the same)	247
Delivered alerts / 24h	82 / 1 (filter stack differs)	6
Compression ratio	~3:1 / ~238:1	~41:1 (in 40–100 band)
Zone polygons configured	0 zones / 16 over-tight zones	All 16 cameras have zones
Dwell thresholds	Flat 0 s / flat 120 s	Per-class (2 / 15 / 60 s)
Threat tier routing	All to on-call / HIGH demoted to digest	HIGH only to on-call
Operator behavior at week 3	Channel muted / complaints of no coverage	Channel active, alerts triaged

The four signals that drive the compression, as configuration

These four configuration surfaces are where the filter stack lives. If an edge AI solution does not expose any of them, the compression ratio is not tunable, which means it cannot be moved into the 40:1 to 100:1 band after deployment. That is a product failure, not a modeling one.

Zone polygons

Per-camera polygons that define the regions where a detection counts. Entry pre-action, mailroom, pool fence, rooftop stair. A camera with no zone emits no alerts, by design.

Dwell thresholds

Per event_class, in milliseconds. 2,000 ms for trespass, 15,000 ms for loitering, 60,000 ms for package_dwell. Sub-threshold detections never reach the outbox.

Threat tier rules

HIGH is paged; LOW lands in the digest. Rules live in a config file, not in the model. A loiter at 3am is HIGH; at 2pm with the office open, LOW.

Layout tile map

Per-layout JSON that tells the pipeline which tile is which camera, and which tiles are non-camera chrome (clock, channel bug). Without this, 15 percent of detections are phantoms.

Per-camera mute windows

Scheduled quiet periods where a camera drops alerts but still writes to the denominator. Used on leasing-office lobbies during business hours to keep the ratio honest.

Outbox schema

The 9-key envelope that gets appended once the event clears all four stages. If the event cannot be encoded in 9 keys, it is not delivered, even if the model fired.

Symptoms of a compression-ratio failure

Alerts arriving every 3 to 5 minutes during daylight

Channel muted by week two

Alerts reassigned to an unread shared inbox

Operator complaints of noise

Fewer than 3 alerts on a 24h window at a 200-unit property

Known incidents that do not appear in the log

Zone polygons too tight to overlap normal entry paths

Dwell thresholds set in minutes rather than seconds

Threat tier rules demoting HIGH to LOW

Phantom bounding boxes on DVR overlay chrome

Alerts arriving every 3 to 5 minutes during daylight

Channel muted by week two

Alerts reassigned to an unread shared inbox

Operator complaints of noise

Fewer than 3 alerts on a 24h window at a 200-unit property

Known incidents that do not appear in the log

Zone polygons too tight to overlap normal entry paths

Dwell thresholds set in minutes rather than seconds

Threat tier rules demoting HIGH to LOW

Phantom bounding boxes on DVR overlay chrome

Each item above maps back to one of the four filter stages. Correcting the compression ratio means finding which stage is letting too much through, or blocking too much, and tuning only that stage.

The audit, as five steps

Pull the last 24 hours of detector output

grep -c person_detected /var/log/cyrano/detector.log. That count is your denominator. No fancy tooling required.

Count delivered alerts for the same window

wc -l /var/lib/cyrano/outbox/sent/$(date -d yesterday +%Y-%m-%d).jsonl plus anything still sitting in pending.ndjson. Sum is the numerator.

Compute the ratio

raw / delivered. A 200-unit multifamily with 16 cameras should land in 40 to 100. Write the number down and move on.

If out of band, identify the stage

Tail the detector log and look at the line that appears after each person_detected. zone_filter keep=false means stage 2. dwell_threshold keep=false means stage 3. Missing tile mapping means stage 1. No threat_class line means stage 4.

Tune only the stage that is off

Loosen a zone polygon by a meter. Move a dwell threshold by three seconds. Reclassify a tier rule. Re-run the audit in 24 hours. The compression ratio should move toward the band without touching the model or the silicon.

A real audit log, from a 24-hour window

Below is the output of cyrano-audit compression on a 16-camera multifamily property over one day. Notice the per-stage drops: zone_filter catches the majority, dwell drops a smaller fraction, threat tier demotes a handful to the digest. The final numerator is 6. The denominator is 247. The ratio is 41.2. In band.

cyrano-audit compression, /var/log/cyrano/audit.log excerpt

What to ask any edge AI solution vendor before signing

The TOPS number on the box is public; the compression ratio in the field is not. A vendor who can show you one and refuses to show you the other is selling silicon, not a deployment. These are the checklist items that distinguish an edge AI solution that will still be in use at month three from one that will be silently muted.

Pre-sign checklist for any edge AI solution

What is the compression ratio band on a property like mine?
Can you show me the detector log and the delivered-alert log on a reference install?
Which four or five surfaces do I tune to move the ratio into band?
How do I run the same ratio audit myself, without your help, in five minutes?
What happens to the ratio when the operator changes the DVR layout mid-shift?
How are LOW vs HIGH tier rules expressed, and can I read them as a file?
What fraction of raw detections are dropped by the overlay / chrome mask on a typical day?
What does the band look like 30, 60, 90 days after install on your oldest customers?

The thing the SERP does not tell you

Edge AI is judged at the output, not at the chip.

The hardware question is settled: every serious edge AI solution has a usable NPU or GPU on board. The open question, which decides whether the deployment is useful, is the compression ratio between the model firing and the operator reading. On a healthy our system install at a 16 to 25 camera multifamily, that ratio is 40:1 to 100:1, produced by four filter stages (layout, zone, dwell, threat tier), verifiable by counting two log files, in five minutes, with no vendor in the loop. That is the edge AI solution audit. That is the thing a property ops lead can actually sign off on.

Raw detections on a 16-camera 24h window: 0. Delivered alerts: 0. Compression ratio: 0:1.

Run the compression-ratio audit on your own cameras

Book a 30-minute call. We will plug the device into your DVR, let it run for 24 hours, and walk the detector log and delivered-alert log together so you can see the ratio on your own property.

Frequently asked questions

Why should I evaluate an edge AI solution by its compression ratio instead of its TOPS rating?

TOPS measures silicon throughput, not whether the system produces output anyone reads. Two edge AI solutions with the same 40 TOPS rating can produce a 50:1 compression ratio and a 2:1 compression ratio on the same 16 cameras, because the ratio is determined by the filter stack (zone geometry, dwell thresholds, threat tiering, layout routing), not by the NPU. A 2:1 ratio means the property ops lead is receiving roughly one alert every five minutes from the cameras, which gets muted in a week. A 50:1 ratio is one alert every few hours with a specific thumbnail and zone label attached. Same silicon. Same cameras. Completely different product. TOPS is a hardware spec. Compression ratio is a deployment spec, and deployment is what you are actually buying.

Where exactly does the 40:1 to 100:1 band come from, and is it specific to our system?

It comes from counting person_detected lines against event_delivered lines across ~50 our system installs on multifamily properties in the 16 to 25 camera range. A typical 16-camera 200-unit Class C property produces roughly 200 to 300 raw person detections per day across residents, vendors, delivery drivers, and passersby. A healthy our system deployment compresses those to 3 to 8 delivered alerts that reach a property manager or guard. 200-300 raw / 3-8 delivered lands in roughly 40:1 to 100:1. The band is not our system marketing; it is what the log files show. An edge AI solution with a different filter stack will produce a different band, and the band it lands in is the best single predictor of whether the channel stays active at month three.

What is the 25:1 floor and why does the channel get muted below it?

Below a 25:1 compression ratio, an operator on a 16-camera property receives one alert every three to five minutes during daylight hours. That is not a security channel; that is background noise. In our install data, every property that ran below 25:1 for two consecutive weeks ended up either muting notifications entirely, reassigning alerts to a shared inbox that nobody opens, or asking for the system to be disabled. The failure is not that the edge AI solution is inaccurate; the model can still be firing correctly on real people. The failure is that its output-side filter stack is not compressing correctly, so humans never get past the first week. 25:1 is the minimum floor where the alert stream is still attention-compatible with a human on call.

And what does above 200:1 mean in the other direction?

Above 200:1 means the system is not emitting enough alerts to represent what actually happened at the property. On a 200-unit multifamily, 24 hours with only one or two alerts is not a good sign; it means either the zone polygons are drawn too tight (a person walking three feet outside the tripwire produces zero signal), the dwell threshold is set so aggressively that a loiterer standing 14 seconds at a door produces nothing (because the threshold is 15), or the threat classifier is demoting actual trespass events to LOW and filtering them at the outbox. A working edge AI solution has to be tunable down from 200:1 toward 60:1, and that tuning is a configuration artifact, not a model retraining job. If the vendor cannot show you where to loosen the zone or the dwell, the high ratio is silent under-reporting, not excellent precision.

How do I run this compression-ratio audit on site, without vendor help?

Shell into the unit and count two things in the last 24 hours of logs. First: grep -c person_detected /var/log/cyrano/detector.log. That is the denominator. Second: wc -l /var/lib/cyrano/outbox/pending.ndjson and /var/lib/cyrano/outbox/sent/$(date -d yesterday +%Y-%m-%d).jsonl. Sum those two for the numerator of delivered alerts. Divide. If the number is in 40 to 100, the solution is deployed correctly. If it is below 25 or above 200, something in the filter stack is misconfigured. The audit takes under five minutes once you know where to look, and it works on any edge AI solution that writes detector output and delivered alerts to separate log files, not just our system.

What are the four filter stages that actually compress raw detections into delivered alerts on a unit?

Stage 1 is layout routing: the frame capture identifies which tile the detection came from using the per-layout tile map, and drops detections from tiles that are not mapped to a camera (for example the DVR clock panel). Stage 2 is zone filtering: each camera has one or more zone polygons, and a detection that does not intersect a zone polygon is dropped. Stage 3 is dwell thresholding: a detection must persist across enough frames to pass a per-class dwell threshold (typically 2 seconds for trespass, 15 seconds for loitering, 60 seconds for package_dwell). Stage 4 is threat tiering: the threat classifier marks the event LOW or HIGH, and the outbox only forwards HIGH to the on-call channel while LOW lands in the digest. Those four stages are where the 50:1 to 100:1 compression happens. The model itself is the start of the pipeline, not the end.

Does this mean the edge inference hardware does not matter?

It matters for throughput, not for whether the output is useful. A 25-camera feed running at DVR multiview framerate is on the order of 25 to 75 inferences per second depending on the per-tile scheduler, and that load needs a real NPU or GPU on the unit. But once the hardware clears that bar, more TOPS does not produce better compression. Better compression comes from better zone geometry, better dwell thresholds, better threat classification, and better layout-aware tile routing. That is why an edge AI solution that ships with a 100 TOPS chip can still underperform a 40 TOPS system with a well-tuned filter stack on the same cameras. The hardware is necessary. The filter stack is the thing that makes it work.

What happens when the operator changes the DVR layout mid-day, does the compression ratio break?

It should not, because the layout classifier tracks the active layout_id per frame and routes detections through the correct tile map. If it did break, you would see the compression ratio spike: a 4x4 grid collapsing to a 1x1 fullscreen makes one tile fill the whole frame, which makes the overlay mask wrong, which produces phantom detections around the DVR chrome. A functioning edge AI solution must handle this, or it cannot survive a guard drilling into a single camera and back. On a unit, the layout_id is tagged on every event in the payload, which means an auditor looking at the outbox log sees exactly which layout was active at the time of each event, and a suspicious compression-ratio spike is traceable to a specific layout transition.

Is this a our system-only framing, or does the compression ratio apply to other edge AI solutions?

The framing applies to any edge AI solution whose output is alert events routed to humans, whether the compute sits on a camera SoC, an NVR, a dedicated appliance, or a cloud GPU. What is our system-specific is the exact band (40:1 to 100:1), because that band is tied to a particular filter stack on a particular class of property (16 to 25 cameras, multifamily). The underlying principle, compression ratio predicts adoption, is transferable. If you are evaluating any edge AI solution, ask the vendor for their compression-ratio band on a comparable property, and ask how a new installer would tune the ratio down from a cold start. A vendor that cannot answer either question is selling you silicon, not a working deployment.

How does this audit framing differ from what the top SERP results cover for edge AI solution?

The top SERP results for edge ai solution are written for an OEM or a platform buyer. They compare TOPS, NPU architectures, INT8 vs FP16 quantization, cloud-vs-edge latency, and list industrial IoT use cases. None of them define a field-verifiable metric that tells a property manager whether the deployed system is useful. Compression ratio fills that gap: it is one number, computable from two log lines on the unit, and it predicts operator adoption better than any hardware spec. This guide is about that reframe, so that the edge AI solution an operator signs for is one they can audit, not one they have to trust.