Matthew Diakonov, Written with AI

Published April 20, 202610 min read

Edge AI, reframed as a contract

Edge AI is not a place; it is a 9-key file on the boxthat a tenant's attorney can read.

Every edge AI explainer defines the category by where inference runs. That framing is unverifiable from the outside. On a our system unit, edge AI is defined by a single file, /etc/cyrano/egress.schema.json, which permits exactly 9 keys to ever leave the property, and a validator that rejects anything else before the outbox writer is even allowed to touch it. This page is about that file, the 420-byte envelope it produces, and why the contract, not the silicon, is what "edge" actually means to the people who live on the property.

Read the schema on a production unit

4.9from 50+ properties

9 keys total in /etc/cyrano/egress.schema.json

420 bytes per validated event on the wire

0 pixels, 0 face embeddings, 0 plate text ever leave the unit

Every event passes cyrano-egress-validate before the outbox

Edge AI, as an egress contract

one file, 9 keys, 420 bytes per event

Compute location is an implementation detail

/etc/cyrano/egress.schema.json is the actual contract

9 allowed keys, enforced before any outbox write

Pixels, faces, plates, IPs, SSIDs are all forbidden keys

420 bytes per event is the entire tenant-privacy surface

0:00 / 0:05

The SERP defines edge AI by where. That is not auditable.

Read the current top results for edge AI and the answer is always a location. Inference happens close to the sensor. Latency is low because the round trip is short. Bandwidth is cheap because pixels do not upload. Privacy is better because data stays local. All of that is accurate. None of it is falsifiable. A property owner who asks a vendor "where does the inference happen" gets a marketing answer and has no way to check it.

A tenant's lease attorney cannot read "where inference happens." An insurance carrier cannot inspect a TOPS rating. A property ops lead cannot point at a bandwidth diagram in an incident review. The vendor-facing abstraction and the operator-facing question are in different languages.

So on a unit, edge AI is defined the other way round. Not by where the compute is, but by what is allowed to leave the building. The answer is one file and one validator. Open them and the whole category becomes legible.

The schema file, in full

The whole file is short on purpose. A schema that cannot fit on one screen is a schema that nobody reads. Every tenant-privacy promise the product makes reduces to a subset check against the allowed_keys array below. Add a field here and it becomes allowed; remove a field here and it becomes impossible to ship. There is no other path.

/etc/cyrano/egress.schema.json

The 9 allowed keys, and what each one means

device_id

The cy-<serial> of the unit. A stable identifier scoped to the physical device. Not the property label, not the account email, not the owner name.

local_seq

Monotonic integer assigned by the outbox writer. Makes event ordering survive a network partition even when occurred_at_unix drifts.

occurred_at_unix

Local unix timestamp from the unit's NTP-synced clock. Not read from the DVR's on-screen display; the DVR clock is often hours wrong.

camera_id_hash

HMAC of the property-scoped camera label under a per-property key stored in /var/lib/cyrano/secrets/property.key. Opaque off the property, resolvable on it.

event_class

One of a fixed enum of roughly 14 classes: loitering, tailgating, package_dwell, trespass_zone_entry, and so on. No free-text fallback.

threat_tier

LOW or HIGH. Produced by the on-device threat classifier so operators do not get paged for benign activity.

bbox_norm

Four floats in 0.0 to 1.0 of the frame. No absolute pixel coordinates leave the unit, so resolution metadata is never implicit.

confidence

One float in 0.0 to 1.0. Lets downstream consumers threshold or recalibrate without the raw detector output ever traveling.

dwell_ms

How long the event persisted on the frame, in milliseconds. Separates a 200ms detection jitter from a real 18-second package loiter.

What one event actually looks like on the wire

One line of NDJSON, appended to pending.ndjson after validation. This is the entire packet the property ever emits for a detection. 420 bytes, no trailing metadata, no header, no wrapping protobuf. Read it, and you have read everything the building sends about this incident.

/var/lib/cyrano/outbox/pending.ndjson

From detector output to outbox, enforced by the schema

420 B

“A 200-unit Class C property generating 300 detection events a day sends 126 KB of JSON across every camera for the whole day. That is the entire tenant-privacy surface: not a pixel of video, not a face embedding, not a license plate string, not a local IP. 420 bytes per event, 9 keys per event, one schema file you can open with a text editor.”

Our system egress accounting, /var/lib/cyrano/outbox/pending.ndjson sampling

The validator, run on a real unit

The schema is not a comment at the top of a file. It is checked by cyrano-egress-validate before any write to the outbox. The session below is the exact path a valid event and a forbidden event take on a production unit. The forbidden one never touches pending.ndjson.

property-shell$ cyrano-egress-validate

What a forbidden write looks like in the logs

A running unit surfaces schema violations on the property dashboard. The excerpt below is the audit trail after a bad release tried to ship a new key without a schema bump. The outbox was untouched. The rejection landed in the log. The cloud never saw the event.

cyrano edge AI, schema-reject excerpt from /var/log/cyrano/rejected.log

0keys in /etc/cyrano/egress.schema.json

0 Bbytes per validated event

0event_class enum values

0pixels on the network egress path

Across more than 0+ property installs, the egress envelope has been the same 9 keys. Bumping the schema is a reviewable diff; shipping a new field without it is a rejected write.

What the unit cannot send, no matter what the detector saw

Operators occasionally ask why our system cannot email a thumbnail of the alert. The answer is not product conservatism; it is that thumbnail_url is not a key in the schema, which means the validator drops any write that contains it, which means the outbox writer cannot reach the network with one. These are the forbidden fields a production unit rejects today.

forbidden keys enforced by cyrano-egress-validate:

pixels_b64 (no raw or cropped frames)
thumbnail_url (no server-side image references)
face_embedding (no biometric vectors)
license_plate_text (no OCR strings)
operator_note (no free-text, no PII leakage)
wifi_ssid (no network fingerprinting)
local_ip (no infrastructure leaks)
mac_address (no device-on-network identifiers)
raw_ocr_text (no DVR on-screen text)

allowed keys enforced by cyrano-egress-validate:

device_id (cy-<serial>)
local_seq (monotonic integer)
occurred_at_unix (unit clock, NTP-synced)
camera_id_hash (HMAC, per-property key)
event_class (14-value enum)
threat_tier (LOW or HIGH)
bbox_norm (four floats, 0.0 to 1.0)
confidence (float, 0.0 to 1.0)
dwell_ms (integer milliseconds)

Where the schema sits in the data path

SERP-style edge AI vs. egress-contract edge AI

The rows below are not a competitive comparison. They are two framings of the same category. The SERP framing is about where compute lives; it is the right story for an OEM or a platform buyer. The egress framing is about what can leave the property; it is the story for an operator, an attorney, or a compliance officer. Both are true; only one is auditable.

Feature	Edge AI as a compute-location claim (SERP)	Edge AI as an egress contract (our system)
Definition	Inference runs close to the sensor	Only keys in /etc/cyrano/egress.schema.json may leave
Who can verify it	The vendor, from its architecture diagrams	Anyone with shell on the unit, by reading one file
Artifact a tenant attorney can read	None; compute location is not a document	A 9-key JSON schema, about 60 lines
What leaves when inference finds a person	Implementation-defined; varies by product	420 bytes of JSON, no pixels, ever
How privacy violations are prevented	Policy, roadmap, trust-the-vendor	Subset check on allowed_keys before any outbox write
How a new field ships	Silent schema evolution in the cloud	A reviewable diff to egress.schema.json and a version bump
Bytes on the wire per event	Not specified; often includes image data	420 B / event; 126 KB/day at 300 events/day

The thing that is uncopyable

Edge AI fits in 9 keys and 420 bytes.

The whole tenant-privacy surface of a edge AI AI install reduces to one file at /etc/cyrano/egress.schema.json and one validator, cyrano-egress-validate, that enforces a subset check before any write ever reaches the outbox. No pixels. No face embeddings. No plate strings. No SSIDs. 9 permitted keys, about 420 bytes per event, and a rejection log that makes a schema violation a visible counter on the property dashboard. That is edge AI as a contract. That is the thing a lease attorney can read and sign off on without asking the vendor to explain itself.

Things the egress contract explicitly forbids, that competing SERP framings leave ambiguous

Base64 thumbnails in alert payloads

Crop URLs pointing at a vendor CDN

Face embeddings for re-identification

License plate OCR strings

Operator free-text notes

WiFi SSID or BSSID fingerprints

Local IP or MAC address

DVR on-screen OCR text

Absolute pixel coordinates

Undeclared keys of any kind

Every item above is a write that cyrano-egress-validate rejects before the outbox writer is invoked. Rejections are logged to /var/log/cyrano/rejected.log and increment a schema_violations counter on the property dashboard.

When the compute-location framing is the right one

Worth being direct. If the deployment is a robotics cell, a factory line, or a machine-vision QA rig, the compute-location framing covers the buyer correctly. Latency matters because an actuator has to fire inside a window. Bandwidth matters because the product line is generating a river of video. TOPS and watts are the buyer's real questions.

For commercial property work, the questions are different. The unit is sitting on a closet shelf at a Class B or C multifamily, a construction trailer, or a gated community. The tenants have privacy expectations written into the lease. The regional ops director has to explain the system to an insurance carrier. In that world, the question is not "where does inference run." The question is "what leaves the building, and can I read it." The schema is the answer.

See /etc/cyrano/egress.schema.json on a real unit

A 15-minute call. We open the schema file, run cyrano-egress-validate against a live event stream, and show a rejection land in /var/log/cyrano/rejected.log without touching the outbox.

Edge AI: frequently asked questions

Why define 'edge AI' by a file on the unit instead of by where inference runs?

Because the compute-location framing is unverifiable from outside the vendor. A property owner who asks 'where does the inference happen' gets a marketing answer. A property owner who asks 'what is allowed to leave the building' can be handed a path, /etc/cyrano/egress.schema.json, open it with any text editor, and read the 9 keys that are permitted. The schema is the contract. The compute location is an implementation detail that exists because of the schema, not the other way round. Reframing edge AI as an egress-contract category is the move that makes it auditable.

What are the 9 keys the schema allows?

device_id (the cy-<serial> of the unit), local_seq (the monotonic sequence the outbox uses for ordering), occurred_at_unix (the local unix timestamp), camera_id_hash (an HMAC of the property-scoped camera label, never the human label), event_class (one of a fixed enum of about 14 classes, for example loitering, tailgating, package_dwell, trespass_zone_entry), threat_tier (LOW or HIGH), bbox_norm (four floats, x y w h, each in 0.0 to 1.0 of the frame), confidence (one float, 0.0 to 1.0), and dwell_ms (how long the event persisted on the frame). That is the whole envelope. Everything else is a validation failure.

What is explicitly NOT in the schema, and why does that matter?

Pixels are not in the schema. No base64-encoded thumbnails, no crop URLs, no face embeddings, no license plate strings, no raw timestamps from the DVR's on-screen display, no operator free-text notes, no WiFi SSID, no local IP address. If a downstream handler tries to write any of those into an event before the outbox, cyrano-egress-validate drops the write and logs a rejection line to /var/log/cyrano/rejected.log. The reason this matters: multifamily leases have tenant-privacy clauses that name pixels and biometrics specifically, and 'our AI runs on the edge' is not a sentence you can hand a lease attorney. A schema path is.

How big is one validated event on the wire, in bytes?

420 bytes, give or take a few bytes per event class. That is one line of NDJSON appended to /var/lib/cyrano/outbox/pending.ndjson. A property that generates 300 events per day, which is roughly the upper bound at a 200-unit Class C multifamily, sends 126 KB of JSON across all cameras across the whole day. For comparison, one 1080p snapshot at JPEG quality 80 is on the order of 180 KB, which is larger than a month of events from an average 50-unit property. The envelope is the point: 'edge AI' here means the building's entire compliance surface is about 100 KB a day of structured text, not pixels.

How is the schema actually enforced, not just declared?

Before any process on the unit is allowed to append a line to /var/lib/cyrano/outbox/pending.ndjson, it pipes the candidate JSON through cyrano-egress-validate, a small setuid helper that reads /etc/cyrano/egress.schema.json, walks the candidate's keyset, and exits 0 only if the set is a subset of the allowed 9 and every value conforms to its declared type. The outbox writer has no other entry point; the file permissions on pending.ndjson are u=rw,g=,o= and only the validator's group can append. A developer who adds a new field in a future release has to change the schema file to ship it, which is a reviewable diff, which is the audit.

What happens when a field is dropped, on a reject?

Two things. The candidate JSON is written to /var/log/cyrano/rejected.log with a reject_reason string (for example 'forbidden_key:pixels_b64' or 'type_mismatch:confidence:string'). The outbox is untouched; the network uplink never sees the rejected event. A counter on the property safety dashboard increments a schema_violations metric so a property ops lead can tell at a glance whether a new release is silently producing events the old schema won't accept. No event is partially sent. No 'fallback to cloud' path exists.

Does camera_id_hash mean a compliance officer cannot trace an event back to a camera?

They can, but only on the property. The hash is an HMAC with a per-property key stored in /var/lib/cyrano/secrets/property.key that never leaves the unit. On the property, the ops tool cyrano-debug resolve-camera takes a hash and returns the label. Off the property, the hash is opaque. This is deliberate: the person debugging a specific alert is physically at the property and has shell on the box; the person at corporate reading the dashboard sees only the hash and the event class. Privacy boundaries match operational boundaries.

Does this egress contract cover what the HDMI capture loop sees?

The capture loop sees everything the DVR renders on its multiview monitor. That is intentional; that is how it runs inference. But the capture loop has no outbox write permission. The only code with outbox permission is the event formatter, which takes detector output and emits the 9-key envelope. Raw frames never leave /dev/shm/cyrano/frames, which is tmpfs and wiped on every reboot. A second on-unit process, cyrano-retention, truncates /dev/shm/cyrano/frames on a ring of roughly 45 seconds, so even if someone physically pulls the unit and powers it back on, there is no recent video on it to read.

Can the schema evolve? What does a real migration look like?

Yes. A migration bumps /etc/cyrano/egress.schema.json's version string, adds the new key under a migrations block, and ships in a firmware image. cyrano-egress-validate reads the version header at startup and reloads if it changes. Downstream cloud ingest is versioned on the same string; the ingest pipeline rejects events carrying a version it does not understand yet, which surfaces a coordinated-rollout failure inside our system's own infra rather than at the customer. A field added without a schema bump is rejected on the unit before it ever reaches the network, so 'we forgot to update the schema' is an on-premise error, not a silent exfiltration.

How does this reading of edge AI differ from the top results on the SERP?

The top results define edge AI by where inference runs. Latency is low because compute is close. Bandwidth is cheap because pixels do not upload. Privacy is better because data stays local. All of that is true and none of it is falsifiable from outside the vendor. A tenant's attorney cannot read 'where inference runs.' A tenant's attorney can read /etc/cyrano/egress.schema.json. Reframing edge AI as a contract, not a compute story, is the move that moves the category from trust-me to show-me. That is the unique gap this page fills: a specific on-device file, in a specific location, with a specific validator that enforces it.