Matthew Diakonov, Written with AI

Published April 29, 202611 min read

CCTV operations guide

Real time CCTV intrusion alerts only matter if they fire on the right frame.

Every product page on this topic talks about smart filtering and "alerts to your phone." None of them talk about the timing question, which is the part operators care about. The intrusion sequence has four stages: approach, dwell, breach, exit. Most systems alert at stage 3 or stage 4. By then the action window is already gone and the alert is just an index into the recording. The alert that prevents the loss is the one that fires at stage 2. This guide walks the sequence, names the layer that gets you stage-2 alerts on cameras you already own, and shows what the payload should look like when it lands on a phone.

Direct answer, verified 2026-04-29

What is a real time CCTV intrusion alert? A notification fired when a system watching a live camera feed detects a frame that matches an intrusion pattern (perimeter crossing, restricted-zone entry, loitering past a configured dwell, or forced-entry posture) and pushes that frame to an operator within single-digit seconds of capture.

When does it fire? Pixel-based DVR motion fires at the breach frame or later. AI loitering and zone rules can fire at the dwell stage, 30 to 60 seconds before a breach, which is the only stage where action can prevent the breach instead of just recording it.

The four-stage intrusion sequence

On a property where someone is going to breach a perimeter and walk away with something, the sequence is almost always the same. Watch enough archived clips on enough properties and you stop seeing exceptions. Approach. Dwell. Breach. Exit. Each stage has a different timing budget and a different action window. The frame an alert fires on decides which stage you can act in.

Stages, timing, action window

Stage 1, Approach (T minus 60 to 90 seconds)

A person walks toward the property boundary. Body language is exploratory, head turning to check sightlines and cameras. They have not crossed any drawn zone yet. Pixel motion detection cannot see this stage as separate from any other person walking past. The action window is wide: a guard tone, an audio talkdown, a marked vehicle drive-by, even a porch light pulse can change what happens next.

AI detection at this stage reads as a person track present in an outer-zone polygon. No alert by default, because most outer-zone tracks are tenants, residents, vendors, and pedestrians on the sidewalk. The signal becomes interesting only when combined with stage 2.

Stage 2, Dwell (T minus 30 to 60 seconds)

The person stops moving. They linger near a gate, a parking-lot entry, a stairwell, a mailroom door, a leasing-office window. This is the stage where intent is legible. A walking-through track does not stop. A delivery worker stops at a defined door and leaves. A person testing whether anyone is watching stops where there is no door to deliver to.

This is the frame an actionable intrusion alert fires on. Cyrano's loitering and trespassing rule treats a person dwelling in a zone past the configured threshold as the trigger. The alert lands on a phone with a thumbnail, while the event is still in progress. There is still an action window.

Stage 3, Breach (T zero)

The person crosses the actual perimeter. Climbs a fence, forces a gate, follows a tenant through a secured door, jumps the leasing-office counter, picks up an unattended package. This is what most DVR motion detection finally catches, because pixel change inside the drawn box only crosses threshold once a body is fully inside it. Alerts that fire here are useful for response, not prevention.

At this stage threat tier escalates. A dwell event that did not get actioned now becomes a forced-entry posture or a tailgating event. The same on-device classifier flips it from LOW THREAT to HIGH THREAT and the channel changes from dashboard digest to SMS plus phone call.

Stage 4, Exit (T plus 5 to 120 seconds)

The person leaves. With property-side cameras and no live monitoring, this is when staff usually find out something happened, hours or days later, when a tenant complains or a piece of property is reported missing. The footage exists. Nobody saw it in time. This is the stage every property security review describes as the failure mode.

Real time intrusion alerts collapse the gap between this stage and the first time a human knows. The longer the gap, the more the recording functions as a forensic exhibit and the less it functions as a deterrent.

Why the dwell stage is the stage that matters

Operators who have lived through this start with a working theory: people who are about to commit a property crime almost always stop. They check whether anyone is watching, check whether the camera moved when they moved, check the gate hardware, count the cars in the lot. The behavior is legible from a still image. A tenant walking home does not stop near the package room. A delivery worker stops at the package room and leaves. A person standing near the package room for ninety seconds, hands free, head turning, is the alert worth firing.

Pixel motion detection cannot tell that story. The pixels change in roughly the same way for any of the three cases. Track-based detection on top of the live feed can: a track that enters the zone, dwells past a threshold, and has not exited becomes the event. Cyrano's loitering and trespassing rule is the one that fires here, and it fires at the dwell stage by design, not at the breach.

“At one Class C multifamily property in Fort Worth, Cyrano caught 20 incidents including a break-in attempt in the first month. Customer renewed after 30 days.”

Fort Worth, TX deployment, first 30 days

What gets you from a live feed to a stage-2 alert on a phone

The thing missing from a typical property is not the cameras. The cameras already produce a usable signal. The recorder already composites every feed onto an HDMI output for the wall display. What is missing is a layer that reads that HDMI output, runs inference per tile, and sends a structured alert out. That layer is an edge box. It plugs in physically, reads what the recorder already shows, and pushes alerts on the network the property already has.

From DVR to phone, on-prem

The hop that decides whether the alert is real time or forensic is the second one, edge box to inference. When that hop is on the same LAN segment as the recorder, it costs milliseconds. When the inference cluster is in the cloud, the same hop spends 8 to 20 seconds on a typical multifamily uplink. Slower silicon close to the recorder beats faster silicon at the end of an uplink for an alert workload, because alerts are tail-latency sensitive and the uplink is what owns the tail.

DVR motion alerts vs real time intrusion alerts

A DVR or NVR shipped with motion alerts. Operators turn them on once, get hammered with raccoons and rain, and turn them back off inside a week. The alerts that come out of an AI layer reading the same feed look completely different.

Same property, same cameras, two alert layers

On a 16-camera property, the DVR's built-in motion alerts produce 200 to 800 events per day. Most of them are environmental: wind, headlights, IR spider webs, rain on the lens, the dog walking back from the dumpster. The alerts land in the recorder's email field that nobody opens. After two weeks the office buzzer gets disabled because it interrupts every leasing-office conversation. The system is technically sending alerts. Nobody is reading them.

Pixel-difference math, no concept of a person track
Triggers on raccoons, headlights, swaying flags
Lands in an email field nobody reads
Operators mute the channel inside two weeks

What an actionable alert looks like on a phone

Most operators decide whether an alert is worth a call in two seconds. The thumbnail does most of the work. If the thumbnail is the entire grid view, or it is a 320x240 crop of a smudged corner, the alert gets scrolled past. If the thumbnail is a recognizable scene, the camera is named in property language, and the event class is one word, the alert gets actioned.

What a usable intrusion alert payload includes

Has a thumbnail cropped from the triggering frame, not just text
Lands in a thread the operator already opens daily, not a portal that requires a new login
Names the camera with the property's own label ('east stairwell'), not a channel number
Shows the event class (loitering, restricted-zone entry, tailgating), so triage takes two seconds
Splits LOW threat into a digest and HIGH threat into a phone call, so the channel does not get muted in week two
One thread per property, so a regional manager with 12 buildings can mute or escalate independently

DVR motion vs real time intrusion alerts

Feature	Built-in DVR motion	Edge AI on existing DVR
What triggers the alert	Pixel change inside a drawn rectangle on the DVR's UI	Person track entering a zone, dwelling past threshold, or crossing a perimeter polygon
When it fires in the intrusion sequence	Stage 3 (breach) at earliest, often after	Stage 2 (dwell) by default, escalates at stage 3
False positive shape	Wind, headlights, IR spider webs, raccoons, rain	Vendor at unusual hour, tenant entry during typical return window (handled with property context)
Action window for prevention	Zero, the breach already happened	30 to 60 seconds before breach
Channel the alert lands on	The recorder's email field nobody reads, plus a buzzer in the office	Phone notification with thumbnail, classified by threat tier
Cost to deploy	Free, already shipped with the DVR	$450 one-time hardware, $200/month software starting month 2

The honest tradeoffs

A few things worth saying out loud, since most pages on this topic do not.

A real time intrusion alert is not a guard contract. The alert lands on a phone. Someone has to be carrying the phone. For a regional manager with 12 buildings, the channel works because LOW threat events dump into a digest and only the HIGH threat events ring. For a property where nobody is on call at 2 AM, the alert is a fast index into footage rather than a prevention layer. It still beats finding out in the morning.
Property context beats fancier models. The hardest false positives are humans doing legitimate things at unusual hours: vendors, tenants returning home, delivery drivers. No model can reason these away on its own. They are handled with property-specific rules, a known-vehicle list, a vendor schedule, a tenant return-window suppression. Without those, the alerts are noisy. With them, the alert volume is low enough to read every one.
Camera replacement is the wrong axis. Replacing 16 to 25 working cameras to get AI features costs $50,000 to $100,000 per property and rebuilds infrastructure that is not broken. Reading the recorder's HDMI output and running detection on the existing feed is a different shape of project: a $450 device that installs in under 30 minutes and works with whatever cameras the recorder is already pulling from.
Cloud inference is fine for forensic search, wrong for alerts. Search-by-prompt across a month of footage is a workload that tolerates a few-second round trip. A live alert workload does not. The split most working installs land on is on-prem inference for the alert path, cloud for the search path.

“Most property teams only find out about trespassing or theft hours later when someone complains. The point of a real time alert is to flip that. By stage 2 of the sequence, you still have time to do something.”

Operator note

Class C multifamily, 180 units

How to evaluate a real time intrusion alert system in one site visit

If you are looking at a vendor and want to know whether their "real time intrusion alerts" are stage-2 alerts or stage-3 alerts, three questions usually settle it:

Ask the vendor to walk you through what triggers the alert at the rule level. If the answer is "motion in a zone," that is pixel motion with a polygon on top, and it fires at stage 3. If the answer involves a person track, a dwell threshold, or a behavior class (loitering, tailgating, posture at a door), that fires at stage 2.
Ask where inference runs. If it is in the cloud, ask for the tail latency on the building's uplink. The variance lives there, not in the inference itself. If it is on-prem, you are getting single-digit second alerts. If it is on a camera-by-camera AI SoC, you are getting alerts but you have lost the ability to reason across the whole grid (cross-camera tracks, tailgating, cross-zone correlations).
Ask to see a real alert from a working customer. If the answer is a screenshot of a dashboard with a list of events, the alert is meant to be read in the dashboard. If the answer is a phone notification with a thumbnail, the alert is meant to be read on a phone. Operators read phone notifications. They do not log into dashboards on a Saturday at 2 AM.

See stage-2 intrusion alerts on a real DVR

A 15-minute walkthrough on cameras you already have. We plug into your DVR's HDMI output, label your zones, and show what the alerts look like on a phone before you commit to anything.

FAQ on CCTV real time intrusion alerts

What counts as a CCTV real time intrusion alert?

A real time CCTV intrusion alert is a notification triggered when a system watching the live feed flags a frame that matches an intrusion pattern (perimeter cross, restricted-zone entry, loitering past a configured dwell, or forced-entry posture at a door) and pushes that frame to an operator within seconds of capture. The cutoff between real time and forensic clip is the action window. Anything that lands while the operator can still call a guard, dispatch on-site staff, or trigger an audio talkdown is real time. Anything that lands after the person has already left is recorded footage with a faster index.

When in the intrusion sequence does the alert actually fire?

It depends on which detection layer is wired up. Pixel-based DVR motion detection fires when enough pixels change inside a drawn box, which on most properties is the breach frame at earliest, often a frame or two later. AI restricted-zone rules fire when a tracked person crosses the zone polygon. AI loitering rules fire when a person dwells in a zone past a configured threshold (15 to 60 seconds is typical), so they fire during the dwell stage, before the breach. Forced-entry posture detection fires when a body posture matching tampering at a door or gate is sustained for a few seconds. The timing of the alert decides what action is still available. A breach-frame alert can summon a response. A dwell-frame alert can summon a response that arrives before the breach.

Is a real time intrusion alert different from a motion alert from my DVR?

Yes, even when the DVR labels its motion alerts as real time. DVR motion is pixel-difference math against a drawn rectangle. It fires for raccoons, headlights, rain on the lens, sunset gradients, and the apartment dog walking back from the dumpster. It also has no concept of a track, so a person lingering in the parking lot at 2 AM is treated identically to a delivery driver dropping off at 11 AM. A real time intrusion alert from an AI layer ties detections to a person track, classifies the behavior (walking through, dwelling, posturing at a door), filters out non-human movement, and only alerts when the behavior is on the property's actual rule list.

Do I need to replace my cameras to get real time intrusion alerts?

No. The cameras already produce a usable signal. The piece you do not have is something between the recorder and the operator's phone that watches the live feed, runs detection on each tile, and pushes a structured alert out. Replacing cameras solves a hardware problem you do not actually have and ignores the recorder, the cabling, the network, and the existing footage retention policy. An edge box that plugs into the recorder's HDMI output reads exactly what an operator on the wall display sees, runs inference on it locally, and sends alerts on the network you already use. No coax, no new conduits, no decommissioning a recorder that still works.

How do you avoid alert fatigue when the system is running 24/7 on 16 to 25 cameras?

Two ways. First, threshold the dwell window. A person walking through a parking lot in 8 seconds is not loitering. Set the dwell to a property-appropriate value (typically 30 to 60 seconds for a parking area, 15 seconds for a stairwell or package room) and the alert volume drops by an order of magnitude. Second, classify threat tier on-device before notifying. A person at the leasing office at 11 AM is a different alert than the same posture at the leasing office at 2 AM. Cyrano routes LOW THREAT events into a dashboard digest that gets reviewed once or twice a day. HIGH THREAT events fire to SMS plus a phone call to the on-call manager. The operator never sees a feed of raw detections, only the events that earned the page.

What is the actual end-to-end latency from intrusion happening to a notification on a phone?

On a working install where inference is on-prem and the alert channel is SMS or a chat thread the team already uses, the perceived latency is single-digit seconds. The hops are: HDMI capture from the recorder (sub-second), per-tile inference on the edge accelerator (typically 100 to 400 ms per frame), event de-duplication and threat classification (sub-second), thumbnail crop (sub-second), push to the messaging or SMS provider (1 to 3 seconds), provider fan-out to the device (1 to 3 seconds). The variance lives in the messaging hop, not the inference hop. Cloud inference adds an uplink round trip on top, which on a typical multifamily DSL or business cable line is the unstable part.

What kinds of false positives are common, and which ones are tractable?

The tractable false positives are environmental: insects on the lens at night, IR spider webs, headlights sweeping a wall, a flag in the wind. Track-based detection ignores these because they do not produce a person track. Animal motion is also tractable. The harder ones are humans doing legitimate things: a maintenance vendor at 3 AM (real, infrequent), a tenant returning home through the garage at 1 AM (real, frequent), a delivery driver in the package room. The system cannot reason these away on its own. They are handled with property-specific rules: a vendor schedule, a known-vehicle list, a tenant entry pattern that suppresses alerts during typical resident return windows. Without the property context, those events generate alerts and an operator marks them dismissed. With the context, they never generate the alert in the first place.

What does the alert itself look like when it lands on a phone?

A thumbnail cropped from the frame that triggered the rule, the camera label (a human-readable name like 'pool gate' or 'east garage entrance', not 'CH04'), the event class (loitering, restricted-zone entry, tailgating, package-room dwell), a timestamp, a one-line description, and a link to the live feed. It lands in one chat thread per property, so a regional manager covering twelve buildings gets twelve separate threads they can mute, escalate, or hand to an on-call manager individually. The thumbnail matters more than the description. An operator decides in two seconds whether the frame is something to call about or scroll past, and a description without an image gets ignored.

Will this work with the cameras I already have, or do I need IP cameras with on-board AI?

It works with whatever the recorder produces, which on most properties is a mix of analog HD over coax (HD-TVI, HD-CVI, AHD) and IP cameras pulled into a hybrid NVR. Camera-side AI is a separate axis and not strictly necessary. The recorder already composites every feed onto an HDMI output for the wall display. An edge box reads that HDMI output, splits it into per-tile streams, and runs detection per tile. No camera replacement, no firmware updates on cameras that may be six to ten years old, no wrestling with vendor lock on ONVIF or RTSP. The detection quality depends on resolution per tile after the multi-view crop, not on whether the camera is analog or IP.