Matthew Diakonov, Written with AI

Published April 19, 202612 min read

Theft Detection Lock Guide

A theft detection lock for buildings is three signals, one polygon, one schedule.

The phrase comes from Android. The feature on your phone fuses three signals (snatch motion, getaway movement, network disconnect) and locks the screen. The same idea moves cleanly to property cameras, but the signals change: zone, dwell, and time. The capture point changes too. Instead of an accelerometer inside a phone, the feed is the DVR's HDMI multiview output, the same signal that drives the guard monitor. The “lock” becomes a polygon armed on a schedule, with a thumbnail, a zone label, a dwell counter, and a timestamp packaged into a WhatsApp alert inside the pre-action window.

See a zone lock fire on a live DVR

4.9from 50+ properties

Three-signal fusion: zone + dwell + time

Capture point is the DVR's HDMI multiview, not the IP stream

25 tiles per unit, one WhatsApp thread per property

Tile to alert delivered under 60 seconds

Theft detection lock, translated

Android's three-signal phone feature, applied to building cameras.

Android lock: snatch + getaway + disconnect → screen locks

Property lock: zone + dwell + time → polygon fires

Capture point: DVR HDMI multiview, not accelerometer

Event latency: tile to WhatsApp under 60 seconds

Pre-action window: 30 to 180 seconds, still open

0:00 / 0:05

Why the phrase is everywhere right now

Search the phrase today and the first ten organic results are the same story: Google shipped an Android feature called Theft Detection Lock, it uses AI-assisted sensor fusion to notice when a phone is snatched and carried away, and it locks the phone screen before a thief can open anything. TechRadar covered it, Android Authority tested it, Pop Sci wrote the setup guide. Every one of those pages is correct. None of them help a property manager, a construction superintendent, or a commercial operator.

Buildings have the same underlying problem the phone feature solves: something of value sits in a place, a person arrives, something happens that should not, and the asset leaves. The window between arrival and departure is short. The window where intervention still works is shorter. The cost of a missed event is not a PIN code, it is a copper line set, a pallet, a catalytic converter, or a box of residents' mail. The question is whether the three-signal fusion pattern that works on a phone has an analogue on a camera system. It does, and the three signals are zone, dwell, and time.

The same shape, two different asset classes

Map the Android feature onto property cameras one row at a time. The mechanism is structurally the same (three signals, one lock). Every row below changes because the asset changes.

Android phone lock vs. property camera lock

The fusion pattern is identical. The signals, the capture point, and the 'lock' are not.

Feature	Android Theft Detection Lock	Our system zone lock
Asset protected	The phone itself	Open and shared property zones (mailroom, pads, docks, lots)
Signal 1	Accelerometer detects a snatch motion	Person enters a drawn polygon on a tile
Signal 2	Movement away from original location	Dwell timer crosses the per-zone threshold
Signal 3	Network disconnect and unlock behaviour	Current time falls inside the armed schedule
Capture point	On-device sensors in the phone	DVR HDMI multiview, downstream of the cameras
What the 'lock' does	Locks the phone screen	Packages an event: tile thumbnail, zone, dwell, time
Delivery	Local screen lock, sync to Find My Device	WhatsApp thread per property, under 60 s
False-positive filter	ML model trained on snatch motion signatures	Three filters have to agree: zone, dwell, time
Intervention window	Sub-second, pre-data-access	30 to 180 seconds, pre-act

How the three-signal fusion plays out on a camera tile

A zone alone fires on anyone walking through. A dwell alone fires on a delivery driver standing still. A time window alone fires on every shift change. Any one of them is noise. All three together, agreeing at the same moment on the same person on the same tile, is the signal.

One alert, four frames

01 / 04

Frame 1. Zone entry

A person crosses the polygon boundary on tile 4 of a 4x4 DVR mosaic. The polygon is labelled 'mail_alcove_after_1900'. Person detection fires. The event is not an alert yet. Zone entry alone is signal 1.

The capture point: tap the HDMI, not the cameras

Most “AI theft detection” vendors require IP cameras with their own SDK, or a cloud ingest path where every camera pushes its stream to a remote service. That assumes a modernised camera stack. Most properties do not have that. They have a DVR or NVR from 2015, 16 to 32 cameras wired in, and a guard monitor showing the multiview grid. The multiview signal is where the capture happens.

One HDMI tap gives inference access to every tile at once

Because the capture point is downstream of the camera stream, the camera vendor does not matter. Hikvision, Dahua, Lorex, Amcrest, Reolink, Uniview, Swann, Night Owl, Bosch DIVAR, Panasonic WJ-NX, and every rebrand that outputs to a DVR with an HDMI port are all compatible. No ONVIF negotiation, no per-camera credentials, no firmware coordination. The physical install on a running DVR is under 2 minutes.

What an actual lock event looks like

The payload is deliberately small. A thumbnail, a camera label, a zone name, a dwell count, a timestamp, a layout id, a latency number. A property manager reads it in two seconds walking between units. Below is an anonymised payload drawn from a real deployment.

Event payload · anonymised from a live deployment

What the zone rule actually looks like in config

Three fields describe a zone: the polygon on a specific tile, the dwell threshold, and the armed schedule. No model retraining, no per-property tuning job. If the polygon is wrong, move the vertices. If the zone fires too often, raise the dwell. If the schedule overlaps a delivery window, narrow the schedule.

zone-config.yaml

The pre-action window, second by second

Watch footage of any non-retail theft. Time the segment between the actor stopping at the target and the actual act. It is almost always 30 to 180 seconds. That is the window where a talkdown, a two-way speaker ping, or a dispatched responder still changes the outcome. A lock that fires inside this window is prevention. A lock that fires after it is documentation.

Arrival to act, one zone lock fire in the middle

Second 0 to 5. Zone entry

Actor crosses the polygon boundary. Person detection fires on the tile. Dwell timer starts. No alert yet. Signal 1 of 3 is satisfied.

Second 5 to 20. Dwell stabilises

Subject stays in the polygon. At 15 s the dwell threshold clears. Residents and couriers passing through this zone take 6 to 10 s and never trip it. Signal 2 of 3 is satisfied.

Second 20 to 45. Lock fires, alert delivered

Time window check agrees. Payload assembled, thumbnail cropped, overlay masks applied, message shipped to WhatsApp. End-to-end latency is a few seconds, typically around 5,120 ms. Signal 3 of 3 agrees; lock fires.

Second 45 to 180. Intervention still works

Actor has not yet cut, pried, or lifted. Because the alert is verified by three agreeing filters, dispatch treats it as priority, not a generic motion alarm. Talkdown, speaker ping, or responder changes the outcome.

After second 180. The act has probably happened

If the lock fires here the function has collapsed into documentation. Footage is good for insurance and police, but recovery rates on copper, packages, and cargo once the actor leaves the property are low.

The operating constants

No invented benchmarks, no marketing percentages. These are the numbers the detection loop actually runs on.

0Tiles inferred per unit off one DVR HDMI

0 sTile to WhatsApp, end-to-end

0 sUpper edge of pre-action window

0 minPhysical install on a running DVR

50:1

“A 16-camera apartment property generates over 200 raw person detections per day from residents, contractors, deliveries, and staff. The same property generates 3 to 8 delivered alerts per day after zone, dwell, and time fusion runs. That ratio is what keeps the alert thread in service.”

Our system deployment data, Class C multifamily

Eight places a zone lock is a better fit than a door lock

A bolted lock covers an opening. A zone lock covers an area with no opening, or an opening that has to stay open. Every entry below has a pre-action pause, a drawable polygon, and a schedule.

Mailroom and lobby package alcoves

Pre-action pause is the outsider scanning for cameras. Zone around the package shelves, 15 s dwell, armed outside the delivery window.

Transformer pads and conduit runs

60 to 180 s between a vehicle stopping and bolt cutters. Lock fires on zone entry, not on the cut.

HVAC condensers and line sets

Condenser cage is a tight polygon with an explicit time window. Any zone entry outside service hours is an event.

Loading docks and trailer yards

Dock aprons are defined polygons with shift-based schedules. A person or a second vehicle entering outside shift fires the lock.

Parking lots and catalytic converter theft

A person lying down between vehicles is a zone + dwell signal. Pose models miss this entirely because the pose is hidden under the car.

Jobsite conex, spool racks, tool cribs

Staging zones are tight polygons, armed outside work hours with a short dwell threshold.

Trailer king-pin and landing gear

Tow-away attempts dwell near either for 30 to 90 s. Both are in-frame on dock cameras.

Bike rooms and short-term storage cages

Open rooms with no door lock possible during business hours. Zone entry + dwell + closed window fires the lock after hours.

Install order for a property that already has a DVR

Four steps. All of them happen on hardware the property already owns. No camera replacements, no firmware coordination, no network rebuild.

1
Cable the HDMI tap
HDMI in from the DVR, HDMI out to the guard monitor, network, power. Under 2 minutes on a running recorder.
2
Register the layout id
4x4-std, 5x5-std, or a custom mosaic. Installs the masks for clock, camera name strip, and channel bug once per layout.
3
Draw the zones
Polygon per tile, dwell threshold per zone, schedule per zone. Mailroom, pad, cage, dock, bike room, conex.
4
Connect the WhatsApp thread
Link the property's existing ops thread. First alert usually lands the same day, inside the pre-action window.

Where Android-style motion fusion does not transfer

The three-signal fusion pattern transfers. The specific signals do not. Anything phone-shaped that assumes a hand, a pocket, or a walking gait breaks the moment you move to an outdoor environment where the actor arrives in a vehicle and stays stationary. These are the failure modes worth naming directly.

Phone-style signals that will not do the job on cameras

Accelerometer and gyroscope snatch detection (there is no accelerometer on a condenser)
Walking gait classification (actor is usually stationary)
Pose concealment models (no aisle, no POS cross-check)
Disconnect detection (camera is not the asset being stolen)
Lockscreen remote wipe (there is no device to wipe; the asset is copper, packages, or vehicles)
Bluetooth beacon proximity (no paired device on a mailroom shelf)

The DVR brands the capture point works on

Compatibility is at the recorder level, not the camera level. If the recorder has an HDMI port driving a monitor, it works. Below is a non-exhaustive list.

Hikvision DS-7xxx

Dahua XVR / NVR

Lorex

Amcrest

Reolink NVR

Uniview

Swann

Night Owl

Q-See

ANNKE

EZVIZ

Honeywell Performance

Bosch DIVAR

Panasonic WJ-NX series

Any DVR with HDMI out

Lock fires on your DVR this weekend.

15-minute demo. HDMI in on a running DVR, zone drawn on a tile, dwell and schedule set, first event into WhatsApp while we are still on the call.

Book a demo →

One question that separates a zone lock from a spec-sheet product

Ask any vendor: on a Tuesday afternoon at a 16-camera apartment property with 200 residents, how many raw person detections will your model produce, and how many alerts will land on the property manager's phone? If the ratio is not at least 50 to 1 between raw detections and delivered alerts, the fusion layer is not doing its job. The alert thread will be muted inside a week.

Raw person detections per day from residents, deliveries, and staff at a 16-camera apartment.

0 to 0

Delivered lock events per day after zone, dwell, and time fusion. The read-able number.

0 ms

Typical capture-to-delivery latency on a real deployment. Well inside the pre-action window.

Frequently asked questions

Is 'theft detection lock' the same thing as Android's phone feature?

The phrase was popularised by Android's Theft Detection Lock, a phone-side feature that uses the accelerometer, Wi-Fi, and Bluetooth to detect a snatch motion followed by fast movement away, and then auto-locks the screen to protect data. That is a specific product for a specific asset class: a pocket-sized device that is picked up and run with. If you searched the phrase because you are protecting a phone, that is where you want to end up. If you searched it because you are protecting a mailroom, a transformer pad, an HVAC cage, a loading dock, or a parking lot, the mechanism is structurally similar (three-signal fusion, then a lock) but the signals and the capture point are different. This guide is about that second version.

What does 'lock' mean on a camera feed, if there is no screen to lock?

On a phone, the lock is the screen lock. On a property, the lock is a polygon armed on a schedule. The moment a person enters that polygon during the armed window and dwells past the threshold, the detection pipeline locks onto that subject for that event: tile captured, thumbnail cropped, zone label attached, dwell counter frozen, message shipped. The event is 'locked' in the same sense that a radar operator says target lock: a specific subject, a specific zone, a specific time, with the chain of evidence packaged so dispatch can act inside the pre-action window rather than after the fact.

What are the three signals in a property-level theft detection lock?

Zone, dwell, and time. Zone is a drawn polygon on a specific camera tile: the package shelves in the mailroom, the condenser pad, the dock apron, the trailer king-pin area, the transformer cage. Dwell is how long a person stays inside that polygon before the event fires, usually 10 to 20 seconds, chosen to filter residents walking through from actors setting up. Time is the schedule the zone is armed on: some zones are armed 24/7 (transformer cage), some only after business hours (mailroom after 7 p.m.), some only during specific shifts (dock gate). Any single signal fires dozens of times a day. All three signals together fire a handful of times a day, and those are events worth reading.

Why tap the DVR's HDMI output instead of the camera streams?

Because the HDMI multiview is the signal that already shows every camera at once, downstream of whatever the camera vendor is. The recorder has already done the hard work: it has pulled the streams, muxed them into a mosaic, and driven them to the monitor port the guard watches. Our system taps that port. One HDMI in from the DVR, one HDMI out to the monitor, a network cable, power. The physical install on a running DVR is under 2 minutes. No ONVIF negotiation, no per-camera credentials, no camera firmware assumption. The capture point is the recorder, so analog and IP cameras behave identically and the brand of the camera does not matter.

How many cameras can one unit watch for this kind of detection?

Up to 25 tiles per unit off a single DVR multiview. If the DVR is set to a 4x4 grid the unit runs inference across 16 tiles in parallel. If it is a 5x5 it runs across 25. If the operator switches the DVR to fullscreen on a specific camera during an incident, the unit re-scopes to that single camera at full resolution, so per-tile accuracy goes up, not down. Properties with more than 25 cameras typically run one unit per DVR.

How long does the actual event take to arrive after the person crosses into the zone?

End-to-end, tile capture to WhatsApp delivery, under 60 seconds in measured deployments. The alert carries a tile thumbnail, the zone label, dwell seconds, timestamp, camera name, and a latency number on the payload. A sample real-deployment event record shows a latency of about 5,120 ms from capture to delivery. That number matters because the pre-action window, from zone entry to the actual act, is usually 30 to 180 seconds. A detection that lands at second 45 is intervention. A detection that lands at minute 4 is documentation.

What about the clock, camera names, and DVR channel bug overlaid on the video?

Those are the part of 'AI on existing cameras' that most vendors skip, and the part that kills false-positive budgets. A DVR composite is not a clean video feed: every tile has a fixed-position clock, a per-tile camera name strip, and a channel bug watermark. To a detection model those glyphs are visual noise that can fire bounding boxes on a colon in the clock or on tall letterforms in a camera label. Our system masks the three overlays per DVR layout once at install time, not once per frame. Layout id 4x4-std, 5x5-std, and so on, are preconfigured.

How is this different from a smart lock or an electronic door lock?

A smart lock is an access control device on a single opening. If someone defeats it, you know it was defeated. A theft detection lock in this guide is a detection device that covers open outdoor areas and shared indoor zones that have no door at all: transformer pads, parking lots, package alcoves, dock aprons, HVAC cages, trailer yards. Those places cannot be locked with a bolt because they are either outdoors, shared, or part of a traffic path. The only 'lock' that maps to them is a zone armed on a schedule, with a camera providing the evidence chain. Use both if you have both. They cover different failure modes.

Does this reduce false positives compared to plain motion alarms?

Motion alarms run on one signal: pixel delta. They fire on raccoons, trash bags, shadows at sunset, delivery drivers turning around, and headlights sweeping a wall. A zone + dwell + time pipeline runs on three filter layers, all of which have to agree. A typical 16-camera apartment property generates more than 200 raw person detections per day from residents, contractors, deliveries, and staff. The same property generates 3 to 8 delivered alerts per day after the three filters run. That ratio, roughly 50 to 1, is what separates an alert channel that stays in service from one that gets muted inside a week.

What delivery channel does the lock event actually land in?

One WhatsApp thread per property, with the tile thumbnail attached. WhatsApp is chosen because it is already open on the staff's phones, it is the same thread they use for maintenance, move-outs, and vendor coordination, and adding an alert stream to it is a free change. The alternative, a dedicated monitoring portal, starts around $250 per camera per month and assumes a human watching a feed. SMS fallback and a webhook into a PMS or access-control platform are both available if a property wants them, but the default is WhatsApp because the default is: the channel that gets read.

Worth saying plainly

“Theft detection lock” has become the Android phone feature, and the phone feature works. If the asset in your hand is a phone, use that feature. If the asset you are protecting is a mailroom, a pad, a cage, a dock, a lot, a conex, or a bike room, the same three-signal fusion idea moves over cleanly, but the signals are zone, dwell, and time, and the capture point is the DVR's HDMI multiview rather than an accelerometer.

The “lock” is a polygon armed on a schedule. The fire is an event packaged with a tile thumbnail, a dwell counter, a timestamp, and a layout id, shipped to a WhatsApp thread in under 60 seconds. That is a theft detection lock for a building, and it runs on cameras you already own.