DVR person and vehicle alerts: five brands, five formats, and the one surface that already unifies them.

Every modern DVR ships a person and vehicle alert. Hikvision calls it Motion Detection 2.0. Dahua wraps it inside IVS. Lorex and Reolink put a Smart Detection toggle in the consumer app. Uniview files it under Smart Event. The marketing is uniform, the alerts are not. Each brand runs the classifier on different silicon, ships events in an incompatible schema, and exposes a different tuning surface in a different console. The minute your portfolio crosses two brands, your alert pipeline stops composing.

There is one surface that does compose, and almost no operator is aware of it. Every brand draws the per-class label directly onto the HDMI multiview output as an on-screen overlay glyph at a predictable pixel offset, because the DVR is rendering that composite for the security operator's wall display anyway. That overlay is the lowest common denominator across vendors. A downstream device tapping HDMI inherits every brand's native classification visually, while running its own model on the same frame. This is a long-form walk through why the per-brand event streams do not interoperate, what each brand actually does inside the P&V box, and how the multiview overlay closes the gap.

Matthew Diakonov, Written with AI

Published April 27, 202613 min

What is actually inside the P&V box on each major brand

The five brands you will see in real Class B and Class C multifamily, in retail, and on construction trailers are Hikvision, Dahua, Lorex, Reolink, and Uniview. Every other label you encounter (Annke, Swann, Amcrest, Night Owl, Anpviz) is a rebrand or a derivative of one of these five, so the underlying behavior tracks one of the entries below.

Hikvision

On modern Hikvision DVRs and NVRs the feature is called Motion Detection 2.0 (or Smart Event in the Hik-Connect mobile app). Person and vehicle classification runs either on the camera ISP (on Hikvision's newer AcuSense IP cameras) or on the DVR chipset (on TVI analog channels). Events leave the recorder over ISAPI HTTP callbacks with the rule name and class label embedded. Output to the multiview includes a small white sans-serif label near the bottom of the tile, along with a yellow bounding box around the detection. Tuning surfaces in the local web console and in iVMS-4200.

Dahua

Dahua groups its detector under IVS (Intelligent Video System) and includes a stack of rules: Tripwire, Intrusion, Abandoned Object, and Object Detection, each gated on person or vehicle. On WizSense cameras the classification runs on the camera; on older Smart H.265 cameras it runs on the recorder. Events leave via alarm.cgi callbacks (different schema from Hikvision) and the per-class label renders on the multiview as a colored bounding box (green for person, blue for vehicle, yellow for unrecognized non-motor) with the rule name in the tile corner. Tuning is in the local web UI or in DSS Pro.

Lorex

Lorex hardware is mostly Dahua with a different label, so the on-device behavior tracks Dahua closely, but the export path is different. Some firmwares expose the local Dahua API; others lock it down behind the Lorex Cloud, which only forwards events to the Lorex mobile app and the email relay. The multiview overlay is the Dahua-style colored bounding box plus a small human or car icon depending on firmware. Practical implication: even on hardware that is identical to Dahua, you cannot count on the same export path being available for portfolio-level ingestion.

Reolink

Reolink consumer NVRs run a person and vehicle model on the camera ISP for any model labeled Smart Detection or Person/Vehicle Detection. The alert path goes through the Reolink mobile app push pipeline and an email relay; there is no documented webhook API on most consumer models, and the local API is undocumented and changes between firmware versions. Multiview overlay is a small class icon inside the tile, which is the surface most third-party integrators end up reading because it is the only one that is stable across firmware updates.

Uniview

Uniview's P&V detector ships under Smart Event with named rules (Tripwire, Intrusion, Object Removal, Loitering). Camera-side classification is widely available on the Pro and Prime IP cameras. Export is through the Uniview SDK or the EZView mobile app, with very few third-party integrators having ever written against the SDK. Multiview overlay is a fixed-width strip with the rule name and class label printed in a known position, which makes it the easiest of the five to OCR cleanly off the composite.

Where the per-class label actually gets drawn

The chain below is the same on every brand, with one box and one wire format swapped depending on whether the model runs on the camera or on the recorder. The thing that is uniform across brands is the last step: every recorder draws the per-class label onto the HDMI composite before it leaves the box. That is why the HDMI tap is the surface that closes the cross-vendor gap.

Camera ISP, recorder, and HDMI overlay

On analog (TVI, AHD, CVI) channels the first arrow disappears and the classifier runs inside the DVR chipset instead of the camera ISP, but the rest of the chain is identical. The edge tap is optional in the chain; if you do not install one, the multiview still gets drawn for the wall display, with all the same labels on it.

The anchor fact: the brand label is at a fixed pixel offset

This is the part that makes the multiview useful as a unified surface. Each brand draws its per-class glyph in a known location inside the tile, and the location is stable across firmware revisions because the on-site security operator's habit depends on it. Hikvision draws the label near the bottom edge of the tile. Dahua draws the label in the top-left corner of the colored bounding box. Lorex puts the icon in the top-right. Reolink puts the icon in the bottom-left. Uniview prints a fixed-width strip across the top of the tile. Once you measure the offset for a given brand at install time, every tile on every camera on that DVR uses it.

Tile-grid templates

2x2, 3x3, 4x4, 5x5

The four layouts every brand renders. Resolution is 1080p on most current DVRs and 4K on the higher-end NVRs.

Glyph position

Fixed offset

Measured once at install relative to each tile's top-left corner. Stable across firmware because the wall display depends on it.

Tile assignment cost

Microseconds

Integer division on the bounding-box center against the tile width and height. No machine learning, no probability, pure layout math.

The implication is that a single edge device, plugged into the back of the DVR over one HDMI cable, can read the brand's native P&V classification for every camera on every property in your portfolio without ever calling the brand's SDK, opening an inbound port, or subscribing to a webhook that the firmware might break in the next update. The brand did the compositing for free. Your tap inherits it.

How the merge actually works

The brand label is one signal, and your own model is the other. They run in parallel on the same composite, on the same frame, so no clock alignment is needed. When they agree, you have a high-confidence event. When your model fires and the brand says nothing, that is recall lift, the events the brand silenced because of the pre-filter motion gate or because its model was tuned for a different scene. When the brand fires and your model does not, you can choose to log it as brand-only and let the per-class context map decide whether to surface it.

tile_grid_merge.ts

None of this requires touching the DVR firmware. None of it requires opening a port to the cloud. None of it requires the brand's SDK. The whole thing reads off a passive HDMI tap and runs locally on the edge device. That is the property that makes it work uniformly across a mixed portfolio.

How to diagnose your own portfolio

If you have already given up on per-brand event ingestion, this is the short diagnostic for what to do next. It applies whether your portfolio is one property or twenty, and whether the boxes in the closet are five years old or five months old.

A 5-step audit you can do in an afternoon

Inventory the brand of recorder at every property

Walk into the management office, look at the unit in the rack, and write down the brand and model. Most properties have one box; a portfolio of more than three properties usually has two or three brands across it. This list is what you cannot pretend to ignore.

Pull the firmware version from each unit

Same console, About or System Info screen. Note the firmware. The reason this matters: the export schema for events drifts between firmware revisions, and a unified ingestion stack built on per-brand events has to track every firmware on every box. The HDMI multiview overlay does not drift, because the wall display depends on it.

Confirm the multiview is being drawn

Plug a monitor into the DVR's HDMI output if there is not already one. Confirm you see the tile grid, with the camera names burned into each tile and the per-class labels rendered when an event fires. If the multiview is being drawn, the tap is available; if it is not, you are on a model so old it predates HDMI output, which is rare.

Measure one tile-grid template per brand

For each unique brand in your portfolio, capture one frame of the multiview, identify the tile layout (2x2, 3x3, 4x4, 5x5), record the per-tile camera label position, and record the per-tile P&V glyph position. Three brands in your portfolio means three templates. The templates are reusable across every property that runs that brand.

Centralize the alert routing, not the alert source

Stop trying to standardize the per-brand event stream. Instead, route every property's HDMI tap into one shared per-class context map (zones, time windows, class-to-action rules) and one on-call rotation. The recorders and cameras stay heterogeneous; the operations layer becomes uniform.

What this looks like when it actually runs

The real test of any architecture is whether it catches the incidents the existing setup was missing. The answer at the only site where I have a clean before-and-after is below. This is one property, one brand of DVR, 30 days, with the existing P&V alerts left in place and the HDMI tap added on top.

20 incidents in 30 days

“At a Class C multifamily property in Fort Worth, the HDMI tap caught 20 incidents in the first month, including a break-in attempt that the existing DVR's own person/vehicle alerts had silenced as sub-threshold motion. The customer renewed at day 30.”

Cyrano deployment, Fort Worth multifamily, 180 units

Most of those 20 events were not novel detections in a strict sense; the camera was pointed at the right place, and a human could see the person on the playback. They were events the existing P&V alert had been silently filtering out, either because the pre-filter motion gate had decided the pixel difference was too small, or because the camera-side classifier was tuned for a brighter scene. The break-in attempt is the one that mattered to the property manager. It is also the one the existing alert stream had been silencing.

Want to see this on the DVRs you already own?

15-minute walk-through of the multiview tap, the per-brand glyph offset, and what the recall lift actually looks like on a mixed-brand portfolio. No camera replacement, no firmware change.

Frequently asked questions

What does 'DVR person and vehicle alerts' actually refer to?

Most modern DVRs and NVRs ship a feature that fires alerts only when the motion event is classified as a person or a vehicle, instead of firing for any pixel change in the scene. Hikvision calls it Motion Detection 2.0 or Smart Event. Dahua calls it IVS (Intelligent Video System) and includes Tripwire, Intrusion, and Object Detection rules gated on person or vehicle. Lorex and Reolink consumer models surface a Smart Detection toggle with separate person and vehicle switches. Uniview groups it under Smart Event. The common shape is: the DVR or the camera ISP runs an object model on the live frame, decides whether the motion was a person, a vehicle, or neither, and only fires the alert when one of the two classes is present. The pitch is the same across every brand, fewer false positives from leaves and headlights, but the implementation is different on every brand, and the alerts do not interoperate.

If every brand ships P&V alerts, why does it matter that they are different?

Because the moment your portfolio has more than one brand of DVR, your alerts stop composing into a single stream. Hikvision events leave the DVR as ISAPI HTTP callbacks with one schema. Dahua events leave as alarm.cgi pushes with a different schema. Reolink consumer NVRs only export to the Reolink mobile app and the email relay; there is no documented webhook. Lorex (which is hardware mostly built by Dahua) sometimes exports through the Lorex Cloud, sometimes not, and the local API moves between firmware versions. Uniview has its own SDK that very few integrators have ever touched. If you operate a 12-property portfolio with three brands of DVR across it, you end up running three different alert ingestion stacks, three different tuning consoles, and three different on-call playbooks. Most operators give up and just log into each property's local console when something happens.

What is the difference between DVR-side P&V and camera-side P&V?

It depends on the wire. On TVI, AHD, and CVI analog DVRs, the cameras send raw video over coax and the DVR's chipset runs the classifier on the decoded frame. On IP NVRs the camera's onboard ISP runs the classifier and sends the per-class label to the NVR as part of the metadata channel; the NVR aggregates and surfaces the alert. On hybrid recorders some channels are analog and some are IP, and the same alert UI on the NVR is being fed by two different classifiers under the hood. The user-visible feature looks identical, but the recall, the latency, and the failure modes are completely different. Camera-side P&V tends to have better recall because the classifier sees the un-recompressed frame; DVR-side P&V tends to have lower latency on alarm output because the event is already on the recorder. Neither is automatically better.

Why are the alerts so noisy in real properties?

Three reasons. First, the classifier is tuned for a generic scene, not your scene. A loading bay at 3am with a passing truck looks different to the model than a quiet front entrance at noon, and the same threshold is applied to both. Second, almost every brand silently runs a pixel-difference motion gate before the classifier ever sees the frame, so the model is being asked to label a clip the motion engine already pre-filtered. Sub-threshold humans (a slow approach, a person hugging a wall, a low-contrast outfit at dusk) never make it to the classifier. Third, the alert rule is usually a flat 'person detected in this zone' with no time-of-day, occupancy, or behavioral context. The vendors who ship this feature are competing on box specs, not on the operations layer that turns the label into a usable phone call.

Can I just consume my brand's existing P&V events and forward them somewhere central?

Sometimes, partially. Hikvision and Dahua both expose enough of an event stream that an integrator can pull P&V alerts off the DVR and forward them. The catches are real: the schemas drift between firmware revisions, the network path from the DVR to the cloud usually requires opening inbound ports the IT team will not approve, and the events still inherit whatever recall ceiling the DVR's motion gate set. On Reolink consumer NVRs and many Lorex models there is no documented push API at all, you are stuck with the mobile app and email. So a portfolio-level central pipeline built on per-brand event ingestion is buildable on the high-end vendors and not buildable at all on the consumer ones, which is exactly the wrong shape if your portfolio is mixed.

What is the OSD glyph trick on the HDMI multiview?

Every brand of DVR draws the per-class label directly onto the rendered HDMI multiview composite. Hikvision draws a small white sans-serif label like 'Smart Event: Person' near the bottom edge of the tile. Dahua draws a colored bounding box (green, blue, or yellow depending on class) with the class name in the corner. Lorex and Reolink overlay a small icon (a stick figure for person, a car icon for vehicle) inside the tile. Uniview writes the rule name in a fixed-width strip. These overlays exist because the DVR is generating them for the on-site monitor, so a security operator looking at the wall display can tell at a glance whether the alert is a person or a car. They are also rendered before the HDMI output, which means anything tapping the HDMI line inherits all of them automatically. A downstream classifier can OCR the label, snap it to the camera ID by the tile-grid template, and treat the brand's native classification as a free signal alongside its own model. This is the surface that does compose across vendors, because the DVR did the compositing for free.

Why does the multiview tile grid help with this specifically?

DVRs render their multiview deterministically. The tile layout is a fixed template (4 tiles in 2x2, 9 in 3x3, 16 in 4x4, 25 in 5x5) and the camera assigned to each channel always lands in the same tile. The DVR also draws the camera name strip at the top of each tile and, on most brands, the per-class label inside the tile. A classifier tapping HDMI captures the layout once at install time, stores the per-tile camera label and the per-tile pixel offset where the brand draws its P&V glyph, and writes both into a tile-grid template tied to that DVR's output resolution. From then on, every detection bounding box gets snapped to its tile by integer division on x and y, and both pieces of information (the camera ID and the brand's native class label) fall out of the lookup. The whole step runs in microseconds.

Does this mean I do not need my own classifier? I can just OCR the brand's labels?

No, the brand label is a useful signal but not a sufficient one. The OCR result inherits everything the brand's classifier got wrong: the recall ceiling from the pre-filter motion gate, the silence on sub-threshold humans, the missed slow approaches. The right architecture is to run your own person and vehicle model on the same composite the DVR is rendering, and treat the OCR'd brand label as a parallel signal that boosts confidence when it agrees and surfaces a disagreement when it does not. If your model fires person and the brand label says nothing, that is the recall lift you came for. If the brand label says vehicle and your model also says vehicle in the same tile, that is a high-confidence event and you can route it directly to the per-class context map. The two streams reinforce each other; neither one alone is enough.

How does the per-class context map turn a label into a useful action?

The classifier produces the class. The context map decides whether the class fires anything. A person at the parcel shelf during business hours is logged but does not page; the same person at the same shelf after the configured delivery window is a HIGH THREAT phone call. A vehicle in the loading bay during the day is normal; a vehicle in the loading bay at 3am pages on-call ops. Headlight sweep is silenced regardless of the brand label. Animals are dropped except in zones the property explicitly cares about. The map is the operations layer that the brand's native P&V alert never reaches, because the brand exposes 'Person detected in Zone 3' and stops there. The map is what turns a label into a phone call, and it has to live somewhere downstream of the brand's alert engine.

What does this look like for a 6-property portfolio with three different DVR brands?

The cameras stay. The DVRs stay. The cabling stays. At each property a small edge device plugs into the back of the existing DVR over HDMI and reads the multiview the DVR was already drawing for the on-site monitor. The device runs object detection on the composite at native rate, OCR's the brand's per-tile P&V glyph as a parallel signal, de-tiles every detection back to a camera ID using the property's tile-grid template, and applies one shared per-class context map across all six properties. Alerts route to a single dashboard and a single on-call rotation regardless of whether the underlying box is a Hikvision, a Dahua, or a Lorex. Hardware is $450 one-time per property; software is $200 per month per property starting in month two; install takes about two minutes on site; all inference runs locally with nothing uploaded to the cloud. This is what portfolio-level P&V alerting actually looks like in practice.

Adjacent reading on the same stack

Architecture

DVR motion alert classifier: the bolt-on pipeline that quietly fails

Why a classifier hung off the DVR's motion-alert stream cannot raise recall, and what the HDMI direct architecture replaces.

11 minRead

Pipeline

Smart camera alert filtering on an HDMI multiview

The four downstream filters (overlay mask, tile-grid zones, multi-frame persistence, event dedup) the per-class router runs after a detection.

12 minRead

Recording

Legacy DVR temporal blind spots: 1 to 5 FPS coverage gaps

The other ceiling that the brand's P&V alert never tells you about, and what to change first if you are stuck below it.