Adding AI to a legacy NVR: why protocol-level retrofit dies on a real fleet, and HDMI capture doesn't
The advice you usually find for putting AI on top of an existing NVR assumes the fleet is uniform, the cameras are alive, the firmware is current, and the credentials were preserved through the last property manager turnover. None of those are true on a real legacy install. The cameras are from three different vendors, two of them are EOL with no firmware path, one batch only speaks the OEM's proprietary SDK, and the admin password was lost in 2022. That fleet blocks every retrofit that tries to talk to the cameras.
The retrofit that survives is the one that doesn't talk to the cameras at all. Read the recorder's HDMI output, split the multiview back into per-tile crops, and run the AI on the pixels the recorder is already drawing. This page walks through why protocol-level retrofit (RTSP, ONVIF, vendor SDK) fails on legacy fleets, what HDMI capture actually sees, the brand-compatibility list nobody publishes, and the cases where HDMI is the wrong answer.
Yes, you can add AI to a legacy NVR camera fleet, but only via HDMI capture if the fleet is heterogeneous, EOL, or pre-ONVIF. Protocol-level retrofit (RTSP pulls, ONVIF subscriptions, vendor SDKs) collapses on real legacy installs because of codec mismatch, ONVIF Profile S vs Profile T gaps, lost credentials, and proprietary stream formats. HDMI capture reads the recorder's multiview output that the NVR is already drawing for its wall display, splits it back into per-tile crops, and runs detection on each tile. The cameras don't change. The recorder doesn't change. The network doesn't change. Install is one HDMI cable, about two minutes on site.
“At one Class C multifamily property in Fort Worth, this approach caught 20 incidents including a break-in attempt in the first 30 days on a fleet that was a mix of three different camera vendors with two batches EOL. Customer renewed at month one.”
180-unit Class C multifamily, Fort Worth TX, mixed-vendor legacy fleet
What "legacy NVR" actually looks like on the inside
The word "legacy" gets thrown around like it means "old but still working." On a property installation it usually means three things stacked on top of each other.
One: the fleet is heterogeneous. The recorder was bought as a turnkey kit, but cameras get added and replaced over the years as they fail or as the property scope expands. Five years in, you have three or four camera vendors hanging off one NVR. The original eight from the kit, plus four PoE cameras added when the parking lot was rebuilt, plus two wireless cameras pushed in last year by a maintenance lead who bought what was on Amazon that day.
Two:at least one batch is EOL. The vendor stopped shipping firmware updates two product lines ago. The cameras still work, they still record to the NVR, but they only support ONVIF Profile S (live video, no event metadata) or in some cases only the vendor's proprietary SDK. Anything that wants to talk to those cameras has to either downgrade to the lowest common denominator or skip them.
Three: the credentials are gone. The cameras were provisioned by an integrator who is no longer reachable, the admin password was rotated by the NVR but the camera-side credentials drifted, and pulling RTSP directly from any camera returns 401. The NVR can still see them because the NVR remembers the original handshake from when it onboarded them, but nothing else can. This last one is the killer for protocol-level AI retrofit and almost nobody warns you about it before the install day.
The two retrofit paths, side by side
There are exactly two ways to add AI on top of an existing camera system without replacing the cameras. The first reads from each camera over the network. The second reads from the recorder's display output. The differences below are why one of them dies on legacy fleets.
| Feature | Protocol-level retrofit (RTSP / ONVIF / SDK) | HDMI capture (Cyrano) |
|---|---|---|
| Touches camera credentials | Yes, needs admin or operator on each camera | No, never speaks to a camera directly |
| Affected by EOL firmware | Yes, blocked by missing ONVIF Profile T or codec | No, NVR has already decoded the stream |
| Affected by mixed vendors | Yes, each vendor needs its own integration path | No, all vendors composite to the same HDMI output |
| Works on proprietary protocols | Only if the SDK is licensed and current | Yes, NVR has done the protocol work already |
| Adds bandwidth on the camera VLAN | Yes, doubles the per-camera read rate | No, single HDMI cable to the AI box |
| Per-camera metadata (PTZ, events) | Yes, when ONVIF Profile T is supported | Only what the recorder draws on screen |
| Original-resolution per-camera frame | Yes, 4K stream from a 4K camera | Downscaled when shown in a 5x5 multiview tile |
| Install time on a 12-camera mixed fleet | Hours to days of credential discovery | About 2 minutes plus grid calibration |
| Survives credential drift on legacy gear | No | Yes |
If the fleet is uniform, current-generation, ONVIF Profile T compatible, and the credentials are intact, protocol-level retrofit is also a valid path and gives you metadata HDMI doesn't. The argument here is specifically about legacy fleets, where those preconditions usually aren't met.
The legacy-fleet failure modes HDMI capture bypasses
These are the four shapes of failure we see on protocol-level retrofit attempts to legacy NVR setups. Each one is a project-killer when it shows up on install day. HDMI capture is structurally immune to all four because it never enters the path where these can fail.
Codec mismatch
An old camera streams H.264 baseline at 5 fps over RTSP and the AI overlay's decoder expects H.265 main, or the camera streams MJPEG only. The overlay either rejects the stream or silently runs at 0.5 fps. The NVR is already transcoding everything for its multiview, so HDMI sees a uniform pixel grid regardless of source codec.
ONVIF Profile gap
Camera supports ONVIF Profile S (video) but the overlay needs Profile T (events, metadata). The connection succeeds and returns nothing useful. HDMI doesn't care because the NVR has already drawn what it knows onto the screen.
Lost credentials
The admin password was rotated by the NVR years ago and never propagated. RTSP pulls return 401. The NVR still works because it remembers the original handshake. HDMI inherits that, the AI box never authenticates to a camera at all.
Proprietary protocols
Wireless or off-brand cameras that only push to the recorder over a vendor protocol with no public ONVIF endpoint at all. Protocol-level retrofit cannot reach them. They appear on the multiview like every other camera, so HDMI capture treats them identically to the rest.
The NVR brand list nobody publishes
The compatibility question for HDMI capture is not "does the brand support a specific protocol" (the answer is irrelevant), it is "does this NVR have an HDMI output that drives a multiview to a wall display." Every consumer and prosumer NVR sold in the past decade does, including all of the brands below. Most of them are also OEM relationships in disguise, which is why a lot of hardware-store CCTV bundles are built on Hikvision or Dahua silicon under a different label. HDMI capture works on all of them because the OEM substrate exposes the same HDMI multiview behavior regardless of the badge on the case.
Recorders we have run HDMI capture against
NVR brands that work with HDMI-based AI retrofit
Hikvision
DS-7600/7700/7900 series
Dahua
DH-NVR series, Lorex parent
Lorex
N841/N881 series
Reolink
RLN8/RLN16/RLN36
Amcrest
NV4108/NV4116/NV4216
Swann
DVR8/NVR8/NVR16
Annke
C500/CR500 series
Uniview
NVR301/NVR304/NVR308
Q-See
QC and QT series
Night Owl
XHD/Bluetooth NVR series
Defender
Sentinel and PhoenixM2
EZVIZ
X5 and W3 series
Foscam
FN3104H/FN3108H
Zosi
8CH/16CH bundle NVRs
Anpviz
Hikvision OEM rebrand
Generic OEM
Hikvision/Dahua reflash
Protocol-level retrofit on this same list of brands has wildly variable behavior because each vendor implements ONVIF slightly differently and several batches ship with proprietary-only firmware that ONVIF cannot reach. HDMI capture treats them as a single class because it reads what the recorder draws, not what each vendor permits over the network.
Where HDMI capture is the wrong answer
This page is an argument for a specific retrofit path on a specific shape of fleet, not a universal claim. HDMI capture is the wrong choice in three cases.
One:the fleet is uniform and current. If every camera is the same vendor and model, on supported firmware, with intact credentials, and the property has IT staff, then a protocol-level retrofit gives you per-camera metadata that HDMI doesn't (PTZ control, ONVIF event subscriptions, audio analytics). On a portfolio that recently rolled out a fresh fleet, the protocol path has the higher ceiling. HDMI is the right call when the fleet is messy, not when it's clean.
Two: the use case is license plate reading at distance. A 4K camera shown in a 5x5 multiview tile is sampled at roughly 384x216 by the AI box, which is enough for person and vehicle detection but not enough to read a plate beyond a few car lengths. The fix is to put the LPR cameras on a separate tile in an expanded multiview, or to run a small ONVIF integration only on those cameras and feed the rest over HDMI. HDMI is the bulk transport, ONVIF is the surgical addition.
Three: the NVR is headless. A small percentage of rack-mount NVRs in commercial datacenter installs ship without a usable HDMI output and only stream over the network. On those, HDMI capture is structurally not an option and a protocol-level retrofit (or a different recorder) is the only path. This case is rare on multifamily and small commercial sites, where almost every install has a wall display in the leasing office or maintenance closet.
What the install actually looks like on a legacy property
The install procedure is short and identical regardless of NVR brand, which is the second-order effect of routing through the recorder's display output. Find the wall display in the leasing office or maintenance closet. The cable from the NVR's HDMI port into that display is what the AI box taps. You either run an HDMI splitter (one feed continues to the wall display, the other goes to the AI box) or use the NVR's secondary HDMI output if it has one (most do).
Plug the AI box into power and ethernet. In the NVR's local UI, set the HDMI output to a fixed multiview grid (2x2 for four cameras, 3x3 for nine, 4x4 for sixteen, 5x5 for up to 25). Disable any auto-cycle behavior so each camera stays in a known tile. Open the AI box's setup dashboard, click capture-calibration, and the box records the pixel boundaries of each tile from a one-shot frame. From that point on, every frame off the HDMI cable gets split per the calibration and run through detection independently.
Total time on site: about two minutes for the physical install, another five for the calibration and zone configuration. No camera-side work at all. No firmware updates. No credential discovery. No new VLAN. The legacy fleet stays exactly as legacy as it was when you walked in, which is the point.
One Cyrano unit handles up to 25 camera tiles in parallel, runs entirely on-device (no cloud round trip for inference), and is $450 one-time hardware plus $200 per month from month two. That includes the per-tile classifier, the natural-language footage search, the alert delivery, and the portfolio dashboard. There is no separate "legacy fleet" SKU because there isn't one, the same box ships on every install.
Have a legacy NVR with a mix of camera vendors? Worth a 15-minute call.
We walk through your specific fleet, identify the cameras that would block a protocol-level retrofit, and show what HDMI capture sees on a recording from your own building. No hardware change, no credential discovery, two-minute install.
Frequently asked questions
What does 'legacy NVR' actually mean for an AI retrofit?
Three things, usually all at once. First, the recorder is pre-2020 and was bought as a turnkey kit with whatever cameras the installer had on hand, so the fleet is heterogeneous (often three or four camera vendors on one NVR). Second, at least one batch of cameras is EOL: the manufacturer no longer ships firmware updates and the cameras only support ONVIF Profile S (live video, no event metadata), or in some cases only the vendor's proprietary SDK. Third, the network the cameras live on is flat and the credentials were lost two property managers ago. Each of these breaks a protocol-level AI overlay (RTSP pull, ONVIF subscription, vendor SDK). HDMI capture sidesteps all three because it reads the recorder's local display output, which the recorder is already drawing whether anyone watches it or not.
Why doesn't an RTSP or ONVIF pull just work on every camera?
Three failure modes show up reliably on legacy fleets. (1) Codec mismatch: an old camera streams H.264 baseline at 5 fps over RTSP and the AI overlay's decoder expects H.265 main. (2) Profile mismatch: the camera supports ONVIF Profile S (video only, no events) but the overlay subscribes to Profile T metadata and gets nothing. (3) Credential drift: the cameras were configured by an installer who is no longer reachable, the admin password was rotated by the NVR but never propagated to the cameras, and pulling RTSP from the camera directly returns 401. On a mixed fleet of 12 cameras, you might get 7 working over RTSP, 3 working over ONVIF with reduced framerate, and 2 that refuse both and only show up on the NVR's local UI. HDMI capture bypasses all of this because the NVR has already done the negotiation and is drawing every camera onto its multiview output.
What is the actual signal path on an HDMI-based AI retrofit?
The recorder draws every camera onto a multiview composite for its local monitor port. That composite is a single rectangular video frame, typically 1080p or 4K at 25 to 30 fps, divided into a fixed tile grid (2x2 for four cameras, 3x3 for nine, 4x4 for sixteen, 5x5 for twenty-five). The edge AI device reads the HDMI output, splits the frame back into per-tile crops using a grid template captured at install, and runs detection plus classification on each tile. The cameras never see the AI box. The NVR never sees the AI box. The network never sees the AI box (other than for outbound alerts). The retrofit is electrically isolated from everything inside the existing CCTV stack, which is the property of HDMI capture that survives a legacy fleet.
Does this work on every NVR brand?
If the NVR has a working HDMI output that can drive a multiview to a wall display, yes. That covers basically every consumer and prosumer NVR sold in the past 12 years, including Hikvision, Dahua, Lorex, Reolink, Amcrest, Swann, Annke, Uniview, Q-See, Night Owl, Defender, EZVIZ, Foscam, Zosi, Anpviz, and the Hikvision OEM rebrands sold under hardware-store private labels. The two edge cases are (a) headless NVRs (no HDMI port, only network output) and (b) NVRs whose HDMI is hardware-locked to a single resolution that does not match a 16:9 multiview the AI box can sample. Both are rare on multifamily property installs. On a legacy installation by an integrator, there is almost always a wall display somewhere in the leasing office or maintenance closet, and that display's HDMI run is the input the AI box uses.
What does HDMI capture lose compared to a protocol-level integration?
Three things, none of them load-bearing for incident detection. (1) Per-camera metadata that lives in the protocol layer: ONVIF event subscriptions, vendor-specific PTZ position, badge swipe correlation routed through the NVR's API. The HDMI box can read what the recorder draws on screen, which on most NVRs includes the camera label and timestamp overlay, but it cannot see protocol-level events. (2) Original-resolution video for the cameras that are downscaled in the multiview: a 4K camera shown in a 5x5 tile is sampled at roughly 384x216 by the AI, which is fine for person and vehicle detection but not enough for license plate reading. The fix is to put high-detail cameras (LPR, entry door) on a separate dedicated tile or an expanded multiview. (3) Direct PTZ control: the AI box does not move the cameras. Static cameras only. For most multifamily and small commercial deployments, none of these are problems. For a site that needs LPR plus PTZ tracking, the right answer is HDMI for the bulk plus a small ONVIF integration for the two cameras that need it.
How do you handle the multiview grid changing? Some NVRs auto-cycle through views.
You disable the auto-cycle. Every NVR that does multiview also has a static-grid mode where one tile assignment stays on screen until manually changed. The install procedure on a legacy NVR is: (a) log into the local UI, (b) set the HDMI output to a fixed grid (2x2 to 5x5 depending on camera count), (c) lock the assignment so each camera occupies a known tile, (d) capture a one-shot calibration frame on the AI box that records the pixel boundaries of each tile, and (e) the runtime split-and-classify logic uses that calibration. If the operator later rearranges the grid, the install procedure is re-run, which takes about 30 seconds. The auto-cycle mode is incompatible because the AI box would have to track which camera is in each tile across a moving target, which is solvable but adds complexity that isn't justified on a legacy install.
What about cameras the NVR shows but on a different recording channel? IP cameras added later via PoE.
If the NVR can show them on the multiview, the AI box can see them. PoE cameras added after the original install (via the NVR's built-in PoE switch on the back panel) typically appear as additional channels on the same recorder and get tiles in the multiview. That is the whole point of routing through the NVR: whatever shows up on the wall display is what the AI sees. Mixed analog (BNC) and IP (PoE) cameras on the same recorder all composite to the same HDMI output. The AI box does not care which physical interface a camera came in on. This is the single biggest practical reason HDMI capture wins on legacy fleets: a property that has 6 analog cameras from 2014 plus 4 PoE cameras added in 2020 plus 2 wireless cameras pushed in last year cannot be retrofitted via RTSP at the camera layer because the wireless cameras only push to the NVR over a vendor protocol, but all 12 of them appear on the multiview.
Does this work during an internet outage?
Detection and classification do, alerts depend on the alert path. The AI runs on the edge box, on-site, with no cloud round trip for inference. So if the ISP goes down at 02:00, the AI keeps detecting incidents and writing them to local storage. The alert delivery (SMS, voice call) requires a working uplink. On properties where uptime matters, the recommended setup is the AI box plus a cellular backup uplink for alerts only (a few dollars a month). The full footage stays local and is unaffected. This is the opposite of cloud-camera systems where an internet outage means the cameras stop recording entirely.
Can I do this myself or do I need an integrator?
On a single property with one NVR and a wall display already running, this is a self-install. Plug HDMI from the NVR's secondary output into the AI box, run an ethernet cable from the AI box to your network, and walk through a one-time grid calibration in the dashboard. About two minutes on site. The legacy fleet doesn't change anything on this front because the AI box doesn't touch the cameras or the recorder's configuration. On a portfolio with many properties, the same process is repeated per property and runs about half a day end-to-end including mounting and labeling. No new wiring, no firmware updates, no credential discovery on the cameras.
How much does this cost compared to ripping out the legacy fleet?
Cyrano is $450 one-time hardware plus $200 per month per property starting month two. Replacing the legacy fleet end-to-end (new IP cameras, new PoE switch, new NVR, new wiring on a multifamily building, plus an integrator labor week) typically lands between $50,000 and $100,000 per property, plus an ongoing cloud subscription. The pricing gap is the entire reason HDMI capture exists: 99 percent of the value of a smart-camera retrofit on a multifamily property is real-time alerts and natural-language footage search, and both of those can be delivered without touching the cameras themselves. The cameras you have are good enough to detect an intruder. They were not good enough to know they were looking at one.
Adjacent reading on the same retrofit stack
Upgrade your DVR or NVR with AI analytics, no camera replacement
The broader case for retrofitting AI on top of an existing recorder, including the cost comparison against ripping the fleet out and the operational case for keeping the cameras you have.
Legacy CCTV real-time monitoring gap
Why legacy CCTV records everything and alerts on nothing, what closes the gap, and what 'real-time' actually requires from the alert path.
AI security camera intent alerts: the six-input decision
Once HDMI capture has the pixels, what the classifier on top actually does. Walks through the class, zone, time, dwell, badge, posture decision that routes to LOW vs HIGH.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.