WISPAYR / AI ADVISORYStart →
AI/Casework/04 · camera-vision-enrichment
All casework
// CASE / 04 / security

Camera-vision-enrichment — vendor-neutral scene labels

Multi-site operators with mixed-vendor camera estates
securityoperationsdata-fabric

A Viseron-shaped service that watches any RTSP stream and emits structured labels (people, vehicles, animals, parcels) for downstream automation.

// Cost
Self-funded internal infrastructure; pays for itself the moment you build the second app on top of it.
// Duration
In production across our entire site portfolio.
// 01 · The problem

Customers had cameras from 4–5 vendors. Each vendor's analytics is locked to their app. There was no way to write a single rule like 'tell me when anyone enters Zone B at night, regardless of which camera saw them.'

// 02 · What we did

Pull RTSP from every camera, run a shared inference pipeline, expose state as a JSON document that other services can query or subscribe to. Treats vision as a substrate, not a product.

// 03 · What the AI did

Standard detection + tracking + per-zone occupancy.

// 04 · What humans did

Defined the schema (the contract that downstream services depend on) and the operator-facing rule grammar.

// 05 · The outcome

A single 'scene state' that drives signage, alerts, dashboards, and the broadcast-studio cinematic camera switcher.

// 06 · What broke

First version coupled too tightly to one camera vendor's RTSP quirks. Second version assumes nothing about the source.

// 07 · What works

Treat AI output as data, not as features. Ship it as a contract; let the apps that consume it be small and replaceable.

// 08 · Reusable lessons
  1. 01If you're going to use vision in more than one product, build the labelling layer once and let everything subscribe to it.
  2. 02JSON schemas age better than ML model versions.
  3. 03Vendor-neutral substrates are the highest-leverage AI investment for multi-site operators.