A real-time CV system that monitors restaurant table occupancy from a single overhead camera — no sensors, no hardware. It detects guests the moment they sit, tracks each table's state live, and flags the groups that have been waiting too long.

A manager's only way to read the room was to walk it — every ten minutes, counting heads, guessing which group had been waiting, missing the table that just cleared. Sensor-based systems existed, but they meant drilling hardware into furniture nobody wanted to touch.
The brief: read table occupancy from the camera already on the ceiling. Detect when guests sit, track each table's state, time how long they've been there — and surface it all on one live screen, processed directly on the feed.
YOLO11 detects every person on the feed with per-detection confidence, and ByteTrack keeps a stable ID on each one across frames — so a guest leaning over isn't mistaken for a new arrival.
Each table is drawn once as a polygon ROI mapped to the camera's perspective. A detection inside a zone marks that table occupied; the system runs a per-table state machine — Free → Occupied → Alert.
A live sidebar shows all seven tables, their state, total persons detected, and dwell time per group — counting from the second they sat. Tables that pass the wait threshold raise an alert, rendered straight onto the feed with OpenCV.
I used to walk the floor every ten minutes. Now one screen tells me which tables are free, which are occupied, and which group's been waiting too long — without me moving.
Every table mapped as a polygon zone on a single overhead feed — occupancy, dwell time, and alerts, all read from one frame.
No table sensors, no IoT install, no wiring. The system runs on the camera the restaurant already had on the ceiling.
Detection, tracking, zone logic, and dashboard rendering all run at frame rate — state changes show the instant they happen.
Live dashboard, detection overlay, and zone map — the full CV pipeline visible in one monitoring screen.