Nov 20, 2025
For years, "smart" security cameras have been plagued by a problem: they aren't actually that smart. A shadow moving across the porch or a pet jumping on the sofa often triggers the same "Person Detected" alert as a genuine break-in.
At Reka, we built Reka Vision to change this. We designed our multimodal vision platform to natively understand video, audio, and image context just as a human would.
We are proud to announce that Reka Vision has demonstrated superior performance on SmartHome-Bench, a comprehensive benchmark for video anomaly detection using Multi-Modal Foundation Models developed by Wyze.
Understanding the Home
SmartHome-Bench is one of the most challenging public evaluations for vision-language models. Unlike traditional benchmarks that focus on public surveillance (like traffic or street crowds), this dataset focuses on the unstructured and highly variable environments of smart homes.

Overview of the video anomaly taxonomy in smart homes.
The benchmark tests a vision AI's ability to detect anomalies across seven distinct categories, including:
Senior Care: Distinguishing between routine activities and distress signals (e.g., falls).
Baby & Kid Monitoring: Identifying unsafe situations versus normal play.
Security: Detecting potential intruders while ignoring authorized residents or pets.
Pet Monitoring & Wildlife: Recognizing animal behaviors that require attention.
In these tests, competitor solutions struggle. They hallucinate events, fail to understand temporal context (the sequence of actions), lack the reasoning capabilities to determine why an event is anomalous, or fail to detect the relevant anomalous event in the first place.
How Reka Vision Stands Apart
On the SmartHome-Bench, Reka Vision performed significantly better than competitor solutions, scoring the highest on recall. In surveillance applications, a false negative could mean missing a break-in or a safety incident.

**Smart home eval, reported numbers are recall score since cost of missing security events is high
While other solutions often require complex "prompt engineering" or massive distinct pipelines just to guess what is happening, Reka Vision leverages its native multimodal capabilities to:
Reason Temporally: It doesn't just look at frames in isolation; it understands the story of the video—recognizing that a door opening is normal, but a door opening at 3 AM followed by a masked individual is an anomaly.
Reduce False Positives: By deeply understanding the scene, Reka Vision filters out the "noise" (like a cat zooming past a camera) that plagues traditional motion-based systems.
Explain the 'Why': Reka Vision doesn't just flag an alert; it can explain the context. Instead of "Motion Detected," it can provide the context: "A delivery driver placed a package behind the pillar and left."
Beyond the Smart Home
While this benchmark focuses on the home, the implications for enterprise security are huge. The same core intelligence that allows Reka Vision to distinguish a playing child from a falling toddler allows it to distinguish:
Retail: A customer browsing a shelf vs. a shoplifting event.
National Security: Unauthorized activities in sovereign territory vs. a fishing boat.
Enterprise Surveillance:: A security guard on patrol vs. an unauthorized breach.
Reka Vision's success on the SmartHome-Bench shows that our platform is ready to serve as the core for the next generation of security applications. Whether that's keeping a single home safe or monitoring a global network of enterprise facilities.
Experience Reka Vision
Real-time video understanding is here. Reka Vision is available today to help developers and enterprises build smarter, safer, and more efficient video analysis tools.
Contact us for a demo of Reka Vision or try it out yourself on our playground.

