Reka Edge

Reka Edge

Frontier–level visual intelligence. Optimized for deployment. Engineered for Speed.

  • Leading Performance

    across multimodal benchmarks

  • 2.4x

    faster time to first token on single requests

  • Easy Deployment

    Hugging Face • vLLM

  • Open Source

    Weights available on Hugging Face

[Reka Edge 2]

Fastest Edge Model for Physical AI

Reka Edge is the fastest vision language model in the 7B - 8B class designed for real world systems. Our model process videos and images with fewer tokens, which translates to concrete reductions in latency and memory footprint.

Perfect for Edge Deployment

Built to redefine intelligence at the edge, Reka Edge delivers strong performance across video and image understanding, object detection, and tool use, making it one of the best options for real-world edge use cases such as:

Perfect for Edge Deployment

Built to redefine intelligence at the edge, Reka Edge delivers strong performance across video and image understanding, object detection, and tool use, making it one of the best options for real-world edge use cases such as:

Perfect for Edge Deployment

  • Real-time video analysis

  • Real-time video analysis

  • Object detection and scene understanding via robotics & drones applications

  • Object detection and scene understanding via robotics & drones applications

  • Media and content highlight generation

  • Media and content highlight generation

  • Multimodal agent orchestration

  • Multimodal agent orchestration

Engineered for responsiveness

Reka Edge achieves up to 2.4x faster time to first token on single requests

Perfect for Edge Deployment

Reka Edge 2 is built on

Reka Edge 2 is built on

  • 660M parameter ConvNeXT V2 vision encoder for efficient visual processing

  • 6B parameter language backbone for reasoning and generation

  • 7B total parameters for both high-end performance and resource efficiency

Real-world Use Cases

Reka Edge 2 achieves 2.4x faster time to first token on single requests

Perfect for Edge Deployment

Problem

Physical AI requires fluid, continuous interaction with the unstructured world. Robots cannot wait seconds for a cloud VLM to tell them how to grasp an unfamiliar object. They need exact spatial coordinates and sub-second latency for control without awkward, dangerous stutters in movement or costly custom detector stacks.

Solution

  1. Continuous Edge Perception: The robot's stereo cameras stream high frame rate visual data directly to internal edge compute (e.g., NVIDIA Jetson).

  1. Contextual Encoding (Reka Edge 2): The ConvNeXT V2 backbone ingests the video. Highly efficient processing prevents the robot's onboard memory from overflowing during continuous operation.

  2. Grounded Tool Localization: The model maps the environment, extracting precise bounding boxes for tools, target objects, and obstacles using conversational pointing (e.g., "Where is the 10mm wrench?").

  3. Agentic Motor Control: Using its tool-use framework, the VLM actively orchestrates the robot's hardware APIs (e.g., adjust_grip_pressure(), move_arm_to(x,y,z)) to complete complex, multi-step tasks.

Problem

Automakers struggle to deploy rich in-vehicle AI because existing automotive hardware cannot run large models locally. Cloud connectivity introduces unacceptable latency and severe privacy concerns regarding cabin footage, while legacy voice systems remain too rigid to handle natural, multimodal interactions based on what the driver is actually looking at or doing.

Solution

  1. Multi-Camera Ingestion: The vehicle's centralized computer simultaneously processes feeds from the dashboard, steering column, and rear-seat monitors.

  2. Edge-First Processing: Reka Edge 2 runs entirely offline on the vehicle's existing System on Chip (SoC), drastically reducing power consumption.

  3. Temporal Video Analysis: The model evaluates the sequence of frames to understand unfolding events, such as a child unbuckling a seatbelt, or a driver pointing at a storefront and asking, "When does that place close?"

  4. Agentic Vehicle Control: The model safely routes tasks, simultaneously triggering ADAS safety alerts while seamlessly calling infotainment APIs to adjust cabin temperature or update navigation.

Problem

Automakers struggle to deploy rich in-vehicle AI because existing automotive hardware cannot run large models locally. Cloud connectivity introduces unacceptable latency and severe privacy concerns regarding cabin footage, while legacy voice systems remain too rigid to handle natural, multimodal interactions based on what the driver is actually looking at or doing.

Solution

  1. Multi-Camera Ingestion: The vehicle's centralized computer simultaneously processes feeds from the dashboard, steering column, and rear-seat monitors.

  2. Edge-First Processing: Reka Edge 2 runs entirely offline on the vehicle's existing System on Chip (SoC), drastically reducing power consumption.

  3. Temporal Video Analysis: The model evaluates the sequence of frames to understand unfolding events, such as a child unbuckling a seatbelt, or a driver pointing at a storefront and asking, "When does that place close?"

  4. Agentic Vehicle Control: The model safely routes tasks, simultaneously triggering ADAS safety alerts while seamlessly calling infotainment APIs to adjust cabin temperature or update navigation.

Problem

Deploying AI to frontline workers in remote locations—like offshore oil rigs, deep mines, or complex utility infrastructure—means dealing with zero internet connectivity. However, projecting context-aware assistance onto AR wearables requires continuous visual processing that typically drains mobile batteries in minutes and causes the headset to overheat.

Solution

  1. First-Person Perception: The smart glasses' front-facing cameras capture the wearer's exact field of view in real-time.

  2. Thermal-Friendly Processing (Reka Edge 2): Because it extracts only 64 tokens per image tile, the compute overhead is drastically minimized, preserving battery life and device thermals on the wearable device.

  3. Offline Visual Q&A: The technician can ask complex questions ("Which of these valves regulates the secondary pressure loop?") and receive instant, grounded answers.

  4. Contextual Tool Use: The VLM understands what the user is looking at and interacts with local, cached technical manuals or databases to project schematics directly into the HUD.

Problem

Deploying AI to frontline workers in remote locations—like offshore oil rigs, deep mines, or complex utility infrastructure—means dealing with zero internet connectivity. However, projecting context-aware assistance onto AR wearables requires continuous visual processing that typically drains mobile batteries in minutes and causes the headset to overheat.

Solution

  1. First-Person Perception: The smart glasses' front-facing cameras capture the wearer's exact field of view in real-time.

  2. Thermal-Friendly Processing (Reka Edge 2): Because it extracts only 64 tokens per image tile, the compute overhead is drastically minimized, preserving battery life and device thermals on the wearable device.

  3. Offline Visual Q&A: The technician can ask complex questions ("Which of these valves regulates the secondary pressure loop?") and receive instant, grounded answers.

  4. Contextual Tool Use: The VLM understands what the user is looking at and interacts with local, cached technical manuals or databases to project schematics directly into the HUD.

Problem

Deploying AI to frontline workers in remote locations—like offshore oil rigs, deep mines, or complex utility infrastructure—means dealing with zero internet connectivity. However, projecting context-aware assistance onto AR wearables requires continuous visual processing that typically drains mobile batteries in minutes and causes the headset to overheat.

Solution

  1. First-Person Perception: The smart glasses' front-facing cameras capture the wearer's exact field of view in real-time.

  2. Thermal-Friendly Processing (Reka Edge 2): Because it extracts only 64 tokens per image tile, the compute overhead is drastically minimized, preserving battery life and device thermals on the wearable device.

  3. Offline Visual Q&A: The technician can ask complex questions ("Which of these valves regulates the secondary pressure loop?") and receive instant, grounded answers.

  4. Contextual Tool Use: The VLM understands what the user is looking at and interacts with local, cached technical manuals or databases to project schematics directly into the HUD.

Problem

During search-and-rescue or reconnaissance operations, operators live in chaotic environments often completely devoid of cellular connectivity. Drones streaming HD footage back to a command center introduces lag and risks connection drops, while manual review of footage wastes precious, life-saving time.

Solution

  1. Aerial Perception: Drones equipped with high-resolution optical and thermal cameras survey a disaster zone.

  2. Token-Efficient Scanning: Reka Edge 2 runs locally on the drone's onboard compute, continuously scanning the feed without being overwhelmed by the massive amount of visual data.

  3. Anomaly & Human Detection: The model reasons over the chaotic visual data to distinguish between debris, wildlife, and human survivors, or identifies the visual signatures of a developing fire.

  4. Autonomous Alerting: The VLM instantly logs the exact GPS coordinates and invokes flight control APIs to hover over the target, dropping a pin for ground teams.

Problem

Sports networks and live-streaming platforms process thousands of hours of high-definition video daily. Manually clipping highlights and tagging metadata is labor-intensive. However, pushing raw 4K broadcast feeds into cloud VLMs for real-time analysis is cost-prohibitive due to massive egress fees, token bloat, and API rate limits.

Solution

  1. On-Premise Ingestion: Live broadcast feeds are routed through a localized edge server rack running Reka Edge 2 directly in the stadium's production truck.

  2. Continuous Video Analysis: The ConvNeXT V2 vision encoder compresses the HD broadcast, tracking complex, fast-moving gameplay continuously over time.

  3. Semantic Tagging: The model understands the nuance between a routine play and a game-winning moment, accurately tagging timestamps and generating descriptive captions for the production team.

  4. Automated Clipping: The model interfaces with the studio's CMS via API to autonomously cut the clip and push it to social media channels within seconds of the play happening.

Problem

Sports networks and live-streaming platforms process thousands of hours of high-definition video daily. Manually clipping highlights and tagging metadata is labor-intensive. However, pushing raw 4K broadcast feeds into cloud VLMs for real-time analysis is cost-prohibitive due to massive egress fees, token bloat, and API rate limits.

Solution

  1. On-Premise Ingestion: Live broadcast feeds are routed through a localized edge server rack running Reka Edge 2 directly in the stadium's production truck.

  2. Continuous Video Analysis: The ConvNeXT V2 vision encoder compresses the HD broadcast, tracking complex, fast-moving gameplay continuously over time.

  3. Semantic Tagging: The model understands the nuance between a routine play and a game-winning moment, accurately tagging timestamps and generating descriptive captions for the production team.

  4. Automated Clipping: The model interfaces with the studio's CMS via API to autonomously cut the clip and push it to social media channels within seconds of the play happening.

FAQ

-

The Escape from Duckov Clip Challenge is a creator contest where players use Reka Clip to create short gameplay highlights from Escape from Duckov.Creators can win cash prizes and get featured on official Reka and partner channels.

The Escape from Duckov Clip Challenge is a creator contest where players use Reka Clip to create short gameplay highlights from Escape from Duckov.Creators can win cash prizes and get featured on official Reka and partner channels.

-

The campaign runs for 6 weeks, starting from the official announcement date in this Discord. All deadlines follow the timeline shared in the pinned message.

The campaign runs for 6 weeks, starting from the official announcement date in this Discord. All deadlines follow the timeline shared in the pinned message.

-

Members of the Escape from Duckov Discord community

  • 18 years and above (or with guardian consent if under 18)

  • Employees or affiliates of Reka or partners are not eligible to win prizes

Members of the Escape from Duckov Discord community

  • 18 years and above (or with guardian consent if under 18)

  • Employees or affiliates of Reka or partners are not eligible to win prizes

-

To enter:

  1. Sign up at creator.reka.ai/clips/duckov (300 free campaign credits issued during the period of this challenge)* 

  2. Create your best Duckov clip

  3. Join Escape from Duckov Discord 

  4. Share your clip in the designated Discord Channel #duckov-reka-clip on Duckov Discord.

To enter:

  1. Sign up at creator.reka.ai/clips/duckov (300 free campaign credits issued during the period of this challenge)* 

  2. Create your best Duckov clip

  3. Join Escape from Duckov Discord 

  4. Share your clip in the designated Discord Channel #duckov-reka-clip on Duckov Discord.

-

  1. Join Escape from Duckov Discord 

  2. Share your clip in the designated Discord Channel #duckov-reka-clip on Duckov Discord.

  1. Join Escape from Duckov Discord 

  2. Share your clip in the designated Discord Channel #duckov-reka-clip on Duckov Discord.

Have a use case?

Tell us your use case, and our team will show you how Reka Edge 2 delivers real-time visual intelligence in your production environment.

Perfect for Edge Deployment