SenseCAP Watcher - AI Assistant that actively interacts with the world
You walk into your study, and the SenseCAP Watcher on your desk instantly senses your presence. The screen lights up, it greets you with a smile, and displays your to-do list for the day—no need for you to speak a word or even lift a finger.
SenseCAP Watcher is redefining how humans and devices interact.
It possesses a unique "Frictionless Trigger" conversational ability, initiating interaction proactively as you approach. This isn't just a technical upgrade; it's a fundamental shift in interaction logic: For the first time, the initiative transitions from the human to the machine, achieving an experiential leap from "humans adapting to machines" to "machines actively adapting to humans."
Watcher's "Keen Eye" is powered by a powerful On-Device AI Vision Chip (Himax), enabling rapid local processing for object recognition and target tracking. Combined with the expandable XiaoZhi assistant firmware, it's more than just a camera—it's a dedicated AI assistant that evolves and understands your needs.
Core Advantages
SenseCAP Watcher
An intelligent device integrating on-device AI vision and a flexible development environment, designed to help you easily build and deploy personalized AI applications.
Offline "Keen Eye": Efficient On-Device Vision Processing
Equipped with a high-performance AI vision chip (Himax), all image processing is completed locally on the device. Enjoy swift responses and enhanced privacy protection—your data doesn't need to be uploaded to the cloud.
Build AI Apps with Zero Code, As Simple As Lego
Leverage the SenseCraft AI platform to deploy AI models and quickly build applications for specific scenarios with just a few clicks. Deploying AI models becomes as easy as photo editing, requiring absolutely no programming background.
Flexible Integration Platform and Tools
Based on the XiaoZhi MCP architecture, you can freely define new AI tools and quickly integrate local or cloud services, seamlessly fitting into existing smart systems.
Hardware Expandability: Additional Interface Support
Features GPIO expansion interfaces for easy connection to various sensors and actuators, enabling deep customization and supporting more creative implementations.
How It Works
SenseCAP Watcher can be thought of as a "modular" intelligent robot, with its core operation relying on the collaboration of three main components:
- Hardware Core (ESP32S3): Drives the underlying hardware, processes camera data, and maintains stable connections to cloud services.
- Visual Nerve (Himax AI Chip): A powerful on-device AI vision processing unit that grants the device real-time environmental perception capabilities.
- Cloud Brain (Backend Services): Responsible for AI role configuration, MCP tool scheduling, and unified device management.
The process can be simplified as follows:
Watcher's "eyes" (camera) capture images → The "visual nerve" (Himax chip) performs recognition and wakes up XiaoZhi → The "brain" (backend AI, MCP services) understands the context and responds.
This modular design offers high flexibility and extensibility.

Getting Started Guide
Quick Start in Three Steps
Activate your Watcher in just three steps:
- Power On the Device: Provide power via the Type-C data cable; if using a battery, press and hold the side button to turn it on.
- Connect to Network: After booting, the device will generate a Wi-Fi hotspot. Connect your phone or computer to this hotspot. | 「Watcher Network Setup」
- Configure and Activate: Access
192.168.4.1
in your browser to configure Wi-Fi for Watcher and follow the instructions on the SenseCraft AI platform to complete activation. | 「Watcher Web Control Panel」
If activation fails, please confirm the verification code is correct and the device authentication information is not lost. The following actions typically cause loss of authentication information:
- The firmware was overwritten by another program.
- A major firmware update was performed without backing up authentication information.
- A completely new firmware was flashed.
If reactivation is needed, please send the device's STA Mac address (can be obtained from serial logs) to [email protected] for assistance. Please refer to: Flashing Authentication Info.
You can watch a detailed tutorial in the video below:
Model and Firmware Updates: Customize Exclusive Skills
Developers or advanced users can flash different models or firmware to empower Watcher with more powerful, exclusive capabilities.
1. Flashing AI Models
Leveraging the built-in Himax on-device AI vision chip, you can easily deploy new recognition models via the SenseCraft AI platform:
-
Connect your computer to the Type-C interface on the bottom of the Watcher.
-
On the SenseCraft AI platform, select
SenseCAP Watcher
under Workspace and choose the port with the smaller serial number for model flashing. -
If the camera doesn't work properly after flashing, try restarting the device.
2. Flashing XiaoZhi Firmware
- SenseCraft AI
- Single File Flash (Recommended)
- Compile from Source (For Developers)
Coming Soon.
-
Latest Firmware v1.8.8: Download Link
-
Extract the downloaded firmware package and use the esptool.py tool to perform the flash:
esptool.py -p /dev/ttyACM0 -b 2000000 write_flash 0 merged-binary.bin
(Note: Adjust the port
/dev/ttyACM0
according to your system, e.g.,COM3
on Windows)
-
Requires a pre-configured IDF toolchain environment. Clone the code repository: GitHub Repository
-
Execute the following commands in the project root directory:
cd xiaozhi-esp32
idf.py --set-target esp32s3
idf.py menuconfig # In the GUI, select the board type as SenseCAP Watcher
idf.py build flash monitor
Start a Conversation
Now that you have a basic understanding of SenseCAP Watcher, you can explore its conversational and tool-calling capabilities, such as using the camera function or setting time-based strategies for proactive interaction.
For example, if my agent's role name is set to Watcher
, the settings would look like this:

Wake the Device
The device remains in a standby state when not awakened, meaning it does not listen to surrounding conversations. Once awakened, it begins listening and can engage in dialogue or execute operations based on user instructions.
- Visual Wake-Up
- Voice Wake-Up
- Button Wake-Up
The current visual wake-up function offers the following configuration options:
- Target ID (
target
): Specifies the target ID to detect. This ID depends on the visual model used; the default value is 0. - Detection Duration (
duration
): Unit is seconds, used to adjust the sensitivity of the visual wake-up. The default is 1 second (this default does not include the 1-second debounce processing). - Confidence Threshold (
threshold
): The lower confidence limit for the visual model to recognize an object, used to adjust detection sensitivity. Represented as a percentage, the default is 75%. - Cooldown Period (
interval
): Unit is seconds, indicating the wait time required after one conversation ends before triggering again, used to avoid frequent interruptions by the same object. The default is 8 seconds.
For example, you can adjust the model's sensitivity by modifying the threshold
parameter. If you find the current threshold too strict, simply say to Watcher: "Please set the confidence threshold to 60%".
Using the Camera
Say to Watcher:
Please turn on the camera
What can you see?
What's in front of you?
Resources
Here are some advanced resources to help you further expand the application boundaries of SenseCAP Watcher according to your needs, whether for on-premise deployment, privacy protection, or building personalized knowledge bases:
- AI Conversation-Driven Smart Home - Control smart home devices like lights, AC, and curtains directly via voice through SenseCAP Watcher, saying goodbye to manual operation.
References
- SenseCAP Watcher Hardware Overview - Hardware resources, structural design, etc., of SenseCAP Watcher.
- SenseCAP Watcher Operation Guideline - Basic logic for turning SenseCAP Watcher on and off.
- Training On-Device Vision Models for SenseCAP Watcher - A guide to training on-device vision models for SenseCAP Watcher, including data preparation, model training, and deployment.
- SenseCAP Watcher - Web Control Panel
- SenseCAP Watcher - Device Network Setup Guide