Real-Time Screen Streaming Architecture¶
โ ๏ธ Partial
Note: This document covers the live IDE screen mirroring feature (continuous streaming to the Android Studio / Xcode plugin). This is distinct from the
videoRecordingMCP tool (which records a clip to a file) โ that tool is โ Implemented ๐งช Tested.The Android
video-serverJAR (H.264, VirtualDisplay) is fully built and used byvideoRecording. The end-to-end live mirroring pipeline (MCP relay โ IDE DeviceScreenView) is in progress. iOS live streaming is ๐ง Design Only โ see iOS Screen Streaming.See the Status Glossary for chip definitions.
Real-time screen streaming from mobile devices to the IDE plugin, enabling interactive device mirroring at up to 60fps with <100ms latency.
Goals¶
- Continuous live streaming for device mirroring in the IDE
- Up to 60fps frame rate
- <100ms end-to-end latency for interactive use
- Support USB-connected physical devices and emulators/simulators
- Include audio streaming for complete mirroring
- Integrate with existing observation architecture
- Single device streaming at a time (no multi-device simultaneous streams)
Architecture¶
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Mobile Device โ
โ โ
โ Platform-specific capture mechanism โ
โ (see platform docs for details) โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MCP Server (Node.js) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ Existing sockets: New socket: โ
โ โโ auto-mobile.sock (MCP proxy) โโ video-stream.sock โ
โ โโ observation-stream.sock (binary frame data) โ
โ โโ performance-push.sock โ
โ โ
โ VideoStreamManager โ
โ โโ Platform detection โ
โ โโ Capture process lifecycle โ
โ โโ Frame forwarding to clients โ
โ โโ Fallback to screenshot mode โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ IDE Plugin (Kotlin/JVM) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ VideoStreamClient โ
โ โโ Unix socket connection to video-stream.sock โ
โ โโ Platform-specific frame decoding โ
โ โโ Frame โ ImageBitmap conversion โ
โ โ
โ DeviceScreenView (Compose Desktop) โ
โ โโ Live frame display โ
โ โโ Overlay support (hierarchy highlights, selection) โ
โ โโ FPS indicator โ
โ โโ Fallback to static screenshots โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Platform-Specific Capture¶
The capture mechanism differs significantly between platforms:
| Platform | Capture Location | Frame Format | Decoder Needed |
|---|---|---|---|
| Android | On device | H.264 encoded | Yes (Klarity) |
| iOS | On Mac | Raw BGRA | No |
See platform-specific documentation for implementation details: - Android Screen Streaming - VirtualDisplay + MediaCodec via shell-user JAR - iOS Screen Streaming - AVFoundation + ScreenCaptureKit on macOS
Video Stream Socket Protocol¶
New Unix socket: ~/.auto-mobile/video-stream.sock
Connection Handshake¶
Client โ Server: { "command": "subscribe", "deviceId": "<optional>" }
Server โ Client: { "type": "stream_started", "deviceId": "...", "platform": "android|ios" }
Frame Data¶
Binary frames with platform-specific headers:
Android (H.264):
โโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโ
โ codec_id (4) โ width (4) โ height (4) โ
โโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโ
Then per-packet: pts_flags (8) + size (4) + H.264 data
iOS (Raw BGRA):
โโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโ
โ width (4) โ height (4) โ bytesPerRow (4) โ timestamp (4) โ
โโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโ
Then: height * bytesPerRow bytes of BGRA pixel data
Stream Control¶
Client โ Server: { "command": "set_quality", "quality": "low|medium|high" }
Client โ Server: { "command": "unsubscribe" }
Server โ Client: { "type": "stream_stopped", "reason": "..." }
Quality Presets¶
| Quality | Android Bitrate | Resolution | Target FPS |
|---|---|---|---|
| Low | 2 Mbps | 540p | 30 |
| Medium | 4 Mbps | 720p | 60 |
| High | 8 Mbps | 1080p | 60 |
iOS streams raw frames, so quality is controlled by resolution scaling only.
Fallback Behavior¶
When video streaming is unavailable: 1. Detect stream failure or unsupported device 2. Automatically switch to existing screenshot-based observation 3. Display indicator in UI showing “Screenshot mode” 4. Retry video streaming on user request or device reconnection
Decisions¶
| Question | Decision |
|---|---|
| Audio streaming | Include audio for complete mirroring |
| Touch input | Plan for it, implement later |
| Quality auto-adjustment | Automatically lower quality on frame drops |
| Multiple devices | Single device streaming at a time |
| Android decoder | Klarity only, no FFmpeg subprocess fallback |
| iOS Swift integration | Swift-to-Node bridge |
| macOS permissions | User handles permission prompts |
| macOS entitlements | No special entitlements needed for iOS capture |