Skip to content

Real-Time Screen Streaming Architecture

โš ๏ธ Partial

Note: This document covers the live IDE screen mirroring feature (continuous streaming to the Android Studio / Xcode plugin). This is distinct from the videoRecording MCP tool (which records a clip to a file) โ€” that tool is โœ… Implemented ๐Ÿงช Tested.

The Android video-server JAR (H.264, VirtualDisplay) is fully built and used by videoRecording. The end-to-end live mirroring pipeline (MCP relay โ†’ IDE DeviceScreenView) is in progress. iOS live streaming is ๐Ÿšง Design Only โ€” see iOS Screen Streaming.

See the Status Glossary for chip definitions.

Real-time screen streaming from mobile devices to the IDE plugin, enabling interactive device mirroring at up to 60fps with <100ms latency.

Goals

  • Continuous live streaming for device mirroring in the IDE
  • Up to 60fps frame rate
  • <100ms end-to-end latency for interactive use
  • Support USB-connected physical devices and emulators/simulators
  • Include audio streaming for complete mirroring
  • Integrate with existing observation architecture
  • Single device streaming at a time (no multi-device simultaneous streams)

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Mobile Device                                                        โ”‚
โ”‚                                                                      โ”‚
โ”‚  Platform-specific capture mechanism                                 โ”‚
โ”‚  (see platform docs for details)                                     โ”‚
โ”‚                                                                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
                               โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ MCP Server (Node.js)                                                 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                      โ”‚
โ”‚  Existing sockets:                    New socket:                    โ”‚
โ”‚  โ”œโ”€ auto-mobile.sock (MCP proxy)      โ””โ”€ video-stream.sock          โ”‚
โ”‚  โ”œโ”€ observation-stream.sock              (binary frame data)        โ”‚
โ”‚  โ””โ”€ performance-push.sock                                           โ”‚
โ”‚                                                                      โ”‚
โ”‚  VideoStreamManager                                                  โ”‚
โ”‚  โ”œโ”€ Platform detection                                               โ”‚
โ”‚  โ”œโ”€ Capture process lifecycle                                        โ”‚
โ”‚  โ”œโ”€ Frame forwarding to clients                                      โ”‚
โ”‚  โ””โ”€ Fallback to screenshot mode                                      โ”‚
โ”‚                                                                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
                               โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ IDE Plugin (Kotlin/JVM)                                              โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                      โ”‚
โ”‚  VideoStreamClient                                                   โ”‚
โ”‚  โ”œโ”€ Unix socket connection to video-stream.sock                      โ”‚
โ”‚  โ”œโ”€ Platform-specific frame decoding                                 โ”‚
โ”‚  โ””โ”€ Frame โ†’ ImageBitmap conversion                                   โ”‚
โ”‚                                                                      โ”‚
โ”‚  DeviceScreenView (Compose Desktop)                                  โ”‚
โ”‚  โ”œโ”€ Live frame display                                               โ”‚
โ”‚  โ”œโ”€ Overlay support (hierarchy highlights, selection)                โ”‚
โ”‚  โ”œโ”€ FPS indicator                                                    โ”‚
โ”‚  โ””โ”€ Fallback to static screenshots                                   โ”‚
โ”‚                                                                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Platform-Specific Capture

The capture mechanism differs significantly between platforms:

Platform Capture Location Frame Format Decoder Needed
Android On device H.264 encoded Yes (Klarity)
iOS On Mac Raw BGRA No

See platform-specific documentation for implementation details: - Android Screen Streaming - VirtualDisplay + MediaCodec via shell-user JAR - iOS Screen Streaming - AVFoundation + ScreenCaptureKit on macOS

Video Stream Socket Protocol

New Unix socket: ~/.auto-mobile/video-stream.sock

Connection Handshake

Client โ†’ Server: { "command": "subscribe", "deviceId": "<optional>" }
Server โ†’ Client: { "type": "stream_started", "deviceId": "...", "platform": "android|ios" }

Frame Data

Binary frames with platform-specific headers:

Android (H.264):

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ codec_id (4)    โ”‚ width (4)       โ”‚ height (4)      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
Then per-packet: pts_flags (8) + size (4) + H.264 data

iOS (Raw BGRA):

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ width (4)       โ”‚ height (4)      โ”‚ bytesPerRow (4) โ”‚ timestamp (4)   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
Then: height * bytesPerRow bytes of BGRA pixel data

Stream Control

Client โ†’ Server: { "command": "set_quality", "quality": "low|medium|high" }
Client โ†’ Server: { "command": "unsubscribe" }
Server โ†’ Client: { "type": "stream_stopped", "reason": "..." }

Quality Presets

Quality Android Bitrate Resolution Target FPS
Low 2 Mbps 540p 30
Medium 4 Mbps 720p 60
High 8 Mbps 1080p 60

iOS streams raw frames, so quality is controlled by resolution scaling only.

Fallback Behavior

When video streaming is unavailable: 1. Detect stream failure or unsupported device 2. Automatically switch to existing screenshot-based observation 3. Display indicator in UI showing “Screenshot mode” 4. Retry video streaming on user request or device reconnection

Decisions

Question Decision
Audio streaming Include audio for complete mirroring
Touch input Plan for it, implement later
Quality auto-adjustment Automatically lower quality on frame drops
Multiple devices Single device streaming at a time
Android decoder Klarity only, no FFmpeg subprocess fallback
iOS Swift integration Swift-to-Node bridge
macOS permissions User handles permission prompts
macOS entitlements No special entitlements needed for iOS capture

References