takeScreenshot fallback¶
🚧 Design Only
Current state: The
takeScreenshotMCP tool with server-side “fallback ticket” gating described here has not been implemented. Screenshots are captured as part ofobserveresult. The underlying ADB screencap paths described here are used by theobserveimplementation. See the Status Glossary for chip definitions.
Goal¶
Provide a screenshot tool that is explicitly a visual fallback when element lookup fails. The tool should be gated by server-side checks so agents cannot treat it as a primary discovery method.
Proposed MCP tool (Not Implemented)¶
takeScreenshot({
reason: "element-not-found",
context: {
action: "tapOn",
text: "Login"
},
preferReuse: true
})
Key semantics:
reasonmust beelement-not-found.- Server issues a short-lived “fallback ticket” after a not-found failure;
takeScreenshotconsumes it. Calls without a ticket fail. preferReusereuses the most recentobservescreenshot if it is fresh (e.g., under 250ms) to avoid another capture.
Android implementation¶
Preferred capture path (fast, no temp file):
adb -s <device> exec-out screencap -p
Fallback path (older devices/emulators):
adb -s <device> shell screencap -p /sdcard/automobile/s.pngadb -s <device> pull /sdcard/automobile/s.png <out>
Notes:
- API 29/35 emulators support
exec-outreliably; keep the file-based path as a compatibility fallback. - If AccessibilityService already delivered a screenshot in the last N ms,
reuse it and return
reused: trueto keep the tool cheap.
Plan¶
- Add MCP tool metadata and server-side gating (fallback ticket).
- Reuse recent
observescreenshots when available. - Add a
reusedflag to the response for agent transparency.
Risks¶
- Agents can still call the tool after a legitimate not-found, but server gating prevents general misuse.
- If
observeis not delivering screenshots, fallback costs increase.