Android tapOn flakes after observe (search lists, tapClickableParent)¶
✅ Implemented
This document captures a real production flake (FUB TaskTest: select “Dan Corkill” from person search after typing in SearchActivity_), everything that was tried, and what actually fixed it.
What actually fixed it (read this first)¶
The flake was not primarily “wrong tap coordinates” or “semantic click ignored the row.” It was timing + tree churn: the row existed when observe + waitFor passed, then disappeared from the accessibility tree (loading overlay, list refresh, IME) before the tap ran. Older behavior could still report tap success using stale bounds from the pre-tap observation, so the UI never advanced and the next step timed out.
The fix that stuck is two parts in the MCP server (Android tapOn path in TapOnElement):
- Pre-tap stable re-find (hard guard)
After the initial element resolution, refresh the hierarchy several times, re-find the same target each time, and require two consecutive re-finds whose bounds match within a small ε (boundsNearlyEqual). If that never happens, abort the tap with a clear error instead of tapping pre-observe coordinates. - Code:
resolveAndroidStableTapTargetAfterRefreshesinsrc/features/action/TapOnElement.ts -
Utility:
boundsNearlyEqualinsrc/utils/bounds.ts -
Loading-aware patience (so the guard can succeed)
If the current tree looks like a blocking loading state (progress indicators, common loading view classes / resource ids), extend the pre-tap refresh budget (default 8 attempts → up to 32), with a short delay between attempts, so the list can come back before we give up. - Code:
androidViewHierarchyIndicatesLikelyBlockingLoadinginsrc/utils/androidTransientLoading.ts
Together: don’t tap on ghosts, and wait long enough through real loading for the row to reappear in the a11y tree. After this landed, the same plan passed repeatedly (e.g. four consecutive runs) where step 28 had previously been flaky.
Policy note: The two consecutive ε-stable re-finds apply to churn-prone selector paths (tapClickableParent, siblingOfText, clickable, scrollableContainer). Plain text or elementId-only taps still require a live re-find after refresh but only one successful stable match, so static labels are not held to the same double-stability bar as search list rows.
Symptom¶
observewithwaitForon the row text (often scoped withcontainer) succeeds.- Immediately after,
tapOnwithtapClickableParent: trueon the same text either: - reported success but the screen did not change, and a later
observetimed out, or - failed with an error about not re-finding the target after refreshes (after the guard was added).
Daemon / failure artifacts sometimes showed progress_bar_loading (or list chrome missing) at failure time while SearchActivity_ was still the active activity.
Root cause (why observe could lie to the plan)¶
-
Stale snapshot vs. live tap
The plan advances on a hierarchy where “Dan Corkill” is present. A few hundred milliseconds later, the app can show loading or replace the list; the accessibility tree no longer contains that node. -
Unsafe fallback
If refresh did not re-find the node but code still used old bounds,dispatchGesture/ ADB tap could “succeed” at the transport layer without activating the intended list row. -
Short polling window
Even with re-find logic, a small fixed number of refreshes could expire while a loading overlay hid rows, so the tool aborted or (before the guard) tapped the wrong state.
What we tried (historical, partial, or orthogonal)¶
These changes improved robustness or diagnostics and are worth keeping in mind, but they did not alone eliminate the Dan Corkill flake:
| Area | What we tried | Why it wasn’t sufficient alone |
|---|---|---|
| CtrlProxy / semantic tap | ACTION_CLICK with bounds disambiguation, multi-window search, hit-test ordering, framework id handling (android:id/* vs app ids) |
Still need a real node matching the row; semantic success doesn’t help if the tree dropped the row. |
| Coordinates | tapClickableParent: label∩row overlap, clamped centers, ADB-before-gesture for search+IME |
Wrong if bounds refer to a row that no longer exists in the current tree. |
| Snapshot consistency | Ensure tap target and geometry come from the same refreshed hierarchy | Necessary but not enough if the next refresh still has no row. |
| Diagnostics | tapDebug, focus/hit-test around tap, plan failureObservation on observe timeout |
Great for proving loading / missing row; doesn’t fix timing. |
| Plans | waitFor.container, tighter waits |
Reduces false positives; doesn’t remove the gap between end of observe and start of tap. |
Implementation reference (for maintainers)¶
| Piece | Location |
|---|---|
| Pre-tap loop + stability requirement | src/features/action/TapOnElement.ts — resolveAndroidStableTapTargetAfterRefreshes; match count from src/features/action/androidPreTapStablePolicy.ts |
| Bounds ε comparison | src/utils/bounds.ts — boundsNearlyEqual |
| Loading heuristic | src/utils/androidTransientLoading.ts |
| Unit tests | test/utils/bounds.test.ts, test/utils/androidTransientLoading.test.ts |
Typical success log line:
text
[TapOnElement] Android tap target stable after N refresh(es) (bounds matched on last 2 consecutive re-finds, ε=3px)
When loading extension applies:
text
[TapOnElement] Android pre-tap: loading/progress indicators present; extending refind attempts to 32
Failure (guard working, UI still churning):
text
Android tap aborted: could not re-find the target ... (refusing tap using pre-observe coordinates). The UI may still be updating (list, keyboard, loading overlay, or animation).
Plan authoring tips (belt-and-suspenders)¶
- Prefer
waitForthat matches what must be true immediately before the tap (row visible and stable if the app does a two-phase load). - Use
containeronwaitFor/tapOnwhen the text can appear outside the list. - If a screen always shows a short loading state after search, consider an extra
observewithwaitForon a stable list id or row after loading clears (defensive; server-side patience above should cover many cases).
Daemon / bundle caveats (debugging only)¶
When reading daemon logs, you may see issues that do not change the above root cause but do muddy diagnostics:
getUiAutomatorHierarchyis not a function (minifiedJ.*) — UI Automator fallback fromrefreshAndroidViewHierarchycan fail in some packaged daemon builds; prefer a build where native helpers export correctly.PerformanceAudit…J.create is not a function— separate bundle/minification issue.NAV_SCREENSHOT… Unsupported platform: undefined — screenshot path missingplatform; unrelated to tap re-find logic.CTRL_PROXYTimeout waiting for fresh data — hierarchy may be slightly stale; the pre-tap loop is meant to cope by re-fetching right before tap.
If logs show pre-tap refresh attempt …/8 with no “extending refind attempts” line, the running server may be older than the loading-extension change, or the app’s loader does not match the loading heuristic (extend patterns there if needed).
See also¶
- Android Control Proxy — accessibility service and CtrlProxy
- MCP tools —
tapOnselector overview