OverviewΒΆ
As AutoMobile explores an app it automatically maps what it observes into a navigation graph.
flowchart TD
subgraph Navigation Graph
Home["π Home Screen"]
Profile["π€ Profile Screen"]
Settings["βοΈ Settings Screen"]
EditProfile["βοΈ Edit Profile"]
Notifications["π Notifications"]
Privacy["π Privacy Settings"]
end
Home -->|"π tapOn 'Profile'"| Profile
Home -->|"π tapOn 'Settings'"| Settings
Home -->|"π tapOn 'Notifications'"| Notifications
Profile -->|"π tapOn 'Edit'"| EditProfile
Profile -->|"π pressButton 'back'"| Home
EditProfile -->|"π pressButton 'back'"| Profile
Settings -->|"π tapOn 'Privacy'"| Privacy
Settings -->|"π pressButton 'back'"| Home
Privacy -->|"π pressButton 'back'"| Settings
Notifications -->|"π pressButton 'back'"| Home
classDef screen fill:#525FE1,stroke-width:0px,color:white;
class Home,Profile,Settings,EditProfile,Notifications,Privacy screen;
Upon every observation after a screen has reached UI stability:
- Create unique screen signature by fingerprinting the observation paired with AutoMobile SDK navigation events
- Compare current vs previous screen
- If weβre on a different unique navigation fingerprint, record the tool call as the edge in the graph.
This process has been benchmarked to take at most 1ms and it is a project goal to keep it within the limit. The graph is persisted as exploration takes place whether by the user or AI. As its built you can take advantage of it:
Navigate to ScreenΒΆ
The πΊοΈ navigateTo tool uses the graph to find paths:
- Finds target screen in graph
- Calculates shortest path from current node to the target
- Executes recorded actions to reach target
- Verifies arrival at destination
Explore EfficientlyΒΆ
The π explore tool uses the graph to:
- Avoid revisiting known screens
- Prioritize unexplored branches
- Track coverage of app features
Read more about how to use the π explore toolβs modes.
Edge Cases & LimitationsΒΆ
Known LimitationsΒΆ
- Multiple similar screens without navigation IDs
- Risk: May produce same fingerprint
-
Mitigation: Include static text for differentiation
-
Cache expiration during long keyboard sessions
- Risk: Lost navigation ID reference
-
Mitigation: Adjust cacheTTL based on use case
-
Screens with identical structure and no selected state
- Risk: Cannot differentiate
- Mitigation: Encourage SDK integration for perfect identification
Handled Edge CasesΒΆ
- β Nested scrollable containers
- β Scrollable tab rows (critical fix)
- β Keyboard show/hide transitions
- β Empty hierarchies
- β Deeply nested structures
Best PracticesΒΆ
For SDK-Instrumented AppsΒΆ
β
Do:
- Use unique navigation resource-ids for each screen
- Follow navigation.* naming convention
- Ensure navigation IDs persist during keyboard
β
Consider:
- Add navigation IDs even to modal/overlay screens
- Use descriptive names: navigation.ProfileEditScreen
For Non-SDK AppsΒΆ
β Do: - Rely on Tier 3 shallow scrollable strategy - Ensure screens have distinguishing static text or selected states - Test fingerprinting across different app states
β οΈ Watch for: - Screens with identical layout but different data - Heavy use of dynamic content without static labels
For All AppsΒΆ
β Do: - Cache previous fingerprint results for stateful tracking - Monitor confidence levels - Log fingerprint method for debugging
β Donβt: - Assume 100% accuracy without navigation IDs - Ignore confidence levels in decision-making - Skip validation on critical navigation paths