RocketSim CLI
Inspect visible elements, navigate screens, and automate taps, swipes, typing, and hardware button presses through RocketSim's built-in CLI.
RocketSim includes a built-in CLI that lets agents inspect visible UI and interact with the Simulator through a running RocketSim session. The CLI communicates with RocketSim over IPC, so interaction is fast and stateful.
The agent interaction loop
Section titled “The agent interaction loop”The CLI gives agents a compact workflow:
- Ask RocketSim which simulator is currently focused
- Read the visible UI elements
- Decide what to do next
- Tap, swipe, type, or press a hardware button
- Read the updated UI state
- Repeat
That loop is fast because RocketSim is already connected to the Simulator. There is no reconnection overhead between steps.
Key commands
Section titled “Key commands”Focused simulator
Section titled “Focused simulator”Returns the currently focused simulator as JSON, including name, runtime, and UDID.
rocketsim simulator focusedVisible elements
Section titled “Visible elements”Returns the accessibility elements currently visible on screen.
rocketsim elements [--udid <udid>] [--agent]The --agent flag is the recommended default for agent workflows. It returns a compact, pipe-delimited format that is easier and cheaper for agents to process.
Interactions
Section titled “Interactions”RocketSim supports the most common agent interactions through rocketsim interact:
rocketsim interact tap --label "Continue"rocketsim interact tap 210 642rocketsim interact long-press --label "Delete"rocketsim interact swipe --direction uprocketsim interact swipe --from 200,650 --to 200,150rocketsim interact type "hello@example.com"rocketsim interact button homeWhy --agent matters
Section titled “Why --agent matters”The --agent flag reduces the element output from full JSON to a compact pipe-delimited format. That means fewer tokens per screen read, faster agent decision loops, and easier recovery after each interaction.
Here is the same screen in both formats:
Default JSON output:
[ { "type": "Button", "label": "Continue", "value": null, "role": "AXButton", "frame": { "x": 120, "y": 620, "w": 180, "h": 44 }, "center": { "x": 210, "y": 642 } }, { "type": "TextField", "label": "Email", "value": "", "role": "AXTextField", "frame": { "x": 40, "y": 320, "w": 320, "h": 44 }, "center": { "x": 200, "y": 342 } }, { "type": "StaticText", "label": "Welcome back", "value": null, "role": "AXStaticText", "frame": { "x": 80, "y": 120, "w": 240, "h": 32 }, "center": { "x": 200, "y": 136 } }]Agent output (--agent):
type|label|cx|cyButton|Continue|210|642TextField|Email|200|342StaticText|Welcome back|200|136The agent format contains everything an agent typically needs to decide what to do next: the element type, its label, and its center coordinates. For most interaction loops, this is all the context required.
Use the full JSON output when you need extra metadata like value, role, or the full accessibility frame.
This structured output for agents works especially well with the RocketSim Agent Skill, which connects your AI coding tool to the CLI automatically. We highly recommend using the Agent Skill instead of communicating directly with the CLI when doing agentic development.
Selector-based interaction
Section titled “Selector-based interaction”When possible, agents should prefer targeting elements by label, type, or value instead of raw coordinates:
rocketsim interact tap --label "Submit"rocketsim interact tap --type Button --label "OK"rocketsim interact long-press --label "Reorder" --duration 1.5RocketSim will first try semantic accessibility activation, which is more reliable than a coordinate tap when the visual affordance does not align perfectly with the accessibility frame. This matters for controls like toggles, list rows, and buttons where the tappable area is asymmetric.
Coordinates are still available as a fallback when the element is visible on screen but not exposed with a stable label.
Named swipe directions
Section titled “Named swipe directions”For common gestures, agents can use named directions instead of explicit coordinates:
rocketsim interact swipe --direction uprocketsim interact swipe --direction backrocketsim interact swipe --direction notification-centerrocketsim interact swipe --direction control-centerValid directions: up, down, left, right, back, notification-center, control-center.
Example prompts
Section titled “Example prompts”These prompts work well with the CLI interaction loop:
Use RocketSim to read the visible elements and tap the primary action button
Use RocketSim to swipe through the onboarding carousel and verify each page title
Use RocketSim to type a search query into the search field and select the first result
Use RocketSim to press the home button and relaunch the app
Requirements
Section titled “Requirements”The CLI works when:
- RocketSim is running
- At least one iOS Simulator is booted
- The RocketSim Agent Skill has been installed so agents can discover RocketSim automatically