Browser Actions

Browser Actions execute quick browser interactions using AI (Claude Sonnet 4.5) with a maximum of 2 steps per invocation.

Overview

Step type: browser_action Browser Actions are perfect for simple, predictable interactions where you need fast execution and want to benefit from caching.

How It Works

AI Model: Uses Claude Sonnet 4.5 with the browser-use library
Step Limit: Maximum 2 autonomous steps per invocation
Variable Support: Supports variable interpolation using {{variable_name}} syntax
Caching: Built-in caching for repeated actions
Timeout: 300 seconds

When to Use

Perfect for:

Simple interactions like “Click submit button and verify success”
Quick navigation tasks like “Go to pricing page and take screenshot”
Single-purpose tasks that don’t need full workflow orchestration
Tasks where caching will provide significant cost savings
Predictable, repeatable actions

Not suitable for:

Multi-step workflows requiring more than 2 actions
Complex conditional logic spanning many steps
Tasks requiring state management across interruptions
Unpredictable flows with many decision points

Cost

Fresh execution: 50 credits
Cached execution: 5 credits

Browser Actions are 10x cheaper when cached! For repeated tasks, the cost savings are significant.

Example Use Cases

"Navigate to google.com and search for 'anthropic'"

Form Interaction

"Click the login button and enter email"

Quick Verification

"Scroll to footer and click contact link"

Screenshot Capture

"Go to the pricing page and take a screenshot"

With Variables

"Search for '{{search_term}}' and click the first result"

Configuration Example

{
  "type": "browser_action",
  "action": "Navigate to {{website_url}} and click the sign up button",
  "variables": {
    "website_url": "https://example.com"
  }
}

Best Practices

Do ✅

Use for simple, well-defined tasks
Leverage caching for repeated operations
Keep instructions clear and concise
Use variables for dynamic content
Combine 2 related actions in a single step

Don’t ❌

Try to cram complex workflows into 2 steps
Use for unpredictable multi-page flows
Rely on complex conditional logic
Expect state persistence across calls
Use when you need more than 2 actions

Limitations

Max 2 steps per invocation
300 second timeout
No state persistence between invocations
Cannot handle complex multi-page workflows

When to Upgrade

Consider switching to Browser Agents when:

You need more than 2 steps
The workflow is unpredictable or conditional
You need state management
The task spans multiple pages with variable paths

Consider switching to Browser Code when:

You need precise control over logic
You’re extracting structured data
You need custom validation or transformation
The AI approach is too unpredictable

Browser Agents

For complex multi-step workflows

Browser Code

For precise programmatic control

Getting Started

Visual Workflow Builder

Programmatic Actions

Browser Actions

Overview

How It Works

When to Use

Cost

Example Use Cases

Simple Navigation

Form Interaction

Quick Verification

Screenshot Capture

With Variables

Configuration Example

Best Practices

Do ✅

Don’t ❌

Limitations

When to Upgrade

Browser Agents

Browser Code

Getting Started

Visual Workflow Builder

Programmatic Actions

​Overview

​How It Works

​When to Use

​Cost

​Example Use Cases

​Simple Navigation

​Form Interaction

​Quick Verification

​Screenshot Capture

​With Variables

​Configuration Example

​Best Practices

​Do ✅

​Don’t ❌

​Limitations

​When to Upgrade

​Related Guides

Browser Agents

Browser Code

Overview

How It Works

When to Use

Cost

Example Use Cases

Simple Navigation

Form Interaction

Quick Verification

Screenshot Capture

With Variables

Configuration Example

Best Practices

Do ✅

Don’t ❌

Limitations

When to Upgrade

Related Guides