Skip to main content
Browser Actions execute quick browser interactions using AI (Claude Sonnet 4.5) with a maximum of 2 steps per invocation.

Overview

Step type: browser_action Browser Actions are perfect for simple, predictable interactions where you need fast execution and want to benefit from caching.

How It Works

  • AI Model: Uses Claude Sonnet 4.5 with the browser-use library
  • Step Limit: Maximum 2 autonomous steps per invocation
  • Variable Support: Supports variable interpolation using {{variable_name}} syntax
  • Caching: Built-in caching for repeated actions
  • Timeout: 300 seconds

When to Use

Perfect for:
  • Simple interactions like “Click submit button and verify success”
  • Quick navigation tasks like “Go to pricing page and take screenshot”
  • Single-purpose tasks that don’t need full workflow orchestration
  • Tasks where caching will provide significant cost savings
  • Predictable, repeatable actions
Not suitable for:
  • Multi-step workflows requiring more than 2 actions
  • Complex conditional logic spanning many steps
  • Tasks requiring state management across interruptions
  • Unpredictable flows with many decision points

Cost

  • Fresh execution: 50 credits
  • Cached execution: 5 credits
Browser Actions are 10x cheaper when cached! For repeated tasks, the cost savings are significant.

Example Use Cases

Simple Navigation

"Navigate to google.com and search for 'anthropic'"

Form Interaction

"Click the login button and enter email"

Quick Verification

"Scroll to footer and click contact link"

Screenshot Capture

"Go to the pricing page and take a screenshot"

With Variables

"Search for '{{search_term}}' and click the first result"

Configuration Example

{
  "type": "browser_action",
  "action": "Navigate to {{website_url}} and click the sign up button",
  "variables": {
    "website_url": "https://example.com"
  }
}

Best Practices

Do ✅

  • Use for simple, well-defined tasks
  • Leverage caching for repeated operations
  • Keep instructions clear and concise
  • Use variables for dynamic content
  • Combine 2 related actions in a single step

Don’t ❌

  • Try to cram complex workflows into 2 steps
  • Use for unpredictable multi-page flows
  • Rely on complex conditional logic
  • Expect state persistence across calls
  • Use when you need more than 2 actions

Limitations

  • Max 2 steps per invocation
  • 300 second timeout
  • No state persistence between invocations
  • Cannot handle complex multi-page workflows

When to Upgrade

Consider switching to Browser Agents when:
  • You need more than 2 steps
  • The workflow is unpredictable or conditional
  • You need state management
  • The task spans multiple pages with variable paths
Consider switching to Browser Code when:
  • You need precise control over logic
  • You’re extracting structured data
  • You need custom validation or transformation
  • The AI approach is too unpredictable