Choosing the Right Step Type

Not sure which browser automation approach to use? This guide will help you make the right choice.

Quick Decision Tree

Is it a simple 1-2 action task?

Yes → Use Browser ActionsNo → Continue to next step

Is the primary goal data extraction?

Yes → Use Browser CodeNo → Continue to next step

Does it require multi-step navigation or complex workflow?

Yes → Use Browser AgentsNo → Use Browser Actions

By Use Case

Simple Interactions

Example: “Click the login button and enter email”

Browser Actions

Best Choice - Fast, cheap, and perfect for simple tasks

Why?

Completes in 1-2 steps
Cacheable for cost savings
Fast execution
Predictable outcome

Data Extraction

Example: “Extract all product names and prices from the page”

Browser Code

Best Choice - Precise, reliable extraction with structured output

Why?

Returns structured data
Precise control over extraction logic
Cacheable code generation
Reliable and repeatable

Alternative: Use browser_assess_page for simpler extraction with schema validation.

Multi-Page Workflows

Example: “Complete the checkout process from cart to confirmation”

Browser Agents

Best Choice - Autonomous navigation with unlimited steps

Why?

No step limit
Handles complex flows
State management
Adaptive to page changes

Visual Understanding

Example: “Find and click the green ‘Confirm’ button in the modal”

Computer Use Agent

Best Choice - Vision-based understanding of layout

Why?

Understands visual layout
Extended reasoning capability
Works with complex visual interfaces
Can identify elements by appearance

Comparison Matrix

By Task Complexity

Complexity Level	Task Examples	Best Choice	Alternative
Low (1-2 steps)	• Click button • Fill single field • Navigate to page	Browser Actions	-
Medium (3-10 steps)	• Multi-step form • Login flow • Search and filter	Browser Agents	Browser Actions (if predictable)
High (10+ steps)	• Complete checkout • Multi-page wizard • Complex workflow	Browser Agents	-
Data Focused	• Extract table data • Scrape listings • Parse structured info	Browser Code	`browser_assess_page`

By Priority

Priority	Recommendation	Why
Cost Optimization	Browser Actions (with caching)	5 credits per cached execution
Reliability	Browser Code	Explicit, deterministic logic
Flexibility	Browser Agents	Handles unpredictable scenarios
Speed	Browser Actions	Fast execution, minimal overhead
Precision	Browser Code	Exact control over operations

By Output Needs

What You Need	Best Choice	Example
No output (just actions)	Browser Actions or Agents	Click, navigate, type
Simple confirmation	Browser Actions	”Button clicked successfully”
Structured data	Browser Code	`[{name: "...", price: "..."}]`
Page assessment	`browser_assess_page`	Evaluated structured data
Screenshots	Any + `browser_get_screenshot`	PNG image data

Common Scenarios

Scenario 1: E-commerce Product Search

Task: Search for “wireless headphones”, filter by price under $100, and add first result to cart

{
  "type": "browser_action",
  "action": "Search for 'wireless headphones' and click the first result under $100"
}
// ⚠️ Might hit 2-step limit

Verdict: Use Browser Agents - requires multiple steps (search, filter, select, add to cart)

Scenario 2: Extract Product Listings

Task: Extract all products with name, price, and rating from the current page

{
  "type": "execute_javascript",
  "prompt": "Extract all products with name, price, and rating as array of objects",
  "return_by_value": true
}
// ✅ Precise, structured output

Verdict: Use Browser Code for complex extraction or browser_assess_page for simpler needs

Task: Log in to dashboard and navigate to settings page

{
  "type": "browser_action",
  "action": "Click login button and navigate to settings"
}
// ⚠️ Only if login is 1 click

Verdict: Use Browser Agents - login flows often require multiple steps

Scenario 4: Click Submit Button

Task: Scroll to the bottom and click the submit button

{
  "type": "browser_action",
  "action": "Scroll to bottom and click submit button"
}
// ✅ Perfect fit - 2 simple actions

Verdict: Use Browser Actions - simple, predictable, cacheable

Cost Considerations

Optimize for Cost

Use Browser Actions for repeated simple tasks
- Fresh: 50 credits
- Cached: 5 credits (10x cheaper!)
Cache Browser Code for repeated extractions
- Code generation cached
- Subsequent runs cheaper
Avoid over-using Agents for simple tasks
- Variable cost based on steps
- More expensive than Actions

Example Cost Comparison

Task: Click login button (repeated 100 times)

Approach	Cost
Browser Actions (cached)	5 credits × 100 = 500 credits
Browser Actions (fresh)	50 credits × 100 = 5,000 credits
Browser Agents	~2-5 steps × cost/step × 100 = Higher

For repeated tasks, Browser Actions with caching provides massive cost savings!

Still Not Sure?

Start Simple, Upgrade as Needed

Try Browser Actions first if the task seems simple
Upgrade to Agents if you hit the 2-step limit
Switch to Code if you need precise data extraction

Ask Yourself

How many steps does this task require?

1-2 steps: Browser Actions
3+ steps: Browser Agents
Just extraction: Browser Code

Will I repeat this task often?

Yes, frequently: Browser Actions (for caching)
No, one-time: Any approach works
Varies by data: Browser Code (caches generation)

How predictable is the workflow?

Very predictable: Browser Actions or Code
Some variation: Browser Agents
Highly dynamic: Browser Agents

What output do I need?

None (just actions): Browser Actions
Structured data: Browser Code
Completion confirmation: Browser Agents

Quick Reference

Browser Actions

When: Simple, fast, repeatedCost: 5-50 creditsLimit: 2 steps

Browser Agents

When: Complex, multi-step, adaptiveCost: VariableLimit: None

Browser Code

When: Data extraction, precise controlCost: MediumLimit: Logic-based

Need Help?

For detailed implementation, see individual guides or contact support at [email protected].

Getting Started

Visual Workflow Builder

Programmatic Actions

Choosing the Right Step Type

Quick Decision Tree

By Use Case

Simple Interactions

Browser Actions

Data Extraction

Browser Code

Multi-Page Workflows

Browser Agents

Visual Understanding

Computer Use Agent

Comparison Matrix

By Task Complexity

By Priority

By Output Needs

Common Scenarios

Scenario 1: E-commerce Product Search

Scenario 2: Extract Product Listings

Scenario 4: Click Submit Button

Cost Considerations

Optimize for Cost

Example Cost Comparison

Still Not Sure?

Start Simple, Upgrade as Needed

Ask Yourself

Quick Reference

Browser Actions

Browser Agents

Browser Code

Need Help?

Getting Started

Visual Workflow Builder

Programmatic Actions

​Quick Decision Tree

​By Use Case

​Simple Interactions

Browser Actions

​Data Extraction

Browser Code

​Multi-Page Workflows

Browser Agents

​Visual Understanding

Computer Use Agent

​Comparison Matrix

​By Task Complexity

​By Priority

​By Output Needs

​Common Scenarios

​Scenario 1: E-commerce Product Search

​Scenario 2: Extract Product Listings

​Scenario 3: Login and Navigate

​Scenario 4: Click Submit Button

​Cost Considerations

​Optimize for Cost

​Example Cost Comparison

​Still Not Sure?

​Start Simple, Upgrade as Needed

​Ask Yourself

​Quick Reference

Browser Actions

Browser Agents

Browser Code

​Need Help?

Quick Decision Tree

By Use Case

Simple Interactions

Data Extraction

Multi-Page Workflows

Visual Understanding

Comparison Matrix

By Task Complexity

By Priority

By Output Needs

Common Scenarios

Scenario 1: E-commerce Product Search

Scenario 2: Extract Product Listings

Scenario 3: Login and Navigate

Scenario 4: Click Submit Button

Cost Considerations

Optimize for Cost

Example Cost Comparison

Still Not Sure?

Start Simple, Upgrade as Needed

Ask Yourself

Quick Reference

Need Help?