The AWS outage affected your scrapers? Switch to ScraperAPI and keep collecting data without interruptions.

How to Use ScraperAPI with Ferrum(Ruby) to Scrape Websites

This guide shows you how to integrate ScraperAPI with Ferrum, a headless browser tool for Ruby. You’ll learn how to set up Ruby and Ferrum on your machine, connect through ScraperAPI’s proxy, and scrape dynamic websites that load content with JavaScript. The goal is to get real, usable data, fast and clean. 

Getting Started with Ferrum (No Proxy)

Here’s what a basic Ferrum script looks like without ScraperAPI:

require 'ferrum'

browser = Ferrum::Browser.new
browser.goto('https://example.com')
puts browser.current_title
browser.quit

This works fine for simple pages. But when you try this on sites that block scraping, use JavaScript to render content, or throw CAPTCHAs, you’ll hit a wall. Ferrum doesn’t rotate IPs or handle advanced blocking on its own.

That’s where ScraperAPI comes in.

Recommended Method: Use ScraperAPI as a Proxy

This method sends all your Ferrum traffic through ScraperAPI’s proxy. It gives you IP rotation, country targeting, CAPTCHA bypass, and support for JS-heavy sites.

Requirements

  • Ruby (v2.6 or later)
  • Bundler (gem install bundler)
  • Chrome or Chromium installed on your system
  • ScraperAPI Key (you can get one by signing up!)
  • Ferrum

Installation and Setup

If you don’t have it already, install Ruby and bundler:

sudo apt update
sudo apt install ruby-full -y
sudo gem install bundler

Create a Gemfile in your project folder:

touch Gemfile

And add the following:

# Gemfile
source 'https://rubygems.org'

gem 'ferrum'

gem 'dotenv'

Then run:

bundle install

This installs the required gems using Bundler.

.env File

In your project folder, create a .env file with the following:

SCRAPERAPI_KEY=your_api_key_here

Your Script

In  a file test_scraper.rb, paste the following:

require 'ferrum'
require 'dotenv/load'

SCRAPERAPI_KEY = ENV['SCRAPERAPI_KEY']
proxy_url = "http://api.scraperapi.com:8001?api_key=#{SCRAPERAPI_KEY}&render=true"

browser = Ferrum::Browser.new(browser_options: { 'proxy-server': proxy_url })

browser.goto('https://news.ycombinator.com/')

puts "\nTop 5 Hacker News Headlines:\n\n"

browser.css('.athing .titleline a').first(5).each_with_index do |link, index|
 puts "#{index + 1}. #{link.text.strip}"
end

# Save output to HTML file for browser inspection
File.write('output.html', browser.body)
puts "\nSaved result to output.html"

browser.quit

# Optional: open the file in Chrome
system("open -a 'Google Chrome' output.html")

The script above uses Ferrum to visit a site that relies on JavaScript. It sends the request through ScraperAPI with render=true to load dynamic content. It scrapes the top 5 headlines from Hacker News, saves the full HTML, and lets you open it in Chrome to check the results.

Save your script as test_scraper.rb, then run it:

ruby test_scraper.rb
screenshot terminal

It should load on Chrome like this: 

screenshot list

This confirms that ScraperAPI is handling the request.

Optional Parameters

ScraperAPI lets you pass additional options via query params:

{
   Render = true,           // Load JavaScript
   CountryCode = "us",      // Use US IP
   Premium = true,          // Enable CAPTCHA solving
   SessionNumber = 123      // Maintain session across requests
};
ParameterWhat It DoesWhen to Use It
render=trueTells ScraperAPI to execute JavaScriptUse for SPAs and dynamic content
country_code=usRoutes requests through US proxiesGreat for geo-blocked content
premium=trueEnables CAPTCHA solving and advanced anti-bot measuresEssential for heavily protected sites
session_number=123Maintains the same proxy IP across requestsUse when you need to maintain login sessions

These parameters cover most scraping scenarios. Check the ScraperAPI documentation for additional options.

Example

proxy_url = "http://api.scraperapi.com:8001?api_key=#{SCRAPERAPI_KEY}&render=true&country_code=us&session_number=123"

Configuration & Best Practices

Concurrency

Use threads to run multiple Ferrum sessions:

threads = 5.times.map do
 Thread.new do
   browser = Ferrum::Browser.new(...)
   browser.goto('https://httpbin.org/ip')
   puts browser.body
   browser.quit
 end
end

threads.each(&:join)

Retry Logic

Wrap unstable requests in retry blocks:

begin
 browser.goto('https://targetsite.com')
rescue Ferrum::StatusError => e
 sleep 1
 retry
end

For more information, you can check ScraperAPI Documentation.

Ready to start scraping?

Get started with 5,000 free API credits or contact sales

No credit card required