Turn webpages into LLM-ready data at scale with a simple API call

How to Use ScraperAPI with Cypress for Web Scraping and Testing

Use ScraperAPI with Cypress to scrape JavaScript-heavy sites and run end-to-end tests. It’s perfect for dynamic pages that regular scraping tools can’t handle.

Getting started

This basic Cypress test works fine for static sites, but it breaks on pages that load content with JavaScript:

// Basic Cypress Test Without ScraperAPI
describe('Plain Cypress scraping', () => {
   it('visits a page', () => {
     cy.visit('https://example.com')
     cy.get('h1').should('contain.text', 'Example Domain')
   })
 })

To scrape JavaScript-heavy pages, use ScraperAPI with cy.request() and DOM parsing instead.

Recommended Method: Custom Command + DOM Injection + ScraperAPI

ScraperAPI handles rendering, proxies, CAPTCHAs, and retries for you. Cypress fetches the HTML, injects it into a DOM node, and lets you query it easily.

Requirements

  • Cypress for running scraping tests
  • npm, the package manager to install Cypress and dependencies
  • nodejs/node to run Cypress and npm
  • cypress-dotenv to keep your credentials secure
  • ScraperAPI and the given API key for scraping

Step 1: Set Up Your Node.js Project

Begin by moving to your project folder and installing Node.js and npm.

# For Ubuntu
sudo apt update
sudo apt install nodejs npm

 # For macOS (includes npm)
brew install node
 
 # For Windows 
 # Download and install Node.js (which includes npm) from the official website (https://nodejs.org/en/download/) and follow the installer steps.

Initialize your Node.js project and download Cypress by running:

npm init -y
npm install cypress --save-dev

Step 2: Add a Custom Command

First off, generate a Cypress folder structure by running this in your terminal from the root of your project: 

npx cypress open

If it’s the first time you do this, Cypress will create its default folder structure.

Now you can navigate to cypress/support/commands.js and create a reusable Cypress command that integrates with ScraperAPI to fetch and parse HTML from JavaScript-heavy websites.

// cypress/support/commands.js
Cypress.Commands.add('scrapeViaScraperAPI', (targetUrl) => {
 const scraperUrl = `http://api.scraperapi.com?api_key=${Cypress.env('SCRAPER_API_KEY')}&url=${encodeURIComponent(targetUrl)}&timeout=60000`;

 return cy.request(scraperUrl).then((response) => {
   return cy.document().then((document) => {
     const container = document.createElement('div');
     container.innerHTML = response.body;
     const titles = Array.from(container.querySelectorAll('.product_pod h3 a')).map(el =>
       el.getAttribute('title')
     );
     return titles;
   });
 });
});

Use an environment variable setup to store your ScraperAPI Key. You can get your API key here.


Install cypress-dotenv, then create a .env file in your project root:

npm install -D cypress-dotenv
touch .env
nano .env
# .env
SCRAPER_API_KEY=your_scraper_api_key

Update your cypress.config.js as follows:

// cypress.config.js
const { defineConfig } = require("cypress");
require('dotenv').config();

module.exports = defineConfig({
 e2e: {
   setupNodeEvents(on, config) {
     config.env.SCRAPER_API_KEY = process.env.SCRAPER_API_KEY;
     return config;
   },
   supportFile: "cypress/support/commands.js"
 }
});

Step 3: Use the Command in Your Test

In your cypress/ folder, create a new folder e2e and a file scraperapi.cy.js

mkdir e2e
touch e2e/scraperapi.cy.js

In the file, paste the custom command in a Cypress test that displays the scraped data inside a browser DOM.

// cypress/e2e/scraperapi.cy.js
describe('Scrape Books to Scrape with ScraperAPI + Cypress', () => {
 it('gets product titles and displays them', () => {
   cy.visit('cypress/fixtures/blank.html'); // Load static HTML file

   cy.scrapeViaScraperAPI('http://books.toscrape.com/catalogue/page-1.html').then((titles) => {
     cy.document().then((doc) => {
       const container = doc.getElementById('results');
       const list = doc.createElement('ul');

       titles.forEach(title => {
         const item = doc.createElement('li');
         item.innerText = title;
         list.appendChild(item);
       });

       container.appendChild(list);
     });

     cy.screenshot('scraped-book-titles'); // Take screenshot after injecting
   });
 });
});

Step 4: Create the blank.html file and run your Cypress test

In your project folder, create the folder cypress/fixtures if it doesn’t exist yet:

mkdir -p cypress/fixtures

Inside, create the blank.html with the following minimal code (or similar!):

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8" />
  <title>Blank Page</title>
</head>
<body>
  <div id="results"></div>
</body>
</html>

You can now run your tests from the project root folder (the one where your package.json lives).

npx cypress run

This method works because:

  • ScraperAPI handles the proxying and geo-routing
  • Cypress injects the content into the browser DOM
  • You get full control using native DOM APIs

Alternative: cy.request without ScraperAPI

You can call cy.request() directly, but it won’t render JS or rotate IPs:

describe('Simple cy.request test', () => {
   it('should load example.com and check the response', () => {
     cy.request('https://example.com').then((response) => {
       expect(response.status).to.eq(200);
       expect(response.body).to.include('Example Domain');
     });
   });
 });

This method is not ideal because:

  • It exposes your IP to bot protection.
  • It doesn’t bypass CAPTCHAs or rotate proxies.
  • It fails on sites that require geolocation or JavaScript rendering.

Prefer ScraperAPI for anything beyond basic scraping.

ScraperAPI Parameters That Matter

ScraperAPI supports options via query parameters:

const scraperUrl = `http://api.scraperapi.com?api_key=YOUR_KEY&url=https://target.com&render=true&country_code=us&session_number=555`
Parameter What It Does When to Use It
render=true Tells ScraperAPI to load JavaScript Use this for dynamic pages or SPAs
country_code=us Uses a U.S. IP address Great for geo-blocked content
premium=true Solves CAPTCHAs and retries failed requests Needed for hard-to-scrape sites
session_number=555 Keeps the same proxy IP across multiple requests Use it when you need to maintain a session

These three are all you need in most cases. For more, check the ScraperAPI docs.

Test Retries

Improve stability with test retries:

// cypress.config.js
export default {
 e2e: {
   retries: {
     runMode: 2,
     openMode: 0,
   },
 },
}

This helps when pages load slowly or throw rate errors.

Visualize the Scraped Data in the DOM

To see the data you’re scraping, run your test using:

npx cypress open

Then select scraperapi.cy.js in the Cypress UI. You should get these results:

  • The static HTML page load (Ready for Scraped Data)
  • Scraped book titles dynamically injected into the DOM
  • A screenshot saved as scraped-book-titles.png
screenshot code

Ready to start scraping?

Get started with 5,000 free API credits or contact sales

No credit card required