How to Scrape Walmart Product Data with Node.js

Tutorial on scraping Walmart product data with Node JS

Walmart has a data mine, represented by millions of products listed on its platform, that business and data analysts can use to make better decisions and gain a competitive advantage in their market.

However, scraping Walmart product data is challenging due to its anti-scraping mechanisms, such as CAPTCHA challenges, request rate-limiting, and IP address bans – to name a few – making it hard to get consistent access to product data at scale.

Get Structured Walmart Product Data

ScraperAPI’s Walmart endpoints allow you to turn any Walmart page, including product, search, and category pages, into structured JSON data.

In this tutorial, we’ll build a Walmart web scraper from scratch and show you how to easily bypass its anti-bot detection without harming its server.

TL;DR: Full Walmart Node.js Scraper

For those in a hurry, here’s the full code in Node.js.

	const axios = require('axios');
	const cheerio = require('cheerio');
	
	const WALMART_PAGE_URL = 'https://walmart.com/search?q=computer+screen';
	const API_URL = 'https://api.scraperapi.com';
	const API_KEY = '' // <--- Enter your API key here
	
	const parseProductReview = (inputString) => {
	   const ratingRegex = /(\d+\.\d+)/;
	   const reviewCountRegex = /(\d+) reviews/;
	
	   const ratingMatch = inputString.match(ratingRegex);
	   const reviewCountMatch = inputString.match(reviewCountRegex);
	
	   const rating = ratingMatch?.length > 0 ? parseFloat(ratingMatch[0]) : null;
	   const reviewCount = reviewCountMatch?.length > 1 ? parseInt(reviewCountMatch[1]) : null;
	
	   return { rating, reviewCount };
	};
	
	const webScraper = async () => {
	   const queryParams = new URLSearchParams({
		   api_key: API_KEY,
		   url: WALMART_PAGE_URL,
		   render: true,
	   });
	
	   try {
		   const response = await axios.get(`${API_URL}?${queryParams.toString()}`);
	
		   const html = response.data;
	
		   const $ = cheerio.load(html);
		   const productList = [];
	
		   $("div[data-testid='list-view']").each((_, el) => {
			   co
	nst link = $(el).prev('a').attr('href');
			   const price = $(el).find('.f2').text();
			   const priceCents = $(el).find('.f2 + span.f6').text();
			   const description = $(el).find("span[data-automation-id='product-title']").text();
			   const reviews = $(el).find('.w_iUH7').last().text();
			   const delivery = $(el).find("div[data-automation-id='fulfillment-badge']").find('span.b').last().text();
	
			   const { rating, reviewCount } = parseProductReview(reviews);
	
			   productList.push({
				   description,
				   price: `$${price}.${priceCents}`,
				   averageRating: rating,
				   totalReviews: reviewCount,
				   delivery,
				   link: `https://www.walmart.com${link}`
			   });
		   });
	
		   console.log(productList);
	   } catch (error) {
		   console.log(error.response.data)
	   }
	};
	
	void webScraper();

Before running this code, install the dependencies and set your API key, which you can find in the ScraperAPI dashboard – this is critical as ScraperAPI will allow us to bypass Walmart’s anti-bot mechanisms and render its dynamic content.

Note: The free trial includes 5,000 API credits and access to all ScraperAPI tools.

Want to learn the details? Let’s jump into the tutorial!

Scraping Walmart Product Data in Node.js

To demonstrate how to scrape Walmart, we’ll write a script that finds computer screens on Walmart.

For each computer screen, we’ll extract the:

  • Description
  • Price
  • Average rating
  • Total reviews
  • Delivery estimation
  • Product link

The script will export the data extracted in JSON format to make it easier to use for other purposes.

Search result from Walmart for the query Computer Screen

 

Important Update

Turn Walmart search pages into easy-to-navigate JSON data using ScraperAPI’s Walmart Search endpoint.

https://api.scraperapi.com/structured/walmart/search

Just send your requests to the endpoint alongside your API key and query within a payload. You’ll be able to pick specific data points using predictable and consistent JSON key-value pairs.

{
  "items": [
    {
      "availability": "In stock",
      "id": "5Q2MFCGA4YFB",
      "image": "https://i5.walmartimages.com/asr/0119bcd3-0ea5-45f5-92b8-451c7db9dae9.740de01aff4ffa73cd8579e3335e187a.jpeg?odnHeight=180&odnWidth=180&odnBg=ffffff",
      "name": "Matchbox Hitch N' Haul: 1:64 Scale Toy Vehicle & Trailer, 4 Accessories (Styles May Vary)",
      "price": 9.76,
      "rating": {
        "average_rating": 4,
        "number_of_reviews": 31
      },
      "seller": "Lord of Retail",
      "url": "https://www.walmart.com/ip/Matchbox-Hitch-N-Haul-1-64-Scale-Toy-Vehicle-Trailer-4-Accessories-Styles-May-Vary/33707529"
    },

[More Data]


See the full Walmart Product endpoint sample response.

Explore Walmart Product SDE

Prerequisites

You must have these tools installed on your computer to follow this tutorial.

Step 1: Set Up the Project

Let’s create a folder containing the source code of the Walmart scraper.

	mkdir walmart-scraper

Enter the folder and initialize a new Node.js project

	cd walmart-scraper

	npm init -y

The second command above will create a package.json file in the folder.

Next, create a file index.js and add a simple JavaScript instruction inside.

	touch index.js

	echo "console.log('Hello world!');" > index.js	

Run the file index.js using the Node.js runtime.

	node index.js

This execution will print a Hello world! message to the terminal.

Step 2: Install the Dependencies

To build this scraper, we need these two Node.js packages:

  • Axios – to build the HTTP request (headers, body, query string parameters, etc…), send it to the ScraperAPI standard API, and download the HTML content
  • Cheerio – to extract the information from the HTML downloaded from the Axios request

Run the command below to install these packages:

	npm install axios cheerio

Step 3: Identify the DOM Selectors to Target

Navigate to https://www.walmart.com; type “computer screen” in the search bar and press enter.

When the search result appears, inspect the page to display the HTML structure and identify the DOM selector associated with the HTML tag wrapping the information we want to extract.

Screenshot highlighting the CSS selectors for the elements we want to scrape

From the above picture, here are all the DOM selectors the Web scraper will target to extract the information.

Information DOM selector
Product’s Description span[data-automation-id=’product-title’]
Price div[data-testid=’list-view’] .f2
Price Cents div[data-testid=’list-view’] .f2 + span.f6
Average rating div[data-testid=’list-view’] span.w_V_DM + div.flex span.w_iUH7
Total reviews div[data-testid=’list-view’] span.w_V_DM + div.flex span.f7
Delivery estimation div[data-testid=’list-view’] div[data-automation-id=’fulfillment-badge’] div.f7:last-child span.b:last-child
Walmart’s link div + a

Be careful when writing the selector because a misspelling will prevent the script from retrieving the correct value.

Note: A good method to avoid errors when building your selectors is to try them with jQuery first. In the browser console, type your selector like $(“span[data-automation-id=’product-title’]”) if it returns the correct DOM element, then you are good to go.

Important Update

Forget about building and maintaining parsers. With ScraperAPI’s Walmart Products endpoint you can to turn product pages into easy-to-navigate JSON data.

https://api.scraperapi.com/structured/walmart/product

Just send your requests to the endpoint alongside your API key and product ID within a payload. Then, extract specific product details using JSON key-value pairs like "reviews".

"reviews": [
        {
            "title": "Walmart pickup order",
            "text": "Make sure they scan your item right I received my order thru the delivery but when I finished activating my phone it still says SOS mode only so I'll be making a trip to Walmart sucks too have a locked phone after spending the money for it",
            "author": "TrustedCusto",
            "date_published": "2/8/2024",
            "rating": 5
        },
        {
            "title": "Very satisfied.",
            "text": "I'm very satisfied with my purchase and product. Thank you. Definitely will recommend when I get the chance. Keep up the good work. I appreciate you.",
            "author": "BLVCKSNVCK",
            "date_published": "10/7/2023",
            "rating": 5
        },

[More Data]


See the full Walmart Product endpoint sample response.

Explore Walmart Product SDE

Step 4: Send the Initial Request to Walmart

As mentioned, we’ll use ScraperAPI’s standard API to bypass Walmart’s anti-bot detection.

To build the requests, we’ll use Axios and pass the following parameters to the API:

  • The URL to scrape: it is the URL of the Walmart products search page; you can copy it in the address bar of your browser (in our case https://walmart.com/search?q=computer+screen).
  • The API Key: to authenticate against the Standard API and perform the scraping. You can find it on your ScraperAPI dashboard.
  • Enable JavaScript: Walmart’s website is built with a modern frontend Framework that adds JavaScript for better interactivity. to enable JavaScript while Scraping, we will use a property named `render` with the value set to `true`.

Note: Enabling JS rendering will allow us to scrape Walmart’s dynamic content. Without it, we would get a blank page.

Edit the index.js to add the code below that builds the HTTP request, sends it, receives the response, and prints it to the terminal.

	const axios = require('axios');

	const WALMART_PAGE_URL = 'https://walmart.com/search?q=computer+screen';
	const API_URL = 'https://api.scraperapi.com';
	const API_KEY = '' // <--- Enter your API key here
	
	const webScraper = async () => {
	   const queryParams = new URLSearchParams({
		   api_key: API_KEY,
		   url: WALMART_PAGE_URL,
		   render: true,
	   });
	
	   try {
		   const response = await axios.get(`${API_URL}?${queryParams.toString()}`);
	
		   const html = response.data;
	
		   console.log("HTML content", html);
	   } catch (error) {
		   console.log(error.response.data)
	   }
	};
	
	void webScraper();

Collect Walmart Data at Scale

Scrape thousands of Walmart products with a custom Enterprise plan. Get more than 3M API credits, +100 concurrent threads, premium support, and an account manager.

Step 5: Extract Information From the HTML

Now that we have the HTML content of the page, we must parse it with Cheerio to be able to navigate it and extract all the information we want.

Cheerio provides functions to load HTML text and then navigate through the structure to extract information using the DOM selectors.

The code below goes through each element, extracts the information, and returns an array containing all the computer’s screens.

	const cheerio = require('cheerio');

	const $ = cheerio.load(html);
	const productList = [];
	
	const parseProductReview = (inputString) => {
	   const ratingRegex = /(\d+\.\d+)/;
	   const reviewCountRegex = /(\d+) reviews/;
	
	   const ratingMatch = inputString.match(ratingRegex);
	   const reviewCountMatch = inputString.match(reviewCountRegex);
	
	   const rating = ratingMatch?.length > 0 ? parseFloat(ratingMatch[0]) : null;
	   const reviewCount = reviewCountMatch?.length > 1 ? parseInt(reviewCountMatch[1]) : null;
	
	   return { rating, reviewCount };
	};
	
	
	$("div[data-testid='list-view']").each((_, el) => {
	   const link = $(el).prev('a').attr('href');
	   const price = $(el).find('.f2').text();
	   const priceCents = $(el).find('.f2 + span.f6').text();
	   const description = $(el).find("span[data-automation-id='product-title']").text();
	   const reviews = $(el).find('.w_iUH7').last().text();
	   const delivery = $(el).find("div[data-automation-id='fulfillment-badge']").find('span.b').last().text();
	
	   const { rating, reviewCount } = parseProductReview(reviews);
	
	   productList.push({
		   description,
		   price: `$${price}.${priceCents}`,
		   averageRating: rating,
		   totalReviews: reviewCount,
		   delivery,
		   link: `https://www.walmart.com${link}`
	   });
	});
	
	console.log(productList);

The function parseProductionReview() extracts the average rating and the total reviews made by the customers on the product.

Here is the complete code of the index.js file:

	const axios = require('axios');
	const cheerio = require('cheerio');
	
	const WALMART_PAGE_URL = 'https://walmart.com/search?q=computer+screen';
	const API_URL = 'https://api.scraperapi.com';
	const API_KEY = '' // <--- Enter your API key here
	
	const parseProductReview = (inputString) => {
	   const ratingRegex = /(\d+\.\d+)/;
	   const reviewCountRegex = /(\d+) reviews/;
	
	   const ratingMatch = inputString.match(ratingRegex);
	   const reviewCountMatch = inputString.match(reviewCountRegex);
	
	   const rating = ratingMatch?.length > 0 ? parseFloat(ratingMatch[0]) : null;
	   const reviewCount = reviewCountMatch?.length > 1 ? parseInt(reviewCountMatch[1]) : null;
	
	   return { rating, reviewCount };
	};
	
	const webScraper = async () => {
	   const queryParams = new URLSearchParams({
		   api_key: API_KEY,
		   url: WALMART_PAGE_URL,
		   render: true,
	   });
	
	   try {
		   const response = await axios.get(`${API_URL}?${queryParams.toString()}`);
	
		   const html = response.data;
	
		   const $ = cheerio.load(html);
		   const productList = [];
	
		   $("div[data-testid='list-view']").each((_, el) => {
			   co
	nst link = $(el).prev('a').attr('href');
			   const price = $(el).find('.f2').text();
			   const priceCents = $(el).find('.f2 + span.f6').text();
			   const description = $(el).find("span[data-automation-id='product-title']").text();
			   const reviews = $(el).find('.w_iUH7').last().text();
			   const delivery = $(el).find("div[data-automation-id='fulfillment-badge']").find('span.b').last().text();
	
			   const { rating, reviewCount } = parseProductReview(reviews);
	
			   productList.push({
				   description,
				   price: `$${price}.${priceCents}`,
				   averageRating: rating,
				   totalReviews: reviewCount,
				   delivery,
				   link: `https://www.walmart.com${link}`
			   });
		   });
	
		   console.log(productList);
	   } catch (error) {
		   console.log(error.response.data)
	   }
	};
	
	void webScraper();

Run the code with the command node index.js, and appreciate the result:

Full Walmart scraper run test

Wrapping Up – Scraping Walmart with Node.js

Building a web scraper for Walmart can be done in the following steps:

  • Use Axios to send a request to the ScraperAPI with the Walmart page to scrape and download the rendered HTML content
  • Parse the HTML with Cheerio to extract the data based on DOM selectors
  • Format and transform the data retrieved to suit your needs.

The result is a list of relevant information about the products displayed on the Walmart website.

Here are a few ideas to go further with this Walmart Web scraper:

  • Retrieve data about a product, such as technical characteristics, reviews, etc…
  • Make the web scraper dynamic by allowing you to type the search keyword directly
  • Store the data in a database (RDBMS, JSON files, CSV files, etc…) to build historical data and make business decisions
  • Use the Async Scraper service to scrape millions of URLs asynchronously

To learn more, check out ScraperAPI’s documentation for Node.js. For easy access, here’s this project’s GitHub repository.

About the author

Eric Cabrel Tiogo

Eric Cabrel Tiogo

Eric Cabrel Tiogo is a software developer and tech mentor specializing in Java, Node.js, Typescript, GraphQL, AWS, Docker, and React. He runs a popular backend-focused blog, a developer newsletter, and actively contributes to OSS Cameroon, an open-source community. He's known for his CI/CD expertise with GitHub Actions and GitLab CI. Connect with him on Twitter and explore his work at blog.tericcabrel.com.

Table of Contents

Related Articles

Talk to an expert and learn how to build a scalable scraping solution.