How to Scrape AliExpress Product Listings [Guide and Tips]

Tutorial on how to scrape AliExpress product data

AliExpress offers over 100 million products across 10,000+ product categories, making it one of the biggest ecommerce marketplaces in the world and an invaluable data source for companies and analysts growing an online business.

Collect Data from All Major eCommerce Marketplaces

Collect millions of product details without sacrificing efficiency or data quality.

In this article, you’ll learn how to:

  • Extract data from AliExpress using Python and BeautifulSoup
  • Export the scraped data as a CSV file
  • Use ScraperAPI to bypass AliExpress anti-scraping mechanisms

Ready? Let’s get started!

TL;DR: Full AliExpress Scraper

Here’s the completed code for those in a hurry:


	from bs4 import BeautifulSoup
	import requests
	import csv
	
	# Specify the product URL
	product_url = "https://www.aliexpress.com/item/1005005095908799.html"
	
	# Set up headers and payload for Scraper API
	headers = {
		'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
		'accept-language': 'en-US,en;q=0.9,ur;q=0.8,zh-CN;q=0.7,zh;q=0.6',
	}
	payload = {'api_key': 'YOUR_API_KEY', 'url': product_url, 'render': 'true', 'keep_headers': 'true'}
	
	# Make a request to Scraper API
	response = requests.get('https://api.scraperapi.com', params=payload, timeout=60, headers=headers)
	
	# Parse HTML content with BeautifulSoup
	soup = BeautifulSoup(response.content, 'html.parser')
	result = soup.find('div', attrs={'class': "pdp-info-right"})
	
	# Extract and print product information
	if result:
		# GET PRODUCT NAME
		product_name = result.find('h1', attrs={'data-pl': "product-title"}).text
	
		# GET PRODUCT PRICE
		price_div = result.find('div', attrs={'class': "price--current--H7sGzqb product-price-current"})
		price_spans = price_div.find_all('span')
		price = ''.join(span.text for span in price_spans[1:])
	
		# GET PRODUCT RATING AND NUMBER OF REVIEWS
		rating_div = result.find('div', attrs={'data-pl': "product-reviewer"})
	
		rating = rating_div.find('strong').text
		reviews = rating_div.find('a').text
	
		# GET ORDER INFORMATION
		orders_spans = result.find_all('span')
		orders = ''.join(span.text for span in orders_spans)
	
		# Open CSV File and Write Header
		with open('aliexpress_results.csv', 'w', newline='', encoding='utf-8') as csvfile:
			csv_writer = csv.writer(csvfile)
			csv_writer.writerow(['Product Name', 'Price', 'Rating', 'Reviews', 'Amount Sold'])
	
			# Write data to CSV file
			csv_writer.writerow([product_name, price, rating, reviews, orders])
	
			print('Scraping and CSV creation successful!')
	else:
		print("No product information found!")

Before running the code, add your API key to the 'api_key' parameter within the payload.

Note: Don’t have an API key? Create a free ScraperAPI account to get 5,000 API credits to try all our tools for 7 days.

Want to see how we built it? Keep reading and join us on this exciting scraping journey!

Requirements

To get started with web scraping on AliExpress using Python, you’ll need a few tools and libraries. Follow these steps to set up your environment:

  1. Python Installation: Make sure you have Python installed, preferably version 3.8 or later.
  2. Library Installations: Open your terminal or command prompt and run the following command to install the necessary libraries:


	pip install requests bs4 

Libraries Overview

  • Requests: The requests library serves as our communication tool in the AliExpress scraper. It acts as our agent, delivering requests to the ScraperAPI service, which, in turn, interacts with AliExpress to retrieve the information we seek. So, think of Requests as our way of politely asking ScraperAPI to fetch the data from AliExpress on our behalf. It’s the bridge that helps us connect to the data source seamlessly.
  • BS4 (BeautifulSoup): BeautifulSoup is like a smart helper in Python that helps us grab information from websites. It’s excellent at understanding and navigating through web page code (HTML and XML). In our AliExpress scraper, we use BeautifulSoup to easily pick out the data we want from the messy web page code, making our job much simpler. It’s like having a neat and tidy assistant that fetches exactly what we need from the web pages.

Project Structure

  • Create a new directory and Python file: Open your terminal or command prompt and run the following commands:

	$ mkdir aliexpress_scraper
	$ touch aliexpress_scraper/app.py

Now that you’ve set up your environment and project structure, you’re ready to proceed with the next part of the tutorial.

Understanding AliExpress Site Structure

Alright, let’s break down how AliExpress’s website is set up!

Think of this as navigating a big online mall, and we want to figure out how to grab important details. In this article, our focus is on scraping the product listing page of a particular product.

In the image below, we’ve highlighted the key components of the products we’ll be extracting:

Product elements highlighted in red

Using the developer tools, we can examine the elements we want to scrape and get the specific tags and their classes that we can use to target the elements we want to scrape.

This div is the main product container holding all the product information: .pdp-info-right

CSS selector for AliExpress product details container

This div tag contains the product name: .product-title

CSS selector for AliExpress product title

This div tag contains the price information: .price--current--H7sGzqb product-price-current

CSS selector for AliExpress product price

Ratings within reviews offer an additional dimension for evaluating product quality and customer sentiment.

This div contains information about reviews and ratings: .product-reviewer

CSS selector for AliExpress product reviews

Having a clear understanding of the AliExpress site structure will streamline our data scraping process by helping us identify the elements of interest and the paths to get there.

Now that we have an action, let’s start scraping!

Scraping Target AliExpress Product Data

For this tutorial, we’ll focus on a particular product page, so you can just create a list of product URLs and use the same script to collect all major elements automatically.

Step 1: Setup Your Project

First, we need to import the libraries we plan to use in our scraper and specify the product URL.


	from bs4 import BeautifulSoup
	import csv
	import requests
	search_url = "https://www.aliexpress.com/item/1005005095908799.html"

Step 2: Send Your Request Through Scraper API

To be able to scrape AliExpress we’ll need to bypass its anti-bot detection. However, building the infrastructure to prevent getting blocked is both time-consuming and expensive.

Instead, we’ll send our requests through ScraperAPI and let it handle IP rotation, CAPTCHA handling, JavaScript rendering, etc.

Here’s what we’ll do:

  • Setup custom “headers” to act as our identification for the server
  • Write a “payload” that includes our API key, the specific URL of the product we want data from, enable JS rendering, and tell the API to use our headers

	headers = {
		'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
		'accept-language': 'en-US,en;q=0.9,ur;q=0.8,zh-CN;q=0.7,zh;q=0.6',
	}
	payload = {'api_key': 'YOUR_API_KEY', 'url': product_url, 'render': 'true', 'keep_headers': 'true'}

💡 Important

AliExpress injects the product information through AJAX, so the page needs to render before we can see the information.

Of course, browsers don’t have any issue with this, but our script will. To access the dynamic data from the page, we tell ScraperAPI to first render the page before returning the data.

Otherwise, we’d just get a blank page.

After that, we send our prepared instructions and details (headers and payload) to the API.

By making this request, we’re essentially asking the API to act on our behalf and fetch the information we’re interested in from the specified product URL.


	response = requests.get('https://api.scraperapi.com', params=payload, timeout=60, headers=headers)

Step 3: Parse the Raw HTML with BeautifulSoup

Next, we need to create a BeautifulSoup object to parse the HTML content received from our request using the ‘html.parser’ and pick the main product container we identified earlier – which contains all the product information we need.


	soup = BeautifulSoup(response.content, 'html.parser')
	result = soup.find('div', attrs={'class': "pdp-info-right"})

Step 4: Extract Product Information

Using the specific tags and classes we identified earlier, we’ll grab the following details:

  • Name
  • Price
  • Rating
  • Number of reviews
  • Order information

Also, let’s use an if statement to prevent errors if no product information was retrieved.


	if result:
    # GET PRODUCT NAME
    product_name = result.find('h1', attrs={'data-pl': "product-title"}).text

    # GET PRODUCT PRICE
    price_div = result.find('div', attrs={'class': "price--current--H7sGzqb product-price-current"})
    price_spans = price_div.find_all('span')
    price = ''.join(span.text for span in price_spans[1:])

    # GET PRODUCT RATING AND NUMBER OF REVIEWS
    rating_div = result.find('div', attrs={'data-pl': "product-reviewer"})

    rating = rating_div.find('strong').text
    reviews = rating_div.find('a').text

    # GET ORDER INFORMATION
    orders_spans = result.find_all('span')
    orders = ''.join(span.text for span in orders_spans)

Step 5: Write the Data to a CSV File

Awesome, we got the data! However, printing all this information to the terminal isn’t very useful, right? To finish our project, let’s store it in a CSV file.

Open a CSV file (‘aliexpress_results.csv’) in write mode, initialize a CSV writer, and write the header row to it.

It’s our way of organizing the information neatly before we start filling in the details.


	with open('aliexpress_results.csv', 'w', newline='', encoding='utf-8') as csvfile:
	csv_writer = csv.writer(csvfile)
	csv_writer.writerow(['Product Name', 'Price', 'Rating', 'Reviews', 'Amount Sold'])

Storing the details we’ve gathered for each product is like noting down key info in a table.

Each row represents one product, and each column holds a specific piece of information. This structured setup makes it easy to read and analyze later on.

Once all is done, we print a success message, signaling the completion of the task.


	csv_writer.writerow([product_name, price, rating, reviews,       orders])

	print('Scraping and CSV creation successful!')

Error Handling

To make it full circle, we complete our if statement with the else part, this just lets us know the process of retrieving the scraped data was not successful.


	else:
    print("No product information found!")

Wrapping Up

Congratulations, you just scraped AliExpress!

In this journey, we’ve unveiled the essential steps and tools to gather valuable insights from the vast online marketplace by:

  • Building an AliExpress scraper using Python and BeautifulSoup
  • Using ScraperAPI to bypass bot detection and render the product page
  • Exporting the data into a CSV file for later analysis

Scraping AliExpress can serve various purposes, providing essential data for decision-making in the online retail sector. You can use the scraped AliExpress product data for:

  • Competitor analysis
  • Tracking product prices
  • Monitoring product reviews
  • Managing product inventory
  • Obtaining product images and descriptions

And much more.

If you have any questions, please contact our support team, we’re eager to help or check our documentation to learn the ins and outs of ScraperAPI.

Until next time, happy scraping!


Handling More Than 3M Requests?

Let our experts tailor a solution that fits your needs. Enterprise customers get premium support and a dedicated account manager.

Frequently Asked Questions

Scraping AliExpress offers loads of data that can be invaluable for making informed decisions in the online retail landscape. From tracking competitor prices to monitoring customer reviews, scraping AliExpress Product Data provides a comprehensive view of the market.

Learn more about scraping ecommerce sites.

While AliExpress doesn’t explicitly ban web scraping, it does use anti-scraping mechanisms to block scrapers and bots. Using tools like ScraperAPI can help navigate potential obstacles.

Learn how to choose the right scraping tool.

AliExpress product data can be a goldmine for various applications. You can perform competitor analysis, monitor product prices, track product reviews for sentiment analysis, manage inventory, and even extract product images and specifications for a holistic understanding of the market.

Learn how to turn ecommerce data into a competitive advantage.

About the author

Ize Majebi

Ize Majebi

Ize Majebi is a Python developer and data enthusiast who delights in unraveling code intricacies and exploring the depths of the data world. She transforms technical challenges into creative solutions, possessing a passion for problem-solving and a talent for making the complex feel like a friendly chat. Her ability brings a touch of simplicity to the realms of Python and data.

Table of Contents

Related Articles

Talk to an expert and learn how to build a scalable scraping solution.