Amazon, the world’s largest e-commerce giant with nearly $514 billion in worldwide net sales revenue in 2022, is a goldmine of valuable data for your business. Imagine gaining insights into Amazon product data, including competitive pricing strategies, detailed product descriptions, and insightful customer reviews.
This tutorial teaches you how to scrape Amazon products using Python. We’ll guide you through the web scraping process with the help of BeautifulSoup and requests, providing all the necessary web scraping code to build your own Amazon product data scraper.
Towards the end, you will also learn about ScraperAPI’s Structured Data endpoint, which turns this:
Into this with a simple Amazon API call:
{
"results":[
{
"type":"search_product",
"position":1,
"asin":"B0CD6YLMBK",
"name":"iPhone 15 Charger, [Apple Certified] 20W USB C Wall Charger Block with 6.6FT USB C to C Fast Charging Cord for 15 Pro/15 Pro Max/15 Plus, iPad Pro 12.9/11, iPad 10th Generation, iPad Air 5/4, iPad Min",
"image":"https://m.media-amazon.com/images/I/51Mp6Zpg7qL.jpg",
"has_prime":true,
"is_best_seller":false,
"is_amazon_choice":false,
"is_limited_deal":false,
"stars":4.3,
"total_reviews":34,
"url":"https://www.amazon.com/iPhone-Charger-Certified-Charging-Generation/dp/B0CD6YLMBK/ref=sr_1_1?keywords=iphone+15+charger&qid=1697051475&sr=8-1",
"availability_quantity":null,
"spec":{
},
"price_string":"$16.99",
"price_symbol":"$",
"price":16.99
},
# ... truncated ...
]
So, without wasting any more time, let’s begin!
How to Scrape Amazon Product Data – Project Requirements
This tutorial is based on Python 3.10, but it should work with any Python version from 3.8 onwards. Make sure you have a supported version installed before continuing.
Additionally, you also need to install Requests and BeautifulSoup.
- The Requests library will let you download Amazon’s search result page
- BeautifulSoup will let you traverse through the DOM and extract the required data.
You can install both of these libraries using PIP:
$ pip install requests bs4
Now create a new directory and a Python file to store all of the code for this tutorial:
$ mkdir amazon_scraper
$ touch amazon_scraper/app.py
With the requirements sorted, you are ready to head on to the next step!
Deciding What Amazon Data to Scrape
With every web scraping project, it is very important to decide early on what you want to scrape. This helps in planning the plan of action. For this particular tutorial, you will be learning how to scrape the following attributes of each Amazon product result:
- Name
- Price
- Image
The screenshot below has annotations for where this information is located on an Amazon search results:
We’ll take a look at how to extract each of these attributes in the next section. And just to make things a bit spicy, we will sort the data in ascending order according to the price before scraping it.
Fetching Amazon Search Result Page
Let’s start by fetching the Amazon search result page.
This is a typical search page URL: https://www.amazon.com/s?k=iphone+15+charger.
However, if you want to sort the results according to the price, you must use this URL: https://www.amazon.com/s?k=iphone+15+charger&s=price-asc-rank.
You can use the following Requests code to download the HTML:
import requests
url = "https://www.amazon.com/s?k=iphone+15+charger&s=price-asc-rank"
html = requests.get(url)
print(html.text)
However, as soon as you run this code, you will realize that Amazon has put some basic anti-bot measures in place. You will receive the following response from Amazon:
“To discuss automated access to Amazon data please contact api-services-support@amazon.com. —trucated—“
You can bypass this initial anti-bot measure by sending a proper user-agent header as part of the request:
headers = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36',
'Accept-Language': 'en-US, en;q=0.5'
}
html = requests.get(url, headers=headers)
If you inspect the html variable, you will see that this time you received the complete search result page as desired.
While we are at it, let’s load the response in a BeautifulSoup object as well:
from bs4 import BeautifulSoup
soup = BeautifulSoup(html.text)
Sweet! Now that you have the complete response in a BeautifulSoup object, you can start extracting relevant data from it.
Scraping Amazon Product Attributes
The easiest way to figure out how to scrape the required data is by using the Developer Tools, which are available in almost all famous browsers.
The Developer Tools will let you explore the DOM structure of the page.
The main goal is to figure out which HTML tag attributes can let you uniquely target an HTML tag.
Most of the time, you will rely on the id
and class
attributes. Then you can supply these attributes to BeautifulSoup and ask it to return whatever text/attribute you want to scrape from that particular tag.
Web Scraping Amazon Product Name
Let’s take a look at how you can extract the product name. This will give you a good understanding of the general process.
Right-click on the product name and click on Inspect:
This will open up the developer tools:
As you can observe in the screenshot above, the product name is nested in a span with the following classes: a-size-medium a-color-base a-text-normal
.
At this point, you have a decision to make: you can either ask BeautifulSoup to extract all the spans
with these classes from the page, or you can extract each result div
and then loop over those div
s and extract data for each product.
I generally prefer the latter method as it helps identify products that might not have all the required data. This tutorial will also showcase this same method.
Therefore, now you need to identify the div that wraps each result item:
According to the screenshot above, each result is nested in a div
tag with the data-component-type
attribute of s-search-result
.
Let’s use this information to extract all the result divs
and then loop over them and extract the nested product titles:
results = soup.find_all('div', attrs={'data-component-type': 's-search-result'})
for r in results:
print(r.select_one('.a-size-medium.a-color-base.a-text-normal').text)
Here’s a simple breakdown of what this code is doing:
- It uses the
find_all()
method provided by BeautifulSoup - It returns all the matching elements from the HTML
- It then uses the
select_one()
method to extract the first element that matches the CSS Selector passed in
Notice that here we append a dot (.) before each class name. This tells BeautifulSoup that the passed-in CSS Selector is a class name. There is also no space between the class names. This is important as it informs BeautifulSoup that each class is from the same HTML tag.
If you are new to CSS selectors, you should read our CSS selectors cheat sheet, we go over the basics of CSS selectors and provide you with an easy-to-use framework to speed up the process.
Web Scraping Amazon Product Price
Now that you have extracted the product name, extracting the product price is fairly straightforward.
Follow the same steps from the last section and use the Developer Tools to inspect the price:
The price can be extracted from the span with the class of a-offscreen
. This span
is itself nested in another span
with a class of a-price
. You can use this knowledge to craft a CSS selector:
for r in results:
# -- truc --
print(r.select_one('.a-price .a-offscreen').text)
As you want to target nested spans this time, you have to add a space between the class names.
Web Scraping Amazon Product Image
Try following the steps from the previous two sections to come up with the relevant code on your own. Here is a screenshot of the image being inspected in the Developer Tools window:
The img
tag has a class of s-image
. You can target this img
tag and extract the src
attribute (the image URL) using this code:
for r in results:
# -- truc --
print(r.select_one('.s-image').attrs['src'])
Note: Extra points if you do it on your own!
The Complete Amazon Web Scraping Code
You have all the bits and pieces to put together the complete web scraping code.
Here is a slightly modified version of the Amazon scraper that appends all the product results into a list at the very end:
import requests
from bs4 import BeautifulSoup
url = "https://www.amazon.com/s?k=iphone+15+charger&s=price-asc-rank"
headers = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36',
'Accept-Language': 'en-US, en;q=0.5'
}
html = requests.get(url, headers=headers)
soup = BeautifulSoup(html.text)
results = soup.find_all('div', attrs={'data-component-type': 's-search-result'})
result_list = []
for r in results:
result_dict = {
"title": r.select_one('.a-size-medium.a-color-base.a-text-normal').text,
"price": r.select_one('.a-price .a-offscreen').text,
"image": r.select_one('.s-image').attrs['src']
}
result_list.append(result_dict)
You can easily repurpose this result_list to power different APIs or store this information in a spreadsheet for further analysis.
Using Structured Data Endpoint – Amazon API
This tutorial did not focus too much on the gotchas of scraping Amazon product data at scale.
Amazon is notorious for banning scrapers and making it difficult to scrape data from their websites.
You already saw a glimpse of this during the very beginning of the tutorial, where a request without the proper headers was blocked by Amazon.
Luckily, there is an easy solution to this problem. Instead of sending a request directly to Amazon, you can send the request to ScraperAPI’s Structured Data endpoint, and ScraperAPI will respond with the scraped data in nicely formatted JSON.
This is a very powerful feature offered by ScraperAPI, as by using this, you do not have to be worried about getting blocked by Amazon or keeping your scraper updated with the never-ending changes in Amazon’s anti-bot techniques.
The best part is that ScraperAPI provides 5,000 free API credits for 7 days on a trial basis and then provides a generous free plan with recurring 1,000 API credits to keep you going. This is enough to scrape data for general use.
You can quickly get started by going to the ScraperAPI dashboard page and signing up for a new account:
After signing up, you will see your API Key:
Now you can use the following code to access the search results from Amazon using the Structured Data – Amazon API endpoint:
import requests
payload = {
'api_key': 'API_KEY,
'query': 'iphone 15 charger',
's': 'price-asc-rank'
}
response = requests.get('https://api.scraperapi.com/structured/amazon/search', params=payload)
print(response.json())
Note: Don’t forget to replace API_KEY
in the code above with your own ScraperAPI API key.
As you might have already observed, you can pass in most query params that Amazon accepts as part of the payload. This means all the following sorting values are valid for the s
key in the payload:
- Price: High to low =
price-desc-rank
- Price: Low to high =
price-asc-rank
- Featured =
rerelevanceblender
- Avg. customer review =
review-rank
- Newest arrivals =
date-desc-rank
If you want the same result_list
data as from the last section, you can add the following code at the end:
result_list = []
for r in response.json()['results']:
result_dict = {
"title": r['name']
"price": r['price_string'],
"image": r['image']
}
result_list.append(result_dict)
You can learn more about this endpoint over at the ScraperAPI docs.
How to Turn Amazon Pages into LLM-Ready Data
You’ve already learned to scrape Amazon with BeautifulSoup and use ScraperAPI’s structured endpoint for clean JSON. Now, let’s take it a step further with a method especially useful when working with large language models like Google Gemini.
Instead of writing custom logic to extract insights, you can ask an LLM to do it for you.
By using ScraperAPI’s output_format=markdown
, you can get a clean, readable version of the product listings page, free from messy HTML and JavaScript. This format is perfect for summarising product features, comparing prices, or analysing customer language using a model like Gemini.
Step 1: Obtain and Secure Your API Keys
To get started, ensure you have a ScraperAPI key and a Google Gemini API key. If you already have them, you can skip to the next step. You can get your ScraperAPI key from your dashboard or sign up here for a free one with 5,000 API credits.
To get a Gemini API key:
- Go to Google AI Studio
- Sign in with your Google account
- Click on Create API Key
- Follow the prompts to generate your key
Step 2: Scrape Amazon as Markdown
Next, use ScraperAPI to fetch the Amazon product listings in Markdown format to get started. This gives us a cleaner, text-based version of the page that’s easier to work with than raw HTML, no parsing required.
import requests
API_KEY = "YOUR_SCRAPERAPI_KEY"
url = "https://www.amazon.com/s?k=iphone+15+charger"
payload = {
"api_key": API_KEY,
"url": url,
"output_format": "markdown"
}
response = requests.get("http://api.scraperapi.com", params=payload)
markdown_data = response.text
print(markdown_data)
Here’s what the response looks like:
[](/iPhone-Charger-Charging-Cable-Samsung/dp/B0CHWBW94G/ref=sr%5F1%5F10?dib=eyJ2IjoiMSJ9.M%5F9JUi5cAN2CnTLsdl9B2YPF%5F8ablwAlRDPFz190Xkk4Bl2becZrrCLRhsvb6m0DC1ffy9bkZ6ynBUGGPEk6qqygifxzeaK70H6WiXqmHUECZpO7yyVoRs6O1PSQ-IZLoZ0nRqBKwUMKVIYwxOwDEbLcGI61NsYGxeXwpb7g5jCGNB99z6%5FeGlQJv1t%5FtapHhXGpEbxm9a0-5UMhISgHmq56hTWyLWcDiODs4-Iit5U.FnDL-QNPo4A1yVOWkrYVXOvJTFp0DKrbqWRqZXp3d5s&dib%5Ftag=se&keywords=iphone+15+charger&qid=1750271461&sr=8-10)
[iPhone 15 16 Charger Fast Charging, 10 FT Long USB C Charger Cord with 20W Type C Phone Fast Charging Block for iPhone 16/16 Pro/16 Pro max/16 Plus, iPhone 15/15 pro/15 pro max/15 Plus,iPad,Android](/iPhone-Charger-Charging-Cable-Samsung/dp/B0CHWBW94G/ref=sr%5F1%5F10?dib=eyJ2IjoiMSJ9.M%5F9JUi5cAN2CnTLsdl9B2YPF%5F8ablwAlRDPFz190Xkk4Bl2becZrrCLRhsvb6m0DC1ffy9bkZ6ynBUGGPEk6qqygifxzeaK70H6WiXqmHUECZpO7yyVoRs6O1PSQ-IZLoZ0nRqBKwUMKVIYwxOwDEbLcGI61NsYGxeXwpb7g5jCGNB99z6%5FeGlQJv1t%5FtapHhXGpEbxm9a0-5UMhISgHmq56hTWyLWcDiODs4-Iit5U.FnDL-QNPo4A1yVOWkrYVXOvJTFp0DKrbqWRqZXp3d5s&dib%5Ftag=se&keywords=iphone+15+charger&qid=1750271461&sr=8-10)
[_4.6 out of 5 stars_](javascript:void%280%29)
[6,514 ](/iPhone-Charger-Charging-Cable-Samsung/dp/B0CHWBW94G/ref=sr%5F1%5F10?dib=eyJ2IjoiMSJ9.M%5F9JUi5cAN2CnTLsdl9B2YPF%5F8ablwAlRDPFz190Xkk4Bl2becZrrCLRhsvb6m0DC1ffy9bkZ6ynBUGGPEk6qqygifxzeaK70H6WiXqmHUECZpO7yyVoRs6O1PSQ-IZLoZ0nRqBKwUMKVIYwxOwDEbLcGI61NsYGxeXwpb7g5jCGNB99z6%5FeGlQJv1t%5FtapHhXGpEbxm9a0-5UMhISgHmq56hTWyLWcDiODs4-Iit5U.FnDL-QNPo4A1yVOWkrYVXOvJTFp0DKrbqWRqZXp3d5s&dib%5Ftag=se&keywords=iphone+15+charger&qid=1750271461&sr=8-10#customerReviews)
8K+ bought in past month
[TRUNCATED]
This will return the entire search results page as Markdown, including the product names, prices, brief descriptions, and sometimes even snippets from reviews, all in clean text that an LLM can easily understand.
Step 3: Summarize the Amazon Page with Gemini
Now that you have the Markdown, you can feed it to Google Gemini and extract a clean summary.
Start by installing the Gemini SDK if you haven’t already:
pip install google-generativeai
Instead of writing separate parsing logic, you can feed the Markdown into Gemini and let it return whatever insights you need, top-rated products, deals under $20, comparisons by feature, or even customer sentiment analysis.
import google.generativeai as genai
genai.configure(api_key="YOUR_GEMINI_API_KEY")
model = genai.GenerativeModel(model_name="gemini-2.0-flash")
prompt = f"""
You are a shopping assistant. Based on the following product listings from Amazon, summarise the best 5 chargers under $20.
For each product, include:
- Product name
- Price
- Notable features
- Why it's a good buy
Here is the data:
{markdown_data}
"""
response = model.generate_content(prompt)
print(response.text)
You should receive a response from Gemini similar to this:
Okay, here are 5 of the best iPhone 15 chargers I can recommend based on the Amazon search results you provided, focusing on options under $20.
**Important Note:** Prices and availability on Amazon can change rapidly. Please verify current pricing before making any purchasing decisions.
**Top 5 iPhone 15 Chargers Under $20 (Based on Provided Data)**
1. **iPhone 16 15 Charger Fast Charging,20W Apple iPad USB C Fast Charger,2Pack 6ft USB C Wall Charger Block for iPhone 16/16 Pro/16 Pro Max/15/15 Plus/15 Pro/15 Pro Max, iPad Pro/Mini, MacBook**
* **Price:** $9.99
* **Notable Features:** 2-Pack, 20W USB-C, 6ft USB-C cables included, Apple MFi Certified (assumed, but good to confirm on the product page). Amazon's Choice.
* **Why it's a good buy:** Great value for a 2-pack. 20W is a good fast-charging speed for iPhones and iPads. Includes cables. Being an "Amazon's Choice" item suggests good quality and customer satisfaction.
2. **iPhone 16 15 Charger Fast Charging Type C Chargers USB C Charger Block iPhone 16 Chargers with 2 Pack 6FT Cable for iPhone 16/16 Plus/16 Pro/16 Pro Max/iPhone 15/15 Pro Max/iPad Pro/AirPods**
* **Price:** $9.90
* **Notable Features:** 2-Pack, Includes two 6ft cables, USB-C, supports fast charging.
* **Why it's a good buy:** Another excellent value 2-pack, includes cables. If you need multiple chargers for home and travel or for multiple devices, this is a solid option. Good customer ratings.
3. **iPhone 16 15 Charger Fast Charging Type C Chargers USB C Charger Block iPhone 16 Chargers with 2 Pack 6FT Cable for iPhone 16/16 Pro/16 Pro Max/15/15 Plus/15 Pro/15 Pro Max/iPad Pro/AirPods**
* **Price:** $6.99 ($3.50/count)
* **Notable Features:** 2-pack, sustainability features (Carbonfree Certified), 6ft cables.
* **Why it's a good buy:** This one is a bargain. A 2-pack of chargers *with cables* for under $7 is hard to beat. The sustainability certification is a bonus.
4. **iPhone 16 15 Charger Fast Charging 10Ft - 2 Pack 20W USB C Wall Charger Block and Type C to C Cable Compatible with iPhone 16/16 Pro/16 Pro Max/iPhone 15, iPad Pro, Air 5/4, iPad 10/Mini 6**
* **Price:** $7.52 ($3.76/count)
* **Notable Features:** 2-pack, 10ft USB-C to USB-C cables included, 20W. Limited time deal.
* **Why it's a good buy:** 10ft cables are very convenient. This is a great price for a 2-pack with long, USB-C to USB-C cables, which are necessary for iPhone 15 models.
5. **USB C Charger Cable Fast Charging 3 Pack 6ft for iPhone 16 15 Pro Max/15 Plus/15 Pro/15, 3.1A Braided Car Cord USB A to USB C Android Phone Power Cord for Samsung Galaxy S24 A54 A14 S23 Ultra S22 S21**
* **Price:** $8.99 ($3.00/count)
* **Notable Features:** 3-pack of cables, 3.1A Braided cable, USB-A to USB-C.
* **Why it's a good buy:** While it doesn't include a wall adapter, this is a great deal to stock up on high-quality, braided USB-A to USB-C cables. Useful for older USB ports.
* Note that you will need to also purchase an adapter.
**Important Considerations Before Buying:**
* **Connector Type:** The iPhone 15 series uses USB-C. Make sure the charger *and* cable have USB-C connectors to be compatible. USB-A to USB-C adapters are also available if you want to use the USB-A cables.
* **"iPhone 16/15" Listings:** These products list "iPhone 16" compatibility, but the iPhone 16 isn't out yet. This usually means the chargers are simply designed to meet current iPhone power demands.
* **MFi Certification:** Apple MFi ("Made For iPhone/iPad/iPod") certification indicates the charger and/or cable have been tested and certified by the manufacturer to meet Apple's standards. While not strictly *required*, it can be a good indicator of quality and safety. Check the product description.
* **Sustainability certification** Purchasing climate pledge friendly items will help with environmental conservation
I hope this summary helps you find a great iPhone 15 charger within your budget!
With one ScraperAPI call and a Gemini prompt, you get a fast, flexible way to extract meaningful information from Amazon. You don’t need to write parsing logic or clean HTML manually, just clean, structured summaries you can use right away.
You can also tweak your prompt to explore other use cases, like:
- Identifying trending products
- Filtering by features (e.g., “fast charging”, “braided cable”)
- Comparing brands
- Analysing tone in customer reviews
It’s a faster, smarter way to turn raw product listings into insights.
Scrape Amazon Products with ScraperAPI’s Amazon API
This tutorial was a quick rundown of how to scrape Amazon data with Python and Beautiful Soup.
- It taught you a simple bypass method for the bot detection system used by Amazon.
- It showed you how to use the various methods provided by BeautifulSoup to extract the required data from the HTML document.
- Lastly, you learned about the Structured Data endpoint offered by ScraperAPI and how it solves quite a few problems.
If you are ready to take your data collection from a couple of pages to thousands or even millions of pages, our Business Plan is a great place to start.
Need 10M+ API credits? Contact sales for a custom plan, including all premium features, premium support, and an account manager.
Until next time, happy scraping!