How to Scrape Amazon Product Data: Prices, Search Page, Etc.

Zoltan Bettenbuk
May, 2021

With over 350 million products listed across over 50 countries, scraping Amazon puts a vast sea of product data in your hands that’ll put your business ahead of the competition.

However, bypassing Amazon anti-scraping mechanisms isn’t a simple task.

In this guide, we’ll show you how to build an amazon scraper capable of extracting product details, pricing and reviews from any product or keyword you need without getting blocked.

Collect Structured Amazon Data in Seconds

ScraperAPI allows you to extract data from Amazon search results and product pages in JSON format with a simple API call.

The Benefits of Scraping Amazon

Web scraping Amazon data helps you concentrate on competitor price research, real-time cost monitoring and seasonal shifts in order to provide consumers with better product offers.

Web scraping allows you to extract relevant data from the Amazon website and save it in a spreadsheet or JSON automatically, updating the data on a regular, weekly or monthly basis with near zero manual labor.

If you need to export product data from Amazon to a spreadsheet or any other storage, we’ve got you covered!

Whether for competitor testing, comparison shopping, creating an API for your app project, or any other business need, ScraperAPI helps you get the data fast and consistently.

Here are some other specific benefits of using an Amazon web scraper:

Utilize details from product search results to improve your Amazon SEO status or Amazon marketing campaigns
Compare and contrast your offering with that of your competitors
Use review data for review management and product optimization for retailers or manufacturers
Discover the products that are trending and look up the top-selling product lists for a group

Scraping Amazon is an intriguing business today, with a large number of companies offering goods, price, analysis, and other types of monitoring solutions specifically for Amazon.

Attempting to scrape Amazon data on a wide scale, however, is a difficult process that often gets blocked by their anti-scraping technology.

The good news is that if you follow this step-by-step guide – no matter if you’re a beginner – you’ll be able to scrape Amazon in no time!

Get Amazon Data with Low-Code

With DataPipeline, you can monitor and scrape Amazon products without writing a single line of code using a visual editor.

How to Approach Amazon Scraping

The first method for scraping data from Amazon is to crawl each keyword’s category or shelf list, then request the product page for each one before moving on to the next. This is best for smaller scale, less-repetitive scraping.

The second option is to create a database of products you want to track by having a list of products or ASINs (unique product identifiers), then have your Amazon web scraper scrape each of these individual pages every day/week/etc. This is the most common method among scrapers who track products for themselves or as a service.

Scraping Amazon Product Data with Scrapy

For this tutorial, we’ll focus on the second method, making our script first do an Amazon search and then collecting the product data using the ASIN number from the products listed on the page.

Also, we’ll be using ScraperAPI to help us bypass Amazon’s anti-bot measures, allowing us to scale our project without having to build a complete scraping infrastructure from scratch and for a fraction of the cost of using residential proxies.

Step 1: Setup Your Project

To build our scraper, we’ll be using Python’s Scrapy, a web crawling and data extraction platform that can be used for a variety of applications such as data mining, information retrieval and historical archiving

if you haven’t already, install Python before you can using the following pip command to install Scrapy:

pip install scrapy

Then go to the folder where your project is saved (Scrapy automatically creates a web scraping project folder for you) and run the startproject command along with the project name, amazon_scraper.

Scrapy will construct a web scraping project folder for you, with everything already set up:

scrapy startproject amazon_scraper
<p>

The result should look like this:

</p>
├── scrapy.cfg # deploy configuration file
└── tutorial # project's Python module, you'll import your code from here
	├── __init__.py
	├── items.py # project items definition file
	├── middlewares.py # project middlewares file
	├── pipelines.py # project pipeline file
	├── settings.py # project settings file
	└── spiders # a directory where spiders are located
		├── __init__.py
		└── amazon.py # spider we just created
<p>

Scrapy creates all of the files you’ll need, and each file serves a particular purpose:

Items.py – Can be used to build your base dictionary, which you can then import into the spider.
Settings.py – All of your request settings, pipeline, and middleware activation happens in settings.py. You can adjust the delays, concurrency, and several other parameters here.
Pipelines.py – The item yielded by the spider is transferred to Pipelines.py, which is mainly used to clean the text and bind to databases (Excel, SQL, etc).
Middlewares.py – When you want to change how the request is made and scrapy manages the answer, Middlewares.py comes in handy.

Step 2: Create an Amazon Spider

You’ve established the project’s overall structure, so now you’re ready to start working on the spiders that will do the scraping.

Scrapy has a variety of spider species, but we’ll focus on the most popular one: the Generic Spide

Simply run the genspider command to make a new spider:

</p>
# syntax is --> scrapy genspider name_of_spider website.com 
scrapy genspider amazon amazon.com
<p>

Scrapy now creates a new file with a spider template, and you’ll gain a new file called amazon.py in the spiders folder.

Your code should look like the following:

</p>
import scrapy
class AmazonSpider(scrapy.Spider):
	name = 'amazon'
	allowed_domains = ['amazon.com']
	start_urls = ['http://www.amazon.com/']
	def parse(self, response):
		pass
<p>

Delete the default code (allowed_domains, start_urls, and the parse() function) and replace it with your own, which should include these four functions:

start_requests – sends an Amazon search query with a specific keyword.
parse_keyword_response – extracts the ASIN value for each product returned in an Amazon keyword query, then sends a new request to Amazon for the product listing. It will also go to the next page and do the same thing.
parse_product_page – extracts all of the desired data from the product page.
get_url – sends the request to the ScraperAPI, which will return an HTML response.

Step 3: Send a Search Query to Amazon

You can now scrape Amazon for a particular keyword using the following steps, with an Amazon spider and ScraperAPI as the proxy solution. This will allow you to scrape all of the key details from the product page and extract each product’s ASIN.

All pages returned by the keyword query will be parsed by the spider. Try using these fields for the spider to scrape from the Amazon product page:

ASIN
Product name
Price
Product description
Image URL
Available sizes and colors
Customer ratings
Number of reviews
Seller ranking

Important Update:

Using our Amazon Search structured data endpoint (SDE), you can now collect all mentioned product details like ASIN, name, price, images, descriptions, and more in JSON format from Amazon search results with a simple API call.

Just send you requests to our endpoint, alongside your API key and the query you want to collect data from.

Here’s a code snippet to get you started:
import requests

payload = {
   'api_key': 'APIKEY',
   'query': 'QUERY',
   'country': 'COUNTRY',
   'tld': 'TLD'
}

r = requests.get('https://api.scraperapi.com/structured/amazon/search', params=payload)
print(r.text)
Don’t have an API key? Create a free ScraperAPI account to start your 7-day free trial with 5,000 API credits and all our tools available.

The first step is to create a start_requests() function to send our Amazon search requests containing our keywords.

Outside of AmazonSpider, you can easily identify a list variable using our search keywords. Input the keywords you want to search for in Amazon into your script:

</p>
queries = [‘tshirt for men’, ‘tshirt for women’]
<p>

Inside the AmazonSpider, you can build your start_requests feature, which will submit the requests to Amazon. Submit a search query k=SEARCH KEYWORD to access Amazon’s search features via a URL:

</p>
https://www.amazon.com/s?k=<SEARCH_KEYWORD>
<p>

It looks like this when we use it in the start_requests() function:

</p>
## amazon.py
queries = ['tshirt for men', ‘tshirt for women’]
class AmazonSpider(scrapy.Spider):
	def start_requests(self):
		for query in queries:
			url = 'https://www.amazon.com/s?' + urlencode({'k': query})
			yield scrapy.Request(url=url, callback=self.parse_keyword_response)
<p>

You will urlencode each query in your queries list so that it is secure to use as a query string in a URL, and then use scrapy.Request to request that URL.

Use yield instead of return since Scrapy is asynchronous, so the functions can either return a request or a completed dictionary.

If a new request is received, the callback method is invoked. If an object is yielded, it will be sent to the data cleaning pipeline.

The parse_keyword_response callback function will then extract the ASIN for each product when scrapy.Request activates it.

Step 4: Scrape Amazon Products Using ASINs

One of the most popular methods to scrape Amazon includes extracting data from a product listing page.

Important Update

You can now use ScraperAPI’s Amazon Product SDE to scrape any product listing using a list of ASIN.

Just send your request alongside your API key and ASIN (or list of ASINs) and get detailed information like name, price, reviews, rating, and more.

Here’s a code snippet to get you started:
import requests

payload = {
   'api_key': 'APIKEY',
   'asin': 'ASIN',
   'country': 'COUNTRY',
   'tld': 'TLD'
}

r = requests.get('https://api.scraperapi.com/structured/amazon/product', params=payload)
print(r.text)
Want to learn more? Check our tutorial on scraping Amazon ASINs at scale.

Using an Amazon product page ASIN ID is the simplest and most common way to retrieve this data as very product on Amazon has a unique ASIN assigned.

We may use this ID in our URLs to get the product page for any Amazon product, such as the following:

</p>
https://www.amazon.com/dp/<ASIN>
<p>

Using Scrapy’s built-in XPath selector extractor methods, we can extract the ASIN value from the product listing tab. You can build an XPath selector in Scrapy Shell that captures the ASIN value for each product on the product listing page and generates a url for each product:

</p>
products = response.xpath('//*[@data-asin]')
		for product in products:
			asin = product.xpath('@data-asin').extract_first()
			product_url = f"https://www.amazon.com/dp/{asin}"
<p>

The function will then be configured to send a request to this URL and then call the parse_product_page() callback function when it receives a response.

Note: This request will also include the meta parameter, which is used to move items between functions or edit certain settings.

</p>
def parse_keyword_response(self, response):
		products = response.xpath('//*[@data-asin]')
		for product in products:
			asin = product.xpath('@data-asin').extract_first()
			product_url = f"https://www.amazon.com/dp/{asin}"
			yield scrapy.Request(url=product_url, callback=self.parse_product_page, meta={'asin': asin})
<p>

Step 5: Extract Specific Data From the Amazon Product Page

After the parse_keyword_response() function requests the product pages URL, it transfers the response it receives from Amazon along with the ASIN ID in the meta parameter to the parse product page callback function.

We now want to derive the information we need from a product page, such as a product page for a t-shirt.

To do this, you need to create XPath selectors to extract each field from the HTML response we get from Amazon:

</p>
def parse_product_page(self, response):
	asin = response.meta['asin']
	title = response.xpath('//*[@id="productTitle"]/text()').extract_first()
	image = re.search('"large":"(.*?)"',response.text).groups()[0]
	rating = response.xpath('//*[@id="acrPopover"]/@title').extract_first()
	number_of_reviews = response.xpath('//*[@id="acrCustomerReviewText"]/text()').extract_first()
	bullet_points = response.xpath('//*[@id="feature-bullets"]//li/span/text()').extract()
	seller_rank = response.xpath('//*[text()="Amazon Best Sellers Rank:"]/parent::*//text()[not(parent::style)]').extract()
<p>

Note: Try using a regex selector over an XPath selector for scraping the image URL if the XPath is extracting the image in base64.

When working with large websites like Amazon that have a variety of product pages, you’ll find that writing a single XPath selector isn’t always enough since it will work on certain pages but not others. To deal with the different page layouts, you’ll need to write several XPath selectors in situations like these.

When you run into this issue, give the spider three different XPath options:

</p>
def parse_product_page(self, response):
	asin = response.meta['asin']
	title = response.xpath('//*[@id="productTitle"]/text()').extract_first()
	image = re.search('"large":"(.*?)"',response.text).groups()[0]
	rating = response.xpath('//*[@id="acrPopover"]/@title').extract_first()
	number_of_reviews = response.xpath('//*[@id="acrCustomerReviewText"]/text()').extract_first()
	bullet_points = response.xpath('//*[@id="feature-bullets"]//li/span/text()').extract()
	seller_rank = response.xpath('//*[text()="Amazon Best Sellers Rank:"]/parent::*//text()[not(parent::style)]').extract()
	price = response.xpath('//*[@id="priceblock_ourprice"]/text()').extract_first()
	if not price:
		price = response.xpath('//*[@data-asin-price]/@data-asin-price').extract_first() or \
				response.xpath('//*[@id="price_inside_buybox"]/text()').extract_first()
<p>

If the spider is unable to locate a price using the first XPath selector, it goes on to the next.

Forget About Parsing and Maintenance

When using ScraperAPI SDEs, you have a team of experts working tirelessly to keep data flowing. Let us worry about website changes and maintenance tasks.

If we look at the product page again, we can see that there are different sizes and colors of the product. To get this info, we’ll write a fast test to see if this section is on the page, and if it is, we’ll use regex selectors to extract it.

temp = response.xpath('//*[@id="twister"]')
	sizes = []>
	colors = []
	if temp:
		s = re.search('"variationValues" : ({.*})', response.text).groups()[0]
		json_acceptable = s.replace("'", "\"")
		di = json.loads(json_acceptable)
		sizes = di.get('size_name', [])
		colors = di.get('color_name', [])

When all of the pieces are in place, the parse_product_page() function will return a JSON object, which will be sent to the pipelines.py file for data cleaning:

def parse_product_page(self, response):
	asin = response.meta['asin']
	title = response.xpath('//*[@id="productTitle"]/text()').extract_first()
	image = re.search('"large":"(.*?)"',response.text).groups()[0]
	rating = response.xpath('//*[@id="acrPopover"]/@title').extract_first()
	number_of_reviews = response.xpath('//*[@id="acrCustomerReviewText"]/text()').extract_first()
	price = response.xpath('//*[@id="priceblock_ourprice"]/text()').extract_first()
	if not price:
		price = response.xpath('//*[@data-asin-price]/@data-asin-price').extract_first() or \
				response.xpath('//*[@id="price_inside_buybox"]/text()').extract_first()
	temp = response.xpath('//*[@id="twister"]')
	sizes = []
	colors = []
	if temp:
		s = re.search('"variationValues" : ({.*})', response.text).groups()[0]
		json_acceptable = s.replace("'", "\"")
		di = json.loads(json_acceptable)
		sizes = di.get('size_name', [])
		colors = di.get('color_name', [])>
	bullet_points = response.xpath('//*[@id="feature-bullets"]//li/span/text()').extract()
	seller_rank = response.xpath('//*[text()="Amazon Best Sellers Rank:"]/parent::*//text()[not(parent::style)]').extract()
	yield {'asin': asin, 'Title': title, 'MainImage': image, 'Rating': rating, 'NumberOfReviews': number_of_reviews,
		'Price': price, 'AvailableSizes': sizes, 'AvailableColors': colors, 'BulletPoints': bullet_points,
		'SellerRank': seller_rank}

Step 6: Scrape Multiple Amazon Search Result Pages

Our spider can now search Amazon using the keyword we provide and scrape the product information it returns on the website. But What if, on the other hand, we want our spider to go through each page and scrape the items on each one?

To accomplish this, we simply need to add a few lines of code to our parse_keyword_response() function:

def parse_keyword_response(self, response):
	products = response.xpath('//*[@data-asin]')
	for product in products:
		asin = product.xpath('@data-asin').extract_first()
		product_url = f"https://www.amazon.com/dp/{asin}"
		yield scrapy.Request(url=product_url, callback=self.parse_product_page, meta={'asin': asin})
	next_page = response.xpath('//li[@class="a-last"]/a/@href').extract_first()
	if next_page:
		url = urljoin("https://www.amazon.com",next_page)
		yield scrapy.Request(url=product_url, callback=self.parse_keyword_response)

After scraping all of the product pages on the first page, the spider would look to see if there is a next page button. If there is, the url extension will be retrieved and a new URL for the next page will be generated.

For Example:

https://www.amazon.com/s?k=tshirt+for+men&page=2&qid=1594912185&ref=sr_pg_1

It will then use the callback to restart the parse_keyword_response() function and extract the ASIN IDs for each product as well as all of the product data as before.

Step 7: Test Your Amazon Scraper

Once you’ve developed your spider, you can now test it with the built-in Scrapy CSV exporter:

scrapy crawl amazon -o test.csv

You may notice that there are two issues:

The text is sloppy and some values appear to be in lists
You’re retrieving 429 responses from Amazon, and therefore Amazon detects that your requests are coming from a bot

If Amazon detects a bot, it’s likely that it’ll ban your IP address and you won’t be able to scrape any Amazon domain.

In order to solve this issue, you need a large proxy pool and you also need to rotate the proxies and headers for every request – which mean higher overhead and development times.

Luckily, ScraperAPI can help eliminate this hassle.

Connect Your Proxies with ScraperAPI to Scrape Amazon

ScraperAPI is a scraping API designed to make web scraping proxies easier to use.

Instead of discovering and creating your own proxy infrastructure to rotate proxies and headers for each request, or detecting bans and bypassing anti-bots, you can simply send the URL you want to scrape to the ScraperAPI.

ScraperAPI will use machine learning and statistical analysis to handle CAPTCHAs, IP and header rotation, and many more advanced anti-bot countermeasures to ensure that your spiders never get blocked, allowing you to scrape Amazon consistently.

You can integrate ScraperAPI with your spider in many ways – you can check our documentation to learn all options – but to make the most out of the tool, the best option is sending your request through the API.

Sending Your Requests Through ScraperAPI

To get started, sign up for a free ScraperAPI account to receive an API key and 5,000 API credits.

Next, fill in the API_KEY variable with your API key:

API = ''
def get_url(url):
	payload = {'api_key': API_KEY, 'url': url}
	proxy_url = 'http://api.scraperapi.com/?' + urlencode(payload)
	return proxy_url

Then, by setting the url parameter in scrapy, we can change our spider functions to use the ScraperAPI proxy using get_url(url):

def start_requests(self):
	...
	…
	yield scrapy.Request(url=get_url(url), callback=self.parse_keyword_response)
def parse_keyword_response(self, response):
	...
	…
	yield scrapy.Request(url=get_url(product_url), callback=self.parse_product_page, meta={'asin': asin})
		...
	…
	yield scrapy.Request(url=get_url(url), callback=self.parse_keyword_response)

Scrape Localized Amazon Data with Geotargeting

Amazon adjusts pricing data and supplier data displayed depending on the country you’re making the request from. So to get consistent data, we’ll use ScraperAPI’s geotargeting feature to make Amazon think our requests are coming from the US.

To accomplish this, we must add the "country_code=us" parameter to the request from within the payload variable.

Requests for geotargeting from the United States would look like the following:

def get_url(url):
	payload = {'api_key': API_KEY, 'url': url, 'country_code': 'us'}
	proxy_url = 'http://api.scraperapi.com/?' + urlencode(payload)
	return proxy_url

Configure Your Concurrencies

Based on your ScraperAPI’s plan concurrency limit, we need to adjust the number of concurrent requests we’re authorized to make in the settings.py file.

The number of requests you may make in parallel at any given time is referred to as concurrency. The quicker you can scrape, the more concurrent requests you can produce.

The spider’s maximum concurrency is set to 5 concurrent requests by default, as this is the maximum concurrency permitted on ScraperAPI’s free plan.

If your plan allows you to scrape with higher concurrency, then be sure to increase the maximum concurrency in the settings.py file.

Set RETRY_TIMES to 5 to tell Scrapy to retry any failed requests, and make sure DOWNLOAD_DELAY and RANDOMIZE_DOWNLOAD_DELAY aren’t allowed because they reduce concurrency and aren’t required with ScraperAPI.

## settings.py
CONCURRENT_REQUESTS = 5
RETRY_TIMES = 5
# DOWNLOAD_DELAY
# RANDOMIZE_DOWNLOAD_DELAY

Need More than 3M API Credits per Month?

Custom enterprise plans come with +3M API credits, +100 concurrencies, premium support, and a dedicated account manager.

Clean Up Your Data With Pipelines

As a final step, clean up the data using the pipelines.py file when the text is a mess and some of the values appear as lists.

class TutorialPipeline:
	def process_item(self, item, spider):
		for k, v in item.items():
			if not v:
				item[k] = ''# replace empty list or None with empty string
				continue
			if k == 'Title':
				item[k] = v.strip()
			elif k == 'Rating':
				item[k] = v.replace(' out of 5 stars', '')
			elif k == 'AvailableSizes' or k == 'AvailableColors':
				item[k] = ", ".join(v)
			elif k == 'BulletPoints':
				item[k] = ", ".join([i.strip() for i in v if i.strip()])
			elif k == 'SellerRank':
				item[k] = " ".join([i.strip() for i in v if i.strip()])
		return item

The item is transferred to the pipeline for cleaning after the spider has yielded a JSON object. We need to add the pipeline to the settings.py file to make it work:

## settings.py

ITEM_PIPELINES = {'tutorial.pipelines.TutorialPipeline': 300}

Now you’re good to go, and you can use the following command to run the spider and save the result to a CSV file:

scrapy crawl amazon -o test.csv

How to Scrape Other Popular Amazon Pages

You can modify the language, response encoding and other aspects of the data returned by Amazon by adding extra parameters to these urls.

Remember to always ensure that these urls are safely encoded. We already went over the ways to scrape an Amazon product page, but you can also try scraping the search and sellers pages by adding the following modifications to your script.

Get Amazon Search Page Data

To get the search results, simply enter a keyword into the url and safely encode it like so:

https://www.amazon.com/s?<SEARCH KEYWORD>

You may add extra parameters to the search to filter the results by price, brand and other factors.

Important Update

You can simplify your code by sending your request to our https://api.scraperapi.com/structured/amazon/search endpoint, alongside your API key and the queries you want data from.
import requests

payload = {
    'api_key': 'API_KEY',
    'query': 'QUERY',
    'country': 'US'
}
r = requests.get('https://api.scraperapi.com/structured/amazon/search', params=payload)
print(r.text)
The endpoint will return all product information in a clean JSON format, making it easier and faster to work with.

Learn how ScraperAPI helps Ecommerce businesses achieve their data goals.

Get Sellers Page Data

Instead of a dedicated page showing what other sellers offer a product, Amazon recently updated these pages so a component slides in.

To get this data, you must now submit a request to the AJAX endpoint that populates this slide-in in order to scrape this data.

Format: https://www.amazon.com/gp/aod/ajax/ref=dp_aod_NEW_mbc?asin=<ASIN>
Example: https://www.amazon.com/gp/aod/ajax/ref=dp_aod_NEW_mbc?asin=B087Z6SNC1

You can refine these findings by using additional parameters such as the item’s state, etc.

Example: https://www.amazon.com/gp/aod/ajax/ref=tmm_pap_new_aod_0?filters={"all":true,"new":true}&condition=new&asin=1844076342&pc=dp

Get Amazon Product Reviews

We’ve recently launched a new Amazon Reviews structured data endpoint, allowing you to scrape Amazon product reviews from any (list of) product ASIN.

Just send your request to the Amazon Reviews SDE alongside your API key and list of ASINs.


import requests

payload = {
    'api_key': 'APIKEY',
    'asin': 'ASIN',
    'country': 'COUNTRY',
    'tld': 'TLD'
}

r = requests.get('https://api.scraperapi.com/structured/amazon/review', params=payload)
print(r.text)

The endpoint will turn the product’s review pages into structured JSON.

Want to learn more? Check our guide on scraping Amazon reviews at scale.

Benefits of Using ScraperAPI to Scrape Amazon

As we’ve seen, ScraperAPI let your scrapers bypass Amazon’s anti-scraping mechanisms, making it possible for your team to collect the data they need.

However, there are many secondary benefits that are worth mentioning:

1. Forget Headless Browsers and Use the Right Amazon Proxy

ScraperAPI uses advanced algorithms to mimic human behavior, allowing you to scrape Amazon data without the need for complex headless browsers, shortening development times and infrastructure costs, as well as increasing scraping speeds.

2. Residential Proxies Aren’t Essential

Thanks to our years of statistical analysis and ML models, ScraperAPI is able to use a healthier combination of datacenter IPs without having to resort to residential proxies.

By increasing the value per proxy, we’re able to reduce our costs, passing the gaining to you in the form of more affordable plans.

Discover how we compare to our competitors.

3. Don’t Forget About Geotargeting

Geotargeting is a must when you’re scraping a site like Amazon. When scraping Amazon, make sure your requests are geotargeted correctly, or Amazon can return incorrect information.

Previously, you could rely on cookies to geotarget your requests; however, Amazon has improved its detection and blocking of these types of requests. As a result, proxies located in that country must be used to geotarget a particular country

To do this with ScraperAPI, set country_code=[COUNTRY_CODE] to the country you want.

If you want to see results that Amazon would show to a person in the U.S., you’ll need a US proxy, and if you want to see results that Amazon would show to a person in Germany, you’ll need a German proxy.

Here’s a list of available countries – custom plans can request for over 50 more countries.

About the author

Zoltan Bettenbuk

Zoltan Bettenbuk is the CTO of ScraperAPI - helping thousands of companies get access to the data they need. He’s a well-known expert in data processing and web scraping. With more than 15 years of experience in software development, product management, and leadership, Zoltan frequently publishes his insights on our blog as well as on Twitter and LinkedIn.

Ready to start scraping?

Get started with 5,000 free API credits or contact sales

Get Started For Free

Top 7 Use Cases for Scraping YouTube Data with ScraperAPI

YouTube is the world’s second most popular search engine, trailing just behind its parent company, Google. This popularity translates to massive video content and, more

Read article

July 23, 2024

Tutorial on how to create your own data collection tool

How to Build a Data Collection Tool [+ Examples]

Having an efficient data collection tool is essential for businesses, developers, and data analysts. Such a tool is crucial to analyze market trends, enhance products,

Read article

July 19, 2024

Tutorial on how to automate web scraping

How to Automate Web Scraping in a Couple of Clicks

Collecting web data can be a complex and time-consuming task, so what if you could run automated website scraping tasks and build large datasets in

Read article

July 12, 2024

Need More Than 3M API Credits per Month?

Talk to an expert and learn how to build a scalable scraping solution.

Async Scraper Service

Structured Data

DataPipeline

Scraping API

Large-Scale Data Acquisition

Ecommerce

Market Research Firms

SEO Agencies

Travel Agencies and Hotels

VCs and Hedge Funds

AI and ML

SERP Data Collection

Ecommerce Data Collection

Market Research Scraper

Real Estate Data Collection

cURL

Python

NodeJS

PHP

Ruby

Java

DataPipeline

Developer Guides

Free Downloads

Product FAQs

Case Studies

Webinars

Comparisons

Learning Hub

Glossary

Blog

Async Scraper Service

Structured Data

DataPipeline

Scraping API

Large-Scale Data Acquisition

Ecommerce

Market Research Firms

SEO Agencies

Travel Agencies and Hotels

VCs and Hedge Funds

AI and ML

SERP Data Collection

Ecommerce Data Collection

Market Research Scraper

Real Estate Data Collection

cURL

Python

NodeJS

PHP

Ruby

Java

DataPipeline

Developer Guides

Free Downloads

Product FAQs

Case Stuides

Webinars

Comparisons

Learning Hub

Glossary

Blog

How to Scrape Amazon Product Data: Prices, Search Page, Etc.

The Benefits of Scraping Amazon

How to Approach Amazon Scraping

Scraping Amazon Product Data with Scrapy

Step 1: Setup Your Project

Step 2: Create an Amazon Spider

Step 3: Send a Search Query to Amazon

Step 4: Scrape Amazon Products Using ASINs

Step 5: Extract Specific Data From the Amazon Product Page

Step 6: Scrape Multiple Amazon Search Result Pages

Step 7: Test Your Amazon Scraper

Connect Your Proxies with ScraperAPI to Scrape Amazon

Sending Your Requests Through ScraperAPI

Scrape Localized Amazon Data with Geotargeting

Configure Your Concurrencies

Clean Up Your Data With Pipelines

How to Scrape Other Popular Amazon Pages

Get Amazon Search Page Data