- How To Use The API?
- Concurrent Requests
- Increasing Scraping Speed
- Getting Failed Requests From The API
- Custom Built Scrapers
Plans & Billing
- Free Plan & 7-Day Free Trial
- Cancelling A Plan
- Getting A Refund
- Change Credit Card Details
- Pay-As-You-Go Option
- Bandwidth Based Pricing
- Buying Individual IPs
- Clicking Page Elements
- Page Elements Still Missing
- JS Rendering Concurrency
- Selenium & Puppeteer Support
Geolocation & Residential IPs
Anti-bots & CAPTCHAs
How To Use The API?
With Scraper API you simply send a request to either our simple REST API interface or our proxy port and we will return the HTML response from the target website. For more information on how to get started then check out our documentation here.
Every plan with Scraper API has a limited number of concurrent threads which limit the number of requests you can make in parallel to the API (the API doesn’t accept batch requests). The more concurrent threads your plan has the faster you can scrape a website. If you would like to increase the number of concurrent requests you can make with the API then contact our customer support team.
Increasing Scraping Speed
When you send a request to the API we will route it through our proxy pools, check the response for CAPTCHAs, bans, etc. and then either return the valid HTML response to you or keep retrying the request with different proxies for up to 60 seconds before returning you a 500 status error.
Frequently, the API has higher latencies than sending requests directly to a normal proxy as average latencies typically range from 4-12 seconds (depending on website) and single requests can sometimes take up to 60 seconds. However, this is compensated by the fact that our average success rate is around 98%.
If you would like to increase the volume of successful requests you can make in a given time period then we can increase your number of concurrent threads. Contact our sales team to enquiry about increasing your concurrency limit.
If you like to reduce the latency of each request or reduce the longtail of some requests taking 20-30 seconds then you can use our premium proxy pools by adding
premium=true to your request, or by contacting our support team to see if they can increase the speed of your requests.
Getting Failed Requests From The API
Scraper API routes your requests through proxy pools with over 40 million proxies and retries requests for up to 60 seconds to get a successful response, however, some of your requests will fail. You can expect 1-3% of your requests to fail, however, you won’t be charged for these failed requests. If you configure your code to automatically retry failed requests then in the majority of cases the retry will be successful.
If you are experiencing failure rates in excess of 10% then contact our support team who will look at tuning the API to yield a higher success rate.
Custom Built Scrapers
Scraper API is a proxy API for web scraping, so unfortunately we don’t develop custom scrapers for users. If you would like someone to build a scraper for you then we would recommend ScrapeSimple who provide high quality scraping services at low costs.
Plans & Billing
Free Plan & 7-Day Free Trial
Scraper API offers a free plan of 1,000 free API calls per month (with a maximum of 5 concurrent connections) for small scraping projects. For the first 7-days after you sign up you will have access to 5,000 free requests so test the API at a larger scale. If you need additional API calls for testing purposes, please contact support.
Cancelling A Plan
Yes, you can cancel your subscription at any time in your dashboard or by contacting support, you will not be charged for cancelling.
Getting A Refund
We offer a 7-day no questions asked refund policy, if you are unhappy with the service for any reason, contact support, and we’ll refund you right away.
Change Credit Card Details
You can change your card details anytime on the Billing page in your dashboard or by contacting support, who will help you securely change your card details.
Currently, we don’t offer a pay-as-you-go option with the API. All our plans are monthly subscriptions that reset each month.
Bandwidth Based Pricing
Currently, we don’t offer bandwidth based pricing. All our plans are based on the number of requests you make to the API each month.
Buying Individual IPs
Currently, we don’t have an option to purchase individual proxies from our pools.
You can use JS rendering with standard requests on our Business Plan at no extra cost (1 JS rendered request = 1 API Call). However, if you use JS rendering with premium requests then 1 JS request equals 25 API requests. We highly recommend that you only use JS rendering if you absolutely need it to extract your target data, as JS rendering will increase your latencies and can reduce your success rates which can reduce the volume of requests you can process through the API.
Clicking Page Elements
Currently, the API doesn’t allow you to control the Chromium instance so that you can interact with page elements. The API will only render any JS on the page and return the HTML response.
Page Elements Still Missing
To avoid rendering unnecessary images, tracking scripts, etc. that will slow your request down the API doesn’t render everything on the page by default. Sometimes this might include some data that you actually need. If this happens then just contact our support team here who will customise the JS rendering for that particular site.
JS Rendering Concurrency
There is a separate concurrency limit for JS rendering that limits you to 3 requests per second. This means that you can only send 3 JS requests to the API every second. However, this doesn’t mean that you can have only 3 concurrent threads with JS rendering active at anyone time. On Enterprise Plans we can increase this limit upon request.
Selenium & Puppeteer Support
The API doesn’t support requests from Selenium or Puppeteer by default, however, for Enterprise users we can enable Selenium and Puppeteer support. Contact our support team here to do so.
Geolocation & Residential IPs
Business and Startup Plan users (Startup Plan US geotargeting only) can geotarget their requests to the following 12 countries by using the
country_code flag in their request.
|us||United States||Startup Plan and higher.|
|ca||Canada||Business Plan and higher.|
|uk||United Kingdom||Business Plan and higher.|
|de||Germany||Business Plan and higher.|
|fr||France||Business Plan and higher.|
|es||Spain||Business Plan and higher.|
|br||Brazil||Business Plan and higher.|
|mx||Mexico||Business Plan and higher.|
|in||India||Business Plan and higher.|
|jp||Japan||Business Plan and higher.|
|cn||China||Business Plan and higher.|
|au||Australia||Business Plan and higher.|
Other countries are available to Enterprise customers upon request.
At the moment the API doesn’t support state or city level geotargeting with our proxy pools. However, on request we can implement this for Enterprise level users.
We use residential proxies as fallback proxies within our standard proxy pools if a request has repeatedly failed. However, if you would like to exclusively use our residential proxy pools then you can enable this functionality by adding
premium=true to the requests you send to the API.
Our premium proxy pools contain mobile IPs, however, if you want to exclusively use mobile proxies then contact our support team who will be able to create a custom plan for you.
Anti-bots & CAPTCHAs
Getting Distil, Cloudflare, etc. bans.
Along with constantly fine tuning our proxy and header pools, within the API we’ve built in numerous anti-bot bypasses that enable the API to bypass most challenges thrown by anti-bots. Generally, your success rates will be a small bit lower on sites that make heavy use of anti-bots, however, you should be able to scrape the site reliably at scale with the API.
In the case that the API is completely blocked by a site or you are experiencing a very low success rate (under 70%) then please let our support team know about the issue.
This is generally due to the site using either a combination of 2 or more anti-bots in tandem or using a customised version of the anti-bot with higher security settings that stop the general bypass from working. In cases like these, one of our engineers will put in place a custom bypass for you if you contact our support team.
How CAPTCHA solving works
After we receive a HTML response back from your target website we automatically run the response through our ban and CAPTCHA detection algorithms. If the API detects a CAPTCHA the API will retry the request with another IP and have the blocked IP unblocked in parallel. This ensures that you don’t have to wait until the CAPTCHA is solved before you can retry the request.
Getting a CAPTCHA as successful response
We have a CAPTCHA database with thousands of bans and CAPTCHA types that we use to detect whether a request contains a CAPTCHA or has been blocked by an anti-bot. If you are getting a CAPTCHA or an anti-bot message back as a successful status 200 request then just let our support team know and they will add this new CAPTCHA or anti-bot message into our database so it will be detected in the future. Triggering the API to keep retrying the request until it gets the correct successful response.
Solving embedded CAPTCHAs
Currently, the API doesn’t solve CAPTCHAs that are permanently embedded on the page, like those often found on forms or buttons to reveal personal information. You will need to use a dedicated CAPTCHA solver service to unlock these CAPTCHAs. On Enterprise Plans, we can implement this functionality for you upon request.
Can’t find an answer to your question?