Getting Started

Scraper API is designed to simplify web scraping. A few things to consider before we get started:

Basic Usage

Scraper API exposes a single API endpoint, simply send a GET request to https://api.scraperapi.com with two query string parameters, key which contains your API key, and url which contains the url you would like to scrape.

Sample Code

Bash Node Python PHP Ruby

      curl "https://api.scraperapi.com?key=YOURAPIKEY&url=http://httpbin.org/ip"
      

Result


        <html>
          <head>
          </head>
          <body>
            <pre style="word-wrap: break-word; white-space: pre-wrap;">
              {"origin":"176.12.80.34"}
            </pre>
          </body>
        </html>
      

Query Strings

If the url you are scraping contains query string parameters, the url must be encoded when calling the API:

Sample Code

Bash Node Python PHP Ruby

      curl "https://api.scraperapi.com?key=YOURAPIKEY&url=http%3A%2F%2Fhttpbin.org%2Fredirect%3Furl%3Dip"
      

      var request = require('request');
      var url = 'http://httpbin.org/redirect-to?url=ip';

      request(
        {
          method: 'GET',
          url: 'https://api.scraperapi.com/?key=YOURAPIKEY&url=' + encodeURIComponent(url),
          headers: {
            Accept: 'application/json',
          },
        },
        function(error, response, body) {
          console.log('Status:', response.statusCode);
          console.log('Response:', body);
        }
      );
      

Result


        <html>
          <head>
          </head>
          <body>
            <pre style="word-wrap: break-word; white-space: pre-wrap;">
              {"origin":"146.19.48.72"}
            </pre>
          </body>
        </html>
      

Rendering Javascript

If you are crawling a page that requires you to render the javascript on the page, you can simply set render=true (don't use this feature when it's not necessary to render javascript, as it will add 10 seconds to request times):

Sample Code

Bash Node Python PHP Ruby

      curl "https://api.scraperapi.com?key=YOURAPIKEY&url=http://httpbin.org/ip&render=true"
      

      var request = require('request');
      var url = 'http://httpbin.org/ip';

      request(
        {
          method: 'GET',
          url: 'https://api.scraperapi.com/?key=YOURAPIKEY&url=' + url + '&render=true',
          headers: {
            Accept: 'application/json',
          },
        },
        function(error, response, body) {
          console.log('Status:', response.statusCode);
          console.log('Response:', body);
        }
      );
      

Result


        <html>
          <head>
          </head>
          <body>
            <pre style="word-wrap: break-word; white-space: pre-wrap;">
              {"origin":"192.15.81.132"}
            </pre>
          </body>
        </html>
      

Custom Headers

If you would like to keep the original request headers in order to pass through custom headers (user agents, cookies, etc.), simply set keep_headers=true. Note this feature will not work in conjunction with rendering javascript. Only use this feature in order to get customized results, do not use this feature in order to avoid blocks, we handle that internally.

Sample Code

Bash Node Python PHP Ruby

      curl --header "X-MyHeader: 123" \
      "https://api.scraperapi.com?key=YOURAPIKEY&url=http://httpbin.org/anything&keep_headers=true"
      

      var request = require('request');
      var url = 'http://httpbin.org/ip';

      request(
        {
          method: 'GET',
          url: 'https://api.scraperapi.com/?key=YOURAPIKEY&url=' + url + '&keep_headers=true',
          headers: {
            Accept: 'application/json',
            'X-MyHeader': '123',
          },
        },
        function(error, response, body) {
          console.log('Status:', response.statusCode);
          console.log('Response:', body);
        }
      );
      

Result


        <html>
          <head>
          </head>
          <body>
            <pre style="word-wrap: break-word; white-space: pre-wrap;">
            {
              "args":{},
              "data":"",
              "files":{},
              "form":{},
              "headers": {
                "Accept":"*/*",
                "Accept-Encoding":"gzip, deflate",
                "Cache-Control":"max-age=259200",
                "Connection":"close",
                "Host":"httpbin.org",
                "Referer":"http://httpbin.org",
                "Timeout":"10000",
                "User-Agent":"curl/7.54.0",
                "X-Myheader":"123"
              },
              "json":null,
              "method":"GET",
              "origin":"45.72.0.249",
              "url":"http://httpbin.org/anything"
            }
            </pre>
          </body>
        </html>
      

POST Requests

(BETA) Some advanced users will want to issue POST Requests in order to scrape forms and API endpoints directly. You can do this by sending a POST request through Scraper API. The return value will be stringified, if you want to use it as JSON, you will want to parse it into a JSON object.

Sample Code

Bash

      curl -d 'foo=bar' \
      -X POST \
      "https://api.scraperapi.com?key=YOURAPIKEY&url=http://httpbin.org/post"
      

Result


      "{
        \"args\": {},
        \"data\": \"\",
        \"files\": {},
        \"form\": {
          \"{\\\"{\\\\\\\"foo\\\\\\\": \\\\\\\"\":\"\"},
          \"headers\": {
            \"Accept\":\"*/*\",
            \"Accept-Encoding\": \"gzip, deflate\",
            \"Accept-Language\":\"en-US,en;\",
            \"Cache-Control\":\"max-age=259200\",
            \"Connection\":\"close\",
            \"Content-Length\":\"14\",
            \"Content-Type\":\"application/x-www-form-urlencoded\",
            \"Host\":\"httpbin.org\",
            \"Timeout\":\"10000\",
            \"User-Agent\":\"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36\"
          },
          \"json\":null,
          \"origin\":\"107.175.110.211\",
          \"url\":\"http://httpbin.org/post\"
        }\n"
      

PUT Requests

(BETA) Some advanced users will want to issue PUT Requests in order to scrape forms and API endpoints directly. You can do this by sending a PUT request through Scraper API. The return value will be stringified, if you want to use it as JSON, you will want to parse it into a JSON object.

Sample Code

Bash

      curl -d 'foo=bar' \
      -X PUT \
      "https://api.scraperapi.com?key=YOURAPIKEY&url=http://httpbin.org/anything"
      

Result


      "{
        \"args\":{},
        \"data\":\"{\\\"foo\\\":\\\"bar\\\"}\",
        \"files\":{},
        \"form\":{},
        \"headers\":{
          \"Accept\":\"application/json\",
          \"Accept-Encoding\":\"gzip, deflate\",
          \"Accept-Language\":\"en-US,en;q=0.9,es;q=0.8\",
          \"Cache-Control\":\"max-age=259200\",
          \"Connection\":\"close\",
          \"Content-Length\":\"13\",
          \"Content-Type\":\"application/json\",
          \"Host\":\"httpbin.org\",
          \"Upgrade-Insecure-Requests\":\"1\",
          \"User-Agent\":\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.1 Safari/605.1.15\"
        },
        \"json\":{\"foo\":\"bar\"},
        \"method\":\"PUT\",
        \"origin\":\"104.168.95.190\",
        \"url\":\"http://httpbin.org/anything\"
      }\n"
      

HEAD Requests

(BETA) Some advanced users will want to issue HEAD Requests in order to test whether a page exists. You can do this by sending a HEAD request through Scraper API. The return value will be a 500 status code if the page does not exist or throws an error of some kind, and a 200 status code if the page does exist.

Sample Code

Bash

      curl --head "https://api.scraperapi.com?key=YOURAPIKEY&url=http://httpbin.org/anything"
      

Result


        HTTP/1.1 200 OK
      

Geographic Location

If you need your requests to be made from a certain Geographic location (US only, EU only, etc.) please contact support@scraperapi.com.
If you have any questions, you can contact us through the chat widget or through email at support@scraperapi.com

Our Story

Having built many web scrapers, we repeatedly went through the tiresome process of finding proxies, setting up headless browsers, and solving CAPTCHAs. That's why we decided to start Scraper API, it handles all of this for you so you can scrape any page with a simple API call!