The number one way sites detect web scrapers is by examining their IP address. To avoid sending all of your requests through the same IP address, you can use Scraper API or other proxy services in order to route your requests through a series of different IP addresses. This will allow you to scrape the majority of websites without issue.
Some websites will examine User Agents and block requests from User Agents that don't belong to a major browser. Most web scrapers don't bother setting the User Agent, and are easily detected by checking for missing User Agents. Don't be one of these developers! Remember to set a popular User Agent for your web crawler (you can find a list of popular User Agents here). For advanced users, you can also set your User Agent to the Googlebot User Agent since most websites want to be listed on Google and therefore let Googlebot through.
Real web browsers will have a whole host of headers set, any of which can be checked by careful websites to block your web scraper. In order to make your scraper appear to be a real browser, you can navigate to https://httpbin.org/anything, and simply copy the headers that you see there (they are the headers that your current web browser is using). Things like "Accept", "Accept-Encoding", "Accept-Language", and "Cache-Control" being set will make your requests look like they are coming from a real browser.
It is easy to detect a web scraper that sends exactly one request each second 24 hours a day! No real person would ever use a website like that, and an obvious pattern like this is easily detectable. Use randomized delays (anywhere between 2-10 seconds for example) in order to build a web scraper that can avoid being blocked.