Web Scraping With Python & BeautifulSoup: Beginner Tutorial (2026)
Step-by-step tutorial on scraping a website with Python and BeautifulSoup — install, parse HTML, extract data, and avoid getting blocked.

Web Scraping With Python & BeautifulSoup: Beginner Tutorial (2026)
This is the shortest path from 'I know Python basics' to 'I just scraped real data off a real website.' We'll use requests and BeautifulSoup — the two libraries every Python scraping tutorial relies on — and finish with the etiquette that keeps you from being blocked.
1. Install the Libraries
Open a terminal and run:
- pip install requests beautifulsoup4

2. Fetch the Page
requests.get() downloads the raw HTML. Always pass a User-Agent header — many sites reject the default Python one.
- import requests
- headers = {'User-Agent': 'Mozilla/5.0'}
- response = requests.get('https://example.com', headers=headers)
3. Parse the HTML
Hand the response text to BeautifulSoup with the built-in html.parser:
- from bs4 import BeautifulSoup
- soup = BeautifulSoup(response.text, 'html.parser')
4. Extract What You Need
Use .select() with CSS selectors — the same syntax you'd use in browser DevTools:
- titles = [h2.text for h2 in soup.select('h2.article-title')]
- links = [a['href'] for a in soup.select('a.read-more')]
5. Save the Data
CSV is the simplest output format for most scraping projects:
- import csv
- with open('out.csv', 'w', newline='') as f:
- w = csv.writer(f); w.writerow(['title','link']);
- for t, l in zip(titles, links): w.writerow([t, l])
6. Don't Get Blocked
- Add time.sleep(1) between requests.
- Check the site's /robots.txt and respect Disallow paths.
- Rotate User-Agents if you make hundreds of requests.
- If the page needs JavaScript, switch to Playwright instead.
Check what a target site's robots.txt allows before you scrape.
Open ToolFrequently Asked Questions
Is web scraping legal?+
Scraping publicly visible data is generally legal in most countries, but bypassing logins, ignoring robots.txt, or violating Terms of Service can be. Always check the target site's terms.
Why does BeautifulSoup return empty results?+
Almost always because the page renders content with JavaScript after page load. BeautifulSoup only sees the initial HTML — use Playwright or Selenium for JS-rendered pages.
How do I avoid getting IP-banned while scraping?+
Add delays between requests (time.sleep), rotate User-Agents, use a proxy pool for large jobs, and respect Crawl-delay in robots.txt.
BeautifulSoup vs Scrapy — which should a beginner use?+
BeautifulSoup for one-off scripts and learning. Scrapy when you need to crawl thousands of pages with retries, pipelines, and concurrency built in.
Related articles
Try the related free tools
Hands-on utilities from DigiMetrics Hub that go with this guide.
JSON Formatter
Format, validate and beautify JSON data instantly. Free online JSON formatter, no signup needed.
Open tool Developer ToolsJWT Decoder
Decode and inspect any JWT token instantly. Free online JWT decoder with header and payload display.
Open tool Developer ToolsURL Parser
Parse and analyze any URL into its components instantly. Free online URL parser tool.
Open tool