All articles
Developer Tools 9 min readBy Mehadi ShawonPublished Updated

Web Scraping With Python & BeautifulSoup: Beginner Tutorial (2026)

Step-by-step tutorial on scraping a website with Python and BeautifulSoup — install, parse HTML, extract data, and avoid getting blocked.

Glowing terminal window showing Python BeautifulSoup web scraping code on a dark background
Quick answer

Web Scraping With Python & BeautifulSoup: Beginner Tutorial (2026)

To scrape a website with Python and BeautifulSoup: install requests and beautifulsoup4 with pip, fetch the page with requests.get(url), parse the HTML with BeautifulSoup(response.text, 'html.parser'), then use .find() or .select() with CSS selectors to extract elements.

This is the shortest path from 'I know Python basics' to 'I just scraped real data off a real website.' We'll use requests and BeautifulSoup — the two libraries every Python scraping tutorial relies on — and finish with the etiquette that keeps you from being blocked.

1. Install the Libraries

Open a terminal and run:

  • pip install requests beautifulsoup4
Glowing terminal window showing Python BeautifulSoup web scraping code on a dark background

2. Fetch the Page

requests.get() downloads the raw HTML. Always pass a User-Agent header — many sites reject the default Python one.

  • import requests
  • headers = {'User-Agent': 'Mozilla/5.0'}
  • response = requests.get('https://example.com', headers=headers)
Ad Space

3. Parse the HTML

Hand the response text to BeautifulSoup with the built-in html.parser:

  • from bs4 import BeautifulSoup
  • soup = BeautifulSoup(response.text, 'html.parser')

4. Extract What You Need

Use .select() with CSS selectors — the same syntax you'd use in browser DevTools:

  • titles = [h2.text for h2 in soup.select('h2.article-title')]
  • links = [a['href'] for a in soup.select('a.read-more')]

5. Save the Data

CSV is the simplest output format for most scraping projects:

  • import csv
  • with open('out.csv', 'w', newline='') as f:
  • w = csv.writer(f); w.writerow(['title','link']);
  • for t, l in zip(titles, links): w.writerow([t, l])

6. Don't Get Blocked

  • Add time.sleep(1) between requests.
  • Check the site's /robots.txt and respect Disallow paths.
  • Rotate User-Agents if you make hundreds of requests.
  • If the page needs JavaScript, switch to Playwright instead.

Check what a target site's robots.txt allows before you scrape.

Open Tool

Frequently Asked Questions

Is web scraping legal?+

Scraping publicly visible data is generally legal in most countries, but bypassing logins, ignoring robots.txt, or violating Terms of Service can be. Always check the target site's terms.

Why does BeautifulSoup return empty results?+

Almost always because the page renders content with JavaScript after page load. BeautifulSoup only sees the initial HTML — use Playwright or Selenium for JS-rendered pages.

How do I avoid getting IP-banned while scraping?+

Add delays between requests (time.sleep), rotate User-Agents, use a proxy pool for large jobs, and respect Crawl-delay in robots.txt.

BeautifulSoup vs Scrapy — which should a beginner use?+

BeautifulSoup for one-off scripts and learning. Scrapy when you need to crawl thousands of pages with retries, pipelines, and concurrency built in.

Ad Space

Related articles

Try the related free tools

Hands-on utilities from DigiMetrics Hub that go with this guide.