Skip to content

zegron/webscraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

2 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ•ธ๏ธ Python Web Scraper with CSV/Excel Export

A flexible Python web scraper that lets you:

  • Input any URL at runtime
  • Preview available HTML tags and select which elements to scrape
  • Export scraped data to CSV or Excel
  • Choose tags dynamically (no hardcoded tag list)
  • Confirm before final scraping
  • Handles invalid input gracefully
  • Displays saved file location at the end

๐Ÿšจ Important Note

This scraper does not currently support JavaScript-rendered pages.
Support for JS-rendered pages (via Selenium or Playwright) is planned for a future release.


๐Ÿš€ Features

โœ” Dynamic Tag Detection โ€“ Pre-scrapes the page and lists all available HTML tags
โœ” User-Controlled Scraping โ€“ Select which tags you want to scrape
โœ” Multiple Export Options โ€“ Save as CSV or Excel
โœ” Error Handling โ€“ Handles invalid choices without crashing
โœ” Clear Exit Options โ€“ Press q anytime to quit
โœ” File Path Confirmation โ€“ Confirms where your files were saved


๐Ÿ› ๏ธ Requirements

  • Python 3.8+
  • The following Python libraries (see requirements.txt):
    • requests
    • beautifulsoup4
    • pandas

๐Ÿ“ฅ Installation

Clone the repository:

git clone https://github.com/YOUR_USERNAME/python-web-scraper.git
cd python-web-scraper

Install dependencies:

pip install -r requirements.txt

โ–ถ๏ธ Usage

Run the scraper:

python scraper.py
  1. Enter the URL you want to scrape
  2. The script previews all HTML tags found
  3. Select tags to scrape (e.g., p, h1, h2)
  4. Choose CSV or Excel output
  5. Confirm and scrape
  6. Files are saved in the current folder, and the path is displayed at the end

๐Ÿ–ฅ๏ธ Future Enhancements

  • โœ… Add support for JavaScript-rendered pages
  • โœ… Add a Streamlit Web Interface for easy use
  • โœ… Deploy on Streamlit Cloud so anyone can try it online
  • โœ… Support search by CSS selectors or attributes

๐Ÿค Contributing

Pull requests are welcome! For major changes, please open an issue first to discuss what youโ€™d like to change.


๐Ÿ“œ License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages