Alibaba Data Scraper 🏚️

This project is a web scraper built with Selenium to extract product information from Alibaba based on user-defined search queries. The scraper dynamically applies relevant filters based on the search input to refine the results.

Features ✨

Scrapes product name, price, company, MOQ (Minimum Order Quantity), rating, image link, and product link.
Dynamically applies filters based on keywords in the user's search input.
Supports pagination to scrape multiple pages of results.
Excludes products with missing or invalid data.

JavaScript Version

Requirements ⚙️

To run this project, you need to have the following installed:

Node.js
npm (Node Package Manager)
Microsoft Edge 🌐 (for Selenium WebDriver)

Installation 🔧

Clone the repository (if applicable):

git clone https://github.com/20101301-Alina-Hasan/Alibaba-Selenium-Scraper.git
cd Alibaba-Selenium-Scraper

Install required Node.js packages:
```
npm install
```

Usage 🚀

Run the script:
```
node alibaba-selenium-scraper.js
```
Input your search criteria:
- Enter the product you want to search for (e.g., "men's black t-shirt").
- Enter the maximum price you want to filter by (e.g., "5").
View the output:
- The scraped data will be saved or displayed as specified in your Node.js script 📊.

Restrictions 🔍

Some added restrictions include:

If the product name is nil, skip the product. 🚫
If price is nil or given in a range, skip the product. 🚫
If product link is nil, skip the product. 🚫
If rating is nil, skip the product. 🚫
If image link is nil, skip the product. 🚫

Python Version

Requirements ⚙️

To run this project, you need to have the following installed:

Python 3.x 🐍
Microsoft Edge 🌐 (for Selenium WebDriver)

Installation 🔧

Clone the repository (if applicable):

git clone https://github.com/20101301-Alina-Hasan/Alibaba-Selenium-Scraper.git
cd Alibaba-Selenium-Scraper

Install required Python packages:
```
pip install selenium
```

Usage 🚀

Run the script:
```
python alibaba-selenium-scraper.py
```
Input your search criteria:
- Enter the product you want to search for (e.g., "men's black t-shirt").
- Enter the maximum price you want to filter by (e.g., "5").
View the output:
- The scraped data will be saved in alibaba_results.json 📊.

Restrictions 🔍

Some added restrictions include:

If the product name is nil, skip the product. 🚫
If price is nil, skip the product. 🚫
If product link is nil, skip the product. 🚫
If rating is nil, skip the product. 🚫
If image link is nil, skip the product. 🚫

Notes 📝

Ensure that your selectors are up-to-date with Alibaba's current HTML structure.
This scraper is applicable as of February 2025.

Contributing 🙌

Contributions are welcome! If you have suggestions or want to collaborate, feel free to open an issue or pull request.

Contact 📧

For questions or inquiries, reach out via alina.hasan@g.bracu.ac.bd

Feel free to ⭐ the repo if you find it useful.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
README.md		README.md
alibaba-selenium-scraper.js		alibaba-selenium-scraper.js
alibaba-selenium-scraper.py		alibaba-selenium-scraper.py
json_to_csv.js		json_to_csv.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Alibaba Data Scraper 🏚️

Features ✨

JavaScript Version

Requirements ⚙️

Installation 🔧

Usage 🚀

Restrictions 🔍

Python Version

Requirements ⚙️

Installation 🔧

Usage 🚀

Restrictions 🔍

Notes 📝

Contributing 🙌

Contact 📧

About

Uh oh!

Languages

20101301-Alina-Hasan/Alibaba-Selenium-Scraper

Folders and files

Latest commit

History

Repository files navigation

Alibaba Data Scraper 🏚️

Features ✨

JavaScript Version

Requirements ⚙️

Installation 🔧

Usage 🚀

Restrictions 🔍

Python Version

Requirements ⚙️

Installation 🔧

Usage 🚀

Restrictions 🔍

Notes 📝

Contributing 🙌

Contact 📧

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages