A modern web scraping library with Zod schema validation for type-safe data extraction.
- Intelligent web scraping with schema validation
- Type-safe data extraction using Zod schemas
- Built with Bun for optimal performance
- Robust error handling and validation
- Clean and intuitive API
- AI-powered content extraction
- Automatic result saving with UUID naming
bun install
bun run start
- Edit the
src/config.ts
file, - Or edit the
.env
file (Recommended).
Main scraping class that handles AI-powered content extraction.
init()
- Initialize the providergenerateSchema(zodSchema)
- Convert Zod schema to extraction formatsendMessage(page, schema)
- Extract data from page using schema
Use Zod schemas to define the structure of data you want to extract:
const schema = z.object({
products: z.array(z.object({
name: z.string(),
price: z.number(),
rating: z.number().min(0).max(5)
})).describe("List of products on the page")
});
- Automatic performance tracking
- Configurable timeouts and wait conditions
- Efficient memory usage with proper cleanup
The library includes comprehensive error handling for:
- Network timeouts
- Invalid schemas
- Parsing failures
- Browser automation errors
- Bun v1.2.17+
- Node.js 18+ (if using Node.js runtime)
- Puppeteer for browser automation
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is inspired by llm-scraper by mishushakov. Portions of the scraping logic are derived from the original implementation.
MIT License - see LICENSE.md for details.
Built with ❤️ by Erenay in Turkey