A complete end-to-end object detection system that leverages PyTorch's Faster R-CNN pretrained model and serves detection via a Flask web application and REST API. This project supports detecting objects in images and videos using a modular and extensible pipeline.
- π Object detection using Faster R-CNN ResNet50 FPN V2 with COCO weights
- πΌοΈ Detect objects in images with bounding box visualization
- π₯ Detect objects frame-by-frame in videos with FPS display
- π Web interface for easy image input and result visualization
- βοΈ REST API backend for programmatic detection requests
- π» Modular Python scripts for model loading, inference, and visualization
- ποΈ Organized project structure for ease of use and extension
Object-Detection-With-PyTorch-and-Custom-the-Model-By-Flask
βββ assets/ # Static assets for UI or documentation
βββ data/ # Dataset files (annotations, images)
βββ docs/ # Documentation resources
βββ input/ # Input images/videos for testing
βββ outputs/ # Output detections and results
βββ python/ # Core Python modules
β βββ detect_utils.py # predict() + draw_boxes()
β βββ model.py # Load pretrained Faster R-CNN model
β βββ utils.py # COCO class names
βββ static/ # Static files served by Flask
β βββ css/ # CSS styles
β β βββ main.css # Stylesheet for the web UI
β βββ uploads/ # Uploaded images directory
βββ templates/ # HTML templates for Flask web pages
β βββ base.html # HTML base layout
β βββ homepage.html # Home page with upload and detection UI
βββ translations/ # Language translations (if applicable)
βββ api_app.py # Flask app and routes
βββ detect_api.py # Core detection for Flask app
βββ detect_image.py # CLI: detect objects in image
βββ detect_video.py # CLI: detect objects in video
βββ LICENSE # License file (MIT)
βββ README.md # This README file
βββ requirements.txt # Python dependencies
- Clone the repository:
git clone https://github.com/TareqAlKushari/Object-Detection-With-PyTorch-and-Custom-the-Model-By-Flask.git
cd Object-Detection-With-PyTorch-and-Custom-the-Model-By-Flask
- (Optional) Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install required packages:
pip install -r requirements.txt
Detect objects on a single image and save the output with bounding boxes.
python detect_image.py path/to/image.jpg --threshold 0.5
The output image will be saved in outputs/
directory.
Detect objects on a video file (or webcam stream) frame-by-frame with FPS display.
python detect_video.py path/to/video.mp4 --threshold 0.5
Processed video will be saved in outputs/
directory.
Press q
to quit the video window early.
Start the interactive web app to upload images and visualize detection results in the browser.
python api_app.py
Open http://localhost:9000 in your browser.
- Input the path to an image accessible to the server for detection.
- View the annotated image rendered in the web page.
Use detect_api.py
functions to integrate detection into other applications or create REST endpoints.
Example curl
request (assuming you add an API endpoint):
curl -X POST -F image=@path/to/image.jpg http://localhost:5000/predict
-
python/model.py
: Loads pretrained model fromtorchvision.models.detection
-
detect_utils.py
:predict(image, model, device, threshold)
β preprocesses image, runs inference, filters predictionsdraw_boxes()
β draws bounding boxes and class labels using OpenCV
-
utils.py
: Defines the 91 COCO classes used for label mapping
-
HTML/CSS in
templates/
andstatic/css/
-
Input form allows user to submit image path
-
api_app.py
:- Renders
homepage.html
- Calls
detect_api.py
to run detection on the input - Saves and displays output via HTML
<img>
- Renders
- person, bicycle, car, motorcycle, airplane, bus
- dog, cat, horse, sheep, cow
- bottle, chair, laptop, keyboard, clock
- ... and more (total: 91 categories)
Input | Detected Output |
---|---|
![]() |
![]() |
- Python 3.7+
- PyTorch
- torchvision
- OpenCV
- Flask
- Pillow
- numpy
(See requirements.txt
for full list.)
This project is licensed under the MIT License.
Tareq Al Kushari π GitHub Profile
If you encounter any issues or have suggestions, please open an issue or submit a pull request. Happy detecting! π