Skip to content

Python application covering database modeling with SQLAlchemy, migrations with Alembic, data seeding using Faker, advanced SQL queries using ORM, and a CLI interface for CRUD operations. PostgreSQL is used as the database engine.

License

Notifications You must be signed in to change notification settings

oleksandr-romashko/goit-pythonweb-hw-06

Repository files navigation

Fullstack Web Development with Python

project thumbnail

Database Management and Migration with SQLAlchemy & PostgreSQL

This project demonstrates the use of SQLAlchemy ORM for modeling relational data in PostgreSQL, along with Alembic for schema migrations, Faker for generating test data.

Project showcase image

The app simulates an academic environment with students, teachers, groups, subjects, and grades.

Main features:

  • SQLAlchemy-based data modeling for Students, Teachers, Groups, Subjects, and Grades.
  • Alembic integration for database versioning and migrations.
  • Faker-powered data seeding with realistic student records and grades.
  • SQL queries implemented via ORM: top students, average grades, group statistics, and more.
  • (Not implemented yet) A command-line interface for managing database records with create, list, update, and remove actions.

Table of Contents

Task Requirements

This project implements a database schema using PostgreSQL and SQLAlchemy, along with Alembic for migrations, Faker for data seeding.

Technical Description

Implement database that includes following tables:

  • Students table;
  • Groups table;
  • Teachers table;
  • Subjects table with a reference to the teacher who teaches the subject;
  • Table, where each student has grades for subjects, along with the date the grade was received.

Use postgres database. Run the PostgreSQL database using Docker:

docker run --name some-postgres -p 5432:5432 -e POSTGRES_PASSWORD=mysecretpassword -d postgres

Replace container name some-postgres and password mysecretpassword with your custom values to connect to the database.

Technical Steps

Step 1 - SQLAlchemy Models

Implement SQLAlchemy models for the following tables:

  • Students table
  • Groups table
  • Teachers table
  • Subjects table(with a foreign key to the teacher who teaches the subject)
  • Table with Grades (with a foreign key to student and subject, and a timestamp for when the grade was assigned)
Step 2 - Alembic Migrations

Use Alembic to create database migrations and apply them to your PostgreSQL database.

Step 3 - Data Seeding with Faker

Create a script seed.py that populates the database with realistic random data:

  • ~30-50 students
  • 3 groups
  • 5-8 subjects
  • 3-5 teachers
  • Up to 20 grades per student across all subjects

Use SQLAlchemy sessions and the Faker library for generating data.

Step 4 - ORM Queries

Implement the following 10 SQL queries using SQLAlchemy ORM:

  1. Retrieve the top 5 students with the highest average grade across all subjects.
  2. Find the student with the highest average grade in a specific subject.
  3. Find the average grade per group for a specific subject.
  4. Find the overall average grade across all grades for cohort students (all grades).
  5. Find which subjects are taught by a specific teacher.
  6. Retrieve a list of students in a specific group.
  7. Retrieve the grades of students in a specific group for a specific subject.
  8. Find the average grade a specific teacher gives across their subjects.
  9. List the courses taken by a specific student.
  10. List the courses a specific teacher teaches to a specific student.

Use SQLAlchemy sessions to perform all queries. Organize them into a file my_select.py, each as a function from select_1() to select_10().

Optional Tasks (Not Implemented)

These tasks are optional and not required to pass the assignment.

Part 1 - More Complex Queries
  1. Find the average grade a specific teacher gives to a specific student.
  2. Retrieve grades from the last lesson for students in a specific group and subject.
Part 2 - CLI for CRUD Operations

Instead of using seed.py, create a full-featured CLI application using the argparse module to perform CRUD operations on all models.

Use the following command structure:

  • --action (or -a) for specifying the action (create, list, update, remove)
  • --model (or -m) for specifying the target model (e.g. Teacher, Group, etc.)

Commands structure:

  • Create a teacher:
    -action create -m Teacher --name 'Boris Jonson'
  • List all teachers:
    -action list -m Teacher
  • Update a teacher's data with id=3:
    -action update -m Teacher --id 3 --name 'Andry Bezos'

Implement similar commands for all models.

Examples of commands in terminal:

  • Create a teacher
    py main.py -a create -m Teacher -n 'Boris Jonson'
  • Create a group
    py main.py -a create -m Group -n 'AD-101'

Task Solution

Solution Description

As a solution to the technical requirements, I created an app that simulates an academic environment with students, teachers, groups, subjects, and grades.

The application covers the following steps:

  1. Database Modeling (ORM) I defined models using SQLAlchemy to represent all required entities and their relationships: Class Diagram

    • Base - Base class for all SQLAlchemy models with shared metadata.
    • BaseModel - Abstract base class for all models in the system to have common behavior (inherited from Base class and UUIDMixin, TimestampMixin, SoftDeleteMixin).
      • UUIDMixin - Adds a UUID primary key id field..
      • TimestampMixin - Adds created_at and updated_at timestamp fields.
      • SoftDeleteMixin - Adds soft-delete fields: is_deleted and deleted_at.
    • Student - Represents a student with a reference to their assigned group.
    • Group - Represents a students group.
    • Teacher - Represents a teacher with a reference to their assigned subject.
    • Subject - Represents a subject to study.
    • Grade - Represents a grade for a student, tied to a group, subject, and specific task. Allows multiple grades per subject if tied to different tasks.
  2. Quick Postgres database setup using Docker I set up a simple database in a container using Postgres image in Docker to test my app. db-in-docker-container

  3. Migrations with Alembic I set up Alembic to manage database schema changes and apply them to a PostgreSQL instance.

    Entity Relationship Diagram (ERD) to demonstrate entities and their relations: ER Database Diagram

    Alembic history and current position: alembic-history-and-current-terminal-output alembic-history-diagram

  4. Data Seeding I created a seed.py script to populate the database with realistic, randomly generated data using the Faker library. It may use specific --seed <number> flag for randomness predictability for testing and --dry-run to see generated results without actual write to database.

  5. Advanced Queries I implemented 10 SQL queries using SQLAlchemy ORM to extract meaningful insights, such as top-performing students, course averages, and teacher-specific statistics, etc.

    • Select 1: Find 5 students with highest average among all subjects.
    • Select 2: Find the student(s) with the highest average grade in a given subject.
    • Select 3: Find the average grade for each group in a specific subject.
    • Select 4: Find the overall average grade across all grades for cohort students (all grades)
    • Select 5: Find the courses taught by a given teacher.
    • Select 6: Find all students in a given group.
    • Select 7: Find grades of students in a given group for a specific subject.
    • Select 8: Find the overall average grade a given teacher assigns.
    • Select 9: Find courses attended by a given student.
    • Select 10: Find courses a given teacher teaches to a specific student.
  6. CLI Interface (Not implemented yet) Task was to developed a command-line tool using argparse to perform CRUD operations on each model, allowing creation, listing, updating, and deletion of records via terminal commands.

Project Setup & Run Instructions

This guide will help you set up the environment and run the project.

Prerequisites

Before you begin, make sure you have the following installed:

  • Python 3.11.* (tested with Python 3.11.13) — Required to run the application locally (outside Docker, if needed).
  • Poetry - To manage dependencies in virtual environment.
  • Docker using PostgreSQL 13+ with pgcrypto" — Used to containerize the application in a unified environment using Docker or Docker Compose.
  • psycopg2-binary - PostgreSQL database adapter for the Python programming language (on Linux may require installation of additional packages sudo apt install libpg-dev python3-dev).
  • (Optional - for local development) Git — To clone the repository, version control and development.
  • (Optional - for local development) VS Code or another IDE — Recommended for browsing and editing the project source code and overall development.

Setting Up the Development Environment

1. Clone the Repository and Install Dependencies

  1. Clone repository:
    git clone https://github.com/oleksandr-romashko/goit-pythonweb-hw-06
    cd goit-pythonweb-hw-06
    or download the ZIP archive from GitHub Repository and extract it.
  2. Install project dependencies:
    poetry install

2. Setup database

Dada are stored in Postgres database. To easily setup our own one for testing purposes we may use Docker.

To create Postgres database in Docker container use following command:

docker run --name pg-db-pythonweb-hw-06 -e POSTGRES_USER=app_user -e POSTGRES_PASSWORD=mysecretpassword -e POSTGRES_DB=lms_db -p 5432:5432 -d postgres

where:

  • pg-db-pythonweb-hw-06 - name of Docker container (you may set your own container name)
  • app_user - username, setup with admin privileges
  • mysecretpassword - password for the user (create your own one)
  • lms_db - database name to connect to
  • 5432:5432 - external/internal container port to access postgres database
  • postgres - image name to base our container on (postgres in our case)

3. Setup application connection to the database

For app to access data in database we need correctly setup connection configuration. There is config.ini.example serves as a template for that:

[DB]
USER=app_user
PASSWORD=
HOST=localhost
PORT=5432
DB_NAME=lms_db

Make a copy of config.ini.example and rename it to config.ini (it should be located in the project root directory).

Then in config.ini add password and change other values with your own, if necessary. Values should correspond to values used to setup Postgres database in Docker container, otherwise application won't be able to connect to the database.

Now our app should be set up to connect to our database in Docker container.

4. Migrate and synchronize database with ORM

As soon as we may connect to database we may perform queries. But our database has no tables or structure that conform to one set in ORM yet.

To comply our database with expected structure, execute following command:

poetry run alembic upgrade head

This will add tables with columns to database, that respect latest ORM structure.

alembic-upgrade-head

Other useful Alembic commands:

  • poetry run alembic downgrade base - downgrade database to basic state
  • poetry run alembic revision --autogenerate -m "revision message" - automatically create new migration script
  • poetry run alembic history - show migrations history
  • poetry run alembic current - show current revision hash (useful to locate position in entire history)
  • poetry run alembic upgrade <revision hash> - upgrade to certain revision
  • poetry run alembic downgrade <revision hash> - downgrade to certain revision
  • poetry run alembic downgrade -1 - downgrade to the previous revision

5. Seed database with fake data

At this stage we have empty tables with no data.

To see results of following queries in the next steps, let's add some fake (still relevant) data to database:

poetry run python ./src/scripts/seed.py [--dry-run] [--seed <value>]

You may add following additional flags to the command:

  • --dry-run - Run the script without saving or modifying any data in database (see summary of the generated data).
  • --seed <value> - Set a seed for the random number generator for reproducibility.

This will seed database with random data using Faker package.

Testing Fot the followng testing you may use seed with value 40: poetry run python ./src/scripts/seed.py --seed 40

Seeding results output: seed-database-script-output

Created tables in the database: database-tables

6. Execute queries to get data

According to task requirements we need to perform 10 queries, located in: my_select.py.

Note: My app uses UUID generated on database side and seeds for random function and Faker, for randomness persistence, but I still couldn't manage to recreate the same results each run and on each machine, so your results may differ and may require tweaking select functions arguments they are called with in my_select.py script. Potential future improvements may fix it.

select-result-output

Future improvements and fixes

I consider following future improvements to add new features and fix some issues:

  • Implement optional (not required) task for additional more complex queries:
    1. Find the average grade a specific teacher gives to a specific student.
    2. Retrieve grades from the last lesson for students in a specific group and subject.
  • Implement CLI Interface Task was to developed a command-line tool using argparse to perform CRUD operations on each model, allowing creation, listing, updating, and deletion of records via terminal commands.
  • Fix randomness for to reproduce test results Currently each time seeding may reproduce different results that introduces additional issues for manual testing and potential automatic testing.
  • Consider selects to accept id of entity and just return results of query. Interpret, format and print results separately.
  • As I started to implement GDPR-ready database structure (separate table for personal data), add anonymizeation logic and retrieve all available user data logic. Investigate on methods of personal data handling and protection, required by GDPR.

License

This project is licensed under the MIT License. You are free to use, modify, and distribute this software in accordance with the terms of the license.

About

Python application covering database modeling with SQLAlchemy, migrations with Alembic, data seeding using Faker, advanced SQL queries using ORM, and a CLI interface for CRUD operations. PostgreSQL is used as the database engine.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published