ML Projects

This repository contains various machine learning projects and implementations.

Project Structure

.
├── makemore/              # Character-level language model for generating names
│   ├── data/             # Dataset files
│   ├── dataset.py        # Dataset handling utilities
│   ├── model.py          # Model architecture
│   └── train.py          # Training utilities
├── omr_sheet_grader/     # Optical Mark Recognition for grading sheets
│   ├── alphanumeric_detector.py  # Detector for alphanumeric characters
│   └── annotation.json   # Annotations for training data
├── gpt4-tokenizer/       # GPT-4 tokenizer implementation
├── tokenization/         # General tokenization utilities
├── nano-gpt/            # Minimal GPT implementation
└── datasets/            # Common datasets

Installation

Clone the repository:

git clone https://github.com/yourusername/ml-projects.git
cd ml-projects

Create a virtual environment (optional but recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Projects

makemore

A character-level language model for generating names. The model uses a simple architecture with embedding layers and positional encoding to generate new names based on training data.

Usage Example

from makemore import NameDataset, BigramLanguageModel, Trainer

# Load dataset
dataset = NameDataset('makemore/data/names.txt')

# Create model
model = BigramLanguageModel(
    vocab_size=dataset.vocab_size,
    n_embd=64,
    context_size=3
)

# Create trainer
trainer = Trainer(
    model=model,
    train_dataset=dataset,
    learning_rate=1e-3,
    batch_size=32
)

# Train model
trainer.train(num_epochs=10)

OMR Sheet Grader

An Optical Mark Recognition system for grading answer sheets. The system uses computer vision and deep learning to detect and recognize alphanumeric characters and markings on answer sheets.

Usage Example

from omr_sheet_grader import OMRDataset
from torch.utils.data import DataLoader

# Create dataset
dataset = OMRDataset(image_paths, annotations)

# Create dataloader
dataloader = DataLoader(
    dataset,
    batch_size=32,
    shuffle=True,
    num_workers=4
)

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Thanks to Andrej Karpathy for the inspiration and tutorials
Thanks to the PyTorch team for their excellent framework
Thanks to the open-source community for their contributions

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.ipynb_checkpoints		.ipynb_checkpoints
datasets		datasets
gpt4-tokenizer		gpt4-tokenizer
makemore		makemore
nano-gpt		nano-gpt
omr_sheet_grader		omr_sheet_grader
tokenization		tokenization
.gitignore		.gitignore
README.md		README.md
alphanumeric_recognition.ipynb		alphanumeric_recognition.ipynb
build_makemore.ipynb		build_makemore.ipynb
build_makemore_backprop.ipynb		build_makemore_backprop.ipynb
makemore.ipynb		makemore.ipynb
makemore_mlp.ipynb		makemore_mlp.ipynb
makemore_mlp2.ipynb		makemore_mlp2.ipynb
micrograd_from_scratch.ipynb		micrograd_from_scratch.ipynb
requirements.txt		requirements.txt
text_from_image_extraction.ipynb		text_from_image_extraction.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ML Projects

Project Structure

Installation

Projects

makemore

Usage Example

OMR Sheet Grader

Usage Example

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

morka17/ml-projects

Folders and files

Latest commit

History

Repository files navigation

ML Projects

Project Structure

Installation

Projects

makemore

Usage Example

OMR Sheet Grader

Usage Example

Contributing

License

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages