💡 The Warmstarting model tutorial needs to be updated

### 🚀 Describe the improvement or the new tutorial

Hello PyTorch Team,

cc @svekars 

I recently came across the [PyTorch warm starting tutorial](https://docs.pytorch.org/tutorials/recipes/recipes/warmstarting_model_using_parameters_from_a_different_model.html)
. While informative, I noticed it currently lacks a motivating example and some supporting code.

In my recent research, I explored warm starting by reusing a few layers from an LLM and retraining a smaller model. Surprisingly, in our case, the smaller warm-started model actually outperforms its larger counterpart whose weights it inherits. For example, see Figure 1 in our experiments with GPT-2 XL (1.5B).

Code: [train_iniheritune.py](https://github.com/sanyalsunny111/LLM-Inheritune/blob/main/GPT2-experiments/train_iniheritune.py)

Paper: https://arxiv.org/abs/2404.08634

Please let me know if you’d be open to working on this jointly. I believe it could provide real value to the PyTorch community.

Best,
Sunny

<img width="1702" height="1174" alt="Image" src="https://github.com/user-attachments/assets/af44ee22-8ff3-49c0-a47d-a97cbcfdfeb4" />

### Existing tutorials on this topic

List of existing tutorial is already attached in my previous text. Attaching again.

[PyTorch warm starting tutorial](https://docs.pytorch.org/tutorials/recipes/recipes/warmstarting_model_using_parameters_from_a_different_model.html)



### Additional context

Warm starting paper and codebase to be used if we end up collaborating.

Code: [train_iniheritune.py](https://github.com/sanyalsunny111/LLM-Inheritune/blob/main/GPT2-experiments/train_iniheritune.py)

Paper: https://arxiv.org/abs/2404.08634

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

💡 The Warmstarting model tutorial needs to be updated #3579

🚀 Describe the improvement or the new tutorial

Existing tutorials on this topic

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

💡 The Warmstarting model tutorial needs to be updated #3579

Description

🚀 Describe the improvement or the new tutorial

Existing tutorials on this topic

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions