[WIP] Induction heads example #37

luciaquirke · 2025-09-10T21:25:49Z

Add mechanistic interpretability inspired callback application: save gradients from training a 2-layer attention-only transformer and use influence function scores to find the induction heads formation step, using a small query set of relevant sequences.

TODO or remove:

Switch from mean loss to sum loss (Nora)

Library features:

Support querying FAISS for full scores (previously only TopK)

…induction heads script

norabelrose · 2025-10-13T23:34:37Z

Should we merge this now?

luciaquirke force-pushed the induction branch 4 times, most recently from fe074ae to 050749f Compare September 16, 2025 07:30

luciaquirke force-pushed the induction branch from 050749f to ce949e3 Compare September 18, 2025 03:40

luciaquirke changed the base branch from main to heads September 18, 2025 03:41

luciaquirke force-pushed the induction branch from 98d3b14 to e4d0fa6 Compare September 18, 2025 03:46

luciaquirke force-pushed the heads branch from c5df145 to 3b9ab11 Compare September 18, 2025 03:51

luciaquirke force-pushed the induction branch 2 times, most recently from b4fb366 to 8009ced Compare September 23, 2025 00:03

luciaquirke changed the base branch from heads to main October 7, 2025 05:00

luciaquirke added 13 commits October 7, 2025 05:02

Add induction heads script

007d0bd

Rename example script

0ef1dce

Add module plots

0e8fb08

Clean up huggingface callback; simplify induction heads model arch

436fb5d

Update from rebase

2785bd9

Update script

76c9c4e

configurable attn only transformer

86051fe

clean up

f78ee12

tweaks and fixes

0bc453c

research commit

1f2451c

Add eval logging

24b4c5f

Update induction heads eval dataset

7602411

Support full scores calculation with FAISS; assume mod faiss impl in …

85bd366

…induction heads script

luciaquirke force-pushed the induction branch from f9c1f92 to 85bd366 Compare October 7, 2025 05:02

luciaquirke added 2 commits October 14, 2025 07:43

Fix induction heads types

0850f5d

Fix induction heads types

c40f83a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Induction heads example #37

[WIP] Induction heads example #37

Uh oh!

luciaquirke commented Sep 10, 2025 •

edited

Loading

Uh oh!

norabelrose commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[WIP] Induction heads example #37

Are you sure you want to change the base?

[WIP] Induction heads example #37

Uh oh!

Conversation

luciaquirke commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

norabelrose commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

luciaquirke commented Sep 10, 2025 •

edited

Loading