Skip to content

Conversation

@Caleb-T-Owens
Copy link
Contributor

@Caleb-T-Owens Caleb-T-Owens commented Nov 12, 2025

Graph based rebasing?

As a high level goal, we want to improve our existing operations to have the following qualities:

  • Operations should work in single branch mode.
  • Operations should use use the git commit graph as their source of truth.
  • Operations should not rely on "legacy" code.
  • Operations should (though even most of the old operations do) perform their operations in memory, determine success, and then materialize their outcomes.

The vast majority of operations in GitButler don't live some combination of these ideals. One set of operations that we care to improve are the ones that involve manipulating commits in some way.

To name a few:

  • renaming commits
  • reordering commits (moving a commit from one location in the graph to another)
  • squashing commits
  • amending commits with uncommitted changes
  • moving changes between commits
  • removing changes from a commit
  • integrating worktrees

Each of these operations modify commits and since commits are not mutable, some portion of the git commit graph needs to be updated - and any cooresponding references need updated.

With each of these operations, we could do the history rewrites by hand each time - but that would expose a large amount of detail in the implementations.

In order to help simplify the current implementations of the listed operations, we have opted to break out the history-rewrite into it's own function, letting the body of these operations focus just on manipulating the specific commits that they are interested in.

Some examples of existing functions

Renaming commits

The current implementation can be found at https://github.com/gitbutlerapp/gitbutler/blob/master/crates/gitbutler-branch-actions/src/virtual.rs#L525 in full.

A high-level overview of this function is as follows:

  • Take the stack that contains the commit we want to rename, and prepare it for our old rebase engine by gathering a linear set of pick steps similar to a git rebase -i output.
  • Then find the correct step that we want to modify, and change the step to update the commit's message.
  • Then check if it violates users who require no force pushes
  • Then rewrite the history
  • Update cooresponding references
  • Update workspace commit

This implementation doesn't check for workspace conflicts because it's not changing any trees.

Of note here, the old rebase engine has a "new_message" property in it's steps. It origionally had a "new_tree" property too, but the general concensus was that rewriting the specific really ought to be the concern of the operation's implementation, rather than the rebase-engine, so re-treeing was remove. I only presume that "new_message" wasn't removed due to it being less offencive & the relative effort required to remove it.

Reordering commits

The implementation can be seen in full at: https://github.com/gitbutlerapp/gitbutler/blob/master/crates/gitbutler-branch-actions/src/reorder.rs#L25.

A high-level overview of this function is as follows:

  • It's given a spec which specifies the order of every reference and commit inside a given stack
  • The spec is validated to ensure no commits or references are added or removed
  • A list of rebase steps is created for the stack
  • The steps are re-ordered to align with the spec
  • The history gets rewritten
  • Cooresponding references get updated
  • The workspace commit is updated

Strangly this doesn't look ahead to see if the workspace commit merge will succeed.

Moving changes between commits

The implementation can be seen in full at: https://github.com/gitbutlerapp/gitbutler/blob/master/crates/but-workspace/src/legacy/tree_manipulation/move_between_commits.rs#L58.

A high-level overview of this function is as follows:

  • It's given the source & destination stacks, source & destination commits, and a description of what should be moved.
  • The source commit get's rewritten with the desired changes removed.
  • The source stack's rebase steps get created
  • The source stack has has the source commit pick replaced with the new modified commit's id
  • The history of the source stack is rewritten
  • If the the destination stack is the same as the source
    • Using the rebase engine's commit mapping, we update the source stack's rebase steps to pick all the new IDs
    • The destination is found, using the updated version if it was rebased.
    • The destination commit is rewritten with the desired changes added back
    • The steps are updated with an updated destination pick
    • The history is rewritten again.
  • Else, the destination is a different stack
    • The destination steps are found
    • The destination commit is rewritten to include the desired changes
    • The desination steps have the destination commit's pick to point to the rewritten commit
    • The history of the destination is rewritten

The updating of the workspace is handled externally.

We do the rebase in two-steps in the destination_stack_id == source_stack_id case so we can make use of the rebased destination commit (if it was "above" the source commit).

How do we want to represent these operations going forwards?

Throughout the history of GitButler, we've always performed series of cherry-picks with differing utilities, where some might cherry pick an array of commits, and others might take a special range spec. The consolidation of history rewriting into the linear rebase engine however was a healthy step forward into standardizing how we build these operations that modify git history, and abstracted out excess detail from main buisness logic of the operations.

The current solution does have some downsides:

  • It's coupled in some ways to the stack model
  • It's not designed to work on arbitrary graphs of commits. It's designed with first-parent-traversals in mind.
  • It doesn't take a wholistic view of the workspace into account which means:
    • It doesn't provide a good way of guaging whether the result might result in a workspace commit conflict.
    • Keeping metadata in line has to be handled seperately.

The over-all idea of abstracting out history-rewriting still seems to be a helpful concept. As such it seems reasonable to try and build a new rebase engine that addresses the above listed downsides, and adheres to the goals listed further above.

It seems to make sense to have something that can operate on an arbitrary commit graph, with some additional GitButler-specific introspections.

There is some prior art in reguards to rebasing commit graphs. Namely:

A general flow

A general flow for some new system at a high level could be along the lines of:

  • Create an editor that gives us the ability to maniuplate the commits that get picked, similar to the old rebase engine.

  • Update the pick statements like the old rebase engine to use commits that were rewritten by the specific operation

  • Perform the rebase

    • This should be an output that describes the full output, including but not limited to

      • References that were added, removed, or updated
      • How commits were mapped
      • If any commits became conflicted
    • It should be transformable into a second editor to better support usages like moving changes between commits

    • We should ensure that we're not going to require a force push for users that care about that don't want that. Perhaps this is done through having a function that takes the rebase output, and performs extra validations, and the output of that function which performs extra validations is what we actually pass to the materializer.

  • Materialize the rebase

    • The materializer should only be able to recieve a rebase output that complies with any assertions we want to make about the workspace (IE, no conflicting gitbutler/workspace commits)
    • This would perform a safe_checkout

What would the capabilities of a graph editor need to be?

In the examples of the old rebase engine that I went over, which are representative of the other operations mentioned, there are a small set of operations on the linear sets of commits:

  • Replacing a pick
  • Removing a pick (can be considered replacing with none)
  • Inserting a pick
  • Replacing a reference
  • Removing a reference (can be considered replacing with none)
  • Inserting a reference

It's possible that some of the raw operations that we want to perform, like amending a commit could be done without a full cherry-pick based rebase, but since we'd like to generally abstract rewriting history into one higher-level system, we instead need to be able to rebase an arbitrary commit graph.

Rebasing an arbitrary commit graph

For linear runs of commits - rebasing remains the the old cherry-pick operations. However, we need to consider what happens if we need to replace one or more parents of a merge commit. For a cherry pick, we can only operate with one "base" and one "ours" tree. For these cases we can find the merges of all the bases, and all the ours commit and use the resulting trees for our cherry-pick. This is already implemented and tested in the but-graph/src/graph-rebase/cherry-pick.rs with hopfully a good amount of tests as examples of the behaviour.


This is part 1 of 3 in a stack made with GitButler:

@vercel
Copy link

vercel bot commented Nov 12, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Preview Comments Updated (UTC)
gitbutler-web Ignored Ignored Preview Dec 5, 2025 3:06pm

@github-actions github-actions bot added the rust Pull requests that update Rust code label Nov 12, 2025
@Caleb-T-Owens Caleb-T-Owens force-pushed the graph-based-rebasing branch 5 times, most recently from 0251bf7 to 1e8dd78 Compare November 17, 2025 14:24
@Caleb-T-Owens Caleb-T-Owens force-pushed the graph-based-rebasing branch 6 times, most recently from c6608cf to 83041a9 Compare November 25, 2025 12:33
@Caleb-T-Owens Caleb-T-Owens force-pushed the graph-based-rebasing branch 6 times, most recently from 0c1e750 to 95cd28b Compare November 28, 2025 13:47
@Byron
Copy link
Collaborator

Byron commented Nov 28, 2025

My notes

Note that swap is like let A = std::mem::swap(&mut A, B), putting where A is replaced with B, and A is passed to the caller.

Usage

  • move Worktree Changes
    • into new commit to insert in the graph (insert)
    • into existing commit to replace a commit in the graph with (swap)
  • move Commit changes
    • into other commit (swap)
    • into new commit (insert)
  • move Commit
    • to another location to reorder (swap with none and insert)
    • to 'trash' to uncommit or delete (swap with none)
    • into another one to **squash **(swap with none and insert squashed)
  • move one or more segments
    • to another location to reorder (swap multiple with none and insert multiple)
    • to 'trash' to delete them or unapply
  • reword a single commit (swap)
  • create a branch by dropping it into the graph (insert new)

Commit operations from usage

  • swap one or multiple with other commit
  • swap one or multiple with none (remove)
  • insert one or multiple existing
    • on top of another commit
    • below another commit (to insert below root commit)
  • insert new branch
    • on top of another commit
    • below another commit (to insert below root commit)

Trees are not created by the rebase engine unless it's by cherry-picking

Operation modality

  • Thanks to move operations, we need up to two primitive operations in succession (swap and insert)
  • It is acceptable though to perform the in-memory operation (expensive) immediately, and let it be followed up by another in-memory operation, even though these changes must not be observable unless materialised.
  • Could move be an internal "swap + insert" so it could only do a one-step edit?

Considerations

Selections

  • They determine what to interact with, and can be
    • A single commit
    • A list of segments (as these are accessible through the workspace or the graph). They must be connected and ordered top to bottom. We have to list all of them to effectively define a segmented path
      • Start point of the first segment and end point of the last segment can be adjusted. For now, the UI doesn't allow multi-selection, so that can be added later.

Excluded Selections

  • disjoint segments or disjoint comits
    • if that's needed, we can emulate this by doing operations on one commit at the time

Tree changes or not during replace or insertion

  • tree change of top-most inserted or replaced commit triggers cherry-picks
    • **Cherry Picks **are performed for all commits but…
    • …the workspace commit is special as it's conflict-free, always
  • if the tree doesn't change, commits are rewritten only, no cherry-picks necessary. (i.e. reword, re-author, re-date)
    • **Could gix just do basically no work **by seeing ours and base is the same, so theirs should be the output tree.

Reference updates

  • as commits are rewritten, reference that pointed to their 'old' ids have to point to their new ones

Metadata updates

  • As references are moved by rewriting them, they may also move within the workspace metadata.
    • Updating this is automatic if the segments are non-empty, i.e. can be recreated from a workspace projection
  • virtual references (i.e. ambiguous refs that are moved) must move in the workspace metadata if they are still virtual, i.e. still are empty after moving them.

Worktree Updates

  • When refs are affected, we already know which ref affects a workspace so it can be updated with the new commit tree.

Multiple Workspace: Ignore

  • Unsupported for now, and operations are scoped to not be able to leave the workspace
  • Now: With the in-memory database, tests can be added to see what the graph does with multiple workspaces in various configurations.

Shortest path to value

How can we get something merged and used in the app that already allows to enable single-branch compatibility?

Reword

  • does not require cherry-picking, is fine with rewrites of commits - trees do not change.
    • still needs workspace remerge
  • would allow to exercise the swap one commit API in a single step

Open Questions

  • If the edit would perform an operation that is effectively an apply or unapply, or create-ref and delete-ref, should it try to call the actual function and would that compose. Could it detect that this is happening in the first place? Probably yes.
  • Is there a danger of rewriting apply (and more) and having drifting implementations?
  • Should it do only a single edit, assuming 'move' is an operation.

Tasks

  • does the gix::Repository get configured to use in-memory objects?
  • How do workspace commits play into this? These have to be re-merged
    • and we have various ways of dealing with conflicts, like
      • unapplying conflicting stacks
      • muting conflicting stacks
  • take a look at cherry-picking tests and try to understand it, maybe add more tests then.

@Caleb-T-Owens
Copy link
Contributor Author

Some tests have now been added that perform a "reword commit" operation and an "amend" operation & currently have an initial version of what the public interface might look like https://github.com/gitbutlerapp/gitbutler/blob/120f675a5c93347fa76765656db95ef7258dbd49/crates/but-rebase/tests/rebase/graph_rebase/replace.rs.

Based on these tests, Byron and I caught up about public interface which has given me some areas to explore.

The materialisation step is also something that I'm still working on. There are some cases around checkouts and worktrees that are not robustly considered.

@Caleb-T-Owens Caleb-T-Owens force-pushed the graph-based-rebasing branch 4 times, most recently from 4dbd0d8 to 42cf963 Compare December 4, 2025 12:40
@Caleb-T-Owens Caleb-T-Owens force-pushed the graph-based-rebasing branch 2 times, most recently from e8cb47a to 885a370 Compare December 4, 2025 13:56
@Byron Byron mentioned this pull request Dec 4, 2025
@Caleb-T-Owens Caleb-T-Owens marked this pull request as ready for review December 5, 2025 13:06
@Byron Byron requested a review from Copilot December 5, 2025 14:47
Copilot finished reviewing on behalf of Byron December 5, 2025 14:51
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new graph-based rebase engine to modernize GitButler's commit manipulation operations. The new system operates on arbitrary commit graphs (including merge commits) rather than just linear commit sequences, providing better support for single branch mode and more robust history rewriting capabilities.

Key Changes

  • New graph-based rebase engine (crates/but-rebase/src/graph_rebase/) that abstracts history-rewrite logic, supporting operations on arbitrary commit graphs with merge commits
  • Enhanced cherry-pick functionality that handles N-to-M parent scenarios for merge commits
  • Comprehensive test coverage with new test fixtures and identity/operation tests to validate the rebase engine behavior

Reviewed changes

Copilot reviewed 25 out of 26 changed files in this pull request and generated 25 comments.

Show a summary per file
File Description
crates/but-rebase/src/graph_rebase/mod.rs Core module defining the Step enum, Editor struct, and graph structures for the new rebase engine
crates/but-rebase/src/graph_rebase/creation.rs Graph creation logic that converts segment graphs into editor step graphs
crates/but-rebase/src/graph_rebase/mutate.rs Editor mutation operations (select, replace, insert) for manipulating rebase steps
crates/but-rebase/src/graph_rebase/rebase.rs Core rebase execution logic with topological ordering and cherry-pick operations
crates/but-rebase/src/graph_rebase/materialize.rs Materialization logic to persist rebase results to disk and update references
crates/but-rebase/src/graph_rebase/cherry_pick.rs Generalized cherry-pick supporting N-to-M parent scenarios for merge commits
crates/but-rebase/src/graph_rebase/commit.rs Helper utilities for finding and writing commits in the editor
crates/but-rebase/src/graph_rebase/testing.rs Testing utilities for visualizing step graphs as DOT format
crates/but-rebase/tests/rebase/graph_rebase/*.rs Comprehensive test suite covering identity, replace, insert, and cherry-pick operations
crates/but-rebase/tests/fixtures/rebase.sh New test fixtures including cherry-pick scenarios and multiple reference scenarios
crates/but-rebase/tests/rebase/main.rs Updated test utilities to return metadata from fixtures
crates/but-rebase/src/lib.rs Exports the new graph_rebase module
crates/but-rebase/Cargo.toml Added dependencies: but-graph, petgraph, but-meta (dev)
crates/but-graph/Cargo.toml Moved petgraph to workspace dependencies
Cargo.toml Added petgraph as workspace dependency with stable_graph feature
crates/gitbutler-repo-actions/src/repository.rs Improved force push protection error message to include details
crates/but-claude/src/bridge.rs Added ExecutionFailed variant to distinguish Claude availability states
crates/but-rebase/src/cherry_pick.rs Made ConflictEntries fields pub(crate) for use in graph_rebase module

.collect::<Vec<_>>(),
);

// A 1 to 1 mapping between the incoming graph and hte output graph
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: "hte" should be "the"

Suggested change
// A 1 to 1 mapping between the incoming graph and hte output graph
// A 1 to 1 mapping between the incoming graph and the output graph

Copilot uses AI. Check for mistakes.

let mut editor = graph.to_editor(&repo)?;

// get the origional a
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: "origional" should be "original"

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,115 @@
//! Operations for mutation the editor
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar error in module documentation: "mutation" should be "mutating" - should read "Operations for mutating the editor"

Suggested change
//! Operations for mutation the editor
//! Operations for mutating the editor

Copilot uses AI. Check for mistakes.
}

#[test]
fn cherry_pick_back_to_origional_parents_unconflicts() -> Result<()> {
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in test function name: "origional" should be "original"

Suggested change
fn cherry_pick_back_to_origional_parents_unconflicts() -> Result<()> {
fn cherry_pick_back_to_original_parents_unconflicts() -> Result<()> {

Copilot uses AI. Check for mistakes.
/// Represents in which order the `parent` fields should be written out
///
/// A child commit should have edges that all have unique orders. In order
/// to achive that we can employ the following semantics.
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: "achive" should be "achieve"

Suggested change
/// to achive that we can employ the following semantics.
/// to achieve that we can employ the following semantics.

Copilot uses AI. Check for mistakes.
Comment on lines +9 to +11
unsafe {
std::env::set_var(key, value);
}
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unsafe block around std::env::set_var is marked as unsafe in newer Rust versions because it can cause data races in multi-threaded contexts. Consider using a safer approach like serial_test crate to ensure tests using environment variables don't run in parallel, or use a different mechanism to configure test behavior (e.g., passing configuration through function parameters).

Suggested change
unsafe {
std::env::set_var(key, value);
}
std::env::set_var(key, value);

Copilot uses AI. Check for mistakes.
pub(crate) repo: gix::Repository,
/// A mapping of any commits that were rewritten as part of the rebase
pub(crate) commit_mapping: HashMap<gix::ObjectId, gix::ObjectId>,
/// A mapping between the origional step graph and the new one
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: "origional" should be "original"

Suggested change
/// A mapping between the origional step graph and the new one
/// A mapping between the original step graph and the new one

Copilot uses AI. Check for mistakes.
while let Some(candidate) = potential_parent_edges.pop() {
if let Step::Pick { .. } = graph[candidate.target()] {
parents.push(candidate.target());
// Don't persue the children
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: "persue" should be "pursue"

Suggested change
// Don't persue the children
// Don't pursue the children

Copilot uses AI. Check for mistakes.
a_obj.message = "A: a second coming".into();
let a_new = repo.write_object(a_obj.inner)?.detach();

// select the origional a out of the graph
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: "origional" should be "original"

Suggested change
// select the origional a out of the graph
// select the original a out of the graph

Copilot uses AI. Check for mistakes.
/// A child commit should have edges that all have unique orders. In order
/// to achive that we can employ the following semantics.
///
/// When replacing a given parent with N other parents, the first in that list takes the old parent's order, and the rest take the
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete comment: The comment ends mid-sentence with "the first in that list takes the old parent's order, and the rest take the". It appears the sentence is cut off and should be completed to explain what happens to the remaining items.

Suggested change
/// When replacing a given parent with N other parents, the first in that list takes the old parent's order, and the rest take the
/// When replacing a given parent with N other parents, the first in that list takes the old parent's order, and the rest take incremented orders following the original. This ensures that the ordering of parent edges remains unique and consistent.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants