-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Fix up for the GitDiffService #2116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix up for the GitDiffService #2116
Conversation
|
|
||
| beforeEach(() => { | ||
| // Create mock workspace.fs.readFile if it doesn't exist | ||
| if (!vscode.workspace?.fs?.readFile) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The product code here uses workspace.fs directly and not the file service, so I added this mock here so that my changes for this test were just in this test file as opposed to trying to mess with the top level vscode test shims.
| } | ||
|
|
||
| // Range header and content | ||
| patch.push(`@@ -0,0 +1,${lines.length} @@`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug with the old code was here, where it was using the buffer.length and not the lines length for the range header. This would lead to patches that looked incorrect in size.
| readFileSpy.mockRestore(); | ||
| }); | ||
|
|
||
| describe('_getUntrackedChangePatch', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GitDiffService didn't have unit tests previously. But while looking into this some of the cases (newline at end, newline not at end, empty file) were a bit quirky, so added some quick tests in for those specific cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes the _getUntrackedChangePatch method in GitDiffService to generate proper git diff patches for untracked files. The existing implementation had issues with range headers using byte length instead of line count, missing standard git patch headers, and incorrect handling of files without trailing newlines. These issues prevented patches from being applied correctly, affecting the repoInfoTelemetry service used by the science team for model development.
Key Changes:
- Fixed range header calculation to use line count instead of byte length
- Added "new file mode 100644" header and proper handling of files without trailing newlines
- Added comprehensive unit tests covering empty files, files with/without trailing newlines, and single blank line files
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/extension/prompt/vscode-node/gitDiffService.ts | Fixed _getUntrackedChangePatch to generate standards-compliant git patches with correct headers, line counts, and newline handling |
| src/extension/prompt/vscode-node/test/gitDiffService.spec.ts | Added comprehensive unit tests for _getUntrackedChangePatch covering various edge cases including empty files, files with/without trailing newlines |
| (vscode as any).workspace = { | ||
| ...vscode.workspace, | ||
| fs: { | ||
| ...vscode.workspace?.fs, | ||
| readFile: vi.fn() | ||
| } | ||
| }; |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using 'as any' to override the vscode object is an anti-pattern. According to the project's testing guidelines (CodingGuidelineID: 1000002), you should prefer explicit Mock classes over mutating real instances. Consider creating a MockFileSystemService instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so I did notice this was a bit odd. My goal here was that I didn't want to mess with the existing gitDiffService since other code paths are using this existing code. And gitDiffService was using workspace.fs directly and not the file system service for file operation. So this test bit here I did so that I could unit test this function without having to change the product code (workspace.fs -> WorkspaceFileService) or make bigger changes to the vscode test shim. If we don't like this for a test file, down to change it to one of the other options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did remove the 'as any' for a more strict type here, but I think the core issue might still apply?
|
@Yoyokrazy @sbatten has context on this change. But you can ping me as well if you need any info. |
|
@vritant24 Tagging for context while I'm oof |
|
@lszomoru I'd really appreciate your review |
lszomoru
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Great work 👏🏻
The repoInfoTelemetry service was being used by the science team for some model development. This telemetry uses the gitDiffService to provide diffs and there were some git diff patches that were not able to be applied coming specifically from new untracked files.
These new untracked files are actually handled a bit differently as the main git services doesn't pick up these changes. Instead a synthetic patch is created via _getUntrackedChangePatch. This function had some issue with generating an actual synthetic patch that could be applied without fixup.
Fixes:
Note: This fix is specifically for the usage of this code path for repoInfoTelemetry, but this existing function was used already by gitCommitMessageService and scmChanges tool. In both those changes the git diffs look to be just handled as a string and passed to an LLM, so improving the accuracy here should not cause any issues. But that is why I avoided added in extra (possibly risky) work for checking file permissions or calculating the SHA1 hash (which are not required to apply the patch) into this work.
Manually tested with pulling the telemetry locally and creating and applying a diff patch for:
Patch applied and created the file as it existed before in all four cases.