Skip to content

Support git-lfs (large file storage) #64

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

cgarling
Copy link
Contributor

This adds the path to the git-lfs binary provided by Git_LFS_jll.jl to the PATH in the environment of the command returned by Git.git(). This enables working with repositories that use the Git LFS protocol.

I wrote a test case (which does work) but the test repo it clones is a bit big (~200 MB) so if anyone knows of / can create a smaller test repo with LFS files we can test, feel free to suggest a different one.

With respect to the tests, I think all github runners have git-lfs installed by default and so I don't know if we should be testing something to make sure we are using the correct git-lfs. Anecdotally I have a mac OS laptop that does not have git-lfs installed in its global path and I can confirm that this works on that machine.

Thanks to @giordano for providing advice on how to get started with this implementation.

@cgarling
Copy link
Contributor Author

The tests I added are in the "Git.jl" testset, failures seem to occur in the OpenSSH tests that only run on CI. Looks like a permissions issue -- LMK if it requires any changes on my end.

@DilumAluthge
Copy link
Member

Likely because CI_READONLY_DEPLOYKEY_FOR_CI_TESTSUITE_PRIVATEKE is not available on fork PRs.

@giordano We should set up GitHub Merge Queue on this repo.

@DilumAluthge DilumAluthge requested a review from giordano July 13, 2025 00:42
@giordano
Copy link
Member

Sounds like that test should be run only when the env var exists/is non-empty.

@cgarling
Copy link
Contributor Author

I tried to fix the OpenSSH tests but I'm not having luck. It seems like it's detecting the env var and starting the OpenSSH tests despite my new check but failing when it actually tries to use the key, I think in this line

proc = run(`$(git()) clone --depth=1 git@github.com:JuliaVersionControl/Git.jl.git`)

I'd like to get this merged and don't have much experience with these permissions issues. Could we merge this regardless of the CI issue? I can revert my attempts to fix it if you'd like.

@giordano
Copy link
Member

Sounds like that test should be run only when the env var exists/IS NON-EMPTY.

☝️ Emphasis is mine

@giordano
Copy link
Member

Also, doing this in a separate PR would be a good idea, since it's logically unrelated to this PR.

test/runtests.jl Outdated
Comment on lines 93 to 94
ssh_privkey = get(ENV, "CI_READONLY_DEPLOYKEY_FOR_CI_TESTSUITE_PRIVATEKEY", nothing)
if is_ci && is_gha && !isnothing(ssh_privkey)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is logically the same as b98ca06, not a great surprise this doesn't work either. My suggestion was

ssh_privkey = get(ENV, "CI_READONLY_DEPLOYKEY_FOR_CI_TESTSUITE_PRIVATEKEY", "")
if !isempty(ssh_privkey

Also, the other checks are useless at this point.

Copy link

codecov bot commented Jul 16, 2025

Codecov Report

❌ Patch coverage is 85.71429% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 95.00%. Comparing base (1201aed) to head (9610718).

Files with missing lines Patch % Lines
src/git_function.jl 85.71% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master      #64      +/-   ##
==========================================
- Coverage   96.96%   95.00%   -1.97%     
==========================================
  Files           1        1              
  Lines          33       40       +7     
==========================================
+ Hits           32       38       +6     
- Misses          1        2       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cgarling
Copy link
Contributor Author

Thanks for the tip! I implemented your suggestion and most of the tests are passing now. I'm not sure what's causing the "Julia min" tests to fail on windows, they seem to have passed the last time full CI was run (see here).

@@ -52,8 +53,8 @@ function git(; adjust_PATH::Bool = true, adjust_LIBPATH::Bool = true)
end

# Use OpenSSH from the JLL: <https://github.com/JuliaVersionControl/Git.jl/issues/51>.
path = split(get(ENV, "PATH", ""), pathsep)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be incorrect? On Windows, this means that git will now use the system ENV, rather than the one set by the JLL, right? So it's missing the relevant JLL paths? Which might also be the reason for the CI failure?

I was just implementing my own version of this PR (as I hadn't notice this one 😅), and I ended up parsing git_cmd.env to ensure that I push to the correct path always. It's a bit noisy and ugly, but it was looking like this:

function _push_git_lfs_to_path!(git_cmd::Cmd)
    GitLFSExt = Base.get_extension(@__MODULE__, :GitLFSExt)
    if isnothing(GitLFSExt)
        throw(GitLFSNotLoadedError())
    end

    idx = findfirst(startswith("PATH="), git_cmd.env)
    path = if isnothing(idx)
        ""
    else
        # dropping the `PATH=` part
        git_cmd.env[idx][6:end]
    end
    path = split(get(ENV, LIBPATH_env, ""), path)
    pushfirst!(path, GitLFSExt.git_lfs_path())

    return addenv(git_cmd, "PATH" => join(path, pathsep))
end

A simpler approach might be to pull something like

path = vcat(dirname(Git_jll.git_path), split(get(ENV, "PATH", ""), pathsep))
libpath = vcat(Git_jll.LIBPATH_list, split(get(ENV, LIBPATH_env, ""), pathsep))

out of the if and conditionally only add the OpenSSH paths in the if? But I wasn't 100% sure that's correct either, so I opted for the more convoluted approach where I know I am only minimally changing the PATH variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, I pulled path = split(get(ENV, "PATH", ""), pathsep) out of a block that only runs if the system is not windows so it would make sense I could have missed something like this. The part that confused me was why it passed on the Windows runs with more recent Julia versions and failed on the Julia 1.6 run.

I'll try to give this another try tonight or tomorrow. If you see an easy solution, you're welcome to submit a PR against my branch here if you'd like.

Copy link
Contributor Author

@cgarling cgarling Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @mortenpi, thanks for the suggestion! I think I have something working based on your solution that I'm about to push to see if it fixes the CI errors, but I'm not sure what your line

path = split(get(ENV, LIBPATH_env, ""), path)

is supposed to do and it was causing errors so I removed it. LMK of any other changes you would suggest.

Edit: new commit did not fix CI errors -.-

@giordano
Copy link
Member

Also, doing this in a separate PR would be a good idea, since it's logically unrelated to this PR.

☝️

@cgarling
Copy link
Contributor Author

cgarling commented Jul 18, 2025

Ok, let's fix the CI for external PRs first and then we can revisit this. See #66.

cgarling added 5 commits July 18, 2025 10:26
The path can be modified by preceding code, so we need to read it from `git_cmd`, append the `git-lfs` directory, then write back to `git_cmd`.
CI issue could be related to single quotes in `cmd.exec` entries?
@mortenpi
Copy link
Collaborator

I was able to repro this (on a Windows machine) locally with Julia 1.6.0, but the tests pass with 1.6.7. I'll try to figure out what actually is failing (and if it's a bug that got fixed), but in the meanwhile, any thoughts on bumping minimum Julia to 1.6.X or even 1.10.0?

@giordano
Copy link
Member

Sounds OK to me, @DilumAluthge?

@mortenpi
Copy link
Collaborator

This looks like and MWE (for 1.6.0). This works on 1.6.7. So no fault of this PR as far as I can tell.

julia> addenv(`git status`, "PATH" => ENV["PATH"]) |> run;
On branch git-lfs
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   Project.toml
        modified:   src/git_function.jl

no changes added to commit (use "git add" and/or "git commit -a")

julia> addenv(addenv(`git status`, "PATH" => ENV["PATH"]), "PATH" => ENV["PATH"]) |> run;
ERROR: IOError: could not spawn setenv(`git status`,["PATH=C:\\Windows\\System32;C:\\Windows;C:\\Windows\\System32\\wbem;C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\;C:\\Windows\\System32\\OpenSSH\\;C:\\Program Files\\dotnet\\;C:\\Program Files (x86)\\IDEMIA\\AWP\\DLLs;C:\\Program Files\\IDEMIA\\AWP\\DLLs;C:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common;C:\\Program Files (x86)\\GnuPG\\bin;C:\\Program Files\\Git\\cmd;C:\\Program Files\\WireGuard\\;C:\\Program Files\\NVIDIA Corporation\\NVIDIA app\\NvDLISR;C:\\Program Files\\GitHub CLI\\;C:\\Users\\morte\\AppData\\Local\\Microsoft\\WindowsApps;;C:\\Users\\morte\\AppData\\Local\\Programs\\Microsoft VS Code\\bin;C:\\texlive\\2024\\bin\\windows;C:\\Users\\morte\\AppData\\Local\\Programs\\cursor\\resources\\app\\bin;c:\\Users\\morte\\AppData\\Roaming\\Code\\User\\globalStorage\\github.copilot-chat\\debugCommand", "USERDOMAIN_ROAMINGPROFILE=RAZA", "HOMEPATH=\\Users\\morte", "VSCODE_INJECTION=1", "PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.CPL", "SESSIONNAME=Console", "VSCODE_NONCE=4a49bbe4-b463-4de9-b912-0560883f4031", "SYSTEMROOT=C:\\WINDOWS", "APPDATA=C:\\Users\\morte\\AppData\\Roaming", "PSMODULEPATH=C:\\Users\\morte\\Documents\\WindowsPowerShell\\Modules;C:\\Program Files\\WindowsPowerShell\\Modules;C:\\WINDOWS\\system32\\WindowsPowerShell\\v1.0\\Modules", "COMMONPROGRAMW6432=C:\\Program Files\\Common Files", "PROGRAMDATA=C:\\ProgramData", "PUBLIC=C:\\Users\\Public", "USERDOMAIN=RAZA", "OS=Windows_NT", "PROCESSOR_REVISION=a502", "TMP=C:\\Users\\morte\\AppData\\Local\\Temp", "COMSPEC=C:\\WINDOWS\\system32\\cmd.exe", "ALLUSERSPROFILE=C:\\ProgramData", "COMPUTERNAME=RAZA", "USERNAME=mortenpi", "VSCODE_GIT_ASKPASS_NODE=C:\\Users\\morte\\AppData\\Local\\Programs\\Microsoft VS Code\\Code.exe", "VSCODE_GIT_ASKPASS_EXTRA_ARGS=", "VSCODE_GIT_IPC_HANDLE=\\\\.\\pipe\\vscode-git-1f4de4c6bb-sock", "ONEDRIVECONSUMER=C:\\Users\\morte\\OneDrive", "USERPROFILE=C:\\Users\\morte", "PROCESSOR_LEVEL=6", "PROGRAMW6432=C:\\Program Files", "TEMP=C:\\Users\\morte\\AppData\\Local\\Temp", "HOMEDRIVE=C:", "COLORTERM=truecolor", "WINDIR=C:\\WINDOWS", "LOCALAPPDATA=C:\\Users\\morte\\AppData\\Local", "PROCESSOR_IDENTIFIER=Intel64 Family 6 Model 165 Stepping 2, GenuineIntel", "NUMBER_OF_PROCESSORS=16", "GIT_ASKPASS=c:\\Users\\morte\\AppData\\Local\\Programs\\Microsoft VS Code\\resources\\app\\extensions\\git\\dist\\askpass.sh", "EFC_8040_1592913036=1", "LANG=en_US.UTF-8", "COMMONPROGRAMFILES(X86)=C:\\Program Files (x86)\\Common Files", "TERM_PROGRAM_VERSION=1.102.1", "COMMONPROGRAMFILES=C:\\Program Files\\Common Files", "ONEDRIVE=C:\\Users\\morte\\OneDrive", "TERM_PROGRAM=vscode", "PROGRAMFILES(X86)=C:\\Program Files (x86)", "VSCODE_GIT_ASKPASS_MAIN=c:\\Users\\morte\\AppData\\Local\\Programs\\Microsoft VS Code\\resources\\app\\extensions\\git\\dist\\askpass-main.js", "OPENBLAS_NUM_THREADS=8", "PROGRAMFILES=C:\\Program Files", "CHROME_CRASHPAD_PIPE_NAME=\\\\.\\pipe\\crashpad_5848_YSKAVRUJGEMCUVJU", "LOGONSERVER=\\\\RAZA", "DRIVERDATA=C:\\Windows\\System32\\Drivers\\DriverData", "ORIGINAL_XDG_CURRENT_DESKTOP=undefined", "VSCODE_STABLE=1", "SYSTEMDRIVE=C:", "=C:", "PROCESSOR_ARCHITECTURE=AMD64", "OPENBLAS_MAIN_FREE=1"]): invalid argument (EINVAL)
Stacktrace:
 [1] _spawn_primitive(file::String, cmd::Cmd, stdio::Vector{Any})
   @ Base .\process.jl:99
 [2] #637
   @ .\process.jl:112 [inlined]
 [3] setup_stdios(f::Base.var"#637#638"{Cmd}, stdios::Vector{Any})
   @ Base .\process.jl:196
 [4] _spawn
   @ .\process.jl:111 [inlined]
 [5] run(::Cmd; wait::Bool)
   @ Base .\process.jl:439
 [6] run
   @ .\process.jl:438 [inlined]
 [7] |>(x::Cmd, f::typeof(run))
   @ Base .\operators.jl:858
 [8] top-level scope
   @ REPL[17]:1

mortenpi added a commit to mortenpi/Git.jl that referenced this pull request Jul 24, 2025
This is the first Julia version that can handle nested `addenv` calls on Windows (for JuliaVersionControl#64).
cgarling added 2 commits July 25, 2025 11:42
Need bugfix to multiple calls to `addenv` on Windows
@cgarling
Copy link
Contributor Author

Following from #67 (thanks for the help @mortenpi), julia = "~1.6.6, 1.7.3" is working on this PR but it looks like no CI is run against 1.7.3. Should we add something to the CI matrix to make sure we test against both 1.6.6 and 1.7.3?

@DilumAluthge
Copy link
Member

Should we add something to the CI matrix to make sure we test against both 1.6.6 and 1.7.3?

Yes, that sounds good.

the `version: 'min'` jobs hit Julia 1.6.6 but we should also test against 1.7.3 on Windows to ensure the `addenv` calls are working
@cgarling
Copy link
Contributor Author

Okay I added a single job on windows-latest with Julia 1.7.3 which passes (see here). It doesn't seem worth it to add 1.7.3 to the full CI matrix since we only ever had issues on Windows, but LMK if you'd like to have it in the matrix so we run Julia 1.7.3 against all the other OSes as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants