Skip to content

Conversation

tobil4sk
Copy link
Member

@tobil4sk tobil4sk commented Jul 10, 2023

Closes #493. For most git installs, a (faster) shallow clone can now be performed with --depth=1, which avoids cloning the entire history and instead only clones one commit's worth of history, including for the submodules. The only exception where a shallow clone cannot be performed is when setting a library to a specific commit with a short commit id, for example:

# Reverts to full non-shallow clone:
haxelib git lime https://github.com/openfl/lime c16f278
# Performs shallow clone:
haxelib git lime https://github.com/openfl/lime c16f27818dcff17e8b5e4648624a64ea6a107119

This is due to how commit ids are resolved.

The data which was used to install the library is now stored in a .gitdata file, to avoid issues of overwriting externally created commits.

Now when updating a git library, it will now only update if it was installed with the default branch or a specific branch. If it was installed with a specific commit or tag, it remains at that commit/tag, since there is no clear commit to update to.

Mercurial works differently, and doesn't have a concept of a shallow clone so I've just left it more or less as is.

@Simn
Copy link
Member

Simn commented Oct 22, 2023

@kLabz Could you check if this makes sense git-wise?

@kLabz
Copy link
Contributor

kLabz commented Oct 22, 2023

I don't think I'll have time today, so will look at this tomorrow.

It likely breaks some workflow I use from time to time (using git dependency, and doing git commands in .haxelib/somelib/git folder like changing branches etc.) but I should probably use haxelib dev for that as it's easy to lose changes like that...

@tobil4sk
Copy link
Member Author

Somehow this PR causes #591 to be triggered on Windows with pretty much every haxelib git command :/ Probably requires getting rid of neko to fix it, I can't figure out what exactly is going on...

@dpomier
Copy link

dpomier commented Jan 18, 2024

The only exception where a shallow clone cannot be performed is when setting a library to a specific commit with a short commit id

It should be possible to get the full commit id from the short one once you have added a git remote:

$ git rev-parse 3cdd5d
3cdd5d19178a54d2e51b5098d43b57571241d0ab

Now when updating a git library, it will now only update if it was installed with the default branch or a specific branch. If it was installed with a specific commit or tag, it remains at that commit/tag, since there is no clear commit to update to.

It might be worth always fetching the commit id so as not to lose your optimisation when a branch or tag is specified instead of a commit id. This can be done this way:

$ git ls-remote https://github.com/HaxeFoundation/haxelib.git HEAD     
98637027327d8cf385d302acaaf104bd6107d2bf        HEAD

$ git ls-remote https://github.com/HaxeFoundation/haxelib.git 4.1.x
f17fffa97554b1bdba37750e3418051f017a5bc2        refs/heads/4.1.x

Hope this helps.

@tobil4sk
Copy link
Member Author

It should be possible to get the full commit id from the short one once you have added a git remote:

This doesn't work if the commit hasn't been fetched:

$ git rev-parse 558fd07 
558fd07
fatal: ambiguous argument '558fd07': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

It might be worth always fetching the commit id so as not to lose your optimisation when a branch or tag is specified instead of a commit id.

Not sure I understand, which optimisation are you talking about that would be lost?

@MAJigsaw77
Copy link

Any progress on this?

@tobil4sk
Copy link
Member Author

tobil4sk commented Aug 8, 2024

The main issue that's blocking this is #591, there is no point in optimising git commands if it will break them 😅. Maybe it can work in development with the interp target and hxb, but that would mean that it doesn't work with haxe 4.3.x.

@PXshadow
Copy link

Any news on this?

tobil4sk added 16 commits July 12, 2025 16:38
Reduce reliance on global state.

Change api with regards to detecting local changes before an update. The
caller now calls hasLocalChanges() and optionally resetLocalChanges(),
rather than having to pass in a callback to handle the decision.
Remove use of the AlreadyUpToDate exception control flow

See: HaxeFoundation#510 (comment)
This allows us to keep track of how a library was installed, and
therefore will allow for consistent update behaviour even if a shallow
clone is performed.
If only a commit or tag was specified, there is no simple way to
determine what an update is meant to do. The commmit may belong to
multiple branches, and there may be multiple newer tags which have
different purposes (e.g. stable vs alpha release tags). It makes more
sense to simply lock to the specified commit when a user installs this
way.

Also, the git show-branch method would not work with a shallow git
history.

For hg, to achieve the same behaviour as git it is necessary to strip
any newer changesets to the checked out one and also remove the
reference to the upstream repository so that no further updates are
pulled. The upstream is still stored as `haxelib_url` so it may still be
retrieved by Hg.getOriginUrl.
Instead of creating a separate CantCheckoutCommit enum constructor, they
have been merged together into CantCheckout
With git installs we now do a hard reset on update. Previously, user
committed changes would prevent a merge from occurring successfully so
they wouldn't be lost, but now the local commits won't stop the hard
reset, so we need to perform this check manually.
This is useful if both a branch and commit are set
Updating should not do anything if only a commit/tag was specified at
install.

Added a new test for cloning only with a commit hash (and later
updating).

Merged the two tests for update with local changes, as the logic for
confirming local change reset is now performed externally which leads to
cleaner tests here.
If we are locked to a commit or tag, leave it unspecified whether
mergeRemoteChanges will throw an exception, just require that it doesn't
update the repository. This is to simplify the hg implementation, where
`hg update` does not give an error in this case.
Test that updates behave correctly also on a haxelib command level. Also
add a git specific test for short commit ids.
It may be desired to omit the subdir argument but pass in a tag, but
currently this is not possible and it is necessary to pass an empty
string as the subdir. This should not result in a dev path being set.

There is an example that requires this behaviour in the newly added
integration tests, see tests.integration.TestVcs.testVcsUpdateTag().
The git implementation doesn't check for untracked changes. For now, to
pass the tests, just match the git behaviour, though, in future we
probably want to check for them to avoid them being wiped on library
reinstall.
Git.clone() is currently broken due to:
HaxeFoundation/haxe#11666
@tobil4sk
Copy link
Member Author

The cause for the crashes has been resolved now! I've updated the branch to resolve conflicts, and I've also split out the commits into more atomic changes and added more descriptive messages (would be good to rebase rather than squash to preserve history). This is now ready for review.

Mercurial now also is consistent with Git's behaviour.

I've also taken out the changes for #512, which will come in a separate PR.

I also had to disable the cpp client tests, because default arguments in abstract classes are broken on cpp, see: HaxeFoundation/haxe#11666.

@PXshadow
Copy link

Tested on go2hx git library with help from @tobil4sk . It's over 200MB if not shallow cloned, and with 3 different git submodules. The time it took to install from git in total went from 14 seconds to 3 seconds on my machine, using this branch.

}

if (data.subDir != null) {
if (!(data == null || data.subDir == "" || data.subDir == null)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks weird, could we De Morgan it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

'haxelib_path=${Path.join([Sys.getCwd(), "haxelib"])}'
]);
runCommand("neko", ["bin/integration_tests.n"]);
// runCommand("neko", ["bin/integration_tests.n"]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be disabled now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes they should be for now, it is explained in the commit message of 824c06d. This test run uses the cpp binary as the haxelib client (specified by haxelib_path), but that is currently broken. The test with the regular neko client still runs on the lines above.

Comment on lines 70 to 71
Returns an object containing the filled-in VcsData fields,
without the empty ones.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm, what? I struggle to make sense of both this description and the implementation. What is the point of this function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It converts a VcsData class object into an anonymous object with the fields that actually carry any information. This is useful when you want to write to a json file, where including the null fields is pointless.

This is haxelib's cli for git installs:

haxelib git [project-name] [git-clone-path] [branch] [subDir] [tag]

So if you want to install with a tag but not a branch or subDir, you have to put both of those anyway as empty strings. The method also excludes fields that have empty values like this, which don't actually hold any useful information about the installation.

tobil4sk added 2 commits July 14, 2025 00:06
Add a description for isReproducible() and clairfy the purpose of
getCleaned()
@Simn Simn merged commit ec6a782 into HaxeFoundation:development Jul 14, 2025
2 checks passed
@tobil4sk tobil4sk deleted the optimize-git branch July 14, 2025 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

haxelib git without .git folder?
6 participants