-
-
Notifications
You must be signed in to change notification settings - Fork 80
Optimize git library installation #607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@kLabz Could you check if this makes sense git-wise? |
I don't think I'll have time today, so will look at this tomorrow. It likely breaks some workflow I use from time to time (using git dependency, and doing git commands in |
Somehow this PR causes #591 to be triggered on Windows with pretty much every |
It should be possible to get the full commit id from the short one once you have added a git remote:
It might be worth always fetching the commit id so as not to lose your optimisation when a branch or tag is specified instead of a commit id. This can be done this way:
Hope this helps. |
This doesn't work if the commit hasn't been fetched:
Not sure I understand, which optimisation are you talking about that would be lost? |
Any progress on this? |
The main issue that's blocking this is #591, there is no point in optimising git commands if it will break them 😅. Maybe it can work in development with the interp target and hxb, but that would mean that it doesn't work with haxe 4.3.x. |
Any news on this? |
Reduce reliance on global state. Change api with regards to detecting local changes before an update. The caller now calls hasLocalChanges() and optionally resetLocalChanges(), rather than having to pass in a callback to handle the decision.
Remove use of the AlreadyUpToDate exception control flow See: HaxeFoundation#510 (comment)
This allows us to keep track of how a library was installed, and therefore will allow for consistent update behaviour even if a shallow clone is performed.
If only a commit or tag was specified, there is no simple way to determine what an update is meant to do. The commmit may belong to multiple branches, and there may be multiple newer tags which have different purposes (e.g. stable vs alpha release tags). It makes more sense to simply lock to the specified commit when a user installs this way. Also, the git show-branch method would not work with a shallow git history. For hg, to achieve the same behaviour as git it is necessary to strip any newer changesets to the checked out one and also remove the reference to the upstream repository so that no further updates are pulled. The upstream is still stored as `haxelib_url` so it may still be retrieved by Hg.getOriginUrl.
Instead of creating a separate CantCheckoutCommit enum constructor, they have been merged together into CantCheckout
With git installs we now do a hard reset on update. Previously, user committed changes would prevent a merge from occurring successfully so they wouldn't be lost, but now the local commits won't stop the hard reset, so we need to perform this check manually.
This is useful if both a branch and commit are set
Updating should not do anything if only a commit/tag was specified at install. Added a new test for cloning only with a commit hash (and later updating). Merged the two tests for update with local changes, as the logic for confirming local change reset is now performed externally which leads to cleaner tests here.
If we are locked to a commit or tag, leave it unspecified whether mergeRemoteChanges will throw an exception, just require that it doesn't update the repository. This is to simplify the hg implementation, where `hg update` does not give an error in this case.
Test that updates behave correctly also on a haxelib command level. Also add a git specific test for short commit ids.
It may be desired to omit the subdir argument but pass in a tag, but currently this is not possible and it is necessary to pass an empty string as the subdir. This should not result in a dev path being set. There is an example that requires this behaviour in the newly added integration tests, see tests.integration.TestVcs.testVcsUpdateTag().
The git implementation doesn't check for untracked changes. For now, to pass the tests, just match the git behaviour, though, in future we probably want to check for them to avoid them being wiped on library reinstall.
Git.clone() is currently broken due to: HaxeFoundation/haxe#11666
The cause for the crashes has been resolved now! I've updated the branch to resolve conflicts, and I've also split out the commits into more atomic changes and added more descriptive messages (would be good to rebase rather than squash to preserve history). This is now ready for review. Mercurial now also is consistent with Git's behaviour. I've also taken out the changes for #512, which will come in a separate PR. I also had to disable the cpp client tests, because default arguments in abstract classes are broken on cpp, see: HaxeFoundation/haxe#11666. |
Tested on go2hx git library with help from @tobil4sk . It's over 200MB if not shallow cloned, and with 3 different git submodules. The time it took to install from git in total went from 14 seconds to 3 seconds on my machine, using this branch. |
src/haxelib/api/GlobalScope.hx
Outdated
} | ||
|
||
if (data.subDir != null) { | ||
if (!(data == null || data.subDir == "" || data.subDir == null)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks weird, could we De Morgan it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
'haxelib_path=${Path.join([Sys.getCwd(), "haxelib"])}' | ||
]); | ||
runCommand("neko", ["bin/integration_tests.n"]); | ||
// runCommand("neko", ["bin/integration_tests.n"]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to be disabled now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes they should be for now, it is explained in the commit message of 824c06d. This test run uses the cpp binary as the haxelib client (specified by haxelib_path
), but that is currently broken. The test with the regular neko client still runs on the lines above.
src/haxelib/VersionData.hx
Outdated
Returns an object containing the filled-in VcsData fields, | ||
without the empty ones. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhm, what? I struggle to make sense of both this description and the implementation. What is the point of this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It converts a VcsData class object into an anonymous object with the fields that actually carry any information. This is useful when you want to write to a json file, where including the null fields is pointless.
This is haxelib's cli for git installs:
haxelib git [project-name] [git-clone-path] [branch] [subDir] [tag]
So if you want to install with a tag but not a branch or subDir, you have to put both of those anyway as empty strings. The method also excludes fields that have empty values like this, which don't actually hold any useful information about the installation.
Add a description for isReproducible() and clairfy the purpose of getCleaned()
Closes #493. For most git installs, a (faster) shallow clone can now be performed with
--depth=1
, which avoids cloning the entire history and instead only clones one commit's worth of history, including for the submodules. The only exception where a shallow clone cannot be performed is when setting a library to a specific commit with a short commit id, for example:This is due to how commit ids are resolved.
The data which was used to install the library is now stored in a
.gitdata
file, to avoid issues of overwriting externally created commits.Now when updating a git library, it will now only update if it was installed with the default branch or a specific branch. If it was installed with a specific commit or tag, it remains at that commit/tag, since there is no clear commit to update to.
Mercurial works differently, and doesn't have a concept of a shallow clone so I've just left it more or less as is.