Skip to content

Conversation

kjughx
Copy link
Contributor

@kjughx kjughx commented Jul 23, 2025

Describe your changes

cargo-acap-sdk compains:

cp: warning: behavior of -n is non-portable and may change in future; use --update=none instead
cp: warning: behavior of -n is non-portable and may change in future; use --update=none instead

so let's use '--update=none' instead.

Checklist before requesting a review

  • [x ] I have performed a self-review of my own code
  • [ x] I have verified that the code builds perfectly fine on my local system
  • I have added tests that prove my fix is effective or that my feature works
  • I have commented my code, particularly in hard-to-understand areas
  • [ x] I have verified that my code follows the style already available in the repository
  • I have made corresponding changes to the documentation

@kjughx kjughx requested a review from a team as a code owner July 23, 2025 12:59
@apljungquist apljungquist changed the title fix(acap-build): Use '--update=none' instead of '-n' fix(acap-build): Use --update=none instead of -n Jul 23, 2025
apljungquist
apljungquist previously approved these changes Jul 23, 2025
Copy link
Collaborator

@apljungquist apljungquist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. The best thing would be to not call out to cp, not least because these arguments are not supported and/or the same on every version of cp, but I struggled to figure out how when I first wrote this code.

@kjughx
Copy link
Contributor Author

kjughx commented Jul 23, 2025

Actually, it turns out the '--update=none' is not supported on older Ubuntu, I think before 22.04...

@kjughx
Copy link
Contributor Author

kjughx commented Jul 24, 2025

Okay I had an idea to just use tar and archive all the files instead of copying recursively. We can archive them in memory, then write them to the staging dir. Then there's no need to use external cp and relative symlinks work out of the box.

The EAPs built are almost identical, the only difference is that acap-build writes a field in the manifest with the architecture that is now preserved. I guess before it was overwritten with the manifest in the src/ directory.
Nvm, that was my mistake. All files are identical except the EAP which as far as I can tell differs even between consecutive builds.
Let me know what you think.

@kjughx kjughx marked this pull request as draft July 24, 2025 10:00
@apljungquist
Copy link
Collaborator

apljungquist commented Jul 24, 2025

Reproducibility is important to me and I wonder why do the checksums change?

@kjughx
Copy link
Contributor Author

kjughx commented Jul 24, 2025

Reproducibility is important to me and I wonder why do the checksums change?

Not sure tbh, for me all the files except the EAPs are identical. Before I did into it, do you think this change is a good idea?

@apljungquist
Copy link
Collaborator

apljungquist commented Jul 24, 2025

If it preserves the current behavior and removes calls to external programs, then I think it's an improvement. But I don't understand it yet, so I don't have an opinion on whether it works.

@kjughx
Copy link
Contributor Author

kjughx commented Jul 24, 2025

If it preserves the current behavior and removes calls to external programs, then I think it's an improvement. But I don't understand it yet, so I don't have an opinion on whether it works.

Alright, so the reason using_a_build_script differed was a mistake on my side, deleted an extra line that keeps track of archived files.

For challenge_build_tools, the reason is that the executable bit is cleared from pre-install.sh and post-install.sh on main but with this change they're preserved:

main:
.rw-r--r--   61 philjn  1 Jan  1970 cgi.conf
.rwxr-xr-x 440k philjn  1 Jan  1970 challenge_build_tools
.rw-r--r--   30 philjn  1 Jan  1970 LICENSE
.rw------- 1.6k philjn  1 Jan  1970 manifest.json
.rw-r--r--  565 philjn  1 Jan  1970 package.conf
.rw-r--r--   38 philjn  1 Jan  1970 param.conf
.rw-r--r--   26 philjn  1 Jan  1970 post-install.sh
.rw-r--r--   26 philjn  1 Jan  1970 pre-uninstall.s
cp-arg:
.rw-r--r--   61 philjn  1 Jan  1970 cgi.conf
.rwxr-xr-x 440k philjn  1 Jan  1970 challenge_build_tools
.rw-r--r--   30 philjn  1 Jan  1970 LICENSE
.rw------- 1.6k philjn  1 Jan  1970 manifest.json
.rw-r--r--  565 philjn  1 Jan  1970 package.conf
.rw-r--r--   38 philjn  1 Jan  1970 param.conf
.rwxr-xr-x   26 philjn  1 Jan  1970 post-install.sh
.rwxr-xr-x   26 philjn  1 Jan  1970 pre-uninstall.sh

Tbh I'm not sure which is right. I think it makes more sense to preserve the state of the sources but I might be missing something. Do you remember why it cargo-acap-build has preserve_permissions set to false?

Philip Johansson added 2 commits July 24, 2025 15:12
It doesn't do what you think... only extended permissions (like suid).
@kjughx kjughx changed the title fix(acap-build): Use --update=none instead of -n fix(acap-build): Archive and extract files instead of cp Jul 24, 2025
@kjughx kjughx changed the title fix(acap-build): Archive and extract files instead of cp fix(acap-build): Archive and extract files instead of 'cp' Jul 24, 2025
@kjughx kjughx changed the title fix(acap-build): Archive and extract files instead of 'cp' fix(acap-build): Archive and extract files instead of cp Jul 24, 2025
@apljungquist
Copy link
Collaborator

apljungquist commented Jul 24, 2025

I think acap-build supports:

  • preserving permissions in order to allow us to create bit-exact matches of what the upstream tool would create. This was part of my plan to create a full rust port of acap-build and get it upstreamed or widely adopted some other way. Maybe it is time to call time on that experiment.
  • discarding permissions in order to allow cargo-acap-build to create artifacts that are reproducible across machines by default. In fix!: Make builds reproducible in dev-container #99 I noted:

crates/cargo-acap-build/src/acap.rs:

  • Replace standard copy function to avoid preserving permission bits;
    Since files like manifest.json are tracked in git and the code is
    typically checked out on host, they will have their permission bits
    set on host causing even containerized builds to differ across
    machines.

But to be honest I feel like the artifacts aren't reproducible across machines; I often find that I get different checksums in my devcontainer compared to CI. At least for the challenge build tools example. (🤔 I think there may be two issues. The first is the mode set on the files when they are checked out. The second issue is the bits set when files are copied inside the dev container; here the host umask affects the mode set inside the container, even if we copy without preserving bits. I think the latter is what has been haunting me lately, probably coincides with me switching from Debian/Ubuntu to MacOS).

I don't know if that helps. I will try to take a closer look at your code and the commit history over the weekend to figure out what I think we should do.

Copy link
Collaborator

@apljungquist apljungquist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specific feedback

I think:

  1. Clearing the executable bit on symlinks is OK. If this is done the same on all systems, it may even be preferred to the current implementation.
  2. We need to figure out why uing_a_build_script is different; it looks to me like the bar file is omitted and I suspect the tests would fail if you ran them.

General requirements

I think I have spent some time thinking about the requirements on this library. We have two dependants, each with slightly different requirements:

  • acap-build
  • cargo-acap-build

acap-build has one requirement: it should produce the exact same artifact as its namesake from the SDK.
Ideally, it should do this for any application on any machine, but it's only been verified to do so on the reproducible examples in acap-native-sdk-examples in their normal docker environment.
These should continue to work for as long as we have this binary crate.
I have a semi-automatic setup for verifying this that we can use when we make high-risk changes: AxisCommunications/acap-native-sdk-examples@main...apljungquist:acap-native-sdk-examples:acap-build-rust

cargo-acap-build has two main requirements:

  • Correctness.
  • Reproducibility.

Correctness. This is a bit tricky to define and to verify.
The most straight forward way of testing it is to run the tests in this repo that are designed to run on target. All other things being equal, these should continue working.
Another technique we can use is to verify that the checksums don't change and, when they change, make sure we can explain the change in its entirety.

Reproducibility: This means that the same commit always produces the same EAP artifacts.
I like to split it into reproducibility into two components:

  • across time.
  • across machines.

To make sure artifacts are reproducible across time, we track their checksum in git, recompute them in CI and fail if they differ. It would ideally be verified on all machines, but the best we can do in practice is to also test it manually as PR author and reviewer.

To make sure artifacts are reproducible across machines, the best we can do currently is to verify that we can reproduce the checksums from CI on our local machines as PR author and reviewer.
We know this is not 100% working at the moment as I get a different checksum for inspect_env:

diff --git a/apps-aarch64.checksum b/apps-aarch64.checksum
index 811169a..203e235 100644
--- a/apps-aarch64.checksum
+++ b/apps-aarch64.checksum
@@ -7,7 +7,7 @@ cb773a1f92cbf3b18a3869ec386c6794d2daf57b  target-aarch64/acap/consume_analytics_
 16310c12bb50db85d69c744d13c93ebf0c628953  target-aarch64/acap/embedded_web_page_0_0_0_aarch64.eap
 b34f6d4a3ce798cdc8c6fd6ee0f1de0aeef61340  target-aarch64/acap/event_subscribe_1_0_0_aarch64.eap
 60d4574e576c4762d1b51bdbe53f416d4a36f093  target-aarch64/acap/hello_world_0_0_0_aarch64.eap
-c8d98b2b6478a073a8f1893fcf8698fb204180b1  target-aarch64/acap/inspect_env_0_0_0_aarch64.eap
+b76f30e26e4ab6a4c3c0fae45dd31f0545b81535  target-aarch64/acap/inspect_env_0_0_0_aarch64.eap
 bfb74d7d675b6adcd48fb0be3e542fcfdb0c888e  target-aarch64/acap/licensekey_handler_0_0_0_aarch64.eap
 6fe4c136fb1ee7c3ec9fbd82240505402e7d4e60  target-aarch64/acap/object_detection_1_0_0_aarch64.eap
 0c0711cbf6c20699909d712175ef4a806df3720b  target-aarch64/acap/reverse_proxy_0_0_0_aarch64.eap

To do this, I think it's acceptable, maybe even preferable, to not preserve information from the host file system.
For example:

  • if a file is meant to be a symlink, then the build system should ensure that we end up with a symlink that is valid on target.
  • if a file is meant to not be readable by all processes on target, then the build system should ensure that and not rely on the host system setting the permission bits correctly. I'm not sure to what extent this is respected by the ACAP-installer though.

So, in short:

  1. acap-build must produce exactly the same artifacts as the official tool.
  2. cargo-acap-build must produce correct EAP files. We verify this by running the tests of the apps on target.
  3. cargo-acap-build must produce EAP files that are no less reproducible than they are today. We verify this by reproducing the checksums locally.

Additional thoughts

I was previously planning to add a third backend to acap-build that I wanted to call Compatible:

enum AcapBuildImpl {
    Reference,
    Equivalent,
}

My idea was that this implementation would not be burdened by the requirement that the EAP files exactly match the EAP files produced by the official acap-build, and that by sharing code, such as package conf creation, with the implementation that produces an exact match we would help ensure that the EAP files are indeed compatible. It could use the tar package directly to avoid having a staging directory and possibly use faster compression implementations. I say possibly because I tried pigz and found that it was faster to compress, but slower to decompress on the camera ending up taking more time overall 😐 The stretch goal was, as I mentioned earlier, to replace the upstream acap-build with something that is easier to use as a library and easier to extract from the SDK container image; their current implementation has a lot of dependencies on python scripts, shell scripts and other files that need to be on the path, or somethings, in a specific location.

pub fn build(mut self) -> anyhow::Result<OsString> {
let cursor = Cursor::new(self.ar.take().unwrap().into_inner()?.into_inner());
let mut archive = Archive::new(cursor);
archive.set_preserve_permissions(self.preserve_permissions);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pub fn set_preserve_permissions(&mut self, preserve: bool)
Indicate whether extended permissions (like suid on Unix) are preserved when unpacking this entry.

I wonder if this does what you expect

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not, unfortunately.

* upstream/main:
  fix(challenge_build_tools): Use valid parameter name (AxisCommunications#206)
@kjughx
Copy link
Contributor Author

kjughx commented Jul 28, 2025

Clearing the executable bit on symlinks is OK. If this is done the same on all systems, it may even be preferred to the current implementation.

Symlinks have all permission bits set always, and trying to change them, with chmod only affects the target of the link.

We need to figure out why uing_a_build_script is different; it looks to me like the bar file is omitted and I suspect the tests would fail if you ran them.

Yes you're right, I forgot to keep track of them but I'll fix that.

@kjughx
Copy link
Contributor Author

kjughx commented Jul 28, 2025

preserving permissions in order to allow us to create bit-exact matches of what the upstream tool would create
To do this, I think it's acceptable, maybe even preferable, to not preserve information from the host file system.

Considering that the 'normal' acap-build is meant to be run in the source directory, wouldn't it make more sense to exactly preserve the information?

@apljungquist
Copy link
Collaborator

preserving permissions in order to allow us to create bit-exact matches of what the upstream tool would create
To do this, I think it's acceptable, maybe even preferable, to not preserve information from the host file system.

Considering that the 'normal' acap-build is meant to be run in the source directory, wouldn't it make more sense to exactly preserve the information?

Do you combine two separate requirements in the same quote intentionally?

The first sentence is about the requirements imposed on the lib by the acap-build binary crate, the second is about the requirements imposed on the lib by cargo-acap-build

With more context:

preserving permissions in order to allow us to create bit-exact matches of what the upstream tool would create. This was part of my plan to create a full rust port of acap-build and get it upstreamed or widely adopted some other way. Maybe it is time to call time on that experiment.

cargo-acap-build has two main requirements:
...
Reproducibility: This means that the same commit always produces the same EAP artifacts.
...
across machines.
...
To do this, I think it's acceptable, maybe even preferable, to not preserve information from the host file system.

@apljungquist
Copy link
Collaborator

apljungquist commented Jul 28, 2025

Symlinks have all permission bits set always

FYI, I think this depends on your umask. Inspecting a symlink with ls on my machine running MacOS:

$ ls -al apps/inspect_env/post-install
lrwxr-xr-x  1 anlj  staff  11 May  9 12:38 apps/inspect_env/post-install -> inspect_env

Compared to Debian:latest in docker:

$ ln -s foo bar
$ ls -al bar
lrwxrwxrwx   1 root root    3 Jul 28 12:28 bar -> foo

I don't think this impacts our decision making, but it could be good to know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants