Skip to content

Conversation

zahidblackduck
Copy link
Collaborator

@zahidblackduck zahidblackduck commented Jul 3, 2025

JIRA Ticket

IDETECT-4746

Description

This merge request fixes a bug in the Cargo Lockfile Detector where transitive dependencies were incorrectly identified as direct dependencies in the generated BOM.
This change ensures that only dependencies explicitly declared in Cargo.toml are marked as direct. All others are accurately classified as transitive.

Implementation Details

  • The CargoExtractor has been updated to:

    • Parse direct dependencies from Cargo.toml
    • Resolve root dependencies separately before performing DFS to collect transitives
    • If Cargo.toml is not found, falls back to the previous implementation
  • The transformToGraph(...) method in CargoLockPackageTransformer now:

    • Accepts a Set<NameVersion> of root dependencies
    • Adds only those as root nodes in the dependency graph
    • Correctly links transitive dependencies as children

Impact

  • Ensures BOMs generated via the Cargo Lockfile Detector reflect accurate dependency relationships
  • Reduces false positives where transitive packages were mistakenly marked as direct
  • Improves reliability of policy evaluation and SBOM auditing in Rust projects scanned without the Cargo CLI

N.B: This merge request is meant for detect release 10.7

@zahidblackduck zahidblackduck self-assigned this Jul 3, 2025
@zahidblackduck zahidblackduck marked this pull request as draft July 3, 2025 04:59
@zahidblackduck zahidblackduck changed the title Fix: Correct direct vs transitive dependency classification in Cargo Lockfile Detector**JIRA Ticket** Fix: Correct direct vs transitive dependency classification in Cargo Lockfile Detector Jul 3, 2025
Copilot

This comment was marked as outdated.

@zahidblackduck zahidblackduck marked this pull request as ready for review July 3, 2025 10:14
@zahidblackduck zahidblackduck requested a review from Copilot July 3, 2025 10:15
Copilot

This comment was marked as outdated.

@zahidblackduck zahidblackduck requested a review from Copilot July 4, 2025 17:01
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes the classification of direct vs. transitive dependencies in the Cargo Lockfile Detector’s BOM by tracking only those declared in Cargo.toml as roots.

  • The transformer now takes an explicit rootDependencies set and only marks those as root nodes.
  • The extractor collects direct dependencies from Cargo.toml, resolves versions of unnamed deps, and passes the roots through to the transformer.
  • Tests were updated to call the new two-argument transformToGraph and supply the expected root set.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
detectable/src/test/java/.../CargoLockPackageTransformerTest.java Updated imports and replaced single-arg transformToGraph calls with two-arg version, supplying roots.
detectable/src/main/java/.../CargoLockPackageTransformer.java Changed transformToGraph signature to accept a Set<NameVersion> of roots and adjust root-node logic.
detectable/src/main/java/.../CargoTomlParser.java Added a null-check around the dependency-type filter to include all when filter is null.
detectable/src/main/java/.../CargoExtractor.java Gathered resolvedRootDependencies, resolved missing versions, and passed roots into the transformer.
Comments suppressed due to low confidence (3)

detectable/src/main/java/com/blackduck/integration/detectable/detectables/cargo/CargoExtractor.java:119

  • [nitpick] The parameter resolvedRootDependenciesOut is not immediately clear. Rename it to something like rootDependenciesSink or simply rootDependencies to improve readability.
    private void processTransitiveDependenciesForInclusion(

detectable/src/test/java/com/blackduck/integration/detectable/detectables/cargo/transform/CargoLockPackageTransformerTest.java:34

  • Add a test where rootDependencies is empty to verify the fallback behavior (all lockfile packages become roots) matches the previous one-argument logic.
        Set<NameVersion> rootDependencies = new HashSet<>();

detectable/src/main/java/com/blackduck/integration/detectable/detectables/cargo/transform/CargoLockPackageTransformer.java:38

  • Avoid creating a new NameVersion object on each iteration for the contains check. Consider precomputing a map of (name, version) keys or using a canonical identifier to reduce object churn.
            if (rootDependencies.isEmpty() || rootDependencies.contains(new NameVersion(name, version))) {

@zahidblackduck zahidblackduck marked this pull request as draft July 4, 2025 17:04
CargoLockPackageTransformer cargoLockPackageTransformer = new CargoLockPackageTransformer();

assertThrows(DetectableException.class, () -> cargoLockPackageTransformer.transformToGraph(input));
assertThrows(DetectableException.class, () -> cargoLockPackageTransformer.transformToGraph(input, rootDependencies));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the method that returns an exception here, verifyNoDuplicatePackages(), is not modified in the PR but I wanted to understand it better. Since Cargo allows multiple versions of the same crate (unlike many other pkg mgrs), what is the purpose of that method? There may be a disconnect between my understanding of a unique crate name/version versus a CargoLockPackage definition. Thanks in advance for your explanation!

Copy link
Collaborator Author

@zahidblackduck zahidblackduck Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The verifyNoDuplicatePackages() in the CargoLockPackageTransformer class ensures that when a dependency is listed in Cargo.lock without a version (e.g., "bitflags"), there should only be one resolved version of that package in the Cargo.lock file. This is based on the assumption that unversioned entries indicate unambiguous resolution.

For example, this entry in a Cargo.lock package:

[[package]]
name = "libredox"
version = "0.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c0ff37bd590ca25063e35af745c343cb7a0271906fb7b37e4813e8f79f00268d"
dependencies = [
 "bitflags",
 "libc",
]

This implies that only one version of bitflags exists in the lock file:

[[package]]
name = "bitflags"
version = "2.9.0"

If, however, there are multiple versions of bitflags present:

[[package]]
name = "bitflags"
version = "2.8.0"

[[package]]
name = "bitflags"
version = "2.9.0"

Then the lookup becomes ambiguous, since "bitflags" without a version doesn't specify which one to use. In such cases, verifyNoDuplicatePackages() fails deliberately to avoid incorrect resolution.

You can see a valid case in https://github.com/casey/just, where "bitflags" is listed without a version, because only one version is present in the lockfile.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like that code has been that way for a while but this discussion does make me curious whether the example of bitflags with multiple versions is indeed invalid (impossible?) or if it should be handled differently. Maybe we can get more into this next time we chat.

@shantyk
Copy link
Contributor

shantyk commented Jul 29, 2025

  • Please add release note entry

@@ -53,6 +54,9 @@ public Extraction extract(File cargoLockFile, @Nullable File cargoTomlFile, Carg
List<CargoLockPackageData> filteredPackages = cargoLockPackageDataList;
boolean exclusionEnabled = isDependencyExclusionEnabled(cargoDetectableOptions);
String cargoTomlContents = null;
Set<NameVersion> dependenciesToInclude;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to declare this where it's initialized. This saves a line of code and also reduces the likelihood of someone trying to use a null value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly curious: what led you to make these changes in a maven test?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched from List.of(..) to Arrays.asList(..) because the project targets Java 8 in build.gradle, and List.of(..) is only available from Java 9 onward, thus causing a cannot find symbol error when running tests in IntelliJ. Also, since we wrap the result in new ArrayList<>(...), the immutability benefit of List.of(..) is lost anyway, so using Arrays.asList(..) doesn't change the actual behavior of the test.
I'm not sure though whether we should go with the change. If we are against this, I can revert the change.

@@ -53,6 +54,9 @@ public Extraction extract(File cargoLockFile, @Nullable File cargoTomlFile, Carg
List<CargoLockPackageData> filteredPackages = cargoLockPackageDataList;
boolean exclusionEnabled = isDependencyExclusionEnabled(cargoDetectableOptions);
String cargoTomlContents = null;
Set<NameVersion> dependenciesToInclude;
Set<NameVersion> resolvedRootDependencies = new HashSet<>();
Map<String, List<CargoLockPackageData>> packageLookupMap = indexPackagesByName(filteredPackages);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like now in this class we're calling indexPackagesByName() for the same list of packages more than once. Let's try to avoid this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. We can surely avoid multiple calls to the method by passing around the map wherever needed. Will update accordingly.

@zahidblackduck zahidblackduck marked this pull request as ready for review July 31, 2025 10:04
@@ -30,7 +30,7 @@

### Resolved issues

*
* (IDETECT-4746) – Fixed incorrect labeling of transitive dependencies as direct in the Cargo Lockfile Detector.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - from: "... in the Cargo Lockfile Detector." to "... by the Cargo Lockfile Detector."

@zahidblackduck zahidblackduck requested a review from cpottsbd July 31, 2025 14:56
@@ -30,7 +30,7 @@

### Resolved issues

*
* (IDETECT-4746) – Fixed incorrect labeling of transitive dependencies as direct by the Cargo Lockfile Detector.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies for the delay, when I read the line back over I thought of this rewording idea:

  • (IDETECT-4746) – Fixed Cargo Lockfile Detector incorrectly labeling transitive dependencies as direct dependencies.

@zahidblackduck zahidblackduck merged commit 8da9c5d into master Aug 8, 2025
@zahidblackduck zahidblackduck deleted the dev/zahidblackduck/IDETECT-4746 branch August 29, 2025 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants