diff --git a/microsoft/winget/Dockerfile b/microsoft/winget/Dockerfile new file mode 100644 index 00000000..06aefcc3 --- /dev/null +++ b/microsoft/winget/Dockerfile @@ -0,0 +1,21 @@ +FROM ubuntu:22.04 + +# Install bash and sqlite3 for database verification +RUN apt-get update && apt-get install -y bash sqlite3 && rm -rf /var/lib/apt/lists/* + +# Create Windows-style directory structure for WinGet databases +RUN mkdir -p /Users/test/AppData/Local/Packages/Microsoft.DesktopAppInstaller_8wekyb3d8bbwe/LocalState/Microsoft.Winget.Source_8wekyb3d8bbwe/ +RUN mkdir -p /Users/test/AppData/Local/Packages/Microsoft.DesktopAppInstaller_8wekyb3d8bbwe/LocalState/StoreEdgeFD/ +RUN mkdir -p /ProgramData/Microsoft/Windows/AppRepository/ + +# Create app directory for scalibr binary +RUN mkdir -p /app + +# Copy WinGet test database files into the container +COPY testdata/ / + +# Set working directory +WORKDIR /app + +# Default command: start bash so the container stays alive interactively +CMD ["/bin/bash"] diff --git a/microsoft/winget/README.md b/microsoft/winget/README.md new file mode 100644 index 00000000..90b3076b --- /dev/null +++ b/microsoft/winget/README.md @@ -0,0 +1,144 @@ +# OSV-Scalibr: Windows Package Manager (WinGet) Extractor + +This directory contains the test Docker setup for testing OSV-Scalibr's WinGet extractor plugin. Windows Package Manager (WinGet) is Microsoft's official package manager for Windows systems that stores its database files in SQLite format at specific system locations. + +## Overview + +The WinGet extractor analyzes installed Windows packages by reading SQLite database files created by the Windows Package Manager. This testbed simulates the Windows file system structure and provides sample WinGet databases for testing purposes. + +## WinGet Database Locations + +The WinGet extractor looks for databases at these Windows paths: + +1. **User-installed packages**: `%LOCALAPPDATA%/Packages/Microsoft.DesktopAppInstaller_8wekyb3d8bbwe/LocalState/Microsoft.Winget.Source_8wekyb3d8bbwe/installed.db` +2. **Store Edge packages**: `%LOCALAPPDATA%/Packages/Microsoft.DesktopAppInstaller_8wekyb3d8bbwe/LocalState/StoreEdgeFD/installed.db` +3. **System-wide repository**: `%PROGRAMDATA%/Microsoft/Windows/AppRepository/StateRepository-Machine.srd` + +## Test Database Contents + +This testbed includes three sample databases with the following packages: + +### User Installed Database +- **Git.Git** v2.50.1 - Git version control system +- **Microsoft.VisualStudioCode** v1.103.1 - Visual Studio Code editor +- **Google.Chrome** v120.0.6099.109 - Google Chrome browser + +### Store Edge Database +- **Microsoft.PowerShell** v7.4.1 - PowerShell terminal +- **Mozilla.Firefox** v121.0.1 - Mozilla Firefox browser + +### System Repository Database +- **Microsoft.WindowsTerminal** v1.18.3181.0 - Windows Terminal +- **Microsoft.VCRedist.2015+.x64** v14.38.33135.0 - Visual C++ Redistributable + +## Setup Instructions + +### Build the Docker Image + +```bash +cd security-testbeds/microsoft/winget +docker build -t winget-test . +``` + +### Run the Container + +```bash +docker run -it --rm -v $(pwd):/app winget-test +``` + +This will: +- Start an interactive bash session +- Mount the current directory as `/app` inside the container +- Allow you to place the `scalibr` binary in `/app` and run tests + +### Running OSV-Scalibr on Windows + +Since the WinGet extractor requires Windows OS, extract the test data from the container to run on a Windows system: + +1. Run the Docker container to extract test data: +```bash +docker run --rm -v $(pwd)/extracted_testdata:/output winget-test cp -r /Users /ProgramData /output/ +``` + +2. Copy the extracted test data to your Windows machine +3. On Windows, run scalibr with the `--root` flag pointing to the extracted directory: + +```bash +# Extract from all WinGet databases using extracted test data +scalibr.exe --extractors=os/winget --result=winget_output.json --root=C:\path\to\extracted_testdata + +# Or target specific paths within the extracted data +scalibr.exe --extractors=os/winget --result=winget_output.json --root=C:\path\to\extracted_testdata --paths=Users/test/AppData/Local,ProgramData/Microsoft +``` + +### Development/CI Testing (Linux) + +For development purposes, you can still use the Docker container to verify database contents, but note that the WinGet extractor will not run: + +1. Build or copy the `scalibr` binary to the current directory +2. Run the container as shown above +3. Inside the container, run scalibr with the WinGet extractor: + +```bash +# This will show the expected "needs to run on a different OS" message +./scalibr --extractors=os/winget --result=winget_output.json +``` + +### Verify Database Contents + +You can inspect the test databases using sqlite3: + +```bash +# Inside the container +sqlite3 /Users/test/AppData/Local/Packages/Microsoft.DesktopAppInstaller_8wekyb3d8bbwe/LocalState/Microsoft.Winget.Source_8wekyb3d8bbwe/installed.db + +# List tables +.tables + +# Query packages +SELECT i.id, n.name, v.version FROM manifest m +JOIN ids i ON m.id = i.rowid +JOIN names n ON m.name = n.rowid +JOIN versions v ON m.version = v.rowid; +``` + +## Regenerating Test Data + +The `generate_testdata.py` script can be used to recreate the test databases with different package sets: + +```bash +python3 generate_testdata.py +``` + +Edit the script to add new packages or modify existing ones before regenerating the databases. + +## Important Note: Windows-Only Extractor + +The WinGet extractor is designed to run **only on Windows systems** due to OS-specific requirements. When running in this Linux Docker container, you will see: + +```bash +./scalibr --plugins=os/winget --result=output.textproto +# Output: plugin os/winget can't be enabled: needs to run on a different OS than that of the scan environment +``` + +This is **expected behavior** and indicates the testbed is set up correctly. + +## Purpose of This Testbed + +This Docker setup serves several purposes: + +1. **Validation**: Provides a standardized environment to test WinGet database parsing logic +2. **Development**: Allows developers to work with realistic WinGet database structures without Windows +3. **CI/CD**: Can be used in automated testing pipelines to verify database schema compatibility +4. **Reference**: Documents the expected WinGet database structure and file locations + +## Expected Output (on Windows) + +When running scalibr successfully on a Windows system, you should see extracted package information for 7 total packages across the three databases, including package names, versions, and metadata like monikers, channels, tags, and commands. + +## Troubleshooting + +- **"needs to run on a different OS"**: This is expected when running on non-Windows systems +- **Database errors**: Verify the database files exist and have proper SQLite schema using the inspection commands above +- **Permission issues**: Make sure the scalibr binary has execute permissions: `chmod +x scalibr` +- **No packages found on Windows**: Ensure the WinGet extractor is enabled with `--plugins=os/winget` \ No newline at end of file diff --git a/microsoft/winget/generate_testdata.py b/microsoft/winget/generate_testdata.py new file mode 100644 index 00000000..d18c367b --- /dev/null +++ b/microsoft/winget/generate_testdata.py @@ -0,0 +1,261 @@ +#!/usr/bin/env python3 +""" +Generate sample WinGet SQLite databases for testing OSV-Scalibr WinGet extractor. + +This script creates SQLite databases with the same schema and sample data +used by the WinGet extractor tests. +""" + +import sqlite3 +import os +import sys + + +def create_winget_database(db_path, packages): + """Create a WinGet SQLite database with the given packages.""" + + # Remove existing database if it exists + if os.path.exists(db_path): + os.remove(db_path) + + # Ensure directory exists + os.makedirs(os.path.dirname(db_path), exist_ok=True) + + conn = sqlite3.connect(db_path) + cursor = conn.cursor() + + # Create schema (based on winget_test.go) + schema = """ + CREATE TABLE [metadata]( + [name] TEXT PRIMARY KEY NOT NULL, + [value] TEXT NOT NULL); + CREATE TABLE [ids](rowid INTEGER PRIMARY KEY, [id] TEXT NOT NULL); + CREATE UNIQUE INDEX [ids_pkindex] ON [ids]([id]); + CREATE TABLE [names](rowid INTEGER PRIMARY KEY, [name] TEXT NOT NULL); + CREATE UNIQUE INDEX [names_pkindex] ON [names]([name]); + CREATE TABLE [monikers](rowid INTEGER PRIMARY KEY, [moniker] TEXT NOT NULL); + CREATE UNIQUE INDEX [monikers_pkindex] ON [monikers]([moniker]); + CREATE TABLE [versions](rowid INTEGER PRIMARY KEY, [version] TEXT NOT NULL); + CREATE UNIQUE INDEX [versions_pkindex] ON [versions]([version]); + CREATE TABLE [channels](rowid INTEGER PRIMARY KEY, [channel] TEXT NOT NULL); + CREATE UNIQUE INDEX [channels_pkindex] ON [channels]([channel]); + CREATE TABLE [manifest](rowid INTEGER PRIMARY KEY, [id] INT64 NOT NULL, [name] INT64 NOT NULL, [moniker] INT64 NOT NULL, [version] INT64 NOT NULL, [channel] INT64 NOT NULL, [pathpart] INT64 NOT NULL, hash BLOB, arp_min_version INT64, arp_max_version INT64); + CREATE TABLE [tags](rowid INTEGER PRIMARY KEY, [tag] TEXT NOT NULL); + CREATE UNIQUE INDEX [tags_pkindex] ON [tags]([tag]); + CREATE TABLE [tags_map]([manifest] INT64 NOT NULL, [tag] INT64 NOT NULL, PRIMARY KEY([tag], [manifest])) WITHOUT ROWID; + CREATE TABLE [commands](rowid INTEGER PRIMARY KEY, [command] TEXT NOT NULL); + CREATE UNIQUE INDEX [commands_pkindex] ON [commands]([command]); + CREATE TABLE [commands_map]([manifest] INT64 NOT NULL, [command] INT64 NOT NULL, PRIMARY KEY([command], [manifest])) WITHOUT ROWID; + """ + + cursor.executescript(schema) + + # Keep track of existing lookup table entries to avoid duplicates + id_ids = {} + name_ids = {} + moniker_ids = {} + version_ids = {} + channel_ids = {} + tag_ids = {} + command_ids = {} + + next_id_id = 1 + next_name_id = 1 + next_moniker_id = 1 + next_version_id = 1 + next_channel_id = 1 + next_tag_id = 1 + next_command_id = 1 + + # Insert test data + for i, pkg in enumerate(packages, 1): + manifest_id = i + + # Insert or get IDs for lookup table values + if pkg["id"] not in id_ids: + id_ids[pkg["id"]] = next_id_id + cursor.execute( + "INSERT INTO ids (rowid, id) VALUES (?, ?)", (next_id_id, pkg["id"]) + ) + next_id_id += 1 + + if pkg["name"] not in name_ids: + name_ids[pkg["name"]] = next_name_id + cursor.execute( + "INSERT INTO names (rowid, name) VALUES (?, ?)", + (next_name_id, pkg["name"]), + ) + next_name_id += 1 + + if pkg["moniker"] not in moniker_ids: + moniker_ids[pkg["moniker"]] = next_moniker_id + cursor.execute( + "INSERT INTO monikers (rowid, moniker) VALUES (?, ?)", + (next_moniker_id, pkg["moniker"]), + ) + next_moniker_id += 1 + + if pkg["version"] not in version_ids: + version_ids[pkg["version"]] = next_version_id + cursor.execute( + "INSERT INTO versions (rowid, version) VALUES (?, ?)", + (next_version_id, pkg["version"]), + ) + next_version_id += 1 + + if pkg["channel"] not in channel_ids: + channel_ids[pkg["channel"]] = next_channel_id + cursor.execute( + "INSERT INTO channels (rowid, channel) VALUES (?, ?)", + (next_channel_id, pkg["channel"]), + ) + next_channel_id += 1 + + # Insert manifest using the lookup IDs + cursor.execute( + "INSERT INTO manifest (rowid, id, name, moniker, version, channel, pathpart) VALUES (?, ?, ?, ?, ?, ?, ?)", + ( + manifest_id, + id_ids[pkg["id"]], + name_ids[pkg["name"]], + moniker_ids[pkg["moniker"]], + version_ids[pkg["version"]], + channel_ids[pkg["channel"]], + -1, + ), + ) + + # Insert tags + for tag in pkg.get("tags", []): + if tag not in tag_ids: + tag_ids[tag] = next_tag_id + cursor.execute( + "INSERT INTO tags (rowid, tag) VALUES (?, ?)", (next_tag_id, tag) + ) + next_tag_id += 1 + cursor.execute( + "INSERT INTO tags_map (manifest, tag) VALUES (?, ?)", + (manifest_id, tag_ids[tag]), + ) + + # Insert commands + for command in pkg.get("commands", []): + if command not in command_ids: + command_ids[command] = next_command_id + cursor.execute( + "INSERT INTO commands (rowid, command) VALUES (?, ?)", + (next_command_id, command), + ) + next_command_id += 1 + cursor.execute( + "INSERT INTO commands_map (manifest, command) VALUES (?, ?)", + (manifest_id, command_ids[command]), + ) + + conn.commit() + conn.close() + print(f"Created database: {db_path}") + + +def main(): + base_dir = os.path.dirname(os.path.abspath(__file__)) + + # Sample packages for user installed database + user_packages = [ + { + "id": "Git.Git", + "name": "Git", + "version": "2.50.1", + "moniker": "git", + "channel": "", + "tags": ["git", "vcs", "developer-tools"], + "commands": ["git"], + }, + { + "id": "Microsoft.VisualStudioCode", + "name": "Microsoft Visual Studio Code", + "version": "1.103.1", + "moniker": "vscode", + "channel": "stable", + "tags": ["developer-tools", "editor", "ide"], + "commands": ["code"], + }, + { + "id": "Google.Chrome", + "name": "Google Chrome", + "version": "120.0.6099.109", + "moniker": "chrome", + "channel": "stable", + "tags": ["browser", "web"], + "commands": ["chrome"], + }, + ] + + # Sample packages for Store Edge database + store_packages = [ + { + "id": "Microsoft.PowerShell", + "name": "PowerShell", + "version": "7.4.1", + "moniker": "powershell", + "channel": "stable", + "tags": ["shell", "terminal", "microsoft"], + "commands": ["pwsh", "powershell"], + }, + { + "id": "Mozilla.Firefox", + "name": "Mozilla Firefox", + "version": "121.0.1", + "moniker": "firefox", + "channel": "release", + "tags": ["browser", "web", "mozilla"], + "commands": ["firefox"], + }, + ] + + # Sample packages for system repository + system_packages = [ + { + "id": "Microsoft.WindowsTerminal", + "name": "Windows Terminal", + "version": "1.18.3181.0", + "moniker": "wt", + "channel": "stable", + "tags": ["terminal", "microsoft", "system"], + "commands": ["wt"], + }, + { + "id": "Microsoft.VCRedist.2015+.x64", + "name": "Microsoft Visual C++ 2015-2022 Redistributable (x64)", + "version": "14.38.33135.0", + "moniker": "vcredist2022x64", + "channel": "", + "tags": ["runtime", "microsoft", "system"], + "commands": [], + }, + ] + + # Create databases + user_db_path = os.path.join( + base_dir, + "testdata/Users/test/AppData/Local/Packages/Microsoft.DesktopAppInstaller_8wekyb3d8bbwe/LocalState/Microsoft.Winget.Source_8wekyb3d8bbwe/installed.db", + ) + create_winget_database(user_db_path, user_packages) + + store_db_path = os.path.join( + base_dir, + "testdata/Users/test/AppData/Local/Packages/Microsoft.DesktopAppInstaller_8wekyb3d8bbwe/LocalState/StoreEdgeFD/installed.db", + ) + create_winget_database(store_db_path, store_packages) + + system_db_path = os.path.join( + base_dir, + "testdata/ProgramData/Microsoft/Windows/AppRepository/StateRepository-Machine.srd", + ) + create_winget_database(system_db_path, system_packages) + + print("All test databases created successfully!") + + +if __name__ == "__main__": + main() diff --git a/microsoft/winget/testdata/ProgramData/Microsoft/Windows/AppRepository/StateRepository-Machine.srd b/microsoft/winget/testdata/ProgramData/Microsoft/Windows/AppRepository/StateRepository-Machine.srd new file mode 100644 index 00000000..26dbd901 Binary files /dev/null and b/microsoft/winget/testdata/ProgramData/Microsoft/Windows/AppRepository/StateRepository-Machine.srd differ diff --git a/microsoft/winget/testdata/Users/test/AppData/Local/Packages/Microsoft.DesktopAppInstaller_8wekyb3d8bbwe/LocalState/Microsoft.Winget.Source_8wekyb3d8bbwe/installed.db b/microsoft/winget/testdata/Users/test/AppData/Local/Packages/Microsoft.DesktopAppInstaller_8wekyb3d8bbwe/LocalState/Microsoft.Winget.Source_8wekyb3d8bbwe/installed.db new file mode 100644 index 00000000..02493632 Binary files /dev/null and b/microsoft/winget/testdata/Users/test/AppData/Local/Packages/Microsoft.DesktopAppInstaller_8wekyb3d8bbwe/LocalState/Microsoft.Winget.Source_8wekyb3d8bbwe/installed.db differ diff --git a/microsoft/winget/testdata/Users/test/AppData/Local/Packages/Microsoft.DesktopAppInstaller_8wekyb3d8bbwe/LocalState/StoreEdgeFD/installed.db b/microsoft/winget/testdata/Users/test/AppData/Local/Packages/Microsoft.DesktopAppInstaller_8wekyb3d8bbwe/LocalState/StoreEdgeFD/installed.db new file mode 100644 index 00000000..83123f65 Binary files /dev/null and b/microsoft/winget/testdata/Users/test/AppData/Local/Packages/Microsoft.DesktopAppInstaller_8wekyb3d8bbwe/LocalState/StoreEdgeFD/installed.db differ