Skip to content

Conversation

@MatMacinf
Copy link

Description

Adds a new report plugin that generates a detailed statistical summary
of your Beets music library, similar to a “Wrapped” summary.

Features include:

  • Total number of tracks, albums, artists, genres, and years
  • Listening time and average track length
  • Average bitrate and primary format
  • Decade distribution
  • Top artist, genre, decade, and year
  • Longest and shortest tracks
  • Counts of new vs. older tracks (2015+)
  • Counts of missing genre/year tags

This PR also adds documentation (docs/plugins/report.rst) and a changelog
entry (docs/changelog.rst).

Fixes #X.


To Do

  • Documentation. (Added docs/plugins/report.rst describing the plugin and usage.)
  • Changelog. (Added entry under Unreleased → New features in docs/changelog.rst.)
  • Tests. (Added unit tests in tests/test_report.py to verify the statistics output.)

@MatMacinf MatMacinf requested a review from a team as a code owner January 5, 2026 11:05
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • Consider separating the report-generation logic from the printing (e.g., build a summary data structure and have a thin formatter layer) so the core statistics can be reused programmatically and tested without relying on stdout parsing.
  • You currently iterate over the full items list multiple times to build separate lists and counters; you could consolidate those into a single pass over items to reduce redundant work on large libraries.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider separating the report-generation logic from the printing (e.g., build a summary data structure and have a thin formatter layer) so the core statistics can be reused programmatically and tested without relying on stdout parsing.
- You currently iterate over the full `items` list multiple times to build separate lists and counters; you could consolidate those into a single pass over `items` to reduce redundant work on large libraries.

## Individual Comments

### Comment 1
<location> `beetsplug/report.py:81-82` </location>
<code_context>
+        top_decade = decade_counter.most_common(1)
+        top_year = year_counter.most_common(1)
+
+        longest_track = max(items, key=lambda i: i.length or 0)
+        shortest_track = min(
+            (i for i in items if i.length), key=lambda i: i.length, default=None
+        )
</code_context>

<issue_to_address>
**issue (bug_risk):** Guard against libraries where no items have a valid length to avoid a TypeError when formatting the longest track.

When all items have falsy/None `length`, `max(items, key=lambda i: i.length or 0)` still returns an item, but `fmt_time(longest_track.length)` will receive `None`, causing `int(None)``TypeError`. Mirror the `shortest_track` pattern by filtering to items with a truthy `length` and using `default=None` for `longest_track`, and only format/print it when not `None`.
</issue_to_address>

### Comment 2
<location> `test/test_report.py:53-62` </location>
<code_context>
+    assert "Your Beets library is empty." in captured.out
+
+
+def test_single_item(capsys, library):
+    """Test library with a single track."""
+    add_item(
+        library,
+        title="Single Track",
+        artist="Solo Artist",
+        genre="Indie",
+        year=2019,
+    )
+    plugin = ReportPlugin()
+    plugin._run_report(library, None, [])
+    captured = capsys.readouterr()
+
+    # --- Basic statistics ---
+    assert "Tracks:" in captured.out
+    assert "Albums:" in captured.out
+    assert "Artists:" in captured.out
+    assert "Genres:" in captured.out
+
+    # --- Wrapped-style insights ---
+    assert "Top artist:" in captured.out
+    assert "Solo Artist" in captured.out
+    assert "Top genre:" in captured.out
+    assert "Indie" in captured.out
+    assert "Top decade:" in captured.out
+    assert "10s" in captured.out
+    assert "Top year:" in captured.out
+    assert "2019" in captured.out
+
+
</code_context>

<issue_to_address>
**suggestion (testing):** Add coverage for bitrate/quality and primary format output in the report.

This currently checks basic stats and Wrapped-style insights, but not the bitrate/quality and primary format output, which is central to this plugin. Please extend this (or add another test) to:

- Set a specific `bitrate` and `format` on the item.
- Assert that the report includes the expected `Avg bitrate: ... kbps` line with the correct quality label.
- Assert that the `Primary format:` line is present and matches the item’s format.

This will ensure the bitrate aggregation and quality/primary-format logic are properly covered.

Suggested implementation:

```python
def test_empty_library(capsys, library):
    """Test empty library: should output message without crashing."""
    plugin = ReportPlugin()
    plugin._run_report(library, None, [])
    captured = capsys.readouterr()
    assert "Your Beets library is empty." in captured.out


def test_single_item(capsys, library):
    """Test library with a single track."""
    # Create a single item with explicit bitrate and format so we can
    # exercise the bitrate/quality and primary-format reporting.
    add_item(
        library,
        title="Single Track",
        artist="Solo Artist",
        genre="Indie",
        year=2019,
        # Beets stores bitrate as bits per second; 256 kbps == 256000 bps.
        bitrate=256000,
        format="MP3",
    )

    plugin = ReportPlugin()
    plugin._run_report(library, None, [])
    captured = capsys.readouterr()

    # --- Basic statistics ---
    assert "Tracks:" in captured.out
    assert "Albums:" in captured.out
    assert "Artists:" in captured.out
    assert "Genres:" in captured.out

    # --- Wrapped-style insights ---
    assert "Top artist:" in captured.out
    assert "Solo Artist" in captured.out
    assert "Top genre:" in captured.out
    assert "Indie" in captured.out
    assert "Top decade:" in captured.out
    assert "10s" in captured.out
    assert "Top year:" in captured.out
    assert "2019" in captured.out

    # --- Bitrate / quality statistics ---
    # Find the "Avg bitrate" line so we can assert both the numeric
    # value and presence of a quality label (typically in parentheses).
    avg_bitrate_lines = [
        line for line in captured.out.splitlines()
        if line.strip().startswith("Avg bitrate:")
    ]
    assert avg_bitrate_lines, "Expected an 'Avg bitrate:' line in output"
    avg_line = avg_bitrate_lines[0]

    # Should include a kbps value.
    assert "kbps" in avg_line

    # Should include a human-readable quality label (e.g. '(High)', '(Lossless)', etc.).
    # We don't depend on the exact wording, just that a label is present in parentheses.
    assert "(" in avg_line and ")" in avg_line

    # --- Primary format statistics ---
    primary_format_lines = [
        line for line in captured.out.splitlines()
        if line.strip().startswith("Primary format:")
    ]
    assert primary_format_lines, "Expected a 'Primary format:' line in output"
    primary_line = primary_format_lines[0]
    assert "MP3" in primary_line

```

This patch assumes that:

1. `add_item(...)` accepts `bitrate` and `format` as keyword arguments and sets the corresponding fields on the created item.
2. The report output contains a line starting with `"Avg bitrate:"` that includes a numeric value in kbps and a quality label inside parentheses on the same line.
3. The report output contains a line starting with `"Primary format:"` that includes the canonical format string (e.g. `"MP3"`).

If the actual output format or label placement differs (for example, the quality label is not in parentheses, or the format is lowercase like `mp3`), adjust the assertions on `avg_line` and `primary_line` accordingly to match the exact strings produced by `ReportPlugin._run_report`.
</issue_to_address>

### Comment 3
<location> `test/test_report.py:112-121` </location>
<code_context>
+    assert "10s" in captured.out
+
+
+def test_missing_metadata(capsys, library):
+    """Test library with missing tags, length, and bitrate."""
+    add_item(
+        library,
+        "Track1",
+        "Artist",
+        "Album",
+        None,
+        2000,
+        length=200,
+        bitrate=256,
+    )
+    add_item(
+        library,
+        "Track2",
+        "Artist",
+        "Album",
+        "Rock",
+        None,
+        length=180,
+        bitrate=None,
+    )
+
+    plugin = ReportPlugin()
+    plugin._run_report(library, None, [])
+    captured = capsys.readouterr()
+
+    # --- Check missing metadata counts ---
+    assert "Missing genre" in captured.out
+    assert "1" in captured.out  # At least one missing genre
+    assert "Missing year" in captured.out
+    assert "1" in captured.out  # At least one missing year
</code_context>

<issue_to_address>
**issue (testing):** Make the assertions for missing metadata counts more specific to avoid false positives.

The bare `"1"` checks can match any occurrence of `1` in the output, so they don’t reliably verify the missing-metadata counters. Instead, assert on the full expected lines or a more specific substring, e.g.:

```python
assert "Missing genre tags: 1" in captured.out
assert "Missing year tags: 1" in captured.out
```
(or a regex equivalent if the exact formatting might vary).
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines 53 to 62
def test_single_item(capsys, library):
"""Test library with a single track."""
add_item(
library,
title="Single Track",
artist="Solo Artist",
genre="Indie",
year=2019,
)
plugin = ReportPlugin()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add coverage for bitrate/quality and primary format output in the report.

This currently checks basic stats and Wrapped-style insights, but not the bitrate/quality and primary format output, which is central to this plugin. Please extend this (or add another test) to:

  • Set a specific bitrate and format on the item.
  • Assert that the report includes the expected Avg bitrate: ... kbps line with the correct quality label.
  • Assert that the Primary format: line is present and matches the item’s format.

This will ensure the bitrate aggregation and quality/primary-format logic are properly covered.

Suggested implementation:

def test_empty_library(capsys, library):
    """Test empty library: should output message without crashing."""
    plugin = ReportPlugin()
    plugin._run_report(library, None, [])
    captured = capsys.readouterr()
    assert "Your Beets library is empty." in captured.out


def test_single_item(capsys, library):
    """Test library with a single track."""
    # Create a single item with explicit bitrate and format so we can
    # exercise the bitrate/quality and primary-format reporting.
    add_item(
        library,
        title="Single Track",
        artist="Solo Artist",
        genre="Indie",
        year=2019,
        # Beets stores bitrate as bits per second; 256 kbps == 256000 bps.
        bitrate=256000,
        format="MP3",
    )

    plugin = ReportPlugin()
    plugin._run_report(library, None, [])
    captured = capsys.readouterr()

    # --- Basic statistics ---
    assert "Tracks:" in captured.out
    assert "Albums:" in captured.out
    assert "Artists:" in captured.out
    assert "Genres:" in captured.out

    # --- Wrapped-style insights ---
    assert "Top artist:" in captured.out
    assert "Solo Artist" in captured.out
    assert "Top genre:" in captured.out
    assert "Indie" in captured.out
    assert "Top decade:" in captured.out
    assert "10s" in captured.out
    assert "Top year:" in captured.out
    assert "2019" in captured.out

    # --- Bitrate / quality statistics ---
    # Find the "Avg bitrate" line so we can assert both the numeric
    # value and presence of a quality label (typically in parentheses).
    avg_bitrate_lines = [
        line for line in captured.out.splitlines()
        if line.strip().startswith("Avg bitrate:")
    ]
    assert avg_bitrate_lines, "Expected an 'Avg bitrate:' line in output"
    avg_line = avg_bitrate_lines[0]

    # Should include a kbps value.
    assert "kbps" in avg_line

    # Should include a human-readable quality label (e.g. '(High)', '(Lossless)', etc.).
    # We don't depend on the exact wording, just that a label is present in parentheses.
    assert "(" in avg_line and ")" in avg_line

    # --- Primary format statistics ---
    primary_format_lines = [
        line for line in captured.out.splitlines()
        if line.strip().startswith("Primary format:")
    ]
    assert primary_format_lines, "Expected a 'Primary format:' line in output"
    primary_line = primary_format_lines[0]
    assert "MP3" in primary_line

This patch assumes that:

  1. add_item(...) accepts bitrate and format as keyword arguments and sets the corresponding fields on the created item.
  2. The report output contains a line starting with "Avg bitrate:" that includes a numeric value in kbps and a quality label inside parentheses on the same line.
  3. The report output contains a line starting with "Primary format:" that includes the canonical format string (e.g. "MP3").

If the actual output format or label placement differs (for example, the quality label is not in parentheses, or the format is lowercase like mp3), adjust the assertions on avg_line and primary_line accordingly to match the exact strings produced by ReportPlugin._run_report.

Comment on lines 112 to 121
def test_missing_metadata(capsys, library):
"""Test library with missing tags, length, and bitrate."""
add_item(
library,
"Track1",
"Artist",
"Album",
None,
2000,
length=200,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (testing): Make the assertions for missing metadata counts more specific to avoid false positives.

The bare "1" checks can match any occurrence of 1 in the output, so they don’t reliably verify the missing-metadata counters. Instead, assert on the full expected lines or a more specific substring, e.g.:

assert "Missing genre tags: 1" in captured.out
assert "Missing year tags: 1" in captured.out

(or a regex equivalent if the exact formatting might vary).

@codecov
Copy link

codecov bot commented Jan 5, 2026

Codecov Report

❌ Patch coverage is 76.19048% with 25 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.28%. Comparing base (ea2e7bf) to head (ed859c3).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
beets/ui/commands/stats.py 76.19% 7 Missing and 18 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6266      +/-   ##
==========================================
+ Coverage   68.24%   68.28%   +0.04%     
==========================================
  Files         138      138              
  Lines       18815    18919     +104     
  Branches     3167     3191      +24     
==========================================
+ Hits        12840    12919      +79     
- Misses       5302     5309       +7     
- Partials      673      691      +18     
Files with missing lines Coverage Δ
beets/ui/commands/stats.py 76.25% <76.19%> (-0.89%) ⬇️
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@semohr
Copy link
Contributor

semohr commented Jan 5, 2026

Hi and thank you for the PR!
This seems to be your first contribution to beets, welcome!

I have not looked at the code yet but I was wondering why you introduced a plugin for this instead of enhancing the buildin beet stats command? I feel like the beet stats command seems like the proper place for most of the introduced features.

@MatMacinf
Copy link
Author

Hi and thank you for the PR! This seems to be your first contribution to beets, welcome!

I have not looked at the code yet but I was wondering why you introduced a plugin for this instead of enhancing the buildin beet stats command? I feel like the beet stats command seems like the proper place for most of the introduced features.

Hi, thank you for warm welcoming.

Answering your concern, my aim was to create plugin that shows preferences more than statistic, our top artist, genre decade etc. In my head it was better to separate stats plugin with my plugin. to keep more metric and technical focusing plugin and add new one focusing more on user preferences and not polluting stats plugin with that. Therefore im open to your insight and if you think it will be better I can refactor my code and add this to to stats plugin as an alternative output called via --report/wrap flag.

@semohr
Copy link
Contributor

semohr commented Jan 5, 2026

We are always happy to have newcomers on board!

Answering your concern, my aim was to create plugin that shows preferences more than statistic, our top artist, genre decade etc. In my head it was better to separate stats plugin with my plugin. to keep more metric and technical focusing plugin and add new one focusing more on user preferences and not polluting stats plugin with that. Therefore im open to your insight and if you think it will be better I can refactor my code and add this to to stats plugin as an alternative output called via --report/wrap flag.

Since the plugin essentially adds a new command, it might actually make sense to integrate it into the stats command, even if the output is somewhat opinionated at the moment. I’d be curious to hear what the rest of the @beetbox/maintainers think about this.

Adding a subcommand (something like beet stats report or beet stats overview) would keep all statistics-related functionality grouped together and make it easier to discover. The stats command is still fairly minimal see: stats.py

@arsaboo
Copy link
Contributor

arsaboo commented Jan 5, 2026

stats seems to be the right place for this. Also, let us not add arbitrary cutoffs such as Counts of new vs. older tracks (2015+).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants