-
Notifications
You must be signed in to change notification settings - Fork 2k
Drop dependency on python-musicbrainzngs #6234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've found 4 issues, and left some high level feedback:
- In
ListenBrainzPlugin.get_mb_recording_idthe return type is annotated asint | Nonebut the method returns the stringrecording['id']; update the annotation (or cast) so the declared and actual types match. - The use of an empty-string key in
MusicBrainzAPI.search_entity(e.g., inListenBrainzPlugin.get_mb_recording_id) is a bit obscure; consider adding an explicit parameter or helper for the "bare" query term instead of encoding this with a special key value.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `ListenBrainzPlugin.get_mb_recording_id` the return type is annotated as `int | None` but the method returns the string `recording['id']`; update the annotation (or cast) so the declared and actual types match.
- The use of an empty-string key in `MusicBrainzAPI.search_entity` (e.g., in `ListenBrainzPlugin.get_mb_recording_id`) is a bit obscure; consider adding an explicit parameter or helper for the "bare" query term instead of encoding this with a special key value.
## Individual Comments
### Comment 1
<location> `beetsplug/listenbrainz.py:133-134` </location>
<code_context>
return tracks
- def get_mb_recording_id(self, track):
+ def get_mb_recording_id(self, track) -> int | None:
"""Returns the MusicBrainz recording ID for a track."""
- resp = musicbrainzngs.search_recordings(
</code_context>
<issue_to_address>
**suggestion (bug_risk):** The return type of `get_mb_recording_id` should be `str | None`, not `int | None`.
This function returns MusicBrainz recording MBIDs, which are UUID strings. `self.mb_api.search_entity(...)` produces dicts where `"id"` is a string and you return `recording["id"]` directly, so annotating the return type as `int | None` is incorrect and may confuse callers or type checkers. Please update the signature to `-> str | None` to match the actual return type.
```suggestion
def get_mb_recording_id(self, track) -> str | None:
"""Returns the MusicBrainz recording ID for a track."""
```
</issue_to_address>
### Comment 2
<location> `beetsplug/_utils/requests.py:159-160` </location>
<code_context>
raise beets_error(response=e.response) from e
- raise
+
+ # re-raise the same exception with the response text
+ raise e.__class__(e.response.text) from e
def request(self, *args, **kwargs) -> requests.Response:
</code_context>
<issue_to_address>
**issue (bug_risk):** Reconstructing the HTTPError drops the original `response` object, which may break callers relying on `e.response`.
The previous implementation re-raised the original `HTTPError`, keeping its `response` attribute. The new `e.__class__(e.response.text)` call creates a new exception where `response` will usually be `None`, so existing handlers that inspect `e.response` may break. You can still add the response text without replacing the instance, e.g.:
```python
e.args = (*e.args, e.response.text)
raise
```
or by attaching the text another way, while re-raising the original exception to preserve behavior.
</issue_to_address>
### Comment 3
<location> `test/plugins/utils/test_musicbrainz.py:4-13` </location>
<code_context>
assert candidates[0].album == "hi"
-
-
-def test_group_relations():
- raw_release = {
- "id": "r1",
</code_context>
<issue_to_address>
**suggestion (testing):** Extend tests for the new MusicBrainz client to cover query construction and browse helpers, not just relation grouping.
Right now we only verify `_group_relations`. The new `MusicBrainzAPI` also packs in logic that’s currently untested:
- `search_entity` query building (empty-key handling, trimming, lowercasing).
- `get_entity` `includes`/`inc` handling in `params`.
- `browse_*` helpers (e.g. `browse_release_groups`) returning the correct slice from the response.
A few `requests_mock`-based tests (like the plugin tests) that assert `search_entity('recording', {...})` hits `/ws/2/recording` with the expected query string, and that `browse_release_groups(artist='X')` returns `response['release-groups']`, would cover these shared behaviours and better exercise the new client.
Suggested implementation:
```python
from beetsplug._utils.musicbrainz import MusicBrainzAPI
from requests_mock import ANY as RM_ANY
def test_group_relations():
api = MusicBrainzAPI()
raw_release = {
"id": "r1",
"relations": [
{"target-type": "artist", "type": "vocal", "name": "A"},
{"target-type": "artist", "type": "instrument", "name": "B"},
{"target-type": "url", "type": "streaming", "url": "http://s"},
{"target-type": "url", "type": "purchase", "url": "http://p"},
{
"target-type": "work",
"type": "performance",
"work": {
"title": "Work title",
"relations": [
{
"target-type": "artist",
"type": "composer",
"name": "Composer A",
},
{
"target-type": "artist",
"type": "lyricist",
"name": "Lyricist B",
},
],
},
},
],
}
grouped = MusicBrainzAPI._group_relations(raw_release["relations"])
# Base-level artist relations.
assert "artist" in grouped
artist_names = {rel["name"] for rel in grouped["artist"]}
assert artist_names == {"A", "B"}
# URL relations grouped by type.
assert "url" in grouped
url_types = {rel["type"] for rel in grouped["url"]}
assert url_types == {"streaming", "purchase"}
# Nested work relations should be flattened into the same structure.
assert "work" in grouped
work_relations = grouped["work"]
assert any(rel.get("work", {}).get("title") == "Work title" for rel in work_relations)
def test_search_entity_builds_query_and_hits_expected_endpoint(requests_mock):
api = MusicBrainzAPI()
# Mock any GET to the recording endpoint.
requests_mock.get(
"https://musicbrainz.org/ws/2/recording",
json={"recordings": []},
)
# Mixed/whitespace keys and an empty key that should be ignored.
query = {
"Artist": " Brian Eno ",
"title": " AN Ending (Ascent) ",
"": "should be ignored",
}
api.search_entity("recording", query)
last = requests_mock.last_request
# Endpoint path.
assert last.path.startswith("/ws/2/recording")
# Query parameters.
params = last.qs
# We always expect a single `query` parameter for the search expression.
assert "query" in params
[query_value] = params["query"]
# Empty-key entry should not appear, keys/values should be normalized.
assert "should be ignored" not in query_value
assert "artist:brian eno" in query_value
assert "title:an ending (ascent)" in query_value
# No uppercase keys should remain in the built query.
assert query_value == query_value.lower()
def test_get_entity_includes_are_encoded_into_inc_param(requests_mock):
api = MusicBrainzAPI()
mbid = "00000000-0000-0000-0000-000000000001"
includes = ["artists", "aliases", "recordings"]
requests_mock.get(
f"https://musicbrainz.org/ws/2/release-group/{mbid}",
json={"id": mbid},
)
api.get_entity("release-group", mbid, includes=includes)
last = requests_mock.last_request
assert last.path == f"/ws/2/release-group/{mbid}"
params = last.qs
# MusicBrainz uses `inc` for include parameters, separated by `+`.
assert "inc" in params
[inc_value] = params["inc"]
for part in includes:
assert part in inc_value
assert "+" in inc_value or len(includes) == 1
def test_browse_release_groups_returns_release_groups_list(requests_mock):
api = MusicBrainzAPI()
artist_mbid = "11111111-1111-1111-1111-111111111111"
payload = {
"release-groups": [
{"id": "rg1", "title": "First RG"},
{"id": "rg2", "title": "Second RG"},
],
"release-group-count": 2,
}
requests_mock.get(
"https://musicbrainz.org/ws/2/release-group",
json=payload,
)
result = api.browse_release_groups(artist=artist_mbid)
# Helper should return just the `release-groups` slice from the payload.
assert result == payload["release-groups"]
last = requests_mock.last_request
assert last.path == "/ws/2/release-group"
params = last.qs
# Browse helper should pass through the browse parameter.
assert params.get("artist") == [artist_mbid]
```
These tests assume:
1. `MusicBrainzAPI` builds URLs under `https://musicbrainz.org/ws/2` and uses `/ws/2/<entity>` and `/ws/2/<entity>/<mbid>` paths.
2. `search_entity(entity, query_dict, ...)` performs an HTTP GET on `/ws/2/<entity>` with a `query` parameter built by:
- Dropping empty-string keys,
- Lowercasing keys and values,
- Trimming whitespace,
- Joining as `key:value` pairs in a single query string.
3. `get_entity(entity, mbid, includes=...)` performs an HTTP GET on `/ws/2/<entity>/<mbid>` and encodes `includes` into an `inc` parameter joined with `+`.
4. `browse_release_groups(**params)` performs an HTTP GET on `/ws/2/release-group` and returns `response_json["release-groups"]`.
If `MusicBrainzAPI` differs (e.g. different base URL, it delegates to `musicbrainzngs` rather than using `requests`, or it uses a different `inc` encoding), adjust:
- The mocked URLs in `requests_mock.get(...)`,
- The assertions on `last.path` and `last.qs`,
- The expectations around the `query` and `inc` parameters,
- And, if `browse_release_groups` returns the raw payload, adapt the final assertion accordingly.
</issue_to_address>
### Comment 4
<location> `docs/plugins/listenbrainz.rst:20-21` </location>
<code_context>
+ plugins:
+ - listenbrainz
You can then configure the plugin by providing your Listenbrainz token (see
intructions here_) and username:
</code_context>
<issue_to_address>
**issue (typo):** Fix typo in “instructions” and ensure consistent capitalization of “ListenBrainz”.
In this paragraph, update “Listenbrainz” to “ListenBrainz” and correct “intructions” to “instructions”.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #6234 +/- ##
==========================================
+ Coverage 68.24% 68.67% +0.43%
==========================================
Files 138 138
Lines 18815 18532 -283
Branches 3167 3061 -106
==========================================
- Hits 12840 12727 -113
+ Misses 5302 5151 -151
+ Partials 673 654 -19
🚀 New features to boost your workflow:
|
0c8b87e to
d015efa
Compare
18c7c71 to
599fdd3
Compare
|
@semohr I have now typed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just had another look here, the PR looks good to me and should mostly ready to merge.
It’s a bit difficult for me to test the mbcollection plugin in a live environment, but I trust that you’ve done your due diligence ;)
7f29f40 to
09afc6e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request replaces the external python-musicbrainzngs dependency with a custom lightweight MusicBrainz client implementation, eliminating an external dependency and simplifying plugin installation for users.
Key changes:
- Created new
beetsplug._utils.musicbrainzmodule withMusicBrainzAPI,MusicBrainzUserAPI, and mixin classes - Refactored 6 plugins (musicbrainz, listenbrainz, mbcollection, missing, parentwork, mbpseudo) to use the new API
- Removed
musicbrainzngsfrom dependencies and plugin extras, enabling plugins to work without separate installation - Added comprehensive test coverage for previously untested plugins
Reviewed changes
Copilot reviewed 29 out of 30 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
beetsplug/_utils/musicbrainz.py |
New lightweight MusicBrainz client with rate limiting, request normalization, and typed search/browse/lookup methods |
beetsplug/_utils/requests.py |
Enhanced to retry on server errors (500, 502, 503, 504) and added PUT/DELETE methods |
beetsplug/musicbrainz.py |
Refactored to use MusicBrainzAPIMixin, removed direct API initialization |
beetsplug/listenbrainz.py |
Updated to use mixin for MB searches instead of direct musicbrainzngs calls |
beetsplug/mbcollection.py |
Major refactor: introduced MusicBrainzUserAPI with digest auth, MBCollection dataclass for collection operations |
beetsplug/missing.py |
Updated error handling and API calls to use new mixin |
beetsplug/parentwork.py |
Refactored work traversal into inline method, updated to use grouped relations from new API |
beetsplug/mbpseudo.py |
Updated API access to use mb_api property |
test/plugins/conftest.py |
New shared fixture for disabling rate limiting in tests |
test/plugins/utils/test_musicbrainz.py |
New test file for group_relations functionality |
test/plugins/test_parentwork.py |
Converted from unittest mocks to pytest fixtures with requests_mock |
test/plugins/test_listenbrainz.py |
New test suite for recording ID lookup and track info fetching |
test/plugins/test_mbcollection.py |
New comprehensive tests for collection validation and sync operations |
test/plugins/test_missing.py |
New tests for missing album detection |
test/plugins/utils/test_request_handler.py |
Added test case for server error retry behavior |
pyproject.toml |
Removed musicbrainzngs dependency and plugin extras; added sphinx-toolbox for TypedDict docs |
poetry.lock |
Updated with sphinx-toolbox and dependencies, removed musicbrainzngs |
docs/plugins/*.rst |
Removed installation sections from listenbrainz, mbcollection, missing, and parentwork docs |
docs/conf.py |
Added sphinx_toolbox configuration for TypedDict documentation |
docs/api/plugin_utilities.rst |
New documentation page for RequestHandler and MusicBrainzAPI |
docs/changelog.rst |
Documented the dependency replacement for plugin developers |
.github/workflows/ci.yaml |
Removed parentwork extra from installation commands |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
2d758d8 to
7cd0e2b
Compare
7cd0e2b to
093b3f0
Compare
093b3f0 to
4e2d519
Compare
4e2d519 to
a801afd
Compare
semohr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to work as expected for me and I'm happy with the code changes.
It also fixes #6265 for me.
Replace
python-musicbrainzngswith Custom Lightweight MusicBrainz ClientCore Problem Solved
Before: Beets depended on the external
python-musicbrainzngslibrary (v0.7.1) for all MusicBrainz API interactions. This dependency required separate installation for multiple plugins and introduced an abstraction layer that obscured direct HTTP semantics.After: Custom lightweight MusicBrainz client (
beetsplug.utils.musicbrainz) built directly onrequestsandrequests-ratelimiter, eliminating the external dependency while maintaining full API compatibility.Architecture Overview
1. New MusicBrainz Client Foundation
Created
beetsplug/utils/musicbrainz.pywith three core components:2. Config parsing now centralized in
MusicBrainzAPI.__post_init__().And applies to all plugins.
3. Mixin Pattern for Plugin Integration
Affected plugins:
musicbrainz→MusicBrainzAPIMixinlistenbrainz→MusicBrainzAPIMixinmbcollection→MusicBrainzUserAPIMixin(requires authentication)missing→MusicBrainzAPIMixinparentwork→MusicBrainzAPIMixinPlugin-Specific Refactoring
mbcollection: From Procedural to Object-OrientedBefore: Module-level functions (
submit_albums,mb_call) with error handling scattered across multiple try-except blocks.After:
MBCollectiondataclass encapsulating collection operations:Key improvements:
mb_callerror wrapper (errors propagate naturally fromRequestHandler)MBCollection.releasespropertyLibrary,ImportSession,ImportTaskparameterslistenbrainz: Search Query SimplificationBefore: Called
musicbrainzngs.search_recordings()with constructed query strings.After: Uses
self.mbapi.search_entity()which handles query formatting:parentwork: Inline Traversal LogicBefore: Module-level functions (
direct_parent_id,work_parent_id,find_parent_work_info) that made sequential API calls.After: Single method
find_parent_work_infothat traverses work hierarchy inline:Eliminates three function calls per traversal level.
Dependency & Installation Impact
Package Dependencies
Result: Four plugin extras no longer require external package installation beyond
requests(already a core dependency).CI Workflow
The
parentworkextra is removed entirely since it had no other dependencies.Documentation Updates
Removed "Installation" sections from four plugin docs that previously required:
Plugins now work out-of-the-box with just
plugins: [listenbrainz, ...]in config.Testing Improvements
New Shared Fixture:
requests_mockCreated
test/plugins/conftest.pywith fixture that disables rate limiting during tests:This avoids artificial delays when mocking HTTP responses.
New Plugin Test Suites
Added comprehensive tests for previously untested plugins:
test_listenbrainz.py: Tests recording ID lookup and track info fetchingtest_mbcollection.py: Tests collection validation, pagination, and sync operationstest_missing.py: Tests missing album detection logictest/plugins/utils/test_musicbrainz.py: Testsgroup_relationstransformationTest Migration
Moved
test_group_relationsfromtest_musicbrainz.pytotest/plugins/utils/test_musicbrainz.py(84 lines) sincegroup_relationsis now a utility function.Migration Benefits
python-musicbrainzngs(0.7.1)requests)pip install beets[plugin]requiredJSONDict,list[str])Backward Compatibility
✅ Fully backward compatible:
musicbrainz.user,musicbrainz.pass, etc.)Breaking changes: None from end-user perspective.
closes #6265