-
Notifications
You must be signed in to change notification settings - Fork 7
Description
This is actually an issue related to uv
(project and package manager) and docling-ibm-models
.
However, we should implement a turnaround, since the current implementation of docling-jobkit
results in errors if using the code, formula, or picture description enhancements.
The code, formula, and picture description models do not work with transformers
version <4.50
. It will results in errors like {"detail":"data did not match any variant of untagged enum ModelWrapper at line 255422 column 3"}
.
This is due to the constraint in docling-ibm-models
intended for MacOS x86:
transformers = [
{markers = "sys_platform != 'darwin' or platform_machine != 'x86_64'", version = "^4.42.0"},
{markers = "sys_platform == 'darwin' and platform_machine == 'x86_64'", version = "~4.42.0"}
]
Using uv
, it will try to use a lock file that satisfies all platforms, so with MacOS (darwin) arm64 or linux that second condition applies, which leads to trasnformers 4.49
and the errors above, as pointed out in docling-project/docling#1142 (comment)
A way to circumvent this issue is to install the latest transformers and force uv
not to update the lock file through the --no-update
command. For instance, using the local development example:
$ uv sync
$ uv pip show transformers
Name: transformers
Version: 4.42.4
$ uv pip install -U transformers
$ uv pip show transformers
Name: transformers
Version: 4.51.3
$ uv run --no-update python ./dev/s3_helper_test.py