Skip to content

Add support for Gemma 3N models and improve local development setup for Python changes #6049

@surfiniaburger

Description

@surfiniaburger

MediaPipe Solution (you are using)

GenAI (Safetensors Converter)

Programming language

Python, Android

Are you willing to contribute it

Yes

Describe the feature and the current behaviour/state

Currently, the SafetensorsCkptLoader in mediapipe.tasks.python.genai.converter supports models like Gemma 2 and Phi-2, but it does not have an entry for newer Gemma 3N models. Attempting to convert a Gemma 3N model fails because it is not recognized as a special model.

Will this change the current API? How?

No, this would be an additive change. It would involve adding the Gemma 3N model name (e.g., "GEMMA_3N_E2B_IT") to the list of supported special models in the SafetensorsCkptLoader, without changing any existing functionality.

Who will benefit with this feature?

Developers who are working with the latest open-source LLMs (like Gemma 3N) and want to use MediaPipe's excellent conversion and deployment tools to bring these models on-device for Android and other platforms.

Please specify the use cases for this feature

The primary use case is converting fine-tuned Gemma 3N models from Safetensors format to the MediaPipe .task format for on-device inference. This enables developers to deploy the latest generation of small, powerful language models in mobile applications. My specific goal is to deploy a custom Gemma 3N model to an Android app.

Any Other info

Yes. In trying to add this one-line Python change myself, I discovered that the local development setup for testing Python modifications on a modern Apple Silicon Mac is extremely challenging. This feedback might be useful for improving the developer experience. The process required: - Manually patching the bundled zlib to fix an fdopen macro conflict with the modern macOS SDK. - Installing all of OpenCV's C++ dependencies via Homebrew (jpeg, png, OpenEXR, etc.). - Debugging multiple linker errors one by one (e.g., libIlmImf, libdc1394) and disabling the corresponding modules in the OpenCV CMake build configuration. - Ultimately, the only stable solution was to heavily modify setup.py, WORKSPACE, and third_party/BUILD files to force the build system to use the pre-built Homebrew version of OpenCV instead of trying to build it from source. - Finally, after a successful C++ build, the Python environment linkage was broken, requiring the manual creation of a .pth file in site-packages to point to the bazel-bin directory. A documented, simpler workflow for developers who only need to test local Python changes would be a massive benefit to the community. Thank you for your consideration!

Metadata

Metadata

Labels

platform::androidAndroid Solutionsplatform:pythonMediaPipe Python issuesstat:awaiting googlerWaiting for Google Engineer's Responsetask:LLM inferenceIssues related to MediaPipe LLM Inference Gen AI setuptype:featureEnhancement in the New Functionality or Request for a New Solution

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions