Fix(backend/torch): Resolved MPS broadcast crash in binary_crossentropy #21816

Mithil27360 · 2025-11-02T10:45:24Z

This PR resolves a broadcast error on the PyTorch MPS backend for binary_crossentropy. The crash occurred during backpropagation when y_true and y_pred had incompatible shapes for broadcasting, such as (B, T, 1) and (B, T).

This fix aligns the shapes by squeezing the trailing dimension of 1 from both tensors and calling .contiguous() to ensure the new shape is respected during the backward pass. This resolves the mps.multiply broadcast error.

google-cla · 2025-11-02T10:45:29Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

gemini-code-assist · 2025-11-02T10:45:37Z

Summary of Changes

Hello @Mithil27360, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the PyTorch MPS backend's binary_crossentropy function, which previously led to crashes due to tensor shape mismatches during backpropagation. By explicitly handling and aligning specific tensor dimensions, the change ensures the robustness and reliability of the binary_crossentropy operation when utilizing Apple Silicon's Metal Performance Shaders, thereby improving the overall stability of the Keras backend on this hardware.

Highlights

Fix MPS Broadcast Error: Resolved a crash in the PyTorch MPS backend for binary_crossentropy that occurred during backpropagation when y_true and y_pred had incompatible shapes (e.g., (B, T, 1) and (B, T)). The fix involves squeezing the trailing dimension of 1 from both tensors and calling .contiguous() to ensure the new shape is respected during the backward pass, preventing mps.multiply broadcast errors.
Tensor Shape Alignment: Introduced conditional logic within binary_crossentropy to detect and correct specific 3D tensor shapes (B, T, 1) by reducing them to (B, T) using torch.squeeze and contiguous() for both target and output tensors.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

codecov-commenter · 2025-11-02T10:51:18Z

Codecov Report

❌ Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.66%. Comparing base (6d06085) to head (e65c475).

Files with missing lines	Patch %	Lines
keras/src/backend/torch/nn.py	0.00%	2 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #21816      +/-   ##
==========================================
- Coverage   82.66%   82.66%   -0.01%     
==========================================
  Files         577      577              
  Lines       59419    59422       +3     
  Branches     9313     9314       +1     
==========================================
  Hits        49121    49121              
- Misses       7898     7900       +2     
- Partials     2400     2401       +1

Flag	Coverage Δ
keras	`82.48% <0.00%> (-0.01%)`	⬇️
keras-jax	`63.32% <0.00%> (-0.01%)`	⬇️
keras-numpy	`57.57% <0.00%> (+<0.01%)`	⬆️
keras-openvino	`34.34% <0.00%> (-0.01%)`	⬇️
keras-tensorflow	`64.12% <0.00%> (-0.01%)`	⬇️
keras-torch	`63.62% <0.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

gemini-code-assist

Code Review

This pull request addresses a broadcast crash in binary_crossentropy on the PyTorch MPS backend. The fix involves squeezing tensors with a trailing dimension of size 1 before the loss calculation. The change is well-commented and correctly uses .contiguous() to work around a suspected view bug in the backward pass. My main feedback is to suggest generalizing the condition to handle tensors of any rank, not just 3D, to make the fix more robust for other use cases like image segmentation.

gemini-code-assist · 2025-11-02T10:52:51Z

keras/src/backend/torch/nn.py

+    if (
+        target.ndim == 3
+        and target.shape[-1] == 1
+        and output.ndim == 3
+        and output.shape[-1] == 1
+    ):


The condition to apply the squeeze operation is specific to 3D tensors (target.ndim == 3). However, the MPS broadcast issue with trailing dimensions of size 1 might also occur for tensors of other ranks, such as 4D tensors (B, H, W, 1) common in segmentation tasks. To make this fix more robust and future-proof, consider generalizing the condition to apply to any tensor with a rank greater than 1.

Suggested change

if (

target.ndim == 3

and target.shape[-1] == 1

and output.ndim == 3

and output.shape[-1] == 1

):

if (

target.ndim > 1

and output.ndim == target.ndim

and target.shape[-1] == 1

and output.shape[-1] == 1

):

Mithil27360 · 2025-11-12T08:09:49Z

Hi @fchollet ,

Thank you for reviewing and merging PR #21816! I learned a lot from implementing the MPS broadcast fix and addressing the feedback .Looking forward to contributing more!

google-ml-butler bot added the size:S label Nov 2, 2025

google-ml-butler bot assigned gbaned Nov 2, 2025

Mithil27360 force-pushed the fix-mps-binary-crossentropy branch from d2f4364 to db2f77e Compare November 2, 2025 10:51

gemini-code-assist bot reviewed Nov 2, 2025

View reviewed changes

Mithil27360 force-pushed the fix-mps-binary-crossentropy branch 2 times, most recently from 206d7e2 to dce518f Compare November 2, 2025 11:29

Fix(backend/torch): Resolve MPS broadcast crash in binary_crossentropy

e65c475

Mithil27360 force-pushed the fix-mps-binary-crossentropy branch from dce518f to e65c475 Compare November 2, 2025 12:03

fchollet approved these changes Nov 3, 2025

View reviewed changes

google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels Nov 3, 2025

fchollet merged commit 45909f9 into keras-team:master Nov 3, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix(backend/torch): Resolved MPS broadcast crash in binary_crossentropy #21816

Fix(backend/torch): Resolved MPS broadcast crash in binary_crossentropy #21816

Mithil27360 commented Nov 2, 2025

Uh oh!

google-cla bot commented Nov 2, 2025

Uh oh!

gemini-code-assist bot commented Nov 2, 2025

Uh oh!

codecov-commenter commented Nov 2, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 2, 2025

Uh oh!

Uh oh!

Mithil27360 commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix(backend/torch): Resolved MPS broadcast crash in binary_crossentropy #21816

Fix(backend/torch): Resolved MPS broadcast crash in binary_crossentropy #21816

Conversation

Mithil27360 commented Nov 2, 2025

Uh oh!

google-cla bot commented Nov 2, 2025

Uh oh!

gemini-code-assist bot commented Nov 2, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

codecov-commenter commented Nov 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Mithil27360 commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Nov 2, 2025 •

edited

Loading