Update Transolver + Add Transolver external_aero example #1022

coreyjadams · 2025-07-18T22:01:11Z

PhysicsNeMo Pull Request

This PR updates the transolver model and introduces an external aero example that uses it. Summary of changes to the model code:

Transolver is extended to irregular meshes and 3D data, previously this wasn't actually usable.
Transolver's interface is improved for readability.
The PhysicsAttention layer has been consolidated to improve readability and reduce code duplication between 2D, 3D, and irregular inputs.
The PhysicsAttention layer has been modified in several cases:
- Using einops rearrange to manipulate data shapes
- Using matmuls directly instead of einsum
- Reordering the weight normalization and projection; this improves numerical stability for lower precisions.
The Transolver_block implementation can now use transformer_engine for most layers.

Additionally, the darcy_transolver example was updated to accomodate these changes, and still converges like the paper result.

A summary of the external aero example:

The model can be trained on the physicsnemo curator outputs from domino; if you have a dataset that works for domino in zarr format, it can work for transolver.
The model uses an irregular mesh, and only surface OR volume data can be used at a given time.
The surface data uses the mesh centers + normals, the volume data uses just mesh centers. SDF could be added but hasn't been.
The training script supports fp32, bf16, fp16, and fp8 "in principle". In reality, only fp32 and bf16 are known to be stable for the surface data. Volumetric experiments are ongoing.
The dataloading is ready to use domain parallelism, but the training script has not enabled it yet. The model should be uniquely scalable due to the physics-attention state projections.
the training normalizes the targets to mean 0, std 1, but un-normalizes the targets and predictions for relative L2 calculations.

There is not yet a good interface for inference with a trained model. For the surface data, considering the simplicity of the preprocessing and reuse of domino inputs, it should be straightforward.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.
The CHANGELOG.md is up to date with these changes.
An issue is linked to this pull request.

Dependencies

…he 'fix' version of the example easier to run, scalable, usable at full resolution, and faster

…meshes. Enhance documentation.

…lelism, this is a disk to gpu pipeline that natively optimizes bandwidth when domain parallel.

… model code.

…olver.

… optimized.

physicsnemo/models/transolver/transolver.py

was used when projecting back onto output states.

coreyjadams · 2025-08-01T00:47:19Z

/blossom-ci

coreyjadams · 2025-08-01T15:58:38Z

Some updates on this PR:

fp8 training appears to be functional and stable. It does not, however, produce a significant speedup yet.
No training curves or results are included in the example README, since they are still being validated.
Some volumetric functions are present but in general volumetric training is WIP. Its not supported yet, and the README makes that clear.
The same can be said for domain parallelism: WIP, not supported yet.

The goal here was to have a transformer-engine leveraging model, so focus was on training stability, model compatibility, and inference.

ram-cherukuri · 2025-08-05T20:30:05Z

examples/cfd/external_aerodynamics/transolver/README.md

@@ -0,0 +1,210 @@
+# Transolver for External Aerodynamics on Irregular Meshes


Looks good and comprehensive overall. I am not calling out minor knit picks. We should have a section on customization - if a user needs to adapt this to a new problem, say CFD data from internal flow problem, what is the guidance in terms of where the user needs to look to customize - assume its VTK data, so can reuse the datapipe but what other elements would need modification.

ram-cherukuri · 2025-08-05T20:31:36Z

examples/cfd/external_aerodynamics/transolver/README.md

+introduces modifications for improved numerical stability and compatibility with NVIDIA
+TransformerEngine.
+
+The training workflow for Transolver leverages the same input datasets as DoMINO. For


same input datasets as other models perhaps. We are specifically mentioning things in the context of DoMINO - can be avoided.

coreyjadams · 2025-08-07T15:26:28Z

This has been divided into two PRS and merged separately. Closing.

coreyjadams and others added 20 commits June 27, 2025 07:34

First pass updates for transformer engine support

7d06b13

Update Transolver to optionally enable transformer engine, and make t…

bb770a3

…he 'fix' version of the example easier to run, scalable, usable at full resolution, and faster

Clean up transolver architecture and code to enable irregular and 3D …

3d392c3

…meshes. Enhance documentation.

Merge branch 'NVIDIA:main' into transformer_engine

b0897ec

Beginning work on transolver example. Planning ahead for domain paral…

c0517e5

…lelism, this is a disk to gpu pipeline that natively optimizes bandwidth when domain parallel.

Updates for the transolver example using the consolidated and updated…

a070165

… model code.

Add small function to convert matlab matrices to npz

b1aa22f

Updates to the transolver model architecture

25a37ab

Merge branch 'NVIDIA:main' into transformer_engine

016512e

Update transolver model. Add external aerodynamics example with trans…

4c383a3

…olver.

Merge branch 'NVIDIA:main' into transformer_engine

c02482e

Merge branch 'NVIDIA:main' into transformer_engine

d74089d

Enable volume training in transolver. Still needs to be validated and…

554a3ab

… optimized.

Merge branch 'NVIDIA:main' into transformer_engine

a4c3620

Updating transolver example further, and including more readme info

8c95135

Further integrate transformer engine into Transolver.

858c537

Update readme to point out matlab to npz conversion of fixed dataset.

221cab1

Update README and improve scripts

5ea7789

Merge branch 'main' into transformer_engine

79df865

Merge branch 'NVIDIA:main' into transformer_engine

ab33308

mnabian reviewed Jul 23, 2025

View reviewed changes

physicsnemo/models/transolver/transolver.py Show resolved Hide resolved