cuTAMP: Differentiable GPU-Parallelized Task and Motion Planning

🌐 Project Website | 📝 Paper

Differentiable GPU-Parallelized Task and Motion Planning
William Shen^1,2, Caelan Garrett², Nishanth Kumar^1,2, Ankit Goyal², Tucker Hermans^2,3,
Leslie Pack Kaelbling¹, Tomás Lozano-Pérez¹, Fabio Ramos^2,4
¹MIT CSAIL, ²NVIDIA Research, ³University of Utah, ⁴University of Sydney
Robotics: Science and Systems (RSS), 2025

Table of Contents:

Installation
Getting Started
Examples
Troubleshooting
Acknowledgements
Citation

Installation

Pre-Requisites:

cuTAMP depends on cuRobo, which has specific hardware requirements
GPU: NVIDIA GPU with Volta or newer architecture
Python: 3.10+ (we've only tested with Python 3.10)
PyTorch: 2.0+ is recommended

1. Setup the Python environment

# We use conda, but feel free to use your favorite Python environment manager
conda create --name cutamp python=3.10 -y
conda activate cutamp

2. Install PyTorch

You must install a PyTorch version that is built for a CUDA version less than or equal to your CUDA Toolkit version. This ensures that cuRobo compiles in the next step. To check your CUDA Toolkit version, run:

# Look for something like "release 12.6"
nvcc --version

If you don't have the CUDA Toolkit installed, install it via the official NVIDIA installer: https://developer.nvidia.com/cuda-downloads

✅ Pick a version that's supported by your GPU drivers.
⚠️ Don't overwrite your drivers unless you're sure, as it could cause issues with your system.

Then, install PyTorch using the command provided on the PyTorch website. You can also install an older version of PyTorch if it better matches your CUDA version:

# Example for latest PyTorch with CUDA 12.6
pip install torch torchvision torchaudio

3. Install cutamp

git clone https://github.com/NVlabs/cuTAMP.git
cd cuTAMP
pip install -e .

4. Install cuRobo

Before cloning cuRobo, make sure git-lfs is installed (used for pulling large assets).

sudo apt install git-lfs
git lfs install

Then clone and install cuRobo:

git clone https://github.com/NVlabs/curobo.git
cd curobo

# This can take up to 20 minutes to install
pip install -e . --no-build-isolation

# Optional: Verify that all unit tests pass
pip install pytest
python -m pytest .
cd ..

For full cuRobo installation instructions, see: https://curobo.org/get_started/1_install_instructions.html

Getting Started

Once installed, you can run the default demo using:

cutamp-demo

This runs the cutamp/scripts/run_cutamp.py script with the default parameters on the Tetris environment with 3 blocks. We use Rerun to visualize the optimization and plan.

If you're on a machine with a display, you're now good to go!
If you're on a remote or headless machine (i.e., no display), see the instructions below on how to run with or without the visualizer.

Toggle between different timelines in the Rerun visualizer to see different aspects of the optimization and planning. For a general guide on how to use Rerun, see this guide

Remote Visualization with Rerun

If you're running cuTAMP on a remote server without a display, you have two options:

Forward the Rerun visualizer port to your local machine via SSH (see below), or
Disable the visualizer using the --disable_visualizer flag:
```
cutamp-demo --disable_visualizer
```

Reverse Tunnel Setup

On your local machine (e.g., laptop), install Rerun and start the viewer:
```
pip install rerun-sdk
rerun
```
Note the TCP port shown in the top right of the viewer (usually 9876).

💡 Try to match the rerun-sdk version on your local and remote machines to avoid compatibility issues.
You can check the version on the remote machine with: pip show rerun-sdk
Create a reverse SSH tunnel from your local machine to the remote server:
```
ssh -R 9876:127.0.0.1:9876 username@server
```
On the remote machine, run the demo:
```
cutamp-demo
```
The visualizer will connect and stream to your local Rerun viewer through the tunnel!

Examples

The cutamp-demo command runs the cutamp/scripts/run_cutamp.py script with a number of useful options.

⚠️ This script exposes only a subset of the functionality of cuTAMP. For more advanced usage, please refer to the source code directly.

To view the available options, run:

cutamp-demo -h

Tetris Packing

The Tetris domain has 1, 2, 3, and 5 block variants named tetris_{1,2,3,5}. The tetris_5 variant is the most challenging and benefits from cost tuning and increasing the number of particles.

# Tetris packing with 3 blocks and motion planning after cuTAMP solve
# All plan skeletons are downward refinable, so 1 initial plan is sufficient
cutamp-demo --env tetris_3 --num_initial_plans 1 --motion_plan

# Tetris packing with 5 blocks and more particles and optimization steps
cutamp-demo --env tetris_5 --num_particles 2048 --num_opt_steps 2000 \
  --num_initial_plans 1 --motion_plan

# Tetris packing with 5 blocks and tuned cost weights
# You can try the --tuned_tetris_weights flag on other problems too (it works)!
cutamp-demo --env tetris_5 --num_particles 2048 --num_opt_steps 2000 \
  --num_initial_plans 1 --motion_plan --tuned_tetris_weights

Optimizing Soft Costs

# Minimize the distance between the objects for 10 seconds
cutamp-demo --env blocks --optimize_soft_cost --soft_cost min_obj_dist --max_duration 10

# Maximize the distance between the objects for 10 seconds
cutamp-demo --env blocks --optimize_soft_cost --soft_cost max_obj_dist --max_duration 10

Searching over Plan Skeletons

In the Stick Button domain, enabling subgraph caching speeds up particle initialization across plan skeletons.

# Stick button domain with Franka Panda
cutamp-demo --env stick_button --robot panda --num_initial_plans 100 --cache_subgraphs

# Stick button domain with UR5. The UR5 doesn't need to use the stick.
# Cross-embodiment generalization!
cutamp-demo --env stick_button --robot ur5 --num_initial_plans 100 --cache_subgraphs

Helpful Flags

--disable_visualizer: disable the rerun visualizer. This is useful for benchmarking or headless runs.
--viz_interval: control how often the visualizer updates (default is 10). Increase to reduce visualization overhead and network bandwidth usage.
--disable_robot_mesh: skip robot mesh rendering (saves bandwidth and load time when visualizing remotely).

Troubleshooting

If you encounter any issues not covered below, please open an issue. Make sure to describe your setup and detail the problem you're facing.

Running out of GPU Memory

torch.OutOfMemoryError: CUDA out of memory...

Try reducing the number of particles (the default is 1024):

cutamp-demo --num_particles 256

Rerun installation fails

On some systems, especially older Linux distros, the rerun-sdk wheel may not be availabe on PyPI.

Solution: try installing via conda via the conda-forge channel. See the instructions here: https://rerun.io/docs/getting-started/installing-viewer#python

Numpy version mismatch

If you see an error like:

A module that was compiled using NumPy 1.x cannot be run in NumPy 2.2.6...

This is due to a known compatibility issue between newer numpy versions and extensions built against older numpy ABI.

Fix: Downgrade numpy to a 1.x version:

pip install "numpy<2"

`ImportError ... GLIBCXX ... not found`

You can try installing GLIBCXX via conda:

conda install -c conda-forge libstdcxx-ng -y

cuTAMP can't find any satisfying particles for my new domain

If you've created a new placement surface, make sure you set the tolerance appropriately here for the surface name: cutamp/scripts/utils.py.

Additionally, check the logs and analyze which constraints have been violated. Try loosening the threshold for those constraints to debug.

Acknowledgements

We thank Balakumar Sundaralingam for his extensive support with using and debugging cuRobo.

Citation

If you use cuTAMP in your research, please consider citing our paper:

@inproceedings{shen2025cutamp,
    title={Differentiable GPU-Parallelized Task and Motion Planning},
    author={Shen, William and Garrett, Caelan and Kumar, Nishanth and Goyal, Ankit and Hermans, Tucker and Kaelbling, Leslie Pack and Lozano-P{\'e}rez, Tom{\'a}s and Ramos, Fabio},
    booktitle={Robotics: Science and Systems},
    year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
cutamp		cutamp
docs		docs
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

cuTAMP: Differentiable GPU-Parallelized Task and Motion Planning

🌐 Project Website | 📝 Paper

Installation

1. Setup the Python environment

2. Install PyTorch

3. Install cutamp

4. Install cuRobo

Getting Started

Remote Visualization with Rerun

Examples

Tetris Packing

Optimizing Soft Costs

Searching over Plan Skeletons

Helpful Flags

Troubleshooting

Running out of GPU Memory

Rerun installation fails

Numpy version mismatch

`ImportError ... GLIBCXX ... not found`

cuTAMP can't find any satisfying particles for my new domain

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Languages

License

NVlabs/cuTAMP

Folders and files

Latest commit

History

Repository files navigation

cuTAMP: Differentiable GPU-Parallelized Task and Motion Planning

🌐 Project Website | 📝 Paper

Installation

1. Setup the Python environment

2. Install PyTorch

3. Install cutamp

4. Install cuRobo

Getting Started

Remote Visualization with Rerun

Examples

Tetris Packing

Optimizing Soft Costs

Searching over Plan Skeletons

Helpful Flags

Troubleshooting

Running out of GPU Memory

Rerun installation fails

Numpy version mismatch

ImportError ... GLIBCXX ... not found

cuTAMP can't find any satisfying particles for my new domain

Acknowledgements

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`ImportError ... GLIBCXX ... not found`

Packages