Robust Reinforcement Learning-Based Locomotion for Resource-Constrained Quadrupeds with Exteroceptive Sensing
Training code implementation of the paper: Robust Reinforcement Learning-Based Locomotion for Resource-Constrained Quadrupeds with Exteroceptive Sensing.
This repository provides an implementation of the paper Robust Reinforcement Learning-Based Locomotion for Resource-Constrained Quadrupeds with Exteroceptive Sensing, accepted at the 2025 IEEE International Conference on Robotics & Automation (ICRA).
If you use this repository in your work, consider citing:
@article{plozza2025robust,
title={Robust Reinforcement Learning-Based Locomotion for Resource-Constrained Quadrupeds with Exteroceptive Sensing},
author={Plozza, Davide and Apostol, Patricia and Joseph, Paul and Schl{\"a}pfer, Simon and Magno, Michele},
journal={arXiv preprint arXiv:2505.12537},
year={2025}
}
This code builds on top of Improbable AI's implementation which is also MIT licensed. of the paper Walk these Ways: Tuning Robot Control for Generalization with Multiplicity of Behavior The environment builds on the legged gym environment by Nikita Rudin, Robotic Systems Lab, ETH Zurich (Paper: https://arxiv.org/abs/2109.11978) and the Isaac Gym simulator from NVIDIA (Paper: https://arxiv.org/abs/2108.10470). Training code builds on the rsl_rl repository, also by Nikita Rudin, Robotic Systems Lab, ETH Zurich. All redistributed code retains its original license.
Real-life performance of our controller is showcased in this video.
This repository is licensed under the MIT License (see LICENSE).
It is based on Improbable AI's original repository which is also MIT licensed.
See LICENSES/ for third-party license information.
Simulated Training and Evaluation: Isaac Gym requires an NVIDIA GPU. To train in the default configuration, we recommend a GPU with at least 10GB of VRAM. The code can run on a smaller GPU if you decrease the number of parallel environments (Cfg.env.num_envs). However, training will be slower with fewer environments.
Install anaconda3 https://www.anaconda.com/download
Create python 3.8.16 environment
conda create --name robodog_gym python=3.8.16For RTX 3000 series: 1.10 with cuda-11.3
pip3 install torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio==0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.htmlFor RTX 4000 series: 2.2.0 with cuda-12.1
pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu121-
Download and install Isaac Gym Preview 4 from https://developer.nvidia.com/isaac-gym
-
unzip the file via:
tar -xf IsaacGym_Preview_4_Package.tar.gz
-
now install the python package
cd isaacgym/python && pip install -e .
-
Verify the installation by try running an example
cd examples python 1080_balls_of_solitude.py -
For troubleshooting check docs
isaacgym/docs/index.html
If you get the error ImportError: libpython3.7m.so.1.0: cannot open shared object file: No such file or directory:
cd ~/anaconda3/envs/<env_name>
mkdir -p etc/conda/activate.d
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/<use-name>/anaconda3/envs/<env_name>/lib' >> etc/conda/activate.d/ld_library_path.shIn this repository, run pip install -e .
If get errors:
sudo apt-get install libcurl4-openssl-dev
sudo apt-get install libssl-dev
If everything is installed correctly, you should be able to run the test script with:
python scripts/test.pyThe script should print Simulating step {i}.
The GUI is off by default. To turn it on, set headless=False in test.py's main function call.
Create a WanB account/team (entity) and project. You will need to change cfg.cfg_ppo.runner.wandb_entity and Cfg.cfg_ppo.runner.wandb_project in the training file accordingly.
Then login with your account.
wandb loginCODE STRUCTURE The main environment for simulating a legged robot is in legged_robot.py. The default configuration parameters including reward weightings are defined in go1_backpack_config.py.
There are three scripts in the scripts directory:
TODO: add additional types
scripts
├── __init__.py
├── play_teleop.py
├── test.py
└── train_<configuration>.pyYou can run the test.py script to verify your environment setup.
If it runs then you have installed the gym environments correctly. To train an agent, run one of the train_<configuration>.py script. To evaluate a trained agent, run play_teleop.py.
We provie one of the trained agent checkpoints used in the paper in the ./runs/exteroceptive_robust_icra_proposed directory.
To train the Go1 controller from Walk these Ways, run:
python scripts/train.pyAfter initializing the simulator, the script will print out a list of metrics every ten training iterations.
Training with the default configuration requires about 12GB of GPU memory. If you have less memory available, you can
still train by reducing the number of parallel environments used in simulation (the default is Cfg.env.num_envs = 4000).
Runs are logged with both ML Dash and Weight and Biases
To visualize training progress in ML Dash, first start the ml_dash frontend app:
python -m ml_dash.appthen start the ml_dash backend server by running this command in the parent directory of the runs folder:
python -m ml_dash.server .Finally, use a web browser to go to the app IP (defaults to localhost:3001)
and create a new profile with the credentials:
Username: runs
API: [server IP] (default is http://localhost:8081)
Access Token: [blank] \
Now, clicking on the profile should yield a dashboard interface visualizing the training runs.
We use WandB for logging. Model checkpoints and videos are also stored in the run.
To download checkpoint, configuration, video, and metrics data from an online wandb run, use the scripts/utils/download_wandb_run.py.
cd scripts
python utils/download_wandb_run.py <arguments>To evaluate the most recently trained model, run:
python scripts/play_teleop.pyThe robot can be controlled with the following key mapping.
- W, A, S, D: Control linear velocities in the X and Y directions.
- Q, E: Adjust yaw velocity.
- Shift: Increase speed.
- Arrow Keys: Simulate external forces by pushing the robot.
Trained agents can be deployed with Improbable's AI walk-these-ways deployment scripts, which needs to be adapted to inlclude elevation map sampling.
GPU-accelerated elevation map (running on a Jetson) can be obtained with RSL's open-source implementation elevation_mapping_cupy.
If you find our project useful, please consider citing:
@INPROCEEDINGS{plozza2025robustRLlocomotion,
author={Plozza, Davide and Apostol, Patricia and Joseph, Paul and Schläpfer, Simon and Magno, Michele},
booktitle={2025 IEEE International Conference on Robotics and Automation (ICRA)},
title={Robust Reinforcement Learning-Based Locomotion for Resource-Constrained Quadrupeds with Exteroceptive Sensing},
year={2025},
volume={},
number={},
pages={8121-8127},
keywords={Training;Accuracy;Robot sensing systems;Cameras;Real-time systems;Robustness;Sensors;Odometry;Quadrupedal robots;Autonomous robots},
doi={10.1109/ICRA55743.2025.11128474}}