Adds perceptive actor-critic class #114

pascal-roth · 2025-09-16T08:32:44Z

Adds a new perceptive actor-critic class, that can define CNN layers for every 2D observation term.

Example usage shown in IsaacLab isaac-sim/IsaacLab#3467

ClemensSchwarke

Hey @pascal-roth, very cool, thanks a lot!

rsl_rl/networks/cnn.py

rsl_rl/modules/actor_critic_perceptive.py

ClemensSchwarke · 2025-10-23T12:17:02Z

rsl_rl/networks/cnn.py

+        if self.flatten:
+            x = x.flatten(start_dim=1)
+        elif self.avgpool is not None:
+            x = self.avgpool(x)
+            x = x.flatten(start_dim=1)
+        return x


I don't get this logic. Why do we always flatten when we do avgpool and why do we not do avgpool when we also set the flatten flag?

rsl_rl/networks/cnn.py

ClemensSchwarke · 2025-10-29T14:07:57Z

@pascal-roth I modified the CNN quite a bit to have more flexibility. Could you quickly check if this version works for you and maybe also give it a quick check? I also modified your perceptive environment configs to work with the changes here, it would be awesome if you could run the env to see if it still trains the same :)

lvjonok · 2025-10-30T00:59:28Z

Hello!

I was following the progress of this PR and I had a question whether it makes sense to extend this ActorCriticPerceptive to have a shared CNN backbone for both critic and actor? Do you think it should be an option or should it be needed at all?

I thought it can greatly simplify the complexity of neural network and speed up the training process.

ClemensSchwarke · 2025-10-30T14:28:36Z

Hi @lvjonok, I would assume the speedup is negligible. But if you want to test this please share the results :)

garylvov · 2025-11-05T16:56:45Z

Hi @ClemensSchwarke in my tests, it seems using a shared CNN backbone between actor and critic can help save VRAM.

I find this particularly helpful to use more environments per GPU

epalmaEth · 2025-11-13T14:05:16Z

rsl_rl/algorithms/ppo.py

 from tensordict import TensorDict

-from rsl_rl.modules import ActorCritic, ActorCriticRecurrent
+from rsl_rl.modules import ActorCritic, ActorCriticPerceptive, ActorCriticRecurrent


maybe a better name for the actor to differentiate it between CNN perceptive and RayCaster perceptive

epalmaEth · 2025-11-13T14:09:11Z

rsl_rl/modules/actor_critic_perceptive.py

+                "PerceptiveActorCritic.__init__ got unexpected arguments, which will be ignored: "
+                + str([key for key in kwargs])
+            )
+        nn.Module.__init__(self)


can probably do
super(ActorCritic, self).__init__()
instead

epalmaEth · 2025-11-13T14:14:11Z

rsl_rl/modules/actor_critic_perceptive.py

+        self.obs_groups = obs_groups
+        num_actor_obs_1d = 0
+        self.actor_obs_groups_1d = []
+        actor_in_dims_2d = []
+        actor_in_channels_2d = []
+        self.actor_obs_groups_2d = []
+        for obs_group in obs_groups["policy"]:
+            if len(obs[obs_group].shape) == 4:  # B, C, H, W
+                self.actor_obs_groups_2d.append(obs_group)
+                actor_in_dims_2d.append(obs[obs_group].shape[2:4])
+                actor_in_channels_2d.append(obs[obs_group].shape[1])
+            elif len(obs[obs_group].shape) == 2:  # B, C
+                self.actor_obs_groups_1d.append(obs_group)
+                num_actor_obs_1d += obs[obs_group].shape[-1]
+            else:
+                raise ValueError(f"Invalid observation shape for {obs_group}: {obs[obs_group].shape}")


what do you think of having a list given to the class constructor that tells you directly the names of the 2D groups? If not, at least i feel like all this code should be its own private method of the class for clarity

epalmaEth · 2025-11-13T14:16:31Z

rsl_rl/modules/actor_critic_perceptive.py

+            # Check if multiple 2D actor observations are provided
+            if len(self.actor_obs_groups_2d) > 1 and all(isinstance(item, dict) for item in actor_cnn_cfg.values()):
+                assert len(actor_cnn_cfg) == len(self.actor_obs_groups_2d), (
+                    "The number of CNN configurations must match the number of 2D actor observations."
+                )
+            elif len(self.actor_obs_groups_2d) > 1:
+                print(
+                    "Only one CNN configuration for multiple 2D actor observations given, using the same configuration "
+                    "for all groups."
+                )
+                actor_cnn_cfg = dict(zip(self.actor_obs_groups_2d, [actor_cnn_cfg] * len(self.actor_obs_groups_2d)))
+            else:
+                actor_cnn_cfg = dict(zip(self.actor_obs_groups_2d, [actor_cnn_cfg]))


this logic can be simplified

epalmaEth · 2025-11-13T14:19:01Z

rsl_rl/modules/actor_critic_perceptive.py

+        # Actor MLP
+        self.state_dependent_std = state_dependent_std
+        if self.state_dependent_std:
+            self.actor = MLP(num_actor_obs_1d + encoding_dim, [2, num_actions], actor_hidden_dims, activation)


why is this not 2*num_actions directly?

epalmaEth · 2025-11-13T14:19:32Z

rsl_rl/modules/actor_critic_perceptive.py

+        if self.critic_obs_groups_2d:
+            assert critic_cnn_cfg is not None, " A critic CNN configuration is required for 2D critic observations."
+
+            # check if multiple 2D critic observations are provided
+            if len(self.critic_obs_groups_2d) > 1 and all(isinstance(item, dict) for item in critic_cnn_cfg.values()):
+                assert len(critic_cnn_cfg) == len(self.critic_obs_groups_2d), (
+                    "The number of CNN configurations must match the number of 2D critic observations."
+                )
+            elif len(self.critic_obs_groups_2d) > 1:
+                print(
+                    "Only one CNN configuration for multiple 2D critic observations given, using the same configuration"
+                    " for all groups."
+                )
+                critic_cnn_cfg = dict(zip(self.critic_obs_groups_2d, [critic_cnn_cfg] * len(self.critic_obs_groups_2d)))
+            else:
+                critic_cnn_cfg = dict(zip(self.critic_obs_groups_2d, [critic_cnn_cfg]))


same as above

epalmaEth · 2025-11-13T14:19:52Z

rsl_rl/modules/actor_critic_perceptive.py

+        num_critic_obs_1d = 0
+        self.critic_obs_groups_1d = []
+        critic_in_dims_2d = []
+        critic_in_channels_2d = []
+        self.critic_obs_groups_2d = []
+        for obs_group in obs_groups["critic"]:
+            if len(obs[obs_group].shape) == 4:  # B, C, H, W
+                self.critic_obs_groups_2d.append(obs_group)
+                critic_in_dims_2d.append(obs[obs_group].shape[2:4])
+                critic_in_channels_2d.append(obs[obs_group].shape[1])
+            elif len(obs[obs_group].shape) == 2:  # B, C
+                self.critic_obs_groups_1d.append(obs_group)
+                num_critic_obs_1d += obs[obs_group].shape[-1]
+            else:
+                raise ValueError(f"Invalid observation shape for {obs_group}: {obs[obs_group].shape}")


same as above

epalmaEth · 2025-11-13T14:21:45Z

rsl_rl/modules/actor_critic_perceptive.py

+        Normal.set_default_validate_args(False)
+
+    def _update_distribution(self, mlp_obs: torch.Tensor, cnn_obs: dict[str, torch.Tensor]) -> None:
+        if self.actor_cnns is not None:


does it make sense to assume that the user can use the ActorCriticPerceptvie as a normal ActorCritic? i would prefer an assertion in the init in such case and tell them to use the default ActorCrititc

epalmaEth · 2025-11-13T14:24:32Z

rsl_rl/modules/actor_critic_perceptive.py

+            mlp_obs = torch.cat([mlp_obs, cnn_enc], dim=-1)
+
+        if self.state_dependent_std:
+            return self.actor(obs)[..., 0, :]


this should be mlp_obs no? also this logic is superflous if you have the output be 2*num_actions and just do
self.actor(obs)[..., :self.num_actions]

epalmaEth · 2025-11-13T14:25:57Z

rsl_rl/modules/actor_critic_perceptive.py

+        obs_list_1d = [obs[obs_group] for obs_group in self.actor_obs_groups_1d]
+        obs_dict_2d = {}
+        for obs_group in self.actor_obs_groups_2d:
+            obs_dict_2d[obs_group] = obs[obs_group]
+        return torch.cat(obs_list_1d, dim=-1), obs_dict_2d


cant obs_dict_2d be also a list where you assume that the order is the same as self.actor_obs_groups_2d. is the dict faster?

pascal-roth mentioned this pull request Sep 16, 2025

Adds perceptive navigation example isaac-sim/IsaacLab#3467

Draft

7 tasks

Mayankm96 force-pushed the main branch from 1979479 to cf71aa6 Compare September 18, 2025 08:19

ClemensSchwarke requested changes Oct 6, 2025

View reviewed changes

rsl_rl/networks/cnn.py Show resolved Hide resolved

rsl_rl/modules/actor_critic_perceptive.py Show resolved Hide resolved

ClemensSchwarke mentioned this pull request Oct 22, 2025

Support of Resnet encoder #123

Closed

ClemensSchwarke changed the base branch from main to fix/recurrent_symmetry October 22, 2025 15:39

ClemensSchwarke force-pushed the feature/perceptive-nav-rl branch 2 times, most recently from 22cd1ec to 8a0a959 Compare October 22, 2025 15:57

ClemensSchwarke reviewed Oct 23, 2025

View reviewed changes

Mayankm96 reviewed Oct 24, 2025

View reviewed changes

rsl_rl/networks/cnn.py Show resolved Hide resolved

ClemensSchwarke force-pushed the fix/recurrent_symmetry branch from 61038de to d69435a Compare October 24, 2025 11:59

ClemensSchwarke force-pushed the feature/perceptive-nav-rl branch from 42e406d to df9e1ad Compare October 24, 2025 12:03

ClemensSchwarke force-pushed the fix/recurrent_symmetry branch from d69435a to c871703 Compare October 24, 2025 12:37

Base automatically changed from fix/recurrent_symmetry to main October 24, 2025 12:40

pascal-roth and others added 7 commits October 24, 2025 14:43

add files for perceptive example

677a7b2

working training

6b2910f

formatter

6edccbe

formatting 1

364dcab

formatting 2

1517bd0

CNN docstrings

d7bfc7a

format actor_critic_perceptive

0a1756d

ClemensSchwarke force-pushed the feature/perceptive-nav-rl branch from df9e1ad to 0a1756d Compare October 24, 2025 12:44

extend CNN to more configuration options and better exportability

3132a7e

ClemensSchwarke requested a review from Mayankm96 October 29, 2025 14:05

garylvov mentioned this pull request Nov 5, 2025

[Question] How to use Custom CNN for the navigation task with obstacles. isaac-sim/IsaacLab#3877

Closed

ClemensSchwarke requested a review from epalmaEth November 12, 2025 16:29

epalmaEth requested changes Nov 13, 2025

View reviewed changes

Adds perceptive actor-critic class #114

Are you sure you want to change the base?

Adds perceptive actor-critic class #114

Conversation

pascal-roth commented Sep 16, 2025

Uh oh!

ClemensSchwarke left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ClemensSchwarke Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ClemensSchwarke commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lvjonok commented Oct 30, 2025

Uh oh!

ClemensSchwarke commented Oct 30, 2025

Uh oh!

garylvov commented Nov 5, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

epalmaEth Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

epalmaEth Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

ClemensSchwarke left a comment •

edited

Loading

ClemensSchwarke Oct 23, 2025 •

edited

Loading

ClemensSchwarke commented Oct 29, 2025 •

edited

Loading

epalmaEth Nov 13, 2025 •

edited

Loading

epalmaEth Nov 13, 2025 •

edited

Loading