[DRAFT] OMPE-88188: Default explicit temporal data instead of TAA by jmart-nv · Pull Request #5232 · isaac-sim/IsaacLab

jmart-nv · 2026-04-10T18:33:05Z

Description

This is a two-part fix: disabling temporal AA (DLSS) by default, and enabling frame stacking for newton by default.

Temporal AA (DLSS): Fixes incorrectly inflated convergence on newton+rtx.

RTX uses DLSS anti-aliasing by default. DLSS uses temporal frame blending that encodes motion information into a single RGB observation, causing camera-based RL to learn from temporal artifacts that don't exist with real cameras in the same way.

This has been fixed by changing the default AA mode to FXAA, which is slower than DLSS, but more accurately simulates a real camera. Also added a runtime warning when DLSS, DLAA or TAA are enabled, as these all introduce temporal artifacts.

Added new unit tests to verify expected behavior. Also tested with full cartpole-camera training run.

Newton Frame Stacking: Provides explicit temporal data to newton.

In the previous commit, DLSS anti-aliasing was disabled by default since it provides implicit temporal data that does not accurately reflect how real cameras work. However, newton's energy-conserving physics solver requires temporal velocity data in order to compensate for the lack of damping.

This commit provides explicit temporal information via 2-frame stacking by default for newton-based camera tasks. This allows newton to provide the damping it needs to converge at the same rate as physx. This adds 36% GPU memory overhead, but the wall clock overhead is negligible.

The default for physx is still stack size = 1 (disabled) since physx has implicit damping built-in via its TGS solver.

Implementation:

TiledCameraCfg now has a frame_stack field that controls a ring buffer of previous frames. These are concatenated to the channel dimension automatically in DirectRLEnv. The new MultiBackendTiledCameraCfg wraps the MultiBackendRendererCfg to provide defaults for the different physics presets. If newton physics is used without frame stacking, a runtime warning will now be emitted.

Updated existing tasks to use the new MultiBackendTiledCameraCfg and added 12 new tests.

Type of change

Bug fix (non-breaking change which fixes an issue)

Screenshots

Checklist

I have read and understood the contribution guidelines
I have run the pre-commit checks with ./isaaclab.sh --format
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I have updated the changelog and the corresponding version in the extension's config/extension.toml file
I have added my name to the CONTRIBUTORS.md or my name already exists there

RTX uses DLSS anti-aliasing by default. DLSS uses temporal frame blending that encodes motion information into a single RGB observation, causing camera-based RL to learn from temporal artifacts that don't exist with real cameras in the same way. This has been fixed by changing the default AA mode to FXAA, which is slower than DLSS, but more accurately simulates a real camera. Also added a runtime warning when DLSS, DLAA or TAA are enabled, as these all introduce temporal artifacts. Added new unit tests to verify expected behavior. Also tested with full cartpole-camera training run.

In the previous commit, DLSS anti-aliasing was disabled by default since it provides implicit temporal data that does not accurately reflect how real cameras work. However, newton's energy-conserving physics solver requires temporal velocity data in order to compensate for the lack of damping. This commit provides explicit temporal information via 2-frame stacking by default for newton-based camera tasks. This allows newton to provide the damping it needs to converge at the same rate as physx. This adds 36% GPU memory overhead, but the wall clock overhead is negligible. The default for physx is still stack size = 1 (disabled) since physx has implicit damping built-in via its TGS solver. Implementation: `TiledCameraCfg` now has a `frame_stack` field that controls a ring buffer of previous frames. These are concatenated to the channel dimension automatically in `DirectRLEnv`. The new `MultiBackendTiledCameraCfg` wraps the `MultiBackendRendererCfg` to provide defaults for the different physics presets. If newton physics is used without frame stacking, a runtime warning will now be emitted. Updated existing tasks to use the new `MultiBackendTiledCameraCfg` and added 12 new tests.

kellyguo11

Review Summary

Good two-part PR that addresses a real sim2real gap: (1) disabling temporal AA (DLSS) by default so RL policies don't learn from renderer-specific temporal artifacts, and (2) adding frame stacking for Newton to explicitly provide temporal information that Newton's energy-conserving integrator needs.

The implementation is solid — ring buffer approach for frame stacking is clean, the MultiBackendTiledCameraCfg preset pattern is a nice API improvement over the previous renderer_cfg=MultiBackendRendererCfg() approach, and the test coverage is thorough (12+ new tests covering frame stacking, ring buffer correctness, reset behavior, and AA mode warnings).

A few performance and correctness issues worth addressing below.

kellyguo11 · 2026-04-10T19:02:28Z

source/isaaclab/isaaclab/sensors/camera/tiled_camera.py

+                    self.renderer.write_output(self.render_data, name, single)
+                history = self._frame_history[name]
+                # For envs that just reset, fill all history slots with current frame
+                if needs_init.any():


should-fix: needs_init.any() triggers a GPU→CPU synchronization on every update step, for every output channel. In the common case (no environments were just reset), this is a wasted sync that stalls the pipeline.

Consider tracking reset state with a CPU-side flag instead:

# In reset(): self._frame_stack_needs_init_cpu = True # simple bool # Here: if self._frame_stack_needs_init_cpu: init_ids = self._frame_stack_needs_init.nonzero(as_tuple=False).squeeze(-1) if len(init_ids) > 0: for i in range(self._frame_stack_size): history[i, init_ids] = single[init_ids] self._frame_stack_needs_init.zero_() self._frame_stack_needs_init_cpu = False

This avoids the GPU→CPU sync entirely in the steady-state path.

kellyguo11 · 2026-04-10T19:02:28Z

source/isaaclab/isaaclab/sensors/camera/tiled_camera.py

+                ordered = torch.cat(
+                    [
+                        history[(self._frame_stack_idx + 1 + i) % self._frame_stack_size]
+                        for i in range(self._frame_stack_size)


nit (perf): The torch.cat with a list comprehension per update is fine for frame_stack=2 but scales as O(frame_stack) allocations per step. Since the API allows arbitrary frame_stack values, worth either:

Documenting that values > 4 are not recommended for performance reasons, or

Using pre-allocated views with torch.narrow + in-place copies instead of cat

Not blocking for the current use case (frame_stack=2).

kellyguo11 · 2026-04-10T19:02:28Z

source/isaaclab/isaaclab/envs/direct_rl_env.py

+                        c,
+                        c * frame_stack,
+                        frame_stack,
+                        attr_name,


suggestion: This break means only the first camera with frame_stack > 1 adjusts the observation space. For single-camera tasks this is correct, but it silently ignores additional cameras.

Worth adding a comment like # NOTE: Only the first camera with frame_stack > 1 is used to adjust observation_space. Multi-camera tasks should set observation_space explicitly.

Also: the observation_space auto-adjust assumes the space is [H, W, C] from a single camera. If a task uses a different observation layout, the multiplication could corrupt the space dimensions. A guard like checking that c matches the camera's expected channel count would make this more robust.

kellyguo11 · 2026-04-10T19:02:28Z

source/isaaclab_tasks/isaaclab_tasks/utils/sim_launcher.py

    visualizer_intent = _compute_visualizer_intent(env_cfg)
    _set_visualizer_intent_on_launcher_args(launcher_args, visualizer_intent)

+    # Warn when Newton physics is used with camera observations but no frame stacking.


should-fix: _has_camera_without_frame_stack checks isinstance(node, CameraCfg) but frame_stack is only defined on TiledCameraCfg. A regular CameraCfg (non-tiled) will hit getattr(node, "frame_stack", 1) <= 1 → True, causing a false-positive warning even though frame stacking isn't applicable to non-tiled cameras.

Should be:

from isaaclab.sensors import TiledCameraCfg def _has_camera_without_frame_stack(node) -> bool: if not isinstance(node, TiledCameraCfg): return False return getattr(node, "frame_stack", 1) <= 1

kellyguo11 · 2026-04-10T19:02:28Z

source/isaaclab/isaaclab/sim/simulation_cfg.py

-      image quality.
-
-    This is set by the variable: ``/rtx/post/dlss/execMode``.
+    antialiasing_mode: Literal["Off", "FXAA", "DLSS", "TAA", "DLAA"] | None = "FXAA"


suggestion (breaking change): Changing the default from None (preserve renderer default, which was DLSS) to "FXAA" is a user-facing behavioral change — existing code that relied on the previous default will now get different rendering output and potentially different performance characteristics.

This is the right default for RL training, but it should be called out in the changelog since it affects all users, not just Newton users. The PR checklist has the changelog item unchecked — please add an entry noting the default AA mode change.

kellyguo11 · 2026-04-10T19:02:28Z

source/isaaclab/test/sensors/test_tiled_camera_frame_stack.py

+    # Step to build history
+    for _ in range(3):
+        sim.step()
+        camera.update(dt=0.01)


nit: import logging has no dependency on the simulator and can be moved to the top-level imports (before AppLauncher). Only isaaclab_tasks and pxr need to be deferred.

kellyguo11 · 2026-04-10T19:02:28Z

source/isaaclab/test/sensors/test_tiled_camera_frame_stack.py

+import logging
+
+import isaaclab_tasks  # noqa: F401
+from isaaclab_tasks.utils import resolve_task_config


suggestion: Manipulating sys.argv directly is fragile — if tests run in parallel, another test could see the modified argv. Consider using unittest.mock.patch('sys.argv', [...]) as a context manager, which is both cleaner and thread-safe:

from unittest.mock import patch with patch('sys.argv', [sys.argv[0], 'presets=newton']): env_cfg, _ = resolve_task_config(...)

kellyguo11 · 2026-04-10T19:02:28Z

source/isaaclab_tasks/isaaclab_tasks/utils/presets.py

+    provides this temporal information by concatenating consecutive frames along
+    the channel dimension, enabling the policy to infer velocity from pixel
+    differences between frames.
+    """


suggestion: Nice API — MultiBackendTiledCameraCfg as a TiledCameraCfg subclass with embedded MultiBackendRendererCfg is cleaner than the previous pattern of separate renderer_cfg fields.

One thing to document: downstream code using type(cam) == TiledCameraCfg (exact type check) would now fail. isinstance checks still work. Might be worth a migration note in the PR description or changelog.

jmart-nv added 2 commits April 10, 2026 13:20

github-actions bot added the isaac-lab Related to Isaac Lab team label Apr 10, 2026

kellyguo11 reviewed Apr 10, 2026

View reviewed changes

fatimaanes self-requested a review April 13, 2026 19:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT] OMPE-88188: Default explicit temporal data instead of TAA#5232

[DRAFT] OMPE-88188: Default explicit temporal data instead of TAA#5232
jmart-nv wants to merge 2 commits intoisaac-sim:developfrom
jmart-nv:jmart/temporal-aa

jmart-nv commented Apr 10, 2026

Uh oh!

kellyguo11 left a comment

Uh oh!

kellyguo11 Apr 10, 2026

Uh oh!

kellyguo11 Apr 10, 2026

Uh oh!

kellyguo11 Apr 10, 2026

Uh oh!

kellyguo11 Apr 10, 2026

Uh oh!

kellyguo11 Apr 10, 2026

Uh oh!

kellyguo11 Apr 10, 2026

Uh oh!

kellyguo11 Apr 10, 2026

Uh oh!

kellyguo11 Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jmart-nv commented Apr 10, 2026

Description

Type of change

Screenshots

Checklist

Uh oh!

kellyguo11 left a comment

Choose a reason for hiding this comment

Review Summary

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants