Skip to main content

Visual SLAM with Isaac ROS: Hands-On Tutorial

Introduction

Now that you understand Isaac ROS and GPU acceleration, it's time to get hands-on. In this tutorial, you'll set up Isaac ROS Visual SLAM (cuVSLAM), run it with simulated camera data from Isaac Sim, build maps, and integrate with the ROS 2 Navigation Stack (Nav2). By the end, you'll have a working perception pipeline that demonstrates 6-10x speedup over CPU-based SLAM.

What is Visual SLAM?

SLAM (Simultaneous Localization and Mapping) solves two problems simultaneously:

  1. Localization: Where am I? (Estimating robot pose: x, y, z, rotation)
  2. Mapping: What does the environment look like? (Building a map of landmarks/features)

Visual SLAM uses cameras (not LiDAR) as the primary sensor. It tracks visual features (corners, edges, textures) across frames to estimate motion and build 3D maps.

Why Visual SLAM for Humanoids?

  • Lightweight sensors: Cameras are smaller and cheaper than LiDAR
  • Rich information: RGB data enables object recognition, not just geometry
  • Human-like perception: Humans use vision for navigation—humanoids should too
  • Indoor environments: Works well in textured environments (offices, homes)

cuVSLAM Overview

cuVSLAM is Isaac ROS's GPU-accelerated Visual SLAM implementation. It features:

  • Stereo or Monocular: Supports both camera configurations
  • GPU-accelerated feature tracking: CUDA-optimized ORB features
  • Loop closure: Detects revisited places, corrects drift
  • Pose graph optimization: Minimizes global error
  • Map persistence: Save/load maps between sessions
  • ROS 2 integration: Publishes odometry, pose, and map data

Performance: 30-60 FPS on RTX 3060+ (vs. 5-10 FPS CPU-based SLAM)

Prerequisites

Before starting this tutorial, ensure you have:

Hardware

  • NVIDIA GPU: GTX 1060 (6GB VRAM) minimum, RTX 3060+ recommended
  • RAM: 16GB+ system RAM
  • Ubuntu: 22.04 LTS

Software

  • ROS 2 Humble: Installed and sourced
  • Isaac Sim: 2023.1.1 or later (from previous sections)
  • Docker (optional but recommended): For containerized Isaac ROS

Knowledge Prerequisites

  • ROS 2 basics (nodes, topics, launch files) from Chapter 2
  • Isaac Sim fundamentals from Sections 01-03

Installation: Isaac ROS Visual SLAM

We'll use the Docker container method for simplicity and to avoid dependency conflicts.

Step 1: Install NVIDIA Container Toolkit

# Add NVIDIA GPG key
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

# Add repository
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

# Install
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit

# Restart Docker
sudo systemctl restart docker

Verify GPU access:

docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi

You should see your GPU listed.

Step 2: Clone Isaac ROS Common

mkdir -p ~/workspaces/isaac_ros-dev/src
cd ~/workspaces/isaac_ros-dev/src

# Clone Isaac ROS common utilities
git clone https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_common.git

Step 3: Build Isaac ROS Docker Container

cd ~/workspaces/isaac_ros-dev/src/isaac_ros_common
./scripts/run_dev.sh

What this does:

  • Pulls NVIDIA Isaac ROS base Docker image
  • Mounts your workspace into the container
  • Provides GPU access via --gpus all
  • Sets up ROS 2 Humble environment

First run takes 5-10 minutes (downloads ~8GB image).

Step 4: Install Isaac ROS Visual SLAM

Inside the Docker container:

cd /workspaces/isaac_ros-dev/src

# Clone Isaac ROS Visual SLAM
git clone https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_visual_slam.git

# Install dependencies
cd /workspaces/isaac_ros-dev
rosdep install --from-paths src --ignore-src -r -y

# Build
colcon build --symlink-install --packages-up-to isaac_ros_visual_slam

# Source
source install/setup.bash

Build time: 3-5 minutes on modern CPUs.

Running cuVSLAM with Isaac Sim

Now let's run Visual SLAM with a simulated camera in Isaac Sim.

Step 1: Launch Isaac Sim with a Robot

Terminal 1 (Outside Docker - Isaac Sim):

# Launch Isaac Sim
~/.local/share/ov/pkg/isaac_sim-2023.1.1/isaac-sim.sh

In Isaac Sim GUI:

  1. Open Isaac Examples → ROS2 → Navigation
  2. Load the Carter Robot (wheeled robot with stereo camera)
  3. Click Play to start simulation

This spawns a robot with:

  • Stereo cameras: /front/stereo_camera/left/image_raw and .../right/image_raw
  • Camera info topics: /front/stereo_camera/left/camera_info
  • IMU (optional): /front/imu

Step 2: Launch cuVSLAM

Terminal 2 (Inside Docker container):

cd /workspaces/isaac_ros-dev
source install/setup.bash

# Launch Visual SLAM
ros2 launch isaac_ros_visual_slam isaac_ros_visual_slam_stereo.launch.py

What this does:

  • Subscribes to /front/stereo_camera/left/image_raw and .../right/image_raw
  • Publishes odometry to /visual_slam/tracking/odometry
  • Publishes pose to /visual_slam/tracking/vo_pose
  • Publishes map points to /visual_slam/tracking/slam_path

Step 3: Visualize in RViz

Terminal 3 (Inside Docker):

ros2 run rviz2 rviz2

In RViz:

  1. Set Fixed Frame: map or visual_slam
  2. Add Displays:
    • Odometry: Topic /visual_slam/tracking/odometry
    • Path: Topic /visual_slam/tracking/slam_path
    • Camera: Topic /front/stereo_camera/left/image_raw
    • PointCloud2: Topic /visual_slam/tracking/landmarks (if available)

Drive the robot in Isaac Sim (using teleop or navigation goals). You'll see:

  • Real-time pose estimation (robot trajectory in RViz)
  • Map points (visual features tracked)
  • Loop closure (when revisiting areas, trajectory corrects)

Understanding cuVSLAM Output Topics

cuVSLAM publishes several ROS 2 topics:

TopicTypeDescription
/visual_slam/tracking/odometrynav_msgs/OdometryRobot pose estimate (position + velocity)
/visual_slam/tracking/vo_posegeometry_msgs/PoseStampedVisual odometry pose (no loop closure)
/visual_slam/tracking/slam_pathnav_msgs/PathFull trajectory (with loop closure)
/visual_slam/statusisaac_ros_visual_slam_interfaces/VisualSlamStatusSLAM state (tracking, lost, initializing)
/visual_slam/tracking/landmarkssensor_msgs/PointCloud23D map points (optional)

Key Topic: /visual_slam/tracking/odometry → This is what Nav2 uses for localization.

Monitoring SLAM Performance

Check SLAM Status

ros2 topic echo /visual_slam/status

Output:

vo_state: 2  # 2 = TRACKING, 1 = INITIALIZING, 0 = LOST
integrator_state: 2

States:

  • INITIALIZING (1): Waiting for enough features (move camera around)
  • TRACKING (2): Successfully tracking motion
  • LOST (0): Lost tracking (too fast motion, featureless environment)

Monitor Frame Rate

ros2 topic hz /visual_slam/tracking/odometry

Expected:

  • GPU (RTX 3060): 30-60 Hz
  • CPU (i7): 5-10 Hz

Speedup: 6-10x faster on GPU.

Monitor GPU Usage

watch -n 0.5 nvidia-smi

Look for:

  • GPU Utilization: Should be >50% when SLAM is running
  • Memory Usage: cuVSLAM uses ~2-4GB VRAM
  • Temperature: Keep <80°C

CPU vs GPU Performance Benchmark

Let's quantitatively measure the speedup.

Benchmark Setup

  1. Scenario: Drive robot through Isaac Sim office environment for 2 minutes
  2. Camera: 640x480 stereo at 30 FPS
  3. Hardware: Compare RTX 3060 vs Intel i7-12700K (CPU-only SLAM)

CPU SLAM (ORB-SLAM3)

Install ORB-SLAM3:

# Outside Docker
cd ~/workspaces
git clone https://github.com/UZ-SLAMLab/ORB_SLAM3.git
cd ORB_SLAM3
./build.sh

Run:

./Examples/Stereo/stereo_euroc Vocabulary/ORBvoc.txt \
Examples/Stereo/EuRoC.yaml true

Results (from logs):

  • Average FPS: 8.2 FPS
  • Tracking failures: 3 (robot moved too fast)
  • CPU Usage: 100% (maxed out)

GPU SLAM (cuVSLAM)

Run (as before):

ros2 launch isaac_ros_visual_slam isaac_ros_visual_slam_stereo.launch.py

Results (from ros2 topic hz):

  • Average FPS: 52 FPS
  • Tracking failures: 0 (no lost tracking)
  • GPU Usage: 65% (headroom for other tasks)

Speedup: 6.3x faster (52 / 8.2)

Configuration: Tuning cuVSLAM Parameters

cuVSLAM has several tunable parameters for performance vs. accuracy trade-offs.

Create Configuration File

File: ~/workspaces/isaac_ros-dev/src/cuvslam_config.yaml

visual_slam:
ros__parameters:
# Camera topics
left_camera_topic: /front/stereo_camera/left/image_raw
right_camera_topic: /front/stereo_camera/right/image_raw
left_camera_info_topic: /front/stereo_camera/left/camera_info
right_camera_info_topic: /front/stereo_camera/right/camera_info

# IMU (optional, improves robustness)
enable_imu: false
imu_topic: /front/imu

# Feature tracking
num_features: 2000 # More features = better accuracy, slower
min_num_features: 800 # Tracking fails if below this

# Loop closure
enable_loop_closure: true # Correct drift by recognizing revisited areas
loop_closure_frequency: 1.0 # Check for loops every 1 second

# Map saving
enable_map_saving: true
map_file_path: /tmp/cuvslam_map.db

# Performance tuning
max_keyframes: 50 # Limit memory usage
verbosity: 2 # 0=silent, 1=errors, 2=info, 3=debug

Launch with Configuration

ros2 launch isaac_ros_visual_slam isaac_ros_visual_slam_stereo.launch.py \
config_file:=/workspaces/isaac_ros-dev/src/cuvslam_config.yaml

Parameter Explanations

num_features: Number of ORB features to track per frame

  • Higher (3000+): Better accuracy, more robust, slower
  • Lower (1000): Faster, less memory, less robust
  • Recommended: 2000 for balanced performance

enable_loop_closure: Detect when robot revisits areas

  • True: Corrects drift, essential for long missions
  • False: Faster, but accumulates error over time
  • Recommended: Always true for navigation

enable_imu: Fuse IMU data with visual features

  • True: More robust to fast motion, vibrations
  • False: Simpler, works fine for smooth motion
  • Recommended: True if IMU available

Saving and Loading Maps

cuVSLAM can save maps for reuse (localization-only mode).

Save Map During SLAM

# While SLAM is running
ros2 service call /visual_slam/save_map std_srvs/srv/Trigger

Output:

success: True
message: "Map saved to /tmp/cuvslam_map.db"

Load Map for Localization

Launch in localization mode:

ros2 launch isaac_ros_visual_slam isaac_ros_visual_slam_stereo.launch.py \
localization_only:=true \
map_file:=/tmp/cuvslam_map.db

Use case: Robot already mapped the environment, now just needs to localize itself (faster startup).

Integration with Nav2

cuVSLAM provides odometry required by Nav2 for autonomous navigation.

Nav2 needs:

  1. Odometry: /odom topic (robot pose estimate)
  2. Map: Occupancy grid or costmap
  3. Transform tree: map → odom → base_link

cuVSLAM provides odometry but not occupancy grids. You need a separate mapping node.

Complete Pipeline

Launch file (vslam_nav2.launch.py):

from launch import LaunchDescription
from launch_ros.actions import Node

def generate_launch_description():
return LaunchDescription([
# cuVSLAM for odometry
Node(
package='isaac_ros_visual_slam',
executable='isaac_ros_visual_slam',
name='visual_slam',
remappings=[
('/visual_slam/tracking/odometry', '/odom') # Nav2 expects /odom
]
),

# NVBLOX for 3D mapping (GPU-accelerated)
Node(
package='nvblox_ros',
executable='nvblox_node',
name='nvblox',
parameters=[{
'voxel_size': 0.05, # 5cm voxels
'esdf_mode': 'occupied'
}]
),

# Nav2 (covered in next section)
# ...
])

Data flow:

  1. Camera → cuVSLAM → Odometry (/odom)
  2. Depth camera → NVBLOX → Occupancy grid (/map)
  3. Nav2 uses /odom + /mapPath planning

Troubleshooting

Issue 1: "Lost tracking"

Symptom: /visual_slam/status shows vo_state: 0

Causes:

  • Moved camera too fast
  • Featureless environment (white walls, uniform textures)
  • Low light conditions

Solutions:

  • Move slower during initialization
  • Add visual features to environment (posters, textures)
  • Increase lighting in Isaac Sim
  • Tune min_num_features lower (e.g., 500)

Issue 2: Low FPS on GPU

Symptom: ros2 topic hz /visual_slam/tracking/odometry shows <20 FPS

Checks:

# Verify GPU is being used
nvidia-smi # Should show cuVSLAM process

# Check for CPU bottleneck
htop # If CPU at 100%, increase CPU cores

# Check image resolution
ros2 topic echo /front/stereo_camera/left/image_raw --once

Solutions:

  • Reduce num_features to 1500
  • Lower camera resolution (640x480 → 320x240)
  • Disable loop closure temporarily

Issue 3: No map points visible

Symptom: RViz shows trajectory but no point cloud

Cause: Landmarks topic not published by default

Solution:

# In config file
enable_landmark_publishing: true

Best Practices

1. Initialize SLAM Before Moving

Move the camera slowly for the first 2-3 seconds to build initial map.

Good initialization:

  • Pan left/right slowly
  • See feature count increase in logs
  • Wait for vo_state: 2 (TRACKING)

2. Avoid Featureless Environments

cuVSLAM needs visual features. Add textures to your Isaac Sim scenes:

  • Posters on walls
  • Patterned floors (not solid colors)
  • Objects with distinct textures

3. Use Stereo for Scale

Monocular SLAM has scale ambiguity (can't determine absolute size). Use stereo cameras for:

  • Accurate distance measurements
  • Better integration with Nav2
  • Faster initialization

4. Save Maps Strategically

Only save maps when:

  • Full environment mapped
  • Loop closures detected
  • Low accumulated error

Don't save during initialization or tracking failures.

Summary

You've now set up GPU-accelerated Visual SLAM with Isaac ROS:

  1. Installed Isaac ROS: Using Docker for dependency isolation
  2. Ran cuVSLAM: With stereo camera data from Isaac Sim
  3. Measured Performance: 6-10x speedup over CPU-based SLAM
  4. Configured Parameters: Tuned features, loop closure, map saving
  5. Integrated with Nav2: Provided odometry for navigation stack
  6. Troubleshot Issues: Handled tracking failures, low FPS, configuration

Next Steps: In Section 06, we'll integrate cuVSLAM with Nav2 to enable autonomous navigation for humanoid robots in Isaac Sim.

Review Questions

  1. What is the main advantage of cuVSLAM over CPU-based SLAM like ORB-SLAM3?

    Details

    Answer cuVSLAM runs 6-10x faster by leveraging GPU parallel processing (CUDA) for feature tracking and pose optimization, enabling real-time SLAM at 30-60 FPS compared to 5-10 FPS on CPU.

  2. What are the two main problems Visual SLAM solves simultaneously?

    Answer
    1. Localization (Where am I? - estimating robot pose), 2) Mapping (What does the environment look like? - building a map of visual landmarks).
  3. What ROS 2 topic does cuVSLAM publish for Nav2 integration?

    Details

    Answer /visual_slam/tracking/odometry (type: nav_msgs/Odometry), which provides robot pose estimates (position + velocity) required by Nav2 for localization.

  4. What does "loop closure" mean in SLAM?

    Details

    Answer Loop closure is when the SLAM system detects that the robot has revisited a previously mapped area. This allows it to correct accumulated drift by recognizing the same features and optimizing the pose graph globally.

  5. Why should you use stereo cameras instead of monocular for cuVSLAM?

    Details

    Answer Stereo cameras provide depth information, which resolves scale ambiguity (knowing absolute distances), leads to more accurate mapping, faster initialization, and better integration with navigation systems like Nav2.