Prism

ROS 2 perception acceleration that picks the right path through your hardware.

ROS 2 Humble · Apache-2.0 · v0.1.0
01 — What it is

A hardware-agnostic perception accelerator

Prism is a hardware-agnostic ROS 2 image-processing accelerator. It's a drop-in replacement for image_proc::ResizeNode's resize pipeline — same parameters, same output, with a scaled CameraInfo on the paired topic — that detects and live-validates host accelerators at startup against the GStreamer registry: a GStreamer pipeline with zero-copy intra-process ingest, single-copy egress, and no DDS round-trip on supported GPUs (Intel VA-API, NVIDIA Jetson NVMM), or a direct cv::resize fallback when no usable GPU path is present.

  • Drop-in replacement for image_proc::ResizeNode's resize pipeline
  • Detects and live-validates VA-API / NVMM / CPU at startup; per-action routing is operator-measured
  • Zero-copy intra-process ingest, single-copy egress, no DDS round-trip
launch.py — one-line swap
ComposableNode(
    package='prism_image_proc',          # was: 'image_proc'
    plugin='prism::ResizeNode',          # was: 'image_proc::ResizeNode'
    name='resize',
    parameters=[{'width': 640, 'height': 480}],
)
02 — Quick start

Get running in three steps

Install prerequisites

On Ubuntu 22.04 with ROS 2 Humble. Pull the image-transport plumbing and the GStreamer dev headers.

shell
# Ubuntu 22.04 · ROS 2 Humble
sudo apt install \
    ros-humble-image-transport \
    ros-humble-image-transport-plugins \
    ros-humble-image-proc \
    libgstreamer1.0-dev \
    libgstreamer-plugins-base1.0-dev \
    gstreamer1.0-plugins-base \
    gstreamer1.0-plugins-good \
    gstreamer1.0-vaapi

Clone and build

Drop the repo into a colcon workspace and build only the Prism packages.

shell
mkdir -p ~/prism_ws/src && cd ~/prism_ws/src
git clone https://github.com/sohams25/prism-ros.git
cd ~/prism_ws
source /opt/ros/humble/setup.bash
colcon build --packages-select prism_image_proc
source install/setup.bash

Run the demo

Launches a synthetic source feeding a Prism resize node in the same composable container — intra-process delivery, no DDS round-trip; zero-copy ingest plus single-copy egress where the host supports it.

shell
ros2 launch prism_image_proc prism_image_proc_demo.launch.py
Prefer the A/B stress test? See the benchmarking section below.
03 — Architecture

Detect, validate, dispatch

HardwareDetector probes the host; PipelineFactory queries the live GStreamer registry via gst_element_factory_find to validate each candidate — element presence is not the same as element working on this kernel, driver, and chroma-subsampling combination. Jetson and Intel probes are two-step (prefer nvvideoconvert / vapostproc; fall back to nvvidconv / vaapipostproc). The first backend that validates wins. Per-action routing within a backend is a hand-coded table from operator A/B measurement, not an autonomous runtime optimiser.

Prism runtime selection diagram Startup probes HardwareDetector, which feeds PipelineFactory. PipelineFactory validates three candidate backends against the live GStreamer registry: Jetson NVMM, Intel VA-API, and CPU fallback. Whichever validates first wins. Startup HardwareDetector PipelineFactory validates via live registry PRIORITY 1 Jetson NVMM nvvideoconvert / nvvidconv PRIORITY 2 Intel VA-API vapostproc PRIORITY 3 CPU fallback cv::resize whichever validates first wins

Fallback chain

PriorityPlatformDetectionProcessing
1 NVIDIA Jetson /dev/nvhost-*, /dev/nvmap GStreamer nvvideoconvert (CUDA / NVMM); legacy nvvidconv accepted as second-step probe
2 Intel iGPU / dGPU vapostproc validates against the live registry GStreamer VA-API
3 Any x86 / ARM Always available Direct cv::resize

GStreamer 1.20 Intel caveat

On stock ROS 2 Humble (GStreamer 1.20), the vaapipostproc element is present but fails live validation due to a chroma-subsampling regression; the Intel iGPU path falls back to direct cv::resize mode. GStreamer 1.22+ (Ubuntu 24.04 / Jazzy) is required for the GPU resize kernel on Intel. NVIDIA Jetson NVMM and the CPU direct path are unaffected.

Jetson legacy nvvidconv BGR-CAPS gap

On the legacy nvvidconv element (still common on Jetson Orin images), the sink/src caps do not list BGR. Any GPU stage on this image carries a CPU videoconvert ↔ BGRx adapter on the ingress boundary at full source resolution. Per-action routing is a hand-coded table from operator A/B measurement, not an autonomous runtime optimiser: resize and chain keep the GPU path; crop routes to CPU videocrop. The Round-3 finding on colorconvert (bench-harness CPU saturation, not BGR-adapter dominance) lives in orin_simple_summary.md.

04 — Components

Registered components

prism::ImageProcNode

Chainable base node. Configurable action chain (resize, crop, flip, colorconvert), CameraInfo transforms applied per action, runtime reconfigurable.

view on GitHub ↗
prism::ResizeNode

Thin wrapper pinning action="resize". Drop-in replacement for image_proc::ResizeNode's resize pipeline. Preferred entry point for migrations.

view on GitHub ↗
prism::CropNode

Thin wrapper pinning action="crop".

view on GitHub ↗
prism::ColorConvertNode

Thin wrapper pinning action="colorconvert". Targets bgr8, rgb8, or mono8 output.

view on GitHub ↗

Helpers

Helper prism::MediaStreamerNode

Video-file publisher for tests.

view on GitHub ↗
Helper prism::Synthetic4kPubNode

Synthetic 4K test source.

view on GitHub ↗
05 — Parameters

Parameter reference

Expand full parameter reference

Core resize

ParameterTypeDefaultDescription
use_scaleboolfalseScale by factor when true; use absolute width/height when false.
scale_widthdouble1.0Horizontal scale factor applied when use_scale is true.
scale_heightdouble1.0Vertical scale factor applied when use_scale is true.
widthint640Absolute output width in pixels.
heightint480Absolute output height in pixels.
input_topicstring/camera/image_rawInput sensor_msgs/Image topic name.
output_topicstring/camera/image_processedOutput sensor_msgs/Image topic name.

Action chain

ParameterTypeDefaultDescription
actionstringresizeOne of resize, crop, flip, colorconvert.
target_encodingstringbgr8Output encoding for colorconvert: bgr8 / rgb8 / mono8.
crop_xint0Crop origin X (pixels from left).
crop_yint0Crop origin Y (pixels from top).
crop_widthint0Crop region width in pixels.
crop_heightint0Crop region height in pixels.
flip_methodstringnoneOne of none, horizontal, vertical.

Transport

ParameterTypeDefaultDescription
input_transportstringrawimage_transport plugin for the input (raw, compressed, …).
publish_camera_infobooltruePublish a transformed CameraInfo on the paired topic.
camera_info_input_topicstring""Optional override; empty string derives <image_topic_namespace>/camera_info per ROS convention.
camera_info_output_topicstring""Optional override for the published CameraInfo topic.
source_widthint3840Source caps width in pixels (GPU mode only).
source_heightint2160Source caps height in pixels (GPU mode only).
06 — Benchmarks

A/B captures on two hosts

A/B captures against stock ROS 2 Humble image_proc on two hosts: 4K BGR8 input at 10 Hz, 120 s per operation, two component_container processes. Latency is per-frame from publisher stamp to subscriber receive; full methodology, per-percentile / CPU / RSS / fps data, and the per-host findings live in intel_desktop_simple_summary.md and orin_simple_summary.md.

Intel desktop, GStreamer 1.20, direct-mode fallback

Intra-process composition + DDS round-trip elimination, not GPU offload — VAAPI is in fallback mode on this GStreamer 1.20 host.

ActionPrism median (ms)Stock median (ms)Δ %
resize4.5510.77−57.8 %
crop4.2722.65−81.1 %
colorconvert2.992323.65
chain (3 ops)12.8676.28−83.1 %

colorconvert Δ% omitted — stock baseline is a Python NumPy node that cannot drain 4K BGR8 at 10 Hz (image_proc ships no C++ colorconvert). Full per-percentile data, methodology, and the throughput-ceiling explanation in intel_desktop_simple_summary.md.

Jetson Orin Nano Super, JetPack 6.2

Per-action backend on legacy nvvidconv (resize/chain GPU, crop CPU videocrop, colorconvert GPU). The BGR-CAPS gap that drives this routing, the Round-3 colorconvert contention finding (bench-harness CPU saturation, not BGR-adapter dominance), and the empirical intra-process verification are documented in the linked summary.

ActionPrism median (ms)Stock median (ms)Δ %
resize21.77
crop716.56868.65−17.5 %
colorconvert1194.6114295.86
chain (3 ops)17.11

Stock-side image_proc::ResizeNode does not publish frames inside this container (an image_proc packaging issue, not a Prism finding) — so resize and chain are Prism-only; colorconvert Δ% is omitted for the same structural reason as Intel. Full per-percentile data, the Round-3 contention finding, and a single-process direct-mode 9.27 ms colorconvert reference in orin_simple_summary.md.

Reproducing

shell
python3 bench/run.py --operation resize --video /path/to/4k.mp4 \
  --duration 120 --warmup 10 --output-dir bench/results/
python3 bench/analyze.py --results-dir bench/results/ \
  --output bench/results/summary.json
python3 bench/emit_simple_summary.py --summary bench/results/summary.json \
  --host-label "<host description>" --gst-version "$(gst-launch-1.0 --version | head -1)" \
  --out bench/results/<host>_simple_summary.md

Repeat with --operation {crop,colorconvert,chain}.

07 — Roadmap

Forward-looking work

Short-term to exploratory, in roughly that order.

  • Extend wrapper coverage to additional image_proc operations. Per-action prism::FlipNode first, then rectify and debayer.
  • GStreamer 1.22+ Intel VA-API capture. A clean A/B run on Ubuntu 24.04 where vapostproc validates and the GPU resize kernel does the work, rather than direct-mode cv::resize.
  • Jetson Orin capture against an image that ships nvvideoconvert. Closes the legacy nvvidconv BGR-CAPS gap by removing the CPU videoconvert ↔ BGRx adapter on each pipeline boundary.
  • AMD Ryzen A/B capture. Same harness, different host class, no VA-API path.
  • Rockchip RK3588 / Qualcomm QCS 6490 backend exploration. RubikPi 3 hardware on hand; the target is a Mali-G610 / RGA path validated against the local GStreamer registry the same way VA-API is.
  • ROS Buildfarm submission for binary distribution.
  • Extended test coverage — integration tests beyond the current gtest unit suite, alongside the Buildfarm submission noted above.