RayPE: Ray-Space Positional Encoding for 3D-Aware Video Generation

Abstract

Modern video diffusion transformers place tokens on a 2D pixel grid and encode their positions (e.g., RoPE over the u, v, t axes). These encodings describe the camera's sampling grid rather than the 3D structure of the scene. RayPE injects per-token Plücker coordinates additively into the queries and keys of self-attention; a query/key flip makes the Euclidean inner product reduce to the Plücker reciprocal product even at zero learning, providing a built-in 3D inductive bias. A Normalize-Gate-Inject design decouples ray direction from ray-moment magnitude. The module adds <0.1% parameters, is zero-initialized, and coexists with the original RoPE rather than replacing it.

RayPE teaser: given a target camera trajectory, RayPE generates videos that faithfully follow the trajectory while preserving the base model's quality. — **RayPE** enables precise relative camera control for pretrained video diffusion models. Given a target camera trajectory, it generates videos that faithfully follow the path while preserving the base model's generation quality. Top: image-to-video (left) and text-to-video (right). Bottom: out-of-distribution generalization to movie stills.

Key Ideas

A video frame samples a light field. The natural coordinate of a ray is its 6D Plücker representation, and the geometry between two rays is the Plücker reciprocal product — bilinear in the two rays, the same algebraic form as the attention dot product.
Geometry inside the dot product. A query/key flip makes the Euclidean inner product reduce to the Plücker reciprocal product even at zero learning, giving the attention a built-in 3D inductive bias instead of forcing the model to recover correspondence from pixels alone.
Robust across data. A Normalize-Gate-Inject design decouples ray direction from ray-moment magnitude, so the encoding stays well-scaled across datasets with very different camera-translation scales.

Same scene · many cameras

Scenes

One generated scene browsed under six different camera trajectories. The inset visualizes the commanded camera path.

Grand Tour

Spiral Climb

Pullback & Rise

Push Sweep

S-Curve Reveal

Wide Orbit

click card to flip · drag to toss 1 / 6

Crane Arc

Wide Orbit

Spiral Rise

S-Curve Reveal

Pullback & Rise

Flyover Left

click card to flip · drag to toss 1 / 6

Grand Tour

Push Sweep

Pullback & Rise

Crane Arc

Rise Small

Spiral Rise

click card to flip · drag to toss 1 / 6

Real Trajectory

Truck Right

Pullback & Rise

Push In

click card to flip · drag to toss 1 / 6

Crane Up

Dolly In

Dolly Out

Orbit Left

Real Trajectory

Truck Right

click card to flip · drag to toss 1 / 6

Crane Up

Dolly In

Dolly Out

Orbit Left

Real Trajectory

Truck Right

click card to flip · drag to toss 1 / 6

WASD · third-person control

Motions

WASD inputs drive the camera trajectory that moves the third-person subject. Overlay keys reflect the per-frame motion.

Ease Forward

Gentle Right

Reverse

City S-Curve

Right Curve

Lane Change

click card to flip · drag to toss 1 / 6

Big S-Curve

Hold Forward

Forward Sweep

Left Sweep

Strafe Left

Wide S-Curve

click card to flip · drag to toss 1 / 6

WASD Combo

Strafe Right

Strafe Left

Patrol & Back

Evade & Recover

Step Search

click card to flip · drag to toss 1 / 6

Evade & Recover

Dolly In

Patrol & Back

Step Search

WASD Combo

Truck Right

click card to flip · drag to toss 1 / 6

Free Walk

Step Search

Corner Cut

WASD Combo

Snake Forward

Truck Right

click card to flip · drag to toss 1 / 6

WASD Combo

Move Back

Strafe Right

Patrol & Back

WASD Combo

Snake Forward

click card to flip · drag to toss 1 / 6

More results

Real-World Scenes

Diverse real captures driven along varied camera paths from a single frame.

Dolly In

Dolly Out

Orbit Left

Real Trajectory

click card to flip · drag to toss 1 / 4

Crane Up

Dolly Out

Real Trajectory

Truck Right

click card to flip · drag to toss 1 / 4

Dolly In

Orbit Left

Pan Right

Real Trajectory

click card to flip · drag to toss 1 / 4

Crane Up

Real Trajectory

Dolly In

click card to flip · drag to toss 1 / 4

Dolly Out

Orbit Left

Real Trajectory

click card to flip · drag to toss 1 / 4

Dolly In

Real Trajectory

Snake Forward

click card to flip · drag to toss 1 / 4

Dolly In

Orbit Left

Real Trajectory

Truck Left

click card to flip · drag to toss 1 / 4

Orbit Left

Real Trajectory

click card to flip · drag to toss 1 / 4

More results

In-the-Wild Videos

Everyday footage re-rendered under controllable camera motion.

Orbit Left

Crane Up

Dolly In

Dolly Out

click card to flip · drag to toss 1 / 4

Real Trajectory

Crane Up

Dolly In

click card to flip · drag to toss 1 / 4

Real Trajectory

Crane Up

Real Trajectory

Snake Forward

click card to flip · drag to toss 1 / 4

Truck Left

Dolly In

Real Trajectory

Snake Forward

click card to flip · drag to toss 1 / 4

Orbit Left

Crane Up

Real Trajectory

Snake Forward

click card to flip · drag to toss 1 / 4

Orbit Left

Crane Up

Dolly Out

Truck Right

click card to flip · drag to toss 1 / 4

Real Trajectory

Dolly In

Dolly Out

Real Trajectory

click card to flip · drag to toss 1 / 4

Truck Left

Crane Up

Dolly In

Orbit Left

click card to flip · drag to toss 1 / 4

Orbit Left

Crane Up

Dolly In

Dolly Out

click card to flip · drag to toss 1 / 4

Orbit Left

Crane Up

Snake Forward

Truck Left

click card to flip · drag to toss 1 / 4

More results

Stylized & Artistic Scenes

Illustrated and painterly inputs animated with 3D-consistent camera control.

Pan Left

Orbit Right

Dolly Back

Zoom In

click card to flip · drag to toss 1 / 4

Dolly Back

Pan Left

Orbit Left

Orbit Right

click card to flip · drag to toss 1 / 4

Zoom In

Truck Left

Pan Right

Orbit Right

click card to flip · drag to toss 1 / 4

Dolly Forward

Dolly Back

Crane Up

Pan Left

click card to flip · drag to toss 1 / 4

Dolly Forward

Dolly Back

Truck Left

Orbit Right

click card to flip · drag to toss 1 / 4

Truck Right

Crane Up

Pan Left

Zoom In

click card to flip · drag to toss 1 / 4

Dolly Forward

Dolly Back

Truck Left

Pan Right

click card to flip · drag to toss 1 / 4

Zoom In

Truck Right

Dolly Back

Orbit Right

click card to flip · drag to toss 1 / 4

Truck Left

Crane Up

Pan Right

Orbit Left

click card to flip · drag to toss 1 / 4

Dolly Forward

Dolly Back

Truck Left

Truck Right

click card to flip · drag to toss 1 / 4