← Back to Main Page
Teaser Image for the Project

Overdriving Visual Depth Perception via Sound Modulation in VR

Published in IEEE Transactions on Visualization and Computer Graphics (TVCG, Proc. IEEE VR), 2026

Presented at IEEE Virtual Reality (VR), March 2026

Daniel Jiménez-Navarro
Max Planck Institute for Informatics
Colin Groth
New York University
Xi Peng
University of North Carolina
Jorge Pina
University of Zaragoza
Qi Sun
New York University
Praneeth Chakravarthula
University of North Carolina
Karol Myszkowski
Max Planck Institute for Informatics
Hans-Peter Seidel
Max Planck Institute for Informatics
Ana Serrano
University of Zaragoza

Abstract

Our ability to perceive and navigate the spatial world is a cornerstone of human experience, relying on the integration of visual and auditory cues to form a coherent sense of depth and distance. In stereoscopic 3D vision, depth perception requires fixation of both eyes on a target object, which is achieved through vergence movements, with convergence for near objects and divergence for distant ones. In contrast, auditory cues provide complementary depth information through variations in loudness, interaural differences (IAD), and the frequency spectrum. We investigate the interaction between visual and auditory cues and examine how contradictory auditory information can overdrive visual depth perception in virtual reality (VR). When a new visual target appears, we introduce a spatial discrepancy between the visual and auditory cues: the visual target is shifted closer to the previously fixated object, while the corresponding sound localization is displaced in the opposite direction. By integrating these conflicting cues through multimodal processing, the resulting percept is biased toward the intended depth location. This audiovisual fusion counteracts depth compression, thus reducing the required vergence magnitude and enabling faster gaze retargeting. Such audio-driven depth enhancement may further help mitigate the vergence–accommodation conflict (VAC) in scenarios where physical depth must be compressed. In a series of psychophysical studies, we first assess the efficiency of depth overdriving for various VR-relevant combinations of initial fixations and shifted target locations, considering different scenarios of audio displacements and their loudness and frequency parameters. Next, we quantify the resulting speedup in gaze retargeting for target shifts that can be successfully overdriven by sound manipulations. Finally, we apply our method in a naturalistic VR scenario where user interface interactions with the scene show an extended perceptual depth.

BibTeX Citation

@article{navarro2026overdriving,
  title = {Overdriving Visual Depth Perception via Sound Modulation in VR},
  author = {Daniel Jiménez-Navarro, Colin Groth, Xi Peng, Jorge Pina, Qi Sun, Praneeth Chakravarthula, Karol Myszkowski, Hans-Peter Seidel, and Ana Serrano},
  journal = {{IEEE} Transactions on Visualization and Computer Graphics ({TVCG}, Proc. {IEEE} {VR})},
  year = {2026}
}