Novel Depth Cues Integrated in 49% of Adults After 1-Hour Training

A 2026 iScience study found that adults could learn a new auditory cue to visual depth after about 1 hour, but full perceptual integration did not appear reliably at the group level: only 31 of 63 participants, or 49%, gained precision when the novel cue was combined with familiar visual disparity.¹ The sharp interpretation separates quick cue use from true cue fusion.

Research Highlights

Novel cue fusion was uneven: after 1 hour of training, familiar-novel cue combination produced no group-level precision gain, with 31 of 63 adults (49%) showing a positive benefit and the group test at V = 920.5, p = 0.551.¹
Familiar cues behaved as expected: visual disparity and size cues were combined near optimally, with 52 of 70 participants (74%) showing a precision benefit and the group test at V = 1888, p < 0.001.¹
Weighting still worked quickly: when disparity became noisier, 56 of 60 adults (93%) re-weighted toward size and 50 of 55 (91%) re-weighted toward the novel auditory cue, both with p < 0.001.¹
Wrong mappings were detected: incongruent cue pairings reduced precision in 57 of 59 adults (97%) for familiar cues and 50 of 59 (85%) for familiar-novel cues, again with p < 0.001.¹
Individual differences were stable: repeatability estimates across sessions suggested that cue-combination variability reflected real between-person differences, with R = 0.239 for familiar-familiar and R = 0.213 for familiar-novel combination benefits.¹

Perceptual learning is the brain’s ability to improve how sensory information is used after experience. In this study, the target was not memory for a fact or a conscious rule. The question was whether a newly learned sound cue could become part of depth perception itself.

Cue combination means the brain takes 2 noisy estimates of the same feature and fuses them into a more precise judgment. For depth, familiar cues include binocular disparity, which uses the slightly different images reaching the 2 eyes, and size, which uses the fact that the same object projects a smaller image when farther away.

1 Hour of Pitch-Depth Training Was Not Enough for Group-Level Fusion

Scheller et al. recruited 105 adults, screened out 27 who failed stereo-acuity preassessment, and tested 78 observers who completed the experiment.¹ The core task asked participants to judge whether a square made of random dots was closer or farther away on its second presentation.

Depth could be signaled by 3 cue types:

Disparity: a familiar visual depth cue based on differences between the 2 eyes.
Size: a familiar visual depth cue based on how large the object appeared.
Audio pitch: a newly trained cue in which pitch increased or decreased with simulated depth, with mapping direction counterbalanced across participants.

Participants learned the pitch-depth mapping during a training session, then completed forced-choice depth judgments across single-cue, paired-cue, noisy-cue, and incongruent-cue conditions. The design separated 3 markers that are often blurred together in casual discussion of “learning a new sense.”

Precision gain: if 2 cues are truly combined, judgments should become less noisy than the best single cue. Reliability re-weighting: if one cue becomes noisier, the brain should put less weight on it. Incongruence sensitivity: if the learned mapping is reversed, performance should worsen.

Familiar visual cues passed the strongest test. Disparity and size together reduced sensory noise relative to the best single cue, with V = 1888 and p < 0.001. Their combined performance did not significantly differ from optimal predictions, V = 1052, p = 0.266.

Novel auditory depth did not pass that same group-level test. When pitch was paired with disparity, sensory noise was not significantly lower than the best single cue, V = 920.5, p = 0.551. The combined condition also differed from optimal predictions, V = 1456, p = 0.002, making a simple measurement-power explanation unlikely.

49% of Adults Gained Precision from the Novel Cue

The group result hides the more useful human result: people differed sharply. With familiar visual cues, 52 of 70 participants (74%) gained precision from having both cues available. With the novel auditory cue, 31 of 63 participants (49%) gained precision, while 32 of 63 (51%) did not.

Scheller et al. checked whether the novel-cue result was merely caused by poorly matched cue noise. In a subset of 17 observers whose cue-noise ratio was below 1.5, familiar-novel cue pairs still did not show a group-level combination benefit, V = 66, p = 0.644. Even in a technically cleaner subgroup, the novel cue did not behave like a familiar cue for everyone.

The cue-pair comparison made that split explicit. Combination indices were larger for familiar-familiar than familiar-novel cues, V = 1354, p < 0.001. Re-weighting and incongruence sensitivity did not differ meaningfully between cue pairs, placing the bottleneck at precision-benefit fusion after the new signal was noticed and used.

Novel Cue Use Was Faster Than Novel Cue Fusion

Reliability re-weighting worked quickly. When disparity became less reliable, participants shifted weight away from disparity both when it was paired with size and when it was paired with the novel audio cue. The familiar-familiar analysis had V = 25, p < 0.001, and the familiar-novel analysis had V = 37, p < 0.001.

Individual behavior followed the same pattern. In the familiar-familiar condition, 56 of 60 adults (93%) increased reliance on size when disparity became noisier. In the familiar-novel condition, 50 of 55 (91%) increased reliance on the auditory cue when disparity became noisier.

That result matters because it rules out a crude “they did not learn the sound cue” explanation. The new pitch-depth cue was usable. It could be weighted by reliability. What failed for many people was the stronger operation: fusing the new cue with visual depth strongly enough to improve precision beyond the best single cue.

Possible decision strategy: participants may have switched between cues instead of averaging them. Cue switching can mimic re-weighting when one cue becomes noisier, but it cannot explain a true precision gain from combining 2 estimates. That is why the precision test is the harder integration test.

Wrong Cue Mappings Disrupted Both Familiar and Novel Pairs

Incongruence sensitivity was also strong. If the learned mapping between cues was reversed, performance became noisier for both cue types. Familiar visual cue incongruence reduced precision in 57 of 59 participants (97%), while familiar-novel incongruence reduced precision in 50 of 59 (85%). Both group tests were significant at p < 0.001.

This is an important calibration point for sensory substitution and sensory augmentation. Sensory substitution uses one sense to carry information usually carried by another sense, such as sound or touch representing spatial layout. Sensory augmentation adds a new signal to expand what a person can perceive or act on.

For either technology, the first win is not necessarily full perceptual fusion. A user may learn that a signal is meaningful, weight it when it becomes useful, and detect when the mapping is wrong, while still not integrating it into ordinary perception with the same precision benefit as a natural cue.

Stable Individual Differences Change the Training Question

Repeated testing helped distinguish random measurement noise from stable individual differences. Sensory-noise measures were significantly repeatable across sessions, including disparity (R = 0.516), size (R = 0.412), audio (R = 0.846), disparity-size (R = 0.383), and disparity-audio (R = 0.468), all p < 0.001.

Combination benefits also showed some repeatability, though the researchers warned that mixed-model convergence issues made these estimates more cautious. Familiar-familiar combination had R = 0.239, likelihood-ratio p < 0.001, and permutation p = 0.029. Familiar-novel combination had R = 0.213, likelihood-ratio p < 0.001, and permutation p = 0.056.

Evidence-strength note: this was a healthy-young-adult psychophysics study, not a rehabilitation trial. It supports a narrow claim about short-term learning of an artificial auditory cue to visual depth. It does not prove that sensory-substitution devices will or will not work in blindness, vestibular loss, neuropathy, or long-term assistive-device training.

Still, the individual-differences finding is more than a footnote. If some adults consistently combine a new cue after brief training and others do not, future studies need designs powered for person-level inference as well as group averages.

Earlier Novel-Cue Studies Looked More Optimistic

Classic cue-combination work showed why the benchmark is demanding. Ernst and Banks found that adults integrate visual and haptic information in a statistically optimal fashion when judging object properties, weighting each cue by reliability.² Ernst later showed that arbitrary visual-touch signal relationships can be learned, suggesting adult perception is not fixed to natural cue pairings only.³

Negen et al. tested a closer sensory-substitution-like case: a new auditory cue trained to support distance perception. Their participants showed Bayes-like integration of the new sensory skill with vision after training.⁴ Aston et al. also found that newly learned cues to location could combine with familiar cues, although not always with each other.⁵

Scheller et al. do not erase those results. They make the claim more conditional. Brief training can teach a new cue, and some adults can integrate it. But when a larger sample splits cue combination, re-weighting, and incongruence sensitivity, “learned” no longer automatically means “fused into perception.”

Nardini et al. argued that human sensory augmentation should be evaluated across perception, brain representation, and subjective experience rather than treated as a single performance score.⁶ The 2026 depth-cue data fit that framework: task use, reliability weighting, and precision-benefit fusion are separate milestones.

Questions About Perceptual Learning and Novel Depth Cues

Did adults learn the new auditory cue?

Yes. Participants used the pitch-depth mapping in reliability re-weighting and were sensitive when the mapping became incongruent. The limited result was full precision-benefit cue combination, not basic learning.

Why is 49% not a simple failure?

Because 31 of 63 adults did gain precision when the novel auditory cue was paired with visual disparity. The result points to individual variability, not a universal inability to integrate new sensory cues.

Does this apply to sensory-substitution devices?

Only indirectly. The study used healthy adults, a controlled pitch-depth cue, and short training. Real devices often involve longer practice, different senses, motivation, disability context, and daily use.

What should the next study test?

Longer training and person-level predictors. Attention, motivation, prior experience with cue coupling, and the sensory modality used for the new cue may determine who fuses a new signal rather than merely using it strategically.

References

Scheller M, Aston S, Chazelle T, Allen C, Slater H, Nardini M. Learning new perceptual skills: Individual differences in the computations that integrate novel sensory cues into depth perception. iScience. 2026;29:115526. doi:10.1016/j.isci.2026.115526
Ernst MO, Banks MS. Humans integrate visual and haptic information in a statistically optimal fashion. Nature. 2002;415:429-433. doi:10.1038/415429a
Ernst MO. Learning to integrate arbitrary signals from vision and touch. Journal of Vision. 2007;7:1-14. doi:10.1167/7.5.7
Negen J, Wen L, Thaler L, Nardini M. Bayes-like integration of a new sensory skill with vision. Scientific Reports. 2018;8:16880. doi:10.1038/s41598-018-35046-7
Aston S, Beierholm U, Nardini M. Newly learned novel cues to location are combined with familiar cues but not always with each other. Journal of Experimental Psychology: Human Perception and Performance. 2022;48:639-652. doi:10.1037/xhp0001014
Nardini M, Scheller M, Ramsay M, Kristiansen O, Allen C. Towards human sensory augmentation: A cognitive neuroscience framework for evaluating integration of new signals within perception, brain representations, and subjective experience. Augmented Human Research. 2025;10:1. doi:10.1007/s41133-024-00075-7