Depth Cues in the Human Visual System
Author: Marko Teittinen
The human visual system interprets depth in sensed images using bothphysiological and psychological cues. Some physiological cues requireboth eyes to be open (binocular), others are available also whenlooking at images with only one open eye (monocular). Allpsychological cues are monocular. In the real world the human visualsystem automatically uses all available depth cues to determinedistances between objects. To have all these depth cues available ina VR system some kind of a stereo display is required to takeadvantage of the binocular depth cues. Monocular depth cues can beused also without stereo display.
The physiological depth cues are accommodation, convergence,binocular parallax, and monocular movement parallax. Convergenceand binocular parallax are the only binocular depth cues, all othersare monocular. The psychological depth cues are retinal image size,linear perspective, texture gradient, overlapping, aerialperspective, and shades and shadows.
Accommodation is the tension of the muscle that changes the focallength of the lens of eye. Thus it brings into focus objects atdifferent distances. This depth cue is quite weak, and it iseffective only at short viewing distances (less than 2 meters) andwith other cues.
When watching an object close to us, our eyes point slightly inward.This difference in the direction of the eyes is called convergence.This depth cue is effective only on short distances (less than 10meters).
As our eyes see the world from slightly different locations, theimages sensed by the eyes are slightly different. This difference inthe sensed images is called binocular parallax. Human visual systemis very sensitive to these differences, and binocular parallax is themost important depth cue for medium viewing distances. The sense ofdepth can be achieved using binocular parallax even if all other depthcues are removed.
Monocular Movement Parallax
If we close one of our eyes, we can perceive depth by moving our head.This happens because human visual system can extract depth informationin two similar images sensed after each other, in the same way it cancombine two images from different eyes.
Retinal Image Size
When the real size of the object is known, our brain compares thesensed size of the object to this real size, and thus acquiresinformation about the distance of the object.
When looking down a straight level road we see the parallel sides ofthe road meet in the horizon. This effect is often visible in photosand it is an important depth cue. It is called linear perspective.
The closer we are to an object the more detail we can see of itssurface texture. So objects with smooth textures are usuallyinterpreted being farther away. This is especially true if thesurface texture spans all the distance from near to far.
When objects block each other out of our sight, we know that theobject that blocks the other one is closer to us. The object whoseoutline pattern looks more continuous is felt to lie closer.
The mountains in the horizon look always slightly bluish or hazy. Thereason for this are small water and dust particles in the air betweenthe eye and the mountains. The farther the mountains, the hazier theylook.
Shades and Shadows
When we know the location of a light source and see objects castingshadows on other objects, we learn that the object shadowing the otheris closer to the light source. As most illumination comes downward wetend to resolve ambiguities using this information. The threedimensional looking computer user interfaces are a nice example onthis. Also, bright objects seem to be closer to the observer thandark ones.
Okoshi, T., Three-Dimensional Imaging Techniques, Academic Press, NewYork, 1976.
Human Interface Technology Laboratory