In surveying previous chapters of Principle of Neural Science, we saw how sophisticated retinal circuitry compresses light information into signals representing contrast and movement. These signals are transmitted through the fiber optic cable we know as the optic nerve, ultimately reaching the visual cortex which uses that information along with other sources of information to analyze objects of regard. This process of using line segments to represent specific objects in known as contour integration, and is the subject of Chapter 24 which is titled: “Intermediate-Level Visual Processing and Visual Primitives“.
Intermediate-level visual processing involves assembling local elements of an image into a unified percept of objects and background. Each relay in the visual circuitry of the brain has built-in logic that allows assumptions to be made about the likely spatial relationships between elements. In certain cases, these inherent rules can lead to the illusion of contours and surface that do not actually exist in the visual field. Illusory contours are purposeful properties of the visual system that allow us to make predictions based on past experiences and contexts. Yet it does expose us to potential ambiguities.
The left image below is the Kanizsa triangle illusion, in which one perceives continuous boundaries extending between the apices of a white triangle, even though the only real contour elements are formed by the Pac-Man-like figures and the acute angle. The image on the right shows that the inside and outside of the illusory pink square are the same white color as the screen background, but a continuous transparent pink surface within the square is perceived.
Three interacting features of visual processing help overcome ambiguity in the signals from the retina and are vital to the visual analysis of complex scenes:
- The way in which a visual feature is perceived depends on everything that surrounds it. The response of a neuron in the visual cortex is context-dependent, depending as much on the presence of contours and surfaces outside the cell’s receptive field as much as on the attributes within it.
- The functional properties of neurons in the visual cortex can be altered by visual experience or perceptual learning.
- Visual processing in the cortex is subject to the influence of cognitive functions, specifically attention, expectation, and perceptual task – which is the active engagement in visual discrimination or detection.
The balance of this chapter focuses on how depth perception helps segregate objects from background. An important cue for the perception of depth is the difference between the two eyes’ view of the world, which must be computed and reconciled by the brain. The integration of binocular input begins in the primary visual cortex, the first level at which individual neurons receive signals from both eyes. The balance of input from the two eyes varies among cells in V1.
Binocular neurons in many visual cortical areas other than V1 are also selective for depth, which is computed from the relative retinal positions of objects at different distances and angles from the observer creating binocular disparity. Individual neurons can be selective for a narrow range of disparities. Some are selective for objects within the plane of fixation whereas others respond only when objects lie in front of the plane (near cells) or behind the plane (far cells).
Depth plays an important role in the perception of the 3-D properties of a scene. The same type of perceptual filling-in that occurs in an illusory sense described above now becomes purposeful in the real world. A surface passing behind an object is perceived as continuous even though its 2-D image on each retina represents two surfaces separated by the occluding object. When the brain encounters a surface interrupted by gaps that have appropriate alignment and contrast, and lying in the near depth plane, it fills in the gaps to create a continuous surface. Here the perceptual filling-in can be conceived as 3-D visual closure.
Whereas the depth of a single object can be established relatively easily, determining the depth of multiple objects with a scene is a much more complex task. The disparity calculation now becomes global rather than local. The calculation in one part of the visual scene influences and disambiguates information in other parts of the visual scene. This process is known as disparity capture. It is believed that the type of foreground emergence from background underlying local stereopsis occurs at the level of V1, as distinct from figural emergence in global stereopsis which occurs at the level of V2.
I believe this is why it is incorrect to insist on random dot stereopsis as the sine qua non of depth perception. True, RDS requires simultaneous bifoveal comparisons between the two eyes, but it is not necessarily the most useful form of depth perception. Context is the key. RDS is marvelous for figure-ground identification in complex scenes; but this type of global stereopsis is not as relevant when dealing with local primitives. Give me global RDS when I need to parallel park, but good ol’ high level Wirt Circle stereoacuity when hammering a nail. Give me RDS to confirm the absence of strabismus, but let me pinch the wings of the fly to tell you about the utility of binocular summation for most activities of daily living (ADLs).
Speaking of ADLs, intermediate level vision is an aspect of acquired brain injury (ABI) that tends to be glossed over. In our chapter on Spatial Vision in Suter & Harvey’s book on Vision Rehabilitation, Bob Sanet and I refer to the fragmentation or dis-integration of vision that occurs, and our role in guiding the patient toward re-assembly or re-integration. After reading Chapter 24 in Principles of Neural Science, I would propose that what occurs to various degrees in ABI is a loss of disambiguation. The regression to a more primitive state of ambiguity understandably stymies the individual through uncertainty that restrains or inhibits self-guided action. Part of what we do through lenses, prisms or active therapy is set conditions for exploration that restore these crucial intermediate levels of visual disambiguation, thereby aiding awareness and confidence in visually guided action. Motion parallax and stereoscopic judgment are merely two overt examples in our toolbox of cues used at intermediate levels of visual processing.