Dr. Candy is affiliated with the School of Optometry, Programs in Vision Science, Neuroscience and Cognitive Science, at Indiana University where she is a Professor and Executive Associate Dean for Academic Affairs. Dr. Cormack is with the Department of Psychology, Institute for Neuroscience, and Center for Perceptual Systems, of The University of Texas at Austin. All you’ll be able to access online for free is the abstract, but I hope to give you enough of a flavor of the article that you’ll be convinced it’s worth getting your hands on the full paper.
Here is the Abstract:
“Technological advances in recent decades have allowed us to measure both the information available to the visual system in the natural environment and the rich array of behaviors that the visual system supports. This review highlights the tasks undertaken by the binocular visual system in particular and how, for much of human activity, these tasks differ from those considered when an observer fixates a static target on the midline. The everyday motor and perceptual challenges involved in generating a stable, useful binocular percept of the environment are discussed, together with how these challenges are but minimally addressed by much of current clinical interpretation of binocular function. The implications for new technology, such as virtual reality, are also highlighted in terms of clinical and basic research application.”
The motivation for Drs. Candy and Cormack in writing this article is likely the increasing attention paid to how artificial intelligence systems fail miserably at duplicating or even simulating the human visual system. Of course we know that binocular vision occurs in the brain, and it is how the brain constructs and utilizes its binocular percepts which is where the magic occurs. Here is Figure 1 from the paper, which is a classical illustration of interocular retinal disparity:
The legend to the figure reads as follows: Panel A presents a series of colored objects. The eyes are fixating the dark blue one on the mid-saggital plane and the theoretical horopter for that fixation distance is illustrated by the dashed grey circle. The images of this object sit on the fovea in each eye, with the images of the other objects falling at eccentric locations. The grey object provides a simple example of uncrossed disparity from the horopter. If the eyes were to now align at that object, its absolute uncrossed disparity would reduce to zero, while its relative disparity to the blue object would remain constant irrespective of the fixation and alignment distance of the eyes. The green and red objects illustrate the more typical natural situation of objects located away from the mid-saggital plane, in uncrossed or crossed disparity, respectively, from the horopter. Panel B illustrates that changing fixation from the green object to the red object involves an eye movement with both a lateral and depth component, and that the required angular rotations of the eyes are not equal and symmetric (purple arrows).
After introducing the influential work of David Marr on computational vision and J.J. Gibson on ecological optics, Candy and Cormack write: “Two things that neither of these authors fully anticipated were the need to cope with the large amount of variability present in real-world stimuli and the (binocular) visual system’s ability to exploit the statistical regularities in visual stimulation that arise from the physics of the environment. These statistical properties of the stimulation we receive from birth have shaped the way in which the visual system encodes and decodes information to guide behavior. In the context of binocular vision, this has led to studies of the statistics of natural visual experience, the response properties of neurons in extrastriate cortex, the use of more naturalistic stimuli that consider multiple cues, plus the development of models of cue integration, decision making and motor responses.”
On a monocular basis, the fact that a typical observer is so unaware of the challenges involved in computations made while walking through the world indicates the elegance and efficiency of their function, and reiterates that monocular fixation does not represent an ‘at rest’ absence of motor activity. Rather, they write, stable visual percepts require highly sophisticated coordination of motor activity and then rapid computation performed on unstable retinal image information. Fixational eye movements and the VOR belong to a category of ‘stabilization’ eye movements. For a typical observer, these computations can be performed fast enough to sustain a stable percept akin to image stabilization needed to shoot a decent video while walking or running.
The ultimate challenge is in understanding how we integrate this information between the two eyes, particularly when there are asymmetries involved as occurs in the natural world where objects on the midline are the exception rather than the rule. Data now suggest that the asymmetric behavioral responses are a result of innervation, coded in position and/or velocity, in a number of both monocular and binocular pathways. There is much to understand about the sets of neural circuitry underlying the combination of vergence and accommodation in the tracking and jump motor responses that permit us to rapidly fixate and focus on targets in our three-dimensional natural visual environment. A more complete understanding of how normal function is disrupted contributes to clinical management of the binocular continuum ranging from manifest misalignment in strabismus to stress on alignment in TBI and CI.
The legend for the figure above from the article reads: An illustration of the binocular integration of images from eyes that are aligned (row A) or misaligned (row B). The images in the left column illustrate the scene with simulated visual fields for the left (red) and right (blue) eyes. The aligned case illustrates the typical central binocular overlapping field with monocular regions in the temporal extremes. The misaligned simulated exotropia in row B illustrates the reduced binocular overlap and extended monocular regions present with the divergent deviation of the right eye. The middle column provides the relevant visual fields for each eye and a reminder that binocular integration is not merely a question of overlaying these fields/retinal images. The right column follows the basic principle of mapping in primary visual cortex, where the binocular visual field is represented with aligned information and appropriate transition into the monocular crescents for aligned eyes and conflicting information in the misaligned case. The cortical representation for the patient with exotropia might imply diplopia and confusion in the nominally binocular cortex, as shown in the top example in row B, or, if the image from the right eye is suppressed in binocular cortex, shown below, any percept from the right eye’s monocular crescent would result in an apparent missing section of the visual scene (as illustrated by the grey region in the left column scene). These potential cortical ‘images’ illustrate some of the challenges faced by the brains of these strabismic patients in compiling a stable unified percept of the world.
Here is an interesting example of how the brain can overcome the monocular effects of pathology, in this case from AMD resulting in significant loss of central acuity unilaterally, yet preserve a very rich sense of binocular depth. If you free-space fuse the image below, you’ll likely see that your brain has no trouble seeing the shirt on the table in depth binocularly. The right eye image, in which AMD obscures identification of the shirt, isn’t perceived as a scotoma. Rather, the natural scene results in the brain perceiving it as a shadow on the wall projected by the sunlight through foliage.
The notion of the static horopter from physiological optics is impoverished, particularly when it comes to appreciating the flexibility and adaptability of binocular vision.
Binocular perception does not always reflect a literal correspondence between our two retinal images. We now know that there are different brain areas responsible for different features of binocular vision. Each brain area and each binocular mechanism has, in fact, its own scheme for combining information from the two eyes. As Candy and Cormack note, to derive a relatively unambiguous and stable dynamic depth map the visual system cope with an enormous amount of noise and uses filtered information such as “disparity energy” or differences between quasi-local velocities in the two eyes rather than point-wise luminance values. This was foreseen by Gibson’s ecological optics and also consists of computations transforming the information into useful representations, as Marr suggested, to generate advantageous behaviors. For reference, here are hyperlinks to the 9 part blog series we did on Eco-Optics: Part 9, Part 8, Part 7, Part 6, Part 5, Part 4, Part 3, Part 2, Part 1.
Crucially, Candy and Cormac add, the binocular visual system also uses prior information incorporating Bayesian strategies in producing optimal or near optimal behaviors. They conclude their superb article with nuances regarding the implications of screen time, the binocular approach to amblyopia therapy, and augmented reality in rehabilitation.