Math of Illusions

The perceptual adaptation based illusions are concisely described by mathematics expressing relationships in complex space (typically planes) such as Laplace or z.

Perceptual Adaptation

One of my favorite demonstrations of visual illusions from adaptations starts with fixating in the center of a constant brightness image with smoothly modulated color (~magenta to green to magenta) over about 40 deg of the visual field. After fixating on the center of this for nearly a minute, the image is switched to pure grey of equal brightness. What is observed by the viewer is the opposite colors, not in a way that appears to be an after image, but as long as you look at the grey image, one would swear that it is color image made up of green and magenta. There are phantom images we all see that are not actually there outside of us, but are real signals generated between the light signals that reaches us and our brains. These same types of adaptation mechanisms that create visual illusions also create auditory illusions (and I've incorporated them in many of my albums, see and other perceptual illusions. There is also strong evidence that they create emotional "over-shoots" and "under-shoots", and at extremes might even be considered to be "illusions" where we emotionally feel that something happened even though it didn't. This page focuses on vision because of my time spent on it, as well as being a good example that can be extrapolated to most living things, and most systems with inter-adaptation, like human relationships, communities, international relations, economies, etc. It uses math common to systems theory, control systems, digital filters. This first draft has links (or links to links) to the math, and mainly describes behaviour and links it to math concepts, parameters and how they change in respect to systems being modeled.

I spent years developing a very accurate human vision system model. The math of this accurate model is essentially the math of biological adaptation. It explains a myriad of perceptual illusions as well as how we are able to see, hear, feel, taste and smell things over such very wide dynamic ranges.

Living things, being biological systems, are composed of a myriad of adaptive mechanisms all working in tandem. The physiology of perceptual systems is no exception. The benefit of adaptation in perceptual systems is dynamic range. For example, our pupils contract or dilate depending on the amount of light entering the eye. This adjustment of pupil diameter allows for a wider range of light intensities to be converted by the more limited range light receptor set. The set of light receptors in turn are composed of receptors that work better in bright light, and with narrower wavelength sensitivity, the cones, and receptors that work better in dark light with broader wavelength selectivity, the rods. These light receptors as well as the neurons which courier the light receptor responses to the visual cortex of the brain, and the brain itself, all have responses similar to fatigue for static or otherwise repeated light stimuli. This fatigue-like response essentially results in reduced sensitivity to the stimulus. This reduced sensitivity is classic in biologic systems in general. It's the equivalent to reducing the volume on music you grow tired of hearing too many times.

In contrast, if the environment becomes very dark, the visual perception system becomes quite sensitive, increasing in sensitivity significantly within about 20 seconds, and typically further increasing, with diminishing returns, over about 2 minutes (a rule of thumb). The pupils will dilate for dark-adaptation, receptors recover from bleaching, neurons change firing rate and re-establish baseline for the new average stimulus, etc. Then, if suddenly there is a pulse of bright light, it can easily swamp the visual perception system. For example, if a very bright picture of bright clouds is presented to a viewer of such a dark-adapted person, the person would not be able to see any details in the clouds. All one could see would be a flash of very bright light filling the image space with no discernible details. If the image is sustained such that the average light has changed accordingly, the details are percieved as the signal reaching the brain is no longer clipped by overload.

On the level of the neuron, there are excitation stimuli and inhibition stimuli input. The inhibition stimuli generally comes from neurons that have integrated the signal more, and therefore help to remove the average stimuli, making the overall response more sensitive to change. This in effect helps us ignore signals that get old over time, so we can pay more attention to what's new. In effect, our perception becomes more differential. For vision, this is generally explained in terms of "surround" response for spatial (but for spatiotemporal response it actually includes both wider spatial and longer temporal) inhibition. Thus the surround response forms the baseline average against which changes in the "center" ( smaller spatial area, shorter temporal duration) can be detectded.

The math model that accounts for these adaptations as well as the vast majority of how we see luminance is given (details via references) in, "AN ADAPTABLE HUMAN VISION MODEL FOR SUBJECTIVE VIDEO QUALITY RATING PREDICTION", which is a white paper originally presented at VPQM 2007, Scottsdale, Arizona & "The Cutting Edge" 2007 IBC (See Proceedings and/or .mp3 audio of the presentation {given with major jet lag}). Figure 2 shows a recursive model with a single real pole (for a give fixed positive real b0 and a1, both between 0 and 1.0). Figure 3. Shows how this filter is replicated and connected to form a center and surround filter set. Figure 4. While generally the surround is used to modulate the neural response ( b0 coefficient, and thus a1 = 1-b0, ), this figure shows a simple way to incorporate viewing distance (zoom), resolution and temporal response modulation.

The model has nested feedback loops, each resulting in at least one pole in the complex plane. One core theme of nested feedback takes the average response and feeds it back to control the sensitivity to the stimulus. A useful perspective for understanding the behaviour of such as system is as follows. Image we start with a simple linear system with simple linear feedback. The output is feed back (in inversion) to sum with the input signal. As long as the feedback gain is low enough, the output simply tends to be running average of the input, with newer stimuli weighted more than older stimuli. If the feedback gain is too high, the output will continue to grow over time (until it hits a maximum in real systems, or goes to infinity in this simple math model). The feedback gain can be set to a convenient value in between these extremes, so that for a constant input, the output is equal to that same constant (DC gain = 1).

In terms of complex numbers, such as simple linear feedback system is a single pole system. In the z-plane, the pole is non-negative and on the real axis. If the output is continuing to grow over time, the pole is on the real axis somewhere above 1.0. If the output follows the input over time without any lag, the pole is at 0,0 in the complex plane. Otherwise, the pole is on the real axis somewhere between 0 and 1.0.

Now let's consider what happens if that pole can be adjusted over time to accommodate changes in the input. There are poles for the center (both spatial and temporal) and also for the surround (both spatial and temporal). Typically, the average brightness of what is being seen will determine the surround poles. If you stare at a high contrast simple image like an edge between black and white, for example, fixating at one spot along the edge for at least 20 seconds, followed by fixating at one spot in the middle of while, you can see the shift in sensitivity change over time. Once looking at a spot in the white area, if you look for it, you can see the part of your visual field that had been dark adapted for 20 seconds appears to be much brighter than the part of your visual field that was bright adapted.

In the time scale in which the pole moves relatively far, as in our 20 second example, the response of the system changes substantially. For spectral components of the stimuli which are low enough such that their respective periods are about equal to or longer than the time to move the pole substantially, there is significant nonlinear response. This non-linear response can manifest in many different ways, depending on the nature of the input stimulus. As mentioned in the article linked above, there are known phantom transient images, such as phantom pulses which can occur between two input pulses spaced in time far enough apart, but not too far to allow the 3rd "phantom" pulse to be produced in between, In other words, a 3rd "phantom" pulse illusion is seen between two light pulses due to the adaptive mechanisms of the human vision system. Similarly, standing waves of light can produce mixing products (squares of the spatiotemporal input signal), creating patterns that are seen, but are not actually there.

In training the human vision model to match psychometric data from visual perception experiments, I found that for large area stimuli that changed at a rate of about 3-8 Hz (depending on the average brightness, etc.), individual sensitiviies vary enormously. When trained to match the most sensitive people, for very bright stimuli, the model became chaotic, with large swings in response continuing long after the stimulus was removed. At first I thought this was a bug in the model. But later I learned that this same stimulus set is known to trigger photosensitive epillepsy in some individuals. Further research into this lead to a photosensitive epillepsy trigger detector and mitigation methods, with a link to the patent here:

  • U.S. No. 8,768,020 Method of detecting visual stress and photosensitive epilepsy triggers in video and mitigation device