The relationship between mental imagery and vision is a long-standing problem in neuroscience. Currently, it is not known whether differences between the activity evoked during vision and reinstated during imagery reflect different codes for seen and mental images. To address this problem, we modeled mental imagery in the human brain as feedback in a hierarchical generative network. Such networks synthesize images by feeding abstract representations from higher to lower levels of the network hierarchy.
When higher processing levels are less sensitive to stimulus variation than lower processing levels, as in the human brain, activity in low-level visual areas should encode variation in mental images with less precision than seen images. To test this prediction, we conducted an fMRI experiment in which subjects imagined and then viewed hundreds of spatially varying naturalistic stimuli. To analyze these data, we developed imagery-encoding models. These models accurately predicted brain responses to imagined stimuli and enabled accurate decoding of their position and content.
They also allowed us to compare, for every voxel, tuning to seen and imagined spatial frequencies, as well as the location and size of receptive fields in visual and imagined space. We confirmed our prediction, showing that, in low-level visual areas, imagined spatial frequencies in individual voxels are reduced relative to seen spatial frequencies and that receptive fields in imagined space are larger than in visual space.
These findings reveal distinct codes for seen and mental images and link mental imagery to the computational abilities of generative networks.