How AI is Learning to See Through the Human Eye?

By Abigail Shields, Year 12

How is it that the totality of everything we see, hear, and feel is the invention of a warm, wrinkled mass tucked inside our skull? While neuroscientists remain uncertain about how the brain transforms the world outside our eyes into the world inside our minds, artificial intelligence is gradually drawing closer to doing something analogous.

Neuroscientists Yu Takagi and Shinji Nishimoto at Osaka University are among the first to teach AI to re-create what the human eye sees using nothing but patterns of brain activity. Using functional magnetic resonance imaging (fMRI) — an instrument invented in 1991 by Seiji Ogawa and Ken Kwong — to measure changes in blood oxygen levels in the brain, the researchers measured brain activity in four participants as they viewed about 10,000 images, identifying distinct patterns of activation across visual-processing regions.

They found that activity in the occipital lobe corresponded to low-level visual features such as shape, spatial layout, and perspective, while activity in temporal-lobe regions reflected higher-level semantic information — including object identity and meaning. As participants viewed each stimulus, unique configurations of neural activity emerged, forming what was described as a “neural fingerprint” of perception. These fingerprints captured the conceptual understanding of each image.

By comparing patterns across thousands of images, this vast dataset of paired images and brain responses provided the foundation for training AI to recognise and reconstruct the visual content encoded within our human thoughts. Rather than creating a new AI model from scratch, the researchers linked these brain patterns to the pre-existing text-to-image AI model Stable Diffusion — a model already trained on billions of image-caption pairs. They used a simple linear mapping model to translate the fMRI data into the AI’s internal “visual language” — a mathematical latent space where images are represented as patterns of meaning and form. Beginning with a field of random noise, the AI applied a denoising process to transform the encoded brain-activity signal into a reconstructed representation of the visual stimulus perceived by the participant.

Sometimes, the generated images even managed to capture recognisable objects or general layouts. Investigating further, the findings revealed that the AI’s layered architecture reflected the structure of human visual processing, hinting that, in its own mechanical way, the machine was beginning to “see” as we do.

But what does this mean? For now, not much in truly practical terms: the method is still not advanced enough to come close to accurate telepathy. While the reconstructions of participants’ perceptions can be recognisable, they are often blurry and far from perfect replications. The precision of this method is limited because fMRI measures changes in blood oxygenation rather than the direct firing of individual neurons, and captures signals only every one or two seconds rather than in real time.

However, what the work suggests for the future is far more significant. A multitude of possibilities arise: the technology offers a chance to peer into how memory, imagination, and visual hallucinations might work; perhaps even a chance to study perception in animals that can’t put into language what they see; or the ability to visualise and understand dreams. Scientists could test theories on perception, and in learning to “see like us”, the machine may be teaching us how we see ourselves.

Although — where do we draw the line? It is indeed invasive (or at least deeply personal) to conduct visual experiments on someone’s brain, and perhaps could raise some unethical situations. If brain-decoding technology becomes more accurate, how do we know that it will only be used for medical and research purposes and not be exploited by law enforcement or intelligence agencies?

Works Cited

Nahas, Kamal. “AI Re-Creates What People See by Reading Their Brain Scans.” Science, 7 Mar. 2023, https://www.science.org/content/article/ai-re-creates-what-people-see-reading-their-brain-scans.

Parshall, Allison. “AI Can Re-Create What You See from a Brain Scan.” Scientific American, 17 Mar. 2023, https://www.scientificamerican.com/article/ai-can-re-create-what-you-see-from-a-brain-scan/.

Takagi, Yu, and Shinji Nishimoto. “High-Resolution Image Reconstruction with Latent Diffusion Models from Human Brain Activity.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, https://openaccess.thecvf.com/content/CVPR2023/html/Takagi_High-Resolution_Image_Reconstruction_With_Latent_Diffusion_Models_From_Human_Brain_CVPR_2023_paper.html.

Leave a Reply

Your email address will not be published. Required fields are marked *